Development¶
Setup¶
Install pre-commit hooks:
Running tests¶
Pass extra arguments after --:
# Run a specific test file
uv run pytest tests/test_runner.py -v
# Run tests matching a keyword
uv run pytest -k test_llm -v
Nox sessions¶
Nox orchestrates lint and tests in isolated virtual environments.
# Run all sessions (lint + tests)
uv run nox
# Run only the test session
uv run nox -s tests
# Run only the lint session
uv run nox -s lint
# Pass extra pytest args
uv run nox -s tests -- -k test_runner -v
Sessions¶
| Session | What it does |
|---|---|
lint |
Runs pre-commit run --all-files (ruff lint + ruff format) |
tests |
Runs pytest tests/ with pytest-asyncio |
Code style¶
hafermilch uses ruff for linting and formatting, enforced via pre-commit.
# Auto-fix lint issues
uv run ruff check --fix src/ tests/
# Format code
uv run ruff format src/ tests/
Ruff is configured in pyproject.toml under [tool.ruff].
Project structure¶
hafermilch/
├── src/hafermilch/
│ ├── browser/
│ │ ├── base.py # Abstract BaseBrowserAgent
│ │ ├── playwright_agent.py # Playwright backend (+ login action)
│ │ ├── agent_browser.py # agent-browser subprocess backend
│ │ ├── context.py # PageContext dataclass
│ │ └── factory.py # create_browser_agent()
│ ├── core/
│ │ ├── models.py # Pydantic models (incl. TokenUsage, Credentials)
│ │ ├── settings.py # pydantic-settings (env vars)
│ │ └── exceptions.py # Custom exception hierarchy
│ ├── evaluation/
│ │ ├── runner.py # EvaluationRunner orchestrator
│ │ └── prompter.py # Prompt construction (with credentials)
│ ├── llm/
│ │ ├── base.py # Abstract LLMProvider
│ │ ├── litellm_provider.py # LiteLLM unified provider
│ │ └── factory.py # LLMProviderFactory
│ ├── personas/
│ │ └── loader.py # YAML loading + ${ENV_VAR} interpolation
│ ├── reporting/
│ │ ├── reporter.py # JSON + Markdown + HTML output
│ │ └── templates/
│ │ └── report.html # Jinja2 HTML report template
│ └── cli.py # Typer CLI entrypoint
├── tests/
│ ├── conftest.py # Shared fixtures
│ ├── test_llm_base.py
│ ├── test_persona_loader.py
│ ├── test_prompter.py
│ ├── test_agent_browser.py
│ └── test_runner.py
├── examples/
│ ├── personas/ # Built-in persona YAMLs
│ └── plans/ # Built-in plan YAMLs
├── pyproject.toml
├── noxfile.py
└── .pre-commit-config.yaml
Adding a new LLM provider¶
hafermilch uses LiteLLM as a unified gateway, so most providers work out of the box — just set the provider and model in the persona YAML. If you need custom behavior beyond what LiteLLM provides:
- Create
src/hafermilch/llm/myprovider.pysubclassingLLMProvider - Implement
async def complete(self, messages)andsupports_vision - Add a branch in
src/hafermilch/llm/factory.py - Add the new provider name to the
providerfield validation incore/models.py
Adding a new browser backend¶
- Create
src/hafermilch/browser/mybackend.pysubclassingBaseBrowserAgent - Implement
start(),stop(),navigate(),capture(),execute(), and theselector_hintproperty - Add the backend to
BrowserBackendinbrowser/factory.py - Update
create_browser_agent()with a new branch
CI¶
GitHub Actions runs on every push and pull request to master/main:
- Lint —
nox -s linton ubuntu-latest - Tests —
nox -s testsacross Python 3.11, 3.12, and 3.13