Development¶

Setup¶

git clone https://github.com/sandeep-selvaraj/hafermilch.git
cd hafermilch
uv sync

Install pre-commit hooks:

uv run pre-commit install

Running tests¶

uv run pytest

Pass extra arguments after --:

# Run a specific test file
uv run pytest tests/test_runner.py -v

# Run tests matching a keyword
uv run pytest -k test_llm -v

Nox sessions¶

Nox orchestrates lint and tests in isolated virtual environments.

# Run all sessions (lint + tests)
uv run nox

# Run only the test session
uv run nox -s tests

# Run only the lint session
uv run nox -s lint

# Pass extra pytest args
uv run nox -s tests -- -k test_runner -v

Sessions¶

Session	What it does
`lint`	Runs `pre-commit run --all-files` (ruff lint + ruff format)
`tests`	Runs `pytest tests/` with `pytest-asyncio`

Code style¶

hafermilch uses ruff for linting and formatting, enforced via pre-commit.

# Auto-fix lint issues
uv run ruff check --fix src/ tests/

# Format code
uv run ruff format src/ tests/

Ruff is configured in pyproject.toml under [tool.ruff].

Project structure¶

hafermilch/
├── src/hafermilch/
│   ├── browser/
│   │   ├── base.py             # Abstract BaseBrowserAgent
│   │   ├── playwright_agent.py # Playwright backend (+ login action)
│   │   ├── agent_browser.py    # agent-browser subprocess backend
│   │   ├── context.py          # PageContext dataclass
│   │   └── factory.py          # create_browser_agent()
│   ├── core/
│   │   ├── models.py           # Pydantic models (incl. TokenUsage, Credentials)
│   │   ├── settings.py         # pydantic-settings (env vars)
│   │   └── exceptions.py       # Custom exception hierarchy
│   ├── evaluation/
│   │   ├── runner.py           # EvaluationRunner orchestrator
│   │   └── prompter.py         # Prompt construction (with credentials)
│   ├── llm/
│   │   ├── base.py             # Abstract LLMProvider
│   │   ├── litellm_provider.py # LiteLLM unified provider
│   │   └── factory.py          # LLMProviderFactory
│   ├── personas/
│   │   └── loader.py           # YAML loading + ${ENV_VAR} interpolation
│   ├── reporting/
│   │   ├── reporter.py         # JSON + Markdown + HTML output
│   │   └── templates/
│   │       └── report.html     # Jinja2 HTML report template
│   └── cli.py                  # Typer CLI entrypoint
├── tests/
│   ├── conftest.py             # Shared fixtures
│   ├── test_llm_base.py
│   ├── test_persona_loader.py
│   ├── test_prompter.py
│   ├── test_agent_browser.py
│   └── test_runner.py
├── examples/
│   ├── personas/               # Built-in persona YAMLs
│   └── plans/                  # Built-in plan YAMLs
├── pyproject.toml
├── noxfile.py
└── .pre-commit-config.yaml

Adding a new LLM provider¶

hafermilch uses LiteLLM as a unified gateway, so most providers work out of the box — just set the provider and model in the persona YAML. If you need custom behavior beyond what LiteLLM provides:

Create src/hafermilch/llm/myprovider.py subclassing LLMProvider
Implement async def complete(self, messages) and supports_vision
Add a branch in src/hafermilch/llm/factory.py
Add the new provider name to the provider field validation in core/models.py

Adding a new browser backend¶

Create src/hafermilch/browser/mybackend.py subclassing BaseBrowserAgent
Implement start(), stop(), navigate(), capture(), execute(), and the selector_hint property
Add the backend to BrowserBackend in browser/factory.py
Update create_browser_agent() with a new branch

CI¶

GitHub Actions runs on every push and pull request to master/main:

Lint — nox -s lint on ubuntu-latest
Tests — nox -s tests across Python 3.11, 3.12, and 3.13