Fuse
Pull any GGUF model from HuggingFace. Extract structured data on CPU. No Pydantic boilerplate.
$ uvx fusellm extract "Sarah Chen, 34, architect at Stripe" \
--model bartowski/Llama-3.2-1B-Instruct-GGUF \
--fields "name:str,age:int,company:str"
{"name": "Sarah Chen", "age": 34, "company": "Stripe"}
What is Fuse?¶
Fuse lets you pull any GGUF model from HuggingFace, run zero-shot structured extraction with dynamic schemas, fine-tune with LoRA, and export to GGUF for fast CPU inference.
Define what to extract — as a Python dict, JSON schema, or natural language description — and Fuse handles prompt construction, constrained generation, and JSON parsing.
How it works¶
flowchart LR
S[Schema\ndict · JSON · description] --> E[Extractor]
M[GGUF Model\nlocal or HuggingFace] --> B[LlamaCppBackend]
B --> E
E -->|constrained generation\nvia outlines| R[Structured JSON]
Key features¶
Zero-shot extraction
No training data needed. Pass a dict of fields and get structured JSON back from any instruction-tuned model.
Dynamic schemas
Build schemas from Python dicts, JSON Schema, or natural language. No Pydantic boilerplate required.
HuggingFace auto-download
Pass a repo name and Fuse downloads the best Q4 GGUF automatically. Models are cached locally.
Fine-tune with LoRA
Train on your domain data with Unsloth or HuggingFace Transformers, then export to GGUF for deployment.
Quick example¶
import fuse
backend = fuse.LlamaCppBackend(model_name="bartowski/Llama-3.2-1B-Instruct-GGUF")
extractor = fuse.Extractor(backend)
result = extractor.extract_from_fields(
"Sarah Chen is a 34-year-old software architect at Stripe.",
{"name": str, "age": int, "job_title": str, "company": str}
)
# {'name': 'Sarah Chen', 'age': 34, 'job_title': 'software architect', 'company': 'Stripe'}
Supported models¶
Any GGUF model on HuggingFace works. Some good small models for CPU extraction:
| Model | Size | HuggingFace Repo |
|---|---|---|
| Llama 3.2 1B Instruct | ~1GB Q4 | bartowski/Llama-3.2-1B-Instruct-GGUF |
| Llama 3.2 3B Instruct | ~2GB Q4 | bartowski/Llama-3.2-3B-Instruct-GGUF |
| Qwen 2.5 1.5B Instruct | ~1GB Q4 | bartowski/Qwen2.5-1.5B-Instruct-GGUF |
| Phi-4 Mini Instruct | ~2.5GB Q4 | bartowski/Phi-4-mini-instruct-GGUF |