Documentation
Welcome to the official documentation for igllama, a Zig-based inference engine for running large language models locally using GGUF format models.
GGUF (Georgi Gerganov Unified Format) is a quantized model format optimized for efficient inference on consumer hardware. This enables you to run powerful AI models locally without cloud dependencies.
Getting Started
Installation
Set up igllama on your system. This guide covers building from source and configuring GPU backends.
- Prerequisites and dependencies
- Building for Windows, macOS, and Linux
- GPU backend configuration (Metal, Vulkan, CUDA)
- Build options and customization
Quickstart
Get running in under 10 minutes with a complete walkthrough of the core workflow.
- Pull a model from HuggingFace
- Run single-turn inference
- Start an interactive chat session
- Import local GGUF files
Core Features
Command Line Interface
Complete reference for all igllama CLI commands, flags, and options.
- Model management (
pull,list,rm,import) - Inference commands (
run,chat) - API server (
api,serve) - Environment variables and configuration
- Command examples and use cases
Interactive Chat
Engage in multi-turn conversations with context management and session persistence.
- Starting chat sessions
- In-chat commands (
/save,/load,/clear,/system) - Chat template support (ChatML, Llama 3, Mistral, and more)
- Session management and history
- Sampling parameters
API Server
OpenAI-compatible REST API for integrating igllama with your applications.
/v1/chat/completionsendpoint with streaming support/v1/embeddingsendpoint- Request and response formats
- Python and JavaScript examples
- CORS and error handling
Documentation Index
| Section | Description |
|---|---|
| Installation | Build from source and configure GPU backends |
| Quickstart | First steps and basic usage guide |
| CLI Reference | Complete command-line interface documentation |
| Chat Mode | Interactive conversation features |
| API Server | REST API integration guide |
Support and Resources
- GitHub Repository: github.com/bkataru/igllama - Source code, issues, and discussions
- License: MIT License
Ready to get started? Head to Installation to set up igllama on your system.