Documentation

Welcome to the official documentation for igllama, a Zig-based inference engine for running large language models locally using GGUF format models.

GGUF (Georgi Gerganov Unified Format) is a quantized model format optimized for efficient inference on consumer hardware. This enables you to run powerful AI models locally without cloud dependencies.

Getting Started

Installation

Set up igllama on your system. This guide covers building from source and configuring GPU backends.

Prerequisites and dependencies
Building for Windows, macOS, and Linux
GPU backend configuration (Metal, Vulkan, CUDA)
Build options and customization

Quickstart

Get running in under 10 minutes with a complete walkthrough of the core workflow.

Pull a model from HuggingFace
Run single-turn inference
Start an interactive chat session
Import local GGUF files

Core Features

Command Line Interface

Complete reference for all igllama CLI commands, flags, and options.

Model management (pull, list, rm, import)
Inference commands (run, chat)
API server (api, serve)
Environment variables and configuration
Command examples and use cases

Interactive Chat

Engage in multi-turn conversations with context management and session persistence.

Starting chat sessions
In-chat commands (/save, /load, /clear, /system)
Chat template support (ChatML, Llama 3, Mistral, and more)
Session management and history
Sampling parameters

API Server

OpenAI-compatible REST API for integrating igllama with your applications.

/v1/chat/completions endpoint with streaming support
/v1/embeddings endpoint
Request and response formats
Python and JavaScript examples
CORS and error handling

Documentation Index

Section	Description
Installation	Build from source and configure GPU backends
Quickstart	First steps and basic usage guide
CLI Reference	Complete command-line interface documentation
Chat Mode	Interactive conversation features
API Server	REST API integration guide

Support and Resources

GitHub Repository: github.com/bkataru/igllama - Source code, issues, and discussions
License: MIT License

Ready to get started? Head to Installation to set up igllama on your system.