Documentation

Welcome to the official documentation for igllama, a Zig-based inference engine for running large language models locally using GGUF format models.

GGUF (Georgi Gerganov Unified Format) is a quantized model format optimized for efficient inference on consumer hardware. This enables you to run powerful AI models locally without cloud dependencies.


Getting Started

Installation

Set up igllama on your system. This guide covers building from source and configuring GPU backends.

  • Prerequisites and dependencies
  • Building for Windows, macOS, and Linux
  • GPU backend configuration (Metal, Vulkan, CUDA)
  • Build options and customization

Quickstart

Get running in under 10 minutes with a complete walkthrough of the core workflow.

  • Pull a model from HuggingFace
  • Run single-turn inference
  • Start an interactive chat session
  • Import local GGUF files

Core Features

Command Line Interface

Complete reference for all igllama CLI commands, flags, and options.

  • Model management (pull, list, rm, import)
  • Inference commands (run, chat)
  • API server (api, serve)
  • Environment variables and configuration
  • Command examples and use cases

Interactive Chat

Engage in multi-turn conversations with context management and session persistence.

  • Starting chat sessions
  • In-chat commands (/save, /load, /clear, /system)
  • Chat template support (ChatML, Llama 3, Mistral, and more)
  • Session management and history
  • Sampling parameters

API Server

OpenAI-compatible REST API for integrating igllama with your applications.

  • /v1/chat/completions endpoint with streaming support
  • /v1/embeddings endpoint
  • Request and response formats
  • Python and JavaScript examples
  • CORS and error handling

Documentation Index

SectionDescription
InstallationBuild from source and configure GPU backends
QuickstartFirst steps and basic usage guide
CLI ReferenceComplete command-line interface documentation
Chat ModeInteractive conversation features
API ServerREST API integration guide

Support and Resources


Ready to get started? Head to Installation to set up igllama on your system.