Welcome to igllama 🦄 - the Zig-based Ollama alternative for running LLMs locally.

Why igllama? Built on top of llama.cpp.zig bindings, igllama provides a pure-Zig, dependency-free experience for running GGUF models with an Ollama-like CLI.

Key Features

Pure Zig 🚫🐍 - No Python or system dependencies
Ollama-like CLI 🖥️ - Familiar commands: pull, run, chat
HuggingFace Integration - Download models directly
OpenAI-compatible API - REST server with /v1/chat/completions
GGUF Support - Native GGUF format support
GPU Acceleration 🚀 - Metal, Vulkan, and CUDA backends

Quick Start

git clone --recursive https://github.com/bkataru/igllama.git
cd igllama
zig build -Doptimize=ReleaseFast
./zig-out/bin/igllama run model.gguf -p "Hello!"

View Full Documentation

Learn More

Benchmark Showcase - Real-world performance benchmarks and case studies
igllama vs Ollama - Architecture comparison and use cases
Philosophy - Why igllama exists and design principles
Architecture - Technical deep dive into the system