Registry

23 models. Ready to run.

On-device, cross-platform. Browse the catalog, copy the integration snippet, ship.

Showing 23 of 23

Text Generation gguf

Gemma 3 1B

Gemma 3 1B - Google's mobile-optimized instruction-tuned LLM with 32K context

Google · 1B params · 749.4 MB

Text Generation gguf

Gemma 4 E2b

Gemma 4 E2B - Google's compact multimodal LLM with 2.3B effective params, 128K context, and audio/image/video understanding

Google · 5.1B params · —

Text Generation gguf

Gemma 4 E4b

Gemma 4 E4B - Google's mid-range multimodal LLM with 4.5B effective params, 128K context, and audio/image/video understanding

Google · 8B params · —

Text Generation gguf

Gemma3npc 1B

Gemma3NPC 1B - Mobile-friendly NPC roleplay fine-tune of Gemma 3 1B for in-character game dialogue

Google · 1B params · 806.0 MB

Text Generation gguf

Gemma3npc It

Gemma3NPC-it - NPC roleplay fine-tune of Gemma3n-E4B for in-character game dialogue

Google · 7B params · 3.8 GB

Text to Speech onnx

Kitten Tts Micro 0.8

KittenTTS Micro 0.8 - Compact StyleTTS 2 (40M params, 8 named voices, OpenPhonemizer)

KittenML · 40M params · 90.2 MB

Text to Speech onnx

Kitten Tts Mini 0.8

KittenTTS Mini 0.8 - High-quality StyleTTS 2 (80M params, 8 named voices, OpenPhonemizer)

KittenML · 80M params · 111.1 MB

Text to Speech onnx

Kitten Tts Nano 0.2

KittenTTS Nano 0.2 - Ultra-lightweight TTS (15M params, <25MB)

KittenML · 15M params · 18.3 MB

Text to Speech onnx

Kitten Tts Nano 0.8

KittenTTS Nano 0.8 - Ultra-lightweight StyleTTS 2 (15M params, 8 named voices, OpenPhonemizer)

KittenML · 15M params · 78.3 MB

Text to Speech onnx

Kokoro 82M

Kokoro 82M - High-quality TTS with 24 voices, Misaki dictionary

Hexgrad · 82M params · 174.9 MB

Text Generation gguf

Lfm2.5 350M

Liquid LFM2.5 350M - hybrid conv+attention LLM optimized for edge deployment (9 languages, tool calling)

Liquid · 354M params · —

Text Generation gguf

Llama 3.2 1B

Llama 3.2 1B - Meta's lightweight mobile-optimized instruction-tuned LLM with 128K context

Meta · 1B params · 754.3 MB

Text Generation gguf

Ministral 3 3B

Ministral 3 3B Instruct - Mistral AI's edge-optimized instruction-tuned LLM with 256K context

Mistral · 3.4B params · 2.0 GB

Text Generation gguf

Mistral 7B

Mistral 7B Instruct v0.3 - High-quality desktop LLM with function calling support

Mistral · 7B params · 4.0 GB

Text to Speech gguf

Neutts Air Q4

NeuTTS Air (~500M) — codec TTS with voice cloning. Q4 GGUF backbone + NeuCodec decoder. Higher quality than Nano.

Neuphonic · 500M params · 755.0 MB

Text to Speech gguf

Neutts Nano Q4

NeuTTS Nano (120M) — codec TTS with voice cloning. Q4 GGUF backbone + NeuCodec ONNX decoder.

Neuphonic · 120M params · 469.8 MB

Text Generation gguf

Phi4 Mini

Phi-4 Mini 3.8B - Microsoft's compact reasoning LLM with 128K context

Microsoft · 3.8B params · 8.1 GB

Text Generation gguf

Qwen2.5 0.5B Instruct

Qwen 2.5 0.5B Instruct - Small but capable instruction-tuned LLM from Alibaba Cloud

Qwen · 500M params · 455.2 MB

Text Generation gguf

Qwen3.5 0.8B

Qwen 3.5 0.8B - Alibaba Cloud's lightweight multimodal LLM (text-only mode, 201 languages)

Qwen · 800M params · 495.7 MB

Text Generation gguf

Qwen3.5 2B

Qwen 3.5 2B - Alibaba Cloud's compact multimodal LLM (text-only mode, 201 languages)

Qwen · 2B params · 1.2 GB

Text Generation gguf

Smollm2 360M

SmolLM2 360M - HuggingFace's best tiny LLM, excellent quality/size ratio

HuggingFace · 360M params · 254.2 MB

Speech Recognition onnx

Wav2vec2 Base 960H

Wav2Vec2 Base 960h - English ASR with CTC decoding

Meta · 95M params · 220.3 MB

Speech Recognition safetensors

Whisper Tiny

Whisper Tiny - Fast multilingual ASR (Candle/SafeTensors runtime)

OpenAI · 39M params · 84.7 MB