Hybrid LLM Manager — TurboQuant multi-model inference with voice pipeline
  • Python 75%
  • Shell 25%
Find a file
2026-04-30 12:22:15 -04:00
__pycache__ feat: Hybrid LLM Manager — TurboQuant+ multi-model router 2026-03-31 22:43:13 -04:00
scripts feat: add TurboQuant launch script — 262K context, q8_0/turbo3 KV cache 2026-04-30 12:22:15 -04:00
.gitignore chore: add gitignore 2026-03-31 22:43:24 -04:00
config.py fix: sync TurboQuant config — turbo3 KV cache, 262K context, V4 manager 2026-04-30 12:21:31 -04:00
llm-manager-init feat: Hybrid LLM Manager — TurboQuant+ multi-model router 2026-03-31 22:43:13 -04:00
manager.py fix: sync TurboQuant config — turbo3 KV cache, 262K context, V4 manager 2026-04-30 12:21:31 -04:00
README.md feat: add README.md 2026-04-24 15:02:20 -04:00

Homelab LLM Infrastructure (homelab-llm-infra)

TurboQuant multi-model inference with voice pipeline.

Overview

Hybrid LLM manager for homelab GPU inference. Manages multiple Ollama models with TurboQuant optimization, context window orchestration, automatic model handoff, and integrated voice pipeline for real-time interaction.

Stack

  • Python
  • Ollama

Quick Start

# Clone
git clone ssh://git@192.168.183.110:2222/pook/homelab-llm-infra.git
cd homelab-llm-infra

Status

Active

License

Private — All rights reserved