Hybrid LLM Manager — TurboQuant multi-model inference with voice pipeline
- Python 75%
- Shell 25%
| __pycache__ | ||
| scripts | ||
| .gitignore | ||
| config.py | ||
| llm-manager-init | ||
| manager.py | ||
| README.md | ||
Homelab LLM Infrastructure (homelab-llm-infra)
TurboQuant multi-model inference with voice pipeline.
Overview
Hybrid LLM manager for homelab GPU inference. Manages multiple Ollama models with TurboQuant optimization, context window orchestration, automatic model handoff, and integrated voice pipeline for real-time interaction.
Stack
- Python
- Ollama
Quick Start
# Clone
git clone ssh://git@192.168.183.110:2222/pook/homelab-llm-infra.git
cd homelab-llm-infra
Status
Active
License
Private — All rights reserved