Table of Contents
Jarvis AI System (TorresVault 2.0)
The Jarvis AI System is the next major evolution of the TorresVault ecosystem β a fully local, private, high-performance AI platform built on modern GPU hardware, integrated deeply with Home Assistant, the Proxmox cluster, the UniFi network, and smart-home automations.
This page outlines the theoretical architecture for Jarvis under the TorresVault v2 upgrade plan.
ποΈ System Architecture Overview
Jarvis is designed as a local-first AI compute node with the following goals:
- Zero dependency on cloud services
- Near-instant local processing
- High availability as part of the wider TorresVault ecosystem
- Tight integration with HA Voice, Nabu Casa, BLE tracking, and dashboards
- Scalable model hosting for multiple LLMs and vision models
Hardware Overview
Jarvis AI Box (theoretical build):
- CPU: AMD Ryzen (preferred for efficiency + PCIe lanes)
- GPU: NVIDIA RTX 3060 (12GB VRAM)
- RAM: 32β64GB
- Storage: NVMe SSD (OS + models), optional SATA SSD for app data
- Networking: Dual 2.5Gb or 10Gb depending on minisforum/mini-PC
- OS: Ubuntu Server or Debian
- Containerization: Docker or Podman
- Base Services: vLLM or Ollama, Whisper, Piper, FastAPI endpoints
Jarvis exists as its own node separate from PVE1/PVE2 to maintain failure domain isolation, consistent with your lab philosophy.
π Speech Pipeline (STT/TTS)
Jarvis handles speech the following way:
1. **Speech Input (STT) β Whisper**
All voice commands enter through:
- Home Assistant Voice Assistants (1 per room eventually)
- The car BLE + TTS node for mobile interactions
- n8n request endpoints
- Optional microphone endpoints on Pi-based sensors
Whisper STT pipeline:
- Local Whisper server running large-v3 or medium-int8
- Audio cleaned β transcribed β command classification
- Routed to HA or the LLM depending on context
2. **Speech Output (TTS) β Piper**
Home Assistant Voice-ready.
Used for:
- Spoken responses from Jarvis
- Car TTS notifications (weekday/time logic)
- Room announcements (office, kitchen, hallway)
Languages/voices:
- en-us (rhasspy onnx baseline)
- Optional emotional / expressive voices later
βοΈ n8n Automations Layer
n8n acts as the automation router:
- Receives events from HA (motion, BLE, network events)
- Sends structured workflows to Jarvis (LLM queries, text cleaning)
- Sends TTS announcements back via HA/Piper
- Handles long-running processes that HA shouldnβt manage
Example workflows:
- βItem Left Behindβ (BLE + car tracker β TTS + WLED warning)
- βClient Count Dropβ β notify Jarvis β escalate warning
- βNetwork health degradationβ β log into Grafana via webhook
- βDaily home-health summaryβ
π§© Integration With Home Assistant
Jarvis integrates deeply into HA:
Inputs from HA β Jarvis
- microphone audio β Whisper
- room presence (BLE/EP1/Atom Lite/Pi Zero)
- sensor events (motion, lights, AP offline, doors)
- network events (UniFi client counts, WAN latency)
- FPP show status
- System notifications (VM backups, node failures)
Outputs from Jarvis β HA
- TTS responses (Piper)
- Script execution
- Automations (scene changes, WLED preset triggers)
- Dashboard chips/cards (dynamic status indicators)
- FPP commands (start playlist, stop, status polling)
π Integration With Proxmox & Servers
Jarvis is not a hypervisor β it *observes and automates*.
Planned integrations:
- Monitoring PVE node status via API
- Reading VM load / uptime / alerts
- Suggesting workload migrations
- Triggering safe shutdown ops if NUT signals UPS failure
- Reading Prometheus/Grafana alerts
- Providing natural-language summaries of cluster health
Example:
- βJarvis, is my storage node overloaded?β
- βJarvis, summarize the last 24h of Proxmox warnings.β
π‘ Integration with the Network (UniFi + Pi-hole)
Jarvis can interface with:
UniFi
- Client analytics (AP client drops)
- Device heartbeat checks
- WAN latency monitoring
- Firewall rule toggles (matrix, mega-tree, VPN-to-TX)
- SSID toggles
Pi-hole
- DNS lookup logs
- Query summaries
- Blocklist additions
- Warn if VIP 192.168.1.5 becomes unreachable
This creates a unified βAI-aware networkβ.
π Car Integration
Using:
- M5Stack Atom Echo (`car_ble`)
- BLE detection for keys, wallet, badge, laptop
- Piper TTS for βitem left behindβ
- WLED (`wled_car_warning`) for green/red status visuals
Morning startup workflow:
- Car BLE powers on
- BLE scan β verify all 4 tracked items
- Piper TTS: βAll items present β have a great day Nathan!β
- WLED animation: green
- OR if missing: βYour work badge is not detected.β (red animation)
π§± Future Modules (TorresVault 2.0 Timeline)
- On-device embeddings for faster semantic search
- Camera inference (face / person detection) on the GPU
- Local RAG system using your wiki + docs
- Full-room voice assistants replacing all Echo devices
- Kubernetes migration (GPU node + HA control plane)
- Distributed Pi Zero fleet for BLE triangulation
- Local-first AI dashboards on tablets
π Status
This page represents the design phase of Jarvis for TorresVault v2. Implementation will begin after the Promove β TorresVault v2 migration is stable.
