===== Jarvis AI System (TorresVault 2.0) ===== The **Jarvis AI System** is the next major evolution of the TorresVault ecosystem — a fully local, private, high-performance AI platform built on modern GPU hardware, integrated deeply with Home Assistant, the Proxmox cluster, the UniFi network, and smart-home automations. This page outlines the **theoretical architecture** for Jarvis under the TorresVault v2 upgrade plan. ---- ==== 🏗️ System Architecture Overview ==== Jarvis is designed as a **local-first AI compute node** with the following goals: * Zero dependency on cloud services * Near-instant local processing * High availability as part of the wider TorresVault ecosystem * Tight integration with HA Voice, Nabu Casa, BLE tracking, and dashboards * Scalable model hosting for multiple LLMs and vision models ---- === Hardware Overview === Jarvis AI Box (theoretical build): * **CPU:** AMD Ryzen (preferred for efficiency + PCIe lanes) * **GPU:** NVIDIA RTX 3060 (12GB VRAM) * **RAM:** 32–64GB * **Storage:** NVMe SSD (OS + models), optional SATA SSD for app data * **Networking:** Dual 2.5Gb or 10Gb depending on minisforum/mini-PC * **OS:** Ubuntu Server or Debian * **Containerization:** Docker or Podman * **Base Services:** vLLM or Ollama, Whisper, Piper, FastAPI endpoints Jarvis exists as its own node separate from PVE1/PVE2 to maintain **failure domain isolation**, consistent with your lab philosophy. ---- ==== 🔊 Speech Pipeline (STT/TTS) ==== Jarvis handles speech the following way: === 1. **Speech Input (STT) – Whisper** === All voice commands enter through: * **Home Assistant Voice Assistants** (1 per room eventually) * The **car BLE + TTS node** for mobile interactions * n8n request endpoints * Optional microphone endpoints on Pi-based sensors Whisper STT pipeline: * Local Whisper server running large-v3 or medium-int8 * Audio cleaned → transcribed → command classification * Routed to HA or the LLM depending on context === 2. **Speech Output (TTS) – Piper** === Home Assistant Voice-ready. Used for: * Spoken responses from Jarvis * Car TTS notifications (weekday/time logic) * Room announcements (office, kitchen, hallway) Languages/voices: * en-us (rhasspy onnx baseline) * Optional emotional / expressive voices later ---- ==== ⚙️ n8n Automations Layer ==== n8n acts as the **automation router**: * Receives events from HA (motion, BLE, network events) * Sends structured workflows to Jarvis (LLM queries, text cleaning) * Sends TTS announcements back via HA/Piper * Handles long-running processes that HA shouldn’t manage Example workflows: * “Item Left Behind” (BLE + car tracker → TTS + WLED warning) * “Client Count Drop” → notify Jarvis → escalate warning * “Network health degradation” → log into Grafana via webhook * “Daily home-health summary” ---- ==== 🧩 Integration With Home Assistant ==== Jarvis integrates deeply into HA: === Inputs from HA → Jarvis === * microphone audio → Whisper * room presence (BLE/EP1/Atom Lite/Pi Zero) * sensor events (motion, lights, AP offline, doors) * network events (UniFi client counts, WAN latency) * FPP show status * System notifications (VM backups, node failures) === Outputs from Jarvis → HA === * TTS responses (Piper) * Script execution * Automations (scene changes, WLED preset triggers) * Dashboard chips/cards (dynamic status indicators) * FPP commands (start playlist, stop, status polling) ---- ==== 🌐 Integration With Proxmox & Servers ==== Jarvis is not a hypervisor — it *observes and automates*. Planned integrations: * Monitoring PVE node status via API * Reading VM load / uptime / alerts * Suggesting workload migrations * Triggering safe shutdown ops if NUT signals UPS failure * Reading Prometheus/Grafana alerts * Providing natural-language summaries of cluster health Example: * “Jarvis, is my storage node overloaded?” * “Jarvis, summarize the last 24h of Proxmox warnings.” ---- ==== 📡 Integration with the Network (UniFi + Pi-hole) ==== Jarvis can interface with: === UniFi === * Client analytics (AP client drops) * Device heartbeat checks * WAN latency monitoring * Firewall rule toggles (matrix, mega-tree, VPN-to-TX) * SSID toggles === Pi-hole === * DNS lookup logs * Query summaries * Blocklist additions * Warn if VIP 192.168.1.5 becomes unreachable This creates a unified “AI-aware network”. ---- ==== 🚗 Car Integration ==== Using: * M5Stack Atom Echo (`car_ble`) * BLE detection for keys, wallet, badge, laptop * Piper TTS for “item left behind” * WLED (`wled_car_warning`) for green/red status visuals Morning startup workflow: * Car BLE powers on * BLE scan → verify all 4 tracked items * Piper TTS: “All items present — have a great day Nathan!” * WLED animation: green * OR if missing: “Your work badge is not detected.” (red animation) ---- ==== 🧱 Future Modules (TorresVault 2.0 Timeline) ==== * On-device embeddings for faster semantic search * Camera inference (face / person detection) on the GPU * Local RAG system using your wiki + docs * Full-room voice assistants replacing all Echo devices * Kubernetes migration (GPU node + HA control plane) * Distributed Pi Zero fleet for BLE triangulation * Local-first AI dashboards on tablets ---- ==== 📌 Status ==== This page represents the **design phase** of Jarvis for TorresVault v2. Implementation will begin after the Promove → TorresVault v2 migration is stable.