====== πŸ› οΈ TorresVault Basic Troubleshooting Guide ====== A clean, simple, repeatable troubleshooting guide tailored specifically to **your** home lab setup β€” Proxmox, UniFi, Pi-hole, NPM, Cloudflare, HA, FPP, and all the issues you've commonly faced. Use this as your first-stop workflow when something breaks. --- ===== 🚦 1. Network & Connectivity Basics ===== **Symptoms:** VMs unreachable, dashboards offline, services not resolving. ==== πŸ” Quick Checks ==== * Check UniFi for offline clients or adopted-but-disconnected devices. * Verify correct NICs are active on Proxmox nodes. * Confirm VLAN assignments match expected networks. ==== πŸ”Œ Common Fixes ==== * If a VM loses network unexpectedly: * Reseat the cable or switch ports (common with dual-NIC setups) * Renew DHCP or verify static mapping * Confirm Proxmox bridges (`vmbr0`, `vmbr1`) are mapped to correct NIC --- ===== πŸ“‘ 2. DNS Issues (Pi-hole / Internal DNS) ===== **Symptoms:** Hostnames like `pve1`, `nextcloud.in.torresvault.com` not resolving. ==== πŸ” Quick Checks ==== * Pi-hole β†’ **Local DNS β†’ Records** * Ensure no typos in hostnames * Ensure A records point to new fixed IPs * If using wildcard entries, verify they don’t conflict ==== πŸ”§ Common Fixes ==== * Reload DNS: pihole restartdns * Add missing A records for internal access * Make sure DHCP-provided domain suffix = `in.torresvault.com` --- ===== πŸ” 3. Nginx Proxy Manager (NPM) Issues ===== **Symptoms:** Service loads internally but not externally, SSL failures. ==== πŸ” Quick Checks ==== * NPM β†’ Proxy Hosts β†’ ensure entry exists * Confirm Forward IP matches the VM’s fixed IP * Check that "Force SSL" isn’t breaking internal-only apps ==== πŸ”§ Common Fixes ==== * Re-request SSL using DNS challenge * Make sure Cloudflare DNS **proxied (orange)** is enabled * Ensure firewall/port forwarding on UCG Max routes **80/443** correctly --- ===== ☁️ 4. Cloudflare DNS Issues ===== **Symptoms:** Public URIs not working, SSL challenge failures. ==== πŸ” Quick Checks ==== * Confirm A records: * `app.torresvault.com` β†’ Public IP * Ensure Cloudflare API token used by NPM is valid * Check if record is Proxied vs DNS-only ==== πŸ”§ Common Fixes ==== * Recreate DNS record * Rotate Cloudflare API token and update NPM * Purge cached DNS --- ===== πŸ“¦ 5. Docker / Docker Compose Problems ===== **Symptoms:** Container won’t start, port conflicts, permissions errors. ==== πŸ” Quick Checks ==== * Check logs: docker logs -f * Validate docker-compose.yml: docker compose config * Confirm folders exist under ~/apps// ==== πŸ”§ Common Fixes ==== * Rebuild: docker compose down && docker compose up -d * Fix wrong port exposures * Ensure user belongs to docker group --- ===== 🧡 6. Proxmox Host Issues ===== **Symptoms:** VM traffic lost, backups failing, cluster partial. ==== πŸ” Quick Checks ==== * Check storage health: **Datacenter β†’ Storage** * Check logs for `log_vg-log` filling up * Ensure corosync links (10.10.10.0/30) are up ==== πŸ”§ Common Fixes ==== * Restart Proxmox Backup services: systemctl restart proxmox-backup systemctl restart proxmox-backup-proxy * Clean old logs only after reviewing them * Verify correct NIC mappings after reboots --- ===== πŸ”” 7. Home Assistant Issues ===== **Symptoms:** Automations failing, sensors offline, dashboards broken. ==== πŸ” Quick Checks ==== * Developer Tools β†’ States * Check whether entities are `unavailable` * Validate YAML with **Check Configuration** ==== πŸ”§ Common Fixes ==== * Reload automations * Restart HA Core * Fix entity renames after device migrations --- ===== πŸŽ„ 8. FPP / Light Show Issues ===== **Symptoms:** Player offline, playlist not starting, show not responding. ==== πŸ” Quick Checks ==== * Verify FPP is reachable at: `http://192.168.60.55` * Check output status logs * Confirm GPIO or Kulp port fuses (10A) aren't tripped ==== πŸ”§ Common Fixes ==== * Restart FPP service * Reset playlist * Re-sync with HA if using ha-fpp integration --- ===== 🧩 9. High-Level Troubleshooting Logic ===== If something breaks, follow this order: 1. **Is the device reachable?** (Ping β†’ UniFi β†’ Physical layer) 2. **Is DNS resolving?** (Pi-hole A records β†’ Cloudflare) 3. **Is the reverse proxy routing correctly?** (NPM logs) 4. **Are containers healthy?** (Docker logs) 5. **Are Proxmox services running?** (pveproxy, pvedaemon, pvescheduler) 6. **Is this an automation/device-specific issue?** (Home Assistant) This top‑down approach solves 95% of the issues you’ve encountered. --- ===== πŸ“˜ Final Tip ===== Document any new fix you discover here so your wiki becomes the single source of truth for TorresVault.