====== π οΈ TorresVault Basic Troubleshooting Guide ======
A clean, simple, repeatable troubleshooting guide tailored specifically to **your** home lab setup β Proxmox, UniFi, Pi-hole, NPM, Cloudflare, HA, FPP, and all the issues you've commonly faced.
Use this as your first-stop workflow when something breaks.
---
===== π¦ 1. Network & Connectivity Basics =====
**Symptoms:** VMs unreachable, dashboards offline, services not resolving.
==== π Quick Checks ====
* Check UniFi for offline clients or adopted-but-disconnected devices.
* Verify correct NICs are active on Proxmox nodes.
* Confirm VLAN assignments match expected networks.
==== π Common Fixes ====
* If a VM loses network unexpectedly:
* Reseat the cable or switch ports (common with dual-NIC setups)
* Renew DHCP or verify static mapping
* Confirm Proxmox bridges (`vmbr0`, `vmbr1`) are mapped to correct NIC
---
===== π‘ 2. DNS Issues (Pi-hole / Internal DNS) =====
**Symptoms:** Hostnames like `pve1`, `nextcloud.in.torresvault.com` not resolving.
==== π Quick Checks ====
* Pi-hole β **Local DNS β Records**
* Ensure no typos in hostnames
* Ensure A records point to new fixed IPs
* If using wildcard entries, verify they donβt conflict
==== π§ Common Fixes ====
* Reload DNS:
pihole restartdns
* Add missing A records for internal access
* Make sure DHCP-provided domain suffix = `in.torresvault.com`
---
===== π 3. Nginx Proxy Manager (NPM) Issues =====
**Symptoms:** Service loads internally but not externally, SSL failures.
==== π Quick Checks ====
* NPM β Proxy Hosts β ensure entry exists
* Confirm Forward IP matches the VMβs fixed IP
* Check that "Force SSL" isnβt breaking internal-only apps
==== π§ Common Fixes ====
* Re-request SSL using DNS challenge
* Make sure Cloudflare DNS **proxied (orange)** is enabled
* Ensure firewall/port forwarding on UCG Max routes **80/443** correctly
---
===== βοΈ 4. Cloudflare DNS Issues =====
**Symptoms:** Public URIs not working, SSL challenge failures.
==== π Quick Checks ====
* Confirm A records:
* `app.torresvault.com` β Public IP
* Ensure Cloudflare API token used by NPM is valid
* Check if record is Proxied vs DNS-only
==== π§ Common Fixes ====
* Recreate DNS record
* Rotate Cloudflare API token and update NPM
* Purge cached DNS
---
===== π¦ 5. Docker / Docker Compose Problems =====
**Symptoms:** Container wonβt start, port conflicts, permissions errors.
==== π Quick Checks ====
* Check logs:
docker logs -f
* Validate docker-compose.yml:
docker compose config
* Confirm folders exist under ~/apps//
==== π§ Common Fixes ====
* Rebuild:
docker compose down && docker compose up -d
* Fix wrong port exposures
* Ensure user belongs to docker group
---
===== π§΅ 6. Proxmox Host Issues =====
**Symptoms:** VM traffic lost, backups failing, cluster partial.
==== π Quick Checks ====
* Check storage health: **Datacenter β Storage**
* Check logs for `log_vg-log` filling up
* Ensure corosync links (10.10.10.0/30) are up
==== π§ Common Fixes ====
* Restart Proxmox Backup services:
systemctl restart proxmox-backup
systemctl restart proxmox-backup-proxy
* Clean old logs only after reviewing them
* Verify correct NIC mappings after reboots
---
===== π 7. Home Assistant Issues =====
**Symptoms:** Automations failing, sensors offline, dashboards broken.
==== π Quick Checks ====
* Developer Tools β States
* Check whether entities are `unavailable`
* Validate YAML with **Check Configuration**
==== π§ Common Fixes ====
* Reload automations
* Restart HA Core
* Fix entity renames after device migrations
---
===== π 8. FPP / Light Show Issues =====
**Symptoms:** Player offline, playlist not starting, show not responding.
==== π Quick Checks ====
* Verify FPP is reachable at: `http://192.168.60.55`
* Check output status logs
* Confirm GPIO or Kulp port fuses (10A) aren't tripped
==== π§ Common Fixes ====
* Restart FPP service
* Reset playlist
* Re-sync with HA if using ha-fpp integration
---
===== π§© 9. High-Level Troubleshooting Logic =====
If something breaks, follow this order:
1. **Is the device reachable?** (Ping β UniFi β Physical layer)
2. **Is DNS resolving?** (Pi-hole A records β Cloudflare)
3. **Is the reverse proxy routing correctly?** (NPM logs)
4. **Are containers healthy?** (Docker logs)
5. **Are Proxmox services running?** (pveproxy, pvedaemon, pvescheduler)
6. **Is this an automation/device-specific issue?** (Home Assistant)
This topβdown approach solves 95% of the issues youβve encountered.
---
===== π Final Tip =====
Document any new fix you discover here so your wiki becomes the single source of truth for TorresVault.