A clean, simple, repeatable troubleshooting guide tailored specifically to your home lab setup β Proxmox, UniFi, Pi-hole, NPM, Cloudflare, HA, FPP, and all the issues you've commonly faced.
Use this as your first-stop workflow when something breaks.
β
Symptoms: VMs unreachable, dashboards offline, services not resolving.
* Check UniFi for offline clients or adopted-but-disconnected devices. * Verify correct NICs are active on Proxmox nodes. * Confirm VLAN assignments match expected networks.
* If a VM loses network unexpectedly:
β
Symptoms: Hostnames like `pve1`, `nextcloud.in.torresvault.com` not resolving.
* Pi-hole β Local DNS β Records * Ensure no typos in hostnames * Ensure A records point to new fixed IPs * If using wildcard entries, verify they donβt conflict
* Reload DNS:
pihole restartdns
* Add missing A records for internal access * Make sure DHCP-provided domain suffix = `in.torresvault.com`
β
Symptoms: Service loads internally but not externally, SSL failures.
* NPM β Proxy Hosts β ensure entry exists * Confirm Forward IP matches the VMβs fixed IP * Check that βForce SSLβ isnβt breaking internal-only apps
* Re-request SSL using DNS challenge * Make sure Cloudflare DNS proxied (orange) is enabled * Ensure firewall/port forwarding on UCG Max routes 80/443 correctly
β
Symptoms: Public URIs not working, SSL challenge failures.
* Confirm A records:
* Ensure Cloudflare API token used by NPM is valid * Check if record is Proxied vs DNS-only
* Recreate DNS record * Rotate Cloudflare API token and update NPM * Purge cached DNS
β
Symptoms: Container wonβt start, port conflicts, permissions errors.
* Check logs:
docker logs -f <container>
* Validate docker-compose.yml:
docker compose config
* Confirm folders exist under ~/apps/<app>/
* Rebuild:
docker compose down && docker compose up -d
* Fix wrong port exposures * Ensure user belongs to docker group
β
Symptoms: VM traffic lost, backups failing, cluster partial.
* Check storage health: Datacenter β Storage * Check logs for `log_vg-log` filling up * Ensure corosync links (10.10.10.0/30) are up
* Restart Proxmox Backup services:
systemctl restart proxmox-backup systemctl restart proxmox-backup-proxy
* Clean old logs only after reviewing them * Verify correct NIC mappings after reboots
β
Symptoms: Automations failing, sensors offline, dashboards broken.
* Developer Tools β States * Check whether entities are `unavailable` * Validate YAML with Check Configuration
* Reload automations * Restart HA Core * Fix entity renames after device migrations
β
Symptoms: Player offline, playlist not starting, show not responding.
* Verify FPP is reachable at: `http://192.168.60.55` * Check output status logs * Confirm GPIO or Kulp port fuses (10A) aren't tripped
* Restart FPP service * Reset playlist * Re-sync with HA if using ha-fpp integration
β
If something breaks, follow this order:
1. Is the device reachable? (Ping β UniFi β Physical layer) 2. Is DNS resolving? (Pi-hole A records β Cloudflare) 3. Is the reverse proxy routing correctly? (NPM logs) 4. Are containers healthy? (Docker logs) 5. Are Proxmox services running? (pveproxy, pvedaemon, pvescheduler) 6. Is this an automation/device-specific issue? (Home Assistant)
This topβdown approach solves 95% of the issues youβve encountered.
β
Document any new fix you discover here so your wiki becomes the single source of truth for TorresVault.