User Tools

Site Tools


proxmox:cluster

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

proxmox:cluster [2026/01/23 18:15] – created nathnaproxmox:cluster [2026/02/23 13:50] (current) 192.168.1.189
Line 1: Line 1:
-===== Proxmox Cluster Architecture (Current State) =====+====== ========================================= ======
  
-This document describes the **current TorresVault Proxmox cluster** as it exists today.   +====== Proxmox Architecture – TorresVault 2.0 (Current State, 2026======
-It focuses on hardware, networking, storage, workloads, and backup/restore, and is intended as the +
-authoritative reference for how virtualization is implemented in TorresVault 2.0 (current state).+
  
-Future redesigns (new NAS, X570D4U, Mini PC cluster, etc.) will be documented separately +====== ========================================= ======
-on the [[torresvault:todo:roadmap]] page.+
  
-----+This document describes the **current**, single-node Proxmox architecture powering TorresVault 2.0. 
 + 
 +This page replaces and supersedes all references to:
  
-==== 1. High-Level Overview ====+  * ''pve1'' 
 +  * ''pve2'' 
 +  * The old 2-node cluster 
 +  * The Raspberry Pi qdevice 
 +  * All Intel-based legacy hardware
  
-The Proxmox environment is a **2-node cluster with a qdevice**, running on older but solid Intel +All of that hardware has been decommissioned.
-desktop platforms with expanded SATA and NIC capacity.+
  
-  * **Cluster name:** `torres-cluster` +The sole hypervisor is now:
-  * **Hypervisor:** Proxmox VE 9.x +
-  * **Nodes:** `pve1`, `pve2` +
-  * **Quorum helper:** Raspberry Pi running `corosync-qdevice` +
-  * **Backup server:** Proxmox Backup Server (PBS) VM +
-  * **Storage backend:** Local SATA disks per node; TrueNAS VM providing backup storage+
  
-High-level logical view:+==== ▶ PVE-NAS (192.168.1.153) ====
  
-  * **Compute layer:** pve1, pve2 +Running on **enterprise-grade Ryzen hardware** with **TrueNAS virtualized via HBA passthrough**, and acting as the centralized compute storage backbone for TorresVault 2.0.
-  * **Storage layer:** local disks per nodeplus TrueNAS VM used as backup target +
-  * **Backup layer:** PBS VM writing to TrueNAS (`pbs-main` datastore) +
-  * **Monitoring layer:** Prometheus Grafana, Kuma, Proxmox built-ins+
  
-The design intentionally **does not use shared storage for HA**; insteadVMs are pinned to nodes +Future expansions (backup NASmini-PC cluster, GPU with Jarvis, Flex 10G, etc.) will be documented on a separate roadmap page.
-and protected via **image-based backups to PBS**.+
  
 ---- ----
  
-==== 2Physical Hosts & Hardware ====+====== 1High-Level Overview ======
  
-=== 2.1 pve1 ===+==== Hypervisor Platform ====
  
-**Role:** General compute node, many core services+  * **Proxmox VE 9.x*
 +  * **Single-node design (no cluster)** 
 +  * System name: **''pve-nas''** 
 +  * Management IP: **192.168.1.153** 
 +  * IPMI: **192.168.1.145**
  
-  * **CPU:** Intel Core i5-2500 @ 3.30 GHz   +==== Storage Layer (under TrueNAS VM====
-    * 4 cores / 4 threads, 1 socket +
-  * **RAM:** 32 GB DDR3L 1600 MHz   +
-    * 4 × 8 GB Timetec DDR3L (PC3L-12800UDIMM kit +
-  * **Motherboard / Chipset:** Older Intel desktop platform +
-  * **Disk controllers:** +
-    * Onboard Intel SATA controller (RAID mode) +
-    * ASMedia ASM1064 SATA controller +
-    * **GLOTRENDS SA3112-C 12-port PCIe x1 SATA expansion card** +
-  * **Disk inventory (approximate):** +
-    * Several **1 TB WDC WD1003FBYX** enterprise HDDs +
-    * Several **1 TB Seagate ST91000640NS** HDDs +
-    * System / boot disk plus ~10–12 × 1 TB data disks +
-  * **Network interfaces:** +
-    * Onboard Intel 82579LM 1 GbE +
-    * **Intel I350 quad-port 1 GbE** PCIe NIC +
-  * **Installed OS:** Proxmox VE 9.x (legacy BIOS) +
-  * **Kernel example:** 6.14.x-pve+
  
-Primary role summary:+  * 8 × Samsung PM863 **1.92 TB enterprise SSDs** passed directly to TrueNAS via HBA 
 +  * TrueNAS manages all storage pools 
 +  * PVE-NAS uses: 
 +    * NVMe mirror → Proxmox OS 
 +    * 1.9 TB SSDs → VM storage 
 +    * ZFS replication & snapshots inside TrueNAS 
 +    * PBS nightly backups
  
-  * Runs web, monitoring, automation, PBS, TrueNAS and various lab VMs +==== Backup Layer ====
-  * Acts as one half of the Proxmox cluster +
-  * Provides local LVM/ZFS storage for its own VMs+
  
----+  * **PBS VM** on PVE-NAS 
 +  * Writes into **pbs-main** datastore on TrueNAS
  
-=== 2.2 pve2 ===+==== Workload Layer ====
  
-**Role:** General compute node, media & application workloads+Core services:
  
-  * **CPU:** Intel Core i5-4570 @ 3.20 GHz   +  * Immich 
-    4 cores / 4 threads, 1 socket +  Nextcloud 
-  * **RAM:** 32 GB DDR3L 1600 MHz   +  * Jellyfin 
-    Same Timetec 4 × 8 GB kit as pve1 +  Web landing page 
-  * **Disk controllers:** +  * NPM reverse proxy 
-    * Intel 9-Series SATA controller (AHCI) +  Prometheus / Grafana 
-    * ASMedia ASM1064 SATA controller +  * Kuma 
-    * **GLOTRENDS SA3112-C 12-port PCIe x1 SATA expansion card*+  * Wiki 
-  * **Disk inventory (approximate):** +  * n8n automations
-    * Multiple **1 TB Seagate ST91000640NS** HDDs +
-    * System disk plus ~10–12 × 1 TB data disks +
-  * **Network interfaces:** +
-    * Intel I350 quad-port 1 GbE (matching pve1) +
-  * **Installed OS:** Proxmox VE 9.x (EFI boot)+
  
-Primary role summary:+==== Automation Layer ====
  
-  * Runs general apps, including media and image workloads (Nextcloud, Immich, Jellyfin, etc.+  * Home Assistant (external Pi 4
-  * Acts as second node of Proxmox cluster +  * BLE tracking 
-  * Mirrors pve1’s storage pattern with local disks only+  * FPP (192.168.60.55) 
 +  * WLED (including car warning system)
  
----+This is currently the entire virtualization footprint for TorresVault 2.0.
  
-=== 2.3 QDevice ===+----
  
-  * Hardware**Raspberry Pi** (dedicated qdevice) +====== 2. Physical HostPVE-NAS ======
-  * Software: `corosync-qdevice` +
-  * Purpose: provides voting/quorum for a **2-node cluster**, preventing split-brain.+
  
----+==== Hardware Summary ====
  
-==== 3Network Design ====+^ Component ^ Details ^ 
 +| **Motherboard** | ASRock Rack **X570D4U-2L2T** | 
 +| **CPU** | AMD Ryzen 7 5700G — 8 cores / 16 threads | 
 +| **RAM** | **64 GiB DDR4 ECC** | 
 +| **Boot** | 2 × NVMe SSD (ZFS mirror) | 
 +| **VM Storage** | 2 × Samsung PM863 1.92 TB SSD (Proxmox local storage) | 
 +| **HBA** | 1 × LSI IT-Mode passthrough | 
 +| **TrueNAS Pool Drives** | 8 × Samsung PM863 1.92 TB SSD (full passthrough) | 
 +| **Networking** | 1G ×2 + 10G ×2 (X550 NICs) | 
 +| **IPMI** | 192.168.1.145 |
  
-The Proxmox cluster uses:+This is now your **single most powerful and consolidated host** in TorresVault.
  
-  * **Main LAN:** `192.168.1.0/24`   +----
-    * Gateway: **UCG Max** at `192.168.1.1` +
-  * **Cluster link:** dedicated point-to-point /30 network: +
-    * pve1: `10.10.10.1/30` +
-    * pve2: `10.10.10.2/30`+
  
-UniFi VLANs exist on the network side (stark_user, stark_IOT, guest, IOT+, Torres Family Lights); +====== 3Network Design ======
-for now, Proxmox sees mostly the flat LAN plus specific lab networks for testing.+
  
----+Proxmox sees only the main LAN and storage networks you define.
  
-=== 3.1 pve1 Network Interfaces ===+==== Management & LAN ====
  
-From the Proxmox UI:+^ Interface ^ IP ^ Purpose ^ 
 +| **vmbr0** | 192.168.1.153 | Main LAN bridge & VM network | 
 +| **eno1 / eno2** | (bridged) | 1G LAN & VM connectivity | 
 +| **ens1f0 / ens1f1** | (available) | Dual 10GbE for future storage network / Flex 10G |
  
-  * **bond0** – Linux bond +==== VLANs (available to VMs====
-    * Mode: active-backup +
-    * Slaves: `enp1s0f1`, `enp1s0f2` +
-    * No IP address (used as bridge slave) +
-  * **eno1** – onboard NIC   +
-    * Not active, not autostart (reserved / spare) +
-  * **enp1s0f0** +
-    * IP: `10.10.10.1/30` +
-    * Usage: dedicated **cluster interconnect** to pve2 +
-  * **enp1s0f1 / enp1s0f2** +
-    * Members of `bond0` +
-  * **enp1s0f3** +
-    * Currently unused +
-  * **vmbr0** – Linux bridge +
-    * IP: `192.168.1.150/24` +
-    * Gateway: `192.168.1.1` (UCG Max) +
-    * Bridge port: `bond0` +
-    * All LAN-facing VMs attach here+
  
-Design notes:+  * VLAN10 – User 
 +  * VLAN20 – IoT 
 +  * VLAN50 – IoT+ 
 +  * VLAN60 – Lighting 
 +  * VLAN30 – Guest\\ (all managed via UniFi)
  
-  * Two NIC ports for LAN via bond (`bond0` → `vmbr0`) for basic redundancy. +==== IPMI ====
-  * One NIC port for cluster link (`enp1s0f0`). +
-  * One NIC still available (`enp1s0f3`) for future use (e.g., storage VLAN or DMZ).+
  
----+  * 192.168.1.145\\ Always available even if Proxmox is offline.
  
-=== 3.2 pve2 Network Interfaces ===+----
  
-  * **bond0** – Linux bond +====== 4Storage Architecture (Current) ======
-    * Mode: active-backup +
-    * Slaves: `enp2s0f1`, `enp2s0f2` +
-  * **eno1** – onboard NIC   +
-    * Not active +
-  * **enp2s0f0** +
-    * IP: `10.10.10.2/30` +
-    * Usage: dedicated **cluster interconnect** to pve1 +
-  * **enp2s0f1 / enp2s0f2** +
-    * Members of `bond0` +
-  * **enp2s0f3** +
-    * Currently unused +
-  * **vmbr0** +
-    * IP: `192.168.1.151/24` +
-    * Gateway: `192.168.1.1` +
-    * Bridge port: `bond0`+
  
-Design notes:+There are three main storage components:
  
-  * Symmetric layout with pve1 to make VM migration and cabling easier. +----
-  * Cluster traffic is physically separated from LAN.+
  
----+===== 4.1 Proxmox Local Storage (OS + VM disks) =====
  
-=== 3.3 Logical Topology Diagram (Text) ===+^ Storage Name ^ Description ^ Backed By ^ 
 +| ''local'' | ISOs, templates | NVMe mirror | 
 +| ''local-lvm'' | VM disks | NVMe mirror | 
 +| ''ssd-backups'' | Local staging | 1.9TB PM863 SSD | 
 +| ''immich-nfs'' | Immich share | TrueNAS | 
 +| ''nas-zfs'' | ZFS datasets | TrueNAS | 
 +| ''nas-local'' | VM backups / misc | TrueNAS |
  
-  * **LAN 192.168.1.0/24** +----
-    * UCG Max (192.168.1.1) +
-    * pve1 (192.168.1.150, vmbr0 on bond0) +
-    * pve2 (192.168.1.151, vmbr0 on bond0) +
-    * Other LAN clients / services+
  
-  * **Cluster link 10.10.10.0/30** +===== 4.2 TrueNAS VM (ID 108) =====
-    * pve1 – `enp1s0f0` → `10.10.10.1` +
-    * pve2 – `enp2s0f0` → `10.10.10.2+
-    * Single direct cable between nodes+
  
-  * **Quorum** +^ Component ^ Details ^ 
-    Raspberry Pi qdevice on LAN (IP TBD)reachable from both nodes+**Disks** | 8 × PM863 1.92TB SSDs via HBA passthrough | 
 +**Role** | All primary storage for Immich, Nextcloud, Jellyfin, PBS datastore | 
 +| **IP** | 192.168.1.108 | 
 +| **Pools** | ''ssd-pool''''temp-pool'', ''hdd-pool'' (if present) |
  
-This separation keeps corosync and cluster traffic off the main LAN and avoids +TrueNAS acts as your **central storage authority**.
-cluster instability if LAN becomes noisy.+
  
 ---- ----
  
-==== 4. Storage Architecture (Current) ====+===== 4.3 Proxmox Backup Server VM (ID 105=====
  
-There are **three main storage layers**:+^ Component ^ Details ^ 
 +**Datastore** | ''pbs-main''
 +| **Backed By** | TrueNAS | 
 +| **Backed Up?** | **NO** (PBS never backs up itself) |
  
-  * **Local node storage** (per-node disks, LVM/ZFS) +PBS backs up:
-  * **TrueNAS VM** – used as backup target +
-  * **Proxmox Backup Server (PBS)** – used for image-based backups+
  
----+  * Immich 
 +  * Nextcloud 
 +  * Jellyfin 
 +  * Web / Wiki 
 +  * Prometheus / Kuma 
 +  * n8n 
 +  * NPM
  
-=== 4.1 Local Storage on pve1 ===+**Excluded:**
  
-Typical Proxmox storages (names as shown in the UI):+  * PBS (cannot back itself) 
 +  * TrueNAS VM (contains the datastore itself)
  
-  * `local (pve1)` – boot disk, ISOs, templates +----
-  * `local-lvm (pve1)` – LVM-thin for VM disks +
-  * `VM-pool (pve1)` – additional pool for VMs (local disks) +
-  * `PBS (pve1)` – smaller storage local to the PBS VM (e.g., for metadata or staging) +
- +
-Backed by: +
- +
-  * WDC WD1003FBYX and Seagate ST91000640NS disks on Intel/ASMedia/GLOTRENDS SATA controllers +
-  * No hardware RAID; uses Proxmox’s software stack and/or ZFS/LVM +
- +
---- +
- +
-=== 4.2 Local Storage on pve2 === +
- +
-  * `local (pve2)` – boot, ISOs, templates +
-  * `local-lvm (pve2)` – VM disks +
-  * `apps-pool (pve2)` – main pool for application VMs (Nextcloud, Immich, Jellyfin, etc.) +
- +
-Also backed by multiple 1 TB Seagate ST91000640NS disks via the same combination of controllers. +
- +
---- +
- +
-=== 4.3 TrueNAS VM === +
- +
-  * **VM ID:** 108 (`truenas`) +
-  * **Node:** pve1 +
-  * **Purpose:** Provides network file storage and **backup target** for PBS +
-  * **Storage role:** Backing store for **PBS datastore `pbs-main`** +
-  * **Backups:** **TrueNAS is NOT backed up by PBS.**   +
-    * Reason: TrueNAS holds the backups; recursively backing it up is inefficient and can overload +
-      the system. +
- +
-Over time, this VM may be migrated to a **dedicated physical NAS**, but for now it is virtualized. +
- +
---- +
- +
-=== 4.4 Proxmox Backup Server (PBS) VM === +
- +
-  * **VM ID:** 105 (`pbs`) +
-  * **Node:** pve1 +
-  * **OS:** Proxmox Backup Server 4.x +
-  * **CPU:** 3 vCPUs +
-  * **RAM:** 4 GB +
-  * **Main datastore:** `pbs-main` +
-    * Size: ~5.38 TB +
-    * Used: ~0.9 TB +
-    * Backed by TrueNAS storage +
- +
-Important backup rule:+
  
-  * **PBS does not back up itself.**   +====== 5Workload Layout (Current======
-    * The PBS VM is excluded from nightly backup jobs. +
-  * **TrueNAS (backup storageis also excluded** from PBS backups.+
  
-This prevents:+==== VMs Hosted on PVE-NAS ====
  
-  * Storage thrashing self-backup loops +^ VM ID ^ Name ^ Purpose ^ 
-  * Catastrophic performance impact from PBS trying to back up its own datastore+| 100 | web | TorresVault home page | 
 +| 101 | Kuma | Uptime monitoring | 
 +| 102 | next | Nextcloud | 
 +| 103 | immich | Photo/video backup | 
 +| 104 | jellyfin | Media server | 
 +| 105 | pbs | Backup server | 
 +| 106 | n8n | Automations | 
 +| 107 | npm | Reverse proxy | 
 +| 108 | truenas | Core storage | 
 +| 110 | Prometheus | Monitoring | 
 +| 116 | wiki | DokuWiki | 
 +| 112/113/114/111 | iperf-x | VLAN testing tools |
  
-PBS instead focuses on backing up **critical application VMs** only.+Everything is now consolidated on **one hypervisor**.
  
 ---- ----
  
-==== 5Workload Layout ====+====== 6Backup Strategy ======
  
-The current cluster runs a mix of core services and lab workloads. VM IDs/names:+==== Nightly PBS Backup Jobs ====
  
-=== 5.1 VMs on pve1 ===+Backed up nightly:
  
-  * **100 – `web`** +  * Core services (web, Nextcloud, Immich, Jellyfin
-    * Role: front-end web / landing page (e.g.torresvault.com+  * Monitoring stack 
-  * **101 – `Kuma`** +  * Wiki 
-    * Role: uptime / service monitoring +  * n8n 
-  * **105 – `pbs`** +  NPM 
-    * Role: Proxmox Backup Server VM +  * Portainer (if re-added
-  * **106 – `n8n`** +  * All IPERF lab images (optional)
-    Role: automation / workflow engine +
-  * **107 – `npm`** +
-    * Role: Nginx Proxy Manager (reverse proxy+
-  * **108 – `truenas`** +
-    * Role: storage VM / backup target +
-  * **110 – `Prometheus`** +
-    * Role: metrics + Grafana stack +
-  * **112 – `iperf-vlan10`** +
-  * **113 – `iperf-vlan20`** +
-  * **114 – `iperf-vlan1`** +
-    * Role: lab VMs for VLAN and bandwidth testing +
-  * **115 – `portainer-mgmt`** +
-    * Role: container management for other hosts +
-  * **116 – `wiki`** +
-    * Role: DokuWiki instance hosting TorresVault documentation +
-  * **111 – `iperf-vlan1` (pve1 local test network)**+
  
----+==== Excluded ====
  
-=== 5.2 VMs on pve2 ===+  * **TrueNAS VM** (contains datastore) 
 +  * **PBS VM** (cannot self-backup) 
 +  * **VMs with external data stores** (e.g., Nextcloud files on TrueNAS)
  
-  * **102 – `next`** +==== Restore Flow ====
-    * Role: Nextcloud services +
-  * **103 – `immich`** +
-    * Role: photo / media backup +
-  * **104 – `jellyfin`** +
-    * Role: media server +
-  * **109 – `RDPjump`** +
-    * Role: jump host / remote access box+
  
-These assignments are **not HA**; VMs are pinned to nodes and protected by PBS backups.+  - In PBS: pick snapshot 
 +  - Restore to local-lvm or ZFS 
 +  - Boot VM 
 +  - Validate with service health checks
  
 ---- ----
  
-==== 6Backup & Restore Strategy ====+====== 7Monitoring ======
  
-Backups are handled by the **PBS VM (ID 105)**, writing into datastore `pbs-main` +Monitoring stack includes:
-hosted on TrueNAS.+
  
-Key points:+==== Node-level ====
  
-  * **Backup scope:** +  * Proxmox UI (graphs
-    * Critical service VMs: +  * ZFS ARC graphs 
-      * web (100) +  * IO delay graphs
-      * Kuma (101) +
-      * next (102) +
-      * immich (103) +
-      * jellyfin (104) +
-      * n8n (106) +
-      * npm (107) +
-      * Prometheus (110) +
-      * portainer-mgmt (115) +
-      * wiki (116) +
-      * other lab VMs as needed +
-    * **Excluded:** +
-      * `truenas` VM (108) +
-      * `pbs` VM (105) itself +
-  * **Datastore:** `pbs-main` +
-    * Size ~5.38 TB; currently lightly used +
-  * **Retention policy:** configured in PBS; typical pattern (can be tuned): +
-    * e.g., 7 daily, 4 weekly, 12 monthly (confirm in PBS UI) +
-  * **Verification:** +
-    * PBS supports scheduled **verify jobs** to detect bit-rot +
-  * **Prune jobs:** +
-    * Automated prune / garbage collection jobs run regularly to reclaim space+
  
----+==== Service-level ====
  
-=== 6.1 Rationale for Exclusions ===+  * Prometheus (metrics) 
 +  * Grafana dashboards 
 +  * Kuma for ping/HTTP checks
  
-  * **TrueNAS** holds the backup data; backing it up with PBS would: +==== Storage-level ====
-    * Recursively back up the PBS datastore +
-    * Multiply IO load +
-    * Risk saturating disks and disrupting backups+
  
-  * **PBS VM** is the backup system itself: +  * TrueNAS SMART monitoring 
-    Backing PBS to its own datastore is logically unsound +  * PBS datastore stats 
-    If PBS is lost, it can be rebuilt from Proxmox templates and reattached to +  Verification/prune jobs
-      existing datastore+
  
-Instead, PBS backup jobs focus on **stateless or easily rebuildable VMs** where immutable data +==== Network-level ====
-is stored externally (e.g., on TrueNAS, Nextcloud data, or other locations).+
  
---- +  UniFi metrics 
- +  * HA sensors & automations
-=== 6.2 Restore Process (Operational Runbook) === +
- +
-**Scenario: single VM failure** +
- +
-  1. Identify affected VM in Proxmox UI. +
-  2. In PBS UI: +
-     Go to **Datastore → pbs-main → Content → VM group**. +
-     * Select the latest successful backup. +
-  3. Choose **Restore**: +
-     * Target node: original host (or alternate host if needed) +
-     * Disk storage: appropriate local storage (`local-lvm`, `apps-pool`, etc.) +
-  4. Start VM in Proxmox and validate: +
-     * Application health checks (web UI, API, etc.) +
-     * Network connectivity (LAN, DNS, etc.) +
- +
-**Scenario: node loss (pve1 or pve2)** +
- +
-  1. Replace/fix hardware and reinstall Proxmox VE. +
-  2. Rejoin node to `torres-cluster`. +
-  3. Recreate necessary storages pointing at local disks. +
-  4. From PBS, restore VMs to the rebuilt node using the procedure above. +
- +
-**Scenario: PBS VM lost but TrueNAS datastore intact** +
- +
-  1. Recreate PBS VM from Proxmox template. +
-  2. Reattach existing `pbs-main` datastore on TrueNAS. +
-  3. PBS will rediscover existing backups. +
-  4. Resume normal operations.+
  
 ---- ----
  
-==== 7Monitoring & Observability ====+====== 8Operations ======
  
-Monitoring in TorresVault is layered:+==== Power-Down Order ====
  
-  * **Proxmox Node Metrics** +  - Apps (ImmichNextcloudNPMwebwiki) 
-    * Built-in graphs: CPURAMI/Onetworkload average per node and per VM +  - Monitoring (PrometheusKuma) 
-  * **Prometheus VM (110)** +  PBS 
-    * Scrapes metrics from nodes and services +  TrueNAS 
-    * Grafana dashboards provide historical views +  - PVE-NAS
-  * **Kuma VM (101)** +
-    * Synthetic checks / uptime monitoring for key services +
-  * **PBS Analytics** +
-    * Shows backup / prune / verify job history and datastore usage +
-  * **TrueNAS UI** +
-    * Disk health, pool status+
  
-Operational practice:+==== Power-Up Order ====
  
-  * Use Kuma for **“is it up?”** +  - Network gear 
-  * Use Proxmox + Grafana for **“how is it behaving?”** +  - PVE-NAS 
-  * Use PBS UI for **backup health**.+  - TrueNAS 
 +  - PBS 
 +  - Core apps 
 +  - Monitoring
  
-----+==== Updating Proxmox ====
  
-==== 8. Operational Procedures (Day-to-Day) ==== +<code -> 
- +apt update && apt full-upgradereboot 
-=== 8.1 Adding a New VM === +</code>
- +
-  1. Decide which node (pve1 vs pve2) based on workload: +
-     * Storage-heavy → whichever has more free disk +
-     * Media or GPU heavy (later) → pve2 +
-  2. Create VM in Proxmox: +
-     * Attach to `vmbr0` for LAN access +
-     * Store disks on `local-lvm`, `apps-pool`, or `VM-pool` +
-  3. Install OS and configure network. +
-  4. In **PBS**, add VM to an existing or new backup group. +
-  5. Verify first backup completes successfully. +
- +
---- +
- +
-=== 8.2 Maintenance & Patching === +
- +
-**Proxmox nodes:** +
- +
-  1. Live-migrate or gracefully shut down VMs on the target node if needed. +
-  2. `apt update && apt full-upgrade` on the node (via console or SSH). +
-  3. Reboot node. +
-  4. Verify: +
-     * Corosync quorum healthy +
-     * VMs auto-started where expected +
- +
-**PBS & TrueNAS:** +
- +
-  * Update during low-traffic windows (overnight). +
-  * Confirm backups resume successfully after upgrades. +
- +
---- +
- +
-=== 8.3 Power-Down Power-Up Order === +
- +
-Planned maintenance that requires full stack shutdown: +
- +
-**Shutdown order:** +
- +
-  1. Application VMs (web, Nextcloud, Immich, Jellyfin, etc.) +
-  2. Monitoring VMs (Kuma, Prometheus) +
-  3. PBS VM +
-  4. TrueNAS VM +
-  5. pve2 node +
-  6. pve1 node (last Proxmox node) +
-  7. Network gear / UPS if necessary +
- +
-**Power-up order:** +
- +
-  1. Network gear & UPS +
-  2. pve1 and pve2 nodes +
-  3. TrueNAS VM +
-  4. PBS VM +
-  5. Core apps (web, Nextcloud, Immich, Jellyfin, n8n, NPM, wiki) +
-  6. Monitoring stack (Kuma, Prometheus/Grafana) +
- +
-This order ensures that **storage is ready before PBS**, and **PBS is ready before +
-depending VMs** (if any use backup features like guest-initiated restore).+
  
 ---- ----
  
-==== 9. Risks, Constraints Known Limitations ====+====== 9. Risks & Constraints ======
  
-  * **No shared storage / HA:**   +  * **Single-node setup** (no HA) 
-    * VMs are pinned to nodes. If a node fails, VMs require restore or manual migration. +  * **TrueNAS + PBS + all VMs** on same hardware = consolidated risk 
-  * **Older hardware:**   +  * **No shared storage** 
-    * CPUs and DDR3L era platforms limit performance and efficiency. +  * **Heavy workloads can spike RAM** (currently ~65% used steady) 
-  * **Local disks only:**   +  Future GPU / AI workloads may require more RAM
-    * Mix of older 1 TB HDDs; no SSD-only tiers for high IOPS workloads. +
-  * **PBS & TrueNAS both virtualized on pve1:**   +
-    * Concentrates backup and storage responsibility on a single compute node. +
-  * **Limited RAM (32 GB per node):**   +
-    Constrains number of memory-heavy workloads.+
  
-These are acceptable for home lab / prosumer environment but are captured here +These are acceptable for home lab usage.
-explicitly for future planning.+
  
 ---- ----
  
-==== 10. Future Improvements (Pointer to Roadmap) ====+====== 10. Future Upgrades (TorresVault 2.0 Roadmap) ======
  
-The following items are **out of scope for this document** but are tracked on the roadmap:+  Add FLEX 10G for 10GbE uplink 
 +  Build **backup NAS** with matching SSDs 
 +  * Add **UM890 Pro mini-PC cluster** 
 +  * Add **Jarvis AI GPU node** 
 +  * Scale-out TrueNAS pool to 10–11 SSDs 
 +  * Offload PBS to dedicated hardware 
 +  Move Home Assistant to a VM 
 +  Add ZFS replication between NAS → backup NAS
  
-  * Dedicated **NAS / Proxmox hybrid** with ASRock Rack X570D4U and 16 × 6 TB SAS +Roadmap page will detail this further.
-  * Standalone physical **TrueNAS or SCALE** box +
-  * Additional **Proxmox node** or Mini-PC cluster for Kubernetes +
-  * 10 GbE or faster interconnect between nodes +
-  * Storage tiers (NVMe → SSD → HDD → PBS) +
-  * Better separation of roles: +
-    * PBS on dedicated hardware +
-    * TrueNAS on physical host +
-    * Proxmox nodes focused on compute+
  
-See: [[torresvault:todo:roadmap]] (to be created). 
- 
----- 
  
proxmox/cluster.txt · Last modified: by 192.168.1.189

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki