====== Proxmox virtualization cluster ====== * LXC images collection: http://download.proxmox.com/images/system/ ===== Storage ===== ===== Network structure ===== ===== System configuration ===== ==== Bootloader ==== === Console redirection, VSP, IPMI SoL === At least when installation was done using serial console, the Proxmox installer configures system in a very useful manner, so that bootloader and the kernel appear both on the serial and the VGA console: GRUB_TERMINAL_INPUT="console serial" GRUB_TERMINAL_OUTPUT="gfxterm serial" GRUB_SERIAL_COMMAND="serial --unit=0 --speed=115200" GRUB_CMDLINE_LINUX="$GRUB_CMDLINE_LINUX console=ttyS0,115200" This allows later convenient access to the system via IPMI Serial-over-LAN. The inconvenient part of this that it diverts kernel boot messages (''kmsg'') away from VGA console. It's possible to show them both on VGA and TTY, for which we create yet another file: GRUB_CMDLINE_LINUX="$GRUB_CMDLINE_LINUX console=tty0" I also don't like the default "quiet" option for boot messsages, so I override it: GRUB_CMDLINE_LINUX_DEFAULT="" In case of installation over Debian, where Proxmox installer did not run, all this setup needs to be replicated: GRUB_CMDLINE_LINUX_DEFAULT="" GRUB_CMDLINE_LINUX="console=tty0 console=ttyS1,57600n8" GRUB_TERMINAL="serial console" GRUB_SERIAL_COMMAND="serial --speed=57600 --unit=1 --word=8 --parity=no --stop=1" Other COM port is used with another speed here, just for illustration. Notice also, instead of separate ''GRUB_TERMINAL_INPUT'' and ''GRUB_TERMINAL_OUTPUT'' I use a single setting for both. ==== Fast reboots with kexec ==== FIXME this needs more work * https://forum.proxmox.com/threads/tip-fast-reboots-with-kexec.35624/ * https://forum.proxmox.com/threads/proxmox-7-fast-reboot-with-kexec.93422/ ==== Time sync ==== Proxmox recommends using ''chrony'' for the NTP synchronization, and **advises against** ''systemd-timesync''. To configure NTP servers, it's necessary to create an additional file in a drop directory: echo 'server 10.226.130.130 iburst' > /etc/chrony/sources.d/local-ntp-server.sources echo 'server 10.226.130.131 iburst' >> /etc/chrony/sources.d/local-ntp-server.sources chronyc reload sources ==== ECC error notifications ==== Install `rasdaemon` utility to receive reports from hardware via EDAC interface and get them into logs. See [[https://www.setphaserstostun.org/posts/monitoring-ecc-memory-on-linux-with-rasdaemon/]] ==== Backup ==== Useful to set up using ''zstd'' by default, since it's both **completes faster** and **has better compression** at the same time than ''gzip''. Also we want it to use multiple cores. For that, new systems should have in ''/etc/vzdump.conf'' at least the following: compress: zstd pigz: 0 zstd: 0 ''0'' means "use half of available cores". If you know the number of the cores in the target system, you can use other, more tailored settings here. We set ''pigz'' too, just for the case somebody changes to gzip. **Note:** it's possible to [[https://forum.proxmox.com/threads/reading-blob-files-qemu-server-conf-blob.95551/|read blob files]] directly on the server, either with `proxmox-backup-debug` or "by hand": proxmox-backup-debug inspect file /path/to/blob --decode - dd if=/path/to/blob bs=1 skip=12 | zstdcat === Node backup === Hosts may need backup, too. Theoretically we need "thin" backups as barely needed for the recovery, as explained here: https://pve.proxmox.com/wiki/Proxmox_VE_4.x_Cluster#Re-installing_a_cluster_node . Yet, I still find it useful to do just full host backup. It will not be very large (in my experience, around 5 GB), it will be strongly deduplicated (between backups of same node and between nodes — they're similar), so why bother? It is useful to create a simple shell script and run it, say, mounthly: #!/usr/bin/bash export PBS_FINGERPRINT= export PBS_REPOSITORY=@: export PBS_PASSWORD= NS= NOTES=$(hostname -f) TMP=$(mktemp -d -p /dev/shm) if mountpoint -q /boot/efi then # for modern UEFI boot proxmox-backup-client backup --ns ${NS} root.pxar:/ pve.pxar:/etc/pve exp.pxar:/boot/efi 2>&1 | tee ${TMP}/client.log else # for legacy BIOS boot proxmox-backup-client backup --ns ${NS} root.pxar:/ pve.pxar:/etc/pve sda1.img:/dev/sda1 2>&1 | tee ${TMP}/client.log fi SNAPSHOT=$(grep "Starting backup:" ${TMP}/client.log | cut -d':' -f 3-) proxmox-backup-client snapshot upload-log --ns ${NS} ${SNAPSHOT} ${TMP}/client.log rm -rf ${TMP} proxmox-backup-client snapshot notes update --ns ${NS} ${SNAPSHOT} ${NOTES} This mathes the disk structure the Proxmox's installer creates: * the paritition table is GPT * ''sda1'' is a ''bios_grub'' partition of 1 MiB minus 34 sectors; it's used only on legacy systems * ''sda2'' is ESP of 1GiB mounted as ''/boot/efi''; it's used only on UEFI systems * ''sda3'' has everything else: it's either LVM or ZFS or BTRFS * ''/etc/pve'' is a Proxmox's configuration file system mounted with FUSE, so it neeeds a dedicated clause for it contents to be backed up as files. It's already backed up into ''root.pxar'' because it is actually contained inside a SQLite file ''/var/lib/pve-cluster/config.db'', but to use that backup we need to mount it, which could be tricky in the event of disaster, so for convenience we back up it's files too There is no need to backup other copies of ESP or ''bios_grub'' partitions (e.g. ''/dev/sdb1'' and ''/dev/sdb2'' in case of "software RAID"), one copy is enough. If the node installation was performed by converting Debian bookworm system, you need to adjust the backup command accordingly. If you find this too wasteful, read this thread and invent your own backup script: https://forum.proxmox.com/threads/backup-and-restore-node.115161/ ===== Debian repositories ===== ==== Proxmox Virtualization Environment ==== Proxmox has a very good guide on installation of the PVE over the standard Debian bookworm, available here: https://pve.proxmox.com/wiki/Install_Proxmox_VE_on_Debian_12_Bookworm . Or, the installation from their distributed ISO can be made. This is the package source file that also includes CEPH repo, which is ommited from the guide above: deb [arch=amd64] http://download.proxmox.com/debian/pve bookworm pve-no-subscription deb http://download.proxmox.com/debian/ceph-quincy bookworm no-subscription Either way, it installs ''ceph.list'' and ''pve-enterprise.list'' into ''/etc/apt/sources.list.d/'', which require subscription and will make the future runs of ''apt update'' fail. You need either to configure a subscription and disable the sources in ''pve-install-repo.list'', or if going without subscription is desired, disable the sources in ''ceph.list'' and ''pve-enterprise.list''. ==== Proxmox Backup Server ==== Proxmox has extensive documentation about it, available at https://pbs.proxmox.com/docs/ If there is other Promxox product installed on the same host, you don't need to install keys again. Else, run the following to add a Proxmox repository signing key: wget https://enterprise.proxmox.com/debian/proxmox-release-bookworm.gpg -O /etc/apt/trusted.gpg.d/proxmox-release-bookworm.gpg The PVE and PBS has the client included; you don't need to do anything to use a client. === Server === To have access to the server packages you may start with the following repo: deb http://download.proxmox.com/debian/pbs bookworm pbs-no-subscription Then, update definitions and install the server package, either ''proxmox-backup'' (generally, do this) or ''proxmox-backup-server'' (it's enough if installing on PVE host, but using the general way doesn't hurt either): apt update apt install proxmox-backup It will install the file ''/etc/apt/sources.list.d/pbs-enterprise.list'' which references subscription repository. That'll make the future runs of ''apt update'' fail. If you have a subscription, configure it and disable the source in the ''pbs-install-repo.list''. If going without subscription is desired, disable the repo in this file instead. === Client === The client is available for Debian ''bookworm'', ''bullseye'' and ''buster'' releases. Use the following source: deb http://download.proxmox.com/debian/pbs-client bookworm main Then, update definitions and install the ''proxmox-backup-client'' package: apt update apt install proxmox-backup-client ==== HP Software Delivery Repository — Management Component Pack ==== On the HPE servers it's recommended to add their SDR MCP repository to access the useful tools to administer and monitor the hardware: https://downloads.linux.hpe.com/SDR/project/mcp/ **Don't use** the [[https://downloads.linux.hpe.com/SDR/keys.html|procedure described by HP]] to add the keys into the keyring with ''apt-key''. It's deprecated in Debian. Modern scheme is to put keys as separate files into ''/etc/apt/trusted.gpg.d'', either as PEM-encoded ''.asc'' files or GnuPG-encoded ''.gpg'' bundles. It's enough to **only download the last HPE key file** into the mentioned directory and rename it to have correct suffix: curl https://downloads.linux.hpe.com/SDR/hpePublicKey2048_key1.pub -o /etc/apt/trusted.gpg.d/hpePublicKey2048_key1.asc (''wget'' could be used instead of ''curl'') For the time of writing, there is support for * ''bookworm'': 12.80 (2023-09-05) * ''buster'', ''bullseye'': 12.20 (2021-10-04), 12.30 (2021-12-06), 12.40 (2022-06-03) The example configuration looks like the following: deb http://downloads.linux.hpe.com/SDR/repo/mcp bookworm/12.80 non-free Afterwards, it's advisable to install the most useful packages, which are: * ''ssacli'' — a command-line interface to configure HP SmartArray RAID controllers and HBAs. It replaces ''hpssacli'' and ''hpacucli'' packages and the usage is the same (it's the same program, renamed with rebranding) * ''hponcfg'' — a program to retrieve a configuration from iLO and upload it back. Although some configuration can be done using standard IPMI tools like ''ipmiutil'', other things can only be configured using this proprietary tool apt update apt install ssacli hponcfg **Also**, it's might be useful to have this repository on some modern Dell and Supermicro hardware, because it distributes a `storcli` tool: apt update apt install storcli ==== HW RAID ==== On Dell and Supermicro servers we generally have the LSI/Avago/Broadcom MegaRAID card (rebranded into PERC in Dell). To administer and monitor it from the OS, we need the `megacli` and `storcli` tools. Some servers use Adaptec cards. The `storcli` package is available in HP SDR MCP repository (see above), [[https://hwraid.le-vert.net/wiki/DebianPackages|other essential tools]] are available in the HWRAID repo at https://hwraid.le-vert.net/ , namely `megacli`. This repository recommends the **deprecated** way ot adding of a signing key into the system with ''apt-key''. Instead, you need to **download the key file** and save it as ''.asc'' file into ''/etc/apt/trusted.gpg.d'' directory: wget -O /etc/apt/trusted.gpg.d/hwraid.le-vert.net.asc https://hwraid.le-vert.net/debian/hwraid.le-vert.net.gpg.key (''curl'' could be used instead of ''wget'') Repository supports ''bookworm'', ''bullseye'' and ''buster'' releases. The example source file: deb http://hwraid.le-vert.net/debian bookworm main Then, update the descriptions and install the needed tools: apt update apt install megacli