Files
homelab/active/os_truenas/truenas.md
2025-10-14 12:33:29 -04:00

19 KiB

Truenas

Bios settings

You can check the bios version with dmidecode -t bios -q

  1. Turn off all C-State or power saving features. These definitely cause instability like random freezes.
  2. Turn off boosting
  3. Enable XMP

Datasets, Snapshots, and Encryption

Periodic Snapshot Recommendations

Hourly Snapshots

  • Lifetime: 1 day
  • Naming Schema: hourly-%Y-%m-%d_%H-%M
  • Schedule: Hourly
  • Begin: 00:00:00
  • End: 23:59:00
  • Disallow taking empty snapshots
  • Enabled
  • Recursive

Assuming 100 datasets: 100 datasets x 24 hours = 2400 snapshots

Disallowing empty snapshots will help keep that number down.

Daily Snapshots

  • Lifetime: 1 week
  • Naming Schema: daily-%Y-%m-%d_%H-%M
  • Schedule: Daily
  • Allow taking empty snapshots
  • Enabled
  • Recursive

Assuming 100 datasets: 100 datasets x 7 days = 700 snapshots

Replication Tasks

Before configuring, create a dataset that you'll be replicating to.

Use advanced settings.

  • Transport LOCAL

Source

  • Recursive
  • Include Dataset Properties
  • Periodic Snapshot Tasks: Select your daily task
  • Run automatically

Destination

  • Read-only Policy: SET
  • Snapshot Retention Policy: Custom
  • Lifetime: 1 month
  • Naming Schema: daily-%Y-%m-%d_%H-%M

Assuming 100 datasets: 100 datasets x 30 days = 3000 snapshots

Manually Create Named Snapshots

  1. Datasets -> Select dataset -> Create Snapshot -> Naming Schema (daily)
  2. Start replication from Data Protection

Migrating encrypted pools

Since you can't use -R to send encrypted datasets recursively you'll need to use more creative tactics. Here's my recommendation:

  1. Save the datasets from a pool to a text file:

    export SNAPSHOT='@enc1-hourly-2025-03-05_09-00'
    export SEND_POOL=enc1
    export RECV_POOL=enc0
    export DATASETS_FILE=pool_datasets.txt
    
    zfs list -r -H -o name <pool> > pool_datasets.txt
    
  2. Remove the source pool from the front of all the listed datasets. In vim, for example:

    :%s/enc0\//g
    
  3. Now you can run the following

    # Dry run
    for DATASET in $(cat $DATASETS_FILE); do echo "zfs send -v $POOL/$DATASET$SNAPSHOT | zfs recv $RECV_POOL/$DATASET"; done
    
    # Real thing
    for DATASET in $(cat $DATASETS_FILE); do zfs send -v $POOL/$DATASET$SNAPSHOT | zfs recv $RECV_POOL/$DATASET; done
    

Migrating Properties

If you need to migrate your dataset comments you can use the following bash to automate the task.

for i in $(zfs list -H -d 1 -o name backup/nvme/k3os-private); do read -r name desc < <(zfs list -H -o name,org.freenas:description $i) && pvc=$(echo "$name" | awk -F "/" '{print $NF}') && zfs set org.freenas:description=$desc enc1/k3os-private/$pvc; done

Create and Destroy zfs Datasets

# Create a pool
zpool create rpool /dev/disk/by-id/disk-id

# Add a cache disk
zpool add backup cache /dev/sda

# Enable encryption
zpool set feature@encryption=enabled rpool

# Create a dataset
zfs create rpool/d1

# Create an encrypted dataset
zfs create -o encryption=on -o keylocation=prompt -o keyformat=passphrase rpool/d1

# Delete a dataset
zfs destroy rpool/d1

Create and send snapshots

export SEND_DATASET=enc0/vms/gitea-docker-runner-data
export RECV_DATASET=enc0/vms/gitea-docker-runner-data-sparse

# snapshot pool and all children
zfs snapshot -r $SEND_DATASET@now

# send all child snapshots
zfs send -R $SEND_DATASET@now | pv | zfs recv $RECV_DATASET

# use the -w raw flag to send encrypted snapshots
zfs send -R -w $SEND_DATASET@snapshot | pv | zfs recv $RECV_DATASET

Cleaning up old snapshots

If you want to delete every snapshot:

# Just in case, use tmux. This can take a while
tmux

# This pool you want to clean up
export POOL=backup0
# This can be anything, set it to something memorable
export SNAPSHOTS_FILE=enc0_mar2025_snapshots.txt

# Check the number of snapshots in the dataset
zfs list -t snap -r $POOL | wc -l

# Save the list of snapshots to the snapshots file
zfs list -t snap -r -H -o name $POOL > $SNAPSHOTS_FILE

# Check the file 
cat $SNAPSHOTS_FILE | less

# Dry run
for SNAPSHOT in $(cat $SNAPSHOTS_FILE); do echo "zfs destroy -v $SNAPSHOT"; done | less

# Real thing
for SNAPSHOT in $(cat $SNAPSHOTS_FILE); do zfs destroy -v $SNAPSHOT; done

Creating and restoring snapshots

# Take a snapshot
zfs list -d 1 enc1/vms
export ZFS_VOL='enc1/vms/Gambox1-z4e0t'
zfs snapshot $ZFS_VOL@manual-$(date --iso-8601)

# Restore a snapshot
zfs list -t snapshot $ZFS_VOL
export ZFS_SNAPSHOT='enc1/vms/Gambox1-z4e0t@init-no-drivers-2025-03-03_05-35'
zfs rollback $ZFS_SNAPSHOT

Filesystem ACLs

If you see something like "nfs4xdr_winacl: Failed to set default ACL on...":

Dataset -> Dataset details (edit) -> Advanced Options -> ACL Type (inherit)

# Remove all ACLs
setfacl -b -R /mnt/enc0/smb/media

Decrypting Pools

Unlocking through the UI.

We'll need to recreate the key manifest json. This is a little tedious, but your keys will be correct after this process.

# List all datasets and format them for json keys
export LIST_DATASET=pool0/dcsi
echo "{" && \
for DATASET_PATH in $(sudo zfs list -r $LIST_DATASET -H -o name); do echo "    \"$DATASET_PATH\": \"key_here\","; done && \
echo "}"

# If the dataset's children have all the encryption keys
# Note this generates the cat EOF commands to create the json files needed to unlock.
export TL_DATASET=pool0
for TL_DATASET_PATH in $(zfs list -r $TL_DATASET -H -o name -d 1); do \
echo "cat <<EOF > dataset_${TL_DATASET_PATH}_key.json" && \
echo "{" && \
for DATASET_PATH in $(zfs list -r $TL_DATASET_PATH -H -o name); do echo "    \"$DATASET_PATH\": \"key_here\","; done && \
echo "}" && \
echo "EOF";
done

ZPool Scrubbing

# Start a scrub
zpool scrub pool0

# Check status
zpool status pool0

ISCSI

Create ZVOL

  1. Create a new dataset called "iscsi" and then a dataset under that called "backups"
    1. Set sync to always
    2. Disable compression
    3. Enable Sparse
  2. Create a new dataset under backups with the same name as your server hostname
  3. Set the size to something reasonable (Note you may need to "force size")

Create ISCSI Target

In Shared -> Block (iSCSI) Shares Targets

  1. Global Target Configuration -> Base Name
    1. set the Base Name following these rules
  2. Authorized Access -> Add
    1. Group ID arbitrary - just pick a number you haven't used
    2. Discovery Authentication: Chap
    3. User: The connecting machine's ISCSI Base Name
    4. Secret: A 16 character password with no special characters
  3. Extents -> Add
    1. Name: some-name
    2. Type: Device
    3. Device: The ZVOL you just created
    4. Sharing Platform: Modern OS
    5. Protocol Options Portal: Either create new (0.0.0.0 and ::) or select your existing portal
    6. Protocol Options Initiators: The base name of the connecting machine following these rules
  4. Targets -> Select the backup- target -> Edit
    1. Authentication Method: CHAP
    2. Authentication Group Number: The group number you created above

Troubleshooting

# ISCSI connection logs
tail -f /var/log/scst.log

VMs

  1. Force UEFI installation
  2. cp /boot/efi/EFI/debian/grubx64.efi /boot/efi/EFI/BOOT/bootx64.efi

Converting zvol to qcow2

# Convert zvol to raw
dd status=progress bs=1M if=/dev/zvol/enc0/vms/nextcloud-fi7tkq of=/mnt/enc0/vms/qcow2/nextcloud-fi7tkq.raw
# Convert raw to qcow
qemu-img convert -f raw -O qcow2 unifi.raw unifi.qcow2

# Convert in batch
# Convert zvol to raw
for FILE in $(ls /dev/zvol/enc0/vms); do dd status=progress bs=1M if=/dev/zvol/enc0/vms/$FILE of=/mnt/enc0/vms/qcow2/$FILE.raw; done
# Convert raw to qcow
for FILE in $(ls /dev/zvol/enc0/vms); do echo "qemu-img convert -f raw -O qcow2 /mnt/enc0/vms/qcow2/$FILE.raw /mnt/enc0/vms/qcow2/$FILE.qcow2"; done

Converting qcow2 to zvol

qemu-img convert -O raw -p /mnt/enc0/images/haos_ova-14.1.qcow2 /dev/zvol/enc1/vms/hass-Iph4DeeJ

Tunables

Core

sysctl kern.ipc.somaxconn=2048
sysctl kern.ipc.maxsockbuf=16777216
sysctl net.inet.tcp.recvspace=4194304
sysctl net.inet.tcp.sendspace=2097152
sysctl net.inet.tcp.sendbuf_max=16777216
sysctl net.inet.tcp.recvbuf_max=16777216
sysctl net.inet.tcp.sendbuf_auto=1
sysctl net.inet.tcp.recvbuf_auto=1
sysctl net.inet.tcp.sendbuf_inc=16384
sysctl net.inet.tcp.recvbuf_inc=524288
sysctl vfs.zfs.arc_max=34359738368 # set arc size to 32 GiB to prevent eating VMs
loader vm.kmem_size=34359738368 # set kmem_size to 32 GiB to force arc_max to apply
loader vm.kmem_size_max=34359738368  # set kmem_size_max to 32 GiB to sync with kmem_size

Nic options: "mtu 9000 rxcsum txcsum tso4 lro"

Scale

ARC Limit

Create an Init/Shutdown Script of type Command with the following:

# Limit to 8 GiB
echo 8589934592 >> /sys/module/zfs/parameters/zfs_arc_max

Set When to Post Init.

Certs

Let's Encrypt

https://www.truenas.com/docs/scale/22.12/scaletutorials/credentials/certificates/settingupletsencryptcertificates/

Note, for all "Name" fields use your domain with all "." replaced with "-"

Examaple: driveripper.reeselink.com becomes driveripper-reeselink-com

  1. Go to Credentials > Certificates and click ADD in the ACME DNS-Authenticators widget
  2. Generate credentials for your domain via AWS IAM
  3. Click ADD in the Certificate Signing Requests widget
    1. Remember, only the SAN is required
  4. Click the wrench icon next to the new CSR
    1. In "Identifier" use domain-something-com-le-stage for staging and domain-something-com-le-prod for prod
  5. System -> General Settings -> GUI Settings -> GUI SSL Certificate -> domain-something-com-le-prod

Self-signed CA

https://raymondc.net/2018/02/28/using-freenas-as-your-ca.html

  1. Create a new Root certificate (CAs -> ADD -> Internal CA)
    • Name: Something_Root
    • Key Length: 4096
    • Digest: SHA512
    • Lifetime: 825 (Apple's new requirement)
    • Extend Key Usage: Server Auth
    • Common Name: Something Root CA
    • Subject Alternate Names:
  2. Create a new intermediate certificate (CAs -> Add -> Intermediate CA)
    • Name: Something_Intermediate_CA
    • Key Length: 4096
    • Digest: SHA512
    • Lifetime: 825 (Apple's new requirement)
    • Extend Key Usage: Server Auth
  3. Create a new Certificate (Certificates -> Add -> Internal Certificate)
    • Name: Something_Certificate
    • Key Length: 4096
    • Digest: SHA512
    • Lifetime: 825 (Apple's new requirement)
    • Extend Key Usage: Server Auth

Testing

iperf

iperf3 -c mainframe -P 4
iperf3 -c mainframe -P 4 -R

iperf3 -c pc -P 4
iperf3 -c pc -P 4 -R

disk

# write 16GB to disk
dd if=/dev/zero of=/tmp/test bs=1024k count=16000
# divide result by 1000^3 to get GB/s

# read 16GB from disk
dd if=/tmp/test of=/dev/null bs=1024k
# divide result by 1000^3 to get GB/s

disk health

https://documents.westerndigital.com/content/dam/doc-library/en_us/assets/public/western-digital/product/internal-drives/wd-black-ssd/product-brief-wd-black-sn750-nvme-ssd.pdf

# HDD
smartctl -a /dev/ada1 | grep "SMART Attributes" -A 18

# NVME
smartctl -a /dev/nvme1 | grep "SMART/Health Information" -A 17

Dead Disks

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Black
Device Model:     WDC WD2003FZEX-00Z4SA0
Serial Number:    WD-WMC5C0D6PZYZ
LU WWN Device Id: 5 0014ee 65a5a19fc
Firmware Version: 01.01A01
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sat Feb 13 18:31:57 2021 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

Corrupted data

One or more devices has experienced an error resulting in data corruption. Applications may be affected.

To get a list of affected files run:

zpool status -v

Stuck VMs

"[EFAULT] 'freeipa' VM is suspended and can only be resumed/powered off"

"virsh cannot acquire state change lock monitor=remoteDispatchDomainSuspend"

virsh -c "qemu+unix:///system?socket=/run/truenas_libvirt/libvirt-sock" list
export VM_NAME=

# Try this first
virsh -c "qemu+unix:///system?socket=/run/truenas_libvirt/libvirt-sock" resume $VM_NAME

# Or just destroy and start it again
virsh -c "qemu+unix:///system?socket=/run/truenas_libvirt/libvirt-sock" destroy $VM_NAME
virsh -c "qemu+unix:///system?socket=/run/truenas_libvirt/libvirt-sock" start $VM_NAME

Mounting ZVOLS

Sometimes you need to mount zvols onto the truenas host. You can do this with the block device in /dev.

For simple operations:

export ZVOL_PATH=enc0/vms/gitea-docker-runner-data-sparse
mount --mkdir /dev/zvol/$ZVOL_PATH /tmp/$ZVOL_PATH

# If you need to create a filesystem
fdisk /dev/zvol/$ZVOL_PATH
mkfs.btrfs /dev/zvol/$ZVOL_PATH

For bulk operations:

for path in $(ls /dev/zvol/enc0/dcsi/apps/); do mount --mkdir /dev/zvol/enc0/dcsi/apps/$path /tmp/pvcs/$path; done
for path in $(ls /dev/zvol/enc1/dcsi/apps/); do mount --mkdir /dev/zvol/enc1/dcsi/apps/$path /tmp/pvcs/$path; done

# From driveripper
rsync --progress -av -e ssh \
    driveripper:/mnt/enc1/dcsi/nfs/pvc-ccaace81-bd69-4441-8de1-3b2b24baa7af/ \
    /tmp/transfer/ \
    --dry-run

# To Kube
rsync --progress -av --delete -e ssh \
    /tmp/transfer/ \
    kube:/opt/local-path-provisioner/ssd/pvc-4fca5cad-7640-45ea-946d-7a604a3ac875_minecraft_nimcraft/ \
    --dry-run

UPS Monitoring

First, you'll need to create a user with access to the UPS in System -> Services -> UPS. Under the Extra Users section, add a user like so:

[admin]
    password = mypass
    actions = set
    actions = fsd
    instcmds = all

Then you can run commands with upscmd

export UPS_USER=admin
export UPS_PASS=mypass

# Quick battery test
upscmd -u $UPS_USER$ -p $UPS_PASS ups test.battery.start.quick

ZFS Size Data

# jq -r is required otherwise the data will be invalid
zfs list -j enc0/vms -p -o available,used | \
jq -r --arg TIMESTAMP `date +%s` '"driveripper.vms.data.used " + .datasets[].properties.used.value + " " + $TIMESTAMP' | \
nc -N -4 yellow.reeselink.com 2003

ZFS Rename

Make sure you unshare any connected shares, otherwise you'll get "cannot unmount '/mnt/enc0/smb/reese_and_alica': pool or dataset is busy"

zfs rename enc0/something enc0/something_else

ISCSI

ISCSI Base Name

https://datatracker.ietf.org/doc/html/rfc3721.html#section-1.1

| iqn | . | year-month of domain registration | . | reversed domain | : | unique string

iqn.2022-01.com.reeselink:driveripper

Archiving

  1. Create a recursive snapshot called "archive_pool_year_month_day"

  2. Create a replication task called "archive_pool_year_month_day"

    • select all datasets you want to backup
    • fill in enc0/archives/archive-year-month-day_hour-minute
    • full filesystem replication
    • select "Matching naming schema"
    • Use archive-%Y-%m-%d_%H-%M
    • Deselect run automatically
    • Save and run

Deleting snapshots

Sometimes you need to delete many snapshots from a certain dataset. The UI is terrible for this, so we need to use zfs destroy. xargs is the best way to do this since it allows parallel processing.

# zfs list snapshots with:
# -o name: only print the name
# -S creation: sort by creation time
# -H: don't display headers
# -r: recurse through every child dataset
zfs list -t snapshot enc0/archives -o name -S creation -H -r

# pipe it through xargs with:
# -n 1: take only 1 argument from the pipe per command
# -P 8: eight parallel processes
# Also pass to zfs destroy:
# -v: verbose
# -n: dryrun
zfs list -t snapshot enc0/archives -o name -S creation -H -r | xargs -n 1 -P 8 zfs destroy -v -n

# if that looks good you can remove the "-n"
zfs list -t snapshot enc0/archives -o name -S creation -H -r | xargs -n 1 -P 8 zfs destroy -v

But First, ZFS on RPi

A really good backup server is an RPi running openzfs. See the openzfs docs for more info.

Pi Setup

Add the vault ssh CA key to your pi.

curl -o /etc/ssh/trusted-user-ca-keys.pem https://vault.ducoterra.net/v1/ssh-client-signer/public_key

echo "TrustedUserCAKeys /etc/ssh/trusted-user-ca-keys.pem" >> /etc/ssh/sshd_config

service ssh restart

Create a pi user.

adduser pi
usermod -a -G sudo pi

SSH to the pi as the "pi" user. Delete the ubuntu user.

killall -u ubuntu
userdel -r ubuntu

Disable SSH password authentication

sed -i 's/PasswordAuthentication yes/PasswordAuthentication no/g' /etc/ssh/sshd_config
service ssh restart

Change the hostname.

echo pi-nas > /etc/hostname

Upgrade and restart the pi.

apt update && apt upgrade -y && apt autoremove -y
reboot

Install ZFS.

apt install -y pv zfs-initramfs

Find the disks you want to use to create your pool

fdisk -l

Create a pool.

mkdir -p /mnt/backup
zpool create \
    -o ashift=12 \
    -O acltype=posixacl -O canmount=off -O compression=lz4 \
    -O dnodesize=auto -O normalization=formD -O relatime=on \
    -O xattr=sa -O mountpoint=/mnt/backup \
    backup ${DISK}