Files
homelab/active/software_k3s/k3s.md
ducoterra f2015e2c71
All checks were successful
Podman DDNS Image / build-and-push-ddns (push) Successful in 1m3s
checkpoint commit
2026-05-05 06:26:40 -04:00

12 KiB
Raw Blame History

K3S

Firewalld

# All required ports (https://docs.k3s.io/installation/requirements?_highlight=ports#local-ports)
firewall-cmd \
--permanent \
--zone=public \
--add-port=80/tcp \
--add-port=443/tcp \
--add-port=2379-2380/tcp \
--add-port=6443/tcp \
--add-port=8472/udp \
--add-port=10250/tcp

# IPv4 config
# 10.42 is for pods
# 10.43 is for services
firewall-cmd \
--permanent \
--zone=trusted \
--add-source=10.42.0.0/16 \
--add-source=10.43.0.0/16

# [Optional] IPv6 config
# fd02:c91e:56f4 is for pods
# fd02:c91e:56f5 is for services
firewall-cmd \
--permanent \
--zone=trusted \
--add-source=fd02:c91e:56f4::/56 \
--add-source=fd02:c91e:56f5::/112

firewall-cmd --reload

SELinux

Make sure to add --selinux to your install script.

Install Single Node K3S

Dual Stack IPv6 Support

curl -sfL https://get.k3s.io | sh -s - \
--selinux \
"--disable" \
"traefik" \
"--disable" \
"servicelb" \
"--tls-san" \
"k3s.reeselink.com" \
"--flannel-ipv6-masq" \
--kubelet-arg="node-ip=::" \
"--cluster-cidr" \
"10.42.0.0/16,fd02:c91e:56f4::/56" \
"--service-cidr" \
"10.43.0.0/16,fd02:c91e:56f5::/112" \
"--cluster-dns" \
"fd02:c91e:56f5::10"

Single Stack IPv4

curl -sfL https://get.k3s.io | sh -s - \
"--disable" \
"traefik" \
"--disable" \
"servicelb" \
"--disable" \
"local-storage" \
"--cluster-cidr" \
"10.42.0.0/16" \
"--service-cidr" \
"10.43.0.0/16" \
--selinux

Install Multi Node K3S

TODO: haproxy (https://docs.k3s.io/blog/2025/03/10/simple-ha?_highlight=tls&_highlight=san#load-balancer) Load balance a single registration point across all active nodes.

# Generate a shared token for joining nodes
# Copy this token to each node at ~/.k3s-token
pwgen --capitalize --numerals --secure 64 1 > ~/.k3s-token

# Create the first node
curl -sfL https://get.k3s.io | K3S_TOKEN=$(cat ~/.k3s-token) sh -s - \
--cluster-init \
--selinux \
"--disable" \
"traefik" \
"--disable" \
"servicelb" \
"--disable" \
"local-storage" \
"--cluster-cidr" \
"10.42.0.0/16" \
"--service-cidr" \
"10.43.0.0/16"

# Copy the generated token to the other nodes
cat /var/lib/rancher/k3s/server/token

# Join nodes
curl -sfL https://get.k3s.io | K3S_TOKEN=$(cat ~/.k3s-token) sh -s - \
--selinux \
"--disable" \
"traefik" \
"--disable" \
"servicelb" \
"--disable" \
"local-storage" \
"--cluster-cidr" \
"10.42.0.0/16" \
"--service-cidr" \
"10.43.0.0/16" \
--server https://kube1.reeselink.com:6443

Network Checks

At this point it's a good idea to make sure node communication is working as expected.

firewall-cmd --set-log-denied=all
# You shouldn't see any dropped traffic from your nodes.
dmesg --follow | egrep -i 'REJECT|DROP'

Kube Credentials

On the operator

export KUBE_SERVER_ADDRESS="https://kube1.reeselink.com:6443"
# Copy the kube config down
ssh kube1-root cat /etc/rancher/k3s/k3s.yaml | \
yq -r ".clusters[0].cluster.server = \"${KUBE_SERVER_ADDRESS}\"" > \
~/.kube/admin-kube-config
export KUBECONFIG=~/.kube/admin-kube-config

Metal LB

VLAN Setup

Before working with Metallb you'll need at least one available VLAN. On Unifi equipment this is accomplished by creating a new network. Don't assign it to anything.

On the linux machine you can use nmcli or cockpit to configure a new VLAN network interface. With cockpit:

  1. Add a new VLAN network
  2. The parent should be the physical adapter connected to your switch
  3. Set the VLAN ID to the VLAN number of your created unifi network
  4. Click create
  5. Click into the new network
  6. Turn off IPv4 and IPv6 DNS (it will overload the resolv.conf hosts limit)
  7. Turn on the network interface
  8. Attempt to ping the acquired address(es)

Installation

We'll be swapping K3S's default load balancer with Metal LB for more flexibility. ServiceLB was struggling to allocate IP addresses for load balanced services. MetallLB does make things a little more complicated- you'll need special annotations (see below) but it's otherwise a well-tested, stable load balancing service with features to grow into.

Metallb is pretty cool. It works via l2 advertisement or BGP. We won't be using BGP, so let's focus on l2.

When we connect our nodes to a network we give them an IP address range: ex. 192.168.122.20/24. This range represents all the available addresses the node could be assigned. Usually we assign a single "static" IP address for our node and direct traffic to it by port forwarding from our router. This is fine for single nodes - but what if we have a cluster of nodes and we don't want our service to disappear just because one node is down for maintenance?

This is where l2 advertising comes in. Metallb will assign a static IP address from a given pool to any arbitrary node - then advertise that node's mac address as the location for the IP. When that node goes down metallb simply advertises a new mac address for the same IP address, effectively moving the IP to another node. This isn't really "load balancing" but "failover". Fortunately, that's exactly what we're looking for.

Install MetalLB

You'll need to annotate your service as follows if you want an external IP:

# Dual Stack
metadata:
  annotations:
    metallb.universe.tf/address-pool: "unifi-pool"
spec:
  ipFamilyPolicy: PreferDualStack
  ipFamilies:
  - IPv6
  - IPv4

# Single Stack
metadata:
  annotations:
    metallb.universe.tf/address-pool: "unifi-pool"
spec:
  ipFamilyPolicy: PreferDualStack
  ipFamilies:
  - IPv4

Then test with

kubectl apply -f active/systemd_k3s/tests/metallb-test.yaml

External DNS

https://github.com/kubernetes-sigs/external-dns/blob/master/docs/tutorials/aws.md

Credentials

  1. Generate credentials for the cluster
aws iam create-user --user-name "externaldns"
aws iam attach-user-policy --user-name "externaldns" --policy-arn arn:aws:iam::892236928704:policy/update-reeselink

# [OPTIONAL] Delete old access keys if you have too many
aws iam delete-access-key --user-name externaldns --access-key-id

GENERATED_ACCESS_KEY=$(aws iam create-access-key --user-name "externaldns")
ACCESS_KEY_ID=$(echo $GENERATED_ACCESS_KEY | jq -r '.AccessKey.AccessKeyId')
SECRET_ACCESS_KEY=$(echo $GENERATED_ACCESS_KEY | jq -r '.AccessKey.SecretAccessKey')

cat <<-EOF > active/kubernetes_external-dns/secrets/externaldns-credentials
[default]
aws_access_key_id = $ACCESS_KEY_ID
aws_secret_access_key = $SECRET_ACCESS_KEY
EOF

kubectl create secret generic external-dns \
--namespace kube-system \
--from-file active/kubernetes_external-dns/secrets/externaldns-credentials

helm repo add external-dns https://kubernetes-sigs.github.io/external-dns/
helm repo update
helm upgrade --install external-dns external-dns/external-dns \
--values active/kubernetes_external-dns/values.yaml \
--namespace kube-system

Annotation

metadata:
  annotations:
    external-dns.alpha.kubernetes.io/hostname: example.com

Cert Manager

Install cert-manager

helm repo add jetstack https://charts.jetstack.io
helm repo update
helm upgrade --install \
    cert-manager jetstack/cert-manager \
    --namespace kube-system \
    --set crds.enabled=true

Create the let's encrypt issuer (Route53 DNS)

export LE_ACCESS_KEY_ID=
export LE_SECRET_KEY=

cat <<EOF > secrets/cert-manager-secret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: prod-route53-credentials-cert-manager
data:
  access-key-id: $(echo $LE_ACCESS_KEY_ID | base64)
  secret-access-key: $(echo $LE_SECRET_KEY | base64)
EOF

kubectl apply -f secrets/cert-manager-secret.yaml
cat <<EOF > secrets/route53-cluster-issuer.yaml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: nginx@ducoterra.net
    privateKeySecretRef:
      name: letsencrypt
    solvers:
    - selector:
        dnsZones:
          - "reeseapps.com"
      dns01:
        route53:
          region: us-east-1
          hostedZoneID: Z012820733346FJ0U4FUF
          accessKeyID: ${LE_ACCESS_KEY_ID}
          secretAccessKeySecretRef:
            name: prod-route53-credentials-cert-manager
            key: secret-access-key
EOF

kubectl apply -f secrets/route53-cluster-issuer.yaml

You can test if your ingress is working with:

# Navigate to demo.reeseapps.com
kubectl apply -f active/infrastructure_k3s/tests/ingress-nginx-test.yaml

# Cleanup
kubectl delete -f active/infrastructure_k3s/tests/ingress-nginx-test.yaml

Traefik Gateway

We'll use traefik gateway to provide ingress.

# Add the repo
helm repo add traefik https://traefik.github.io/charts
helm repo update

kubectl create namespace traefik

# Generate a selfsigned certificate valid for *.reeselink.com
mkdir active/kubernetes_traefik/secrets
openssl req -x509 -nodes -days 3650 -newkey rsa:2048 \
-keyout active/kubernetes_traefik/secrets/tls.key -out active/kubernetes_traefik/secrets/tls.crt \
-subj "/CN=*.reeselink.com"

# Create the TLS secret in the traefik namespace
kubectl create secret tls local-selfsigned-tls \
--cert=active/kubernetes_traefik/secrets/tls.crt --key=active/kubernetes_traefik/secrets/tls.key \
--namespace traefik

# Install the chart into the 'traefik' namespace
helm upgrade --install traefik traefik/traefik \
--namespace traefik \
--values active/kubernetes_traefik/values.yaml

# Deploy a demo
kubectl apply -f active/kubernetes_traefik/demo-app.yaml

Longhorn Storage

Longhorn provides replicated block storage via raw files on the nodes.

On the host you need to install iscsiadm

dnf install iscsiadm
systemctl enable --now iscsid
helm repo add longhorn https://charts.longhorn.io
helm repo update

helm upgrade --install longhorn longhorn/longhorn \
--namespace longhorn-system \
--create-namespace \
--set "persistence.defaultClassReplicaCount=1"

# Check that the route was created
kubectl get httproute longhorn-httproute -n longhorn-system -o jsonpath='{.status.parents[*].conditions}'

# Create a demo app to test storage
kubectl apply -f active/kubernetes_longhorn/demo-app.yaml

Test Minecraft Server

helm upgrade --install minecraft active/kubernetes_minecraft -n minecraft --create-namespace

Automatic Updates

https://docs.k3s.io/upgrades/automated

kubectl create namespace system-upgrade
kubectl apply -f https://github.com/rancher/system-upgrade-controller/releases/latest/download/system-upgrade-controller.yaml
kubectl apply -f https://github.com/rancher/system-upgrade-controller/releases/latest/download/crd.yaml
kubectl apply -f active/infrastructure_k3s/upgrade-plan.yaml

# Check plan
kubectl get plan -n system-upgrade

Database Backups

https://docs.k3s.io/cli/etcd-snapshot

Note, you must backup /var/lib/rancher/k3s/server/token and use the contents as the toklisten when restoring the backup as data is encrypted with that token.

Backups are saved to /var/lib/rancher/k3s/server/db/snapshots/ by default.

k3s etcd-snapshot save
k3s etcd-snapshot list

k3s server \
  --cluster-reset \
  --cluster-reset-restore-path=/var/lib/rancher/k3s/server/db/snapshots/on-demand-kube-1720459685

Uninstall

/usr/local/bin/k3s-uninstall.sh