chart fixes and readme edits

2023-10-20 00:03:15 -04:00
parent 0462913304
commit 42b6aa33a0
24 changed files with 697 additions and 258 deletions
--- a/README.md
+++ b/README.md
@@ -2,48 +2,36 @@

 A project to store container-based hosting stuff.

+## Table of Contents
+
+- [Containers](#containers)
+  - [Table of Contents](#table-of-contents)
+  - [Platform](#platform)
+  - [Components](#components)
+    - [CoreDNS](#coredns)
+    - [Metal LB](#metal-lb)
+    - [Nginx Ingress](#nginx-ingress)
+    - [Storage](#storage)
+  - [Apps](#apps)
+    - [Dashboard](#dashboard)
+    - [Nextcloud](#nextcloud)
+      - [Test Deploy](#test-deploy)
+    - [Gitea](#gitea)
+    - [Minecraft](#minecraft)
+      - [Nimcraft](#nimcraft)
+      - [Testing](#testing)
+    - [Snapdrop](#snapdrop)
+    - [Jellyfin](#jellyfin)
+  - [Upgrading](#upgrading)
+  - [Help](#help)
+    - [Troubleshooting](#troubleshooting)
+
 ## Platform

 Before you being be sure to take a look at the [Fedora Server Config](FedoraServer.md) readme
 which explains how to set up a basic fedora server hosting platform with certbot.

-## K3S
-
-### Install K3S
-
-We're going to be tweaking some installation parameters so if you already have k3s
-installed you can either uninstall it or skip these steps.
-
-This installation disables traefik and local-storage (We don't really need either):
-
-```bash
-curl -sfL https://get.k3s.io | sh -s - \
-    "--disable" \
-    "traefik" \
-    "--disable" \
-    "local-storage" \
-    "--disable" \
-    "coredns" \
-    "--cluster-dns" \
-    "10.43.0.10"
-```
-
-Now you can change the ownership of (and copy) the k3s.yaml file:
-
-```bash
-chown ducoterra /etc/rancher/k3s/k3s.yaml
-
-scp /etc/rancher/k3s/k3s.yaml ~/.kube/config
-```
-
-Edit ~/.kube/config and change 127.0.0.1 to containers.reeselink.com
-
-### Database Backups
-
-We're using SQLite (because it's all we really need). The db is stored at
-`/var/lib/rancher/k3s/server/db/`. You can just copy that folder to backup the database
-and restore it by copying it back. Note, you must also copy`/var/lib/rancher/k3s/server/token`
-and use the contents as the token when restoring the backup as data is encrypted with that token.
+## Components

 ### CoreDNS

@@ -68,17 +56,26 @@ kubectl run -it --rm --restart=Never --image=infoblox/dnstools:latest dnstools

 ### Metal LB

-The problem with metallb is when a service needs the real IP address of a client. You can
-get the real IP with `externalTrafficPolicy: Local` but that prevents shared IP addresses even
-if services are running on different ports. Klipper, on the other hand, seems to handle this
-just fine. MetalLB isn't great for a local installation for this reason, but I'm leaving
-the docs here just in case.
-
 We'll be swapping K3S's default load balancer with Metal LB for more flexibility. ServiceLB was
 struggling to allocate IP addresses for load balanced services. MetallLB does make things a little
 more complicated- you'll need special annotations (see below) but it's otherwise a well-tested,
 stable load balancing service with features to grow into.

+Metallb is pretty cool. It works via l2 advertisement or BGP. We won't be using BGP, so let's
+focus on l2.
+
+When we connect our nodes to a network we give them an IP address range: ex. `192.168.122.20/24`.
+This range represents all the available addresses the node could be assigned. Usually we assign
+a single "static" IP address for our node and direct traffic to it by port forwarding from our
+router. This is fine for single nodes - but what if we have a cluster of nodes and we don't want
+our service to disappear just because one node is down for maintenance?
+
+This is where l2 advertising comes in. Metallb will assign a static IP address from a given
+pool to any arbitrary node - then advertise that node's mac address as the location for the
+IP. When that node goes down metallb simply advertises a new mac address for the same IP
+address, effectively moving the IP to another node. This isn't really "load balancing" but
+"failover". Fortunately, that's exactly what we're looking for.
+
 ```bash
 helm repo add metallb https://metallb.github.io/metallb
 helm repo update
@@ -88,23 +85,35 @@ helm upgrade --install metallb \
    metallb/metallb
 ```

-MetalLB doesn't know what IP addresses are available for it to allocate, so we'll have
+MetalLB doesn't know what IP addresses are available for it to allocate so we'll have
 to provide it with a list. The `metallb-addresspool.yaml` has one IP address (we'll get to
-IP address sharing in a second) which is the IP address of our node.
+IP address sharing in a second) which is an unassigned IP address not allocated to any of our
+nodes. Note if you have many public IPs which all point to the same router or virtual network
+you can list them. We're only going to use one because we want to port forward from our router.

 ```bash
 # create the metallb allocation pool
 kubectl apply -f metallb-addresspool.yaml
 ```

-In order to allow services to allocate the same IP address we'll need to annotate them
-as such. MetalLB will allow services to allocate the same IP if:
+Now we need to create the l2 advertisement. This is handled with a custom resource definition
+which specifies that all nodes listed are eligible to be assigned, and advertise, our
+"production" IP addresses.

-    - They both have the same sharing key.
-    - They request the use of different ports (e.g. tcp/80 for one and tcp/443 for the other).
-    - They both use the Cluster external traffic policy, or they both point to the exact same set of pods (i.e. the pod selectors are identical).
+```bash
+kubectl apply -f metallb-l2advertisement.yaml
+```

-See https://metallb.org/usage/#ip-address-sharing for more info.
+We now have a problem. We only have a signle production IP address and Metallb
+really doesn't want to share it. In order to allow services to allocate the
+same IP address (on different ports) we'll need to annotate them as such.
+MetalLB will allow services to allocate the same IP if:
+
+- They both have the same sharing key.
+- They request the use of different ports (e.g. tcp/80 for one and tcp/443 for the other).
+- They both use the Cluster external traffic policy, or they both point to the exact same set of pods (i.e. the pod selectors are identical).
+
+See <https://metallb.org/usage/#ip-address-sharing> for more info.

 You'll need to annotate your service as follows if you want an external IP:

@@ -114,7 +123,8 @@ kind: Service
 metadata:
  name: {{ .Release.Name }}
  annotations:
-    metallb.universe.tf/allow-shared-ip: "containers"
+    metallb.universe.tf/address-pool: "production"
+    metallb.universe.tf/allow-shared-ip: "production"
 spec:
  externalTrafficPolicy: Cluster
  selector:
@@ -170,6 +180,10 @@ Navigate to ingress-nginx-test.reeseapps.com

 ### Storage

+<https://github.com/democratic-csi/democratic-csi/blob/master/examples/freenas-nfs.yaml>
+
+Use nfsv4. It works without rpcbind which makes it lovely.
+
 We'll be installing democratic csi for our volume manager. Specifically, we'll be installing the
 freenas-api-nfs driver. All configuration is stored in truenas-nfs.yaml.

@@ -279,6 +293,14 @@ helm upgrade \
 --namespace democratic-csi \
 --create-namespace \
 zfs-iscsi-enc1 democratic-csi/democratic-csi
+
+# enc1 stable storage (nfs)
+helm upgrade \
+--install \
+--values secrets/truenas-nfs-enc1.yaml \
+--namespace democratic-csi \
+--create-namespace \
+zfs-nfs-enc1 democratic-csi/democratic-csi
 ```

 You can test that things worked with:
@@ -288,6 +310,41 @@ kubectl apply -f tests/democratic-csi-pvc-test.yaml
 kubectl delete -f tests/democratic-csi-pvc-test.yaml
 ```

+And run some performance tests. You can use network and disk monitoring tools
+to see performance during the tests.
+
+```bash
+# Big writes
+count=0
+start_time=$EPOCHREALTIME
+while true; do
+    dd if=/dev/zero of=test.dat bs=1M count=100 1> /dev/null 2> /dev/null
+    current=$(echo "$EPOCHREALTIME - $start_time" | bc)
+    current_gt_one=$(echo "$current > 10" | bc)
+    if [ $current_gt_one -eq 0 ]; then
+        count=$((count + 1))
+        echo -e '\e[1A\e[K'$count
+    else
+        break
+    fi
+done
+
+# Lots of little writes
+count=0
+start_time=$EPOCHREALTIME
+while true; do
+    dd if=/dev/zero of=test.dat bs=1K count=1 1> /dev/null 2> /dev/null
+    current=$(echo "$EPOCHREALTIME - $start_time" | bc)
+    current_gt_one=$(echo "$current > 1" | bc)
+    if [ $current_gt_one -eq 0 ]; then
+        count=$((count + 1))
+        echo -e '\e[1A\e[K'$count
+    else
+        break
+    fi
+done
+```
+
 Because iscsi will mount block devices, troubleshooting mounting issues, data corruption,
 and exploring pvc contents must happen on the client device. Here are a few cheat-sheet
 commands to make things easier:
@@ -346,6 +403,8 @@ mount -t xfs /dev/zvol/... /mnt/iscsi
 iscsiadm --mode session -P 3 | grep Target -A 2 -B 2
 ```

+## Apps
+
 ### Dashboard

 The kubernetes dashboard isn't all that useful but it can sometimes give you a good
@@ -387,8 +446,7 @@ helm upgrade --install \
    nextcloud \
    ./helm/nextcloud \
    --namespace nextcloud \
-    --create-namespace \
-    --values secrets/nextcloud-values.yaml
+    --create-namespace
 ```

 Need to copy lots of files? Copy them to the user data dir and then run
@@ -399,7 +457,28 @@ Need to copy lots of files? Copy them to the user data dir and then run

 Set up SES with the following links:

-https://docs.aws.amazon.com/general/latest/gr/ses.html
+<https://docs.aws.amazon.com/general/latest/gr/ses.html>
+
+#### Test Deploy
+
+You can create a test deployment with the following:
+
+```bash
+helm upgrade --install nextcloud ./helm/nextcloud \
+    --namespace nextcloud-test \
+    --create-namespace \
+    --set nextcloud.domain=nextcloud-test.reeseapps.com \
+    --set nextcloud.html.storageClassName=zfs-nfs-enc1 \
+    --set nextcloud.html.storage=8Gi \
+    --set nextcloud.data.storageClassName=zfs-nfs-enc1 \
+    --set nextcloud.data.storage=8Gi \
+    --set postgres.storageClassName=zfs-nfs-enc1 \
+    --set postgres.storage=8Gi \
+    --set redis.storageClassName=zfs-nfs-enc1 \
+    --set redis.storage=8Gi \
+    --set show_passwords=true \
+    --dry-run
+```

 ### Gitea

@@ -456,6 +535,8 @@ below installs nimcraft. For each installation you'll want to create your own va
 with a new port. The server-downloader is called "minecraft_get_server" and is available on
 [Github](https://github.com/ducoterra/minecraft_get_server).

+#### Nimcraft
+
 ```bash
 helm upgrade --install \
    nimcraft \
@@ -464,6 +545,17 @@ helm upgrade --install \
    --create-namespace
 ```

+#### Testing
+
+```bash
+helm upgrade --install \
+    testcraft \
+    ./helm/minecraft \
+    --namespace testcraft \
+    --create-namespace \
+    --set port=25566
+```
+
 ### Snapdrop

 Snapdrop is a file sharing app that allows airdrop-like functionality over the web
@@ -488,6 +580,21 @@ helm upgrade --install \
    --create-namespace
 ```

+## Upgrading
+
+<https://docs.k3s.io/upgrades/manual#manually-upgrade-k3s-using-the-binary>
+
+```bash
+sudo su -
+wget https://github.com/k3s-io/k3s/releases/download/v1.28.2%2Bk3s1/k3s
+systemctl stop k3s
+chmod +x k3s
+mv k3s /usr/local/bin/k3s
+systemctl start k3s
+```
+
+## Help
+
 ### Troubleshooting

 Deleting a stuck namespace