homelab/README.md

# Containers!

A project to store container-based hosting stuff.

## Platform

Before you being be sure to take a look at the [Fedora Server Config](FedoraServer.md) readme
which explains how to set up a basic fedora server hosting platform with certbot.

## K3S

### Install K3S

We're going to be tweaking some installation parameters so if you already have k3s
installed you can either uninstall it or skip these steps.

This installation disables traefik and local-storage (We don't really need either):

```bash
curl -sfL https://get.k3s.io | sh -s - \
    "--disable" \
    "traefik" \
    "--disable" \
    "local-storage" \
    "--disable" \
    "coredns" \
    "--disable" \
    "servicelb" \
    "--cluster-dns" \
    "10.43.0.10"
```

Now you can change the ownership of (and copy) the k3s.yaml file:

```bash
chown ducoterra /etc/rancher/k3s/k3s.yaml

scp /etc/rancher/k3s/k3s.yaml ~/.kube/config
```

Edit ~/.kube/config and change 127.0.0.1 to containers.reeselink.com

### Database Backups

We're using SQLite (because it's all we really need). The db is stored at
`/var/lib/rancher/k3s/server/db/`. You can just copy that folder to backup the database
and restore it by copying it back. Note, you must also copy`/var/lib/rancher/k3s/server/token`
and use the contents as the token when restoring the backup as data is encrypted with that token.

### CoreDNS

We'll use our own coredns server so we can add custom hosts. This prevents the server from collapsing
if the internet drops out (something that apparently happens quite frequently)

```bash
helm repo add coredns https://coredns.github.io/helm
helm repo update
helm upgrade --install \
    --namespace=kube-system \
    --values coredns-values.yaml \
    coredns \
    coredns/coredns
```

You can test your dns config with

```bash
kubectl run -it --rm --restart=Never --image=infoblox/dnstools:latest dnstools
```

### Metal LB

We'll be swapping K3S's default load balancer with Metal LB for more flexibility. ServiceLB was
struggling to allocate IP addresses for load balanced services. MetallLB does make things a little
more complicated- you'll need special annotations (see below) but it's otherwise a well-tested,
stable load balancing service with features to grow into.

```bash
helm repo add metallb https://metallb.github.io/metallb
helm repo update
helm upgrade --install metallb \
    --namespace metallb \
    --create-namespace \
    metallb/metallb
```

MetalLB doesn't know what IP addresses are available for it to allocate, so we'll have
to provide it with a list. The `metallb-addresspool.yaml` has one IP address (we'll get to
IP address sharing in a second) which is the IP address of our node.

```bash
# create the metallb allocation pool
kubectl apply -f metallb-addresspool.yaml
```

In order to allow services to allocate the same IP address we'll need to annotate them
as such. MetalLB will allow services to allocate the same IP if:

    - They both have the same sharing key.
    - They request the use of different ports (e.g. tcp/80 for one and tcp/443 for the other).
    - They both use the Cluster external traffic policy, or they both point to the exact same set of pods (i.e. the pod selectors are identical).

See https://metallb.org/usage/#ip-address-sharing for more info.

You'll need to annotate your service as follows if you want an external IP:

```yaml
apiVersion: v1
kind: Service
metadata:
  name: {{ .Release.Name }}
  annotations:
    metallb.universe.tf/allow-shared-ip: "containers"
spec:
  externalTrafficPolicy: Cluster
  selector:
    app: {{ .Release.Name }}
  ports:
  - port: {{ .Values.ports.containerPort }}
    targetPort: {{ .Values.ports.targetPort }}
    name: {{ .Release.Name }}
  type: LoadBalancer
```

### Nginx Ingress

Now we need an ingress solution (preferably with certs for https). We'll be using nginx since
it's a little bit more configurable than traefik (though don't sell traefik short, it's really
good. Just finnicky when you have use cases they haven't explicitly coded for).

1. Install nginx

    ```bash
    helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
    helm repo update
    helm upgrade --install \
        ingress-nginx \
        ingress-nginx/ingress-nginx \
        --values ingress-nginx-values.yaml \
        --namespace ingress-nginx \
        --create-namespace
    ```

2. Install cert-manager

    ```bash
    helm repo add jetstack https://charts.jetstack.io
    helm repo update
    helm install \
        cert-manager jetstack/cert-manager \
        --namespace cert-manager \
        --create-namespace \
        --version v1.11.0 \
        --set installCRDs=true
    ```

3. Create the let's encrypt issuer

    ```bash
    kubectl apply -f letsencrypt-issuer.yaml
    ```

You can test if your ingress is working with `kubectl apply -f ingress-nginx-test.yaml`

Navigate to ingress-nginx-test.reeseapps.com

### Storage

We'll be installing democratic csi for our volume manager. Specifically, we'll be installing the
freenas-api-nfs driver. All configuration is stored in truenas-nfs.yaml.

The nfs driver will provision an nfs store owned by user 3000 (kube). You may have to make
that user on Truenas. The nfs share created will be world-read/write, so any user can write to
it. Users that write to the share will have their uid/gid mapped to Truenas, so if user 33 writes
a file to the nfs share it will show up as owned by user 33 on Truenas.

The iscsi driver will require a portal ID. This is NOT what is reflected in the UI. The most
reliable way (seriously) to get the real ID is to open the network monitor in the browser, reload
truenas and find the websocket connection, click on it, create the portal and click on the
server reseponse. It'll look something like:

```json
{"msg": "added", "collection": "iscsi.portal.query", "id": 7, "fields": {"id": 7, "tag": 1, "comment": "democratic-csi", "listen": [{"ip": "172.20.0.1", "port": 3260}], "discovery_authmethod": "NONE", "discovery_authgroup": null}}
```

The initiator group IDs seem to line up.

It's good practice to have separate hostnames for your share export and your truenas server. This
way you can have a direct link without worrying about changing the user-facing hostname.
For example: your truenas server might be driveripper.reeselink.com and your kube server might be
containers.reeselink.com. You should also have a democratic-csi-server.reeselink.com and a
democratic-csi-client-1.reeselink.com which might be on 172.20.0.1 and 172.20.0.2.

https://github.com/democratic-csi/democratic-csi

ISCSI requires a bit of server config before proceeding:

```bash
# Install the following system packages
sudo dnf install -y lsscsi iscsi-initiator-utils sg3_utils device-mapper-multipath

# Enable multipathing
sudo mpathconf --enable --with_multipathd y

# Ensure that iscsid and multipathd are running
sudo systemctl enable iscsid multipathd
sudo systemctl start iscsid multipathd

# Start and enable iscsi
sudo systemctl enable iscsi
sudo systemctl start iscsi
```

And now you can install the drivers:

```bash
helm repo add democratic-csi https://democratic-csi.github.io/charts/
helm repo update

# enc0 bulk storage (iscsi)
helm upgrade \
--install \
--values truenas-iscsi-enc0.yaml \
--namespace democratic-csi \
--create-namespace \
zfs-iscsi-enc0 democratic-csi/democratic-csi

# enc1 fast storage (iscsi)
helm upgrade \
--install \
--values truenas-iscsi-enc1.yaml \
--namespace democratic-csi \
--create-namespace \
zfs-iscsi-enc1 democratic-csi/democratic-csi
```

You can test that things worked with:

```bash
kubectl apply -f democratic-csi-pvc-test.yaml
```

Because iscsi will mount block devices, troubleshooting mounting issues, data corruption,
and exploring pvc contents must happen on the client device. Here are a few cheat-sheet
commands to make things easier:

```bash
# discover all targets on the server
iscsiadm --mode discovery \
    --type sendtargets \
    --portal democratic-csi-server.reeselink.com:3260

export ISCSI_TARGET=

# delete the discovered targets
iscsiadm --mode discovery \
    --portal democratic-csi-server.reeselink.com:3260 \
    --op delete

# view discovered targets
iscsiadm --mode node

# view current session
iscsiadm --mode session

# prevent automatic login
iscsiadm --mode node \
    --portal democratic-csi-server.reeselink.com:3260 \
    --op update \
    --name node.startup \
    --value manual

# connect a target
iscsiadm --mode node \
    --login \
    --portal democratic-csi-server.reeselink.com:3260 \
    --targetname $ISCSI_TARGET

# disconnect a target
# you might have to do this if pods can't mount their volumes.
# manually connecting a target tends to make it unavailable for the pods since there
# will be two targets with the same name.
iscsiadm --mode node \
    --logout \
    --portal democratic-csi-server.reeselink.com:3260 \
    --targetname $ISCSI_TARGET

# view all connected disks
ls /dev/disk/by-path/

# mount a disk
mount -t xfs /dev/disk/by-path/... /mnt/iscsi

# emergency - by-path isn't available
# (look for "Attached scsi disk")
iscsiadm --mode session -P 3 | grep Target -A 2 -B 2
```

### Dashboard

The kubernetes dashboard isn't all that useful but it can sometimes give you a good
visual breakdown when things are going wrong. It's sometimes faster than running
`kubectl get` commands over and over.

Create the dashboard and an admin user with:

```bash
helm upgrade \
--install \
--namespace kubernetes-dashboard \
--create-namespace \
dashboard-user ./helm/dashboard-user

kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.7.0/aio/deploy/recommended.yaml
```

Then login with the following:

```bash
kubectl -n kubernetes-dashboard create token admin-user
kubectl proxy
```

### Nextcloud

The first chart we'll deploy is nextcloud. This is a custom chart because Nextcloud
doesn't support helm installation natively (yet). There is a native Docker image and
really detailed installation instructions so we can pretty easily piece together what's
required.

This image runs the nextcloud cron job automatically and creates random secrets for all
infrastructure - very helpful for a secure deployment, not very helpful for migrating
clusters. You'll want to export the secrets and save them in a secure location.

```bash
helm upgrade --install \
    nextcloud \
    ./helm/nextcloud \
    --namespace nextcloud \
    --create-namespace
```

Need to copy lots of files? Copy them to the user data dir and then run

```bash
./occ files:scan --all
```

### Gitea

Gitea provides a helm chart [here](https://gitea.com/gitea/helm-chart/). We're not
going to modify much, but we are going to solidify some of the default values in case
they decide to change things. This is the first chart (besides ingress-nginx) where
we need to pay attention to the MetalLB annotation. This has been set in the values.yaml
file.

```bash
helm repo add gitea-charts https://dl.gitea.io/charts/
helm repo update
helm upgrade --install \
    gitea \
    gitea-charts/gitea \
    --values secrets/gitea-values.yaml \
    --namespace gitea \
    --create-namespace
```

If you need to backup your database you can run:

```bash
# Backup
kubectl exec -it -n gitea gitea-postgresql-0 -- \
    pg_dump \
    --no-owner \
    --dbname=postgresql://gitea:gitea@localhost:5432 > gitea_backup.db

# Take gitea down to zero pods
kubectl scale statefulset gitea --replicas 0

# Drop the existing database
kubectl exec -it -n gitea gitea-postgresql-0 -- psql -U gitea

\c postgres;
drop database gitea;
CREATE DATABASE gitea WITH OWNER gitea TEMPLATE template0 ENCODING UTF8 LC_COLLATE 'en_US.UTF-8' LC_CTYPE 'en_US.UTF-8';
exit

# restore from backup
kubectl exec -it -n gitea gitea-postgresql-0 -- \
    psql \
    postgresql://gitea:gitea@localhost:5432 gitea < gitea_backup.db

# Restore gitea to 1 pod
kubectl scale statefulset gitea --replicas 1
```

### Minecraft

Minecraft is available through the custom helm chart (including a server downloader). The example
below installs nimcraft. For each installation you'll want to create your own values.yaml
with a new port. The server-downloader is called "minecraft_get_server" and is available on
[Github](https://github.com/ducoterra/minecraft_get_server).

```bash
helm upgrade --install \
    nimcraft \
    ./helm/minecraft \
    --namespace nimcraft \
    --create-namespace
```

### Troubleshooting

Deleting a stuck namespace

```bash
NAMESPACE=nginx
kubectl proxy &
kubectl get namespace $NAMESPACE -o json |jq '.spec = {"finalizers":[]}' >temp.json
curl -k -H "Content-Type: application/json" -X PUT --data-binary @temp.json 127.0.0.1:8001/api/v1/namespaces/$NAMESPACE/finalize
```

Fixing a bad volume

```bash
xfs_repair -L /dev/sdg
```

Mounting an ix-application volume from truenas

```bash
# set the mountpoint
zfs set mountpoint=/ix_pvc enc1/ix-applications/releases/gitea/volumes/pvc-40e27277-71e3-4469-88a3-a39f53435a8b

#"unset" the mountpoing (back to legacy)
zfs set mountpoint=legacy enc1/ix-applications/releases/gitea/volumes/pvc-40e27277-71e3-4469-88a3-a39f53435a8b
```

Mounting a volume

```bash
# mount
mount -t xfs /dev/zvol/enc0/dcsi/apps/pvc-d5090258-cf20-4f2e-a5cf-330ac00d0049 /mnt/dcsi_pvc

# unmount
umount /mnt/dcsi_pvc
```