move to project lifecycle structure

This commit is contained in:
2024-07-21 02:20:48 -04:00
parent fd1fde499d
commit e6aff894e8
121 changed files with 6234 additions and 196 deletions

View File

@@ -0,0 +1,155 @@
# Fedora Server
- [Fedora Server](#fedora-server)
- [Installation](#installation)
- [Setup SSH](#setup-ssh)
- [Fail2Ban](#fail2ban)
- [Automatic Updates](#automatic-updates)
- [Disable Swap](#disable-swap)
- [Extras](#extras)
<https://docs.fedoraproject.org/en-US/fedora-server/installation/postinstallation-tasks/#_manage_system_updates>
Note these instructions differentiate between an `operator` and a `server`. The operator can be
any machine that configure the server. A pipeline, laptop, dedicated server, etc. are all options.
The server can be its own operator, though that's not recommended since servers should be ephemeral
and the operator will store information about each server.
## Installation
1. Make sure to use `custom` disk partitioner and select `btrfs`.
2. Create an administrator. We'll give ssh root access later, but this gives you a cockpit user.
3. Ensure IPV6 connection is set to "eui64".
4. Set hostname
## Setup SSH
On the operator:
```bash
export SSH_HOST=kube
ssh-keygen -t rsa -b 4096 -C ducoterra@"$SSH_HOST".reeselink.com -f ~/.ssh/id_"$SSH_HOST"_rsa
# Note: If you get "too many authentication failures" it's likely because you have too many private
# keys in your ~/.ssh directory. Use `-o PubkeyAuthentication` to fix it.
ssh-copy-id -o PubkeyAuthentication=no -i ~/.ssh/id_"$SSH_HOST"_rsa.pub ducoterra@"$SSH_HOST".reeselink.com
cat <<EOF >> ~/.ssh/config
Host $SSH_HOST
Hostname "$SSH_HOST".reeselink.com
User root
ProxyCommand none
ForwardAgent no
ForwardX11 no
Port 22
KeepAlive yes
IdentityFile ~/.ssh/id_"$SSH_HOST"_rsa
EOF
```
On the server:
```bash
# Copy authorized_keys to root
sudo cp ~/.ssh/authorized_keys /root/.ssh/authorized_keys
# Change your password
passwd
sudo su -
echo "PasswordAuthentication no" > /etc/ssh/sshd_config.d/01-prohibit-password.conf
echo '%wheel ALL=(ALL) NOPASSWD: ALL' > /etc/sudoers.d/01-nopasswd-wheel
```
On the operator:
```bash
# Test if you can SSH with a password
ssh -o PubkeyAuthentication=no ducoterra@"$SSH_HOST".reeselink.com
# Test that you can log into the server with ssh config
ssh $SSH_HOST
```
## Fail2Ban
On the server:
```bash
dnf install -y fail2ban
# Setup initial rules
cat <<EOF > /etc/fail2ban/jail.local
# Jail configuration additions for local installation
# Adjust the default configuration's default values
[DEFAULT]
# Optional enter an trusted IP never to ban
ignoreip = 2600:1700:1e6c:a81f::0/64
bantime = 6600
backend = auto
# The main configuration file defines all services but
# deactivates them by default. We have to activate those neeeded
[sshd]
enabled = true
EOF
systemctl enable fail2ban --now
tail -f /var/log/fail2ban.log
```
## Automatic Updates
On the server:
```bash
dnf install dnf-automatic -y
systemctl enable --now dnf-automatic-install.timer
```
## Disable Swap
```bash
swapoff -a
zramctl --reset /dev/zram0
dnf -y remove zram-generator-defaults
```
## Extras
On the server:
```bash
# Set vim as the default editor
dnf install -y vim-default-editor --allowerasing
# Install glances for system monitoring
dnf install -y glances
# Install zsh with autocomplete and suggestions
dnf install -y zsh zsh-autosuggestions zsh-syntax-highlighting
cat <<EOF > ~/.zshrc
# Basic settings
autoload bashcompinit && bashcompinit
autoload -U compinit; compinit
zstyle ':completion:*' menu select
# Prompt settings
autoload -Uz promptinit
promptinit
prompt redhat
PROMPT_EOL_MARK=
# Syntax Highlighting
source /usr/share/zsh-syntax-highlighting/zsh-syntax-highlighting.zsh
source /usr/share/zsh-autosuggestions/zsh-autosuggestions.zsh
### Custom Commands and Aliases ###
EOF
chsh -s $(which zsh) && chsh -s $(which zsh) ducoterra
```

View File

@@ -0,0 +1,428 @@
# K3S
- [K3S](#k3s)
- [Guide](#guide)
- [Disable Firewalld](#disable-firewalld)
- [Set SELinux to Permissive](#set-selinux-to-permissive)
- [Install K3S (Single Node)](#install-k3s-single-node)
- [Kube Credentials](#kube-credentials)
- [Storage](#storage)
- [Coredns](#coredns)
- [Metal LB](#metal-lb)
- [VLAN Setup](#vlan-setup)
- [Installation](#installation)
- [External DNS](#external-dns)
- [Credentials](#credentials)
- [Annotation](#annotation)
- [Nginx Ingress](#nginx-ingress)
- [Cert Manager](#cert-manager)
- [Test Minecraft Server](#test-minecraft-server)
- [Automatic Updates](#automatic-updates)
- [Database Backups](#database-backups)
- [Quickstart](#quickstart)
- [Help](#help)
- [Troubleshooting](#troubleshooting)
- [Deleting a stuck namespace](#deleting-a-stuck-namespace)
- [Fixing a bad volume](#fixing-a-bad-volume)
- [Mounting an ix-application volume from truenas](#mounting-an-ix-application-volume-from-truenas)
- [Mounting a volume](#mounting-a-volume)
- [Uninstall](#uninstall)
## Guide
1. Configure Host
2. Install CoreDNS for inter-container discovery
3. Install Metal LB for load balancer IP address assignment
4. install External DNS for laod balancer IP and ingress DNS records
5. Install Nginx Ingress for http services
6. Install Cert Manager for automatic Let's Encrypt certificates for Ingress nginx
7. Install longhorn storage for automatic PVC creation and management
8. Set up automatic database backups
## Disable Firewalld
<https://docs.k3s.io/advanced#red-hat-enterprise-linux--centos--fedora>
Disable firewalld. You could add rules for each service but every time you open a port
from a container you'd need to run a firewalld rule.
You can disable firewalld from the web interface.
## Set SELinux to Permissive
K3S is more than capable of running with SELinux set to enforcing. We won't be doing
that, however. We'll set it to permissive and you can reenable it once you've added all
the rules you need to keep your services running.
Set SELinux to permissive by editing `/etc/selinux/config`
SELINUX=permissive
## Install K3S (Single Node)
```bash
curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=v1.30.2+k3s2 sh -s - \
"--cluster-init" \
"--flannel-ipv6-masq" \
"--disable" \
"traefik" \
"--disable" \
"servicelb" \
"--disable" \
"coredns" \
"--disable" \
"local-storage" \
"--tls-san" \
"kube.reeselink.com" \
"--cluster-cidr" \
"10.42.0.0/16,fd02:c91e:56f4::/56" \
"--service-cidr" \
"10.43.0.0/16,fd02:c91e:56f5::/112" \
"--cluster-dns" \
"fd02:c91e:56f5::10"
```
## Kube Credentials
On the operator
```bash
# Copy the kube config down
scp kube:/etc/rancher/k3s/k3s.yaml ~/.kube/admin-kube-config
# Edit the server to match the remote address.
```
## Storage
1. `mkdir /var/lib/rancher/k3s/storage`
2. Edit fstab to mount your drive to `/var/lib/rancher/k3s/storage`
3. `systemctl daemon-reload`
4. `mount -a`
<https://github.com/rancher/local-path-provisioner/tree/master/deploy/chart/local-path-provisioner>
```bash
# Download the updated template from github
kubectl kustomize "github.com/rancher/local-path-provisioner/deploy?ref=v0.0.28" > local-path-provisioner/local-path-storage.yaml
# Apply customizations (ssd/hdd storage, read write many support)
kubectl kustomize local-path-provisioner | kubectl apply -f -
# Create test pod
kubectl apply -f k3s/tests/local-storage-test.yaml
```
## Coredns
1. Edit `coredns/values.yaml` to ensure the forward nameserver is correct.
```bash
# Install CoreDNS
helm upgrade --install \
--namespace=kube-system \
--values coredns/values.yaml \
coredns coredns/coredns
# Test DNS works
kubectl run -it --rm \
--restart=Never \
--image=infoblox/dnstools:latest \
dnstools
```
## Metal LB
### VLAN Setup
Before working with Metallb you'll need at least one available VLAN. On Unifi equipment
this is accomplished by creating a new network. Don't assign it to anything.
On the linux machine you can use nmcli or cockpit to configure a new VLAN network interface.
With cockpit:
1. Add a new VLAN network
2. The parent should be the physical adapter connected to your switch
3. Set the VLAN ID to the VLAN number of your created unifi network
4. Click create
5. Click into the new network
6. Turn off IPv4 and IPv6 DNS (it will overload the resolv.conf hosts limit)
7. Turn on the network interface
8. Attempt to ping the acquired address(es)
### Installation
We'll be swapping K3S's default load balancer with Metal LB for more flexibility. ServiceLB was
struggling to allocate IP addresses for load balanced services. MetallLB does make things a little
more complicated- you'll need special annotations (see below) but it's otherwise a well-tested,
stable load balancing service with features to grow into.
Metallb is pretty cool. It works via l2 advertisement or BGP. We won't be using BGP, so let's
focus on l2.
When we connect our nodes to a network we give them an IP address range: ex. `192.168.122.20/24`.
This range represents all the available addresses the node could be assigned. Usually we assign
a single "static" IP address for our node and direct traffic to it by port forwarding from our
router. This is fine for single nodes - but what if we have a cluster of nodes and we don't want
our service to disappear just because one node is down for maintenance?
This is where l2 advertising comes in. Metallb will assign a static IP address from a given
pool to any arbitrary node - then advertise that node's mac address as the location for the
IP. When that node goes down metallb simply advertises a new mac address for the same IP
address, effectively moving the IP to another node. This isn't really "load balancing" but
"failover". Fortunately, that's exactly what we're looking for.
```bash
helm repo add metallb https://metallb.github.io/metallb
helm repo update
helm upgrade --install metallb \
--namespace kube-system \
metallb/metallb
```
MetalLB doesn't know what IP addresses are available for it to allocate so we'll have
to provide it with a list. The `metallb-addresspool.yaml` has one IP address (we'll get to
IP address sharing in a second) which is an unassigned IP address not allocated to any of our
nodes. Note if you have many public IPs which all point to the same router or virtual network
you can list them. We're only going to use one because we want to port forward from our router.
```bash
# create the metallb allocation pool
kubectl apply -f metallb/addresspool.yaml
```
You'll need to annotate your service as follows if you want an external IP:
```yaml
metadata:
annotations:
metallb.universe.tf/address-pool: "external"
# or
metallb.universe.tf/address-pool: "internal"
spec:
ipFamilyPolicy: SingleStack
ipFamilies:
- IPv6
```
## External DNS
<https://github.com/kubernetes-sigs/external-dns/blob/master/docs/tutorials/aws.md>
### Credentials
1. Generate credentials for the cluster
```bash
aws iam create-user --user-name "externaldns"
aws iam attach-user-policy --user-name "externaldns" --policy-arn arn:aws:iam::892236928704:policy/update-reeseapps
aws iam attach-user-policy --user-name "externaldns" --policy-arn arn:aws:iam::892236928704:policy/update-reeselink
SECRET_ACCESS_KEY=$(aws iam create-access-key --user-name "externaldns")
ACCESS_KEY_ID=$(echo $SECRET_ACCESS_KEY | jq -r '.AccessKey.AccessKeyId')
cat <<-EOF > secrets/externaldns-credentials
[default]
aws_access_key_id = $(echo $ACCESS_KEY_ID)
aws_secret_access_key = $(echo $SECRET_ACCESS_KEY | jq -r '.AccessKey.SecretAccessKey')
EOF
kubectl create secret generic external-dns \
--namespace kube-system --from-file secrets/externaldns-credentials
kubectl apply -f external-dns/sa.yaml
kubectl apply -f external-dns/deploy.yaml
```
### Annotation
```yaml
metadata:
annotations:
external-dns.alpha.kubernetes.io/hostname: example.com
```
## Nginx Ingress
Now we need an ingress solution (preferably with certs for https). We'll be using nginx since
it's a little bit more configurable than traefik (though don't sell traefik short, it's really
good. Just finnicky when you have use cases they haven't explicitly coded for).
```bash
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update
helm upgrade --install \
ingress-nginx \
ingress-nginx/ingress-nginx \
--values ingress-nginx/values.yaml \
--namespace kube-system
```
## Cert Manager
Install cert-manager
```bash
helm repo add jetstack https://charts.jetstack.io
helm repo update
helm upgrade --install \
cert-manager jetstack/cert-manager \
--namespace kube-system \
--set crds.enabled=true
```
Create the let's encrypt issuer (Route53 DNS)
```bash
export LE_ACCESS_KEY_ID=
export LE_SECRET_KEY=
cat <<EOF > secrets/cert-manager-secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: prod-route53-credentials-cert-manager
data:
access-key-id: $(echo $LE_ACCESS_KEY_ID | base64)
secret-access-key: $(echo $LE_SECRET_KEY | base64)
EOF
kubectl apply -f secrets/cert-manager-secret.yaml
```
```bash
cat <<EOF > secrets/route53-cluster-issuer.yaml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: nginx@ducoterra.net
privateKeySecretRef:
name: letsencrypt
solvers:
- selector:
dnsZones:
- "reeseapps.com"
dns01:
route53:
region: us-east-1
hostedZoneID: Z012820733346FJ0U4FUF
accessKeyID: ${LE_ACCESS_KEY_ID}
secretAccessKeySecretRef:
name: prod-route53-credentials-cert-manager
key: secret-access-key
EOF
kubectl apply -f secrets/route53-cluster-issuer.yaml
```
You can test if your ingress is working with:
```bash
# Navigate to demo.reeseapps.com
kubectl apply -f k3s/tests/ingress-nginx-test.yaml
# Cleanup
kubectl delete -f k3s/tests/ingress-nginx-test.yaml
```
## Test Minecraft Server
```bash
helm upgrade --install minecraft ./minecraft -n minecraft --create-namespace
```
## Automatic Updates
<https://docs.k3s.io/upgrades/automated>
```bash
kubectl create namespace system-upgrade
kubectl apply -f https://github.com/rancher/system-upgrade-controller/releases/latest/download/system-upgrade-controller.yaml
kubectl apply -f https://github.com/rancher/system-upgrade-controller/releases/latest/download/crd.yaml
kubectl apply -f k3s/upgrade-plan.yaml
# Check plan
kubectl get plan -n system-upgrade
```
## Database Backups
<https://docs.k3s.io/cli/etcd-snapshot>
Note, you must backup `/var/lib/rancher/k3s/server/token`
and use the contents as the toklisten when restoring the backup as data is encrypted with that token.
Backups are saved to `/var/lib/rancher/k3s/server/db/snapshots/` by default.
```bash
k3s etcd-snapshot save
k3s etcd-snapshot list
k3s server \
--cluster-reset \
--cluster-reset-restore-path=/var/lib/rancher/k3s/server/db/snapshots/on-demand-kube-1720459685
```
### Quickstart
```bash
# Create certsigner pod for all other operations
./setup.sh <server_fqdn>
# Create a user, use "admin" to create an admin user
./upsertuser.sh <ssh_address> <server_fqdn (for kubectl)> <user>
# Remove a user, their namespace, and their access
./removeuserspace <server_fqdn> <user>
```
## Help
### Troubleshooting
#### Deleting a stuck namespace
```bash
NAMESPACE=nginx
kubectl proxy &
kubectl get namespace $NAMESPACE -o json |jq '.spec = {"finalizers":[]}' >temp.json
curl -k -H "Content-Type: application/json" -X PUT --data-binary @temp.json 127.0.0.1:8001/api/v1/namespaces/$NAMESPACE/finalize
```
#### Fixing a bad volume
```bash
xfs_repair -L /dev/sdg
```
#### Mounting an ix-application volume from truenas
```bash
# set the mountpoint
zfs set mountpoint=/ix_pvc enc1/ix-applications/releases/gitea/volumes/pvc-40e27277-71e3-4469-88a3-a39f53435a8b
#"unset" the mountpoint (back to legacy)
zfs set mountpoint=legacy enc1/ix-applications/releases/gitea/volumes/pvc-40e27277-71e3-4469-88a3-a39f53435a8b
```
#### Mounting a volume
```bash
# mount
mount -t xfs /dev/zvol/enc0/dcsi/apps/pvc-d5090258-cf20-4f2e-a5cf-330ac00d0049 /mnt/dcsi_pvc
# unmount
umount /mnt/dcsi_pvc
```
## Uninstall
```bash
/usr/local/bin/k3s-uninstall.sh
```

View File

@@ -0,0 +1,93 @@
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: zfs-iscsi-enc0
annotations:
"helm.sh/resource-policy": keep
spec:
storageClassName: zfs-iscsi-enc0
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 8Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: zfs-iscsi-enc1
annotations:
"helm.sh/resource-policy": keep
spec:
storageClassName: zfs-iscsi-enc1
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 8Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: zfs-nfs-enc1
annotations:
"helm.sh/resource-policy": keep
spec:
storageClassName: zfs-nfs-enc1
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 8Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: democratic-csi-test
spec:
selector:
matchLabels:
app: democratic-csi-test
template:
metadata:
labels:
app: democratic-csi-test
spec:
containers:
- image: debian
command:
- bash
- -c
- 'sleep infinity'
name: democratic-csi-test
volumeMounts:
- mountPath: /zfs_iscsi_enc0
name: zfs-iscsi-enc0
- mountPath: /zfs_iscsi_enc1
name: zfs-iscsi-enc1
- mountPath: /zfs_nfs_enc1
name: zfs-nfs-enc1
resources:
limits:
memory: "4Gi"
cpu: "2"
requests:
memory: "1Mi"
cpu: "1m"
restartPolicy: Always
volumes:
- name: zfs-iscsi-enc0
persistentVolumeClaim:
claimName: zfs-iscsi-enc0
- name: zfs-iscsi-enc1
persistentVolumeClaim:
claimName: zfs-iscsi-enc1
- name: zfs-nfs-enc1
persistentVolumeClaim:
claimName: zfs-nfs-enc1

View File

@@ -0,0 +1,45 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: ffmpeg
spec:
selector:
matchLabels:
app: ffmpeg
template:
metadata:
labels:
app: ffmpeg
spec:
volumes:
- name: data
persistentVolumeClaim:
claimName: ffmpeg
containers:
- name: ffmpeg
image: linuxserver/ffmpeg:latest
volumeMounts:
- mountPath: /config
name: data
command:
- /bin/bash
- -c
- 'sleep infinity'
resources:
limits:
memory: "2Gi"
cpu: "8"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: ffmpeg
spec:
storageClassName: zfs-iscsi-enc0-ext4
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 64Gi

View File

@@ -0,0 +1,66 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: ingress-nginx-demo
spec:
selector:
matchLabels:
app.kubernetes.io/name: ingress-nginx-demo
strategy:
type: Recreate
template:
metadata:
labels:
app.kubernetes.io/name: ingress-nginx-demo
spec:
containers:
- name: httpd
image: httpd
ports:
- containerPort: 80
name: http
---
apiVersion: v1
kind: Service
metadata:
name: ingress-nginx-demo
spec:
type: ClusterIP
selector:
app.kubernetes.io/name: ingress-nginx-demo
ports:
- name: ingress-nginx-demo
protocol: TCP
port: 80
targetPort: http
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: ingress-nginx-demo
annotations:
cert-manager.io/cluster-issuer: letsencrypt
external-dns.alpha.kubernetes.io/ttl: "60"
nginx.ingress.kubernetes.io/proxy-body-size: "0"
nginx.org/client-max-body-size: "0"
spec:
ingressClassName: nginx
rules:
- host: demo.reeseapps.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: ingress-nginx-demo
port:
number: 80
tls:
- hosts:
- demo.reeseapps.com
secretName: ingress-nginx-demo-tls-cert

View File

@@ -0,0 +1,71 @@
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: ssd-test
namespace: default
spec:
storageClassName: ssd
accessModes:
- ReadWriteMany
resources:
requests:
storage: 8Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: hdd-test
namespace: default
spec:
storageClassName: hdd
accessModes:
- ReadWriteMany
resources:
requests:
storage: 8Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: local-storage-test
namespace: default
spec:
selector:
matchLabels:
app: local-storage-test
template:
metadata:
labels:
app: local-storage-test
spec:
containers:
- image: debian
command:
- bash
- -c
- 'sleep infinity'
name: local-storage-test
volumeMounts:
- mountPath: /ssd
name: ssd
- mountPath: /hdd
name: hdd
resources:
limits:
memory: "4Gi"
cpu: "2"
requests:
memory: "1Mi"
cpu: "1m"
restartPolicy: Always
volumes:
- name: hdd
persistentVolumeClaim:
claimName: hdd-test
- name: ssd
persistentVolumeClaim:
claimName: ssd-test

View File

@@ -0,0 +1,101 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: ingress-nginx-demo-1
namespace: default
spec:
selector:
matchLabels:
app.kubernetes.io/name: ingress-nginx-demo-1
strategy:
type: Recreate
template:
metadata:
labels:
app.kubernetes.io/name: ingress-nginx-demo-1
spec:
containers:
- name: httpd
image: httpd
ports:
- containerPort: 80
name: http
resources:
requests:
memory: "100Mi"
cpu: "1m"
limits:
memory: "256Mi"
cpu: "1"
---
apiVersion: v1
kind: Service
metadata:
name: ingress-nginx-demo-1
namespace: default
annotations:
metallb.universe.tf/allow-shared-ip: "production"
metallb.universe.tf/address-pool: production
spec:
type: LoadBalancer
ports:
- name: http
protocol: TCP
port: 8001
targetPort: 80
selector:
app.kubernetes.io/name: ingress-nginx-demo-1
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: ingress-nginx-demo-2
namespace: default
spec:
selector:
matchLabels:
app.kubernetes.io/name: ingress-nginx-demo-2
strategy:
type: Recreate
template:
metadata:
labels:
app.kubernetes.io/name: ingress-nginx-demo-2
spec:
containers:
- name: httpd
image: httpd
ports:
- containerPort: 80
name: http
resources:
requests:
memory: "100Mi"
cpu: "1m"
limits:
memory: "256Mi"
cpu: "1"
---
apiVersion: v1
kind: Service
metadata:
name: ingress-nginx-demo-2
namespace: default
annotations:
metallb.universe.tf/allow-shared-ip: "production"
metallb.universe.tf/address-pool: production
spec:
type: LoadBalancer
ports:
- name: http
protocol: TCP
port: 8002
targetPort: 80
selector:
app.kubernetes.io/name: ingress-nginx-demo-2

View File

@@ -0,0 +1,49 @@
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
ports:
- port: 80
name: web
clusterIP: None
selector:
app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
spec:
selector:
matchLabels:
app: nginx # has to match .spec.template.metadata.labels
serviceName: "nginx"
replicas: 3 # by default is 1
minReadySeconds: 10 # by default is 0
template:
metadata:
labels:
app: nginx # has to match .spec.selector.matchLabels
spec:
terminationGracePeriodSeconds: 10
containers:
- name: nginx
image: registry.k8s.io/nginx-slim:0.8
ports:
- containerPort: 80
name: web
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumeClaimTemplates:
- metadata:
name: www
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "local-path"
resources:
requests:
storage: 1Gi

View File

@@ -0,0 +1,19 @@
# Server plan
apiVersion: upgrade.cattle.io/v1
kind: Plan
metadata:
name: server-plan
namespace: system-upgrade
spec:
concurrency: 1
cordon: true
nodeSelector:
matchExpressions:
- key: node-role.kubernetes.io/control-plane
operator: In
values:
- "true"
serviceAccountName: system-upgrade
upgrade:
image: rancher/k3s-upgrade
channel: https://update.k3s.io/v1-release/channels/stable

View File

@@ -0,0 +1,369 @@
# Truenas
- [Truenas](#truenas)
- [Bios settings](#bios-settings)
- [Archiving](#archiving)
- [Deleting snapshots](#deleting-snapshots)
- [But First, ZFS on RPi](#but-first-zfs-on-rpi)
- [Pi Setup](#pi-setup)
- [Datasets, Snapshots, and Encryption](#datasets-snapshots-and-encryption)
- [Migrating encrypted pools](#migrating-encrypted-pools)
- [Migrating Properties](#migrating-properties)
- [Backup Task Settings](#backup-task-settings)
- [Create and Destroy zfs Datasets](#create-and-destroy-zfs-datasets)
- [Create and send snapshots](#create-and-send-snapshots)
- [Cleaning up old snapshots](#cleaning-up-old-snapshots)
- [VMs](#vms)
- [Converting zvol to qcow2](#converting-zvol-to-qcow2)
- [Tunables](#tunables)
- [Core](#core)
- [Scale](#scale)
- [ARC Limit](#arc-limit)
- [Certs](#certs)
- [Testing](#testing)
- [iperf](#iperf)
- [disk](#disk)
- [disk health](#disk-health)
- [Dead Disks](#dead-disks)
- [Corrupted data](#corrupted-data)
## Bios settings
These are my recommended settings that seem stable and allow GPU passthrough
1. Memory 3200mhz, fabric 1600mhz
2. AC Power - On
3. SVM - On
4. IOMMU - On (Do not touch rebar or other pci encoding stuff)
5. Fans 100%
6. Initial video output: pci 3
7. PCIE slot 1 bifurcation: 4x4x4x4
8. Disable CSM
9. Fast Boot Enabled
## Archiving
1. Create a recursive snapshot called "archive_pool_year_month_day"
2. Create a replication task called "archive_pool_year_month_day"
- select all datasets you want to backup
- fill in enc0/archives/archive-year-month-day_hour-minute
- full filesystem replication
- select "Matching naming schema"
- Use `archive-%Y-%m-%d_%H-%M`
- Deselect run automatically
- Save and run
## Deleting snapshots
Sometimes you need to delete many snapshots from a certain dataset. The UI is terrible for this, so
we need to use `zfs destroy`. xargs is the best way to do this since it allows parallel processing.
```bash
# zfs list snapshots with:
# -o name: only print the name
# -S creation: sort by creation time
# -H: don't display headers
# -r: recurse through every child dataset
zfs list -t snapshot enc0/archives -o name -S creation -H -r
# pipe it through xargs with:
# -n 1: take only 1 argument from the pipe per command
# -P 8: eight parallel processes
# Also pass to zfs destroy:
# -v: verbose
# -n: dryrun
zfs list -t snapshot enc0/archives -o name -S creation -H -r | xargs -n 1 -P 8 zfs destroy -v -n
# if that looks good you can remove the "-n"
zfs list -t snapshot enc0/archives -o name -S creation -H -r | xargs -n 1 -P 8 zfs destroy -v
```
## But First, ZFS on RPi
A really good backup server is an RPi running openzfs. See [the openzfs docs](https://openzfs.github.io/openzfs-docs/Getting%20Started/Ubuntu/Ubuntu%2020.04%20Root%20on%20ZFS%20for%20Raspberry%20Pi.html#step-2-setup-zfs) for more info.
### Pi Setup
Add the vault ssh CA key to your pi.
```bash
curl -o /etc/ssh/trusted-user-ca-keys.pem https://vault.ducoterra.net/v1/ssh-client-signer/public_key
echo "TrustedUserCAKeys /etc/ssh/trusted-user-ca-keys.pem" >> /etc/ssh/sshd_config
service ssh restart
```
Create a pi user.
```bash
adduser pi
usermod -a -G sudo pi
```
SSH to the pi as the "pi" user. Delete the ubuntu user.
```bash
killall -u ubuntu
userdel -r ubuntu
```
Disable SSH password authentication
```bash
sed -i 's/PasswordAuthentication yes/PasswordAuthentication no/g' /etc/ssh/sshd_config
service ssh restart
```
Change the hostname.
```bash
echo pi-nas > /etc/hostname
```
Upgrade and restart the pi.
```bash
apt update && apt upgrade -y && apt autoremove -y
reboot
```
Install ZFS.
```bash
apt install -y pv zfs-initramfs
```
Find the disks you want to use to create your pool
```bash
fdisk -l
```
Create a pool.
```bash
mkdir -p /mnt/backup
zpool create \
-o ashift=12 \
-O acltype=posixacl -O canmount=off -O compression=lz4 \
-O dnodesize=auto -O normalization=formD -O relatime=on \
-O xattr=sa -O mountpoint=/mnt/backup \
backup ${DISK}
```
## Datasets, Snapshots, and Encryption
### Migrating encrypted pools
Since you can't use `-R` to send encrypted datasets recursively you'll need to use more creative tactics. Here's my recommendation:
1. Save the datasets from a pool to a text file:
```bash
zfs list -r -o name <pool> > pool_datasets.txt
```
2. Next, remove the prefix of the source pool from the list of datasets. Also remove the source pool itself as well as any duplicate pools in the receiving dataset.
3. Now, run a command like the following:
```bash
for i in $(cat nvme_pools.txt); do zfs send -v nvme/$i@manual-2021-10-03_22-34 | zfs recv -x encryption enc0/$i; done
```
### Migrating Properties
If you need to migrate your dataset comments you can use the following bash to automate the task.
```bash
for i in $(zfs list -H -d 1 -o name backup/nvme/k3os-private); do read -r name desc < <(zfs list -H -o name,org.freenas:description $i) && pvc=$(echo "$name" | awk -F "/" '{print $NF}') && zfs set org.freenas:description=$desc enc1/k3os-private/$pvc; done
```
### Backup Task Settings
| Key | Value |
| ------------------------------------ | --------------------- |
| Destination Dataset Read-only Policy | SET |
| Recursive | true |
| Snapshot Retention Policy | Same as Source |
| Include Dataset Properties | true |
| Periodic Snapshot Tasks | <daily-snapshot-task> |
### Create and Destroy zfs Datasets
```bash
# Create a pool
zpool create rpool /dev/disk/by-id/disk-id
# Add a cache disk
zpool add backup cache /dev/sda
# Enable encryption
zpool set feature@encryption=enabled rpool
# Create a dataset
zfs create rpool/d1
# Create an encrypted dataset
zfs create -o encryption=on -o keylocation=prompt -o keyformat=passphrase rpool/d1
# Delete a dataset
zfs destroy rpool/d1
```
### Create and send snapshots
```bash
# snapshot pool and all children
zfs snapshot -r dataset@now
# send all child snapshots
zfs send -R dataset@snapshot | zfs recv dataset
# use the -w raw flag to send encrypted snapshots
zfs send -R -w dataset@snapshot | zfs recv dataset
```
### Cleaning up old snapshots
```bash
wget https://raw.githubusercontent.com/bahamas10/zfs-prune-snapshots/master/zfs-prune-snapshots
```
## VMs
1. Force UEFI installation
2. `cp /boot/efi/EFI/debian/grubx64.efi /boot/efi/EFI/BOOT/bootx64.efi`
### Converting zvol to qcow2
```bash
dd if=/dev/zvol/enc1/vms/unifi-e373f of=unifi.raw
qemu-img convert -f raw -O qcow2 unifi.raw unifi.qcow2
```
## Tunables
### Core
```bash
sysctl kern.ipc.somaxconn=2048
sysctl kern.ipc.maxsockbuf=16777216
sysctl net.inet.tcp.recvspace=4194304
sysctl net.inet.tcp.sendspace=2097152
sysctl net.inet.tcp.sendbuf_max=16777216
sysctl net.inet.tcp.recvbuf_max=16777216
sysctl net.inet.tcp.sendbuf_auto=1
sysctl net.inet.tcp.recvbuf_auto=1
sysctl net.inet.tcp.sendbuf_inc=16384
sysctl net.inet.tcp.recvbuf_inc=524288
sysctl vfs.zfs.arc_max=34359738368 # set arc size to 32 GiB to prevent eating VMs
loader vm.kmem_size=34359738368 # set kmem_size to 32 GiB to force arc_max to apply
loader vm.kmem_size_max=34359738368 # set kmem_size_max to 32 GiB to sync with kmem_size
```
Nic options: "mtu 9000 rxcsum txcsum tso4 lro"
### Scale
#### ARC Limit
Create an Init/Shutdown Script of type `Command` with the following:
```bash
echo 34359738368 >> /sys/module/zfs/parameters/zfs_arc_max
```
Set `When` to `Pre Init`.
## Certs
<https://raymondc.net/2018/02/28/using-freenas-as-your-ca.html>
1. Create a new Root certificate (CAs -> ADD -> Internal CA)
- Name: Something_Root
- Key Length: 4096
- Digest: SHA512
- Lifetime: 825 (Apple's new requirement)
- Extend Key Usage: Server Auth
- Common Name: Something Root CA
- Subject Alternate Names:
2. Create a new intermediate certificate (CAs -> Add -> Intermediate CA)
- Name: Something_Intermediate_CA
- Key Length: 4096
- Digest: SHA512
- Lifetime: 825 (Apple's new requirement)
- Extend Key Usage: Server Auth
3. Create a new Certificate (Certificates -> Add -> Internal Certificate)
- Name: Something_Certificate
- Key Length: 4096
- Digest: SHA512
- Lifetime: 825 (Apple's new requirement)
- Extend Key Usage: Server Auth
## Testing
### iperf
```bash
iperf3 -c mainframe -P 4
iperf3 -c mainframe -P 4 -R
iperf3 -c pc -P 4
iperf3 -c pc -P 4 -R
```
### disk
```bash
# write 16GB to disk
dd if=/dev/zero of=/tmp/test bs=1024k count=16000
# divide result by 1000^3 to get GB/s
# read 16GB from disk
dd if=/tmp/test of=/dev/null bs=1024k
# divide result by 1000^3 to get GB/s
```
## disk health
<https://documents.westerndigital.com/content/dam/doc-library/en_us/assets/public/western-digital/product/internal-drives/wd-black-ssd/product-brief-wd-black-sn750-nvme-ssd.pdf>
```bash
# HDD
smartctl -a /dev/ada1 | grep "SMART Attributes" -A 18
# NVME
smartctl -a /dev/nvme1 | grep "SMART/Health Information" -A 17
```
## Dead Disks
```bash
=== START OF INFORMATION SECTION ===
Model Family: Western Digital Black
Device Model: WDC WD2003FZEX-00Z4SA0
Serial Number: WD-WMC5C0D6PZYZ
LU WWN Device Id: 5 0014ee 65a5a19fc
Firmware Version: 01.01A01
User Capacity: 2,000,398,934,016 bytes [2.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-2 (minor revision not indicated)
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Sat Feb 13 18:31:57 2021 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
```
## Corrupted data
One or more devices has experienced an error resulting in data corruption. Applications may be affected.
To get a list of affected files run:
```bash
zpool status -v
```

View File

@@ -0,0 +1,146 @@
# Ubuntu Server
- [Ubuntu Server](#ubuntu-server)
- [Setup SSH](#setup-ssh)
- [Fail2Ban](#fail2ban)
- [Automatic Updates](#automatic-updates)
- [Disable Swap](#disable-swap)
- [Extras](#extras)
Note these instructions differentiate between an `operator` and a `server`. The operator can be
any machine that configure the server. A pipeline, laptop, dedicated server, etc. are all options.
The server can be its own operator, though that's not recommended since servers should be ephemeral
and the operator will store information about each server.
## Setup SSH
On the operator:
```bash
export SSH_HOST=kube
ssh-keygen -t rsa -b 4096 -C ducoterra@${SSH_HOST}.reeselink.com -f ~/.ssh/id_${SSH_HOST}_rsa
# Note: If you get "too many authentication failures" it's likely because you have too many private
# keys in your ~/.ssh directory. Use `-o PubkeyAuthentication` to fix it.
ssh-copy-id -o PubkeyAuthentication=no -i ~/.ssh/id_${SSH_HOST}_rsa.pub ducoterra@${SSH_HOST}.reeselink.com
cat <<EOF >> ~/.ssh/config
Host $SSH_HOST
Hostname ${SSH_HOST}.reeselink.com
User root
ProxyCommand none
ForwardAgent no
ForwardX11 no
Port 22
KeepAlive yes
IdentityFile ~/.ssh/id_${SSH_HOST}_rsa
EOF
```
On the server:
```bash
# Copy authorized_keys to root
sudo cp ~/.ssh/authorized_keys /root/.ssh/authorized_keys
# Change your password
passwd
sudo su -
echo "PasswordAuthentication no" > /etc/ssh/sshd_config.d/01-prohibit-password.conf
echo '%sudo ALL=(ALL) NOPASSWD: ALL' > /etc/sudoers.d/01-nopasswd-sudo
systemctl restart sshd
```
On the operator:
```bash
# Test if you can SSH with a password
ssh -o PubkeyAuthentication=no ducoterra@${SSH_HOST}.reeselink.com
# Test that you can log into the server with ssh config
ssh $SSH_HOST
```
## Fail2Ban
On the server:
```bash
apt update
apt install -y fail2ban
# Setup initial rules
cat <<EOF > /etc/fail2ban/jail.local
# Jail configuration additions for local installation
# Adjust the default configuration's default values
[DEFAULT]
# Optional enter an trusted IP never to ban
ignoreip = 2600:1700:1e6c:a81f::0/64
bantime = 6600
backend = auto
# The main configuration file defines all services but
# deactivates them by default. We have to activate those neeeded
[sshd]
enabled = true
EOF
systemctl enable fail2ban --now
tail -f /var/log/fail2ban.log
```
## Automatic Updates
On the server:
```bash
apt install -y unattended-upgrades
systemctl enable --now unattended-upgrades.service
```
## Disable Swap
```bash
swapoff -a
```
## Extras
On the server:
```bash
# Install glances for system monitoring
apt install -y glances
# Install zsh with autocomplete and suggestions
apt install -y zsh zsh-autosuggestions zsh-syntax-highlighting
cat <<EOF > ~/.zshrc
# Basic settings
autoload bashcompinit && bashcompinit
autoload -U compinit; compinit
zstyle ':completion:*' menu select
# Prompt settings
autoload -Uz promptinit
promptinit
prompt redhat
PROMPT_EOL_MARK=
# Syntax Highlighting
source /usr/share/zsh-syntax-highlighting/zsh-syntax-highlighting.zsh
source /usr/share/zsh-autosuggestions/zsh-autosuggestions.zsh
### Custom Commands and Aliases ###
EOF
chsh -s $(which zsh) && chsh -s $(which zsh) ducoterra
# Cockpit
apt install -y cockpit
systemctl enable --now cockpit
```

View File

@@ -0,0 +1,47 @@
# Network Management
- [Network Management](#network-management)
- [IP Addresses](#ip-addresses)
- [Route53](#route53)
- [IPV6 EUI64 Address Generation](#ipv6-eui64-address-generation)
- [NetworkManager](#networkmanager)
## IP Addresses
| Hostname | IPV4 | IPV6 |
| -------- | ----------- | ------------------ |
| unifi | 192.168.2.1 | 2603:6013:3140:102 |
| lab | 10.1.0.1 | 2603:6013:3140:100 |
| iot | 10.2.0.1 | |
| home | 10.3.0.1 | 2603:6013:3140:103 |
| metallb | 10.5.0.1 | 2603:6013:3140:101 |
## Route53
```bash
aws route53 list-hosted-zones
# reeselink
aws route53 change-resource-record-sets --hosted-zone-id Z0092652G7L97DSINN18 --change-batch file://
# reeseapps
aws route53 change-resource-record-sets --hosted-zone-id Z012820733346FJ0U4FUF --change-batch file://
```
## IPV6 EUI64 Address Generation
This will ensure a static IPV6 Address that is based on your mac address.
You can tell if your ipv6 is eui64 if it has an fe:ff in between the 6th and 7th number.
### NetworkManager
(Fedora Server, Raspberry Pi, Debian)
```bash
nmcli connection show --active
nmcli -f ipv6.addr-gen-mode connection show <connection>
nmcli con mod <connection> ipv6.addr-gen-mode eui64
systemctl restart NetworkManager
nmcli -f ipv6.addr-gen-mode connection show <connection>
```