checkpoint commit
All checks were successful
Podman DDNS Image / build-and-push-ddns (push) Successful in 1m3s

This commit is contained in:
2026-05-05 06:26:40 -04:00
parent e43c534ceb
commit f2015e2c71
76 changed files with 4265 additions and 235 deletions

22
stories/README.md Normal file
View File

@@ -0,0 +1,22 @@
# Stories
Stories I want to tell. Unlike the `active` project, which stores notes in no
particular order, stories are meant to be read and enjoyed from top to bottom.
Hopefully they teach you something.
## This is a mkdocs project
`docs/` contains all the stories.
`mkdocs.yml` holds the project config.
```bash
# Run the mkdocs site with mkdocs serve
uv run mkdocs serve
```
## Errata
mkdocs has a [bug that breaks
autoreload](https://github.com/mkdocs/mkdocs/issues/4032). This project has
pinned click to `8.2.1` to fix it.

View File

@@ -0,0 +1,92 @@
# I want to build the perfect homelab server
Date Written: 02/11/26
Fedora Version: 43
## Intro
And it will run Fedora. Backstory: I ran Truenas for a long time. I started with
Freenas, got confused by jails, switched to this new thing called Truenas,
learned about ZFS, watched "Apps" come into existence, watched Kubernetes
support rise and fall, watched the disaster that was Incus container/VM support
and the subsequent "rollback" to traditional VMs, and got really tired of lack
of control.
App backup and restore was never, and still isn't, well supported. Taking
snapshots of your apps pool and then sending those snapshots to a backup drive
un-hid the ix-systems directory (which would frequently have thousands of
snapshots due to Truenas's liberal use of subvolumes and would slow down the UI
immensely). App data was intentionally hidden from the user for some reason.
Migrating between Docker, then kubernetes, then Incus was never fully planned.
The Truenas Charts app market was awesome, but building a Truenas Chart was
complex and required duplicating all `values.yaml` configurable parameters into
a new yaml file that the UI could use for form-fill. They got rid of Truenas
Charts regardless so you just have to hope that your favorite app supporter is
comfortable rewriting their app in the new format and supporting some kind of
migration strategy. Every six months I would expect some kind of downtime
because Truenas would change something critical and it would inevitably impact
my workflow.
So, if I'm going to be subject to the whims of a changing platform anyway (given
Truenas is supposed to be based on Debian, aka the *stable* choice), and if I'm
going to suffer breaking changes every 6 months no matter what I choose, then I
may as well have the latest and greatest via a rolling kernel distro.
So why not Arch? Simply: SELinux. SELinux is currently not officially supported
in Arch linux. Plus Fedora Server comes with a lot built in that I like.
Cockpit, Firewalld, Podman, SELinux, OSBuild, and RPM support all work out of
the box. These are, imo, the "bare bones" requirements for a server exposed to
the internet that will run homelab services.
So let's get started configuring an awesome Fedora server to keep your data safe
and run your Homelab services with minimal downtime.
## Installation
When installing Fedora from the ISO, take some time at the installation menu to
configure some basics.
Don't worry about RAID for now, we can convert a single disk into a RAID 1 array
later.
If you don't have an SSH key already, generate one for yourself so you can log into the server. On your local machine:
```bash
# Generate the key
# Save it to the default location (~/.ssh/id_ed25519)
# Please please please encrypt it with a password. Something memorable. Write it down. Friends don't let friends have naked SSH keys.
ssh-keygen -t ed25519
```
1. Configure the network
1. Set a hostname
2. Disable ipv6 privacy extensions
2. Configure software selection
1. Choose anything you'd like preinstalled
3. Create a non-root user
1. Set a simple password for easy login, we'll change it later
4. Configure your disk partitioning
1. Select manual (blivet) partitioning
2. Create a 1GB EFI system partition and mount it at `/boot/efi`
3. Create a 1GB btrfs partition and mount it at `/boot`
4. Create an encrypted btrfs volume with the remaining data and name it something unique, do not mount it
5. Create a btrfs subvolume called "root" and mount it at `/`
6. Create a btrfs subvolume called "home" and mount it at `/home`
7. Create any other btrfs subvolumes you might need (`/var`, for example)
5. Take note of the ipv4 and ipv6 address. Update any DNS records at this time.
6. Install and reboot
## Configuration
Once your server boots up we'll follow a basic playbook:
1. Change your password
2. Configure automatic decryption for your encrypted drives at boot with TPM2
3. Configure the package manager and apply updates
4. Secure SSH with Fail2Ban
5. Install Snapper for automatic snapshots to prevent accidental file deletion
6. Install BorgBackup for automatic backups
7. Install VM support
8. Build some images
9. Run some VMs

View File

@@ -0,0 +1,224 @@
# I refuse to pay for LLMs
But I want them anyway. And I don't just want LLMs, I want:
1. Image Generation
2. Image Editing
3. Speech to Text
4. Text to Speech
5. Web Searching
6. RAG Retrieval
7. Guest accounts with time-based access
8. Probably other things
On rootless podman with snapshots and backups and no compromises.
- [I refuse to pay for LLMs](#i-refuse-to-pay-for-llms)
- [Create your environment](#create-your-environment)
- [Local LLM First](#local-llm-first)
- [Ollama](#ollama)
- [LM Studio](#lm-studio)
- [llama.cpp](#llamacpp)
- [Ok, so you have a backend](#ok-so-you-have-a-backend)
- [What about llama-server?](#what-about-llama-server)
- [Anything LLM](#anything-llm)
- [Open Webui](#open-webui)
- [But we don't have image editing working](#but-we-dont-have-image-editing-working)
- [Stable Diffusion CPP](#stable-diffusion-cpp)
- [Making it Run with Quadlets](#making-it-run-with-quadlets)
## Create your environment
I created a user named `ai` to run all my AI services. Do that now:
```bash
useradd -m ai
loginctl enable-linger ai
su -l ai
mkdir -p /home/ai/.config/containers/systemd/
mkdir -p /home/ai/.ssh
```
## Local LLM First
On the Framework Desktop (or any AMD system) your options are ROCM or Vulkan drivers. Both are fine, with Vulkan pulling slightly ahead as of February 2026. Almost every backend you pick will support both, so pick a backend first.
### Ollama
is the natural place to start. Their "marketplace" is the best I've found for browsing models. They include short descriptions about what the models are good for and (almost) all of them work out of the box!
Bonus points: Ollama's API is well supported by interfaces like Anything LLM, Open Webui, a litany of F-Droid apps, and many other services.
Honestly, Ollama is still where I'd recommend anyone start. The installer is easy, performance is decent, the API is great, they (the Ollama team) curate models that work well on their platform, what's not to like?
Performance, mostly. llama.cpp just performs 20-30% better in my testing on models like gpt-oss-120b. Your mileage may vary, this is a great project.
### LM Studio
Everyone says to start with this. Ok, first of all, it's a GUI app. Yeah there's a toggle to run an API server but ain't no way I'm installing wayland on my pure, uncompromising, headless Fedora server.
I do have to admit it's the fastest way to get started with LLMs on desktop. But we're not here for desktops, we're here for servers. It runs llama.cpp in the backend anyway so skip past this and go for the good stuff.
### llama.cpp
We've landed on the best choice. You'll browse Hugging Face for models, be confused, and like it. You'll struggle to read the logs and feel right at home. You'll wonder why there isn't an intuitive CLI like Ollama. And you'll be rewarded with the fastest, most flexible way to run LLMs.
You'll need the Hugging Face CLI (`hf`). Install that.
First, download qwen3-vl-8b. This is a good jack of all trades model that supports vision, which is nice.
```bash
# Create a directory to hold your text models
# I put mine at /home/ai/models/text
mkdir -p /home/ai/models/text/qwen3-vl-8b-instruct
# Download the model from hugging face
hf download --local-dir /home/ai/models/text/qwen3-vl-8b-instruct Qwen/Qwen3-VL-8B-Instruct-GGUF Qwen3VL-8B-Instruct-Q4_K_M.gguf
# Also download the "mmproj" file for this model
# "mmproj" files allow a model to see images
hf download --local-dir /home/ai/models/text/qwen3-vl-8b-instruct Qwen/Qwen3-VL-8B-Instruct-GGUF mmproj-Qwen3VL-8B-Instruct-Q8_0.gguf
```
With our model locked and loaded, we can run the llama.cpp server. We do have to build the llama.cpp server container first though because making this any easier would be a crime.
```bash
# Build the llama.cpp container image
git clone https://github.com/ggml-org/llama.cpp.git
cd llama.cpp
export BUILD_TAG=$(date +"%Y-%m-%d-%H-%M-%S")
# Vulkan
podman build -f .devops/vulkan.Dockerfile -t llama-cpp-vulkan:${BUILD_TAG} -t llama-cpp-vulkan:latest .
# Run llama server (Available on port 8000)
# Add `--n-cpu-moe 32` to gpt-oss-120b to keep minimal number of expert in GPU
podman run \
--rm \
--name llama-server-demo \
--device=/dev/kfd \
--device=/dev/dri \
--pod systemd-ai-internal \
-v /home/ai/models/text:/models:z \
localhost/llama-cpp-vulkan:latest \
--port 8000 \
-c 16384 \
--perf \
--n-gpu-layers all \
--jinja \
--models-max 1 \
--models-dir /models
```
You should be able to access the llama.cpp server at http://{your-ip}:8000. From there you can select the only model you have downloaded (qwen3-vl-8b) and have a conversation.
## Ok, so you have a backend
Now we need a frontend. In my experience there are only 2 choices, but this is changing extremely fast.
### What about llama-server?
Good enough for testing. Honestly, if this meets your needs, more power to you.
### Anything LLM
I started here about a year ago. This is a fantastic frontend with RAG, speech to text, text to speech, web search, RAG, plugins, and decent user management. It supports Ollama, OpenAI, and a bunch of other backends.
Unfortunately, as of when I used it, there was no integrated image generation or image editing.
### Open Webui
This is, in my opinion, the best frontend experience you can get. The killer feature is side-by-side HTML rendering with your LLM response. If your LLM writes HTML/Javascript/CSS, it'll render in real time next to your chat. That's ridiculously cool.
It also supports image generation as a tool that your LLM can call. Prompts like "Generate an image of a dragon" will trigger a call to the image generation tool. Generated images show up in the chat and can be edited with another message.
```bash
mkdir /home/ai/.env
vim /home/ai/.env/open-webui-env
# Add this to the file, then save an exit
WEBUI_SECRET_KEY="some-random-key"
# Will be available on port 8080
podman run \
-d \
-p 8080 \
-v open-webui:/app/backend/data \
--env-file /home/ai/.env/open-webui-env \
--name open-webui \
--restart always \
ghcr.io/open-webui/open-webui:main
```
Use the following connections when configuring models/image editing:
| Service | Endpoint |
| -------------------- | ----------------------------------------- |
| llama.cpp | <http://host.containers.internal:8000> |
| stable-diffusion.cpp | <http://host.containers.internal:1234/v1> |
## But we don't have image editing working
In the past I used stable-diffusion-webui-forge. This project relied on a very
specific set of ROCM torch versions installed via pip from the nightly ROCM pip
repository. I had Stable Diffusion XL and Flux1.dev working on an AMD GPU, but I
couldn't get this working at all on the Framework Desktop.
I found out later this might be due to a ROCM driver bug, but we have bigger and better projects to work with.
### Stable Diffusion CPP
This project is llama.cpp equivalent for image generation. Open AI compatible API, tons of model support, excellent documentation, it's the best.
```bash
# Clone and build the stable diffusion cpp container
git clone https://github.com/leejet/stable-diffusion.cpp.git
cd stable-diffusion.cpp
git submodule update --init --recursive
export BUILD_TAG=$(date +"%Y-%m-%d-%H-%M-%S")
podman build -f Dockerfile.vulkan -t stable-diffusion-cpp:${BUILD_TAG} -t stable-diffusion-cpp:latest .
```
Stable diffusion CPP supports a CLI and a web server. Let's download a model and test out the CLI.
```bash
# z-turbo image model
# Fastest image generation in 8 steps. Great a text and prompt following.
# Lacks variety.
mkdir -p /home/ai/models/image/z-turbo
hf download --local-dir /home/ai/models/image/z-turbo QuantStack/FLUX.1-Kontext-dev-GGUF flux1-kontext-dev-Q4_K_M.gguf
hf download --local-dir /home/ai/models/image/z-turbo black-forest-labs/FLUX.1-schnell ae.safetensors
hf download --local-dir /home/ai/models/image/z-turbo unsloth/Qwen3-4B-Instruct-2507-GGUF Qwen3-4B-Instruct-2507-Q4_K_M.gguf
# Create our output directory
mkdir /home/ai/output
# Generate an image of a photorealistic dragon.
podman run --rm \
-v /home/ai/models:/models:z \
-v /home/ai/output:/output:z \
--device /dev/kfd \
--device /dev/dri \
localhost/stable-diffusion-cpp:latest \
--diffusion-model /models/image/z-turbo/z_image_turbo-Q4_K.gguf \
--vae /models/image/z-turbo/ae.safetensors \
--llm /models/image/z-turbo/Qwen3-4B-Instruct-2507-Q4_K_M.gguf \
--cfg-scale 1.0 \
-v \
--seed -1 \
--steps 8 \
--vae-conv-direct \
-H 1024 \
-W 1024 \
-o /output/output.png \
-p "A photorealistic dragon"
```
With any luck you should have a picture of a dragon in your output folder.
Since we know it works, we can tie everything together.
## Making it Run with Quadlets
Now that we have know our setup works we can glue it all together with systemd.
Take a look at [the framework desktop docs](https://gitea.reeseapps.com/services/homelab/src/branch/main/active/device_framework_desktop/framework_desktop.md#install-the-whole-thing-with-quadlets-tm) for the relevant commands.

View File

@@ -0,0 +1 @@
# Everyone uses this GPG thing, so should I

View File

@@ -0,0 +1 @@
# I want to use Podman, not Docker

11
stories/docs/index.md Normal file
View File

@@ -0,0 +1,11 @@
# Come, have a seat
Join me on a journey through homelab adventures. Follow along at home! These
stores will walk you through the trials of my self hosting wins and losses.
The stories will be written in a way that allows you to skip past the text and
just copy/paste the code blocks (similar to a Medium article). Each story will
lay out its goal and the prerequisites.
Stories are ordered by time written, oldest to newest. They don't necessarily
read in order, but may reference each other. No need to read each one.

3
stories/mkdocs.yml Normal file
View File

@@ -0,0 +1,3 @@
site_name: Reese's Homelab Stories
theme:
name: readthedocs