add litellm
This commit is contained in:
233
active/container_litellm/litellm.md
Normal file
233
active/container_litellm/litellm.md
Normal file
@@ -0,0 +1,233 @@
|
||||
# Podman litellm
|
||||
|
||||
- [Podman litellm](#podman-litellm)
|
||||
- [Setup litellm Project](#setup-litellm-project)
|
||||
- [Install litellm](#install-litellm)
|
||||
- [Create the ai user](#create-the-ai-user)
|
||||
- [Write the litellm compose spec](#write-the-litellm-compose-spec)
|
||||
- [A Note on Volumes](#a-note-on-volumes)
|
||||
- [Convert litellm compose spec to quadlets](#convert-litellm-compose-spec-to-quadlets)
|
||||
- [Create the litellm.env file](#create-the-litellmenv-file)
|
||||
- [Start and enable your systemd quadlet](#start-and-enable-your-systemd-quadlet)
|
||||
- [Expose litellm](#expose-litellm)
|
||||
- [Using LiteLLM](#using-litellm)
|
||||
- [Adding Models](#adding-models)
|
||||
- [Testing Models](#testing-models)
|
||||
- [Backup litellm](#backup-litellm)
|
||||
- [Upgrade litellm](#upgrade-litellm)
|
||||
- [Upgrade Quadlets](#upgrade-quadlets)
|
||||
- [Uninstall](#uninstall)
|
||||
- [Notes](#notes)
|
||||
- [SELinux](#selinux)
|
||||
|
||||
## Setup litellm Project
|
||||
|
||||
- [ ] Copy and rename this folder to active/container_litellm
|
||||
- [ ] Find and replace litellm with the name of the service.
|
||||
- [ ] Create the rootless user to run the podman containers
|
||||
- [ ] Write the compose.yaml spec for your service
|
||||
- [ ] Convert the compose.yaml spec to a quadlet
|
||||
- [ ] Install the quadlet on the podman server
|
||||
- [ ] Expose the quadlet service
|
||||
- [ ] Install a backup service and timer
|
||||
|
||||
## Install litellm
|
||||
|
||||
### Create the ai user
|
||||
|
||||
```bash
|
||||
# SSH into your podman server as root
|
||||
useradd ai
|
||||
loginctl enable-linger $(id -u ai)
|
||||
systemctl --user --machine=ai@.host enable podman-restart
|
||||
systemctl --user --machine=ai@.host enable --now podman.socket
|
||||
mkdir -p /home/ai/.config/containers/systemd
|
||||
```
|
||||
|
||||
### Write the litellm compose spec
|
||||
|
||||
See the [docker run command here](https://docs.litellm.ai/docs/proxy/docker_quick_start#32-start-proxy)
|
||||
|
||||
Edit the compose.yaml at active/container_litellm/compose/compose.yaml
|
||||
|
||||
#### A Note on Volumes
|
||||
|
||||
Named volumes are stored at `/home/litellm/.local/share/containers/storage/volumes/`.
|
||||
|
||||
### Convert litellm compose spec to quadlets
|
||||
|
||||
Run the following to convert a compose.yaml into the various `.container` files for systemd:
|
||||
|
||||
```bash
|
||||
# Generate the systemd service
|
||||
podman run \
|
||||
--security-opt label=disable \
|
||||
--rm \
|
||||
-v $(pwd)/active/container_litellm/compose:/compose \
|
||||
-v $(pwd)/active/container_litellm/quadlets:/quadlets \
|
||||
quay.io/k9withabone/podlet \
|
||||
-f /quadlets \
|
||||
-i \
|
||||
--overwrite \
|
||||
compose /compose/compose.yaml
|
||||
|
||||
# Copy the files to the server
|
||||
export PODMAN_SERVER=ai-ai
|
||||
scp -r active/container_litellm/quadlets/. $PODMAN_SERVER:/home/ai/.config/containers/systemd/
|
||||
```
|
||||
|
||||
### Create the litellm.env file
|
||||
|
||||
Should look something like:
|
||||
|
||||
```env
|
||||
LITELLM_MASTER_KEY="random-string"
|
||||
LITELLM_SALT_KEY="random-string"
|
||||
|
||||
UI_USERNAME="admin"
|
||||
UI_PASSWORD="random-string"
|
||||
```
|
||||
|
||||
Then copy it to the server
|
||||
|
||||
```bash
|
||||
export PODMAN_SERVER=ai
|
||||
scp -r active/container_litellm/config.yaml $PODMAN_SERVER:/home/ai/litellm_config.yaml
|
||||
ssh $PODMAN_SERVER chown -R ai:ai /home/ai/litellm_config.yaml
|
||||
```
|
||||
|
||||
### Start and enable your systemd quadlet
|
||||
|
||||
SSH into your podman server as root:
|
||||
|
||||
```bash
|
||||
ssh ai
|
||||
machinectl shell ai@
|
||||
systemctl --user daemon-reload
|
||||
systemctl --user restart litellm
|
||||
journalctl --user -u litellm -f
|
||||
# Enable auto-update service which will pull new container images automatically every day
|
||||
systemctl --user enable --now podman-auto-update.timer
|
||||
```
|
||||
|
||||
### Expose litellm
|
||||
|
||||
1. If you need a domain, follow the [DDNS instructions](/active/container_ddns/ddns.md#install-a-new-ddns-service)
|
||||
2. For a web service, follow the [Caddy instructions](/active/container_caddy/caddy.md#adding-a-new-caddy-record)
|
||||
3. Finally, follow your OS's guide for opening ports via its firewall service.
|
||||
|
||||
## Using LiteLLM
|
||||
|
||||
### Adding Models
|
||||
|
||||
```json
|
||||
// qwen3.5-35b-a3b-thinking
|
||||
{
|
||||
"temperature": 1,
|
||||
"top_p": 0.95,
|
||||
"presence_penalty": 1.5,
|
||||
"extra_body": {
|
||||
"top_k": 20,
|
||||
"min_p": 0,
|
||||
"repetition_penalty": 1,
|
||||
"chat_template_kwargs": {
|
||||
"enable_thinking": true
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// qwen3.5-35b-a3b-coding
|
||||
{
|
||||
"temperature": 0.6,
|
||||
"top_p": 0.95,
|
||||
"presence_penalty": 0,
|
||||
"extra_body": {
|
||||
"top_k": 20,
|
||||
"min_p": 0,
|
||||
"repetition_penalty": 1,
|
||||
"chat_template_kwargs": {
|
||||
"enable_thinking": true
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// qwen3.5-35b-a3b-instruct
|
||||
{
|
||||
"temperature": 0.7,
|
||||
"top_p": 0.8,
|
||||
"presence_penalty": 1.5,
|
||||
"extra_body": {
|
||||
"top_k": 20,
|
||||
"min_p": 0,
|
||||
"repetition_penalty": 1,
|
||||
"chat_template_kwargs": {
|
||||
"enable_thinking": false
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Testing Models
|
||||
|
||||
```bash
|
||||
# List models
|
||||
curl -L -X GET 'https://aipi.reeseapps.com/v1/models' \
|
||||
-H 'Content-Type: application/json' \
|
||||
-H 'Authorization: Bearer sk-1234'
|
||||
|
||||
curl -L -X POST 'https://aipi.reeseapps.com/v1/chat/completions' \
|
||||
-H 'Content-Type: application/json' \
|
||||
-H 'Authorization: Bearer sk-1234' \
|
||||
-d '{
|
||||
"model": "gpt-4o-mini", # 👈 REPLACE with 'public model name' for any db-model
|
||||
"messages": [
|
||||
{
|
||||
"content": "Hey, how's it going",
|
||||
"role": "user"
|
||||
}
|
||||
],
|
||||
}'
|
||||
```
|
||||
|
||||
## Backup litellm
|
||||
|
||||
Follow the [Borg Backup instructions](/active/systemd_borg/borg.md#set-up-a-client-for-backup)
|
||||
|
||||
## Upgrade litellm
|
||||
|
||||
### Upgrade Quadlets
|
||||
|
||||
Upgrades should be a repeat of [writing the compose spec](#convert-litellm-compose-spec-to-quadlets) and [installing the quadlets](#start-and-enable-your-systemd-quadlet)
|
||||
|
||||
```bash
|
||||
export PODMAN_SERVER=
|
||||
scp -r quadlets/. $PODMAN_SERVER$:/home/litellm/.config/containers/systemd/
|
||||
ssh litellm systemctl --user daemon-reload
|
||||
ssh litellm systemctl --user restart litellm
|
||||
```
|
||||
|
||||
## Uninstall
|
||||
|
||||
```bash
|
||||
# Stop the user's services
|
||||
systemctl --user disable podman-restart
|
||||
podman container stop --all
|
||||
systemctl --user disable --now podman.socket
|
||||
systemctl --user disable --now podman-auto-update.timer
|
||||
|
||||
# Delete the user (this won't delete their home directory)
|
||||
# userdel might spit out an error like:
|
||||
# userdel: user litellm is currently used by process 591255
|
||||
# kill those processes and try again
|
||||
userdel litellm
|
||||
```
|
||||
|
||||
## Notes
|
||||
|
||||
### SELinux
|
||||
|
||||
<https://blog.christophersmart.com/2021/01/31/podman-volumes-and-selinux/>
|
||||
|
||||
:z allows a container to share a mounted volume with all other containers.
|
||||
|
||||
:Z allows a container to reserve a mounted volume and prevents any other container from accessing.
|
||||
Reference in New Issue
Block a user