# Ollama - [Ollama](#ollama) - [Run natively with GPU support](#run-natively-with-gpu-support) - [Unsticking models stuck in "Stopping"](#unsticking-models-stuck-in-stopping) - [Run Anything LLM Interface](#run-anything-llm-interface) - [Anything LLM Quadlet with Podlet](#anything-llm-quadlet-with-podlet) - [Now with Nginx and Certbot](#now-with-nginx-and-certbot) - [Custom Models](#custom-models) - [From Existing Model](#from-existing-model) - [From Scratch](#from-scratch) - [Converting to gguf](#converting-to-gguf) ## Run natively with GPU support ```bash # Install script curl -fsSL https://ollama.com/install.sh | sh # Check service is running systemctl status ollama ``` Remember to add `Environment="OLLAMA_HOST=0.0.0.0"` to `/etc/systemd/system/ollama.service` to make it accessible on the network. For Radeon 6000 cards you'll need to add `Environment="HSA_OVERRIDE_GFX_VERSION=10.3.0"` as well. ```bash # Pull models # Try to use higher parameter models. Grab the q5_K_M variant at minimum. # For a 24GB VRAM Card I'd recommend: # Anything-LLM Coding ollama pull qwen2.5-coder:14b-instruct-q5_K_M # Anything-LLM Math ollama pull qwen2-math:7b-instruct-fp16 # Anything-LLM Chat ollama pull llama3.2-vision:11b-instruct-q8_0 # VSCode Continue Autocomplete ollama pull starcoder2:15b-q5_K_M # VSCode Continue Chat ollama pull llama3.1:8b-instruct-fp16 # VSCode Continue Embedder ollama pull nomic-embed-text:137m-v1.5-fp16 ``` Note your ollama instance will be available to podman containers via `http://host.containers.internal:11434` ## Unsticking models stuck in "Stopping" ```bash ollama ps | grep -i stopping pgrep ollama | xargs -I '%' sh -c 'kill %' ``` ## Run Anything LLM Interface ```bash podman run \ -d \ -p 3001:3001 \ --name anything-llm \ --cap-add SYS_ADMIN \ -v anything-llm:/app/server \ -e STORAGE_DIR="/app/server/storage" \ docker.io/mintplexlabs/anythingllm ``` This should now be accessible on port 3001. Note, you'll need to allow traffic between podman and the host: Use `podman network ls` to see which networks podman is running on and `podman network inspect` to get the IP address range. Then allow traffic from that range to port 11434 (ollama): ```bash ufw allow from 10.89.0.1/24 to any port 11434 ``` ## Anything LLM Quadlet with Podlet ```bash podman run --rm ghcr.io/containers/podlet --install --description "Anything LLM" \ podman run \ -d \ -p 3001:3001 \ --name anything-llm \ --cap-add SYS_ADMIN \ --restart always \ -v anything-llm:/app/server \ -e STORAGE_DIR="/app/server/storage" \ docker.io/mintplexlabs/anythingllm ``` To the service to have them autostart. Put the generated files in `/usr/share/containers/systemd/`. ## Now with Nginx and Certbot See [generating AWS credentials](cloud/graduated/aws_iam/README.md) ```bash curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" unzip awscliv2.zip ./aws/install # Configure default credentials aws configure ``` Open http/s in firewalld: ```bash # Remember to firewall-cmd --set-default-zone=public firewall-cmd --permanent --zone=public --add-service=http firewall-cmd --permanent --zone=public --add-service=https firewall-cmd --reload # or ufw allow 80/tcp ufw allow 443/tcp ``` Here are the detailed instructions for installing and setting up Nginx on Fedora Linux with Certbot using the Route53 DNS challenge to put in front of a service called "Anything LLM" running on port 3001 with WebSockets. The domain will be chatreesept.reeseapps.com. 1. Install Nginx: ``` dnf install -y nginx ``` 2. Start and enable Nginx service: ``` systemctl enable --now nginx ``` 3. Install Certbot and the Route53 DNS plugin: ``` # Fedora dnf install -y certbot python3-certbot-dns-route53 # Arch pacman -S certbot certbot-dns-route53 ``` 4. Request a certificate for your domain using the Route53 DNS challenge: ``` certbot certonly --dns-route53 -d chatreesept.reeseapps.com ``` Follow the prompts to provide your Route53 credentials and email address. 5. Configure Nginx for your domain: Create a new Nginx configuration file for your domain: Update your nginx conf with the following ``` vim /etc/nginx/nginx.conf ``` ``` keepalive_timeout 1h; send_timeout 1h; client_body_timeout 1h; client_header_timeout 1h; proxy_connect_timeout 1h; proxy_read_timeout 1h; proxy_send_timeout 1h; ``` ``` vim /etc/nginx/conf.d/ollama.reeselink.com.conf ``` ``` server { listen 80; server_name ollama.reeselink.com; location / { return 301 https://$host$request_uri; } } server { listen 443 ssl; server_name ollama.reeselink.com; ssl_certificate /etc/letsencrypt/live/ollama.reeselink.com/fullchain.pem; ssl_certificate_key /etc/letsencrypt/live/ollama.reeselink.com/privkey.pem; location / { proxy_pass http://localhost:11434; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; proxy_set_header Host $host; proxy_cache_bypass $http_upgrade; proxy_buffering off; } } ``` ``` vim /etc/nginx/conf.d/chatreesept.reeseapps.com.conf ``` Add the following configuration to the file: ``` server { listen 80; server_name chatreesept.reeseapps.com; location / { return 301 https://$host$request_uri; } } server { listen 443 ssl; server_name chatreesept.reeseapps.com; ssl_certificate /etc/letsencrypt/live/chatreesept.reeseapps.com/fullchain.pem; ssl_certificate_key /etc/letsencrypt/live/chatreesept.reeseapps.com/privkey.pem; location / { client_max_body_size 50m; proxy_pass http://localhost:3001; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; proxy_set_header Host $host; proxy_cache_bypass $http_upgrade; proxy_buffering off; } } `` 6. Test your Nginx configuration for syntax errors: ``` nginx -t ``` If there are no errors, reload Nginx to apply the changes: ``` systemctl reload nginx ``` 7. Set up automatic certificate renewal: Add the following line to your crontab to renew the certificate daily: ``` pacman -S cronie sudo crontab -e ``` Add the following line to the end of the file: ``` 0 0 * * * certbot renew --quiet ``` Now, your "Anything LLM" service running on port 3001 with WebSockets is accessible through the domain chatreesept.reeseapps.com with a valid SSL certificate from Let's Encrypt. The certificate will be automatically renewed daily. ## Custom Models ### From Existing Model ```bash ollama show --modelfile opencoder > Modelfile PARAMETER num_ctx 8192 ollama create opencoder-fix -f Modelfile ``` ### From Scratch Install git lfs and clone the model you're interested in ```bash # Make sure you have git-lfs installed (https://git-lfs.com) git lfs install git clone https://huggingface.co/bartowski/Starling-LM-7B-beta-GGUF ``` Create a modelfile ``` # Modelfile FROM "./path/to/gguf" TEMPLATE """{{ if .Prompt }}<|im_start|> {{ .Prompt }}<|im_end|> {{ end }} """ SYSTEM You are OpenCoder, created by OpenCoder Team. PARAMETER stop <|im_start|> PARAMETER stop <|im_end|> PARAMETER stop <|fim_prefix|> PARAMETER stop <|fim_middle|> PARAMETER stop <|fim_suffix|> PARAMETER stop <|fim_end|> PARAMETER stop """ """ ``` Build the model ```bash ollama create "Starling-LM-7B-beta-Q6_K" -f Modelfile ``` Run the model ```bash ollama run Starling-LM-7B-beta-Q6_K:latest ``` ## Converting to gguf 1. Clone the llama.cpp repository and install its dependencies: ```bash git clone https://github.com/ggerganov/llama.cpp.git cd ~/llama.cpp python3 -m venv venv && source venv/bin/activate pip3 install -r requirements.txt mkdir ~/llama.cpp/models/mistral huggingface-cli login #necessary to download gated models huggingface-cli download mistralai/Mistral-7B-Instruct-v0.3 --local-dir ~/llama.cpp/models/mistral/ python3 convert_hf_to_gguf.py ~/.cache/huggingface/hub/models--infly--OpenCoder-8B-Instruct/snapshots/01badbbf10c2dfd7e2a0b5f570065ef44548576c ```