vibe-bot/README.md

# Vibe Discord Bot with RAG Chat History

A Discord bot that stores long-term chat history using SQLite database with RAG (Retrieval-Augmented Generation) capabilities powered by custom embedding models.

- [Vibe Discord Bot with RAG Chat History](#vibe-discord-bot-with-rag-chat-history)
  - [Quick Start - Available Commands](#quick-start---available-commands)
    - [Pre-built Bots](#pre-built-bots)
    - [Custom Bot Management](#custom-bot-management)
    - [Using Custom Bots](#using-custom-bots)
  - [Features](#features)
  - [Setup](#setup)
    - [Prerequisites](#prerequisites)
    - [Environment Variables](#environment-variables)
    - [Installation](#installation)
  - [How It Works](#how-it-works)
    - [Database Structure](#database-structure)
    - [RAG Process](#rag-process)
    - [Configuration Options](#configuration-options)
  - [Usage](#usage)
  - [File Structure](#file-structure)
  - [Build](#build)
    - [Using uv](#using-uv)
    - [Container](#container)
  - [Docs](#docs)
    - [Open AI](#open-ai)
  - [Models](#models)
    - [Qwen3.5](#qwen35)


## Quick Start - Available Commands

### Pre-built Bots

| Command      | Description                   | Example Usage                              |
| ------------ | ----------------------------- | ------------------------------------------ |
| `!doodlebob` | Generate images from text     | `!doodlebob a cat sitting on a moon`       |
| `!retcon`    | Edit images with text prompts | `!retcon <image attachment> Make it sunny` |

### Custom Bot Management

| Command                        | Description                                   | Example Usage                                    |
| ------------------------------ | --------------------------------------------- | ------------------------------------------------ |
| `!custom <name> <personality>` | Create a custom bot with specific personality | `!custom alfred you are a proper british butler` |
| `!list-custom-bots`            | List all available custom bots                | `!list-custom-bots`                              |
| `!delete-custom-bot <name>`    | Delete your custom bot                        | `!delete-custom-bot alfred`                      |

### Using Custom Bots

Once you create a custom bot, you can interact with it directly by prefixing your message with the bot name:

```bash
!<bot_name> <your message>
```

**Example:**

1. Create a bot: `!custom alfred you are a proper british butler`
2. Use the bot: `alfred Could you fetch me some tea?`
3. The bot will respond in character as a British butler

## Features

- **Long-term chat history storage**: Persistent storage of all bot interactions
- **RAG-based context retrieval**: Smart retrieval of relevant conversation history using vector embeddings
- **Custom embedding model**: Uses qwen3-embed-4b for semantic search capabilities
- **Efficient message management**: Automatic cleanup of old messages based on configurable limits

- **Long-term chat history storage**: Persistent storage of all bot interactions
- **RAG-based context retrieval**: Smart retrieval of relevant conversation history using vector embeddings
- **Custom embedding model**: Uses qwen3-embed-4b for semantic search capabilities
- **Efficient message management**: Automatic cleanup of old messages based on configurable limits

## Setup

### Prerequisites

- Python 3.10 or higher
- [uv](https://docs.astral.sh/uv/) package manager
- Embedding API key
- Discord bot token

### Environment Variables

Create a `.env` file or export the following variables:

```bash
# Discord Bot Token
export DISCORD_TOKEN=your_discord_bot_token

# Embedding API Configuration
export OPENAI_API_KEY=your_embedding_api_key
export OPENAI_API_ENDPOINT=https://llama-embed.reeselink.com/embedding

# Image Generation (optional)
export IMAGE_GEN_ENDPOINT=http://toybox.reeselink.com:1234/v1
export IMAGE_EDIT_ENDPOINT=http://toybox.reeselink.com:1235/v1

# Database Configuration (optional)
export CHAT_DB_PATH=chat_history.db
export EMBEDDING_MODEL=qwen3-embed-4b
export EMBEDDING_DIMENSION=2048
export MAX_HISTORY_MESSAGES=1000
export SIMILARITY_THRESHOLD=0.7
export TOP_K_RESULTS=5
```

### Installation

1. Sync dependencies with uv:
```bash
uv sync
```

2. Run the bot:
```bash
uv run main.py
```

## How It Works

### Database Structure

The system uses two SQLite tables:

1. **chat_messages**: Stores message metadata
   - message_id, user_id, username, content, timestamp, channel_id, guild_id

2. **message_embeddings**: Stores vector embeddings for RAG
   - message_id, embedding (as binary blob)

### RAG Process

1. When a message is received, it's stored in the database
2. An embedding is generated using OpenAI's embedding API
3. The embedding is stored alongside the message
4. When a new message is sent to the bot:
   - The system searches for similar messages using vector similarity
   - Relevant context is retrieved and added to the prompt
   - The LLM generates a response with awareness of past conversations

### Configuration Options

- **MAX_HISTORY_MESSAGES**: Maximum number of messages to keep (default: 1000)
- **SIMILARITY_THRESHOLD**: Minimum similarity score for context retrieval (default: 0.7)
- **TOP_K_RESULTS**: Number of similar messages to retrieve (default: 5)
- **EMBEDDING_MODEL**: OpenAI embedding model to use (default: text-embedding-3-small)

## Usage

The bot maintains conversation context automatically. When you ask a question, it will:

1. Search for similar past conversations
2. Include relevant context in the prompt
3. Generate responses that are aware of the conversation history

## File Structure

```text
vibe_discord_bots/
├── main.py              # Main bot application
├── database.py          # SQLite database with RAG support
├── pyproject.toml       # Project dependencies (uv)
├── .env                 # Environment variables
├── .venv/               # Virtual environment (created by uv)
└── README.md           # This file
```

## Build

### Using uv

```bash
# Set environment variables
export DISCORD_TOKEN=$(cat .token)
export OPENAI_API_KEY=your_api_key
export OPENAI_API_ENDPOINT="https://llama-cpp.reeselink.com"
export IMAGE_GEN_ENDPOINT="http://toybox.reeselink.com:1234/v1"
export IMAGE_EDIT_ENDPOINT="http://toybox.reeselink.com:1235/v1"

# Run with uv
uv run main.py
```

### Container

```bash
# Build
podman build -t vibe-bot:latest .

# Run
export DISCORD_TOKEN=$(cat .token)
podman run -e DISCORD_TOKEN localhost/vibe-bot:latest
```

## Docs

### Open AI

Chat

<https://developers.openai.com/api/reference/resources/chat/subresources/completions/methods/create>

Images

<https://developers.openai.com/api/reference/python/resources/images/methods/edit>

## Models

### Qwen3.5

> We recommend using the following set of sampling parameters for generation

- Non-thinking mode for text tasks: temperature=1.0, top_p=1.00, top_k=20, min_p=0.0, presence_penalty=2.0, repetition_penalty=1.0
- Non-thinking mode for VL tasks: temperature=0.7, top_p=0.80, top_k=20, min_p=0.0, presence_penalty=1.5, repetition_penalty=1.0
- Thinking mode for text tasks: temperature=1.0, top_p=0.95, top_k=20, min_p=0.0, presence_penalty=1.5, repetition_penalty=1.0
- Thinking mode for VL or precise coding (e.g. WebDev) tasks : temperature=0.6, top_p=0.95, top_k=20, min_p=0.0, presence_penalty=0.0, repetition_penalty=1.0

> Please note that the support for sampling parameters varies according to inference frameworks.