218 lines
7.3 KiB
Markdown
218 lines
7.3 KiB
Markdown
# Vibe Discord Bot with RAG Chat History
|
|
|
|
A Discord bot that stores long-term chat history using SQLite database with RAG (Retrieval-Augmented Generation) capabilities powered by custom embedding models.
|
|
|
|
- [Vibe Discord Bot with RAG Chat History](#vibe-discord-bot-with-rag-chat-history)
|
|
- [Quick Start - Available Commands](#quick-start---available-commands)
|
|
- [Pre-built Bots](#pre-built-bots)
|
|
- [Custom Bot Management](#custom-bot-management)
|
|
- [Using Custom Bots](#using-custom-bots)
|
|
- [Features](#features)
|
|
- [Setup](#setup)
|
|
- [Prerequisites](#prerequisites)
|
|
- [Environment Variables](#environment-variables)
|
|
- [Installation](#installation)
|
|
- [How It Works](#how-it-works)
|
|
- [Database Structure](#database-structure)
|
|
- [RAG Process](#rag-process)
|
|
- [Configuration Options](#configuration-options)
|
|
- [Usage](#usage)
|
|
- [File Structure](#file-structure)
|
|
- [Build](#build)
|
|
- [Using uv](#using-uv)
|
|
- [Container](#container)
|
|
- [Docs](#docs)
|
|
- [Open AI](#open-ai)
|
|
- [Models](#models)
|
|
- [Qwen3.5](#qwen35)
|
|
|
|
|
|
## Quick Start - Available Commands
|
|
|
|
### Pre-built Bots
|
|
|
|
| Command | Description | Example Usage |
|
|
| ------------ | ----------------------------- | ------------------------------------------ |
|
|
| `!doodlebob` | Generate images from text | `!doodlebob a cat sitting on a moon` |
|
|
| `!retcon` | Edit images with text prompts | `!retcon <image attachment> Make it sunny` |
|
|
|
|
### Custom Bot Management
|
|
|
|
| Command | Description | Example Usage |
|
|
| ------------------------------ | --------------------------------------------- | ------------------------------------------------ |
|
|
| `!custom <name> <personality>` | Create a custom bot with specific personality | `!custom alfred you are a proper british butler` |
|
|
| `!list-custom-bots` | List all available custom bots | `!list-custom-bots` |
|
|
| `!delete-custom-bot <name>` | Delete your custom bot | `!delete-custom-bot alfred` |
|
|
|
|
### Using Custom Bots
|
|
|
|
Once you create a custom bot, you can interact with it directly by prefixing your message with the bot name:
|
|
|
|
```bash
|
|
!<bot_name> <your message>
|
|
```
|
|
|
|
**Example:**
|
|
|
|
1. Create a bot: `!custom alfred you are a proper british butler`
|
|
2. Use the bot: `alfred Could you fetch me some tea?`
|
|
3. The bot will respond in character as a British butler
|
|
|
|
## Features
|
|
|
|
- **Long-term chat history storage**: Persistent storage of all bot interactions
|
|
- **RAG-based context retrieval**: Smart retrieval of relevant conversation history using vector embeddings
|
|
- **Custom embedding model**: Uses qwen3-embed-4b for semantic search capabilities
|
|
- **Efficient message management**: Automatic cleanup of old messages based on configurable limits
|
|
|
|
- **Long-term chat history storage**: Persistent storage of all bot interactions
|
|
- **RAG-based context retrieval**: Smart retrieval of relevant conversation history using vector embeddings
|
|
- **Custom embedding model**: Uses qwen3-embed-4b for semantic search capabilities
|
|
- **Efficient message management**: Automatic cleanup of old messages based on configurable limits
|
|
|
|
## Setup
|
|
|
|
### Prerequisites
|
|
|
|
- Python 3.10 or higher
|
|
- [uv](https://docs.astral.sh/uv/) package manager
|
|
- Embedding API key
|
|
- Discord bot token
|
|
|
|
### Environment Variables
|
|
|
|
Create a `.env` file or export the following variables:
|
|
|
|
```bash
|
|
# Discord Bot Token
|
|
export DISCORD_TOKEN=your_discord_bot_token
|
|
|
|
# Embedding API Configuration
|
|
export OPENAI_API_KEY=your_embedding_api_key
|
|
export OPENAI_API_ENDPOINT=https://llama-embed.reeselink.com/embedding
|
|
|
|
# Image Generation (optional)
|
|
export IMAGE_GEN_ENDPOINT=http://toybox.reeselink.com:1234/v1
|
|
export IMAGE_EDIT_ENDPOINT=http://toybox.reeselink.com:1235/v1
|
|
|
|
# Database Configuration (optional)
|
|
export CHAT_DB_PATH=chat_history.db
|
|
export EMBEDDING_MODEL=qwen3-embed-4b
|
|
export EMBEDDING_DIMENSION=2048
|
|
export MAX_HISTORY_MESSAGES=1000
|
|
export SIMILARITY_THRESHOLD=0.7
|
|
export TOP_K_RESULTS=5
|
|
```
|
|
|
|
### Installation
|
|
|
|
1. Sync dependencies with uv:
|
|
```bash
|
|
uv sync
|
|
```
|
|
|
|
2. Run the bot:
|
|
```bash
|
|
uv run main.py
|
|
```
|
|
|
|
## How It Works
|
|
|
|
### Database Structure
|
|
|
|
The system uses two SQLite tables:
|
|
|
|
1. **chat_messages**: Stores message metadata
|
|
- message_id, user_id, username, content, timestamp, channel_id, guild_id
|
|
|
|
2. **message_embeddings**: Stores vector embeddings for RAG
|
|
- message_id, embedding (as binary blob)
|
|
|
|
### RAG Process
|
|
|
|
1. When a message is received, it's stored in the database
|
|
2. An embedding is generated using OpenAI's embedding API
|
|
3. The embedding is stored alongside the message
|
|
4. When a new message is sent to the bot:
|
|
- The system searches for similar messages using vector similarity
|
|
- Relevant context is retrieved and added to the prompt
|
|
- The LLM generates a response with awareness of past conversations
|
|
|
|
### Configuration Options
|
|
|
|
- **MAX_HISTORY_MESSAGES**: Maximum number of messages to keep (default: 1000)
|
|
- **SIMILARITY_THRESHOLD**: Minimum similarity score for context retrieval (default: 0.7)
|
|
- **TOP_K_RESULTS**: Number of similar messages to retrieve (default: 5)
|
|
- **EMBEDDING_MODEL**: OpenAI embedding model to use (default: text-embedding-3-small)
|
|
|
|
## Usage
|
|
|
|
The bot maintains conversation context automatically. When you ask a question, it will:
|
|
|
|
1. Search for similar past conversations
|
|
2. Include relevant context in the prompt
|
|
3. Generate responses that are aware of the conversation history
|
|
|
|
## File Structure
|
|
|
|
```text
|
|
vibe_discord_bots/
|
|
├── main.py # Main bot application
|
|
├── database.py # SQLite database with RAG support
|
|
├── pyproject.toml # Project dependencies (uv)
|
|
├── .env # Environment variables
|
|
├── .venv/ # Virtual environment (created by uv)
|
|
└── README.md # This file
|
|
```
|
|
|
|
## Build
|
|
|
|
### Using uv
|
|
|
|
```bash
|
|
# Set environment variables
|
|
export DISCORD_TOKEN=$(cat .token)
|
|
export OPENAI_API_KEY=your_api_key
|
|
export OPENAI_API_ENDPOINT="https://llama-cpp.reeselink.com"
|
|
export IMAGE_GEN_ENDPOINT="http://toybox.reeselink.com:1234/v1"
|
|
export IMAGE_EDIT_ENDPOINT="http://toybox.reeselink.com:1235/v1"
|
|
|
|
# Run with uv
|
|
uv run main.py
|
|
```
|
|
|
|
### Container
|
|
|
|
```bash
|
|
# Build
|
|
podman build -t vibe-bot:latest .
|
|
|
|
# Run
|
|
podman run --env-file .env localhost/vibe-bot:latest
|
|
```
|
|
|
|
## Docs
|
|
|
|
### Open AI
|
|
|
|
Chat
|
|
|
|
<https://developers.openai.com/api/reference/resources/chat/subresources/completions/methods/create>
|
|
|
|
Images
|
|
|
|
<https://developers.openai.com/api/reference/python/resources/images/methods/edit>
|
|
|
|
## Models
|
|
|
|
### Qwen3.5
|
|
|
|
> We recommend using the following set of sampling parameters for generation
|
|
|
|
- Non-thinking mode for text tasks: temperature=1.0, top_p=1.00, top_k=20, min_p=0.0, presence_penalty=2.0, repetition_penalty=1.0
|
|
- Non-thinking mode for VL tasks: temperature=0.7, top_p=0.80, top_k=20, min_p=0.0, presence_penalty=1.5, repetition_penalty=1.0
|
|
- Thinking mode for text tasks: temperature=1.0, top_p=0.95, top_k=20, min_p=0.0, presence_penalty=1.5, repetition_penalty=1.0
|
|
- Thinking mode for VL or precise coding (e.g. WebDev) tasks : temperature=0.6, top_p=0.95, top_k=20, min_p=0.0, presence_penalty=0.0, repetition_penalty=1.0
|
|
|
|
> Please note that the support for sampling parameters varies according to inference frameworks.
|