2026-03-05 13:33:23 -05:00
2026-03-09 22:39:49 -04:00
2026-03-05 13:33:23 -05:00
2026-02-28 08:55:23 -05:00
2026-03-09 22:39:49 -04:00
2026-03-09 22:36:04 -04:00
2026-03-09 22:36:04 -04:00
2026-03-05 13:33:23 -05:00
2026-03-09 22:39:49 -04:00
2026-03-05 13:33:23 -05:00
2026-03-03 10:30:41 -05:00

Vibe Discord Bot with RAG Chat History

A Discord bot that stores long-term chat history using SQLite database with RAG (Retrieval-Augmented Generation) capabilities powered by custom embedding models.

Quick Start - Available Commands

Pre-built Bots

Command Description Example Usage
!doodlebob Generate images from text !doodlebob a cat sitting on a moon
!retcon Edit images with text prompts !retcon <image attachment> Make it sunny

Custom Bot Management

Command Description Example Usage
!custom <name> <personality> Create a custom bot with specific personality !custom alfred you are a proper british butler
!list-custom-bots List all available custom bots !list-custom-bots
!delete-custom-bot <name> Delete your custom bot !delete-custom-bot alfred

Using Custom Bots

Once you create a custom bot, you can interact with it directly by prefixing your message with the bot name:

!<bot_name> <your message>

Example:

  1. Create a bot: !custom alfred you are a proper british butler
  2. Use the bot: alfred Could you fetch me some tea?
  3. The bot will respond in character as a British butler

Features

  • Long-term chat history storage: Persistent storage of all bot interactions

  • RAG-based context retrieval: Smart retrieval of relevant conversation history using vector embeddings

  • Custom embedding model: Uses qwen3-embed-4b for semantic search capabilities

  • Efficient message management: Automatic cleanup of old messages based on configurable limits

  • Long-term chat history storage: Persistent storage of all bot interactions

  • RAG-based context retrieval: Smart retrieval of relevant conversation history using vector embeddings

  • Custom embedding model: Uses qwen3-embed-4b for semantic search capabilities

  • Efficient message management: Automatic cleanup of old messages based on configurable limits

Setup

Prerequisites

  • Python 3.10 or higher
  • uv package manager
  • Embedding API key
  • Discord bot token

Environment Variables

Create a .env file or export the following variables:

# Discord Bot Token
export DISCORD_TOKEN=your_discord_bot_token

# Embedding API Configuration
export OPENAI_API_KEY=your_embedding_api_key
export OPENAI_API_ENDPOINT=https://llama-embed.reeselink.com/embedding

# Image Generation (optional)
export IMAGE_GEN_ENDPOINT=http://toybox.reeselink.com:1234/v1
export IMAGE_EDIT_ENDPOINT=http://toybox.reeselink.com:1235/v1

# Database Configuration (optional)
export CHAT_DB_PATH=chat_history.db
export EMBEDDING_MODEL=qwen3-embed-4b
export EMBEDDING_DIMENSION=2048
export MAX_HISTORY_MESSAGES=1000
export SIMILARITY_THRESHOLD=0.7
export TOP_K_RESULTS=5

Installation

  1. Sync dependencies with uv:
uv sync
  1. Run the bot:
uv run main.py

How It Works

Database Structure

The system uses two SQLite tables:

  1. chat_messages: Stores message metadata

    • message_id, user_id, username, content, timestamp, channel_id, guild_id
  2. message_embeddings: Stores vector embeddings for RAG

    • message_id, embedding (as binary blob)

RAG Process

  1. When a message is received, it's stored in the database
  2. An embedding is generated using OpenAI's embedding API
  3. The embedding is stored alongside the message
  4. When a new message is sent to the bot:
    • The system searches for similar messages using vector similarity
    • Relevant context is retrieved and added to the prompt
    • The LLM generates a response with awareness of past conversations

Configuration Options

  • MAX_HISTORY_MESSAGES: Maximum number of messages to keep (default: 1000)
  • SIMILARITY_THRESHOLD: Minimum similarity score for context retrieval (default: 0.7)
  • TOP_K_RESULTS: Number of similar messages to retrieve (default: 5)
  • EMBEDDING_MODEL: OpenAI embedding model to use (default: text-embedding-3-small)

Usage

The bot maintains conversation context automatically. When you ask a question, it will:

  1. Search for similar past conversations
  2. Include relevant context in the prompt
  3. Generate responses that are aware of the conversation history

File Structure

vibe_discord_bots/
├── main.py              # Main bot application
├── database.py          # SQLite database with RAG support
├── pyproject.toml       # Project dependencies (uv)
├── .env                 # Environment variables
├── .venv/               # Virtual environment (created by uv)
└── README.md           # This file

Build

Using uv

# Set environment variables
export DISCORD_TOKEN=$(cat .token)
export OPENAI_API_KEY=your_api_key
export OPENAI_API_ENDPOINT="https://llama-cpp.reeselink.com"
export IMAGE_GEN_ENDPOINT="http://toybox.reeselink.com:1234/v1"
export IMAGE_EDIT_ENDPOINT="http://toybox.reeselink.com:1235/v1"

# Run with uv
uv run main.py

Container

# Build
podman build -t vibe-bot:latest .

# Run
podman run --env-file .env localhost/vibe-bot:latest

Docs

Open AI

Chat

https://developers.openai.com/api/reference/resources/chat/subresources/completions/methods/create

Images

https://developers.openai.com/api/reference/python/resources/images/methods/edit

Models

Qwen3.5

We recommend using the following set of sampling parameters for generation

  • Non-thinking mode for text tasks: temperature=1.0, top_p=1.00, top_k=20, min_p=0.0, presence_penalty=2.0, repetition_penalty=1.0
  • Non-thinking mode for VL tasks: temperature=0.7, top_p=0.80, top_k=20, min_p=0.0, presence_penalty=1.5, repetition_penalty=1.0
  • Thinking mode for text tasks: temperature=1.0, top_p=0.95, top_k=20, min_p=0.0, presence_penalty=1.5, repetition_penalty=1.0
  • Thinking mode for VL or precise coding (e.g. WebDev) tasks : temperature=0.6, top_p=0.95, top_k=20, min_p=0.0, presence_penalty=0.0, repetition_penalty=1.0

Please note that the support for sampling parameters varies according to inference frameworks.

Description
Vibe Coded Bots
Readme 3.7 MiB
Languages
Python 99.2%
Dockerfile 0.8%