everything working again after cleanup

2026-05-23 23:56:03 -04:00
parent 6ec9fbe85f
commit 87a578f1de
13 changed files with 380 additions and 200 deletions
@@ -1,217 +1,271 @@
 # Vibe Discord Bot with RAG Chat History

-A Discord bot that stores long-term chat history using SQLite database with RAG (Retrieval-Augmented Generation) capabilities powered by custom embedding models.
+A Discord bot that stores long-term chat history using SQLite with RAG (Retrieval-Augmented Generation) capabilities. It supports custom bots with personalities, text-to-speech via Kokoro, image generation, and image editing.

 - [Vibe Discord Bot with RAG Chat History](#vibe-discord-bot-with-rag-chat-history)
-  - [Quick Start - Available Commands](#quick-start---available-commands)
-    - [Pre-built Bots](#pre-built-bots)
+  - [Available Commands](#available-commands)
    - [Custom Bot Management](#custom-bot-management)
    - [Using Custom Bots](#using-custom-bots)
+    - [Text-to-Speech](#text-to-speech)
+    - [Image Commands](#image-commands)
+    - [Bot Conversations](#bot-conversations)
  - [Features](#features)
  - [Setup](#setup)
    - [Prerequisites](#prerequisites)
    - [Environment Variables](#environment-variables)
    - [Installation](#installation)
+    - [Running the Bot](#running-the-bot)
  - [How It Works](#how-it-works)
    - [Database Structure](#database-structure)
    - [RAG Process](#rag-process)
-    - [Configuration Options](#configuration-options)
-  - [Usage](#usage)
  - [File Structure](#file-structure)
-  - [Build](#build)
-    - [Using uv](#using-uv)
+  - [Building](#building)
+    - [Local](#local)
    - [Container](#container)
-  - [Docs](#docs)
-    - [Open AI](#open-ai)
-  - [Models](#models)
-    - [Qwen3.5](#qwen35)
+  - [Testing](#testing)
+  - [Configuration](#configuration)

-
-## Quick Start - Available Commands
-
-### Pre-built Bots
-
-| Command      | Description                   | Example Usage                              |
-| ------------ | ----------------------------- | ------------------------------------------ |
-| `!doodlebob` | Generate images from text     | `!doodlebob a cat sitting on a moon`       |
-| `!retcon`    | Edit images with text prompts | `!retcon <image attachment> Make it sunny` |
+## Available Commands

 ### Custom Bot Management

-| Command                        | Description                                   | Example Usage                                    |
-| ------------------------------ | --------------------------------------------- | ------------------------------------------------ |
-| `!custom <name> <personality>` | Create a custom bot with specific personality | `!custom alfred you are a proper british butler` |
-| `!list-custom-bots`            | List all available custom bots                | `!list-custom-bots`                              |
-| `!delete-custom-bot <name>`    | Delete your custom bot                        | `!delete-custom-bot alfred`                      |
+| Command                            | Description                            | Example Usage                                        |
+| ---------------------------------- | -------------------------------------- | ---------------------------------------------------- |
+| `!custom-bot <name> <personality>` | Create a custom bot with a personality | `!custom-bot alfred you are a proper british butler` |
+| `!list-custom-bots`                | List all available custom bots         | `!list-custom-bots`                                  |
+| `!delete-custom-bot <name>`        | Delete your custom bot (owner only)    | `!delete-custom-bot alfred`                          |

 ### Using Custom Bots

-Once you create a custom bot, you can interact with it directly by prefixing your message with the bot name:
+Once you create a custom bot, interact with it by prefixing your message with the bot name:

-```bash
+```text
 !<bot_name> <your message>
 ```

 **Example:**

-1. Create a bot: `!custom alfred you are a proper british butler`
+1. Create a bot: `!custom-bot alfred you are a proper british butler`
 2. Use the bot: `alfred Could you fetch me some tea?`
 3. The bot will respond in character as a British butler

+### Text-to-Speech
+
+| Command                    | Description                             | Example Usage                   |
+| -------------------------- | --------------------------------------- | ------------------------------- |
+| `!speak <text>`            | Convert text to speech (MP3 attachment) | `!speak hello world`            |
+| `!speak <bot_name> <text>` | Have a custom bot respond and speak     | `!speak alfred what time is it` |
+
+### Image Commands
+
+| Command      | Description                          | Example Usage                              |
+| ------------ | ------------------------------------ | ------------------------------------------ |
+| `!doodlebob` | Generate an image from a text prompt | `!doodlebob a cat sitting on the moon`     |
+| `!retcon`    | Edit an attached image with text     | `!retcon <image attachment> Make it sunny` |
+
+### Bot Conversations
+
+| Command                                | Description                                 | Example Usage                                    |
+| -------------------------------------- | ------------------------------------------- | ------------------------------------------------ |
+| `!talkforme <bot1> <bot2> <n> <topic>` | Have two bots discuss a topic for n replies | `!talkforme alfred jarvis 4 the meaning of life` |
+
 ## Features

- **Long-term chat history storage**: Persistent storage of all bot interactions
+- **Long-term chat history storage**: Persistent storage of all bot interactions in SQLite
 - **RAG-based context retrieval**: Smart retrieval of relevant conversation history using vector embeddings
- **Custom embedding model**: Uses qwen3-embed-4b for semantic search capabilities
- **Efficient message management**: Automatic cleanup of old messages based on configurable limits
-
- **Long-term chat history storage**: Persistent storage of all bot interactions
- **RAG-based context retrieval**: Smart retrieval of relevant conversation history using vector embeddings
- **Custom embedding model**: Uses qwen3-embed-4b for semantic search capabilities
- **Efficient message management**: Automatic cleanup of old messages based on configurable limits
+- **Custom bots**: Create unlimited bots with unique personalities
+- **Text-to-speech**: Kokoro TTS engine converts bot responses to MP3 audio
+- **Image generation**: Generate images from text prompts via OpenAI-compatible API
+- **Image editing**: Edit uploaded images with text instructions
+- **Bot conversations**: Two custom bots can discuss a topic autonomously
+- **Automatic message cleanup**: Configurable limits on stored messages

 ## Setup

 ### Prerequisites

- Python 3.10 or higher
+- Python 3.13 or higher
 - [uv](https://docs.astral.sh/uv/) package manager
- Embedding API key
 - Discord bot token
+- OpenAI-compatible API endpoints (for chat, embeddings, and image generation)

 ### Environment Variables

-Create a `.env` file or export the following variables:
+Create a `.env` file with the following variables:

 ```bash
-# Discord Bot Token
-export DISCORD_TOKEN=your_discord_bot_token
+# Discord Bot Token (required)
+DISCORD_TOKEN=your_discord_bot_token

-# Embedding API Configuration
-export OPENAI_API_KEY=your_embedding_api_key
-export OPENAI_API_ENDPOINT=https://llama-embed.reeselink.com/embedding
+# Chat/Completion API (required)
+CHAT_ENDPOINT=https://your-api.com/v1
+COMPLETION_ENDPOINT=https://your-api.com/v1
+CHAT_ENDPOINT_KEY=your_api_key
+COMPLETION_ENDPOINT_KEY=your_api_key
+CHAT_MODEL=your_model_name
+COMPLETION_MODEL=your_model_name

-# Image Generation (optional)
-export IMAGE_GEN_ENDPOINT=http://toybox.reeselink.com:1234/v1
-export IMAGE_EDIT_ENDPOINT=http://toybox.reeselink.com:1235/v1
+# Image Generation (required)
+IMAGE_GEN_ENDPOINT=https://your-api.com/v1
+IMAGE_EDIT_ENDPOINT=https://your-api.com/v1
+IMAGE_GEN_ENDPOINT_KEY=your_api_key
+IMAGE_EDIT_ENDPOINT_KEY=your_api_key
+IMAGE_GEN_MODEL=gen
+IMAGE_EDIT_MODEL=edit

-# Database Configuration (optional)
-export CHAT_DB_PATH=chat_history.db
-export EMBEDDING_MODEL=qwen3-embed-4b
-export EMBEDDING_DIMENSION=2048
-export MAX_HISTORY_MESSAGES=1000
-export SIMILARITY_THRESHOLD=0.7
-export TOP_K_RESULTS=5
+# Embedding API (required)
+EMBEDDING_ENDPOINT=https://your-api.com/v1
+EMBEDDING_ENDPOINT_KEY=your_api_key
+EMBEDDING_MODEL=your_embed_model
+
+# Optional: TTS Configuration
+TTS_MODEL_PATH=kokoro-v1.0.onnx
+TTS_VOICES_PATH=voices-v1.0.bin
+TTS_VOICE=af_sarah
+TTS_SPEED=1.0
+
+# Optional: Database/Chat Settings
+DB_PATH=chat_history.db
+MAX_COMPLETION_TOKENS=1000
+MAX_HISTORY_MESSAGES=1000
+SIMILARITY_THRESHOLD=0.7
+TOP_K_RESULTS=5
 ```

 ### Installation

-1. Sync dependencies with uv:
-```bash
-uv sync
-```
+1. Clone the repository and sync dependencies:
+
+    ```bash
+    uv sync
+    ```
+
+2. Ensure the TTS model files are present in the project root:
+
+   - `kokoro-v1.0.onnx`
+   - `voices-v1.0.bin`
+
+### Running the Bot

-2. Run the bot:
 ```bash
-uv run main.py
+uv run python -m vibe_bot.main
 ```

 ## How It Works

 ### Database Structure

-The system uses two SQLite tables:
+The system uses SQLite with three tables:

 1. **chat_messages**: Stores message metadata
-   - message_id, user_id, username, content, timestamp, channel_id, guild_id
+   - `message_id`, `user_id`, `username`, `content`, `timestamp`, `channel_id`, `guild_id`

 2. **message_embeddings**: Stores vector embeddings for RAG
-   - message_id, embedding (as binary blob)
+   - `message_id` (PK), `embedding` (binary blob of float32 values)
+
+3. **custom_bots**: Stores custom bot configurations
+   - `bot_name` (PK), `system_prompt`, `created_by`, `created_at`, `is_active`

 ### RAG Process

-1. When a message is received, it's stored in the database
-2. An embedding is generated using OpenAI's embedding API
-3. The embedding is stored alongside the message
-4. When a new message is sent to the bot:
-   - The system searches for similar messages using vector similarity
-   - Relevant context is retrieved and added to the prompt
+1. When a message is sent to a custom bot, it's stored in `chat_messages`
+2. An embedding is generated via the configured embedding API and stored in `message_embeddings`
+3. When a new message is sent:
+   - The system retrieves recent messages from the same user
+   - It searches for semantically similar messages using cosine similarity on embeddings
+   - Relevant context (user + bot message pairs) is prepended to the prompt
   - The LLM generates a response with awareness of past conversations

-### Configuration Options
-
- **MAX_HISTORY_MESSAGES**: Maximum number of messages to keep (default: 1000)
- **SIMILARITY_THRESHOLD**: Minimum similarity score for context retrieval (default: 0.7)
- **TOP_K_RESULTS**: Number of similar messages to retrieve (default: 5)
- **EMBEDDING_MODEL**: OpenAI embedding model to use (default: text-embedding-3-small)
-
-## Usage
-
-The bot maintains conversation context automatically. When you ask a question, it will:
-
-1. Search for similar past conversations
-2. Include relevant context in the prompt
-3. Generate responses that are aware of the conversation history
-
 ## File Structure

 ```text
 vibe_discord_bots/
-├── main.py              # Main bot application
-├── database.py          # SQLite database with RAG support
-├── pyproject.toml       # Project dependencies (uv)
-├── .env                 # Environment variables
-├── .venv/               # Virtual environment (created by uv)
-└── README.md           # This file
+├── vibe_bot/
+│   ├── __init__.py            # Package marker
+│   ├── main.py                # Main bot application (commands, event handlers)
+│   ├── config.py              # Environment variable loading and validation
+│   ├── database.py            # SQLite database with RAG + CustomBotManager
+│   ├── llama_wrapper.py       # OpenAI-compatible API wrappers (chat, images, embeddings)
+│   ├── tts.py                 # Kokoro TTS engine
+│   └── tests/
+│       ├── conftest.py        # Shared test fixtures
+│       ├── test_main.py       # Bot command tests
+│       ├── test_config.py     # Config loading tests
+│       ├── test_database.py   # Database + CustomBotManager tests
+│       ├── test_llama_wrapper.py  # API wrapper tests
+│       └── test_tts.py        # TTS engine tests
+├── pyproject.toml             # Project dependencies (uv)
+├── uv.lock                    # Locked dependency versions
+├── .env                       # Environment variables
+├── kokoro-v1.0.onnx           # Kokoro TTS model
+├── voices-v1.0.bin            # Kokoro voice definitions
+├── Containerfile              # Podman/Docker build file
+└── README.md                  # This file
 ```

-## Build
+## Building

-### Using uv
+### Local

 ```bash
-# Set environment variables
-export DISCORD_TOKEN=$(cat .token)
-export OPENAI_API_KEY=your_api_key
-export OPENAI_API_ENDPOINT="https://llama-cpp.reeselink.com"
-export IMAGE_GEN_ENDPOINT="http://toybox.reeselink.com:1234/v1"
-export IMAGE_EDIT_ENDPOINT="http://toybox.reeselink.com:1235/v1"
+# Sync dependencies
+uv sync

-# Run with uv
-uv run main.py
+# Run the bot
+uv run python -m vibe_bot.main
 ```

 ### Container

 ```bash
-# Build
+# Build the container image
 podman build -t vibe-bot:latest .

-# Run
+# Run with environment file
 podman run --env-file .env localhost/vibe-bot:latest
 ```

-## Docs
+## Testing

-### Open AI
+Run the full test suite:

-Chat
+```bash
+uv run pytest vibe_bot/tests/ -v
+```

-<https://developers.openai.com/api/reference/resources/chat/subresources/completions/methods/create>
+Run linters:

-Images
+```bash
+# Ruff (linter + formatter)
+uv run ruff check vibe_bot/

-<https://developers.openai.com/api/reference/python/resources/images/methods/edit>
+# Mypy (type checking)
+uv run mypy vibe_bot/

-## Models
+# Pyright (type checking)
+uv run pyright vibe_bot/

-### Qwen3.5
+# Black (formatter check)
+uv run black --check vibe_bot/
+```

-> We recommend using the following set of sampling parameters for generation
+## Configuration

- Non-thinking mode for text tasks: temperature=1.0, top_p=1.00, top_k=20, min_p=0.0, presence_penalty=2.0, repetition_penalty=1.0
- Non-thinking mode for VL tasks: temperature=0.7, top_p=0.80, top_k=20, min_p=0.0, presence_penalty=1.5, repetition_penalty=1.0
- Thinking mode for text tasks: temperature=1.0, top_p=0.95, top_k=20, min_p=0.0, presence_penalty=1.5, repetition_penalty=1.0
- Thinking mode for VL or precise coding (e.g. WebDev) tasks : temperature=0.6, top_p=0.95, top_k=20, min_p=0.0, presence_penalty=0.0, repetition_penalty=1.0
-
-> Please note that the support for sampling parameters varies according to inference frameworks.
+| Variable                | Default            | Description                           |
+| ----------------------- | ------------------ | ------------------------------------- |
+| `DISCORD_TOKEN`         | *(required)*       | Discord bot authentication token      |
+| `CHAT_ENDPOINT`         | *(required)*       | OpenAI-compatible chat API URL        |
+| `CHAT_MODEL`            | *(required)*       | Model name for chat completions       |
+| `IMAGE_GEN_ENDPOINT`    | *(required)*       | Image generation API URL              |
+| `IMAGE_EDIT_ENDPOINT`   | *(required)*       | Image editing API URL                 |
+| `EMBEDDING_ENDPOINT`    | *(required)*       | Embedding API URL                     |
+| `EMBEDDING_MODEL`       | *(required)*       | Model name for text embeddings        |
+| `MAX_COMPLETION_TOKENS` | `1000`             | Max tokens in LLM responses           |
+| `MAX_HISTORY_MESSAGES`  | `1000`             | Max messages kept in the database     |
+| `SIMILARITY_THRESHOLD`  | `0.7`              | Min cosine similarity for RAG context |
+| `TOP_K_RESULTS`         | `5`                | Number of similar messages retrieved  |
+| `TTS_MODEL_PATH`        | `kokoro-v1.0.onnx` | Path to Kokoro ONNX model file        |
+| `TTS_VOICES_PATH`       | `voices-v1.0.bin`  | Path to Kokoro voices binary file     |
+| `TTS_VOICE`             | `af_sarah`         | Default voice for TTS                 |
+| `TTS_SPEED`             | `1.0`              | Speech speed multiplier               |
+| `DB_PATH`               | `chat_history.db`  | SQLite database file path             |