AI Techniques - Cheat Sheet

Training Techniques

Training

Backpropagation

The core algorithm for training neural networks. It calculates the gradient of the loss function with respect to each weight by chain rule, then adjusts weights to minimize error.

Analogy: Like adjusting a radio dial — you turn it slightly, check if the signal is clearer, and keep adjusting in the right direction.

Training

Epoch

One complete pass through the entire training dataset. Models typically train for many epochs.

Training

Batch Size

The number of training examples processed before the model's weights are updated. Larger batches are more stable but use more memory.

Training

Learning Rate

A hyperparameter that controls how much to adjust weights during each update. Too high → unstable training; too low → slow convergence.

Training

Transfer Learning

Using a model trained on one task as the starting point for a model on a second task. Saves time and data.

Example: A model trained on Wikipedia text is fine-tuned for legal document analysis.

Training

Data Augmentation

Artificially expanding a training dataset by applying transformations (e.g., rotation, flipping, synonym replacement) to create new training examples.

Alignment & Improvement

Alignment

RLHF (Reinforcement Learning from Human Feedback)

A technique to align model outputs with human preferences. Humans rank model responses, and a reward model is trained on those rankings. The main model is then fine-tuned to maximize the reward.

Used by: ChatGPT, Claude, and other conversational AI systems to make them more helpful and harmless.

Alignment

SFT (Supervised Fine-Tuning)

Fine-tuning a model on a dataset of input-output pairs to teach it a specific format or style of response.

Example: Training a model to respond in JSON format for API integration.

Alignment

Prompt Tuning

Instead of changing model weights, carefully crafting prompts to guide the model's behavior. Zero-cost and reversible.

Alignment

LoRA (Low-Rank Adaptation)

An efficient fine-tuning technique that adds small trainable matrices to a frozen pre-trained model, drastically reducing compute and memory needs.

Deployment & Optimization

Optimization

Quantization

Reducing the precision of model weights (e.g., from 32-bit to 8-bit) to shrink model size and speed up inference with minimal accuracy loss.

Example: A 13GB model quantized to 4-bit becomes ~3.5GB, fitting on consumer GPUs.

Optimization

Distillation

Training a smaller "student" model to mimic the behavior of a larger "teacher" model, capturing its knowledge in a more compact form.

Optimization

Speculative Decoding

Using a small model to draft multiple tokens, then having the large model verify them in parallel — speeding up generation.

Architecture

RAG (Retrieval-Augmented Generation)

Augmenting a language model with an external knowledge retrieval step. The model first searches a knowledge base, then generates a response using both the retrieved info and its own training.

Example: A customer support bot that searches your product docs before answering questions — no fine-tuning needed.

Architecture

Agent / Tool Use

Giving an LLM the ability to call external tools (search, calculators, APIs) to accomplish multi-step tasks.

Example: An AI that searches the web, summarizes results, and writes a report — all autonomously.

Architecture

Chain-of-Thought

Asking a model to show its reasoning step-by-step before giving an answer. Dramatically improves performance on reasoning tasks.

Prompt: "Let's think step by step. First, ..."