How AI models are built, trained, and optimized.
The core algorithm for training neural networks. It calculates the gradient of the loss function with respect to each weight by chain rule, then adjusts weights to minimize error.
One complete pass through the entire training dataset. Models typically train for many epochs.
The number of training examples processed before the model's weights are updated. Larger batches are more stable but use more memory.
A hyperparameter that controls how much to adjust weights during each update. Too high → unstable training; too low → slow convergence.
Using a model trained on one task as the starting point for a model on a second task. Saves time and data.
Artificially expanding a training dataset by applying transformations (e.g., rotation, flipping, synonym replacement) to create new training examples.
A technique to align model outputs with human preferences. Humans rank model responses, and a reward model is trained on those rankings. The main model is then fine-tuned to maximize the reward.
Fine-tuning a model on a dataset of input-output pairs to teach it a specific format or style of response.
Instead of changing model weights, carefully crafting prompts to guide the model's behavior. Zero-cost and reversible.
An efficient fine-tuning technique that adds small trainable matrices to a frozen pre-trained model, drastically reducing compute and memory needs.
Reducing the precision of model weights (e.g., from 32-bit to 8-bit) to shrink model size and speed up inference with minimal accuracy loss.
Training a smaller "student" model to mimic the behavior of a larger "teacher" model, capturing its knowledge in a more compact form.
Using a small model to draft multiple tokens, then having the large model verify them in parallel — speeding up generation.
Augmenting a language model with an external knowledge retrieval step. The model first searches a knowledge base, then generates a response using both the retrieved info and its own training.
Giving an LLM the ability to call external tools (search, calculators, APIs) to accomplish multi-step tasks.
Asking a model to show its reasoning step-by-step before giving an answer. Dramatically improves performance on reasoning tasks.