AI Terminology - Cheat Sheet

Machine Learning Basics

Machine Learning (ML)

A subset of AI where systems learn patterns from data to make decisions or predictions without being explicitly programmed for each task.

Supervised Learning

Training a model on labeled data — each example has an input and a known correct output. The model learns to map inputs to outputs.

Example: Training on emails labeled "spam" or "not spam" to build a spam filter.

Unsupervised Learning

Training on unlabeled data — the model finds hidden patterns or groupings on its own.

Example: Grouping customers by purchasing behavior without pre-defined categories.

Reinforcement Learning

An agent learns by interacting with an environment, receiving rewards for good actions and penalties for bad ones, optimizing for maximum cumulative reward.

Example: An AI learning to play chess by playing millions of games against itself.

Overfitting

When a model learns the training data too well — including noise and outliers — and performs poorly on new, unseen data.

Underfitting

When a model is too simple to capture the patterns in the data, performing poorly on both training and test data.

Natural Language Processing

NLP

NLP (Natural Language Processing)

A field of AI focused on enabling computers to understand, interpret, and generate human language.

NLP

Token

The smallest unit of text a model processes. Tokens can be words, subwords, or characters. A single word may be split into multiple tokens.

Example: "unhappiness" might become ["un", "happiness"] — 2 tokens.

NLP

Embedding

A numerical representation of text (or other data) in a continuous vector space, where similar items are closer together.

Example: "king", "queen", "man", "woman" are embedded so that queen - woman + man ≈ king.

NLP

Context Window

The maximum number of tokens a model can process at once — both input and output combined.

Example: A 128K context window means the model can read ~100,000 words in a single prompt.

NLP

Paraphrasing

Restating text in different words while preserving the original meaning. LLMs excel at this task.

NLP

Sentiment Analysis

Determining the emotional tone behind text — positive, negative, or neutral.

Example: "This product is amazing!" → Positive

Model Concepts

Model

LLM (Large Language Model)

A neural network with billions of parameters trained on massive text corpora to understand and generate human language. Examples: GPT-4, Claude, Gemini, Llama.

Model

Pre-trained Model

A model that has already been trained on a large dataset and can be used as-is or fine-tuned for specific tasks.

Model

Fine-tuning

Taking a pre-trained model and continuing to train it on a smaller, task-specific dataset to adapt its behavior.

Example: Fine-tuning GPT-4 on medical texts so it answers healthcare questions more accurately.

Model

Parameters

The internal variables of a model that are adjusted during training. More parameters generally mean greater capacity to learn complex patterns.

Example: GPT-4 is estimated to have trillions of parameters.

Model

Inference

The process of using a trained model to generate outputs for new inputs (as opposed to training the model).

Model

Weights

The numerical values learned during training that determine how input signals are transformed as they pass through the network.

Common Acronyms

Acronym	Meaning
AI	Artificial Intelligence
ML	Machine Learning
DL	Deep Learning
NLP	Natural Language Processing
LLM	Large Language Model
RLHF	Reinforcement Learning from Human Feedback
RAG	Retrieval-Augmented Generation
API	Application Programming Interface
SFT	Supervised Fine-Tuning
PoC	Proof of Concept
GAN	Generative Adversarial Network
CNN	Convolutional Neural Network
GAN	Generative Adversarial Network
AGI	Artificial General Intelligence
STT / ASR	Speech-to-Text / Automatic Speech Recognition
TTS	Text-to-Speech