AI & LLM Glossary for Everyone
Goal: To translate "Tech Bro" and "Research Scientist" speak into plain English.
๐ ฐ๏ธ Acronyms Decodedโ
| Acronym | Stands For | Imagine This... |
|---|---|---|
| LLM | Large Language Model | A super-smart autocomplete that read the entire internet. |
| GPT | Generative Pre-trained Transformer | The brand name engine under the hood of ChatGPT. |
| RAG | Retrieval-Augmented Generation | Giving the AI an open-book test instead of a memory test. It looks up facts before answering. |
| RLHF | Reinforcement Learning from Human Feedback | Training a dog with treats. Humans told the AI "Good answer" or "Bad answer" to teach it manners. |
| SFT | Supervised Fine-Tuning | Deeply training the AI on a specific textbook (e.g., Coding or Law) to make it an expert. |
| CoT | Chain of Thought | Asking the AI to "Show its work" step-by-step, which makes it smarter at math and logic. |
| API | Application Programming Interface | A waiter. You (the app) tell the waiter what you want, and they bring the food (data) from the kitchen (server). |
| GPU | Graphics Processing Unit | The muscle. The specialized hardware chip that runs AI. |
| VRAM | Video RAM | The short-term memory of the GPU. More VRAM = bigger, smarter brains can be run. |
๐ง Core Concepts (ELI5)โ
Context Windowโ
The Short-Term Memory. Imagine you are talking to a goldfish. It only remembers the last 10 words you said. That's a small context window. Modern AIs have huge context windowsโthey can "remember" multiple books' worth of conversation instantly.
Hallucinationโ
Confident Bullsh*t. When an AI doesn't know the answer, it sometimes makes up a plausible-sounding lie because it is designed to complete the pattern, not to be a fact-checker.
- Fix: Use RAG (giving it sources) or tell it "If you don't know, say you don't know."
Tokenโ
A Chunk of Text. AIs don't read words; they read "tokens". A token is roughly 4 characters (0.75 words).
- Example: The word "hamburger" might be two tokens: "ham" and "burger".
- Why it matters: You pay by the token (or million tokens).
Temperatureโ
The Creativity Slider.
- Low (0.0): The AI is a boring, predictable robot. Great for coding and factual answers.
- High (1.0): The AI is a wild poet on caffeine. Great for creative writing, bad for facts.
Prompt Engineeringโ
The Art of Asking. Talking to AI is a skill. A bad prompt ("Write a story") gets a bland result. A good prompt ("Write a spy thriller in the style of Ian Fleming featuring a ham sandwich") gets a masterpiece.
Zero-Shot vs. Few-Shotโ
- Zero-Shot: Asking the AI to do something with no examples. "Translate this."
- Few-Shot: Giving the AI examples first. "Convert these names to JSON. Here are 3 examples. Now do this one." (This works MUCH better).
๐ ๏ธ Technical Terms (For when you talk to devs)โ
Embeddingsโ
Converting text into a list of numbers (vectors) so the computer can calculate "meaning". It allows the AI to know that "King - Man + Woman = Queen" mathematically.
Vector Databaseโ
A specialized database that stores Embeddings. It's used for Semantic Search (searching by meaning, not just keywords).
Inferenceโ
The act of the AI actually running and generating an answer. "Training" is learning; "Inference" is doing.
Fine-Tuningโ
Taking a general-purpose model (like GPT-4) and training it further on your specific company data so it speaks your language.
Quantizationโ
Shrinking a giant AI model so it can run on a smaller computer (like your laptop). It's like converting a high-res 4K video to 720pโit's smaller and faster, but slightly lower quality.