| A2A |
Agent-to-Agent (protocol) |
| AGI |
Artificial General Intelligence |
| ANN |
Approximate Nearest Neighbor |
| API |
Application Programming Interface |
| ASI |
Artificial Superintelligence |
| ASR |
Automatic Speech Recognition |
| AWQ |
Activation-Aware Weight Quantization |
| BF16 |
Brain Floating Point (16-bit) |
| BLEU |
Bilingual Evaluation Understudy |
| BPE |
Byte Pair Encoding |
| CLIP |
Contrastive Language--Image Pre-training |
| CNN |
Convolutional Neural Network |
| CoT |
Chain-of-Thought |
| CUDA |
Compute Unified Device Architecture |
| DARE |
Drop And REscale |
| DiT |
Diffusion Transformer |
| DL |
Deep Learning |
| DP |
Data Parallelism |
| DPO |
Direct Preference Optimization |
| DQN |
Deep Q-Network |
| EMA |
Exponential Moving Average |
| FAISS |
Facebook AI Similarity Search |
| FFN |
Feed-Forward Network |
| FLOP |
Floating-Point Operation |
| FP16/32 |
16-/32-bit Floating Point |
| FSDP |
Fully Sharded Data Parallel |
| GAT |
Graph Attention Network |
| GCN |
Graph Convolutional Network |
| GGUF |
GPT-Generated Unified Format |
| GIN |
Graph Isomorphism Network |
| GNN |
Graph Neural Network |
| GPTQ |
GPT Quantization |
| GQA |
Grouped-Query Attention |
| GRPO |
Group Relative Policy Optimization |
| GPU |
Graphics Processing Unit |
| HBM |
High-Bandwidth Memory |
| HNSW |
Hierarchical Navigable Small World |
| IFT |
Instruction Fine-Tuning |
| INT4/8 |
4-/8-bit Integer |
| JEPA |
Joint Embedding Predictive Architecture |
| JIT |
Just-In-Time (compilation) |
| KD |
Knowledge Distillation |
| KL |
Kullback--Leibler (divergence) |
| KV-cache |
Key--Value Cache |
| LDM |
Latent Diffusion Model |
| LIME |
Local Interpretable Model-agnostic Explanations |
| LLM |
Large Language Model |
| LoRA |
Low-Rank Adaptation |
| MARL |
Multi-Agent Reinforcement Learning |
| MCP |
Model Context Protocol |
| MCTS |
Monte Carlo Tree Search |
| MDP |
Markov Decision Process |
| ML |
Machine Learning |
| MMLU |
Massive Multitask Language Understanding |
| MoE |
Mixture of Experts |
| MQA |
Multi-Query Attention |
| NLP |
Natural Language Processing |
| PEFT |
Parameter-Efficient Fine-Tuning |
| PP |
Pipeline Parallelism |
| PPO |
Proximal Policy Optimization |
| PTQ |
Post-Training Quantization |
| QAT |
Quantization-Aware Training |
| QLoRA |
Quantized Low-Rank Adaptation |
| RAG |
Retrieval-Augmented Generation |
| RL |
Reinforcement Learning |
| RLAIF |
Reinforcement Learning from AI Feedback |
| RLHF |
Reinforcement Learning from Human Feedback |
| RMSNorm |
Root Mean Square Normalization |
| RNN |
Recurrent Neural Network |
| RoPE |
Rotary Position Embeddings |
| SAE |
Sparse Autoencoder |
| SFT |
Supervised Fine-Tuning |
| SGD |
Stochastic Gradient Descent |
| SHAP |
SHapley Additive exPlanations |
| SLERP |
Spherical Linear Interpolation |
| SRAM |
Static Random-Access Memory |
| SwiGLU |
Swish-Gated Linear Unit |
| TGI |
Text Generation Inference |
| TIES |
Trim, Elect Sign, Disjoint Merge |
| TP |
Tensor Parallelism |
| TPU |
Tensor Processing Unit |
| TTS |
Text-to-Speech |
| VAE |
Variational Autoencoder |
| VLA |
Vision--Language--Action (model) |
| VLM |
Vision--Language Model |
| VRAM |
Video Random-Access Memory |
| XAI |
Explainable AI |
| ZeRO |
Zero Redundancy Optimizer |