The Scaling Journey

Welcome to The Scaling Journey — a comprehensive guide and blog exploring the art of training and deploying large language models at scale.

Start Here

Scaling Laws — Understanding compute efficiency and power laws
- Compute Efficiency & Chinchilla Scaling
- Power Laws in LLMs
Transformers — Modern architecture deep dives
- Attention Mechanisms
- Architecture Variants (MQA, GQA, Hybrids)
Compression — Model optimization techniques
- Pruning
- Quantization & QAT
- Knowledge Distillation
- Restoration Processes
Retrieval & RAG — Knowledge-grounded generation
- RAG Systems
- Embeddings & Indexing
Pre-training — Training from scratch
- Architecture design
- Data curation
- Optimization strategies
Mid-training — Fine-tuning and alignment
- Instruction tuning
- RLHF
- Specialized tuning
Post-training — Deployment and inference
- Inference optimization
- Deployment at scale

Start exploring in the sidebar!