Skip to content

The Scaling Journey

Welcome to The Scaling Journey — a comprehensive guide and blog exploring the art of training and deploying large language models at scale.

  • Blog — Articles and insights on AI, ML, and scaling
  • About — Meet the creator
  • CV — Professional background
  • Scaling Laws — Understanding compute efficiency and power laws

    • Compute Efficiency & Chinchilla Scaling
    • Power Laws in LLMs
  • Transformers — Modern architecture deep dives

    • Attention Mechanisms
    • Architecture Variants (MQA, GQA, Hybrids)
  • Compression — Model optimization techniques

    • Pruning
    • Quantization & QAT
    • Knowledge Distillation
    • Restoration Processes
  • Retrieval & RAG — Knowledge-grounded generation

    • RAG Systems
    • Embeddings & Indexing
  • Pre-training — Training from scratch

    • Architecture design
    • Data curation
    • Optimization strategies
  • Mid-training — Fine-tuning and alignment

    • Instruction tuning
    • RLHF
    • Specialized tuning
  • Post-training — Deployment and inference

    • Inference optimization
    • Deployment at scale

Start exploring in the sidebar!