Suggested Reading Order
Three paths through the primer. Pick the one that matches your role and time budget, then use the tables below to navigate. All paths start at the Welcome page.
PM / Product Path (~3โ4 hours)โ
For product managers, product strategists, and technical leaders who need fluent understanding without deep implementation knowledge.
| # | Page | Time | Notes |
|---|---|---|---|
| 1 | Welcome | 5 min | Start here |
| 2 | What Is an LLM? | 15 min | Foundation |
| 3 | Tokens & Tokenization | 15 min | Explains billing |
| 4 | Embeddings | 15 min | Powers semantic search |
| 5 | Transformer Architecture | 20 min | How it works |
| 6 | Attention Mechanism | 15 min | Why context matters |
| 7 | Training Pipeline | 25 min | SFT, RLHF explained |
| 8 | Context Windows | 20 min | Memory & cost |
| 9 | Mixture of Experts | 15 min | Why MoE matters |
| 10 | RAG | 20 min | Private knowledge |
| 11 | Agents & Tool Use | 20 min | Agentic systems |
| 12 | Cost & Latency | 20 min | Budgeting AI |
| 13 | Evaluation & Benchmarks | 15 min | Reading leaderboards |
| 14 | Agentic Vocabulary | 25 min | Terminology |
| 15 | Tools Landscape | 30 min | Reference |
Engineer Path (~6โ8 hours)โ
For software engineers, platform engineers, and ML engineers who want deep conceptual grounding plus implementation context.
| # | Page | Time | Notes |
|---|---|---|---|
| 1 | Welcome | 5 min | |
| 2 | What Is an LLM? | 15 min | |
| 3 | Tokens & Tokenization | 15 min | |
| 4 | Embeddings | 15 min | |
| 5 | Transformer Architecture | 20 min | Read full paper abstract |
| 6 | Attention Mechanism | 15 min | |
| 7 | Model Architecture Types | 15 min | |
| 8 | Training Pipeline | 25 min | Read InstructGPT abstract |
| 9 | Context Windows | 20 min | |
| 10 | Mixture of Experts | 15 min | |
| 11 | Scaling Laws | 15 min | |
| 12 | RAG | 20 min | |
| 13 | Agents & Tool Use | 20 min | |
| 14 | Multimodal | 15 min | |
| 15 | Evaluation & Benchmarks | 15 min | |
| 16 | Cost & Latency | 20 min | |
| 17 | Prompt Engineering | 20 min | |
| 18 | Agentic Vocabulary | 25 min | |
| 19 | Tools Landscape | 30 min | Reference |
Researcher Path (~12โ15 hours)โ
All pages in sidebar order plus full paper reading. For each paper cited in a page, read the full paper โ not just the abstract. Use the Papers reference as your guide.
Work through every page in the order they appear in the sidebar: Basics โ Intermediate โ Advanced โ Use Cases. When a page cites a paper, follow the link and read according to the PM reading guide at minimum; for the researcher path, read the full paper.
Estimated total: 12โ15 hours depending on paper reading depth. The scaling laws papers and the transformer paper in particular reward slow reading โ both are dense with implications that take time to absorb.