Personal Reference · 2026 Edition

AI Engineer
Roadmap

5 phases
9–12 months
17 topics
LLMs · RAG · LoRA core focus areas
01 /
1
Foundations
Python, math, and data — the bedrock everything else rests on
6–8 weeks
+
🐍
+
Python essentials
Core syntax, data structures, OOP, file I/O, virtual environments
beginner
What to learn
Variables, loops, functions, list comprehensions
Classes & OOP, decorators, generators
pip, venv / conda, requirements.txt
NumPy & Pandas basics — arrays, DataFrames
Jupyter notebooks for experimentation
Git & GitHub basics — commits, branches, PRs
+
Math for ML
Linear algebra, calculus, probability you'll actually use in practice
required
What to learn
Vectors, matrices, dot products, matrix multiply
Derivatives & gradients — intuition, not proofs
Chain rule (this IS backprop)
Probability: distributions, Bayes theorem
Statistics: mean, variance, covariance
3Blue1Brown Essence of LA & Calculus (free)
📊
+
Data & EDA
Wrangling, cleaning, and visualising datasets before any model touches them
practical
What to learn
Pandas for manipulation & cleaning
Matplotlib / Seaborn for visualisation
Handling missing data, outliers, dtypes
Train / val / test splitting correctly
Feature scaling: StandardScaler, MinMaxScaler
Kaggle datasets — start practising immediately
02 /
2
ML & Deep Learning
Neural nets, backprop, and transformers — the building blocks of every LLM
8–10 weeks
+
🌲
+
Classical ML
Scikit-learn fundamentals — the mental models still matter
important
What to learn
Linear / logistic regression from scratch
Decision trees, random forests, XGBoost
Cross-validation, confusion matrix, AUC-ROC
Overfitting & regularisation (L1 / L2)
Scikit-learn pipelines & GridSearchCV
When NOT to use deep learning
🧠
+
Deep Learning
Neural nets, backprop, CNNs, RNNs — foundations of all large models
core
What to learn
Perceptrons, activation functions, layers
Backpropagation & gradient descent by hand
PyTorch: tensors, autograd, nn.Module
Training loops: forward → loss → backward → step
Batch norm, dropout, weight decay
CNNs for vision; RNNs / LSTMs for sequences
+
Transformers
Attention is all you need — understand every component deeply
critical
What to learn
Self-attention: Q, K, V matrices explained
Multi-head attention — why multiple heads?
Positional encodings (sinusoidal & RoPE)
Encoder-decoder vs decoder-only models
Layer norm, residual connections, FFN layers
Implement a mini-GPT from scratch (Karpathy)
03 /
3
LLMs & GenAI
Internals, fine-tuning, RAG, and prompt engineering — the core specialisation
8–10 weeks
+
🔬
+
LLM internals
How GPT, Claude, Llama actually work under the hood
deep-dive
What to learn
Tokenisation: BPE, SentencePiece, tiktoken
Pre-training: next-token prediction at scale
RLHF / RLAIF — how models get aligned
Context windows, KV cache, attention patterns
Emergent abilities: in-context learning, CoT
Scaling laws — Chinchilla & compute-optimal
🔧
+
Fine-tuning & LoRA
Adapt pretrained models cheaply and effectively with PEFT methods
practical
What to learn
Full fine-tuning vs parameter-efficient methods
LoRA: low-rank matrix decomposition explained
QLoRA — quantised + LoRA for consumer GPUs
Hugging Face PEFT & Transformers libraries
Instruction tuning with custom datasets
Evaluation: perplexity, BLEU, human eval
🗄️
+
RAG systems
Retrieval-augmented generation — give LLMs external memory
in-demand
What to learn
Why RAG? Solving hallucination & knowledge cutoff
Embeddings: what they are, cosine similarity
Vector DBs: Chroma, Pinecone, Qdrant, Weaviate
Chunking: fixed, semantic, hierarchical
Hybrid search: dense + sparse (BM25)
Advanced RAG: re-ranking, HyDE, query rewriting
✍️
+
Prompt engineering
The craft of communicating with LLMs precisely and reliably
vibe-coding
What to learn
Zero-shot, few-shot, chain-of-thought
System prompts, role-playing, personas
Structured outputs: JSON mode, function calling
Tree of Thought, ReAct, self-consistency
Prompt injection & jailbreak awareness
Evaluating prompt quality at scale (LLM-as-judge)
04 /
4
AI Engineering & Vibecoding
APIs, agents, AI-native IDEs, and production deployment
10–12 weeks
+
🔌
+
APIs & SDKs
Building with Anthropic, OpenAI, and open-source model APIs
vibe-coding
What to learn
Anthropic SDK: messages, streaming, tool use
OpenAI-compatible APIs — portable patterns
Managing rate limits, retries, costs
Structured outputs & JSON schema enforcement
Streaming responses for real-time UX
Cost monitoring: tokens, context window budgets
🤖
+
Agents & tools
LLMs that take actions — the frontier of practical AI engineering
hot
What to learn
Tool / function calling — give LLMs abilities
ReAct pattern: reason → act → observe loop
LangChain / LlamaIndex agent frameworks
Memory: short-term (context) vs long-term (DB)
Multi-agent systems — orchestrator + workers
Human-in-the-loop approval patterns
⌨️
+
Vibe-coding mastery
Using AI IDEs and LLMs to build 10× faster than traditional coding
meta-skill
What to learn
Claude Code, Cursor, Windsurf, Copilot in practice
Writing prompts that generate working code
Iterative refinement — the vibe-coding loop
Debugging AI-generated code effectively
Context management in long coding sessions
When to override the AI vs trust it
🚀
+
Stack & infra
The tools real AI engineers actually deploy with in production
deployment
What to learn
FastAPI for serving ML models as REST APIs
Docker & Docker Compose for reproducibility
LangSmith / LangFuse for LLM observability
Hugging Face Hub — model versioning & sharing
Modal / Replicate for GPU inference in the cloud
Gradio & Streamlit for quick demos & prototypes
05 /
5
Mastery & Portfolio
Build real things, ship publicly, and stay current in a fast-moving field
ongoing
+
🏗️
+
Capstone projects
Build real things — this is what actually gets you hired
required
Project ideas
RAG chatbot over your own document corpus
Fine-tuned domain model with LoRA (coding assistant)
Agentic app: autonomous research or code-review agent
Multimodal app: image + text pipeline
Open-source contribution to LangChain / PEFT / Axolotl
Public GitHub + technical blog post per project
📏
+
Evaluation & evals
How to measure if your AI system actually works in production
underrated
What to learn
LLM-as-judge evaluation pipelines
RAGAS for RAG system evaluation
Creating golden test sets for regression
Latency, cost, and quality trade-off analysis
A/B testing prompts in production
Model red-teaming & safety evaluation
📡
+
Stay current
The field moves weekly — staying current is itself a skill to build
ongoing
Resources
Follow: Andrej Karpathy, Sebastian Raschka, @huggingface
Papers: arXiv cs.LG, cs.CL — read abstracts daily
Courses: fast.ai, DeepLearning.AI, HF courses
Communities: r/MachineLearning, HF Discord
Newsletters: The Batch, TLDR AI, Import AI
Reproduce one new paper per month
Essential resources
Andrej Karpathy — YouTube
nanoGPT, makemore, Neural Nets: Zero to Hero. Best free DL education on the internet.
fast.ai
Practical Deep Learning. Top-down, code-first. Gets you building before you fully understand.
Hugging Face courses
Free, hands-on, industry-standard tools. Transformers, Diffusers, RL, agents.
Sebastian Raschka — LLMs
"Build an LLM from Scratch" book + GitHub. Most thorough technical walkthrough available.
DeepLearning.AI short courses
RAG, agents, fine-tuning, prompt engineering — 1-2 hour practical courses, always updated.
Attention Is All You Need
The original transformer paper. Read it twice — once early, once after phase 2. Different experience.