Loading...
AI Systems Research Track
Modern AI Systems
Explained Deeply
Transformers → Training → Inference → Optimization → Agents → Evaluation → Production
8 Pillars of Production AI Systems (2025)
Model Architecture
- •Transformers
- •Mixture-of-Experts
- •State Space Models
- •Multimodal
- •Long-context
Training & Fine-tuning
- •LoRA / QLoRA
- •FSDP / DeepSpeed
- •ZeRO-Offload
- •FlashAttention-3
- •Unsloth
Inference & Serving
- •vLLM
- •TGI
- •TensorRT-LLM
- •SGLang
- •PagedAttention
- •Speculative Decoding
Retrieval & Memory
- •RAG
- •Vector DBs
- •HyDE
- •Re-ranking
- •Graph RAG
- •Memory-augmented LLMs
Agents & Tool Use
- •ReAct
- •Toolformer
- •LangGraph
- •AutoGen
- •CrewAI
- •OpenAI Swarm
Evaluation & Benchmarks
- •MT-Bench
- •Arena-Hard
- •LiveBench
- •GPQA
- •HumanEval
- •Big-Bench Hard
Safety & Alignment
- •RLHF / DPO
- •Constitutional AI
- •Red teaming
- •Adversarial robustness
- •Jailbreak defense
MLOps & Deployment
- •MLflow
- •Weights & Biases
- •LangSmith
- •PromptLayer
- •BentoML
- •Modal
- •RunPod
Recommended Learning Path (2025)
1
Foundations (1–2 months)
- Understand transformers from scratch (nanoGPT, Karpathy lectures)
- Master PyTorch fundamentals
- Learn tokenization, attention, positional encodings
2
Fine-tuning & Optimization (1–2 months)
- QLoRA / LoRA on consumer GPUs
- FlashAttention, Unsloth, bitsandbytes
- Gradient checkpointing, mixed precision, ZeRO
3
Production Inference (1 month)
- vLLM vs TGI vs TensorRT-LLM
- PagedAttention, continuous batching
- Quantization (AWQ, GPTQ, GGUF)
4
RAG & Agents (1–2 months)
- Vector DBs: Chroma, Weaviate, Pinecone, Qdrant
- Advanced RAG: HyDE, self-query, multi-query
- Build agents with LangGraph / AutoGen
5
Evaluation & Safety (ongoing)
- LLM-as-a-judge, MT-Bench style evals
- Red teaming & jailbreak resistance
- Constitutional AI / RLHF basics
Weekly paper discussions • Live implementation sessions • Open research projects