Meta VP Matt Steiner on Ads Infra, GPUs, MTIA, and LLM-Written Kernels
Podcast:Semi Doped Published On: Mon Apr 20 2026 Description: Matt Steiner, VP of Monetization Infrastructure, Ranking & AI Foundations at Meta, walks through how Meta's ad system actually works, and why the infrastructure behind it differs from what you'd build for LLMs.We cover Andromeda (retrieval on a custom NVIDIA Grace Hopper SKU Meta co-designed), Lattice (consolidating N ranking models into one), GEM (Meta's Generative Ads Recommendation foundation model), and the adaptive ranking model, a roughly one-trillion-parameter recommender served at sub-second latency.We get into why recommender workloads aren't embarrassingly parallel like LLMs (the "personalization blob"), what that means for Meta's MTIA custom silicon roadmap, and how LLM-written kernels (KernelEvolve) flipped the economics of running a heterogeneous hardware fleet. Demand for software engineering has actually gone up as the price has come down. Meta now wants ~100x more optimized kernels per chip.Read the full transcript at https://www.chipstrat.com/p/an-interview-with-meta-vp-matt-steinerChapters:0:00 Intro and scale0:39 How Meta's ad system works2:00 Meta Andromeda and the custom NVIDIA SKU3:30 Lattice: consolidating ranking models5:00 GEM, Meta's ads foundation model6:30 Adaptive ranking for power users8:17 The scale: 3B DAUs at sub-second latency9:40 Why longer interaction histories matter10:45 The anniversary gift analogy12:57 A decade of compute evolution15:21 Meta's infra as a CP-SAT problem16:07 Co-designing Grace Hopper with NVIDIA17:47 Matching compute shape to workload18:26 Influencing hardware and software roadmaps20:23 MTIA: why ads aren't LLMs22:07 The personalization blob and I/O ratios26:38 One trillion parameters at sub-second latency28:26 Heterogeneous hardware trade-offs29:30 KernelEvolve: LLMs writing custom kernels33:30 GenAI and recommender systems cross-pollination35:21 The 2-year infrastructure outlook37:00 Why demand for software engineering is rising38:53 How Matt stays on top of it allRelevant reading:KernelEvolve (Meta Engineering): https://engineering.fb.com/2026/04/02/developer-tools/kernelevolve-how-metas-ranking-engineer-agent-optimizes-ai-infrastructure/Follow Chipstrat:Newsletter: https://www.chipstrat.comX: https://x.com/chipstrat