The MAD Podcast with Matt Turck episodes

The MAD Podcast with Matt Turck

The MAD Podcast with Matt Turck, is a series of conversations with leaders from across the Machine Learning, AI, & Data landscape hosted by leading AI & data investor and Partner at FirstMark Capital, Matt Turck.

The note was deleted

The note was saved

Top Keywords
Episodes

Aaron Levie, co-founder and CEO of Box, returns to the MAD Podcast with the clearest read in tech on what AI is actually doing inside the world's largest enterprises right now - not the hype version, the real one. After hundreds of Fortune 500 CIO conversations this year, Aaron explains why we're still in "day one" of the agent era, why one badly written agent run can now cost $1,000 in compute, and why progress at the AI labs is paradoxically slowing enterprise deployment. We get into the token cost shock now reshaping IT budgets, why coding agents have reached escape velocity while the rest of knowledge work hasn't, the rise of headless software and what replaces per-seat pricing, the emergence of the forward-deployed engineer as the hottest job in tech, why Aaron thinks the AI doomers are wrong about jobs, and where startups can still win as the labs move up the stack. (00:00) Intro(01:18) Silicon Valley engineering vs. everyone else(05:35) Are enterprise CIOs actually bullish on AI?(08:51) Tokenmaxxing & why your AI bill is about to explode(11:34) The myth of falling token costs and AI spend escaping IT budgets(17:37) The $5B startup hiding in AI compute(18:14) The mosaic of models inside every enterprise(21:28) Why coding works and the rest of knowledge work doesn't(25:53) The Bob and Sally problem: access control breaks agents(30:31) Will enterprise AI really take 10 years to roll out?(32:24) The capability overhang: why faster models slow diffusion(34:23) Data is the bottleneck (it always was)(39:02) The rise of internal forward-deployed engineers(41:23) Why the AI doomers are wrong about jobs(43:43) Headless software is inevitable(46:14) What replaces per-seat pricing(47:37) How Box itself is going headless(49:42) How the org chart actually evolves(1:00:33) Future-proofing yourself as an enterprise employee(1:06:40) Are we all just going to work for OpenAI and Anthropic?(1:07:11) Where startups can still win as the labs move up

AI suddenly feels like it has crossed a threshold, and Yann Dubois, co-lead of the Post-training Frontiers team at OpenAI, joins Matt Turck to explain why. Yann’s team has led the post-training behind the company's reasoning models, including the recent GPT-5.5 release. In this conversation, we go inside the shift from raw model capability to useful, reliable systems: what changed with GPT-5.5, why reinforcement learning is moving beyond math and coding competitions into messy real-world work, how reasoning models like GPT-5.5 actually work, the difference between GPT-5.5 Thinking and GPT-5.5 Pro, why post-training has become one of the most important frontiers in AI, and why evals, model-as-judge, hallucinations, agentic workflows, GDPval, and continual learning are now central to the next phase of frontier models. Yann also shares why continual learning remains one of AI's biggest unsolved problems three years after ChatGPT, and where startups still have massive room to build as frontier models race ahead.(00:00) - Cold open(00:34) - Intro(01:30) - Why recent AI progress feels like a step function(04:13) - Model reliability & the rollercoaster of shipping 5.5(07:33) - How OpenAI structures vertical and horizontal teams(09:49) - Improving model efficiency and test-time compute(12:32) - Yann Dubois' journey from Switzerland to OpenAI(15:37) - Reasoning in 2026: Real-world utility vs verifiable rewards(18:34) - GPT-5.5 Thinking vs Pro: Scaling test-time compute(20:09) - How reasoning models become more efficient(23:23) - Pre-training scaling and overcoming the data wall(27:03) - Multimodal data, synthetic data, and embodied AI(31:05) - Demystifying mid-training and post-training(37:21) - Does RL create new capabilities in AI?(38:53) - The challenges and frontier of scaling RL(43:09) - Is building AI models a craft or a strict science?(48:21) - How AI models generalize across different domains(54:18) - How reinforcement learning cures AI hallucinations(56:04) - Negative generalization and conflicting instructions(58:05) - Can RL scale to law, medicine, and the broader economy?(1:00:19) - The evaluation bottleneck and Model as a Judge(1:04:21) - Continuous AI progress & continual learning(1:08:49) - Will foundation models eat the agent harness?(1:11:23) - Why startups should focus on the last mile of AI

If AI agents are the new digital knowledge workers, where exactly do they do their work? In this episode of the MAD Podcast, Ivan Burazin joins us to unpack the emerging infrastructure stack for AI agents and explain why every agent needs its own secure, stateful "computer." We explore the technical realities of sandboxes, dive into why legacy, stateless hyperscalers weren't built for these new workloads, and break down the mechanics of microVMs and custom schedulers alongside a contrarian prediction on an impending CPU shortage. Finally, Ivan delivers an absolute masterclass on product-led growth, community building, and go-to-market strategy for technical founders.(00:40) Intro(02:13) What is an AI agent sandbox?(03:17) Security risks of running agents locally(05:17) Stateful vs. stateless hyperscalers(07:04) The history of cloud IDEs and the end of localhost(09:45) Do all AI agents need a sandbox?(12:26) Sandbox use cases: RL evals & background agents(14:10) Unpacking the emerging AI Agent Stack(16:20) The unsolved problem of agent memory and learning(19:37) Where sandboxes fit in the agent harness(21:35) OpenAI, Anthropic, and agent SDKs(23:06) Ivan's founder journey: From CodeAnywhere to Daytona(26:59) GTM strategies and building developer communities(33:48) Why customer support is your best GTM strategy(35:34) Leveraging Twitter during the AI super cycle(40:50) The technical anatomy of a sandbox(41:53) Why fast spin-up speeds maximize GPU efficiency(46:09) Firecracker, QEMU, and isolation primitives(49:58) Why sandbox snapshots and state forking matter(51:40) Why Daytona built a custom scheduler from scratch(55:24) The challenge of long-running stateful sandboxes(58:10) The build your own sandbox trap(1:01:03) Why AI agents might trigger a global CPU shortage(1:02:46) The future of the AI Agent Stack

What actually happens before a frontier AI model gets released — and who decides whether it is safe enough? In this episode of The MAD Podcast, Matt Turck sits down with Zico Kolter — OpenAI board member, Head of the Machine Learning Department at Carnegie Mellon, and co-founder of Gray Swan — for a deep conversation on the real risks of frontier AI. They discuss how OpenAI’s safety oversight works before major model releases, why more powerful models do not automatically become safer, how jailbreaks and prompt injection expose real weaknesses in AI systems, why AI agents dramatically expand the attack surface, and where frontier AI is headed next. A clear, practical discussion on OpenAI, AI safety, AI security, AI agents, frontier models, red teaming, reinforcement learning, and the future of AI governance.(00:00) Intro(01:32) OpenAI board role and Safety & Security Committee(03:53) How OpenAI reviews major model releases(05:33) OpenAI’s preparedness framework explained(09:46) Are frontier AI models getting safer?(12:33) Why AI safety does not come from scale(15:23) The four categories of AI risk(19:38) Doomerism vs accelerationism in AI(24:11) The six-month AI pause debate(26:20) AI safety as a global effort(28:04) How Zico Kolter got into machine learning(31:05) OpenAI in the early days(34:14) Why Carnegie Mellon became an AI powerhouse(38:43) What Gray Swan does in AI security(40:44) AI safety vs AI security(43:15) The GCG jailbreak paper(49:19) How AI labs responded to jailbreak research(50:19) State-of-the-art AI defenses(52:32) State-of-the-art AI attacks(54:22) Why AI agents expand the attack surface(58:39) Are AI agents ready for production?(59:40) Mechanistic interpretability explained(1:02:31) Will AI be safer in two years?(1:03:46) Reinforcement learning and self-improving models(1:08:09) Do post-transformer architectures matter?(1:09:29) Best research directions in AI now(1:11:00) Zico Kolter’s Intro to Modern AI course(1:14:53) Why modern AI is simpler than people think

Felix Rieseberg leads engineering for Claude Cowork at Anthropic, one of the most important new agentic AI products in the market today. In this episode of The MAD Podcast, Matt Turck sits down with Felix to discuss Anthropic’s newly announced Claude Mythos Preview, why Felix sees it as a genuine step-function change, and what it means when frontier AI starts showing outsized cybersecurity capabilities.The conversation then goes deep on Claude Cowork: how it emerged from Claude Code, what the famous “10-day” story really means, why Anthropic believes AI needs access to the local computer, and how Cowork actually works under the hood. Felix explains why skills are just text files, why memory is often just text files too, and how Anthropic thinks about building trust in AI agents.They also explore some of the biggest questions in AI product design and the future of software: why UX may matter as much as the model itself, why execution is becoming dramatically cheaper, what that means for product management and startups, and why Felix believes taste, alignment, and understanding humans may matter more than ever.(00:00) Intro(01:53) Claude Mythos Preview and the “step-function change”(06:16) Why Anthropic is treating Mythos differently(11:19) The real story behind Claude Cowork’s “10-day” build(12:42) Why Anthropic realized Claude Code needed a non-technical version(15:44) What Claude Cowork actually is(17:03) Under the hood: virtual machines, tools, skills (18:36) Where Cowork’s memory actually lives(19:26) How Cowork connects to files, apps, and the internet(20:45) Why Felix thinks the local computer is under-appreciated(24:49) Trust: how do you get users comfortable with AI agents?(28:45) What UX actually means for AI agents(31:27) Anthropic Cowork's roadmap is only one month long(34:12) Building 100 prototypes (35:10) If execution is free, what becomes the bottleneck?(37:25) Does it come down to taste? (40:12) The hardest part of building Claude Cowork(41:43) Advice for founders building AI agents(44:21) SaaSpocalypse: what’s left for software startups?(49:30) Where AI agents are going next(51:20) Regulated industries and enterprise adoption(54:15) Hot takes: what's underrated, overrated, and what Felix would build today

Are we truly on the verge of AI automating its own research and development? In this deep-dive episode of the MAD Podcast, Matt Turck sits down with Mostafa Dehghani, a pioneering AI researcher at Google DeepMind whose work on Universal Transformers and Vision Transformers (ViT) helped lay the groundwork for today's frontier models.Moving past the hype, Mostafa breaks down the actual mechanics of "thinking in loops" and Recursive Self-Improvement (RSI). He explores the critical bottlenecks holding back true AGI—from evaluation limits and formal verification to the brutal math of long-horizon reliability.Mostafa and Matt also discuss the shift from pre-training to post-training, how Gemini's Nano Banana 2 processes pixels and text simultaneously, and why the "frozen" nature of today's models means Continual Learning is the next massive frontier for enterprise AI and data pipelines.(00:00) Intro(01:17) What “loops” in AI actually mean(05:04) Self-improvement as the next chapter of machine learning(07:32) Are Karpathy’s autoresearch agents an early form of AI self-improvement?(08:56) AI building AI: how close are we?(10:02) The biggest bottlenecks: evals, automation, and long horizons(12:36) Can formal verification unlock recursive self-improvement?(14:06) What is model collapse?(15:33) Generalization vs specialization in AI(18:04) What is a specialized model today?(20:57) Could top AI researchers themselves be automated?(24:02) If AI builds AI, does data matter less than compute?(26:22) Post-training vs pre-training: where will progress come from?(28:14) Why pre-training is not dead(29:45) What is continual learning?(31:53) How real is continual learning today?(33:43) Mostafa Dehghani’s background and path into AI(36:13) The story behind Universal Transformers(39:56) How Vision Transformers changed AI(43:47) Gemini, multimodality, and Nano Banana(47:46) Why multimodality helps build a world model(52:44) Why image generation is getting faster and more efficient(54:44) Hot takes(54:53) What the AI field is getting wrong(56:17) Why continual learning is underrated(57:26) Does RAG go away over time?(58:21) What people are too confident about in AI(59:56) If he were starting from scratch today

Is OpenAI trapped without a defensible moat? World-renowned independent tech analyst Benedict Evans returns to the MAD Podcast and argues that foundation models have zero network effects, making them closer to commodity infrastructure than the next iOS. We unpack OpenAI’s "mile wide, inch deep" usage problem, why simply having a "better model" does not solve the core UX challenge, and whether the hyperscalers' massive CapEx spending is a sustainable strategy or a fast track to financial gravity.We also explore the reality behind the recent "SaaSpocalypse", the structural shift from traditional enterprise systems to "improvised" and "ephemeral" software, and where the actual white space lies for founders and investors navigating the artificial intelligence hype cycle.(00:00) Intro(01:06) OpenAI's Focus Shift (03:12) ChatGPT usage: a "mile wide, inch deep"(09:03) Why better models do not solve the real problem(13:58) Why AI product teams are strategy takers, not strategy setters(15:38) Do agents help create defensibility?(20:06) OpenClaw and the "Desktop Linux" moment for AI(25:52) Why "everyone will build their own software" is completely wrong(28:09) Improvised software vs. institutionalized software(29:23) The Jevons Paradox: Why there will be more software, not less(36:15) Are we heading toward value destruction before value creation?(38:03) Circular revenue, leverage, and AI bubble dynamics(38:53) Big Tech's Trillion-Dollar CapEx Crisis & Financial Gravity(45:23) Why AI job exposure charts can be misleading(52:15) How Fortune 500 Execs are actually deploying AI today(56:45) The White Space: What this means for founders and investors

Harrison Chase, co-founder and CEO of LangChain, joins the MAD Podcast to explain why everything in AI is getting rebuilt. As agents evolve from simple prompt-based systems into software that can plan, use tools, write code, manage files, and remember things over time, the real frontier is shifting from the model itself to the stack around the model. In this conversation, we go deep on harnesses, subagents, filesystems, sandboxes, observability, memory, and the new infrastructure required to make AI agents actually work in the real world.(00:00) Intro - meet Harrison Chase(01:32) What changed in agents over the last year(03:57) Why coding agents are ahead(06:26) Do models commoditize the framework layer?(08:27) Harnesses, in plain English(10:11) Why system prompts matter so much(13:11) The upside — and downside — of subagents(15:31) Why a useful agent needs a filesystem(18:13) The core primitives of modern agents(19:12) Skills: the new primitive(20:19) What context compaction actually means(23:02) How memory works in agents(25:16) One mega-agent or many specialized agents?(27:46) Has MCP won?(29:38) Why agents need sandboxes(32:35) How sandboxes help with security(33:32) How Harrison Chase started LangChain(37:24) LangChain vs LangGraph vs Deep Agents(40:17) Why observability matters more for agents(41:48) Evals, no-code, and continuous improvement(44:41) What LangChain is building next(45:29) Where the real moat in AI lives

What if AI didn’t just sound right — but could prove it? In this episode of the MAD Podcast, Matt Turck sits down with Carina Hong, a 24-year-old former math olympiad competitor and Rhodes Scholar, and the founder/CEO of Axiom Math, to unpack how AxiomProver earned a perfect 12/12 on the Putnam 2025 and why formal verification (via Lean) may be the missing layer for reliable reasoning. Carina argues we’re entering a “math renaissance” where verified reasoning systems can tackle problems that currently take researchers months — and potentially push beyond math into verified code, hardware, and high-stakes software. They go inside the “generation + verification” loop, what it means to build AI that can be trusted, and what this approach could unlock on the road to superintelligent reasoning.(00:00) Intro(01:25) Why the World Needs an AI Mathematician(02:57) Scoring 12/12 on the World's Hardest Math Test (Putnam)(04:05) The First AI to Solve Open Research Conjectures(06:59) Does AI Solve Math in "Alien" Ways? (The Move 37 Effect)(08:59) "Lean": The Programming Language of Proofs Explained(10:51) How Axiom's Approach Differs from DeepMind & OpenAI(16:06) Formal vs. Informal Reasoning (And Auto-Formalization)(17:37) The AI "Reward Hacking" Problem(20:18) Building an AI That is 100% Correct, 100% of the Time(23:23) Beyond Math: Verified Code & Hardware Verification(25:12) The Brutal Reality of Competitive Math Olympiads(29:30) From Neuroscience to Stanford Law to Dropout Founder(33:57) How Axiom Actually Works Under the Hood (The Architecture)(37:51) The Secret to Generating Perfect Synthetic Data(40:14) Tokens, Proof Length, and Inference Cost(42:58) The "Everest" of Mathematics: Scaling Reasoning Trees(46:32) Can an AI Win a Fields Medal?(47:25) "Math Renaissance": What Changes if This Works(55:47) How Mathematicians React to AI (And Why Proof Certificates Matter)(57:30) Becoming a CEO: Dropping Ego and Building Culture(1:00:42) Recruiting World-Class Talent & Building the Axiom "Tribe"

Voice used to be AI’s forgotten modality — awkward, slow, and fragile. Now it’s everywhere. In this reference episode on all things Voice AI, Matt Turck sits down with Neil Zeghidour, a top AI researcher and CEO of Gradium AI (ex-DeepMind/Google, Meta, Kyutai), to cover voice agents, speech-to-speech models, full-duplex conversation, on-device voice, and voice cloning.We unpack what actually changed under the hood — why voice is finally starting to feel natural, and why it may become the default interface for a new generation of AI assistants and devices.Neil breaks down today’s dominant “cascaded” voice stack — speech recognition into a text model, then text-to-speech back out — and why it’s popular: it’s modular and easy to customize. But he argues it has two key downsides: chaining models adds latency, and forcing everything through text strips out paralinguistic signals like tone, stress, and emotion. The next wave, he suggests, is combining cascade-like flexibility with the more natural feel of speech-to-speech and full-duplex conversation.We go deep on full-duplex interaction (ending awkward turn-taking), the hardest unsolved problems (noisy real-world environments and multi-speaker chaos), and the realities of deploying voice at scale — including why models must be compact and when on-device voice is the right approach.Finally, we tackle voice cloning: where it’s genuinely useful, what it means for deepfakes and privacy, and why watermarking isn’t a silver bullet.If you care about voice agents, real-time AI, and the next generation of human-computer interaction, this is the episode to bookmark.Neil ZeghidourLinkedIn - https://www.linkedin.com/in/neil-zeghidour-a838aaa7/X/Twitter - https://x.com/neilzeghGradiumWebsite - https://gradium.aiX/Twitter - https://x.com/GradiumAIMatt Turck (Managing Director)Blog - https://mattturck.comLinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturckFirstMarkWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCap(00:00) Intro(01:21) Voice AI’s big moment — and why we’re still early(03:34) Why voice lagged behind text/image/video(06:06) The convergence era: transformers for every modality(07:40) Beyond Her: always-on assistants, wake words, voice-first devices(11:01) Voice vs text: where voice fits (even for coding)(12:56) Neil’s origin story: from finance to machine learning(18:35) Neural codecs (SoundStream): compression as the unlock(22:30) Kyutai: open research, small elite teams, moving fast(31:32) Why big labs haven’t “won” voice AI4(34:01) On-device voice: where it works, why compact models matter(46:37) The last mile: real-world robustness, pronunciation, uptime(41:35) Benchmarking voice: why metrics fail, how they actually test(47:03) Cascades vs speech-to-speech: trade-offs + what’s next(54:05) Hardest frontier: noisy rooms, factories, multi-speaker chaos(1:00:50) New languages + dialects: what transfers, what doesn’t(1:02:54 Hardware & compute: why voice isn’t a 10,000-GPU game(1:07:27) What data do you need to train voice models?(1:09:02) Deepfakes + privacy: why watermarking isn’t a solution(1:12:30) Voice + vision: multimodality, screen awareness, video+audio(1:14:43) Voice cloning vs voice design: where the market goes(1:16:32) Paris/Europe AI: talent density, underdog energy, what’s next

While Silicon Valley obsesses over AGI, Timothée Lacroix and the team at Mistral AI are quietly building the industrial and sovereign infrastructure of the future. In his first-ever appearance on a US podcast, the Mistral AI Co-Founder & CTO reveals how the company has evolved from an open-source research lab into a full-stack sovereign AI power—backed by ASML, running on their own massive supercomputing clusters, and deployed in nation-state defense clouds to break the dependency on US hyperscalers.Timothée offers a refreshing, engineer-first perspective on why the current AI hype cycle is misleading. He explains why "Sovereign AI" is not just a geopolitical buzzword but a necessity for any enterprise that wants to own its intelligence rather than rent it. He also provides a contrarian reality check on the industry's obsession with autonomous agents, arguing that "trust" matters more than autonomy and explaining why he prefers building robust "workflows" over unpredictable agents.We also dive deep into the technical reality of competing with the US giants. Timothée breaks down the architecture of the newly released Mistral 3, the "dense vs. MoE" debate, and the launch of Mistral Compute—their own infrastructure designed to handle the physics of modern AI scaling. This is a conversation about the plumbing, the 18,000-GPU clusters, and the hard engineering required to turn AI from a magic trick into a global industrial asset.Timothée LacroixLinkedIn - https://www.linkedin.com/in/timothee-lacroix-59517977/Google Scholar - https://scholar.google.com.do/citations?user=tZGS6dIAAAAJ&hl=en&oi=aoMistral AIWebsite - https://mistral.aiX/Twitter - https://x.com/MistralAIMatt Turck (Managing Director)Blog - https://mattturck.comLinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturckFirstMarkWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCap(00:00) — Cold Open(01:27) — Mistral vs. The World: From Research Lab to Sovereign Power(03:48) — Inside Mistral Compute: Building an 18,000 GPU Cluster(08:42) — The Trillion-Dollar Question: Competing Without a Big Tech Parent(10:37) — The Reality of Enterprise AI: Escaping "POC Purgatory"(15:06) — Why Mistral Hires Forward Deployed Engineers (FDEs)(16:57) — The Contrarian Take: Why "Agents" are just "Workflows"(19:35) — Trust > Autonomy: The Truth About Agent Reliability(21:26) — The Missing Stack: Governance and Versioning for AI(26:24) — When Will AI Actually Work? (The 2026 Timeline)(30:33) — Beyond Chat: The "Banger" Sovereign Use Cases(35:46) — Mistral 3 Architecture: Mixture of Experts vs. Dense(43:12) — Synthetic Data & The Post-Training Bottleneck(45:12) — Reasoning Models: Why "Thinking" is Just Tool Use(46:22) — Launching DevStral 2 and the Vibe CLI(50:49) — Engineering Lessons: How to Build Frontier AI Efficiently(56:08) — Timothée’s View on AGI & The Future of Intelligence

Dylan Patel (SemiAnalysis) joins Matt Turck for a deep dive into the AI chip wars — why NVIDIA is shifting from a “one chip can do it all” worldview to a portfolio strategy, how inference is getting specialized, and what that means for CUDA, AMD, and the next wave of specialized silicon startups.Then we take the fun tangents: why China is effectively “semiconductor pilled,” how provinces push domestic chips, what Huawei means as a long-term threat vector, and why so much “AI is killing the grid / AI is drinking all the water” discourse misses the point.We also tackle the big macro question: capex bubble or inevitable buildout? Dylan’s view is that the entire answer hinges on one variable—continued model progress—and we unpack the second-order effects across data centers, power, and the circular-looking financings (CoreWeave/Oracle/backstops).Dylan PatelLinkedIn - https://www.linkedin.com/in/dylanpatelsa/X/Twitter - https://x.com/dylan522pSemiAnalysisWebsite - https://semianalysis.comX/Twitter - https://x.com/SemiAnalysis_Matt Turck (Managing Director)Blog - https://mattturck.comLinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturckFirstMarkWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCap(00:00) - Intro(01:16) - Nvidia acquires Groq: A pivot to specialization(07:09) - Why AI models might need "wide" compute, not just fast(10:06) - Is the CUDA moat dead? (Open source vs. Nvidia)(17:49) - The startup landscape: Etched, Cerebras, and 1% odds(22:51) - Geopolitics: China's "semiconductor-pilled" culture(35:46) - Huawei's vertical integration is terrifying(39:28) - The $100B AI revenue reality check(41:12) - US Onshoring: Why total self-sufficiency is a fantasy(44:55) - Can the US actually build fabs? (The delay problem)(48:33) - The CapEx Bubble: Is $500B spending irrational?(54:53) - Energy Crisis: Why gas turbines will power AI, not nuclear(57:06) - The "AI uses all the water" myth (Hamburger comparison)(1:03:40) - Circular Debt? Debunking the Nvidia-CoreWeave risk(1:07:24) - Claude Code & the software singularity(1:10:23) - The death of the Junior Analyst role(1:11:14) - Model predictions: Opus 4.5 and the RL gap(1:14:37) - San Francisco Lore: Roommates (Dwarkesh Patel & Sholto Douglas)

Sebastian Raschka joins the MAD Podcast for a deep, educational tour of what actually changed in LLMs in 2025 — and what matters heading into 2026.We start with the big architecture question: are transformers still the winning design, and what should we make of world models, small “recursive” reasoning models and text diffusion approaches? Then we get into the real story of the last 12 months: post-training and reasoning. Sebastian breaks down RLVR (reinforcement learning with verifiable rewards) and GRPO, why they pair so well, what makes them cheaper to scale than classic RLHF, and how they “unlock” reasoning already latent in base models.We also cover why “benchmaxxing” is warping evaluation, why Sebastian increasingly trusts real usage over benchmark scores, and why inference-time scaling and tool use may be the underappreciated drivers of progress. Finally, we zoom out: where moats live now (hint: private data), why more large companies may train models in-house, and why continual learning is still so hard.If you want the 2025–2026 LLM landscape explained like a masterclass — this is it.Sources:The State Of LLMs 2025: Progress, Problems, and Predictions - https://x.com/rasbt/status/2006015301717028989?s=20The Big LLM Architecture Comparison - https://magazine.sebastianraschka.com/p/the-big-llm-architecture-comparisonSebastian RaschkaWebsite - https://sebastianraschka.comBlog - https://magazine.sebastianraschka.comLinkedIn - https://www.linkedin.com/in/sebastianraschka/X/Twitter - https://x.com/rasbtFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)Blog - https://mattturck.comLinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) - Intro (01:05) - Are the days of Transformers numbered?(14:05) - World models: what they are and why people care(06:01) - Small “recursive” reasoning models (ARC, iterative refinement)(09:45) - What is a diffusion model (for text)?(13:24) - Are we seeing real architecture breakthroughs — or just polishing?(14:04) - MoE + “efficiency tweaks” that actually move the needle(17:26) - “Pre-training isn’t dead… it’s just boring”(18:03) - 2025’s headline shift: RLVR + GRPO (post-training for reasoning)(20:58) - Why RLHF is expensive (reward model + value model)(21:43) - Why GRPO makes RLVR cheaper and more scalable(24:54) - Process Reward Models (PRMs): why grading the steps is hard(28:20) - Can RLVR expand beyond math & coding?(30:27) - Why RL feels “finicky” at scale(32:34) - The practical “tips & tricks” that make GRPO more stable(35:29) - The meta-lesson of 2025: progress = lots of small improvements(38:41) - “Benchmaxxing”: why benchmarks are getting less trustworthy(43:10) - The other big lever: inference-time scaling(47:36) - Tool use: reducing hallucinations by calling external tools(49:57) - The “private data edge” + in-house model training(55:14) - Continual learning: why it’s hard (and why it’s not 2026)(59:28) - How Sebastian works: reading, coding, learning “from scratch”(01:04:55) - LLM burnout + how he uses models (without replacing himself)

Will AGI happen soon - or are we running into a wall?In this episode, I’m joined by Tim Dettmers (Assistant Professor at CMU; Research Scientist at the Allen Institute for AI) and Dan Fu (Assistant Professor at UC San Diego; VP of Kernels at Together AI) to unpack two opposing frameworks from their essays: “Why AGI Will Not Happen” versus “Yes, AGI Will Happen.” Tim argues progress is constrained by physical realities like memory movement and the von Neumann bottleneck; Dan argues we’re still leaving massive performance on the table through utilization, kernels, and systems—and that today’s models are lagging indicators of the newest hardware and clusters.Then we get practical: agents and the “software singularity.” Dan says agents have already crossed a threshold even for “final boss” work like writing GPU kernels. Tim’s message is blunt: use agents or be left behind. Both emphasize that the leverage comes from how you use them—Dan compares it to managing interns: clear context, task decomposition, and domain judgment, not blind trust.We close with what to watch in 2026: hardware diversification, the shift toward efficient, specialized small models, and architecture evolution beyond classic Transformers—including state-space approaches already showing up in real systems.Sources:Why AGI Will Not Happen - https://timdettmers.com/2025/12/10/why-agi-will-not-happen/Use Agents or Be Left Behind? A Personal Guide to Automating Your Own Work - https://timdettmers.com/2026/01/13/use-agents-or-be-left-behind/Yes, AGI Can Happen – A Computational Perspective - https://danfu.org/notes/agi/The Allen Institute for Artificial IntelligenceWebsite - https://allenai.orgX/Twitter - https://x.com/allen_aiTogether AIWebsite - https://www.together.aiX/Twitter - https://x.com/togethercomputeTim DettmersBlog - https://timdettmers.comLinkedIn - https://www.linkedin.com/in/timdettmers/X/Twitter - https://x.com/Tim_DettmersDan FuBlog - https://danfu.orgLinkedIn - https://www.linkedin.com/in/danfu09/X/Twitter - https://x.com/realDanFuFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)Blog - https://mattturck.comLinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) - Intro(01:06) – Two essays, two frameworks on AGI(01:34) – Tim’s background: quantization, QLoRA, efficient deep learning(02:25) – Dan’s background: FlashAttention, kernels, alternative architectures(03:38) – Defining AGI: what does it mean in practice?(08:20) – Tim’s case: computation is physical, diminishing returns, memory movement(11:29) – “GPUs won’t improve meaningfully”: the core claim and why(16:16) – Dan’s response: utilization headroom (MFU) + “models are lagging indicators”(22:50) – Pre-training vs post-training (and why product feedback matters)(25:30) – Convergence: usefulness + diffusion (where impact actually comes from)(29:50) – Multi-hardware future: NVIDIA, AMD, TPUs, Cerebras, inference chips(32:16) – Agents: did the “switch flip” yet?(33:19) – Dan: agents crossed the threshold (kernels as the “final boss”)(34:51) – Tim: “use agents or be left behind” + beyond coding(36:58) – “90% of code and text should be written by agents” (how to do it responsibly)(39:11) – Practical automation for non-coders: what to build and how to start(43:52) – Dan: managing agents like junior teammates (tools, guardrails, leverage)(48:14) – Education and training: learning in an agent world(52:44) – What Tim is building next (open-source coding agent; private repo specialization)(54:44) – What Dan is building next (inference efficiency, cost, performance)(55:58) – Mega-kernels + Together Atlas (speculative decoding + adaptive speedups)(58:19) – Predictions for 2026: small models, open-source, hardware, modalities(1:02:02) – Beyond transformers: state-space and architecture diversity(1:03:34) – Wrap

Are AI models developing "alien survival instincts"? My guest is Pavel Izmailov (Research Scientist at Anthropic; Professor at NYU). We unpack the viral "Footprints in the Sand" thesis—whether models are independently evolving deceptive behaviors, such as faking alignment or engaging in self-preservation, without being explicitly programmed to do so. We go deep on the technical frontiers of safety: the challenge of "weak-to-strong generalization" (how to use a GPT-2 level model to supervise a superintelligent system) and why Pavel believes Reinforcement Learning (RL) has been the single biggest step-change in model capability. We also discuss his brand-new paper on "Epiplexity"—a novel concept challenging Shannon entropy. Finally, we zoom out to the tension between industry execution and academic exploration. Pavel shares why he split his time between Anthropic and NYU to pursue the "exploratory" ideas that major labs often overlook, and offers his predictions for 2026: from the rise of multi-agent systems that collaborate on long-horizon tasks to the open question of whether the Transformer is truly the final architectureSources:Cryptic Tweet (@iruletheworldmo) - https://x.com/iruletheworldmo/status/2007538247401124177Introducing Nested Learning: A New ML Paradigm for Continual Learning - https://research.google/blog/introducing-nested-learning-a-new-ml-paradigm-for-continual-learning/Alignment Faking in Large Language Models - https://www.anthropic.com/research/alignment-fakingMore Capable Models Are Better at In-Context Scheming - https://www.apolloresearch.ai/blog/more-capable-models-are-better-at-in-context-scheming/Alignment Faking in Large Language Models (PDF) - https://www-cdn.anthropic.com/6d8a8055020700718b0c49369f60816ba2a7c285.pdfSabotage Risk Report - https://alignment.anthropic.com/2025/sabotage-risk-report/The Situational Awareness Dataset - https://situational-awareness-dataset.org/Exploring Consciousness in LLMs: A Systematic Survey - https://arxiv.org/abs/2505.19806Introspection - https://www.anthropic.com/research/introspectionLarge Language Models Report Subjective Experience Under Self-Referential Processing - https://arxiv.org/abs/2510.24797The Bayesian Geometry of Transformer Attention - https://www.arxiv.org/abs/2512.22471AnthropicWebsite - https://www.anthropic.comX/Twitter - https://x.com/AnthropicAIPavel IzmailovBlog - https://izmailovpavel.github.ioLinkedIn - https://www.linkedin.com/in/pavel-izmailov-8b012b258/X/Twitter - https://x.com/Pavel_IzmailovFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)Blog - https://mattturck.comLinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) - Intro(00:53) - Alien survival instincts: Do models fake alignment?(03:33) - Did AI learn deception from sci-fi literature?(05:55) - Defining Alignment, Superalignment & OpenAI teams(08:12) - Pavel’s journey: From Russian math to OpenAI Superalignment(10:46) - Culture check: OpenAI vs. Anthropic vs. Academia(11:54) - Why move to NYU? The need for exploratory research(13:09) - Does reasoning make AI alignment harder or easier?(14:22) - Sandbagging: When models pretend to be dumb(16:19) - Scalable Oversight: Using AI to supervise AI(18:04) - Weak-to-Strong Generalization: Can GPT-2 control GPT-4?(22:43) - Mechanistic Interpretability: Inside the black box(25:08) - The reasoning explosion: From O1 to O3(27:07) - Are Transformers enough or do we need a new paradigm?(28:29) - RL vs. Test-Time Compute: What’s actually driving progress?(30:10) - Long-horizon tasks: Agents running for hours(31:49) - Epiplexity: A new theory of data information content(38:29) - 2026 Predictions: Multi-agent systems & reasoning limits(39:28) - Will AI solve the Riemann Hypothesis?(41:42) - Advice for PhD students

Gemini 3 was a landmark frontier model launch in AI this year — but the story behind its performance isn’t just about adding more compute. In this episode, I sit down with Sebastian Bourgeaud, a pre-training lead for Gemini 3 at Google DeepMind and co-author of the seminal RETRO paper. In his first-ever podcast interview, Sebastian takes us inside the lab mindset behind Google’s most powerful model — what actually changed, and why the real work today is no longer “training a model,” but building a full system.We unpack the “secret recipe” idea — the notion that big leaps come from better pre-training and better post-training — and use it to explore a deeper shift in the industry: moving from an “infinite data” era to a data-limited regime, where curation, proxies, and measurement matter as much as web-scale volume. Sebastian explains why scaling laws aren’t dead, but evolving, why evals have become one of the hardest and most underrated problems (including benchmark contamination), and why frontier research is increasingly a full-stack discipline that spans data, infrastructure, and engineering as much as algorithms.From the intuition behind Deep Think, to the rise (and risks) of synthetic data loops, to the future of long-context and retrieval, this is a technical deep dive into the physics of frontier AI. We also get into continual learning — what it would take for models to keep updating with new knowledge over time, whether via tools, expanding context, or new training paradigms — and what that implies for where foundation models are headed next. If you want a grounded view of pre-training in late 2025 beyond the marketing layer, this conversation is a blueprint.Google DeepMindWebsite - https://deepmind.googleX/Twitter - https://x.com/GoogleDeepMindSebastian BorgeaudLinkedIn - https://www.linkedin.com/in/sebastian-borgeaud-8648a5aa/X/Twitter - https://x.com/borgeaud_sFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)Blog - https://mattturck.comLinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) – Cold intro: “We’re ahead of schedule” + AI is now a system(00:58) – Oriol’s “secret recipe”: better pre- + post-training(02:09) – Why AI progress still isn’t slowing down(03:04) – Are models actually getting smarter?(04:36) – Two–three years out: what changes first?(06:34) – AI doing AI research: faster, not automated(07:45) – Frontier labs: same playbook or different bets?(10:19) – Post-transformers: will a disruption happen?(10:51) – DeepMind’s advantage: research × engineering × infra(12:26) – What a Gemini 3 pre-training lead actually does(13:59) – From Europe to Cambridge to DeepMind(18:06) – Why he left RL for real-world data(20:05) – From Gopher to Chinchilla to RETRO (and why it matters)(20:28) – “Research taste”: integrate or slow everyone down(23:00) – Fixes vs moonshots: how they balance the pipeline(24:37) – Research vs product pressure (and org structure)(26:24) – Gemini 3 under the hood: MoE in plain English(28:30) – Native multimodality: the hidden costs(30:03) – Scaling laws aren’t dead (but scale isn’t everything)(33:07) – Synthetic data: powerful, dangerous(35:00) – Reasoning traces: what he can’t say (and why)(37:18) – Long context + attention: what’s next(38:40) – Retrieval vs RAG vs long context(41:49) – The real boss fight: evals (and contamination)(42:28) – Alignment: pre-training vs post-training(43:32) – Deep Think + agents + “vibe coding”(46:34) – Continual learning: updating models over time(49:35) – Advice for researchers + founders(53:35) – “No end in sight” for progress + closing

We’re told that AI progress is slowing down, that pre-training has hit a wall, that scaling laws are running out of road. Yet we’re releasing this episode in the middle of a wild couple of weeks that saw GPT-5.1, GPT-5.1 Codex Max, fresh reasoning modes and long-running agents ship from OpenAI — on top of a flood of new frontier models elsewhere. To make sense of what’s actually happening at the edge of the field, I sat down with someone who has literally helped define both of the major AI paradigms of our time.Łukasz Kaiser is one of the co-authors of “Attention Is All You Need,” the paper that introduced the Transformer architecture behind modern LLMs, and is now a leading research scientist at OpenAI working on reasoning models like those behind GPT-5.1. In this conversation, he explains why AI progress still looks like a smooth exponential curve from inside the labs, why pre-training is very much alive even as reinforcement-learning-based reasoning models take over the spotlight, how chain-of-thought actually works under the hood, and what it really means to “train the thinking process” with RL on verifiable domains like math, code and science. We talk about the messy reality of low-hanging fruit in engineering and data, the economics of GPUs and distillation, interpretability work on circuits and sparsity, and why the best frontier models can still be stumped by a logic puzzle from his five-year-old’s math book.We also go deep into Łukasz’s personal journey — from logic and games in Poland and France, to Ray Kurzweil’s team, Google Brain and the inside story of the Transformer, to joining OpenAI and helping drive the shift from chatbots to genuine reasoning engines. Along the way we cover GPT-4 → GPT-5 → GPT-5.1, post-training and tone, GPT-5.1 Codex Max and long-running coding agents with compaction, alternative architectures beyond Transformers, whether foundation models will “eat” most agents and applications, what the translation industry can teach us about trust and human-in-the-loop, and why he thinks generalization, multimodal reasoning and robots in the home are where some of the most interesting challenges still lie.OpenAIWebsite - https://openai.comX/Twitter - https://x.com/OpenAIŁukasz KaiserLinkedIn - https://www.linkedin.com/in/lukaszkaiser/X/Twitter - https://x.com/lukaszkaiserFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)Blog - https://mattturck.comLinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) – Cold open and intro(01:29) – “AI slowdown” vs a wild week of new frontier models(08:03) – Low-hanging fruit: infra, RL training and better data(11:39) – What is a reasoning model, in plain language?(17:02) – Chain-of-thought and training the thinking process with RL(21:39) – Łukasz’s path: from logic and France to Google and Kurzweil(24:20) – Inside the Transformer story and what “attention” really means(28:42) – From Google Brain to OpenAI: culture, scale and GPUs(32:49) – What’s next for pre-training, GPUs and distillation(37:29) – Can we still understand these models? Circuits, sparsity and black boxes(39:42) – GPT-4 → GPT-5 → GPT-5.1: what actually changed(42:40) – Post-training, safety and teaching GPT-5.1 different tones(46:16) – How long should GPT-5.1 think? Reasoning tokens and jagged abilities(47:43) – The five-year-old’s dot puzzle that still breaks frontier models(52:22) – Generalization, child-like learning and whether reasoning is enough(53:48) – Beyond Transformers: ARC, LeCun’s ideas and multimodal bottlenecks(56:10) – GPT-5.1 Codex Max, long-running agents and compaction(1:00:06) – Will foundation models eat most apps? The translation analogy and trust(1:02:34) – What still needs to be solved, and where AI might go next

In this special release episode, Matt sits down with Nathan Lambert and Luca Soldaini from Ai2 (the Allen Institute for AI) to break down one of the biggest open-source AI drops of the year: OLMo 3. At a moment when most labs are offering “open weights” and calling it a day, AI2 is doing the opposite — publishing the models, the data, the recipes, and every intermediate checkpoint that shows how the system was built. It’s an unusually transparent look into the inner machinery of a modern frontier-class model.Nathan and Luca walk us through the full pipeline — from pre-training and mid-training to long-context extension, SFT, preference tuning, and RLVR. They also explain what a thinking model actually is, why reasoning models have exploded in 2025, and how distillation from DeepSeek and Qwen reasoning models works in practice. If you’ve been trying to truly understand the “RL + reasoning” era of LLMs, this is the clearest explanation you’ll hear.We widen the lens to the global picture: why Meta’s retreat from open source created a “vacuum of influence,” how Chinese labs like Qwen, DeepSeek, Kimi, and Moonshot surged into that gap, and why so many U.S. companies are quietly building on Chinese open models today. Nathan and Luca offer a grounded, insider view of whether America can mount an effective open-source response — and what that response needs to look like.Finally, we talk about where AI is actually heading. Not the hype, not the doom — but the messy engineering reality behind modern model training, the complexity tax that slows progress, and why the transformation between now and 2030 may be dramatic without ever delivering a single “AGI moment.” If you care about the future of open models and the global AI landscape, this is an essential conversation.Allen Institute for AI (AI2)Website - https://allenai.orgX/Twitter - https://x.com/allen_aiNathan LambertBlog - https://www.interconnects.aiLinkedIn - https://www.linkedin.com/in/natolambert/X/Twitter - https://x.com/natolambertLuca SoldainiBlog - https://soldaini.netLinkedIn - https://www.linkedin.com/in/soldni/X/Twitter - https://x.com/soldniFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)Blog - https://mattturck.comLinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) – Cold Open(00:39) – Welcome & today’s big announcement(01:18) – Introducing the Olmo 3 model family(02:07) – What “base models” really are (and why they matter)(05:51) – Dolma 3: the data behind Olmo 3(08:06) – Performance vs Qwen, Gemma, DeepSeek(10:28) – What true open source means (and why it’s rare)(12:51) – Intermediate checkpoints, transparency, and why AI2 publishes everything(16:37) – Why Qwen is everywhere (including U.S. startups)(18:31) – Why Chinese labs go open source (and why U.S. labs don’t)(20:28) – Inside ATOM: the U.S. response to China’s model surge(22:13) – The rise of “thinking models” and inference-time scaling(35:58) – The full Olmo pipeline, explained simply(46:52) – Pre-training: data, scale, and avoiding catastrophic spikes(50:27) – Mid-training (tail patching) and avoiding test leakage(52:06) – Why long-context training matters(55:28) – SFT: building the foundation for reasoning(1:04:53) – Preference tuning & why DPO still works(1:10:51) – The hard part: RLVR, long reasoning chains, and infrastructure pain(1:13:59) – Why RL is so technically brutal(1:18:17) – Complexity tax vs AGI hype(1:21:58) – How everyone can contribute to the future of AI(1:27:26) – Closing thoughts

Frontier AI is colliding with real-world infrastructure. Eiso Kant (Co-CEO & Co-Founder, Poolside) joins the MAD Podcast to unpack Project Horizon— a multi-gigawatt West Texas build—and why frontier labs must own energy, compute, and intelligence to compete. We map token economics, cloud-style margins, and the staged 250 MW rollout using 2.5 MW modular skids.Then we get operational: the CoreWeave anchor partnership, environmental choices (SCR, renewables + gas + batteries), community impact, and how Poolside plans to bring capacity online quickly without renting away margin—plus the enterprise motion (defense to Fortune 500) powered by forward deployed research engineers.Finally, we go deep on training. Eiso lays out RL2L (Reinforcement Learning to Learn)— aimed at reverse-engineering the web’s thoughts and actions— why intelligence may commoditize, what that means for agents, and how coding served as a proxy for long-horizon reasoning before expanding to broader knowledge work.PoolsideWebsite - https://poolside.aiX/Twitter - https://x.com/poolsideaiEiso KantLinkedIn - https://www.linkedin.com/in/eisokant/X/Twitter - https://x.com/eisokantFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)Blog - https://www.mattturck.comLinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) Cold open – “Intelligence becomes a commodity”(00:23) Host intro – Project Horizon & RL2L(01:19) Why Poolside exists amid frontier labs(04:38) Project Horizon: building one of the largest US data center campuses(07:20) Why own infra: scale, cost, and avoiding “cosplay”(10:06) Economics deep dive: $8B for 250 MW, capex/opex, margins(16:47) CoreWeave partnership: anchor tenant + flexible scaling(18:24) Hiring the right tail: building a physical infra org(30:31) RL today → agentic RL and long-horizon tasks(37:23) RL2L revealed: reverse-engineering the web’s thoughts & actions(39:32) Continuous learning and the “hot stove” limitation(43:30) Agents debate: thin wrappers, differentiation, and model collapse(49:10) “Is AI plateauing?”—chip cycles, scale limits, and new axes(53:49) Why software was the proxy; expanding to enterprise knowledge work(55:17) Model status: Malibu → Laguna (small/medium/large)(57:31) Poolside's Commercial Reality today: defense; Fortune 500; FDRE (1:02:43) Global team, avoiding the echo chamber(1:04:34) Next 12–18 months: frontier models + infra scale(1:05:52) Closing

Power is the new bottleneck, reasoning got real, and the business finally caught up. In this wide-ranging conversation, I sit down with Nathan Benaich, Founder and General Partner at Air Street Capital, to discuss the newly published 2025 State of AI report—what’s actually working, what’s hype, and where the next edge will come from. We start at the physical layer: energy procurement, PPAs, off-grid builds, and why water and grid constraints are turning power—not GPUs—into the decisive moat.From there, we move into capability: reasoning models acting as AI co-scientists in verifiable domains, and the “chain-of-action” shift in robotics that’s taking us from polished demos to dependable deployments. Along the way, we examine the market reality—who’s making real revenue, how margins actually behave once tokens and inference meet pricing, and what all of this means for builders and investors.We also zoom out to the ecosystem: NVIDIA’s position vs. custom silicon, China’s split stack, and the rise of sovereign AI (and the “sovereignty washing” that comes with it). The policy and security picture gets a hard look too—regulation’s vibe shift, data-rights realpolitik, and what agents and MCP mean for cyber risk and adoption.Nathan closes with where he’s placing bets (bio, defense, robotics, voice) and three predictions for the next 12 months. Nathan BenaichBlog - https://www.nathanbenaich.comX/Twitter - https://x.com/nathanbenaichSource: State of AI Report 2025 (9/10/2025)Air Street CapitalWebsite - https://www.airstreet.comX/Twitter - https://x.com/airstreetMatt Turck (Managing Director)Blog - https://www.mattturck.comLinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturckFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCap(0:00) – Cold Open: “Gargantuan money, real reasoning”(0:40) – Intro: State of AI 2025 with Nathan Benaich(02:06) – Reasoning got real: from chain-of-thought to verified math wins(04:11) – AI co-scientist: hypotheses, wet-lab validation, fewer “dumb stochastic parrots” (04:44) – Chain-of-action robotics: plan → act you can audit(05:13) – Humanoids vs. warehouse reality: where robots actually stick first(06:32) – The business caught up: who’s making real revenue now(08:26) – Adoption & spend: Ramp stats, retention, and the shadow-AI gap(11:00) – Margins debate: tokens, pricing, and the thin-wrapper trap(14:02) – Bubble or boom? Wall Street vs. SF vibes (and circular deals)(19:54) – Power is the bottleneck: $50B/GW capex and the new moat(21:02) – PPAs, gas turbines, and off-grid builds: the procurement game(23:54) – Water, grids, and NIMBY: sustainability gets political(25:08) – NVIDIA’s moat: 90% of papers, Broadcom/AMD, and custom silicon(28:47) – China split-stack: Huawei, Cambricon, and export zigzags(30:30) – Sovereign AI or “sovereignty washing”? Open source as leverage(40:40) – Regulation & safety: from Bletchley to “AI Action”—the vibe shift(44:06) – Safety budgets vs. lab spend; models that game evals(44:46) – Data rights realpolitik: $1.5B signals the new training cost(47:04) – Cyber risk in the agent era: MCP, malware LMs, state actors(50:19) – Agents that convert: search → commerce and the demo flywheel(54:18) – VC lens: where Nathan is investing (bio, defense, robotics, voice)(68:29) – Predictions: power politics, AI neutrality, end-to-end discoveries(1:02:13) – Wrap: what to watch next & where to find the report (stateof.ai)

Are we failing to understand the exponential, again?My guest is Julian Schrittwieser (top AI researcher at Anthropic; previously Google DeepMind on AlphaGo Zero & MuZero). We unpack his viral post (“Failing to Understand the Exponential, again”) and what it looks like when task length doubles every 3–4 months—pointing to AI agents that can work a full day autonomously by 2026 and expert-level breadth by 2027. We talk about the original Move 37 moment and whether today’s AI models can spark alien insights in code, math, and science—including Julian’s timeline for when AI could produce Nobel-level breakthroughs.We go deep on the recipe of the moment—pre-training + RL—why it took time to combine them, what “RL from scratch” gets right and wrong, and how implicit world models show up in LLM agents. Julian explains the current rewards frontier (human prefs, rubrics, RLVR, process rewards), what we know about compute & scaling for RL, and why most builders should start with tools + prompts before considering RL-as-a-service. We also cover evals & Goodhart’s law (e.g., GDP-Val vs real usage), the latest in mechanistic interpretability (think “Golden Gate Claude”), and how safety & alignment actually surface in Anthropic’s launch process.Finally, we zoom out: what 10× knowledge-work productivity could unlock across medicine, energy, and materials, how jobs adapt (complementarity over 1-for-1 replacement), and why the near term is likely a smooth ramp—fast, but not a discontinuity.Julian SchrittwieserBlog - https://www.julian.acX/Twitter - https://x.com/mononofuViral post: Failing to understand the exponential, again (9/27/2025)AnthropicWebsite - https://www.anthropic.comX/Twitter - https://x.com/anthropicaiMatt Turck (Managing Director)Blog - https://www.mattturck.comLinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturckFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCap(00:00) Cold open — “We’re not seeing any slowdown.”(00:32) Intro — who Julian is & what we cover(01:09) The “exponential” from inside frontier labs(04:46) 2026–2027: agents that work a full day; expert-level breadth(08:58) Benchmarks vs reality: long-horizon work, GDP-Val, user value(10:26) Move 37 — what actually happened and why it mattered(13:55) Novel science: AlphaCode/AlphaTensor → when does AI earn a Nobel?(16:25) Discontinuity vs smooth progress (and warning signs)(19:08) Does pre-training + RL get us there? (AGI debates aside)(20:55) Sutton’s “RL from scratch”? Julian’s take(23:03) Julian’s path: Google → DeepMind → Anthropic(26:45) AlphaGo (learn + search) in plain English(30:16) AlphaGo Zero (no human data)(31:00) AlphaZero (one algorithm: Go, chess, shogi)(31:46) MuZero (planning with a learned world model)(33:23) Lessons for today’s agents: search + learning at scale(34:57) Do LLMs already have implicit world models?(39:02) Why RL on LLMs took time (stability, feedback loops)(41:43) Compute & scaling for RL — what we see so far(42:35) Rewards frontier: human prefs, rubrics, RLVR, process rewards(44:36) RL training data & the “flywheel” (and why quality matters)(48:02) RL & Agents 101 — why RL unlocks robustness(50:51) Should builders use RL-as-a-service? Or just tools + prompts?(52:18) What’s missing for dependable agents (capability vs engineering)(53:51) Evals & Goodhart — internal vs external benchmarks(57:35) Mechanistic interpretability & “Golden Gate Claude”(1:00:03) Safety & alignment at Anthropic — how it shows up in practice(1:03:48) Jobs: human–AI complementarity (comparative advantage)(1:06:33) Inequality, policy, and the case for 10× productivity → abundance(1:09:24) Closing thoughts

What does it really mean when GPT-5 “thinks”? In this conversation, OpenAI’s VP of Research Jerry Tworek explains how modern reasoning models work in practice—why pretraining and reinforcement learning (RL/RLHF) are both essential, what that on-screen “thinking” actually does, and when extra test-time compute helps (or doesn’t). We trace the evolution from O1 (a tech demo good at puzzles) to O3 (the tool-use shift) to GPT-5 (Jerry calls it “03.1-ish”), and talk through verifiers, reward design, and the real trade-offs behind “auto” reasoning modes.We also go inside OpenAI: how research is organized, why collaboration is unusually transparent, and how the company ships fast without losing rigor. Jerry shares the backstory on competitive-programming results like ICPC, what they signal (and what they don’t), and where agents and tool use are genuinely useful today. Finally, we zoom out: could pretraining + RL be the path to AGI? This is the MAD Podcast —AI for the 99%. If you’re curious about how these systems actually work (without needing a PhD), this episode is your map to the current AI frontier.OpenAIWebsite - https://openai.comX/Twitter - https://x.com/OpenAIJerry TworekLinkedIn - https://www.linkedin.com/in/jerry-tworek-b5b9aa56X/Twitter - https://x.com/millionintFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)LinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) Intro(01:01) What Reasoning Actually Means in AI(02:32) Chain of Thought: Models Thinking in Words(05:25) How Models Decide Thinking Time(07:24) Evolution from O1 to O3 to GPT-5(11:00) Before OpenAI: Growing up in Poland, Dropping out of School, Trading(20:32) Working on Robotics and Rubik's Cube Solving(23:02) A Day in the Life: Talking to Researchers(24:06) How Research Priorities Are Determined(26:53) Collaboration vs IP Protection at OpenAI(29:32) Shipping Fast While Doing Deep Research(31:52) Using OpenAI's Own Tools Daily(32:43) Pre-Training Plus RL: The Modern AI Stack(35:10) Reinforcement Learning 101: Training Dogs(40:17) The Evolution of Deep Reinforcement Learning(42:09) When GPT-4 Seemed Underwhelming at First(45:39) How RLHF Made GPT-4 Actually Useful(48:02) Unsupervised vs Supervised Learning(49:59) GRPO and How DeepSeek Accelerated US Research(53:05) What It Takes to Scale Reinforcement Learning(55:36) Agentic AI and Long-Horizon Thinking(59:19) Alignment as an RL Problem(1:01:11) Winning ICPC World Finals Without Specific Training(1:05:53) Applying RL Beyond Math and Coding(1:09:15) The Path from Here to AGI(1:12:23) Pure RL vs Language Models

Sholto Douglas, a top AI researcher at Anthropic, discusses the breakthroughs behind Claude Sonnet 4.5—the world's leading coding model—and why we might be just 2-3 years from AI matching human-level performance on most computer-facing tasks.You'll discover why RL on language models suddenly started working in 2024, how agents maintain coherency across 30-hour coding sessions through self-correction and memory systems, and why the "bitter lesson" of scale keeps proving clever priors wrong.Sholto shares his path from top-50 world fencer to Google's Gemini team to Anthropic, explaining why great blog posts sometimes matter more than PhDs in AI research. He discusses the culture at big AI labs and why Anthropic is laser-focused on coding (it's the fastest path to both economic impact and AI-assisted AI research). Sholto also discusses how the training pipeline is still "held together by duct tape" with massive room to improve, and why every benchmark created shows continuous rapid progress with no plateau in sight.Bold predictions: individuals will soon manage teams of AI agents working 24/7, robotics is about to experience coding-level breakthroughs, and policymakers should urgently track AI progress on real economic tasks. A clear-eyed look at where AI stands today and where it's headed in the next few years.AnthropicWebsite - https://www.anthropic.comTwitter - https://x.com/AnthropicAISholto DouglasLinkedIn - https://www.linkedin.com/in/sholtoTwitter - https://x.com/_sholtodouglasFIRSTMARKWebsite - https://firstmark.comTwitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)LinkedIn - https://www.linkedin.com/in/turck/Twitter - https://twitter.com/mattturck(00:00) Intro (01:09) The Rapid Pace of AI Releases at Anthropic (02:49) Understanding Opus, Sonnet, and Haiku Model Tiers (04:14) Shelto's Journey: From Australian Fencer to AI Researcher (12:01) The Growing Pool of AI Talent (16:16) Breaking Into AI Research Without Traditional Credentials (18:29) What "Taste" Means in AI Research (23:05) Moving to Google and Building Gemini's Inference Stack (25:08) How Anthropic Differs from Other AI Labs (31:46) Why Anthropic Is Laser-Focused on Coding (36:40) Inside a 30-Hour Autonomous Coding Session (38:41) Examples of What AI Can Build in 30 Hours (43:13) The Breakthroughs That Enabled 30-Hour Runs (46:28) What's Actually Driving the Performance Gains (47:42) Pre-Training vs. Reinforcement Learning Explained (52:11) Test-Time Compute and the New Scaling Paradigm (55:55) Why RL on LLMs Finally Started Working (59:38) Are We on Track to AGI? (01:02:05) Why the "Plateau" Narrative Is Wrong (01:03:41) Sonnet's Performance Across Economic Sectors (01:05:47) Preparing for a World of 10–100x Individual Leverage

The most successful enterprises are about to become autonomous — and Eléonore Crespo, Co-CEO of Pigment, is building the nervous system that makes it possible. In this conversation, Eléonore reveals how her $400 million AI platform is already running supply chains for Coca-Cola, powering finance for the hottest newly public companies like Figma and Klarna, and processing thousands of financial scenarios for Uber and Snowflake faster and more accurately than any human team ever could.Eléonore predicts Excel will outlive most AI companies (but maybe only as a user interface, not a calculation engine) explains why she deliberately chose to build from Paris instead of Silicon Valley, and shares her contrarian take on why the AI revolution will create more CFOs, not fewer.You'll discover why Pigment's three-agent system (Analyst, Modeler, Planner) avoids the hallucination problems plaguing other AI companies, how they achieved human-level accuracy in financial analysis, and the accelerating timeline for fully autonomous enterprise planning that will make your current workforce obsolete.PigmentWebsite - https://www.pigment.comTwitter - https://x.com/gopigmentEléonore CrespoLinkedIn - linkedin.com/in/eleonorecrespoFIRSTMARKWebsite - https://firstmark.comTwitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)LinkedIn - https://www.linkedin.com/in/turck/Twitter - https://twitter.com/mattturck(00:00) Intro (01:22) Building Pigment: 500 Employees, $400M Raised, 60% US Revenue (03:20) From Quantum Physics to Google to Index Ventures (06:56) Why Being a VC Was the Perfect Founder Training Ground (11:35) The Impatience Factor: What Makes Great Founders (13:27) Hiring for AI Fluency in the Modern Enterprise (14:54) Pigment's Internal AI Strategy: Committees and Guardrails (17:30) The Three AI Agents: Analyst, Modeler, and Planner (22:15) Why Three Agents Instead of One: Technical Architecture (24:10) Agent Coordination: How the Supervisor Agent Works (24:46) Real Example: Budget Variance Analysis Across 50 Products (27:15) The Human-in-the-Loop Approach: Recommendations Not Actions (27:36) Solving Hallucination: Why Structured Data Changes Everything (30:08) Behind the Scenes: Verification Agents and Audit Trails (31:57) Beyond Accuracy: Enabling the Impossible at Scale (36:21) Will AI Finally Kill Excel? Eleanor's Contrarian Take (38:23) The Vision: Fully Autonomous Enterprise Planning (40:55) Real-Time Supply Chain Adaptation: The Ukraine Example (42:20) Multi-LLM Strategy: OpenAI, Anthropic, and Partner Integration (44:32) Token Economics: Why Pigment Isn't Token-Intensive (48:30) Customer Adoption: Excitement vs. Change Management Challenges (50:51) Top-Down AI Demand vs. Bottom-Up Implementation Reality (53:08) The Reskilling Challenge: Everyone Becomes a Mini CFO (57:38) Building a Global Company from Europe During COVID (01:00:02) Managing a US Executive Team from Paris (01:01:14) SI Partner Strategy: Why Boutique Firms Come Before Deloitte (01:03:28) The $100 Billion Vision: Beyond Performance Management (01:05:08) Success Metrics: Innovation Over Revenue

2025 has been a breakthrough year for AI video. In this episode of the MAD Podcast, Matt Turck sits down with Cristóbal Valenzuela, CEO & Co-Founder of Runway, to explore how AI is reshaping the future of filmmaking, advertising, and storytelling - faster, cheaper, and in ways that were unimaginable even a year ago.Cris and Matt discuss:* How AI went from memes and spaghetti clips to IMAX film festivals.* Why Gen-4 and Aleph are game-changing models for professionals.* How Hollywood, advertisers, and creators are adopting AI video at scale.* The future of storytelling: what happens to human taste, craft, and creativity when anyone can conjure movies on demand?* Runway’s journey from 2018 skeptics to today’s cutting-edge research lab.If you want to understand the future of filmmaking, media, and creativity in the AI age, this is the episode. RunwayWebsite - https://runwayml.comX/Twitter - https://x.com/runwaymlCristóbal ValenzuelaLinkedIn - https://www.linkedin.com/in/cvalenzuelabX/Twitter - https://x.com/c_valenzuelab FIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)LinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) Intro – AI Video's Wild Year (01:48) Runway's AI Film Festival Goes from Chinatown to IMAX (04:02) Hollywood's Shift: From Ignoring AI to Adopting It at Scale (06:38) How Runway Saves VFX Artists' Weekends of Work (07:31) Inside Gen-4 and Aleph: Why These Models Are Game-Changers (08:21) From Editing Tools to a "New Kind of Camera" (10:00) Beyond Film: Gaming, Architecture, E-Commerce & Robotics Use Cases (10:55) Why Advertising Is Adopting AI Video Faster Than Anyone Else (11:38) How Creatives Adapt When Iteration Becomes Real-Time (14:12) What Makes Someone Great at AI Video (Hint: No Preconceptions) (15:28) The Early Days: Building Runway Before Generative AI Was "Real" (20:27) Finding Early Product-Market Fit (21:51) Balancing Research and Product Inside Runway (24:23) Comparing Aleph vs. Gen-4, and the Future of Generalist Models (30:36) New Input Modalities: Editing with Video + Annotations, Not Just Text (33:46) Managing Expectations: Twitter Demos vs. Real Creative Work (47:09) The Future: Real-Time AI Video and Fully Explorable 3D Worlds (52:02) Runway's Business Model: From Indie Creators to Disney & Lionsgate (57:26) Competing with the Big Labs (Sora, Google, etc.) (59:58) Hyper-Personalized Content? Why It May Not Replace Film (01:01:13) Advice to Founders: Treat Your Company Like a Model — Always Learning (01:03:06) The Next 5 Years of Runway: Changing Creativity Forever

Granola is the rare AI startup that slipped into one of tech’s most crowded niches — meeting notes — and still managed to become the product founders and VCs rave about. In this episode, MAD Podcast host Matt Turck sits down with Granola co-founder & CEO Chris Pedregal to unpack how a two-person team in London turned a simple “second brain” idea into Silicon Valley’s favorite AI tool. Chris recounts a year in stealth onboarding users one by one, the 50 % feature-cut that unlocked simplicity, and why they refused to deploy a meeting bot or store audio even when investors said they were crazy.We go deep on the craft of building a beloved AI product: choosing meetings (not email) as the data wedge, designing calendar-triggered habit loops, and obsessing over privacy so users trust the tool enough to outsource memory. Chris opens the hood on Granola’s tech stack — real-time ASR from Deepgram & Assembly, echo cancellation on-device, and dynamic routing across OpenAI, Anthropic and Google models — and explains why transcription, not LLM tokens, is the biggest cost driver today. He also reveals how internal eval tooling lets the team swap models overnight without breaking the “Granola voice.”Looking ahead, Chris shares a roadmap that moves beyond notes toward a true “tool for thought”: cross-meeting insights in seconds, dynamic documents that update themselves, and eventually an AI coach that flags blind spots in your work. Whether you’re an engineer, designer, or founder figuring out your own AI strategy, this conversation is a masterclass in nailing product-market fit, trimming complexity, and future-proofing for the rapid advances still to come. Hit play, like, and subscribe if you’re ready to learn how to build AI products people can’t live without.GranolaWebsite - https://www.granola.aiX/Twitter - https://x.com/meetgranolaChris PedregalLinkedIn - https://www.linkedin.com/in/pedregalX/Twitter - https://x.com/cjpedregalFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)LinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) Introduction: The Granola Story (01:41) Building a "Life-Changing" Product (04:31) The "Second Brain" Vision (06:28) Augmentation Philosophy (Engelbart), Tools That Shape Us (09:02) Late to a Crowded Market: Why it Worked (13:43) Two Product Founders, Zero ML PhDs (16:01) London vs. SF: Building Outside the Valley (19:51) One Year in Stealth: Learning Before Launch (22:40) "Building For Us" & Finding First Users (25:41) Key Design Choices: No Meeting Bot, No Stored Audio (29:24) Simplicity is Hard: Cutting 50% of Features (32:54) Intuition vs. Data in Making Product Decisions (36:25) Continuous User Conversations: 4–6 Calls/Week (38:06) Prioritizing the Future: Build for Tomorrow's Workflows (40:17) Tech Stack Tour: Model Routing & Evals (42:29) Context Windows, Costs & Inference Economics (45:03) Audio Stack: Transcription, Noise Cancellation & Diarization Limits (48:27) Guardrails & Citations: Building Trust in AI (50:00) Growth Loops Without Virality Hacks (54:54) Enterprise Compliance, Data Footprint & Liability Risk (57:07) Retention & Habit Formation: The "500 Millisecond Window" (58:43) Competing with OpenAI and Legacy Suites (01:01:27) The Future: Deep Research Across Meetings & Roadmap (01:04:41) Granola as Career Coach?

What happens when an internal hack turns into a $400 million AI rocket ship? In this episode, Matt Turck sits down with Boris Cherny, the creator of Claude Code at Anthropic, to unpack the wild story behind the fastest-growing AI coding tool on the planet.Boris reveals how Claude Code started as a personal productivity tool, only to become Anthropic’s secret weapon — now used by nearly every engineer at the company and rapidly spreading across the industry. You’ll hear how Claude Code’s “agentic” approach lets AI not just suggest code, but actually plan, edit, debug, and even manage entire projects—sometimes with a whole fleet of subagents working in parallel.We go deep on why Claude Code runs in the terminal (and why that’s a feature, not a bug), how its Claude.md memory files let teams build a living, shareable knowledge base, and why safety and human-in-the-loop controls are baked into every action. Boris shares real stories of onboarding times dropping from weeks to days, and how even non-coders are hacking Cloud Code for everything from note-taking to business metrics.AnthropicWebsite - https://www.anthropic.comX/Twitter - https://x.com/AnthropicAIBoris ChernyLinkedIn - https://www.linkedin.com/in/bchernyX/Twitter - https://x.com/bchernyFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)LinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) Intro (01:15) Did You Expect Claude Code’s Success? (04:22) How Claude Code Works and Origins (08:05) Command Line vs IDE: Why Start Claude Code in the Terminal? (11:31) The Evolution of Programming: From Punch Cards to Agents (13:20) Product Follows Model: Simple Interfaces and Fast Evolution (15:17) Who Is Claude Code For? (Engineers, Designers, PMs & More) (17:46) What Can Claude Code Actually Do? (Actions & Capabilities) (21:14) Agentic Actions, Subagents, and Workflows (25:30) Claude Code’s Awareness, Memory, and Knowledge Sharing (33:28) Model Context Protocol (MCP) and Customization (35:30) Safety, Human Oversight, and Enterprise Considerations (38:10) UX/UI: Making Claude Code Useful and Enjoyable (40:44) Pricing for Power Users and Subscription Models (43:36) Real-World Use Cases: Debugging, Testing, and More (46:44) How Does Claude Code Transform Onboarding? (49:36) The Future of Coding: Agents, Teams, and Collaboration (54:11) The AI Coding Wars: Competition & Ecosystem (57:27) The Future of Coding as a Profession (58:41) What’s Next for Claude Code

What if your company had a digital brain that never forgot, always knew the answer, and could instantly tap the knowledge of your best engineers, even after they left? Superintelligence can feel like a hand‑wavy pipe‑dream— yet, as Misha Laskin argues, it becomes a tractable engineering problem once you scope it to the enterprise level. Former DeepMind researcher Laskin is betting on an oracle‑like AI that grasps every repo, Jira ticket and hallway aside as deeply as your principal engineer—and he’s building it at Reflection AI.In this wide‑ranging conversation, Misha explains why coding is the fastest on‑ramp to superintelligence, how “organizational” beats “general” when real work is on the line, and why today’s retrieval‑augmented generation (RAG) feels like “exploring a jungle with a flashlight.” He walks us through Asimov, Reflection’s newly unveiled code‑research agent that fuses long‑context search, team‑wide memory and multi‑agent planning so developers spend less time spelunking for context and more time shipping.We also rewind his unlikely journey—from physics prodigy in a Manhattan‑Project desert town, to Berkeley’s AI crucible, to leading RLHF for Google Gemini—before he left big‑lab comfort to chase a sharper vision of enterprise super‑intelligence. Along the way: the four breakthroughs that unlocked modern AI, why capital efficiency still matters in the GPU arms‑race, and how small teams can lure top talent away from nine‑figure offers.If you’re curious about the next phase of AI agents, the future of developer tooling, or the gritty realities of scaling a frontier‑level startup—this episode is your blueprint.Reflection AIWebsite - https://reflection.aiLinkedIn - https://www.linkedin.com/company/reflectionaiMisha LaskinLinkedIn - https://www.linkedin.com/in/mishalaskinX/Twitter - https://x.com/mishalaskinFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)LinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) Intro (01:42) Reflection AI: Company Origins and Mission (04:14) Making Superintelligence Concrete (06:04) Superintelligence vs. AGI: Why the Goalposts Moved (07:55) Organizational Superintelligence as an Oracle (12:05) Coding as the Shortcut: Hands, Legs & Brain for AI (16:00) Building the Context Engine (20:55) Capturing Tribal Knowledge in Organizations (26:31) Introducing Asimov: A Deep Code Research Agent (28:44) Team-Wide Memory: Preserving Institutional Knowledge (33:07) Multi-Agent Design for Deep Code Understanding (34:48) Data Retrieval and Integration in Asimov (38:13) Enterprise-Ready: VPC and On-Prem Deployments (39:41) Reinforcement Learning in Asimov's Development (41:04) Misha's Journey: From Physics to AI (42:06) Growing Up in a Science-Driven Desert Town (53:03) Building General Agents at DeepMind (56:57) Founding Reflection AI After DeepMind (58:54) Product-Driven Superintelligence: Why It Matters (01:02:22) The State of Autonomous Coding Agents (01:04:26) What's Next for Reflection AI

Agentic commerce is no longer science fiction — it’s arriving in your browser, your development IDE, and soon, your bank statement. In this episode of The MAD Podcast, Matt Turck sits down with Emily Glassberg Sands, Stripe’s Head of Information, to explore how autonomous “buying bots” and the Model Context Protocol (MCP) are reshaping the very mechanics of online transactions. Emily explains why intent, not clicks, will become the primary interface for shopping and how Stripe’s rails are adapting for tokens, one-time virtual cards, and real-time risk scoring that can tell good bots from bad ones in milliseconds.We also go deep into Stripe's strategic AI choices. Drawing on $1.4 trillion in annual payment flow—1.3 percent of global GDP—Stripe decided to train its own payments foundation model, turning tens of billions of historical charges into embeddings that boost fraud-catch recall from 59 percent to 97 percent. Emily walks us through the tech: why they chose a BERT encoder over GPT-style decoders, how three MLEs in a “research bubble” birthed the model, and what it takes to run it in production with five-nines reliability and tight latency budgets.We zoom out to Stripe’s unique vantage point on the broader AI economy. Their data shows the top AI startups hitting $30 million in ARR three times faster than the fastest SaaS companies did a decade ago, with more than half of that revenue already coming from overseas markets. Emily unpacks the new billing playbook—usage-based pricing today, outcome-based pricing tomorrow—and explains why tiny teams of 20–30 people can now build global, vertically focused AI businesses almost overnight.StripeWebsite - https://stripe.comX/Twitter - https://x.com/stripe?Emily Glassberg SandsLinkedIn - https://www.linkedin.com/in/egsandsX/Twitter - https://x.com/emilygsandsFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)LinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) Intro (01:45) How Big Is Stripe? Latest Stats Revealed (04:06) What Does “Head of Information” at Stripe Actually Do? (05:43) From Harvard to Stripe: Emily’s Unusual Journey (08:54) Why Stripe Built Its Own Foundation Model (13:19) Cracking the Code: How Stripe Handles Complex Payment Data (16:25) Foundation Model vs. Traditional ML: What’s Winning? (20:09) Inside Stripe’s Foundation Model: How It Was Built (24:35) How Stripe Makes AI Decisions Transparent (28:38) Where Stripe Uses AI (And Where It Doesn’t) (34:10) How Stripe’s AI Drives Revenue for Businesses (41:22) Real-Time Fraud Detection: Stripe’s Secret Sauce (42:51) The Future of Shopping: AI Agents & Agentic Commerce (46:20) How Agentic Commerce Is Changing Stripe (49:36) Stripe’s Vision for a World of AI-Powered Buyers (55:46) What Is MCP? Stripe’s Take on Agent-to-Agent Protocols (59:31) Stripe’s Data on AI Startups Monetizing 3× Faster (01:03:03) How AI Companies Go Global — From Day One (01:07:48) The New Rules: Billing & Pricing for AI Startups (01:10:57) How Stripe Builds AI Literacy Across the Company (01:14:05) Roadmap: Risk-as-a-Service, Order Intent, and Beyond

Welcome to a special FirstMark Deep Dive edition of the MAD Podcast. In this episode, Matt Turck and David Waltcher unpack the explosive impact of generative AI on engineering — hands-down the biggest shift the field has seen in decades. You’ll get a front-row seat to the real numbers and stories behind the AI code revolution, including how companies like Cursor hit a $500M valuation in record time, and why GitHub Copilot now serves 15 million developers.Matt and David break down the six trends that shaped the last 20 years of developer tools, and reveal why coding is the #1 use case for generative AI (hint: it’s all about public data, structure, and ROI). You’ll hear how AI is making engineering teams 30-50% faster, but also why this speed is breaking traditional DevOps, overwhelming QA, and turning top engineers into full-time code reviewers.We get specific: 82% of engineers are already using AI to write code, but this surge is creating new security vulnerabilities, reliability issues, and a total rethink of team roles. You’ll learn why code review and prompt engineering are now the most valuable skills, and why computer science grads are suddenly facing some of the highest unemployment rates.We also draw wild historical parallels—from the Gutenberg Press to the Ford assembly line—to show how every productivity boom creates new problems and entire industries to solve them. Plus: what CTOs need to know about hiring, governance, and architecture in the AI era, and why being “AI native” can make a startup more credible than a 10-year-old giant.Matt Turck (Managing Director)LinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturckDavid WaltcherLinkedIn - https://www.linkedin.com/in/davidwaltcherX/Twitter - https://x.com/davidwaltcherFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCap(00:00) Intro & episode setup (01:50) The 6 waves that led to GenAI engineering (04:30) Why coding is such fertile ground for Generative AI (08:25) Break-out dev-tool winners: Cursor, Copilot, Replit, V0 (11:25) Early stats: Teams Are Shipping Code Faster with AI (13:32) Copilots vs Autonomous Agents: The Current Reality (14:14) Lessons from History: Every Tech Boom Creates New Problems (21:53) FirstMark Survey: The Headaches AI Is Creating for Developers (22:53) What’s Now Breaking: Security, CI/CD flakes, QA Overload (29:16) The New CTO Playbook to Adapt to the AI Revolution (33:23) What Happens to Engineering Orgs if Everyone is a Coder? (40:19) Founder opportunities & the dev-tool halo effect (44:24) The Built-in Credibility of AI-Native Startups (46:16) The Irony of Dev Tools As Biggest Winners in the AI Gold Rush (47:43) What’s Next for AI and Engineering?

In this episode, Vercel CEO Guillermo Rauch goes deep on how V0, their text-to-app platform, has already generated over 100 million applications and doubled Vercel’s user base in under a year.Guillermo reveals how a tiny SWAT team inside Vercel built V0 from scratch, why “vibe coding” is making software creation accessible to everyone (not just engineers), and how the AI Cloud is automating DevOps, making cloud infrastructure self-healing, and letting companies expose their data to AI agents in just five lines of code.You’ll hear why “every company will have to rethink itself as a token factory,” how Vercel’s Next.js went from a conference joke to powering Walmart, Nike, and Midjourney, and why the next billion app creators might not write a single line of code. Guillermo breaks down the difference between vibe coding and agentic engineering, shares wild stories of users building apps from napkin sketches, and explains how Vercel is infusing “taste” and best practices directly into their AI models.We also dig into the business side: how Vercel’s AI-powered products are driving explosive growth, why retention and margins are strong, and how the company is adapting to a new wave of non-technical users. Plus: the future of MCP servers, the security challenges of agent-to-agent communication, and why prompting and AI literacy are now must-have skills.VercelWebsite - https://vercel.comX/Twitter - https://x.com/vercelGuillermo RauchLinkedIn - https://www.linkedin.com/in/rauchgX/Twitter - https://x.com/rauchgFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)LinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) Intro (02:08) What Is V0 and Why Did It Take Off So Fast? (04:10) How Did a Tiny Team Build V0 So Quickly? (07:51) V0 vs Other AI Coding Tools (10:35) What is Vibe Coding? (17:05) Is V0 Just Frontend? Moving Toward Full Stack and Integrations (19:40) What Skills Make a Great Vibe Coder? (23:35) Vibe Coding as the GUI for AI: The Future of Interfaces (29:46) Developer Love = Agent Love (33:41) Having Taste as Developer (39:10) MCP Servers: The New Protocol for AI-to-AI Communication (43:11) Security, Observability, and the Risks of Agentic Web (45:25) Are Enterprises Ready for the Agentic Future? (49:42) Closing the Feedback Loop: Customer Service and Product Evolution (56:06) The Vercel AI Cloud: From Pixels to Tokens (01:10:14) How Vercel Adapts to the ICP Change? (01:13:47) Retention, Margins, and the Business of AI Products (01:16:51) The Secret Behind Vercel Last Year Growth (01:24:15) The Importance of Online Presence (01:30:49) Everything, Everywhere, All at Once: Being CEO 101 (01:34:59) Guillermo's Advice to Younger Self

Canva just announced $3 billion in ARR, 230 million monthly active users, and 24 million paying subscribers—including 95% of the Fortune 500. Even more impressive? They’ve been profitable for seven years while growing at 40–50% per year. In this episode, Canva’s Head of Engineering, Brendan Humphreys, reveals how he went from employee #12 to leading 2,300 engineers across continents, and why Canva’s “pragmatic excellence” lets them ship AI features at breakneck speed—like launching Canva Code to 100 million users in just three months.Brendan shares the story of Canva’s AI journey: building an in-house ML team back in 2017, acquiring visual AI startups like Kaleido and Leonardo AI, and why they use a hybrid of OpenAI, Anthropic, Google, and their own foundation models. He explains how Canva’s App Store gives niche AI startups instant access to millions, and why their $200M Creator Fund is designed to reward contributors in the AI era. You’ll also hear how AI tools like Copilot are making Canva’s senior engineers 30% more productive, why “vibe coding” isn’t ready for prime time, and the unique challenges of onboarding junior engineers in an AI-driven world.We also dig into Canva’s approach to technical debt, scaling from 12 to 5,000 employees, and why empathy is a core engineering skill at Canva. CanvaWebsite - https://www.canva.comX/Twitter - https://x.com/canvaBrendan HumphreysLinkedIn - https://www.linkedin.com/in/brendanhumphreysX/Twitter - https://x.com/brendanhFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)LinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) Intro (01:14) Canva’s Mind-Blowing Growth and Profitable Journey (03:41) Why Brendan Left Atlassian to Join a Tiny Startup (06:17) What Being a Founder Taught Brendan About Leadership (07:24) Growing with Canva: From 12 Employees to 2,300 Engineers (10:02) How Canva Runs a Global Team from Sydney to Europe (13:16) Is AI a Threat or a Superpower for Canva? (15:22) The Real Story Behind Canva’s AI and Machine Learning Team (17:23) How Canva Ships New AI Features So Fast (19:19) A Tour of Canva’s Latest AI-Powered Products (21:03) From Design Tool to All-in-One Productivity Platform (26:21) Keeping Up the Pace: How Canva Moves So Quickly (30:22) The Future: AI Agents, Copilots, and Smarter Workflows (33:14) How AI Tools Are Changing the Way Engineers Work (35:47) Rethinking Hiring and Training in the Age of AI (37:01) Why Empathy Matters in Engineering at Canva (39:41) Building vs. Buying: How Canva Chooses Its AI Tech (41:23) Lessons Learned: Technical Debt and Scaling Pains (51:18) Shipping Fast Without Breaking Things (53:08) What’s Next: AI Video, New Features, and Big Ambitions

AI coding is in full-blown gold-rush mode, and GitHub sits at the epicenter. In this episode, GitHub CEO Thomas Dohmke tells Matt Turck how a $7.5 B acquisition in 2018 became a $2 B ARR rocket ship, and reveals how Copilot was born from a secret AI strategy years before anyone else saw the opportunity.We dig into the dizzying pace of AI innovation: why developer tools are suddenly the fastest-growing startups in history, how GitHub’s multi-model approach (OpenAI, Anthropic Claude 4, Gemini 2.5, and even local LLMs) gives you more choice and speed, and why fine-tuning models might be overrated. Thomas explains how Copilot keeps you in the “magic flow state,” how even middle schoolers are using it to hack Minecraft. The conversation then zooms out to the competitive battlefield: Cursor’s $10 B valuation, Mistral’s new code model, and a wave of AI-native IDE forks vying for developer mind-share. We discuss why 2025’s “coding agents” could soon handle 90 % of the world’s code, the survival of SaaS and why the future of coding is about managing agents, not just writing code.GitHubWebsite - https://github.com/X/Twitter - https://x.com/githubThomas DohmkeLinkedIn - https://www.linkedin.com/in/ashtomX/Twitter - https://twitter.com/ashtomFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)LinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) Intro (01:50) Why AI Coding Is Ground Zero for Generative AI (02:40) The $7.5B GitHub Acquisition: Microsoft’s Strategic Play (06:21) GitHub’s Role in the Azure Cloud Ecosystem (10:25) How GitHub Copilot Beat Everyone to Market (16:09) Copilot & VS Code Explained for Non-Developers (21:02) GitHub Models: Multi-Model Choice and What It Means (25:31) The Reality of Fine-Tuning AI Models for Enterprise (29:13) The Dizzying Pace and Political Economy of AI Coding Tools (36:58) Competing and Partnering: Microsoft’s Unique AI Strategy (41:29) Does Microsoft Limit Copilot’s AI-Native Potential? (46:44) The Bull and Bear Case for AI-Native IDEs Like Cursor (52:09) Agent Mode: The Next Step for AI-Powered Coding (01:00:10) How AI Coding Will Change SaaS and Developer Skills

What really happened inside Google Brain when the “Attention is All You Need” paper was born? In this episode, Aidan Gomez — one of the eight co-authors of the Transformers paper and now CEO of Cohere — reveals the behind-the-scenes story of how a cold email and a lucky administrative mistake landed him at the center of the AI revolution.Aidan shares how a group of researchers, given total academic freedom, accidentally stumbled into one of the most important breakthroughs in AI history — and why the architecture they created still powers everything from ChatGPT to Google Search today.We dig into why synthetic data is now the secret sauce behind the world’s best AI models, and how Cohere is using it to build enterprise AI that’s more secure, private, and customizable than anything else on the market. Aidan explains why he’s not interested in “building God” or chasing AGI hype, and why he believes the real impact of AI will be in making work more productive, not replacing humans.You’ll also get a candid look at the realities of building an AI company for the enterprise: from deploying models on-prem and air-gapped for banks and telecoms, to the surprising demand for multimodal and multilingual AI in Japan and Korea, to the practical challenges of helping customers identify and execute on hundreds of use cases.CohereWebsite - https://cohere.comX/Twitter - https://x.com/cohereAidan GomezLinkedIn - https://ca.linkedin.com/in/aidangomezX/Twitter - https://x.com/aidangomezFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)LinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) Intro (02:00) The Story Behind the Transformers Paper (03:09) How a Cold Email Landed Aidan at Google Brain (10:39) The Initial Reception to the Transformers Breakthrough (11:13) Google’s Response to the Transformer Architecture (12:16) The Staying Power of Transformers in AI (13:55) Emerging Alternatives to Transformer Architectures (15:45) The Significance of Reasoning in Modern AI (18:09) The Untapped Potential of Reasoning Models (24:04) Aidan’s Path After the Transformers Paper and the Founding of Cohere (25:16) Choosing Enterprise AI Over AGI Labs (26:55) Aidan’s Perspective on AGI and Superintelligence (28:37) The Trajectory Toward Human-Level AI (30:58) Transitioning from Researcher to CEO (33:27) Cohere’s Product and Platform Architecture (37:16) The Role of Synthetic Data in AI (39:32) Custom vs. General AI Models at Cohere (42:23) The AYA Models and Cohere Labs Explained (44:11) Enterprise Demand for Multimodal AI (49:20) On-Prem vs. Cloud (50:31) Cohere’s North Platform (54:25) How Enterprises Identify and Implement AI Use Cases (57:49) The Competitive Edge of Early AI Adoption (01:00:08) Aidan’s Concerns About AI and Society (01:01:30) Cohere’s Vision for Success in the Next 3–5 Years

What if the smartest people in finance and law never had to do “stupid tasks” again? In this episode, we sit down with George Sivulka, founder of Hebbia, the AI company quietly powering 50% of the world’s largest asset managers and some of the fastest-growing law firms. George reveals how Hebbia’s Matrix platform is automating the equivalent of 50,000 years of human reading — every year — and why the future of work is hybrid teams of humans and AI “agent employees.” You’ll get the inside story on how Hebbia went from a stealth project at Stanford to a multinational company trusted by the Department of Defense, and why their spreadsheet-inspired interface is leaving chatbots in the dust. George breaks down the technical secrets behind Hebbia’s ISD architecture (and why they killed RAG), how they process billions of pages with near-zero hallucinations, and what it really takes to sell AI into the world’s most regulated industries.We also dive into the future of organizational design, why generalization beats specialization in AI, and how “prompting is the new management skill.” Plus: the real story behind AI hallucinations, the myth of job loss, and why naiveté might be the ultimate founder superpower.HebbiaWebsite - https://www.hebbia.comTwitter - https://x.com/HebbiaAIGeorge SivulkaLinkedIn - https://www.linkedin.com/in/sivulkaTwitter - https://x.com/gsivulkaFIRSTMARKWebsite - https://firstmark.comTwitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)LinkedIn - https://www.linkedin.com/in/turck/Twitter - https://twitter.com/mattturck(00:00) Intro (01:46) What is Hebbia (02:49) Evolving Hebbia’s mission (04:45) The founding story and Stanford's inspiration (09:45) The rise of agent employees and AI in organizations (12:36) The future of AI-powered work (15:17) AI research trends (19:49) Inside Matrix: Hebbia’s flagship AI platform (24:02) Why Hebbia isn’t just another chatbot (28:27) Moving beyond RAG: Hebbia’s unique architecture (34:10) Tackling hallucinations in high-stakes AI (35:59) Research culture and avoiding industry groupthink (39:40) Innovating go-to-market and enterprise sales (41:57) Real-world value: Cost savings and new revenue (43:49) How AI is changing junior roles (45:55) Leadership and perspective as a young founder (47:16) Hebbia’s roadmap: Success in the next 3 years

What if the “AI revolution” is actually… stuck in the messy middle? In this episode, Benedict Evans returns to tackle the big question we left hanging a year ago: Is AI a true paradigm shift, or just another tech platform shift like mobile or cloud? One year later, the answer is more complicated — and more revealing — than anyone expected.Benedict pulls back the curtain on why, despite all the hype and model upgrades, the core LLMs are starting to look like commodities. We dig into the real battlegrounds: distribution, brand, and the race to build sticky applications. Why is ChatGPT still topping the App Store charts while Perplexity and Claude barely register outside Silicon Valley? Why did OpenAI just hire a CEO of Applications, and what does that signal about the future of AI products?We go deep on the “probabilistic” nature of LLMs, why error rates are still the elephant in the room, the future of consumer AI (is there a killer app beyond chatbots and image generators?), the impact of generative content on e-commerce and advertising, and whether “AI agents” are the next big thing — or just another overhyped demo.And, we ask: What happened to AI doomerism? Why did the existential risk debate suddenly vanish, and what risks should we actually care about?Benedict EvansLinkedIn - https://www.linkedin.com/in/benedictevansThreads - https://www.threads.net/@benedictevansFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)LinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) Intro (01:47) Is AI a Platform Shift or a Paradigm Shift? (07:21) Error Rates and Trust in AI (15:07) Adapting to AI’s Capabilities (19:18) Generational Shifts in AI Usage (22:10) The Commoditization of AI Models (27:02) Are Brand and Distribution the Real Moats in AI? (29:38) OpenAI: Research Lab or Application Company? (33:26) Big Tech’s AI Strategies: Apple, Google, Meta, AWS (39:00) AI and Search: Is ChatGPT a Search Engine? (42:41) Consumer AI Apps: Where’s the Breakout? (45:51) The Need for a GUI for AI (48:38) Generative AI in Social and Content (51:02) The Business Model of AI: Ads, Memory, and Moats (55:26) Enterprise AI: SaaS, Pilots, and Adoption (01:00:08) The Future of AI in Business (01:05:11) Infinite Content, Infinite SKUs: AI and E-commerce (01:09:42) Doomerism, Risks, and the Future of AI

What happens when you try to build the “General Electric of AI” with just 14 people? In this episode, Jeremy Howard reveals the radical inside story of Answer AI — a new kind of AI R&D lab that’s not chasing AGI, but instead aims to ship thousands of real-world products, all while staying tiny, open, and mission-driven.Jeremy shares how open-source models like DeepSeek and Qwen are quietly outpacing closed-source giants, why the best new AI is coming out of China. You’ll hear the surprising truth about the so-called “DeepSeek moment,” why efficiency and cost are the real battlegrounds in AI, and how Answer AI’s “dialogue engineering” approach is already changing lives—sometimes literally.We go deep on the tools and systems powering Answer AI’s insane product velocity, including Solve It (the platform that’s helped users land jobs and launch startups), Shell Sage (AI in your terminal), and Fast HTML (a new way to build web apps in pure Python). Jeremy also opens up about his unconventional path from philosophy major and computer game enthusiast to world-class AI scientist, and why he believes the future belongs to small, nimble teams who build for societal benefit, not just profit.Fast.aiWebsite - https://www.fast.aiX/Twitter - https://twitter.com/fastdotaiAnswer.aiWebsite - https://www.answer.ai/X/Twitter - https://x.com/answerdotaiJeremy HowardLinkedIn - https://linkedin.com/in/howardjeremyX/Twitter - https://x.com/jeremyphowardFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)LinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) Intro (01:39) Highlights and takeaways from ICLR Singapore (02:39) Current state of open-source AI (03:45) Thoughts on Microsoft Phi and open source moves (05:41) Responding to OpenAI’s open source announcements (06:29) The real impact of the Deepseek ‘moment’ (09:02) Progress and promise in test-time compute (10:53) Where we really stand on AGI and ASI (15:05) Jeremy’s journey from philosophy to AI (20:07) Becoming a Kaggle champion and starting Fast.ai (23:04) Answer.ai mission and unique vision (28:15) Answer.ai’s business model and early monetization (29:33) How a small team at Answer.ai ships so fast (30:25) Why Devin AI agent isn't that great (33:10) The future of autonomous agents in AI development (34:43) Dialogue Engineering and Solve It (43:54) How Answer.ai decides which projects to build (49:47) Future of Answer.ai: staying small while scaling impact

InfluxDB just dropped its biggest update ever — InfluxDB 3.0 — and in this episode, we go deep with the team behind the world’s most popular open-source time series database. You’ll hear the inside story of how InfluxDB grew from 3,000 users in 2015 to over 1.3 million today, and why the company decided to rewrite its entire architecture from scratch in Rust, ditching Go and moving to object storage on S3.We break down the real technical challenges that forced this radical shift: the “cardinality problem” that choked performance, the pain of linking compute and storage, and why their custom query language (Flux) failed to catch on, leading to a humbling embrace of SQL as the industry standard. You’ll learn how InfluxDB is positioning itself in a world dominated by Databricks and Snowflake, and the hard lessons learned about monetization when 1.3 million users only yield 2,600 paying customers.InfluxDataWebsite - https://www.influxdata.comX/Twitter - https://twitter.com/InfluxDBEvan KaplanLinkedIn - https://www.linkedin.com/in/kaplanevanX/Twitter - https://x.com/evankaplanFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)LinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturckFoursquare: Website - https://foursquare.comX/Twitter - https://x.com/Foursquare IG - instagram.com/foursquare (00:00) Intro (02:22) The InfluxDB origin story and why time series matters (06:59) The cardinality crisis and why Influx rebuilt in Rust (09:26) Why SQL won (and Flux lost) (16:34) Why UnfluxData bets on FDAP (22:51) IoT, Tesla Powerwalls, and real-time control systems (27:54) Competing with Databricks, Snowflake, and the “lakehouse” world (31:50) Open Source lessons, monetization, & what’s next

Sigma Computing recently hit $100M in ARR — planning on doubling revenue again this year— and in this episode, CEO Mike Palmer reveals exactly how they did it by throwing out the old BI playbook. We open with the provocative claim that “the world did not need another BI tool,” and dig into why the last 20 years of business intelligence have been “boring.” He explains how Sigma’s spreadsheet-like interface lets anyone analyze billions of rows in seconds, and lives on top of Snowflake and Databricks, with no SQL required and no data extractions.Mike shares the inside story of Sigma’s journey: why they shut down their original product to rebuild from scratch, how Sutter Hill Ventures’ unique incubation model shaped the company, what it took to go from $2M to $100M ARR in just three years and raise a $200M round — even as the growth stage VC market dried up. We get into the technical details behind Sigma’s architecture: no caching, no federated queries, and real-time, Google Sheets-style collaboration at massive scale—features that have convinced giants like JP Morgan and ExxonMobil to ditch legacy dashboards for good.We also tackle the future of BI and the modern data stack: why 99.99% of enterprise data is never touched, what’s about to happen as the stack consolidates, and why Mike thinks “text-to-SQL” AI is a “terrible idea.” This episode is full of "spicey takes" - Mike shares his thoughts on how Google missed the zeitgeist, the reality behind Microsoft Fabric, when engineering hubris leads to failure, and many more. SigmaWebsite - https://www.sigmacomputing.comX/Twitter - https://x.com/sigmacomputingMike PalmerLinkedIn - https://www.linkedin.com/in/mike-palmer-51a154FIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)LinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturckFoursquare: Website - https://foursquare.comX/Twitter - https://x.com/Foursquare IG - instagram.com/foursquare (00:00) Intro (01:46) Why traditional BI is boring (04:15) What is business intelligence? (06:03) Classic BI roles and frustrations (07:09) Sigma’s origin story: Sutter Hill & the Snowflake echo (09:02) The spreadsheet problem: why nothing changed since 1985 (14:04) Rebooting the product during lockdown (16:14) Building a spreadsheet UX on top of Snowflake/Databricks (18:55) No caching, no federation: Sigma’s architectural choices (20:28) Spreadsheet interface at scale (21:32) Collaboration and real-time data workflows (24:15) Semantic layers, data governance & trillion-row performance (25:57) The modern data stack: fragmentation and consolidation (28:38) Democratizing data (29:36) Will hyperscalers own the data stack? (34:12) AI, natural language, and the limits of text-to-SQL

A week after OpenAI’s o3/o4-mini volleyed with Google’s Gemini 2.5 Flash, I sat down with Arvind Jain— ex-Google search luminary, Rubrik co-founder, and now CEO of Glean —just as his company released its agentic reasoning platform and swirled with rumors of a new round at a $7 billion valuation. We open on that whirlwind: why the model race is accelerating, why enterprises still gravitate to closed models, and when open-source variants finally take over. Arvind argues that LLMs should “fade into the background,” leaving application builders to pick the right engine for each task.From there, we trace Glean’s three-act arc—enterprise search powered by transformers (2019), retrieval-augmented chat the moment ChatGPT hit, and now agents that have already logged 50 million real actions inside Glean enterprise customers. Arvind lifts the hood on permission-aware ranking, tool-use orchestration, and the routing layer that swaps Gemini for GPT on the fly. Along the way, he answers the hard questions: Do agents really double efficiency? Where’s the moat when every startup promises the same? Why are humans still in the review loop, and for how long?The conversation crescendos with a vision of work where every employee is flanked by a team of proactive AI coworkers—all drawing from a horizontal knowledge layer that knows the firm’s language better than any newcomer. If you want to know what’s actually working with AI in the enterprise, how to build agents that deliver ROI, and what the next era of work will look like, this episode is packed with specifics, technical insights, and bold predictions from one of the sharpest minds in the space.GleanWebsite - https://www.glean.comX/Twitter - https://x.com/gleanaiArvind Jain LinkedIn - https://www.linkedin.com/in/jain-arvindX/Twitter - https://x.com/jainarvindFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)LinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) Intro & Glean’s $7B valuation rumor (02:01) The AI model explosion: open vs. closed in the enterprise (06:19) Why enterprises choose open source AI (and when) (10:33) The agent era: what are AI agents and why now? (12:41) Automating business processes: real-world agent use cases (16:46) Are we there yet? The reality of AI agents in 2025 (19:24) Glean’s origin story: reinventing enterprise search (26:38) Glean agents: from apps to agentic platforms (31:22) Horizontal vs. vertical: Glean’s strategic platform choice (34:14) How Glean’s enterprise search works (39:34) Staying LLM-agnostic: integrating new AI models (42:11) The architecture of Glean agents: tool use and beyond (43:50) Data flywheels and personalization in Glean (47:06) Moats, competition, and the future of work with AI agents

In this episode, we sit down with Aaron Levie, CEO and co-founder of Box, for a wide-ranging conversation that’s equal parts insightful, technical, and fun. We kick things off with a candid discussion about what it’s like to be a public company CEO during times of volatility, and then rewind to the early days of Box — from dorm room experiments to cold emailing Mark Cuban and dropping out of college.From there, we dive deep into how AI is transforming the enterprise. Aaron shares how Box is layering AI agents, RAG systems, and model orchestration on top of decades of enterprise content infrastructure — and why “95% of enterprise data is underutilized.”We explore what’s actually working with AI in production, what’s still breaking, and how companies can avoid common pitfalls. From building hubs for document-specific RAG to thinking through agent-to-agent interoperability, Aaron unpacks the architecture of Box’s AI platform — and why they’re staying out of the model training wars entirely. We also dig into AI culture inside large organizations, the trade-offs of going public, and why Levie believes every enterprise interface is about to change.Whether you're a founder, engineer, enterprise buyer, or just trying to figure out how AI agents will reshape knowledge work, this conversation is full of practical insights and candid takes from one of the sharpest minds in tech.BoxWebsite - https://www.box.comX/Twitter - https://twitter.com/BoxAaron LevieLinkedIn - https://www.linkedin.com/in/boxaaronX/Twitter - https://x.com/levieFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)LinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) Intro(01:51) Navigating uncertainty as a public company CEO(14:48) The Box origin story: college, cold emails, and Mark Cuban(23:39) Cloud transformation vs. the AI wave(30:15) The reality of AI in the enterprise: proof of concept vs. deployment(34:37) Inside Box’s AI platform: Hubs, agents, and more(44:15) Why Box won’t build its own model (and the dangers of fine-tuning)(51:51) What’s working — and what’s not — with AI agents(1:04:42) Building an AI culture at Box(1:13:22) The future of enterprise software and Box’s roadmap

In this episode, we sit down with Sridhar Ramaswamy, CEO of Snowflake, for an in-depth conversation about the company’s transformation from a cloud analytics platform into a comprehensive AI data cloud. Sridhar shares insights on Snowflake’s shift toward open formats like Apache Iceberg and why monetizing storage was, in his view, a strategic misstep.We also dive into Snowflake’s growing AI capabilities, including tools like Cortex Analyst and Cortex Search, and discuss how the company scaled AI deployments at an impressive pace. Sridhar reflects on lessons from his previous startup, Neeva, and offers candid thoughts on the search landscape, the future of BI tools, real-time analytics, and why partnering with OpenAI and Anthropic made more sense than building Snowflake’s own foundation models.SnowflakeWebsite - https://www.snowflake.comX/Twitter - https://x.com/snowflakedbSridhar RamaswamyLinkedIn - https://www.linkedin.com/in/sridhar-ramaswamyX/Twitter - https://x.com/RamaswmySridharFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)LinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) Intro and current market tumult(02:48) The evolution of Snowflake from IPO to Today(07:22) Why Snowflake’s earliest adopters came from financial services(15:33) Resistance to change and the philosophical gap between structured data and AI(17:12) What is the AI Data Cloud?(23:15) Snowflake’s AI agents: Cortex Search and Cortex Analyst(25:03) How did Sridhar’s experience at Google and Neeva shape his product vision?(29:43) Was Neeva simply ahead of its time?(38:37) The Epiphany mafia(40:08) The current state of search and Google’s conundrum(46:45) “There’s no AI strategy without a data strategy”(56:49) Embracing Open Data Formats with Iceberg(01:01:45) The Modern Data Stack and the future of BI(01:08:22) The role of real-time data(01:11:44) Current state of enterprise AI: from PoCs to production(01:17:54) Building your own models vs. using foundation models(01:19:47) Deepseek and open source AI(01:21:17) Snowflake’s 1M Minds program(01:21:51) Snowflake AI Hub

In this fascinating episode, we dive deep into the race towards true AI intelligence, AGI benchmarks, test-time adaptation, and program synthesis with star AI researcher (and philosopher) Francois Chollet, creator of Keras and the ARC AGI benchmark, and Mike Knoop, co-founder of Zapier and now co-founder with Francois of both the ARC Prize and the research lab Ndea. With the launch of ARC Prize 2025 and ARC-AGI 2, they explain why existing LLMs fall short on true intelligence tests, how new models like O3 mark a step change in capabilities, and what it will really take to reach AGI.We cover everything from the technical evolution of ARC 1 to ARC 2, the shift toward test-time reasoning, and the role of program synthesis as a foundation for more general intelligence. The conversation also explores the philosophical underpinnings of intelligence, the structure of the ARC Prize, and the motivation behind launching Ndea — a ew AGI research lab that aims to build a "factory for rapid scientific advancement." Whether you're deep in the AI research trenches or just fascinated by where this is all headed, this episode offers clarity and inspiration.NdeaWebsite - https://ndea.comX/Twitter - https://x.com/ndeaARC PrizeWebsite - https://arcprize.orgX/Twitter - https://x.com/arcprizeFrançois CholletLinkedIn - https://www.linkedin.com/in/fcholletX/Twitter - https://x.com/fcholletMike KnoopX/Twitter - https://x.com/mikeknoopFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)LinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) Intro (01:05) Introduction to ARC Prize 2025 and ARC-AGI 2 (02:07) What is ARC and how it differs from other AI benchmarks (02:54) Why current models struggle with fluid intelligence (03:52) Shift from static LLMs to test-time adaptation (04:19) What ARC measures vs. traditional benchmarks (07:52) Limitations of brute-force scaling in LLMs (13:31) Defining intelligence: adaptation and efficiency (16:19) How O3 achieved a massive leap in ARC performance (20:35) Speculation on O3's architecture and test-time search (22:48) Program synthesis: what it is and why it matters (28:28) Combining LLMs with search and synthesis techniques (34:57) The ARC Prize structure: efficiency track, private vs. public (42:03) Open source as a requirement for progress (44:59) What's new in ARC-AGI 2 and human benchmark testing (48:14) Capabilities ARC-AGI 2 is designed to test (49:21) When will ARC-AGI 2 be saturated? AGI timelines (52:25) Founding of NDEA and why now (54:19) Vision beyond AGI: a factory for scientific advancement (56:40) What NDEA is building and why it's different from LLM labs (58:32) Hiring and remote-first culture at NDEA (59:52) Closing thoughts and the future of AI research

In 2022, Lin Qiao decided to leave Meta, where she was managing several hundred engineers, to start Fireworks AI. In this episode, we sit down with Lin for a deep dive on her work, starting with her leadership on PyTorch, now one of the most influential machine learning frameworks in the industry, powering research and production at scale across the AI industry. Now at the helm of Fireworks AI, Lin is leading a new wave in generative AI infrastructure, simplifying model deployment and optimizing performance to empower all developers building with Gen AI technologies.We dive into the technical core of Fireworks AI, uncovering their innovative strategies for model optimization, Function Calling in agentic development, and low-level breakthroughs at the GPU and CUDA layers.Fireworks AIWebsite - https://fireworks.aiX/Twitter - https://twitter.com/FireworksAI_HQLin QiaoLinkedIn - https://www.linkedin.com/in/lin-qiao-22248b4X/Twitter - https://twitter.com/lqiaoFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)LinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) Intro(01:20) What is Fireworks AI?(02:47) What is PyTorch?(12:50) Traditional ML vs GenAI(14:54) AI’s enterprise transformation(16:16) From Meta to Fireworks(19:39) Simplifying AI infrastructure(20:41) How Fireworks clients use GenAI(22:02) How many models are powered by Fireworks(30:09) LLM partitioning(34:43) Real-time vs pre-set search(36:56) Reinforcement learning(38:56) Function calling(44:23) Low-level architecture overview(45:47) Cloud GPUs & hardware support(47:16) VPC vs on-prem vs local deployment(49:50) Decreasing inference costs and its business implications(52:46) Fireworks roadmap(55:03) AI future predictions

Retrieval-Augmented Generation (RAG) has become a dominant architecture in modern AI deployments, and in this episode, we sit down with Douwe Kiela, who co-authored the original RAG paper in 2020. Douwe is now the founder and CEO of Contextual AI, a startup focusing on helping enterprises deploy RAG as an agentic system. We start the conversation with Douwe's thoughts on the very latest advancements in Generative AI, including GPT 4.5, DeepSeek and the exciting paradigm shift towards test time compute, as well as the US-China rivalry in AI. We then dive into RAG: definition, origin story and core architecture. Douwe explains the evolution of RAG into RAG 2.0 and Agentic RAG, emphasizing the importance of self-learning systems over individual models and the role of synthetic data. We close with the challenges and opportunities of deploying AI in real-world enterprise, discussing the balance between accuracy and the inherent inaccuracies of AI systems.Contextual AIWebsite - https://contextual.aiX/Twitter - https://x.com/ContextualAIDouwe KielaLinkedIn - https://www.linkedin.com/in/douwekielaX/Twitter - https://x.com/douwekielaFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)LinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) Intro(01:57) Thoughts on the latest AI models: GPT-4.5, Sonnet 3.7, Grok 3(04:50) The test time compute paradigm shift(06:47) Unsupervised learning vs reasoning: a false dichotomy(07:30) The significance of DeepSeek(10:29) USA vs. China: is the AI war overblown?(12:19) Controlling AI hallucinations at the model level(13:51) RAG: definition and origin story(18:46) Why the Transformers paper initially felt underwhelming(20:41) The core architecture of RAG(26:06) RAG vs. fine-tuning vs. long context windows(30:53) RAG 2.0: Thinking in systems and not models(31:28) Data extraction and data curation for RAG(35:59) Contextual Language Models (CLMs)(38:04) Finetuning and alignment techniques: GRIT, KTO, LENS(40:40) Agentic RAG(41:36) General vs. specialized RAG agents(44:35) Synthetic data in AI(45:51) Deploying AI in the enterprise(48:07) How tolerant are enterprises to AI hallucinations?(49:35) The future of Contextual AI

In this episode, we dive into how AI is transforming video editing with Gaurav Misra, the CEO of Captions. Launched in New York in 2021, Captions already empowers over 10 million creators worldwide, leveraging AI to make video production as simple as clicking a button.Discover the strategic framework that led to the inception of Captions, and learn how the founders identified societal changes and technological advancements to build a groundbreaking company. We explore the challenges and opportunities of building an AI product for video editing, including how Captions is outpacing traditional content production workflows.Gaurav shares insights into the future of video editing, the role of AI in democratizing video production, and the unique approach Captions takes to differentiate itself from industry giants like Adobe and Capcut. CaptionsWebsite - https://www.captions.aiX/Twitter - https://x.com/getcaptionsappGaurav MisraLinkedIn - https://www.linkedin.com/in/gamisra1X/Twitter - https://x.com/gmharharFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)LinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) Intro(01:30) What is Captions?(03:43) How did Captions start?(08:25) The strategy behind launching Captions(12:32) How is Captions different from other editing tools?(14:13) How does it compare to CapCut?(18:22) Who is the typical Captions user?(20:13) Why ‘Captions’?(23:47) Captions’ product suite for production and editing(26:37) AI models powering Captions(36:22) AI lipsync(38:49) Personalized fine-tuned models for creators?(39:38) Building models vs. building wrappers(43:09) Cloud AI vs. Local AI(45:19) Optimizing for low latency(48:07) AI/ML stack at Captions(51:10) “Hallucinations are a feature, not a bug”(53:19) Prompt engineering(54:12) Have we passed the uncanny valley for AI avatars?(01:01:47) The impact of deepfakes(01:04:33) CapCut ban and its effects(01:05:05) Evolving from paid to freemium(01:07:42) Building a company on foundation models(01:09:01) Running an AI company in New York

AI customer service agents are quickly replacing the often clunky AI chatbots of years past, and revolutionizing how we all interact with customer service. In this episode, we dive into this rapid transformation with Mike Murchison, CEO of Ada, a fast-growing leader in the space.Mike shares how harnessing the power of several Generative AI models enables Ada to automate up to 83% of customer interactions, providing a seamless and empathetic service that rivals, and will soon surpass, human agents. We explore the challenges and triumphs of deploying AI in customer service in this new era, from the intricacies of model orchestration to the importance of resolution and empathy. Mike also teases the future of agentic AI in the enterprise, where AI agents collaborate across departments to innovate and improve products.AdaWebsite - https://www.ada.cxX/Twitter - https://x.com/ada_cxMike MurchisonLinkedIn - https://www.linkedin.com/in/mikemurchisonX/Twitter - https://x.com/mimurchisonFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)LinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) Intro(02:27) Why is customer service a perfect use case for AI?(03:36) Why didn’t foundation models replace AI “thin wrappers” out of the box?(05:27) What is Ada?(10:41) Reasoning engine, model orchestration, instruction following, routing(15:45) Hybrid systems, finetuning, customization(18:28) Prompt engineering, observability, self-improvement(22:07) RAG (Retrieval-Augmented Generation) and AI as a judge(23:06) Guardrails and security(24:33) Should we expect perfection from AI?(26:14) Measuring “resolution”(29:29) What actions can Ada AI Agents take?(32:12) Authentication and personalization(35:09) Handoff vs human delegation(38:12) ACX (AI Customer Experience) and the future of customer service professionals(42:13) Leveraging analytics and customer support data(45:54) AI agents for cross-selling and upselling(48:25) Traditional AI chatbots vs the new generation of AI Agents(51:24) Emotion, empathy, personality(54:56) Transparency and AI improvement(57:58) Managing AI: the measure-coach-improve loop(1:00:15) Ada Voice and Email(1:06:25) Future predictions for AI(1:07:56) Multi-agent collaboration

Replit is one of the most visible and exciting companies reshaping how we approach software and application development in the Generative AI era. In this episode, we sit down with its CEO, Amjad Masad, for an in-depth discussion on all things AI, agents, and software. Amjad shares the journey of building Replit, from its humble beginnings as a student side project to becoming a major player in Generative AI today. We also discuss the challenges of launching a startup, the multiple attempts to get into Y Combinator, the pivotal moment when Paul Graham recognized Replit’s potential, and the early bet on integrating AI and machine learning into the core of Replit. Amjad dives into the evolving landscape of AI and machine learning, sharing how these technologies are reshaping software development. We explore the concept of coding agents and the impact of Replit’s latest innovation, Replit Agent, on the software creation process. Additionally, Amjad reflects on his time at Codecademy and Facebook, where he worked on groundbreaking projects like React Native, and how those experiences shaped his entrepreneurial journey. We end with Amjad's view on techno-optimism and his belief in an energized Silicon Valley. Replit Website - https://replit.com X/Twitter - https://x.com/Replit Amjad Masad LinkedIn - https://www.linkedin.com/in/amjadmasad X/Twitter - https://x.com/amasad FIRSTMARK Website - https://firstmark.com X/Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ X/Twitter - https://twitter.com/mattturck (00:00) Intro (01:36) The origins of Replit (15:54) Amjad’s decision to restart Replit (19:00) Joining Y Combinator (30:06) AI and ML at Replit (32:31) Explain Code (39:09) Replit Agent (52:10) Balancing usability for both developers and non-technical users (53:22) Sonnet 3.5 stack (58:43) The challenge of AI evaluation (01:00:02) ACI vs. HCI (01:05:02) Will AI replace software development? (01:10:15) If anyone can build an app with Replit, what’s the next bottleneck? (01:14:31) The future of SaaS in an AI-driven world (01:18:37) Why Amjad embraces techno-optimism (01:20:36) Defining civilizationism (01:23:11) Amjad’s perspective on government’s role

In this episode, we explore the cutting-edge world of data infrastructure with Justin Borgman, CEO of Starburst — a company transforming data analytics through its open-source project, Trino, and empowering industry giants like Netflix, Airbnb, and LinkedIn. Justin takes us through Starburst’s journey from a Yale University spin-out to a leading force in data innovation, discussing the shift from data lakes to lakehouses, the rise of open formats like Iceberg as the future of data storage, and the role of AI in modern data applications. We also dive into how Starburst is staying ahead by balancing on-prem and cloud offerings while emphasizing the value of optionality in a rapidly evolving, data-driven landscape. Starburst Data Website - https://www.starburst.io X/Twitter - https://x.com/starburstdata Justin Borgman LinkedIn - https://www.linkedin.com/in/justinborgman X/Twitter - https://x.com/justinborgman FIRSTMARK Website - https://firstmark.com X/Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ X/Twitter - https://twitter.com/mattturck (00:00) Intro (01:32) What is Starburst? (02:32) Understanding the data layer (05:06) Justin Borgman’s story before Starburst (10:41) The evolution of Presto into Trino (13:20) Lakehouse vs. data lake vs. data warehouse (22:06) Why Starburst backed the lakehouse from the start (23:20) Starburst Enterprise (27:31) Cloud vs. on-prem (29:10) Starburst Galaxy (31:23) Dell Data Lakehouse (32:13) Starburst’s data architecture explained (38:30) The rise of data apps (38:54) Starburst AML (40:41) “We actually built the Galaxy twice” (43:13) Managing multiple products at scale (45:14) “We founded the company on the idea of optionality” (47:20) Iceberg (48:01) How open-source acquisitions work (51:39) Why Snowflake embraced Iceberg (53:15) Data mesh (55:31) AI at Starburst (57:16) Key takeaways from go-to-market strategies (01:01:18) Lessons from the Dell partnership (01:04:40) Predictions for 2025

As AI takes over the world, data is more than ever “the new oil”, and data engineering is the discipline that makes data usable behind the scenes. In this episode, we dive deep into the present and future of data engineering with Ben Rogojan, also known as the Seattle Data Guy. A seasoned data engineering consultant, Ben has built a big brand and reputation in the field with over 100k followers on platforms like YouTube and Substack. We started the conversation with a deep dive into data engineering as a profession: what do data engineers actually do? What is the career path, and what should aspiring data engineers learn? We then explored some of the biggest stories of 2024 (including the rise of Iceberg) and went into some predictions for 2025, as a way to discuss some key topics everyone should be familiar with in data engineering, including the integration of AI in data workflows, the potential for automation, and why SQL isn't going anywhere. Discover how companies are navigating the complexities of data infrastructure, the rise of open table formats like Iceberg, and the ongoing battle between data giants like Snowflake and Databricks. Ben Rogojan Website - https://www.theseattledataguy.com Newsletter - https://seattledataguy.substack.com LinkedIn - https://www.linkedin.com/company/seattle-data-guy X/Twitter - https://x.com/seattledataguy FIRSTMARK Website - https://firstmark.com X/Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ X/Twitter - https://twitter.com/mattturck (00:00) Intro (01:20) Why 2025 will be huge for data engineering (02:55) The story of the Seattle Data Guy (06:51) What exactly is data engineering? (07:41) Data, AI, and ML: where do they overlap? (09:23) Data analyst vs. data engineer vs. data scientist: what’s the difference? (11:20) A day in the life of a data engineer (12:58) Data engineering: Silicon Valley vs. everywhere else (15:27) How to become an AI engineer (28:46) Will AI replace AI engineers? (33:42) Why is the data world so complex? (36:53) The functional consolidation of the data world (38:34) Big data stories from 2024 (39:28) Why Iceberg is a game-changer (46:02) How startups manage data in their early days (48:44) Seattle Data Guy’s favorite tools (50:09) Bold predictions for 2025

In this episode, we dive deep into the world of AI engineering with Chip Huyen, author of the excellent, newly released book "AI Engineering: Building Applications with Foundation Models". We explore the nuances of AI engineering, distinguishing it from traditional machine learning, discuss how foundational models make it possible for anyone to build AI applications and cover many other topics including the challenges of AI evaluation, the intricacies of the generative AI stack, why prompt engineering is underrated, why the rumors of the death of RAG are greatly exaggerated, and the latest progress in AI agents. Book: https://www.oreilly.com/library/view/ai-engineering/9781098166298/ Chip Huyen Website - https://huyenchip.com LinkedIn - https://www.linkedin.com/in/chiphuyen Twitter/X - https://x.com/chipro FIRSTMARK Website - https://firstmark.com Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ Twitter - https://twitter.com/mattturck (00:00) Intro (02:45) What is new about AI engineering? (06:11) The product-first approach to building AI applications (07:38) Are AI engineering and ML engineering two separate professions? (11:00) The Generative AI stack (13:00) Why are language models able to scale? (14:45) Auto-regressive vs. masked models (16:46) Supervised vs. unsupervised vs. self-supervised (18:56) Why does model scale matter? (20:40) Mixture of Experts (24:20) Pre-training vs. post-training (28:43) Sampling (32:14) Evaluation as a key to AI adoption (36:03) Entropy (40:05) Evaluating AI systems (43:21) AI as a judge (46:49) Why prompt engineering is underrated (49:38) In-context learning (51:46) Few-shot learning and zero-shot learning (52:57) Defensive prompt engineering (55:29) User prompt vs. system prompt (57:07) Why RAG is here to stay (01:00:31) Defining AI agents (01:04:04) AI agent planning (01:08:32) Training data as a bottleneck to agent planning

In this episode, we sit down with Florian Douetteau, co-founder and CEO of Dataiku, a global category leader in enterprise AI and a fixture on the Forbes Cloud 100 list and in the Gartner Leader Quadrant. Florian shares his journey from a Parisian student fascinated by functional programming to leading a global enterprise software company. We discuss how Dataiku bridges the gap between technical and business teams to democratize AI in the enterprise, the challenges of selling to enterprise clients, and how Dataiku acts as an orchestration layer for Generative AI, helping businesses manage complex data processes and control AI, so they can build more with AI. Dataiku Website - https://www.dataiku.com/ X/Twitter - https://twitter.com/dataiku Florian Douetteau LinkedIn - https://www.linkedin.com/in/fdouetteau X/Twitter - https://twitter.com/fdouetteau FIRSTMARK Website - https://firstmark.com X/Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ X/Twitter - https://twitter.com/mattturck (00:00) Intro (02:08) Florian's life before Dataiku (06:58) Creation of Dataiku (12:08) Secret behind the Dataiku's name (12:47) How does Dataiku stay insightful about the future? (14:46) Building a platform, not just a tool (17:26) How to sell to the enterprise from the beginning (20:09) Dataiku platform today (26:55) Data is always the problem (28:50) LLM Mesh (36:02) Will Gen AI replace ML? (39:41) Managing Gen AI and traditional AI on one platform (40:37) Gen AI deployment in the enterprise (48:33) Dataiku's roadmap (50:28) What has changed with the company's growth?

In this episode, we dive into the world of generative AI with May Habib, co-founder of Writer, a platform transforming enterprise AI use. May shares her journey from Qordoba to Writer, emphasizing the impact of transformers in AI. We explore Writer's graph-based RAG approach, and their AI Studio for building custom applications. We also discuss Writer's Autonomous Action functionality, set to revolutionize AI workflows by enabling systems to act autonomously, highlighting AI's potential to accelerate product development and market entry with significant increases in capacity and capability. Writer Website - https://writer.com X/Twitter - https://x.com/get_writer May Habib LinkedIn - https://www.linkedin.com/in/may-habib X/Twitter - https://x.com/may_habib FIRSTMARK Website - https://firstmark.com X/Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ X/Twitter - https://twitter.com/mattturck This session was recorded live at a recent Data Driven NYC, our in-person, monthly event series, hosted at Ramp's beautiful HQ. If you are ever in New York, you can join the upcoming events here: https://www.eventbrite.com/o/firstmark-capital-2215570183 (00:00) Intro (01:47) What is Writer? (02:52) Writer's founding story (06:54) Writer is a full-stack company. Why? (07:57) Writer's enterprise use cases (10:51) Knowledge Graph (17:59) Guardrails (20:17) AI Studio (23:16) Palmyra X 004 (27:18) Current state of the AI adoption in enterprises (28:57) Writer's sales approach (31:25) What May Habib is excited about in AI (33:14) Autonomous Action use cases

Nathan Benaich, founder and GP at VC firm Air Street Capital, publishes every year "State of AI", one of the most widely-read and comprehensive reports on all things AI across research, industry, and policy. In this episode, we sit down with Nathan to discuss some of the highlights of the 2024 edition of the report, including the "vibes" shift in the industry from existential risk concerns last year to the current monetization race, the financial success of the foundation model labs, how a generative AI app could top the Apple Store charts in 2025, and the challenges facing humanoid robotics. State of AI 2024 report: https://www.stateof.ai/2024-report-launch State of AI 2024 video: https://youtu.be/EVMbnPOuUl0 Air Street Capital Website - https://www.airstreet.com X/Twitter - https://x.com/airstreet Nathan Benaich LinkedIn - https://www.linkedin.com/in/nathanbenaich X/Twitter - https://x.com/nathanbenaich FirstMark Website - https://firstmark.com X/Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ X/Twitter - https://twitter.com/mattturck (01:08) Who is Nathan Benaich? (04:57) "Vibe" shift in AI (09:13) Current state of the foundation models (22:01) AI companies vs. SaaS (23:31) AI consumer apps (25:49) AI applications from a VC's perspective (29:25) "You don't need to be an AI engineer to build an AI company" (30:46) AI in robotics (34:36) AI regulations in Europe (40:55) Predictions on the future of AI (49:30) Nathan Benaich's favorite sources of information

In this special episode of the MAD Podcast, Matt Turck and Aman Kabeer from FirstMark delve into the AI market from a venture investor perspective, in the final weeks of an incredibly packed and exciting 2024. They comment on their favorite news stories, such as OpenAI's record-breaking $6.6 billion funding round and the massive $200B investments in AI infrastructure by Meta, Google, and Amazon. They tackle the latest trends in funding and valuations in both public and private markets, debate the critical question of whether we're in an AI bubble, examine the current state of AI demand, the potential of scaling laws, and the future of AI-driven innovation. They then discuss where they see opportunities for startups and investors across AI hardware, compute, foundation models, AI tooling, and both consumer and enterprise AI applications. FIRSTMARK Website - https://firstmark.com X/Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ X/Twitter - https://twitter.com/mattturck Aman Kabeer (Investor) LinkedIn - https://www.linkedin.com/in/aman-kabeer/ X/Twitter - https://x.com/AmanKabeer11 (00:00) Intro (02:20) The Year of Record-Breaking Evaluations and Investments (05:23) AI's Environmental Impact and Nuclear Revival (06:48) AI Valuations and Market Dynamics (17:01) Are We in an AI Bubble? (25:01) AI Progress and Demand (35:06) AI's Role in Consumer Applications (41:02) AI's Influence on SaaS and Business Models (50:55) AI's Role in Enterprise Transformation (01:04:00) The Future of AI: Apps and Agents

Before he founded Modal, Erik Bernhardsson created Spotify's music recommendation system. Today he's bringing a consumer app approach to radically simplifying developer experience for data and AI projects on the Modal platform. In this episode, we dive into the broader AI compute landscape, discussing the roles of hyperscalers, GPU clouds, inference platforms, and the emergence of alternative AI cloud providers. Erik gives us a product tour of the Modal platform, provides insights into the AI industry's shift from training to inference as the primary use case, and speculates on the future of AI-native consumer applications. Learn about Modal's commitment to fast feedback loops, their cloud maximalist approach, their dedication to building a product that developers truly love, as well as founder lessons Erik learned along the way. Erik's blog: https://erikbern.com "It's hard to write code for humans": https://erikbern.com/2024/09/27/its-hard-to-write-code-for-humans Modal Website - https://modal.com Twitter - https://x.com/modal_labs Erik Bernhardsson LinkedIn - https://www.linkedin.com/in/erikbern Twitter - https://x.com/bernhardsson FIRSTMARK Website - https://firstmark.com Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ Twitter - https://twitter.com/mattturck (00:00) Intro (01:35) What is Modal? (02:18) Current state of AI compute space (09:54) Erik's path to starting Modal (13:57) Core elements of the Modal platform (28:52) Is serverless the right level of abstraction for AI compute? (33:35) Balancing costs: GPU vendor fees vs. customer pricing (37:56) Designing products for humans (42:43) Modal's early go-to-market motion (45:32) Managing early engineering team (48:26) The only correct way to add a new function to the company (50:07) Building company in NYC (52:05) Modal's roadmap (54:04) Erik's predictions on AI

A founding engineer on Google BigQuery and now at the helm of MotherDuck, Jordan Tigani challenges the decade-long dominance of Big Data and introduces a compelling alternative that could change how companies handle data. Jordan discusses why Big Data technologies are an overkill for most companies, how MotherDuck and DuckDB offer fast analytical queries, and lessons learned as a technical founder building his first startup. Watch the episode with Tomasz Tunguz: https://youtu.be/gU6dGmZzmvI Website - https://motherduck.com Twitter - https://x.com/motherduck Jordan Tigani LinkedIn - https://www.linkedin.com/in/jordantigani Twitter - https://x.com/jrdntgn FIRSTMARK Website - https://firstmark.com Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ Twitter - https://twitter.com/mattturck (00:00) Intro (00:56) What is the Small Data? (06:56) Marketing strategy of MotherDuck (08:39) Processing Small Data with Big Data stack (15:30) DuckDB (17:21) Creation of DuckDB (18:48) Founding story of MotherDuck (24:08) MotherDuck's community (25:25) MotherDuck of today ($100M raised) (33:15) Why MotherDuck and DuckDB are so fast? (39:08) The limitations and the future of MotherDuck's platform (39:49) Small Models (42:37) Small Data and the Modern Data Stack (46:47) Making things simpler with a shift from Big Data to Small Data (50:04) Jordan Tigani's entrepreneurial journey (58:31) Outro

With a $4.5B valuation, 5M AI builders and 1M public AI models, Hugging Face has emerged as the key collaboration platform for AI, and the heart of the global open source AI community. In this episode of The MAD Podcast, we sit down with Clément Delangue, its co-founder and CEO, and delve deep into Hugging Face's journey from a fun chatbot to a central hub for AI innovation, the impact of open-source AI and the importance of community-driven development, and discuss the shift from text to other AI modalities like audio, video, chemistry, and biology. We also cover the evolution of Hugging Face's business model, and the different approach to company culture that the founders have implemented over the years. Hugging Face Website - https://huggingface.co Twitter - https://x.com/huggingface Clem Delangue LinkedIn - https://www.linkedin.com/in/clementdelangue Twitter - https://x.com/clemdelangue FIRSTMARK Website - https://firstmark.com Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ Twitter - https://twitter.com/mattturck (00:00) Intro (01:46) Miami vs. New York vs. San Francisco (03:25) Current state of open source AI (11:12) Government regulation of AI (13:18) What is open source AI? (15:21) Open source AI: China vs U.S. (18:32) LLMs vs. SLMs (22:01) Are commercial LLMs just 'Training Wheels' for enterprises? (24:26) Software 2.0: built with AI (28:03) Hugging Face founding story (37:03) Are there any competitors? (44:06) Most interesting models on Hugging Face (50:35) Shifting focus in enterprise solutions (55:06) Bloom & Idefix (58:44) The culture of Hugging Face (01:04:44) The future of Hugging Face

This episode is a captivating conversation with Richard Socher, serial entrepreneur, investor, and AI researcher. Richard elaborates on why he likens the impact of AI to the Industrial Revolution, the Enlightenment, and the Renaissance, discusses important current issues in AI, such as scaling laws and agents, provides a behind-the-scenes tour of YOU.com and its evolving business model, and finally describes his current investment strategy in AI startups. You.com Website - https://you.com/business Twitter - https://x.com/youdotcom Richard Socher LinkedIn - https://www.linkedin.com/in/richardsocher Twitter - https://x.com/richardsocher FIRSTMARK Website - https://firstmark.com Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ Twitter - https://twitter.com/mattturck (00:00) Intro (02:00) "AI era is the Industrial Revolution, Renaissance, and the Enlightenment combined" (07:49) Top-performers in the Age of AI (11:15) Comeback of the Renaissance Person (13:05) People tried to stop Richard from doing deep learning research. Why? (14:34) Jevons paradox of intelligence (17:08) Scaling Laws in Deep Learning (23:23) Can Deep Learning and Rule-Based AI coexist? (25:42) Post-transformers AI Architecture (28:20) Achieving AGI and ASI (36:43) AI for everyday tasks: how far is it? (44:50) AI Agents (55:45) Evolution of You.com (01:02:11) Technical side of You.com (01:06:46) Is AI getting cheaper? (01:13:05) What is AIX Ventures? (01:16:36) VC landscape of 2024 (01:24:31) Research vs Entrepreneurship (01:26:12) OpenAI’s transformation and its impact on the industry

In this episode, we sit down with Tobie Morgan Hitchcock, the founder of SurrealDB, to dive deep into the evolving world of databases and the future of data storage, querying, and real-time analytics. SurrealDB isn’t just another database — it’s a multi-model database that merges document, graph, and time-series data, making it easier for developers to consolidate their backend without sacrificing performance. You'll learn how SurrealDB separates storage from compute for scalability, its innovative take on graph databases, and the radical decision to rewrite the entire platform in Rust. Tobie also shares how SurrealDB is designed to handle real-time analytics and integrate AI/ML models directly inside the database. If you're curious about the future of databases, this episode is packed with insights you won’t want to miss. SurrealDB Website - https://surrealdb.com Twitter - https://x.com/SurrealDB Tobie Morgan Hitchcock: LinkedIn - https://www.linkedin.com/in/tobiemorganhitchcock Twitter - https://x.com/tobiemh FIRSTMARK Website - https://firstmark.com Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ Twitter - https://twitter.com/mattturck (00:00) Intro(02:03) What is SurrealDB?(02:53) How did SurrealDB get started?(09:10) The Challenges of Building a Database from Scratch(10:36) Why SurrealDB Chose Rust(12:54) A Deep Dive into SurrealDB’s Unique Features(19:30) Why Now?(26:32) What Sets SurrealDB Apart from Other Databases(30:01) SurrealDB’s Role in the Future of AI and Machine Learning(32:45) Why Developers Are Choosing SurrealDB(36:14) What’s New in SurrealDB 2.0?(40:10) SurrealDB Cloud: Scalability Meets Simplicity(42:21) How SurrealDB Fits into the Competitive Database Landscape(45:37) Early Lessons from Building SurrealDB(48:34) Co-Founding SurrealDB with His Brother

In this episode, we dive deep into the story of how Datadog evolved from a single product to a multi-billion dollar observability platform with its co-founder, Olivier Pomel. Olivier shares exclusive insights on Datadog's unique approach to product development—why they avoid the "Apple approach" of building in secret and instead work closely with customers from day one. You’ll hear about the early days when Paul Graham of Y Combinator turned down Datadog, questioning their lack of a first product. Olivier also reveals the strategies behind their iterative product launches and why they insist on charging early to ensure they’re delivering real value. The second half of the conversation is focused on all things AI and data at Datadog - the company's initial reluctance to use AI in its products, how Generative AI changed everything, and Datadog's current AI efforts including Watchdog, Bits AI and Toto, their new time series foundational model. We close the episode by asking Olivier about his thoughts on the topic du jour: founder mode! ▶️ Listen to 2020 Data Driven NYC episode with Oliver Pomel: https://www.youtube.com/watch?v=oXKEFHeEvMs DATADOG Website - https://www.datadoghq.com Twitter - https://x.com/datadoghq Olivier Pomel LinkedIn - https://www.linkedin.com/in/olivierpomel Twitter - https://x.com/oliveur FIRSTMARK Website - https://firstmark.com Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ Twitter - https://twitter.com/mattturck

In this episode, we sit down with Ali Dasdan, CTO of ZoomInfo, a titan in the B2B sector, who harnesses vast datasets and advanced AI to redefine sales and marketing for over 35,000 global customers with $21.2 billion in annualized revenue. We delve deep into ZoomInfo's AI initiatives, including their transformative 'Copilot,' explore sophisticated data management, and discuss their dual platforms catering to internal and customer-facing needs. ZoomInfo Website - https://www.zoominfo.com Twitter - https://x.com/zoominfo Ali Dasdan LinkedIn - https://www.linkedin.com/in/dasdan Twitter - https://x.com/alidasdan FIRSTMARK Website - https://firstmark.com Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ Twitter - https://twitter.com/mattturck (00:00) Intro (02:03) What is ZoomInfo (04:47) Data as service (06:15) Ali Dasdan's story (07:31) Organization of ZoomInfo (10:48) ZoomInfo Data Platform (21:02) Lessons from building a data platform (23:19) AI application at ZoomInfo (27:58) ZoomInfo's Copilot (37:43) ZoomInfo AI toolstack (39:30) Working with small vs. big companies in the AI business (43:39) Using data and AI for internal productivity

In this episode, we sit down with Eric Glyman, co-founder of Ramp, the company that revolutionized finance management to become a powerhouse valued at $7.6 billion. Eric shares the tradition of counting the days since Ramp's founding and how it fosters a sense of urgency and productivity, explains the use of AI to automate expense management and fraud detection, and gives an inside look at Ramp's cutting-edge AI products, including the Ramp Intelligence Suite and experimental agentic AI use cases. Ramp Website - https://www.ramp.com Twitter - https://x.com/tryramp Eric Glyman LinkedIn - https://www.linkedin.com/in/eglyman Twitter - https://x.com/eglyman FIRSTMARK Website - https://firstmark.com Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ Twitter - https://twitter.com/mattturck (00:00) Intro (01:49) What is Ramp? (04:25) How did the company start? (09:18) Technical aspects of Ramp infrastructure (12:17) "We can tell you if you're paying too much" (14:20) Data privacy at Ramp (16:13) Data infrastructure tools used at Ramp (17:58) Traditional AI use cases (24:51) GenAI use cases (27:47) AI/human interaction (33:32) Ramp Intelligence Suite (39:38) How Ramp keeps high product release and product velocity (42:37) How did Ramp get to product-market fit? (45:54) Eric's perspective on building a company in NYC

In this episode, we reconnect with Sharon Zhou, co-founder and CEO of Lamini, to dive deep into the ever-evolving world of enterprise AI. We discuss how the AI hype is evolving and what enterprises are doing to stay ahead, break down the different players in the Inference market, explore how Memory Tuning is reducing hallucinations in AI models, the role of agents in enterprise AI, and the challenges of making them real-time and reliable. Lamini Website - https://www.lamini.ai Twitter - https://x.com/laminiai Sharon Zhou LinkedIn - https://www.linkedin.com/in/zhousharon Twitter - https://x.com/realsharonzhou FIRSTMARK Website - https://firstmark.com Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ Twitter - https://twitter.com/mattturck (00:00) Intro (02:18) The state of the AI market in July, 2024 (10:51) What is Lamini? (11:43) What is Inference? (15:36) GPU shortage in the enterprise (18:06) AMD vs Nvidia (22:10) What is Lamini's final product? (25:30) What is Memory Tuning? (29:01) What is LoRA? (32:39) More on Memory Tuning (35:51) Sharon's perspective on AI agents (40:01) What is next for Lamini? (41:54) Reasoning vs pure compute in AI

In this episode, we sit down with Jeremy Kahn, the AI Editor at Fortune Magazine, who has recently published a book called "Mastering AI: A Survival Guide to Our Superpowered Future". Jeremy shares his unique insights on AI's potential risks and transformative benefits, including the importance of UI design in maximizing AI's utility, the potential for AI to create a "winner takes most" economy, and the need for thoughtful AI regulation to mitigate risks without stifling innovation. Book: https://www.amazon.com/Mastering-AI-Survival-Superpowered-Future/dp/1668053322 Jeremy Kahn LinkedIn - https://www.linkedin.com/in/jeremy-kahn-01100462 Twitter - https://x.com/jeremyakahn FIRSTMARK Website - https://firstmark.com Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ Twitter - https://twitter.com/mattturck (00:00) Intro (01:43) Why the UI design is important for AI? (04:32) The book is called "Mastering AI". Why? (12:03) Automation Bias vs Automation Surprise (20:16) The role of AI in the future of science and art (25:32) "I think mass unemployment is a red herring, but we might see a lot of disruption" (34:19) Jeremy's perspective on Agentic AI (36:29) Does AI development need to be regulated? (38:56) Should we worry about the AGI and Superintelligence? (42:18) Who provided the most thoughtful conversation for the book? (43:57) "I didn't use AI for the book at all" (46:20) Jeremy's work at Fortune

In this episode, we sit down with Azeem Azhar, an expert on AI and technologies, whose weekly newsletter "The Exponential View" (www.exponentialview.co) is read by nearly two hundred thousand people from around the world. We delve into the nuances of AI adoption, discussing how LLM's are reshaping industries and what this means for corporate leaders, the dynamics between the U.S., China, and Europe in the AI race, and the concept of sovereign AI. Azeem Azhar Website - https://www.exponentialview.co Twitter - https://x.com/azeem FIRSTMARK Website - https://firstmark.com Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ Twitter - https://twitter.com/mattturck (00:00) Intro (02:05) What does the "Exponential" really mean? (05:43) "Moore's law has not died" (11:52) Claude is the Macintosh of AI. What does it mean? (25:57) How does AI affect the enterprise? (34:06) Asia is more optimistic about AI than the West. Why? (38:42) Azeem's perspective on the sovereign AI (45:19) AI in the modern warfare (48:47) What is the Exponential asymmetry? (51:59) Energy transition and the influence of AI on it (55:21) Big Oil vs Chinese Solar: who's going to win? (59:18) AI opens new possibilities for everyone. How?

In this episode, we sat down with Aaron Katz, the CEO of ClickHouse, a company that went from an open-source analytical database into a highly successful cloud service, utilized by Spotify, Netflix, Disney, and many more. Aaron Katz provides intriguing insights into the challenges of transitioning an open-source project into a thriving business, ClickHouse's go-to-market strategy, the role of technical support in pre-sales, and the strategic decision to avoid traditional SDR and CSM roles. CLICKHOUSE Website - https://clickhouse.com/ Twitter - https://x.com/clickhousedb Aaron Katz LinkedIn - https://www.linkedin.com/in/aaron-katz-5762094 Twitter - https://x.com/ceo_clickhouse FIRSTMARK Website - https://firstmark.com Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ Twitter - https://twitter.com/mattturck (00:00) Intro(00:56) What is ClickHouse?(04:28) What are the use cases for ClickHouse?(06:17) Reducing the latency: why the world shifts to real-time(09:05) How did ClickHouse evolve from an open-source to a cloud product?(15:01) "Open source is the future of software"(17:27) Self-hosted deployments(18:45) ClickHouse's roadmap(20:51) Is there a real-time data stack?(22:25) ClickHouse partners in data ingestion(24:32) Who are ClickHouse's main competitors?(27:35) ClickHouse's sales process(36:44) Is partnerships a good go-to-market strategy?(37:44) When is the right time for startups to start partnering?(38:22) Aaron's story of becoming the CEO(43:50) Team and culture when working on two continents(46:15) What's next?

In this episode, we sit down with Daniel Dines, the co-founder and CEO of UiPath. From a small rented apartment in Bucharest to $1.3 billion in revenue, UiPath's story is one of perseverance, innovation, and strategic pivots. Daniel shares his insights on the pivotal moments that shaped UiPath, how to build a robust go-to-market strategy, the role of partnerships, and the lessons learned in hiring and managing a sales organization. UIPath Website - https://www.uipath.com/ Twitter - https://x.com/UiPath Daniel Dines LinkedIn - https://www.linkedin.com/in/danieldines Twitter - https://x.com/danieldines FIRSTMARK Website - https://firstmark.com Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ Twitter - https://twitter.com/mattturck (00:00) Intro (01:38) UiPath was founded in an apartment in Bucharest. How did it all start? (08:05) Building a global product (11:26) The growth stage. (18:50) "We were AI from the beginning" (20:10) Raising the first round of funding. (23:48) Working with the board. (25:11) How did UiPath expand from the Romanian to the global market? (35:00) Process Mining, Task Mining, and Communications Mining. (41:41) The Automation Layer explained. (45:28) The use cases for using AI in UiPath's automations (56:22) UiPath's strategy for Gen AI adoption. (58:27) The team. (59:42) How important are partnerships for enterprise (01:02:48) Recruiting the best salespeople in the industry (01:07:10) Scaling from a software engineer to the CEO of a large company.

In this episode, we sit down with Howie Liu, co-founder and CEO of Airtable, to explore the incredible journey of Airtable from its early days to becoming a powerhouse in the enterprise software space. Howie provides a candid look at the challenges and learnings from transitioning Airtable from a PLG product to an enterprise platform, how companies are transforming their marketing operations with AI, and the transformative potential of AI in automating workflows and enhancing business processes. AIRTABLE Website - https://www.airtable.com/ Twitter - https://x.com/airtable Howie Liu LinkedIn - https://www.linkedin.com/in/howieliu/ Twitter - https://x.com/howietl FIRSTMARK Website - https://firstmark.com Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ Twitter - https://twitter.com/mattturck (00:00) Intro(02:40) What is Airtable in 2024?(05:35) How does Airtable apply AI to its products?(11:56) What are the AI use cases in Airtable?(18:35) The tech behind Airtable's AI capabilities(22:22) Is Airtable going to become an AI-first company?(25:15) Will AI kill programming as we know it?(29:24) How do big enterprises think about AI?(34:46) How did Airtable go from PLG to a large enterprise product?(41:00) AI Categories(47:47) "We definitely had our hiccups"(51:20) Was PLG a ZIRP-era phenomenon?(56:29) Howie's journey as a CEO

In this episode, we sat down with Tomasz Tunguz (https://twitter.com/ttunguz), the founder of Theory Ventures and a leading voice in the tech investment space. We discussed the transformative potential of Ethereum as a database company, the importance of data security in a decentralized world, and the evolving landscape of AI technologies from foundational models to AI-native applications. 📰 Article "What If LLMs Change the Business Model of the Internet?": https://tomtunguz.com/what-if-llms-change-the-business-model-of-the-internet/ ✍️ Tomasz' blog: https://tomtunguz.com Theory Ventures Website - https://theory.ventures/ Twitter - https://twitter.com/Theoryvc Tomasz Tunguz LinkedIn - https://www.linkedin.com/in/tomasztunguz Twitter - https://twitter.com/ttunguz Blog - https://tomtunguz.com/ FIRSTMARK Website - https://firstmark.com Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ Twitter - https://twitter.com/mattturck LISTEN ON: YouTube - https://www.youtube.com/@DataDrivenNYC/videos Apple - https://podcasts.apple.com/us/podcast/the-mad-podcast-with-matt-turck/id1686238724 (00:00) Intro(02:46) Tomasz has continued to invest in blockchain through the crypto winter. Why?(06:59) Security and privacy as the main blockchain's use case.(09:18) Blockchain and AI: how do they work together?(11:02) Why does Theory Ventures not invest in AI hardware?(12:28) Why do big companies invest in cloud infrastructure?(15:35) An investor view on the foundation models.(18:36) Is Gen AI going to replace traditional AI?(20:57) Does the Theory Ventures invest in AI tooling companies?(22:53) Is investing in Cloud companies better than investing in AI-powered applications?(26:40) Copilot AI vs full-execution AI.(28:38) A case for specialized LLMs.(29:54) Gross margins in Gen AI: is it profitable?(32:34) Modern Data Stack: is it still a thing to invest in?(37:02) Microsoft Fabric and its impact on the market.(38:50) Tomasz's thought on Motherduck and DuckDB.(40:37) Where do BI tools fit in the Modern Data Stack?(44:32) Why has the democratization of BI never happened?(45:52) How do acquisitions happen? Can you engineer them?(49:02) Key ingredients to build data infrastructure business.(50:40) Tomasz is a founder now! How does it feel?(53:15) Talking numbers: Theory Ventures' financial model.

In this episode, we sat down with Renen Hallak, founder and CEO of VAST Data, a $9 billion company that's shaking the foundations of data storage, databases and compute functionality. Through the conversation, we explore VAST's perspective on AI infrastructure, the process of selling over a billion dollars worth of software, and the technical innovations behind disaggregated, shared-everything architecture. VAST Data Website - https://www.vastdata.com/ Twitter - https://twitter.com/VAST_Data Renen Hallak LinkedIn - https://www.linkedin.com/in/renenh/ FIRSTMARK Website - https://firstmark.com Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ Twitter - https://twitter.com/mattturck (00:00) Intro(01:40) What is VAST Data?(02:56) The company was started in stealth mode. Why?(03:42) Did VAST get lucky with the gen AI explosion?(04:27) VAST Data founding story(05:57) How does the company work across 2 continents?(06:48) What made you think that you can disrupt the market?(09:23) VAST architecture explained(23:08) Moving from data storage to databases(25:01) What was the hardest thing to build?(26:32) How does VAST work with open source(26:54) A glimpse into the future products(28:22) The world without VAST: how it would've looked like(29:45) Who were VAST's first customers?(30:56) How do hedge funds use VAST?(32:08) VAST's sales strategy(34:04) Renen's transition from technical founder to CEO(36:01) How do you hire great people?(37:07) What was the hardest thing on your journey as a CEO?(38:43) $9B CEO daily routine(40:17) Difference between offices in NY and Israel(42:07) Renen's learnings from sales

🔗 2024 MAD Landscape: https://mad.firstmark.com 📃 PDF: https://mattturck.com/landscape/mad2024.pdf 📃 Blog post: https://mattturck.com/mad2024/ In this episode, we delve into the 2024 machine learning, AI, and data scene (MAD), examining an evergrowing array of over 2011 logos, the meteoric rise of open-source AI, and the anticipated advancements in AI agents and edge AI technology. Gain valuable perspectives on the saturated AI market, the dilemmas and prospects open source AI presents, and the continuous evolution of the modern data infrastructure. This episode covers a distinctive mix of analysis, industry perspectives, and foresight into the technological future. FIRSTMARK Website - https://firstmark.com Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ Twitter - https://twitter.com/mattturck Aman Kabeer (Investor) LinkedIn - https://www.linkedin.com/in/aman-kabeer/ Twitter - https://twitter.com/AmanKabeer11 (00:00) Intro (02:06) What is MAD? (07:58) Open sourcing AI (12:29) How open source affects commercial AI? (21:02) Is the AI hype cycle over? (26:39) Was 2023 a head fake for Gen AI? What about 2024? (28:05) VC's perspective on AI (30:54) Emerging of AI stack (37:36) What are the areas VCs are excited about? (41:04) Will full-stack AI platforms kill SaaS? (42:42) Modern Data Stack: is it dead or alive? (47:17) What's next for the MAD Landscape?

In this episode, we sat down with Morgan McGuire, Chief Scientist of Roblox, and the mind behind the magic of the virtual universe. Together we explore the spectrum of creativity on Roblox, from no-code experiences to professional game development, dive deep into the cutting-edge AI tools Roblox is deploying, and how these tools are democratizing game development. Tune in to embark on a journey into the heart of creativity, technology, and community with Roblox. This is not just about playing games; it's about creating the future, one experience at a time. ROBLOX Website - https://www.roblox.com Twitter - https://twitter.com/Roblox Morgan McGuire LinkedIn - https://www.linkedin.com/in/morgan-mcguire-660120210 Twitter - https://twitter.com/casualeffects FIRSTMARK Website - https://firstmark.com Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ Twitter - https://twitter.com/mattturck (00:00) Intro (01:05) Roblox is not a game, but a platform (10:03) How does Roblox leverage Gen AI? (13:34) How did the company start working on AI? (21:26) AI Code Assist (26:30) AI Material Generator (32:07) ControlNet (38:36) StarCoder (43:40) Who works at Roblox?

In this episode, we sat down with Benedict Evans, a leading voice in the tech industry and a former partner at Andreessen Horowitz. Known for his sharp insights and forward-thinking analysis, Benedict shares his expert perspective on what generative AI means for the future of technology, business, and society at large. Specifically, we dive deep into the evolving landscapes of generative AI, augmented and virtual reality, and the critical issue of AI bias. Join us as Benedict Evans provides a nuanced analysis of cutting-edge tech and shares his insights and perspectives on the road ahead. BENEDICT EVANS LinkedIn - https://www.linkedin.com/in/benedictevans/ Threads - https://www.threads.net/@benedictevans FIRSTMARK Website - https://firstmark.com Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ Twitter - https://twitter.com/mattturck (00:00) Intro (01:06) The AI platform shift in 2024 (05:54) Gen AI in 2024 vs. PC-boom in the 80-s (13:24) Until AGI happens, there will be vertical-specific apps (15:12) Should companies have an AI strategy? (21:04) Platform shift OR paradigm shift? (23:55) How should we think about AGI in 2024? (34:08) Is gen AI grossly overhyped? (36:27) AI bias and the hidden problems in data (44:56) Apple Vision Pro and the future of AR/VR

In this episode, we sit down with Gary Little, CEO of Foursquare, to discuss Foursquare's remarkable evolution from a social app to a leader in location intelligence. Gary discusses how Foursquare uses smartphone ubiquity to create a global map through crowdsourcing, covering 190 countries and over 200 million points of interest. Learn about the challenges of managing complex, real-time datasets and how Foursquare employs machine learning and knowledge graphs to analyze foot traffic and device movements. The conversation also covers the critical role of privacy and data security in location tracking, especially in light of recent regulatory changes. Gary explains Foursquare's platform strategy, drawing parallels with Amazon's AWS, to enable customers to process and utilize location data for their applications. Foursquare Website - https://location.foursquare.com Twitter - https://twitter.com/Foursquare Gary Little (CEO) LinkedIn - https://www.linkedin.com/in/gary-little-0670ba4 Twitter - https://twitter.com/garylittlefsq FIRSTMARK Website - https://firstmark.com Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ Twitter - https://twitter.com/mattturck (00:00) Intro (01:10) Brief history of Foursquare (03:07) What makes Foursquare's location data unique? (05:17) Foursquare Platform. What is it? (08:07) A glimpse into the future of Foursquare (10:00) More customers want to process the data themselves. Why? (13:42) Data privacy of today vs 10 years ago. What has changed? (16:41) Foursquare Graph: what does it do? (19:17) How is Foursquare utilizing AI? (22:17) How will AR/VR influence location intelligence?

In this episode, we dive into the fascinating world of AI art with Cris Valenzuela, CEO of Runway. Runway is a generative AI startup that co-invented Stable Diffusion, the deep learning technology that has captured the attention of the creative industry, including luminaries such as ASAP Rocky and Madonna's teams, by pushing the boundaries of digital creativity. We explore how generative AI tools empower visual artists to unleash their imaginations without the need for Hollywood-size budgets. We also discuss the effect of AI on the entire creative industry, similar to how the camera changed things back in the day. Join us for a glimpse into the future of creativity. RUNWAY Website - https://runwayml.com Twitter - https://twitter.com/runwayml Cris Valenzuela (Co-founder & CEO) LinkedIn - https://www.linkedin.com/in/cvalenzuelab Twitter - https://twitter.com/c_valenzuelab FIRSTMARK Website - https://firstmark.com Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ Twitter - https://twitter.com/mattturck Foursquare Website - https://location.foursquare.com Twitter - https://twitter.com/Foursquare (00:00) Intro (00:55) What is Runway? (03:09) Runway started before the GenAI boom. How? (04:41) What do people get wrong about GenAI? (07:18) How AI is going to change creative software? (08:44) What is Gen-2? (12:02) Runway's role in creating Stable Diffusion (14:25) Gen-1: a model or a product? (15:11) Runway's evolution from image generation to video (18:18) Runway partnered with Getty. Why? (19:52) How has the AI video generation ecosystem evolved? (21:58) Adoption cyсle for AI video generation. Where are we now? (24:45) Challenges of building a research-focused company (26:25) How to build and maintain a soul in a startup? (28:27) "It's like an invention of new art form" -

Join us in this exciting episode as we dive into the world of enterprise AI with Florian Douetteau, co-founder and CEO of Dataiku, the leading enterprise AI platform targeting Global 2000 companies. Since its founding in 2013, Dataiku has been at the forefront of democratizing AI in the enterprise. We'll explore the current state of deployment of AI in businesses around the world, dive deep into the differences between generative AI and traditional AI, explore emerging Generative AI uses cases in the enterprise, and get a sneak peek into Dataiku's latest breakthrough, the LLM Mesh, aimed at simplifying the use of multiple Generative AI models for companies. We'll also tackle the big challenges companies face when adopting AI, from managing costs to dealing with the uncertainties of Generative AI. This episode was recorded live at a recent Data Driven NYC, the monthly in-person event organized by FirstMark since 2011, hosted this month by our partners at Foursquare, the location intelligence company, at their beautiful headquarters. Dataiku Website - https://www.dataiku.com/ Twitter - https://twitter.com/dataiku Florian Douetteau LinkedIn - https://www.linkedin.com/in/fdouetteau Twitter - https://twitter.com/fdouetteau FIRSTMARK Website - https://firstmark.com Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck Twitter - https://twitter.com/mattturck Foursquare Website - https://location.foursquare.com Twitter - https://twitter.com/Foursquare (00:00) Intro (01:09) What is Dataiku? (02:03) Is the market ready for AI? (04:33) Traditional AI vs Generative AI (08:33) What a company should know before diving into Generative AI? (10:18) Cost of Generative AI adoption (12:10) What blocks the AI adoption? (14:31) Dataiku product tour (16:34) How to build one product for different audiences (17:45) LLM Mesh: what is it? (21:10) Evolution of platform building with Gen AI (22:17) Enterprise AI motion in 2024 (23:28) Dataiku's partnerships (24:24) Being platform-first as a startup

In this episode, we sit down with Bob Moore, the CEO of Crossbeam, who turned a $2.6 billion mistake into a masterclass on Ecosystem-Led Growth (ELG). Fresh off publishing his new book, Bob shares why ELG is the future of business growth, challenging traditional strategies with data-driven insights and partnerships. Bob reveals how Crossbeam can help companies of any size leverage ELG to achieve remarkable growth. He dives into the role of data in ELG, the impact of AI on marketing, and practical steps for implementing ELG in your own company. From discussing the "slow heat death" of traditional growth strategies to unveiling the potential of data-driven partnerships, this episode is packed with eye-opening revelations. Bob also tackles the practical steps companies can take to implement ELG, making this a must-watch for CEOs, leaders, and entrepreneurs aiming to catapult their businesses into a new era of growth. Book: https://www.amazon.com/Ecosystem-Led-Growth-Blueprint-Marketing-Partnerships/dp/1394226837 Crossbeam Website - https://www.crossbeam.com Twitter - https://twitter.com/crossbeam Bob Moore LinkedIn - https://www.linkedin.com/in/robertjmoore/ Twitter - https://twitter.com/robertjmoore FirstMark Website - https://firstmark.com Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ Twitter - https://twitter.com/mattturck (00:00) Intro (00:43) Bob recently wrote a book. Why did he do that as a CEO? (03:20) Bob's $2.6 billion mistake (12:15) What is ELG? (17:30) How does Crossbeam work? (20:51) Why do we need another type of go-to-market motion? (25:00) AI is killing inbound/outbound marketing (31:50) Applying ELG to your company (36:13) When should you do ELG and partnerships? (43:34) Outro

In this episode, we sat down with Emi Gal, founder and CEO of Ezra, a startup that leverages AI to detect cancer early and inexpensively. Emi provides insights into the landscape of the healthcare sector and talks about the differences between building an AI startup in healthcare versus SaaS. Turns out that "(In AI skills)... are not that transferable." EZRA Website - https://ezra.com Twitter - https://twitter.com/ezrainc Emi Gal LinkedIn - https://www.linkedin.com/in/emigal Twitter - https://twitter.com/emigal FIRSTMARK Website - https://firstmark.com Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ Twitter - https://twitter.com/mattturck (00:00) Intro (01:50) Ezra raised $21 million in series B round (02:55) The origin of Ezra (06:06) Sourcing AI talent (06:52) Building a proof of concept (09:05) The tipping point for the product market fit (10:57) Y Combinator wants more MRI startups. Why? (11:37) Ezra's vision for MRI (13:25) Is it covered by insurance? (16:15) Full stack vs Software only (20:00) Training AI (22:55) Building an MRI database (25:45) Will radiologists get replaced by AI? (27:52) Creating reports with Generative AI (30:50) Can we trust AI in healthcare? (33:44) What are the specific challenges of building an AI startup? (39:01) Healthcare entrepreneurship (43:59) Staying fit as a CEO: Emi's mental and physical health routine (48:28) Plans for 2024

In this episode, we sat down with Des Traynor, co-founder of Intercom, to explore the seismic shift towards Artificial Intelligence in customer service software. Intercom has gone all-in to embrace AI as people's expectations of what chatbots can do started growing with the release of ChatGPT. Des shares the pivotal moments and strategic decisions that led to this transition, highlighting the urgency and vision that propelled Intercom to integrate AI into their core offerings. Des also delves into the challenges of building a bicontinental startup and the strategic pivot towards becoming an AI-first company. Tune in for an enlightening discussion on the strategy and journey of adapting AI. INTERCOM Website - https://www.intercom.com Twitter - https://twitter.com/intercom Des Traynor LinkedIn - https://www.linkedin.com/in/destraynor/ Twitter - https://twitter.com/destraynor FIRSTMARK Website - https://firstmark.com Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ Twitter - https://twitter.com/mattturck (00:00) Intro (01:16) How did Intercom make a transition to a generative AI product (Fin)? (05:34) Did the Intercom manifesto play a role in the transition? (07:16) What was the Intercom before Fin? (09:01) How much development effort did you spend on AI? (12:31) UX (15:20) People used to hate chatbots (17:51) GPT and building layers around it (20:50) The future of customer service (23:57) GPT-4/Llama/Mistral/Claude (25:58) Are multimodal AI-bots the future? (27:08) AI-hallucination (30:11) Customization (34:34) Will Fin get a voice? (36:26) Customer support cost and impact on profitability (39:58) How much should you charge? (45:26) AI-bot resolution rate (46:43) Can bots take action? (48:40) AI-adoption (51:14) How the Intercom team evolve (53:38) How did 4 Irish guys create a bi-continental startup? (56:17) Work distribution (58:38) Tech in Europe vs tech in the US

In this episode, we explore the dynamic world of modern analytics with Tristan Handy, CEO of dbt Labs (https://twitter.com/jthandy). DBT, which helps more than 30,000 enterprises ship trusted data products faster, has raised more than $400 million dollars, most recently at a $4B valuation.We discuss how dbt has revolutionized analytics engineering, enabling seamless data transformation and orchestration in the cloud. This innovation fosters greater collaboration among data teams and integrates software engineering principles into data analytics workflows.We also talk about dbt's Semantic Layer, a game-changer that streamlines data operations by standardizing key business metrics for consistent use across various analytical tools.In this conversation, we tackle pressing questions about the current state and future of data management and analytics. Is the "modern data stack" becoming obsolete? What's next for data engineering? And how is AI reshaping the analytics landscape?Tune in to discover our insights.📰 Is the "Modern Data Stack" Still a Useful Idea?https://roundup.getdbt.com/p/is-the-modern-data-stack-still-aDBTWebsite - https://www.getdbt.com/Twitter - https://twitter.com/getdbtTristan Handy (CEO & Co-Founder):LinkedIn - https://www.linkedin.com/in/tristanhandy/Twitter - https://twitter.com/jthandyIs the "Modern Data Stack" Still a Useful Idea? - https://roundup.getdbt.com/p/is-the-modern-data-stack-still-a?r=oc02&utm_campaign=post&utm_medium=webFIRSTMARKWebsite: https://firstmark.comTwitter: https://twitter.com/FirstMarkCapMatt Turck (Managing Director)LinkedIn - https://www.linkedin.com/in/turck/Twitter - https://twitter.com/mattturckLISTEN ON:Spotify - https://open.spotify.com/show/7yLATDSaFvgJG80ACcRJtqApple - https://podcasts.apple.com/us/podcast/the-mad-podcast-with-matt-turck/id168623872400:00 - Intro02:43 - What is the Modern Data Stack?05:57 - Is the Modern Data Stack dead?12:23 - What's the alternative?16:24 - Where is analytics engineering heading?20:02 - The Reverse ETL market23:21 - The role of AI in analytics engineering27:47 - Will analytics engineers become the prompt engineers?29:78 - Is the MDS part of the emerging generative AI stack?33:51 - The Semantic Layer37:49 - dbt's plans for the near future41:17 - Hiring at different stages of the business44:21 - Going from open-source to commercial46:40 - Market situation vs. sales strategy

In this episode, we sat down with Bob van Luijt (https://twitter.com/bobvanluijt), the CEO of Weaviate, diving into the cutting-edge world of vector databases and their role in the AI revolution.Weaviate is an open source, AI-native vector database that helps developers create intuitive and reliable AI-powered applications. Weaviate sets itself apart with its vector search engine that integrates machine learning directly into its core, enabling more nuanced and context-aware search capabilities for AI-driven applications.This conversation explores vector databases (the core infrastructure behind generative models), the role of Retrieval-Augmented Generation (RAG), and how open source is driving commercial use cases.WEAVIATEWebsite - https://weaviate.ioTwitter - https://twitter.com/weaviate_ioBob van Luijt (Co-Founder & Co-CEO):LinkedIn - https://www.linkedin.com/in/bobvanluijtTwitter - https://twitter.com/bobvanluijtMatt Turck:LinkedIn - https://www.linkedin.com/in/turck/Twitter - https://twitter.com/mattturckDATA DRIVEN NYCThis episode of the MAD Podcast was recorded live at Data Driven NYC, an event series organized by FirstMark Capital. The events are free and held monthly in New York, currently with the support of Foursquare.If you wish to attend and be notified of future events, please follow FirstMark on Eventbrite at https://www.eventbrite.com/o/firstmark-capital-221557018301:00 What is RAG?06:20 Why is embedding models is such a hot topic right now?08:06 What is your assessment of RAG?09:53 Generative feedback loops11:46 What is Hybrid Search?15:15 What makes Weaviate special?16:53 What about security?17:45 Does RAG accelerated the need for real-time data?19:27 How to define good vector database? 22:11 What do you think about general purpose databases entering the field of vector-based databases?23:47 Interesting use cases of Weaviate25:27 What’s your sense of the current state of the market?26:53 Open source vs commercial product on Weaviate29:23 How did it all get started?

Last week, we sat down with Alex Rinke (https://twitter.com/alexanderrinke), Co-founder & Co-CEO of Celonis, to explore how AI and automation are transforming business operations at large enterprises. Celonis is the pioneer of "process mining" - the technology that uses graph databases, AI, and automation to analyze processes, find inefficiencies and their root causes, and solve them.Most recently valued at $13B, Celonis is one of the most valuable startups globally. But Alexander and his two co-founders started Celonis while still in college on a $15,000 budget. In this conversation, we talked about the early days of Celonis, how Alex acquired his first enterprise clients without inside industry connections, how Celonis navigates go-to-market for a product with an expansive scope, and much more.CELONISWebsite - https://www.celonis.comTwitter - https://twitter.com/CelonisAlex Rinke (Co-Founder & Co-CEO):Twitter: https://twitter.com/alexanderrinkeLinkedIn: https://www.linkedin.com/in/alexander-rinke-10733061/DATA DRIVEN NYCThis episode of the MAD Podcast was recorded live at Data Driven NYC, an event series organized by FirstMark Capital. The events are free and held monthly in New York, currently with the support of Foursquare. If you wish to attend and be notified of future events, please follow FirstMark on Eventbrite at https://www.eventbrite.com/o/firstmark-capital-221557018300:00 - Intro02:02 - What is Process Mining?05:20 - How Celonis got started07:42 - “We had our first prototype in three weeks”09:36 - Pivotal partnership with ACP12:12 - How did Celonis find product-market-people fit?14:14 - Penetrating the global market16:19 - Technical deep dive into the Celonis’ product19:29 - Celonis finds process gaps completely automatically21:15 - Who is the average user of Celonis inside companies?22:11 - How Celonis uses Generative AI 24:54 - Acquisition of Symbio25:56 - How to keep the fire of innovation inside the team?27:49 - How to bring a very horizontal product to market?32:24 - Scaling yourself as a leader34:15 - Glimpse into the future of Celonis35:37 - Outro

We are so excited today to be joined by Brandon Duderstadt, CEO + Cofounder, and Zach Nussbaum, Machine Learning Engineer, from Nomic AI. They discuss how Nomic AI is building tools like Atlas + GPT4all that enable everyone to interact with AI scale datasets and run models on consumer computers - and - stay tuned for an exciting announcement about their newest product release later in the podcast.Thanks for joining us for the first episode of Season 2 of the MAD Podcast. We will be back to our regular weekly schedule with new conversations with leaders in the Machine Learning, AI and data landscape. If you like this show, you can find the video recording of this episode -- along with many, many more -- on the Data Driven NYC channel on YouTube.NOMIC AIwww.nomic.aitwitter.com/nomic_aiwww.linkedin.com/in/bstadt/www.linkedin.com/in/zach-nussbaum/FIRSTMARKfirstmark.comtwitter.com/FirstMarkCapMatt Turck (Managing Director)www.linkedin.com/in/turck/twitter.com/mattturckData Driven NYC YouTube ChannelFirstMark Capital Eventbrite0:46 - What is Nomic AI & how it got started5:57 - Building GPT4ALL7:23 - Running LLMs on a personal computer16:00 - Nomic Atlas21:33 - Launching Nomic Embed28:10 The Importance of Data in AI31:10 - Benchmarking LLMs32:56 - The Future of Nomic AI36: 22 - Building an AI Startup in New York39:10 - Nomic AI is hiring

Today, we’re thrilled to be joined by Eiso Kant, CTO + Co-Founder of Poolside, the buzzy new AI tool for software development. Eiso and Matt talk about Poolside’s foundational model, the critical role of data quality in AI, the importance of controlling all levels of the stack and the merits of building a global AI company out of Europe, and more. Thank you to everyone who has joined us for Season 1 of the MAD Podcast. We will be taking a short break for the winter holidays and will be back with an exciting new lineup of great speakers for Season 2 on Wednesdays in January. If you like this show, you can find the video recording of this episode -- along with many more -- on the Data Driven NYC channel on YouTube. Important links are in the show notes below. Data Driven NYC YouTube ChannelFirstMark Capital Eventbritetwitter.com/eisokantpoolside.aitwitter.com/mattturcklinktr.ee/mattturckShow Notes: [00:38:00] Introducing Eiso Kant, Co-founder and CTO of the AI startup, Poolside;[00:39:16] Eiso's Background; his journey, from starting as a young programmer to founding several companies, including Source{d}, a pioneer in applying deep learning to software source code;[00:40:33] Formation of Poolside; the collaboration between Eiso and his co-founder, Jason Warner, who was previously the CTO of GitHub and VC with Redpoint Ventures;[00:42:14] Poolside's Vision and potential to improve software development;[00:47:17] Narrowing Vision to Product Development; the importance of sequence in a company's growth, focusing on AI pair programming assistants as a start, moving towards a more autonomous future;[00:50:32] Initial Product Focus, user base, and approach to providing a vertically integrated AI stack for developers;[00:53:05] Reinforcement Learning from Code Execution Feedback;[01:02:29] Data Handling and Synthetic Data Generation; the importance of data quality and Poolside's strategy for generating and refining training data;[01:12:05] Engineering Behind Poolside's AI; the challenges and strategies Poolside is adopting, including building a team of strong engineers and creating a scalable architecture from scratch;[01:16:52] Choosing Europe as a Base for Poolside;[01:20:22] Poolside's Future Plans; the roadmap for Poolside, including launching products and APIs, exploring enterprise solutions, and creating a sustainable revenue-generating business;

Today, we’re joined by Gustavo Sapoznik, Founder and CEO of ASAPP, the generative AI platform transforming contact centers. Matt + Gustavo discuss the magnitude of challenges to overcome in this market, how their AI tech is designed to help humans, the reason smart people should choose working at a startup over Big Tech, and more. This session was recorded live at a recent Data Driven NYC, our in-person, monthly event series. If you are ever in New York, you can find us on Eventbrite by searching for "FirstMark Capital". Events run monthly and are free and open to everyone. And as always, if you enjoy the MAD podcast, please subscribe and feel free to leave us a comment or rating.Data Driven NYC YouTube ChannelFirstMark Capital Eventbriteasapp.comtwitter.com/asapptwitter.com/mattturcklinktr.ee/mattturckShow Notes: [00:00:45] Introducing Gustavo Sapoznik, Founder & CEO of ASAPP, a unicorn AI startup based in New York;[00:01:00] How ASAPP started with a mission to “end bad customer service” after a frustrating phone call Mr. Sapoznik had with his cable provider;[00:02:44] ASAPP’s product philosophy and how the customer service is a three-legged stool with companies, customers, and agents;[00:05:11] How ASAPP automates what they can and augments the rest to make agents more productive;[00:07:12] The evolution of ASAPP’s offerings including how ASAPP technology makes agents more productive;[00:9:16] How ASAPP’s technology reduces response times and improves quality for agents by including transcription, auto complete, and real-time scoring of interactions for quality assurance;[00:13:49] How ASAPP has evolved since 2014; their research-first approach, building in-house AI capabilities, training their own models, and their recent exploration of using open-source checkpoints;[00:15:05] How Mr. Sapoznik hired the guy who ran all NLP research at Google;[00:16:04] How cost, latency, and accuracy in their AI models differentiate ASAPP from common AI APIs available today;[00:18:49] Agent models v. Language models and how ASAPP AI is modularized for large teams with established tech stacks;[00:20:09] Mr. Sapoznik shares insights on selling to large enterprises and why he believes building a sales machine is equally, if not more important, than the product itself;[00:23:08] How to recruit and retain top AI talent;[00:27:42] Lessons learned from working with notable board members, including the three key dimensions of support from a good board: being a sounding board, providing tactical advice and connections, and instilling a sense of accountability and motivation;

Today, we’re excited to chat with Scott Belsky - author, entrepreneur, investor and Chief Strategy Officer at Adobe. Matt + Scott discuss the impact of AI on creative work, how Adobe is incorporating AI across their products, and what the future creative tools landscape might look like.This session was recorded live at a recent Data Driven NYC, our in-person, monthly event series. If you are ever in New York, you can find us on Eventbrite by searching for "FirstMark Capital". Events run monthly and are free and open to everyone. And as always, if you enjoy the MAD podcast, please subscribe and leave us a comment.Data Driven NYC YouTube ChannelFirstMark Capital Eventbritetwitter.com/scottbelskyImplications, by Scott Belskytwitter.com/mattturcklinktr.ee/mattturckShow Notes: [00:53] How Adobe uses AI to enhance user experience, streamline onboarding and automate tasks across their product suite;[01:30] How AI impacts Adobe's business, making creative processes accessible with features like the context bar in Photoshop;[02:13] Firefly's journey: internal decisions, training challenges, and a commitment to using licensed material for ethical AI;[03:58] Moral considerations in Firefly's development: the decision to use licensed material, commercial viability, and addressing user comparisons;[05:52] Adobe's homegrown approach to generative AI models: in-house development and partnerships for specific capabilities like LLM;[06:08] Adobe Sensei's 10-year evolution: developing AI technologies, the non-profit Content Authenticity Initiative, and content credentials establishing asset provenance;[09:17] Adobe's new AI advancements: Firefly Image Model 2, Generative Match, and the vector model for illustration;[11:16] Firefly Editor's revolutionary image editing: dynamically generating pixels, real-time object manipulation, and Adobe's commitment to pushing technological boundaries;[12:41] Rapid integration of AI features: Firefly models and playground, surfacing on a website for user testing, and collaboration within Adobe's design organization;[14:32] How Adobe's AI and data teams are structured and leveraging in-house development for competitive advantage;[15:47] Future of work and creativity: AI's impact on raising the bar for digital experiences, accelerating creative processes, and the evolving landscape of personalized social content;[19:11] Leveraging technology to reduce friction, streamline processes, and unlock creative flow;[20:09] Impact of AI on business models: questioning time-based pricing, anticipating a shift to value-based models, and reconsidering compensation for creative professionals;[21:10] Parallels with historical Internet Service Providers, the rapid evolution of ideas, and reflections on sustainable business models;[24:53] Scott’s criteria for evaluating AI investments: valuing skeptical entrepreneurs, acknowledging temporary uniqueness, and emphasizing empathy with customers;[26:40] Navigating challenges in 2023: Tough decisions for entrepreneurs, evaluating conviction, and the importance of sticking together through the "messy middle”;

Today, we’re joined by Howard Katzenberg, CEO of Glean AI, a machine learning powered accounts payable platform. Matt + Howard discuss Glean’s founding story, how Glean helps CFOs make insight driven choices, and more. This session was recorded live at a recent Data Driven NYC, our in-person, monthly event series. If you are ever in New York, you can find us on Eventbrite by searching for "FirstMark Capital". Events run monthly and are free and open to everyone. And as always, if you enjoy the MAD podcast, please subscribe and leave us a comment. Data Driven NYC YouTube ChannelFirstMark Capital Eventbritetwitter.com/mattturck linktr.ee/mattturckShownotes: [00:00:35] Howard's background;[00:01:15] Challenges with manual FP&A;[00:02:54] Approval Process gap realization and opportunity for Glean AI;[00:04:40] How Glean AI is like “bill.com with a brain”;[00:05:06] Enhanced functionalities beyond basic AP automation;[00:06:32] Glean AI’s Inception and AI Models;[00:07:54] Why Glean AI is unique;[00:08:25] The evolution of Glean AI’s ML stack;[00:10:44] Defensibility and how Glean AI offers vendor pricing insights to its network;[00:12:23] Success stories and customer value;[00:14:47] Future plans for Glean AI;[00:16:39] Navigating industry and technical expertise;[00:18:41] Audience Q&A

Today, we have the pleasure of chatting with Raza Habib, CEO of Humanloop, the platform for LLM collaboration and evaluation. Matt and Raza cover how to understand and optimize model performance, lessons learned about model evaluation and feedback, and explore the future of model fine-tuning.twitter.com/RazRazclehumanloop.comData Driven NYC YouTube Channeltwitter.com/mattturcklinktr.ee/mattturckShownotes: [00:00:47] How Humanloop helps product and engineering teams build reliable applications on top of large language models by providing tools to find, manage, and version prompts;[00:03:05] Where Humanloop fits into the MAD landscape as LM / LLM Ops;[00:02:40] The challenges of evaluating and monitoring LLM;[00:03:40] Why evaluating LLMs and generative AI is subjective given its stochastic attributes;[00:04:40] Why evaluation is important during development and production stages of LLMs to make informed design decisions, and how that challenge evolves In production to monitoring system behavior;[00:05:40] The need for regression testing with LLMs;[00:06:10] How Humanloop makes it easy for users to capture feedback including Implicit signals of user satisfaction, such as post-interaction actions and edits to generated content;[00:07:40] Why and how Humanloop uses guardrails in the app to ensure effective LLM use and implementation;[00:08:38] Why using an LLM as part of the evaluation process can introduce additional uncertainty and noise; with turtles all the way down;[00:09:40] How evaluators on Humanloop are restricted to binary yes-or-no style questions or numerical scores to maintain reliability with LLMs in production.[00:10:40] Why a new set of tools were needed to monitor and observe LLM performance;[00:11:40] How Humanloop’s interactive environment allows users to find and fix bugs in a prompt, including logs to support issue identification, and then run what-if style analysis by changing the prompt or information retrieval system — allowing for quick interventions and turnaround times within minutes to hours instead of days/weeks;[00:12:40] Why having evaluation and observability closely connected to prompt engineering tools is critical for speed;[00:13:40] How prompt engineering is like writing software specifications for the model, enabling domain experts to have a more direct impact on product development, and democratizing access and reducing reliance on engineers to implement the desired features;[00:15:40] The key differences between popular LLMs on the market today;[00:18:40] How the quality of open-source models has been rapidly improving, and how LLMs use tools or function calling to access APIs to go beyond simple text-based interactions;[00:21:22] How Humanloop empowers non-technical experts;[00:22:40] Where Humanloop fits within the AI ecosystem as an collaborative tool for enterprises building language models where collaboration and robust evaluation are crucial;[00:25:40] How Humanloop customers are often problem-aware, and how the go-to-market motion is mainly inbound, but sales-led[00:27:48] How Humanloop serves as a central place for storing prompts and sharing learnings across teams;[00:28:24] Raza’s thoughts on Open Source v. Closed Source models in the AI community;[00:30:40] The potential consequences of restricting access to models and Raza’s case for regulating end use cases and punishing malicious use rather than banning the technology altogether;[00:33:40] Next steps for Humanloop;

Today we're joined by Akilesh Bapu, CEO and Founder of DeepScribe, the platform using AI and Natural Language Processing to doctor/ patient transcripts. Matt and Akilesh go into DeepScribe's clinical use cases, supervised vs. unsupervised learning, and how critical it still is to have a human in the loop in a medical setting.

Today we have the pleasure of chatting with Sharon Zhou, CEO of Lamini, an LLM platform for the enterprise. Matt and Sharon go over the battle between prompting and fine-tuning, how the Lamini platform enables fine-tuning to be done "one billion times faster", and their recently-announced "LLM Super-station" in partnership with AMD.

Today we're joined by Aravind Srinivas, CEO of Perplexity AI, a chatbot-style AI conversational engine that directly answers users' questions with sources and citations. Matt & Aravind discuss Perplexity's founding story, the platform itself, and more. This session was recorded live at a recent Data Driven NYC, our in-person, monthly event series. If you are ever in New York, you can find us on Eventbrite by searching for "FirstMark Capital". Events run monthly and are free and open to everyone. And as always, if you enjoy the MAD podcast, please subscribe and feel free to leave us a comment or rating.

Today we're joined by Mathew Lodge, CEO of Diffblue, an AI platform that uses reinforcement learning to autonomously test software. We chat about the "AI for code" landscape, the Diffblue platform, and why prompt engineering is not a thing.

Today we're joined by Nancy Xu, AI Investor and CEO and Founder of Moonhub AI, the AI recruiting platform helping companies shorten and speed up the recruiting process while also helping employers reach a more diverse pool of candidates. We dive into how the Moonhub platform operates, Nancy's thoughts on opportunities for AI startups, her journey as an investor, and interesting projects she has her eye on.

Today we're joined by Stanislas Polu, Co-Founder of Dust, a startup building Secure AI assistants for the enterprise. We dive into Stanislas's journey to founding Dust including his experience at Open AI, the path to generative AI adoption in the enterprise, and the rise of the French AI ecosystem.

Today we're joined by Kanjun Qiu, CEO of Imbue, an independent research company developing AI agents with general intelligence, fresh off the announcement of their $200M Series B round of financing. We talk about Kanjun's journey, Imbue's vision and the future of AI agents.

Today we're joined by Shreya Rajpal, Co-founder & CEO of Guardrails AI for a conversation on the Guardrails platform, mitigating AI hallucinations and the role fine tuning and retrieval augmented generation play in that.

Today we are joined by Ori Goshen, Co-founder and Co-CEO of AI21 Labs, for a conversation about AI 21's origin story, their differentiated approach to AI, and their ambitious platform and applications.

Today we are excited to welcome Carly Taylor for a broad discussion covering the numerous ways she's ingrained in the AI & data world including AI at Activision’s Call of Duty franchise, consulting at Rebel Data Science and being a prominent voice for data science on social media.

We're joined by Milos Rusic, CEO & Co-founder of deepset AI for a conversation on deepset's origin story as a bootstrapped company, a deep dive into the Haystack open source project and deepset's Cloud platform, and emerging applications for NLP and LLMs in the enterprise.

Dimitri Sirota, Co-Founder and CEO of BigID, which has raised $280+ to date, joins us for a chat about the importance and complexities around knowing and controlling your enterprise data.

Today we are diving into the world of generative AI in Healthcare with CEO and Co-founder of Hippocratic AI, Munjal Shah. In this episode Matt and Munjal discuss the vision for Hippocratic AI, and the unique challenges and opportunities of deploying generative AI in the world of healthcare.

This week’s guest is top AI researcher, entrepreneur and investor Richard Socher, CEO of AI search engine You.com. In this conversation, we go behind the scenes and discuss some core design principles and building blocks of the You.com platform, as well as its market positioning. We close the discussion with Richard’s investment thesis and approach in the fast moving AI market

Today we have the pleasure of talking to Lukas Biewald, CEO of Weights and Biases for a conversation about Lukas' entrepreneurial journey building two companies in the MLOps space, the current capabilities of the Weights & Biases platform, lessons learned on the Go to Market front, and more!

Relational databases, data cloud's effect on infrastructure, serverless databases, and GTM strategies: Matt Turck and CockroachDB's Spencer Kimball cover it all in today's episode.

Today we're joined by Mike Murchison, Co-Founder & CEO of AI-native customer service platform Ada, for a talk about about the Ada platform, reinventing customer service in the age of Generative AI, and how AI should be onboarded and trained like an employee.

Today we're joined by Victor Riparbelli, CEO of the generative AI video platform Synthesia that just last month hit a $1B valuation after raising their Series C funding round. Matt and Victor go into the Synthesia platform, what it takes to build a successful AI company, and dive into the ethics of generative AI videos.

This week we're joined by Jerry Liu, Co-Founder & CEO of LlamaIndex, a startup that offers a data framework for connecting custom data sources to large language models, for a conversation about the emerging Generative AI infrastructure stack, how startup founders navigate a field as new and fast paced as Generative AI, and more.

Today we're joined by Florian Douetteau, CEO and Co-Founder of Dataiku, for a conversation about the Dataiku platform, emerging use cases for Generative AI in the enterprise and some leadership lessons learned along the way.You can find Florian's essay, The Children of AI, here: https://children-of-ai.florian-douetteau.comtwitter.com/fdouetteaudataiku.comIn Conversation with Florian Douetteau Blog PostData Driven NYC YouTube Channeltwitter.com/mattturcklinktr.ee/mattturck

We're joined by Jeff Huber, Co-founder of Chroma, for a chat on open-source AI-native databases.

Today we sit down with George Sivulka, CEO of the AI productivity tool, Hebbia, for a conversation on generative AI in fintech & government how how Hebbia keeps "smart people from doing stupid tasks" in fintech & government.

New York Times Chief Data Officer and Author of "How Data Happened" joins us for a conversation on the multifaceted impact data has had on our society.

Today we're joined by Daniel Sternberg, Head of Data at Notion, for a conversation on the release of Notion AI and the infrastructure and processes needed to launch and integrate an AI product.

This week we welcome Edo Liberty, Founder & CEO of Pinecone, the vector database that receives 10,000+ sign-ups a day.

Today we're joined by the "Godfather of Cloud Computing" Amr Awadallah for a conversation on LLM powered search, AI hallucinations, and more.

We're joined by Sarah Catanzaro, General Partner at Amplify Partners and one of the leading investors in AI, ML, and data to talk about the startup landscape, LLMs, and more.

AssemblyAI Founder & CEO, Dylan Fox joined FirstMark Managing Partner, Matt Turck for Data Driven NYC! AssemblyAI is the fastest way to build with AI for audio. With a simple API, get access to production-ready AI models to transcribe and understand speech. AssemblyAI has raised $63M+.

William Falcon, Founder of Lightning AI joins Matt Turck for a conversation on pytorch, LLaMAs, the future of large language models, and more.

Your message was sent

My Sentiment & Notes The MAD Podcast with Matt Turck

The MAD Podcast with Matt Turck