Super Data Science: ML & AI Podcast with Jon Krohn
Super Data Science: ML & AI Podcast with Jon Krohn

The latest machine learning, A.I., and data career topics from across both academia and industry are brought to you by host Dr. Jon Krohn on the Super Data Science Podcast. As the quantity of data on our planet doubles every couple of years and with this trend set to continue for decades to come, there's an unprecedented opportunity for you to make a meaningful impact in your lifetime. In conversation with the biggest names in the data science industry, Jon cuts through hype to fuel that professional impact. Whether you're curious about getting started in a data career or you're a deep technical expert, whether you'd like to understand what A.I. is or you'd like to integrate more data-driven processes into your business, we have inspiring guests and lighthearted conversation for you to enjoy. We cover tools, techniques, and implementation tricks across data collection, databases, analytics, predictive modeling, visualization, software engineering, real-world applications, commercialization, and entrepreneurship − everything you need to crush it with data science.

“The Fit Data Scientist” newsletter author Pénélope Lafeuille talks to Jon Krohn about how to give your all at work, offering her top tips for a healthy body and a healthy mind. Learn why “The SuperDataScience Podcast” made it onto her top 3 data science podcasts, and why following your passion can pay off in dividends for your career. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/952⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
VP of Engineering at Dropbox Josh Clemm speaks to Jon Krohn about consolidating search tools across apps with the AI-powered workspace, Dropbox Dash, the new collaborative AI systems that enhance interoperability between team members and their projects, and how to avoid “context rot”. Dropbox Dash gives users the best of Dropbox’s cloud storage and search functions, plus a “universal search” ability to locate information across multimedia and apps. “AI really needs to understand you and your team, first and foremost, and all that connected data,” says Josh. This episode is brought to you by the ⁠⁠Dell⁠⁠, by ⁠⁠Intel⁠⁠, by Airia and by MongoDB. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/951⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (01:07) All about Dropbox Dash (10:00) The benefits of browser-embedded AI (22:17) Why context engineering is so critical to agentic systems  (37:51) How creating apps helps tech leadership  (48:39) When to decide to use data versus intuition
In this special holiday episode, the SuperDataScience Podcast team comes together to wish you happy holidays and thank you for listening throughout the year. Team members from around the world share warm greetings in their own voices and languages as we reflect on another year of learning, curiosity, and community. From all of us at SDS, we wish you a joyful holiday season and look forward to bringing you more data science, machine learning, and AI content in the year ahead. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/950⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Alex “Sandy” Pentland, Toshiba Professor of Media Arts & Science at MIT and Fellow at Stanford, speaks to Jon Krohn about his new book, Shared Wisdom, why he attributes AI to the collapse of the Soviet Union, and why those risks to society could still be relevant today. We can only achieve better system performance, Alex says, when we build tools that keep step with the way that people make decisions. Listen to the episode to hear Alex talk about how he is helping make AI agents work for individuals rather than the companies that develop them, and his work in making sure that systems operate consistently and fairly across the world. This episode is brought to you by the⁠ ⁠⁠⁠⁠Dell⁠⁠⁠, by⁠ ⁠⁠Intel⁠⁠⁠, by ⁠Fabi⁠, and by ⁠Airia⁠. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/949⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (02:19) About Alex Pentland’s new book, Shared Wisdom                     (16:00) About loyalagents.org (28:36) Why we need data unions (34:02) The governance of AI (41:24) How to measure the social impact of AI projects
In this November episode of “In Case You Missed It” series, Jon Krohn selects his favorite clips from the month. Hear from Shirish Gupta and Tyler Cox (Episode 939), Vikoy Pandey (Episode 941), Marc Dupuis (Episode 937), and Maya Ackerman (Episode 943) on getting back to human motivation and the importance of evaluating the tools and data we use.  Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/948⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Jeff Li tells Jon Krohn what it's like to work at scale as a data scientist and a machine learning engineer at Netflix, Spotify and DoorDash, as well as how to get a foot in the door at these companies. Jeff also discusses how to run forecasts and trends, and how to read their results. Listen to hear Jeff Li discuss how Spotify became a podcast powerhouse, his startup move.ai, and the tools he uses every day. This episode is brought to you by the ⁠⁠Dell⁠⁠, by ⁠⁠Intel⁠⁠, by Fabi, and by Airia. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/947⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (09:05) Forecasting in data science                (23:33) How to get a data science job at Netflix     (30:06) Jeff’s experience on launching an AI startup      (51:57) Jeff’s AI toolkit
Jon Krohn looks into the benefits of robotaxis, from safety to affordability, in this Five-Minute Friday. Hear about Waymo’s partnership with Jaguar Land Rover, the latest safety studies concerning driverless vehicles, and a case for robotaxis becoming the preferred method of transport in the US, where households spend roughly 15% of their budget on vehicle ownership. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/946⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Is there humor in data? Joel Beasley, host of Modern CTO, tells Jon Krohn how he used AI to turn his sights to stand-up comedy. He also shares his tips on tech leadership that he learned from his popular podcast, Modern CTO, and how he is using generative AI as a collaborative partner in his creative work. This episode is brought to you by the ⁠⁠Dell⁠⁠, by ⁠⁠Intel⁠⁠, by Fabi, and by Gurobi⁠⁠⁠. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/945⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (02:14) Joel Beasley on his comedy career (19:04) Applying the ‘memory palace’ technique (22:28) About The Modern CTO Podcast (36:24) Leadership advice from The Modern CTO
Google is steaming ahead with launching its top-league new Gemini 3 Pro model across their product suite, from Google Search to Vertex AI cloud services. The multinational tech company is also letting eager early adopters like Wayfair and GitHub. Get all the detailed data, its performance across hard-to-game industry benchmarks, and what this all means for the way you use generative AI, in this week’s episode. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/944⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Creative human-AI partnerships and AI-generated music: WaveAI CEO and co-founder Maya Ackerman speaks with Jon Krohn about learning to see – and accept – AI’s potential as a creative partner in a human-centric, AI-forward future. Listen to the episode to hear Maya Ackerman discuss reframing hallucination as a creative force, her work at WaveAI, and how to push the boundaries of creativity using generative AI. This episode is brought to you by the ⁠⁠Dell⁠⁠, by ⁠⁠Intel⁠⁠, by Gurobi⁠⁠⁠ and by Airia. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/943⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (05:20) Maya’s challenge to anthropocentrism (19:26) How to compose music with AI (28:13) How to invest in creative empowerment (32:18) How to produce genuinely creative artworks through AI (44:58) The future of GenAI
What’s on the horizon for AI? Jon Krohn wades through opinions from more than experts, curated by the Longitudinal Expert AI Panel (LEAP), about what we can expect from the industry. From estimates on AI-assisted workers through energy consumption to AI performance in highly skilled domains, find out just how much LEAP thinkers believe AI is permeating our daily work and life in this Five-Minute Friday.  Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/942⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Vijoy Pandey imagines a bold new society in which agents and humans make scientific discoveries and complete physical tasks together, and he tells Jon Krohn about his work at AGNTCY, Cisco’s open-source platform for the Internet of Agents. Listen to the episode to hear Vijoy Pandey talk about how a future society in which multi-agents and humans interact may be a real possibility, what TCP/IP is, how to find trustworthy AI agents, and how to get your hands on AGNTCY today! This episode is brought to you by the Dell⁠⁠⁠⁠⁠⁠⁠⁠⁠, by⁠ ⁠⁠Intel⁠⁠⁠, by ⁠Fabi⁠ and by ⁠Gurobi⁠⁠⁠⁠. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/941⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (02:37) All about AGNTCY                  (12:04) How an agent-human society might function (15:19) What an “Internet of Agents” means (27:17) The future of access management (41:39) How to trust AI agents (48:49) How to get started with AGNTCY
Jon Krohn curates a selection of clips from the month that was. Hear from the orchestrators of an expanding AI universe in this episode of In Case You Missed It, with news, views and groundbreaking ideas from Sheamus McGovern, Jerry Yurchisin, Stephanie Hare, Larissa Schneider, and Adrian Kosowsky. We cover baby dragons, the Hippocratic Oath, and, of course, all the latest in artificial intelligence! Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/940⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
State space models (SSMs), granite models, and Mamba: Dell’s Tyler Cox and Shirish Gupta discuss with Jon Krohn why state space models can process information so efficiently, and how Dell’s AI factory helps enterprises manage custom AI workloads. Hear the latest on the Dell Pro AI Studio and Dell’s partnerships with IBM and Hugging Face in this episode.  This episode is brought to you by the Trainium2, the latest AI chip from AWS and by Gurobi. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/939⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (02:58) Dell Pro AI Studio news (23:17) How Dell manages interoperability (28:08) About the Dell/IBM granite models (47:38) How to troubleshoot AI tools (52:36) How Dell performs against benchmarks
Jon Krohn speaks to Rohan Kodialam, Cofounder and CEO of Sphinx, the company that redefines how machine intelligence reasons data with frontier AI. In this Feature Friday, Jon and Rohan discuss the benefits of using Sphinx to assist with data analysis. Get under the hood to learn how Sphinx operates, from running commands to ensuring your data stays secure, and find out how you can get your hands on this great tool for free. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/938⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
AI tools won’t eliminate but elevate data scientists, says Marc Dupuis. The CEO of fabi.ai talks to Jon Krohn about the new wave of AI-driven platforms that integrate workflows within popular work tools like Slack and email, and how building AI-first products means widening access to all ability levels.   This episode is brought to you by the Gurobi⁠⁠⁠⁠, by ⁠⁠Dell⁠⁠ and by ⁠⁠Intel. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/937 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (09:31) Will fabi.ai outshine data science practitioners      (20:40) Resolving workflows with fabi.ai                             (24:08) Creating AI agents with fabi.ai                                    (45:23) How to avoid ‘gaming’ targets
How much power – and risk – do we carry around with us in our pockets? A Reuters investigation about how easily LLMs can be utilized for online phishing scams is the subject of this week’s Five-Minute Friday with Jon Krohn. By asking six of the most popular LLMs (Grok, ChatGPT, Meta AI, Claude, DeepSeek and Gemini) to generate phishing emails specifically targeting elderly people, Reuters found the safety sometimes severely lacking in the models. Listen to the episode to hear Jon quantify this problem with real-world examples, why mere content warnings in LLM models don’t work, and the troubling results of the phishing requests. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/936⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Jon Krohn speaks to researcher, broadcaster and author Stephanie Hare about how the Hippocratic Oath might apply to artificial intelligence, and a guiding ethos for pushing innovation while protecting users from harm. A code of conduct, she says, could be one approach to ensuring that people are using technology more mindfully and ethically, as well as an opportunity for users to feel that they belong to a wider, global community. Although she sympathizes with people concerned by overregulation undermining innovation, Stephanie also notes that we expect certain standards to be met elsewhere, such as vehicle and drug safety, as well as fair journalistic practices. As Stephanie explains, we need to find a realistic middle ground between innovation and regulation. This episode is brought to you by the ⁠⁠Dell⁠⁠, by ⁠⁠Intel⁠⁠, by Fabi and by Gurobi⁠⁠⁠. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/935⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (01:23) What ‘technology ethics’ is                (14:46) Developing a Hippocratic Oath for tech  (42:32) How to protect against sensationalism           (53:38) How to maintain a balance of growth and infrastructure
With the number of jobs dramatically slowing in the last year, many question if this decline is down to companies turning to AI for completing entry-level tasks in particular. Research published earlier this month by Yale University shows no major difference in the types of roles and tasks in so-called `white-collar jobs` since late 2022, an auspicious date that coincides with the launch of ChatGPT. In this week‘s Five-Minute Friday, host Jon Krohn discusses if and when AI will undercut junior-level jobs, particularly in the US. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/934⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Sheamus McGovern, CEO of Open Data Science, takes Jon Krohn and his listeners on a journey to launching his popular data science and AI conference, now in its tenth year, as well as the great shifts to the fields that he has seen on the way. For Seamus, the growth of his Open Data Science Conference has shown him that an AI engineer is just the beginning of several roles that will emerge from the industry. He asks Jon to consider the breadth of tasks demanded of today’s engineers, from data profiling and transformation to feature engineering, hyper-parameter tuning, and model deployments. Just as the AI engineer emerged from the data scientist role, Seamus expects the industry to respond to the broadening range of projects and tools with new, niche, and dynamic job roles.  This episode is brought to you by the Trainium2, the latest AI chip from AWS, by Gurobi⁠⁠, by ⁠Dell⁠ and by ⁠Intel⁠. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/933⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (02:50) Why Seamus started ODSC (18:27) The differences in AI engineers and data scientists      (24:20) How to keep up with AI’s rapid pace (33:51) How people hire for AI orchestration  (46:26) How companies can get team skillsets right
Larissa Schneider speaks to Jon Krohn in this Feature Friday about finding the right time to invest in AI solutions, and when it’s better to build them yourself. She discusses her work leading global strategy and operations at Unframe, and how they raised $50 million in venture capital since the company’s launch in March 2025. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/932⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
AI predictions, and how to act on them: Data Science Strategist at Gurobi, Jerry Yurchisin, speaks to Jon Krohn about how mathematical optimization helps enterprises automate decisions for business success and where to find the resources to make it happen.   This episode is brought to you by the ⁠ODSC, the Open Data Science Conference, by Fabi, by ⁠Dell⁠, and by ⁠Intel⁠. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/931⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (02:34) What mathematical optimization is (13:58) How to get started with mathematical optimization (45:56) Gurobi’s use cases (56:29) Quantum computing and mathematical optimization
Jon Krohn’s highlights from this month of interviews focus on ways to future-proof your career, looking at the hardware that will get you the most mileage, the emerging roles that are well worth a look, and the developments in AI that will endure in a field constantly testing the durability of its own breakthroughs. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/930⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Breaking news: Jon Krohn welcomes Adrian Kosowski to the show to talk about the groundbreaking research happening at Pathway. Adrian and his team demonstrate how they have brought attention in AI closer to the way the brain functions, creating, in essence, a “massively parallel system of [artificial] neurons” that communicate with one another and exhibit properties similar to natural neurons. The goal is to move beyond the current limitations of transformers, where reasoning can be generalized across more complex and extended reasoning patterns, approximating a more human-like approach to problem-solving. This episode is brought to you by the Trainium2, the latest AI chip from AWS, by ⁠Dell⁠, by ⁠Intel⁠, by and ⁠Gurobi⁠. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/929⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (01:27) Pathway’s ground-breaking new biologically inspired architecture (20:40) Limitless context windows (34:39) BDH architecture as positive space (53:11) Building multilingual models (1:01:07) How to access the BDH architecture
Prompt injections, malicious code, and AI agents: In this week’s Five-Minute Friday, Jon Krohn looks into the current security weaknesses found in AI systems. A structural vulnerability that The Economist dubs a “lethal trifecta” could cause havoc for AI users, unless we take the necessary steps to contain our systems.  Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/928⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Earlier this year, David Loker joined CodeRabbit as their Director of AI. As more people come to write code with the help of large language models, David believes CodeRabbit will become a helpful assistant for code reviewing and pull requests. He tells Jon Krohn how CodeRabbit assists developers with real-time feedback, as well as the reality of vibe coding, the optimization challenges of agentic AI, and other pressing questions in AI and tech.  This episode is brought to you by the ⁠Dell⁠, by ⁠Intel⁠, by ⁠Gurobi⁠ and by ⁠ODSC, the Open Data Science Conference⁠. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/927⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (01:26) How CodeRabbit helps with coding    (17:30) Context engineering in context     (40:40) How CodeRabbit keeps data secure      (46:10) David’s thoughts on “vibe coding”                              (1:03:04) If machines will ever be truly creative
In this Five-Minute Friday, Jon Krohn explores how AI is reshaping the legal industry. He investigates how AI tools are helping lawyers make conclusions faster, how paralegals are being retrained, and the latest in-demand role in law (hint: It concerns AI). Listen to hear how Harvey AI and Thomson Reuters’ CoCounsel are using AI to help lawyers get ahead. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/926⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Tech innovation’s dependence on economic systems, trust in technology throughout history, and job displacement through AI: The Dieter Schwartz Associate Professor of AI and work at the University of Oxford, Carl Benedikt Frey, talks to Jon Krohn about his latest book, How Progress Ends, as well as how different economic systems deal with innovation and scaling, dealing with the homogeneity of generative AI output, and how to stay afloat in the new wave of job automation. This episode is brought to you by the ⁠Dell⁠, by ⁠Intel⁠, by ⁠ODSC, the Open Data Science Conference⁠ and by ⁠Gurobi⁠. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/925⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (04:00) All about How Progress Ends: Technology, Innovation, and the Fate of Nations (14:26) The role of weak ties in driving technological innovation (18:22) How to keep innovating as a big business (48:05) What we can learn and apply from previous industrial revolutions (54:33) How workers can try to ‘future-proof’ themselves
MIT lab NANDA (“Networked AI Agents in Decentralized Architecture”) reveals less than promising results for the future of AI adoption in businesses. According to “The GenAI Divide: State of AI in Business 2025”, a whopping 95% of enterprise AI projects “are getting zero return” on their $30-40 billion investment. Jon Krohn takes this Five-Minute Friday to look into why this has happened, with help from a critical response to the report written by Futuriom’s R. Scott Raynovich. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/924⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Graphs, but not as you would expect them: Graph analytics guru Amy Hodler speaks to Jon Krohn about the graph data structure and graph applications, graph algorithms, graph RAG, and graphs as memory systems for AI agents. We can use graphs in a surprising number of ways. Money laundering and fraud, as well as supply-chain crime, leave breadcrumbs at multiple “touch-points” over time, behaviors that graphs are better suited to reveal than rows and tables. Amy sees that most interest in graphs has been in the cybersecurity space. But this work isn’t only restricted to fighting crime! Listen to the episode to hear more case examples and how to get into graph work.  This episode is brought to you by the Dell, by the Intel, by ODSC, the Open Data Science Conference and by Gurobi. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/923⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: 01:49) A brief history of graphs (10:08) Uncovering fraud with graphs (28:31) Where graphs are most commonly applied, to date (34:49) Retrieval augmented generation graphs (48:04) The future of graphs
Hugo Dozois-Caouette speaks to Jon Krohn about his startup MaintainX and how he secured $254 million in venture capital, reaching a $2.5 billion valuation. MaintainX builds computerized maintenance management systems (CMMS) and enterprise asset management (EAM) software for industrial and manufacturing companies. This "digital clipboard" delivered through web and mobile apps connects machines, work orders, and frontline teams to boost productivity, reduce downtime, and prevent costly breakdowns. The platform captures knowledge from experienced workers and delivers AI-powered insights, with features like MaintainX CoPilot helping teams troubleshoot issues and make faster decisions. Listen to the episode to hear Hugo's perspective on manufacturing gaps that technology can fill, MaintainX's tech stack, and how CMMS platforms address information disconnects that slow down frontline teams. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/922⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Using Windows for AI development and the bleeding edge of NPUs: Shirish Gupta and Ish Shah from Dell Technologies speak to Jon Krohn about the latest products from Dell, the future of neural-processing units (NPUs), and how AI developers can make sound hardware investments.  This episode is brought to you by the Trainium2, the latest AI chip from AWS, by ODSC, the Open Data Science Conference and by Gurobi. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/921⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (04:18) Why Windows still outranks other operating systems (20:58) The difference between GPUs and NPUs (32:44) How to access and use Dell’s NPUs and GPUs (49:08) Using processing units on the cloud versus locally (57:43) About the Dell Pro Max
This month’s episode of In Case You Missed It gives us reasons to be cautiously optimistic about the future of large language models (LLMs), with guests discussing what to do about recent reports that found AI agents blackmailed human users when threatened, the importance of post-training LLMs, and the training we have available for data and AI engineers to create robust, secure, and useful AI. Jon Krohn includes clips from his interviews with Akshay Agrawal (Episode 911), Julien Launay (Episode 913), Michelle Yi (Episode 915), and Kirill Eremenko (Episode 917).  Additional materials: ⁠⁠⁠⁠⁠www.superdatascience.com/920⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
PyTorch, AGI, and the future of alignment research: Aurélien Géron joins Jon Krohn in this live interview to talk about the fourth edition of his bestselling Hands-On Machine Learning as well as what superintelligence makes him hopeful for, as well as what concerns him about machines surpassing human intelligence. This episode is brought to you by Gurobi and by the Dell AI Factory with NVIDIA⁠ Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/919⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (02:04) Why Aurélien wrote Hands-On Machine Learning        (20:54) How Aurélien came to decide on material for the new edition  (28:53) Aurélien’s predictions for AGI                                       (51:21) How to support alignment research                    (1:13:42) Does superintelligence mean super-capability
In this Five-Minute Friday, Jon Krohn introduces listeners to CrewAI, an open-source Python framework that can create and manage multi-agent teams. The clue is in the title: CrewAI assembles specialized agents into single “crews” that achieve complex goals between them. CrewAI’s agent teams can also learn and iterate, meaning that after the crew has achieved its goals for the first time, they can refine and tailor their approach to  future goals.  Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/918⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Founder of SuperDataScience, Kirill Eremenko, talks to Jon Krohn about how he found the best tools and approaches to help launch his 8-week AI engineering bootcamp. He breaks down the topics participants cover each week, and he also shares his tips with listeners who might want to start their own tech bootcamp or sign up for SuperDataScience’s September 2025 cohort. This episode is brought to you by the Dell AI Factory with NVIDIA and by ODSC, the Open Data Science Conference Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/917⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (10:58) Weeks 1-4 of the SuperDataScience bootcamp             (37:52) How to use AI to drive the bottom line in business                    (47:50) Weeks 5-8 of the SuperDataScience bootcamp (54:50) How to convert LLMs to agents (1:09:33) Jon’s feedback on the SuperDataSciencebootcamp
GPT-5 has just been released, but with not very much fanfare. In this Five-Minute Friday, Jon Krohn asks if GPT-5 deserves the community’s underwhelmed response to its release. He outlines five features of the model and explains why people might be feeling less than enthusiastic in the broader context of LLM development. Which LLMs are leading the way, and which are still playing the game of catch-up? Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/916⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Tech leader, investor, and Generationship cofounder Michelle Yi talks to Jon Krohn about finding ways to trust and secure AI systems, the methods that hackers use to jailbreak code, and what users can do to build their own trustworthy AI systems. Learn all about “red teaming” and how tech teams can handle other key technical terms like data poisoning, prompt stealing, jailbreaking and slop squatting.  This episode is brought to you by ⁠Trainium2, the latest AI chip from AWS⁠ and by the ⁠Dell AI Factory with NVIDIA⁠. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/915⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (03:31) What “trustworthy AI” means      (31:15) How to build trustworthy AI systems  (46:55) About Michelle’s “sorry bench”   (48:13) How LLMs help construct causal graphs   (51:45) About Generationship
In this Five-Minute Friday, Cofounder and CTO of lakeFS Oz Katz talks to Jon Krohn about data warehouses, data lakes, and how companies can handle increasingly complex data infrastructures and formats. Hear about lakeFS’s collaboration with Legofest, lakeFS’s approach to helping users collaborate on data lakes, and how to overcome the challenges of working with multimodal data. Additional materials: ⁠www.superdatascience.com/914⁠ This episode is brought to you by the ⁠Dell AI Factory with NVIDIA⁠.
Julien Launay launched Adaptive to give data science teams in business enterprises their “RLOps tooling” to make reinforcement learning easier. Talking to Jon Krohn, Julien says, “Most of our users are data scientists who write Python codes to interface with the system”. Adaptive is also able to work with companies without data science teams, collaborating with partners like Deloitte to add the necessary personnel. Julien is currently working on making his platform more widely available. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/913⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
In this episode of In Case You Missed It, we look back on five great interview episodes from July. Hear from Lilith Bat-Leah (Episode 901), Sinan Ozdemir (Episode 903), Sebastian Gehrmann (Episode 905), Zohar Bronfman (Episode 907) and Robert Ness (Episode 909). They’ll tell you why data-centric machine learning is so important across disciplines, starting with law, and how we can use AI benchmarks and “red teaming” to refine our search for the best AI models.  Additional materials: ⁠⁠⁠⁠www.superdatascience.com/912 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Reproducibility, Python notebooks, and data science communities: Software developer Akshay Agrawal speaks to Jon Krohn about Marimo, the next-generation computational notebook for Python, how he built and fostered a thriving community around the product, and what makes this notebook so versatile and accessible for users.  Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/911⁠⁠⁠⁠⁠ This episode is brought to you by ⁠Trainium2, the latest AI chip from AWS ⁠and by the ⁠Dell AI Factory with NVIDIA⁠. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
In this Five-Minute Friday, Jon Krohn looks into AI’s disruption of the journalism industry and how it has fundamentally reshaped news production. Multiple news outlets’ suing of ChatGPT over its use of copyrighted materials may have taken the most headlines to date, but this isn’t to say news media is rebuffing AI entirely. On the contrary, several outlets have launched summarization and analysis tools for both internal and external use, such as The New York Times’s Echo and The Washington Post’s Haystacker. This episode looks into the ways major news outlets are utilising AI, and what this means for journalists. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/910⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Researcher at Microsoft Robert Usazuwa Ness talks to Jon Krohn about how to achieve causality in AI with correlation-based learning, the right libraries, and handling statistical inference. When dealing with causal AI, Robert notes how important it is to keep aware of variables in the data that may mislead us and force inaccurate assumptions. Not all variables will be useful. It is essential, then, that any assumptions are grounded in a deeper understanding of how the data were gathered, and not what appears in the dataset. Listen to the episode to hear how you can apply causal AI to your projects. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/907⁠⁠⁠⁠ This episode is brought to you by Trainium2, the latest AI chip from AWS and by the Dell AI Factory with NVIDIA. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
The moral and ethical implications of letting AI take the wheel in business, as revealed by Anthropic: Jon Krohn looks into Anthropic’s latest research on how to use and deploy LLMs safely, specifically in business environments. The team designed scenarios to test the behavior of AI agents when given a goal and a set of obstacles to reach it. Those obstacles included 1) threats to the AI’s continued operation, and 2) conflict between the AI’s goals and the goals of the company. Hear Jon break down the results of this research in this Five-Minute Friday. Additional materials: ⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/908⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
“Intelligence has many forms,” says Zohar Bronfman, who speaks with Jon Krohn about the fascinating intersection between computational neuroscience and philosophy, and how it has brought him closer to understanding what is necessary to develop human-like intelligence in machines, as well as his motivations for launching Pecan AI and why predictive models outstrip generative models in business.  Additional materials: ⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/907⁠⁠⁠ This episode is brought to you⁠⁠⁠ by, ⁠⁠⁠⁠Adverity, the conversational analytics platform⁠⁠⁠⁠ and by the ⁠⁠⁠⁠Dell AI Factory with NVIDIA⁠⁠⁠⁠. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (03:47) Why LLMs aren’t bringing us closer to AGI (33:44) About Pecan AI (51:03) Why data modeling is so challenging (1:01:25) How Pecan AI makes its tools widely accessible
Jason Corso speaks to Jon Krohn in this Five-Minute Friday all about Voxel51’s latest tool, Verified Auto-Labelling, and the company’s incredible success in developing popular tools for computer vision. Additional materials: ⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/906⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
RAG LLMs are not safer: Sebastian Gehrmann speaks to Jon Krohn about his latest research into how retrieval-augmented generation (RAG) actually makes LLMs less safe, the three ‘H’s for gauging the effectivity and value of a RAG, and the custom guardrails and procedures we need to use to ensure our RAG is fit-for-purpose and secure. This is a great episode for anyone who wants to know how to work with RAG in the context of LLMs, as you’ll hear how to select the best model for purpose, useful approaches and taxonomies to keep your projects secure, and which models he finds safest when RAG is applied. Additional materials: ⁠⁠⁠⁠⁠⁠www.superdatascience.com/905⁠⁠ This episode is brought to you⁠ by, ⁠⁠⁠Adverity, the conversational analytics platform⁠⁠⁠ and by the ⁠⁠⁠Dell AI Factory with NVIDIA⁠⁠⁠. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (03:28) Findings from the paper “RAG LLMs are Not Safer: A Safety Analysis of Retrieval-Augmented Generation for Large Language Models” (09:35) What attack surfaces are in the context of AI (38:51) Small versus large models with RAG (46:27) How to select an LLM with safety in mind
In this Five-Minute Friday, Jon Krohn reveals how AI is taking on the glitzy world of advertising. Bold claims from Meta and OpenAI contend that users will soon be able to plug in what they want and have AI churn out an ad campaign for little to no cost are shaking the advertising industry to its core. The fact that the four biggest sellers of ads (Google, Meta, Amazon, and ByteDance) are digital companies and accounted for over half of the global market in 2024 adds salt to the wound. Hear the three ways that AI is disrupting the industry, and who (or what) has the most influence on digital consumers to date. Additional materials: ⁠⁠⁠⁠⁠⁠www.superdatascience.com/904 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Has AI benchmarking reached its limit, and what do we have to fill this gap? Sinan Ozdemir speaks to Jon Krohn about the lack of transparency in training data and the necessity of human-led quality assurance to detect AI hallucinations, when and why to be skeptical of AI benchmarks, and the future of benchmarking agentic and multimodal models. Additional materials: ⁠⁠⁠⁠⁠www.superdatascience.com/903⁠⁠⁠⁠ This episode is brought to you by Trainium2, the latest AI chip from AWS, by ⁠⁠Adverity, the conversational analytics platform⁠⁠ and by the ⁠⁠Dell AI Factory with NVIDIA⁠⁠. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (16:48) Sinan’s new podcast, Practically Intelligent (21:54) What to know about the limits of AI benchmarking (53:22) Alternatives to AI benchmarks (1:01:23) The difficulties in getting a model to recognize its mistakes
In this episode of “In Case You Missed It”, Jon recaps his June interviews on The SuperDataScience Podcast. Hear from Diane Hare, Avery Smith, Kirill Eremenko, and Shaun Johnson as they talk about the best portfolios for AI practitioners, how to stand out in a saturated candidate market for AI roles, how to tell when an AI startup is going places, and ways to lead AI change in business. Additional materials: ⁠⁠⁠www.superdatascience.com/902 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Senior Director of AI Labs for Epiq Lilith Bat-Leah speaks to Jon Krohn about the ways AI have disrupted the legal industry using LLMs and retrieval-augmented generation (RAG), as well as how the data-centric machine learning research movement (DMLR) is systematically improving data quality, and why that is so important.  Additional materials: ⁠⁠⁠⁠⁠www.superdatascience.com/901⁠⁠⁠⁠ This episode is brought to you by the ⁠⁠Dell AI Factory with NVIDIA⁠⁠ and Adverity, the conversational analytics platform⁠⁠⁠. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (05:45) Deciphering legal tech terms (TAR, e-discovery) (13:47) How legal firms use data and AI (29:01) All about data-centric machine learning research (DMLR) (46:58) Lilith’s career journey in the AI industry
“Stay happy and healthy”: In this special Five-Minute Friday, Jon Krohn speaks with Annie, his grandmother, on her 95th birthday. Hear how she is physically and mentally coping with illnesses that limit her mobility and the joys of having a pet. Additional materials: ⁠⁠⁠⁠⁠⁠www.superdatascience.com/900⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Data science skills, a data science bootcamp, and why Python and SQL still reign supreme: In this episode, Kirill Eremenko returns to the podcast to speak to Jon Krohn about SuperDataScience subscriber success stories, where to focus in a field that is evolving incredibly quickly, and why in-person working and networking might give you the edge over other candidates in landing a top AI role. Additional materials: ⁠⁠⁠⁠www.superdatascience.com/899⁠⁠⁠ This episode is brought to you by ⁠Adverity, the conversational analytics platform⁠ and by the ⁠Dell AI Factory with NVIDIA⁠. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (04:35) Stories from five SuperDataScience subscribers (27:32) How to secure a career in a fast-paced industry (44:19) How to stand out against huge competition in data science (1:01:40) The importance of communication in data science (1:16:41) Where to focus your skills in AI engineering
In this Five-Minute Friday, Jon Krohn announces his new, free workshop on Agentic AI. On this four-hour comprehensive course, you’ll learn the key terminology for working with these flexible, multi-agent systems and then get to grips with developing and deploying this artificial “team of experts” for all your AI-driven projects.  Additional materials: ⁠⁠⁠⁠⁠www.superdatascience.com/898⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Diane Hare talks to Jon Krohn about the power of storytelling for corporate buy-in of AI initiatives, how to actively implement AI to transform organizations, and how emerging professionals can upskill themselves. Hear how she discovered her background in storytelling at Ernst & Young and her work with Simon Sinek, which she finds to be integral to her process. Inspired by Sinek’s aphorism “start with why”, Diane notes that many companies neglect this crucial part of their mission because they never take the time to work on it. Additional materials: ⁠⁠⁠www.superdatascience.com/897⁠⁠ This episode is brought to you by Trainium2, the latest AI chip from AWS, by Adverity, the conversational analytics platform and by the Dell AI Factory with NVIDIA. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (04:51) How Y Carrot works with BizLove (14:19) How BizLove prioritizes change management (29:18) How to upskill effectively (42:37) How BizLove integrated data from two enterprises (48:52) How to enable change in your business
The Economist reported that global Google searches for "AI unemployment" hit an all-time high earlier this year. But do we have to worry about AI taking our jobs? In this week’s Five-Minute Friday, Jon Krohn investigates whether the rise of AI has directly led to an increase in unemployment.  Additional materials: ⁠⁠⁠⁠www.superdatascience.com/896 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
How to get funded by a VC specializing in AI: Head of AIX Ventures Shaun Johnson talks to Jon Krohn about investment strategies, how to simplify AI adoption, why a little competition can be so beneficial to AI startups, and how Big Tech is circumventing anti-monopoly measures. Additional materials: ⁠⁠www.superdatascience.com/895⁠ This episode is brought to you by the ⁠⁠Dell AI Factory with NVIDIA and by ⁠Adverity, the conversational analytics platform. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (10:36) What Shaun looks for when evaluating early-stage AI startups (19:11) Building out AI startups (41:44) How AI practitioners can future-proof their careers (45:27) How to measure AI impact (53:30) The key verticals ripe for AI disruption
In this episode of “In Case You Missed It”, Jon Krohn takes clips from interviews with guests in May 2025. From AI agent integration and RAG-based chatbots to education through virtual reality headsets and data harmonization, this episode explores how industry leaders are developing the tools and technologies that can improve operations, education, healthcare, and marketing. Highlight clips are with John Roese, Global Chief Technology Officer and Chief AI Officer at Dell Technologies (Episode 887), Senior Developer Relations Engineer at Posit, PBC Jeroen Janssens and Lead Data Scientist at Xomnia Thijs Nieuwdorp (Episode 885), Founder of CEEK Mary Spio (Episode 889), and Martin Brunthaler, Co-founder and Chief Technology Officer at Adverity (Episode 891).  Additional materials: ⁠⁠www.superdatascience.com/894⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Avery Smith is a passionate and motivational YouTuber and careers educator for data science. In this episode, Jon Krohn asks Avery about the tools and tricks he has learned from personal experience and from his students in how to get ahead in the tech industry. Avery shares the “learning ladder” he uses to help newcomers start on the right foot with great examples from former bootcamp students who have put his theories into practice. And, if you’re using LinkedIn to find jobs, Avery explains why this might be one of the reasons you’re not getting work. Additional materials: ⁠www.superdatascience.com/893 This episode is brought to you by Adverity, the conversational analytics platform and by the Dell AI Factory with NVIDIA Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (04:02) How to take the jump into a data science career (15:19) Avery’s recommended strategy for starting a career in data science (18:10) Recommendations for people learning data science with LLMs (32:52) What should go into a data science portfolio  (46:07) Why Avery prefers practice over theory in teaching data science  (48:25) The bare minimum to get your first job in data science
Businesses have entered a “trough of disillusionment” for AI. In this Five-Minute Friday, Jon Krohn learns why Fortune 500 execs are so frustrated with the tools and how they can work their way up the “slope of enlightenment” towards effective AI. Hear why AI takeup hasn’t so far gone to plan in the corporate world and what that world needs from AI to encourage greater business engagement.  Additional materials: ⁠⁠⁠www.superdatascience.com/892⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Martin Brunthaler talks to Jon Krohn about founding Adverity, a data analytics platform for marketing that simplifies integrating data from multiple sources and crunching them into actionable insights. Learn how Adverity became a data analytics powerhouse serving multiple industries, and why Martin thinks AI will strengthen rather than diminish the job market for data scientists, data analysts, and machine learning engineers.   Additional materials: www.superdatascience.com/891 Today’s episode is brought to you by Trainium2, the latest AI chip from AWS and by the Dell AI Factory with NVIDIA Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (03:52) How Martin co-founded Adverity (14:26) The features of Adverity (39:24) If data analysts, data scientists, and ML engineers should worry about Adverity making their job redundant (48:29) Martin’s predictions for the future for data analysts and data scientists (51:39) Martin’s tips for success as a CTO
In this week’s Five-Minute Friday, Jon Krohn reveals highlights from Stanford University’s AI Index Report. Released a few weeks ago by the Institute for Human-Centered AI, this annual report details the incredible technical advances, policies, and investments in artificial intelligence. Hear which models achieve the best performance relative to their size, in what scenarios top AI systems can outperform humans (and when humans still outperform AI), and more in Jon’s five key takeaways. Additional materials: ⁠⁠www.superdatascience.com/890⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Founder of CEEK’s Mary Spio talks to Jon Krohn about how the platform contributes to the emerging community of digital creators with its blockchain-powered virtual experiences. Hear how Mary got her first investors for CEEK and how it is used across industries as diverse as education, entertainment, aviation, and healthcare. Additional materials: ⁠⁠⁠www.superdatascience.com/889⁠ This episode is brought to you by ⁠⁠Adverity, the conversational analytics platform⁠⁠ and by the ⁠⁠Dell AI Factory with NVIDIA⁠⁠. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (03:42) What CEEK is and the multiple industries it serves (38:47) How Mary developed VR headsets to reduce nausea experienced by women headset users (42:10) The growing potential for immersive experiences (44:36) How to mitigate the risks of immersive-experience misuse (51:56) Mary’s tips for career success
Mike Pell speaks to Jon Krohn about The Microsoft Garage, a program that drives the culture of innovation at the tech multinational, and how listeners can apply their principles to foster innovation in their workplace. In this Five-Minute Friday, you’ll hear more about Microsoft’s approaches to agentic AI, the future of human-AI collaboration in the workplace, and why experimentation and curiosity are critical skills for the future of work. Additional materials: ⁠www.superdatascience.com/888⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Jon Krohn speaks to John Roese about the promise of multi-agent teams for business, the benefits of agentic AI systems that can identify and complete tasks independently, and how these systems demand new authentication, authorization, security and knowledge-sharing standards. They also discuss how to use AI to refine project ideas down to a core business need, as well as the new and emerging careers in the tech industry and beyond, all thanks to AI. Additional materials: ⁠⁠www.superdatascience.com/887 This episode is brought to you by ⁠Adverity, the conversational analytics platform⁠ and by the ⁠Dell AI Factory with NVIDIA⁠. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (03:54) Why ROI is the most important aspect of an AI-driven project  (14:06) Why high-impact AI projects trigger a flywheel of success (23:32) The future of agentic systems  (30:28) How to manage agentic systems at scale (46:36) The disruptive nature of quantum computing
Our In Case You Missed It episode for April has clips on NVIDIA’s and Dell’s product and service offers including an overview of NVIDIA’s GPUs, AI Enterprise, and its microservices. You’ll also hear about AWS’ focus on bringing choice to customers and the incredible power of its Graviton CPU, how Zerve opens access to AI deployment, Merck KGaA, Darmstadt, Germany’s multi-chip integration, and why reliance on the cloud might soon become a practice of times past. Additional materials: ⁠www.superdatascience.com/886⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Jeroen Janssens and Thijs Nieuwdorp are data frame library Polars’ greatest advocates in this episode with Jon Krohn, where they discuss their book, Python Polars: The Definitive Guide, best practice for using Polars, why Pandas users are switching to Polars for data frame operations in Python, and how the library reduces memory usage and compute time up to 10x more than Pandas. Listen to the episode to be a part of an O’Reilly giveaway! Additional materials: ⁠www.superdatascience.com/885 This episode is brought to you by Trainium2, the latest AI chip from AWS, by Adverity, the conversational analytics platform and by the Dell AI Factory with NVIDIA. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (07:44) Why Jeroen and Thijs wrote Python Polars: The Definitive Guide   (21:54) Best practices in Polars  (25:55) Why Polars has so many users (34:32) The benefits of the Great Tables package (51:06) Jeroen and Thijs’ partnership with NVIDIA and Dell for Python Polars: The Definitive Guide
Model Context Protocol (MCP) is Anthropic’s hottest tool, with over 1,000 community-built MCP servers in operation by February alone. In this Five-Minute Friday, Jon Krohn explains what took so long for users to catch on: Anthropic released MCP in November 2024. Hear more about the buzz behind MCP, its applications, and how easy it is to get started. Additional materials: ⁠www.superdatascience.com/884⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Returning after the “Super Bowl of AI”, NVIDIA GTC, Sama Bali and Logan Lawler talk to Jon Krohn about their respective work at tech giants NVIDIA and Dell. Sama and Logan discuss the next-gen Blackwell GPUs to their collaboration with Dell in launching Pro-Max PCs specially designed to take on heavy computational workloads as well as the incredible performance of GB 10 and GB 300 workstations, and the widening accessibility of AI developer tools and models.  Additional materials: www.superdatascience.com/883 This episode is brought to you by ODSC, the Open Data Science Conference, by Adverity, the conversational analytics platform and by the Dell AI Factory with NVIDIA. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (07:29) About Dell’s Pro Max PCs (14:01) Why having a Blackwell GPU from Nvidia is a great option for those new to training and deploying AI models   (36:47) When it makes sense for a data scientist to switch from a Unix to a Windows based system  (46:33) Logan’s and Sama’s predictions for AI
This week’s five-minute Friday heads to the Netherlands to find out more about Dutch company ASML, the brains behind the lithography machines that build AI chips.  Jon Krohn walks through how ASML came to dominate the market, where they’re headed next, and how ASML’s complex machines shape AI chips as well as the very future of AI.   Additional materials: www.superdatascience.com/882 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Emily Webber speaks to Jon Krohn about her work at Amazon Web Services, from its Annapurna Labs-developed Nitro System, a foundational technology that can enhance securities and performance in the cloud and how Trainium2 became AWS’ most powerful AI chip with four times the compute of Trainium. Hear the specs of AWS’s chips and when to use them. Additional materials: www.superdatascience.com/881 This episode is brought to you by ODSC, the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (08:36) Emily’s work on AWS’ SageMaker and Trainium  (23:54) How AWS Neuron lets builders tailor their approach to using frameworks  (29:07) Why using an accelerator is better than using a GPU  (35:29) The key differences between AWS Trainium and AWS Trainium2  (52:45) How to select between AWS Trainium and AWS Trainium2
First developed in China, Manus AI and DeepSeek have made great waves on an international scale. Sought-after for their cost-effectiveness compared to US-made tech, Manus AI and DeepSeek are quickly becoming dominant technologies inside the country. In this five-minute Friday, Jon Krohn asks: Do these technologies warrant the huge amount of resources spent on them by multiple industries in China, and what makes hype become a mainstay? Additional materials: www.superdatascience.com/880 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Greg Michaelson speaks to Jon Krohn about the latest developments at Zerve, an operating system for developing and delivering data and AI products, including a revolutionary feature allowing users to run multiple parts of a program’s code at once and without extra costs. You’ll also hear why LLMs might spell trouble for SaaS companies, Greg’s ‘good-cop, bad-cop’ routine that improves LLM responses, and how RAG (retrieval-augmented generation) can be deployed to create even more powerful AI applications. Additional materials: www.superdatascience.com/879 This episode is brought to you by Trainium2, the latest AI chip from AWS and by the Dell AI Factory with NVIDIA. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (04:00) Zerve’s latest features (35:26) How Zerve’s built-in API builder and GPU manager lowers barriers to entry (40:54) How to get started with Zerve (41:49) Will LLMs make SaaS companies redundant? (52:29) How to create fairer and more transparent AI systems (56:07) The future of software developer workflows
AI stacks, AGI, training neural networks, and AI authenticity: Jon Krohn rounds up his interviews from March with this episode of “In Case You Missed It”. In his favorite clips from the month, he speaks to Andriy Burkov (Episode 867), Natalie Monbiot (Episode 873), Richmond Alake (Episode 871) and Varun Godbole (Episode 869). Additional materials: www.superdatascience.com/878 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
NPUs, AIPC, and Dell’s growing suite of AI products: Shirish Gupta speaks to Jon Krohn about neural processing units and what makes them a go-to tool for AI inference workloads, reasons to move your workloads from the cloud and to your local devices, what the mnemonic AIPC stands for and why it will soon be on everyone’s lips, and he offers a special intro to Dell’s new Pro-AI Studio Toolkit. Hear about several real-world AIPC applications run by Dell’s clients, from detecting manufacturing defects to improving efficiencies for first responders, massively supporting actual life-or-death situations.  Additional materials: www.superdatascience.com/877 This episode is brought to you by ODSC, the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (03:28) What neural processing units (NPUs) are (23:53) About Dell Pro AI Studio  (35:03) Use cases for Dell Pro AI Studio (45:16) How AI development workflows and applications will change  (49:01) About Dell’s AI factory ecosystem
Small, simple, accessible: Hugging Face makes a huge contribution to the agentic AI wave with its smolagents. Jon Krohn explores how this small-but-mighty new Python library can act as the best personal assistant you never had. Hear about its features and use cases in this five-minute Friday. Additional materials: www.superdatascience.com/876 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Why are semiconductors so essential in this digital age, and how are they made? Jon Krohn speaks to electronics CEO Kai Beckmann about Merck KGaA, Darmstadt, Germany’s intricate manufacturing process, how we can use AI to develop materials that power next-gen AI technologies, and how a chip with the processing power of the human brain might one day be able to run on the power of a low-watt light bulb. Additional materials: www.superdatascience.com/875 This episode is brought to you by the Dell AI Factory with NVIDIA. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (06:26) How Merck KGaA, Darmstadt, Germany supports groundbreaking developments in AI  (13:42) Material science’s biggest challenges for AI  (29:55) What heterogeneous integration is (34:37) How optical tech influences the electronics industry  (49:04) Navigating upturns and downturns in the semiconductor industry  (53:08) How AI regulations benefit humanity
In this Five-Minute Friday, Jon Krohn talks baseball. For decades, coaches have relied on player performance stats to make in-game decisions and refine their season strategies. Now, AI led by Statcast is taking baseball strategy even further, massively broadening analytics data to include pitch, swing and catch trajectories, spin rates, biomechanical information, player matchups, and how to enhance player performances. Listen to the episode to find out what other industries can learn from the “data-friendly” sport of baseball. Additional materials: www.superdatascience.com/874 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Natalie Monbiot is an independent advisor and collaborator for projects that concern the “virtual human”, and she is “going all in on the virtual human economy”. Jon Krohn speaks to Natalie about these new ventures, how to mitigate the divide between AI users and nonusers, and how anyone can collaborate with AI without compromising their own creativity. Additional materials: www.superdatascience.com/873 This episode is brought to you by the Dell AI Factory with NVIDIA, by Trainium2, the latest AI chip from AWS and by ODSC, the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (07:21) Natalie’s influences for her work (18:30) Will machines surpass human intelligence? (29:08) Using LLMs as collaborators and partners (40:15) How platforms demand user engagement and time (56:54) Natalie Monbiot at Wizly
In this five-minute Friday, Jon Krohn looks into Microsoft’s recent release of Majorana 1, a new quantum processing unit that uses topological qubits, a step away from the fragile qubits currently in use. Get Jon’s thoughts about this “transistor for the quantum age”, potential applications for quantum computing, and why this marks an exciting future for data science and machine learning practitioners. Additional materials: www.superdatascience.com/872 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Agentic AI, AI success strategies, and why flexibility will be so important to keep up with the AI market: Jon Krohn talks to Richmond Alake about the NoSQL database MongoDB, including why it’s a great addition to your toolkit for developing (agentic) AI applications, with a look under the hood at its native vector database. Richmond also talks about why he expects multi-agent AI architectures to go mainstream in 2025.  Additional materials: www.superdatascience.com/871 This episode is brought to you by the Dell AI Factory with NVIDIA and by ODSC, the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (04:10) How Richmond became a Staff Developer Advocate (07:40) How NoSQL database differs from a relational database (16:50) The advantages of working with the cloud-based MongoDB Atlas (32:26) Richmond’s predictions for agentic AI (40:38) How to create an effective AI strategy
In this Five-Minute Friday, Jon Krohn looks into what he considers the world’s most powerful research tool to date, OpenAI’s Deep Research. Find out how OpenAI trained Deep Research to compile literature reviews of limitless topics, what similar tools are on the market, and where Jon sees the tool as having real-world value including how he uses it daily. Additional materials: www.superdatascience.com/870 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Jon Krohn talks to Varun Godbole about AI prompt engineering, generative wisdom, and AI generalists in this episode all about the interrelationships between humans and AI. Additional materials: www.superdatascience.com/869 This episode is brought to you by the Dell AI Factory with NVIDIA and by ODSC, the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (10:44) Using deep learning to predict breast cancer (15:55) All about Varun’s Tuning Playbook (29:56) On the explosion of interest and news about AI and data science  (46:35) About Varun’s Wise AI
How to start a successful tech company, and how you can get started with DBT, TabPFN and BAML: Jon Krohn rounds up his favorite moments from February in this episode of “In Case You Missed It”. Additional materials: www.superdatascience.com/868 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
The realities of Agentic AI, AGI, and chatbots that don’t hallucinate: Andriy Burkov talks to Jon Krohn about AI in 2025. Best known for his concise machine learning modelling books, author and AI influencer Andriy Burkov also talks about his latest publication in the series, The Hundred-Page Language Learning Models Book.  Additional materials: www.superdatascience.com/867 This episode is brought to you by the Dell AI Factory with NVIDIA. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (07:38) Andriy’s “triology” of books on machine learning (29:32) On the limitations of AI agents (41:12) On the prospect of artificial general intelligence (AGI) (54:24) On developing a chatbot that doesn’t hallucinate (01:10:07) On open-weight and open-source LLMs
Jon Krohn addresses a question for the ages: How close are we, really, to Jurassic Park? Dallas-based biotech company Colossal Biosciences is developing technology that aims to return previously extinct animals like the dodo and woolly mammoth to earth and, crucially, pull many others like the white rhino back from the brink of extinction.  Additional materials: www.superdatascience.com/866 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Jon Krohn talks to Cal Al-Dhubaib about the extraordinary success of AI and machine learning solutions provider Pandata, his ironclad hack for any company to define their core values, and how to attract and secure loyal clients. Cal thinks tech professionals make two critical mistakes in their careers: The first is that they too-often enjoy being the gatekeepers of their work rather than educating their clients and coworkers as to the details of their projects and why it benefits the company. The second is that tech professionals don’t show vulnerability, whether that means not knowing a topic or not fully understanding how a business works. This issue, Cal says, can spell the difference between a startup’s success and failure. Learn how tech startups can make an ironclad strategy for their future in this episode of The SuperDataScience Podcast. This episode is brought to you by ODSC, the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (09:32) How to scale a successful data science consultancy (22:25) How Pandata navigates highly regulated environments  (27:59) How to tackle tech illiteracy in business  (36:32) What skills Cals looks for in new hires  (35:56) How to sell on a tech company  Additional materials: www.superdatascience.com/865
Jon Krohn investigates OpenAI’s new release, o3-mini, in this five-minute Friday, where he walks through the reasoning model’s capabilities and performance, cross-examining them against other major-league players, DeepSeek-R1, GPT-4o and Claude 3.5 Sonnet. Additional materials: www.superdatascience.com/864 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Jon Krohn talks tabular data with Frank Hutter, Professor of Artificial Intelligence at Universität Freiburg in Germany. Despite the great steps that deep learning has made in analysing images, audio, and natural language, tabular data has remained its insurmountable obstacle. In this episode, Frank Hutter details the path he has found around this obstacle even with limited data by using a ground-breaking transformer architecture. Named TabPFN, this approach is vastly outperforming other architectures, as testified by a write up of TabPFN’s capabilities in Nature. Frank talks about his work on version 2 of TabPFN, the architecture’s cross-industry applicability, and how TabPFN is able to return accurate results with synthetic data. This episode is brought to you by ODSC, the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (05:57) All about the TabPFN architecture  (21:27) Use cases for Bayesian inference (35:07) On getting published in Nature (44:03) How TabPFN handles time series data (51:52) All about Prior Labs Additional materials: www.superdatascience.com/863
In this episode of “In Case You Missed It”, Jon Krohn shares his favorite clips from the last four weeks. He talks to Azeem Azhar, Florian Neukart, Kirill Eremenko, Hadelin de Ponteves, and Brooke Hopkins on what’s in store for AI in 2025, from quantum computing and customizable tools to handy checklists and how the mathematics of exponentials can help us keep our heads about the swift advancement of AI. Additional materials: www.superdatascience.com/862 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
How does a CrossFit winner, bobsledder and swimmer go on to have a glittering career in data analytics and engineering? Colleen Fotsch talks to Jon Krohn about transitioning into very different career paths, how sports gave her the competitive mindset she needed for success in data science, and seeing the niche role of analytics engineering as a bridge between data engineering and analysis. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (05:49) Colleen’s path from athlete to data analyst (1:14:41) About the data build tool (DBT) (1:22:51) Colleen’s work at CHG Healthcare (1:32:45) How Colleen and Tia-Clair got started with PRVN GO Additional materials: www.superdatascience.com/861
DeepSeek-curious? This Five-Minute Friday is for you! Jon Krohn investigates the overwhelming overnight success of this new LLM, the product of a Chinese hedge fund. DeepSeek is a market newcomer, and yet it runs shoulder to shoulder with behemoths from OpenAI, Anthropic and Google like it’s all in a day’s work. Additional materials: www.superdatascience.com/860 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
In this week’s guest interview, Vaibhav Gupta talks to Jon Krohn about creating a programming language, BAML, that helps companies save up to 30% on their AI costs. He explains how he started tailoring BAML to facilitate natural language generation interactions with AI models, how BAML helps companies optimize their outputs, and he also lets listeners into Boundary’s hiring process. This episode is brought to you by ODSC, the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (04:53) What BAML stands for (14:33) Making a prompt engineering a serious practice (18:00) How BAML helps companies (23:30) Using retrieval-augmented generation (RAG) (43:09) How to get a job at Boundary Additional materials: www.superdatascience.com/859
Are you an Account Executive with experience in the technology sector? In this Five-Minute Friday, Jon Krohn tells listeners about an exciting new role that has opened up at The SuperDataScience Podcast. Additional materials: www.superdatascience.com/858 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Brooke Hopkins speaks to Jon Krohn about technology’s new frontiers in AI agents, how these agents will impact society, work and our creative enterprises, and what this might mean for our data-driven future. You will learn how Coval, a simulation and evaluation platform for AI voice and chat agents, helps companies balance precision and scalability while making few concessions on the way.  This episode is brought to you by ODSC, the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (07:49) What Coval does and how the platform works (21:16) Coval’s workflows (37:40) The future of AI agents  (46:28) The metrics to evaluate performance  (55:08) How close we are to achieving AI agent autonomy Additional materials: www.superdatascience.com/857
Get excited: The fastest-growing jobs in the US are AI Engineer and AI Consultant. In this Five-Minute Friday, Jon Krohn looks into the reports that reveal this job growth, and the trends any data scientist and AI professional will want to watch in 2025. Additional materials: www.superdatascience.com/856 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
How can we use AI to solve global problems like the environmental crisis, and how will future AI start to manage increasingly complex workflows? Famed futurist Azeem Azhar talks to Jon Krohn about the future of AI as a force for good, how we can stay mindful of an evolving job market, and Azeem’s favorite tools for automating his workflows. This episode is brought to you by ODSC, the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (05:43) Azeem Azhar’s vision for AI’s future (14:16) How to prepare for technological shifts (20:35) How to be more like an AI-first company (38:46) The tools Azeem Azhar uses regularly (50:09) The benefits and risks of transitioning to renewable energy (1:09:28) Opportunities in the future workplace Additional materials: www.superdatascience.com/855
Join Jon Krohn as he unpacks Ray Kurzweil’s six epochs of intelligence evolution, a fascinating framework from The Singularity is Nearer. From the origins of atoms and molecules to the transformative future of brain-computer interfaces and cosmic intelligence, Jon explores how each stage builds upon the last. This quick yet profound journey reveals how humanity is shaping the Fifth Epoch—and hints at what’s next for intelligence in our universe. Additional materials: www.superdatascience.com/854 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Kirill Eremenko and Hadelin de Ponteves AI educators, whose courses have been taken by over 3 Million students, sit down with Jon Krohn to talk about how foundation models are transforming businesses. From real-world examples to clever customization techniques and powerful AWS tools, they cover it all. bravotech.ai - Partner with Kirill & Hadelin for GenAI implementation and training in your business. Mention the “SDS Podcast” in your inquiry to start with 3 complimentary hours of consulting. This episode is brought to you by ODSC, the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (07:00) What are foundation models? (15:45) Overview of the foundation model lifecycle: 8 main steps. (29:11) Criteria for selecting the right foundation model for business use. (41:35) Exploring methods to customize foundation models. (53:04) Techniques to modify foundation models during deployment or inference. (01:11:00) Introduction to AWS generative AI tools like Amazon Q, Bedrock, and SageMaker. Additional materials: www.superdatascience.com/853
AI security, LLM engineering, how to choose the best LLM, and tech agnosticism: In our first “In Case You Missed It” of 2025, Jon Krohn starts the year with a round-up of our favorite recent interview moments. He selects from interviews with Andrew Ng, Ed Donner, Eiman Ebrahimi, Sadie St Lawrence, and Greg Epstein, covering the latest in AI development, touching on agentic workflows, promising new roles in AI, and what blew our minds last year. Additional materials: www.superdatascience.com/852 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Are our passwords safe, even with the increasing accessibility of quantum computing? Florian Neukart, Chief Product Officer at Terra Quantum AG, thinks so. In this episode, he outlines the three key elements of quantum-safe security. He speaks to Jon Krohn about the resourceful applications of quantum computing and workarounds for the demands of quantum computing on operational times and cooling systems. And if you’re interested in making the switch to quantum computing from machine learning, he also explores what you need (and don’t need) to make change happen. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (17:12) The real-world applications of quantum computing (23:35) The chips needed for quantum computing  (31:18) How quantum computing meets key business challenges (46:33) The ethical challenges of quantum technology (49:28) How to become proficient in quantum computing  (1:01:21) The future of quantum computing Additional materials: www.superdatascience.com/851
A new year often draws our focus towards fresh approaches to the way we work and structure our day. For Jon Krohn, the continuous calendar gives him a realistic and uninterrupted overview of his time. Plus, it’s customizable and free! In this episode, Jon also shares his plans and priorities for the New Year, and he recommends how you can assess and achieve your goals for the year; critical advice for anyone who wants to create manageable and sustainable milestones in 2025. Additional materials: www.superdatascience.com/850 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Sadie St Lawrence returns for her 4th annual prediction episode on the Super Data Science Podcast. Together with host Jon Krohn, they reflect on 2024’s most transformative trends—like agentic AI and enterprise AI monetization—and predict what's coming in 2025, from AI-driven science to the skills data scientists need to stay ahead. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (03:30) 2024 AI trend recap (19:23) Comeback of the year: Google (27:29) Wow moment of the year (40:20) Looking ahead to 2025 Additional materials: www.superdatascience.com/849
In this Five-Minute Friday episode, Jon Krohn reflects on 2024’s monumental year in AI, highlighting the rapid rise of generative AI and its impact across industries. From functional coding breakthroughs to independently acting AI agents, we explore the transformative power of these advancements and the promise they hold for 2025. Jon shares optimism for the future of AI and humanity's ability to harness it for the greater good. Additional materials: www.superdatascience.com/848 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Ed Donner co-founded AI-driven recruitment platform, Nebula.io, with The SuperDataScience Podcast’s host, Jon Krohn. Ed and Jon reminisce about how they launched their company, the growing opportunities for data scientists, how to choose an LLM, and today’s top technical terms in AI.  Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (11:15) What an AI engineer does (19:23) Defining today’s key terms in AI: RAG, fine tuning, agentic. (27:09) How to select an LLM (49:41) Pitting LLMs against each other in a game (53:14) What to do once you’ve selected an AI model Additional materials: www.superdatascience.com/847
In this Five-Minute Friday, Jon Krohn speaks to Anu Jain, CEO of Nexus Cognitive, and Mahesh Kumar, CMO of Acceldata. They talk about the importance of updating data, especially for predictive models that make key financial decisions for a company, as well as the current state of data governance and why it’s overdue its own update. Additional materials: www.superdatascience.com/846 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Discover how technology has become the modern belief system shaping our world. Greg Epstein, author of Tech Agnostic: How Technology Became the World's Most Powerful Religion, and Why It Desperately Needs a Reformation, draws striking parallels between tech culture and traditional faiths. From AI's "singularity" echoing prophetic narratives to Silicon Valley’s promises of salvation through innovation, Greg uncovers the profound influence of technology on our lives. He challenges us to rethink blind faith in progress, focus on genuine human connection, and navigate a future where ethics and empathy guide innovation. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (08:30) How can someone cultivate connection without religion? (15:49) Social media as a new form of community (17:00) Tech's transformation into a religion (56:08) How to set boundaries with tech (01:01:32) The singularity as a religious narrative (01:19:53) Transhumanism and effective altruism as tech cults (01:15:00) Defining tech agnosticism (01:26:55) Prioritizing human connection in a tech-driven world Additional materials: www.superdatascience.com/845
In this episode of “In Case You Missed It”, in which we round up our favorite moments from the previous month of interviews, Jon Krohn asks his guests about the future of recruitment and job applications, the multiple pathways to a career in AI, the potential of AI in developing proteins for improved healthcare, and how “AI celebrity” doesn’t necessarily equate to “AI expert”. Additional materials: www.superdatascience.com/844 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
What’s holding your AI projects back from success? Dr. Eiman Ebrahimi, CEO of Protopia AI and former NVIDIA scientist, takes us on a fascinating journey through the challenges of AI data security and enterprise scalability. Learn how to escape "proof of concept purgatory," unlock profitable AI solutions, and tackle the trade-offs between cost, speed, and security. Plus, discover how the philosophy of Alan Watts can inspire innovation and drive meaningful change in the world of AI. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (02:53) Protopia’s role in AI data security and privacy (11:45) The functionality behind Stained Glass Transform (22:20) Eiman’s journey from NVIDIA to founding Protopia (25:37) Challenges enterprises face with ROI on AI projects (36:40) Multi-tenancy in AI systems (55:37) Stained Glass Transform’s privacy-preserving capabilities (01:09:31) Emerging trends in AI (01:14:55) Alan Watts’ philosophies and their link to entrepreneurship Additional materials: www.superdatascience.com/843
In this Five-Minute Friday, Jon interviews Chris Bennett and Joseph Balsamo on the importance of flexibility in the way we deploy AI models, Dell’s brand positioning in the AI space, and whether GenAI’s business applications stand up to the hype. Additional materials: www.superdatascience.com/842 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
In this special episode recorded live at ScaleUp:AI in New York, Jon Krohn speaks to Andrew Ng in response to his conference talk on smart agentic AI workflows. Jon follows up with Andrew about smart agentic workflows and when to use them, how businesses should direct their efforts in investing in AI, and the new ways that AI tools can process visual and unstructured data. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (06:13) How to weigh up cost and effectiveness in new AI workflows (12:08) The crucial elements for building effective vision AI applications (15:34) How large vision models might transform global industries (18:40) How to mitigate risk in people not verifying accuracy in answers generated by agents Additional materials: www.superdatascience.com/841
What do AI, robotics, and premium wine grapes have in common? Everything, as it turns out. In this episode, we explore viticultural robotics a revolutionary project combining machine learning, spectroscopic sensors, and VR-controlled robotics to tackle one of agriculture’s trickiest challenges: harvesting delicate wine grapes worth over $6,000 per tonne. From vineyards in the UK to cutting-edge labs, discover how these innovations could transform not just viticulture but the entire future of precision agriculture. Additional materials: www.superdatascience.com/840 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Jess Ramos is redefining success in data analytics. As the Founder of Big Data Energy and a Senior Data Analyst at Crunchbase, she’s mastered the art of salary negotiation, built a massive social media following, and turned her passion for data into a thriving personal brand. She reveals how she doubled her salary in under a year, created her own SQL course, and advocates for women in STEM. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (03:42) How Jess got her start in data analytics (09:14) Why SQL is the most critical skill for data professionals (11:46) How Jess more than doubled her salary in less than a year (20:16) Tips for transitioning from a data job to creating your own business (31:20) The various routes to a career in data science (39:13) How Jess challenges STEM stereotypes Additional materials: www.superdatascience.com/839
Jon Krohn heads to Lisbon for an interview hosted by Bella Shing, Chapter Lead for Light Dao. He shares the stage with Regarding Consciousness podcast host, Jennifer Hill, where the three discuss AI philosophy and consciousness. Additional materials: www.superdatascience.com/838 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Deepali Vyas, Global Head of Applied Intelligence (Data Science & AI) and FinTech at Korn Ferry, talks to Jon Krohn about the best ways for data science and AI professionals to get seen and hired. Hear why video, not text, is the future of recruitment, how to get over camera shyness, and how to make a winning impression on job recruiters. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (09:49) On using GenAI to get hired (27:44) The future of video in recruitment (40:36) Tips for the camera-shy (44:15) How Fearless+ started (54:51) How AI helps organizations to ensure equity (57:43) Green-flag behaviors at work Additional materials: www.superdatascience.com/837
Economist and social-impact innovator Dr. Nat Ware reveals how our expectations shape happiness and why chasing it often leaves us unfulfilled. He shares insights on the “hedonic treadmill” and the effects of constant comparison on our well-being. Find out how to build a more meaningful life by making memories, taking chances, and focusing on genuine connections. Additional materials: www.superdatascience.com/836 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
AI systems are evolving rapidly, and in this episode, Bryan McCann, CTO of You.com, explains You.com’s unique approach to search, the impact of AI-driven research, and the game-changing potential of AI agents. With a background in natural language processing and philosophy, Bryan joins Jon Krohn to share a fresh perspective on where AI is headed and what it means for the future of work and scientific discovery. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (03:55) How You.com’s “do engine” approach connects users to multiple language models (11:34) How AI systems at You.com generate optimized, intent-driven queries for better results (28:39) You.com’s focus on automated workflows sets it apart from other platforms (31:31) AI agents in You.com, with Bryan predicting they’ll outnumber people by 2025 (41:49) Bryan’s path to unified AI models that can perform diverse tasks (50:40) Early experiments with alignment in AI that influenced modern transformers (01:04:45) Bryan’s research on controllable text generation (01:11:27) Language models applied to protein generation, linking text and biology sequences Additional materials: www.superdatascience.com/835
Jon Krohn starts the month with his round-up of favorite clips from the previous month. Hear from Bradley Voytek, Natalie Monbiot, Luca Antiga, Chad Sanderson, and Ritchie Vink in conversations about the ongoing potential of AI.   Additional materials: www.superdatascience.com/834 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Martin Goodson speaks to Jon Krohn about what he would add to his viral article “Ten Ways Your Data Project is Going to Fail”, why practitioners always need to be present at AI policy discussions, and Evolution AI’s breakthroughs in computer vision and NLP. This episode is brought to you by epic LinkedIn Learning instructor Keith McCormick. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (04:25) What Evolution AI does  (11:41) How to maintain accuracy in large infrastructures (21:22) How to cultivate innovation and creativity while meeting market demands (24:27) Potential knowledge gaps for machine learning practitioners (30:57) Martin’s viral article, “Ten Ways Your Data Project is Going to Fail” (59:54) Strategies for the UK to become a key player in AI Additional materials: www.superdatascience.com/833
Host Jon Krohn unpacks Dario Amodei’s vision of a techno-utopia in his essay Machines of Loving Grace, where “Powerful AI” takes center stage. Amodei, CEO of Anthropic, imagines a future where AI doesn’t just assist but actively shapes fields like healthcare, economics, and governance with unmatched intelligence and autonomy. Jon explores the possibilities and challenges of this AI-driven future, asking how close we are to seeing these revolutionary shifts and what they mean for society. Additional materials: www.superdatascience.com/832 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
PyTorch Lightning is revolutionizing the AI landscape, and Dr. Luca Antiga, CTO of Lightning AI, joins host Jon Krohn to explain how. In this episode, they explore the tools pushing AI development forward, from Lightning Studios to Lit-Serve, and discuss the game-changing rise of small language models that challenge industry giants with precision and speed. Luca also shares his vision for developers in an AI-enhanced world, where coding meets creativity and collaboration with intelligent tools. This episode is brought to you by epic LinkedIn Learning instructor Keith McCormick, and by ODSC, the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: How Lightning AI's open-source tools make AI development faster [11:30] The rise of small language models and how they'll rival LLMs [37:47] Luca's journey from biomedical imaging to deep learning pioneer [52:03] How AI will transform software developer tasks [1:03:05] Additional materials: www.superdatascience.com/831
Geoffrey Hinton and Sir Demis Hassabis: The Nobel Prize committee is an achievement of the highest order, awarding physicists, chemists, physiologists, medical practitioners, writers, pacifists and economists perhaps the greatest honor in their respective fields. In this week’s Five-Minute Friday, Jon Krohn discusses how two AI pioneers came to win prizes in chemistry and physics. Additional materials: www.superdatascience.com/830 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Neuroscientist Bradley Voytek outlines to Jon Krohn the incredible use of data science and machine learning in his research and how recent discoveries in action potentials and neurons have completely skyrocketed the field to a new understanding of the brain and its functions. You’ll also hear what Bradley thinks is most important when hiring data scientists and his contributions to Uber’s algorithm when it was still a startup.  This episode is brought to you by epic LinkedIn Learning instructor Keith McCormick, and by Gurobi, the Decision Intelligence Leader. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: Breakthroughs in brain region communication [04:08] The future of brain research and MedTech [35:24] The libraries and software used at the Halicioglu Data Science Institute [45:11] Brain rhythm as a diagnostic tool [1:02:58] Bradley’s curriculum structure at UC San Diego [1:12:21] How Uber applies data science [1:20:07] Additional materials: www.superdatascience.com/829
The citizen data scientist: Fact or fiction? Jon Krohn holds a conversation across episodes in this Five-Minute Friday, with today’s guest Keith McCormick, in part responding to Nick Elprin’s interview in episode 811: Scaling Data Teams Effectively. Additional materials: www.superdatascience.com/828 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Ritchie Vink, CEO and Co-Founder of Polars, Inc., speaks to Jon Krohn about the new achievements of Polars, an open-source library for data manipulation. This is the episode for any data scientist on the fence about using Polars, as it explains how Polars managed to make such improvements, the APIs and integration libraries that make it so versatile, and what’s next for this efficient library. This episode is brought to you by epic LinkedIn Learning instructor Keith McCormick, by Gurobi, the Decision Intelligence Leader, and by ODSC, the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: Why Polars is so efficient [05:20] Polars’ easy integration with other data-processing tools [21:23] Eager vs lazy executive in Polars [32:15] Polars’ data processing of large- and small-scale datasets [38:28] Ritchie’s plans to scale his company [46:14] Upcoming features in Polars [58:06] Additional materials: www.superdatascience.com/827
Next-gen IDEs, efficiency-boosting open-source Python libraries, and changes in hiring for data scientists: This episode of In Case You Missed It gives you our best clips of September’s interviews, hosted by Jon Krohn. Additional materials: www.superdatascience.com/826 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Data contracts are redefining data quality and governance, and Chad Sanderson, CEO of Gable.ai, joins host Jon Krohn to explain how they can transform your data strategy. He breaks down what data contracts are, how they shift data quality checks closer to production, and why they’re essential for reducing data debt. Chad also highlights how better alignment between data producers and consumers can elevate data reliability and tackle change-management challenges in modern organizations. This episode is brought to you by epic LinkedIn Learning instructor Keith McCormick, and by Gurobi, the Decision Intelligence Leader. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: What data contracts are and how they define expectations for data quality [03:16] What data contracts look like [09:09] The common misconceptions about data quality when implementing AI [12:55] Chad’s Chief Operator role at Data Quality Camp [19:46] How “shifting left” improves data reliability by addressing issues early [24:17] Why data professionals still struggle with data quality [30:31] How data debt forms and why it leads to complex, inefficient architectures [35:53] How will the role of human oversight evolve in ensuring data quality? [47:12] How can data teams leverage storytelling? [52:33] Additional materials: www.superdatascience.com/825
Llama 3.2 brings a new era of AI innovation with lightweight models tailored for on-device applications and powerful vision models for handling complex image inputs. Host Jon Krohn explores how this release pushes the boundaries of open-source AI, making it more accessible and versatile for developers. He also covers the Llama Stack toolkit, designed to streamline deployment, and Llama Guard 3, Meta’s latest content moderation solution. With extensive support from major cloud and hardware partners, Llama 3.2 is set to unlock groundbreaking possibilities for AI across mobile and beyond. Tune in to hear more. Additional materials: www.superdatascience.com/824 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Virtual humans are rewriting the rules of digital communication and reshaping entire industries. This week, Jon Krohn welcomes Natalie Monbiot, Head of Strategy at Hour One, to shed light on how AI avatars are revolutionizing L&D and e-commerce by turning traditional training and product listings into captivating, presenter-led content. This episode is brought to you by epic LinkedIn Learning instructor Keith McCormick, by Gurobi, the Decision Intelligence Leader, and by ODSC, the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • How do you create a virtual being? [10:55] • Reid Hoffman's avatar [13:40] • The virtual human economy [31:07] • Virtual human societies [51:24] • Virtual humans and creative expression [56:35] • Challenges in maintaining transparency [01:00:22] Additional materials: www.superdatascience.com/823
NotebookLM, Google’s latest AI tool, takes content creation to a new level. This week, Jon Krohn shares how the platform transformed his 200-page dissertation into a fascinating 11-minute podcast. Discover how AI can turn vast amounts of information into engaging and digestible content, opening up new possibilities for content creation. Additional materials: www.superdatascience.com/822  Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Marck Vaisman speaks to Jon Krohn about his paradigm for understanding core data practitioner types. Hear Marck detail the four data practitioner personas that he has identified in his research, why he believes the roadmaps that influencers like to promote as surefire ways to a data science career don’t work in practice, and why the term “data scientist” is still so elusive and hard to recruit for. This episode is brought to you by Gurobi, the Decision Intelligence Leader. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • How Marck started his work in defining data science roles [08:06] • The relationship between the four data practitioner personas [15:26] • About Marck’s “menu” for effective data science [40:43] • How recruiters can hire the best data scientist for the job [59:31] Additional materials: www.superdatascience.com/821
Jon Krohn takes OpenAI’s new models (o1-preview and o1-mini) for a spin in this Five-Minute Friday, learning their key strengths and limitations, and how the o1 series may represent yet another landmark for generative AI. Additional materials: www.superdatascience.com/820  Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
SuperDataScience veteran and Udemy teacher Luka Anicin is on the podcast to talk about his brand-new course, “PyTorch: From Zero to Hero”, available exclusively on superdatascience.com. Host Jon Krohn asks Luka why he feels that every data scientist should consider PyTorch as their default Python library, and why “keeping it simple” can secure the success of a machine learning project. This episode is brought to you by AWS Inferentia and AWS Trainium, and by Gurobi, the Decision Intelligence Leader. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • About the PyTorch library [03:29] • Why PyTorch became so popular [25:24] • How to increase accuracy and efficiency in PyTorch [31:49] • How to utilize transfer learning [35:44] • Why real-world projects are essential to data scientists [41:10] • About Datablooz [46:49] Additional materials: www.superdatascience.com/819
Experts from AI and data science discuss the impact and benefits of decentralization, the importance of structuring AI systems in business, and why knowing the basics will always matter for data engineers. Listen to Shingai Manjengwa (episode 809), Daniel Hulme (episode 807), Jerry Yurchisin (episode 813) and Nick Elprin (episode 811) explore a future world of work that rewards continuing learners, sets tasks for the people best suited to complete them rather than those whose job titles reflect the spec, and applies a fleet of ‘AI agents’ to solve complex business tasks. Additional materials: www.superdatascience.com/818  Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Dr. Julia Silge, Engineering Manager at Posit, introduces the brand-new Positron IDE, perfect for exploratory data analysis and visualization. She also lays out her top picks for LLMs that boost coding efficiency and discusses when traditional NLP methods might be the smarter choice over LLMs. Plus, Julia highlights some must-know open-source libraries that make managing MLOps easier than ever. Tune in for insights that every data scientist, ML engineer, and developer will find useful. This episode is brought to you by Gurobi, the Decision Intelligence Leader, and by ODSC, the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • Overview of Posit and Positron IDE [05:20] • How the needs of a data scientist differ from those of a software developer [10:54] • How to contribute to the open-source Positron [19:50] • MLOps and Vetiver: Tools for deploying and maintaining ML models [37:01] • Natural Language Processing (NLP) and the Tidyverse approach [50:34] • The role of AI and LLMs in data science education [1:24:18] Additional materials: www.superdatascience.com/817
Jon Krohn takes on a listener's challenge to explain his work in data science to his 94-year-old grandmother, Annie. This heartwarming conversation covers what data is, the role of a data scientist, and breaks down artificial intelligence (AI) and artificial general intelligence (AGI) in simple terms. The episode provides a fresh take on how to communicate complex topics to a lay audience, offering both clarity and insight. Additional materials: www.superdatascience.com/816  Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Polars, Python, Narwhals, Rust, and Pandas: Marco Gorelli talks to Jon Krohn about the many ways to use the newest data libraries available, the joys of open-source development, and the best method to win prizes in forecasting competitions. This episode is brought to you by AWS Inferentia and AWS Trainium, by Babbel, the science-backed language-learning platform, and by Gurobi, the Decision Intelligence Leader. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • When to use Polars vs Pandas [08:26] • How Polars optimizes string operations and data processing [20:08] • Where Narwhals outstrips Polars and Pandas [48:37] • The benefits of using Altair [55:21] • Addressing the lack of women in data science [1:09:58] • How to win a forecasting competition [1:16:58] Additional materials: www.superdatascience.com/815
As summer winds down, this episode shifts focus from the usual tech discussions to something more personal: reflecting on the importance of balancing work with life’s simple pleasures. While the world of data science and AI continues to evolve rapidly, it's essential to remember that true success isn't just about professional milestones. It’s also about cherishing the moments that make life meaningful. Tune in for a brief but impactful reflection on how to redefine success to include not just achievements, but also the everyday joys that often go unnoticed. Additional materials: www.superdatascience.com/814  Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Jerry Yurchisin from Gurobi joins Jon Krohn to break down mathematical optimization, showing why it often outshines machine learning for real-world challenges. Find out how innovations like NVIDIA’s latest CPUs are speeding up solutions to problems like the Traveling Salesman in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • The Burrito Optimization Game and mathematical optimization use cases [03:36] • Key differences between machine learning and mathematical optimization [05:45] • How mathematical optimization is ideal for real-world constraints [13:50] • Gurobi’s APIs and the ease of integrating them [21:33] • How LLMs like GPT-4 can help with optimization problems [39:39] • Why integer variables are so complex to model [01:02:37] • NP-hard problems [01:11:01] • The history of optimization and its early applications [01:26:23] Additional materials: www.superdatascience.com/813
In this episode of Five-Minute Friday, Jon Krohn investigates published findings from the startup Sakana AI and its paper’s co-authors from the University of Oxford, the University of British Columbia and the Vector Institute in Toronto. These authors explore the potential of The AI Scientist, a framework that could change the way we conduct scientific research forever. Additional materials: www.superdatascience.com/812  Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Nick Elprin talks to Jon Krohn about how and when to scale a data science team and its workflows to secure a company’s commercial viability. You’ll also hear how to launch your own data science startup and why it’s so important to understand that AI tools are not one-size-fits-all. This episode is brought to you by AWS Inferentia and AWS Trainium. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • How Nick served enterprises with his AI startup, Domino Data Lab [05:36] • About the Navy’s own mine detection models [17:43] • The hype surrounding GenAI [30:35] • How AI platforms integrate with business strategies [39:49] • When it’s time to integrate an AI tool into your business [51:12] • Why Nick started Domino Data Lab [1:03:53] Additional materials: www.superdatascience.com/811
Self-driving cars are here, and Jon Krohn is breaking down the five levels of automation that could change driving forever. From full human control at Level 0 to cars that drive themselves in any condition at Level 5, get the real story on what these levels mean. With firsthand insights from a recent autonomous vehicle experience, this episode cuts through the buzz and tells you what’s coming next. Additional materials: www.superdatascience.com/810  Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Agentic AI is revolutionizing the tech landscape, and Shingai Manjengwa from ChainML is here to tell us why. Discover how AI agents are becoming an integral part of our lives, automating tasks like travel bookings and daily inspiration. Shingai explains the power of multi-agent systems, where AI agents collaborate to solve complex challenges, and highlights how blockchain technology is enhancing AI transparency and trust. Plus, get an inside look at ChainML’s innovative Theoriq protocol and the groundbreaking Council Analytics tool. This episode is brought to you by Gurobi, the Decision Intelligence Leader, and by ODSC, the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • What A.I. agents are [10:51] • How blockchain technology helps humans trust A.I. agents [18:27] • The Theoriq protocol developed by ChainML [34:05] • How Council Analytics lets you “speak” to their dataset with natural language [39:00] • A future of multi-agent systems [50:42] • Challenges and risks associated with agentic AI [1:04:17] Additional materials: www.superdatascience.com/809
Advice for emerging data scientists, the latest in model merging, and how GenAI can supercharge your creativity: Host Jon Krohn gives us his highlights from a month of interviews, packed with tips from some of the leading names in data science and beyond. Guests include Daliana Liu, Charles Duhigg, Charles Goddard, Rosanne Liu and Andrey Kurenkov. Additional materials: www.superdatascience.com/808  Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
The singularity could soon be upon us. The PESTLE framework, developed by this episode’s guest Daniel Hulme, expresses not one but six types of singularity that could occur: political, environmental, social, technological, legal and economic. Jon Krohn and Daniel Hulme discuss how each of these singularities could bring good to the world, aligning with human interests and pushing forward progress. They also talk about neuromorphic computing, machine consciousness, and applying AI at work. This episode is brought to you by AWS Inferentia and AWS Trainium, and by Gurobi, the Decision Intelligence Leader. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • About the six singularities [03:43] • How the singularity could improve life on earth [09:01] • The credibility of AI experts [32:51] • How the decentralization of technology could benefit earth [43:14] • How AI might enhance creativity [1:04:33] Additional materials: www.superdatascience.com/807
Llama 3.1 is here, and it’s a game-changer. Meta’s latest AI model, especially the massive 405B variant, finally brings an open-source option to compete with giants like OpenAI's GPT-4o and Anthropic's Claude 3.5 Sonnet. While Meta didn’t fully open-source everything, the availability of "open weights" is a strategic move to shake up the AI landscape. The model boasts an impressive 128,000-token context window and multilingual support in eight languages. Meta is also focusing on responsible AI development with tools like Llama Guard 3 for content moderation. This release is more than just a tech upgrade—it's about democratizing AI and sparking innovation across industries. How will you leverage Llama 3.1 to make a real impact? Tune into this week’s FMF episode and let’s explore the future with this latest AI development together. Additional materials: www.superdatascience.com/806 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Become a Supercommunicator! New York Times bestselling author Charles Duhigg, known for The Power of Habit and Smarter Faster Better, gets real about mastering communication in this episode. Discover insights from his latest book, Supercommunicator, where he reveals how to align conversation styles for deeper connections, handle conflicts effectively, and why AI can't replicate the emotional depth of human interactions. This episode is brought to you by Gurobi, the Decision Intelligence Leader. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • The inspirations behind Supercommunicator [03:41] • The three types of conversations: Practical, emotional, and social conversations [05:22] • The matching principle: Align communication styles for better connection [10:36] • What is neural entrainment: Achieve a mind meld through synchronized brain activity [13:22] • The series of steps/principles to connect with someone [24:39] • How to avoid or de-escalate conflict conversations [31:07] • The impact of GenAI on conversations: How AI mimics dialogue but lacks emotional depth [45:24] Additional materials: www.superdatascience.com/805
Solar power now provides 6% of the world's electricity, thanks to rapid growth. Host Jon Krohn discusses the factors driving this rise, the challenges ahead, and how AI and data science are optimizing solar technologies. Tune in for insights on the future of solar power, and don't forget to like, share, and subscribe! Additional materials: www.superdatascience.com/804 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Daliana Liu is a big name in data science teaching, and she has always been generous in sharing everything she knows about getting a job in data science. In this episode, she continues to extend her generosity, helping listeners define their approach to achieving a fulfilling career in data science and tech. This episode is brought to you by AWS Inferentia and AWS Trainium, by Babbel, the science-backed language-learning platform, and by Gurobi, the Decision Intelligence Leader. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • Common career challenges for data scientists [34:57] • Advice for people who don’t know where to go in their career [48:05] • How to build resilience and protect against Imposter Syndrome [1:06:23] • Skills that data scientists should develop today [1:39:17] • The future of the data science and AI job market [1:46:55] Additional materials: www.superdatascience.com/803
How to grab investor interest with your AI startup idea, revisiting algorithms, and helping practitioners ensure AI safety with regulatory frameworks and beyond: This month, you missed a whole bunch of great interviews. But don’t worry, Jon Krohn is here to recap all the best bits for you! Additional materials: www.superdatascience.com/802 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Merged LLMs are the future, and we’re exploring how with Mark McQuade and Charles Goddard from Arcee AI on this episode with Jon Krohn. Learn how to combine multiple LLMs without adding bulk, train more efficiently, and dive into different expert approaches. Discover how smaller models can outperform larger ones and leverage open-source projects for big enterprise wins. This episode is packed with must-know insights for data scientists and ML engineers. Don’t miss out! Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • Explanation of Charles' job title: Chief of Frontier Research [03:31] • Model Merging Technology combining multiple LLMs without increasing size [04:43] • Using MergeKit for model merging [14:49] • Evolutionary Model Merging using evolutionary algorithms [22:55] • Commercial applications and success stories [28:10] • Comparison of Mixture of Experts (MoE) vs. Mixture of Agents [37:57] • Spectrum Project for efficient training by targeting specific modules [54:28] • Future of Small Language Models (SLMs) and their advantages [01:01:22] Additional materials: www.superdatascience.com/801
The SuperDataScience Podcast is celebrating its 800th episode! Host Jon Krohn speaks to his grandmother, Annie, about growing up at a time when so many technologies we take for granted today were yet to be developed. Listen in to hear Annie’s experience of the changes in technology across 94 years and how she and her family fared in 1940s Ukraine with no electricity or running water. Additional materials: www.superdatascience.com/800
No-code games with GenAI, the creative possibilities of LLMs, and our proximity to AGI: In this episode, Jon Krohn talks to Andrey Kurenkov about what turned him from an AGI skeptic to a positivist. You’ll also hear about his wildly popular podcast “Last Week in AI” and how the NVIDIA-backed startup Astrocade is helping videogame enthusiasts to create their own games through generative AI. A must-listen! This episode is brought to you by AWS Inferentia and AWS Trainium. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • All about The Gradient and Last Week in AI [10:42] • All about Astrocade and Andrey’s role at the startup [24:35] • Balancing UX and creative control at Astrocade [42:00] • The creative possibilities of LLMs [1:04:15] • The rapid emergence of AGI [1:10:31] Additional materials: www.superdatascience.com/799
Claude 3.5 Sonnet, Anthropic’s newest model, is making waves in the AI community. This mid-size model outshines the larger Claude 3 Opus in tasks like code generation, content creation, and document summarization, and it’s twice as fast. In this episode of The Super Data Science Podcast, Jon Krohn discusses its top-notch performance across benchmarks like MMLU, GPQA, and HumanEval, along with its improved machine vision capabilities. Plus, learn about the new Artifacts UI feature, which makes managing generated content easier by displaying outputs side-by-side with inputs. Tune in to find out why Claude 3.5 Sonnet is setting new standards in AI. Additional materials: www.superdatascience.com/798 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Dr. Rosanne Liu, Research Scientist at Google DeepMind and co-founder of the ML Collective, shares her journey and the mission to democratize AI research. She explains her pioneering work on intrinsic dimensions in deep learning and the advantages of curiosity-driven research. Jon and Dr. Liu also explore the complexities of understanding powerful AI models, the specifics of character-aware text encoding, and the significant impact of diversity, equity, and inclusion in the ML community. With publications in NeurIPS, ICLR, ICML, and Science, Dr. Liu offers her expertise and vision for the future of machine learning. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • How the ML Collective came about [03:31] • The concept of a failure CV [16:12] • ML Collective research topics [19:03] • How Dr. Liu's work on the “intrinsic dimension” of deep learning models inspired the now-standard LoRA approach to fine-tuning LLMs [21:28] • The pros and cons of curiosity-driven vs. goal-driven ML research [29:08] • Discussion on Dr. Liu's research and papers [33:17] • Character-aware vs. character-blind text encoding [54:59] • The positive impacts of diversity, equity, and inclusion in the ML community [57:51] Additional materials: www.superdatascience.com/797
Want to feel optimistic about your day? In this Friday episode, Simon Kuestenmacher talks to Jon Krohn about demography: What it is, why it’s so important, and why its forecasts should give us reason to hope for a better future. In an increasingly globalized world, and with an aging population in countries with the biggest GDPs, demography is more valuable than ever. Additional materials: www.superdatascience.com/796 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Gina Guillaume-Joseph talks to Jon Krohn about the data and regulatory frameworks set to transform the AI industry and why that’s important to anyone working with data. This episode offers a solid path to understanding AI regulation’s past, present and future. Gina walks listeners through the AI Bill of Rights, the NIST AI Risk Framework and the MITRE ATLAS threat model. This episode is brought to you by AWS Inferentia and AWS Trainium, by Crawlbase, the ultimate data crawling platform, and by Babbel, the science-backed language-learning platform. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • What “responsible AI” means [08:14] • Why the federal government should be behind AI regulation [12:22] • The US vs EU on AI regulation [18:46] • About the AI Bill of Rights [26:14] • About MITRE and the MITRE Atlas [37:19] • What a systems engineer does [54:11] Additional materials: www.superdatascience.com/795
Trends in open-source AI: Join Jon Krohn and a panel of data science icons as they discuss the most exciting and concerning developments in open-source AI. Hear insights from Drew Conway, Jared Lander, Emily Zabor, and JD Long on the transformative potential of AI and its future impact. Additional materials: www.superdatascience.com/794 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Bayesian methods take the spotlight in this episode with Alex Andorra, co-founder of PyMC Labs, and Jon Krohn. Learn how Bayesian techniques handle tough problems, make the most of prior knowledge, and work wonders with limited data. Alex and Jon break down essentials like PyMC, PyStan, and NumPyro libraries, show how to boost model efficiency with PyTensor, and talk about using ArviZ for top-notch diagnostics and visualizations. Plus, get into advanced modeling with Gaussian Processes. This episode is brought to you by Crawlbase, the ultimate data crawling platform. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • Practical introduction to Bayesian statistics [04:54] • Definition and significance of epistemology [17:52] • Explanation of PyMC and Monte Carlo methods [27:57] • How to get started with Bayesian modeling and PyMC [34:26] • PyMC Labs and its consulting services [50:50] • ArviZ for post-modeling diagnostics and visualization [01:02:23] • Gaussian processes and their applications [01:09:02] Additional materials: www.superdatascience.com/793
Jon Krohn shares his favorite clips from May. Hear how Navdeep Martin is spearheading a company to tackle the climate crisis, why Sol Rashidi and Demetrios Brinkmann find nailing job titles so necessary in the fast-paced industries of tech and AI, and get the latest on embeddings with Luis Serrano. Additional materials: www.superdatascience.com/792 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Reinforcement learning through human feedback (RLHF) has come a long way. In this episode, research scientist Nathan Lambert talks to Jon Krohn about the technique’s origins of the technique. He also walks through other ways to fine-tune LLMs, and how he believes generative AI might democratize education. This episode is brought to you by AWS Inferentia (go.aws/3zWS0au) and AWS Trainium (go.aws/3ycV6K0), and Crawlbase (crawlbase.com), the ultimate data crawling platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • Why it is important that AI is open [03:13] • The efficacy and scalability of direct preference optimization [07:32] • Robotics and LLMs [14:32] • The challenges to aligning reward models with human preferences [23:00] • How to make sure AI’s decision making on preferences reflect desirable behavior [28:52] • Why Nathan believes AI is closer to alchemy than science [37:38] Additional materials: www.superdatascience.com/791
The experts reveal their top open-source R libraries with us live from the New York R Conference! This Super Data Science Podcast episode features an exclusive panel with data science trailblazers Drew Conway, Jared Lander, Emily Zabor, and JD Long. They share their favorite R libraries and valuable insights to enhance your data science practice. Additional materials: www.superdatascience.com/790 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Machine Learning for Wind Energy is front and center in this episode as Jon Krohn is joined by Dr. Jason Yosinski, CEO of Windscape AI. Dr. Yosinski brings to light the latest ML advancements sparking significant changes in renewable energy. Tune in for a comprehensive review of these cutting-edge technologies and their expansive impact on the industry and the environment's well-being. This episode is brought to you by Crawlbase, the ultimate data crawling platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • Enhancing predictability in wind energy with ML [04:52] • Data utilization from wind turbines by energy providers [11:41] • Jason's journey into wind energy [17:55] • Landing the right startup idea [22:47] • Visualizing neural networks with the Deep Vis Toolbox [31:29] • Extreme event forecasting at Uber vs. nowcasting at Windscape AI [45:13] • Discoveries from Loss Change Allocation research [47:48] • Engaging with Jason's ML Collective [59:46] • Traits of successful AI entrepreneurs [1:10:26] Additional materials: www.superdatascience.com/789
Multi-agent systems could mark a significant turning point in generative AI. From mastering increasingly complex tasks to getting LLMs to collaborate, in this Five-Minute Friday, Jon Krohn discusses the systems that are working to bridge the remaining gaps left by the latest large language models (LLMs). Additional materials: www.superdatascience.com/788 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
MLOps, how to build an online community, and tools for scaling LLMs: In this episode, Demetrios Brinkmann speaks to Jon Krohn about the similarities and differences between LLMOps, MLOps and DevOps, and why this should matter to companies looking to hire such engineers. You will also hear how to get involved in the MLOps community wherever you are in the world, and how you can start developing great products with the available tools. This episode is brought to you by AWS Inferentia (go.aws/3zWS0au) and AWS Trainium (go.aws/3ycV6K0). Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • What MLOps is [03:51] • About LLMOps [12:06] • About LlamaIndex and Ollama [18:29] • Insights from Demetrios’ MLOps survey [20:49] • Guidance for using third-party APIs [40:18] • Recommendations for building an online community in tech and AI [47:07] Additional materials: www.superdatascience.com/787
Learn about the six keys to data science success as host Jon Krohn welcomes back Kirill Eremenko, the mastermind behind SuperDataScience. Kirill shares his top insights on data science careers, from building strong portfolios to leveraging mentors and hands-on labs. With over 2.7 million students, his advice is a must-hear for aspiring and experienced data scientists alike.Additional materials: www.superdatascience.com/786Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Dr. Luis Serrano from the Serrano Academy reveals how to make Math and Quantum ML accessible, tackles the challenges of teaching A.I. to beginners, and explores the power of embeddings in enterprise applications. Explore the future of Quantum Machine Learning and the latest trends in AI, including multimodality and autonomous systems.This episode is brought to you by AWS Inferentia and AWS Trainium. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.In this episode you will learn:• How math and AI can be made easy to understand [05:21]• The three major categories of learners [16:21]• Why embeddings are the most important component of LLMs [26:19]• How semantic search differs from a traditional keyword search [29:57]• The most exciting emerging application areas for AI [42:41]• The promising application areas for Quantum Machine Learning [49:18]Additional materials: www.superdatascience.com/785
Aligning LLMs: How can we teach pre-trained LLMs to hold a conversation and learn new information from each other? This was where Sinan Ozdemir began his investigation into aligning LLMs. In this episode, he talks to Jon Krohn about the limitations of definitions for LLMs, training LLMs, and whether it is possible to train an LLM without alignment.Additional materials: www.superdatascience.com/784Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Recent advances in GenAI, how to tackle the climate crisis with advanced technology, and addressing the knowledge gap in understanding AI: Jon Krohn speaks to Flypower co-founder and CEO Navdeep Martin about the advances made in GenAI, from products to applications, and how we might use AI to tackle climate change.This episode is brought to you by AWS Inferentia and AWS Trainium. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.In this episode you will learn:• How the Washington Post’s recommendation systems work [03:29]• Why product leaders make great CEOs [10:36]• How Flypower uses GenAI to tackle climate change [22:13]• How Flypower identifies its customers’ most pertinent questions [30:03]• How AI might come to tackle climate change [36:52]• How to mitigate hallucination in AI models [41:04]Additional materials: www.superdatascience.com/783
Hear Jon Krohn’s favorite five clips from his April interviews. Chief Scientist at Posit PBC Hadley Wickham on the subtle differences between Python and R. Professor of Business Analytics Barrett Thomas walks through the variables that companies should consider when using drones or any other tech to improve their business operations and bottom line. Aleksa Gordić, Founder of Runa AI believes an overhaul of the current educational system is long overdue. Bernard Marr discusses the future of GenAI and its impact on the world of work. And SuperDataScience founder Kirill Eremenko gives a lively workshop on gradient boosting. Additional materials: www.superdatascience.com/782Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Sol Rashidi, a distinguished data executive who has served in C-suite roles at Fortune 100 companies, joins Jon Krohn to delve into successful enterprise AI strategies and the reasons behind the high turnover among Chief Data Officers. This episode provides an in-depth look at selecting AI projects that succeed and understanding the strategic value of patents in various industries. Benefit from Sol’s extensive experience and practical advice on navigating complex corporate challenges.This episode is brought to you by AWS Inferentia and AWS Trainium. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.In this episode you will learn:• How CDOs and related roles have such high turnover because [09:40]• The importance of building relationships in AI projects [17:01]• How Sol's book "The AI Survival Guide" came about [20:44]• How high-criticality, low-complexity AI projects are the ones with the highest probability of success [27:11]• How Enterprise data security issues can be resolved with technologies like Protopia’s stained-glass data-masking solution [36:10]• Why having great data engineers is essential [47:57]• The value of patents [51:45]Additional materials: www.superdatascience.com/781
Want to become a data scientist? Jon and Adam discuss the key steps to becoming a data scientist, with a focus on developing portfolio projects. Hear about the 10 project ideas Adam recommends in his book to help you stand out in the data science community.Additional materials: www.superdatascience.com/780Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Tidyverse, ggplot2, and the secret to a tech company’s longevity: Hadley Wickham talks to Jon Krohn about Posit’s rebrand, Tidyverse and why it needs to be in every data scientist’s toolkit, and why getting your hands dirty with open-source projects can be so lucrative for your career.This episode is brought to you by Intel and HPE Ezmeral Software. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.In this episode you will learn:• All about the Tidyverse [04:46]• Hadley’s favorite R libraries [17:10]• The goal of Posit [30:29]• On bringing multiple programming languages together [36:02]• The principles for a long-lasting tech company [52:10]• How Hadley developed ggplot2 [55:24]• How to contribute to the open-source community [1:05:43]Additional materials: www.superdatascience.com/779
Mixtral 8x22B is the focus on this week's Five-Minute Friday. Jon Krohn examines how this model from French AI startup Mistral leverages its mixture-of-experts architecture to redefine efficiency and specialization in AI-powered tasks. Tune in to learn about its performance benchmarks and the transformative potential of its open-source license.Additional materials: www.superdatascience.com/778Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Generative AI is reshaping our world, and Bernard Marr, world-renowned futurist and best-selling author, joins Jon Krohn to guide us through this transformation. In this episode, Bernard shares his insights on how AI is transforming industries, revolutionizing daily life, and addressing global challenges. With his extensive experience advising top organizations worldwide, he also examines the ethical considerations of AI deployment.This episode is brought to you by Intel and HPE Ezmeral Software. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.In this episode you will learn:• How Generative AI will transform industries [03:55]• The evolution of Generative AI [10:19]• How will Generative AI impact daily life [16:52]• The ethical challenges of AI [18:55]• How corporations can harness Generative AI for collaboration [24:36]• Industries that will be impacted by Generative AI [32:20]• How Sora-like Generative AI systems will create highly immersive entertainment [42:16]• How Generative AI could unlock 99% of business data [53:34]Additional materials: www.superdatascience.com/777
What are the risks of AI progressing beyond a point of no return? What do we stand to gain? On this Five-Minute Friday, Jon Krohn talks ‘books’ as he outlines two nonfiction works on AI and futurism by Oxford philosopher Nick Bostrom. Listen to a breakdown of Deep Utopia and Superintelligence in this episode.Additional materials: www.superdatascience.com/776Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Tech entrepreneurship, artificial superintelligence, and the future of education: Aleksa Gordić speaks to Jon Krohn about his strategies for self-directed learning, the traits that help people succeed in moving from big tech to entrepreneurship, and the social impact of artificial superintelligence.This episode is brought to you by Ready Tensor, where innovation meets reproducibility. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.In this episode you will learn:• How to motivate yourself to become a tech entrepreneur [17:02]• Aleksa’s checklist for the perfect CTO [35:00]• Potential sustainable solutions for LLMs [41:51]• The next major developments in AI and tech [48:29]• How hobbies have a knock-on effect for a person’s career [1:01:53]• How and why formal education needs to change [1:09:24]Additional materials: www.superdatascience.com/775
Covariant's RFM-1: Jon Krohn explores the future of AI-driven robotics with RFM-1, a groundbreaking robot arm designed by Covariant and discussed by A.I. roboticist Pieter Abbeel. Explore how this innovation aims to merge digital intelligence with the physical world, promising a new era of efficiency and autonomy.Additional materials: www.superdatascience.com/774Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Dr. Barrett Thomas, an award-winning Research Professor at the University of Iowa, explores the intricacies of Markov decision processes and their connection to Deep Reinforcement Learning. Discover how these concepts are applied in operations research to enhance business efficiency and drive innovations in same-day delivery and autonomous transportation systems.This episode is brought to you by Ready Tensor, where innovation meets reproducibility. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.In this episode you will learn:• Barrett's start in operations logistics [02:27]• Concorde Solver and the traveling salesperson problem [09:59]• Cross-function approximation explained [19:13]• How Markov decision processes relate to deep reinforcement learning [26:08]• Understanding policy in decision-making contexts [33:40]• Revolutionizing supply chains and transportation with aerial drones [46:47]• Barrett’s career evolution: past changes and future prospects [52:19]Additional materials: www.superdatascience.com/773
Pytorch benefits, how to get funding for your AI startup, and managing scientific silos: In our new series for SuperDataScience, “In Case You Missed It”, host Jon Krohn engages in some “reinforcement learning through human feedback” of his own with need-to-hear sound bites from past SDS episodes!Additional materials: www.superdatascience.com/772Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Kirill Eremenko joins Jon Krohn for another exclusive, in-depth teaser for a new course just released on the SuperDataScience platform, “Machine Learning Level 2”. Kirill walks listeners through why decision trees and random forests are fruitful for businesses, and he offers hands-on walkthroughs for the three leading gradient-boosting algorithms today: XGBoost, LightGBM, and CatBoost.This episode is brought to you by Ready Tensor, where innovation meets reproducibility, and by Data Universe, the out-of-this-world data conference. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.In this episode you will learn:• All about decision trees [09:17]• All about ensemble models [21:43]• All about AdaBoost [36:47]• All about gradient boosting [45:52]• Gradient boosting for classification problems [59:54]• Advantages of XGBoost [1:03:51]• LightGBM [1:17:06]• CatBoost [1:32:07]Additional materials: www.superdatascience.com/771
Explore the science of confidence with Lucy Antrobus, as she unveils neuroscience-backed strategies to build and boost confidence through practice, positive energy, and the power of laughter. An essential listen for fostering unshakable self-assurance.Additional materials: www.superdatascience.com/770Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Generative AI in medicine takes center stage as Prof. Zachary Lipton, Chief Scientific Officer at Abridge, joins host Jon Krohn to discuss the significant advancements in AI that are reshaping healthcare.This episode is brought to you by the DataConnect Conference, and by Data Universe, the out-of-this-world data conference. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.In this episode you will learn:• The inspiration for Zack to get started in ML and healthcare [03:56]• The hardware required to use Abridge [12:29]• The key data science projects at Abridge right now [35:05]• Abridge's tech stack [59:54]• How Abridge ensures reliability in a high-stakes setting like healthcare [1:07:29]• How Zack’s academic research cross-pollinates with his commercial ML projects [1:21:05]• How Zack’s jazz background molded his entrepreneur and data science journey [1:30:32]Additional materials: www.superdatascience.com/769
Claude 3, LLMs and testing ML performance: Jon Krohn tests out Anthropic’s new model family, Claude 3, which includes the Haiku, Sonnet and Opus models (written in order of their performance power, from least to greatest). Can it stand shoulder to shoulder with other models such as GPT-4 and Gemini 1.0 Ultra? And how important is it for machine learning practitioners to try out these models with their own benchmarks? Jon walks listeners through a test of his own in this Five-Minute Friday.Additional materials: www.superdatascience.com/768Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Jon Krohn sits down with Sebastian Raschka to discuss his latest book, Machine Learning Q and AI, the open-source libraries developed by Lightning AI, how to exploit the greatest opportunities for LLM development, and what’s on the horizon for LLMs.This episode is brought to you by the DataConnect Conference, and by Data Universe, the out-of-this-world data conference. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.In this episode you will learn:• All about Machine Learning Q and AI [04:13]• Sebastian Raschka’s role as Staff Research Engineer at Lightning AI [19:21]• PyTorch Lightning’s and Lightning Fabric’s capabilities [39:32]• Large language models: Opportunities and challenges [43:35]• DoRA vs LoRA [48:56]• How to be a successful AI educator [1:34:18]Additional materials: www.superdatascience.com/767
Kurt Vonnegut's "Player Piano" delivers striking parallels between its dystopian vision and today's AI challenges. This week, Jon Krohn explores the novel's depiction of a world where humans are marginalized by machines, reflecting on the impact of automation on society and the ethical considerations it raises. Tune in as we unpack the timeless relevance of Vonnegut's work to the AI era.Additional materials: www.superdatascience.com/766Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Explore the origins of NumPy and SciPy with their creator, Dr. Travis Oliphant. Discover the journey from personal need to global impact, the challenges overcome, and the future of these essential Python libraries in scientific computing and data science.This episode is brought to you by the DataConnect Conference, by Data Universe, the out-of-this-world data conference, and by CloudWolf, the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.In this episode you will learn:• Travis's journey to creating NumPy and SciPy [08:05]• How Anaconda got started [42:24]• How Numba, a high-performance Python compiler, was brought to market [54:48]• Python's influence on the thought processes of scientists and engineers [1:04:21]• The commercial projects that support Travis’s vast open-source efforts and communities [1:10:22]• How to get involved in Travis's commercial projects and communities [1:22:34]• The future of scientific computing and Python libraries [1:29:50]Additional materials: www.superdatascience.com/765
Data science futurists, bestselling authors, and lively how-to guides from the industry’s top practitioners, which range from applying data science for good to using open-source tools for NLP: This is The Super Data Science Podcast’s top ten most listened-to episodes in 2023, hosted by Jon Krohn. A great snapshot of our great content from 2023.Additional materials: www.superdatascience.com/764Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
At Glasswing Ventures, Rudina Seseri wants to be able to answer the question: What has Glasswing Ventures done for the company beyond capital investment? She speaks to Jon Krohn about how her company uses data to assess venture capital investments, the secret sauce of successful AI startups, and why she feels generative AI is only the start of a much broader impact that AI will make in communities and businesses.This episode is brought to you by the DataConnect Conference, and by Ready Tensor, where innovation meets reproducibility. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.In this episode you will learn:• Potential interest areas for Series A AI venture capitalists [12:22]• How Glasswing’s AI Palette helps AI startups [23:06]• How data driven the venture capital industry is [27:21]• Advice for adopting services from AI providers [47:21]• Model collapse: Causes and concerns [58:44]• Glasswing’s checklist for AI startups [1:04:59]Additional materials: www.superdatascience.com/763
Jon Krohn presents an insightful overview of Google's groundbreaking Gemini Pro 1.5, a million-token LLM that's transforming the landscape of AI. Discover the innovative aspects of Gemini Pro 1.5, from its extensive context window to its multimodal functionalities, which are broadening the scope of AI technology and signifying a significant leap in data science. Plus, join Jon for a practical demonstration, showcasing the real-world applications, capabilities, and limitation of this advanced language model.Additional materials: www.superdatascience.com/762Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Google's Gemini Ultra takes the spotlight this week, as host Jon Krohn welcomes Lisa Cohen, Google's Director of Data Science and Engineering, for a conversation about the launch of Gemini Ultra. Discover the capabilities of this cutting-edge large language model and how it stands toe-to-toe with GPT-4. Lisa shares her insights on the development, rollout, and potential of Gemini Ultra in reshaping various sectors. Whether you're a data science professional, tech enthusiast, or curious about the future of AI, this episode offers a deep dive into one of the most significant advancements in artificial intelligence.This episode is brought to you by Ready Tensor, where innovation meets reproducibility, and by Intel and HPE Ezmeral Software Solutions. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.In this episode you will learn:• Google’s Gemini model family and Lisa's key responsibilities [04:55]• How LLMs will transform the practice of Data Science [19:47]• Lisa on prompt engineering and reinforcement learning from human feedback [24:38]• How to fine-tune Gemini models with Google's Vertex AI [30:52]• How AI-assistants will transform life and work for everyone from data scientists to educators to children [47:14]• The challenges of developing a data-centric culture [57:31]• Centralized vs decentralized data science teams [1:03:50]Additional materials: www.superdatascience.com/761
AI-crafted beer, machine learning for passion projects, and self-taught data science: Jon Krohn and Beau Warren’s hotly anticipated, data-driven, punny lager Krohn&Borg is finally given a taste test in this week’s Five-Minute Friday. Heading to the Species X brewery in Columbus, Ohio, Jon Krohn and Beau Warren launched the beer that had been predicted, optimized and developed by a machine-learning model.Additional materials: www.superdatascience.com/760Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Encoders, cross attention and masking for LLMs: SuperDataScience Founder Kirill Eremenko returns to the SuperDataScience podcast, where he speaks with Jon Krohn about transformer architectures and why they are a new frontier for generative AI. If you’re interested in applying LLMs to your business portfolio, you’ll want to pay close attention to this episode!This episode is brought to you by Ready Tensor, where innovation meets reproducibility, by Oracle NetSuite business software, and by Intel and HPE Ezmeral Software Solutions. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.In this episode you will learn:• How decoder-only transformers work [15:51]• How cross-attention works in transformers [41:05]• How encoders and decoders work together (an example) [52:46]• How encoder-only architectures excel at understanding natural language [1:20:34]• The importance of masking during self-attention [1:27:08]Additional materials: www.superdatascience.com/759
Explore the groundbreaking Mamba model, a potential game-changer in AI that promises to outpace the traditional Transformer architecture with its efficient, linear-time sequence modeling.Additional materials: www.superdatascience.com/758Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Explore mind-blowing storytelling with Cole Nussbaumer Knaflic in this episode. Audience favorite and author of "Storytelling with You," Cole returns to share essential tips for crafting impactful presentations, emphasizing narrative construction and audience engagement. Learn how to effectively communicate data and stories, enhancing your presentations with insights from a leading expert in the field.This episode is brought to you by CloudWolf, the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.In this episode you will learn:• How to become a confident communicator [11:59]• How to get rid of filler words [26:32]• How facts alone can't make a strong impact [41:44]• Cole's overview of her book Storytelling with You [55:19]• How to craft an effective presentation [1:00:24]• Common mistakes in virtual presentations [1:09:48]• Cole's virtual presentation setup [1:15:33]• Cole's next book Daphne Draws Data [1:20:23]Additional materials: www.superdatascience.com/757
AlphaGeometry, intuitive AI, and geometric deduction: In this week’s Five-Minute Friday, Super Data Science host Jon Krohn looks into developments from DeepMind, Google’s ground-breaking AI lab, and explores how this is a critical step towards a future of broadly accessible AI solutions across scientific disciplines.Additional materials: www.superdatascience.com/756Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
ChatGPT applications and data-driven beer: Beer brewer and Super Data Science regular listener Beau Warren talks to Jon Krohn about the wonders of “sweaty ales”, how to brew beer with data, and how to get started on creative machine learning projects even without a degree in data science.This episode is brought to you by CloudWolf, the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.In this episode you will learn:• About Species X [06:31]• How to become a certified beer taster [12:37]• How Beau checks the quality of his beer [25:01]• Beau and Jon’s machine learning project [38:02]• About genetic algorithms [52:35]• How to get creativity out of LLMs [1:24:46]Additional materials: www.superdatascience.com/755
Explore the future of coding with poolside co-founder and CEO Jason Warner as he explores the potential of code-specialized LLMs and their revolutionary impact on the developer's role. Tune in for insights on the shift towards an AI-led development paradigm.Additional materials: www.superdatascience.com/754Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Explore the future of collaborative ML workflows in this engaging episode with Dr. Greg Michaelson, Co-Founder of Zerve. Dr. Michaelson introduces the groundbreaking Zerve IDE and Pypelines project, addressing the critical gap in AutoML for commercial use and pinpointing why many A.I. projects don't meet their objectives. Gain insights into steering AI initiatives towards success and enhancing project communication, all in this insightful session. This episode is brought to you by Oracle NetSuite business software, and by Prophets of AI, the leading agency for AI experts. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.In this episode you will learn:• Why Zerve IDE is so sorely needed [04:50]• Pypelines: AutoML open-source in python [30:00]• Why most commercial A.I. projects fail and how to ensure they succeed [47:45]• How AutoML will impact the role of the data scientist [53:21]• Greg's background as a pastor and working at DataRobot [1:03:40]• How to develop impressive communication and storytelling skills [1:16:16]Additional materials: www.superdatascience.com/753
Jon Krohn interviews Hilke Schellmann about the ethics of recruitment algorithms, the field’s current state of play, and what can be improved about AI used in recruiting.Additional materials: www.superdatascience.com/752Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Venture capital and AI, and how to succeed with an AI company in 2024: Rasmus Rothe, Cofounder of Merantix, speaks to Jon Krohn about the Merantix campus in Berlin, how a venture capitalist identifies the best AI startups, the surefire ways for AI company founders to raise venture capital, and the jobs that are most and least vulnerable to disruption by automation.This episode is brought to you by Oracle NetSuite business software, by QuickChat customized AI assistants, and by Prophets of AI, the leading agency for AI experts. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.In this episode you will learn:• How Merantix started [05:17]• How does Merantix work and how to apply for funding [08:19]• How to secure AI funding [21:02]• How AI companies can prove competitiveness [33:46]• Ensuring AI regulation [41:17]• How AI will change the future of work [56:56]Additional materials: www.superdatascience.com/751
Explore the transformative power of AI in science. Jon Krohn reviews the groundbreaking AI-driven discoveries at MIT and beyond, showcasing how AI is reshaping various scientific fields, from pharmaceuticals to climate science, and pondering the balance between AI's capabilities and human ingenuity.Additional materials: www.superdatascience.com/750Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Data science for clean energy takes center stage as Emily Pastewka from Palmetto joins Jon Krohn this week, exploring innovative paths to a sustainable future. This episode covers the impact of AI on smart energy choices, the creation of a smart grid, and the wide array of professionals required to bring cleantech data solutions to life.This episode is brought to you by Prophets of AI, the leading agency for AI experts. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.In this episode you will learn:• Emily on her Master's in Deep Learning [08:20]• Using AI to solve clean energy challenges at Palmetto [17:22]• The different roles needed to solve cleantech problems [27:33]• How econometrics impacts consumer decision-making [38:56]• How Emily manages high-performing teams [56:30]• The tools and technologies that drive small teams [1:06:58]Additional materials: www.superdatascience.com/749
Artificial General Intelligence gets a new definition: This episode introduces Google DeepMind’s paper, “Levels of AGI: Operationalizing Progress on the Path to AGI”. Hear how its authors have organized narrow and general AI into hierarchical categories defined by human capability, from Level 0 (no AI) and Level 1 (equal to or somewhat better than an unskilled human) to Level 5 (able to outperform 100% of humans). A scary thought? Or a vision of a better future? Host Jon Krohn details the strengths of this research in this Five-Minute Friday.Additional materials: www.superdatascience.com/748Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Attention and transformers in LLMs, the five stages of data processing, and a brand-new Large Language Models A-Z course: Kirill Eremenko joins host Jon Krohn to explore what goes into well-crafted LLMs, what makes Transformers so powerful, and how to succeed as a data scientist in this new age of generative AI.This episode is brought to you by Intel and HPE Ezmeral Software Solutions, and by Prophets of AI, the leading agency for AI experts. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.In this episode you will learn:• Supply and demand in AI recruitment [08:30]• Kirill and Hadelin's new course on LLMs, “Large Language Models (LLMs), Transformers & GPT A-Z” [15:37]• The learning difficulty in understanding LLMs [19:46]• The basics of LLMs [22:00]• The five building blocks of transformer architecture [36:29]- 1: Input embedding [44:10]- 2: Positional encoding [50:46]- 3: Attention mechanism [54:04]- 4: Feedforward neural network [1:16:17]- 5: Linear transformation and softmax [1:19:16]• Inference vs training time [1:29:12]• Why transformers are so powerful [1:49:22]Additional materials: www.superdatascience.com/747
Jon’s continuous calendar for 2024 is here! Now in an updated format, learn about its unique layout and benefits, and how it can revolutionize your planning for the new yearAdditional materials: www.superdatascience.com/746Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
2024 data science trends take the spotlight in this special episode, where Jon joins Sadie St. Lawrence to analyze last year's predictions and delve into the emerging technologies reshaping the field. From AI hardware accelerators to the transformative role of large language models, this episode is a treasure trove of insights for anyone interested in the future of data science. This episode is brought to you by CloudWolf, the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.In this episode you will learn:• Reviewing predictions for 2023 [05:56]• Sadie's trend predictions for 2024 [20:49]• 1: Hardware evolution [21:17]• 2: LLMOS [35:30]• 3: Slow-thinking model [48:18]• 4: Tool consolidation [54:46]• 5: Workforce Upheaval [58:06]• Jon's predictions [1:06:26]• 1: AI bubble bursting [1:08:11]• 2: Breakthroughs in Edge AI [1:12:22]• Sadie on her productivity planner [1:17:50]Additional materials: www.superdatascience.com/745
2023: A year of great movement and change. Technological developments have rocketed generative AI’s capabilities into the stratosphere of possibilities for future approaches to work, health, and play. Host Jon Krohn recognizes the benefits we have seen over the past year, discusses the important role we all have in ensuring ethics remains at the core of AI development and use, and he ends the year with a musical surprise for his listeners!Additional materials: www.superdatascience.com/744Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Chatbots, large language models and generative AI: Founder of Quickchat AI Piotr Grudzień believes the key to any successful AI platform is to ensure it can be tailored to a company’s specific needs. He speaks to host Jon Krohn about helping clients generate realistic and satisfying conversations that help their customer base find what they need quickly. This episode is brought to you by Gurobi, the Decision Intelligence Leader,  and by CloudWolf, the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.In this episode you will learn:• About Quickchat AI and how it works [02:46]• How to successfully set up a conversational AI [23:58]• What “temperature” is in the context of AI [38:38]• How the LLM landscape has changed in recent years [40:24]• The future of generative AI [57:43]• The advantages of an AI accelerator [1:09:38]Additional materials: www.superdatascience.com/743
Join us on a brief journey through the AI world in 2023. A year ago, GPT-3.5 crafting our holiday message was a marvel, but now, with GPT-4's arrival, we're seeing an even more astounding evolution in AI. As we wave goodbye to the trend of generative AI, the Super Data Science Podcast team is bringing a personal touch back. Tune in for our heartfelt Happy Holidays message and a big thank you to all our listeners for your unwavering support.Additional materials: www.superdatascience.com/742Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Data visualization remains at the forefront as Dr. Alberto Cairo from the University of Miami guides us beyond numerical figures, exploring the art of weaving compelling narratives through data. In his book, "The Art of Insight," he reveals the varied motivations driving visualization experts and highlights the serene, meditative process inherent in crafting visualizations. Emphasizing the fusion of scientific principles and personal style for effective data communication, Dr. Cairo also discusses with Jon the impending impact of AI on both interactive and static graphics.This episode is brought to you by Gurobi, the Decision Intelligence Leader, by Intel and HPE Ezmeral Software Solutions, and by CloudWolf, the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• Alberto's book, The Art of Insight [04:07]• How to transform data into engaging visuals [07:06]• What it takes to enter in a meditation-like flow state when creating visualizations [11:21]• How balancing the science of visualization with one’s personal style [29:29]• The importance of Smart Brevity for great data visualizations [37:32]• How data visualization can drive social change [42:31]• How diversity in designers enriches the field [52:07]• The future of data visualizations [59:10]Additional materials: www.superdatascience.com/741
Sam Altman’s exit and rehiring, AGI, and OpenAI’s Q*: In this week’s Five-Minute Friday, Jon Krohn peeks behind the curtains of OpenAI, where development of the world’s first model that can solve complex, nonlinear logical problems, Q*, might be well underway. This episode casts light on the rumors behind OpenAI’s Q*, what its emergence could mean for the future of AI, and the controversies already surrounding an agent that has not yet reached the market.Additional materials: www.superdatascience.com/740Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
AI Protein design, machine learning and cancer care, and pharmaceuticals: At Exazyme, CEO and Co-Founder Ingmar Schuster uses AI to design proteins. He speaks with Jon Krohn about their wider applications in pharmaceuticals and chemistry, how Kernel methods make the design of synthetic biological catalysts more efficient, and when to use shallow machine learning over deep learning.This episode is brought to you by Gurobi, the Decision Intelligence Leader. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• On designing proteins with AI [03:14]• Designing proteins at Exazyme [08:22]• About the kernel methods [18:10]• The importance of human-led approaches in protein research [35:44]• Europe’s focus on AI regulation [43:45]• Deep vs shallow in AI [59:35]• How a background in academia helps with entrepreneurship [1:09:17]Additional materials: www.superdatascience.com/739
Bioengineering and Generative AI converge under the visionary leadership of Dr. Pierre Salvy at Cambrium GmbH, propelling material science into uncharted territories. He sits down with Jon Krohn live at Merantix A.I. Campus in Berlin to discuss how he's transforming material design, exemplified by his swift development of NovaColl, a vegan collagen crafted within two years.Additional materials: www.superdatascience.com/738Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
scikit-learn co-founder Gaël Varoquaux and Jon Krohn are live at the historic Sorbonne in Paris, where they discuss the evolution of scikit-learn. From its origins as a memory-efficient Python implementation of support vector machines to its present-day status as a pivotal resource in machine learning, Gaël paints a vivid picture of its remarkable growth. Join us for a glimpse into scikit-learn's evolution, the realm of open-source collaboration, and the transformative power of data-driven insights in today's dynamic data landscape.This episode is brought to you by Gurobi, the Decision Intelligence Leader, by Data Universe, the out-of-this-world data conference, and by CloudWolf, the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• The early beginnings and growth of scikit-learn [05:34]• Development principles of scikit-learn [18:05]• How to apply scikit-learn to your ML problem [21:16]• Resource-efficiency and scikit-learn development [25:32]• How to contribute to an open-source project like scikit-learn yourself [38:21]• The future of scikit-learn [51:13]• Gaël on the social-impact data projects in his Soda lab [1:02:33]• Why domain expertise and statistical rigor are more important than ever [1:11:24]Additional materials: www.superdatascience.com/737
AI certification and EU regulation: Jan Zawadzki, CTO and CO Managing Director of Certif.ai, talks to Jon Krohn about the future of certification for AI startups and keeping within rigorous international regulations. Additional materials: www.superdatascience.com/736Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Artificial General Intelligence, AlphaGo, and Google DeepMind: Jon Krohn speaks to Mehdi Ghissassi, Director of Product Management at Google DeepMind, about the ethics and social impact of AI, keeping up with AI releases with safety in mind, and other pressing AI problems that keep him awake at night. In this episode, Mehdi and Jon also take a broader look at the current AI landscape, the opportunities for AI investors and startups, and what AI product managers need to get ahead.This episode is brought to you by Gurobi, the Decision Intelligence Leader,  and by CloudWolf, the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• How DeepMind seeks to ‘solve intelligence’ [05:14]• The impact of AGI’s capabilities on medicine [16:37]• How the general public might come to apply future AI systems [28:09]• How working on product development for Africa has shaped Mehdi’s perspective on AI’s potential and challenges [37:17]• How to stay on top of rapid changes in AI [39:17]• What investors look for in AI startups [59:16]• Tips for product managers [1:03:34]Additional materials: www.superdatascience.com/735
Robot Soccer takes center stage as Jon Krohn and Dário Catarrinho, Secretary of the Dutch Nao Team and an AI student at the University of Amsterdam, discuss the intricate machine learning that enables robots to navigate the field, make decisions in real-time, respond to sound, and compete against each other in a gripping display of skill and strategy.Additional materials: www.superdatascience.com/734Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Yannic Kilcher, a leading ML YouTuber and DeepJudge CTO, teams up with Jon Krohn this week to delve into the open-source ML community, the technology powering Yannic’s Swiss-based startup, and the significant implications of adversarial examples in ML. Tune in as they also unpack Yannic's approach to tracking ML research, future AI prospects and his startup challenges.This episode is brought to you by Gurobi, the Decision Intelligence Leader,  and by CloudWolf, the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• About OpenAssistant project [03:39]• Alignment issues in open-source vs closed-source [08:36]• Alternative formulas vital for crafting superior LLMs [20:29]• Strategies to foster open-source LLM ecosystems [27:07]• Yannic's pioneering work in legal document processing at DeepJudge [31:31]• Comprehensive overview of adversarial examples [1:04:02]• The future AI's landscape [1:18:08]• Startup challenges [1:25:35]Additional materials: www.superdatascience.com/733
Exploring our vast universe, in this episode Jon Krohn meets with Daniela Huppenkothen at the University of Amsterdam's astronomy department for a wide-ranging discussion about building instrumentation for telescopes, collecting data from outer space and how to sort astronomy’s problem of enormous amounts of data.Additional materials: www.superdatascience.com/732Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Ethics and machine intelligence pioneer Nell Watson speaks to host Jon Krohn about the differences between AI ethics and AI safety, how crying wolf may result in future complications for AI development and the importance of ensuring IEEE standards to mitigate and regulate AI risks. She also touches on what she considers a “second Enlightenment”, in which we may start to form intimate relationships with AI—to both parties’ benefit.This episode is brought to you by Gurobi, the Decision Intelligence Leader,  and by CloudWolf, the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• AI ethics and AI safety [05:30]• How "moving fast" could break the world [18:07]• The shifting relationship between humans and machines [29:54]• International ethics standards, and their review process [52:10]• Current and future ethical standards [1:05:31]• Building a universal basic income with AI [1:19:23]Additional materials: www.superdatascience.com/731
In this episode, Kyle Daigle, COO of GitHub, joins Jon Krohn to discuss the transformative impact of generative AI tools like GitHub Copilot. Learn how these tools streamline software development, enhance collaboration, and accelerate code reviews. Discover innovative approaches to collaboration and innersourcing, reshaping the future of teamwork in the digital age.Additional materials: www.superdatascience.com/730Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Dr. Blake Richards discusses the world of AI and human cognition this week. Learn about the essence of intelligence, the ways AI research informs our understanding of the human brain, and discover the potential future scenarios where AI and humanity might intersect.This episode is brought to you by Gurobi, the Decision Intelligence Leader,  and by CloudWolf, the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• Blake's research and his take on intelligence [09:56]• How we can evaluate progress in artificial general intelligence [15:54]• Blake's thoughts on biomimicry [20:57]• Why Blake thinks the fears regarding AI are overdone [25:38]• The most effective strategies to mitigate AI fears without hindering innovation [35:31]• What steps can we take to ensure that AI supports human flourishing [45:23]• The importance of interpreting neuroscience data through the lens of ML [55:08]• Backpropagation, gradient descent and the brain [1:17:32]Additional materials: www.superdatascience.com/729
Learn how to achieve human-like outputs from LLMs in this week’s Five-Minute Friday with Jon Krohn. Understand the various current methods available to decode and generate text, as well as the differences between them. Find out about greedy search, beam search, sampling, and contrastive search, and how you can use them to create incredibly useful LLMs.Additional materials: www.superdatascience.com/728Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Coded bias, intersectionality in AI, and computer vision: Founder of the Algorithmic Justice League Joy Buolamwini talks to host Jon Krohn about the impact of exclusion and inclusion in datasets, the need to address intersectionality when identifying racial, age, or gender-based prejudice in machine learning tools, protections for artists and creative practitioners against AI, and the role that AI may have in combating systemic racism.This episode is brought to you by Gurobi, the Decision Intelligence Leader,  and by CloudWolf, the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• What coded bias is [06:49]• The problem with bias in machine learning datasets [18:41]• The Incoding Movement [42:08]• About the Pilot Parliaments Benchmark [52:07]• Ethics and the future of AI [1:20:10]• The potential for AI to end systemic racism [1:32:59]Additional materials: www.superdatascience.com/727
Ben Jones, CEO of Data Literacy, discusses the seven crucial components of effective data leadership. From ethics to technology and fostering a data-centric culture, Jones provides actionable insights and practical examples. Tune in to empower your organization with purposeful and ethical data strategies from day one.Additional materials: www.superdatascience.com/726Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Dr. Kim Stachenfeld, Research Scientist at Google DeepMind and Affiliate Professor at Columbia University, delves into the realms of AI and neuroscience as she discusses computer-based simulations of the human brain, the efficiency of language in compression, and the neuroscience theories shaping the future of artificial intelligence. Discover the secrets behind memory formation, cognitive enhancement, and the potential of Artificial General Intelligence (AGI) in this thought-provoking episode.This episode is brought to you by Gurobi, the Decision Intelligence Leader,  and by ODSC, the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• The importance of simulations in the context of human intelligence [05:44]• The basic approach to simulating human intelligence or physical systems [09:30]• Will simulations help us realize AGI? [37:21]• The cross-disciplinary potential of LLMs [40:20]• The special role of our brain’s hippocampus in memory formation [1:05:15]• Kim's research on reinforcement learning and neural representation [1:15:02]• Compression in representation learning [1:38:51]• What skills should an aspiring computational neuroscientist hone [1:50:30]Additional materials: www.superdatascience.com/725
In this Friday episode, host Jon Krohn talks to UCSF’s David Moses about BRAVO (Brain-Computer Interface Restoration of Arm and Voice), a study led by Edward Chang and Karunesh Ganguly that helps patients who have lost the ability to speak to communicate once again via a speech neuroprosthesis. Postdoctoral engineer David Moses, who is a part of BRAVO, reveals the data and machine learning models that help BRAVO predict the words and facial expressions that a paralyzed patient is trying to form via their brain activity, crucially helping patients to communicate with medical practitioners and loved ones.Additional materials: www.superdatascience.com/724Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Mathematical optimization should be known to every data scientist: Jon Krohn speaks to Jerry Yurchisin, Data Science Strategist at Gurobi, the decision-making technology and best-kept secret of 80% of America’s leading enterprises.This episode is brought to you by the Zerve data science dev environment,  by ODSC, the Open Data Science Conference, and by CloudWolf, the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• What mathematical optimization is [04:27]• How Gurobi solver works [29:01]• How to use Gurobi with Python [36:08]• Coding and algebra resources [41:14]• When to use mathematical optimization and machine learning together [54:23]• Using mathematical optimization in natural language processing [1:01:00]Additional materials: www.superdatascience.com/723
This episode delves into an intriguing research paper from top institutions like UC Irvine and MIT, analyzing the carbon emissions of AI-driven writing and illustrating versus traditional human methods. The findings might surprise you. Is AI the more eco-friendly option? Listen now to explore this compelling intersection of technology and sustainability.Additional materials: www.superdatascience.com/722Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Dr. Amira Abbas, Quantum Computing Researcher at the University of Amsterdam, explores the captivating world of Quantum Machine Learning. Learn about the distinct characteristics of qubits and the vital processes of Quantum ML. For those keen on exploring further, Amira offers noteworthy ML tools suggestions to kickstart your journey in Quantum Computing.This episode is brought to you by Gurobi, the Decision Intelligence Leader,  by ODSC, the Open Data Science Conference, and by CloudWolf, the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• Quantum computing vs classical computing [03:42]• What is quantum entanglement [11:45]• What is a qubit [15:07]• The best problems for quantum ML [30:08]• Three distinct steps in quantum ML and its potential [39:06]• Quantum neural networks [49:03]• What Amira's working on at the moment [1:10:20]• How to get started in quantum ML [1:21:06]• Amira's recommended ML tools for quantum computing [1:30:39]Additional materials: www.superdatascience.com/721
DALL-E may be playing second fiddle to Midjourney no longer with OpenAI’s latest model for generative AI art, DALL-E 3. Host Jon Krohn breaks down the newest model’s capabilities to go beyond producing incredible artistic images, and that follows your written brief to the letter.Additional materials: www.superdatascience.com/720Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode, Margot Gerritsen and Jon Krohn discuss the fundamentals of computational mathematics and its application in studying fluid dynamics. Margot also talks about how her synesthesia led to a lifelong interest in math, using computational mathematics to predict airflow, and why it is so important that underrepresented groups in data science become more visible through organizations like Women in Data Science.This episode is brought to you by the Zerve data science dev environment, by Gurobi, the Decision Intelligence Leader, and by ODSC, the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• About computational mathematics and its relation to data science [03:19]• Margot’s current research into emissions simulation [15:05]• Computational Mathematics: Real-World Applications [33:18]• The importance of wind tunnels in testing designs [47:54]• The beauty of linear algebra [1:05:59]• Synesthesia: Seeing Numbers as Colors [1:16:33]• About Women in Data Science [1:24:59]Additional materials: www.superdatascience.com/719
Elevate your ChatGPT game with a useful custom instruction. Tune in to hear Jon’s trick for maximizing ChatGPT’s potential.Additional materials: www.superdatascience.com/718Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Dr. Dan Shiebler, Head of ML at Abnormal Security, joins Jon Krohn this week and unveils the intricacies of cybercrime detection and email protection, and the role of AI in future challenges.This episode is brought to you by Grafbase, the unified data layer, by ODSC, the Open Data Science Conference,  and by Modelbit, for deploying models in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• The heuristic and “intermediate” ML models that they develop at Abnormal Security [07:08]• How Dan uses LLMs at Abnormal Security [15:46]• How false negatives are individually the biggest classification error to avoid in cybersecurity [20:49]• How head-to-head competitor analysis helps refine models [34:34]• Resilient ML in cybersecurity [38:36]• Abnormal Security’s routine for updating their models [52:37]• AI's impact on the urban world [1:09:57]• How to stay updated in data science and AI [1:13:46]Additional materials: www.superdatascience.com/717
Jon Krohn's 94-year-old grandmother, Annie, who's bursting with life and wisdom, shares her recipe to lifelong happiness and how relationships and daily intentions play an integral role. Annie also shares her curious take on modern technology. Get inspired by her infectious joy and perspective on life.Additional materials: www.superdatascience.com/716Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Join us as Dr. Allen Downey, renowned author and professor, shares insights from his upcoming book 'Probably Overthinking It,' breaking down underused techniques like Survival Analysis, explaining common paradoxes, and discussing the dynamic Overton Window.This episode is brought to you by the Zerve data science dev environment, by  Modelbit, for deploying models in seconds, and by Grafbase, the unified data layer. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• Why interpreting data is not always easy [06:21]• What is Survival Analysis [15:32]• Preston's Paradox [22:09]• Are you Normal? [36:52]• How to better prepare for rare “Black Swan” events [42:48]• What is an Overton Window? [53:06]• What is the base rate fallacy? [1:23:31]• How to protect yourself from biased samples [1:33:39]• Simpson’s Paradox [1:42:43]Additional materials: www.superdatascience.com/715
In this Friday episode, guest Tim Albiges explores with host Jon Krohn how people with blindness can have a lucrative and fulfilling career in data science, how Tim’s PhD thesis applied machine learning to help diagnose chronic respiratory diseases, and the communication tools that blind people can use to live a full and independent life.Additional materials: www.superdatascience.com/714Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Artificial General Intelligence, RLHF’s application in AI, and how entrepreneurs can enter the AI industry: Meta’s AI Research Scientist Thomas Scialom gives us behind-the-scenes insights into developing Llama 2 and what’s in the works for Llama 3. With host Jon Krohn, he discusses the future of Artificial General Intelligence, why the Galactica science-focused LLM was taken down, and what he learned from it.This episode is brought to you by AWS Inferentia, by Grafbase, the unified data layer, and by Modelbit, for deploying models in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• Llama 2: Behind the Scenes of Today’s Top Open-Source LLM [05:04]• Responsible use of Llama 2 [15:26]• Toolformer: LLM That Learns How to Use External Tools [24:57]• Galactica: The Science-Specific LLM and Why It Was Brought Down [36:57]• Is AGI Around the Corner? [57:03]• Advice for AI entrepreneurs [1:05:46]• How Thomas develops and manages large-scale AI projects [1:14:42]Additional materials: www.superdatascience.com/713
Code Llama might just be starting the revolution for how data scientists code. In this Five-Minute Friday, host Jon Krohn investigates the suite of models under the free-to-use Code Llama and how to find the best fit for your project’s needs.Additional materials: www.superdatascience.com/712Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode, host Jon Krohn explores with his guest Ajay Jain, Co-Founder of Genmo.ai, how creative general intelligence could take the video industry by storm. They also discuss the models that got Genmo to this point, the applications of NeRF, and how understanding human psychology is so essential to developing models that output high-fidelity video.This episode is brought to you by the Zerve data science dev environment, by Grafbase, the unified data layer, and by Modelbit, for deploying models in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• About Genmo.ai and the term “creative general intelligence” [03:47]• Why Ajay started Genmo.ai [09:26]• The increased performance of multimodal models [21:12]• All about Denoising Diffusion Probabilistic Models (DDPMs) [31:03]• The application of Neural Radiance Fields (NeRF) [55:26]• Predicting pedestrian behavior at Uber [1:01:50]• How to save money in the process of training models [1:12:42]Additional materials: www.superdatascience.com/711
Discover the power of Large Language Models with Kris Ograbek as he unravels the intricacies of LangChain and showcases a chatbot in action, all while putting our host Jon Krohn in the hot seat!Additional materials: www.superdatascience.com/710Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Meta's Senior Research Director, Dr. Laurens van der Maaten, takes center stage to unravel the captivating realm of AI innovation. Learn about his groundbreaking contributions, including pioneering the t-SNE dimensionality reduction technique and harnessing AI for novel protein synthesis, climate change mitigation, and wearable materials simulation. Join us to explore the transformative power of AI across diverse domains and gain a glimpse into its future societal implications.This episode is brought to you by AWS Inferentia, by Modelbit, for deploying models in seconds, and by Grafbase, the unified data layer. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• Large-scale learning of image recognition models on web data [05:05]• Evolutionary Scale Modeling protein models [16:45]• Fighting climate change by building an A.I. model [29:49]• The CrypTen privacy-preserving ML framework [38:36]• Concerns about adversarial examples [53:25]• Laurens’ t-SNE algorithm [58:56]• How to make a big impact [1:07:25]Additional materials: www.superdatascience.com/709
On this week’s Five-Minute Friday, host Jon Krohn gives five reasons why he is so excited about ChatGPT’s Code Interpreter and walks listeners through its capabilities with a practical example.Additional materials: www.superdatascience.com/708Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
LLM Vicuña, Chatbot Arena, and the race to increase LLM context windows: This episode’s guest Joey Gonzalez talks to Jon Krohn about developing models and platforms that leverage and improve LLMs, as well as the future of AI development and access.This episode is brought to you by the AWS Insiders Podcast, by Modelbit, for deploying models in seconds, and by Grafbase, the unified data layer. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• Vicuña: How the revolutionary LLM came to be [03:35]• Chatbot Arena: The leading LLM leaderboard [09:47]• Trusting LLM results [17:54]• Gorilla: The open-source ChatGPT plugin alternative [32:13]• About LMSYS and long context windows [47:48]• Open- vs closed-source LLMs: Which is better? [1:01:39]• Aqueduct [1:16:49]• Founding GraphLab [1:27:02]• How AI will positively impact society in the coming decades [1:32:31]Additional materials: www.superdatascience.com/707
In this episode, Caterina Constantinescu dives deep into Large Language Models (LLMs), spotlighting top leaderboards, evaluation benchmarks, and real-world user perceptions. Plus, discover the challenges of dataset contamination and the intricacies of platforms like HELM and Chatbot Arena.Additional materials: www.superdatascience.com/706Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Join Jon Krohn as he chats with Syngenta Group's Feroz Sheikh, Jeremy Groeteke, and Thomas Jung about the digital revolution in agriculture. Learn how data science is evolving farming, from precision techniques to global food solutions. A compelling blend of tech meets nature.This episode is brought to you by AWS Inferentia and by Modelbit, for deploying models in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• What is precision agriculture? [09:43]• What is computational agronomy? [12:30]• How Syngenta helps growers optimize yields [21:37]• How to bridge the gap between R&D and out in the real world [33:58]• What is generative chemistry? [37:52]• How generative chemistry accelerates the discovery of new compounds [41:55]• How you could make a big social impact in agriculture with data science [56:22]• How to go about designing ML models for agriculture [1:00:27]Additional materials: www.superdatascience.com/705
Take on the world of GPT and learn to develop your own, commercially successful Large Language Models (LLMs) with Jon Krohn’s comprehensive, guided training video for generative AI. Get to grips with the technology, learn which tools to use, and find out how to get an eye for business-viable models with Jon’s (ad-)free educational video.Additional materials: www.superdatascience.com/704Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Statistics history, interdisciplinarity, and data and society. Chris Wiggins talks with Jon Krohn about the power dynamics of data, the transformation of the field of biology through data-driven approaches to genetic sequencing, and the New York Times’ data science team’s cutting-edge approach to accommodating its tech stack.This episode is brought to you by the AWS Insiders Podcast and by Modelbit, for deploying models in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• The importance of the humanities in data science [09:18]• How data science “rearranges” power [17:19]• An overview of How Data Happened [20:36]• The controversial nature of Bayes theorem [29:16]• Why we need to consider data ethics [34:00]• How biology came to adopt data science into its field [45:44]• The data science tech stack at the New York Times [49:18]Additional materials: www.superdatascience.com/703
This week, Jon Krohn is examining Meta's newly released open-source large language model, Llama 2, highlighting its commercial prospects, immense capacity, model variety, and unique 'time awareness' feature. He also discusses its innovative two-stage RLHF approach that enhances its performance.Additional materials: www.superdatascience.com/702Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Dr. Raluca Ada Popa, renowned computer scientist, entrepreneur, and President of Opaque Systems, joins Jon Krohn to share her insights on securely interacting with AI APIs like OpenAI's GPT-4, the pros and cons of open vs. closed-source AI development, and the seamless operation of compute pipelines across multiple clouds.This episode is brought to you by AWS Inferentia and by Modelbit, for deploying models in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• What is a confidential computing platform? [04:31]• How to get started with confidential computing [12:10]• The challenges of confidential computing and LLMs [21:11]• How to safeguard your data while using commercial LLMs like GPT-4 [38:00]• Open-source vs closed-source [52:28]• Raluca's PreVail cybersecurity company [1:01:50]• Combining entrepreneurship and academic career [1:04:03]• DARE Program [1:10:39]Additional materials: www.superdatascience.com/701
Yoga and Hindu mythology: This special episode continues the thread of our centenary episodes, SDS 500: Yoga Nidra with Jes Allen and SDS 600: Yoga Nidra Practice with Steve Fazzari, which talked through guided meditation techniques to help improve posture, sleep, and expand consciousness. Inspired by these sessions, host Jon Krohn explores Hindu mythology via Alan Watts’ “The Dream of Life”.Additional materials: www.superdatascience.com/700Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Model deployment, data warehouse options for running models, and how to best leverage BI tools: Harry Glaser and Jon Krohn discuss Modelbit’s capabilities to automate ML models from notebooks into production-ready models, reducing the time and effort in ‘translating’ information from one mode to another. Harry’s conversation with host Jon Krohn expanded on the importance of automating this task, and how developments in ML modeling have widened access to entire teams to analyze data, whatever their level of expertise.This episode is brought to you by the AWS Insiders Podcast. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• What the modern data stack is [03:28]• Version control for data scientists [13:30]• CI/CD, load balancing and logging [20:38]• Snowflake vs. Redshift [30:10]• How tools like Looker and Tableau help monitor models [35:26]Additional materials: www.superdatascience.com/699
Company-wide AI adoption can take a lot of persuasion. Rehgan Avon talks to host Jon Krohn about why AI has become necessary for forward-thinking businesses and the steps to implement AI in an institution so that everyone benefits.Additional materials: www.superdatascience.com/698Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
AI visionary and CEO of SingularityNET Dr. Ben Goertzel provides a deep dive into the possible realization of Artificial General Intelligence (AGI) within 3-7 years. Explore the intriguing connections between self-awareness, consciousness, and the future of Artificial Super Intelligence (ASI) and discover the transformative societal changes that could arise.This episode is brought to you by AWS Inferentia, by the AWS Insiders Podcast, and by Modelbit, for deploying models in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• Decentralized and benevolent AGI [03:13] • The SingularityNET ecosystem [13:10]• Dr. Goertzel's vision for realizing AGI - combining DL with neuro-symbolic systems, genetic algorithms and knowledge graphs [25:50]• How reaching AGI will trigger Artificial Super Intelligence [38:51]• Dr. Goertzel's approach to AGI using OpenCog Hyperon [42:34]• Why Dr. Goertzel believes AGI will be positive for humankind [53:07]• How to ensure the AGI is benevolent [1:06:43]• How AGI or ASI may act ethically [1:13:50]Additional materials: www.superdatascience.com/697
Jon Krohn welcomes Professor Dr. Bob Knight to explore human intelligence, the prefrontal cortex, and the transformative potential of brain implants for data collection. Discover the pivotal role of machine learning in treating Parkinson's and delve into exciting future advancements.Additional materials: www.superdatascience.com/696Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
What are transformers in AI, and how do they help developers to run LLMs efficiently and accurately? This is a key question in this week’s episode, where Hugging Face’s ML Engineer Lewis Tunstall sits down with host Jon Krohn to discuss encoders and decoders, and the importance of continuing to foster democratic environments like GitHub for creating open-source models.This episode is brought to you by the AWS Insiders Podcast, by WithFeeling.ai, the company bringing humanity into AI, and by Modelbit, for deploying models in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• What a transformer is, and why it is so important for NLP [04:34]• Different types of transformers and how they vary [11:39]• Why it’s necessary to know how a transformer works [31:52]• Hugging Face’s role in the application of transformers [57:10]• Lewis Tunstall’s experience of working at Hugging Face [1:02:08]• How and where to start with Hugging Face libraries [1:18:27]• The necessity to democratize ML models in the future [1:25:25]Additional materials: www.superdatascience.com/695
Modeling tabular data and spreadsheets doesn’t have to be tedious with CatBoost’s open-source tree-boosting algorithm. CatBoost does what it says on the tin, blending categories with boosting that allows you to train your models faster and handle large datasets for ML tasks across multiple GPUs. In this week’s Five-Minute Friday, host Jon Krohn gets to grips with the technical components of CatBoost that give it the speed and accuracy so acclaimed by its users.Additional materials: www.superdatascience.com/694Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Harpreet Sahota, a data science expert and deep learning developer at Deci AI, joins Jon Krohn to explore the fascinating realm of object detection and the revolutionary YOLO-NAS model architecture. Discover how machine vision models have evolved and the techniques driving compute-efficient edge device applications..This episode is brought to you by AWS Inferentia, by WithFeeling.ai, the company bringing humanity into AI, and by Modelbit, for deploying models in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• What is machine vision? [07:02]• Object detection and YOLO architectures [13:00]• Deci's YOLO-NAS: Optimal object detection model architecture [23:39]• Developer Relations [1:00:16]• Harpreet's 'top-down' approach to learning Deep Learning [1:06:50]Additional materials: www.superdatascience.com/693
Join Jon as he navigates listeners through the innovative SpQR approach—a cutting-edge, lossless LLM weight compression technique that harnesses the power of quantization. Tune in as Jon delves into the four steps behind this groundbreaking method in this week's episode.Additional materials: www.superdatascience.com/692Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
GPUs vs CPUs, chip design and the importance of chips in AI research: This highly technical episode is for anyone who wants to learn what goes into chip development and how to get into the competitive industry of accelerator design. With advice from expert guest Ron Diamant, Senior Principal Engineer at AWS, you’ll get a breakdown of the need-to-know technical terms, what chip engineers need to think about during the design phase and what the future holds for processing hardware.This episode is brought to you by Posit, the open-source data science company, by the AWS Insiders Podcast, and by WithFeeling.ai, the company bringing humanity into AI. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• What CPUs and GPUs are [05:29]• The differences between accelerators used for deep learning [14:31]• Trainium and Inferentia: AWS's A.I. Accelerators [22:10]• If model optimizations will lead to lower demand for hardware to process them [43:14]• How a chip designer goes about production [48:34]• Breaking down the technical terminology for chips (accelerator interconnect, dynamic execution, collective communications) [55:29]• The importance of AWS Neuron, a software development kit [1:15:42]• How Ron got his foot in the door with chip design [1:26:40]Additional materials: www.superdatascience.com/691
Krishna Gade, the founder and CEO of Fiddler.AI, discusses the challenges faced by Large Language Models (LLMs) in Generative AI, including inaccuracies, biases, and privacy risks. He emphasizes the importance of monitoring to build trust in AI and highlights Fiddler's explainability algorithms and pre-built bias detection tools as vital solutions.Additional materials: www.superdatascience.com/690Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Arize's Amber Roberts and Xander Song join Jon Krohn this week, sharing invaluable insights into ML Observability, drift detection, retraining strategies, and the crucial task of ensuring fairness and ethical considerations in AI development.This episode is brought to you by Posit, the open-source data science company, by AWS Inferentia, and by Anaconda, the world's most popular Python distribution. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• What is ML Observability [05:07]• What is Drift [08:18]• The different kinds of model drift [15:31]• How frequently production models should be retrained? [25:15]• Arize's open-source product, Phoenix [30:49]• How ML Observability relates to discovering model biases [50:30]• Arize case studies [57:13]• What is a developer advocate [1:04:51]Additional materials: www.superdatascience.com/689
Prompt injection, prompt engineering, context windows, and more: In this week’s Five-Minute Friday, Jon explains why anyone looking to build their own product leveraging LLMs should stop to consider these and three more issues before jumping in. Phillip Carter first outlined these six issues in his article “All the Hard Stuff Nobody Talks About when Building Products with LLMs”.Additional materials: www.superdatascience.com/688Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Autoencoders, transformers, latent space: Learn the elements of generative AI and hear what data scientist David Foster has to say about the potential for generative AI in music, as well as the role that world models play in blending generative AI with reinforcement learning.This episode is brought to you by Posit, the open-source data science company, by Anaconda, the world's most popular Python distribution, and by WithFeeling.ai, the company bringing humanity into AI. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• Generative modeling vs discriminative modeling [04:21]• Generative AI for Music [13:12]• On the threats of AI [23:15]• Autoencoders Explained [38:36]• Noise in Generative AI [48:11]• What CLIP models are (Contrastive Language-Image Pre-training) [54:07]• What World Models are [1:00:40]• What a Transformer is [1:11:14]• How to use transformers for music generation [1:19:50]Additional materials: www.superdatascience.com/687
Mircosoft’s Ruth Yakubu joins Jon Krohn to discuss Responsible AI principles and the open-source Responsible AI Toolbox, allowing users to assess their models for fairness, inclusiveness, privacy, explainability, accountability, and reliability before deployment.Additional materials: www.superdatascience.com/686Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Richmond Alake, a Machine Learning Architect at Slalom Build, sits down with Jon to share real-time ML insights, tools and career experiences for a high-energy and high impact episode. From his work at Slalom Build to his two AI startups, discover the software choices, ML tools, and front-end development techniques used by a leader in the field.This episode is brought to you by Posit, the open-source data science company, by AWS Inferentia, and by WithFeeling.ai, the company bringing humanity into AI. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• What is a Machine Learning Architect? [03:09]• Richmond's startups [12:07]• Why Richmond started a podcast [29:51]• Richmond's new course on feature stores [38:05]• Why Richmond produces data science content [43:25]• Why All Data Scientists Should Write [51:30]Additional materials: www.superdatascience.com/685
Open-source LLMs, FlashAttention and generative AI terminology: Host Jon Krohn gives us the lift we need to explore the next big steps in generative AI. Listen to the specific way in which Stanford University’s “exact attention” algorithm, FlashAttention, could become a competitor for GPT-4’s capabilities.Additional materials: www.superdatascience.com/684Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Monitoring malicious, user-generated content; contextual AI; adapting to novel evasion attempts: Matar Haller speaks to Jon Krohn about the challenges of identifying, analyzing and flagging malicious information online. In this episode, Matar explains how contextual AI and a “database of evil” can help resolve the multiple challenges of blocking dangerous content across a range of media, even those that are live-streamed.This episode is brought to you by Posit, the open-source data science company, by Anaconda, the world's most popular Python distribution, and by WithFeeling.ai, the company bringing humanity into AI. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• How ActiveFence helps its customers to moderate platform content [05:36]• How ActiveFence finds extreme social media users trying to evade detection [16:32]• How to monitor live-streaming content and analyze it for dangerous material [29:13]• The technologies ActiveFence uses to run its platform [35:54]• Matar’s experience of the Insight Fellows Program (Data Science Fellowship) [40:28]• Leadership opportunities for women in STEM [1:00:41]• Israel’s R&D edge for AI [1:13:19]Additional materials: www.superdatascience.com/683
In this week's episode, Mico Yuk, host of 'Analytics on Fire', joins Jon Krohn to share her effective business intelligence and analytics framework, BIDS, for persuading key decision makers. She crowns one "power" tool as the analytics king and discusses emerging tools that could challenge its dominance. Tune in for unapologetic insights on future and current BI trends and happenings from the world of BI and analytics.Additional materials: www.superdatascience.com/682Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Unlock the power of XGBoost by learning how to fine-tune its hyperparameters and discover its optimal modeling situations. This and more, when best-selling author and leading Python consultant Matt Harrison teams up with Jon Krohn for yet another jam-packed technical episode! Are you ready to upgrade your data science toolkit in just one hour? Tune-in now!This episode is brought to you by Pathway, the reactive data processing framework, by Posit, the open-source data science company, and by Anaconda, the world's most popular Python distribution. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• Matt's book ‘Effective XGBoost’ [07:05]• What is XGBoost [09:09]• XGBoost's key model hyperparameters [19:01]• XGBoost's secret sauce [29:57]• When to use XGBoost [34:45]• When not to use XGBoost [41:42]• Matt’s recommended Python libraries [47:36]• Matt's production tips [57:57]Additional materials: www.superdatascience.com/681
Industrial machinery’s dependence on data science, tech stacks to build IoT platforms, and transitioning from data science to product: This week’s Friday episode with Allegra Alessi explores the minutiae of product ownership for the Internet of Things at packaging company Bobst. Join host Jon Krohn and his guest as they unpack how the IoT is leading factory production.Additional materials: www.superdatascience.com/680Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Generative AI, MLOps, and making smart investments in AI: This week’s episode is critical listening for AI investors and generative AI creators. AI investor George Mathew talks with host Jon Krohn about the emerging generative AI stack, the critical elements of MLOps to ensure a scalable model, and the tools developers can use for a saleable product.This episode is brought to you by Posit, the open-source data science company, by AWS Inferentia, and by Anaconda, the world's most popular Python distribution. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• Venture capital’s role in the technology startup ecosystem [05:59]• How RLHF helps UI become more intuitive [12:53]• The four layers of the generative AI stack [34:16]• The risks for generative AI business founders and investors [46:50]• How MLOps drive best practices and help implementation [56:33]• The importance of PLG (Product Lead Growth) [1:04:15]• How generative AI tools will impact the labor market [1:17:34]Additional materials: www.superdatascience.com/679
StableLM, the new family of open-source language models from the brilliant minds behind Stable Diffusion is out! Small, but mighty, these models have been trained on an unprecedented amount of data for single GPU LLMs. This week, Jon breaks down the mechanics of this model–see you there! Additional materials: www.superdatascience.com/678 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
How does one use marketing analytics to drive business success? Avinash Kaushik, Chief Strategy Officer at Croud and former Sr. Director of Global Strategic Analytics at Google joins Jon Krohn live for an exciting episode that covers the transformative power of AI, his 'four clusters of intent' framework and the value of hands-on data tools. This episode is brought to you by Pathway, the reactive data processing framework, by Posit, the open-source data science company, and by Anaconda, the world's most popular Python distribution. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: • What is a chief strategy officer? [3:55] • Brand vs performance analytics [7:23] • Incrementality-centric marketing [32:53] • Avinash's time at Google [37:54] • How to maintain human-touch with AI [48:58] • Four clusters of intent framework [1:11:28] • Avinash's most significant career challenges [1:17:18] Additional materials: www.superdatascience.com/677
Chinchilla AI, and fine-tuning proprietary tasks with large language models: On this week’s Five-Minute Friday, host Jon Krohn outlines the principles of the Chinchilla Scaling Laws, the incredible power of models such as Cerebras-GPT based on these laws, and the impact of scaling on the number of viable applications and commercial use cases.Additional materials: www.superdatascience.com/676Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Wrangling data in Pandas, when to use Pandas, Matplotlib or Seaborn, and why you should learn to create Python packages: Jon Krohn speaks with guest Stefanie Molin, author of Hands-On Data Analysis with Pandas. This episode is brought to you by Posit, the open-source data science company, and by AWS Inferentia. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn:• The advantages of using pandas over other libraries [07:55]• Why data wrangling in pandas is so helpful [12:05]• Stefanie’s Data Morph library [24:27]• When to use pandas, matplotlib, or seaborn [33:45]• Understanding the ticker module in matplotlib [36:48]• Where data analysts should start their learning journey [40:08]• What it’s like being a software engineer at Bloomberg [51:19] Additional materials: www.superdatascience.com/675
Models like Alpaca, Vicuña, GPT4All-J and Dolly 2.0 have relatively small model architectures, but they're prohibitively expensive to train even on a small amount of your own data. The standard model-training protocol can also lead to catastrophic forgetting. In this week's episode, Jon explores a solution to these problems, introducing listeners to Parameter-Efficient Fine-Tuning (PEFT) and the leading approach: Low-Rank Adaptation (LoRA).Additional materials: www.superdatascience.com/674Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Vincent Gosselin, CEO and co-founder of Taipy, an open-source Python library, joins Jon Krohn to discuss how to accelerate productivity in Python and build scalable, reusable, and maintainable data pipelines. Gosselin shares his breadth of wisdom honed over his decades-long AI career. This episode is brought to you by Pathway, the reactive data processing framework, and by Posit, the open-source data science company. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn:• The Taipy library functionality [2:59]• The future of data pipelines [21:40]• Common trends of companies that are successful at adopting data pipelines [28:31]• How no-code and low-code trends impact the data science lifecycle [33:00]• How Vincent chose the programming languages that underpin Taipy [41:40]• Common trends on how companies manage their data to learn from it [45:06]• Vincent's perspective on AI winters [51:03] Additional materials: www.superdatascience.com/673
Get started with language models: Learn about the commercial-use options available for your business in this week’s Five-Minute Friday, where host Jon Krohn discusses four models that have many of the capabilities of ChatGPT and can run at a fraction of the cost.Additional materials: www.superdatascience.com/672Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Get to grips with AWS, Azure, Google Cloud Platform on this week’s episode. Host Jon Krohn speaks with Kirill Eremenko and Hadelin de Ponteves about CloudWolf, a cloud computing educational platform that prepares students for certification in AWS (Amazon Web Services). Find out why an accreditation in cloud computing could be the safest investment for your data science career. This episode is brought to you by Posit, the open-source data science company, and by AWS Inferentia. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn:• About CloudWolf [07:04]• Why learning the cloud is important for data scientists [09:12]• Is learning cloud computing complex? [22:30]• Essential AWS services [28:31]• Database options on AWS [33:47]• How to run analytics on AWS [40:58]• Why an AWS certification is so helpful [56:35] Additional materials: www.superdatascience.com/671
How does Meta AI's natural language model, LLaMa compare to the rest? Based on the Chinchilla scaling laws, LLaMa is designed to be smaller but more performant. But how exactly does it achieve this feat? It's all done by training a small model for a longer period of time. Discover how LLaMa compares to its competition, including GPT-3, in this week's episode. Additional materials: www.superdatascience.com/670Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode, Jon Krohn welcomes Adrian Kosowski, Co-Founder and Chief Product Officer at Pathway, who shares insights on streaming data processing and reactive data processing, and how they're shaping the future of machine learning. Tune in now for an unforgettable episode. This episode is brought to you by Posit, the open-source data science company, and by AWS Inferentia. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn:• About Pathway's reactive data processing framework [04:45]• Reactive data processing use cases [17:08]• What is the difference between batch and streaming processing [33:18]• Transformers in data engineering and data streaming [53:44]• The benefits of Adrian's technical background as a CPO [1:04:17]• Adrian's responsibilities and favorite tools as a CPO [1:15:25]• Emerging ML approaches and tools for startups [1:28:49] Additional materials: www.superdatascience.com/669
AI risks, RLHF, and inner alignment: GPT stands to give the business world a major boost. But with everyone racing either to develop products that incorporate GPT or use it to carry out critical tasks, what dangers could lie ahead in working with a tool that applies essentially unknowable means (inner alignments) to reach its goals? This week’s guest Jérémie Harris speaks with Jon Krohn about the essential need for anyone working with GPT to understand the impact of a system comprising inner alignments that cannot – and may never – be fully understood.Additional materials: www.superdatascience.com/668Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
GPT-4, augmenting human tasks with AI, and using GPT-4 commercially: Vin Vashishta speaks to host Jon Krohn about how to leverage GPT-4 and outperform your competitors in both speed and value. Learn how GPT-4 has outmatched its predecessors – and many skilled workers – in this latest iteration of large language models. This episode is brought to you by Pathway, the reactive data processing framework, by Posit, the open-source data science company, and by epic LinkedIn Learning instructor Keith McCormick. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn:• Using GPT-4 to screen for jobs [06:26]• A framework for improving systems with GPT [13:32]• Teaming, tooling and collaborating with GPT-4 [29:58]• How to accelerate data science with generative A.I. [45:36]• How to prepare for opportunities with GPT-4 [52:09] Additional materials: www.superdatascience.com/667
GPT-4 has landed! But how well does it compare to GPT-3.5? Tune in to hear Jon stack its performance against its predecessor–the results might just blow your mind.Additional materials: www.superdatascience.com/666Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Angel investor and data science consultant Josh Wills sits down with Jon Krohn to discuss his former roles (Google, Slack, and Cloudera) and the essential skills for engineering scalable machine learning projects. This episode is brought to you by Pathway, the reactive data processing framework, and by epic LinkedIn Learning instructor Keith McCormick. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn:• Josh's 'Data Engineering for Machine Learning' course [06:50]• Contextual bandits [10:52]• Data quality and monitoring [16:45]• The “infinite loop of sadness” in data product development [25:12]• Josh’s definition of a data scientist [30:02]• Josh's role at WeaveGrid [37:36]• Management-Track vs Independent Contributor [48:47]• Josh's work on the Covid pandemic [1:06:46]• Josh’s favorite tech stack [1:11:13] Additional materials: www.superdatascience.com/665
Can ChatGPT make us better and faster in our work, and is it the future or just another fad? In this episode, Jon Krohn delves into a new study from MIT about the tool’s potential productivity for white-collar tasks.Additional materials: www.superdatascience.com/664Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
NLP, transformer architectures, and machines beating humans at their own game: Jon Krohn talks to Alexander H. Miller about his work in building a machine that can outsmart humans in the game of Diplomacy by engineering powers of persuasion and collusion to its own advantage. This episode is brought to you by epic LinkedIn Learning instructor Keith McCormick (linkedin.com/learning/instructors/keith-mccormick). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn:• Training a natural language model to interact with Diplomacy players [05:07]• Processing speeds for a Diplomacy bot [29:32]• Using transformer architectures [37:25]• How Diplomacy AI actually works [43:25]• CICERO's potential real-world applications [55:28]• How to R&D an AI project [59:27]• How to become an AI Research Manager [1:06:12] Additional materials: www.superdatascience.com/663
Our list of the top 10 SuperDataScience podcast episodes for 2022 is here. From Pandas to causality, AI breakthroughs and data storytelling, these were your most popular episodes of the year gone by. Additional materials: www.superdatascience.com/662 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Chip Huyen, co-founder of Claypot AI and author of O'Reilly's best-selling "Designing Machine Learning Systems" is here to share her expertise on designing production-ready machine learning applications, the importance of iteration in real-world deployment, and the critical role of real-time machine learning in various applications. Technical listeners like data scientists and machine learning engineers will definitely enjoy this one! This episode is brought to you by Pathway, the reactive data processing framework (pathway.com), and by epic LinkedIn Learning instructor Keith McCormick (linkedin.com/learning/instructors/keith-mccormick). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn:• Why Chip wrote 'Designing Machine Learning Systems' [08:58]• How Chip ended up teaching at Stanford [13:18]• About Chip's book 'Designing Machine Learning Systems' [21:12]• What makes ML feel like magic [30:53]• How to align business intent, context, and metrics with ML [37:55]• The lessons Chip learned about training data [42:03]• Chip's secrets to engineering good features [53:19]• How Chip optimizes her productivity [1:07:48] Additional materials: www.superdatascience.com/661
ChatGPT is well-known for its potential to disrupt the writing industry, but in what other, perhaps less explored, ways can we use the tool? In this episode, Jon Krohn outlines five critical ways that ChatGPT can augment a data scientist’s work. From generating code to acting as a translation tool for programming languages, listen in to hear why ChatGPT could become a vital part of every data scientist’s toolkit. Additional materials: www.superdatascience.com/660 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
NLP practitioners: this episode is for you. From the awareness of linguistic elements and annotation to getting the necessary people in the room, Vincent Warmerdam presents to Jon Krohn a recipe for a successful project and the open-source NLP tools to get there. This episode is brought to you by epic LinkedIn Learning instructor Keith McCormick (linkedin.com/learning/instructors/keith-mccormick). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn:• How Vincent came to work with De Speld [08:57]• Vincent’s role at Explosion [18:59]• How users can apply spaCy [21:46]• Prodigy: Annotate training data more efficiently with scripts [26:28]• How to manage “skill anxiety” with Calmcode [32:32]• How Vincent fixed bad labels [42:47]• The value of understanding linguistics for NLP [54:42]• How to constrain artificial stupidity [1:02:38] Additional materials: www.superdatascience.com/659
What makes data products popular? Brian T. O'Neill, Founder and Principal of Designing for Analytics, returns to the podcast to help us crack the code on building data products that people love. Additional materials: www.superdatascience.com/658 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Data engineering educator Andreas Kretz joins Jon Krohn for a 1-hour primer that covers everything you need to know about the most in-demand role in data. From skills to tools, problem-solving processes and more, growing your knowledge of data engineering only improves your marketability, so tune in today if you're ready to future-proof your data career. This episode is brought to you by Glean (glean.io), the platform for data insights fast, and by epic LinkedIn Learning instructor Keith McCormick (linkedin.com/learning/instructors/keith-mccormick). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• Why learn data engineering? [06:55]• What is data engineering? [08:08]• What sets Senior Data Engineers apart from junior ones? [13:57]• The must-know data-engineering tools [20:26]• The right path to learn data engineering [44:24]• Are certifications worth it? [51:46]• The future of data engineering [55:24]• Andreas's career challenges [58:48]Additional materials: www.superdatascience.com/657
How to attract an AI recruiter’s attention: In this episode, Jon Krohn and Tribe AI CEO Jaclyn Rice Nelson break down the key ingredients needed to make a Tribe AI recruiter say “yes!” Get Jaclyn’s top tips for forward-thinking AI talent, the skills you need to learn, and the in-demand roles on Tribe’s list of clients. Additional materials: www.superdatascience.com/656 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Transparent data science, profitable AI, and what’s missing from a data science education: Pandata’s Data Scientist in Residence Keith McCormick and Jon Krohn discuss how “insights” can never be the end product of a data science project, how to ensure you have a specific goal at the start of a project that is related to revenue, and why there is so much miscommunication between data scientists and their clients. Exclude the C-suite at your peril!This episode is brought to you by Glean (glean.io), the platform for data insights, fast. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• What an Executive Data Scientist in Residence is [05:27]• What A.I. transparency is and how it relates to the field of Explainable A.I. (XAI) [17:34]• How companies can ensure they profit from AI projects [36:47]• Possible organization structures for data science teams to be profitable [1:02:41]• The current gaps in data science education [1:09:58] Additional materials: www.superdatascience.com/655
14-year-old AI prodigy Mike Wimmer joins Jon Krohn to discuss his latest projects. Whether he's using AI to help conserve the world's coral reefs or launching his new IOT-based company, Mike is an endless source of inspiration in the field of AI. Additional materials: www.superdatascience.com/654 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Carlos Aguilar, the founder and CEO of Glean, a data exploration and visualization platform, knows a thing or two about starting and growing a tech startup. After recently raising a $7 million seed round, he sits down with Jon Krohn to dive into the makings of his platform and shares tips for building a great founding team and how to delight early adopters. In this episode you will learn:• How Glean extracts actionable insights from their client's data warehouses [06:48]• What sets Glean apart from other platforms [12:43]• Glean's software stack [14:43]• Glean's recent fundraising journey [24:56]• The essential characteristics of a founding team [30:53]• How Carlos founded Glean [36:56]• Carlos's former role at Flatiron Health [40:49]• How Carlos created a robotic painter [48:57]Additional materials: www.superdatascience.com/653 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
MedTech, communications technology and computer vision: In this Five-Minute Friday, Jon Krohn investigates the technology that allows patients who have lost their ability to speak via medical ventilation to communicate clearly. Additional materials: www.superdatascience.com/652 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Data visualizations, color theories and color inclusivity: In this episode, Kate Strachnyi and host Jon Krohn discuss how color can make or break your data visuals, ways to make your charts and graphs more inclusive through color, and how Kate developed the tools and techniques to nail color for your data stories in her latest book, ColorWise: A Data Storyteller’s Guide to the Intentional Use of Color. In this episode you will learn:• What a “data storyteller” is [11:01]• Why color use should always be intentional [12:52]• Is color always necessary in data visualization? [29:41]• Color selection tips for your data visuals [31:19]• Three-color scales [34:54]• How to respect individual cultures in the color choices you make [38:25]• Best tools for data visualization [54:35]Additional materials: www.superdatascience.com/651 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
SparseGPT is a noteworthy one-shot pruning technique that can halve the size of large language models like GPT-3 without adversely affecting accuracy. In this episode, Jon Krohn provides an overview of this development and explains its commercial and environmental implications. Additional materials: www.superdatascience.com/650 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Looking for a short primer on Machine Learning concepts? SDS Founder Kirill Eremenko and AI expert Hadelin de Ponteves are back, joining Jon Krohn to review essential ML concepts. From classification errors to logistic regression, feature scaling, the elbow method and more. The popular data science instructors also introduce their latest course: Machine Learning in Python: Level 1. In this episode you will learn:• Kirill and Hadelin's new course [17:34]• Supervised vs unsupervised learning [26:23]• False positives and false negatives [31:21]• Logistic regression [43:00]• Holding out a set of test data [46:39]• Feature scaling [52:45]• The Adjusted R-Squared metric [59:44]• The five assumptions of linear regression [1:05:12]• The Elbow Method [1:11:41] Additional materials: www.superdatascience.com/649 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Text-to-speech gets a groundbreaking update with Microsoft’s VALL-E. On this Five-Minute Friday, Jon Krohn investigates how the Microsoft team modeled their tool to replicate natural human speech using just three seconds of a person’s voice. Additional materials: www.superdatascience.com/648 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Knowledge management, trust of AI, and job automation: Tom Davenport speaks with Jon Krohn about the organizational obstacles to adopting AI, and why the C-suite also needs to learn how to handle data.This episode is brought to you by Kolena (kolena.io), the testing platform for machine learning. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• Cognitive bias in understanding AI [14:13]• How AI will augment rather than replace human workers [24:27]• OpenAI and regulatory action [35:13]• Jobs that might be at risk of being automated [39:57]• The potential of citizen science in accumulating and analyzing data [1:02:18]• How AI will change the game for the C-suite [1:15:17]Additional materials: www.superdatascience.com/647
Are you still wondering how to get the most out of ChatGPT's game-changing technology? In this week's Five-Minute Friday guest episode, Jon Krohn sits down with longtime friend and e-commerce entrepreneur Zack Weinberg, to discuss the practical applications of this incredible AI tool. Additional materials: www.superdatascience.com/646 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Machine learning, security and Call of Duty collide this week as Jon Krohn sits down with Carly Taylor, Lead Machine Learning Engineer for Activision's COD franchise to discuss the importance of low-latency, the future of gaming and her favorite software packages. This episode is brought to you by Kolena (kolena.io), the testing platform for machine learning. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: • The relationship between data science and cyber security [4:49] • The importance of low-latency for an optimal gaming experience [9:15] • The future of gaming [18:13] • Carly's thoughts on the Metaverse [25:43] • Carly’s favorite operating systems, software packages, and keyboards [30:27] • How to transition from a quantitative academic background into data science [45:28] • Why Carly is called the “Rebel Data Scientist” [53:27] • How to file a patent [57:21] Additional materials: www.superdatascience.com/645
Love and money matter in this week’s Five-Minute Friday, as Stanford University’s Myra Strober sits down with Jon Krohn to talk about her latest book, Money and Love, coauthored with Abby Davisson. In this unorthodox take on thinking with your head versus your heart, Myra and Abby address the life-changing impact that money and love have on each other and how to rethink this relationship to make better decisions. Additional materials: www.superdatascience.com/644 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
AI prediction tools for antibodies and using statistics to prepare healthcare systems for pandemics: host Jon Krohn speaks with Chief Scientist of Biologics AI for Exscientia Charlotte Deane about the variety of potential partnerships between medicine and machine learning.This episode is brought to you by Kolena (kolena.io), the testing platform for machine learning. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn:• What does Biologics AI mean? [03:48]• How to use AI to predict protein structures [07:37]• What antibodies are [14:00]• Personalized Medicine is slow but A.I. can speed it up [24:36]• The future of predicting 4D protein structures [44:30]• Applications of machine learning during the pandemic [53:27]Additional materials: www.superdatascience.com/643
Looking to shake up your data science productivity in 2023? Switching to a continuous calendar can make all the difference. Jon Krohn shares his new calendar with those taking their yearly, monthly and daily planning to the next level. Additional materials: www.superdatascience.com/642 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
The top data science trends of 2023 are here. Sadie St. Lawrence joins Jon Krohn to share annual predictions on the future of AI. From the data mesh to multimodal models like ChatGPT, tune in to discover what's next. This episode is brought to you by Kolena (kolena.io), the testing platform for machine learning. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn:• A recap of 2022 predictions [5:22]• Our data science trend predictions for 2023:- Data as a product [23:36]- Multimodal A.I. models [32:26]- The data mesh [42:49]- Privacy & AI Trust [50:54]- Environmental Sustainability [54:37]• Sadie's goals for 2023 [1:16:04] Additional materials: www.superdatascience.com/641
From AI trends to rediscovering how fun it is to work with colleagues ‘in person’, host Jon Krohn wraps up the year’s best SuperDataScience content and looks ahead to another year of interviews with the data science community’s brightest stars. Additional materials: www.superdatascience.com/640 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Learning Python for beginners is made fun on Mariya Sha’s YouTube and Discord channels, on which she posts hacks, breakdowns and tutorials on everything to do with the world’s most important programming language. If you’re continually frustrated by the high base level at which many ML and Python courses seem to begin, this episode is a great jumping-off point for you. This episode is brought to you by Kolena (kolena.io), the testing platform for machine learning. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn:• Why Mariya was first interested in learning Python [04:44]• The positive potential for future AI applications [12:02]• Useful broadcasting software [23:09]• The importance of productivity hacking in data science [34:13]• The ethical problems of web scraping [38:45]• Mariya’s favorite Python libraries [53:48]• What excites Mariya about the future of NLP [1:13:53]• Mariya’s favorite software tools [1:15:23] Additional materials: www.superdatascience.com/639
OpenAI's ChatGPT helps us generate a special holiday greeting this week. Tune in to hear the festive message that this impressive natural language generating algorithm churned out as we close out the year. Additional materials: www.superdatascience.com/638 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
It's all about data visualization this week as Jon Krohn welcomes Ann K. Emery, data visualization designer and owner of Depict Data Studio, to the show. If you want to learn data viz best practices, tips and tricks and reporting how-tos, make some time to tune in today! This episode is brought to you by Kolena (kolena.io), the testing platform for machine learning. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn:• What data storytelling is [3:40]• Pinpoints of data visualization [10:38]• Best practices for data visualization [23:41]• Surprising spreadsheet tricks [30:51]• When static dashboards are more effective than interactive ones [43:30]• Ann's top tips for presenting data in a slideshow [48:07] Additional materials: www.superdatascience.com/637
Digital literacy and data bias: Can one reduce or even eradicate the other? Law professor Orly Lobel speaks with SDS host Jon Krohn about Orly’s latest book, The Equality Machine, which offers an optimistic look into the future of AI and data mining. Additional materials: www.superdatascience.com/636 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Hand labeling data and information bias: Jon Krohn speaks with Watchful CEO Shayan Mohanty about the pitfalls of data analysis when bias comes into the equation (spoiler alert: it always does), the importance of the Chomsky hierarchy in data management, and the importance of simulation engines for returning real-time results to users. This episode is brought to you by Iterative (iterative.ai), your mission control center for machine learning. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn:• Why bias in general is good [04:06]• The arguments against hand labeling [09:47]• How Shayan solves the problem of labeling at his company [24:26]• Misconceptions concerning hand-labeled data [43:25]• What the Chomsky hierarchy is [52:38]• Watchful’s high-performance simulation engine [1:04:51]• What Shayan looks for in his new hires [1:08:15] Additional materials: www.superdatascience.com/635
Data scientist and author Serg Masís joins Jon Krohn for a Five-Minute Friday episode that touches on model error analysis. Learn how this process can improve your models and discover a helpful tool that expedites this critical process. Additional materials: www.superdatascience.com/634 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
This week's episode is all about Responsible Decentralized Intelligence as award-winning professor and tech entrepreneur, Dawn Song, joins Jon Krohn to help us explore this exciting topic in-depth. This episode is brought to you by Iterative (iterative.ai), your mission control center for machine learning. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn:• What is decentralized intelligence? [3:46]• Dawn’s Responsible Data Economy collaboration with Meta AI [11:31]• How homomorphic encryption, differential privacy, and multi-party computation can work together [16:22]• How PrivateSQL makes differential privacy easy to use [22:54]• The relationship between deep learning and federated learning [37:55]• What is a responsible data economy [42:13] Additional materials: www.superdatascience.com/633
Liquid neural networks are a type of bio-inspired machine learning set to make a huge impact in the field of data analytics. On this week’s Five-Minute Friday, Jon Krohn speaks with Pathway.com Co-Founder Dr. Adrian Kosowski about the development of this new type of network and what this means for the future of data.Additional materials: www.superdatascience.com/630Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Interview success, funny memes about data, and stakeholder management: Jon Krohn speaks with Luke Barousse, a full-time YouTuber who produces content to help aspiring data scientists. First, Jon and his guest go underwater to find out how data science can help you while working on a submarine before they emerge onto Luke’s YouTube channel. There, he discloses all the helpful hacks for data science beginners—with a generous helping of humor! As founder of MacroFit, a data-driven company that helps with meal planning, Luke is no stranger to portion sizes… This episode is brought to you by Iterative (iterative.ai), your mission control center for machine learning. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn:• Where Luke gets his inspiration for making YouTube videos [04:46]• How Luke got into creating comedy skits [08:21]• Luke’s favorite Python libraries for web scraping [14:41]• Incorrect assumptions that aspiring data scientists make [15:54]• The best time to use Power BI [19:15]• The biggest mistakes Luke made in his data science career [22:17]• Luke’s experience as a submariner and how it helped him in his data analyst career [38:13]• The must-have skills for entry-level data analyst roles [43:46] Additional materials: www.superdatascience.com/631
Jon Krohn sits with Dr. Dan Shiebler at the Open Data Science Conference (ODSC) to dive into the critical components of building resilient machine learning. Additional materials: www.superdatascience.com/630 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Has the term developer advocacy ever left you scratching your head? This week data science developer advocate for JetBrains, Dr. Jodie Burchell, joins Jon Krohn to shed light on her responsibilities and why it's a role you might want to consider. Jodie also dives into building reproducible data science workflows and the keys to working effectively with real-world data.This episode is brought to you by Iterative (iterative.ai), the open-source company behind DVC. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• Jodie’s background in psychology [2:22]• Jodie's tips for real-world data preparation [6:55]• Tour JetBrains' developer tools: PyCharm, DataSpell and Datalore [10:41]• What is a data science developer advocate? [38:47]• The books that Jodie's co-authored [46:18]• Jodie's favorite Python libraries [58:33]• How to have reproducible data science workflows [1:01:36]Additional materials: www.superdatascience.com/629
On this episode of Five-Minute Friday, Jon Krohn speaks from the Open Data Science Conference (ODSC). There, he sits down with author and data scientist Keith McCormick to discuss the conference’s key trend: learning the importance of trust in the relationship between humans and algorithms. Additional materials: www.superdatascience.com/628 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Jon Krohn speaks with Erin LeDell, H2O.ai’s Chief Machine Learning Scientist. They investigate how AutoML supercharges the data science process, the importance of admissible machine learning for an equitable data-driven future, and what Erin’s group Women in Machine Learning & Data Science is doing to increase inclusivity and representation in the field. This episode is brought to you by Datalore (datalore.online/SDS), the collaborative data science platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn:• The H2O AutoML platform Erin developed [07:43]• How genetic algorithms work [19:17]• Why you should consider using AutoML? [28:15]• The “No Free Lunch Theorem” [33:45]• What Admissible Machine Learning is [37:59]• What motivated Erin to found R-Ladies Global and Women in Machine Learning and Data Science [47:00]• How to address bias in datasets [57:03] Additional materials: www.superdatascience.com/627
Word tokenization, character tokenization and subword tokenization go head-to-head this week as Jon Krohn delivers a mini-bootcamp on the NLP-related process. Additional materials: www.superdatascience.com/626 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Chainalysis' Director of Research, Kim Grauer joins Jon Krohn to explore the state of economic-data analysis on the blockchain. This episode is brought to you by Datalore (datalore.online/SDS), the collaborative data science platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: • Kim's role as Director of Research [5:02] • The unique real-time economic-data analytics of the blockchain [13:07] • How ML can predict patterns of criminal activity on the blockchain [18:56] • Interesting use cases of ML for crime investigation [29:37] • The tools and approaches Kim uses daily [47:44] • The future of crypto, blockchains, and data science [50:54] • Why a data science bootcamp helps people break into data science [53:42] Additional materials: www.superdatascience.com/625
On this week’s Five-Minute Friday, Jon Krohn investigates Imagen Video, Google’s latest model for making video art out of text prompts. Recently published, this text-to-image converter now competes against already strong competitors on the scene like DALL-E 2. Unlike DALL-E 2, it returns moving images or time-based media. Tune in to hear Jon explain the technology that made Imagen Video the tech giant’s shiniest new tool to date. Additional materials: www.superdatascience.com/624 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Jon Krohn speaks with Shashank Kalanithi, the man who makes a sport out of YouTube and data analytics out of sports. Listen in as he talks about how he got started producing YouTube videos on data science, the essential differences between data science roles, and how data could shape the future of the sports industry. This episode is brought to you by Datalore (datalore.online/SDS), the collaborative data science platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn:• What motivated Shashank to start his YouTube channel [04:31]• The must-have technical skills for every data scientist [16:59]• The soft skills needed for data science [20:52]• The differences between data analyst, data scientist and data engineer [24:26]• How data are currently being applied in the sports industry [38:38]• The “needs” divide between digital native and traditional companies [45:34] Additional materials: www.superdatascience.com/623
Is burnout on the horizon for you and your team? Christina Maslach, author of the new book "The Burnout Challenge," joins Jon Krohn to help us identify the common signs of looming burnout while steering us in a healthier direction. Additional materials: www.superdatascience.com/622 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Cryptocurrency and blockchain take center stage this week as we welcome Chief Economist at Chainalysis, Philip Gradwell, to discuss the data science applications in this exciting field. This episode is brought to you by Datalore (datalore.online/SDS), the collaborative data science platform, by Zencastr (zen.ai/sds), the easiest way to make high-quality podcasts, and by Bunch (superdatascience.com/bunch), the AI driven leadership coach. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn:• What the role of a chief economist entails [5:50]• What are blockchains and cryptocurrency? [8:23]• How analyzing cryptocurrencies differs from established fiat currencies [12:48]• Philip's work at Chainalysis [26:07]• Philip's crypto data analytics pipeline [34:48]• How Philip develops data products for a wide range of users [46:18]• How the blockchain facilitates innovative computing and machine learning technologies [51:52]• What Philip looks for in the data scientists he hires [1:04:59] Additional materials: www.superdatascience.com/621
What’s your secret to superb audio recognition? Whisper it. We mean that literally—Whisper is the latest in OpenAI’s growing suite of models aimed to benefit humanity. On this episode of Five-Minute Friday, host Jon Krohn reviews OpenAI’s latest model, Whisper. This tool will vastly improve the way human speech is recognized and converted to text. Jon gets under the hood to show how the team managed to get such a powerfully accurate recognition model. Listen to the episode and find out how you can try it yourself, for free! Additional materials: www.superdatascience.com/620Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Jon Krohn speaks with Erik Bernhardsson, the man who invented Spotify’s original music recommendation system. They address the different ways to interview a data science candidate, how to deploy a data model into the cloud, and the approach he took that made Spotify go from a digital music startup to an AI-driven streaming giant. This episode is brought to you by Datalore (datalore.online/SDS), the collaborative data science platform, by Zencastr (zen.ai/sds), the easiest way to make high-quality podcasts, and by Bunch (superdatascience.com/bunch), the AI driven leadership coach. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn:• The data problem that Erik’s company Modal Labs solves [04:32]• Erik’s prolific blogging career [09:15]• Opportunities for making data teams more efficient and productive [14:42]• Erik’s views on interviewing data scientists and software developers [20:18]• Erik’s tips and tricks for data science interviewees [31:35]• How Erik built Spotify’s original music recommendation system [38:58]• Applying vectors to other tools, and opportunities for working with vectors [47:45]• Using Annoy to search across vectors [50:57]• Building Python module Luigi for Spotify [55:20]• The tools that Erik loves to work with [1:06:23] Additional materials: www.superdatascience.com/619
Telic and atelic activities take center stage this week as Jon Krohn contemplates how our daily actions contribute to our overall sense of fulfillment. Additional materials: www.superdatascience.com/618Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Dr. Sean Taylor, Co-Founder and Chief Scientist of Motif Analytics, joins Jon Krohn this week for yet another perspective on causal modeling. Tune in for a great conversation that covers large-scale causal experimentation, Information Systems, Bayesian parameter searches, and more. This episode is brought to you by Datalore (datalore.online/SDS), the collaborative data science platform, and by Zencastr (zen.ai/sds), the easiest way to make high-quality podcasts. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn:• Sean on his new venture, Motif Analytics [4:23]• The relationship between causality and sequence analytics [15:26]• Sean's data science work at Lyft [22:21]• The key investments for large-scale causal experimentation [27:25]• Why and when is causal modeling helpful [32:34]• Causal modeling tools and recommendations [36:52]• Facebook's Prophet automation tool for forecasting [40:02]• What Sean looks for in data science hires [50:57]• Sean on his PhD in Information Systems [53:34] Additional materials: www.superdatascience.com/617
10,000 hours of study: Will it make you an expert? On this episode of Five-Minute Friday, host Jon Krohn explores whether increasing your skills is just a numbers game or if there is more to becoming proficient in your area of interest, whether that’s flute playing or data wrangling. Additional materials: www.superdatascience.com/616Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
“Being a great data scientist” and “being great at a data science interview” are not one and the same. Jon Krohn speaks with Nick Singh about how to strengthen your interviewee skills, and how you can even beat out more senior competition to land a coveted data science role.This episode is brought to you by Datalore (datalore.online/SDS), the collaborative data science platform, and by Zencastr (zen.ai/sds), the easiest way to make high-quality podcasts. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• Nick’s inspiration for writing his bestselling book, Ace the Data Science Interview [06:21]• Why Nick believes in being a work generalist [12:37]• How DataLemur supports emerging data scientists for free [15:43]• Why Nick started DataLemur off the back of his book [21:31]• Portfolio essentials for any data scientist [22:36]• The three most common things data scientists get wrong at the interview [24:33]• How data science introverts can shift their mindset about self-promotion [37:58]• Great responses to end your data science interview on the right foot [42:21]Additional materials: www.superdatascience.com/615
World-leading futurist, author and entrepreneur, Ross Dawson joins us for the first of our extended Five-Minute Friday episodes. As information overwhelm becomes increasingly unavoidable, Dawson is here to share the five powers from his new book 'Thriving on Overload', to help us transition from overwhelm into abundance.  Additional materials: www.superdatascience.com/614Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Dr. Emre Kiciman, Senior Principal Researcher at Microsoft Research joins the podcast to share his world-leading knowledge on causal machine learning.This episode is brought to you by Datalore (datalore.online/SDS), the collaborative data science platform, and by Zencastr (zen.ai/sds), the easiest way to make high-quality podcasts. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn:• What is causal machine learning? [5:52]• Causal machine learning vs correlational machine learning [10:10]• Emre’s DoWhy open-source library [16:17]• The four key steps of causal inference [21:24]• How and why Emre’s key steps of causal inference will impact ML [26:36]• Emre's thoughts on the future of causal inference and AGI [34:09]• How Emre leverages social media data to solve social problems [38:36]• What's next for Emre's research [46:02]• The software tools Emre highly recommends [55:16]• What he looks for in the data science researchers he hires [58:45] Additional materials: www.superdatascience.com/613
Some exciting changes are coming to our popular Five-Minute Friday series! From longer episodes to new guests, tune in to hear what's next. Additional materials: www.superdatascience.com/612Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Dr. Ken Stanley, a world-leading expert on Open-Ended AI and author of the genre-bending book "Why Greatness Cannot be Planned," joins Jon Krohn for a discussion that has the potential to shift your entire view on life. Tune in now to learn more about the complex topics of genetic ML algorithms, the Objective Paradox, Novelty Search, and so much more.This episode is brought to you by Zencastr (zen.ai/sds), the easiest way to make high-quality podcasts. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn:• Ken on his book 'Why Greatness Cannot Be Planned" and the Objective Paradox [4:15]• The Novelty Search approach [24:14]• How open-ended algorithms like Novelty Search can be stopped from doing something potentially dangerous [1:00:00]• The future of open-ended AI and its intimate relationship with Artificial General Intelligence [1:07:34]• Ken's new company [1:13:34]• How AI could transform life for humans in the coming decades [1:18:29] Additional materials: www.superdatascience.com/611
On this episode of Five-Minute Friday, host Jon Krohn shares his life motto, “Who dares, wins”, and the sentiment behind it: that to get anywhere in life, it is first necessary to try. Jon believes that “daring”, in this instance, simply means taking action when we have a good idea or when a new opportunity becomes available. Listen to the end for constructive advice on how to be daring in your own life right now. Additional materials: www.superdatascience.com/610Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
Jon Krohn speaks with Zhamak Dehghani, the empathetic technologist who coined the term “data mesh”. They explore what a data mesh is, and how its approach toward secure interconnectivity will help solve a roster of data-led business problems. This episode is brought to you by Zencastr (zen.ai/sds), the easiest way to make high-quality podcasts. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: • The importance of data meshes [3:29] • How standardizing database interfaces helps tech giants like Amazon [6:40] • Current challenges with data meshes [9:33] • How data meshes give users the freedom to work with data [17:09] • The missing piece of the puzzle for data meshes [22:11] • How data meshes connect with the metaverse and Web3 [33:18] • The times when data meshes aren’t fit for purpose [42:24] Additional materials: www.superdatascience.com/609
Company meetings should be held to solve problems. So, why do we often feel like the weekly stand-ups and check-ins are a waste of everyone’s time? On this episode of Five-Minute Friday, host Jon Krohn brings his habit-making practices into the dreaded meeting room. Make every meeting productive and positive with his five-step method for assigning deliverables. Additional materials: www.superdatascience.com/608
We welcome Dr. Jennifer Hill, Professor of Applied Statistics at New York University, to the podcast this week for a discussion that covers causality, correlation, and inference in data science. This episode is brought to you by Pachyderm, the leader in data versioning and MLOps pipelines and by Zencastr (zen.ai/sds), the easiest way to make high-quality podcasts. In this episode you will learn: • How causality is central to all applications of data science [4:32] • How correlation does not imply causation [11:12] • What is counterfactual and how to design research to infer causality from the results confidently [21:18] • Jennifer’s favorite Bayesian and ML tools for making causal inferences within code [29:14] • Jennifer’s new graphical user interface for making causal inferences without the need to write code [38:41] • Tips on learning more about causal inference [43:27] • Why multilevel models are useful [49:21] Additional materials: www.superdatascience.com/607
Four thousand weeks equate to roughly 80 years—a lifetime for those of us lucky enough to get there. What do we choose to do with this time? How can we stop ourselves from feeling like time in general is slipping away? In this episode, host Jon Krohn reviews the book Four Thousand Weeks: Time Management for Mortals by journalist Oliver Burkeman. He outlines how he has personally benefited from this essential reflection on our thirst for productivity and efficiency. Additional materials: www.superdatascience.com/606
Kian Katanforoosh, CEO of Workera and Lecturer at Stanford University, joins Jon Krohn to reveal the tools, frameworks, and machine learning models that power his platform and remote team. In this episode you will learn: • What a skills intelligence platform is [3:11] • How mentorship can be life-changing [7:45] • Four ways that ML drives Kian’s skills intelligence platform [10:57] • Kian's day-to-day responsibilities as the CEO of Workera [21:00] • What frameworks and software languages Kian and his team selected for building their platform and why [24:20] • What Kian looks for in the data scientists and software engineers he hires [31:48] • Kian’s Stanford Deep Learning class and mentors [34:58] • How Kian’s passion for EdTech began [42:47] Additional materials: www.superdatascience.com/605
During this week's Five-Minute Friday episode features, Jon explores recent groundbreaking developments in nuclear fusion –ignition–and what that signals for the future. Additional materials: www.superdatascience.com/604
Christina Stathopoulos, Analytical Lead for Waze and Adjunct Professor at IE Business School, joins the podcast to shed light on her work with geospatial data and how she nurtured an entire data career while abroad in Spain. In this episode you will learn: • Christina's tips on navigating an unconventional path into a data career [3:05] • Geospatial data and open-source packages for working with it [10:08] • Guidance to help women and other underrepresented groups to thrive in tech [22:28] • The hard and soft skills most essential to success in a data role today [39:26] • Christina’s #bookaweekchallenge and the top data-centric book recommendations [43:28] Additional materials: www.superdatascience.com/603
Inspired by a quote from by science fiction writer, Teresa Nielsen Hayden, Jon Krohn reflects on the notion of living in ancient times and the machine learning-related implications that arise from this perspective. Additional materials: www.superdatascience.com/602
This week, Sarah Catanzaro, General Partner at Amplify Partners joins Jon for an episode that dives into the venture capital side of data science. Learn how to fund your data science business idea, take note of what start-ups can do to survive or raise capital in the current economic climate, and discover how to break into the field of venture capital yourself. In this episode you will learn: • Angel vs. venture capital vs. private equity investment [7:27] • How early-stage investment is made prior to a firm having product-market fit [14:33] • How to pick winners in early-stage investments [28:08] • Tricks to accelerating from a data science idea to obtaining funding [36:21] • Observational causal inference [44:01] • How to get involved in venture capital [47:37] Additional materials: www.superdatascience.com/601
Rest and relaxation await as Steve Fazzari joins us this week for a special edition of the podcast! Tune in for a rejuvenating session of Yoga Nidra led beautifully by the expert. Additional materials: www.superdatascience.com/600
This week, Mikiko Bazeley, Senior Software Engineer at Mailchimp joins the podcast to share her in-depth knowledge of MLOps: Machine Learning Operations. Tune in to hear her discuss what it entails, why it's so critical for the efficiency of any data science team, and the most important tools you need to master for career success in this field. In this episode you will learn: • What MLOps is [11:40] • Mikiko’s role at Mailchimp and why MLOps is critical for the efficiency of any data science team [27:11] • The three most important MLOps tools [32:15] • The six most essential MLOps skills for data scientists [47:01] • The key factors Mikiko looks when hiring engineers [1:07:31] • Mikiko’s productivity tricks for balancing software engineering, content creation, and her athletic pursuits [1:13:20] Additional materials: www.superdatascience.com/599
Ben Taylor makes a fourth appearance on Five-Minute Friday to discuss the best ways to introduce STEM to children. Tune in to hear the many ways in which he thinks STEM education will evolve in the future. Additional materials: www.superdatascience.com/598
Dr. Miles Brundage, Head of Policy Research at OpenAI, joins Jon Krohn this week to discuss AI model production, policy, safety, and alignment. Tune in to hear him speak on GPT-3, DALL-E, Codex, and CLIP as well. In this episode you will learn: • Miles’ role as Head of Policy Research at OpenAI [4:35] • OpenAI's DALL-E model [7:20] • OpenAI's natural language model GPT-3 [30:43] • OpenAI's automated software-writing model Codex [36:57] • OpenAI’s CLIP model [44:01] • What sets AI policy, AI safety, and AI alignment apart from each other [1:07:03] • How A.I. will likely augment more professions than it displaces them [1:12:06] Additional materials: www.superdatascience.com/597
Ben Taylor returns for a third Five-Minute Friday episode! This week, he looks ahead and digs into what we can expect from the A.I. platforms of the future. Additional materials: www.superdatascience.com/596
Tune in as Joe Reis and Matt Housley, co-founders of Ternary Data and co-authors of the book “Fundamentals of Data Engineering” join Jon Krohn to discuss major undercurrents across the data engineering lifecycle, and their top tools and techniques. In this episode you will learn: • What is data engineering? [3:55] • Why Joe and Matt identify as “recovering data scientists” [6:12] • What kinds of people tend to become data scientists vs. data engineers [10:38]? • Key components of Joe and Matt’s book [26:31] • Major undercurrents across the data engineering lifecycle [28:26] • The most under-utilized tool in a data engineer's toolbox [34:39] • How there are tradeoffs in any data pipeline latency considerations, but faster is typically the default assumption [38:55] • Joe and Matt’s favorite data engineering tools and techniques [43:39] Additional materials: www.superdatascience.com/595
This week, Jon Krohn and A.I. industry veteran Ben Taylor discuss the driving factors that push CEOs to prioritize A.I. over other technologies. Additional materials: www.superdatascience.com/594
Jon welcomes Professor Philip Bourne, Founding Dean of the School of Data Science at the University of Virginia to discuss his biomedical data science research, the importance of open-source and open-access within the industry and the data science skills you need to succeed today. In this episode you will learn: • Why Philip founded a School of Data Science [6:08] • How computing and data science have evolved across academic departments [15:55] • The improvements needed in higher education [26:44] • The most important data science skills for academia and industry and the 4+1 model [36:49] • Philip’s biomedical data science research and its fascinating practical applications [43:24] • The essential roles of open-source code and open-access publishing in data science [1:01:27] Additional materials: www.superdatascience.com/593
In this episode, Jon Krohn welcomes A.I. industry veteran Ben Taylor to discuss how to sell multimillion dollar A.I. contracts. Tune in to hear why trust and proof of value are some of the critical steps in his sales process. Additional materials: www.superdatascience.com/592
Mars Buttfield-Addison, PhD Candidate at the University of Tasmania, joins Jon Krohn for a high-energy episode covering everything from Machine Learning simulations to Swift, space junk, and more! In this episode you will learn: • What simulations and synthetic data are, and why they can be invaluable for real-life applications [5:47] • How simulated bots can solve any problem [9:07] • Practical uses of simulated data [21:49] • Why the mobile operating system language Swift is interesting for A.I. [25:46] • Why it's critical to track the amount of junk in space [35:47] • Whether programming or statistical skills are more important in data science [47:05] • What it’s like creating video games in a "secret" games lab [56:45] • Why you might want to do a data science internship in industry before pursuing in academia [ 1:01:54] Additional materials: www.superdatascience.com/591
In this episode, Jon continues his two-part series on artificial general intelligence (AGI) and why we are unlikely to realize it anytime soon. Listen in as Jon reviews Meta's Yann LeCun's seven-part perspective on the topic. Additional materials: www.superdatascience.com/590
Hilary Mason, Co-Founder and CEO of Hidden Door, joins Jon Krohn for a live discussion that explores narrative A.I., emerging ML techniques, and how her OSEMN data science process developed. In this episode you will learn: How narrative A.I. can assist creativity [5:14] How to build ML products that have no quantitative error function to optimize [10:31] How to ensure creative A.I. systems do not output non-sense or explicit content [16:58] Hilary's OSEMN data science process [21:05] The emerging ML technique she’s most excited about [24:58] What it takes to be successful as CEO of an early-stage A.I. company [27:20] What she looks for in engineering hires [32:28] How she’s hopeful A.I. will transform our lives for the better in the decades to come [38:48] Additional materials: www.superdatascience.com/589
In this episode, Jon kicks off a two-part series that sees him explore the popular topic of artificial general intelligence and why it might–or might not–be only a few years away. Listen in as Jon explains the several reasons why he doesn't believe that AGI is nigh. Additional materials: www.superdatascience.com/588
Mark Freeman, Senior Data Scientist at Humu, joins Jon Krohn to talk about all things data engineering and offers listeners some critical tips for their data science career journey – from what it takes to get promoted to his number one tip for getting hired at a fast-growing capital-backed startup. In this episode you will learn: How Humu leverages data and machine learning to improve workplace behaviors [10:38] What is data engineering? [14:21] What it takes to get promoted into more senior data science roles [20:55] The differences between junior, senior, and staff data scientists [30:21] Mark’s top tools for data extraction, modeling, and pipeline engineering [37:08] Mark’s number one tip for getting hired at a fast-growing venture capital-backed startup [53:10] Why all data scientists should be interested in Web3 [1:11:53] Additional materials: www.superdatascience.com/587
In this episode, Jon dives into the popular topic of social media and its impact on his productivity. Tune in to hear how minimizing the use of social media can positively impact your days, mental health and work. Additional materials: www.superdatascience.com/586
In this episode, Dr. Thomas Wiecki, Core Developer of the PyMC Library and CEO of PyMC Labs, joins Jon for a masterclass in Bayesian statistics. Tune in to hear about PyMC, and discover why Bayesian statistics can be more powerful and interpretable than any other data modeling approach. In this episode you will learn: What Bayesian statistics is [7:30] Why Bayesian statistics can be more powerful and interpretable than any other data modeling approach [17:20] How PyMC was developed [20:41] Commercial applications of Bayesian stats [43:07] How to build a successful company culture [1:03:14] What Thomas looks for when hiring [1:11:13] Thomas’s top resources for learning Bayesian stats yourself [1:13:57] Additional materials: www.superdatascience.com/585
In this episode, Jon reviews the remarkable natural language model Codex by OpenAI. Learn why it has amassed a waitlist and how you can leverage its practical applications in your work. Additional materials: www.superdatascience.com/584
In this episode, natural language processing (NLP) expert and Lead Data Scientist at CB Insights, Rongyao Huang, joins Jon Krohn to discuss NLP. Listen in for a thorough review of the field over the past decade and how the coming iron age of NLP will help us overcome the limitations of today's approaches. In this episode you will learn: The evolution of NLP techniques over the past decade [4:14] What's next in the coming iron age of NLP [35:33] Rongyao’s Bauhaus-inspired model for effective data science [43:12] Rongyao's long-term career pathfinding framework [51:50] Rongyao’s top tips for staying sane while juggling career and family [1:00:30] Additional materials: www.superdatascience.com/583
In this episode, Jon wraps up his three-part series on business value and machine learning. Listen in as he explains why starting with simple models is best, and why speed is likely more important to your users than accuracy. Additional materials: www.superdatascience.com/582
In this episode founding Editor-in-Chief of the Harvard Data Science Review and Professor of Statistics at Harvard University, Prof. Xiao-Li Meng, joins Jon Krohn to dive into data trade-offs that abound, and shares his view on the paradoxical downside of having lots of data. In this episode you will learn: What the Harvard Data Science Review is and why Xiao-Li founded it [5:31] The difference between data science and statistics [17:56] The concept of 'data minding' [22:27] The concept of 'data confession' [30:31] Why there’s no “free lunch” with data, and the tricky trade-offs that abound [35:20] The surprising paradoxical downside of having lots of data [43:23] What the Bayesian, Frequentist, and Fiducial schools of statistics are, and when each of them is most useful in data science [55:47] Additional materials: www.superdatascience.com/581
In this episode, Jon resumes his series on strategies for getting business value from machine learning. Part one saw him review several ways to identify a commercial problem before starting data collection or ML model development. And now, in part two, Jon digs into the data collection process. Additional materials: www.superdatascience.com/580
In this episode, the CEO of Overjet, Dr. Wardah Inam, joins Jon Krohn to discuss the classification and quantification of dental diagnoses with computer vision, her data labeling challenges, and tips for building a successful A.I. business. In this episode you will learn: How Overjet leverages computer vision to qualify and quantify dental diagnoses [5:11] How A.I. solutions reduce the under-diagnosis of common diseases like periodontal disease [8:15] Overjet's particular ML challenges within the dental industry [15:45] Wardah's experience in introducing A.I. to the dental industry [20:12] Wardah's tips for building a successful A.I. business [23:34] What she looks for in the data scientists and software engineers she hires [39:36] Additional materials: www.superdatascience.com/579
In this episode, Jon kicks off a new Five-Minute Friday series that explores the strategies for getting business value from machine learning. Part one sees him review several ways to identify a commercial problem before starting data collection or ML model development. Additional materials: www.superdatascience.com/578
In this episode, the former CEO and co-founder behind Onfido, an AI-based ID verification, joins Jon Krohn to discuss his path to start-up success. Tune in to hear valuable information from Husayn Kassai. In this episode you will learn: How Husayn's start-up journey began [5:55] How Husayn determined that his challenge could be solved by machine vision [11:18] Onfido's initial seed stages [18:23] Launching and scaling your start-up in the U.S. market [22:00] The most important component in building the best product [26:30] Husayn's latest start-up [28:52] Husayn’s startup project decision-making process [37:49] Choosing your co-founding team [44:04] Additional materials: www.superdatascience.com/577
Hollywood has officially fallen for the drama of tech startups! Tune in to hear Jon Krohn review the small-screen adaptations of WeWork (WeCrashed), Uber (Super Pumped), and Theranos (The Dropout). Additional materials: www.superdatascience.com/576
In this episode, the Director of Architecture at NVIDIA, Dr. Magnus Ekman, joins Jon Krohn to discuss how machine learning, including deep learning, can optimize computer hardware design. The pair also review his exceptional book 'Learning Deep Learning.' In this episode you will learn: What hardware architects do [10:15] How ML can optimize hardware speed [ 13:19] Magnus’s Deep Learning Book [21:14] Is understanding how ML models work important? [36:16] Algorithms inspired by biological evolution [41:25] How artificial general intelligence won’t be obtained by increasing model parameters alone [51:24] Why there will always be a place for CNNs and RNNs [54:51] How people can "transition" realistically into ML [1:09:15] Additional materials: www.superdatascience.com/575
In this episode, Jon shares how the right music can power your productivity. It's no secret that he's a big fan of 'deep work,' but this week, he opens up about the artists, sites, and playlists that propel his productivity to new levels. Additional materials: www.superdatascience.com/574
In this episode, co-founder and CEO of Linea, Dr. Doris Xin, joins Jon Krohn to discuss how automating ML model deployment delivers groundbreaking change to data science productivity, and shares what it's like being the CEO of an exciting, early-stage tech start-up. In this episode you will learn: How Linea reduces ML model deployment down to a couple of lines of Python code [5:14] Linea use cases [11:30] How DAGs can 10x production workflow efficiency [22:12] ML model graphlets and reducing wasted computation [24:14] What future Doris envisions for autoML [35:23] Doris’s day-to-day life as a CEO of an early-stage start-up [42:43] What Doris looks for in the engineers and data scientists that she hires [52:21] The future of Data Science and how to prepare best for it [53:58] Additional materials: www.superdatascience.com/573
In this episode, Jon shares his habit of blocking out two hours in his mornings that are free from email and social media distractions. Tune in to learn how this habit helps him deeply focus on his most delightful tasks of the day. Additional materials: www.superdatascience.com/572
Einblick co-founder and associate professor at MIT, Tim Kraska, joins Jon Krohn to discuss no-code collaboration tools for data science and uncovers the clever database and machine learning tricks under the hood of the visual data computing platform. In this episode you will learn: The inspiration behind Einblick [2:45] Einblick's progressive approximation engine [6:43] How no-code tools impact productivity [17:18] The critical steps to become more data-driven as an organization [24:30] How research universities like MIT support high-risk, long-term research [38:37] How ML applied to databases enables them to be faster and more efficient [42:03] How real-time collaboration environments like Google Docs are likely to become more widespread for data science tasks [ 49:24] Additional materials: www.superdatascience.com/571
In this episode, Jon is back with another A.I. model breakthrough! He updates listeners on OpenAI's outstanding DALL-E 2 model. The new natural language processing model churns out staggering visual examples of whatever text your mind can dream up. Additional materials: www.superdatascience.com/570
Research Scientist at Meta AI, Dr. Noam Brown, joins Jon Krohn to discuss his award-winning no-limit poker-playing algorithms and the real-world implications of his game-playing A.I. breakthroughs. In this episode you will learn: What Meta A.I. is and how it fits into Meta, the company [3:01] Noam's award-winning no-limit poker-playing algorithms, Libratus and Pluribus algorithms. [4:33] What game theory is and how does Noam integrate it into his models? [8:45] The real-world implications of Noam’s game-playing A.I. breakthroughs [25:24] Why Noam elected to become a researcher at a big tech firm instead of in academia [27:06] The main barriers to getting AI game theory techniques beyond games to self-driving cars [30:16] Recommendations for people who want to break into poker AI [37:45] Additional materials: www.superdatascience.com/569
In this episode, Jon updates listeners on one of the industry's biggest breakthroughs to date –Google's new natural language processing model, PaLM. The key innovation with PaLM is scaling up Google's Pathways modeling approach to half a trillion parameters — many-fold more parameters than had previously been trained using this approach. Additional materials: www.superdatascience.com/568
In this episode, the MIT Press Director and Publisher, Dr. Amy Brand, joins Jon Krohn to discuss open-access publishing in data science and how to address the inequalities that exist for women and minorities in STEM. In this episode you will learn: What it’s like to run the prestigious MIT Press [4:34] How open access makes scholarly work more impactful [6:34] How publishing outstanding STEM books for broader audiences, including for children, can help address STEM biases [19:28] Amy's award-winning documentary Picture A Scientist [25:28] What it's like to executive produce a documentary [37:24] What can be done to change STEM to make it more welcoming to minorities [48:44] The best open-source model going forward [58:26] What fascinates Amy about natural language processing [1:01:30] How author metadata in standardized taxonomies can help authors receive the credit they deserve [1:04:50] Additional materials: www.superdatascience.com/567
In this episode, Jon reflects on the Chinese proverb: "The best time to plant a tree was 20 years ago. The second best time is now." He also challenges listeners to reflect on their long-term goals that have gone unfulfilled. Additional materials: www.superdatascience.com/566
In this episode, Jeremie Harris dives into the stirring topic of AI Safety and the existential risks that Artificial General Intelligence poses to humankind. In this episode you will learn: Why mentorship is crucial in a data science career development [15:45] Canadian vs American start-up ecosystems [24:18] What is Artificial General Intelligence (AGI)? [38:50] How Artificial Superintelligence could destroy the world [1:04:00] How AGI could prove to be a panacea for humankind and life on the planet. [1:27:31] How to become an AI safety expert [1:30:07] Jeremie's day-to-day work life at Mercurius [1:35:39] Additional materials: www.superdatascience.com/565
In this episode, Jon speaks with the CEO of Hugging Face, Clem Delangue, about open-source machine learning and transformer architectures, while attending the ScaleUp:AI Conference in New York. Additional materials: www.superdatascience.com/564
In this episode, superstar data science YouTuber Tina Huang joins us to discuss what it's like to work at one of the world's largest tech companies, her strategies for efficient learning, and how best to prepare for a career in data science from scratch. In this episode you will learn: The key areas to focus on when getting started in data science [6:01] Tina’s five steps to consistently doing anything [11:55] Tina's day-to-day life as a data scientist at one of the world’s largest tech companies [20:02] How Tina's computer science background helps her work [26:20] Traditional banking culture vs big tech [32:12] How Tina's background in pharmacology impacts her work in data science [36:15] The software languages that Tina uses daily in her work [45:30] How Tina’s SQL course practically prepares you for data science interviews [47:24] Additional materials: www.superdatascience.com/563
In this episode, Jon shares his daily technical exercise, which is part of an extensive habit tracking system that allows him to achieve more, create more structure within his day, and cut out bad habits. By completing mathematics, computer science, or programming exercise daily, Jon is able to hone his technical skills in a limitlessly broad field and open new professional opportunities in the long run. Additional materials: www.superdatascience.com/562
In this episode, Ribbon Health CTO Nate Fox joins us to discuss the ins and outs of APIs. Tune in to hear him share how he and his team build out APIs from scratch; how they ensure the uptime and reliability of APIs and how they leverage machine learning to improve the quality of healthcare delivery and maximize their social impact. In this episode you will learn: What are APIs? [13:20] How Ribbon Health’s data API leverages ML models to improve the quality of healthcare delivery [16:08] How to design a data API from scratch [20:00] How to ensure the uptime and reliability of APIs [25:28] How Ribbon uses knowledge graphs, manually labeled data samples, and an XGBoost model with hundreds of inputs to assign a confidence score [27:14] Nate’s favorite tool for easily scaling up the impact of data science [37:40] What is Nate’s day-to-day like? [34:34] The qualities Nate looks for when hiring data scientists [39:50] How scientists and engineers can make a big social impact in health technology [42:50] Additional materials: www.superdatascience.com/561
In this episode, Jon shares his daily habit of reading two pages and explains how it has transformed his productivity. Additional materials: www.superdatascience.com/560
Natural language processing expert and PhD student Melanie Subbiah sits down with Jon Krohn to discuss GPT-3, its strengths and weaknesses, and the future of NLP. In this episode you will learn: What is GPT-3? [6:24] The strengths and weaknesses of GPT-3 [14:38] What is autoregression? [18:03] GPT-3's new fine-tuning abilities [20:02] Bias issues with GPT-3 [22:47] The future of natural language processing models [27:54] How Melanie ended up working at OpenAI [38:13] Melanie’s self-study process [42:19] Melanie's work on OpenAI API [45:45] How to address the climate change and bias issues that cloud discussions of large natural language models [49:40] Why Melanie chose to do a PhD at Columbia University [1:01:17] The machine learning tools Melanie’s most excited about [1:08:09] Additional materials: www.superdatascience.com/559
In this episode, Jon shares the key topics he recently discussed with the Open Data Science Conference. From the approach behind his extensive machine learning and deep learning content library to revealing the key tools and software he uses daily, get to know Jon and his process a little better. Additional materials: www.superdatascience.com/558
Pandas expert Matt Harrison sits down with Jon Krohn to discuss tips, tricks and best practices for Pandas learning and mastery. In this episode you will learn: Pros and cons of self-publishing and working with a publisher [5:05] Matt's six tips for using Pandas [17:13] The best way for corporate teams to level up their skills [40:04] How to learn anything effectively [47:14] Matt’s tricks for staying motivated [50:00] Matt’s recommendations for using Git and the Unix command line [1:00:14] Matt’s recommended software libraries for working with tabular data [1:19:45] Additional materials: www.superdatascience.com/557
Discover Jon’s extensive library of machine learning content and learn why Jon's Machine Learning House forms the knowledge structure of an outstanding data scientist or ML engineer. Additional materials: www.superdatascience.com/556
Data scientist and Youtuber Ken Jee joins Jon Krohn for a deep dive into the world of sports analytics and brings us behind the makings of his large, online data science community. In this episode you will learn: The inspiration behind Ken’s YouTube videos [18:03] Ken’s four steps for getting started in data science [24:18] How sports analytics is transforming sports like golf [33:32] Ken’s favorite tools for software scripting as well as for production code development [41:10] How the #66DaysofData hashtag can supercharge your capacity as a data scientist [42:51] Ken’s data science podcast Ken’s Nearest Neighbors [54:11] LinkedIn Q&A [1:00:32] Additional materials: www.superdatascience.com/555
In this episode, Jon shares where you can find his extensive deep learning video content and courses. Tune in to learn more about his deep learning curriculum and where you can learn for free. Additional materials: www.superdatascience.com/554
In this episode, Dr. Josh Starmer, the creative, musical genius behind the wildly popular YouTube channel StatQuest joins the podcast to discuss statistics, learning and communication secrets, and how he grew his YouTube channel to over 650,000 subscribers. In this episode you will learn: The inspiration behind Josh’s YouTube channel [18:39] Josh's simple approach to learning something new [34:25] Josh's secret tool for creating YouTube videos with over a million views [51:01] The StatQuest Illustrated Guide to Machine Learning [53:34] How and when Josh uses R vs. Python [1:07:53] How to cluster any types of data using the R randomForest package [1:11:24] Why Josh left his academic career [1:14:24] The two stats concepts Josh thinks everyone should know [1:38:50] Additional materials: www.superdatascience.com/553
In this episode of Five-Minute Friday, Jon recaps the most popular SuperDataScience podcast episodes from 2021. See what you might have missed and catch up today! Additional materials: www.superdatascience.com/552
In this episode, gifted author and software engineer Wah Loon Keng joins the podcast to dive deep into reinforcement learning. From its history to limitations, modern industrial applications, and future developments– there's no better expert to learn from if you want to know more about this complex topic. In this episode you will learn: What is reinforcement learning? [4:50] Deep reinforcement learning vs reinforcement learning [13:17] A timeline of reinforcement learning breakthroughs [16:17] The limitations of deep RL today [39:53] Deep RL applications [53:10] Keng's open-source SLM-Lab framework [57:51] Keng’s responsibilities as an AI engineer [1:02:17] What is the future of RL? [1:08:05] Additional materials: www.superdatascience.com/551
Jon is back with another Five-Minute Friday habit-tracking episode! Listen in as he explains how writing morning pages has helped his data science work flourish with creativity. Inspired by Julia Cameron's book The Artist's Way, he details his morning pages routine and how it kickstarted a new chapter in his career. Additional materials: www.superdatascience.com/550
In this episode, Glean software engineer and Stanford graduate Lauren Zhu joins us to discuss her role at a fast-growing startup, working on natural language processing projects, and how she remains inspired by pursuing her side passions. In this episode you will learn: Lauren's experience as a course assistant [5:53] Stanford's Hacking the Coronavirus Course [11:53] How do you empower minority groups in AI [19:45] Lauren on zero-shot multilingual neural machine translation [23:25] Lauren's work at Glean [27:58] The Contrary Talent Network [34:30] The tools Lauren uses at Glean [43:39] The most important skills to possess as a data scientist [47:29] Additional materials: www.superdatascience.com/549
Our Five-Minute Friday series on habit tracking returns with a look at one of Jon's daily mindfulness habits–meditation. Learn how to keep this habit going for the long run and discover which tools help Jon stay on track. Additional materials: www.superdatascience.com/548
In this episode, Dr. Jonathan Flint, Professor of Psychiatry and Biobehavioral Sciences at the University of California Los Angeles, joins us to discuss how he uses data science and machine learning to explore the link between genetics and depression. In this episode you will learn: Johnathan's background [2:53] How we know that genetics plays a role in complex human behaviors including psychiatric disorders like anxiety, depression, and schizophrenia [8:00] The role that data science and ML play in modern genetics research [15:08] About Jonathan book "How Genes Influence Behavior" [19:45] The day-to-day life of a world-class medical sciences researcher [32:24] The open-source software libraries that Jonathan uses for data modeling [40:33] A single question you can ask to prevent a severely depressed person from committing suicide [52:00] LinkedIn Q&A [54:41] The future of psychiatric treatments [1:05:35] Additional materials: www.superdatascience.com/547
Our Five-Minute Friday habit-tracking series continues! Learn more about alternate-nostril breathing–the mindfulness technique that is scientifically proven to lower blood pressure and regulate the stress response. Additional materials: www.superdatascience.com/546
Data scientist and entrepreneur Matthew Russell joins Jon Krohn to discuss the intersection of machine learning and fitness and dive deep into the strategies he and his team at Strongest AI use to scale data-intensive real-time applications. In this episode you will learn: About Strongest's event platform and iOS app [6:06] How Strongest scaled to serve million [8:14] Strongest's unique approach to building a fitness app [17:50] How to rapidly test ML models for deployment [29:01] The three critical traits Matthew looks for in anyone he hires [33:11] Mining the Social Web [41:14] The values instilled in Matthew by pursuing a military education [53:30] The key skills Matthew wishes he’d learned earlier in his career [1:03:51] Additional materials: www.superdatascience.com/545
Our habit-tracking series continues with a look at how making your bed can jumpstart your mornings, prevent you from taking part in negative habits and help you become happier. Additional materials: www.superdatascience.com/544
Nicole Büttner (Founder and CEO of Merantix Labs) joins the podcast to discuss driving A.I. innovation, automation, and transformation and building the ideal A.I. start-up founding team. In this episode you will learn: The three factors that spark A.I. innovation [12:48] How to make great use of the unlabelled, unbalanced data sets [18:54] How to engineer reusable data and software components [25:09] Merantix's A.I. Canvas framework for successful innovation [29:59] How to be a part of Merantix's program as a founder [45:23] Additional materials: www.superdatascience.com/543
Revisit the much-underrated continuous calendar and get started with this uncommon planning method thanks to Jon's 2022 template.  Additional materials: www.superdatascience.com/542
In this episode, Kevin Hu joins the podcast to talk about founding and growing the data observability startup, Metaplane. Listen in to hear about his time in academia at MIT, his experience with Y Combinator, and his current routine as a technical founder. In this episode you will learn: What is data observability? [4:35] •How to identify data quality issues? [8:56] Kevin's PhD research on automating data science systems using machine learning [16:18] Why Kevin launched Metaplane [28:50] The pros and cons of an academic career relative to the start-up hustle [31:57] Kevin's experience in Y-Combinator accelerator [39:50] The software tools he uses daily as a CEO [50:54] What Kevin looks for in data engineer hires [56:13] Additional materials: www.superdatascience.com/541
In this episode, Jon opens up about starting his day with a glass of water – his first morning habit that sets his day off on a healthy and successful note. Additional materials: www.superdatascience.com/540
In this episode, Serg Masís joins the podcast to share his in-depth technical knowledge of Interpretable Machine Learning. Together they discuss why this field matters, how it’s evolving, and so much more. In this episode you will learn: What is interpretable machine learning? [8:41] The social and financial ramifications of interpreting models incorrectly [10:23] The challenges involved in interpretable ML [16:00] The most important interpretable ML concepts to master [19:54] The future of Interpretable ML [32:41] What it’s like to be a Climate & Agronomic Data Scientist [42:28] Serg’s day-to-day tools [49:05] Serg's productivity tips [50:25] Why Serg pursued a Master's in Data Science [52:25] Additional materials: www.superdatascience.com/539
In this episode, Jon shares his "life-changing" habit tracking system that has allowed him to achieve more, create more structure within his day and cut out bad habits. Additional materials: www.superdatascience.com/538
Sadie St. Lawrence returns to discuss the biggest data science trends that are set to take over the industry in 2022. In this episode you will learn: A look back at data science trends for 2021 [4:03] Micro and macro data science trends for 2022 [12:30] AutoML tools [15:20] The social implications of deepfakes [21:21] Scalable AI [38:40] Macro data science trends for 2022 [42:45] The impact of the remote-working economy in data science [43:21] Blockchain in data science [50:28] Data literacy of the global workforce [1:01:07] Additional materials: www.superdatascience.com/537
Jon goes over his five biggest learnings from 2021 and what he hopes to work on in 2022. Additional materials: www.superdatascience.com/536
Prolific data science entrepreneur and Y Combinator alum Austin Ogilvie (Laika, Yhat) joins Jon Krohn for a revealing look into his journey of starting, growing, and selling a data science startup. From liberal arts graduate to twice successful technical founder, take a seat and learn from the best. In this episode you will learn: The story behind the naming of Yhat and its early beginnings [5:10] Austin and Yhat's experience at Y Combinator [19:00] The benefits of being a technical founder [25:00] From arts degree graduate to successful tech entrepreneur [27:00] Austin's latest venture, Laika [39:30] The tools that Austin uses day-to-day [47:30] Unity gaming environment [49:58] What makes a great data scientist [56:23] Additional materials: www.superdatascience.com/535
Jon sends a holiday greeting to all listeners. Additional materials: www.superdatascience.com/534
Dr. Brett Tully joins us on the podcast to discuss his work as Director of AI Output Systems at Nearmap and his previous research in biomedical topics and nuclear fusion. In this episode you will learn: What is Nearmap? [5:22] What is a Director of AI Output Systems? [7:51] A case study [20:35] MLOps at Nearmap [26:37] Brett’s day-to-day and what he looks for in hires [40:19] Brett’s academic and research history [53:30] Brett’s work in nuclear fusion and predictions for the technology [1:04:48] The tools Brett used in his research [1:26:34] ProCan project [1:34:27] Brett’s prediction for future AI applications [1:48:30] Additional materials: www.superdatascience.com/533
Jon discusses one helpful framework when it comes to problem-solving and how data scientists are uniquely positioned to employ this technique. Additional materials: www.superdatascience.com/532
Jeroen Janssens joins on the podcast to discuss his book on utilizing the command line for data science and the importance of polyglot data science work. In this episode you will learn: The genesis of Jeroen’s book [3:24] Data Science at the Command Line [8:55] Creating your own command line tools [22:07] Polyglot data scientist [24:29] Data Science Workshops [27:01] Jeroen’s PhD research [30:38] Additional materials: www.superdatascience.com/531
Jon details his top ten AI thought leaders hoping that his suggestions prove valuable to you in your data science journey. Additional materials: www.superdatascience.com/530
Dave Niewinski joins us to discuss his prolific work in robotics both as a consultant and a popular YouTuber. In this episode you will learn: Dave’s Armoury [4:44] Robotic cornhole tournament [12:33] Dave’s many robots [14:25] Dave’s idea process [28:51] Future robots [31:43] Dave’s consulting business [33:27] Tools Dave likes to use [37:05] How did Dave get started in this line of work? [38:50] Dave’s advice to people who want to get into robotics [41:18] What is Dave excited about in the future? [45:38] Additional materials: www.superdatascience.com/529
Jon explores his personal anxieties as a content creator to encourage fellow creators to keep sharing their knowledge. Additional materials: www.superdatascience.com/528
Peter Bailis joins the podcast to discuss the work of his company that solves complex commercial problems through automated data analysis. In this episode you will learn: Meaning of the name Sisu [3:08] What Sisu does [4:45] Sisu and the data science stack [17:00] Going from academia to startups [22:37] What Sisu looks for when hiring [28:57] Peter’s favorite tools [32:40] Peter’s academic research [45:02] Additional materials: www.superdatascience.com/527
I finish up our three-part series on the results of the O’Reilly Survey, looking at the highest-paying data frameworks. Additional materials: www.superdatascience.com/526
Karen Jean-Francois joins us to discuss how she wants to empower her team members and a wider audience of data scientists battling imposter syndrome. In this episode you will learn: Karen’s background as a hurdler [4:42] Women in Data Podcast [10:32] Cardlytics [19:04] Karen’s background and current career [22:55] Karen’s favorite tools [31:29] Karen’s balance of fitness and work [34:45] The biggest challenge of Karen’s career [47:09] Advancement in data [54:13] What is Karen most excited about? [59:40] Additional materials: www.superdatascience.com/525
In this episode, I go over the highest-paying data tools based on the O’Reilly survey. Additional materials: www.superdatascience.com/524
Wes McKinney joins us to discuss the history and philosophy of pandas and Apache Arrow as well as his continued work in open source tools. In this episode you will learn: History of pandas [7:29] The trends of R and Python [23:33] Python for Data Analysis [25:58] pandas updates and community [30:10] Apache Arrow [41:50] Voltron Data [55:10] Origin of Wes’s project names [1:08:14] Wes’s favorite tools [1:09:46] Audience Q&A [1:15:34] Additional materials: www.superdatascience.com/523
I provide you with some quick definitions of data tools vs data platforms to prep us for deep dives in future episodes. Additional materials: www.superdatascience.com/522
Khuyen Tran joins us to discuss her work as a prolific technical writer and undergraduate data science student. In this episode you will learn: Khuyen’s online writing [4:00] Book writing [8:50] How you can increase your engagement [13:49] Khuyen’s work with Towards Data Science and NVIDIA [19:01] Ocelot Consulting [24:08] Khuyen’s undergrad work [32:12] Audience questions [47:00] Additional materials: www.superdatascience.com/521
I take a look at the results of O’Reilly’s survey on salaries for data scientists in 2021. Additional materials: www.superdatascience.com/520
James Hodson joins us to discuss his philosophy and work at A.I. For Good and how they aim to promote sustainability and A.I. use for social issues. In this episode you will learn: AI for Good [5:17] Founding of AI for Good [8:50] Case studies [14:58] How you can get involved [46:29] Skills James looks for in hires [50:39] Additional materials: www.superdatascience.com/519
This week, I provide a short but important bit of advice on failure. Additional materials: www.superdatascience.com/518
Sadie St. Lawrence talks in-depth about her extensive work as a data science educator through both online and collegiate courses as well as her organization for diversifying data science careers. In this episode you will learn: Sadie’s education work in SQL [4:13] The popularity of Sadie’s course [13:32] Sadie’s forthcoming machine learning certificate course [16:29] Women in Data [25:32] Sadie’s non-technical background [36:17] NFTs and VR [46:41] Additional materials: www.superdatascience.com/517
In this episode, I finish up my saga into the effects of caffeine on productivity. Additional materials: www.superdatascience.com/516
Chrys Wu joins us to discuss her community organizations, her tips, and her recommended resources for building data science communities for impact. In this episode you will learn: The world of K-Pop [ 4:07] Chrys’s talk at the R Conference [8:56] Write/Speak/Code [14:05] Hacks/Hackers [21:58] Tips on developing data communities [27:22] Additional materials: www.superdatascience.com/515
In this episode, I dive into the nuts and bolts of data on my experiment into caffeine and productivity. Additional materials: www.superdatascience.com/514
Denis Rothman joins us to discuss his writing work in natural language processing, explainable AI, and more! In this episode you will learn: What are transformers and their applications? [7:54] Denis’s book on explainable AI [25:08] AI by Example [35:53] LinkedIn audience questions [42:00] Additional materials: www.superdatascience.com/513
I dive into a personal experiment to test my productivity relative to my coffee intake and if caffeine is actually hurting my productivity. Additional materials: www.superdatascience.com/512
Drew Conway joins us on the first live podcast to discuss his work in private investing and how data science figures into and improves his work. In this episode you will learn: The R Conference and NYHackR [6:33] Machine Learning for Hackers [20:17] Two Sigma and Drew’s work [28:27] Drew’s team structure at Two Sigma [35:12] Audience Q&A [46:27] Additional materials: www.superdatascience.com/511
In this episode, I dive into the world of reinforcement learning and deep reinforcement learning and the benefits of both. Additional materials: www.superdatascience.com/510
Parinaz Sobhani joins us to discuss the cutting-edge work of Georgian, a collaborative company that helps start-ups implement and scale machine learning and AI. In this episode you will learn: Parinaz’s work at Georgian [5:35] Use cases of Georgian’s work [14:35] Tools and approaches Parinaz uses [32:27] Environmental concerns of machine learning [42:52] Hiring at Georgian and what Parinaz looks for [48:18] How did Parinaz become interested in this? [56:19] Fairness in AI [1:09:01] Additional materials: www.superdatascience.com/509
In this episode, I discuss an interesting bit of my grandmother’s view about the process of working and going through life. Additional materials: www.superdatascience.com/508
Rob Trangucci joins us to discuss his work and study in Bayesian statistics and how he applies it to real-world problems. In this episode you will learn: Getting Rob on the show [8:12] Stan [9:34] Gradients [18:15] What is Bayesian statistics? [23:05] Multi-modal deep learning [45:20] Stan package [53:46] Applications of Bayesian stats [1:09:47] The day-to-day of a PhD in stats [1:21:56] What does the future hold? [1:42:37] Additional materials: www.superdatascience.com/507
In this episode, I continue with last week’s theme and discuss the differences between supervised and unsupervised learning. Additional materials: www.superdatascience.com/506
Hadelin de Ponteves joins us to discuss his latest educational work and how his skills as a data science educator helped him start his career in acting. In this episode you will learn: What has Hadelin been up to? [4:27] Hadelin’s cinema career and data science crossover [16:02] Sleep for productivity [27:27] How did Hadelin decide to undertake this? [32:26] Bollywood vs Hollywood [37:26] Additional materials: www.superdatascience.com/505
In this episode, I give a quick introduction to subcategories of supervised learning problems. Additional materials: www.superdatascience.com/504
Pieter Abbeel joins us to discuss his work as an academic and entrepreneur in the field of AI robotics and what the future of the industry holds. In this episode you will learn: How does Pieter do it all? [5:45] Pieter’s exciting areas of research [12:30] Research application at Covariant [32:27] Getting into AI robotics [42:18] Traits of good AI robotics apprentices [49:38] Valuable skills [56:40] What Pieter hopes to look back on [1:04:30] LinkedIn Q&A [1:06:51] Additional materials: www.superdatascience.com/503
In this episode, I explore a common issue plaguing people across fields: imposter syndrome. Additional materials: www.superdatascience.com/502
Jared Lander joins us to discuss his work as an R meetup organizer, the upcoming virtual R Conference, and his work as a consultant for a variety of companies from metal workers to professional football teams. In this episode you will learn: Jared’s R meetups and our professional history [3:27] NYHackR [6:42] The R Conference [13:25] R for Everyone [18:55] Lander Analytics [22:10] Job openings at Lander Analytics [25:04] R vs. Python [29:15] The importance of pizza in Jared’s life [32:19] Additional materials: www.superdatascience.com/501
In this very special episode, we delve into a live yoga Nidra practice with Jes Allen and go over how you can open up to consciousness through yoga practice. In this episode you will learn: [3:40] What Yoga means [10:00] Jes’s current work as a yoga practitioner [22:31] How to find Jes online [27:09] The Yoga Nidra practice [54:50] Coming out of the practice Additional materials: www.superdatascience.com/500
Barr Moses joins us to discuss the importance of data reliability for pipelines and how companies can achieve data mesh. In this episode you will learn: Data meshes [4:25] Self-serve data reliability [15:36] How Monte Carlo helps data up time [21:13] How to build an effective data science team [26:50] LinkedIn Q&A [31:50] Additional materials: www.superdatascience.com/499
In this episode, I dive into a reoccurring pattern I’ve noticed where beginners, myself included, think they’re more skilled and experienced than they really are. Additional materials: www.superdatascience.com/498
Benjamin Todd joins us to discuss his work helping professionals maximize their career capital, the top skills to learn across professions, and more. In this episode you will learn: How Benjamin helped me become a data scientist [6:56] How did 80,000 Hours come about? [9:39] The impact of 80,000 Hours [14:46] Funding [17:23] Where does the name come from? [23:32] What kind of advice does Benjamin give to people? [25:21] How data scientists can make an impact [42:04] How can someone strategize about their career? [1:02:53] Top skills that everyone should learn [1:05:49] Additional materials: www.superdatascience.com/497
In this episode, you’ll enjoy a fictional narrative I’ve titled “2040: A Brain-Computer Interface Story”. Additional materials: www.superdatascience.com/496
Greg Coquillo joins us to discuss his work on ROI for startups and the best ways to make the most of your company’s AI investment. In this episode you will learn: Our connection through Harpreet’s happy hours and DSGO [4:48] Greg’s content on LinkedIn [6:40] The scope of Greg’s work [9:25] Making the most out of AI [16:05] LinkedIn Q&A [20:00] Quantum machine learning [32:06] Additional materials: www.superdatascience.com/495
In this episode, I talk about an interesting thought experiment that helps you appreciate your existence. Additional materials: www.superdatascience.com/494
Anjali Shrivastava joins us to discuss her data science degree and her content creation efforts to bring data science to the people. In this episode you will learn: Anjali’s studies [2:00] Anjali’s YouTube channel [11:57] The content creation process [17:58] Yoga during the pandemic [21:34] Anjali as a writer [24:38] Anjali’s dual degrees [31:28] Anjali’s previous data science roles [43:04] Anjali’s first full-time data job [51:12] Anjali’s hopes for the future [55:29] Additional materials: www.superdatascience.com/493
In this episode, I discuss the changing child mortality rate as evidence of how much better the world is and how much better it could be. Additional materials: www.superdatascience.com/492
Veerle van Leemput joins us to make the case for why you should be using R for production. In this episode you will learn: Our shared powerlifting passion [2:47] The stigma of using R [12:02] What does Analytic Health do? [13:55] How Analytic Health uses R [19:08] Tidyverse [34:44] Tools for API creation [37:09] Additional materials: www.superdatascience.com/491
In this episode, I discuss why you should avoid the visually pleasing but flawed pie chart. Additional materials: www.superdatascience.com/490
Vin Vashishta joins us to discuss his AI consulting work and his philosophy on AI strategy for monetization. In this episode you will learn: V-Squared [4:59] Vin’s online content [17:18] Low-code/no-code in data science [25:33] Top five gap skills [35:19] Data sets for insights on consumers and targeting [40:26] Are there socially beneficial data science and machine learning applications? [43:16] The most difficult data science problem Vin ever faced [50:39] Additional materials: www.superdatascience.com/489
In this episode, I discuss the simple and cheap ways you can buy yourself more time during the day. Additional materials: www.superdatascience.com/488
Susan Walsh joins us to discuss the importance of data cleaning and normalization and how clean procurement data can save companies money. In this episode you will learn: Susan’s “COAT” system [7:16] The Classification Guru [15:39] Case studies [22:46] Susan’s book [30:26] Additional materials: www.superdatascience.com/487
In this episode, I go over the world history of calculus and how we still use these techniques today. Additional materials: www.superdatascience.com/486
Doug Eisenstein joins us for a great and in-depth conversation on data engineering in the financial sector. In this episode you will learn: • The founding of Advanti [4:37] • Aristos and solution products [16:45] • The kinds of financial industries and how Doug helps [26:25] • Entity Extraction [34:27] • Temporality data [44:27] • How to work with Doug [58:19] Additional materials: www.superdatascience.com/485
In this episode, I discuss interesting research on why humans are so quick to lose faith in algorithms. Additional materials: www.superdatascience.com/484
Andrew Jones joins us to discuss data science interviews and how you can maximize your chances on interview time, resume, and more! In this episode you will learn: Data Science Infinity [5:40] “The Essential AI and Data Science Handbook for Recruitment” [17:40] How can aspiring data scientists set themselves apart? [21:30] What skillset should data scientists have? [34:36] Should data science be trying to be data engineers? [41:14] How can organizations ensure data science projects are a success? [50:50] Additional materials: www.superdatascience.com/483
In this episode, I talk about the advantages of using a continuous calendar. Additional materials: www.superdatascience.com/482
Kris Tait joins us to discuss the vast world of digital performance marketing and how automation, data, and optimization play an important role. In this episode you will learn: What is performance marketing? [3:29] How can advertisers take advantage of these tactics? [13:04] The importance of quality data in performance marketing [20:19] Human value performance marketing [25:30] How does Croud optimize? [29:05] What are the best KPIs in this industry? [34:02] Roles available at Croud now [39:11] Typical tools at Croud [42:43] What clients work best for Croud? [48:56] Additional materials: www.superdatascience.com/481
In this episode, I go over my top 5 tips for refining your perfect data science resume. Additional materials: www.superdatascience.com/480
Maureen Teyssier joins us to discuss the cutting-edge work Reonomy is doing in commercial property real estate and her views and tips on building a great data science team. In this episode you will learn: Maureen’s work with Reonomy [5:40] Knowledge graphs and use cases [7:35] Other tools Reonomy uses [18:58] What Maureen looks for in potential hires, soft skills and hard skills [26:28] Hiring at Reonomy [41:40] Maureen’s tips for growing a data science team [48:55] Tools to transition from academia to industry [52:45] Additional materials: www.superdatascience.com/479
In this episode, I go over my 5 keys to success to tackle any goal. Additional materials: www.superdatascience.com/478
Sidney Arcidiacono joins us to discuss her studies and work at Make School and her interest in utilizing AI for healthcare, as well as her tips and strategies for becoming a successful early-career data scientist. In this episode you will learn: What is Make School? [5:00] Sidney’s interest in AI and computer science [10:56] Graph theory and graph convolutional neural networks [19:53] What tools does Sidney use for her work? [31:16] Sidney’s internship [36:52] How other beginners can get involved in data science [38:12] Sidney’s goals [41:57] Additional materials: www.superdatascience.com/477
In this episode, I discuss the amazing benefits of implementing peer-driven learning in your professional life. Additional materials: www.superdatascience.com/476
David Langer joins us to discuss his work as a data analytics educator and his beliefs in the use of Excel, SQL and R in analytics work. In this episode you will learn: Intro to Dave on Data [6:50] 20% analytics that drives 80% of ROI [11:04] The benefits of SQL [19:15] The uses of R [24:50] Machine learning [34:15] Additional materials: www.superdatascience.com/475
In this episode, I discuss the architecture of a “machine learning house”, representing the skills and learnings you can use as foundations to build your data science career. Additional materials: www.superdatascience.com/474
Anima Anandkumar joins us to discuss her work as a researcher in machine learning at NVIDIA and a professor at CalTech, and how they often go hand-in-hand and inform each other. In this episode you will learn: Anima’s recent discovery of yoga [5:20] How does Anima balance her work? [12:25] Applications of Anima’s work [14:45] Tensors [22:55] Anima’s favorite NVIDIA projects [35:35] What tools does NVIDIA use? [41:55] CalTech interdisciplinary science [47:41] The path to generalized artificial intelligence [57:19] The skills to have to get into this field [1:00:27] LinkedIn questions for Anima [1:07:03] Additional materials: www.superdatascience.com/473
In this episode, I share a note I received from a student who expressed his thoughts on the learning that never stops as he goes through his data science career. Additional materials: www.superdatascience.com/472
Kirill Eremenko returns to the SDS podcast as a guest to debunk common myths you may believe about getting a data science job. In this episode you will learn: What has Kirill been up to? [3:48] The genesis of the 99-days challenge [5:27] 5 myths about pursuing a data science career [15:49] First data science jobs [1:00:53] 5 components for success [1:08:19] Additional materials: www.superdatascience.com/471
In this episode, I follow up on the popular book recommendation portion of the podcast with my own list of favorite books. Additional materials: www.superdatascience.com/470
Konrad Körding joins us to discuss his work in educating the next generation in deep learning and his views on the importance of causality in deep learning research. In this episode you will learn: Konrad’s academic background [3:54] Neuromatch Academy [5:23] Artificial general intelligence [35:02] Defining deep learning [41:24] Symbol representation [44:12] Konrad’s career journey [47:25] What other skills should you develop for the future? [52:46] What is the future of intelligence in our timeline? [56:37] Additional materials: www.superdatascience.com/469
In this episode, I tackle another historical topic: the history of data. Additional materials: www.superdatascience.com/468
Noah Gift joins us to discuss how he believes data science urgency and the end of hierarchies will change the world for the better. In this episode you will learn: Catch up with Noah [2:50] Educational options to pursue in data science [13:09] Outside university education [24:06] Noah as a prolific author [28:15] Urgent applications of technology [37:34] Noah’s income streams color code [48:38] How to harness our free time to solve big problems [54:13] Noah’s Coursera course [1:09:12] Additional materials: www.superdatascience.com/467
In this episode, I go over what separates a good data scientist from a great one in skills, practices, and approach. Additional materials: www.superdatascience.com/466
Konrad Kopczynski joins us to discuss how data, tracking, analytics, and key performance indicators can help your professional and personal development. In this episode you will learn: What does Konrad do [3:40] Tools and techniques used in Impakt Advisors [10:35] Impakt’s unique hiring model [18:53] How does Impakt manage remote work [21:36] Konrad’s professional history and daily structure [28:42] Konrad’s Iron Man triathlon [44:11] Konrad’s years’ long project on presidential biographies [47:46] Additional materials: www.superdatascience.com/465
In this episode, I tackle three often conflated terms - AI, machine learning, and deep learning - to shine some light on what exactly they are. Additional materials: www.superdatascience.com/464
Matt Dancho joins us to discuss his various packages for time series analysis and his courses on the topic through his company Business Science. In this episode you will learn: How Matt got into time series library development [4:22] Business Science [7:00] R Shiny [9:36] Matt’s 6 time series models [14:11] Timetk [15:02] Modeltime [29:32] Gluon package [36:04] Modeltime Ensemble [43:12] Modeltime H2O [45:22] Modeltime Resample [48:10] Additional materials: www.superdatascience.com/463
In this episode, I discuss taking a positive approach to the good things that happen in life, rather than focusing on potential negative outcomes. Additional materials: www.superdatascience.com/462
Sam Hinton joins us to discuss his work since assisting COVID-19 data pipelines, now working in renewable energy and applications of ML and MLOps for the industry. In this episode you will learn: Catching up with Sam [3:05] Updates on the COVID-19 data pipelines [7:07] Sam’s current work at Arenko [10:41] Sam’s stint on Survivor, PhD, and his software engineering background [16:32] Machine learning in renewable energy [35:23] Sam’s day-to-day tools [49:33] How can listeners utilize MLOps [53:08] Sam’s forthcoming novel [59:05] Additional materials: www.superdatascience.com/461
In this episode, I talk about the ancient history of algebra, an important component of data science today. Additional materials: www.superdatascience.com/460
Vince Petaccio joins us to discuss how he sees data science, ML, and AI making positive impacts in the fight against climate change. In this episode you will learn: Where in the world is Vince? [2:08] Vince’s interest in climate science [4:33] The Citizen’s Climate Lobby [9:12] Where data science comes in [13:28] Risks of relying on tools [31:54] How can you make an impact? [37:28] Additional materials: www.superdatascience.com/459
In this week’s episode, I take you behind the scenes of our video tutorial productions to see what goes into making our tutorials. Additional materials: www.superdatascience.com/458
Harpreet Sahota joins us to discuss his data science mentorship work outside his day job and how you can land your dream job. In this episode you will learn: Harpreet’s current life and location [2:25] Data Community Content Creator Awards [8:37] The Artists of Data Science Podcast [14:46] Data Science Dream Job [24:18] Harpreet’s day job at Price Industries [30:48] Coming in data science from a non-data background [40:55] Tools and skills to know [47:57] Additional materials: www.superdatascience.com/457
In this week’s episode, I talk about one of my favorite time management techniques: the Pomodoro technique. Additional materials: www.superdatascience.com/456
Horace Wu joins us to discuss his work on Syntheia, a unique product that helps sift through massive amounts of legal data to augment the capacities and function of law firms. In this episode you will learn: Horace’s life and work in New York City [5:00] Syntheia and Horace’s role there [6:25] Horace’s background [12:07] Nearmap [16:35] Syntheia NLP use cases [21:46] Design, coding, and the team [34:19] What skills does one need for this field? [41:41] What would Horace do differently and what is he excited for? [46:15] Additional materials: www.superdatascience.com/455
In this episode, I continue my discussion about the quick-paced growth of technology and how it impacts different fields. Additional materials: www.superdatascience.com/454
Stephen Welch joins to go over his year-end 2020 list of 10 important questions and pain points that machine learning can improve. In this episode you will learn: Welch Labs on YouTube [4:54] What Stephen’s been up to [7:56] Stephen’s 2020 year-end blog post [10:11] Stephen’s reflections on 10 areas worth focusing on [16:25] Additional materials: www.superdatascience.com/453
In this week’s episode, I discuss how technology propelled the recruitment industry forward and continues to do so today. Additional materials: www.superdatascience.com/452
Dan Shiebler joins us to discuss his category theory Ph.D. program, his full-time job at Twitter, and how the two crossover and combine in his overall data work. In this episode you will learn: Dan’s neuroscience undergrad and MATLAB [4:12] Dan’s Ph.D. timeline and research [14:01] How to start a Ph.D. while working full time [22:45] Dan’s work at TrueMotion and label data [30:39] Dan’s title and role at Twitter [39:15] Specific projects at Twitter [44:09] What skills someone should bring to a Twitter job interview [52:06] What machine learning approaches will be important in the future? [1:00:38] Additional materials: www.superdatascience.com/451
This week, Jon talks with Steve Fazzari about the physical and emotional benefits of practicing Yoga Nidra. Additional materials: www.superdatascience.com/450
Ayodele Odubela joins us to discuss fairness in AI and how we can work towards a more equitable and transparent world of data science and machine learning. In this episode you will learn: Comet ML [3:22] What is a data science evangelist? [7:08] FullyConnected [12:04] Imposter Syndrome and Ayodele’s book [15:57] What Ayodele wished she learned from grad school [20:25] Uncovering Bias in Machine Learning [27:00] Where can we affect this positive change in fairness? [31:08] The potential for a rosy future [49:20] Ayodele’s LinkedIn Learning course [52:24] Additional materials: www.superdatascience.com/449
This week, I answer your questions about how to take yourself from data science practitioner to data science leader. Additional materials: www.superdatascience.com/448
Michael Segala joins us to discuss how machine learning can provide creative and novel solutions to longstanding problems in both the private and public sectors. In this episode you will learn: SFL Scientific [4:20] SFL’s example work [10:55] Public sector vs private sector work [20:28] Michael’s day-to-day [30:18] What is Michael looking for in the people he hires? [33:38] Michael’s career journey [41:39] What is Michael excited about for the future? [48:38] Additional materials: www.superdatascience.com/447
This week I answer your questions about machine learning and how to educate yourself further in the field. Additional materials: www.superdatascience.com/446
Sinan Ozdemir joins us to share his work in conversational AI and what it takes to keep chatbots up to date and functional in an ever-changing world. In this episode you will learn: Kylie.ai under Directly [4:51] Sinan’s day-to-day work and tools [10:45] Use cases [18:27] AutoML’s role in these processes [21:55] What hard or soft skills are needed for this work? [29:32] Sinan’s background in teaching [34:58] Sinan’s history in pure math and applied math [39:44] Sinan’s math tattoos [43:48] Additional materials: www.superdatascience.com/445
In today’s episode, I answer your questions on how to best future-proof your data science career in AI, AutoML, and model interpretability. Additional materials: www.superdatascience.com/444
Jeff Wald joins us to discuss his book and the research he has done into the data and trends around the job market, the decline of the 9-5 office job, and more. In this episode you will learn: The Birthday Rules [3:51] A history of work [7:41] The myth of the lifetime contract [12:15] What the data says about now [21:02] On-demand labor market [25:34] Remote work [32:09] What role will automation play? [46:27] Future of employment from the study lens [48:30] Additional materials: www.superdatascience.com/443
In today’s episode, I discuss how focusing on process and habit building can provide more for you and your professional progress than simply chasing a goal. Additional materials: www.superdatascience.com/442
Kate Strachnyi joins us to discuss her work in data visualization education from conferences to published books as well as her tips for visualization best practices. In this episode you will learn: What does Kate do (from her children’s perspective) [1:56] What kind of tools does Kate employ? [5:19] Kate’s day-to-day [13:03] DATAcated Conference [16:03] How do you amass a big LinkedIn following? [20:39] Kate’s four published books [29:55] The guidelines to follow to succeed in this field [37:00] What’s next for Kate? [41:24] Additional materials: www.superdatascience.com/441
In this episode, I continue my discussion on the leaps we’re making towards AGI, by looking at MuZero. Additional materials: www.superdatascience.com/440
Deblina Bhattacharjee joins us to talk about her amazing work in computer vision and give advice for getting into and excelling in the field. In this episode you will learn: Deblina’s master’s program work [4:03] Deblina’s computer vision research and Ph.D. [11:46] Deblina’s drumming hobby [20:18] The daily work [24:40] What key skills do you need as a data scientist? [33:21] How can a data scientist prepare for the future? [37:03] How does Deblina tackle time management? [40:24] Additional materials: www.superdatascience.com/439
In this episode, I discuss DeepMind’s latest breakthrough towards AGI and the stepping stones that got them there. Additional materials: www.superdatascience.com/438
Claudia Perlich joins us to discuss her work at one of the world’s largest hedge funds and how she got to work there, as well as her history of winning data science competitions. In this episode you will learn: Life and work during the pandemic [2:23] Claudia’s history with horses and riding [8:28] Claudia’s work at Two Sigma [12:00] Claudia’s role on a daily basis [20:51] Tools of the trade [30:27] What Claudia looks for when hiring [36:37] What skills do future hires need? [40:32] Claudia’s history with data science competitions [48:22] Why work in finance and at Two Sigma? [1:00:19] Additional materials: www.superdatascience.com/437
In this episode, I continue my discussion on daily mindfulness practice and how to form a growing habit in it. Additional materials: www.superdatascience.com/436
Erica Greene joins us to discuss her work as a machine learning manager at Etsy, how they tackle problem-solving, how they implement ML scaling, and more. In this episode you will learn: Erica’s role at Etsy and problem solving between platforms [2:28] Interesting failures Erica has navigated [25:40] How does Erica’s team select problems to solve [33:07] Engineering at scale [40:15] What does Erica’s working day look like? [46:30] Etsy is hiring [53:00] Diversity in hiring [57:12] Do data scientists need PhDs? [1:01:26] Additional materials: www.superdatascience.com/435
In this episode, I discuss my use of mindfulness and attention sharpening tools to boost my productivity throughout the day. Additional materials: www.superdatascience.com/434
Ben Taylor joins us for the fourth time to discuss the upcoming 2021 trends in the world of data science as well as the post-COVID world. In this episode you will learn: Ben’s passion for AI [9:41] Delivering results and KPIs [12:43] DataRobot and AutoML [20:38] Transparent storytelling [24:29] Federated learning [31:37] ML productionization [37:01] AI ethics [46:01] Emerging software packages/tools [54:39] Remote work [1:02:44] Additional materials: www.superdatascience.com/433
In this episode, I introduce myself, Jon Krohn, as the new host of the SuperDataScience podcast and give you a taste of what to look forward to in 2021! Additional materials: www.superdatascience.com/432
In this final episode featuring Kirill as the host, he examines and presents his top 7 learnings from this unprecedented year. In this episode you will learn: Backpain and standing desks [5:41] The internal conflict model [14:33] What acceptance really means [38:32] Intellect and Intelligence [58:10] Needs vs. wants/desires/wishes [1:08:00] Intention vs effect [1:25:51] Do not take things personally [1:46:12] Additional materials: www.superdatascience.com/431
In this episode, I talk about the reasoning behind my decision to step down as the host of the SDS podcast. Additional materials: www.superdatascience.com/430
Jon Krohn joins us for a year-end episode about 2020’s biggest data science breakthroughs and for a big podcast announcement for 2021. In this episode you will learn: Global warming [4:37] Our big podcast announcement [6:57] Who is Jon Krohn? [12:14] Top 3 technological breakthroughs of the year [21:28] AlphaFold [23:33]• GPUs [45:51] GPT-3 [1:00:26] Wrap up [1:26:40] Additional materials: www.superdatascience.com/429
In this episode, I talk about a very interesting concept around expectations and reality, and how the gap between the two might be affecting us. Additional materials: www.superdatascience.com/428
Syafri Bahar joins us for a great conversation about his work at GOJEK, a decacorn super app bringing services to Indonesia, and his philosophy of empowered data science teams. In this episode you will learn: Syafri’s day job at GOJEK [11:26] What is a super app? [14:50] The data science department at GOJEK [19:47] High-performance data science team [31:17] Syafri’s career journey and love of math [39:49] Apply to work at GOJEK [55:42] Working for the benefit of others [1:00:21] Additional materials: www.superdatascience.com/427
In this episode, I talk about something profoundly important for me this year in shifting away from ego-driven ambition towards non-materialistic meaning in your life and work. Additional materials: www.superdatascience.com/426
Rama Akkiraju joins us to discuss the past, present, and future of AI services and how companies and data scientists can best prepare themselves to become AI consumers. In this episode you will learn: 23 years at IBM, before and after data science [6:11] IBM Watson and AI services [12:25] Skills to utilize AI services [25:02] How to achieve significant ROI on AI deployment [41:31] What does the AI future look like to Rama? [52:41] Ethics and the benefits of AI [1:04:37] Additional materials: www.superdatascience.com/425
In this episode, we talk about how businesses can maximize their relationship with AI to ensure visible ROI and progress of industries. Additional materials: www.superdatascience.com/424
Amanda Obidike joined us for a great discussion about her work in Nigeria and the African continent in empowering and enabling STEM education and job placement. In this episode you will learn: Life in Lagos, Nigeria [5:22] Amanda’s journey to data science [7:28] Case studies and example projects [13:00] STEM skills and the start of STEMi [19:41] What are the issues STEMi is addressing? [24:48] Get involved in STEMi’s mentoring project [30:12] STEMi’s results so far [36:02] Amanda’s best tips for landing jobs [39:04] Work in promoting education and literacy [45:19] The progress of STEM in Africa [47:34] Additional materials: www.superdatascience.com/423
In this episode, I talk about the difference between pain and suffering and the importance of becoming aware of it. Additional materials: www.superdatascience.com/422
Theunis Barnard joins us for a great conversation about digital twins and how data scientists can learn about the technology and get involved with its applications. In this episode you will learn: Data science in South Africa [6:08] Theunis’s current companies [11:32] Industry 4.0 [13:59] Digital twins [22:37] Theunis’s day-to-day [38:54] Further examples of digital twins [42:26] Future of digital twins [48:02] Theunis’s advice for data science newcomers [57:17] Process digital twins vs. system digital twins [59:42] Additional materials: www.superdatascience.com/421
In this episode, we do an exercise using the wheel of life to examine your time management and understand how balanced your life currently is. Additional materials: www.superdatascience.com/420
Juval Löwy joins us for an exceptional episode that condenses much of his masterclass teachings into a powerful hour of information about the right approach to designing systems as well as projects. In this episode you will learn: Career planning [7:24] Consequences of designing against requirements [8:57] The framework of a good system design [30:32] The right approach to project design [44:00] Juval’s book [1:02:31] The progress and future [1:03:48] Additional materials: www.superdatascience.com/419
In this episode, I discuss a very interesting quote by Beethoven about the importance of giving space to feelings, even if that means making a mistake. Additional materials: www.superdatascience.com/418
Arthur Shectman joins us to discuss the data engineering and data product development work they do in Elephant Ventures and the importance of capturing value through data. In this episode you will learn: What is Elephant Ventures? [8:11] Data quality engineering [21:00] The importance of focusing on business value [39:58] Methodology for understanding the company’s business value [46:05] What is data engineering? [49:28] What is data product development [51:34] What are the technical skills needed for these jobs? [56:02] What is the future bringing for data science? [59:23] Additional materials: www.superdatascience.com/417
In this episode, I talk about the three key ingredients for a successful, happy career in data science. Additional materials: www.superdatascience.com/416
Asieh Ahani joins us to discuss her rapid career progress, the unique work she does at MassMutual, and how she maintains her technical skills while working in a leading position. In this episode you will learn: Asieh’s background [5:09] Machine learning techniques for processing biosignals [14:24] Signal processing [22:22] Asieh’s career and move from academia to industry [27:19] Maintaining technical skills as a manager [41:48] MassMutual is hiring [47:02] Leading a remote data science team and work/life balance [49:55] Asieh’s words for other women in data science [55:28] Future of data science [1:00:30] Additional materials: www.superdatascience.com/415
Today I talked about the importance of understanding the balance between acting selfishly and acting with self-neglect and how the awareness of our needs and wants can help with that. Additional materials: www.superdatascience.com/414
Emmanuel Letouzé discussed in-depth his work in global data science literacy and how he hopes data science will benefit the world in various societal and socio-economic challenges. In this episode you will learn: Parenting and its effects on Emmanuel’s life and work [3:14] Why did Data-Pop Alliance come to life? [8:42] Working with Harvard and MIT [13:04] Examples of projects and areas of focus [18:16] Data as lenses and data as lever [29:43] Sustainable Development Goals indicators [38:21] How can we use data as a lever? [43:41] How can data help with disaster resilience? [57:09] The future of data science [1:04:09] Additional materials: www.superdatascience.com/413
Today I talked with a chiropractor about how to best treat your back while working during the day. Additional materials: www.superdatascience.com/412
Jennifer Cooper talked with us about her role as a strategic analyst and how others can get involved with similar positions around analytics and hybrid roles. In this episode you will learn: Jennifer’s start in data science [6:04] What is analytics support function? [16:01] Keys to success in analytics roles [21:09] How do you find these roles? [42:42] DataScienceGO Virtual #2 [50:45] Common questions Jennifer gets [1:00:52] Additional materials: www.superdatascience.com/411
Today I talk about something important, which I recently had to reteach myself, about personal needs and communication. Additional materials: www.superdatascience.com/410
Steve Nouri talks with us about the importance of managing your personal brand, participating in hackathons, and being active in the conversations around AI as you begin your career. In this episode you will learn: Steve’s work in the Australian Computer Society [4:32] River City Labs [12:22] Hackathons during the pandemic [16:21] Choosing a path in AI [26:09] The AI bubble and its implications [31:09] Strategic data acquisition [38:04] Explainable AI [43:50] Creating a personal brand [51:35] Additional materials: www.superdatascience.com/409
Today I talk about an interesting concept that can often be the cause of conflicts in professional and personal relationships. Additional materials: www.superdatascience.com/408
Margot Gerritsen joins us for a great discussion that was both technical and inspiring, on the topics of principal component analysis and linear algebra, as well as the importance of women in data science. In this episode you will learn: Margot’s travels and background [7:29] Margot’s position and work at Stanford [13:38] What is linear algebra? [18:00] Principle component analysis [23:02] WIDS, Women in Data Science [32:08] Margot’s diversity call to action [58:12] How can men support their female colleagues? [1:05:55] Additional materials: www.superdatascience.com/407
Today we discussed the Buddhist concept “abandon hope” as a way to avoid falling victim to negative emotions and fear. Additional materials: www.superdatascience.com/406
Thomas Obrist joins us to give an advanced talk on the work he does in the financial and energy space as a quant and how it overlaps with data science. In this episode you will learn: Thomas’s background and studies [5:04] Long and short in financial markets [8:33] Thomas’s current role at Axpo [14:55] Quant vs. data scientist vs. data analyst [18:55] The Monte Carlo method [26:26] Thomas’s day-to-day [30:06] Grid loss use case [35:25] Thomas’s hackathon success [53:22] Thomas’s recommendation for those interested in the space [1:01:39] Additional materials: www.superdatascience.com/405
Today we dissect the building blocks of storytelling to help you become a better presenter of your data science insights. Additional materials: www.superdatascience.com/404
Juan Gabriel Gomila Salas joins for an exciting discussion about his work in the game industry and how gamification can boost data science impact across industries. In this episode you will learn: Juan Gabriel’s work before and during COVID-19 [3:37] Juan Gabriel’s unique career path [10:36] Video game monetization case study [25:44] •How can data scientists utilize gamification in their daily jobs? [36:28] Juan Gabriel’s work as a professor [42:46] Is online education the future? [47:40] Data science in the English speaking world vs the Spanish speaking world [52:30] Where is data science headed? [59:45] Additional materials: www.superdatascience.com/403
In this episode, I discuss an interesting metaphor I’ve recently utilized to help myself face and overcome toxic or negative feelings. Additional materials: www.superdatascience.com/402
Michael Galarnyk joins to tackle your questions on data science job hunting and data science education.  In this episode you will learn: Who is Michael Galarnyk? [3:48] Tools and skills to know [11:52] Building and sharing a portfolio [26:21] Advantages of online and in-person education [37:42] Teaching data science to younger students [43:55] Necessary soft skills to be a successful data scientist [51:31] Additional materials: www.superdatascience.com/401
In this anniversary episode, we discuss the importance of knowing why you do data science and how your skills may one day impact the world as challenges arise. Additional materials: www.superdatascience.com/400
Monica Royal joins us to discuss her journey from consumer to contributor in the data science community and how sharing your work and exploring networking can help you on your journey. In this episode you will learn: Monica’s activity in the data science community [5:17] The biggest takeaways from Monica’s 100 Days of Learnings [11:00] Techniques for productivity and continued learning [16:03] Monica’s interest in the SDS podcast and keeping up to date in data science [33:01] The DataScienceGO Virtual experience [35:51] Strategic thinking [38:38] Monica’s parting inspirational thoughts [41:01] Additional materials: www.superdatascience.com/399
In this episode, I discuss a very important topic on the stages and symptoms of burnout and how to tackle them at each point to avoid irreparable damage. Additional materials: www.superdatascience.com/398
We chatted with data science influencer, educator, and principal data scientist Kirk Borne about his philosophy and work in spreading data science literacy across fields and industries through his frameworks. In this episode you will learn: Live vs. virtual events [4:20] Who is Kirk Borne? [7:13] Big data’s evolution and the emergence of small data [11:17] The fourth industrial revolution and the future [22:00] How has the data science education space changed in 14 years? [33:44] Four types of data discovery [38:00] The broad categories of AI you should pursue [50:44] 5 levels of analytics maturity [53:50] LinkedIn Q&A [1:05:00] Hiring at Booz Allen [1:15:18] Additional materials: www.superdatascience.com/397
In this episode, I share a series of great tips, plus a bonus tip for getting your application further along in the hiring process and getting the job. Additional materials: www.superdatascience.com/396
Cole Nussbaumer Knaflic talks about her influential book Storytelling with Data and shares some best practices for conveying meaning from your visualizations. In this episode you will learn: Cole’s business Storytelling With Data [4:04] How did Cole get into this space? [7:24] When did Cole start writing the book? [15:33] Top 3 tips from the book [22:44] How to structure a good story [35:17] Communicating in-person vs. virtually [41:37] Cole’s upcoming workshops [43:50] LinkedIn Q&A [48:57] Cole’s advice on preparing for the future in the field [1:05:22] Additional materials: www.superdatascience.com/395
In this episode, I discuss the power of teaching what you learn to help you retain the highest amount of the information you are learning. Additional materials: www.superdatascience.com/394
John Peach joins to discuss his passion for bringing more scientific approaches to the data science field, making it smarter and more efficient. In this episode you will learn: John’s move from Canada to the US [3:37] John’s new position at Oracle [8:31] Data Science Workflows [9:34] John’s solution to data science workflow exploration [12:06] John’s data science design thinking framework [21:20] Case study [34:21] Literate statistical programming [43:12] R or Python? [51:55] Data unit testing [53:28] What drives John? [1:00:56] Additional materials: www.superdatascience.com/393
In this episode, I describe my morning ritual and discuss the importance of setting up a morning ritual for yourself. Additional materials: www.superdatascience.com/392
John Elder joins for an amazing podcast to share his data science "campfire tales" spanning over 20 years of his career in the industry. It will definitely help you in your work to incorporate some of the best principles. In this episode you will learn: John’s first bungee jump [4:01] Calculus and newer approaches [14:01] Elder Research [21:11] Domain knowledge advice [25:26] The importance of instincts [41:52] Ensembles and simplicity [59:33] John’s opinions on neural nets [1:10:49] Target shuffling method [1:17:27] What does the future of data science hold? [1:39:53] Additional materials: www.superdatascience.com/391
In this episode, I share a tip I came across this week about avoiding conflict in interpersonal relationships. Additional materials: www.superdatascience.com/390
Josh Hortaleza discusses how he’s become a juggernaut of an aspiring data scientist and powered through networking and internships to reach his goals in the field. In this episode you will learn: How did Kirill and Josh meet [8:26] Who is Josh? [12:42] Josh’s first internships [17:07] Being “good enough” and the luck factor [34:51] Josh’s goal [40:55] Genuine networking [43:08] Additional materials: www.superdatascience.com/389
In this episode, I share an awesome tip for anyone at any level around recruitment and headhunters. Additional materials: www.superdatascience.com/388
Lillian Pierson discusses her work on data leadership and how any data scientist can become a data leader in their organization or community. In this episode you will learn: Who is Lillian Pierson? [3:27] Winning With Data [6:08] Four superpowers of great data leaders [11:53] Benefits of developing these skills [17:27] Examples of quick win challenges in Winning With Data [19:34] Impact of COVID-19 [22:23] Where is the industry going? [28:26] Additional materials: www.superdatascience.com/387
Today, I explain cohort analysis and how this can be used for conversion metrics and tracking the customer journey. Additional materials: www.superdatascience.com/386
T. Scott Clendaniel joins to discuss advanced topics in data science and his forecasts for the future in this field. He also talks about the importance of soft skills for data scientists. In this episode you will learn: Who is Scott Clendaniel? [6:57] Scott’s role at Franklin Templeton [10:24] LinkedIn advanced Q&A [13:29] Tools that Scott uses the most [26:57] Target mean encoding technique [30:35] LinkedIn Q&A on models [33:11] LinkedIn Q&A on soft skills [54:04] LinkedIn Q&A on forecasts for the future [01:00:19] Hub and spoke model in Data Science Management [01:08:32] Scott’s advice for advanced data scientists [01:10:12] Additional materials: www.superdatascience.com/385
Today, I discuss best practices for data visualization and how to build on what we learned about cognitive load. Additional materials: www.superdatascience.com/384
Sean Casey joins to discuss his data science journey and how he’s used online courses, secondary resources, and the wider network to help his journey to a data visualization professional. In this episode you will learn: How Sean and Kirill met at DSGO Virtual [4:25] Sean’s experience at the virtual event [7:32] Sean’s journey [10:06] Do you need the credibility of a degree? [22:01] Sean’s supplemental readings [24:33] What can others do to replicate Sean’s success? [39:18] Sean’s advice for others just starting [50:15] Additional materials: www.superdatascience.com/383
Today, I discussed the types of cognitive load and how to best utilize them when imparting information through data. Additional materials: www.superdatascience.com/382
Tony Saldanha joins the podcast to discuss the realities of digital transformation and the steps companies must take to successfully transform in this fourth industrial revolution. In this episode you will learn: Tony’s book on digital transformation [2:51] What is digital transformation [8:30] Five stage framework of going through digital transformation [11:13] Case studies through the stages [16:43] Why do digital transformations fail? [27:26] VC portfolio approach [31:21] Tony’s consulting work and top tips [40:11] Change management and COVID-19 [44:44] Disruption vs. digital transformation [51:14] What does the future hold? [53:50] Additional materials: www.superdatascience.com/381
Today, I discuss the difference between a data analyst and data scientist and how you can join our team as a potential data analyst. Additional materials: www.superdatascience.com/380
Christopher Bishop speaks on the importance of career tactics in data science and how to prepare and move through the career path you want. In this episode you will learn: Who is Christopher Bishop? [5:18] How Christopher developed his advising framework [9:17] Why data scientists? [12:07] What is the Future Career Toolkit? [15:54] How to connect with people as an unknown data scientist [34:09] What's the intended outcome of the framework? [43:53] Additional materials: www.superdatascience.com/379
In this episode, I talk about the importance of the unconscious mind in decision making and how logic and reasoning may sometimes hinder you. Additional materials: www.superdatascience.com/378
Deborah Berebichez joins us to discuss her experience as a woman in STEM, her work with upcoming generations of women in STEM, and how she helps facilitate data science trainings. In this episode you will learn: Deborah's origins [4:21] Pursuing physics as a Jewish-Mexican woman [9:43] Deborah's work in helping women in STEM [23:10] How can companies also aid women in STEM? [28:10] How can individual data scientists work on creative thinking? [44:31] Deborah's work at Metis [48:26] Data literacy done the right way [1:00:33] The future of data science [1:04:55] Additional materials: www.superdatascience.com/377
In this FiveMinuteFriday, I talk about the need to widen your horizons, expose yourself to more varied disciplines and thought processes, and the benefits you can get in your work from doing this. Additional materials: www.superdatascience.com/376
Greg Pavlik joins me for a great talk about the current state of the cloud and how single practitioners and small businesses can take advantage of cloud services. In this episode you will learn: Will we have cloud-based solutions for VR and working from home? [8:15] Greg’s career journey [11:50] From Hadoop to Cloud [23:35] The cloud element in data science [30:17] Data science and AI in Oracle [33:00] Is Oracle more suited for larger companies only? [37:35] Fundamental differences between Oracle Cloud, Amazon, and Azure [42:12] Trends in data science and data management [45:14] Why should someone choose Oracle over any other open source? [52:50] How does the future of data management look like? [56:00] 5G and edge computing [1:01:36] Greg’s recommendation to data scientists [1:04:28] Additional materials: www.superdatascience.com/375
In this episode, I talk about an issue I’ve been having when it comes to phasing my mind out of work and into post-work activities, a concept called “attention residue”. Additional materials: www.superdatascience.com/374
Laurence Moroney sits down to talk about TensorFlow, its community, and his work educating developers in AI and machine learning. We talk about the explosive growth of the community and the great chance for career advancement for all developers, regardless of educational background. In this episode you will learn: Who is Laurence Moroney? [4:14] The importance of developers' focus on AI [8:21] What is TensorFlow and how can it help in AI? [15:53] Differences in TensorFlow editions [26:26] Careers and overcoming the fear of AI [31:14] TensorFlow community [48:46] What does the future look like? [54:40] Additional materials: www.superdatascience.com/373
Today, I talk about P-value and proper hypothesis testing as well as the importance of statistical significance. Additional materials: www.superdatascience.com/372
Anthony Metivier joins us again for an in-depth discussion about how memory and presence can boost productivity for people in their professional and personal lives. In this episode you will learn: Anthony’s technique for memorizing names [12:04] Anthony’s new book and concept of memory [15:45] Memory and productivity insights [31:44] Memory palace construction methods [37:30] How can memory techniques help a data scientist [1:01:30] Challenge frustration curve [1:07:28] Further advice and learnings [1:11:59] Additional materials: www.superdatascience.com/371
In today’s FiveMinuteFriday episode, I wanted to experience explaining support vector regression without the use of any visual aids. Additional materials: www.superdatascience.com/370
John Johnson joins me for a thoughtful discussion about the importance of data in the world of economics and business analytics. We discuss his academic and professional history until his work now and how his company is sifting through economic data during the COVID-19 pandemic. In this episode you will learn: Living and working in Washington D.C. [4:11] John’s initial jobs before Edgeworth [8:41] Edgeworth's core values [12:01] Edgeworth Economics and Edgeworth Analytics case studies [16:57] Data analytics vs. data science [29:50] Parachuting into industries [36:06] Real analytics vs. “lip service” [42:11] HR business analytics [51:13] How much, as a business owner, should you rely on a consultant? [56:26] John’s advice to worried business owners [59:24] Additional materials: www.superdatascience.com/369
Today, I discuss the best ways to ensure you future-proof your career for the great restructuring of the workforce that technological advancements already brought and will bring even more in the future. Additional materials: www.superdatascience.com/368
Samuel Hinton joins us again for an important and timely discussion on data pipelines and the work he’s doing to aid research on COVID-19 with the COVID-19 Critical Care Consortium. We also talk about his new online courses and his continued research into dark matter. In this episode you will learn: Sam’s current work and COVID-19 Critical Care Consortium [4:22] The COVID data science pipeline and workflow [12:50] Sam’s second online course [36:22] Bayesian inference [43:06] Sam at DSGO Virtual [53:30] Sam’s work on dark matter [1:01:25] What is Sam reading right now? [1:09:14] Additional materials: www.superdatascience.com/367
Today, I discuss a profound conversation we had with our team this month on success and how you can define your own success. Additional materials: www.superdatascience.com/366
Jon Krohn joins me to discuss his work at untapt in designing models for HR purposes. We also discuss the power of data science across fields of medicine and epidemiology, as well as the future of deep learning. In this episode you will learn: Coronavirus update in New York City [2:36] What brought Jon to New York? [5:38] Data science and coronavirus [12:50] Jon’s work at untapt [18:09] Techniques used to design models in untapt [22:02] untapt’s approach to explainability and bias [30:19] Jon’s other contributions to data science [38:10] Jon’s book and visual teaching styles [44:32] LinkedIn Q&A [52:05] Jon’s recommendation for becoming best at deep learning [1:13:09] Additional materials: www.superdatascience.com/365
Today, I’m talking with Anthony Metivier about practices to help your brain and body, work through the stress of the pandemic. Additional materials: www.superdatascience.com/364
Piyanka Jain goes in-depth about the true power of data that can be unlocked when you combine intuition with data science practices and follow a hypothesis-driven framework to reach your project goals. Items mentioned in this podcast: The power of data plus intuition [5:29] BADIR framework for data science [12:36] What can students pick up from Aryng’s courses? [24:58] SWAT data science teams [34:16] The rate of successful projects [39:38] Four D’s of Data Culture [45:27] Decision science vs data science [49:17] Piyanka’s inspiration for her book [51:23] Additional materials: www.superdatascience.com/363
Today, I’m talking about an interesting topic I found in our own Data Science newsletter about the need for hybrid AI models in the future. Additional materials: www.superdatascience.com/362
John David Ariansen joins me for an episode on the best practices for getting into data science consulting, the importance of understanding data science and analytics, and how you can network, even during a pandemic. In this episode you will learn: Coronavirus and how it will affect the way we work [3:06] John David’s consulting work [8:19] Why did John get into consulting? [13:17] Does John David’s age affect his clients? [25:03] John David’s podcast [34:59] The difference between data science and analytics [40:26] Creating space for opportunities [49:54] 3 top tips for getting a job in data science [54:28] Additional materials: www.superdatascience.com/361
In this episode, I’m exploring some topics in proper sleep habits to help you keep good sleep schedules. Additional materials: www.superdatascience.com/360
Emily Robinson breaks down her new book “Build a Career in Data Science” by sharing what skills she focuses on exploring, who the data science field is for, and how to tackle interviews and negotiations. In this episode you will learn: Long-distance networking [5:58] Emily’s book [9:38] Who is the field of data science for? [14:02] Should newcomers use Python or R? [23:34] Five company archetypes [28:36] Approaching data science interviews and negotiating [31:25] How do you actually get an interview? [48:08] Emily at DSGO 2020 [57:07] Emily’s final take-home message [58:52] Where to buy Emily’s book and SDS discount code [1:04:26] Additional materials: www.superdatascience.com/359
In this episode, I’m discussing my personal experience with discrimination during a trip at the start of the pandemic and how it elevated my understanding of racism and discrimination beyond just a cognitive level. Additional materials: www.superdatascience.com/358
Tracy Crossley, a Behavioral Relationship Expert, talks about how you can explore yourself during this difficult time. We also explored how different relationship dynamics can be tested during a forced lockdown and how to avoid dangerous emotional pitfalls. In this episode you will learn: What work does Tracy do? [5:50] Tracy’s training [8:20] Tracy’s view on the consequences of the pandemic [12:55] Ways to tackle emotions during lockdown [17:14] Final advice to those struggling during lockdown [1:00:11] Additional materials: www.superdatascience.com/357
Today, I’m helping you explore working remotely. Whether you’ve started doing this during the pandemic or you've been interested in and exploring remote-based jobs recently. I outline three advantages and three disadvantages to consider. Additional materials: www.superdatascience.com/356
DJ Patil talks about ethics in data science and the importance of data science communities working together to make sure data science is an accelerant of solutions for our children and our children’s children. In this episode you will learn: How does it feel to be the person who created data science as we know it now? [3:17] What data science is not [6:01] Ethics and data science development in different countries [10:00] What is the “biorevolution”? [16:02] The importance of data sharing [20:10] The current state of Chief Data Scientist of USA [24:07] LinkedIn Q&A [26:03] What to think about when you think about data science [44:08] Additional materials: www.superdatascience.com/355
Today I discuss a negative coefficient as a philosophical concept in problem-solving in your life. Do you make things worse by ignoring a problem or doing the wrong things to fix it? Additional materials: www.superdatascience.com/354
Brian T. O’Neill joins me for an insightful dive into how you can implement human-centric practices into your data work, whether you’re a consultant or individual contributor. There are ways and steps to workshop best practices in conversations with stakeholders. In this episode you will learn: Brian’s two lives [7:28] Brian’s human-first focal point [10:05] The process of Brian’s consulting work [17:07] How can an individual contributor be better at design thinking? [40:43] Walkthrough Brian’s course and seminar [54:37] Additional materials: www.superdatascience.com/353
Today, we’re diving into our fifth and final part of our history of data science series by looking into data science’s future through the eyes of five of the most influential people in our space and how they see the next few decades. Additional materials: www.superdatascience.com/352
Stratos Hadjioannou is a freshly hired data scientist who is self-taught and made the jump to visit DSGO. He talks about his learnings, putting himself in a data science ecosystem, and how to tackle interviews with little experience. In this episode you will learn: Where did Stratos start? [6:16] How to keep the momentum for learning [12:20] Stratos’s goals [19:35] Planning the steps to getting a data science job [23:01] Triad for successful interviews [32:47] Application process [34:53] Experiences from the first data science job [45:51] Additional materials: www.superdatascience.com/351
Today, we take some time to discuss the real mental and emotional toll social distancing can take during the coronavirus. How can we effectively tackle each other's needs during this period? Additional materials: www.superdatascience.com/350
Brad Klingenberg talks about the unique way Stitch Fix uses algorithms and human-in-the-loop AI to generate excellent customer experiences and pull ahead of other retailers in the space. In this episode you will learn: Working in Stitch Fix [5:18] How does Stitch Fix work? [11:29] Stitch Fix algorithms tour [20:14] Open positions in Stitch Fix [36:25] Stitch Fix takeaways for other companies [39:07] Humans + machines [44:19] Stitch Fix global expansion [47:34] Future of personalization [50:16] Brad’s advice to data scientists [55:33] Additional materials: www.superdatascience.com/349
In the penultimate episode of our history of data science series, we look at 2015 on and watch as data science goes from being about hard skills and coding to being about ethics and progress. Additional materials: www.superdatascience.com/348
Kerri Twigg talks with me about her work in helping professionals talk about themselves and tell stories about their passions and professional work to land ideal jobs and propel their career trajectory. In this episode you will learn: Kerri at DSGO 2019 [6:09] Who is Kerri Twigg? [8:00] A case study from DSGO 2019 [9:51] How do you build a career story? [18:30] Kerri’s book and practices [32:35] How to prepare for interviews [43:22] 3-Parts of a career story [56:53] Additional materials: www.superdatascience.com/347
In this FiveMinuteFriday we take a break from our series on the history of data science to discuss productivity and my top 5 hacks for getting more hours out of your day and week. Additional materials: www.superdatascience.com/346
I speak with Dan Shiebler who works as a machine learning engineer at Twitter Cortex and at the same time, is doing a Ph.D. on applying category theory in machine learning. We discuss his work at Twitter, the importance of academics, and the future of machine learning. In this episode you will learn: What is great about Twitter [5:31] Dan’s Ph.D. program [9:25] Dan’s work at Twitter [18:07] Dan at DSGO 2020 [35:16] LinkedIn Q&A [40:25] Dan’s advice [1:03:58] Additional materials: www.superdatascience.com/345
In the third of five episodes in this series, I journey through 2010 into 2015 to look at the boom of self-driving cars, the growth of data science as a profession, and the beginning of educational paths for future data scientists. Additional materials: www.superdatascience.com/344
I speak with Jose Quesada, founder and CEO of Data Science Retreat about the purpose of his program to help data scientists learn and find jobs through a three-month retreat and portfolio project. In this episode you will learn: Overview of Jose’s current projects [5:55] "What if I don’t have a tech background?" [09:58] How does it work? [11:51] Program structure [21:24] Tips for picking a portfolio project [26:45] The program’s next intake [1:03:06] Additional materials: www.superdatascience.com/343
In the second of five episodes in this series, I take a step into the early 2000s and the true boom of data science as a profession and philosophy of study, as well as look at some of science fiction’s failed hopes for data science by this time. Additional materials: www.superdatascience.com/342
Brandon Rohrer joins me in this special episode about robotics, machine learning, and the merge of software and hardware to create innovative technology for homes around the world. In this episode you will learn: Brandon at MIT [7:41] iRobot [15:14] Moving from Facebook to iRobot [17:14] Brandon’s work in iRobot [20:18] Brandon as a data science influencer [30:08] Q&A [40:40] Additional materials: www.superdatascience.com/341
In this five-episode series, I dive into the history of data science from the beginning of mathematics to today. In this first episode, we start by looking in the 1950s and go up to the dawn of the 2000s. Additional materials: www.superdatascience.com/340
I sat down with my coach Ivor Lok to discuss the power and importance of coaching and how everyone can use it in their personal and professional lives to become happier. In this episode you will learn: Managing expectations [9:21] Personal beliefs & parenting [17:42] Value of having a coach [25:33] Mindset over skillset [37:24] Dream lists [51:06] Ivor’s new projects [1:03:20] Additional materials: www.superdatascience.com/339
I discuss an observation I had recently about how many photos we take, and how much we miss out on by focusing on capturing a moment rather than living it. Additional materials: www.superdatascience.com/338
Hadley Wickham, a huge presence in data science, sits down to talk about R, Python, and the future of potential integrations, as well as some Q&A with our listeners through LinkedIn about programming languages and how to make data science accessible for all. In this episode you will learn: Hadley’s R packages [8:26] Better integrations between R and Python [20:11] LinkedIn Q&A [33:34] useR Conference vs. RStudio Conference [50:46] LinkedIn Q&A: Career-related questions [1:01:06] LinkedIn Q&A: Future-related questions [1:08:01] Additional materials: www.superdatascience.com/337
I discuss something that popped up for me recently: is it better to have something finished or to have something be perfect? I explore the answer and what it can mean for you in your life. Additional materials: www.superdatascience.com/336
Rico Meinl failed when he tried to make a successful startup. He learned a lot from it and shared his story and learnings for nearly 2 hours in one of our longest and most insightful podcasts to date. In this episode you will learn: Rico at DSGO [8:50] Dresswell [17:10] B2B vs B2C in startups [34:03] Rico's 5 learnings [53:25] Learning no. 1 [53:54] Learning no. 2 [56:33] Learning no. 3 [1:10:43] Learning no. 4 [1:24:08] Learning no. 5 [1:34:02] Rico’s next steps [1:45:35] Additional materials: www.superdatascience.com/335
I return to the concept of no coaching in more detail and discuss how I recently had a good conversation with a friend without giving advice but offering empathy. Additional materials: www.superdatascience.com/334
Sinan Ozdemir is back again, this time talking about his work since his company Kylie.ai was acquired by Directly. We discuss his work, the way he is creating human and AI synergy and the future of NLP as it continues to progress. In this episode you will learn: Sinan’s company acquired [7:29] Explainable deep learning models [16:13] Airbnb case study [19:42] Microsoft case study [22:25] Sinan’s role at Directly [25:57] Work with Sinan [32:57] Preview of Sinan at DSGO [38:38] BERT [43:18] Sinan’s prediction for NLP in 2020 [53:17] Additional materials: www.superdatascience.com/333
I discuss the concept of putting yourself on autopilot and powering through getting work done when you feel like giving up. Additional materials: www.superdatascience.com/332
Harshal Sanap talks about how he took himself from a data science student and graduate to a full time professional in data science and shares mistakes to avoid to get started in your career. In this episode you will learn: Harshal at DSGO [8:12] Harshal’s first data science job [16:01] The process of getting your first job [21:25] 3 steps to data science job search [23:37] 4 tips on how to apply for jobs [36:59] 5 tips on how to prepare for an interview [53:21] 5 mistakes to avoid [1:10:31] Additional materials: www.superdatascience.com/331
I discuss finding the good in something that is objectively not so good and how you can take setbacks as a learning experience and challenge. Additional materials: www.superdatascience.com/330
Isaac Reyes talks about his approach to data visualization. We dive into the science behind it, the psychology, and the needs in businesses for proper and informed data storytelling. In this episode you will learn: Catching up with Isaac [6:37] StoryIQ's office in Manila [10:04] What is data storytelling? [12:29] The keys to data storytelling [15:47] Second key to data storytelling [18:36] Third key to data storytelling [21:35] Elementary Perceptual Tasks Scale [24:10] Gestalt principles [38:56] How does StoryIQ teach this? [48:35] Fourth key to data storytelling [49:20] Additional materials: www.superdatascience.com/329
In this week’s FiveMinuteFriday, I wish you all a happy New Year with an interesting story about having the choice to see the best in situations or see the worst in them. Additional materials: www.superdatascience.com/328
Hadelin and I outlined our top 5 trends in Data Science for 2020. We discussed why they’re hot topics and how companies can utilize them to drive profit and efficiency in the coming year. In this episode you will learn: The decade in review [1:45] A decade preview [5:20] 2020 trends webinar [7:30] Robotic process automation [9:00] Natural language processing [18:28] Reinforcement learning [26:35] Edge computing [37:25] Open source AI frameworks [52:02] Additional materials: www.superdatascience.com/327
This week’s FiveMinuteFriday and final episode of 2019 is about who inspires you and how it may be those closest to you without you even realizing it. Additional materials: www.superdatascience.com/326
I went over the 7 top learnings I took from this exciting year of ups, downs, and incredible adventures and explorations. In this episode you will learn: Dichotomies [6:18] F*ck FOMO [19:00] Full circle stress [27:23] Letting doors close [38:57] Managing my energy as an introvert [45:00] No coaching [56:15] Feelings [1:03:38] Additional materials: www.superdatascience.com/325
In this week’s FiveMinuteFriday, Vitaly and I talked more about a familiar topic: proximity is power. We discussed the importance of connection, how to not saturate, and how to decide with whom you spend your time. Additional materials: www.superdatascience.com/324
I chatted with top Upwork freelancer Wesley Engers who has worked over 150 jobs in data science. He’s worked in a variety of industries and shared a few of his most interesting jobs and offered advice for those considering diving into freelance data science work. In this episode you will learn: Wesley on Upwork [9:11] Wesley’s background [16:20] How Wesley onboards a client [26:32] Good clients vs. bad clients [31:12] Tools [37:23]• Wesley’s best projects [45:09] Freelance vs. full-time work [59:26] Tips about getting into Upwork [1:06:48] Additional materials: www.superdatascience.com/323
In this week’s FiveMinuteFriday we are with Vitaly and Hadelin again and we are discussing our diets and how we maintain feeling healthy and good through food intake. Additional materials: www.superdatascience.com/322
I sat down with Morgan Mendis whom I met at DSGO this year. He is one of the most advanced data scientists I’ve met and he’s been using his skills and experience to give back to his community. We discuss his career, his dreams, his ideology, and his hunt for a VP of Data Science at his former company. In this episode you will hear: Catch up since DSGO 2019 [8:04] VP of Data Science at Inspire [12:00] Morgan’s career dreams [22:04] Morgan’s experience [30:50] Tools & solutions [1:01:33] How you can get involved [1:12:45] Additional materials: www.superdatascience.com/321
In this week’s FiveMinuteFriday I sat down with Vitaly and Hadelin to discuss the concept of mentorship and how we work through our professional and personal hurdles with mentors. Additional materials: www.superdatascience.com/320
I sat down with Jonathan and Ogo, two DataScienceGO attendees, who are experts in the field of data visualization. Their methods and backgrounds differ but ultimately they believe in the same goal: telling a meaningful story. Additional materials: www.superdatascience.com/319
In this week’s FiveMinuteFriday I discuss the concept of “fake it until you become it” and use of the word “amazing” when thinking about your current state and when people ask how you are. Additional materials: www.superdatascience.com/318
An incredible young guest is in this episode after he attended DSGO. Edis is a 15-year-old, building his own neural networks. We discussed his background, his process of building neural networks from scratch, Kaggle competitions, and the benefit of online data science education. Additional materials: www.superdatascience.com/317
In this week’s FiveMinuteFriday I discuss how best to handle disagreements by keeping your focus on yourself and your own actions. Additional materials: www.superdatascience.com/316
Back by popular demand is Gabriela de Queiroz to discuss various data accessibility issues and how her work, talks, and organizations are working to make data science and AI more available across the board. In this episode you will learn: What’s new with Gabriela [7:14] How does Gabriela find the time? [10:50] Gabriela at DSGO [16:45] IBM’s Data Asset Exchange [27:48] What is Docker? [44:32] Gabriela’s code background & team [46:34] R-Ladies [54:50] Additional materials: www.superdatascience.com/315
I asked the team what was one wish they had for our students on their data science journey. The answers are inspirational and encouraging for students at all levels. Additional materials: www.superdatascience.com/314
Marco Caviezel’s journey from research-based psychology into a career as a data analyst is really fascinating. He did his entire data education online and managed to not only teach himself in topics of machine learning and data visualization but got a job as a data analyst through his own work. Additional materials: www.superdatascience.com/313
Kirill and Mitja share some thoughts about one of the workshops at the SuperDataScience offsite retreat. They explore the practice of contemplation as a way to get a deeper understanding and insights. Additional materials: www.superdatascience.com/312
This episode with Daniel Obodovski explores smart cities and the importance of problem-solving from city to city by using data correctly. But solutions aren’t always obvious, privacy continues to be a huge issue for citizens, and not every city prioritizes problems the same way. It’s a fascinating topic. Additional materials: www.superdatascience.com/311
Kirill and Mitja share thoughts on purposeful “trials by fire” in your life and how you can force yourself to grow through intended adversity. Additional materials: www.superdatascience.com/310
A conversation between rival online educators in the data science community about the challenges of creating a worldwide community with millions of students, the trends in data science, and how education can keep up to date. Additional materials: www.superdatascience.com/309
A FiveMinuteFriday about the importance of belonging and how a connection to the larger community in the work that you do can be incredibly beneficial and meaningful for both your career and personal happiness. Additional materials: www.superdatascience.com/308
Kirill and Marc have a conversation that started as a quick FiveMinuteFriday discussion on thoughtfulness that turned into a full podcast worth of content on the power of thought, mindfulness, practice, and how even data scientists need to look past facts and information and follow their intuition. Additional materials: www.superdatascience.com/307
The Costa Rican phrase "Pura Vida" is something very important to think about because it is incredibly beautiful, filled with emotion and it is so powerful. What meaning would this phrase have for you, in your life? Additional materials: www.superdatascience.com/306
Jean-Pierre Labuschagne's career journey started in South Africa and moved to Europe, where he is bringing massive value with the power of data visualization. He is also teaching successful courses online after spending 2 years as a student of online courses himself. Additional materials: www.superdatascience.com/305
Can you think of examples when the law of attraction worked in your life? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/304
In this episode of the SuperDataScience Podcast, I chat with Astrophysicist and Online Data Science Instructor, Sam Hinton. You will hear about the Lindau Nobel Laureates meeting, where he met Nobel Prize winners and you will also hear about his appearance on the Survivor TV show. You will learn about quantum mechanics. You will also learn about the course he launched in Python for Statistical Analysis, as well as going in-depth on hypothesis testing. You will hear about Python versus R, statistical significance, why p-value of 0.5 is bad, Bayesian statistics, and what is the difference between frequentist and Bayesian approaches. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/303
What is Data Science to you? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/302
In this episode of the SuperDataScience Podcast, I chat with Data Scientist at TD Bank, Ayobami Ayodeji. You will hear Ayobami's valuable insights about the takeaways from DataScienceGO 2019, including productization of data science products, the 3 types of data science teams, and building character and resilience. You will also learn about Ayobami's career journey from project manager to data scientist and the sacrifices he made on that journey. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/301
What are you leaving for the next generation on this planet? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/300
In this episode of the SuperDataScience Podcast, I chat with Head of Data Science and Machine Learning, Michelle Keim. You will hear what working remotely is all about in data science. You will learn about the importance of failure, and why everyone should lose their job at least once. You will hear about churn and segmentation, what they meant 10 years ago and what they mean now. You will also learn about the imposter syndrome and what to do when you feel like an imposter while applying for a role. You will hear about moving from centralized data science teams to integrated experts within the business and leading people on the three key learnings that Michelle has taken away from her experience as a leader. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/299
What would you change about the things you do in your life if you thought you only had 6 months to live? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/298
In this episode of the SuperDataScience Podcast, I chat with data scientist Ayodele Odubela. You will hear how and why she chose to do a Masters in Data Science and supplemented that with online education. You will also hear about self-discovery, fortitude and passion, and how she got one of her data science jobs through Twitter. You will learn about some of Ayodele's projects like using SVM for detecting poisonous vs. edible mushrooms, using random forests and decision trees for ranking wines based on the chemical contents, using the Naive Bayes to detect spam. You will learn about the real-world project that she's worked on, bullet stopping flying drones. You will find out what role machine learning played in that project and how they're going to be applied in society once they get rolled out. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/297
Do you take time to reflect on who you became or actions you took while on a path to achieving a goal? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/296
In this episode of the SuperDataScience Podcast, I chat with my friend and business partner, Hadelin de Ponteves. You will hear what new exciting things are happening in Hadelin's life now. You will hear some preview of his upcoming presentation at DataScienceGO 2019, which will cover NLP, especially the BERT model, which raised a whole new level in NLP. You will also learn about reinforcement learning and Hadelin's new course on Twin-Delayed DDPG. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/295
What about AI worries you in the professional world? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/294
In this episode of the SuperDataScience Podcast, I chat with Data Scientist, Peyman Hesami. You will find out what reinforcement learning is and how it works on an intuitive level. You will hear about the differences between reinforcement learning versus classification, or other supervised learning methods, and how it's used for personalization specifically. You will learn about six distinct advantages of reinforcement learning, what role reinforcement learning is going to play in the future of machine learning and why. Also, you will find out how and why Peyman made a career transition to work for a startup, how he's using reinforcement learning, and what is the biggest mistake he has made with reinforcement learning. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/293
How can you find a way to balance your energy through recharging in the way that works best for you? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/292
In this episode of the SuperDataScience Podcast, I chat with founder and CEO at Daisy Intelligence, Gary Saarenvirta. You'll learn about dangerous implicit assumptions, the power of theory and theory versus data. You'll also learn about two types of decisions, the spacial interaction model, traffic flow model, the concept of dividing the world in two and what humans should be doing, and what artificial intelligence should be doing. You will hear about the difference between artificial intelligence that leverages just data versus artificial intelligence that leverages theory and data, and what advantages that creates. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/291
How does your inner voice compare to your passions? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/290
In this episode of the SuperDataScience Podcast, I chat with top AI influencer, Ben Taylor. You will learn some very cool concepts about artificial intelligence such as active adverse impact mitigation, what that means and how that can help train on your dataset without bias. You will hear about AI ethics, deepfakes and Ben's current passion project, building an artificial intelligence that plays Call of Duty, which he will actually demonstrate at DataScienceGO this year at the end of September. If you enjoyed this episode, check out the video, show notes, resources, and more at www.superdatascience.com/289
Can you pick one activity to implement for yourself this week to engage in loving yourself? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/288
In this episode of the SuperDataScience Podcast, I chat with a data scientist, public speaker and the mastermind behind the Let’s Go Data Science meetups, Ashwin Chirag. You will learn why it is very important to attend meetups and what are the benefits and advantages you get from meetups. You will hear some great stories on how in-person connections with data scientists can take your career to the next level. You will also hear Ashwin’s experience in standup comedy and what it has taught him. And finally, you will hear how attending meetups, attending conferences such as DataScienceGO has changed the trajectory of Ashwin’s life. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/287
Can you give yourself an hour this weekend to physically separate from your phone? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/286
In this episode of the SuperDataScience Podcast, I chat with the top contributor on Stack Overflow, Jon Skeet. You will learn what is versioning, how that affects developers and how that affects data scientists. You will hear about compiled versus interpreted languages, what is the silver bullet in cold diagnostics, what kind of problems you want to diagnose and the 'divide and conquer' principle. You will also hear about the importance of community, what it means to be part of a community and how communities grow, what you can do as a data scientist to make our community be more inclusive, more welcoming and prosper further. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/285
Who in your life can you get more inspiration and learning from by increasing your proximity? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/284
In this episode of the SuperDataScience Podcast, I chat with one of the key people behind the Python package scikit-learn, Andreas Mueller. You will learn about gradient boosting algorithms, XGBoost, LightGBM and HistGradientBoosting. You will hear Andreas's approach to solving problems, what machine learning algorithms he prefers to apply to a given data science challenge, in which order and why. You will also hear about problems with Kaggle competitions. You will find out the four key questions that Andreas recommends to ask when you have a data challenge in front of you. You will learn about his 95% rule to creating models, and creating success in business enterprises with the help of machine learning. And, finally, you will also learn about the Data Science Institute at Columbia University. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/283
What can you spend time learning that’s new to you and how can it help your lifestyle and career? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/282
In this episode of the SuperDataScience Podcast, I chat with the Founder and Director of Digital Strategy at Webfor, Kevin Getch. You will learn what digital assistants are and where they're going with the help of people like Ray Kurzweil at Google. You will hear Kevin's philosophy on 'what gets measured gets managed' and what it means for marketing and data science. You will also learn why websites are less and less important, how segmentation is slowly transitioning to personalization, creating amazing customer experiences, disk profiles, natural language processing, and computer vision and their role in the future of marketing. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/281
Is there a period in your life you can look at and feel grateful for your freedom and experience? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/280
In this episode of the SuperDataScience Podcast, I chat with Head of Data Science at Scribd, Kevin Perko. You will learn what it's like to be a data science manager, or a data science leader, and what it's like to manage a team, and more so two teams, in two different locations, and how that is different to actually doing the technical work. Also, you'll learn about the Book Genome Project at Scribd, what it's like when a company sees data science as a product, as opposed to an auxiliary function, and a very valuable concept of decentralized, or embedded teams, versus core data science teams and the advantages and disadvantages of each approach. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/279
What do you do extremely well that no one else around you does quite as well and how can you leverage it? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/278
In this episode of the SuperDataScience Podcast, I chat with Serial Entrepreneur, Khai Pham. You will learn why data science is an advantage in terms of mindset even to be an entrepreneur. You will hear about general artificial intelligence versus superintelligence, what are the differences and why you don't really need general artificial intelligence to get to superintelligence. You'll also learn how questions are more important than answers, and hence the reasoning engine versus a search engine. You'll hear about Khai's experience in becoming a founder of companies. You'll learn what the whole idea of reasoning is and why companies need to move from data-driven and machine learning-driven to reasoning-driven. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/277
How can you use these processes in your industry? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/276
In this episode of the SuperDataScience Podcast, I chat with the Machine Learning Research Scientist, John Langford. You will hear about unsupervised, supervised learning and reinforcement learning, and the differences between the three. You will learn about applications of contextual bandits and reinforcement learning in general, YOLO style algorithms versus simulator algorithms, technics for avoiding local optimums. You will also learn about the balance between exploration and exploitation, learning to search and active learning. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/275
What has happened to you recently that could call for clarity from a mentor or coach? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/274
In this episode of the SuperDataScience Podcast, I chat with Matthew Rosenquist, one of the top leading world experts in the space of cybersecurity. You will learn what balance in cybersecurity means and what the dark web is. You will hear how Matthew's career developed and how he thinks about the strategy of cybersecurity. You will also learn about the valuable role of data science in cybersecurity and the steps you can take to get into this space. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/273
How can you use data science to help keep the energy industry helpful and ethical towards the planet and future generations? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/272
In this episode of the SuperDataScience Podcast, I chat with the legend of visual journalism, Alberto Cairo, who talks about understanding if your data is measuring the right thing that you want to be measuring. You will hear about Simpson's paradox and the ecological fallacy. You will learn about the four kinds of literacy, exploratory data analysis versus communicating results, and how to create a narrative structure in your visualization and convey the insights in a certain way so that people can better understand them. And finally, you will hear about ethics in data visualization. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/271
How can you leverage this to help you focus on your work in data science? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/270
In this episode of the SuperDataScience Podcast, I chat with Justin Fortier, the principal data scientist at ViralGains. You will hear about ad tech, performing insights, getting insights, and making decisions within milliseconds with data science. You will learn about the business impact, why business impact is ultra-important, and on the other hand, why user experience is an ultra-important factor for a data scientist to consider more and more in today's world. You'll also learn about Justin's path from managing data scientists at large organizations, to being the single data scientist at smaller startups and you will hear some very interesting decisions he made throughout his career. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/269
How can you see data science disrupting and progressing the insurance industry in the future? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/268
In this episode of the SuperDataScience Podcast, I chat with Manasi Vartak, founder and CEO at Verta.ai. You will learn about model tracking, versioning, and maintenance. You will also learn what data maturity means and what are the 3 areas where top-tier data science teams are investing in. You will hear a great discussion about the boom that will happen with machine learning in the next 3 years and what you can do to prepare your career or your business for the future. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/267
Where are you overexploiting patterns in your life and where can you be more open to some exploration? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/266
In this episode of the SuperDataScience Podcast, I chat with Frank Kane, an expert and top instructor in the field of big data, who also worked in Amazon for 10 years. You will learn how data science and big data have been different but are now converging into something that is very intertwined. You will hear about recommender systems such as user-based and item-based collaborative filtering as well as other types of recommender systems and where this space of recommender systems is going. You will also hear about singular value decomposition or SVD model-based methods, deep learning and Amazon DSSTNE. And in the end, you will learn some very valuable tips on how to get hired by big companies like Amazon. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/265
How do you see data science continuing to improve the world’s most important industry? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/264
In this episode of the SuperDataScience Podcast, I chat with Eoin Murray, the founder of Kyso.io, a platform where you can blog about your data science projects using tools such as Jupyter notebooks. You will learn what the platform means for data scientists and how you can use it to build your online presence and online portfolio. You will hear about startups and how you can jump into creating a startup, what accelerators are, what angel investors are, and what venture capital funds are. Yu will also hear where data science is going and whether or not data science should be a certified profession. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/263
How can you put a routine on a goal you’re trying to achieve? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/262
In this episode of the SuperDataScience Podcast, I chat with Andrei Lyskov, a data science writer, who shares not only his experience but also his research and his thoughts and ideas in the space of getting a job in data science. You will learn about the trichotomy of control and the stages involved in job interviews in data science. You will also learn about the importance of referrals and portfolio, and you will hear about learning how to learn. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/261
How can we, as data scientists, scale the existing technologies in real estate? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/260
In this episode of the SuperDataScience Podcast, I chat with Stephen Welch, a computer vision and neural networks expert who will share a ton of information about the space of self-driving cars. You will learn about self-driving cars starting from the history of neural networks and how that was associated with self-driving cars from the '60s, '70s, '80s and all the way until now. You'll also learn about autonomous driving and the three components in the neural networks related to autonomous driving and what they are and how they work. You will find out about the five different levels of autonomous driving and what to expect in the next 10-20 years. You will hear a case study of how machine learning can be applied to historically older industries and you will also hear some very valuable career advice. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/259
How can you work S.L.O.W.L.Y. into your daily routine? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/258
In this episode of the SuperDataScience Podcast, I chat with Melanie Mitchell, one of the leading researchers in the field of AI. You will learn about complexity, what it is and how it works, and how it can be seen in different areas of life. You will hear about common sense, meta-cognition, explainable AI, and you will also hear Melanie's ideas and thoughts on the future of AI, which break down into two areas which you'll find out in this podcast. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/257
How do you see data science assisting in the transportation industry in the future? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/256
In this episode of the SuperDataScience Podcast, I chat with the founder of PyImageSearch.com, Adrian Rosebrock, who gives us a great overview of the space of computer vision. You will learn what computer vision was in the past, what it is now, and most importantly, what it will be in the future and what you need to prepare for if you're interested in computer vision. You will also learn about OpenCV and how to quickly get started with it as it is one of the most popular libraries and tools for computer vision in the world right now. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/255
What negative and positive emotions do you find most often acknowledging in yourself? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/254
In this episode of the SuperDataScience Podcast, I chat with Associate Professor at the University of California San Diego, Bradley Voytek, who was the first data scientist, and the person to kickstart data science at Uber. You will hear a lot about his past work at Uber and his current work at UCSD. You will learn 4 valuable philosophical points about data science being a separate field and you will also learn what data science skills can help you resist automation. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/253
What other problems in construction can you think of that data science could offer a solution for? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/252
In this episode of the SuperDataScience Podcast, I chat with the CEO and Data Scientist at TypingDNA. You will hear about a brand new industry which is transforming everything we know about security. You will learn about typing biometrics and how it works and how machine learning and data science enable this industry to go forward. You will also hear what is like to run a data science startup and how to go from an idea to research and finally to creating a business. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/251
How can you pivot your way of talking about actions and yourself to make negative feelings productive? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/250
In this episode of the SuperDataScience Podcast, I chat with the CEO and Co-Founder of SFL Scientific, Michael Segala, who is joining us for the second time to share his overview of data science consulting and data science projects overall. You will hear some amazing case studies which include healthcare imaging, logistics and supply chain, and the space of energy. You will also learn about the challenge of small data and how to deal with unbalanced data sets. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/249
How has data science implementation in government helped improve your community? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/248
In this episode of the SuperDataScience Podcast, I chat with Pablos Holman, the famous hacker, inventor, and entrepreneur. You will hear how artificial intelligence is impacting the world, what Maslow's hierarchy of needs is, and how that is affected by technology. You will also hear what roles Data Science and machine learning are playing in the future of the world. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/247
How will you prepare yourself for imperfection going forward? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/246
In this episode of the SuperDataScience Podcast, I chat with the Seasoned Executive Luis Blanco about his amazing career journey from which you will gain very valuable insights for your career development. Also, you will learn about fact-based decision-making cultures, how to create and nurture them and you will also learn about cross-departmental work and sharing models between departments. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/245
How do you experience data science in your everyday use of entertainment media? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/244
In this episode of the SuperDataScience Podcast, I chat with the Senior Underwriter Dominic Roe about his actuarial work and the pioneering and implementation of a risk assessment method he’s utilized for insurance companies that’s now widespread across Australia. You will hear a very detailed explanation of how he built his model. You will also learn about geodemographic segmentation and hear some very interesting use cases of geodemographic segmentation in data science. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/243
Do you use meditation to more efficiently filter your thoughts during both work and leisure time? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/242
In this episode of the SuperDataScience Podcast, I chat with Dr. Guillermo Cecchi about the role of data science in medical research and maybe even the future of artificial intelligence. You will learn how data science and artificial intelligence are pushing the boundaries of mental healthcare. You will hear some very interesting approaches about getting insights from audio samples of patients’ voices and their speech and you will also learn about the development of some fascinating techniques, like transferring intuitive knowledge from professionals in the healthcare field into algorithms. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/241
How would you approach adopting AI into your business? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/240
In this episode of the SuperDataScience Podcast, I chat with the Data Science Headhunter and Head of Analytics Recruitment at IT Search and Selection, Adrian Clarke. You will learn about the state of the data science industry globally and the different data science roles that exist in the world. You will also learn what to expect in terms of data science salaries and why there is a huge demand for data scientists. You will hear about the concept of hybrid professional and a lot of valuable career insights. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/239
What aspect of banking can you see making the best use of data science? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/238
In this episode of the SuperDataScience Podcast, I chat with the principal attorney, Jessica Merlet, who is an extremely experienced lawyer and gives us an excellent breakdown of what GDPR is. You will learn all about GDPR, from requirements for capturing and storing to processing data under GDPR. You will also learn some important terms such as data controller, data processor, affirmative consent, sensitive information and you will also learn what are the 4 pillars of GDPR and the 6 legal bases for capturing and processing data. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/237
What negative emotion has been difficult for you to control lately? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/236
In this episode of the SuperDataScience Podcast, I chat with the Data Science Consultant, Nic Ryan, who does a great job in combining the technical and consulting/mentoring part of data science in his career. You will hear Nic's journey of creating his remote career and a balanced work-family relationship, along with some valuable tips you can apply to your own career. You will also hear about natural language processing and some examples of his work. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/235
What challenges have you seen lately in educational institutions where data science or AI can help? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/234
In this episode of the SuperDataScience Podcast, I chat with the Director of Data Science at Red Bull, Josh Muncke, who gives us some very valuable insights. You will hear a couple of case studies of how Red Bull uses Data Science, you will learn how Data Science Leadership is an important area for businesses in Data Science and you will also learn about the importance of asking good data questions. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/233
How do you approach your roadblocks? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/232
In this episode of the SuperDataScience Podcast, I chat with the professional data visualizer, Mollie Pettit. You will learn about the difference between a data scientist and a data visualizer. You will also learn about the D3.js javascript library, when to use it and how you can benefit from using it. You will hear one of Mollie's case studies and how she participates in projects that use Data Science to contribute to the world. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/231
How will SuperDataScience 2.0 serve you now and in the future? SuperDataScience 2.0 is available at: www.superdatascience.com/yes If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/230
In this episode of the SuperDataScience Podcast, I chat with the Co-Founder at Cursor, Adam Weinstein. You will hear Adam's journey from working at LinkedIn to founding his own company. You will learn about the concepts of Data Literacy and Citizen Data Scientist, how Cursor can help you on this journey and what does it mean for an organization to be Data Literate and Data Driven. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/229
What more can you add to the list of how the mining industry can benefit from data science? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/228
In this episode of the SuperDataScience Podcast, I chat with the Data Science Influencer, Sarah Nooravi, who inspires the data science community in many different ways. You will hear about Sarah's background and interests, from culinary chef to nuclear fusion, and how she found her way to data science. You will also hear a specific case study of data science in the marketing and mobile gaming industry and Sarah's role in it. You will learn about diversity in data science and how the community can help inspire data scientists to be successful. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/227
What do you consider your ‘flat tyre’ recently and how did you approach it? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/226
In this episode of the SuperDataScience Podcast, I chat with the Business Development Specialist at Velocity Group, Anna Foard who have built an amazing data science career in very short time, besides being a mother of two children. You will hear how she is challenging and conquering the field of Data Science from many different perspectives. You will learn about her strategic approach to her career and how she has managed to meet and work on projects with many of her heroes in Data Science. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/225
What great books can you suggest for others to read this year? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/224
In this episode of the SuperDataScience Podcast, Hadelin de Ponteves and I discuss the accuracy rate of the predictions we made for 2018. We also discuss the key AI and technology trends to look out for in 2019 which can help you structure your carrier and design your path through technology. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/223
What key trend do you think will continue in the year 2019? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/222
In this episode of the SuperDataScience Podcast, I reflect on 2018 and share top 7 things I learned in this year. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/221
What big help does data science do for you when you do your shopping? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/220
In this episode of the SuperDataScience Podcast, I chat with the Vice President of Measurement and Evaluation at Kaplan, David Niemi. You will hear how David applies data in the space of education in order to extract insights and understand how the learning journey can be improved. You will learn some very valuable tips about learning and you will also hear how the education industry is booming and is going to keep growing. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/219
What’s the first thing you’ve written on the list of things that you should be grateful for? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/218
In this episode of the SuperDataScience Podcast, I chat with the aerospace engineer, Carlos Hervás García, who works for Airbus. You will hear about Aerospace and Orbital Mechanics, the International Space Station and what aerospace engineers actually do. You will learn how Data Science, Machine Learning, Deep Learning, and Artificial Intelligence can be used in Aerospace Engineering and what value do these technologies bring to the field of Interplanetary Travel. You will also hear how Carlos combined his passions for Aerospace and Artificial Intelligence in his career journey. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/217
What do you think is the most significant application of data science in the healthcare industry so far? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/216
In this episode of the SuperDataScience Podcast, I chat with the full stack web developer and aspiring data scientist, Brian Dowe. You will hear what is like to go from developer to data scientist and how to integrate data science in your career as a developer. You will learn about the development, deployment and maintenance life cycle of models in business and you will hear some ideas on modeling in general. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/215
What in your life is super amazing right now? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/214
In this episode of the SuperDataScience Podcast, I chat with the legends of visualization, Andy Kriebel and Eva Murray. You will learn about visualization and why it is important for data scientists and machine learning experts to know how to visualize data. You will also hear some amazing tips from their brand new creation, MakeoverMonday the Book. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/213
Are you willing to make the digital shift to be a model-driven business? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/212
In this episode of the SuperDataScience Podcast, I chat with the software engineer at RStudio, Javier Luraschi, who has ton of passion for RStudio and developing packages, Apache Spark, Big Data and Big Compute. You will learn about Big Data and the whole history of Big Data, about Apache Spark, why it was created, what is used for, how it compares with Hadoop, how it’s developed through the time and how it’s developing now. You will also hear about package development in RStudio and some exciting things happening in that space. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/211
What area in your life do you think you’ve plateaued? What will you do to level up your game in this area? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/210
In this episode of the SuperDataScience Podcast, I chat with the aspiring Data Scientist, Rio Branham, who now has a full-time job as a data scientist after just 1 year in the field. You will learn about the difference between data science and econometrics, what kind of tools and techniques Rio uses and what his aspirations are. You will also hear about Rio’s approach to learning which together with Rio's mindset, make this a very inspirational podcast episode. Rio makes a reckless commitment for 2019 right on this podcast.If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/209
What talk are you particularly looking forward to listening to again? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/208
In this episode of the SuperDataScience Podcast, I chat with the Data Science Influencer, Data Science Podcast Co-Host and future Data Science Author, Kristen Kehrer. You will hear Kristen's valuable tips about using different data science tools, especially R and Python and how they complement each other, you will learn about the technical skills that actually add business value and you'll hear valuable tips on how to better structure your resume. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/207
What online applications are your favorites? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/206
In this episode of the SuperDataScience Podcast, I chat with the Data Science Influencer and Data Science Author, Kate Strachnyi. You will hear about Kate's journey to becoming a data science influencer, know how she is massively contributing to the data science community, and get insights how to create a culture of data science within an organization. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/205
What massive goal are you looking to achieve lately? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/204
In this episode of the SuperDataScience Podcast, I chat with the aspiring data scientist, Sasha Prokhorova. You will learn about Sasha's background experience and journey to Data Science, get valuable tips on how to get recognized and get job offers in the space of Data Science, and also get to hear some of the key things that she had learned from the speakers during the DSGO 2018 event. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/203
In what ways can you make your ideas valuable? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/202
In this episode of the SuperDataScience Podcast, we will listen to the Panel Discussion on emerging technologies during the DataScienceGO 2018 event in San Diego California, last October 12-14. Our panelists were 4 valuable persons in the space of Data Science, we have the Senior Solution Architect at NVIDIA - Mark Skinner, Manager of Data Science at TrueCar - Rachel Wang, Chief AI Officer at Ziff, Inc - Ben Taylor, and the world-renowned speaker, inventor, hacker, and entrepreneur - Pablos Holman. You will listen to a very insightful panel discussion we had during DSGO where we covered many topics including Blockchain, AI, Deep Learning, Machine Learning and disruption, startups, and a lot more. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/201
Where do you think the field of Data Science is going in the coming years? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/200
In this episode of the SuperDataScience Podcast, I chat with Associate Professor at NYU Stern School of Business, Kristen Sosulski. You will hear about Kristen's experience in visualizing data, how she learned to communicate findings, present data visualization insights, and how she has helped people on how to read data, charts and graphs. You will also learn why Data Visualization is an entry pathway into Data Science and get to know the top tips Kristen has been practicing in order to be successful in this field. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/199
How do you think the concept of 2-millimeter shift can impact you? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/198
In this episode of the SuperDataScience Podcast, I chat with the Epic Life Coach, Carl Massy. This episode is a total pause from the Data Science world as we talk about the secret to happiness. You will hear what led Carl to follow his passion for helping people find inner happiness, learn how to establish daily rituals to have a solid foundation, and get the 3 valuable tips on how to change your state of mind. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/197
How will you play your part in addressing the diversity gap in Data Science? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/196
In this episode of the SuperDataScience Podcast, I chat with the Founder of R-Ladies, Gabriela de Queiroz. You will hear about the R Ladies Organization and how it was founded, discuss the gender diversity problem in the data science field, and know the different skill sets Gabriela acquired during her previous careers that she was able to carry as she moved towards being a data scientist. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/195
Would you consider switching to a vegan lifestyle? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/194
In this episode of the SuperDataScience Podcast, I chat with the Artificial Intelligence expert, Roman Yampolskiy. You will hear about Artificial Intelligence safety, hear how AI is going to quickly take over in the coming years and why we have to prioritize AI safety for safer machines, and also get valuable advice on how to start a career in AI, for business owners, and for professionals on how to get to high-end jobs. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/193
What do you think is your greatest weakness? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/192
In this episode of the SuperDataScience Podcast, I chat with the Founder of the Machine Learning Society and the creator of the CO network, Tristen Blake. You will hear about the upcoming Data+Science collaboration platform - the CO Network, know the reason for the existence of the Machine Learning Society, hear about San Diego being the next Silicon Valley, and also get to hear Tristen's courageous stories from having no Data Science background and now breaking into Machine Learning by seeing the value of this field. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/191
What do you look forward to this year's DataScienceGO 2018 event? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/190
In this episode of the SuperDataScience Podcast, I chat with the Data Science Mentor, Randy Lao. You will learn how Randy Lao made a big rise in LinkedIn starting out with 45,000 followers and several Data Science jobs in just one year. You will also hear about the whole process of getting a job in Data Science from preparing your portfolio to getting data science jobs & opportunities, and learn that one thing that Randy is very passionate about which is the "end goal of Data Science". If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/189
In what ways do you see yourself filling in the data science gap? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/188
In this episode of the SuperDataScience Podcast, I chat with the Principal Data Scientist at OXXO, Favio Vázquez. You will hear about Data Science finally becoming a Science, learn the 5 characteristics of Data Science projects, and know the 3 different kinds of people who explore Data Science. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/187
What are your personal experiences relating to the concept of knowledge and execution? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/186
In this episode of the SuperDataScience Podcast, I chat with the President of McKnight Consulting Group, William McKnight. You will learn about the right pre-requisites to have the right analytics organization, learn the 4 components of data maturity of an organization, and also find out the 5 different stages of which level an organization can be. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/185
In what situations in your life can you apply the concept of FIFO or LIFO? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/184
In this episode of the SuperDataScience Podcast, I chat with the Founder and Chief Technology Officer of CirroLytix Research Services, Dominic Ligot. You will learn the 3 verticals in starting your own analytics business, hear where the world is going in the space of data analytics and know the demands for analytics from the industries and businesses. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/183
In what ways do you think can you give back to the Data Science community? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/182
In this episode of the SuperDataScience Podcast, I chat with Founder and managing partner at velocitygroup.io, Tim Lafferty. You will hear about Tim's insights from his consulting experience in the space of Data Science, learn about the 3 main aspects of the life cycle of a successful data project, and also know the 10 things that a degree won't teach you. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/181
In what situation in your life have you applied the value of essentialism? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/180
In this episode of the SuperDataScience Podcast, I chat with Data Science recruiter, Matt Corey. You will hear about Matt's role as a Data Science recruiter, get interesting tips and insights about what recruiters look for in candidates for a data science position, and you will learn what are the things to expect from a recruiter and how intricate the role of a good recruiter in Data Science is. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/179
For what purpose do you usually use your data visualization skills? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/178
In this episode of the SuperDataScience Podcast, I chat with the aspiring data scientist, Zach Loertscher. You will hear about the current state of Data Science education, learn how you can build your data science career through self-learning using the available resources online, learn how to build your portfolio through LinkedIn, and how to help back in the Data Science community. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/177
Have you ever tried conveying facts to anyone by telling them as a story? How did it work? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/176
In this episode of the SuperDataScience Podcast, I chat with the President and Editor at KDNuggets, Gregory Piatetsky-Shapiro. You will hear about the recent advancements in Data Science, learn how Reinforcement Learning can greatly improve AI capabilities, and learn about the noticeable growing trend of automation in the field of data science. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/175
When was the last time that you disconnected from technology? If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/174
In this episode of the SuperDataScience Podcast, I chat with the founders of Cigen, Leigh and Daniel Pullen. You will listen to an overview of RPA or Robotics Process Automation, hear what it’s all about, how it works, how business gets value from it, and also how people can learn this and integrate it into their careers. You will learn the different RPA tools or global tools that any business can use across any application. Also, you will know the concept of Robots-for-Hire and how it is valuable for entrepreneurs. If you enjoyed this episode, check out show notes, resources, and more at www.superdatascience.com/173