Link to lecture recording on YouTube
Date: 2025-04-07
Speaker: Kaiyu Yang 杨凯峪
Speaker’s social profile: Website / Google Scholar / GitHub / LinkedIn / X (Twitter)
Education:
- Ph.D. in Computer Science, 2018-2022, Princeton University, advised by Prof. Jia Deng
- M.S. in Computer Science, 2016-2018, University of Michigan
- B.Eng. in Computer Science, 2011-2016, Tsinghua University
- B.S. in Mathematics, 2012-2016, Tsinghua University
Work:
Notes
AI arms race: leading companies release a new model every few months, and at the center is math and coding
Why math and coding?
- proxies for complex reasoning and planning
- important in human intelligence, challenging for LLMs
- unlimited applications: travel planning etc.
- relatively easy to evaluate
- math: check answers
- coding: run unit tests
Math and coding problems are deeply connected
Two main techniques to train LLMs to solve math problems:
- supervised finetuning (SFT): “good data are all you need”
- start with a base model - generic language model pre-trained on Internet-scale datasets
- give it additional math-related web documents (e.g., webpages from MathOverflow, papers from arXiv)
- obtain base math model after the steps above; it has seen a lot of mathematics, but yet to learn how to solve problems
- next step: give problems with detailed step-by-step solutions
- optionally, can use external tools
- training data are the foremost important: problems + (step-by-step, tool-integrated) solutions curated by humans and LLMs
- reinforcement learning (RL): “verifiability is all you need”
[Incomplete, work in progress]
References