Lecture 05: Coding Agents and AI for Vulnerability Detection
Link to lecture recording on YouTube
Date: 2025-03-03
Speaker: Charles Sutton
Speaker’s social profile: Website / Company Profile / Google Scholar / GitHub / LinkedIn / X (Twitter)
Education:
- Ph.D. in Computer Science, 2000-2007, University of Massachusetts Amherst, advised by Prof. Andrew McCallum
- B.A. in Computer Science and Philosophy, 1995-1999, St. Mary’s College of Maryland
Work:
- Research Scientist, Google Deepmind
Notes
Coding Agents
Speaker’s definition of LLM agents:
LLM agents are multi-turn LLMs with tool use:
- dynamic computation time: more computation time for harder problems
- information from external tools; e.g., coding agent run the code and write different code based on the output
- ability to test hypothesis
- ability to take actions
What this definition de-emphasizes:
- planning
- chain of thought: can be viewed as special cases of dynamic computation time
- multi-agent
Evaluation
Pre-history of LLM code evaluation:
Examples of evaluations of code-generation capabilities
- Mostly Basic Programming Problems (MBPP)1 at Google
- HumanEval2 at OpenAI
AI for Computer Security
LLM Agents for Computer Security
[Incomplete, work in progress]
References
-
Jacob Austin et al. Program Synthesis with Large Language Models. arXiv:2108.0773 [cs.PL]. 2021.
-
Mark Chen et al. Evaluating Large Language Models Trained on Code. arXiv:2107.03374 [cs.LG]. 2021.