Link to lecture recording on YouTube
Date: 2025-03-10
Speaker: Ruslan Salakhutdinov
Speaker’s social profile: Website / Google Scholar / GitHub / LinkedIn / X (Twitter)
Work:
Autonomous AI agents: many opportunities to automate menial tasks
Web agents: foundation model + text understanding (HTML) + visual encoder + web grounding
VisualWebArena1
Tree Search2
Internet-Scale3
[Incomplete, work in progress]
Jing Yu Koh et al. VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web Tasks. arXiv:2401.13649 [cs.LG]. 2024.
Jing Yu Koh et al. Tree Search for Language Model Agents. arXiv:2407.01476 [cs.AI]. 2024.
Brandon Trabucco et al. InSTA: Towards Internet-Scale Training For Agents. arXiv:2502.06776 [cs.LG]. 2025.