Stop Thinking, Just Do!

Sungsoo Kim's Blog

Building Large Language Models (LLMs)

tagsTags

2 September 2024


Article Source


Building Large Language Models (LLMs)

Abstract

For more information about Stanford’s Artificial Intelligence programs visit: https://stanford.io/ai

This lecture provides a concise overview of building a ChatGPT-like model, covering both pretraining (language modeling) and post-training (SFT/RLHF). For each component, it explores common practices in data collection, algorithms, and evaluation methods. This guest lecture was delivered by Yann Dubois in Stanford’s CS229: Machine Learning course, in Summer 2024.

About the speaker: Yann Dubois is a fourth-year CS PhD student advised by Percy Liang and Tatsu Hashimoto. His research focuses on improving the effectiveness of AI when resources are scarce. Most recently, he has been part of the Alpaca team, working on training and evaluating language models more efficiently using other LLMs.

https://youtu.be/9vM4p9NN0Ts?si=-ONwLjx4WgNig5vI


comments powered by Disqus