Article Source
In-Context Learning; A Case Study of Simple Function Classes
- Workshop: Large Language Models and Transformers
- Speaker(s): Gregory Valiant (Stanford University)
- Location: Calvin Lab Auditorium
- Date: Friday, Aug. 18, 2023
- Time: 9 – 10 a.m. PT
- Related Paper
Abstract
In-context learning refers to the ability of a model to learn new tasks from a sequence of input-output pairs given in a prompt. Crucially, this learning happens at inference time without any parameter updates to the model. I will discuss our empirical efforts that shed light on some basic aspects of in-context learning: To what extent can Transformers, or other models such as LSTMs be efficiently trained to in-context learn fundamental function classes, such as linear models, sparse linear models, and small decision trees? How can one evaluate in-context learning algorithms? And what are the qualitative differences between these architectures with respect to their ability to be trained to perform in-context learning? I will also discuss subsequent work of other researchers which illuminates connections between language modeling and learning: must a good language model be able to perform in-context learning? Do large language models know how to perform regression? And are such primitives useful for language-centric tasks? This talk will be mostly based on joint work with Shivam Garg, Dimitris Tsipras, and Percy Liang.