How To Read AI Research Papers Effectively

Abstract

According to a recent survey, over two-thirds (66.9%) of developers and machine learning teams are planning production deployments of LLM apps in the next 12 months or “as fast as possible” – and 14.1% are already in production! Given the rapid rate of progress and constant drumbeat of new foundation models, orchestration frameworks and open source libraries – as well as the workaday challenges of getting an app into production – it can be difficult to find the time to digest and read the dizzying array of cutting-edge AI research papers hitting arXiv.

That task has never been more critical, however, as the time between academic discovery and industry application moves from years to weeks. How can teams discover and read AI research papers quickly without losing nuance, with an eye toward pragmatic application, while balancing real-world challenges?

In this session, Aparna Dhinakaran – who blends a background in academia with experience overseeing AI in production and troubleshooting real-world AI systems as co-founder and Chief Product Officer of Arize AI – will be joined by data scientist and machine learning engineer Amber Roberts to talk through strategies for understanding and applying the latest research, reducing mean time to application. The session will include an exercise of digesting 1-2 to be announced papers (will be a recent release!) in real-time.

Survey papers:

A Survey of Large Language Models: https://arxiv.org/pdf/2303.18223v12.pdf
Retrieval-Augmented Generation for Large Language Models: A Survey, https://arxiv.org/pdf/2312.10997.pdf
Benchmarking paper: HellaSwag: Can a Machine Really Finish Your Sentence, https://arxiv.org/pdf/1905.07830.pdf

-Breakthrough paper (deep dive): Mistral AI Mixture of Experts, https://arxiv.org/pdf/2401.04088.pdf

Slides: https://docs.google.com/presentation/d/18u-Xk-oVI9kmlQUAKlXszXz2nmxu8Zzp0zBVGFcbRmg/edit#slide=id.p

Stop Thinking, Just Do!