Stop Thinking, Just Do!

Sungsoo Kim's Blog

Multimodal Retrieval-Augmented Generation with Gemini

tagsTags

22 May 2024


Article Source


How to build Multimodal Retrieval-Augmented Generation (RAG) with Gemini

Abstract

The saying ““a picture is worth a thousand words”” encapsulates the immense potential of visual data. But most retrieval-augmented generation (RAG) applications rely only on text. This session applies RAG to multimodal use cases. It focuses on embeddings and attributed question answering to retrieve data. We’ll begin with a high-level architecture and quickly dive into a practical demo. Attendees will learn to create powerful LLM-based workflows and embed them in existing applications.

  • Speakers: Shilpa Kancharla, Jeff Nelson

comments powered by Disqus