Content

Speaker

Qingyao Ai

Title

Dynamic and Parametric Retrieval Augmented Generation

Abstract

Retrieval-augmented generation (RAG) has emerged as a popular paradigm for enhancing large language models (LLMs) with external knowledge. However, conventional RAG approaches often treat LLMs as static black boxes, relying solely on prompting and in-context learning to integrate retrieved information. This overlooks two important opportunities: the dynamic nature of LLM inference, and the potential to leverage the model’s internal states and parameters for more direct, efficient knowledge integration. To this end, we propose to conduct dynamic retrieval-augmented generation that actively analyzes the evolving needs of LLMs during inference and injects retrieved knowledge directly into the model’s internal representations and parameters in real time. Specifically in this talk, I will present our recent works that dynamically modulate LLM’s attention mechanisms and multi-layer perceptron weights to enable seamless, on-the-fly knowledge infusion without interrupting the generation process or introducing additional input overhead. Empirical results show significant improvements in accuracy, efficiency, and adaptability over traditional RAG. We believe that research on dynamic RAG has the potential to advance the frontier of LLM customization, bridging the gap between external knowledge retrieval and intrinsic model dynamics to create more responsive, context-aware AI systems.

Bio

Qingyao Ai is an associate professor at the Department of Computer Science and Technology, Tsinghua University. His research focuses on Information Retrieval and related topics, particularly on integrating retrieval and generative AI techniques to build better information access systems and agents, including retrieval/ranking optimization, retrieval augmented generation, automatic prompt refinement, continue learning for agents, and more. Qingyao Ai has served as the general co-chair of SIGIR-AP 2023, the program co-chair of NTCIR-18, the associate editor of ACM TOIS, and the area chair or senior PC member for SIGIR, CIKM, WWW, EMNLP, etc. He has received multiple awards including CIPS Qian Wei-chang Youth Innovation Award, ACM SIGIR Early Researcher Award, ACM SIGIR 2024 Best Paper Award, SIGIR-AP’23 Best Paper Honorable Mention, Google Research Scholar Award, etc.

About

The CIIR Talk Series is an initiative for researchers and practitioners working on information retrieval and related disciplines to present their work.

Subscribe to mailing list by sending an email to ciir-talks-request [at] cs [dot] umass [dot] edu (ciir-talks-request[at]cs[dot]umass[dot]edu) with "subscribe" as the email subject (without the quotation marks) for Zoom link/passcode notifications, or click here for Zoom link and reach out to zamani [at] cs [dot] umass [dot] edu (subject: CIIR%20Talks%20Passcode) (Hamed Zamani) for the passcode.

Hybrid event posted in CIIR Talk Series for Faculty , Staff , and Alumni