Zeng, Zamani Win Best Short Paper Award at 2022 ACM SIGIR Conference
Content
Doctoral student Hansi Zeng and Assistant Professor Hamed Zamani of the Manning College of Information and Computer Sciences (CICS) at UMass Amherst, along with Vishwa Vinay of Adobe Research India, received the Best Short Paper Award at the 2022 ACM Special Interest Group on Information Retrieval (SIGIR) Conference on Research and Development in Information Retrieval in Madrid, Spain for their work, "Curriculum Learning for Dense Retrieval Distillation" (CL-DRD).
When a user submits a query to a search engine, algorithms work behind the scenes to deliver high-quality results in order of perceived relevance. To ensure these algorithms are capable of providing the most relevant results, computer scientists deploy Learning to Rank (LTR) techniques — methods of training algorithms to produce higher-quality search results.
Previous research validates using knowledge distillation techniques, the transfer of knowledge from a more powerful but inefficient ranking model (teacher model) to a more efficient dense retrieval model (student model). However, using traditional knowledge distillation alone might not be suitable for an algorithm's Learning to Rank task if the ranking knowledge is particularly complex or if the student and teacher models have different structures and capacities. Adding curriculum learning introduces an iterative training process in which the difficulty of training data in each iteration increases, helping to mitigate the limitations of the traditional knowledge distillation for ranking.
"The concept of curriculum learning is inspired by real-world learning. When teaching complex subject matter, instructors typically break the knowledge into subcategories arranged in order of increasing difficulty to allow students to learn incrementally, mastering the basics before moving on to a more complex curriculum. Our CL-DRD framework embraces a similar philosophy," explains Zeng.
The team's framework deconstructs the teacher model's ranking knowledge into subtasks organized by increasing difficulty. For each iteration, the dense retrieval student model learns one subtask at a time before attempting the next, more complex task, leading to improved learning and better ranking performance.
"The CL-DRD framework advances the state-of-the-art in document ranking, which is one of the most fundamental research tasks in the field of information retrieval that has been studied for several decades," says Zamani, "Improving document ranking can potentially improve the search experience for billions of users."
Zeng is a doctoral student advised by Zamani and a research assistant at the Center for Intelligent Information Retrieval. He works on information retrieval and machine learning-related topics. His current research interest focuses on applying large pre-trained language models to several search and recommendation tasks, including dense retrieval.
Zamani is the associate director of the Center for Intelligent Information Retrieval. His current research focuses on designing and evaluating statistical and machine learning models with applications to interactive information access systems, including search engines, recommender systems, and question answering.