PhD Thesis Defense: Lijun Zhang, Advanced Resource-Efficient Multi-Task Learning
Content
Speaker
Abstract
As machine learning continues to advance across diverse prediction and generation tasks, there is growing interest in multi-task models—single deep learning models designed to solve multiple tasks simultaneously. By sharing parameters and computations across related tasks, these models can improve generalization while reducing inference costs such as memory usage and latency. However, efficiently building effective multi-task models remains a significant challenge due to the large design space for parameter sharing and the high training cost.
This thesis addresses these challenges through a series of innovations in multi-task learning. First, it revisits conventional parameter-sharing strategies in settings where tasks differ in input and output domains, demonstrating that allocating separate early-layer parameters can significantly outperform traditional bottom-shared designs under the same memory budget. Building on this insight, the thesis introduces two automated frameworks that transform a backbone CNN into an efficient, high-performing multi-task model by systematically exploring the parameter-sharing space. Finally, this thesis explores how pre-trained diffusion models can reduce the training and design burden in multi-task learning – by enabling zero-shot performance in tasks such as image watermarking and image restoration, and by serving as parameter generators to synthesize task-specific parameters of a multi-task model without additional training.
Advisor
Hui Guan