Faculty Recruiting Support CICS

Text-Guided Image Editing

19 Oct
Thursday, 10/19/2023 12:00pm to 1:00pm
Computer Science Building, Room 150/151; Virtual via Zoom
Machine Learning and Friends Lunch

Abstract: Text-to-image diffusion models are quickly becoming a powerful tool for image creation. Employing those models for intuitive editing control over images is only natural, yet challenging.

In this Talk, Hertz will present two distinct methods we have developed for text-guided image editing. In the first work, Prompt-to-Prompt, we employ the cross-attention layers of the diffusion model to refine the generation process through refinement in the condition text prompt. Furthermore, we introduce an efficient "Null Text Inversion" technique that enables prompt-to-prompt image editing over real images.

In our recent work, Delta Denoising Score, we introduce a score function for image editing that can be used directly over an image or as a loss function to train an Image2Image translation model. In this work, we analyze the noisy dynamics of the Score Distillation Sampling (SDS) when used for image editing. We suggest adding a reference SDS branch to eliminate the noisy component during the optimization.

Bio: Amir Hertz is a research scientist at Google, working on extending image editing capabilities using generative models. He completed his Ph.D. studies recently (under review) in the Department of Computer Science at Tel-Aviv University under the supervision of Prof. Daniel Cohen-Or and Prof. Raja Giryes. His research focuses on adopting and extending machine learning practices within computer graphics. Specifically, Hertz has developed models and methods for 3D shape generation, texture synthesis, 3D modeling, and meshing.