Blended Diffusion for Text-driven Editing of Natural Images
- Omri Avrahami, D. Lischinski, Ohad Fried
- Computer ScienceComputer Vision and Pattern Recognition
- 29 November 2021
This paper introduces the first solution for performing local (region-based) edits in generic natural images, based on a natural language description along with an ROI mask, and shows several text-driven editing applications, including adding a new object to an image, removing/replacing/altering existing objects, background replacement, and image extrapolation.
Blended Latent Diffusion
- Omri Avrahami, Ohad Fried, Dani Lischinski
- Computer ScienceArXiv
- 6 June 2022
This work evaluates this LDM against the available baselines both qualitatively and quantitatively and demonstrates that in addition to being faster, this method achieves better precision than the existing methods.
GAN Cocktail: mixing GANs without dataset access
- Omri Avrahami, D. Lischinski, Ohad Fried
- Computer ScienceEuropean Conference on Computer Vision
- 7 June 2021
This work tackles the problem of model merging, given two constraints that often come up in the real world: no access to the original training data, and without increasing the network size, with a novel, two-stage solution.
SpaText: Spatio-Textual Representation for Controllable Image Generation
- Omri Avrahami, Thomas Hayes, Xiaoyue Yin
- Computer ScienceArXiv
- 25 November 2022
This work presents SpaText — a new method for text-to-image generation using open-vocabulary scene control, based on a novel CLIP-based spatio-textual representation, and shows its effectiveness on two state-of-the-art diffusion models: pixel-based and latent-based.
Ownership and Creativity in Generative Models
- Omri Avrahami, Bar Tamir
- Computer ScienceArXiv
- 2 December 2021
A possible algorithmic solution in the vision-based model’s regime is proposed, raising several candidates that may own the content and arguments for each one of them, and discussing the broader implications of this problem.