In the ever-evolving landscape of generative AI, the year 2022 witnessed a remarkable leap from image generation to video generation, all in a matter of months. Building upon this rapid progress, Meta Connect 2023 has ushered in a new era of creativity and self-expression with a slew of groundbreaking developments. At the forefront of these innovations is “Emu,” Meta’s foundational model for image generation, which underpins a range of generative AI experiences, including AI-powered image editing tools for Instagram and the Imagine feature within Meta AI.
Emu Video: Simplifying High-Quality Video Generation
Meta introduces “Emu Video,” a revolutionary approach to text-to-video generation, based on diffusion models. This unified architecture responds to diverse inputs, including text-only, image-only, and combined text and image inputs. The process is split into two steps: first, generating images based on text prompts, and then creating videos using both the text and generated images. This factorized approach not only streamlines video generation but also enables the creation of higher-resolution videos with just two diffusion models, outperforming prior models in human evaluations. In fact, 96% of respondents preferred Emu Video’s output quality, and 85% favored its faithfulness to the text prompt, surpassing previous benchmarks.
Emu Edit: Precision Image Editing Through Text Instructions
Meta introduces “Emu Edit,” a groundbreaking solution for precise image editing guided by text instructions. Unlike existing generative AI models, Emu Edit adheres closely to instructions, ensuring that only the pixels relevant to the edit request are altered while leaving unrelated areas untouched. This novel approach addresses the challenge of offering precise control in image manipulation, covering tasks such as local and global editing, background changes, color transformations, and more. To train this model, Meta has curated a dataset of 10 million synthesized samples, the largest of its kind to date. Emu Edit demonstrates superior performance, setting new standards in instruction faithfulness and image quality across various image editing tasks.
The Future of Generative AI
While these advancements are rooted in fundamental research, their potential applications are boundless. Imagine effortlessly creating animated stickers or GIFs for group chats, editing photos with no technical expertise required, or adding dynamic elements to your Instagram posts. Emu Video, Emu Edit, and similar technologies hold the promise of empowering individuals to express themselves in innovative ways, whether as an art director exploring new concepts, a content creator enhancing their creations, or a friend sending a unique birthday greeting. As Meta continues to push the boundaries of generative AI, the possibilities for creative expression and communication are endless, and that is truly worth celebrating.