DEV Community

Cover image for ImageFX: Google’s New Tool for Generating Images from Text
Amulya Kumar for HyScaler

Posted on

ImageFX: Google’s New Tool for Generating Images from Text

Google has always been at the forefront of artificial intelligence (AI), especially with its AI chatbot, Google Bard, which can converse, write, and create with users. But when it comes to AI image generation, Google has been more reserved, keeping its powerful Imagen model under wraps - until now.

On Thursday, Google announced a series of AI updates, with a focus on the image-generating space. The company launched a new image-generation tool, ImageFX, and enabled Bard to generate images as well. Both features are driven by Imagen 2, Google’s latest and greatest text-to-image model, developed by Google DeepMind and released last month.

What is ImageFX and how does it work?

ImageFX is a new tool that lets users generate images from text prompts, similar to other AI models such as DALL-E 3. Users can simply type in a description of what they want to see, and ImageFX will produce a realistic image that matches the text.

But ImageFX is not just another AI image generator. It has a unique feature called “expressive chips”, which are small icons that appear on the prompt interface. These chips allow users to explore “adjacent dimensions of your creation and ideas”, according to Google. For example, users can use the chips to change the color, style, mood, or perspective of the image, or to add or remove elements from the scene.

Users can access ImageFX in Google Labs, the company’s experimental platform where users can try out the company’s early ideas for features and products. One of the products that users can test in Google Labs is the Search Generative Experience (SGE), which uses ImageFX to generate images based on search queries.

What is Imagen 2 and what can it do?

Imagen 2 is the text-to-image model that powers ImageFX, as well as Bard, Google’s AI chatbot. Imagen 2 is the successor of Imagen, a model that Google developed in 2022 but did not release to the public. Imagen 2 is based on the Transformer architecture, which is widely used in natural language processing (NLP) and computer vision.

Google claims that Imagen 2 can generate its highest quality images, even for challenging tasks such as human faces and hands. The model can also handle complex and abstract concepts, such as emotions, metaphors, and analogies. For instance, Imagen 2 can generate an image of “a happy dog wearing a hat” or “a sad elephant in a tutu”.

How to generate images in Bard?

Bard is Google’s AI chatbot that can answer questions, tell stories, write poems, and more. Now, Bard can also generate images, thanks to Imagen 2. Users can simply ask Bard to generate a photo using a conversational prompt, and Bard will respond with a high-quality image. For example, users can say “Show me a picture of a cat playing with a ball of yarn” or “Draw me a portrait of yourself” and Bard will oblige.

The feature is available starting today in most countries in English. Google plans to expand the feature to other languages and regions in the future.

Where else can you find Imagen 2?

Google is also integrating Imagen 2 across its offerings, including Ads, Duet AI in Workspace, SGE, and Google Cloud’s Vertex AI. These products will benefit from the enhanced image quality and diversity that Imagen 2 can provide.

For example, Ads users can use Imagen 2 to create more engaging and personalized ads based on text inputs. Duet AI in Workspace users can use Imagen 2 to collaborate and communicate more effectively with visual aids. SGE users can use Imagen 2 to generate images for their search queries. Vertex AI users can use Imagen 2 to build and deploy their own custom text-to-image models.

How does Google ensure the ethical use of Imagen 2?

Google is aware of the potential misuse of AI image generators, such as creating fake or harmful content. To prevent this, Google has implemented the necessary guardrails to block generating violent, offensive, and sexually explicit content. Google also requires users to follow its terms of service and responsible AI practices when using Imagen 2.

Additionally, all images generated with Imagen 2 will be watermarked with SynthID, a tool developed by Google DeepMind that watermarks photos in a way that is imperceptible to the human eye but can be used for identification. This way, users can verify whether an image was generated by Google’s AI tools or not.

Users can also check the “About this image” insights in Search and Chrome, which will show whether the photo was generated using Google’s AI tools, as well as other information such as the source, the date, and the license of the image.

Top comments (0)