DEV Community

Cover image for Create your own GenAI Image Generator App like MidJourney or DALLE-2
Kapil Raghuwanshi🖥
Kapil Raghuwanshi🖥

Posted on • Edited on • Originally published at blog.bitsrc.io

Create your own GenAI Image Generator App like MidJourney or DALLE-2

Simple React App to demonstrate Generative AI Text-to-Image Capability using any third-party APIs

In the fast-paced world of web development, staying ahead often involves incorporating cutting-edge technologies into our projects. One such innovation that has been gaining traction is the integration of Artificial Intelligence (AI) into web applications. In this article, we'll explore how I leveraged third-party APIs built by Segmind to seamlessly integrate AI-generated images into my React app, pushing the boundaries of creativity and user engagement.

What's Generative AI?

It refers to a class of artificial intelligence systems designed to generate new content, such as images, text, or even music, often in a way that mimics human creativity. These systems, often based on neural networks, can learn patterns and generate novel outputs without explicit programming. Generative AI has applications in various fields, including art, content creation, and data synthesis, contributing to innovative solutions and creative outputs.

Image Generation in AI:

Image generation in AI involves using artificial intelligence models to create new, realistic images. This process often leverages generative models, which are trained on large datasets to learn patterns and generate novel content.

Stable Diffusion is a deep learning, text-to-image model released in 2022 based on diffusion techniques. Uses deep neural networks to generate high-quality images. It trains models using input images and generates new images based on them until the network reaches stability.

DALL·E, developed by OpenAI, is a variant of the GPT (Generative Pre-trained Transformer) architecture designed for image generation. It can generate images from textual descriptions and has gained attention for its ability to create unique and imaginative visuals.

Vision for our App:

When conceptualising my React app, I envisioned an immersive user experience that went beyond traditional static content. I wanted to incorporate dynamic, AI-generated images that would not only captivate users but also add a touch of uniqueness to each interaction. To achieve this, I turned to third-party APIs specialising in AI image generation.

App Snapshot

Top Features of the App:

  1. Generate the image from the prompt with various parameters.
  2. Surprise Me option to get the idea instantly, if you're falling short ;).
  3. Few previously generated images to pick from.
  4. The recent history of the generated images can be picked again.
  5. Download the generated images.
  6. Responsive web app.

Technologies used:

  1. Uses the latest React hooks to fetch the SegMind text2Img API.
  2. Deployed on Firebase.
  3. Recent history has been stored on the localStorage.

Choosing the Right API:

The first step was selecting a suitable API that aligned with my vision. After careful research, I settled on SegMind text2Img API (Create your account and get the API token from https://www.SegMind.com), a versatile platform renowned for its powerful image generation capabilities. It offered a range of features, including style transfer, deep dreaming, and more, making it the perfect choice for injecting creativity into my app.

Use my referral link to sign up and get extra credits for API [segmind.com/invite/3fb71a62-a0e5–4e87-af7a-8e4f3a774689].
You can explore their API Postman collection. We are going to use API sdxl1.0-txt2img in our app.

With the API chosen, we seamlessly integrated it into our React app. Leveraging React's component-based architecture, I created a dedicated component responsible for handling API requests and rendering the AI-generated images. Thanks to the simplicity and flexibility of React, this process was smooth and well-organized.

By leveraging third-party APIs, developers can effortlessly integrate AI capabilities into their projects, pushing the boundaries of what's possible in web development.

React Code

You can follow all the code behind our app on my GitHub: https://github.com/kapilraghuwanshi/gen-ai-image-generator, with all the proper file structure and required components, static content, and other config files.

Home.jsx



import React, { useState, useEffect } from "react";
import ImageBox from "../components/ImageBox";
import NavBar from "../components/NavBar";
import { fetchImages } from "../services/model-api";
import { getRandom, loaderMessages, promptIdeas } from "../utilities/utils";
import ChooseResults from "../components/ChooseResults";
import RecentResults from "../components/RecentResults";

const Home = () => {
  const [showLoader, setShowLoader] = useState(false);
  const [imageResult, setImageResult] = useState(null);
  const [promptQuery, setPromptQuery] = useState("");
  const [radioValue, setRadioValue] = useState("20");
  const [dropDownValue, setDropDownValue] = useState("DDIM");
  const [seedValue, setSeedValue] = useState(17123564234);
  const [loaderMessage, setLoaderMessage] = useState(loaderMessages[0]);

  useEffect(() => {
    const loaderInterval = setInterval(() => {
      setLoaderMessage(getRandom(loaderMessages));
    }, 3000);
    // to avoid memory leak
    return () => {
      clearInterval(loaderInterval);
    };
  }, [loaderMessage]);

  const handleSearch = (event) => {
    setPromptQuery(event.target.value);
  };

  const handleChange = (event) => {
    if (event.target.name === "radio") {
      setRadioValue(event.target.value);
    } else if (event.target.name === "dropdown") {
      setDropDownValue(event.target.value);
    } else {
      setSeedValue(event.target.value);
    }
  };

  const handleGenerate = (e) => {
    e.preventDefault();
    fetchData();
  };

  const fetchData = async () => {
    try {
      setShowLoader(true);

      const imageBlob = await fetchImages(
        promptQuery,
        seedValue,
        dropDownValue,
        radioValue
      );

      const fileReaderInstance = new FileReader();
      // This event will fire when the image Blob is fully loaded and ready to be displayed
      fileReaderInstance.onload = () => {
        let base64data = fileReaderInstance.result;
        setImageResult(base64data);
      };
      // Use the readAsDataURL() method of the FileReader instance to read the image Blob and convert it into a data URL
      fileReaderInstance.readAsDataURL(imageBlob);
      setShowLoader(false);
    } catch (error) {
      // Handle error
      console.error("Error fetching images from API:", error);
      setShowLoader(false);
    }
  };

  const handleSurpriseMe = (e) => {
    const surprisePrompt = getRandom(promptIdeas);
    setPromptQuery(surprisePrompt);
  };

  const handleAvailOptions = (option) => {
    setPromptQuery(option);
  };

  return (
    <div>
      <NavBar />
      <div className="surpriseBox">
        <label>Bring your imaginations into reality!</label>
      </div>
      <div>
        <input
          type="text"
          id="prompt"
          value={promptQuery}
          onChange={handleSearch}
          placeholder="A plush toy robot sitting against a yellow wall"
          className="promptInput"
        />
        <button onClick={handleSurpriseMe}>Surprise Me</button>
      </div>
      <div className="formBox">
        <div className="formValue">
          <label>Scheduler &nbsp;</label>
          <select name="dropdown" value={dropDownValue} onChange={handleChange}>
            <option value="Euler">Euler</option>
            <option value="LMS">LMS</option>
            <option value="Heun">Heun</option>
            <option value="DDPM">DDPM</option>
          </select>
        </div>
        <div className="formValue">
          Steps
          <label>
            <input
              type="radio"
              name="radio"
              value="20"
              checked={radioValue === "20"}
              onChange={handleChange}
            />
            20
          </label>
          <label>
            <input
              type="radio"
              name="radio"
              value="30"
              onChange={handleChange}
            />
            30
          </label>
          <label>
            <input
              type="radio"
              name="radio"
              value="50"
              onChange={handleChange}
            />
            50
          </label>
        </div>
        <div className="formValue">
          <label>Seed &nbsp;</label>
          <input
            type="number"
            name="input"
            value={seedValue}
            onChange={handleChange}
          />
        </div>
      </div>
      <div>
        <button onClick={handleGenerate}>Generate the Image</button>
      </div>

      {showLoader ? (
        <div style={{ margin: 40 }}>Blazing fast results... ⚡️⚡️⚡️</div>
      ) : (
        <>
          <ImageBox promptQuery={promptQuery} imageResult={imageResult} />
        </>
      )}
      <ChooseResults onSelect={handleAvailOptions} />
      <RecentResults
        promptQuery={promptQuery}
        imageResult={imageResult}
        onSelect={handleAvailOptions}
      />
      <div className="slideShowMessage">{loaderMessage}</div>
      <div className="footer">Powered by SegMind</div>
    </div>
  );
};

export default Home;



Enter fullscreen mode Exit fullscreen mode

model-api.js



import axios from "axios";
import { secret } from "../secret";

const { apiKey } = secret;

export const fetchImages = async (
  promptCall,
  seedValue,
  dropDownValue,
  radioValue
) => {
  const options = {
    method: "POST",
    url: "https://api.segmind.com/v1/sdxl1.0-txt2img",
    headers: {
      "x-api-key": `${apiKey}`,
      "Content-Type": "application/json",
    },
    responseType: "arraybuffer",
    data: {
      prompt: promptCall,
      seed: seedValue,
      scheduler: dropDownValue,
      num_inference_steps: radioValue,
      negative_prompt: "NONE",
      samples: "1",
      guidance_scale: "7.5",
      strength: "1",
      shape: 512,
    },
  };

  try {
    const response = await axios.request(options);
    // convert raw blob as ArrayBuffer to an image blob with MIME type
    const imageBlob = new Blob([response.data], { type: "image/jpeg" });
    // console.log(response, imageBlob);
    return imageBlob;
  } catch (error) {
    console.error("Error while fecthing Gen AI model API", error);
  }
};



Enter fullscreen mode Exit fullscreen mode

Use SegMind text2Img API (Create your account and get the API token from https://www.SegMind.com, and replace the API key in the below variable.

It is ready to use for all of you, you can fork the repo, use your SegMind or any other third-party APIs, and make it yours!

Run in local:

Go to your project directory in the VSCode terminal/Console, you can run:



npm run build 
npm run start


Enter fullscreen mode Exit fullscreen mode

Runs the app in the development mode. Open http://localhost:3000 to view it in your browser. The page will reload when you make changes. You may also see any lint errors in the console.

Deploy on Firebase:

We can follow this quick article for deploying our react image generation app on Firebase - a free deployment tool by Google for developers.

Run in Production:

Try our app here https://genai-image-generator.web.app/home

In case the API token expires, use yours in the code deploy on Firebase, and then run.

Share it with your friends and family, and show your swanky product to your colleagues.

Further Scope for Developers:

I would like the new developers to fork the GitHub repository https://github.com/kapilraghuwanshi/gen-ai-image-generator and work on to add few of the below features:

  1. Create an image slideshow with a recent history of generated images.
  2. Reverse the order for recent history images.
  3. Create an Image slideshow of the recent history of generated images.
  4. Build REST APIs to post images on your developed servers and database (free tiers), fetch those APIs, and show them on the app.
  5. Reverse the order for recent history images.
  6. Create an Image slideshow of the recent history of generated images.
  7. Build REST APIs to post recent images on your developed servers and database (Render, Vercel, MongoDB Atlas free tiers), and fetch those from APIs and show them on the app.
  8. Add i18n localization to the project using react-i18next.
  9. Write Unit Test cases using @testing-library/react.

As technology continues to evolve, the fusion of AI and web development offers exciting opportunities for developers to create truly unique and captivating user experiences. Embrace the power of AI in your React projects, and watch as your applications come to life with dynamic, intelligent content.

That's all folks for this article!

Hope it will help you in creating your own MidJourney and DALLe-2 like applications, hope it is easy and fun!😃

Write your suggestions and feedback in the comment section below.

If you really learned something new with this article or it really made your dev work faster than before, like it, save it and share it with your colleagues.

Also, I have recently started creating tech content on my YouTube channel, just have a look at it TechMonkKapil, and subscribe if you like the content!🤝

Also, we are building a tech community on Telegram(Tech Monk Army) and Discord(Tech Monk Army). Join if you are looking to interact with like-minded folks.

I have been writing tech blogs for quite a time now, and have mostly published through my Medium account, this is my first tech article/tutorial in Dev.to. Hope you guys will shower love to it!🤩

Let’s be connected on LinkedIn and Twitter for more such engaging Tech Articles and Tutorials.🤝

Top comments (2)

Collapse
 
flornkm profile image
Florian Kiem

Hey, super nice post! I always thought that someone should build an app like MidJourney or Dall-E, especially because MidJourney runs completely via Discord at the moment. Nice job. :) I like it, if the UI gets improved it will easily attract a lot of users.
Related to i18n, have you used react-i18next often? I'm working for a company, inlang, and we're developing Paraglide JS which is far more performant, smaller in size, and agnostic than other libs. In case you want to take a look: inlang.com/m/gerre34r/library-inla...

Collapse
 
techygeeky profile image
Kapil Raghuwanshi🖥

Definitely @flornkm, we can improve the UI for this app, feel free to take the fork and raise the PR.