Trix Cyrus

Posted on Dec 17

Part 11: Building Your Own AI - Introduction to Generative Models: GANs and VAEs

#programming #ai #machinelearning #learning

Author: Trix Cyrus

[Try My], Waymap Pentesting tool: Click Here
[Follow] TrixSec Github: Click Here
[Join] TrixSec Telegram: Click Here

Generative models are a fascinating area of machine learning, capable of creating entirely new data resembling the training data. In this article, we’ll explore two popular types of generative models: Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs). These models have applications in generating realistic images, creating deepfake videos, and even composing music.

1. What are Generative Models?

Generative models aim to understand the underlying data distribution and generate new samples that resemble the original data. Unlike discriminative models, which focus on classification or prediction, generative models create something entirely new.

2. Introduction to GANs

Generative Adversarial Networks (GANs) are a class of generative models introduced by Ian Goodfellow in 2014. GANs consist of two neural networks:

Generator: Creates new data resembling the training data.
Discriminator: Evaluates whether a sample is real (from the training data) or fake (generated).

These networks are trained in a competitive process, known as adversarial training:

The generator tries to fool the discriminator by creating realistic samples.
The discriminator tries to correctly classify real and fake samples.

GAN Workflow

The generator produces a sample from random noise.
The discriminator evaluates the sample.
Both networks adjust their weights to improve their respective tasks.

Applications of GANs

Generating realistic images (e.g., AI-generated portraits).
Creating deepfake videos.
Data augmentation for imbalanced datasets.
Super-resolution: Enhancing image quality.

3. Introduction to VAEs

Variational Autoencoders (VAEs) are another type of generative model. They work by learning a compressed representation (latent space) of the input data, then generating new data by sampling from this latent space.

How VAEs Work

Encoder: Compresses input data into a latent space representation.
Latent Space: Represents the underlying structure of the data in a reduced form.
Decoder: Reconstructs data from the latent space.

VAEs differ from traditional autoencoders because they incorporate probabilistic sampling, which allows for smoother interpolation in the latent space and the generation of new data.

Applications of VAEs

Generating new images or videos.
Anomaly detection by comparing reconstructed and original data.
Creating music and sound synthesis.

4. Hands-On: Building a GAN for Image Generation

Step 1: Install Required Libraries

pip install tensorflow keras numpy matplotlib

Step 2: Import Libraries

import tensorflow as tf
from tensorflow.keras import layers
import numpy as np
import matplotlib.pyplot as plt

Step 3: Define the GAN Components

# Generator
def build_generator(latent_dim):
    model = tf.keras.Sequential([
        layers.Dense(256, activation='relu', input_dim=latent_dim),
        layers.BatchNormalization(),
        layers.LeakyReLU(0.2),
        layers.Dense(28 * 28 * 1, activation='sigmoid'),
        layers.Reshape((28, 28, 1))
    ])
    return model

# Discriminator
def build_discriminator(input_shape):
    model = tf.keras.Sequential([
        layers.Flatten(input_shape=input_shape),
        layers.Dense(128, activation='relu'),
        layers.LeakyReLU(0.2),
        layers.Dense(1, activation='sigmoid')
    ])
    return model

Step 4: Train the GAN

Generate images using the generator.
Classify real vs. fake images using the discriminator.
Train both networks in an adversarial setup.

5. Hands-On: Building a VAE for Data Generation

Step 1: Define the Encoder and Decoder

# Encoder
latent_dim = 2
encoder_input = layers.Input(shape=(28, 28, 1))
x = layers.Flatten()(encoder_input)
x = layers.Dense(128, activation='relu')(x)
z_mean = layers.Dense(latent_dim, name='z_mean')(x)
z_log_var = layers.Dense(latent_dim, name='z_log_var')(x)

# Decoder
decoder_input = layers.Input(shape=(latent_dim,))
x = layers.Dense(128, activation='relu')(decoder_input)
x = layers.Dense(28 * 28, activation='sigmoid')(x)
decoder_output = layers.Reshape((28, 28, 1))(x)

Step 2: Train the VAE

Encode input data into latent space.
Sample from latent space using ( z = z_{\text{mean}} + \text{exp}(z_{\text{log_var}}) \cdot \epsilon ).
Decode the sampled points to reconstruct the input.

6. Comparison of GANs and VAEs

Feature	GANs	VAEs
Training	Adversarial (generator vs. discriminator)	Reconstruction (minimizing loss)
Output Quality	Often more realistic	Typically smoother, less sharp
Latent Space	No explicit latent space	Explicit latent space

7. Challenges with Generative Models

GANs: Training instability, mode collapse (generator produces limited diversity).
VAEs: Blurrier outputs compared to GANs.

8. Future Trends in Generative Models

Combining GANs and VAEs for enhanced performance.
Advanced architectures like StyleGAN and BigGAN.
Applications in drug discovery and creative arts.

~Trixsec

DEV Community