Advanced Autoencoder Techniques

Duration: 7 min

This module delves into advanced techniques for autoencoders, a type of neural network used for unsupervised learning. We will explore various methods to enhance the performance and utility of autoencoders, including denoising, variational, and convolutional autoencoders. Understanding these techniques is crucial for applications in data compression, anomaly detection, and generative modeling.

Denoising Autoencoders

Denoising autoencoders are designed to reconstruct clean data from corrupted inputs. This process enhances the model's ability to capture robust features and generalize better. By adding noise to the input data during training, the autoencoder learns to ignore the noise and focus on the underlying structure of the data.

import numpy as np
import tensorflow as tf
from tensorflow.keras import layers, models

# Generate noisy data
def add_noise(x, noise_factor=0.3):
  noise = np.random.normal(loc=0.0, scale=noise_factor, size=x.shape)
  noisy_x = x + noise
  noisy_x = np.clip(noisy_x, 0., 1.)
  return noisy_x

# Load dataset
(x_train, _), (x_test, _) = tf.keras.datasets.mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:])))
x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))

# Add noise
noisy_x_train = add_noise(x_train)
noisy_x_test = add_noise(x_test)

# Build denoising autoencoder
input_img = layers.Input(shape=(784,))
encoded = layers.Dense(128, activation='relu')(input_img)
decoded = layers.Dense(784, activation='sigmoid')(encoded)
denoising_autoencoder = models.Model(input_img, decoded)
denoising_autoencoder.compile(optimizer='adam', loss='binary_crossentropy')

# Train the model
denoising_autoencoder.fit(noisy_x_train, x_train, epochs=50, batch_size=256, shuffle=True, validation_data=(noisy_x_test, x_test))

# Evaluate
decoded_imgs = denoising_autoencoder.predict(noisy_x_test)

Try it in Google Colab:

Model trained successfully. Decoded images are generated.

Variational Autoencoders

Variational autoencoders (VAEs) are a generative model that not only reconstructs the input data but also learns the latent space distribution. This allows VAEs to generate new data points that are similar to the training data. VAEs introduce a probabilistic twist to the autoencoder framework, making them powerful for tasks like image generation and anomaly detection.

import numpy as np
import tensorflow as tf
from tensorflow.keras import layers, models

# Custom layers for VAE
class Sampling(layers.Layer):
  def call(self, inputs):
    z_mean, z_log_var = inputs
    batch = tf.shape(z_mean)[0]
    dim = tf.shape(z_mean)[1]
    epsilon = tf.keras.backend.random_normal(shape=(batch, dim))
    return z_mean + tf.exp(0.5 * z_log_var) * epsilon

# Encoder
input_img = layers.Input(shape=(784,))
h1 = layers.Dense(256, activation='relu')(input_img)
z_mean = layers.Dense(2)(h1)
z_log_var = layers.Dense(2)(h1)
z = Sampling()([z_mean, z_log_var])

# Decoder
decoder_h = layers.Dense(256, activation='relu')
decoder_mean = layers.Dense(784, activation='sigmoid')
h_decoded = decoder_h(z)
decoded = decoder_mean(h_decoded)

# VAE model
vae = models.Model(input_img, decoded)

# Loss function
reconstruction_loss = tf.keras.losses.binary_crossentropy(input_img, decoded)
reconstruction_loss *= 784
kld_loss = 1 + z_log_var - tf.square(z_mean) - tf.exp(z_log_var)
kld_loss = tf.reduce_sum(kld_loss, axis=-1)
kld_loss *= -0.5
vae_loss = tf.reduce_mean(reconstruction_loss + kld_loss)
vae.add_loss(vae_loss / 784.0)
vae.compile(optimizer='adam')

# Load dataset
(x_train, _), (x_test, _) = tf.keras.datasets.mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:])))
x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))

# Train the model
vae.fit(x_train, epochs=50, batch_size=128, validation_data=(x_test, None))

💡 Tip: When training VAEs, ensure that the latent space dimension is appropriate for the complexity of your data. Too small a dimension may lead to underfitting, while too large may cause overfitting.

❓ What is the primary purpose of a denoising autoencoder?

To classify data To reconstruct clean data from corrupted inputs To generate new data points To reduce dimensionality

❓ What is a key characteristic of Variational Autoencoders (VAEs)?

They use a deterministic latent space They generate new data points by sampling from the latent space They do not require a loss function They use a single dense layer for encoding

Advanced Autoencoder Techniques

Denoising Autoencoders

Variational Autoencoders

Related Courses