Practice Fine-Tuning
Generative Adversarial Networks

Fine-tuning Generative Adversarial Networks (GANs) is both art and engineering. This guide walks you from the conceptual building blocks to hands-on recipes for stabilizing and improving synthesis quality. We'll use the snowGAN and the Rocky Mountain snowpack dataset as a concrete example to try generating synthetic images of snowpack images within the dataset like this.

Synthetic image generated by the coreDiffusor

Picture of a snowpack profile taken apart of the Rocky Mountain Snowpack dataset.

Getting Started

Start by installing snowGAN, a coarsely tuned GAN ready to be fine-tuned!


git clone https://github.com/RMDig/snowGAN.git
cd snowGAN
pip install -e .

Train from scratch


snowgan --mode train

Explore different GAN configuration, in this example we make our generator stronger!


snowgan --mode train --gen_kernel '5 5' --gen_stride '2 2' --gen_lr 0.005

Generate images with your trained model


snowgan --mode generate --checkpoint keras/snowgan/latest_gen.h5 --n_samples 64

Run snowgan --help to list every tunable hyperparameter, checkpoint flags, and data paths.

Model Structure

High-level view

A generative adversarial network (GAN) has two networks trained adversarially. The first network is a Generator (G) that maps random noise to images, and a Discriminator (D) network that predicts whether samples are real or synthetic. Fine-tuning adjusts the model's architecture, optimization and regularization so G and D remain balanced while improving image quality.

The snowGAN uses Wasserstein GAN with gradient penalty (WGAN-GP). This model uses Earth Mover Distance which calculates the difference between real and generated data distributions. Through training the generator tries to minimize EMD and ultimately create images that are similar to real images. The WGAN-GP incorporates a gradient penalty to stabilize training by keeping the discriminator's gradients well-behaved, which helps prevent instability during parameter updates.

Generator

Input: latent vector (e.g., 100-d Gaussian).
Upsampling blocks: transpose-convolutions layer to increase noise resolution.
Normalization: batchnorm to stabilitize layer and prevent exploding gradients.
Output: tanh oactivation mapping back to -1, 1; ensure training images match output range.

Discriminator

Input: An (x, y) resolution image, real or fake.
Downsampling blocks: conv → activation → normalization. LeakyReLU is leveraged to avoid dead neurons.
Output: single scalar output for WGAN, or probability for vanilla GAN, predicting whether the image was real or fake.

Hyperparameters

Below are tunable parameters and guidance for how to change them and why.

Hyperparameter

CLI call

Default

Description

Latent Dimension

--latent_dim

100

Higher dims can encode more detail; 64–256 typical. Increase slowly and watch mode coverage.

Filters Per Layer

--gen_filters,
--disc_filters

[1024,512,256,128,64], [64,128,256,512,1024]

More filters = more capacity but slower and higher memory. Traditionally one cuts or grows filters by powers of two.

Kernel Size

--gen_kernel,
--disc_kernel

[5,5], [5,5]

3×3 is standard; 5×5 captures larger structures (good for texture), 7×7 only when needed.

Kernel Stride

--gen_stride,
--disc_stride

[2,2], [2,2]

Stride controls upsampling factor per block. If artifacts appear, prefer upsample+conv over transpose conv.

Training Steps

--gen_steps,
--disc_steps

3, 1

Increasing D steps per G update can stabilize training when D is underpowered or for WGAN-style training.

Learning Rate

--gen_lr,
--disc_lr

0.001, 0.0001

Common ranges: 1e-4–5e-3. Use lower LR for high-capacity G to prevent destabilizing D.

Adam Betas

--gan_beta_1,
--gan_beta_2,
--disc_beta_1,
--disc_beta_2

0.5, 0.9,
0.5, 0.9

Beta1=0.5 commonly used for GANs; try 0.0–0.9 if momentum hurts convergence.

Negative Slope

--gan_negative_slope,
--disc_negative_slope

0.25, 0.25

LeakyReLU slope; 0.01–0.3 typical. Lower slopes more like ReLU; higher slopes give smoother gradients.

Gradient Penalty

--disc_lambda_gp

10.0

For WGAN-GP — 10 is common; reduce if gradients vanish or increase if Lipschitz constraint is loose.

Rule of thumb: change one hyperparameter at a time and run for a small number of epochs to see its direction of effect. Note that lists are formatted as string with list items seperated by spaces.

Hands-On Practice

Use these stepwise recipes on the Rocky Mountain snowpack dataset.

Quick sanity run (few epochs)

Install & clone repository (see Getting Started above).
Run a 10–20 epoch trial to check architecture and sample pipeline learn start learning snowpack features:
```
snowgan --mode train --epochs 20 --batch_size 32 --gen_lr 0.001 --disc_lr 0.0004
```
Inspect generated samples in synthetics/ and logs.

Strengthen generator (if D too strong)

Increase generator capacity slightly, or increase --gen_steps so G updates more per iteration.

snowgan --mode train --gen_filters '1024 512 256 256 128' --gen_steps 3 --gen_lr 0.002 --disc_lr 0.0001

Transfer learning

Load pretrained generator --gen_checkpoint (or --disc_checkpoint) keras/snowgan/generator.keras

snowgan --mode train --gen_checkpoint keras/snowgan/generator.keras --gen_lr 1e-4

Fine-Tuning Strategies

Balance G & D

The primary goal when fine-tuning is to keep Generator and Discriminator balanced: if D learns too fast, G gets no signal; if G overpowers D, training mode collapses. Strategies:

Adjust learning rates: reduce D LR or increase G LR when D is too strong.
Update frequency: set --disc_steps > 1 when D is weak; lower it if D becomes too dominant.

Transfer Learning & Warm-Starts

Instead of training from scratch, initialize G (or D) from pretrained weights.

Progressive growing & multi-resolution

Train at low resolution, then fine-tune at higher resolutions. This reduces instability and speeds early learning.

Practice Fine-Tuning Generative Adversarial Networks