A Quick GAN LessonEssentially, GANs consist of two neural network agents/models (called generator and discriminator) that compete with one another in a zero-sum game, where one agent's gain is another agent's loss. The generator is used to generate new plausible examples from the problem domain whereas the discriminator is used to classify examples as real (from the domain) or fake (generated). The discriminator is then updated to get better at discriminating real and fake samples in subsequent iterations, and the generator is updated based on how well the generated samples fooled the discriminator (Figure 1). During its history, numerous architectural variations and improvements over the original GAN idea have been proposed in the literature. Most GANs today are at least loosely based on the DCGAN (Deep Convolutional Generative Adversarial Networks) architecture, formalized by Alec Radford, Luke Metz and Soumith Chintala in their 2015 paper. You’re likely to see DCGAN, LAPGAN, and PGAN used for unsupervised techniques like image synthesis, and cycleGAN and Pix2Pix used for cross-modality image-to-image translation.
GANs for Medical ImagesThe use of GANs to create synthetic medical images is motivated by the following aspects:
- Medical (imaging) datasets are heavily unbalanced, i.e., they contain many more images of healthy patients than any pathology. The ability to create synthetic images (in different modalities) of specific pathologies could help alleviate the problem and provide more and better samples for a deep learning model to learn from.
- Manual annotation of medical images is a costly process (compared to similar tasks for generic everyday images, which could be handled using crowdsourcing or smart image labeling tools). If a GAN-based solution were reliable enough to produce appropriate images requiring minimal labeling/annotation/validation by a medical expert, the time and cost savings would be appealing.
- Because the images are synthetically generated, there are no patient data or privacy concerns.
- Domain experts would still be needed to assess quality of synthetic images while the model is being refined, adding significant time to the process before a reliable synthetic medical image generator can be deployed.
- Since we are ultimately dealing with patient health, the stakes involved in training (or fine-tuning) predictive models using synthetic images are higher than using similar techniques for non-critical AI applications. Essentially, if models learn from data, we must trust the data that these models are trained on.
An exampleHere is an example of how to use MATLAB to generate synthetic images of skin lesions. The training dataset consists of annotated images from the ISIC 2016 challenge, Task 3 (Lesion classification) data set, containing 900 dermoscopic lesion images in JPEG format. The code is based on an example using a more generic dataset, and then customized for medical images. It highlights MATLAB’s recently added capabilities for handling more complex deep learning tasks, including the ability to:
- Create deep neural networks with custom layers, in addition to commonly used built-in layers.
- Train deep neural networks with custom training loop and enabling automatic differentiation.
- Process and manage mini-batches of images and using custom mini-batch processing functions.
- Evaluate the model gradients for each mini-batch – and update the generator and discriminator parameters accordingly.
Practical hints and tipsIf you choose to go down the path of improving, expanding, and adapting the example to your needs, keep in mind that:
- Image synthesis using GANs is a very time-consuming process (just as most deep learning solutions). Be sure to secure as much computational resources as you can.
- Some things can go wrong and could be detected by inspecting the training progress, among them: convergence failure (when the generator and discriminator do not reach a balance during training, with one of them overpowering the other) and mode collapse (when the GAN produces a small variety of images with many duplicates and little diversity in the output). Our example doesn’t suffer from either problem.
- Your results may not look “great” (contrast Figure 4 with Figure 2), but that is to be expected. After all, in this example we are basically using the standard DCGAN (deep convolutional generative adversarial network) Specialized work in synthetic skin lesion image generation has moved significantly beyond DCGAN; SOTA solutions (such as the one by Bissoto et al. and the one by Baur et al.) use more sophisticated architectures, normalization options, and validation strategies.
Key takeawaysGANs (and their numerous variations) are here to stay. They are, according to Yann LeCun, “the coolest thing since sliced bread.” Many different GAN architectures have been successfully used for generating realistic (i.e., semantically meaningful) synthetic images, which may help training deep learning models in cases where real images are rare, difficult to find, and expensive to annotate. In this blog post we have used MATLAB to show how to generate synthetic images of skin lesions using a simple DCGAN and training images from the ISIC archive. Medical image synthesis is a very active research area, and new examples of successful applications of GANs in different medical domains, specialties, and image modalities are likely to emerge in the near future. If you’re interested in learning more about it, check out this review paper and use our example as a starting point for further experimentation.
To leave a comment, please click here to sign in to your MathWorks Account or create a new one.