Review of Medical Image Synthesis using GAN Techniques

: Generative Adversarial Networks (GANs) is one of the vital efficient methods for generating a massive, high-quality artificial picture. For diagnosing particular diseases in a medical image, a general problem is that it is expensive, usage of high radiation dosage, and time-consuming to collect data. Hence GAN is a deep learning method that has been developed for the image to image translation, i.e. from low-resolution to high-resolution image, for example generating Magnetic resonance image (MRI) from computed tomography image (CT) and 7T from 3T MRI which can be used to obtain multimodal datasets from single modality. In this review paper, different GAN architectures were discussed for medical image analysis.


Introduction
Today, the majority of medical professionals diagnosing diseases with the help of computer aided images. In general, some of the diseases could not be identified with low-resolution images. For example, it is difficult to segment the area of the hippocampus for diagnosing Alzheimer's disease in MRI. So, the deep convolutional network will help to generate the synthetic images along with their segmented images. These synthesized images are in high resolution.
GAN is the deep convolution neural network that was defined by a team of research workers under the guidance of Ian Good fellow. GAN has two competing neural network model. One uses the noise as input and generates samples (and so named generator). The second model which is known as discriminator receives samples from the generator and the training data. According to game theory, the generator is trained to produce an image that looks like a real image, whereas the discriminator is learning to discriminate perfectly from generated data to actual data.
The GAN is trained similarly to the Minimax algorithm from Game Theory and the two networks try to achieve Nash Equilibrium with respect to each other. In medical image synthesis, MRI imaging is obtained from CT imaging. The generator produces the CT image and the discriminator is trained to differentiate the generated CT image with reference to the desired input (CT image) given to it. GAN is the perfect model for generating images and images with better resolution compared to the images generated by MLE (maximum likelihood Estimation) algorithm.
In this review, overview of the GAN principle and some of its variants are explained in section 1 and section 2 respectively. It is followed by a review of different GAN architecture in Section 3.Applications of GAN in medical image synthesis in Section 4. Section 5, 6, and 7 summarizes the review, Future challenges application , and conclusion

Structural Variants
GAN is the deep neural framework that contains two models namely the generative (G) and discriminative (D) model. G produces faux samples looks like training sample from the latent variable z, whereas faux samples from G is given to the D. Discriminator find the difference between real data and data generated from G.G tries to satisfy the discriminator with the faux sample, whenever the faux data is found by the discriminator the error is back propagated to the generator. This adversarial learning is formulated in equation (1) min The generator and discriminator neural model contains the differential function where the weights and bias of two models can be modified to adjust the probability density function through the backpropagation algorithm. Generator satisfies the discriminator with faux data when the PDF of input(pdata (a)) is equal to the PDF of the data created by the generator(pg(a)) [1].
In this instance it is difficult to differentiate faux and real data hence the discriminator produces the output as 0.5. It shows that D gets confused. The objective of the discriminator is to maximize the two terms log (1-D (G (z))) and log(G(a)) i.e. D(z)=0 and D(a)=1 to correctly classify the faux and real data. The best or optimal generator is given as: Here b=0 is the generated data and b=1 is the real data(b is assumed to be the output) and it is assumed that the probability of both generated and real data are equal. This condition represents that the relative behavior of the two distributions [3] is used by the GAN to pass the collection of data to the generator for generating actual samples. GAN also measures the discriminator discrepancy.

Different GAN Architectures
In this section, we introduce the different architectures of GAN like the DCGAN, MGAN, CGAN, Cycle GAN, ACGAN, WGAN which are applied in medical applications.

Cycle-GAN
Cycle GAN [5] is a technique using the cycle consistent property for training unsupervised image translation models with unpaired images. It translates the original input image into another fake image and again the fake image is converted back into the original input image. The forward and backward generators are named G and F respectively in figure 2. The cycle consistency forces F (G (A)) ≈A and G (F (B)) ≈B. Both the generators work similarly to an auto-encoder [6]. The proper translation of one image from another image is obtained by combining the cycle consistent loss with the adversarial loss.
In improved Cycle-GAN [7] architecture, gradient consistency loss is adding to improve the boundaries of lung module images in between the MRI images and CT images. This gives accurate results in the process of synthesis and segmentation in generated CT/MRI images.
Cycle-GAN with end to end synthesis and segmentation network can label the image from one modality (MRI) to another targeting modality (CT) [10].Their experimental results suggest that particular segmented CT images are generated from MRI images with multiple organ labels. Tanner et al. [11] used the MR-GAN framework for training paired and unpaired data together with combining various loss functions include adversarial loss, dual cycle-consistent loss, and voxel-wise loss. Even Cycle-GAN is used for DIR(deformable image registration) of lung inhale and exhale volume measurement [8] which results in the lowest mean absolute error between the paired and unpaired images.
A TD-GAN [12] combined with cycle GAN is used for pixel-to-pixel translation in chest X-ray image segmentation. A U-Net-based network [13] was trained using the supervised algorithm on simulated synthetic x-ray which is known as Digitally Reconstructed Radiographs (DRRs) for bone extraction. DRRs are reconstructed in the Cycle GAN with the addition of segmentation loss.

MGAN
The MGAN makes use of a pre-trained fixed weights VGG19 network. It is added beginning to both the generative and discriminative network of GAN for extracting feature maps and make the network to preserve the original image content. The generator transfers the style of these feature maps with target texture to an image and the discriminator transforms real or texturized input image with a Fully Convolutional Network (FCN) into VGG19 feature maps.FCN classifies the real and fake image which belongs to patches classified in the VGG19 attributes map space. The generator makes the discriminator satisfied with the faux images, So it is trained to generate images that are similar to the original characteristics of VGG19. A loss function is calculated using VGG which predicts that the real input image content is preserved while transferring the style of the high-level features.

3.3ACGAN
Auxiliary classifier GAN gives the perfect results with the limited dataset. In this network, side information is reconstructed by the discriminator.
Here, the Discriminator model can adapt to multi-class identification and gives the probability distribution of sources and class labels. Samples in the generator have the corresponding class labels ( fig.4).

DCGAN
Generative Adversarial Networks with deep convolutional neural network . [16] is the extension of GAN introduced first. DCGAN can maintain stability in the training process and create high resolution images. DCGAN uses convolutional neural networks with learning phase and generation phase for generating synthetic images. During training process, the random noise is given as the input to the generator and the generator with multi deconvolutional neural network produces the sample images looks like real images. Discriminator tries to differentiate generated images and training set of images [16]. DCGAN uses Batch Norm [17] for normalization of extracted data, and Leaky ReLU [18] for preventing dead gradients.

Conditional GAN (CGAN) [19]
A latent variable is passed to the generator and the discriminator as shown in figure 6. The generator learns side information conditional distributions, as it can disentangle this from the overall latent space. Class label is given to discriminator hence discriminator predicts only the faux and real data ( fig.5)

Info GAN
Info GAN [20] is designed in such way to maximize the mutual information and it is trained without label to classify the data with less than 6 % error rate. Learns disentangled features. In info GAN, class label is not given to the discriminator hence discriminator has to predict both the class label and also discriminates the real and faux data ( fig.6)   Fig.6.InfoGAN

WGAN
In the GAN framework, real and generated images data distribution is exactly compared using the divergence called JSD( Jensen-Shannon divergence). This will make a diminished gradient and optimization problem, which results in mode collapse and instability. Hence, Earth Mover (ME) or Wasserstein-1 distance estimation is used in new architecture namely the Wasserstein-GAN (WGAN). Learning process of WGAN estimates the relationship between real and generated distribution in a deep manner. Practically WGAN leads to a slow optimization process. The value function of WGAN is designed using the Kantorovich-Rubinstein duality [23] to obtain

Synthesis
Medical images can be synthesized with conditional and unconditional image generation techniques to reduce the loss function. The generative property of both supervised and unsupervised GAN has been utilized to synthesis different medical images. In the following, works of all the frameworks will be given. Particularly conditional framework GAN is classified based on the type of imaging modality.

Unconditional Image Synthesis
Many works have proposed recently within the field of unsupervised medical image generation using GANs, which raise data simulation [22] and reduces class imbalance along with data scarcity problem [21]. At first, results have shown that the DCGAN synthesized the images of small patches of prostate benign, prostatic hyperplasia and Prostatic carcinoma [24], digital retinal images [23], or lung cancer nodules [21]. A separate generative model is used to train each metastases class. For liver lesion classification , synthetic samples in addition to heavily augmented data are used to train the GAN which considerably improves a CNN classifier. The synthetic samples are also used to classify skin lesions [25]. Recently, progressive GAN is proposed [26] to synthesize highly realistic-looking images of skin lesions that even skillful dermatologists could not differentiate from a real sample.

Conditional Image Synthesis
To diagnosis certain diseases from the medical image, CT scan images are required. However, CT imaging causes cell damage to the brain and cancer due to radiation exposure. Hence, the3D Fully Convolutional Networks are cascaded for extracting the CT image from MRI images. To improve the resolution of the synthetic images, the adversarial model is trained with some of the loss functions such as image gradient and reconstruction with pixel-wise loss. The cascaded generators are combined to form Auto-Context Model. The main idea of ACM is to train each classifier with the feature data of source image and probability map from the previous classifier gives the context information and allows the GAN for removing redundancy. Use of cycle GANs [27] to change 2D MR to CT pictures without the requirement of co-registered and explicit data pairs. Deep supervision discrimination is proposed to get feasible results in image-to-image translation, which takes advantage of the attributes related to the model of VGG16 which provides exact gradient updates to the generator and also provides real and newly synthetic CT images apart.

Segmentation
Segmentation of various biological structures is an essential process to extract a particular region. The manual segmentation method cannot give accurate results which made it utilize the Deep-Learning networks for analyzing diseases from the medical image. In Deep learning it is hard to optimize the model coefficients, hence there is a lack of annotations training. GAN modifies the CNN's top layers such as CRFs and SSMs [28] in different learning flows to optimize the coefficients.

Reconstruction
Reconstruction is the process of obtaining tomographic images from a projection.L2 weight regularization loss function is used to reduce the overfitting problem [32] for reconstructing the thin slice medical images usual thick slice images.3D-Y-Net-GAN and 3D-DenseU-Net reconstruction framework based GAN is proposed to reconstruct thin slice infant MR image [33].

Registration
Parameter dependency and heavy optimization load is the major disadvantage of the registration process. An improved version of the adversarial image registration method called WGAN with its image transformation capabilities has been used as an image alignment algorithm in MR-TRUS image registration. An unsupervised adversarial registration network does not get information from the ground truth deformation by maximizing the sum of squared difference and cross-correlation to define the similarity between the two pairs of images.

Classification
To diagnosis cardiovascular diseases, twostage SCGANs(Semi-coupled-GANs) [34] is proposed to completely classify the Left ventricle that covers from basal to apical slices in CMR images. A Combination of DCGAN [30] generated images and real images of Chest X-ray given to DCNN is used to classify abnormalities with the utilization of ReLU activation functions, L2 regularization, and Cross-validation which improves the prediction accuracy of GAN. GAN based synthetic data augmentation [35] plays a major role in liver lesion classification to identify the middle stage of cancer along with the bone and lungs.

DCGAN
Synthesized the images of small patches of prostate benign, prostatic hyperplasia and Prostatic carcinoma [24] Progressive GAN Synthesize highly realistic-looking images of skin lesions [26] Cycle GANs Change 2D MR to CT [27] Y-Net-GAN and 3D-DenseU-Net Reconstruct thin slice infant MR image [33] CollaGAN Synthesis image across different MR sequences [31] SCGANs Completely classify the Left ventricle that covers from basal to apical slices in CMR images. [34]

Future challenges
It is difficult to stabilize the metrics such as Structural similarity index and MAE for quantitative evaluation. Even though it is no need for the visual quality of the image during the reconstruction process. Mode collapse is the most important issue to analyze the similar images. Diminished gradient and nonconvergence are also the major problems in GAN. It requires a huge number of training steps to reduce the loss function which makes it difficult to balance both generator and discriminator in a single training. This increases the cost function.

Future applications
GAN can be used for Improving radiology work flow and patient care. The advantage of GANs is to perform semi-supervised learning and unsupervised learning. Progressive GAN is applied to medical image analysis for improving resolution. GAN can predict the future video frame and 3D modeling.

Conclusion
GAN plays a major role in Cross-modality image synthesis. GAN uses the MR high quality imaging modality that can minimize the data acquisition time by originally forming a sequence of data from a previously taken one. Recently proposed CollaGAN [31] network synthesis image across different MR sequences and find the missing input data with the single generator and discriminator. The availability of huge MR data sets is the reason for the utilization of MR in the GANs framework.
Adversarial training network regularize the texture and shape in the generator output which is important in the reconstruction and segmentation process of medical image processing. This technique significantly reduces the loss function in the segmentation of the liver from 3D CT Volumes. Further, this paper is focused on classification using the data augmentation with domain shift that generates the tiny objects such as nodules, lesions, and cells. Many studies explained that GAN is used to develop the synthesized images to diagnosis neurodegenerative diseases.