Skip to main content

MADGAN: unsupervised medical anomaly detection GAN using multiple adjacent brain MRI slice reconstruction

Abstract

Background

Unsupervised learning can discover various unseen abnormalities, relying on large-scale unannotated medical images of healthy subjects. Towards this, unsupervised methods reconstruct a 2D/3D single medical image to detect outliers either in the learned feature space or from high reconstruction loss. However, without considering continuity between multiple adjacent slices, they cannot directly discriminate diseases composed of the accumulation of subtle anatomical anomalies, such as Alzheimer’s disease (AD). Moreover, no study has shown how unsupervised anomaly detection is associated with either disease stages, various (i.e., more than two types of) diseases, or multi-sequence magnetic resonance imaging (MRI) scans.

Results

We propose unsupervised medical anomaly detection generative adversarial network (MADGAN), a novel two-step method using GAN-based multiple adjacent brain MRI slice reconstruction to detect brain anomalies at different stages on multi-sequence structural MRI: (Reconstruction) Wasserstein loss with Gradient Penalty + 100 \(\ell _1\) loss—trained on 3 healthy brain axial MRI slices to reconstruct the next 3 ones—reconstructs unseen healthy/abnormal scans; (Diagnosis) Average \(\ell _2\) loss per scan discriminates them, comparing the ground truth/reconstructed slices. For training, we use two different datasets composed of 1133 healthy T1-weighted (T1) and 135 healthy contrast-enhanced T1 (T1c) brain MRI scans for detecting AD and brain metastases/various diseases, respectively. Our self-attention MADGAN can detect AD on T1 scans at a very early stage, mild cognitive impairment (MCI), with area under the curve (AUC) 0.727, and AD at a late stage with AUC 0.894, while detecting brain metastases on T1c scans with AUC 0.921.

Conclusions

Similar to physicians’ way of performing a diagnosis, using massive healthy training data, our first multiple MRI slice reconstruction approach, MADGAN, can reliably predict the next 3 slices from the previous 3 ones only for unseen healthy images. As the first unsupervised various disease diagnosis, MADGAN can reliably detect the accumulation of subtle anatomical anomalies and hyper-intense enhancing lesions, such as (especially late-stage) AD and brain metastases on multi-sequence MRI scans.

Background

Machine learning has revolutionized life science research, especially in neuroimaging and bioinformatics [1, 2], such as by modeling interactions between whole brain genomics/imaging [3, 4] and identifying Alzheimer’s disease (AD)-related proteins [5]. Especially, deep learning can achieve accurate computer-assisted diagnosis when large-scale annotated training samples are available. In medical imaging, unfortunately, preparing such massive annotated datasets is often unfeasible [6, 7]; to tackle this pervasive problem, researchers have proposed various data augmentation techniques, including generative adversarial network (GAN)-based ones [8,9,10,11,12,13] ; alternatively, Rauschecker et al. combined convolutional neural networks (CNNs), feature engineering, and expert-knowledge Bayesian network to derive brain magnetic resonance imaging (MRI) differential diagnoses that approach neuroradiologists’ accuracy for 19 diseases. However, even exploiting these techniques, supervised learning still requires many images with pathological features, even for rare diseases, to make a reliable diagnosis; nevertheless, it can only detect already-learned specific pathologies. In this regard, as physicians notice previously unseen anomaly examples using prior information on healthy body structure, unsupervised anomaly detection methods leveraging only large-scale healthy images can discover and alert overlooked diseases when their generalization fails.

Towards this, researchers reconstructed a single medical image via GANs [14], autoencoders (AEs) [15], or combining them, since GANs can generate realistic images and AEs, especially variational AEs (VAEs), can directly map data onto its latent representation [16]; then, unseen images were scored by comparing them with reconstructed ones to discriminate a pathological image distribution (i.e., outliers either in the learned feature space or from high reconstruction loss). However, those single image reconstruction methods mainly target diseases easy-to-detect from a single image even for non-expert human observers, such as glioblastoma on MR images [16] and lung cancer on computed tomography (CT) images [15]. Without considering continuity between multiple adjacent images, they cannot directly discriminate diseases composed of the accumulation of subtle anatomical anomalies, such as AD. Moreover, no study has shown so far how unsupervised anomaly detection is associated with either disease stages, various (i.e., more than 2 types of) diseases, or multi-sequence MRI scans.

Fig. 1
figure 1

Unsupervised medical anomaly detection framework: we train WGAN-GP w/\(\ell _1\) loss on 3 healthy brain axial MRI slices to reconstruct the next 3 ones, and test it on both unseen healthy and abnormal scans to classify them according to average \(\ell _2\) loss per scan

Therefore, this paper proposes unsupervised medical anomaly detection GAN (MADGAN), a novel two-step method using GAN-based multiple adjacent brain MRI slice reconstruction to detect various diseases at various stages on multi-sequence structural MRI (Fig. 1): (Reconstruction) Wasserstein loss with gradient penalty (WGAN-GP) [17, 18] + 100 \(\ell _1\) loss—trained on 3 healthy brain axial MRI slices to reconstruct the next 3 ones—reconstructs unseen healthy/abnormal scans; the \(\ell _1\) loss generalizes well only for unseen images with a similar distribution to the training images while the WGAN-GP loss captures recognizable structure; (Diagnosis) Average \(\ell _2\) loss per scan discriminates them, comparing the ground truth/reconstructed slices; the \(\ell _2\) loss clearly discriminates the healthy/abnormal scans as squared error becomes huge for outliers. Using receiver operating characteristics (ROCs) and their area under the curves (AUCs), we evaluate the diagnosis performance of AD on T1-weighted (T1) MRI scans, and brain metastases/various diseases (e.g., small infarctions, aneurysms) on contrast-enhanced T1 (T1c) MRI scans. Using 1133 healthy T1 and 135 healthy T1c scans for training, our self-attention (SA) MADGAN approach can detect AD at a very early stage, mild cognitive impairment (MCI), with AUC 0.727, and AD at a late stage with AUC 0.894, while detecting brain metastases with AUC 0.921.

Contributions Our main contributions are as follows:

  • MRI Slice Reconstruction This first multiple MRI slice reconstruction approach can reliably predict the next 3 slices from the previous 3 ones only for unseen images similar to training data by combining SAGAN and \(\ell _1\) loss.

  • Unsupervised Anomaly Detection This first unsupervised multi-stage anomaly detection reveals that, like physicians’ way of performing a diagnosis, massive healthy data can aid early diagnosis, such as of MCI, while also detecting late-stage disease much more accurately by discriminating with \(\ell _2\) loss.

  • Various Disease Diagnosis This first unsupervised various disease diagnosis can reliably detect the accumulation of subtle anatomical anomalies (e.g., AD), as well as hyper-intense enhancing lesions (e.g., brain metastases) on multi-sequence MRI scans.

Related work

Alzheimer’s disease diagnosis

Even though the clinical, social, and economic impact of early AD diagnosis is of paramount importance [19]—primarily associated with MCI detection [20]—it generally relies on subjective assessment by physicians (e.g., neurologists, geriatricians, and psychiatrists). The diagnosis typically considers two characteristics: (i) medial temporal lobe atrophy (particularly hippocampus, entorhinal cortex, and perirhinal cortex) and (ii) temporo-parietal cortical atrophy. Quantifying these structures is crucial for early AD diagnosis and its progression tracking [21]. Moreover, morphometry-based markers, such as gray matter volume and cortical thickness, can play a key role in brain atrophy assessment [22].

Towards quantitative and reproducible approaches, many traditional supervised machine learning-based methods—which relies on handcrafted MRI-derived features—were proposed in the literature [23, 24]. In this context, diffusion-weighted MRI tractography enables reconstructing the brain’s physical connections that can be subsequently investigated by complex network-based techniques. Lella et al. [25] employed the whole brain structural communicability as a graph-based metric to describe the AD-relevant brain connectivity disruption. This approach achieved comparable performance with classic machine learning models—namely, support vector machines, random forests, and artificial neural networks—in terms of classification and feature importance analysis.

In the latest years, deep learning has achieved outstanding performance by exploiting more multiple levels of abstraction and descriptive embeddings in a hierarchy of increasingly complex features [26]: Liu et al. devised a semi-supervised CNN to significantly reduce the need for labeled training data [27]; for clinical decision-making tasks, Suk et al. integrated multiple sparse regression models (i.e., deep ensemble sparse regression network) [28]; Spasov et al. proposed a parameter-efficient CNN for 3D separable convolutions, combining dual learning and a specific layer to predict the conversion from MCI to AD within 3 years [29]; different from CNN-based approaches, Parisot used a semi-supervised graph convolutional network trained on a sub-set of labeled nodes with diagnostic outcomes to represent sparse clinical data [30]. However, to the best of our knowledge, no existing work has conducted fully unsupervised anomaly detection for AD diagnosis since capturing subtle anatomical differences between MCI and AD is challenging.

Brain metastasis and various disease diagnosis

Along with neuro-degenerative diseases, MRI can also play a definite role in abnormality diagnosis. Whereas advanced cancer screening, imaging, and therapeutics can improve oncological patients’ survival and quality of life, brain metastases still remain major contributors of morbidity and mortality, especially for patients with lung cancer, breast cancer, or malignant melanoma [31]. To tackle this, previous computational methods have detected the brain metastases in either a supervised [13, 32] or semi-automatic manner [33, 34].

Detecting other various diseases, such as cerebral aneurysms, hemorrhage, and infarctions, also remain challenging [35, 36]. Therefore, similar to the brain metastases, researchers have mostly relied on supervised methods, especially CNN-based detection [37,38,39]. Recently, unsupervised anomaly segmentation methods have been applied to brain MRI datasets for detecting multiple sclerosis lesions [40] and glioblastoma [41]. However, it is difficult to directly compare our approach with such existing unsupervised anomaly detection methods on 3D medical images since we perform a whole-brain diagnosis (i.e., classification), instead of segmentation.

Unsupervised medical anomaly detection

Unsupervised disease diagnosis is challenging because it requires estimating healthy anatomy’s normative distributions only from healthy examples to detect outliers either in the learned feature space or from high reconstruction loss. The latest advances in deep learning, mostly GANs [8] and VAEs [42], have allowed for the accurate estimation of the high-dimensional healthy distributions. Except for discriminative boundary-based approaches including [43], almost all unsupervised medical anomaly detection studies have leveraged reconstruction: as pioneering research, Schlegl et al. proposed AnoGAN to detect outliers in the learned feature space of the GAN [44]; then, the same authors presented fast AnoGAN that can efficiently map query images onto the latent space [14]; since the reconstruction-based models often suffer from many false positives, Chen et al. penalized large deviations between original/reconstructed images in gliomas and stroke lesion detection on brain MRI [45]. However, to the best of our knowledge, all previous studies are based on 2D/3D single image reconstruction, without considering continuity between multiple adjacent slices. Moreover, no existing work has investigated how unsupervised anomaly detection is associated with either disease stages, various (i.e., more than two types of) diseases, or multi-sequence MRI scans.

Self-attention GANs (SAGANs)

Zhang et al. proposed SAGAN that deploys an SA mechanism in the generator/discriminator of a GAN to learn global and long-range dependencies for diverse image generation [46]; for further performance improvement, they suggested to apply the SA modules to large feature maps. The SAGANs have shown great promise in various tasks, such as human pose estimation [47], image colorization [48], photo-realistic image de-quantization [49], and large-scale image generation [50]. This SAGAN trend also applies to medical imaging to extract multi-level features for better super-resolution/denoising and lesion characterization: to mitigate the problem of thin slice thickness, Kudo et al. and Li et al. applied the SA modules to GANs on CT and MRI scans, respectively [51, 52]; similarly, in [53], the authors proposed to fuse plane SA modules and depth SA modules for low-dose 3D CT denoising; Lan et al. synthesized multi-modal 3D brain images using SA conditional GAN [53]; Ali et al. incorporated SA modules into progressive growing of GANs to generate realistic and diverse skin lesion images for data augmentation [54]. However, to the best of our knowledge, no existing work has directly exploited the SAGAN for medical disease diagnosis.

Materials and methods

Datasets

AD dataset: OASIS-3

We use a longitudinal 3.0T MRI dataset of \(176 \times 240{/}176 \times 256\) T1 brain axial MRI slices containing both normal aging subjects/AD patients, extracted from the open access series of imaging studies-3 (OASIS-3) [55]. The \(176 \times 240\) slices are zero-padded to reach \(176 \times 256\) pixels. Relying on clinical dementia rating (CDR) [56], common clinical scale for the staging of dementia, the subjects are comprised of:

  • Unchanged CDR = 0: Cognitively healthy population;

  • CDR = 0.5: Very mild dementia (\(\sim\) MCI);

  • CDR = 1: Mild dementia;

  • CDR = 2: Moderate dementia.

Since our dataset is longitudinal and the same subject’s CDRs may vary (e.g., CDR = 0 to CDR = 0.5), we only use scans with unchanged CDR = 0 to assure certainly healthy scans. As CDRs are not always assessed simultaneously with the MRI acquisition, we label MRI scans with CDRs at the closest date. We only select brain MRI slices including hippocampus/amygdala/ventricles among whole 256 axial slices per scan to avoid over-fitting from AD-irrelevant information; the atrophy of the hippocampus/amygdala/cerebral cortex, and enlarged ventricles are strongly associated with AD, and thus they mainly affect the AD classification performance of machine learning [57]. Moreover, we discard low-quality MRI slices. The remaining dataset is divided as follows:

  • Training set: Unchanged CDR = 0 (408 subjects/1133 scans/57,834 slices);

  • Test set: Unchanged CDR = 0 (168 subjects/473 scans/24,278 slices),

    CDR = 0.5 (152 subjects/253 scans/13,813 slices),

    CDR = 1 (90 subjects/135 scans/7532 slices),

    CDR = 2 (6 subjects/10 scans/500 slices).

The same subject’s scans are included in the same dataset. The datasets are strongly biased towards healthy scans similar to MRI inspection in the clinical routine. During training for reconstruction, we only use the training set—structural MRI alone—containing healthy slices to conduct unsupervised learning. We do not use a validation set as our unsupervised diagnosis step is non-trainable.

Brain metastasis and various disease dataset

This paper also uses a non-longitudinal, heterogeneous 1.5T/3.0T MRI dataset of \(190 \times 224{/}216 \times 256{/}256 \times 256{/}460 \times 460\) T1c brain axial MRI slices. This dataset was collected by the authors at National Center for Global Health and Medicine, and is not publicly available due to ethical restrictions. The dataset contains both healthy subjects, brain metastasis patients [33], and patients with various diseases different from brain metastases. The slices are resized to \(176 \times 256\) pixels. The various diseases include but are not limited to:

  • Small infarctions;

  • Aneurysms;

  • Benign tumors;

  • Hemorrhages;

  • Cysts;

  • White matter lesions;

  • Post-operative inflammations.

Conforming to T1 slices, we also only select T1c slices including hippocampus, amygdala, and ventricles—a large portion of various diseases also appear in the mid-brain. The remaining dataset is divided as follows:

  • Training set: Normal (135 subjects/135 scans/7793 slices);

  • Test set: Normal (58 subjects/58 scans/3353 slices),

    Brain Metastases (79 subjects/79 scans/4872 slices),

    Various Diseases (66 subjects/66 scans/4195 slices).

Since we cannot collect large-scale T1c scans from healthy patients like OASIS-3 dataset, during training for reconstruction, we use both T1/T1c training sets containing healthy slices simultaneously for the knowledge transfer. In the clinical practice, T1c MRI is well-established in detecting various diseases, including brain metastases [58], thanks to its high-contrast in the enhancing region—however, the contrast agent is not suitable for screening studies. Accordingly, such inter-sequence knowledge transfer is valuable in computer-assisted MRI diagnosis. During testing, we make an unsupervised diagnosis on T1 and T1c scans separately.

Fig. 2
figure 2

Proposed MADGAN architecture for the next 3-slice generation from the input 3 \(256 \times 176\) brain MRI slices: 3-SA MADGAN has only 3 (red-contoured) SA modules after convolution/deconvolution whereas 7-SA MADGAN has 7 (red- and blue-contoured) SA modules. Similar to RGB images, we concatenate adjacent 3 gray slices into 3 channels

MADGAN-based multiple adjacent brain MRI slice reconstruction

To model strong consistency in healthy brain anatomy (Fig. 1), in each scan, we reconstruct the next 3 MRI slices from the previous 3 ones using an image-to-image GAN (e.g., if a scan includes 40 slices \(s_i\) for \(i=1,\dots ,40\), we reconstruct all possible 35 setups: \((s_i)_{i\in \{1,2,3\}} \mapsto (s_i)_{i\in \{4,5,6\}}\); \((s_i)_{i\in \{2,3,4\}} \mapsto (s_i)_{i\in \{5,6,7\}}\); ...; \((s_i)_{i\in \{35,36,37\}} \mapsto (s_i)_{i\in \{38,39,40\}}\)). As Fig. 2 shows, our MADGAN uses a U-Net-like [59, 60] generator with 4 convolutional layers in encoders and 4 deconvolutional layers in decoders respectively with skip connections, as well as a discriminator with 3 decoders. We apply batch normalization to both convolution with leaky rectified linear unit (ReLU) and deconvolution with ReLU. Between the designated convolutional/deconvolutional layers and batch normalization layers, we apply SA modules [46] for effective knowledge transfer via feature recalibration between T1 and T1c slices; as confirmed on four different image datasets [61], introducing the SA modules to GAN-based anomaly detection (i.e., attention-driven, long-range dependency modeling) can also mitigate the effect of noise by ignoring irrelevant disturbances and focusing on the salient body parts in the slice. We compare the MADGAN models with a different number of the SA modules: (i) no SA modules (i.e., MADGAN); (ii) 3 (red-contoured) SA modules (i.e., 3-SA MADGAN); (iii) 7 (red- and blue-contoured) SA modules (i.e., 7-SA MADGAN). To confirm how reconstructed slices’ realism and anatomical continuity affect medical anomaly detection, we also compare the MADGAN models with different loss functions: (i) WGAN-GP loss + 100 \(\ell _1\) loss (i.e., MADGAN); (ii) WGAN-GP loss (i.e., MADGAN w/o \(\ell _1\) loss). The \(\ell _1\) and \(\ell _2\) losses between an input image x and its reconstructed image \(x'\) are defined as follows:

$$\begin{aligned} \ell _1&= \sum _{i=1}^{P} |x_i - x'_i|,\ \end{aligned}$$
(1)
$$\begin{aligned} \ell _2&= \sum _{i=1}^{P} (x - x')^2, \end{aligned}$$
(2)

where P denotes the number of pixels.

Implementation details Each MADGAN training lasts for \(1.8 \times 10^{6}\) steps with a batch size of 16 (our maximum available batch size). We use \(2.0 \times 10^{-4}\) learning rate for Adam optimizer [62]. Such as in RGB images, we concatenate adjacent 3 grayscale slices into 3 channels. During training, the generator uses two dropout [63] layers with 0.5 rate. We flip the discriminator’s real/synthetic labels once in three times for robustness. Using 4 NVIDIA Quadro GV100 graphics processing units, we implement the framework on TensorFlow 1.8.

Unsupervised medical anomaly detection

During diagnosis, we classify unseen healthy and abnormal scans based on average \(\ell _2\) loss per scan. The average \(\ell _2\) loss is calculated from whole MADGAN-reconstructed 3 slices \(s_i\) of each scan containing n slices: \((s_i)_{i\in \{4,5,6\}}\); \((s_i)_{i\in \{5,6,7\}}\); ...; \((s_i)_{i\in \{n-2,n-1,n\}}\). We use the \(\ell _2\) loss since squared error is sensitive to outliers and it significantly outperformed other losses (i.e., \(\ell _1\) loss, dice loss, structural similarity loss) in our preliminary paper [64]. To evaluate its unsupervised AD diagnosis performance on a T1 MRI test set, we show ROCs—along with the AUC values—between CDR = 0 versus (i) all the other CDRs; (ii) CDR = 0.5; (iii) CDR = 1; (iv) CDR = 2. We also show the AUCs under different training steps (i.e., 150k, 300k, 600k, 900k, 1.8M steps) and confirm the effect of calculating average \(\ell _2\) loss (among whole slice sets or continuous 10 slice sets exhibiting the highest loss) per scan; if the 10 slice sets start from the jth slice, we use: \((s_i)_{i\in \{j,j+1,j+2\}}\); \((s_i)_{i\in \{j+1,j+2,j+3\}}\); ...; \((s_i)_{i\in \{j+9,j+10,j+11\}}\)). Moreover, we visualize pixelwise \(\ell _2\) loss between real/reconstructed 3 slices, along with distributions of average \(\ell _2\) loss per scan of CDR = 0/0.5/1/2 to know how disease stages affect its discrimination. In exactly the same manner, we evaluate the diagnosis performance of brain metastases/various diseases on a T1c MRI test set, showing ROCs/AUCs between normal versus (i) brain metastases + various diseases; (ii) brain metastases; (iii) various diseases.

Fig. 3
figure 3

Example T1 brain MRI slices with CDR = 0/0.5/1/2 from a test set: a Input 3 real slices; b Ground truth next 3 real slices; c, d Next 3 slices reconstructed by MADGAN and 7-SA MADGAN. To compare the real/reconstructed next 3 slices, we show pixelwise \(\ell _2\) loss values in (b) versus (c) and (b) versus (d) columns, respectively. Using a Jet colormap in [0, 0.2] with alpha-blending, we overlay the obtained maps onto the ground truth slices. The achieved slice-level, pixelwise \(\ell _2\) loss values are also displayed

Fig. 4
figure 4

Example T1c brain MRI slices with no abnormal findings/three brain metastases from a test set: a Input 3 real slices; b Ground truth next 3 real slices; c, d Next 3 slices reconstructed by MADGAN and 7-SA MADGAN. To compare the real/reconstructed next 3 slices, we show pixelwise \(\ell _2\) loss values in (b) versus (c) and (b) versus (d) columns, respectively. Using a Jet colormap in [0, 0.06] with alpha-blending, we overlay the obtained maps onto the ground truth slices. The achieved slice-level, pixelwise \(\ell _2\) loss values are also displayed

Fig. 5
figure 5

Example T1c brain MRI slices with four different brain diseases from a test set: a Input 3 real slices; b Ground truth next 3 real slices; c, d Next 3 slices reconstructed by MADGAN and 7-SA MADGAN. To compare the real/reconstructed next 3 slices, we show pixelwise \(\ell _2\) loss values in (b) versus (c) and (b) versus (d) columns, respectively. Using a Jet colormap in [0, 0.06] with alpha-blending, we overlay the obtained maps onto the ground truth slices. The achieved slice-level, pixelwise \(\ell _2\) loss values are also displayed

Results

Reconstructed brain MRI slices

Figure 3 illustrates example real T1 MRI slices from a test set and their reconstruction by MADGAN and 7-SA MADGAN. Similarly, Figs. 4 and 5 show example real T1c MRI slices and their reconstructions. Pixelwise \(\ell _2\) loss tends to increase (i.e., high intensity in the heatmap) around lesions due to their different image distribution from healthy samples.

Fig. 6
figure 6

Distributions of average \(\ell _2\) loss per scan evaluated on T1 slices with CDR = 0/0.5/1/2 reconstructed by: a MADGAN and b 7-SA MADGAN

Fig. 7
figure 7

Distributions of average \(\ell _2\) loss per scan evaluated on T1c slices with no abnormal findings/brain metastases/various diseases reconstructed by: a MADGAN and b 7-SA MADGAN

Figures 6 and 7 indicate distributions of average \(\ell _2\) loss per scan on T1 and T1c scans, respectively. Leveraging \(\ell _1\) loss’ good realism sacrificing diversity (i.e., generalizing well only for unseen images with a similar distribution to training images) and WGAN-GP loss’ ability to capture recognizable structure, the MADGAN can successfully capture T1-specific appearance and anatomical changes from the previous 3 slices. Meanwhile, the 7-SA MADGAN tends to be less stable in keeping texture but more sensitive to abnormal anatomical changes due to the SA modules’ anomaly-sensitive reconstruction via the attention-driven, long-range dependency modeling, resulting in moderately higher average \(\ell _2\) loss than the MADGAN.

Since the models are trained only on healthy slices, as visualized by an over-imposed Jet colormap, reconstructing slices with higher CDRs tends to comparatively fail, especially around hippocampus, amygdala, cerebral cortex, and ventricles due to their insufficient atrophy after reconstruction; this is plausible because physicians also perform the AD diagnosis based on their prior normal atrophy information around those body parts. We do not find other significant reconstruction failures except them, considering that inter-subject/sequence variability also lead to considerable reconstruction failures. The T1c scans show much lower average \(\ell _2\) loss than the T1 scans due to darker texture. Since most training images are the T1 slices with brighter texture than the T1c slices, reconstruction quality clearly decreases on the T1c slices, occasionally exhibiting bright texture. Accordingly, reconstruction failure from anomaly contributes comparatively less to the average \(\ell _2\) loss, especially when local small lesions, such as brain abscesses and enhanced lesions, appear—unlike global big lesions, such as multiple cerebral infarction and blood component retention. However, the average \(\ell _2\) loss remarkably increases on brain metastases scans due to their hyper-intensity, especially for the 7-SA MADGAN.

Fig. 8
figure 8

AUC performance on T1 scans using average \(\ell _2\) loss per scan under different training steps (i.e., 150k, 300k, 600k, 900k, 1.8M steps). Unchanged CDR = 0 (i.e., cognitively healthy population) is compared against: a all the other CDRs (i.e., dementia); b CDR = 0.5 (i.e., very mild dementia); c CDR = 1 (i.e., mild dementia); d CDR = 2 (i.e., moderate dementia)

Fig. 9
figure 9

AUC performance on T1c scans using average \(\ell _2\) loss per scan under different training steps (i.e., 150k, 300k, 600k, 900k, 1.8M steps). No abnormal findings are compared against: a brain metastases + various diseases; b brain metastases; c various diseases

Unsupervised anomaly detection results

Figures 8 and 9 show AUCs of unsupervised anomaly detection on T1 and T1c scans under different training steps, respectively. The AUCs generally increase as training progresses, but more SA modules require more training steps until convergence due to their feature recalibration. Although most models show a convergence after 900k steps, MADGAN with abundant SA modules might perform even better, especially on the T1c scans with less training data than the T1 scans, if we continue its training.

All the best results in specific tasks, except for CDR = 0 versus CDR = 0.5, are from the SA models (e.g., 7-SA MADGAN w/o \(\ell _1\) loss under 900k steps: AUC 0.783 in CDR = 0 versus CDR = 0.5 + 1 + 2, 3-SA MADGAN under 300k steps: AUC 0.966 in normal versus brain metastases, 3-SA MADGAN under 600k steps: AUC 0.638 in normal versus various diseases); thus, whereas the SA models, which do not know the task to optimize in an unsupervised manner, perform unstably, we might use them similar to supervised learning if we could obtain good parameters for a certain disease. Without \(\ell _1\) loss, the AUCs tend to decrease, also accompanying large fluctuations; 7-SA MADGAN w/o \(\ell _1\) loss performs well on the T1 scans but poorly on the T1c scans due to the instability.

Fig. 10
figure 10

Unsupervised anomaly detection results using average \(\ell _2\) loss per scan on reconstructed T1 slices (ROCs and AUCs). Unchanged CDR = 0 (i.e., cognitively healthy population) is compared against: a all the other CDRs (i.e., dementia); b CDR = 0.5 (i.e., very mild dementia); c CDR = 1 (i.e., mild dementia); d CDR = 2 (i.e., moderate dementia). Each model is trained for 1.8M steps

Fig. 11
figure 11

Unsupervised anomaly detection results using average \(\ell _2\) loss per scan on reconstructed T1c slices (ROCs and AUCs). No abnormal findings are compared against: a brain metastases + various diseases; b brain metastases; c various diseases. Each model is trained for 1.8M steps

Figures 10 and 11 illustrate ROC curves and their AUCs on T1 and T1c scans under 1.8M training steps, respectively. Since brains with higher CDRs accompany stronger anatomical atrophy from healthy brains, their AUCs between unchanged CDR = 0 remarkably increase as CDRs increase. MADGAN and 7-SA MADGAN both achieve good AUCs, especially for higher CDRs—The MADGAN obtains AUC 0.750/0.707/0.829 in CDR = 0 versus CDR = 0.5/1/2, respectively; the discrimination between healthy subjects versus MCI patients (i.e., CDR = 0 versus CDR = 0.5) is extremely difficult even in a supervised manner [57]. Whereas detecting various diseases is difficult in an unsupervised manner, the 7-SA MADGAN outperforms the MADGAN and achieves AUC 0.921 in brain metastases detection. As Tables 1 and 2 show, the effect of how to calculate average \(\ell _2\) loss (among whole slice sets or continuous 10 slice sets exhibiting the highest loss) per scan is limited. Whereas no significant differences exist between them, the best performing approach on each dataset is always whole slice sets-based.

Table 1 AUC performance of unsupervised anomaly detection on T1 scans using average \(\ell _2\) loss (among whole slice sets/continuous 10 slice sets exhibiting the highest loss) per scan. Unchanged CDR = 0 (i.e., cognitively healthy population) is compared against: (i) all the other CDRs (i.e., dementia); (ii) CDR = 0.5 (i.e., very mild dementia); (iii) CDR = 1 (i.e., mild dementia); (iv) CDR = 2 (i.e., moderate dementia). Each model is trained for 1.8M steps
Table 2 AUC performance of unsupervised anomaly detection on T1c scans using average \(\ell _2\) loss (among whole slice sets/continuous 10 slice sets exhibiting the highest loss) per scan. No abnormal findings are compared against: (i) brain metastases + various diseases; (ii) brain metastases; (iii) various diseases. Each model is trained for 1.8M steps

Discussion and conclusions

Using massive healthy data, our MADGAN-based multiple MRI slice reconstruction can reliably discriminate AD patients from healthy subjects for the first time in an unsupervised manner; to detect the accumulation of subtle anatomical anomalies, our solution leverages a two-step approach: (Reconstruction) \(\ell _1\) loss generalizes well only for unseen images with a similar distribution to training images while WGAN-GP loss captures recognizable structure; (Diagnosis) \(\ell _2\) loss clearly discriminates healthy/abnormal data as squared error becomes huge for outliers. Using 1133 healthy T1 MRI scans for training, our approach can detect AD at a very early stage, MCI, with AUC 0.727 while detecting AD at a late stage with AUC 0.894. Accordingly, this first unsupervised anomaly detection across different disease stages reveals that, like physicians’ way of performing a diagnosis, large-scale healthy data can reliably aid early diagnosis, such as of MCI, while also detecting late-stage disease much more accurately.

To confirm its ability to also detect other various diseases, even on different MRI sequence scans, we firstly investigate how unsupervised medical anomaly detection is associated with various diseases and multi-sequence MRI scans, respectively. Due to the different texture of T1/T1c slices, reconstruction quality clearly decreases on the data-sparse T1c slices, and thus reconstruction failure from anomaly contributes comparatively less to the average \(\ell _2\) loss. Nevertheless, we generally succeed to unravel diseases hard-to-detect and easy-to-detect in an unsupervised manner: it is hard to detect local small lesions, such as brain abscesses and enhanced lesions; but, it is easy to detect hyper-intense enhancing lesions, such as brain metastases (AUC 0.921), especially for 7-SA MADGAN thanks to its feature recalibration. Our visualization of differences between real/reconstructed slices might play a key role in understanding and preventing various diseases, including rare diseases.

Since we firstly propose a two-step unsupervised anomaly detection approach based on multiple slice reconstruction, its limitations are two-fold: yet less generalizable reconstruction and diagnosis. As future work, we will investigate more suitable SA modules in a reconstruction model, such as dual attention network that captures feature dependencies in both spatial/channel dimensions [65]; here, optimizing where to place how many SA modules is the most relevant aspect. We will validate combining new loss functions for both reconstruction/diagnosis, including sparsity regularization [66], structural similarity [67], and perceptual loss [68]. Lastly, we plan to collect a higher amount of healthy T1c scans to reliably detect and locate various diseases, including cancers and rare diseases. Integrating multi-modal imaging data, such as positron emission tomography with specific radiotracers [69], might further improve disease diagnosis [70], even when analyzed modalities are not always available [71]. Moreover, to specify detected anomalies, we might extend this work to supervised learning with limited pathological data by discriminating normal/pathological image distributions during diagnosis, instead of calculating the average \(\ell _2\) loss per scan.

Availability of data and materials

OASIS-3 dataset is publicly available via the website: https://www.oasis-brains.org/. The brain metastasis and various disease dataset was collected by National Center for Global Health and Medicine, and is not publicly available due to ethical restrictions.

Abbreviations

AUC:

Area under the curve

AE:

Autoencoder

AD:

Alzheimer’s disease

CDR:

Clinical dementia rating

T1c:

Contrast-enhanced T1-weighted

CNN:

Convolutional neural network

CT:

Computed tomography

GAN:

Generative adversarial network

MRI:

Magnetic resonance imaging

MADGAN:

Medical anomaly detection generative adversarial network

MCI:

Mild cognitive impairment

OASIS-3:

Open access series of imaging studies-3

ROC:

Receiver operating characteristic

ReLU:

Rectified linear unit

SA:

Self-attention

T1:

T1-weighted

VAE:

Variational autoencoder

WGAN-GP:

Wasserstein loss with gradient penalty

References

  1. Gao L, Pan H, Li Q, Xie X, Zhang Z, Han J, Zhai X. Brain medical image diagnosis based on corners with importance-values. BMC Bionform. 2017;18(1):1–13. https://doi.org/10.1186/s12859-017-1903-6.

    Article  Google Scholar 

  2. Serra A, Galdi P, Tagliaferri R. Machine learning for bioinformatics and neuroimaging. Wiley Interdisc Rev Data Min Knowl Discov. 2018;8(5):1248. https://doi.org/10.1002/widm.1248.

    Article  Google Scholar 

  3. Park B, Lee W, Han K. Modeling the interactions of Alzheimer-related genes from the whole brain microarray data and diffusion tensor images of human brain. BMC Bioinform. 2012;13(S7):10. https://doi.org/10.1186/1471-2105-13-S7-S10.

    Article  CAS  Google Scholar 

  4. Medland SE, Jahanshad N, Neale BM, Thompson PM. Whole-genome analyses of whole-brain data: working within an expanded search space. Nat Neurosci. 2014;17(6):791–800. https://doi.org/10.1038/nn.3718.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Zhao T, Hu Y, Zang T, Cheng L. Identifying Alzheimer’s disease-related proteins by LRRGD. BMC Bionform. 2019;20(18):570. https://doi.org/10.1186/s12859-019-3124-7.

    Article  Google Scholar 

  6. Han C, Rundo L, Murao K, Nemoto T, Nakayama H, Satoh S. Bridging the gap between AI and healthcare sides: towards developing clinically relevant AI-powered diagnosis systems. In: Proceedings international conference on artificial intelligence applications and innovations (AIAI); 2020. p. 320–33 . https://doi.org/10.1007/978-3-030-49186-4_27.

  7. Cheplygina V, de Bruijne M, Pluim JP. Not-so-supervised: a survey of semi-supervised, multi-instance, and transfer learning in medical image analysis. Med Image Anal. 2019;54:280–96. https://doi.org/10.1016/j.media.2019.03.009.

    Article  PubMed  Google Scholar 

  8. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, et al. Generative adversarial nets. In: Proceedings advances in neural information processing systems (NIPS); 2014. p. 2672–80.

  9. Frid-Adar M, Diamant I, Klang E, Amitai M, Goldberger J, Greenspan H. GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification. Neurocomputing. 2018;321:321–31. https://doi.org/10.1016/j.neucom.2018.09.013.

    Article  Google Scholar 

  10. Han C, Rundo L, Araki R, Furukawa Y, Mauri G, Nakayama H, Hayashi H. Infinite brain MR images: PGGAN-based data augmentation for tumor detection. In: Neural approaches to dynamics of signal exchanges. Smart innovation, systems and technologies, vol. 151. Springer; 2019. p. 291–303. https://doi.org/10.1007/978-981-13-8950-4_27.

  11. Han C, Rundo L, Araki R, Nagano Y, Furukawa Y, et al. Combining noise-to-image and image-to-image GANs: brain MR image augmentation for tumor detection. IEEE Access. 2019;7(1):156966–77. https://doi.org/10.1109/ACCESS.2019.2947606.

    Article  Google Scholar 

  12. Han C, Kitamura Y, Kudo A, Ichinose A, Rundo L, Furukawa Y, et al. Synthesizing diverse lung nodules wherever massively: 3D multi-conditional GAN-based CT image augmentation for object detection. In: Proceedings international conference on 3D vision (3DV); 2019. p. 729–37. https://doi.org/10.1109/3DV.2019.00085.

  13. Han C, Murao K, Noguchi T, et al. Learning more with less: conditional PGGAN-based data augmentation for brain metastases detection using highly-rough annotation on MR images. In: Proceedings ACM international conference on information and knowledge management (CIKM); 2019. p. 119–27. https://doi.org/10.1145/3357384.3357890.

  14. Schlegl T, Seeböck P, Waldstein SM, Langs G, Schmidt-Erfurth U. f-AnoGAN: fast unsupervised anomaly detection with generative adversarial networks. Med Image Anal. 2019;54:30–44.

    Article  Google Scholar 

  15. Uzunova H, Schultz S, Handels H, Ehrhardt J. Unsupervised pathology detection in medical images using conditional variational autoencoders. Int J Comput Assist Radiol Surg. 2019;14(3):451–61. https://doi.org/10.1007/s11548-018-1898-0.

    Article  PubMed  Google Scholar 

  16. Chen X, Konukoglu E. Unsupervised detection of lesions in brain MRI using constrained adversarial auto-encoders. In: Proceedings international conference on medical imaging with deep learning (MIDL); 2018. arXiv preprint arXiv:1806.04972.

  17. Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville AC. Improved training of Wasserstein GANs. In: Proceedings advances in neural information processing systems (NIPS); 2017. p. 5769–79.

  18. Han C, Hayashi H, Rundo L, Araki R, Shimoda W, Muramatsu S et al: GAN-based synthetic brain MR image generation. In: Proceedings international symposium on biomedical imaging (ISBI). IEEE; 2018. p. 734–38. https://doi.org/10.1109/ISBI.2018.8363678.

  19. Arvanitakis Z, Shah RC, Bennett DA. Diagnosis and management of dementia. JAMA. 2019;322(16):1589–99. https://doi.org/10.1001/jama.2019.4782.

    Article  PubMed  PubMed Central  Google Scholar 

  20. Moscoso Rial A, Silva Rodríguez J, Aldrey Vázquez JM, Cortés Hernández J, Fernández Ferreiro A, Gómez Lado N, et al. Prediction of alzheimer’s disease dementia with mri beyond the short-term: implications for the design of predictive models. NeuroImage Clin. 2019;23:101837. https://doi.org/10.1016/j.nicl.2019.101837.

    Article  Google Scholar 

  21. Desikan RS, Cabral HJ, Fischl B, Guttmann CR, Blacker D, Hyman BT, et al. Temporoparietal MR imaging measures of atrophy in subjects with mild cognitive impairment that predict subsequent diagnosis of Alzheimer disease. Am J Neuroradiol. 2009;30(3):532–8. https://doi.org/10.3174/ajnr.A1397.

    Article  CAS  PubMed  Google Scholar 

  22. Ma X, Li Z, Jing B, Liu H, Li D, Li H. Identify the atrophy of Alzheimer’s disease, mild cognitive impairment and normal aging using morphometric MRI analysis. Front Aging Neurosci. 2016;8:243. https://doi.org/10.3389/fnagi.2016.00243.

    Article  PubMed  PubMed Central  Google Scholar 

  23. Salvatore C, Cerasa A, Battista P, Gilardi MC, Quattrone A, Castiglioni I. Magnetic resonance imaging biomarkers for the early diagnosis of Alzheimer’s disease: a machine learning approach. Front Neurosci. 2015;9:307. https://doi.org/10.3389/fnins.2015.00307.

    Article  PubMed  PubMed Central  Google Scholar 

  24. Nanni L, Brahnam S, Salvatore C, Castiglioni I. Texture descriptors and voxels for the early diagnosis of Alzheimer’s disease. Artif Intell Med. 2019;97:19–26. https://doi.org/10.1016/j.artmed.2019.05.003.

    Article  PubMed  Google Scholar 

  25. Lella E, Lombardi A, Amoroso N, Diacono D, Maggipinto T, Monaco A, Bellotti R, Tangaro S. Machine learning and DWI brain communicability networks for Alzheimer’s disease detection. Appl Sci. 2020;10(3):934. https://doi.org/10.3390/app10030934.

    Article  Google Scholar 

  26. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436. https://doi.org/10.1038/nature145.

    Article  CAS  Google Scholar 

  27. Liu S, Liu S, Cai W, Pujol S, Kikinis R, Feng D. Early diagnosis of alzheimer’s disease with deep learning. In: Proceedings international symposium on biomedical imaging (ISBI). IEEE; 2014. p. 1015–8. https://doi.org/10.1109/ISBI.2014.6868045.

  28. Suk H-I, Lee S-W, Shen D. Deep ensemble learning of sparse regression models for brain disease diagnosis. Med Image Anal. 2017;37:101–13. https://doi.org/10.1016/j.media.2017.01.008.

    Article  PubMed  PubMed Central  Google Scholar 

  29. Spasov S, Passamonti L, Duggento A, Liò P, Toschi N, Initiative ADN, et al. A parameter-efficient deep learning approach to predict conversion from mild cognitive impairment to Alzheimer’s disease. NeuroImage. 2019;189:276–87. https://doi.org/10.1016/j.neuroimage.2019.01.031.

    Article  PubMed  Google Scholar 

  30. Parisot S, Ktena SI, Ferrante E, Lee M, Guerrero R, Glocker B, Rueckert D. Disease prediction using graph convolutional networks: application to autism spectrum disorder and Alzheimer’s disease. Med Image Anal. 2018;48:117–30. https://doi.org/10.1016/j.media.2018.06.001.

    Article  PubMed  Google Scholar 

  31. Sacks P, Rahman M. Epidemiology of brain metastases. Neurosurg Clin. 2020;31(4):481–8. https://doi.org/10.1016/j.nec.2020.06.001.

    Article  Google Scholar 

  32. Grøvik E, Yi D, Iv M, Tong E, Rubin D, Zaharchuk G. Deep learning enables automatic detection and segmentation of brain metastases on multisequence MRI. J Magn Reson Imaging. 2020;51(1):175–82. https://doi.org/10.1002/jmri.26766.

    Article  PubMed  Google Scholar 

  33. Rundo L, Militello C, Russo G, Vitabile S, Gilardi MC, Mauri G. GTVcut for neuro-radiosurgery treatment planning: an MRI brain cancer seeded image segmentation method based on a cellular automata model. Nat Comput. 2018;17:521–36. https://doi.org/10.1007/s11047-017-9636-z.

    Article  CAS  Google Scholar 

  34. Rundo L, Militello C, Tangherloni A, Russo G, Vitabile S, Gilardi MC, Mauri G. NeXt for neuro-radiosurgery: a fully automatic approach for necrosis extraction in brain tumor MRI using an unsupervised machine learning technique. Int J Imaging Syst Technol. 2018;28(1):21–37. https://doi.org/10.1002/ima.22253.

    Article  Google Scholar 

  35. Miki S, Hayashi N, Masutani Y, Nomura Y, Yoshikawa T, Hanaoka S, Nemoto M, Ohtomo K. Computer-assisted detection of cerebral aneurysms in MR angiography in a routine image-reading environment: effects on diagnosis by radiologists. Am J Neuroradiol. 2016;37(6):1038–43. https://doi.org/10.3174/ajnr.A4671.

    Article  CAS  PubMed  Google Scholar 

  36. Vert C, Parra-Fariñas C, Rovira À. MR imaging in hyperacute ischemic stroke. Eur J Radiol. 2017;96:125–32. https://doi.org/10.1016/j.ejrad.2017.06.013.

    Article  PubMed  Google Scholar 

  37. Zhou M, Wang X, Wu Z, Pozo JM, Frangi AF. Intracranial aneurysm detection from 3D vascular mesh models with ensemble deep learning. In: Proceedings international conference on medical image computing and computer-assisted intervention (MICCAI). LNCS. Springer; 2019. vol. 11767, p. 243–52. https://doi.org/10.1007/978-3-030-32251-9_27.

  38. Conti V, Militello C, Rundo L, Vitabile S. A novel bio-inspired approach for high-performance management in service-oriented networks. IEEE Trans Emerg Topics Comput. 2020;. https://doi.org/10.1109/TETC.2020.3018312.

    Article  Google Scholar 

  39. Federau C, Christensen S, Scherrer N, Ospel JM, Schulze-Zachau V, Schmidt N, et al. Improved segmentation and detection sensitivity of diffusion-weighted stroke lesions with synthetically enhanced deep learning. Radiol Artif Intell. 2020;2(5):190217. https://doi.org/10.1148/ryai.2020190217.

    Article  Google Scholar 

  40. Baur C, Wiestler B, Albarqouni S, Navab N. Deep autoencoding models for unsupervised anomaly segmentation in brain MR images. In: Proceedings international conference on medical image computing and computer-assisted intervention (MICCAI) Workshop. Springer; 2018. p. 161–9.

  41. Zimmerer D, Isensee F, Petersen J, Kohl S, Maier-Hein K. Unsupervised anomaly localization using variational auto-encoders. In: Proceedings international conference on medical image computing and computer-assisted intervention (MICCAI). Springer; 2019. p. 289–97.

  42. Kingma DP, Welling M. Auto-encoding variational Bayes. In: Proceedings international conference on learning representations (ICLR) 2014. arXiv preprint arXiv:1312.6114.

  43. Alaverdyan Z, Jung J, Bouet R, Lartizien C. Regularized siamese neural network for unsupervised outlier detection on brain multiparametric magnetic resonance imaging: application to epilepsy lesion screening. Med Image Anal. 2020;60:101618.

    Article  Google Scholar 

  44. Schlegl T, Seeböck P, Waldstein SM, Schmidt-Erfurth U, Langs G. Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. In: Proceedings international conference on medical image computing and computer-assisted intervention (MICCAI). Springer; 2017. p. 146–57.

  45. Chen X, You S, Tezcan KC, Konukoglu E. Unsupervised lesion detection via image restoration with a normative prior. Med Image Anal. 2020;. https://doi.org/10.1016/j.media.2020.101713.

    Article  PubMed  PubMed Central  Google Scholar 

  46. Zhang H, Goodfellow I, Metaxas D, Odena A. Self-attention generative adversarial networks. In: Proceedings international conference on machine learning (ICML). PMLR, vol. 97; 2019. p. 7354–63. arXiv preprint arXiv:1805.08318.

  47. Wang X, Cao Z, Wang R, Liu Z, Zhu X. Improving human pose estimation with self-attention generative adversarial networks. IEEE Access. 2019;7:119668–80. https://doi.org/10.1109/ACCESS.2019.2936709.

    Article  Google Scholar 

  48. Sharma M, Makwana M, Upadhyay A, Singh AP, Badhwar A, Trivedi A, Saini A, Chaudhury S. Robust image colorization using self attention based progressive generative adversarial network. In: Proceedings IEEE/CVF conference on computer vision and pattern recognition workshops (CVPRW). IEEE; 2019. p. 2188–96. https://doi.org/10.1109/CVPRW.2019.00272.

  49. Zhang Y, Hu C, Lu X. Deep attentive generative adversarial network for photo-realistic image de-quantization. arXiv preprint arXiv:2004.03150 2020.

  50. Brock A, Donahue J, Simonyan K. Large scale GAN training for high fidelity natural image synthesis. In: Proceedings international conference on learning representations (ICLR) 2018. arXiv preprint arXiv:1809.11096.

  51. Kudo A, Kitamura Y, Li Y, Iizuka S, Simo-Serra E. Virtual thin slice: 3D conditional GAN-based super-resolution for CT slice interval. In: Proceedings international conference on medical image computing and computer-assisted intervention (MICCAI) workshop. LNCS, vol. 11905. Springer; 2019. p. 91–100. https://doi.org/10.1007/978-3-030-33843-5_9.

  52. Li Y, Huang H, Zhang L, Wang G, Zhang H, Zhou W. Super-resolution and self-attention with generative adversarial network for improving malignancy characterization of hepatocellular carcinoma. In: Proceedings IEEE 17th international symposium on biomedical imaging (ISBI). IEEE; 2020. p. 1556–60. https://doi.org/10.1109/ISBI45749.2020.9098705.

  53. Lan H, Toga AW, Sepehrband F, Initiative ADN, et al. SC-GAN: 3D self-attention conditional GAN with spectral normalization for multi-modal neuroimaging synthesis. bioRxiv 2020. https://doi.org/10.1101/2020.06.09.143297.

  54. Ali IS, Mohamed MF, Mahdy YB. Data augmentation for skin lesion using self-attention based progressive generative adversarial network. arXiv preprint arXiv:1910.11960 2019.

  55. LaMontagne PJ, Keefe S, Lauren W, et al. OASIS-3: longitudinal neuroimaging, clinical, and cognitive dataset for normal aging and Alzheimer’s disease. Alzheimers Dement. 2018;14(7):1097. https://doi.org/10.1016/j.jalz.2018.06.1439.

    Article  Google Scholar 

  56. Morris JC. The clinical dementia rating (CDR): current version and scoring rules. Neurology. 1993;43(11):2412–4. https://doi.org/10.1212/wnl.43.11.2412-a.

    Article  CAS  PubMed  Google Scholar 

  57. Ledig C, Schuh A, Guerrero R, Heckemann RA, Rueckert D. Structural brain imaging in Alzheimer’s disease and mild cognitive impairment: biomarker analysis and shared morphometry database. Sci Rep. 2018;8:11258. https://doi.org/10.1038/s41598-018-29295-9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Arvold ND, Lee EQ, Mehta MP, Margolin K, Alexander BM, et al. Updates in the management of brain metastases. Neuro Oncol. 2016;18(8):1043–65. https://doi.org/10.1093/neuonc/now127.

    Article  PubMed  PubMed Central  Google Scholar 

  59. Ronneberger O, Fischer P, Brox T. U-Net: convolutional networks for biomedical image segmentation. In: Proceedings international conference on medical image computing and computer-assisted intervention (MICCAI). LNCS, vol. 9351. Springer; 2015. p. 234–41. https://doi.org/10.1007/978-3-319-24574-4_28.

  60. Rundo L, Han C, Nagano Y, et al. USE-Net: incorporating squeeze-and-excitation blocks into U-Net for prostate zonal segmentation of multi-institutional MRI datasets. Neurocomputing. 2019;365:31–43. https://doi.org/10.1016/j.neucom.2019.07.006.

    Article  Google Scholar 

  61. Kimura D, Chaudhury S, Narita M, Munawar A, Tachibana R. Adversarial discriminative attention for robust anomaly detection. In: Proceedings IEEE Winter conference on applications of computer vision (WACV); 2020. p. 2172–81. https://doi.org/10.1109/WACV45572.2020.9093428.

  62. Kingma DP, Ba J. Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 2014.

  63. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: a simple way to prevent neural networks from overfitting. JMLR. 2014;15(1):1929–58.

    Google Scholar 

  64. Han C, Rundo L, Murao K, Milacski Z.Á, Umemoto K, Sala E, Nakayama H, Satoh S. GAN-based multiple adjacent brain MRI slice reconstruction for unsupervised Alzheimer’s disease diagnosis. In: Proceedings international conference on computational intelligence methods for bioinformatics and biostatistics (CIBB). LNBI 2020. arXiv preprint arXiv:1906.06114.

  65. Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H. Dual attention network for scene segmentation. In: Proceedings IEEE/CVF conference on computer vision and pattern recognition (CVPR). IEEE; 2019. p. 3146–54. https://doi.org/10.1109/CVPR.2019.00326.

  66. Zhou K, Gao S, Cheng J, Gu Z, Fu H, Tu Z, Yang J, Zhao Y, Liu J. Sparse-GAN: sparsity-constrained generative adversarial network for anomaly detection in retinal OCT image. In: Proceedings IEEE international symposium on biomedical imaging (ISBI). IEEE; 2020. p. 1227–31. https://doi.org/10.1109/ISBI45749.2020.9098374.

  67. Haselmann M, Gruber DP, Tabatabai P. Anomaly detection using deep learning based image completion. In: Proceedings IEEE international conference on machine learning and applications (ICMLA). IEEE; 2018. p. 1237–42. https://doi.org/10.1109/ICMLA.2018.00201.

  68. Tuluptceva N, Bakker B, Fedulova I, Schulz H, Dylov DV. Anomaly detection with deep perceptual autoencoders. arXiv preprint arXiv:2006.13265 2020.

  69. Rundo L, Stefano A, Militello C, Russo G, Sabini MG, D’Arrigo C, Marletta F, Ippolito M, Mauri G, Vitabile S, Gilardi MC. A fully automatic approach for multimodal PET and MR image segmentation in Gamma Knife treatment planning. Comput Methods Programs Biomed. 2017;144:77–96. https://doi.org/10.1016/j.cmpb.2017.03.011.

    Article  PubMed  Google Scholar 

  70. Brier MR, Gordon B, Friedrichsen K, McCarthy J, Stern A, Christensen J, Owen C, Aldea P, Su Y, Hassenstab J, et al. Tau and Aβ imaging, CSF measures, and cognition in Alzheimer’s disease. Sci Trans Med. 2016;8(338):338–6633866. https://doi.org/10.1126/scitranslmed.aaf2362.

    Article  CAS  Google Scholar 

  71. Li R, Zhang W, Suk H-I, Wang L, Li J, Shen D, Ji S. Deep learning based imaging data completion for improved brain disease diagnosis. In: Proceedings international conference on medical image computing and computer-assisted intervention (MICCAI). LNCS, vol. 8675. Springer; 2014. p. 305–12. https://doi.org/10.1007/978-3-319-10443-0_39.

Download references

Acknowledgements

The authors would like to thank the open access series of imaging studies project, which has Grant Numbers P50 AG05681, P01 AG03991, R01 AG021910, P50 MH071616, U24 RR021382, and R01 MH56584.

About this supplement

This article has been published as part of BMC Bioinformatics Volume 22, Supplement 2 2021: 15th and 16th International Conference on Computational Intelligence methods for Bioinformatics and Biostatistics (CIBB 2018-19). The full contents of the supplement are available at https://bmcbioinformatics.biomedcentral.com/articles/supplements/volume-22-supplement-2.

Funding

This research was partially supported both by AMED Grant Number JP18lk1010028 and The Mark Foundation for Cancer Research and Cancer Research UK Cambridge Centre [C9685/A25177]. Additional support has been provided by the National Institute of Health Research (NIHR) Cambridge Biomedical Research Centre. The views expressed are those of the authors and not necessarily those of the National Health Service (NHS), NIHR, or Department of Health and Social Care. Zoltán Ádám Milacski was supported by Grant Number VEKOP-2.2.1-16-2017-00006.

Author information

Authors and Affiliations

Authors

Contributions

Conceived the idea: CH, LR, ZAM, KM. Designed the code: CH, LR, ZAM. Collected the T1c dataset: TN. Implemented the code: CH. Performed the experiments: CH. Analyzed the results: CH, LR. Wrote the manuscript: CH, LR. Critically read the manuscript and contributed to the discussion of the whole work: KM, TN, ZAM, YS, SK, ES, HN, SS. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Changhee Han.

Ethics declarations

Ethics approval and consent to participate

This research was conducted using human subjects data both (i) approved by the Ethics Committee of National Center for Global Health and Medicine and (ii) made available in open access by the open access series of imaging studies project.

Consent for publication

The brain metastasis and various disease dataset collected by the authors has written informed consent.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Han, C., Rundo, L., Murao, K. et al. MADGAN: unsupervised medical anomaly detection GAN using multiple adjacent brain MRI slice reconstruction. BMC Bioinformatics 22 (Suppl 2), 31 (2021). https://doi.org/10.1186/s12859-020-03936-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12859-020-03936-1

Keywords