Skip to main content

Table 1 The training and testing datasets variant prioritization of iMEGES

From: iMEGES: integrated mental-disorder GEnome score by deep neural network for prioritizing the susceptibility genes for mental disorders in personal genomes

Dataset Positive Negative Description
Training dataset 1 574 27,735 The most likely causal dsQTL SNPs were downloaded from deltaSVM [30]
Training dataset 2 1614 161,400 Regulatory associated mutations were downloaded from HGMD from 2012, and random SNVs with allele frequency ≥ 1% in the 1000 Genomes Project
Training dataset 3 31,118 36,540 eQTLs SNPs were collected from 11 studies on 7 tissues/cell lines
Training dataset 4 78,613 593,335 Non-coding eQTLs from GRASP was considered to be associated, while SNPs from 1000 Genomes Project not to be associated
Testing dataset 1 3439 66,916 Based on P-values of imputed SNPs from Psychiatric Genome Consortium (PGC) schizophrenia GWAS
Testing dataset 2 8002 19,322 Based on P-values of imputed SNPs from Psychiatric Genome Consortium (PGC) autism spectrum disorder (ASD)
Testing dataset 3 76 156 Manually curated regulatory SNPs with experimental validation.
Testing dataset 4 75 402 The synonymous variants compiled by [72]