Overview of the integrated database-pipeline system. Rectangles represent computational applications, and are three in number. The Resource (A) contains gene-, SNP-, and disease-related primary resources and constructs a primary information database. The Automatic pipeline (B) retrieves information from primary databases and extracts essential gene-, SNP-, and disease-related data. We mapped disease terms and aliases, or gene names and aliases, based on the UMLS and HGNC databases. Also, disease terms were corrected for noun modification, stop word, and suffix. SNP effects were investigated by amino acid substitution; locations are available. The Diseasome (C) is a database including three categories of information (gene, SNP, and disease), and relationships among the three categories.