- Open Access
Large-scale reverse docking profiles and their applications
© Lee and Kim; licensee BioMed Central Ltd. 2012
- Published: 13 December 2012
Reverse docking approaches have been explored in previous studies on drug discovery to overcome some problems in traditional virtual screening. However, current reverse docking approaches are problematic in that the target spaces of those studies were rather small, and their applications were limited to identifying new drug targets. In this study, we expanded the scope of target space to a set of all protein structures currently available and developed several new applications of reverse docking method.
We generated 2D Matrix of docking scores among all the possible protein structures in yeast and human and 35 famous drugs. By clustering the docking profile data and then comparing them with fingerprint-based clustering of drugs, we first showed that our data contained accurate information on their chemical properties. Next, we showed that our method could be used to predict the druggability of target proteins. We also showed that a combination of sequence similarity and docking profile similarity could predict the enzyme EC numbers more accurately than sequence similarity alone. In two case studies, 5-flurouracil and cycloheximide, we showed that our method can successfully find identifying target proteins.
By using a large number of protein structures, we improved the sensitivity of reverse docking and showed that using as many protein structure as possible was important in finding real binding targets.
- Protein Data Bank
- Virtual Screening
- Docking Score
- Protein Data Bank Structure
- Drug Repositioning
Identifying disease genes and target proteins of drugs is a critical step in drug discovery. Once the disease genes are identified, designing lead compounds which can modulate those genes or the protein products may lead to a successful new drug. The growth of the number of available 3D structures of proteins and computing power has enabled high-throughput computational screening of lead compounds, which is known as virtual screening. Conventionally, these virtual screening methods have focused on searching chemical space for chemicals that can specifically bind to a protein target .
Complication in this structure-based drug discovery strategy is that there may exist unknown off-target proteins that can bind to the lead compounds unexpectedly, which undoubtedly poses some difficulty such as severe side effect, but also provides a new opportunity. Upon discovering novel drug targets for existing drugs, we can expand indications of the drugs by drug repositioning. Motivated by this, reverse (or inverse) docking approaches have received increasing interest to find unknown targets of natural products and existing old drugs [2–4]. In reverse docking, one tries to find the protein targets which can bind to a particular ligand.
In previous researches, based on an assumption that the number of predicted potential protein targets  is quite low compared to the number of genes, they tried to find new drug targets among a relatively small number of potential target proteins. For example, a reverse docking study by Gao et al. used ~1,100 targets , and that by Hui-fang et al., used 1,714 targets and 8 compounds . However, this may cause poor coverage of the protein structure space in reverse docking. Moreover, their only intended application of their reverse docking methods is to find the targets of drugs. On the other hand, various approaches including statistical method using sequence and structure similarity , calculating binding site similarity [9, 10], and prediction of druggability by descriptors  have been developed.
Here, we present a large-scale reverse docking study. The main difference from previous studies is that we used all available protein structures in human and yeast. To our best knowledge, our docking profile contains the largest number of protein structures. The reverse docking profile was merged into a matrix which can be easily interpretable. We showed the some properties of the large-scale docking profile and demonstrated usefulness of these docking profile data. We also developed several new applications such as predicting druggability of protein targets and protein function prediction based on docking profile similarity. We discussed two interesting case studies, 5-flurouracil and cycloheximide. Especially, we successfully demonstrated that using as many protein structures as possible was important in improving the sensitivity of reverse docking and finding real binding targets.
The list of ligands used to generate reverse docking profiles
The similarities among hierarchical clusterings in ligand space.
The "druggability" of a certain target protein represents how probable the protein is in fact a real target of drugs, and it has been investigated in many previous studies [14–16]. In one such method, the druggability of a protein was inferred from its homologous proteins whose druggabilities were already known . The weakness of this method is that the number of targets with known druggability is limited. Other approaches attempted to define "druggable" as "highly likely to bind to putative drugs", i.e., "bindability" [18, 19].
Protein function prediction based on docking profile similarity
Here, for example, simple implementation of combination of sequence and docking profile information was tested. To cover low sensitivity of docking fitness in low FDR, a new distance was defined as follows: if BLAST e-value of a pair is less than 1e-5, e-value is used as the distance; if otherwise, Euclidean distance is used. The performance of this metric is shown in Figure 5 (red). Note that this simple metric is never based on any serious training, feature extraction, or machine learning technique. Not considering which elements in 35-dimentional docking profile are important, and simply adding information of docking profile exhibits better performances in all area. In summary, this implies that using docking profile information together with other useful measures as features of state-of-the-art machine learning technique and increasing the size of docking profile, i.e., appending the reverse docking results of additional ligands would get close to more precise function prediction of proteins.
The docking profile data generated in this study can be applied in a variety of ways. As discussed in the previous section, it can be utilized to infer protein function. On the other hand, more common application that has been explored in several previous studies is to infer new binding targets for known drugs. Here, we present two case studies.
Binding target of 5-FC and 5-FU
5-fluorocytosine (5-FC) and 5-flurouracil (5-FU) are both fluorinated analogues of pyrimidine . The structures of the two ligands are quite similar. Therefore, not surprisingly, the docking profiles are quite similar as well. Moreover, the top-ranking binding site of both ligands is the structure of yeast exosome component, the protein product of gene rrp6 (PDB id: 2hbm) . The structure was identified relatively recently, so 2hbm has never been annotated as putative target, not to mention druggable. Previously known mechanism of action of 5-FU is inhibition of thymidylate synthetase . Thus, the top-ranking structure, 2hbm, might be considered as a false positive. Fortunately, however, genome-wide study using tagged heterozygotes yeast mutants provided a strong evidence that rrp6 related rRNA processing exosome is a target of 5-FU . The direct binding target of 5-FU was not identified in the previous study, but the result of that research and the docking scores strongly suggest that the protein product of rrp6 is the direct binding target of 5-FU in yeast.
Protein structures from the same sequence
Are such a large number of protein structures necessary for reverse docking?
Compared to the protein sets used in previous studies, the set used in this study is quite large and has some redundancy. One may question whether all these structures contribute to the sensitivity of reverse docking. It is an important issue because docking still costs high computing power and is time-consuming.
In our dataset of human, 8,717 structures out of 10,886 structures have the hits sharing the same UniProt ID with 1,339 unique UniProt IDs. In other words, those 8,717 structures could be reduced into 1,339 structures by removing at most 7,378 structures if we filter the set with respect to only sequence redundancy. However, there are many cases where docking fitness profiles for similar sequences are quite different.
To show this property, we first carried out hierarchical clustering of docking profile of proteins. For each sub-cluster, if all the members were derivatives of the same UniProt ID, the members were merged into one. This procedure was repeated until there were no sub-clusters in which members shared the same UniProt ID. As a result, 1,710 structures were filtered out eventually, i.e., only about 20% of sequence-redundant protein structures exhibited the redundancy in docking profile. This is due to heterogeneity in PDB. There are many modified structures such as oxidized, reduced, multimeric, metal containing, and truncated forms for even a one protein sequence. Thus, we concluded that the sets of protein structures which were used in previous reverse docking studies are insufficient. For example, the interesting results from the docking of cycloheximide, which was discussed in the previous section, would have not been obtained.
Another interesting example is the main binding target of hydrocortisone, the g lucoc orticoid r eceptor (GCR). There are nine structures of the GCR in PDB. However, datasets used for reverse docking such as p otential d rug t arget d atabase (PDTD)  included only two of them (PDB 1nhz and 1p93). The result of reverse docking of hydrocortisone by others  using PDTD could not detect the GCR as the target. In our docking profile, PDB 3bqd was the top-ranking protein target, which is another structure of the GCR. If we had removed redundancy based on sequence similarity, we could have not detected the real target of the GCR. Therefore, our reverse docking experiment suggests that using as many as possible protein structures in reverse docking is worthwhile in finding unknown drug targets or unexpected mode-of-action even though it costs high computation cost.
In this study, we generated large-scale reverse docking profiles for all X-ray protein structures in human and yeast. These data can be the reference for future binding assays and used to find unexpected binding targets of drugs. Furthermore, it would be useful to find unknown therapeutic uses in drug repositioning. In some case studies, targets not annotated as druggable or not stored in target database previously exhibit high binding fitness and they are highly likely to be real binding targets considering previous functional experiments. By using a large number of protein structures, we improved the sensitivity of reverse docking and showed that using as many protein structure as possible was important in finding real binding targets. Although we used as small as 35 ligands in docking, we were able to demonstrate some usefulness of our data. Generating this kind of reverse docking profile of a large number of ligands would be valuable in the future study.
All available X-ray protein structures in human and budding yeast Saccharomyces cerevisiae were retrieved from RCSB Protein Data Bank (PDB) [34, 35]. The best putative binding sites of each PDB structure were generated by using the program Fpocket [36, 37]. To make pockets appropriate inputs for the docking, Open Babel  was used to protonate all the pockets. Thirty-five well-known ligands (Table 1) were manually selected from previous high-throughput experimental studies [29, 39] to perform high-throughput reverse docking after excluding some ligands that were too large or small for molecular docking study. The 3D structures of the ligands were retrieved from PubChem  and converted from sdf file  into Tripos mol2 file format.
All the protonated pockets were docked against the ligand set using GOLD . We used a 'flexible ligand-rigid protein' mode. All other options involved in GOLD's search algorithm and termination factor were set to the default options. Given several putative docking conformations, we only chose the highest-ranking binding pose for each ligand-biding site pair. The GOLD fitness value  was used as a measure of the binding fitness. As a result, 10,886 × 35 matrix and 1,165 × 35 matrix of docking fitness scores for human and yeast, respectively, were made and used in this study (Additional files 1, 2).
Predefined the n on-r edundant set of d ruggable and l ess d ruggable binding sites (NRDLD set) was retrieved from the study by Krasowski et al. . Among 71 druggable binding sites and 44 less-druggable ones in NRDLD set, 43 druggable and 8 less-druggable binding sites are overlapped with human protein structures used in this study. These 51 binding sites were used for druggability analysis.
Putative druggable and less-druggable protein binding sites were assigned by the following rules: a binding site is druggable when all 35 docking values of the binding site are always larger than corresponding overall average values, and less-druggable when all 35 docking values of a binding site are always less than corresponding average values.
To get EC number composition of assigned druggable and less-druggable sets, n on-r edundant (NR) putative druggable and less druggable sets were defined. In this study, NR set means that the set do not contain any pairs of proteins sharing the same UniProt ID . Note that we did not use any sequence identity measure to remove redundancy.
Although there are several ortholog databases, none of those provides PDB-based mapping table. Therefore, we obtained the ortholog mapping between human and yeast protein structures by the following procedure. First, we retrieved human-yeast ortholog table from InParanoid [45, 46]. In this table, human proteins and yeast proteins were annotated by Ensembl's id (ENSP)  and yeast ORF name , respectively. These terms were transferred into PDB id by PICR  to complete PDB-based mapping.
We thank all members of Bioinformatics and Computational Biology Laboratory at KAIST for helpful discussions. This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea Government, the Ministry of Education, Science & Technology (MEST) [2009-0086964].
This article has been published as part of BMC Bioinformatics Volume 13 Supplement 17, 2012: Eleventh International Conference on Bioinformatics (InCoB2012): Bioinformatics. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcbioinformatics/supplements/13/S17.
- Sperandio O, Miteva MA, Delfaud F, Villoutreix BO: Receptor-based computational screening of compound databases: the main docking-scoring engines. Current protein & peptide science. 2006, 7: 369-393. 10.2174/138920306778559377.View ArticleGoogle Scholar
- Chen YZ, Zhi DG: Ligand-protein inverse docking and its potential use in the computer search of protein targets of a small molecule. Proteins. 2001, 43: 217-226. 10.1002/1097-0134(20010501)43:2<217::AID-PROT1032>3.0.CO;2-G.View ArticlePubMedGoogle Scholar
- Paul N, Kellenberger E, Bret G, Muller P, Rognan D: Recovering the true targets of specific ligands by virtual screening of the protein data bank. Proteins. 2004, 54: 671-680. 10.1002/prot.10625.View ArticlePubMedGoogle Scholar
- Cai J, Han C, Hu T, Zhang J, Wu D, Wang F, Liu Y, Ding J, Chen K, Yue J: Peptide deformylase is a potential target for anti-Helicobacter pylori drugs: reverse docking, enzymatic assay, and X-ray crystallography validation. Protein science: a publication of the Protein Society. 2006, 15: 2071-2081. 10.1110/ps.062238406.View ArticleGoogle Scholar
- Russ AP, Lampel S: The druggable genome: an update. Drug discovery today. 2005, 10: 1607-1610. 10.1016/S1359-6446(05)03666-4.View ArticlePubMedGoogle Scholar
- Gao Z, Li H, Zhang H, Liu X, Kang L, Luo X, Zhu W, Chen K, Wang X, Jiang H: PDTD: a web-accessible protein database for drug target identification. BMC bioinformatics. 2008, 9: 104-10.1186/1471-2105-9-104.PubMed CentralView ArticlePubMedGoogle Scholar
- Hui-fang L, Qing S, Jian Z, Wei F: Evaluation of various inverse docking schemes in multiple targets identification. Journal of molecular graphics & modelling. 2010, 29: 326-330. 10.1016/j.jmgm.2010.09.004.View ArticleGoogle Scholar
- Xie L, Bourne PE: A unified statistical model to support local sequence order independent similarity searching for ligand-binding sites and its application to genome-based drug discovery. Bioinformatics. 2009, 25: i305-312. 10.1093/bioinformatics/btp220.PubMed CentralView ArticlePubMedGoogle Scholar
- Hoffmann B, Zaslavskiy M, Vert JP, Stoven V: A new protein binding pocket similarity measure based on comparison of clouds of atoms in 3D: application to ligand prediction. BMC bioinformatics. 2010, 11: 99-10.1186/1471-2105-11-99.PubMed CentralView ArticlePubMedGoogle Scholar
- Kahraman A, Morris RJ, Laskowski RA, Thornton JM: Shape variation in protein binding pockets and their ligands. Journal of molecular biology. 2007, 368: 283-301. 10.1016/j.jmb.2007.01.086.View ArticlePubMedGoogle Scholar
- Gupta A, Gupta AK, Seshadri K: Structural models in the assessment of protein druggability based on HTS data. J Comput Aid Mol Des. 2009, 23: 583-592. 10.1007/s10822-009-9279-y.View ArticleGoogle Scholar
- Li Q, Cheng T, Wang Y, Bryant SH: PubChem as a public resource for drug discovery. Drug discovery today. 2010, 15: 1052-1057. 10.1016/j.drudis.2010.10.003.PubMed CentralView ArticlePubMedGoogle Scholar
- Morlini I, Zani S: An overall index for comparing hierarchical clusterings. Challenges at the interface of data analysis, computer science, and optimization. Edited by: Gaul W, Geyer-Schulz A, Schimidt-Thieme L, Kunze J. 2012, New York: SpringerGoogle Scholar
- Overington JP, Al-Lazikani B, Hopkins AL: How many drug targets are there?. Nature reviews Drug discovery. 2006, 5: 993-996. 10.1038/nrd2199.View ArticlePubMedGoogle Scholar
- Imming P, Sinning C, Meyer A: Drugs, their targets and the nature and number of drug targets. Nature reviews Drug discovery. 2006, 5: 821-834. 10.1038/nrd2132.View ArticlePubMedGoogle Scholar
- Schmidtke P, Barril X: Understanding and predicting druggability. A high-throughput method for detection of drug binding sites. Journal of medicinal chemistry. 2010, 53: 5858-5867. 10.1021/jm100574m.View ArticlePubMedGoogle Scholar
- Han LY, Zheng CJ, Xie B, Jia J, Ma XH, Zhu F, Lin HH, Chen X, Chen YZ: Support vector machines approach for predicting druggable proteins: recent progress in its exploration and investigation of its usefulness. Drug discovery today. 2007, 12: 304-313. 10.1016/j.drudis.2007.02.015.View ArticlePubMedGoogle Scholar
- Sheridan RP, Maiorov VN, Holloway MK, Cornell WD, Gao YD: Drug-like density: a method of quantifying the "bindability" of a protein target based on a very large set of pockets and drug-like ligands from the Protein Data Bank. Journal of chemical information and modeling. 2010, 50: 2029-2040. 10.1021/ci100312t.View ArticlePubMedGoogle Scholar
- Cheng AC, Coleman RG, Smyth KT, Cao Q, Soulard P, Caffrey DR, Salzberg AC, Huang ES: Structure-based maximal affinity model predicts small-molecule druggability. Nature biotechnology. 2007, 25: 71-75. 10.1038/nbt1273.View ArticlePubMedGoogle Scholar
- Krasowski A, Muthas D, Sarkar A, Schmitt S, Brenk R: DrugPred: a structure-based approach to predict protein druggability developed using an extensive nonredundant data set. Journal of chemical information and modeling. 2011, 51: 2829-2842. 10.1021/ci200266d.View ArticlePubMedGoogle Scholar
- Webb EC: Enzyme nomenclature 1992: recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology on the nomenclature and classification of enzymes. 1992, San Diego: Published for the International Union of Biochemistry and Molecular Biology by Academic PressGoogle Scholar
- Giaever G, Flaherty P, Kumm J, Proctor M, Nislow C, Jaramillo DF, Chu AM, Jordan MI, Arkin AP, Davis RW: Chemogenomic profiling: identifying the functional interactions of small molecules in yeast. Proceedings of the National Academy of Sciences of the United States of America. 2004, 101: 793-798. 10.1073/pnas.0307490100.PubMed CentralView ArticlePubMedGoogle Scholar
- Winzeler EA, Shoemaker DD, Astromoff A, Liang H, Anderson K, Andre B, Bangham R, Benito R, Boeke JD, Bussey H: Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis. Science. 1999, 285: 901-906. 10.1126/science.285.5429.901.View ArticlePubMedGoogle Scholar
- Schwikowski B, Uetz P, Fields S: A network of protein-protein interactions in yeast. Nature biotechnology. 2000, 18: 1257-1261. 10.1038/82360.View ArticlePubMedGoogle Scholar
- Karlin S, Altschul SF: Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. Proceedings of the National Academy of Sciences of the United States of America. 1990, 87: 2264-2268. 10.1073/pnas.87.6.2264.PubMed CentralView ArticlePubMedGoogle Scholar
- Vermes A, Guchelaar HJ, Dankert J: Flucytosine: a review of its pharmacology, clinical indications, pharmacokinetics, toxicity and drug interactions. The Journal of antimicrobial chemotherapy. 2000, 46: 171-179. 10.1093/jac/46.2.171.View ArticlePubMedGoogle Scholar
- Midtgaard SF, Assenholt J, Jonstrup AT, Van LB, Jensen TH, Brodersen DE: Structure of the nuclear exosome component Rrp6p reveals an interplay between the active site and the HRDC domain. Proceedings of the National Academy of Sciences of the United States of America. 2006, 103: 11898-11903. 10.1073/pnas.0604731103.PubMed CentralView ArticlePubMedGoogle Scholar
- Parker WB, Cheng YC: Metabolism and mechanism of action of 5-fluorouracil. Pharmacology & therapeutics. 1990, 48: 381-395. 10.1038/clpt.1990.166.View ArticleGoogle Scholar
- Lum PY, Armour CD, Stepaniants SB, Cavet G, Wolf MK, Butler JS, Hinshaw JC, Garnier P, Prestwich GD, Leonardson A: Discovering modes of action for therapeutic compounds using a genome-wide screen of yeast heterozygotes. Cell. 2004, 116: 121-137. 10.1016/S0092-8674(03)01035-3.View ArticlePubMedGoogle Scholar
- Derbyshire MK, Weinstock KG, Strathern JN: HST1, a new member of the SIR2 family of genes. Yeast. 1996, 12: 631-640. 10.1002/(SICI)1097-0061(19960615)12:7<631::AID-YEA960>3.0.CO;2-8.View ArticlePubMedGoogle Scholar
- Zhao K, Chai X, Clements A, Marmorstein R: Structure and autoregulation of the yeast Hst2 homolog of Sir2. Nature structural biology. 2003, 10: 864-871. 10.1038/nsb978.View ArticlePubMedGoogle Scholar
- Zhao K, Chai X, Marmorstein R: Structure of the yeast Hst2 protein deacetylase in ternary complex with 2'-O-acetyl ADP ribose and histone peptide. Structure. 2003, 11: 1403-1411. 10.1016/j.str.2003.09.016.View ArticlePubMedGoogle Scholar
- Wilson JM, Le VQ, Zimmerman C, Marmorstein R, Pillus L: Nuclear export modulates the cytoplasmic Sir2 homologue Hst2. EMBO reports. 2006, 7: 1247-1251. 10.1038/sj.embor.7400829.PubMed CentralView ArticlePubMedGoogle Scholar
- Deshpande N, Addess KJ, Bluhm WF, Merino-Ott JC, Townsend-Merino W, Zhang Q, Knezevich C, Xie L, Chen L, Feng Z: The RCSB Protein Data Bank: a redesigned query system and relational database based on the mmCIF schema. Nucleic acids research. 2005, 33: D233-237. 10.1093/nar/gki586.PubMed CentralView ArticlePubMedGoogle Scholar
- Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE: The Protein Data Bank. Nucleic acids research. 2000, 28: 235-242. 10.1093/nar/28.1.235.PubMed CentralView ArticlePubMedGoogle Scholar
- Schmidtke P, Le Guilloux V, Maupetit J, Tuffery P: fpocket: online tools for protein ensemble pocket detection and tracking. Nucleic acids research. 2010, 38: W582-589. 10.1093/nar/gkq383.PubMed CentralView ArticlePubMedGoogle Scholar
- Le Guilloux V, Schmidtke P, Tuffery P: Fpocket: an open source platform for ligand pocket detection. BMC bioinformatics. 2009, 10: 168-10.1186/1471-2105-10-168.PubMed CentralView ArticlePubMedGoogle Scholar
- O'Boyle NM, Banck M, James CA, Morley C, Vandermeersch T, Hutchison GR: Open Babel: An open chemical toolbox. Journal of cheminformatics. 2011, 3: 33-10.1186/1758-2946-3-33.PubMed CentralView ArticlePubMedGoogle Scholar
- Hillenmeyer ME, Fung E, Wildenhain J, Pierce SE, Hoon S, Lee W, Proctor M, St Onge RP, Tyers M, Koller D: The chemical genomic portrait of yeast: uncovering a phenotype for all genes. Science. 2008, 320: 362-365. 10.1126/science.1150021.PubMed CentralView ArticlePubMedGoogle Scholar
- Wang Y, Xiao J, Suzek TO, Zhang J, Wang J, Bryant SH: PubChem: a public information system for analyzing bioactivities of small molecules. Nucleic acids research. 2009, 37: W623-633. 10.1093/nar/gkp456.PubMed CentralView ArticlePubMedGoogle Scholar
- Dalby A, Nourse JG, Hounshell WD, Gushurst AKI, Grier DL, Leland BA, Laufer J: Description of several chemical-structure file formats used by computer-programs developed at molecular design limited. J Chem Inf Comp Sci. 1992, 32: 244-255. 10.1021/ci00007a012.View ArticleGoogle Scholar
- Jones G, Willett P, Glen RC: Molecular recognition of receptor sites using a genetic algorithm with a description of desolvation. Journal of molecular biology. 1995, 245: 43-53. 10.1016/S0022-2836(95)80037-9.View ArticlePubMedGoogle Scholar
- Verdonk ML, Cole JC, Hartshorn MJ, Murray CW, Taylor RD: Improved protein-ligand docking using GOLD. Proteins. 2003, 52: 609-623. 10.1002/prot.10465.View ArticlePubMedGoogle Scholar
- Magrane M, Consortium U: UniProt Knowledgebase: a hub of integrated protein data. Database: the journal of biological databases and curation. 2011, 2011: bar009.View ArticlePubMedGoogle Scholar
- Ostlund G, Schmitt T, Forslund K, Kostler T, Messina DN, Roopra S, Frings O, Sonnhammer EL: InParanoid 7: new algorithms and tools for eukaryotic orthology analysis. Nucleic acids research. 2010, 38: D196-203. 10.1093/nar/gkp931.PubMed CentralView ArticlePubMedGoogle Scholar
- O'Brien KP, Remm M, Sonnhammer EL: Inparanoid: a comprehensive database of eukaryotic orthologs. Nucleic acids research. 2005, 33: D476-480.PubMed CentralView ArticlePubMedGoogle Scholar
- Hubbard T, Barker D, Birney E, Cameron G, Chen Y, Clark L, Cox T, Cuff J, Curwen V, Down T: The Ensembl genome database project. Nucleic acids research. 2002, 30: 38-41. 10.1093/nar/30.1.38.PubMed CentralView ArticlePubMedGoogle Scholar
- Stamm S, Smith CWJ, Lührmann R: Appendix A1: yeast nomenclature systematic Open Reading Frame (ORF) and other genetic designations. Alternative pre-mRNA Splicing. Edited by: Stamm S, Smith CWJ, Lührmann R. 2012, Weinheim, Germany: Wiley-VCH Verlag GmbH & Co. KGaA, 603-607.View ArticleGoogle Scholar
- Cote RG, Jones P, Martens L, Kerrien S, Reisinger F, Lin Q, Leinonen R, Apweiler R, Hermjakob H: The Protein Identifier Cross-Referencing (PICR) service: reconciling protein identifiers across multiple source databases. BMC bioinformatics. 2007, 8: 401-10.1186/1471-2105-8-401.PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.