Skip to main content

Table 1 XML languages for the representation of biological data types

From: XML-based approaches for the integration of heterogeneous bio-molecular data

Type of Data Format Concrete Scope Version Comments
Molecular entities BSML [57] Biological sequences and sequence annotation v.3.1/2005 Uses DTD. Included in EMBLxml.
  ProXML [58] Protein sequences, structures and families v.1.0/2006 Uses XSD. Included within HOBIT formats
  RNAML [59] RNA sequence, structure and experimental data v.1.1/2002 Uses XSD
  AGAVE [16] Biological sequences and sequence annotation 2003 XSD Included in EMBLxml
  Uniprot XSD [121] Representation of UniProt Records 2004 XSD, Successor of SP (SwissProt) ML format
  EMBLxml [17] Biological sequences and sequence annotation v.1.1./2007 Uses XSD. Currently includes BSML and AGAVE.
  GAME [18] Genome and Sequence v.0.3/1999 Uses DTD
  SequenceML Sequence Information v.2.1 2006 Designed to replace FASTA. Belongs to HOBIT XML formats.
Biological Expression GeneXML [122] Gene expression data - Uses DTD
  MAGE-ML [123] Microarray expression data v.1.0/2006 Uses DTD
System Biology CellML [124] Models of biochemical reaction networks v.1.1/2006 Uses DTD. Available conversion to BioPAX.
  SBML [57] Models of biochemical reaction networks Lev. 2/2007 Uses XSD. Available conversion to BioPAX.
  PSI-MI [125] Protein Interactions v.2.5/2005 Uses XSD and OBO. Linked with OBO vocabularies.
  BioPAX [60] Metabolic pathways, molecular interactions Lev. 3/2008 Uses OWL. Linked to OBO vocabularies.
  CML [126] Description of Molecules and Reactions v.2.1./2003 Uses XSD
  1. This table summarizes some of the characteristics of a subset of existing XML languages. In particular, we note the application scope, the number and year of the current version, and comments such as the kind of schema it relies on, or the interaction with other standards.