Skip to main content

Table 1 XML languages for the representation of biological data types

From: XML-based approaches for the integration of heterogeneous bio-molecular data

Type of Data

Format

Concrete Scope

Version

Comments

Molecular entities

BSML [57]

Biological sequences and sequence annotation

v.3.1/2005

Uses DTD. Included in EMBLxml.

 

ProXML [58]

Protein sequences, structures and families

v.1.0/2006

Uses XSD. Included within HOBIT formats

 

RNAML [59]

RNA sequence, structure and experimental data

v.1.1/2002

Uses XSD

 

AGAVE [16]

Biological sequences and sequence annotation

2003

XSD Included in EMBLxml

 

Uniprot XSD [121]

Representation of UniProt Records

2004

XSD, Successor of SP (SwissProt) ML format

 

EMBLxml [17]

Biological sequences and sequence annotation

v.1.1./2007

Uses XSD. Currently includes BSML and AGAVE.

 

GAME [18]

Genome and Sequence

v.0.3/1999

Uses DTD

 

SequenceML

Sequence Information

v.2.1 2006

Designed to replace FASTA. Belongs to HOBIT XML formats.

Biological Expression

GeneXML [122]

Gene expression data

-

Uses DTD

 

MAGE-ML [123]

Microarray expression data

v.1.0/2006

Uses DTD

System Biology

CellML [124]

Models of biochemical reaction networks

v.1.1/2006

Uses DTD. Available conversion to BioPAX.

 

SBML [57]

Models of biochemical reaction networks

Lev. 2/2007

Uses XSD. Available conversion to BioPAX.

 

PSI-MI [125]

Protein Interactions

v.2.5/2005

Uses XSD and OBO. Linked with OBO vocabularies.

 

BioPAX [60]

Metabolic pathways, molecular interactions

Lev. 3/2008

Uses OWL. Linked to OBO vocabularies.

 

CML [126]

Description of Molecules and Reactions

v.2.1./2003

Uses XSD

  1. This table summarizes some of the characteristics of a subset of existing XML languages. In particular, we note the application scope, the number and year of the current version, and comments such as the kind of schema it relies on, or the interaction with other standards.