The multiple alignment of homologous sequences provides important information on the evolution and the sequence-function relationships of protein families. Two types of methods, tree-based or space-based methods, can be used to compare sequences (reviewed in [1]). Both methods depend on a multiple alignment of homologous sequences. Tree methods assume a hierarchical, binary structure of the data to infer phylogenetic relationships. On the other hand, space methods are based on multivariate analysis of a distance matrix between the sequences and do not assume a specific structure for the data. Such a method is metric multidimensional (MDS) which is a powerful method to visualize distances between elements [2–5]. MDS, also named principal coordinate analysis, starts from a matrix of distances between elements and visualizes these elements in a low dimensional space in which the distances best approximate the original distances. Applied to biological sequences, this method usefully complements phylogeny [6–11].

The completion of the genome sequencing of a wide variety of organisms has paved the way to the comparison of protein families from different species. A very interesting property of MDS is the possibility to project supplementary elements onto a reference or “active” space. The positions of the supplementary elements (a.k.a. “out of sample” elements) are obtained from their distance to the active elements [2, 12, 13]. This property provides a very useful tool to compare orthologous sequences to a reference sequence set. In particular, when several orthologous protein families are compared, this method can be used to visualize evolutionary drifts [9].

MDS is based on the eigen-decomposition (i.e., principal component analysis) of a cross-product matrix derived from the distance matrix [2–5] and can be performed with the default tools included in the R statistical language (e.g., *cmds* function). In addition, several R packages such as *ade4*
*made4, adegenet*, and *vegan*[14–17] have been developed to provide multivariate analysis in the field of bioinformatics, including MDS. For example, the *dudi.pca* function in *ade4*[14] or the *wcmdscale* function in *vegan*[17] performs MDS analysis. However, the projection technique has not been widely used yet and, to the best of our knowledge, is not included in the available R packages.

Thus, we have developed the R package *bios2mds* (from BIOlogical Sequences to MultiDimensional Scaling) to provide all the tools necessary to perform the MDS analysis of multiple sequence alignments. This package includes a function that projects supplementary sequences onto a reference space and, thus, makes it possible to compare orthologous sequence sets.