Skip to main content

Archived Comments for: A comparison of common programming languages used in bioinformatics

Back to article

  1. Regarding the use of the Python code

    Peter Cock, Biopython Project; University of Warwick

    18 February 2008

    The entire trust of this paper is a comparison of the performance of the different languages, yet the skill level of the programmer in each language varies dramatically - surely confounding the whole exercise.

    For example, the authors confess to being inexperienced in python, and it is clear from their code that they are beginners. For example, one of their observations:

    "Perl clearly outperformed Python for I/O operations. Perl was three times as fast as Python when reading a FASTA file and needed half of the space to store the sequences in memory (Fig 4)."

    The script concerned contains errors, for example attempting to removing trailing new line characters with line.rstrip('/n') rather than line.rstrip('\n')

    More importantly, given their desire to look at performance metrics, is the way they have concatenated the sequences. The seq+=line idiom used is the most natural, but it is well known in the python community that concatenating a list of strings the using ''.join(str_list) is far more efficient.

    It would appear that the reviewers of this manuscript were also python novices, or at least missed this point.

    Finally as far as I can tell, the authors have not provided all the input files used for their benchmarks, making it difficult to verify their results.

    Competing interests

    I am a python programmer, and contribute to the Biopython project (mentioned but not used in this paper).

  2. Sub-divide comparisons to IO, computing, etc.

    Zhang Zhang, Yale University

    1 September 2008

    This paper made a valuable attempt to compare the performance of six programming languages used in bioinformatics. To get comparison results, three common cases in bioinformatics, Sellers, NJ and parsing blast, are used. I think that it would also be important to determine which language is better than others when considering different kinds of operations individually, such as, IO operations, computing operations (e.g, ML and MCMC), parsing sequences, etc. This may also guide us to choose more efficient language for a given bioinformatics programming task.

    Competing interests

    None declared