Skip to content


Sequence analysis (applications)

Section edited by João Setubal

This section incorporates all aspects of sequence analysis applications, including but not limited to: software, workflows and webservers dealing with applied sequence/genome analysis, sequence assembly, analysis of sequence features, and protein function and ligand binding, estimated through sequence features.

Page 1 of 10

  1. Content type: Research article

    Eqolisins are rare acid proteases found in archaea, bacteria and fungi. Certain fungi secrete acids as part of their lifestyle and interestingly these also have many eqolisin paralogs, up to nine paralogs have...

    Authors: Nicolás Stocchi, María Victoria Revuelta, Priscila Ailín Lanza Castronuovo, D. Mariano A. Vera and Arjen ten Have

    Citation: BMC Bioinformatics 2018 19:338

    Published on:

  2. Content type: Research article

    Spanins are phage lysis proteins required to disrupt the outer membrane. Phages employ either two-component spanins or unimolecular spanins in this final step of Gram-negative host lysis. Two-component spanins...

    Authors: Rohit Kongari, Manoj Rajaure, Jesse Cahill, Eric Rasche, Eleni Mijalis, Joel Berry and Ry Young

    Citation: BMC Bioinformatics 2018 19:326

    Published on:

  3. Content type: Methodology article

    Procedures for controlling the false discovery rate (FDR) are widely applied as a solution to the multiple comparisons problem of high-dimensional statistics. Current FDR-controlling procedures require accurat...

    Authors: Matthew M. Parks, Benjamin J. Raphael and Charles E. Lawrence

    Citation: BMC Bioinformatics 2018 19:323

    Published on:

  4. Content type: Software

    Metagenomic approaches have revealed the complexity of environmental microbiomes with the advancement in whole genome sequencing displaying a significant level of genetic heterogeneity on the species level. It...

    Authors: Adeola M. Rotimi, Rian Pierneef and Oleg N. Reva

    Citation: BMC Bioinformatics 2018 19:309

    Published on:

  5. Content type: Software

    Microarray experiments comprise more than half of all series in the Gene Expression Omnibus (GEO). However, downloading and analyzing raw or semi-processed microarray data from GEO is not intuitive and require...

    Authors: Maria Luisa Amaral, Galina A. Erikson and Maxim N. Shokhirev

    Citation: BMC Bioinformatics 2018 19:296

    Published on:

  6. Content type: Software

    Transfer of genetic material from microbes or viruses into the host genome is known as horizontal gene transfer (HGT). The integration of viruses into the human genome is associated with multiple cancers, and ...

    Authors: Saurabh Baheti, Xiaojia Tang, Daniel R. O’Brien, Nicholas Chia, Lewis R. Roberts, Heidi Nelson, Judy C. Boughey, Liewei Wang, Matthew P. Goetz, Jean-Pierre A. Kocher and Krishna R. Kalari

    Citation: BMC Bioinformatics 2018 19:271

    Published on:

  7. Content type: Software

    Methylated RNA immunoprecipitation sequencing (MeRIP-seq or m6A-seq) has been extensively used for profiling transcriptome-wide distribution of RNA N6-Methyl-Adnosine methylation. However, due to the intrinsic pr...

    Authors: Teng Zhang, Shao-Wu Zhang, Lin Zhang and Jia Meng

    Citation: BMC Bioinformatics 2018 19:260

    Published on:

  8. Content type: Research Article

    Gene expression in plant chloroplasts and mitochondria is affected by RNA editing. Numerous C-to-U conversions, accompanied by reverse U-to-C exchanges in some plant clades, alter the genetic information encod...

    Authors: Henning Lenz, Anke Hein and Volker Knoop

    Citation: BMC Bioinformatics 2018 19:255

    Published on:

  9. Content type: Software

    Computation of reaction similarity is a pre-requisite for several bioinformatics applications including enzyme identification for specific biochemical reactions, enzyme classification and mining for specific i...

    Authors: Tadi Venkata Sivakumar, Anirban Bhaduri, Rajasekhara Reddy Duvvuru Muni, Jin Hwan Park and Tae Yong Kim

    Citation: BMC Bioinformatics 2018 19:254

    Published on:

  10. Content type: Software

    High-throughput bioinformatics analyses of next generation sequencing (NGS) data often require challenging pipeline optimization. The key problem is choosing appropriate tools and selecting the best parameters...

    Authors: Nikola Tom, Ondrej Tom, Jitka Malcikova, Sarka Pavlova, Blanka Kubesova, Tobias Rausch, Miroslav Kolarik, Vladimir Benes, Vojtech Bystry and Sarka Pospisilova

    Citation: BMC Bioinformatics 2018 19:243

    Published on:

  11. Content type: Software

    The long-range sequencing information captured by linked reads, such as those available from 10× Genomics (10xG), helps resolve genome sequence repeats, and yields accurate and contiguous draft genome assembli...

    Authors: Lauren Coombe, Jessica Zhang, Benjamin P. Vandervalk, Justin Chu, Shaun D. Jackman, Inanc Birol and René L. Warren

    Citation: BMC Bioinformatics 2018 19:234

    Published on:

  12. Content type: Software

    Mutational signatures have been proved as a valuable pattern in somatic genomics, mainly regarding cancer, with a potential application as a biomarker in clinical practice. Up to now, several bioinformatic pac...

    Authors: Marcos Díaz-Gay, Maria Vila-Casadesús, Sebastià Franch-Expósito, Eva Hernández-Illán, Juan José Lozano and Sergi Castellví-Bel

    Citation: BMC Bioinformatics 2018 19:224

    Published on:

  13. Content type: Methodology article

    Discovering over-represented approximate motifs in DNA sequences is an essential part of bioinformatics. This topic has been studied extensively because of the increasing number of potential applications. Howe...

    Authors: Chadi Saad, Laurent Noé, Hugues Richard, Julie Leclerc, Marie-Pierre Buisine, Hélène Touzet and Martin Figeac

    Citation: BMC Bioinformatics 2018 19:223

    Published on:

  14. Content type: Software

    Advances in sequencing technologies have facilitated large-scale comparative genomics based on whole genome sequencing. Constructing and investigating conserved genomic regions among multiple species (called s...

    Authors: Jongin Lee, Daehwan Lee, Mikang Sim, Daehong Kwon, Juyeon Kim, Younhee Ko and Jaebum Kim

    Citation: BMC Bioinformatics 2018 19:216

    Published on:

  15. Content type: Software

    Large sequence datasets are difficult to visualize and handle. Additionally, they often do not represent a random subset of the natural diversity, but the result of uncoordinated and convenience sampling. Cons...

    Authors: Fabrizio Menardo, Chloé Loiseau, Daniela Brites, Mireia Coscolla, Sebastian M. Gygli, Liliana K. Rutaihwa, Andrej Trauner, Christian Beisel, Sonia Borrell and Sebastien Gagneux

    Citation: BMC Bioinformatics 2018 19:164

    Published on:

  16. Content type: Research article

    In the last decade and a half it has been firmly established that a large number of proteins do not adopt a well-defined (ordered) structure under physiological conditions. Such intrinsically disordered protei...

    Authors: Nenad S. Mitić, Saša N. Malkov, Jovana J. Kovačević, Gordana M. Pavlović-Lažetić and Miloš V. Beljanski

    Citation: BMC Bioinformatics 2018 19:158

    Published on:

  17. Content type: Database

    Bioactive peptides, including biological sources-derived peptides with different biological activities, are protein fragments that influence the functions or conditions of organisms, in particular humans and a...

    Authors: Krittima Anekthanakul, Apiradee Hongsthong, Jittisak Senachak and Marasri Ruengjitchatchawalya

    Citation: BMC Bioinformatics 2018 19:149

    Published on:

  18. Content type: Software

    The study of the huge diversity of immune receptors, often referred to as immune repertoire profiling, is a prerequisite for diagnosis, prognostication and monitoring of hematological disorders. In the era of ...

    Authors: Christos Maramis, Athanasios Gkoufas, Anna Vardi, Evangelia Stalika, Kostas Stamatopoulos, Anastasia Hatzidimitriou, Nicos Maglaveras and Ioanna Chouvarda

    Citation: BMC Bioinformatics 2018 19:144

    Published on:

  19. Content type: Research article

    After decades of identifying risk factors using array-based genome-wide association studies (GWAS), genetic research of complex diseases has shifted to sequencing-based rare variants discovery. This requires l...

    Authors: Yingxue Ren, Joseph S. Reddy, Cyril Pottier, Vivekananda Sarangi, Shulan Tian, Jason P. Sinnwell, Shannon K. McDonnell, Joanna M. Biernacka, Minerva M. Carrasquillo, Owen A. Ross, Nilüfer Ertekin-Taner, Rosa Rademakers, Matthew Hudson, Liudmila Sergeevna Mainzer and Yan W. Asmann

    Citation: BMC Bioinformatics 2018 19:139

    Published on:

  20. Content type: Research article

    Single nucleotide polymorphisms (SNPs) located within the human genome have been shown to have utility as markers of identity in the differentiation of DNA from individual contributors. Massively parallel DNA ...

    Authors: M. Heath Farris, Andrew R. Scott, Pamela A. Texter, Marta Bartlett, Patricia Coleman and David Masters

    Citation: BMC Bioinformatics 2018 19:126

    Published on:

  21. Content type: Methodology article

    High quality functional annotation is essential for understanding the phenotypic consequences encoded in a genome. Despite improvements in bioinformatics methods, millions of sequences in databanks are not ass...

    Authors: Jonathan Mercier, Adrien Josso, Claudine Médigue and David Vallenet

    Citation: BMC Bioinformatics 2018 19:132

    Published on:

  22. Content type: Methodology Article

    Genome-wide association studies (GWASs) have been widely used to discover the genetic basis of complex phenotypes. However, standard single-SNP GWASs suffer from lack of power. In particular, they do not direc...

    Authors: Christine Sinoquet

    Citation: BMC Bioinformatics 2018 19:106

    Published on:

  23. Content type: Software

    Quantitative trait locus (QTL) mapping in genetic data often involves analysis of correlated observations, which need to be accounted for to avoid false association signals. This is commonly performed by model...

    Authors: Andrey Ziyatdinov, Miquel Vázquez-Santiago, Helena Brunel, Angel Martinez-Perez, Hugues Aschard and Jose Manuel Soria

    Citation: BMC Bioinformatics 2018 19:68

    Published on:

  24. Content type: Software

    Small RNA molecules play important roles in many biological processes and their dysregulation or dysfunction can cause disease. The current method of choice for genome-wide sRNA expression profiling is deep se...

    Authors: Raza-Ur Rahman, Abhivyakti Gautam, Jörn Bethune, Abdul Sattar, Maksims Fiosins, Daniel Sumner Magruder, Vincenzo Capece, Orr Shomroni and Stefan Bonn

    Citation: BMC Bioinformatics 2018 19:54

    Published on:

  25. Content type: Research Article

    The ease at which influenza virus sequence data can be used to estimate antigenic relationships between strains and the existence of databases containing sequence data for hundreds of thousands influenza strai...

    Authors: Christopher S. Anderson, Patrick R. McCall, Harry A. Stern, Hongmei Yang and David J. Topham

    Citation: BMC Bioinformatics 2018 19:51

    Published on:

  26. Content type: Software

    The advent of modern high-throughput genetics continually broadens the gap between the rising volume of sequencing data, and the tools required to process them. The need to pinpoint a small subset of functiona...

    Authors: Monika Mozere, Mehmet Tekman, Jameela Kari, Detlef Bockenhauer, Robert Kleta and Horia Stanescu

    Citation: BMC Bioinformatics 2018 19:46

    Published on:

  27. Content type: Research Article

    The clinical sequencing of cancer genomes to personalize therapy is becoming routine across the world. However, concerns over patient re-identification from these data lead to questions about how tightly acces...

    Authors: Dorota H. Sendorek, Cristian Caloian, Kyle Ellrott, J. Christopher Bare, Takafumi N. Yamaguchi, Adam D. Ewing, Kathleen E. Houlahan, Thea C. Norman, Adam A. Margolin, Joshua M. Stuart and Paul C. Boutros

    Citation: BMC Bioinformatics 2018 19:28

    Published on:

  28. Content type: Methodology Article

    Protein or nucleic acid sequences contain a multitude of associated annotations representing continuous sequence elements (CSEs). Comparing these CSEs is needed, whenever we want to match identical annotations...

    Authors: Roman Prytuliak, Friedhelm Pfeiffer and Bianca Hermine Habermann

    Citation: BMC Bioinformatics 2018 19:24

    Published on:

  29. Content type: Software

    Genetic association studies (GAS) aims to evaluate the association between genetic variants and phenotypes. In the last few years, the number of this type of study has increased exponentially, but the results ...

    Authors: Jordi Martorell-Marugan, Daniel Toro-Dominguez, Marta E. Alarcon-Riquelme and Pedro Carmona-Saez

    Citation: BMC Bioinformatics 2017 18:563

    Published on:

  30. Content type: Software

    High throughput sequencing requires bioinformatics pipelines to process large volumes of data into meaningful variants that can be translated into a clinical report. These pipelines often suffer from a number ...

    Authors: Kenneth D. Doig, Jason Ellul, Andrew Fellowes, Ella R. Thompson, Georgina Ryland, Piers Blombery, Anthony T. Papenfuss and Stephen B. Fox

    Citation: BMC Bioinformatics 2017 18:555

    Published on:

  31. Content type: Software

    Haloplex targeted resequencing is a popular method to analyze both germline and somatic variants in gene panels. However, involved wet-lab procedures may introduce false positives that need to be considered in...

    Authors: Matthias Beyens, Nele Boeckx, Guy Van Camp, Ken Op de Beeck and Geert Vandeweyer

    Citation: BMC Bioinformatics 2017 18:554

    Published on:

  32. Content type: Software

    Chromatin immunoprecipitation followed by DNA sequencing (ChIP-seq) and associated methods are widely used to define the genome wide distribution of chromatin associated proteins, post-translational epigenetic...

    Authors: Mike Myschyshyn, Marco Farren-Dai, Tien-Jui Chuang and David Vocadlo

    Citation: BMC Bioinformatics 2017 18:521

    Published on:

  33. Content type: Software

    Bioinformatics tools designed to identify lentiviral or retroviral vector insertion sites in the genome of host cells are used to address the safety and long-term efficacy of hematopoietic stem cell gene thera...

    Authors: Giulio Spinozzi, Andrea Calabria, Stefano Brasca, Stefano Beretta, Ivan Merelli, Luciano Milanesi and Eugenio Montini

    Citation: BMC Bioinformatics 2017 18:520

    Published on:

  34. Content type: Methodology Article

    Accurate structural annotation depends on well-trained gene prediction programs. Training data for gene prediction programs are often chosen randomly from a subset of high-quality genes that ideally represent ...

    Authors: Megan J. Bowman, Jane A. Pulman, Tiffany L. Liu and Kevin L. Childs

    Citation: BMC Bioinformatics 2017 18:522

    Published on:

  35. Content type: Software

    With the plummeting cost of the next-generation sequencing technologies, high-density genetic linkage maps could be constructed in a forest hybrid F1 population. However, based on such genetic maps, quantitative ...

    Authors: Fenxiang Liu, Chunfa Tong, Shentong Tao, Jiyan Wu, Yuhua Chen, Dan Yao, Huogen Li and Jisen Shi

    Citation: BMC Bioinformatics 2017 18:515

    Published on:

  36. Content type: Methodology Article

    Gene set enrichment analysis and overrepresentation analyses are commonly used methods to determine the biological processes affected by a differential expression experiment. This approach requires biologicall...

    Authors: Jie Tan, Matthew Huyck, Dongbo Hu, René A. Zelaya, Deborah A. Hogan and Casey S. Greene

    Citation: BMC Bioinformatics 2017 18:512

    Published on:

  37. Content type: Research Article

    Nowadays, many public repositories containing large microarray gene expression datasets are available. However, the problem lies in the fact that microarray technology are less powerful and accurate than more ...

    Authors: Daniel Castillo, Juan Manuel Gálvez, Luis Javier Herrera, Belén San Román, Fernando Rojas and Ignacio Rojas

    Citation: BMC Bioinformatics 2017 18:506

    Published on:

  38. Content type: Software

    Whole-genome sequencing (WGS) projects provide short read nucleotide sequences from nuclear and possibly organelle DNA depending on the source of origin. Mitochondrial DNA is present in animals and fungi, whil...

    Authors: Kosai Al-Nakeeb, Thomas Nordahl Petersen and Thomas Sicheritz-Pontén

    Citation: BMC Bioinformatics 2017 18:510

    Published on:

2017 Journal Metrics

  • Citation Impact
    2.213 - 2-year Impact Factor
    3.114 - 5-year Impact Factor
    0.878 - Source Normalized Impact per Paper (SNIP)
    1.479 - SCImago Journal Rank (SJR)


    Social Media Impact
    4446 mentions