Skip to content


Sequence analysis (applications)

Section edited by João Setubal

This section incorporates all aspects of sequence analysis applications, including but not limited to: software, workflows and webservers dealing with applied sequence/genome analysis, sequence assembly, analysis of sequence features, and protein function and ligand binding, estimated through sequence features.

Page 1 of 10

  1. Content type: Software

    Simulation of genetic variants data is frequently required for the evaluation of statistical methods in the fields of human and animal genetics. Although a number of high-quality genetic simulators have been d...

    Authors: Apostolos Dimitromanolakis, Jingxiong Xu, Agnieszka Krol and Laurent Briollais

    Citation: BMC Bioinformatics 2019 20:26

    Published on:

  2. Content type: Software

    Around 1% of human proteins are predicted to contain a disordered and low complexity prion-like domain (PrLD). Mutations in PrLDs have been shown promote a transition towards an aggregation-prone state in seve...

    Authors: Valentin Iglesias, Oscar Conchillo-Sole, Cristina Batlle and Salvador Ventura

    Citation: BMC Bioinformatics 2019 20:24

    Published on:

  3. Content type: Software

    Traditional Map based Cloning approaches, used for the identification of desirable alleles, are extremely labour intensive and years can elapse between the mutagenesis and the detection of the polymorphism. Hi...

    Authors: Ghanasyam Rallapalli, Pilar Corredor-Moreno, Edward Chalstrey, Martin Page and Daniel MacLean

    Citation: BMC Bioinformatics 2019 20:9

    Published on:

  4. Content type: Software

    Direct-coupling analysis (DCA) is a method for protein contact prediction from sequence information alone. Its underlying principle is parameter estimation for a Hamiltonian interaction function stemming from ...

    Authors: Michael Schmidt and Kay Hamacher

    Citation: BMC Bioinformatics 2018 19:546

    Published on:

  5. Content type: Software

    As a result of its simplicity and high efficiency, the CRISPR-Cas system has been widely used as a genome editing tool. Recently, CRISPR base editors, which consist of deactivated Cas9 (dCas9) or Cas9 nickase ...

    Authors: Gue-Ho Hwang, Jeongbin Park, Kayeong Lim, Sunghyun Kim, Jihyeon Yu, Eunchong Yu, Sang-Tae Kim, Roland Eils, Jin-Soo Kim and Sangsu Bae

    Citation: BMC Bioinformatics 2018 19:542

    Published on:

  6. Content type: Software

    Various algorithms have been developed to predict fetal trisomies using cell-free DNA in non-invasive prenatal testing (NIPT). As basis for prediction, a control group of non-trisomy samples is needed. Predict...

    Authors: Lennart F. Johansson, Hendrik A. de Weerd, Eddy N. de Boer, Freerk van Dijk, Gerard J. te Meerman, Rolf H. Sijmons, Birgit Sikkema-Raddatz and Morris A. Swertz

    Citation: BMC Bioinformatics 2018 19:531

    Published on:

  7. Content type: Research article

    Antimicrobial peptides attract considerable interest as novel agents to combat infections. Their long-time potency across bacteria, viruses and fungi as part of diverse innate immune systems offers a solution ...

    Authors: Kyle Boone, Kyle Camarda, Paulette Spencer and Candan Tamerler

    Citation: BMC Bioinformatics 2018 19:469

    Published on:

  8. Content type: Research article

    Eqolisins are rare acid proteases found in archaea, bacteria and fungi. Certain fungi secrete acids as part of their lifestyle and interestingly these also have many eqolisin paralogs, up to nine paralogs have...

    Authors: Nicolás Stocchi, María Victoria Revuelta, Priscila Ailín Lanza Castronuovo, D. Mariano A. Vera and Arjen ten Have

    Citation: BMC Bioinformatics 2018 19:338

    Published on:

  9. Content type: Research article

    Spanins are phage lysis proteins required to disrupt the outer membrane. Phages employ either two-component spanins or unimolecular spanins in this final step of Gram-negative host lysis. Two-component spanins...

    Authors: Rohit Kongari, Manoj Rajaure, Jesse Cahill, Eric Rasche, Eleni Mijalis, Joel Berry and Ry Young

    Citation: BMC Bioinformatics 2018 19:326

    Published on:

  10. Content type: Methodology article

    Procedures for controlling the false discovery rate (FDR) are widely applied as a solution to the multiple comparisons problem of high-dimensional statistics. Current FDR-controlling procedures require accurat...

    Authors: Matthew M. Parks, Benjamin J. Raphael and Charles E. Lawrence

    Citation: BMC Bioinformatics 2018 19:323

    Published on:

  11. Content type: Software

    Metagenomic approaches have revealed the complexity of environmental microbiomes with the advancement in whole genome sequencing displaying a significant level of genetic heterogeneity on the species level. It...

    Authors: Adeola M. Rotimi, Rian Pierneef and Oleg N. Reva

    Citation: BMC Bioinformatics 2018 19:309

    Published on:

  12. Content type: Software

    Microarray experiments comprise more than half of all series in the Gene Expression Omnibus (GEO). However, downloading and analyzing raw or semi-processed microarray data from GEO is not intuitive and require...

    Authors: Maria Luisa Amaral, Galina A. Erikson and Maxim N. Shokhirev

    Citation: BMC Bioinformatics 2018 19:296

    Published on:

  13. Content type: Software

    Transfer of genetic material from microbes or viruses into the host genome is known as horizontal gene transfer (HGT). The integration of viruses into the human genome is associated with multiple cancers, and ...

    Authors: Saurabh Baheti, Xiaojia Tang, Daniel R. O’Brien, Nicholas Chia, Lewis R. Roberts, Heidi Nelson, Judy C. Boughey, Liewei Wang, Matthew P. Goetz, Jean-Pierre A. Kocher and Krishna R. Kalari

    Citation: BMC Bioinformatics 2018 19:271

    Published on:

  14. Content type: Software

    Methylated RNA immunoprecipitation sequencing (MeRIP-seq or m6A-seq) has been extensively used for profiling transcriptome-wide distribution of RNA N6-Methyl-Adnosine methylation. However, due to the intrinsic pr...

    Authors: Teng Zhang, Shao-Wu Zhang, Lin Zhang and Jia Meng

    Citation: BMC Bioinformatics 2018 19:260

    Published on:

  15. Content type: Research Article

    Gene expression in plant chloroplasts and mitochondria is affected by RNA editing. Numerous C-to-U conversions, accompanied by reverse U-to-C exchanges in some plant clades, alter the genetic information encod...

    Authors: Henning Lenz, Anke Hein and Volker Knoop

    Citation: BMC Bioinformatics 2018 19:255

    Published on:

  16. Content type: Software

    Computation of reaction similarity is a pre-requisite for several bioinformatics applications including enzyme identification for specific biochemical reactions, enzyme classification and mining for specific i...

    Authors: Tadi Venkata Sivakumar, Anirban Bhaduri, Rajasekhara Reddy Duvvuru Muni, Jin Hwan Park and Tae Yong Kim

    Citation: BMC Bioinformatics 2018 19:254

    Published on:

  17. Content type: Software

    High-throughput bioinformatics analyses of next generation sequencing (NGS) data often require challenging pipeline optimization. The key problem is choosing appropriate tools and selecting the best parameters...

    Authors: Nikola Tom, Ondrej Tom, Jitka Malcikova, Sarka Pavlova, Blanka Kubesova, Tobias Rausch, Miroslav Kolarik, Vladimir Benes, Vojtech Bystry and Sarka Pospisilova

    Citation: BMC Bioinformatics 2018 19:243

    Published on:

  18. Content type: Software

    The long-range sequencing information captured by linked reads, such as those available from 10× Genomics (10xG), helps resolve genome sequence repeats, and yields accurate and contiguous draft genome assembli...

    Authors: Lauren Coombe, Jessica Zhang, Benjamin P. Vandervalk, Justin Chu, Shaun D. Jackman, Inanc Birol and René L. Warren

    Citation: BMC Bioinformatics 2018 19:234

    Published on:

  19. Content type: Software

    Mutational signatures have been proved as a valuable pattern in somatic genomics, mainly regarding cancer, with a potential application as a biomarker in clinical practice. Up to now, several bioinformatic pac...

    Authors: Marcos Díaz-Gay, Maria Vila-Casadesús, Sebastià Franch-Expósito, Eva Hernández-Illán, Juan José Lozano and Sergi Castellví-Bel

    Citation: BMC Bioinformatics 2018 19:224

    Published on:

  20. Content type: Methodology article

    Discovering over-represented approximate motifs in DNA sequences is an essential part of bioinformatics. This topic has been studied extensively because of the increasing number of potential applications. Howe...

    Authors: Chadi Saad, Laurent Noé, Hugues Richard, Julie Leclerc, Marie-Pierre Buisine, Hélène Touzet and Martin Figeac

    Citation: BMC Bioinformatics 2018 19:223

    Published on:

  21. Content type: Software

    Advances in sequencing technologies have facilitated large-scale comparative genomics based on whole genome sequencing. Constructing and investigating conserved genomic regions among multiple species (called s...

    Authors: Jongin Lee, Daehwan Lee, Mikang Sim, Daehong Kwon, Juyeon Kim, Younhee Ko and Jaebum Kim

    Citation: BMC Bioinformatics 2018 19:216

    Published on:

  22. Content type: Software

    Large sequence datasets are difficult to visualize and handle. Additionally, they often do not represent a random subset of the natural diversity, but the result of uncoordinated and convenience sampling. Cons...

    Authors: Fabrizio Menardo, Chloé Loiseau, Daniela Brites, Mireia Coscolla, Sebastian M. Gygli, Liliana K. Rutaihwa, Andrej Trauner, Christian Beisel, Sonia Borrell and Sebastien Gagneux

    Citation: BMC Bioinformatics 2018 19:164

    Published on:

  23. Content type: Research article

    In the last decade and a half it has been firmly established that a large number of proteins do not adopt a well-defined (ordered) structure under physiological conditions. Such intrinsically disordered protei...

    Authors: Nenad S. Mitić, Saša N. Malkov, Jovana J. Kovačević, Gordana M. Pavlović-Lažetić and Miloš V. Beljanski

    Citation: BMC Bioinformatics 2018 19:158

    Published on:

  24. Content type: Database

    Bioactive peptides, including biological sources-derived peptides with different biological activities, are protein fragments that influence the functions or conditions of organisms, in particular humans and a...

    Authors: Krittima Anekthanakul, Apiradee Hongsthong, Jittisak Senachak and Marasri Ruengjitchatchawalya

    Citation: BMC Bioinformatics 2018 19:149

    Published on:

  25. Content type: Software

    The study of the huge diversity of immune receptors, often referred to as immune repertoire profiling, is a prerequisite for diagnosis, prognostication and monitoring of hematological disorders. In the era of ...

    Authors: Christos Maramis, Athanasios Gkoufas, Anna Vardi, Evangelia Stalika, Kostas Stamatopoulos, Anastasia Hatzidimitriou, Nicos Maglaveras and Ioanna Chouvarda

    Citation: BMC Bioinformatics 2018 19:144

    Published on:

  26. Content type: Research article

    After decades of identifying risk factors using array-based genome-wide association studies (GWAS), genetic research of complex diseases has shifted to sequencing-based rare variants discovery. This requires l...

    Authors: Yingxue Ren, Joseph S. Reddy, Cyril Pottier, Vivekananda Sarangi, Shulan Tian, Jason P. Sinnwell, Shannon K. McDonnell, Joanna M. Biernacka, Minerva M. Carrasquillo, Owen A. Ross, Nilüfer Ertekin-Taner, Rosa Rademakers, Matthew Hudson, Liudmila Sergeevna Mainzer and Yan W. Asmann

    Citation: BMC Bioinformatics 2018 19:139

    Published on:

  27. Content type: Research article

    Single nucleotide polymorphisms (SNPs) located within the human genome have been shown to have utility as markers of identity in the differentiation of DNA from individual contributors. Massively parallel DNA ...

    Authors: M. Heath Farris, Andrew R. Scott, Pamela A. Texter, Marta Bartlett, Patricia Coleman and David Masters

    Citation: BMC Bioinformatics 2018 19:126

    Published on:

  28. Content type: Methodology article

    High quality functional annotation is essential for understanding the phenotypic consequences encoded in a genome. Despite improvements in bioinformatics methods, millions of sequences in databanks are not ass...

    Authors: Jonathan Mercier, Adrien Josso, Claudine Médigue and David Vallenet

    Citation: BMC Bioinformatics 2018 19:132

    Published on:

  29. Content type: Methodology Article

    Genome-wide association studies (GWASs) have been widely used to discover the genetic basis of complex phenotypes. However, standard single-SNP GWASs suffer from lack of power. In particular, they do not direc...

    Authors: Christine Sinoquet

    Citation: BMC Bioinformatics 2018 19:106

    Published on:

  30. Content type: Software

    Quantitative trait locus (QTL) mapping in genetic data often involves analysis of correlated observations, which need to be accounted for to avoid false association signals. This is commonly performed by model...

    Authors: Andrey Ziyatdinov, Miquel Vázquez-Santiago, Helena Brunel, Angel Martinez-Perez, Hugues Aschard and Jose Manuel Soria

    Citation: BMC Bioinformatics 2018 19:68

    Published on:

  31. Content type: Software

    Small RNA molecules play important roles in many biological processes and their dysregulation or dysfunction can cause disease. The current method of choice for genome-wide sRNA expression profiling is deep se...

    Authors: Raza-Ur Rahman, Abhivyakti Gautam, Jörn Bethune, Abdul Sattar, Maksims Fiosins, Daniel Sumner Magruder, Vincenzo Capece, Orr Shomroni and Stefan Bonn

    Citation: BMC Bioinformatics 2018 19:54

    Published on:

  32. Content type: Research Article

    The ease at which influenza virus sequence data can be used to estimate antigenic relationships between strains and the existence of databases containing sequence data for hundreds of thousands influenza strai...

    Authors: Christopher S. Anderson, Patrick R. McCall, Harry A. Stern, Hongmei Yang and David J. Topham

    Citation: BMC Bioinformatics 2018 19:51

    Published on:

  33. Content type: Software

    The advent of modern high-throughput genetics continually broadens the gap between the rising volume of sequencing data, and the tools required to process them. The need to pinpoint a small subset of functiona...

    Authors: Monika Mozere, Mehmet Tekman, Jameela Kari, Detlef Bockenhauer, Robert Kleta and Horia Stanescu

    Citation: BMC Bioinformatics 2018 19:46

    Published on:

  34. Content type: Research Article

    The clinical sequencing of cancer genomes to personalize therapy is becoming routine across the world. However, concerns over patient re-identification from these data lead to questions about how tightly acces...

    Authors: Dorota H. Sendorek, Cristian Caloian, Kyle Ellrott, J. Christopher Bare, Takafumi N. Yamaguchi, Adam D. Ewing, Kathleen E. Houlahan, Thea C. Norman, Adam A. Margolin, Joshua M. Stuart and Paul C. Boutros

    Citation: BMC Bioinformatics 2018 19:28

    Published on:

  35. Content type: Methodology Article

    Protein or nucleic acid sequences contain a multitude of associated annotations representing continuous sequence elements (CSEs). Comparing these CSEs is needed, whenever we want to match identical annotations...

    Authors: Roman Prytuliak, Friedhelm Pfeiffer and Bianca Hermine Habermann

    Citation: BMC Bioinformatics 2018 19:24

    Published on:

  36. Content type: Software

    Genetic association studies (GAS) aims to evaluate the association between genetic variants and phenotypes. In the last few years, the number of this type of study has increased exponentially, but the results ...

    Authors: Jordi Martorell-Marugan, Daniel Toro-Dominguez, Marta E. Alarcon-Riquelme and Pedro Carmona-Saez

    Citation: BMC Bioinformatics 2017 18:563

    Published on:

  37. Content type: Software

    High throughput sequencing requires bioinformatics pipelines to process large volumes of data into meaningful variants that can be translated into a clinical report. These pipelines often suffer from a number ...

    Authors: Kenneth D. Doig, Jason Ellul, Andrew Fellowes, Ella R. Thompson, Georgina Ryland, Piers Blombery, Anthony T. Papenfuss and Stephen B. Fox

    Citation: BMC Bioinformatics 2017 18:555

    Published on:

  38. Content type: Software

    Haloplex targeted resequencing is a popular method to analyze both germline and somatic variants in gene panels. However, involved wet-lab procedures may introduce false positives that need to be considered in...

    Authors: Matthias Beyens, Nele Boeckx, Guy Van Camp, Ken Op de Beeck and Geert Vandeweyer

    Citation: BMC Bioinformatics 2017 18:554

    Published on:

  39. Content type: Methodology Article

    Accurate structural annotation depends on well-trained gene prediction programs. Training data for gene prediction programs are often chosen randomly from a subset of high-quality genes that ideally represent ...

    Authors: Megan J. Bowman, Jane A. Pulman, Tiffany L. Liu and Kevin L. Childs

    Citation: BMC Bioinformatics 2017 18:522

    Published on:

2017 Journal Metrics

  • Citation Impact
    2.213 - 2-year Impact Factor
    3.114 - 5-year Impact Factor
    0.878 - Source Normalized Impact per Paper (SNIP)
    1.479 - SCImago Journal Rank (SJR)


    Social Media Impact
    4446 mentions