Skip to main content


Sequence analysis (applications)

Section edited by João Setubal

This section incorporates all aspects of sequence analysis applications, including but not limited to: software, workflows and webservers dealing with applied sequence/genome analysis, sequence assembly, analysis of sequence features, and protein function and ligand binding, estimated through sequence features.

Page 1 of 11

  1. Content type: Software

    Exploration and processing of FASTQ files are the first steps in state-of-the-art data analysis workflows of Next Generation Sequencing (NGS) platforms. The large amount of data generated by these technologies...

    Authors: Leandro Gabriel Roser, Fernán Agüero and Daniel Oscar Sánchez

    Citation: BMC Bioinformatics 2019 20:361

    Published on:

  2. Content type: Research article

    Protein secondary structure (PSS) is critical to further predict the tertiary structure, understand protein function and design drugs. However, experimental techniques of PSS are time consuming and expensive, ...

    Authors: Yanbu Guo, Weihua Li, Bingyi Wang, Huiqing Liu and Dongming Zhou

    Citation: BMC Bioinformatics 2019 20:341

    Published on:

  3. Content type: Software

    Salmonella enterica is a major cause of bacterial food-borne disease worldwide. Immunological serotyping is the most commonly used typing method to characterize S. enterica isolates, but is time-consuming and req...

    Authors: Lang Yang, Xia Zhang, Yuqi Liu, Hao Li, Shaofu Qiu, Peng Li and Hongbin Song

    Citation: BMC Bioinformatics 2019 20:215

    Published on:

  4. Content type: Methodology article

    Sub-nuclear structures or locations are associated with various nuclear processes. Proteins localized in these substructures are important to understand the interior nuclear mechanisms. Despite advances in hig...

    Authors: Maria Littmann, Tatyana Goldberg, Sebastian Seitz, Mikael Bodén and Burkhard Rost

    Citation: BMC Bioinformatics 2019 20:205

    Published on:

  5. Content type: Software

    Genetic studies in tetraploids are lagging behind in comparison with studies of diploids as the complex genetics of tetraploids require much more elaborated computational methodologies. Recent advancements in ...

    Authors: Konrad Zych, Gerrit Gort, Chris A. Maliepaard, Ritsert C. Jansen and Roeland E. Voorrips

    Citation: BMC Bioinformatics 2019 20:148

    Published on:

  6. Content type: Research article

    The standard genetic code is a recipe for assigning unambiguously 21 labels, i.e. amino acids and stop translation signal, to 64 codons. However, at early stages of the translational machinery development, the...

    Authors: Paweł BłaŻej, Małgorzata Wnetrzak, Dorota Mackiewicz and Paweł Mackiewicz

    Citation: BMC Bioinformatics 2019 20:114

    Published on:

  7. Content type: Software

    The accurate determination of parent-progeny relationships within both in situ natural populations and ex situ genetic resource collections can greatly enhance plant breeding/domestication efforts and support ...

    Authors: Arthur T. O. Melo and Iago Hale

    Citation: BMC Bioinformatics 2019 20:108

    Published on:

  8. Content type: Software

    Recent comparative studies have brought to our attention how somatic mutation detection from next-generation sequencing data is still an open issue in bioinformatics, because different pipelines result in a lo...

    Authors: Noemi Di Nanni, Marco Moscatelli, Matteo Gnocchi, Luciano Milanesi and Ettore Mosca

    Citation: BMC Bioinformatics 2019 20:107

    Published on:

  9. Content type: Software

    High-throughput amplicon sequencing of environmental DNA (eDNA metabarcoding) has become a routine tool for biodiversity survey and ecological studies. By including sample-specific tags in the primers prior PC...

    Authors: Yoann Dufresne, Franck Lejzerowicz, Laure Apotheloz Perret-Gentil, Jan Pawlowski and Tristan Cordier

    Citation: BMC Bioinformatics 2019 20:88

    Published on:

  10. Content type: Software

    With the availability of well-assembled genomes of a growing number of organisms, identifying the bioinformatic basis of whole genome duplication (WGD) is a growing field of genomics. The most extant software ...

    Authors: Yongzhi Yang, Ying Li, Qiao Chen, Yongshuai Sun and Zhiqiang Lu

    Citation: BMC Bioinformatics 2019 20:75

    Published on:

  11. Content type: Software

    High throughput sequencing technologies have been increasingly used in basic genetic research as well as in clinical applications. More and more variants underlying Mendelian and complex diseases are being dis...

    Authors: Mohammad Zia, Paul Spurgeon, Adrian Levesque, Thomas Furlani and Jianxin Wang

    Citation: BMC Bioinformatics 2019 20:61

    Published on:

  12. Content type: Software

    High-throughput technologies for analyzing chromosome conformation at a genome scale have revealed that chromatin is organized in topologically associated domains (TADs). While TADs are relatively stable acros...

    Authors: Konstantin Okonechnikov, Serap Erkek, Jan O. Korbel, Stefan M. Pfister and Lukas Chavez

    Citation: BMC Bioinformatics 2019 20:60

    Published on:

  13. Content type: Software

    Long reads provide valuable information regarding the sequence composition of genomes. Long reads are usually very noisy which renders their alignments on the reference genome a daunting task. It may take days...

    Authors: Mostafa Hadadian Nejad Yousefi, Maziar Goudarzi and Seyed Abolfazl Motahari

    Citation: BMC Bioinformatics 2019 20:51

    Published on:

  14. Content type: Software

    With sequencing technologies becoming cheaper and easier to use, more groups are able to obtain whole genome sequences of viruses of public health and scientific importance. Submission of genomic data to NCBI ...

    Authors: Ryan C. Shean, Negar Makhsous, Graham D. Stoddard, Michelle J. Lin and Alexander L. Greninger

    Citation: BMC Bioinformatics 2019 20:48

    Published on:

  15. Content type: Software

    Samples pooling is a method widely used in studies to reduce costs and labour. DNA sample pooling combined with massive parallel sequencing is a powerful tool for discovering DNA variants (polymorphisms) in la...

    Authors: Aleksandr Igorevich Zhernakov, Alexey Mikhailovich Afonin, Natalia Dmitrievna Gavriliuk, Olga Mikhailovna Moiseeva and Vladimir Aleksandrovich Zhukov

    Citation: BMC Bioinformatics 2019 20:45

    Published on:

  16. Content type: Software

    Simulation of genetic variants data is frequently required for the evaluation of statistical methods in the fields of human and animal genetics. Although a number of high-quality genetic simulators have been d...

    Authors: Apostolos Dimitromanolakis, Jingxiong Xu, Agnieszka Krol and Laurent Briollais

    Citation: BMC Bioinformatics 2019 20:26

    Published on:

  17. Content type: Software

    Around 1% of human proteins are predicted to contain a disordered and low complexity prion-like domain (PrLD). Mutations in PrLDs have been shown promote a transition towards an aggregation-prone state in seve...

    Authors: Valentin Iglesias, Oscar Conchillo-Sole, Cristina Batlle and Salvador Ventura

    Citation: BMC Bioinformatics 2019 20:24

    Published on:

  18. Content type: Software

    Traditional Map based Cloning approaches, used for the identification of desirable alleles, are extremely labour intensive and years can elapse between the mutagenesis and the detection of the polymorphism. Hi...

    Authors: Ghanasyam Rallapalli, Pilar Corredor-Moreno, Edward Chalstrey, Martin Page and Daniel MacLean

    Citation: BMC Bioinformatics 2019 20:9

    Published on:

  19. Content type: Software

    Direct-coupling analysis (DCA) is a method for protein contact prediction from sequence information alone. Its underlying principle is parameter estimation for a Hamiltonian interaction function stemming from ...

    Authors: Michael Schmidt and Kay Hamacher

    Citation: BMC Bioinformatics 2018 19:546

    Published on:

  20. Content type: Software

    As a result of its simplicity and high efficiency, the CRISPR-Cas system has been widely used as a genome editing tool. Recently, CRISPR base editors, which consist of deactivated Cas9 (dCas9) or Cas9 nickase ...

    Authors: Gue-Ho Hwang, Jeongbin Park, Kayeong Lim, Sunghyun Kim, Jihyeon Yu, Eunchong Yu, Sang-Tae Kim, Roland Eils, Jin-Soo Kim and Sangsu Bae

    Citation: BMC Bioinformatics 2018 19:542

    Published on:

  21. Content type: Software

    Various algorithms have been developed to predict fetal trisomies using cell-free DNA in non-invasive prenatal testing (NIPT). As basis for prediction, a control group of non-trisomy samples is needed. Predict...

    Authors: Lennart F. Johansson, Hendrik A. de Weerd, Eddy N. de Boer, Freerk van Dijk, Gerard J. te Meerman, Rolf H. Sijmons, Birgit Sikkema-Raddatz and Morris A. Swertz

    Citation: BMC Bioinformatics 2018 19:531

    Published on:

  22. Content type: Research article

    Antimicrobial peptides attract considerable interest as novel agents to combat infections. Their long-time potency across bacteria, viruses and fungi as part of diverse innate immune systems offers a solution ...

    Authors: Kyle Boone, Kyle Camarda, Paulette Spencer and Candan Tamerler

    Citation: BMC Bioinformatics 2018 19:469

    Published on:

  23. Content type: Research article

    Eqolisins are rare acid proteases found in archaea, bacteria and fungi. Certain fungi secrete acids as part of their lifestyle and interestingly these also have many eqolisin paralogs, up to nine paralogs have...

    Authors: Nicolás Stocchi, María Victoria Revuelta, Priscila Ailín Lanza Castronuovo, D. Mariano A. Vera and Arjen ten Have

    Citation: BMC Bioinformatics 2018 19:338

    Published on:

  24. Content type: Research article

    Spanins are phage lysis proteins required to disrupt the outer membrane. Phages employ either two-component spanins or unimolecular spanins in this final step of Gram-negative host lysis. Two-component spanins...

    Authors: Rohit Kongari, Manoj Rajaure, Jesse Cahill, Eric Rasche, Eleni Mijalis, Joel Berry and Ry Young

    Citation: BMC Bioinformatics 2018 19:326

    Published on:

  25. Content type: Methodology article

    Procedures for controlling the false discovery rate (FDR) are widely applied as a solution to the multiple comparisons problem of high-dimensional statistics. Current FDR-controlling procedures require accurat...

    Authors: Matthew M. Parks, Benjamin J. Raphael and Charles E. Lawrence

    Citation: BMC Bioinformatics 2018 19:323

    Published on:

  26. Content type: Software

    Metagenomic approaches have revealed the complexity of environmental microbiomes with the advancement in whole genome sequencing displaying a significant level of genetic heterogeneity on the species level. It...

    Authors: Adeola M. Rotimi, Rian Pierneef and Oleg N. Reva

    Citation: BMC Bioinformatics 2018 19:309

    Published on:

  27. Content type: Software

    Microarray experiments comprise more than half of all series in the Gene Expression Omnibus (GEO). However, downloading and analyzing raw or semi-processed microarray data from GEO is not intuitive and require...

    Authors: Maria Luisa Amaral, Galina A. Erikson and Maxim N. Shokhirev

    Citation: BMC Bioinformatics 2018 19:296

    Published on:

  28. Content type: Software

    Transfer of genetic material from microbes or viruses into the host genome is known as horizontal gene transfer (HGT). The integration of viruses into the human genome is associated with multiple cancers, and ...

    Authors: Saurabh Baheti, Xiaojia Tang, Daniel R. O’Brien, Nicholas Chia, Lewis R. Roberts, Heidi Nelson, Judy C. Boughey, Liewei Wang, Matthew P. Goetz, Jean-Pierre A. Kocher and Krishna R. Kalari

    Citation: BMC Bioinformatics 2018 19:271

    Published on:

  29. Content type: Software

    Methylated RNA immunoprecipitation sequencing (MeRIP-seq or m6A-seq) has been extensively used for profiling transcriptome-wide distribution of RNA N6-Methyl-Adnosine methylation. However, due to the intrinsic pr...

    Authors: Teng Zhang, Shao-Wu Zhang, Lin Zhang and Jia Meng

    Citation: BMC Bioinformatics 2018 19:260

    Published on:

  30. Content type: Research Article

    Gene expression in plant chloroplasts and mitochondria is affected by RNA editing. Numerous C-to-U conversions, accompanied by reverse U-to-C exchanges in some plant clades, alter the genetic information encod...

    Authors: Henning Lenz, Anke Hein and Volker Knoop

    Citation: BMC Bioinformatics 2018 19:255

    Published on:

  31. Content type: Software

    Computation of reaction similarity is a pre-requisite for several bioinformatics applications including enzyme identification for specific biochemical reactions, enzyme classification and mining for specific i...

    Authors: Tadi Venkata Sivakumar, Anirban Bhaduri, Rajasekhara Reddy Duvvuru Muni, Jin Hwan Park and Tae Yong Kim

    Citation: BMC Bioinformatics 2018 19:254

    Published on:

  32. Content type: Software

    High-throughput bioinformatics analyses of next generation sequencing (NGS) data often require challenging pipeline optimization. The key problem is choosing appropriate tools and selecting the best parameters...

    Authors: Nikola Tom, Ondrej Tom, Jitka Malcikova, Sarka Pavlova, Blanka Kubesova, Tobias Rausch, Miroslav Kolarik, Vladimir Benes, Vojtech Bystry and Sarka Pospisilova

    Citation: BMC Bioinformatics 2018 19:243

    Published on:

  33. Content type: Software

    The long-range sequencing information captured by linked reads, such as those available from 10× Genomics (10xG), helps resolve genome sequence repeats, and yields accurate and contiguous draft genome assembli...

    Authors: Lauren Coombe, Jessica Zhang, Benjamin P. Vandervalk, Justin Chu, Shaun D. Jackman, Inanc Birol and René L. Warren

    Citation: BMC Bioinformatics 2018 19:234

    Published on:

  34. Content type: Software

    Mutational signatures have been proved as a valuable pattern in somatic genomics, mainly regarding cancer, with a potential application as a biomarker in clinical practice. Up to now, several bioinformatic pac...

    Authors: Marcos Díaz-Gay, Maria Vila-Casadesús, Sebastià Franch-Expósito, Eva Hernández-Illán, Juan José Lozano and Sergi Castellví-Bel

    Citation: BMC Bioinformatics 2018 19:224

    Published on:

  35. Content type: Methodology article

    Discovering over-represented approximate motifs in DNA sequences is an essential part of bioinformatics. This topic has been studied extensively because of the increasing number of potential applications. Howe...

    Authors: Chadi Saad, Laurent Noé, Hugues Richard, Julie Leclerc, Marie-Pierre Buisine, Hélène Touzet and Martin Figeac

    Citation: BMC Bioinformatics 2018 19:223

    Published on:

  36. Content type: Software

    Advances in sequencing technologies have facilitated large-scale comparative genomics based on whole genome sequencing. Constructing and investigating conserved genomic regions among multiple species (called s...

    Authors: Jongin Lee, Daehwan Lee, Mikang Sim, Daehong Kwon, Juyeon Kim, Younhee Ko and Jaebum Kim

    Citation: BMC Bioinformatics 2018 19:216

    Published on:

  37. Content type: Software

    Large sequence datasets are difficult to visualize and handle. Additionally, they often do not represent a random subset of the natural diversity, but the result of uncoordinated and convenience sampling. Cons...

    Authors: Fabrizio Menardo, Chloé Loiseau, Daniela Brites, Mireia Coscolla, Sebastian M. Gygli, Liliana K. Rutaihwa, Andrej Trauner, Christian Beisel, Sonia Borrell and Sebastien Gagneux

    Citation: BMC Bioinformatics 2018 19:164

    Published on:

2018 Journal Metrics

  • Citation Impact
    2.511 - 2-year Impact Factor
    2.970 - 5-year Impact Factor
    0.855 - Source Normalized Impact per Paper (SNIP)
    1.374 - SCImago Journal Rank (SJR)


    Social Media Impact
    4446 mentions