Skip to main content

Machine Learning and Artificial Intelligence in Bioinformatics

Section edited by Jean-Philippe Vert

This section covers recent advances in machine learning and artificial intelligence methods, including their applications to problems in bioinformatics. It considers manuscripts describing novel computational techniques to analyse high throughput data such as sequences and gene/protein expressions, as well as machine learning techniques such as graphical models, neural networks or kernel methods.

Page 1 of 5

  1. This paper exploits recent developments in topological data analysis to present a pipeline for clustering based on Mapper, an algorithm that reduces complex data into a one-dimensional graph.

    Authors: Ewan Carr, Mathieu Carrière, Bertrand Michel, Frédéric Chazal and Raquel Iniesta

    Citation: BMC Bioinformatics 2021 22:449

    Content type: Software

    Published on:

  2. One of the major challenges in precision medicine is accurate prediction of individual patient’s response to drugs. A great number of computational methods have been developed to predict compounds activity usi...

    Authors: Zhaorui Zuo, Penglei Wang, Xiaowei Chen, Li Tian, Hui Ge and Dahong Qian

    Citation: BMC Bioinformatics 2021 22:434

    Content type: Methodology article

    Published on:

  3. Modern Next Generation- and Third Generation- Sequencing methods such as Illumina and PacBio Circular Consensus Sequencing platforms provide accurate sequencing data. Parallel developments in Deep Learning hav...

    Authors: Anand Ramachandran, Steven S. Lumetta, Eric W. Klee and Deming Chen

    Citation: BMC Bioinformatics 2021 22:404

    Content type: Methodology article

    Published on:

  4. Autism spectrum disorders (ASD) imply a spectrum of symptoms rather than a single phenotype. ASD could affect brain connectivity at different degree based on the severity of the symptom. Given their excellent ...

    Authors: Jinlong Hu, Lijie Cao, Tenghui Li, Shoubin Dong and Ping Li

    Citation: BMC Bioinformatics 2021 22:379

    Content type: Methodology article

    Published on:

  5. Plant pathogens cause billions of dollars of crop loss every year and are a major threat to global food security. Effector proteins are the tools such pathogens use to infect the cell, predicting effectors de ...

    Authors: Ruth Kristianingsih and Dan MacLean

    Citation: BMC Bioinformatics 2021 22:372

    Content type: Software

    Published on:

  6. The topology of metabolic networks is both well-studied and remarkably well-conserved across many species. The regulation of these networks, however, is much more poorly characterized, though it is known to be...

    Authors: Justin Y. Lee, Britney Nguyen, Carlos Orosco and Mark P. Styczynski

    Citation: BMC Bioinformatics 2021 22:365

    Content type: Methodology article

    Published on:

  7. Localization of messenger RNAs (mRNAs) plays a crucial role in the growth and development of cells. Particularly, it plays a major role in regulating spatio-temporal gene expression. The in situ hybridization ...

    Authors: Prabina Kumar Meher, Anil Rai and Atmakuri Ramakrishna Rao

    Citation: BMC Bioinformatics 2021 22:342

    Content type: Methodology article

    Published on:

  8. Epigenetic modifications, including CG methylation (a major form of DNA methylation) and histone modifications, interact with each other to shape their genomic distribution patterns. However, the entire pictur...

    Authors: Wan Kin Au Yeung, Osamu Maruyama and Hiroyuki Sasaki

    Citation: BMC Bioinformatics 2021 22:341

    Content type: Research article

    Published on:

  9. Approximate Bayesian Computation (ABC) has become a key tool for calibrating the parameters of discrete stochastic biochemical models. For higher dimensional models and data, its performance is strongly depend...

    Authors: Richard M. Jiang, Fredrik Wrede, Prashant Singh, Andreas Hellander and Linda R. Petzold

    Citation: BMC Bioinformatics 2021 22:339

    Content type: Methodology article

    Published on:

  10. MicroRNAs (miRNAs) are small non-coding RNAs that regulate gene expression post-transcriptionally via base-pairing with complementary sequences on messenger RNAs (mRNAs). Due to the technical challenges involv...

    Authors: Gilad Ben Or and Isana Veksler-Lublinsky

    Citation: BMC Bioinformatics 2021 22:264

    Content type: Research article

    Published on:

  11. Pseudogenes are non-functional copies of protein coding genes that typically follow a different molecular evolutionary path as compared to functional genes. The inclusion of pseudogene sequences in DNA barcodi...

    Authors: T. M. Porter and M. Hajibabaei

    Citation: BMC Bioinformatics 2021 22:256

    Content type: Methodology article

    Published on:

  12. Motivated by the size and availability of cell line drug sensitivity data, researchers have been developing machine learning (ML) models for predicting drug response to advance cancer treatment. As drug sensit...

    Authors: Alexander Partin, Thomas Brettin, Yvonne A. Evrard, Yitan Zhu, Hyunseung Yoo, Fangfang Xia, Songhao Jiang, Austin Clyde, Maulik Shukla, Michael Fonstein, James H. Doroshow and Rick L. Stevens

    Citation: BMC Bioinformatics 2021 22:252

    Content type: Research article

    Published on:

  13. The state-of-the-art deep learning based cancer type prediction can only predict cancer types whose samples are available during the training where the sample size is commonly large. In this paper, we consider...

    Authors: Milad Mostavi, Yu-Chiao Chiu, Yidong Chen and Yufei Huang

    Citation: BMC Bioinformatics 2021 22:244

    Content type: Research article

    Published on:

  14. Current methods in machine learning provide approaches for solving challenging, multiple constraint design problems. While deep learning and related neural networking methods have state-of-the-art performance,...

    Authors: Kyle Boone, Cate Wisdom, Kyle Camarda, Paulette Spencer and Candan Tamerler

    Citation: BMC Bioinformatics 2021 22:239

    Content type: Research article

    Published on:

  15. Genes implicated in tumorigenesis often exhibit diverse sets of genomic variants in the tumor cohorts within which they are frequently mutated. For many genes, neither the transcriptomic effects of these varia...

    Authors: Michal R. Grzadkowski, Hannah D. Holly, Julia Somers and Emek Demir

    Citation: BMC Bioinformatics 2021 22:233

    Content type: Research article

    Published on:

  16. Epitope prediction is a useful approach in cancer immunology and immunotherapy. Many computational methods, including machine learning and network analysis, have been developed quickly for such purposes. Howev...

    Authors: Xiaoyun Yang, Liyuan Zhao, Fang Wei and Jing Li

    Citation: BMC Bioinformatics 2021 22:231

    Content type: Methodology article

    Published on:

  17. The identification of gene–gene and gene–environment interactions in genome-wide association studies is challenging due to the unknown nature of the interactions and the overwhelmingly large number of possible...

    Authors: Pål V. Johnsen, Signe Riemer-Sørensen, Andrew Thomas DeWan, Megan E. Cahill and Mette Langaas

    Citation: BMC Bioinformatics 2021 22:230

    Content type: Methodology article

    Published on:

  18. The Cox proportional hazards model is commonly used to predict hazard ratio, which is the risk or probability of occurrence of an event of interest. However, the Cox proportional hazard model cannot directly g...

    Authors: Eu-Tteum Baek, Hyung Jeong Yang, Soo Hyung Kim, Guee Sang Lee, In-Jae Oh, Sae-Ryung Kang and Jung-Joon Min

    Citation: BMC Bioinformatics 2021 22:192

    Content type: Methodology article

    Published on:

  19. The genomics data analysis has been widely used to study disease genes and drug targets. However, the existence of missing values in genomics datasets poses a significant problem, which severely hinders the us...

    Authors: Xinshan Zhu, Jiayu Wang, Biao Sun, Chao Ren, Ting Yang and Jie Ding

    Citation: BMC Bioinformatics 2021 22:188

    Content type: Methodology article

    Published on:

  20. Technological and research advances have produced large volumes of biomedical data. When represented as a network (graph), these data become useful for modeling entities and interactions in biological and simi...

    Authors: Khushnood Abbas, Alireza Abbasi, Shi Dong, Ling Niu, Laihang Yu, Bolun Chen, Shi-Min Cai and Qambar Hasan

    Citation: BMC Bioinformatics 2021 22:187

    Content type: Research article

    Published on:

  21. Microsatellite instability (MSI) is a common genomic alteration in colorectal cancer, endometrial carcinoma, and other solid tumors. MSI is characterized by a high degree of polymorphism in microsatellite leng...

    Authors: Tao Zhou, Libin Chen, Jing Guo, Mengmeng Zhang, Yanrui Zhang, Shanbo Cao, Feng Lou and Haijun Wang

    Citation: BMC Bioinformatics 2021 22:185

    Content type: Software

    Published on:

  22. The interactions of proteins are determined by their sequences and affect the regulation of the cell cycle, signal transduction and metabolism, which is of extraordinary significance to modern proteomics resea...

    Authors: Yang Wang, Zhanchao Li, Yanfei Zhang, Yingjun Ma, Qixing Huang, Xingyu Chen, Zong Dai and Xiaoyong Zou

    Citation: BMC Bioinformatics 2021 22:184

    Content type: Research article

    Published on:

  23. Identifying lncRNA-disease associations not only helps to better comprehend the underlying mechanisms of various human diseases at the lncRNA level but also speeds up the identification of potential biomarkers...

    Authors: Rong Zhu, Yong Wang, Jin-Xing Liu and Ling-Yun Dai

    Citation: BMC Bioinformatics 2021 22:175

    Content type: Methodology article

    Published on:

  24. Supervised learning from high-throughput sequencing data presents many challenges. For one, the curse of dimensionality often leads to overfitting as well as issues with scalability. This can bring about inacc...

    Authors: Trevor S. Frisby, Shawn J. Baker, Guillaume Marçais, Quang Minh Hoang, Carl Kingsford and Christopher J. Langmead

    Citation: BMC Bioinformatics 2021 22:174

    Content type: Methodology article

    Published on:

  25. To address the need for easy and reliable species classification in plant genetic resources collections, we assessed the potential of five classifiers (Random Forest, Neighbour-Joining, 1-Nearest Neighbour, a ...

    Authors: Artur van Bemmelen van der Plaat, Rob van Treuren and Theo J. L. van Hintum

    Citation: BMC Bioinformatics 2021 22:173

    Content type: Research article

    Published on:

  26. Recent studies have confirmed that N7-methylguanosine (m7G) modification plays an important role in regulating various biological processes and has associations with multiple diseases. Wet-lab experiments are cos...

    Authors: Jiani Ma, Lin Zhang, Jin Chen, Bowen Song, Chenxuan Zang and Hui Liu

    Citation: BMC Bioinformatics 2021 22:152

    Content type: Software

    Published on:

  27. Automated text classification has many important applications in the clinical setting; however, obtaining labelled data for training machine learning and deep learning models is often difficult and expensive. ...

    Authors: Kevin De Angeli, Shang Gao, Mohammed Alawad, Hong-Jun Yoon, Noah Schaefferkoetter, Xiao-Cheng Wu, Eric B. Durbin, Jennifer Doherty, Antoinette Stroup, Linda Coyle, Lynne Penberthy and Georgia Tourassi

    Citation: BMC Bioinformatics 2021 22:113

    Content type: Research article

    Published on:

  28. Manual microscopic examination of Leishman/Giemsa stained thin and thick blood smear is still the “gold standard” for malaria diagnosis. One of the drawbacks of this method is that its accuracy, consistency, a...

    Authors: Fetulhak Abdurahman, Kinde Anlay Fante and Mohammed Aliy

    Citation: BMC Bioinformatics 2021 22:112

    Content type: Research article

    Published on:

  29. Machine learning involves strategies and algorithms that may assist bioinformatics analyses in terms of data mining and knowledge discovery. In several applications, viz. in Life Sciences, it is often more imp...

    Authors: Mateusz Garbulowski, Klev Diamanti, Karolina Smolińska, Nicholas Baltzer, Patricia Stoll, Susanne Bornelöv, Aleksander Øhrn, Lars Feuk and Jan Komorowski

    Citation: BMC Bioinformatics 2021 22:110

    Content type: Software

    Published on:

  30. The accumulation of various multi-omics data and computational approaches for data integration can accelerate the development of precision medicine. However, the algorithm development for multi-omics data inte...

    Authors: Yuqi Wen, Xinyu Song, Bowei Yan, Xiaoxi Yang, Lianlian Wu, Dongjin Leng, Song He and Xiaochen Bo

    Citation: BMC Bioinformatics 2021 22:97

    Content type: Methodology article

    Published on:

  31. Microbes perform a fundamental economic, social, and environmental role in our society. Metagenomics makes it possible to investigate microbes in their natural environments (the complex communities) and their ...

    Authors: Raíssa Silva, Kleber Padovani, Fabiana Góes and Ronnie Alves

    Citation: BMC Bioinformatics 2021 22:87

    Content type: Software

    Published on:

  32. The increasing number of genome-wide association studies (GWAS) has revealed several loci that are associated to multiple distinct phenotypes, suggesting the existence of pleiotropic effects. Highlighting thes...

    Authors: Camilo Broc, Therese Truong and Benoit Liquet

    Citation: BMC Bioinformatics 2021 22:86

    Content type: Methodology article

    Published on:

  33. In the last decade, Genome-wide Association studies (GWASs) have contributed to decoding the human genome by uncovering many genetic variations associated with various diseases. Many follow-up investigations i...

    Authors: Haohan Wang, Fen Pei, Michael M. Vanyukov, Ivet Bahar, Wei Wu and Eric P. Xing

    Citation: BMC Bioinformatics 2021 22:50

    Content type: Methodology article

    Published on:

  34. Survival analysis is an important part of cancer studies. In addition to the existing Cox proportional hazards model, deep learning models have recently been proposed in survival prediction, which directly int...

    Authors: Jiarui Feng, Heming Zhang and Fuhai Li

    Citation: BMC Bioinformatics 2021 22:47

    Content type: Methodology article

    Published on:

  35. Differential expression and feature selection analyses are essential steps for the development of accurate diagnostic/prognostic classifiers of complicated human diseases using transcriptomics data. These step...

    Authors: Liangqun Lu, Kevin A. Townsend and Bernie J. Daigle Jr.

    Citation: BMC Bioinformatics 2021 22:44

    Content type: Methodology article

    Published on:

  36. Assigning chromatin states genome-wide (e.g. promoters, enhancers, etc.) is commonly performed to improve functional interpretation of these states. However, computational methods to assign chromatin state suf...

    Authors: Tara Eicher, Jany Chan, Han Luu, Raghu Machiraju and Ewy A. Mathé

    Citation: BMC Bioinformatics 2021 22:35

    Content type: Methodology article

    Published on:

  37. Predicting the response of cancer cell lines to specific drugs is an essential problem in personalized medicine. Since drug response is closely associated with genomic information in cancer cells, some large p...

    Authors: Akram Emdadi and Changiz Eslahchi

    Citation: BMC Bioinformatics 2021 22:33

    Content type: Methodology article

    Published on:

Annual Journal Metrics

  • Speed
    70 days to first decision for reviewed manuscripts only
    44 days to first decision for all manuscripts
    163 days from submission to acceptance
    36 days from acceptance to publication

    Citation Impact
    3.169 - 2-year Impact Factor
    3.629 - 5-year Impact Factor
    1.276 - Source Normalized Impact per Paper (SNIP)
    1.567 - SCImago Journal Rank (SJR)

    Usage 
    5,167,186 Downloads
    5089 Altmetric Mentions