- Open Access
InCoB2014: bioinformatics to tackle the data to knowledge challenge
BMC Bioinformaticsvolume 15, Article number: I1 (2014)
Since 2006, the International Conference on Bioinformatics (InCoB) has been publishing selected papers in BMC Bioinformatics. Papers within the scope of the journal from the 13th InCoB July 31-2 August, 2014 in Sydney, Australia have been compiled in this supplement. These span protein and proteome informatics, structural bioinformatics, software development and bioimaging to pharmacoinformatics and disease informatics, representing the breadth of bioinformatics research in the Asia-Pacific.
InCoB (the International Conference on Bioinformatics) has served as the annual conference of the Asia-Pacific Bioinformatics Network (APBioNet) , since 2002 and Sydney, Australia was the venue for the 13th InCoB, 31 July-2 August, 2014. In order to provide our region with international, peer-reviewed impact factor journal publications rather than printed books of conference proceedings, APBioNet has setup a rigorous peer review protocol and accepted the best InCoB papers in BMC Bioinformatics supplements since 2006, gradually, adding in BMC Genomics and BMC Systems Biology supplements, with BMC Medical Genomics as well this year. We have briefly reviewed the articles in this supplement, providing the 2014 bioinformatics research update from the APBioNet community.
Manuscript submission and review
InCoB2014 provided authors the choice of submitting original research as full manuscripts to either the BMC track (supplement issues of BMC Bioinformatics, BMC Systems Biology or BMC Genomics) or to a special issue of PeerJ. The statistics for paper submission and acceptance, along with details of the peer review process undertaken as well as the links to the BMC Systems Biology, BMC Medical Genomics and PeerJ supplements are provided in the InCoB2014 BMC Genomics supplement introduction , with 16 “bioinformatics” articles briefly overviewed here.
Protein and proteome informatics
Proteins display diverse functionality and these are usually the consequence of specific binding sites or sequence motifs. Dipeptide propensity scores have provided the solution to successfully predicting heme binding proteins . Among biologically important post-translational modifications, O-linked glycosylation of serine and threonine residues is elusive in that there is no clear sequence motif associated with this site. Wu et al. have applied support vector machine (SVM) learning to this problem, outperforming three other currently available tools. While mass spectrometry is frequently used to identify phosphorylated proteins, the low abundance of phosphopeptides in a sample is an obstacle to data analysis. iPhos  offers an innovative workflow system for streamlining phosphoproteome analysis.
The biological function of a protein is ascribed to its 3D structure and five papers [6–10] provide updates on structural bioinformatics research. At the outset, for proteins that have neither experimental structural information nor structural homologues, predicting the 3D structure of a protein continues to remain a challenge. Paliwal et al.  have integrated evolutionary information and secondary structure prediction for protein 3D fold recognition, while Bhageerath-H  provides a novel ab initio and homology-based hybrid tertiary structure prediction server. For multi-domain proteins, the domain boundaries may be delineated by identifying inter-domain linker regions , for subsequent prediction of domain 3D structures using either  or . Liu et al.  have analysed all available protein 3D structures and developed a new approach to discriminate between biologically relevant protein interactions and crystal packing contacts, while IFACEwat by Su et al.  seeks to predict near native structures of protein-protein complexes with the inclusion of solvent molecules at the interface.
Coevolution is an unusual phenomenon observed among species from different phyla asserting selective pressures on one another, making cophylogenetic analysis of these species very difficult. TreeCollapse addresses the cophylogeny problem indirectly by using common topological patterns . In the area of bioimaging, phenotypic changes associated with development can be quantitatively analysed live imaging of muscle tissue using the software tool, FMAj . These tools enable scientific advances in research areas where high quality software is lacking.
Where a number of bioinformatics software programs exist, quality assurance strategies are required to verify and validate the results generated. Ho and coworkers  show that metamorphic testing can be effectively applied to evaluate the results from two programs, BWA and Bowtie.
Pharmacoinformatics and disease informatics
Generating a 3D map of chemical features of known ligands (a ‘pharmacophore’), for effective drug design remains a difficult problem which has been addressed by a pharmacophore-assisted iterative closest point (ICP) method . Grover et al.  have used virtual screening of the glucagon receptor to identify novel natural inhibitors as potential therapeutic candidates for combating type 2 diabetes.
Prevention is better than cure, especially in the case of Alzheimer’s disease and Zhang et al.  propose a genetic algorithm with logistic regression for the early diagnosis of this disease from the information available from non-invasive neuropsychological tests. This could result in treatment options to prevent the onset or slow down the progression of this disease.
On the other hand, serious negative side reactions of drugs leading to organ failure can pose a serious threat to a subset of sensitive patients. To identify such patients prior to drug therapy, nephrotoxicity can be predicted based on only two genes  while multi-organ failure can be anticipated, using an integrative prediction score  for gene expression profiles.
The articles in this supplement cover protein, proteome and structural bioinformatics, software packages as well as bioinformatics applications for drug development, early diagnosis of diseases and possible prevention of drug toxicity issues. We believe the Asia-Pacific is on track to participate in the ongoing NIH Big Data to Knowledge (DB2K)  and other similar global initiatives. We welcome you to attend our 2015 InCoB meeting to be held jointly with the Genome Informatics Workshop (GIW) in Tokyo, Japan , to contribute to this regional bioinformatics effort.
The Asia-Pacific Bioinformatics Network. [http://www.apbionet.org]
Schönbach C, Tan TW, Ranganathan S: InCoB2014: mining biological data from genomics for transforming industry and health. BMC Genomics. 2014, 15 (Suppl 9): I1-10.1186/1471-2164-15-S9-I1.
Liou YF, Charoenkwan P, Srinivasulu YS, Vasylenko T, Lai SC, Lee HC, Chen YH, Huang HL, Ho SY: SCMHBP: Prediction and analysis of heme binding proteins using propensity scores of dipeptides. BMC Bioinformatics. 2014, 15 (Suppl 16): S4-10.1186/1471-2105-15-S16-S4.
Wu HY, Lu CT, Kao HJ, Chen YJ, Chen YJ, Lee TY: Characterization and identification of protein O-GlcNAcylation sites with substrate specificity. BMC Bioinformatics. 2014, 15 (Suppl 16): S1-10.1186/1471-2105-15-S16-S1.
Yang TH, Chang HT, Hsiao ESL, Sun JL, Wang CC, Wu HY, Liao PC, Wu WS: iPhos: a toolkit to streamline the alkaline phosphatase-assisted comprehensive LC-MS phosphoproteome investigation. BMC Bioinformatics. 2014, 15 (Suppl 16): S10-10.1186/1471-2105-15-S16-S10.
Paliwal KK, Sharma A, Lyons J, Dehzangi A: Improving protein fold recognition using the amalgamation of evolutionary-based and structural based information. BMC Bioinformatics. 2014, 15 (Suppl 16): S12-10.1186/1471-2105-15-S16-S12.
Jayaram B, Dhingra P, Mishra A, Kaushik R, Mukherjee G, Singh A, Shekhar S: Bhageerath-H: A homology/ab initio hybrid server for predicting tertiary structures of monomeric soluble proteins. BMC Bioinformatics. 2014, 15 (Suppl 16): S7-10.1186/1471-2105-15-S16-S7.
Shatnawi M, Zaki N, Yoo PD: Protein inter-domain linker prediction using random forest and amino acid physiochemical properties. BMC Bioinformatics. 2014, 15 (Suppl 16): S8-10.1186/1471-2105-15-S16-S8.
Liu Q, Li Z, Li J: Use B-factor related features for accurate classification between protein binding interfaces and crystal packing contacts. BMC Bioinformatics. 2014, 15 (Suppl 16): S3-10.1186/1471-2105-15-S16-S3.
Su CTT, Nguyen TD, Zheng J, Kwoh CK: IFACEwat: the interfacial water-implemented reranking algorithm to improve the discrimination of near native structures for protein rigid docking. BMC Bioinformatics. 2014, 15 (Suppl 16): S9-10.1186/1471-2105-15-S16-S9.
Drinkwater B, Charleston MA: Introducing TreeCollapse: a novel greedy algorithm to solve the cophylogeny reconstruction problem. BMC Bioinformatics. 2014, 15 (Suppl 16): S14-10.1186/1471-2105-15-S16-S14.
Kuleesha Y, Choo PW, Feng L, Wasser M: FMAj: a tool for high content analysis of muscle dynamics in Drosophila metamorphosis. BMC Bioinformatics. 2014, 15 (Suppl 16): S6-10.1186/1471-2105-15-S16-S6.
Giannoulatou E, Park SH, Humphreys DT, Ho JWK: Verification and validation of bioinformatics software without a gold standard: a case study of BWA and Bowtie. BMC Bioinformatics. 2014, 15 (Suppl 16): S15-10.1186/1471-2105-15-S16-S15.
Zhou L, Griffith R, Gaeta BA: Combining spatial and chemical information for clustering pharmacophores. BMC Bioinformatics. 2014, 15 (Suppl 16): S5-10.1186/1471-2105-15-S16-S5.
Grover S, Dhanjal JK, Goyal S, Grover A, Sundar D: Computational identification of novel natural inhibitors of glucagon receptor for checking type II diabetes mellitus. BMC Bioinformatics. 2014, 15 (Suppl 16): S13-10.1186/1471-2105-15-S16-S13.
Johnson P, Vandewater L, Wilson W, Maruff P, Savage G, Graham P, Macaulay SL, Ellis KA, Szoeke C, Martins RN, Rowe CC, Masters CL, Ames D, Zhang P: Genetic algorithm with logistic regression for prediction of progression to Alzheimer’s disease. BMC Bioinformatics. 2014, 15 (Suppl 16): S11-10.1186/1471-2105-15-S16-S11.
Su R, Li Y, Zink D, Loo LH: Supervised prediction of drug-induced nephrotoxicity based on interleukin-6 and -8 expression levels. BMC Bioinformatics. 2014, 15 (Suppl 16): S16-10.1186/1471-2105-15-S16-S16.
Kim J, Shin M: An integrative model of multi-organ drug-induced toxicity prediction using gene-expression data. BMC Bioinformatics. 2014, 15 (Suppl 16): S2-10.1186/1471-2105-15-S16-S2.
Big Data to Knowledge (BD2K) initiative. [http://bd2k.nih.gov/]
GIW-InCoB. 2015, [http://incob.apbionet.org/incob15]
We are indebted to all members of the Program Committee and additional reviewers for their efforts and time. We are grateful to the NSW T&I Conference grant awarded to SR for hosting InCoB2014, our sponsors: the Australian Bioinformatics Network, the International Society for Computational Biology, Qiagen, Millennium Science and ABSciEx, and for material support from Macquarie University and Sydney Business Events. Special thanks go to ASN Events for running the conference smoothly. Finally, we are deeply grateful to Isobel Peters and Jennifer Egar of BMC who have supported us through the supplement publication process.
This article has been published as part of BMC Bioinformatics Volume 15 Supplement 16, 2014: Thirteenth International Conference on Bioinformatics (InCoB2014): Bioinformatics. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcbioinformatics/supplements/15/S16.
SR wrote the introduction. CS and SR (Program Committee Co-chairs) managed the review and editorial processes, respectively. TWT supported the post-acceptance manuscript processing.