Skip to main content

eccCL: parallelized GPU implementation of Ensemble Classifier Chains



Multi-label classification has recently gained great attention in diverse fields of research, e.g., in biomedical application such as protein function prediction or drug resistance testing in HIV. In this context, the concept of Classifier Chains has been shown to improve prediction accuracy, especially when applied as Ensemble Classifier Chains. However, these techniques lack computational efficiency when applied on large amounts of data, e.g., derived from next-generation sequencing experiments. By adapting algorithms for the use of graphics processing units, computational efficiency can be greatly improved due to parallelization of computations.


Here, we provide a parallelized and optimized graphics processing unit implementation (eccCL) of Classifier Chains and Ensemble Classifier Chains. Additionally to the OpenCL implementation, we provide an R-Package with an easy to use R-interface for parallelized graphics processing unit usage.


eccCL is a handy implementation of Classifier Chains on GPUs, which is able to process up to over 25,000 instances per second, and thus can be used efficiently in high-throughput experiments. The software is available at


Multi-label classification (MLC) has gained significant attention in recent years in diverse fields of research, e.g., in protein function prediction [1] and text categorization [2], as well as in biomedical research [35]. For instance, in recent work the MLC concept of classifier chaining was applied to the problem of drug resistance prediction in HIV [6].

The concept of Classifier Chains (CC) is a generalization of binary classification. In MLC each instance is associated with a set of labels instead of one single label as in binary classification. Formally, let L={l 1,…,l m } be a set of class labels and Y the power set of labels defining the possible label combinations of L. Let X be the input space, where each vector x represents an instance, e.g., a protein sequence, which is associated with labels of Y. The idea of CC is to generate a single classifier for each lL and to link the single classifiers along a chain. The general concept of classifier chaining is exemplarily shown for three labels in Fig. 1. One major advantage in classifier chaining is that interdependencies between class labels can be modeled, e.g., in the case of drug resistance prediction, where resistance to one drug type might also be indicative of resistance against another drug. However, the order in CC may have an influence on the accuracy of prediction due to error propagation [7]. An extension to overcome these effects are Ensemble Classifier Chains (ECC) [8]. In this approach k classifier chains are trained with each chain in random order and with different subsets of training data. The prediction outcome is then combined by a voting scheme, e.g., by thresholding the prediction of each label and chain. Overall, the concept of classifier chaining has been shown to improve prediction accuracy, particularly when applied as ECC [9, 10].

Fig. 1
figure 1

General concept of classifier chanining. In general, classifier C i knows the labels L 0,...,L i−1 of classifiers C 0,...,C i−1 in training process and in classification process the results of classifiers C 0,...,C i−1. Here, the concept of classifier chaining is depicted for three class labels

However, today it is necessary to process large amounts of data which typically comes with big data problems, e.g., in biomedical research the usage of data generated by next-generation sequencing technologies or functional magnetic resonance imaging [11, 12] is still challenging as current available implementations lack computational efficiency. Therefore, parallelized architectures, especially graphics processing unit (GPU) implementations might provide remedy in regards of expensive computing time [13, 14]. For example, Olejnik et al. [15] recently published a GPU implementation to predict the co-receptor usage in HIV. Whereas the CPU implementation [16] was able to classify only few instances per second, the parallelized and optimized GPU version processes a significantly increased amount of instances per second.

Here, we provide a parallelized implementation of CC and ECC optimized for parallelized GPU usage. Our implementation is able to classify over 25,000 instances per second, whereas the sequential implementation on the CPU provided by the Mulan library ( is able to classify only 360 instances per second.


Our software is implemented in Java using the Lightweight Java Game Library (LWJGL) ( enabling the development of parallel computing applications based on OpenCL. The software can be used in Java as library or CLI-application or with R ( by installing the R package eccCL. For the communication between R and Java the rJava package is used. As a base classifier, we implemented random forests for GPU usage.

A random forest [17] is an ensemble learning method for classification and regression. A random forest trains several decision trees on a subset of the original dataset. Major advantages of random forests are the control of overfitting and the improved prediction accuracy which is achieved by the combination of prediction results of each individual tree to a final decision. Parallelization is achieved in two ways: First, each decision tree within a random forest is built in a concurrent task in the training phase. Second, in the classification phase each instance is classified in a concurrent task. In contrast to the Mulan library, eccCL is able to use OpenCL. This implicates that the subsets for each node in training are not dynamically created as this is not possible in OpenCL, compared to Mulan. Furthermore, each tree has the exact same number of nodes and the exact same depth, thus the classifiers can be stored in a single array and the position of each node can be calculated. Additionally, all instances are stored in a single buffer. Furthermore, instead of generating random subsets dynamically in the training phase, the index positions of the instances are stored in a separate array and reordered in a randomized manner for each node, due to the fact that all arrays in OpenCL need to have a fixed size at compile time.

Results and discussion

We developed a GPU framework for modeling CC and ECC. The software was evaluated on an Intel Xeon E5-1620 with 4 cores and an NVIDIA Tesla K20c with 2496 streaming processors. The data sets for the evaluation of our implementation were taken from different research areas. The NNRTI and PI dataset are from the realm of drug resistance prediction [18] in HIV. The data sets emotions [19], scene [20], and yeast [21] are received from the Mulan project ( which provides an implementation for the usage of CC and ECC, however, implemented in a non-parallelized manner.

The software can be used via Java on command line with parameter settings or in R by installing the R package eccCL. The software can be downloaded at the authors homepage ( After downloading, the R package can be installed using the R command within the R command line: install.packages(‘/path/to/package/eccCL.tar.gz’, repos=‘NULL’). In the following we demonstrate how to build an ECC with an ensemble size of 20 chains and a forest size of 64 within R:library(eccCL)

# Load file (.arff and.xml format

must be available)

data <- eccCLloadWekaFile


# Build classifier

ecc <- eccCLbuildFromObject(data,

ensembleSize=20, forestSize=64)

# Classify data

out <- eccCLclassifyObject(ecc, data)

# Get classification results

res <- eccCLgetResults(out)

# Save and load classifier


‘/home/temp/classifier.stored’) ecc

<- eccCLload(‘/home/temp/classifier.


The data format should be in.arff and.xml format according to the Mulan library. The files must be available in the given path. In the building process of the classifier, the ensemble size and forest size can be set individually. The classifier can be saved and loaded again for later classification tasks. Equivally, the following line represents the usage with Java as a shell command using the jar-file:

java -jar EccCL.jar -inpData /path/to/dataset/NNRTI -eccES 20 -eccFS 64 -evalAllLabels

The classifier will be trained and a classification will be performed. A classification task without a training process on a trained and saved classifier can be executed with the command:

java -jar EccCL.jar -inpData /path/to/dataset/NNRTI -classOnly /path/to/trainedClassifier

Table 1 provides a speed-up comparison between our GPU implementation and the Mulan framework with the usage of 20 ECC and 64 trees per random forest. Additionally, Table 2 demonstrates the number of instances classified per second with eccCL compared to the Mulan framework with respect to an increasing number of instances. Overall, our GPU implementation shows a speed-up of an order of magnitude in computation times. The prediction accuracy shows no difference between the GPU implementation and the models of the Mulan framework, however, slightly dependent on the parameter settings.

Table 1 Comparison between our GPU implementation and the non-parallelized Mulan framework for the classification of instances based on different data sets with different counts of instances and labels
Table 2 Instances classified per second with increasing number of bootstrapped instances exemplarily shown for the PI dataset

Our software can be used on standard desktop PCs and with OpenCL-ready graphics cards, whereas in general currently available GPUs of almost all manufacturers support OpenCL. eccCL needs Java (version 8.0) and OpenCL (version 1.2) installed. Furthermore, R (version 3.0) and the rJava package (version 3.2) have to be installed in advance for the usage of eccCL with R interface. Dependent on the platform, the OpenCL implementation can be used and in case OpenCL is not installed a parallelized Java implementation can be executed, however, on the CPU. eccCL runs on Linux and Mac OS. Overall, the software is easy to handle and no special hardware, i.e., a cluster or high-end server is needed. Currently, the eccCL package provides the random forest classifier in a parallelized manner. Random forests can be used as a classifier chain classifier and as an ensemble classifier chain classifier. In the future, we will work on further classifier implementations and will make them available within our package.


We provide an R-package and a Java version of a parallelized and optimized GPU implementation of Classifier Chains and Ensemble Classifier Chains. The software is able to classify up to over 25,000 instances per second and thus can efficiently speed up the classification process in high-throughput experiments.

Availability and requirements

Project name: eccCLProject home page: Operating system(s): Linux, Mac OSProgramming language: Java (≥ 8.0), R (≥ 3.0), (optional) OpenCL (≥ 1.2)License: GPL (≥ 2)Any restrictions to use by non-academics: none



Classifier chains


Command-line interface application


Central processing unit


Ensemble classifier chains


Graphics processing unit


Human immunodeficiency virus


Multi-label classification


Non-nucleoside reverse transcriptase inhibitor


Protease inhibitor


  1. Yu G, Domeniconi C, Rangwala H, Zhang G, Yu Z. Transductive multi-label ensemble classification for protein function prediction. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’12. New York: ACM: 2012. p. 1077–85.

    Google Scholar 

  2. Zhang BB-F, Xu X, Su J. An Ensemble Method for Multi-class and Multi-label Text Categorization. In: Proceedings of the International Conference on Intelligent System and Knowledge Engineering (ISKE). Chengdu: Atlantis Press: 2007. p. 1345–50.

    Google Scholar 

  3. Cerri R, Barros RC, PLF de Carvalho AC, Jin Y. Reduction strategies for hierarchical multi-label classification in protein function prediction. BMC Bioinforma. 2016; 17:373.

    Article  Google Scholar 

  4. Xu YY, Yang F, Shen HB. Incorporating organelle correlations into semi-supervised learning for protein subcellular localization prediction. Bioinformatics. 2016; 32(14):2184–92.

    Article  CAS  PubMed  Google Scholar 

  5. Lin W, Xu D. Imbalanced Multi-label Learning for Identifying Antimicrobial Peptides and Their Functional Types. Bioinformatics. 2016; 32(24):3745–52.

    Article  PubMed  Google Scholar 

  6. Heider D, Senge R, Cheng W, Hüllermeier E. Multilabel classification for exploiting cross-resistance information in HIV-1 drug resistance prediction. Bioinformatics. 2013; 29(16):1946–52.

    Article  CAS  PubMed  Google Scholar 

  7. Senge R, del Coz JJ, Hüllermeier E. On the Problem of Error Propagation in Classifier Chains for Multi-label Classification In: Spiliopoulou M, Schmidt-Thieme L, Janning R, editors. Data Analysis, Machine Learning and Knowledge Discovery. Cham: Springer International Publishing: 2014. p. 163–70.

    Google Scholar 

  8. Read J, Pfahringer B, Holmes G, Frank E. Classifier chains for multi-label classification. Mach Learn. 2011; 85(3):333–59.

    Article  Google Scholar 

  9. Tsoumakas G, Vlahavas I. Random k-labelsets: An Ensemble Method for Multilabel Classification. In: European Conference on Machine Learning. Heidelberg: Springer Berlin: 2007. p. 406–17.

    Google Scholar 

  10. Read J, Pfahringer B, Holmes G. Multi-label classification using ensembles of pruned sets. In: IEEE International Conference on Data Mining (ICDM). Pisa: IEEE Computer Society: 2008. p. 995–1000.

    Google Scholar 

  11. Pyka M, Hahn T, Heider D, Krug A, Sommer J, Kircher T, Jansen A. Baseline activity predicts working memory load of preceding task condition. Hum Brain Mapp. 2013; 34(11):3010–22.

    Article  PubMed  Google Scholar 

  12. Hahn T, Kircher T, Straube B, Wittchen HU, Konrad C, Ströhle A, Wittmann A, Pfleiderer B, Reif A, Arolt V, Lueken U. Predicting Treatment Response to Cognitive Behavioral Therapy in Panic Disorder With Agoraphobia by Integrating Local Neural Information. JAMA Psychiatry. 2015; 72(1):68–74.

    Article  PubMed  Google Scholar 

  13. Manconi A, Orro A, Manca E, Armano G, Milanesi L. A tool for mapping Single Nucleotide Polymorphisms using Graphics Processing Units. BMC bioinforma. 2014; 15(1):10.

    Article  Google Scholar 

  14. Larsen SJ, Alkærsig FG, Ditzel HJ, Jurisica I, Alcaraz N, Baumbach J. A Simulated Annealing Algorithm for Maximum Common Edge Subgraph Detection in Biological Networks. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 16). New York: ACM: 2016. p. 341–8.

    Google Scholar 

  15. Olejnik M, Steuwer M, Gorlatch S, Heider D. gCUP: rapid GPU-based HIV-1 co-receptor usage prediction for next-generation sequencing. Bioinformatics. 2014; 30(22):3272–3.

    Article  CAS  PubMed  Google Scholar 

  16. Heider D, Dybowski JN, Wilms C, Hoffmann D. A simple structure-based model for the prediction of HIV-1 co-receptor tropism. BioData Min. 2014; 7:14.

    Article  PubMed  PubMed Central  Google Scholar 

  17. Breiman L. Random forests. Mach Learn. 2001; 45(1):5–32.

    Article  Google Scholar 

  18. Riemenschneider M, Senge R, Neumann U, Hüllermeier E, Heider D. Exploiting HIV-1 protease and reverse transcriptase cross-resistance information for improved drug resistance prediction by means of multi-label classification. BioData Min. 2016; 9:10.

    Article  PubMed  PubMed Central  Google Scholar 

  19. Trohidis K, Kalliris G. Multi-Label Classification of Music Into Emotion. J Audio Speech Music Process. 2011; 2011:4.

    Article  Google Scholar 

  20. Boutell MR, Luo J, Shen X, Brown CM. Learning multi-label scene classification. Pattern Recogn. 2004; 37(9):1757–71.

    Article  Google Scholar 

  21. Elisseeff A, Weston J. A kernel method for multi-labelled classification. Adv Neural Inf Process Syst. 2001; 14:681–7.

    Google Scholar 

Download references


Not applicable.


This work was supported by the Straubing Center of Science, the CiM Cluster of Excellence at the University of Münster, the German Research Foundation (DFG) and the Technische Universität München within the funding programme Open Access Publishing. None of the funding bodies have played any part in the design of the study, in the collection, analysis, and interpretation of the data, or in the writing of the manuscript.

Author information

Authors and Affiliations



Conceived and designed the experiments: SG, DH. Performed the experiments: MR, AH. Interpreted results: MR, AH, AR, SG, DH. Wrote the paper: MR, AR, DH. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Dominik Heider.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver( applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Riemenschneider, M., Herbst, A., Rasch, A. et al. eccCL: parallelized GPU implementation of Ensemble Classifier Chains. BMC Bioinformatics 18, 371 (2017).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: