- Open Access
ImmunoNodes – graphical development of complex immunoinformatics workflows
BMC Bioinformatics volume 18, Article number: 242 (2017)
Immunoinformatics has become a crucial part in biomedical research. Yet many immunoinformatics tools have command line interfaces only and can be difficult to install. Web-based immunoinformatics tools, on the other hand, are difficult to integrate with other tools, which is typically required for the complex analysis and prediction pipelines required for advanced applications.
We present ImmunoNodes, an immunoinformatics toolbox that is fully integrated into the visual workflow environment KNIME. By dragging and dropping tools and connecting them to indicate the data flow through the pipeline, it is possible to construct very complex workflows without the need for coding.
ImmunoNodes allows users to build complex workflows with an easy to use and intuitive interface with a few clicks on any desktop computer.
Immunoinformatics methods have become a vital part of biomedical research. Their applications span a wide variety ranging from basic immunological to translational research, especially in the field of cancer research [1,2,3]. These applications often involve several methods, varying from pre- and post-processing routines, to complex statistical analysis procedures, and require a high amount of development time. Additionally, the lack of standardized interfaces and data formats renders the use of different tools in the same pipeline difficult. To overcome these problems, several groups have developed web-based workbenches that allow interacting with several different approaches via a unified interface [4, 5]. However, factors such as data volume, speed, robustness, or legal restrictions (e.g., data privacy or restrictions on data sharing), often prevent the use of web-based solutions.
Due to the variety and number of tasks that a typical immunoinformatics analysis conveys, we have developed ImmunoNodes, a set of components, each carrying out one specific task in immunoinformatics (e.g., human leukocyte antigen (HLA) ligand binding prediction or statistical analyses). By chaining several of these tools together one can form a complete data analysis workflow. Workflows not only enable complex automation tasks, but they also increase reproducibility of scientific studies by documenting the complete data analysis in a standardized form.
In this work, we present an immunoinformatics toolbox whose components can be used without transferring data to a central server across the Internet (thus circumventing data privacy restrictions). It enables the user to build complex workflows and offers unified interfaces and data formats. In order to facilitate collaboration between its several components, we have fully integrated ImmunoNodes into the Konstanz Information Miner Analytics Platform (KNIME) [6, 7], an application for visual workflow development. We thus benefit from KNIME’s rich functionality covering data mining, statistics, visualization, chemo- and bioinformatics [8,9,10], as well as computational proteomics [11,12,13]. ImmunoNodes provides a wide range of well-known tools for HLA binding prediction, HLA class I antigen processing prediction, HLA genotyping, as well as epitope-based vaccine design including epitope-selection and string-of-beads assembly.
Having integrated ImmunoNodes into such a versatile workflow development environment that KNIME is, we hope to ease its use and thus to spread the application of advanced immunoinformatics tools to a wide range of users.
ImmunoNodes is available for all major platforms (Windows, OSX, Linux) and released under a 3-clause BSD license. It can be directly installed from the KNIME-Community repository and its source code can be found at GitHub (https://github.com/FRED-2/ImmunoNodes). The accompanying Docker image can be found at Docker Hub (https://hub.docker.com/r/aperim/immunonodes).
KNIME is a free, stand-alone, open-source, workflow development framework for personal computers. Out of the box, it includes hundreds of sample workflows, more than 1,000 different tools (nodes) including a wide range of solutions for statistics analysis, data acquisition and visualization . KNIME runs on all major operating systems and can be easily extended by writing plug-ins and extensions. It is thus a popular and widespread platform for data analysis.
The ImmunoNodes framework has dependencies on command line tools that, with some considerable effort, could be imported as KNIME nodes. However, the Generic Knime Node (GKN) extension was developed to assist users to add arbitrary command line tools into KNIME. Instead of asking the end user to focus on writing code to enable the interaction between external command line tools and KNIME, GKN enables pipeline designers to mainly concentrate on describing the tools to be added. This description has to be contained in a Common Tool Descriptor (CTD) file . A CTD file is an XML document defining input data, output data, and all parameters required by each tool. Input and output data types are identified by their MIME content types (e.g., text/xml, application/zip) and parameters can be as simple as a single integer number restricted to a range or as complex as a list of nested values. CTDs also contain a section to map named parameters to command line parameters and thus enable the execution of arbitrary command line tools. We use CTD as an abstraction layer for the description of all tools in ImmunoNodes. The software package Generic KNIME Nodes (GKN) (https://github.com/genericworkflownodes) is then used to automatically generate the KNIME plugins from these abstract representations. Several of the software components used in ImmunoNodes are often difficult to install or are available exclusively for Linux. To address these issues, we have extended GKN to be natively able to execute command line tools provided within a Docker container. Docker is a software project that enables a lightweight virtualization of software applications, which internally allows an easy deployment of fully configured software suites to the end user. Docker also permits the execution of Linux-only third-party immunoinformatics tools on Windows and Mac OS X operating systems and thus gives ImmunoNodes full portability. GKN automatically generates the required Docker calls and handles the interaction between the host system and the virtualized Docker container. The majority of nodes in ImmunoNodes are command line tools written with FRED 2 . FRED 2 is an immunoinformatics Python module that provides standardized interfaces to the immunoinformatics software.
ImmunoNodes offers twelve different nodes covering epitope, proteasomal cleavage, and transporter associated with antigen processing (TAP) prediction, distance-to-self calculations of peptides, as well as HLA genotyping (Table 1). It also offers nodes for vaccine design including epitope selection and assembly. Each node wraps a variety of state-of-the art tools, many of which were covered in a recent review on immunoinformatics .
Epitope prediction node
Consumes two files, namely, a text file containing HLA alleles, one per line, in new nomenclature (see http://hla.alleles.org), and a text file either containing protein sequences in FASTA format or short peptide sequences, one per line. Besides specifying the desired epitope length, the user can choose an epitope prediction method from a variety of options (Table 1 - Epitope Prediction). The node returns a tab-separated file containing the predicted score for each peptide and allele.
Neoepitope prediction node
Consumes a VCF file containing the identified somatic genomic variants, besides a text file containing HLA alleles, and generates all possible neo-epitopes based on the annotated variants contained in the VCF file by extracting the annotated transcript sequences from Ensemble  and integrating the variants. Optionally, it consumes a text file, containing gene IDs of the reference system used for annotation, which are used as filter during the neoepitope generation. The user can specify whether frameshift mutations, deletions, and insertions should be considered in addition to single nucleotide variations (default). NeoEpitopePrediction currently supports ANNOVAR  and Variant Effect Predictor  annotations for GRCh37 and GRCh38 only.
Cleavage prediction node
Takes a FASTA file and predicts the cleavage probability for each site (Table 1 – Cleavage Prediction). In addition, the user can specify a peptide length, which in turn will alter the output to a tab-separated text file containing peptide sequences of the specified length with their C-terminal cleavage score.
TAP prediction node
Consumes either a FASTA file or a file containing peptide sequences. Besides the TAP prediction model to use (Table 1 - TAP Prediction), the user can specify the required peptide length (if the input was a FASTA file). Its output is again a tab-separated file containing the peptide sequences and the predicted TAP score.
HLA typing node
Takes a paired-end or single-end whole exome, whole genome sequence, or RNA-Seq FASTQ files and infers the most likely HLA class I and II genotype depending on the method used (see Table 1 - HLA Typing). The resulting file contains the most likely genotype with one HLA allele per line.
Epitope selection node
Selects an optimal set of epitopes from a set of candidate epitopes that maximizes the overall predicted immunogenicity. The tool implements OptiTope, an integer linear programming-based epitope selection framework proposed by Toussaint et al. . As input it takes a file containing the results of (Neo)EpitopePrediction and a tab-separated HLA allele file with assigned population frequencies, similar to the type of files that AlleleFrequency can generate. Optionally, EpitopeSelection accepts a tab-separated file containing the epitope sequences of the EpitopePrediction result with assigned conservation scores. The user can specify the number of epitopes to select, the percentage of HLA alleles and antigens that have to be covered by the selected epitopes, and a HLA binding threshold that specifies at what point a peptide is considered to bind to a specific HLA allele. If an epitope conservation file is provided, the user can define a minimum conservation to filter the epitopes with.
Epitope assembly node
Assembles a set of epitopes into an optimal string-of-beads polypeptide vaccine construct. It consumes a peptide list and generates a traveling salesman problem (TSP) instance as described in . Each node of the underlying fully connected graph represents a peptide, each edge’s weight expresses the cleavage probability of the connected epitopes predicted by the user specified cleavage site prediction model. Solving the TSP instance yields a string-of-beads construct that has the highest probability to be fully recovered. The user can either specify to solve the TSP instance either optimally via integer linear programming by using the CBC solver (https://projects.coin-or.org/Cbc), or to obtain an approximate solution by using the Lin-Kernighan heuristic . Optionally, the user can specify a weight parameter (which defaults to 0) that activates and weights an additional term of the objective function. The additional term represents the non-junctional cleavage likelihood, which, by providing a weight greater to zero, will be minimized, whilst the junction cleavage likelihood will be maximized.
Spacer design node
Generates a string-of-beads design similar to the EpitopeAssembly node but also constructs optimal spacer sequences maximizing the cleavage probability of the desired epitopes. The tool consumes a peptide list and generates a TSP instance. Additionally, it calculates short spacer sequences connecting two epitopes to increase the cleavage likelihood of the epitopes while simultaneously reducing the formation of neoepitopes . The user has to specify an epitope prediction model in addition to the required cleavage site model. The output, like in EpitopeAssembly, is a FASTA file containing the designed string-of-beads vaccine.
Can be used to calculate the distance of a given \( l \) -mer peptide to the whole human proteome or a user-defined set of proteins. To this end, distance-to-self uses a memory efficient trie-based data structure to hold the reference proteome or any set of protein sequences and to query it with a target peptide as previously described in . The distance calculation is based on a distance measure derived from a transformed BLOSUM substitution matrix and lies between 0 (most similar) and 1 (most dissimilar). ImmunoNodes provides two distance-to-self nodes: Distance2SelfGeneration and Distance2SelfCalculation. Distance2SelfGeneration can be used to generate custom reference tries for a given protein FASTA and the desired length of peptides in the trie, while Distance2SelfCalculation calculates the distances of the \( k \) closest reference peptides of a custom build, or pre-calculated reference trie for a list of peptides given in a tab-separated file. There are four pre-calculated reference tries generated from all 8−, 9−, 10−, and 11−mers of the human reference proteome (Uniprot, TrEMBL, accesse 04/07/2016).
Allele frequency node
Is a very simple node that takes a list of HLA alleles and assigns the probability that a given HLA allele occurs in the user-specified geographic region or population extracted from dbMHC . The output is a tab-separated file, each row containing an HLA allele and its probability of occurrence in the given region or population.
Epitope conservation node
Consumes a multiple sequence alignment, calculates the consensus sequence and generates peptides of a user specified length. In addition to that, the multiple sequence alignment is used to calculate peptide conservation, which is defined as the product of column-wise conservation of the MSA. In the case of multiple epitope origins the maximum epitope conservation is reported . The output is a tab-separated file containing the peptide sequences and their conservation.
Example workflow 1: HLA ligandomics analysis pipeline
Recently, high throughput methodologies based on liquid chromatography and mass spectrometry (MS) have been successfully used to identify therapeutic targets for cancer immunotherapies [27,28,29]. Here, we present a peptide identification workflow for ligandomics analysis using OpenMS  and ImmunoNodes (Fig. 1, http://www.myexperiment.org/workflows/4947). At the same time, this workflow will exemplify the synergistic effects of combining native KNIME nodes, other community extensions, and ImmunoNodes.
First, ligandomics data of JY cell lines are downloaded from PRIDE  via an FTP download node. Then, peptide identification at 5% FDR is applied using OpenMS nodes . The resulting peptides are then annotated with their predicted binding affinity using ImmunoNodes’ EpitopePrediction with NetMHC  and simple statistics of the predicted binding affinities are calculated and visualized using native KNIME nodes.
Example workflow 2: population-based vaccine design against Zika virus
To demonstrate the usage of ImmunoNodes for vaccine design, we extracted all 221 partially and 30 fully sequenced genomes of Zika virus from the Virus Pathogen Resource database  (access 02/22/2016). Epitope prediction was performed with PickPocket  using HLA alleles with a minimal prevalence of 1% in the South American population and nine-mer peptides generated from the extracted protein sequences. The candidate epitopes were filtered based on a binding threshold of 500 nM, and EpitopeSelection was allowed to select up to ten epitopes that guaranteed the maximal obtainable antigen and HLA allele coverage (Fig. 2, http://www.myexperiment.org/workflows/4948).
The ten selected epitopes (Table 2) covered more than 95% (20 of 21) of the HLA alleles prevalent in the South American population, as well as 92% (287 of 312) of the extracted Zika antigens. The alleles of HLA-A, −B, −C of the South American population could be covered by 100%, 83%, and 100% respectively with the selected epitopes, resulting in a 99% population coverage (i.e., the probability that a person of the South American population carries at least one HLA allele that is covered by the vaccine is 99%).
The complexity and development time of accurate, state-of-the-art immunoinformatics tasks is high. To maximize quality in the results and to decrease implementation time, it is common that immunoinformatics software makes use of already existing, thoroughly tested libraries. Unfortunately, the installation and configuration of the different components of such pipelines tends to be non-trivial and often exceeds the technical capabilities of many end users.
Having these aspects in mind, we developed ImmunoNodes, an immunoinformatics framework that covers essential tasks of pipelines such as epitope discovery, HLA inference, antigen processing, and vaccine design. Structuring complex scientific tasks into a collection of small, easily executable, simpler computations (i.e., a pipeline or workflow) brings the benefit of adding a certain degree of reproducibility, an aspect desired in all scientific endeavors. Being fully integrated into KNIME using GKN, it enables a wide audience to develop complex analysis workflows without the need of having mastered a programming language. Also, the complexity of installation and configuration of required third-party libraries has been lifted from the end user as a result of the provided Docker images. We therefore are confident that ImmunoNodes will enable a wide range of users to develop innovative and complex pipelines, thus spreading the usage of state-of-the-art immunoinformatics approaches.
Common tool descriptor
Generic KNIME node
Human leukocyte antigen
Integrated development environment
Immune epitope database
Konstanz information miner
Transporter associated with antigen processing
Traveling salesman problem
Variant calling format
Extensible markup language
Boisguérin V, Castle J, Loewer M, Diekmann J, Mueller F, Britten C, Kreiter S, Türeci Ö, Sahin U. Translation of genomics-guided RNA-based personalised cancer vaccines: towards the bedside. Br J Cancer. 2014;111(8):1469–75.
Shukla SA, Rooney MS, Rajasagi M, Tiao G, Dixon PM, Lawrence MS, Stevens J, Lane WJ, Dellagatta JL, Steelman S. Comprehensive analysis of cancer-associated somatic mutations in class I HLA genes. Nat Biotechnol. 2015;33(11):1152–8.
Kreiter S, Vormehr M, van de Roemer N, Diken M, Löwer M, Diekmann J, Boegel S, Schrörs B, Vascotto F, Castle JC. Mutant MHC class II epitopes drive therapeutic immune responses to cancer. Nature. 2015;520(7549):692–6.
Schubert B, Brachvogel H-P, Jürges C, Kohlbacher O. EpiToolKit—a web-based workbench for vaccine design. Bioinformatics. 2015;31(13):2211–3.
Vita R, Overton JA, Greenbaum JA, Ponomarenko J, Clark JD, Cantrell JR, Wheeler DK, Gabbard JL, Hix D, Sette A. The immune epitope database (IEDB) 3.0. Nucleic Acids Res. 2015;43(D1):D405–12.
Berthold MR, Cebron N, Dill F, Gabriel TR, Kötter T, Meinl T, Ohl P, Sieb C, Thiel K, Wiswedel B: KNIME: The Konstanz information miner. Heidelberg: Springer; 2008.
Berthold MR, Cebron N, Dill F, Gabriel TR, Kötter T, Meinl T, Ohl P, Thiel K, Wiswedel B. KNIME-the Konstanz information miner: version 2.0 and beyond. AcM SIGKDD explorations Newsletter. 2009;11(1):26–31.
Döring A, Weese D, Rausch T, Reinert K. SeqAn an efficient, generic C++ library for sequence analysis. BMC bioinformatics. 2008;9(1):11.
Lindenbaum P, Le Scouarnec S, Portero V, Redon R. Knime4Bio: a set of custom nodes for the interpretation of next-generation sequencing data with KNIME. Bioinformatics. 2011;27(22):3200–1.
Beisken S, Meinl T, Wiswedel B, de Figueiredo LF, Berthold M, Steinbeck C. KNIME-CDK: Workflow-driven cheminformatics. BMC bioinformatics. 2013;14(1):1.
Aiche S, Sachsenberg T, Kenar E, Walzer M, Wiswedel B, Kristl T, Boyles M, Duschl A, Huber CG, Berthold MR. Workflows for automated downstream data analysis and visualization in large‐scale computational mass spectrometry. Proteomics. 2015;15(8):1443–7.
Uszkoreit J, Maerkens A, Perez-Riverol Y, Meyer HE, Marcus K, Stephan C, Kohlbacher O, Eisenacher M. PIA: An intuitive protein inference engine with a web-based user interface. J Proteome Res. 2015;14(7):2988–97.
Sturm M, Bertsch A, Gröpl C, Hildebrandt A, Hussong R, Lange E, Pfeifer N, Schulz-Trieglaff O, Zerck A, Reinert K. OpenMS–an open-source software framework for mass spectrometry. BMC bioinformatics. 2008;9(1):163.
Analytics Platform Product Sheet. https://www.knime.org/knime-analytics-platform.
de la Garza L, Veit J, Szolek A, Röttig M, Aiche S, Gesing S, Reinert K, Kohlbacher O. From the Desktop to the Grid: scalable Bioinformatics via Workflow Conversion. BMC Bioinformatics. 2016;17(1):127.
Schubert B, Walzer M, Brachvogel H-P, Szolek A, Mohr C, Kohlbacher O. FRED 2: An Immunoinformatics Framework for Python. Bioinformatics. 2016;32(13):2044–6. doi:10.1093/bioinformatics/btw113.
Backert L, Kohlbacher O. Immunoinformatics and epitope prediction in the age of genomic medicine. Genome medicine. 2015;7(1):1–12.
Cunningham F, Amode MR, Barrell D, Beal K, Billis K, Brent S, Carvalho-Silva D, Clapham P, Coates G, Fitzgerald S. Ensembl 2015. Nucleic Acids Res. 2015;43(D1):D662–9.
Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 2010;38(16):e164.
McLaren W, Pritchard B, Rios D, Chen Y, Flicek P, Cunningham F. Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor. Bioinformatics. 2010;26(16):2069–70.
Toussaint NC, Kohlbacher O. OptiTope—a web server for the selection of an optimal set of peptides for epitope-based vaccines. Nucleic Acids Res. 2009;37 suppl 2:W617–22.
Toussaint NC, Maman Y, Kohlbacher O, Louzoun Y. Universal peptide vaccines–Optimal peptide vaccine design based on viral sequence conservation. Vaccine. 2011;29(47):8745–53.
Helsgaun K. General k-opt submoves for the Lin–Kernighan TSP heuristic. Math Program Comput. 2009;1(2–3):119–63.
Schubert B, Kohlbacher O. Designing string-of-beads vaccines with optimal spacers. Genome medicine. 2016;8(1):1–10.
Toussaint NC, Feldhahn M, Ziehm M, Stevanović S, Kohlbacher O. T-cell epitope prediction based on self-tolerance. In: Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicinenn 2011: ACM. 2011. p. 584–8.
NCBI RC. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2016;44(D1):D7.
Kowalewski DJ, Stevanovic S, Rammensee HG, Stickel JS. Antileukemia T-cell responses in CLL - We don’t need no aberration. Oncoimmunology. 2015;4(7):e1011527.
Peper JK, Bosmuller HC, Schuster H, Guckel B, Horzer H, Roehle K, Schafer R, Wagner P, Rammensee HG, Stevanovic S, et al. HLA ligandomics identifies histone deacetylase 1 as target for ovarian cancer immunotherapy. Oncoimmunology. 2016;5(5):e1065369.
Kowalewski DJ, Schuster H, Backert L, Berlin C, Kahn S, Kanz L, Salih HR, Rammensee HG, Stevanovic S, Stickel JS. HLA ligandome analysis identifies the underlying specificities of spontaneous antileukemia immune responses in chronic lymphocytic leukemia (CLL). Proc Natl Acad Sci U S A. 2015;112(2):E166–175.
Röst HL, Sachsenberg T, Aiche S, Bielow C, Weisser H, Aicheler F, Andreotti S, Ehrlich H-C, Gutenbrunner P, Kenar E. OpenMS: a flexible open-source software platform for mass spectrometry data analysis. Nat Methods. 2016;13(9):741–8.
Martens L, Hermjakob H, Jones P, Adamski M, Taylor C, States D, Gevaert K, Vandekerckhove J, Apweiler R. PRIDE: the proteomics identifications database. Proteomics. 2005;5(13):3537–45.
Andreatta M, Nielsen M. Gapped sequence alignment using artificial neural networks: application to the MHC class I system. Bioinformatics. 2016;32(4):511–7. doi:10.1093/bioinformatics/btv639.
Pickett BE, Sadat EL, Zhang Y, Noronha JM, Squires RB, Hunt V, Liu M, Kumar S, Zaremba S, Gu Z. ViPR: an open bioinformatics database and analysis resource for virology research. Nucleic Acids Res. 2012;40(D1):D593–8.
Zhang H, Lund O, Nielsen M. The PickPocket method for predicting binding specificities for receptors based on receptor pocket similarities: application to MHC-peptide binding. Bioinformatics. 2009;25(10):1293–9.
Parker KC, Bednarek MA, Coligan JE. Scheme for ranking potential HLA-A2 binding peptides based on independent binding of individual peptide side-chains. J Immunol. 1994;152(1):163–75.
Dönnes P, Elofsson A. Prediction of MHC class I binding peptides, using SVMHC. BMC bioinformatics. 2002;3(1):25.
Bui H-H, Sidney J, Peters B, Sathiamurthy M, Sinichi A, Purton K-A, Mothé BR, Chisari FV, Watkins DI, Sette A. Automated generation and evaluation of specific MHC binding predictive tools: ARB matrix applications. Immunogenetics. 2005;57(5):304–14.
Peters B, Sette A. Generating quantitative models describing the sequence specificity of biological processes with the stabilized matrix method. BMC bioinformatics. 2005;6(1):132.
Kim Y, Sidney J, Pinilla C, Sette A, Peters B. Derivation of an amino acid similarity matrix for peptide: MHC binding and its application as a Bayesian prior. BMC bioinformatics. 2009;10(1):394.
Sidney J, Assarsson E, Moore C, Ngo S, Pinilla C, Sette A, Peters B. Quantitative peptide binding motifs for 19 human and mouse MHC class I molecules derived using positional scanning combinatorial peptide libraries. Immunome Res. 2008;4(2):7580–4.
Nielsen M, Andreatta M. NetMHCpan-3.0; improved prediction of binding to MHC class I molecules integrating information from multiple receptor and peptide length datasets. Genome medicine. 2016;8(1):1.
Sturniolo T, Bono E, Ding J, Raddrizzani L, Tuereci O, Sahin U, Braxenthaler M, Gallazzi F, Protti MP, Sinigaglia F. Generation of tissue-specific and promiscuous HLA ligand databases using DNA microarrays and virtual HLA class II matrices. Nat Biotechnol. 1999;17(6):555–61.
Zhang L, Chen Y, Wong H-S, Zhou S, Mamitsuka H, Zhu S: TEPITOPEpan: extending TEPITOPE for peptide binding prediction covering over 700 HLA-DR molecules. PLoS One. 2012;7(2):e30483. doi:10.1371/journal.pone.0030483.
Nielsen M, Lundegaard C, Lund O. Prediction of MHC class II binding affinity using SMM-align, a novel stabilization matrix alignment method. BMC bioinformatics. 2007;8(1):238.
Karosiene E, Rasmussen M, Blicher T, Lund O, Buus S, Nielsen M. NetMHCIIpan-3.0, a common pan-specific MHC class II prediction method including all three human MHC class II isotypes, HLA-DR, HLA-DP and HLA-DQ. Immunogenetics. 2013;65(10):711–24.
Rammensee H-G, Bachmann J, Emmerich NPN, Bachor OA, Stevanović S. SYFPEITHI: database for MHC ligands and peptide motifs. Immunogenetics. 1999;50(3–4):213–9.
Stranzl T, Larsen MV, Lundegaard C, Nielsen M. NetCTLpan: pan-specific MHC class I pathway epitope predictions. Immunogenetics. 2010;62(6):357–68.
Calis JJ, Maybeno M, Greenbaum JA, Weiskopf D, De Silva AD, Sette A, Keşmir C, Peters B. Properties of MHC class I presented peptides that enhance immunogenicity. PLoS Comput Biol. 2013;9(10):e1003266.
Tenzer S, Peters B, Bulik S, Schoor O, Lemmel C, Schatz M, Kloetzel P-M, Rammensee H-G, Schild H, Holzhütter H-G. Modeling the MHC class I pathway by combining predictions of proteasomal cleavage, TAP transport and MHC class I binding. Cellular and Molecular Life Sciences CMLS. 2005;62(9):1025–37.
Dönnes P, Kohlbacher O. Integrated modeling of the major events in the MHC class I antigen processing pathway. Protein Sci. 2005;14(8):2132–40.
Nielsen M, Lundegaard C, Lund O, Keşmir C. The role of the proteasome in generating cytotoxic T-cell epitopes: insights obtained from improved predictions of proteasomal cleavage. Immunogenetics. 2005;57(1–2):33–41.
Peters B, Bulik S, Tampe R, Van Endert PM, Holzhütter H-G. Identifying MHC class I epitopes by predicting the TAP transport efficiency of epitope precursors. J Immunol. 2003;171(4):1741–9.
Doytchinova I, Hemsley S, Flower DR. Transporter associated with antigen processing preselection of peptides binding to the MHC: a bioinformatic evaluation. J Immunol. 2004;173(11):6813–9.
Szolek A, Schubert B, Mohr C, Sturm M, Feldhahn M, Kohlbacher O. OptiType: precision HLA typing from next-generation sequencing data. Bioinformatics. 2014;30(23):3310–6.
Boegel S, Löwer M, Schäfer M, Bukur T, de Graaf J, Boisguérin V, Türeci Ö, Diken M, Castle JC, Sahin U. HLA typing from RNA-Seq sequence reads. Genome Medicine. 2013;4(12):102.
This project has received funding from the European Union’s Horizon 2020 research and innovation program under grant agreement No. 633592 (APERIM). OK acknowledges funding from the Deutsche Forschungsgemeinschaft (SFB685/B1).
Availability of data and materials
ImmunoNodes’ source code is hosted at GitHub (https://github.com/FRED-2/ImmunoNodes) and released under a 3-clause BSD license. Licenses for commercial use are needed for third-party software including the NetMHC-family and the LKH solver. ImmunoNodes is fully integrated into KNIME. It can be directly installed from KNIME’s graphical user interface. For further infromation, see the installation guide at https://github.com/FRED-2/ImmunoNodes. KNIME can be downloaded from https://www.knime.org. The presented example workflows can be downloaded from https://www.myexperiment.org or directly from ImmunoNodes’ GitHub repository.
BS, LG developed and implemented the method. CM implemented the distance-to-self nodes. MW contributed the ligandomics workflow. BS, LG, and OK wrote the paper. OK designed the study. All authors read and approved the manuscript.
The authors declare that they have no competing interests.
Ethics approval and consent to participate
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Schubert, B., de la Garza, L., Mohr, C. et al. ImmunoNodes – graphical development of complex immunoinformatics workflows. BMC Bioinformatics 18, 242 (2017). https://doi.org/10.1186/s12859-017-1667-z