Skip to main content

Table 7 Overview of tools and resources. Collection of external tools and resources used for the PPI tasks by participating teams.

From: The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text

Name Type URL Summary
MALLET ML [48] Framework for feature extraction, logistic regression models and inference
SVMPerf ML [83] Support Vector Machine software for optimizing multivariate performance measures
Weka ML [64] Collection of machine learning algorithms for data mining, useful for feature selection
LIBSVM ML [84] Software for support vector classification
Matlab ML [85] Data analysis, and numeric computation software
Liblinear ML [86] Linear classifier software
MEGAM ML [87] Software for maximum entropy model implementation
C&C CCG parser NLP [55] Parser and taggers are written in C++
TreeTagger NLP [88] Part-of-speech tagger (trained on PENN treebank)
SNOWBALL NLP [89] Stemming program
NooJ NLP [90] Corpus processing and dictionary matching
Lucene NLP [53] Full-featured text search engine library
LingPipe NLP [91] Tool kit for processing text using computational linguistics
PSI-MI Lexical [46] Molecular Interaction Ontology used by PPI databases
UMLS Lexical [47] Unified Medical Language System which contains a large vocabulary database about biomedical and health-related concepts
MeSH Lexical [92] Vocabulary thesaurus used for indexing PubMed
ChEBI Lexical [93] Chemical Entities of Biological Interest
BioLexicon Lexical [52] Terminological resources integrating data from various bioinformatics collections
Stop words Lexical [44] Collection of words that are filtered out prior to processing of natural language data
NLProt BioNLP [94] SVM-based tool for recognition of protein-names in text
OSCAR3 BioNLP [95] Tool for recognition of chemical name mentions in text
ABNER BioNLP [60] Bio-Named entity recognition (proteins, genes, DNA, etc.)