Table 5 Example sentences used in training of the SVM model

From: Improving classification in protein structure databases using text mining

Training example Label (+/-)
Sequence analysis showed that pENO2 shares 75.6% nucleotide and 89.5% deduced amino acid sequence identity with pENO1 and is encoded by a distinct gene. +
The packing of the octameric enzyme in this crystal form is unusual, because the asymmetric unit contains three subunits. +
Cys-592, which is essential for enzymatic activity, is located in the above-mentioned histidine-rich region. +
From the significant sequence similarity between intradiol enzymes, it has been shown that intradiol enzymes evolved from a common ancestor. +
Two 2,3-dihydroxybiphenyl (23DHBP) dioxygenase genes, bphC1 and etbC involved in the degradation of polychlorinated biphenyl(s) (PCBs) have been isolated and characterized from a strong PCB degrader, Rhodococcus sp. +
A thermostable hydantoinase of Bacillus stearothermophilus NS1122A: cloning, sequencing, and high expression of the enzyme gene, and some properties of the expressed enzyme. -
A catechol 2,3-dioxygenase gene in chromosomal DNA of P. putida KF715 was cloned and its nucleotide sequence analyzed. -
The K+ ion activates the enzyme 100-fold with an activation constant of 6 mM, well below the physiologic concentration of K+ in E. coli. -
A putative regulator and its possible recognition site was suggested on the basis of homology data. -
The enzyme has a subunit Mr of 33,500 +/- 2000 by SDS/polyacrylamide-gel electrophoresis. -
  1. Sample positive and negative sentences manually classified by an expert biologist for their content on functional, structural and classification information and used as training examples to learn an SVM model. Terms in italics were removed prior to training.