EnvMine: A text-mining system for the automatic extraction of contextual information

BMC Bioinformatics

Table 4 Results for the discrimination between environmental-or experimental-associated sentences for different classifiers in the Weka package.

*Method*	*Subset*	*Original*	*Classified*	*Correct*	*Recall*	*Precision*	*F-value*
Naive Bayes Multinomial	Exp	828	802	755	91.2%	94.1%	92.6%
Naive Bayes Multinomial	Env	323	349	276	85.4%	79.1%	82.1%
Naive Bayes	Exp	828	692	679	82.0%	98.1%	89.3%
Naive Bayes	Env	323	459	322	96.0%	70.0%	81.0%
Bayes Logistic Regression	Exp	828	796	746	90.0%	93.7%	91.8%
Bayes Logistic Regression	Env	323	355	273	84.5%	76.9%	80.5%
Bayes Net	Exp	828	617	608	73.4%	98.5%	84.1%
Bayes Net	Env	323	534	314	97.2%	58.8%	73.3%
Meta Bagging	Exp	828	752	708	85.5%	94.1%	89.6%
Meta Bagging	Env	323	399	279	86.4%	69.9%	77.3%
Rules, Decision Table	Exp	828	1041	809	97.7%	77.7%	86.6%
Rules, Decision Table	Env	323	110	91	28.2%	82.7%	42.1%
Random Forest	Exp	828	776	735	88.8%	94.7%	91.7%
Random Forest	Env	323	375	282	87.3%	75.2%	80.8%

The column "original" indicates the original distribution of sentences, "classified" shows the results of the classifier as the obtained number of sentences in each category, and "correct" specifies the number of correctly classified sentences.

ISSN: 1471-2105