BMC Bioinformatics

Table 1 Performance results from implemented human DLPR models

From: Supervised promoter recognition: a benchmark framework

		CNNProm dataset												ICNNP dataset
		1426 promoters				19,811 promoters				21,237 promoters				7156 promoters
		8256 non-promoters				27,731 non-promoters				35,987 non-promoters				5235 non-promoters
		TATA				Non-TATA				Complete dataset				Complete dataset
		MCC	PPV	Sn	Sp	MCC	PPV	Sn	Sp	MCC	PPV	Sn	Sp	MCC	PPV	Sn	Sp
OS\(^*\)	1	90	–	95	98	89	–	90	98	–	–	–	–	–	–	82	79
	2	–	–	–	–	–	–	–	–	–	–	–	–	–	–	90	87
	3	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–
CT\(^*\)	1	92	95	91	99	72	85	83	89	73	85	82	91	–	–	–	–
	CI	90–93	94–96	89–93	99–99	71–73	81–88	80–86	86–92	72–74	82–87	79–84	89–93	–	–	–	–
	2	–	–	–	–	–	–	–	–	–	–	–	–	64	83	88	74
	CI	–	–	–	–	–	–	–	–	–	–	–	–	61–67	80–87	83–93	66–82
	3	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–
	CI	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–
CT\(^-\)	1	86	82	94	97	73	91	76	95	76	88	82	94	69	76	87	84
	2	–	–	–	–	–	–	–	–	69	76	87	84	70	88	87	83
	3	16	18	91	28	− 5	41	93	05	− 4	37	95	03	07	59	96	07
CT\(^+\)	1	94	93	97	99	76	89	83	93	80	90	84	95	74	93	84	91
	2	–	–	–	–	–	–	–	–	72	79	86	87	82	95	89	93
	3	19	18	99	22	− 3	41	96	03	0	37	97	03	07	59	96	07

		DProm dataset
		3065 promoters				26,532 promoters				29,597 promoters
		3065 non-promoters				26,532 non-promoters				29,597 non-promoters
		TATA				Non-TATA				Complete dataset
		MCC	PPV	Sn	Sp	MCC	PPV	Sn	Sp	MCC	PPV	Sn	Sp
OS*	1	62	75	91	–	26	58	83	–	–	–	–	–
	2	–	–	–	–	–	–	–	–	–	–	–	–
	3	88	93	95	–	92	97	95	–	–	–	–	–
CT*	1	–	–	–	–	–	–	–	–	–	–	–	–
	CI	–	–	–	–	–	–	–	–	–	–	–	–
	2	–	–	–	–	–	–	–	–	–	–	–	–
	CI	–	–	–	–	–	–	–	–	–	–	–	–
	3	90	95	95	95	91	96	95	96	91	97	94	97
	CI	88–92	92–98	93–97	91–98	90–92	94–97	94–97	94–97	91–92	95–98	92–96	95–98
CT⁻	1	16	18	91	28	− 5	41	93	05	− 4	37	95	03
	2	–	–	–	–	–	–	–	–	27	58	87	37
	3	86	95	91	95	91	98	93	98	91	96	95	96
CT⁺	1	65	74	95	67	14	54	82	29	12	53	84	26
	2	–	–	–	–	–	–	–	–	27	58	86	38
	3	99	1	99	1	94	98	96	98	94	98	96	98

OS indicates model results from original studies. CT shows cross-testing results from our implemented models. (1) CNNProm, (2) ICNNP, (3) DProm model. CT\(^+\): results from training and testing datasets with some overlapping sequences. CT\(^-\): results from properly split training and testing datasets having no overlapping sequences. CT\(^*\) and OS\(^*\): results obtained from 10-fold cross-validation. All values expressed as percentages. \(99\%\) confidence interval values for our 10-fold cross validation results shown within CI rows

Back to article page

ISSN: 1471-2105

Contact us

General enquiries: journalsubmissions@springernature.com