Quality versus accuracy: result of a reanalysis of protein-binding microarrays from the DREAM5 challenge by using BayesPI2 including dinucleotide interdependence

Table 1 Prediction results of TFs with good PBM quality by using BayesPI2 energy-independent model and energy-dependent model including dinucleotide interactions

	TF family	Rank	CorrCoef (Ind)	Length (Ind)	Number (Ind)	CorrCoef (Dep)	Length (Dep)	Number (Dep)
TF_7	bHLH	1	0.78	13	1	0.786	10	1
TF_26	bHLH	2	0.66	10	1	0.67	13	1
TF_56	C2H2 Z F(4)	3	0.76	12	1	0.788	13	1
TF_55	AT hook	4	0.8	8	1	0.8	9	1
TF_17	NR	5	0.76	9	1	0.757	8	1
TF_11	NR	6	0.826	10	1	0.833	10	1
TF_16	Myb/SANT	7	0.808	10	1	0.817	9	1
TF_31	C2H2 ZF (13)	8	0.585	12	1	0.59	11	1
TF_15	Pou + Homeo	9	0.62	11	1	0.655	11	1
TF_45	Myb/SANT	11	0.8	12	1	0.78	11	1
TF_42*	Forkhead	12	0.75	12	1	0.805	13	1
TF_64	C2H2 ZF (3)	13	0.75	8	1	0.75	10	1
TF_52	NR	14	0.807	12	1	0.79	10	1
TF_3*	Forkhead	16	0.67	10	1	0.724	11	1
TF_27*	bZIP	17	0.526	9	1	0.635	11	1
TF_18	Sox	18	0.638	8	1	0.677	8	1
TF_22*	T-box	19	0.675	9	1	0.746	12	1
TF_47	Homeo	21	0.726	12	1	0.767	11	1
TF_44	GATA	22	0.633	12	1	0.68	10	1
TF_28	C2H2 ZF (8)	23	0.58	11	1	0.6	11	1
TF_13*	Pou + Homeo	24	0.584	9	1	0.675	13	1
TF_5	C2H2 ZF (3)	25	0.59	10	1	0.618	11	1
TF_43	Forkhead	27	0.577	10	1	0.614	12	1
TF_19	Sox	29	0.559	8	1	0.59	11	1
TF_39	C2H2 ZF (3)	30	0.63	11	1	0.668	13	1
TF_51*	Pou + Homeo	31	0.615	12	1	0.67	13	1
TF_23	T-box	33	0.64	12	1	0.66	13	1
TF_12*	NR	35	0.55	13	1	0.606	13	1
TF_49	NR	34	0.675	10	1	0.676	11	1
TF_53*	RFX	39	0.696	12	1	0.77	13	1
TF_14	Myb/SANT	40	0.6	8	1	0.62	9	1
TF_48	NR	43	0.755	12	1	0.78	12	1
TF_38	DM	45	0.67	9	1	0.689	8	1
TF_32	C2H2 ZF (6)	55	0.587	9	1	0.62	8	1

In the table, the 34 TFs were classified by applying fuzzy neuronal gas algorithm on the paired PBM quality-control parameters (i.e. the length of the major and minor axes of the PCA ellipses), where a good agreement between training and testing PBMs indicates good PBM data quality; Rank means TFs are sorted in decreasing order of their final performance score across all tested algorithms in Figure 2 of original publication [1]; CorrCoef , Length, and Number are Pearson correlation between predicted intensities and testing probe intensities, the length of motif, the first or second motif, respectively; (Ind) and (Dep) represent BayesPI2 energy-independent model and energy-dependent model including dinucleotide interaction, respectively; TFs marked by star and bold text indicate that the increase in Pearson correlation coefficient is greater than 0.05 by using BayesPI2 energy-dependent model including dinucleotide interaction energies.

ISSN: 1471-2105