Sensitivity (TPR) and false positive rate (FPR) plots versus profile pair information content (IC). Here we use a single type-I error threshold value of 5% and TNW as a reference. Left column (A-C) corresponds to OrthoMCL groups and the right column (D-F) corresponds to InterPro groups. The top row (A and D) shows results using presence-absence profiling approaches, the middle row (B and E) represents group size approaches using the hypergeometric test (GSHGT), and the bottom row (C and F) corresponds group size approaches using the Kendall-t correlation index (GSCOR). Results for WRUNS on TNW and TNUW are omitted since they are extremely similar to DPCP's. Violin plots under each diagram indicate the frequency distribution of profile pairs across information content (wider plots indicate higher frequencies). Each colour represents a different positive, TPPPI (blue) and TPCoPW (green), or negative, TNUW (magenta) and TNW (orange), profile pair dataset. Thus the first two represent sensitivity or true positive rate (TPR) whereas the second two indicate FPR using alternative negative control sampling approaches. Sensitivity, FPR and IC frequency distributions are smoothed by considering neighbouring data points along the y-axis (radius 2) with equal weights. Data points with less than 10 observations are not shown.