Skip to main content

Table 3 Characteristics of the subsets of data used for the benchmarks reported in this paper

From: A benchmark server using high resolution protein structure data, and benchmark results for membrane helix predictions

 

Characteristics of the specialised data subsets used for the benchmarks

Identifier for subsets of data used in benchmarks reported in this paper

TMH

½MH

BB

Solb

All years

2008…

OPM

PDBTM

#seqs

#MHs

(1) TMH_1/2MH_OPM

Y

Y

  

Y

 

Y

 

101

483

(2) TMH_1/2MH_2008_OPM

Y

Y

   

Y

Y

 

24

191

(3) TMH_1/2MH_BB_SOLB_OPM

Y

Y

Y

Y

Y

 

Y

 

599

483

(4) TMH_OPM

Y

   

Y

 

Y

 

86

372

(5) TMH_BB_SOLB_OPM

Y

 

Y

Y

Y

 

Y

 

584

372

(6) TMH_PDBTM

Y

   

Y

  

Y

86

464

  1. All these data subsets were restricted to sequences having less than 30% similarity to each other with similarity having been measured by EMBOSS global sequence alignment. Other parameters used to build the data subsets are specified by ticks in the columns. For parameters not specified here the benchmark server default values were used. Legend : TMH : transmembrane helices; ½MH : half-membrane helices; BB : membrane β-barrels; Solb : soluble proteins; All years : sequences used in the benchmark were not restricted by date that the PDB model was made available; 2008 : benchmark was carried out restricting sequences to those belonging to PDB structures deposited 2008 or after and not having any structures of similar sequence deposited before 2008. OPM : benchmark was carried out using OPM-adjusted membrane helix assigments. PDBTM : benchmark was carried out using PDBTM membrane helix assignments without including segments assigned as loops. #seqs : total number of sequences; #MHs : total number of membrane helices.