Skip to main content

Table 1 Description for the Data Sets

From: A novel hierarchical clustering algorithm for gene sequences

Data

Name

Number

Average length (bp)

Description

DS1

beta-globin

176

1531

Cytochrome P450

 

beta-Hemoglobin

89

448

Hemoglobin subunit

 

integrin_alpha

142

3360

Integrin, alpha

 

ketoacyl-synt1

43

754

Estradiol 17-beta-dehydrogenase 8

 

myoglobin

55

478

Cytoglobin Myoglobin

 

RWD

93

825

RWD domain-containing protein

 

VCL

92

2746

Vinculin

 

Histone

81

668

Histone

DS2

HBG106679

22

446

Copper uptake protein 2

 

HBG108349

49

718

Prolactin

 

HBG079775

26

3152

Transcription elongation factor SPT5

 

HBG058842

34

1351

TNFR superfamily member 1A

 

HBG002834

92

951

Calumenin/Reticulocalbin

 

HBG050441

58

1899

ATP-binding cassette sub-family G member

DS3

HBG093787

32

1769

Hypothetical membrane proteins

 

HBG099893

34

430

Putative membrane protein precursor

 

HBG415481

65

557

Phasin like/family protein

 

HBG423057

32

236

Hypothetical proteins

 

HBG050644

99

3129

Beta galactosidase, beta glucuronidase, Evolved beta-D-galactosidase alpha subunit

 

HBG364776

48

1069

Formate dehydrogenase gamma subunit precursor

DS4

HBG000080

29

674

BWK-1,CG6617-PA , Zgc:73100 C20orf11 homolog , RH01588p

 

HBG060165

28

163

ATP synthase, H + transporting mitochondrial F1 complex/epsilon subunit

 

HBG010471

48

1802

Hypothetical Glycosyl transferase, family 25/Endoplasmic reticulum targeting sequence containing protein

 

HBG000013

70

318

60 S ribosomal protein L36a-like, 60 S ribosomal protein L42, L44, IP15820p, RPL

 

HBG000026

18

3157

Eukaryotic translation initiation factor 2-alpha kinase 3 precursor, Eukaryotic translation initiati

 

HBG065748

48

1238

AT20832p,AT27361p, CG10513-PA, CG10514-PA, CG10550-PA, isoform A, CG10553-PA,CG10559-PA,CG10560-P