Skip to main content

Table 1 Whole-genome long-read sequencing datasets used in the current version of RepeatHMM-scan

From: Genome-wide detection of short tandem repeat expansions by long-read sequencing

Genome name

#Long reads

Mapped coverage

NA12878 [33]

68,064,542

54X

NA24385 [33]

26,325,971

55X

NA24149 [33]

12,927,769

26X

NA24143 [33]

12,655,875

26X

NA24631 [33]

20,640,162

56X

NA24694 [43]

10,211,241

28X

NA24695 [43]

10,075,227

28X

AK1 [32]

1,082,595,779

297X

CHM1 [35]

49,203,975

100X

CHM13 [44]

69,236,262

176X

HG00268 [34]

18,556,018

81X

HG00514 [34]

51,979,497

213X

HG00733 [34]

38,400,667

143X

HG01352 [34]

33,512,701

144X

HG02059 [34]

43,154,257

155X

HG02106 [34]

20,165,840

71X

HG02818 [34]

51,357,293

224X

HG04217 [34]

68,629,541

203X

NA19240 [34]

48,378,501

125X

NA19434 [34]

32,040,706

147X

HX1 [30]

27,541,832

84X

  1. Reference assembly is GRCh38/hg38. All samples except HX1 were sequenced by the PacBio SMRT technology, while HX1 was sequenced by the Oxford Nanopore technology. CHM1 and CHM13 are haploid human genomes