Skip to main content

Table 1 Probability to find one or more "quasi-ditag" in the nucleotide sequence of the given length (P(20–24))

From: Incidence of "quasi-ditags" in catalogs generated by Serial Analysis of Gene Expression (SAGE)

Sequence Length (L)

Frequency of ≥ 1 quasi-ditags in sequence

S. Cerevisiae chromosome 3

 

Mathematical model

Computer simulation 1

In vivo 2 simulation (S. Cerevisiae)

 

600 bp

0.039392

0.039858

0.016666

IV [NC_001136]

700 bp

0.046231

0.046655

0.026666

X [NC_001142]

800 bp

0.053070

0.053383

0.036666

XIV [NC_001146]

900 bp

0.059909

0.060051

0.040000

VIII [NC_001140]

1,000 bp

0.066748

0.066743

0.046666

V [NC_001137]

1,100 bp

0.073587

0.073225

0.066666

IX [NC_001141]

1,200 bp

0.080426

0.079793

0.070000

XI [NC_001143]

  1. 1 For computer simulation, 1,000,000 files consisting of the sequence-imitating random combination of A, C, G and T nucleotides of selected length were analyzed in search of SAGE "quasi-ditags".
  2. 2 For in vivo / in silico simulations 300 sequences were created by fragmentation of randomly selected chromosomes of Saccharomyces Cerevisiae for each L value. Larger samplings (900–1,400 sequences) were created and tested for selected sequence lengths and did not change results significantly.
  3. 3 GenBank database accession numbers are given in brackets.