Skip to main content

Advertisement

Table 3 Informational density of various corpora

From: Automatic reconstruction of a bacterial regulatory network using Natural Language Processing

A) Corpus B) # Docs C) Size (MB) D) Interactions in RegulonDB E) Total interactions F) % In RegulonDB G) Average doc size (kbs) H) Interactions/Docs
RN 724 24.9 1026 1643 62.4 65.9 1.41
RP 2475 99.0 1316 2650 49.6 40.0 0.53
RA 3075 3.3 322 414 77.7 1.07 0.1
EA 13334 14.4 402 627 64.1 1.08 0.03
RS 12059 12.3 400 718 55.7 1.02 0.03
ST 58312 10.7 342 691 49.5 0.18 0.005
  1. A comparison of the degree of informativeness with regard to transcriptional regulation in E. coli K-12 in various corpora, as established from the number of RegulonDB-attested interactions they contain; The table includes total number of documents and interactions (Cols. B & E), percentage and number of all interactions found in RegulonDB (Cols. D & F), average size of each document in the corpus (Col. G), ordered by number of RegulonDB interactions per document (Col. H).