Skip to main content

Table 4 Number of protein references successfully assigned to ROG's and broken down by assignment score.

From: iRefIndex: A consolidated protein interaction database with provenance

  

Examples

Score type

Total number with this score type (%)

ROG Assignment Score

Number of cases

Details for one example

1

598590 (77.43)

P

512650

UniProt:Q15118 is cited in the interaction record as the primary reference (P).

S

52738

UniProt:P94102 is cited in the interaction record as the secondary reference (S).

PD

14166

"protein accession" is cited as the source database for accession Q9Z2F5 (D).

SM

2154

Accession NP 191913 is cited in a modified form (M) without the underscore.

SVGO+

262

EntrezGeneId:26207 (G) encodes multiple proteins (+) but only one matches the original (O) sequence given in the interaction record (RefSeq:NP_858057.1).

2

24664 (3.19)

PU

18542

UniProt:O95686 is cited and updated (U) to UniProt/KB:Q9UQK1.

PE

264

GenBank GI:12962935 is cited and updated to RefSeq:NP_002458.2 using eUtils (E).

PUO+

6

UniProt:P38706 is cited. Two possible updates are possible (+) but only one matches the original (O) sequence in the interaction record (P0C2H6).

3

121540 (15.72)

PT

52074

Protein reference cites taxon id as 9534 (African green monkey) but the sequence record cites taxon 9606.

ST

60205

Protein reference cites taxon id as 40674 (mammalia) but the sequence record cites (9606) human.

4

2803 (0.36)

PTUO+

15

UniProt:O04063 is cited with taxon identifier 4530 (rice). More than one updated accession exists (+U). Only one possibility has the same sequence as cited in the interaction record (P0C5B0) with taxon identifier 39947 (a specific strain of rice).

5

9840 (1.27)

SL+

9090

The primary reference cited is not found. 49 secondary references are cited (S). 15 of these were found to map to 8 distinct proteins (+). The protein with the largest (L) SEGUID is arbitrarily chosen.

PUTL+

187

UniProt:Q9MAY7 is cited with a taxon id of 4530 (rice). Two updated accessions are available (+U). Neither one has the expected sequence or taxon id (T) given in the interaction record. The accession with the largest (L) SEGUID is arbitrarily chosen.

SVGL+

303

EntrezGene:9912 is cited (G). This gene encodes two proteins (+). Neither has the sequence expected from the interaction record. The one with the largest (L) SEGUID is selected.

PTQ

21

Primary accession P84244 cited as a "see also" (Q) reference with taxon id 9606. The sequence record cites taxon id 10090 (T).

6

15649 (2.02)

PN

8909

Q95Q01 is an obsolete accession. The sequence is retrieved from the interaction record. The SEGUID and ROGID are calculated and stored locally as a new entry (N).

SEN

5561

RefSeq:NP_010441 is an obsolete accession. The sequence is retrieved using eUtils (E). The SEGUID and ROGID are calculated and stored locally as a new entry (N).

STGOEN+

2

EntrezGene 196549 (G) is cited and encodes two proteins (+). The protein accessions cited by EntrezGene are retired. Sequences are retrieved using eUtils (E). One matches the sequence cited in the interaction record (O). The SEGUID and ROGID are calculated and stored locally as a new entry (N).