From: Infrastructure for the life sciences: design and implementation of the UniProt website
Data set | Description | References | Entries | Path | Formats |
---|---|---|---|---|---|
UniProtKB | Protein sequence and annotation data | UniRef, UniParc, Literature citations, Taxonomy, Keywords | 6.4 M | /uniprot/ | Plain text, FASTA, (GFF), XML, RDF |
UniRef | Clusters of proteins with similar sequences | UniProtKB, UniParc, Taxonomy | 12.3 M | /uniref/ | FASTA, XML, RDF |
UniParc | Protein sequence archive | UniProtKB, Taxonomy | 17.0 M | /uniparc/ | FASTA, XML, RDF |
Literature citations | Literature cited in UniProtKB (based on PubMed) | Â | 0.4 M | /citations/ | RDF |
Taxonomy | Taxonomy data (based on NCBI taxonomy) | Â | 0.5 M | /taxonomy/ | RDF, (Tab-delimited) |
Keywords | Keywords used in UniProtKB | Â | 1K | /keywords/ | RDF, (OBO) |
Subcellular locations | Subcellular location terms used in UniProtKB | Â | 375 | /locations/ | RDF, (OBO) |