From: Accessing the SEED Genome Databases via Web Services API: Tools for Programmers
Method Name | Parameters & Order | Description |
---|---|---|
abstract_coupled_to | peg | Get the pegs that may be coupled to this peg through abstract coupling. Input is a peg, output is list of [protein, score] for things that are coupled to this peg |
Adjacent | pegs | Retrieve the set of pegs in order along the chromosome. Input is a comma separated list of pegs, and output is the pegs in order along the genome. |
alias2fig | alias | Get the FIG ID(s) (peg) for a given external identifier. Input is an identifier used by another database, output is a list of our identifiers. Note that an alias can refer to more than one protein since the mapping is done via protein sequence. |
aliases_of | peg | Get the aliases of a peg. These are the identifiers that other databases use. Input is a peg, output is an array of aliases |
ali_to_seq | alias | Retrieve the protein sequence for a given identifier. Input is an alias, output is a sequence |
all_families | Â | Get all the FIG protein families (FIGfams). No input needed, it just returns a list of all families |
all_families_ with_funcs | Â | Get all the FIG protein families (FIGfams) with their assigned functions. No input needed, it just returns a list of all the families and their functions. |
all_genomes | complete, restrictions, domain | Get a set of genomes. The inputs are a series of constraints - whether the sequence is complete, other restrictions, and a domain of life (Bacteria, Archaea, Eukarya, Viral, Environmental Genome). Output is a list of genome ids. An example use is with the parameters ("complete", undef, "Bacteria") that will return all complete bacterial genomes. |
all_subsystem_ classifications | Â | Get a list of all the subsystems and their classifications. No input needed, it just returns a list of all the subsystems and their classifications |
boundaries_of | locations | Get the boundaries of a feature location. A feature can have multiple locations on a contig (e.g. split locations, introns, etc). This just returns an array of [contig, beginning, end]. You can pass it the output from feature_location directly |
CDS_data | families | Get all the pegs in some FIGfams, their functions, and aliases. Input is a tab-separated list of pegs, returns a 3-column comma separated table [peg, Function, Aliases] |
CDS_sequences | families | Get the protein sequences for a list of proteins. Input is a tab-separated list of peg, returns a 2-column comma separated table of [peg, sequence] |
cluster_by_bbhs | peg | Get the clusters for a peg by bidirectional best hits. Input is a peg, output is two column table of [peg, cluster] |
cluster_by_sim | peg | Get the clusters for a peg by similarity. Input is a peg, output is two column table of [peg, cluster] |
contigs_of | genomeid | Get a comma-separated list of all the contigs in a genome |
contig_ln | genomeid, contig | Get the length of the DNA sequence in a contig in a genome. Input is a genome id and a contig name, return is the length of the contig |
coupled_to | peg | Get the pegs that are coupled to any given peg. Input is a peg, output is list of [protein, score] for things that are coupled to this peg |
dna_seq | genomeid, location1 | Get the DNA sequence for a region in a genome. Input is a genome ID and a location in the form contig_start_stop, output is the DNA sequence in fasta format. |
ec_name | EC_number | Get the name for a given E.C. number. Input is an EC number, output is the name |
external_calls | peg | Get the annotations for a peg from all other known sources. Input is a peg, output is two column table of [peg, other function] |
feature_location | peg | Get the location of a peg on its contig. Input is a peg, output is list of locations on contigs. Usually this will be a single location, but sometimes it can either be more than one region on a contig, or even on multiple contigs. For convenience it is a comma joined list, often you will want to pass that to boundaries_of |
fid2dna | peg | Get the DNA sequence for a given protein identifier. Input is a peg, output is the DNA sequence in fasta format. |
fids2dna | peg | Get the DNA sequence for a set of protein identifiers. Input is a comma-joined list of pegs, output is the DNA sequence in fasta format. |
function_of | peg | Get the functional annotation of a given protein identifier. Input is a peg, output is a function |
Genomes | complete, restrictions, domain | Get a set of genomes. The inputs are a series of constraints - whether the sequence is complete, other restrictions, and a domain of life (Bacteria, Archaea, Eukarya, Viral, Environmental Genome). Output is a list of genome ids with the genus species appended. An example use is with the parameters ("complete", undef, "Bacteria") that will return all complete bacterial genomes. |
genomes_of | peg | Get the genome(s) that a given protein identifier refers to. Input is a peg, output is a single column table of genomes |
genus_species | genomeid | Get the genus and species of a genome identifier. Input is a genome ID, output is the genus and species of the genome |
get_ corresponding_ ids | peg | Get the corresponding ids of a peg. These are the identifiers that other databases use. Input is a peg, output is an array of aliases |
get_dna_seq | featureid | Retrieve the DNA sequence for a particular feature. Note that this will take a feature id (peg, rna, etc), and return the DNA sequence for that id. There is also a separate method to get the DNA sequence for an arbitrary location on a genome |
get_translation | peg | Get the translation (protein sequence) of a peg. Input is a peg, output is translation. (Note that this is a synonym of translation_of); |
is_archaeal | genomeid | Test whether an organism is Archaeal. Input is a genome identifier, and output is true or false (or 1 or 0) |
is_bacterial | genomeid | Test whether an organism is Bacterial. Input is a genome identifier, and output is true or false (or 1 or 0) |
is_eukaryotic | genomeid | Test whether an organism is Eukaryotic. Input is a genome identifier, and output is true or false (or 1 or 0) |
is_member_of | sequences | Tries to put a protein sequence in a family. Input is a tab-separated id and sequence, delimited by new lines. The output is a comma-separated 2-column table [your sequence id, FamilyID] if the sequence is placed in a family. |
is_prokaryotic | genomeid | Test whether an organism is a Prokaryote. Input is a genome identifier, and output is true or false (or 1 or 0) |
list_members | families | Get all the pegs in some FIGfams. The input is a tab-separated list of family IDs, and the output is a two column table of [family id, peg] |
pegs_of | genomeid | Get all the protein identifiers associated with a genome. Input is a genome id, output is a list of pegs in that genome |
pegs_with_md5 | md5 | Get the FIG IDs associated with the MD5 sum of a protein sequence. Input is the md5 checksum, output is an array of strings of FIG ids. This should be faster, and more complete, than using aliases or other ways to match protein sequences. |
pegs_with_md5_string | md5 | Get the FIG IDs associated with the MD5 sum of a protein sequence. Input is the md5 checksum, output is a comma separated list of FIG ids as a single string. This should be faster, and more complete, than using aliases or other ways to match protein sequences. |
pinned_region_ data | peg_id, n_pch_pins, n_sims, sim_cutoff, color_sim_ cutoff, sort_by | Input is a FIG (peg) ID and ..., output is the pinned regions data |
reaction_to_role | Reaction_number, genomeid | Get a tab-separated list of [subsystem name, functional role, peg, subsystem variant code for that genome] for any given reaction id and genome id. Maps the reaction id to peg, peg to genome, and genome to variant code |
replaces | genomeid | If this genome replaces another one (it is a more upto date version), what is the ID of the older genome? |
Rnas_of | genomeid | Get all the RNA identifiers associated with a genome. Input is a genome ID, and output is a list (an array) of the RNAs in that genome |
search_and_grep | pattern1, pattern2 | Search and grep through the database. Input is two patterns, first one is used in search_index, second used to grep the results to restrict to a smaller set. Output is an array of hashes with keys id, organism, otherIds, functionalAssignment, and annotator. |
Simple_search | pattern | Search the database. Input is a pattern to search for, output is list of pegs and roles |
Sims | peg, maxN, maxP | Retrieve the sims (precomputed BLAST hits) for a given protein sequence. Input is a peg, an optional maximum number of hits (default = 50), and an optional maximum E value (default = 1e-5). The output is a list of sims in modified tab separated (-m 8) format. Additional columns include length of query and database sequences, and method used. |
taxonomy_of | genomeid | Returns the taxonomy of a given genomeid |
translation_of | peg | Get the translation (protein sequence) of a peg. Input is a peg, output is the protein sequence. (Note that this is a synonym of get_translation). |