Skip to main content
Figure 1 | BMC Bioinformatics

Figure 1

From: ProFAT: a web-based tool for the functional annotation of protein sequences

Figure 1

Workflow of a ProFAT Analysis. (A) A protein sequence and a keyword list are required inputs for a ProFAT analysis. The first step carried out by ProFAT is a domain search (RPS-BLAST) against the CDD-database from the NCBI. If no conserved domain is detected with RPS-BLAST, the user can proceed to domain prediction (A, right figure), which combines a RPS-BLAST search with relaxed parameters with a BLAST-search and subsequent text-mining for the biological relevance of identified hits. Alternatively, the user can choose to split the sequence into fragments between 150 and 300 amino acids for further processing. Selected conserved domains and/or regions of the input query can then be submitted to the Annotation Engine and/or Threading Engine. The Annotation Engine combines a PSI-BLAST search with text-mining of Gene Ontology annotation, features and PubMed abstracts associated with identified hits, thereby extracting hits involved in the process/function described by the user's keyword list. The Threading Engine combines a Threader 3.5 run with text-mining of associated PDB-keywords, features, compound information and PubMed abstracts of identified structures for post-filtering using keywords from the user-provided keyword list. (B) HMMerThread pipeline. HMMerThread combines a HMMer-search against the PFAM-database of conserved domains with a Threader run. The input query is first sent to an HMMer-search, whereby only domains with an associated 3D-structure are chosen for further processing. Selected domains are then sent to Threader 3.5, with prior secondary structure prediction (PSI-PRED), coiled-coil prediction (COILS2) and low-complexity filtering (SEG), which are all performed on the entire input sequence to achieve higher accuracy. HMMerThread therefore can give a highly accurate prediction of conserved domains.

Back to article page