Flow chart of multiple sequence comparison (MSC) method. In step (A), MSC uses BLAST with filters as indicated to define a group of closely related sequences to the query (close sequence set, N); in Step (B), the Reference protein set consists of protein sequences identified without filters by BLAST in step A that are 1) from the Swiss-Prot database, 2) from the Protein Data Bank (PDB), and 3) linked to PubMed articles in the NCBI. Reference protein sequences are then used as queries against a database consisting of the close sequence set. Hits are filtered as indicated. If the filtered hit ratio is larger than 0.8, then a score is assigned to the reference protein.