- Methodology article
- Open Access
Identification and characterization of subfamily-specific signatures in a large protein superfamily by a hidden Markov model approach
© Truong and Ikura; licensee BioMed Central Ltd. 2002
- Received: 17 August 2001
- Accepted: 10 January 2002
- Published: 10 January 2002
Most profile and motif databases strive to classify protein sequences into a broad spectrum of protein families. The next step of such database studies should include the development of classification systems capable of distinguishing between subfamilies within a structurally and functionally diverse superfamily. This would be helpful in elucidating sequence-structure-function relationships of proteins.
Here, we present a method to diagnose sequences into subfamilies by employing hidden Markov models (HMMs) to find windows of residues that are distinct among subfamilies (called signatures). The method starts with a multiple sequence alignment (MSA) of the subfamily. Then, we build a HMM database representing all sliding windows of the MSA of a fixed size. Finally, we construct a HMM histogram of the matches of each sliding window in the entire superfamily. To illustrate the efficacy of the method, we have applied the analysis to find subfamily signatures in two well-studied superfamilies: the cadherin and the EF-hand protein superfamilies. As a corollary, the HMM histograms of the analyzed subfamilies revealed information about their Ca2+ binding sites and loops.
The method is used to create HMM databases to diagnose subfamilies of protein superfamilies that complement broad profile and motif databases such as BLOCKS, PROSITE, Pfam, SMART, PRINTS and InterPro.
- Multiple Sequence Alignment
- Protein Superfamily
- Solve Crystal Structure
- Cadherin Superfamily
- Motif Database
The biological function of a protein can often be inferred from its similarity to sequences of known function in sequence databases using single-sequence similarity algorithms such as BLAST  or FASTA . Such algorithms are suitable for determining highly similar sequences, but are not sensitive enough to capture highly divergent sequences. Therefore, many members of an evolutionarily diverse family of proteins may be overlooked. Within the last decade, the sensitivity of sequence searching techniques has been improved by profile- or motif-based analysis, which uses information derived from MSAs to construct and search for sequence patterns [3–6]. Unlike single-sequence similarity, a profile or motif can exploit additional information, such as the position and identity of residues that are conserved throughout the family, as well as variable insertion and deletion probabilities.
Currently, the most widely-used profile and motif databases are: BLOCKS , which stores ungapped MSAs corresponding to the most conserved regions of protein families; PROSITE , which uses single consensus patterns and profiles to characterize each family of sequences; Pfam  or SMART , which uses profile hidden Markov models (HMMs) to find commonly occurring protein domains; and PRINTS-S , which is a database similar to both PROSITE and BLOCKS, except it uses "fingerprints" composed of more than one pattern to characterize a protein family. Recently, a new profile and motif database, InterPro [10, 11], consisting of an amalgamation of PROSITE, ProDom , Pfam and the PRINTS fingerprint database, was used in the automatic annotation of complete proteomes including fly and human.
The purpose of subfamily partitioning is to create an MSA of each subfamily, however, if quality MSAs of subfamilies already exist, it is possible to commence with the analysis at that point, as is done for PRINTS-S . This section outlines a simple procedure for partitioning, however other methods exist which may be more preferable [17–21]. Many methods, like the one described herein, use a tree clustering approach based on sequence distance or identity.
The members of a protein family can be identified by collecting the matching sequences to profile or motif databases such as the ones described in the Background. This initial set of sequences is designated as the superfamily database and let the total number of sequences in this database be represented by n T . The method of selecting a protein subfamily and defining its limits depends on the researcher who defines it. Subfamilies can be partitioned based on sequence or function and while function-based methods are valid, sequence-based methods can be automated.
To divide the sequences into subfamilies, construct a square similarity matrix, S, of dimensions n T by n T . S i,j is the percent similarity between the sequence i and sequence j. The alignment between a pair of sequences is determined in CLUSTALW by performing a global alignment  with an opening gap penalty of 10, an extension gap penalty of 0.1 and a Gonnet scoring matrix [23, 24]. The percent similarity is estimated by the division of the alignment score by the maximum alignment score between each sequence aligned to itself.
The similarity matrix is used to build a tree by the UPGMA (unweighted pair group method using arithmetic averages) clustering algorithm  for the purpose of partitioning sequences based on sequence similarity. At this point, Sjolander  pointed out that any partition of the tree may be meaningful. Indeed, there is no partitioning criterion that is impartially better than another. In the end, the biologist must decide the most appropriate partitioning criterion from their perspective given their experience with the protein superfamily. Therefore, the introduction of complementary methods may be important for consistent and reproducible analysis.
Our aim is to achieve a high quality MSA of each subfamily. A benchmark of the quality of an MSA is how well it reflects the structural alignment. Comparative homology modeling allows us to predict the three-dimensional structure of a target protein based on its alignment to one or more proteins with a known template structure . It has been observed that as the sequence identity between the target sequence and the template increases, the average structural similarity between the template and the target also increases and for closely related protein sequences with identity over 40%, the alignment is almost always correct . Therefore, if a similarity threshold greater than 40% is used for partitioning, the resulting MSAs should be reasonably high quality and well correlated with the structure. Since Dayhoff used a 60% identity for the threshold for a subfamily , we adopt a 60% universal similarity threshold as a slight modification. This strict threshold may create multiple partitions of the same subfamily, however, careful inspection of the sequence descriptions hint at what partitions can be joined.
Let n S be the number of subfamilies and n i be the number of sequences in the ith subfamily. Therefore, the number of sequences that cannot be partitioned, n H , can be expressed in the following equation:
These sequences are less than 60% similar to each other and to sequences in any subfamily. Note that n H will never be zero due to the intermediate nodes of the initial tree. Also note that n H will increase as the similarity of the sequences in the superfamily decreases.
Creating an HMM histogram for one subfamily
The creation of an HMM histogram for a subfamily commences with an MSA, which can be acquired from manual or automatic sequence alignment of the sequences in each subfamily. If another method was used for partitioning subfamilies, it is necessary to check if the automatically generated MSAs are correct; however, using the outlined partitioning procedure, an automatic MSA method such as CLUSTALW should produce a structurally correlated MSA, since the sequences in the subfamilies have a greater than 40% sequence identity.
Sliding MSA windows with a width of w are created. Let a i be the width of the MSA of the ith subfamily, then the number of MSA windows for the ith subfamily, b i , is:
b i = a i - w - 1
An HMM is created for each sliding MSA window of the subfamily by the HMMER software package . The HMM database of the subfamily is created from the concatenation of all these individual HMMs and calibrated with a sample size of 10000 sequences. The superfamily sequence database is then searched with the HMM database and an HMM histogram is constructed from the number of matches of each window. Let the HMM histogram of the ith subfamily be represented by, f i (x), where x is the starting position of the window.
Using HMM histograms to find subfamily signatures
Finding signatures involves discovering MSA windows that can distinguish this subfamily from all other subfamilies. A particular MSA window can fall into one of three categories: divergent window (a window that is not shared by the subfamily), superfamily window (shared by the superfamily), or subfamily window (shared by the subfamily). Divergent windows can be easily identified from an MSA by a stretch of positions that do not align well; however, superfamily and subfamily windows cannot be separated because they will both align well.
However, from an HMM histogram, subfamily windows have an equal number of matches (f i (x)) to the number of sequences in the subfamily MSA (n i ), f i (x) = n i ; superfamily windows, f i (x) >n i ; divergent windows, f i (x) <n i . Since the HMM histogram sweeps across the MSA with a window size of w, if there is a subfamily signature greater than w positions, it will be identified by consecutive subfamily windows.
To define an HMM match, HMMER returns both a score and an e-value. The score is the base two logarithm of the ratio between the probability that the query sequence is a significant match to the probability that it is generated by a random model. The e-value represents the expected number of sequences with a score greater than or equal to the returned HMM score. While decreasing the e-value threshold favors finding true positives, increasing the e-value threshold favors finding true negatives. For finding subfamily signatures, a tolerant e-value of 100 is used because windows matching only sequences in the subfamily, under loose conditions, are characteristic to the subfamily.
Using HMM histograms to visualize functional regions
In the previous section, to identify subfamily signatures, we focused on subfamily windows. However, superfamily windows also may provide insight into which regions in the subfamily share functional significance relative to the superfamily. Peaks in the HMM histogram can suggest which regions are particularly well conserved across the entire superfamily.
To extract this data, a few modifications are needed to the method. First, create a HMM histogram of the ith subfamily as previously described, but instead with an e-value threshold of 0.1. This is a stringent threshold because for this purpose, it is important to favor true positives. Thus far, the HMM histograms presented are functions of the starting position of the window (f i (x)) and while this is convenient for identifying subfamily signatures, HMM histograms as a function of the position in the alignment, g i (x), are useful to assess the contribution of individual positions.
The mapping from f i (x) → g i (x) is determined by tabulating a count of 1 for each position in the window when a match is found. Therefore, the mapping equation is expressed as follows:
Peaks in g i (x) may hint at positions that may have functional importance.
Analysis of the cadherin superfamily
Cadherins represent a large family of proteins having diverse functions including cell-cell adhesion, morphogenesis, synapse formation, cell polarization, cell sorting, cell migration, and cell rearrangements . All members of the cadherin superfamily possess a cadherin repeat (CR) and by using Pfam's HMM of the CR, 203 sequences were filtered that match the model below a 0.1 e-value from the SWISS PROT sequence database (Release 39).
Tabulation of sequences in cadherin subfamilies
Number of sequences (n i )
Kidney Specific Cadherin
Liver Intestine Cadherin
Tyrosine Receptor Kinase
Vascular Endothelial Cadherin
Unpartitioned (n H )
Total (n T )
Various biochemical and structural studies have suggested that Ca2+ binding occurs between all CRs . These Ca2+ binding linkers seem to play critical roles in the cell-adhesion function of cadherins, as they are directly involved in molecular assembly . The high peak between linker of CR2 and CR3 in the HMM histogram (Fig. 5B) strongly suggests the functional importance of this domain linker. Interestingly, the two linkers between the last 3 CRs do not display an intense peak in the HMM histogram. These findings may suggest that the two N-terminal linkers are functionally more essential than the two C-terminal linkers. Further structural and mutagenesis studies are required to test this hypothesis derived from our sequence analysis.
Analysis of the EF-hand superfamily
Kretsinger and Nockolds  discovered the EF-hand motif in the crystal structure of parvalbumin in 1973. The EF-hand motif has a characteristic helix-loop-helix structure, consisting of approximately 30 residues. Numerous proteins that interact with Ca2+ contain the EF-hand motif . The most prevalent classification of the EF-hand superfamily based on domain relations has been reported previously .
Tabulation of sequences in EF-hand subfamilies
Number of sequences (ni)
Ca2+ Dependent Protein Kinase
Guanylyl Cyclase Activating Protein
Unpartitioned (n H )
Total (n T )
We developed a method to decipher signature regions of protein subfamilies, which can be used to build HMM databases for diagnosing subfamilies of large protein superfamilies. Using this method, we identified subfamily signatures and built HMM databases for two well-studied superfamilies of cadherins and EF-hand proteins. Additionally, peaks in the HMM histogram plots of subfamilies were found to coincide with functionally important regions (i.e. Ca2+ binding sites and loops). Future work should include the comparison between different subfamily partitioning techniques and also the creation of richly annotated databases for subfamilies of superfamilies for possible application in automated genomic annotation in conjunction with other motif and profile databases.
The studies were performed using a variety of tools and whenever necessary, in-house programs were written to pre- and post-process data from the different applications. MSAs were generated using CLUSTALW  and all HMMs were created using the HMMER package . Data was stored on the Oracle relational database management system and Microsoft FoxPro was used as an ODBC (Open Database Connectivity) client for querying and joining tables from the database. Microsoft Excel was used for dynamic charting of data. Perl was used for shell scripting, text manipulation and pattern matching with regular expressions. HMMER, CLUSTALW, Oracle database server (version 8) and Perl scripts were executed on a machine with a dual 750 MHz UltraSPARC-111 processor and 4 G of RAM running SunOS 5.8. Microsoft FoxPro and Excel were executed on a 500 MHz Intel Celeron processor and 128 MB of RAM running a Windows 98 operating system.
The time required to analyze one superfamily depended largely on the computation platform, the number of sequences of the superfamily and the average width of subfamily MSAs. Using the computation platforms described, the computation time to generate the MSA using CLUSTALW for the cadherin superfamily (~200 sequences, ~800 average width) was ~3 hours and for the EF-hand superfamily (~700 sequence, ~200 average width) was ~9 hours. The computation time for the creation of a calibrated HMM database (window size of 20) for an average cadherin subfamily was ~6 hours; for an EF-hand subfamily, ~45 minutes. The execution time for an average HMM database of cadherin subfamily over the superfamily database was ~12 hours; for an EF-hand sub-family, ~7 hours. The computation time was extensive but could easily be adapted to a parallel computing system.
The HMM database created for the cadherin and EF-hand superfamilies and all glue programs that were used for the analysis are available upon request.
We would like to thank Gil Prive for critical reading of the manuscript. This work was supported by grants to MI from the National Cancer Institute of Canada. MI is a HHMI (Howard Hughes Medical Institute) International Scholar and CIHR (Canadian Institute of Health Research) Scientist.
- Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997, 25: 3389–3402. 10.1093/nar/25.17.3389PubMed CentralView ArticlePubMedGoogle Scholar
- Pearson WR: Using the FASTA program to search protein and DNA sequence databases. Methods Mol Biol 1994, 25: 365–389. 10.1385/0-89603-276-0:365PubMedGoogle Scholar
- Hofmann K, Bucher P, Falquet L, Bairoch A: The PROSITE database, its status in 1999. Nucleic Acids Res 1999, 27: 215–219. 10.1093/nar/27.1.215PubMed CentralView ArticlePubMedGoogle Scholar
- Henikoff S, Henikoff JG, Pietrokovski S: Blocks+: a non-redundant database of protein alignment blocks derived from multiple compilations. Bioinformatics 1999, 15: 471–479. 10.1093/bioinformatics/15.6.471View ArticlePubMedGoogle Scholar
- Barton GJ: Protein multiple sequence alignment and flexible pattern matching. Methods Enzymol 1990, 183: 403–428. 10.1016/0076-6879(90)83027-7View ArticlePubMedGoogle Scholar
- Eddy SR: Profile hidden Markov models. Bioinformatics 1998, 14: 755–763. 10.1093/bioinformatics/14.9.755View ArticlePubMedGoogle Scholar
- Bateman A, Birney E, Durbin R, Eddy SR, Howe KL, Sonnhammer EL: The Pfam protein families database. Nucleic Acids Res 2000, 28: 263–266. 10.1093/nar/28.1.263PubMed CentralView ArticlePubMedGoogle Scholar
- Schultz J, Milpetz F, Bork P, Ponting CP: SMART, a simple modular architecture research tool: identification of signaling domains. Proc Natl Acad Sci USA 1998, 95: 5857–5864. 10.1073/pnas.95.11.5857PubMed CentralView ArticlePubMedGoogle Scholar
- Attwood TK, Croning MD, Flower DR, Lewis AP, Mabey JE, Scordis P, Selley JN, Wright W: PRINTS-S: the database formerly known as PRINTS. Nucleic Acids Res 2000, 28: 225–227. 10.1093/nar/28.1.225PubMed CentralView ArticlePubMedGoogle Scholar
- Apweiler R, Attwood TK, Bairoch A, Bateman A, Birney E, Biswas M, Bucher P, Cerutti L, Corpet F, Croning MD, Durbin R, Falquet L, Fleischmann W, Gouzy J, Hermjakob H, Hulo N, Jonassen I, Kahn D, Kanapin A, Karavidopoulou Y, Lopez R, Marx B, Mulder NJ, Oinn TM, Pagni M, Servant F: The InterPro database, an integrated documentation resource for protein families, domains and functional sites. Nucleic Acids Res 2001, 29: 37–40. 10.1093/nar/29.1.37PubMed CentralView ArticlePubMedGoogle Scholar
- Apweiler R, Attwood TK, Bairoch A, Bateman A, Birney E, Biswas M, Bucher P, Cerutti L, Corpet F, Croning MD, Durbin R, Falquet L, Fleischmann W, Gouzy J, Hermjakob H, Hulo N, Jonassen I, Kahn D, Kanapin A, Karavidopoulou Y, Lopez R, Marx B, Mulder NJ, Oinn TM, Pagni M, Servant F, Sigrist CJ, Zdobnov EM: InterPro–an integrated documentation resource for protein families, domains and functional sites. Bioinformatics 2000, 16: 1145–1150. 10.1093/bioinformatics/16.12.1145View ArticlePubMedGoogle Scholar
- Corpet F, Servant F, Gouzy J, Kahn D: ProDom and ProDom-CG: tools for protein domain analysis and whole genome comparisons. Nucleic Acids Res 2000, 28: 267–269. 10.1093/nar/28.1.267PubMed CentralView ArticlePubMedGoogle Scholar
- Adams MD, Celniker SE, Holt RA, Evans CA, Gocayne JD, Amanatides PG, Scherer SE, Li PW, Hoskins RA, Galle RF, George RA, Lewis SE, Richards S, Ashburner M, Henderson SN, Sutton GG, Wortman JR, Yandell MD, Zhang Q, Chen LX, Brandon RC, Rogers YH, Blazej RG, Champe M, Pfeiffer BD, Wan KH, Doyle C, Baxter EG, Helt G, Nelson CR, Gabor GL, Abril JF, Agbayani A, An HJ, Andrews-Pfannkoch C, Baldwin D, Ballew RM, Basu A, Baxendale J, Bayraktaroglu L, Beasley EM, Beeson KY, Benos PV, Berman BP, Bhandari D, Bolshakov S, Borkova D, Botchan MR, Bouck J, Brokstein P, Brottier P, Burtis KC, Busam DA, Butler H, Cadieu E, Center A, Chandra I, Cherry JM, Cawley S, Dahlke C, Davenport LB, Davies P, de Pablos B, Delcher A, Deng Z, Mays AD, Dew I, Dietz SM, Dodson K, Doup LE, Downes M, Dugan-Rocha S, Dunkov BC, Dunn P, Durbin KJ, Evangelista CC, Ferraz C, Ferriera S, Fleischmann W, Fosler C, Gabrielian AE, Garg NS, Gelbart WM, Glasser K, Glodek A, Gong F, Gorrell JH, Gu Z, Guan P, Harris M, Harris NL, Harvey D, Heiman TJ, Hernandez JR, Houck J, Hostin D, Houston KA, Howland TJ, Wei MH, Ibegwam C, et al.: The genome sequence of Drosophila melanogaster. Science 2000, 287: 2185–2195. 10.1126/science.287.5461.2185View ArticlePubMedGoogle Scholar
- Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W, Funke R, Gage D, Harris K, Heaford A, Howland J, Kann L, Lehoczky J, LeVine R, McEwan P, McKernan K, Meldrim J, Mesirov JP, Miranda C, Morris W, Naylor J, Raymond C, Rosetti M, Santos R, Sheridan A, Sougnez C, Stange-Thomann N, Stojanovic N, Subramanian A, Wyman D, Rogers J, Sulston J, Ainscough R, Beck S, Bentley D, Burton J, Clee C, Carter N, Coulson A, Deadman R, Deloukas P, Dunham A, Dunham I, Durbin R, French L, Grafham D, Gregory S, Hubbard T, Humphray S, Hunt A, Jones M, Lloyd C, McMurray A, Matthews L, Mercer S, Milne S, Mullikin JC, Mungall A, Plumb R, Ross M, Shownkeen R, Sims S, Waterston RH, Wilson RK, Hillier LW, McPherson JD, Marra MA, Mardis ER, Fulton LA, Chinwalla AT, Pepin KH, Gish WR, Chissoe SL, Wendl MC, Delehaunty KD, Miner TL, Delehaunty A, Kramer JB, Cook LL, Fulton RS, Johnson DL, Minx PJ, Clifton SW, Hawkins T, Branscomb E, Predki P, Richardson P, Wenning S, Slezak T, Doggett N, Cheng JF, Olsen A, Lucas S, Elkin C, Uberbacher E, Frazier M, et al.: Initial sequencing and analysis of the human genome. Nature 2001, 409: 860–921. 10.1086/172716View ArticlePubMedGoogle Scholar
- Tepass U, Truong K, Godt D, Ikura M, Peifer M: Cadherins in embryonic and neural morphogenesis. Nature Review 2000, 1: 91–100. 10.1038/35040042View ArticleGoogle Scholar
- Kawasaki H, Nakayama S, Kretsinger RH: Classification and evolution of EF-hand proteins. Biometals 1998, 11: 277–295. 10.1023/A:1009282307967View ArticlePubMedGoogle Scholar
- Dayhoff MO: The origin and evolution of protein superfamilies. Fed Proc 1976, 35: 2132–2138.PubMedGoogle Scholar
- Lichtarge O, Bourne HR, Cohen FE: An evolutionary trace method defines binding surfaces common to protein families. J Mol Biol 1996, 257: 342–358. 10.1006/jmbi.1996.0167View ArticlePubMedGoogle Scholar
- Corpet F, Gouzy J, Kahn D: Browsing protein families via the 'Rich Family Description' format. Bioinformatics 1999, 15: 1020–1027. 10.1093/bioinformatics/15.12.1020View ArticlePubMedGoogle Scholar
- Sjolander K: Phylogenetic inference in protein superfamilies: analysis of SH2 domains. Proc Int Conf lntell Syst Mol Biol 1998, 6: 165–174.Google Scholar
- Wicker N, Perrin GR, Thierry JC, Poch O: Secator: a program for inferring protein subfamilies from phylogenetic trees. Mol Biol Evol 2001, 18: 1435–1441.View ArticlePubMedGoogle Scholar
- Needleman SB, Wunsch CD: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 1970, 48: 443–453.View ArticlePubMedGoogle Scholar
- Benner SA, Cohen MA, Gonnet GH: Amino acid substitution during functionally constrained divergent evolution of protein sequences. Protein Eng 1994, 7: 1323–1332.View ArticlePubMedGoogle Scholar
- Barker WC, Ketcham LK, Dayhoff MO: A comprehensive examination of protein sequences for evidence of internal gene duplication. J Mol Evol 1978, 10: 265–281.View ArticlePubMedGoogle Scholar
- Prager EM, Wilson AC: Construction of phylogenetic trees for proteins and nucleic acids: empirical evaluation of alternative matrix methods. J Mol Evol 1978, 11: 129–142.View ArticlePubMedGoogle Scholar
- Sanchez R, Sali A: Comparative protein structure modeling. Introduction and practical examples with modeller. Methods Mol Biol 2000, 143: 97–129.PubMedGoogle Scholar
- Marti-Renom MA, Stuart AC, Fiser A, Sanchez R, Melo F, Sali A: Comparative protein structure modeling of genes and genomes. Annu Rev Biophys Biomol Struct 2000, 29: 291–325. 10.1146/annurev.biophys.29.1.291View ArticlePubMedGoogle Scholar
- Eddy SR: The HMMER Package. 1995.Google Scholar
- Nagar B, Overduin M, Ikura M, Rini JM: Structural basis of Ca 2+ -induced E-cadherin rigidification and dimerization. Nature 1996, 380: 360–364. 10.1038/380360a0View ArticlePubMedGoogle Scholar
- Ozawa M, Engel J, Kemler R: Single amino acid substitutions in one Ca 2+ -binding site of uvomorulin abolish the adhesive function. Cell 1990, 63: 1033–1038. 10.1016/0092-8674(90)90506-AView ArticlePubMedGoogle Scholar
- Alattia JR, Kurokawa H, Ikura M: Structural view of cadherin-mediated cell-cell adhesion. Cell Mol Life Sci 1999, 55: 359–367. 10.1007/s000180050297View ArticlePubMedGoogle Scholar
- Kretsinger RH, Nockolds CE: Carp muscle Ca 2+ -binding protein. II. Structure determination and general description. J Biol Chem 1973, 248: 3313–3326.PubMedGoogle Scholar
- Lewit-Bentley A, Rety S: EF-hand Ca 2+ -binding proteins. Curr Opin Struct 2000, 10: 637–643. 10.1016/S0959-440X(00)00142-1View ArticleGoogle Scholar
- Akerfeldt KS, Coyne AN, Wilk RR, Thulin E, Linse S: Ca 2+ -binding stoichiometry of calbindin D28k as assessed by spectroscopic analyses of synthetic peptide fragments. Biochemistry 1996, 35: 3662–3669. 10.1021/bi9527956View ArticlePubMedGoogle Scholar
- Kakalis LT, Kennedy M, Sikkink R, Rusnak F, Armitage IM: Characterization of the Ca 2+ -binding sites of calcineurin B. FEBS Lett 1995, 362: 55–58. 10.1016/0014-5793(95)00207-PView ArticlePubMedGoogle Scholar
- Weber C, Lee VD, Chazin WJ, Huang B: High level expression in Escherichia coli and characterization of the EF-hand Ca 2+ -binding protein caltractin. J Biol Chem 1994, 269: 15795–15802.PubMedGoogle Scholar
- Lannergren J, Elzinga G, Stienen GJ: Force relaxation, labile heat and parvalbumin content of skeletal muscle fibres of Xenopus laevis. J Physiol 1993, 463: 123–140.PubMed CentralView ArticlePubMedGoogle Scholar
- Muntener M, Kaser L, Weber J, Berchtold MW: Increase of skeletal muscle relaxation speed by direct injection of parvalbumin cDNA. Proc Natl Acad Sci U S A 1995, 92: 6504–6508.PubMed CentralView ArticlePubMedGoogle Scholar
- Swain AL, Kretsinger RH, Amma EL: Restrained least squares refinement of native (Ca 2+ ) and Cd- substituted carp parvalbumin using X-ray crystallographic data at 1.6-Å resolution. J Biol Chem 1989, 264: 16620–16628.PubMedGoogle Scholar
- Thompson JD, Higgins DG, Gibson TJ: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 1994, 22: 4673–4680.PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.