SODa: An Mn/Fe superoxide dismutase prediction and design server
© Kwasigroch et al; licensee BioMed Central Ltd. 2008
Received: 30 November 2007
Accepted: 02 June 2008
Published: 02 June 2008
Superoxide dismutases (SODs) are ubiquitous metalloenzymes that play an important role in the defense of aerobic organisms against oxidative stress, by converting reactive oxygen species into nontoxic molecules. We focus here on the SOD family that uses Fe or Mn as cofactor.
The SODa webtool http://babylone.ulb.ac.be/soda predicts if a target sequence corresponds to an Fe/Mn SOD. If so, it predicts the metal ion specificity (Fe, Mn or cambialistic) and the oligomerization mode (dimer or tetramer) of the target. In addition, SODa proposes a list of residue substitutions likely to improve the predicted preferences for the metal cofactor and oligomerization mode. The method is based on residue fingerprints, consisting of residues conserved in SOD sequences or typical of SOD subgroups, and of interaction fingerprints, containing residue pairs that are in contact in SOD structures.
SODa is shown to outperform and to be more discriminative than traditional techniques based on pairwise sequence alignments. Moreover, the fact that it proposes selected mutations makes it a valuable tool for rational protein design.
Normal cellular metabolism produces reactive oxygen species, whose accumulation is prevented by the action of SODs. These enzymes convert superoxide to hydrogen peroxide, which is then removed by glutathione peroxidase or catalase. However, overproduction of reactive oxygen species can occur in abnormal processes such as irradiation, aging and several diseases [1, 2]. In such case, natural SODs may become insufficient to ensure detoxification. Therefore, a better understanding of how SOD function and the design of active SOD mimetics would be particularly important in view of treating the effects of oxidative damage  or using them as therapeutic target .
There are several forms of SOD enzymes that are generally classified according to their metal cofactor, i.e. Cu/Zn, Ni, Fe and Mn ions. We focus here on Fe and Mn SODs, which are prevalent in bacteria and mitochondria. The large majority of these SODs require specifically either Fe or Mn to perform their biological activity; only some, called cambialistic, function with both types of ions .
Fe and Mn SODs have very similar sequences and structures , and it is often quite difficult to distinguish the metal specificity on the basis of their primary, secondary or even tertiary structures [7, 8]. Two groups of Fe/Mn SODs can moreover be defined on the basis of their oligomeric properties, as they form either homodimers or homotetramers in solution. This property too is quite difficult to detect on the basis of sequence and structure.
Consequently, the development of a prediction method that allows to tune the activity and specificity of SOD enzymes and to design specific SOD mutants is very challenging.
SODa is available trough the SODa home page (see Section Availability and requirements). The main program is implemented in C; the web interface and the processing of the results are performed using Perl, Bourne shell scripts, and PHP. The output files in PDF format are created with the "HTML To PDF" PHP class .
The SODa prediction method relies on datasets of annotated and aligned sequences and structures, from which residue and interaction fingerprints are derived [10, 11]. These fingerprints form the basis of the SODa method: they are combined to predict if a target sequence is a SOD, its metal specificity and oligomerization mode.
Datasets of aligned sequences and structures
The sequence dataset encompasses 374 SOD sequences, for which an assignment of the metal cofactor and oligomeric state is established on the basis of the SwissProt annotation  if available, and on literature resources otherwise . It contains 234 dimers (116 Fe-specific, 102 Mn-specific, and 16 cambialistic SODs) and 140 tetramers (42 Fe-specific, 94 Mn-specific, and 4 cambialistic SODs). This set was used to derive SOD- and SODtype-fingerprints.
In addition to this sequence datasets, a structure dataset was considered, containing 17 high-resolution x-ray structures which were retrieved from the Protein Quaternary Structure server . A list of these structures can be found in . They were aligned using the SoFiSt algorithm  onto the E. coli SOD structure of Protein Data Bank code  1isa ; this protein is chosen as representative SOD.
To obtain a global alignment of the 374 sequences of the learning set, each of them was aligned onto the 1isa sequence, using the CLUSTALX sequence alignment algorithm . This alignment was manually improved on the basis of the structure alignment of the 17 SODs from the structure dataset .
Derivation of SOD- and SODtype-fingerprints
Single residues and residue pair interactions that are conserved in all SOD enzymes (SOD-fingerprint) or are specific to a SOD type (SODtype-fingerprints) have been identified from a set of 374 aligned SOD sequences and 17 aligned structures, merging Fe- and Mn-specific SODs, dimers and tetramers (see above). Pair interactions are defined as residues whose side chain geometric centers are separated by 8 Å at most in one the 17 aligned SOD structures. These fingerprints form the kernel of the SODa method .
The SOD-fingerprint contains the residues and the residue pair interactions that are present in 80% at least of all 374 aligned SOD sequences, and are used to identify if a target sequence is a SOD or not. Among these, the four residues that bind the metal cofactor (His26, His73, Asp156, His160, following the numbering of the E. coli SOD 1isa ) are perfectly conserved, and so is Glu159 which makes a salt bridge with His160 across the dimer interface. The interactions in the SOD-fingerprint link the four central metal-bound residues to the residues situated in their immediate neighborhood on the first shell around these central residues. Several of these interactions occur across the dimer interface, which corresponds to the main channel leading to the active site.
The SODtype-fingerprints are used to predict what type of SOD the target sequence corresponds to, that is, whether it is an Fe dimer, Fe tetramer, Mn dimer or Mn tetramer. They contain residues and residue pair interactions that are typical of a SOD subgroup, i.e. which occur in at least 80% of the members of the subgroup and in less than 20% of the other SODs. In addition to the four basic subgroups (Fe dimer, Fe tetramer, Mn dimer, Mn tetramer), we also consider larger subgroups that are the union of several basic subgroups: Fe, Mn, dimers, tetramers, all but Fe dimers, all but Mn dimers, all but Fe tetramers, and all but Mn tetramers. As expected, the tetramer fingerprint involves residues in the region where the structure differs between dimers and tetramers. It has to be noted that dimer and tetramer fingerprints contain residues at different positions, whereas Mn and Fe specific fingerprints concern several identical positions occupied by different amino acids, which tune the preference towards Mn or Fe. Note also that the interaction fingerprints involve several π -π, cation-π, amino-π, H-bond and salt bridge interactions. The SOD- and SODtype-fingerprints are listed on the webpage (see Availability and requirements section).
Prediction of SOD and SODtype
The SOD-fingerprint is used to perform the first prediction, that is, to identify whether or not a target sequence is an Fe/Mn SOD. For that purpose, the target sequence is aligned to a hidden Markov profile built from the 17 sequences of our structure set using the HMMER program  with the default parameters. On the basis of this alignment, the residues and interactions of the SOD-fingerprint that are conserved in the target are identified; note that an interaction is supposed to be present if the two residues forming the interaction occur in the sequence. The target is predicted to be a SOD if it contains all the perfectly conserved residues and interactions, and at least 40% of the others.
If the target sequence is predicted to be a SOD, the program goes over to the second prediction level, which consists of predicting whether it is an Fe dimer, Fe tetramer, Mn dimer or Mn tetramer. It uses for that purpose the SODtype-fingerprints. The target is assigned to the basic subgroup presenting the highest weight, evaluated as follows. If a residue or interaction specific of one of the four basic subgroups (e.g. Mn dimer) occurs in the target, a weight of 1 is added to the subgroup; if it is specific to the union of two basic subgroups (e.g. Mn), a weight of 1/2 is added to the two basic subgroups involved (in this case Mn dimer and Mn tetramer); if it is specific to the union of three basic subgroups (e.g. non Fe tetramer), a weight of 1/3 is added to the three basic subgroups involved (in this case, Fe dimer, Mn dimer and Mn tetramer). All the weights are normalized through division by the maximum possible weight of the subgroup, and expressed in percent. The subgroup with the highest normalized weight w max is the predicted one. If the next highest weight is larger than w max – 20%, the corresponding subgroup is predicted too. Thus, cambialistic SODs and non well defined oligomeric modes can be predicted by the SODa server.
To evaluate the performance of SODa, we compared it with the commonly used prediction method that assigns the metal cofactor and oligomer state on the basis of pairwise sequence comparisons.
Pairwise sequence comparison method
In applying the pairwise sequence comparison method, four reference sequences are considered, which are representatives of the four SOD types, i.e. a dimeric Fe SOD (E. coli; accession number P0AGD3), a dimeric Mn SOD (E. coli; P00448), a tetrameric Fe SOD (Streptomyces coelicolor; Q9X469) and a tetrameric Mn SOD (human mitochondrial; P04179).
The target sequence is aligned onto these four reference sequences using the WATER program from the EMBOSS package . The target is assigned the metal specificities and oligomer properties of the reference sequence that yields the best alignment score, defined as the WATER similarity score divided by the maximum score reached by aligning the reference sequence onto itself, and expressed in %.
Assessment of the SODa predictions
To evaluate the predictions performed by SODa, we first applied it to all 374 SODs of the learning set. All these sequences but one were correctly recognized as Fe/Mn SODs, which amounts to a score of 99.7%. The prediction of their cofactor specificity and oligomer state reaches a score of 97%. To have an objective estimation of the predictive power of SODa, we compare it with the commonly used assignment method, where the target sequence is aligned onto SODs of each type and assigned the oligomer state and metal specificity of the most similar sequence, as described in Implementation. Clearly, the latter method yields less good results: the percentage of target sequences correctly assigned drops to 88%, with 30 incorrect assignments spread over all SOD subgroups. Moreover, the scores of the four subgroups are much more similar, which results in a drop of discriminating power .
Comparison of the SODa predictions, the sequence comparison predictions, and the observed SOD types
Sequence comparison prediction
Mn tetramer 
Fe dimer 
Fe dimer 
Mn dimer 
Fe dimer 
Fe dimer 
Fe tetramer 
Mutations for rational SOD design
Another novelty and power of SODa lies in the list of suggested mutations that are likely to reinforce the SOD function, the specificity for the metal cofactor, or the dimeric or tetrameric character. We would like to emphasize that, for a set of experimentally characterized SOD mutations, the tendencies predicted by SODa have been shown to be in excellent agreement with the measured ones .
The prediction scores of the SODa method are higher and allow better discrimination between the four SOD types than the commonly used method based on pairwise sequence comparisons. This high discriminative power and the suggestion of targeted mutations makes the SODa server particularly well suited for rational design of SOD proteins, with modified or enhanced activity and specificity.
Availability and requirements
SODa is freely available on the webpage http://babylone.ulb.ac.be/soda.
The SODa user can submit a query by filling a form on the SODa web page. The sequence to be predicted must be in FASTA format. The "E-mail" field is required for later identification. The results are quickly (typically, after one minute) available on a web page, accessible via the link displayed or via the "results" page upon typing the E-mail address. The files remain available during seven days. The results of the prediction are described in three files named "align", "observed" and "missing", available both in HTML and PDF format.
The "align" file contains the main results of the prediction, in particular, whether the target is a "SOD" or a "non SOD" and, in the former case, what type of SOD it is. To allow the evaluation of the strength of the prediction, the percentage of residues and interactions from the target that match the SOD- and SODtype-fingerprints are indicated. Moreover, the alignment of the target sequence onto four reference proteins, one of each SOD type, is given. The residues and interactions corresponding to the SOD- and SODtype-fingerprints are colored in the alignment, and the missing characteristics are marked by an "X". Since it is impossible to indicate all the information in the alignment because some fingerprints overlap, the observed and missing characteristics are listed in the text files "observed" and "missing". The information in the latter file provides proposals for residue substitutions that are likely to reinforce the predicted SOD type.
We acknowledge support from the Communauté Française de Belgique through an Action de Recherche Concertée (02/07-289), the Belgian State Science Policy Office through an Interuniversity Attraction Poles Programme (DYSCO), and the Belgian Fund for Scientific Research (FRS) through an FRFC project. RW and MR are Research Associate and Research Director, respectively, at the FRS.
- Halliwell B, Gutteridge J, Cross C: Free radicals, antioxidants and human disease. Where are we now? J Lab Clin Med 1992, 119: 598–662.PubMedGoogle Scholar
- Afonsoa V, Champya R, Mitrovica D, Collina P, Lomri A: Reactive oxygen species and superoxide dismutases: Role in joint diseases. Joint Bone Spine 2007, 74: 324–329. 10.1016/j.jbspin.2007.02.002View ArticleGoogle Scholar
- Greenberger J: Gene therapy approaches for stem cell protection. Gene Therapy 2008, 15: 100–108. 10.1038/sj.gt.3303004View ArticlePubMedGoogle Scholar
- Dive D, Gratepanche S, Yera H, Bécuwe P, Daher W, Delplace P, Odberg-Ferragut C, Capron M, Khalife J: Superoxide dismutase in Plasmodium : a current survey. Redox Report 2003, 8: 265–267. 10.1179/135100003225002871View ArticlePubMedGoogle Scholar
- Whittaker J: The irony of manganese superoxide dismutase. Biochem Soc Trans 2003, 31(Pt 6):1318–1321.View ArticlePubMedGoogle Scholar
- Stallings W, Pattridge K, Strong R, Ludwig M: Manganese and iron superoxide dismutases are structural homologs. Journal of Biological Chemistry 1984, 259: 10695–10699.PubMedGoogle Scholar
- Parker M, Blake C, Barra D, Bossa F, Schinina M, Bannister W, Bannister J: Structural identity between the iron-and manganese-containing superoxide dismutases. Protein Engineering 1987, 1: 393–397. 10.1093/protein/1.5.393View ArticlePubMedGoogle Scholar
- Parker M, Blake C: Iron- and manganese-containing superoxide dismutases can be distinguished by analysis of their primary structures. FEBS Letters 1988, 229: 377–382. 10.1016/0014-5793(88)81160-8View ArticlePubMedGoogle Scholar
- HTML to PDF conversion[http://www.rustyparts.com/pdf.php]
- Wintjens R, Noël C, May A, Gerbod D, Dufernez F, Capron M, Viscogliosi E, Rooman M: Specific and phenetic relationships of iron- and manganese-containing superoxide dismutases on the basis of structure and sequence comparisons. Journal of Biological Chemistry 2004, 279: 9248–9254. 10.1074/jbc.M312329200View ArticlePubMedGoogle Scholar
- Wintjens R, Gilis D, Rooman M: Mn/Fe superoxide dismutase interaction fingerprints and prediction of oligomerization and metal cofactor from sequence. Proteins 2008, 70: 1564–1577. 10.1002/prot.21650View ArticlePubMedGoogle Scholar
- Boeckmann B, Bairoch A, Apweiler R, Blatter MC, Estreicher A, Gasteiger E, Martin MJ, Michoud K, O'Donovan C, Phan I, Pilbout S, Schneider M: The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res 2003, 31: 365–370. [http://www.expasy.org/sprot] 10.1093/nar/gkg095PubMed CentralView ArticlePubMedGoogle Scholar
- Henrick K, Thornton JM: PQS: a protein quaternary structure file server. Trends Biochem Sci 1998, 23(9):358–61. [http://pqs.ebi.ac.uk/] 10.1016/S0968-0004(98)01253-5View ArticlePubMedGoogle Scholar
- Boutonnet N, Rooman M, Ochagavia M, Richelle J, Wodak S: Optimal protein structure alignments by multiple linkage clustering: application to distantly related proteins. Protein Engineering 1995, 8: 647–662.View ArticlePubMedGoogle Scholar
- Berman H, Westbrook J, Feng Z, Gilliland G, Bhat T, Weissig H, Shindyalov I, Bourne P: The Protein Data Bank. Nucleic Acids Research 2000, 28: 235–242. 10.1093/nar/28.1.235PubMed CentralView ArticlePubMedGoogle Scholar
- Lah M, Dixon M, Pattridge K, Stallings W, Fee J, Ludwig M: Structure-function in Escherichia coli iron superoxide dismutase: comparisons with the manganese enzyme from Thermus thermophilus . Biochemistry 1995, 34: 1646–1660. 10.1021/bi00005a021View ArticlePubMedGoogle Scholar
- Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG: The CLUSTAL X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 1997, 25: 4876–4882. 10.1093/nar/25.24.4876PubMed CentralView ArticlePubMedGoogle Scholar
- Eddy S: Profile Hidden Markov Models. Bioinformatics 1998, 14: 755–763. 10.1093/bioinformatics/14.9.755View ArticlePubMedGoogle Scholar
- Rice P, Longden I, Bleasby A: EMBOSS: the European Molecular Biology Open Software Suite. Trends in Genetics 2000, 16: 276–277. [http://emboss.sourceforge.net] 10.1016/S0168-9525(00)02024-2View ArticlePubMedGoogle Scholar
- Cooper J, McIntyre K, Badasso M, Wood S, Zhang Y, Garbe T, Young D: X-ray structure analysis of the iron-dependent superoxide dismutase from Mycobacterium tuberculosis at 2.0 Angstroms resolution reveals novel dimer-dimer interactions. Journal of Molecular Biology 1995, 246: 1531–544. 10.1006/jmbi.1994.0105View ArticleGoogle Scholar
- Dash B, Metz R, Huebner H, Porter W, Phillips T: Molecular characterization of two superoxide dismutases from Hydra vulgaris. Gene 2007, 387: 93–108. 10.1016/j.gene.2006.08.020PubMed CentralView ArticlePubMedGoogle Scholar
- Castellano I, A DM, Ruocco M, Chambery A, Parente A, Di Martino M, Parlato G, Masullo M, De Vendittis E: Psychrophilic superoxide dismutase from Pseudoalteromonas haloplanktis : biochemical characterization and identification of a highly reactive cysteine residue. Biochimie 2006, 88: 1377–1389. 10.1016/j.biochi.2006.04.005View ArticlePubMedGoogle Scholar
- Davydova M, Gorshkov O, Tarasova N: Periplasmic superoxide dismutase from Desulfovibrio desulfuricans 1388 is an iron protein. Biochemistry (Mosc) 2006, 71: 68–72. 10.1134/S000629790601010XView ArticleGoogle Scholar
- Seo S, Lee J, Kim Y: Characterization of iron- and manganese-containing superoxide dismutase from methyllobacillus Sp. Strain SK1 DSM 8269. Molecules and Cells 2007, 23: 370–378.PubMedGoogle Scholar
- Zheng Z, Jiang Y, Miao J, Wang Q, Zhang B, Li G: Purification and characterization of a cold-active iron-superoxide dismutase from a psychrophilic bacterium, Marinomonas sp. NJ522. Biotechnology Letters 2006, 28: 85–88. 10.1007/s10529-005-4951-3View ArticlePubMedGoogle Scholar
- He Y, Fan K, Jia C, Wang Z, Pan W, Huang L, Yang K, Dong Z: Characterization of a hyperthermostable Fe-superoxide dismutase from hot spring. Applied Microbiology and Biotechnology 2007, 75: 367–376. 10.1007/s00253-006-0834-3View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.