LUD, a new protein domain associated with lactate utilization
© Hwang et al.; licensee BioMed Central Ltd. 2013
Received: 3 July 2013
Accepted: 19 November 2013
Published: 26 November 2013
A novel highly conserved protein domain, DUF162 [Pfam: PF02589], can be mapped to two proteins: LutB and LutC. Both proteins are encoded by a highly conserved LutABC operon, which has been implicated in lactate utilization in bacteria. Based on our analysis of its sequence, structure, and recent experimental evidence reported by other groups, we hereby redefine DUF162 as the LUD domain family.
JCSG solved the first crystal structure [PDB:2G40] from the LUD domain family: LutC protein, encoded by ORF DR_1909, of Deinococcus radiodurans. LutC shares features with domains in the functionally diverse ISOCOT superfamily. We have observed that the LUD domain has an increased abundance in the human gut microbiome.
We propose a model for the substrate and cofactor binding and regulation in LUD domain. The significance of LUD-containing proteins in the human gut microbiome, and the implication of lactate metabolism in the radiation-resistance of Deinococcus radiodurans are discussed.
KeywordsLUD DUF162 LutB LutC Domain of unknown function Deinococcus radiodurans
We are now in an era when we can routinely sequence the complete genomes of microbes and rapidly identify their protein coding complements. The sequences of millions of proteins are now known. Despite this wealth of information we are still far from understanding how all of these proteins operate to give rise to a living organism. At present, in a consistent percentage of proteins the predicted function remains unknown [1, 2]. From our analysis of 23 million proteins in the Pfam sequence database (Pfam release 27.0), 20% of them have no associated Pfam domain  and more are classified into DUF (Domains of Unknown Function) families . This uncharacterized set of proteins potentially contains novel biological systems. Therefore, it is important to uncover these hidden functions through analysis of protein sequence, protein structure, and finally through directed experimental analyses [4-7].
There have been various attempts to classify the multitude of protein sequences into families to facilitate an improved understanding of the functional repertoire of proteins. In addition, there is a growing number of protein families defined for which no protein has ever been previously experimentally characterized. These families have been called DUFs  or Uncharacterized Protein Families (UPFs) . The Pfam database contains one of the largest collections of such families with over 4,000 defined to date.
A novel domain, DUF162 [Pfam: PF02589] [COG: COG1556] [eggNOG: COG1556] [CDD: 224473], was found predominantly in Bacteria, and to a lesser extent in Archaea and Eukaryota. Recently, one protein (YvbY from Bacillus subtilis) in this DUF162 family was identified as lactate-utilization protein C (LutC), which was homologous to the YkgG protein in E. coli, hinting at a possible role in lactate utilization [9, 10]. Indeed, DUF162 domain is a constituent domain of two proteins (LutB and LutC) encoded by the conserved LutABC operon in bacteria. This operon has been linked to lactate utilization [9, 10] and is implicated in the oxidative conversion of L-lactate into pyruvate . Based on our analysis of its sequence, structure, and recent experimental evidence reported by other groups, we hereby redefine DUF162 domain as the LUD domain.
Here, we report the first crystal structure [PDB: 2G40] of the LUD domain family: LutC protein (encoded by ORF DR_1909) from Deinococcus radiodurans[11, 12] at 1.70 Å resolution. We propose a model for the substrate and cofactor binding and regulation.
Results and discussion
LUD domain structure
Structural alignment with other protein structures present in the Protein Data Bank, using the program DALI [13, 14], suggests LutC protein is structurally akin to proteins found in the ISOCOT superfamily . This is consistent with its classification in SCOP  as part of the NagB/RpiA/CoA transferase-like fold and superfamily. The ISOCOT superfamily is known to comprise proteins of diverse functions including sugar isomerases, translation factor eIF2B, ligand-binding domains of the DeoR-family transcription factors, acetyl-CoA transferases, and methenyltetrahydrofolate synthetase .
LUD domain-containing proteins encoded by the highly conserved LutABC Operon
Presence in gut microbiome
It is worth noting that LUD domain has an increased abundance in gut microbiome. From our comparative genomics analysis of the metahit human gut microbiome of 124 human subjects (unpublished result, data not shown), the average ratio of number of homologs from the metahit human gut microbiome versus those found in UniProtKB is about 0.07. The ratio for LUD domain is ten times higher at 0.72, suggesting it plays a significant role in the gut microbiome, possibly related to its role in anaerobic metabolism. Interestingly, lactic acid bacteria (LAB) are being used as probiotics . Lactate metabolism is integral to human health and host-pathogen interactions. Pathogenic bacteria have been shown to decrease local pH in hosts, through an increase in lactate production, so as to facilitate the release of iron from host transferrin . In other species, acquisition of lactate is necessary for bacteremia  and colonization . Lactate is also a potent signaling molecule in inflammatory pathways and has emerged as a critical regulator of cancer development, maintenance and metastasis . By modulating lactate concentrations in the host’s environment through LUD domains and other lactate-related pathways, lactobacilli could thus influence the outcomes of both pathogenicity and disease .
Model for LUD domain substrate-cofactor binding and regulation
Another moderately conserved cavity lined by residues R155, C120, and D137 (Figure 7), roughly coincides with the ISOCOT superfamily primary binding site. Docking of NAD to this shallow and small cavity leaves it not fully embedded and partially exposed. Thus, it is unlikely to form the active site. Nevertheless, this cavity could bind smaller molecules and is a good candidate for allosteric regulation. Allosteric regulation has been reported for certain proteins of the ISOCOT superfamily [26, 27].
Functional implications in Deinococcus radiodurans
The LutC protein was selected as a target because of the interest in Deinococcus radiodurans by JCSG. Deinococcus radiodurans is the most radiation-resistant bacterium known to date . It can survive 4000 Gray (Gy) of irradiation, a dose hundreds of times greater than that considered lethal for most organisms. How it accomplishes such a remarkable feat remains enigmatic. A study examining global gene expression following ionizing radiation exposure and desiccation allowed a dissection of the response to double strand breaks (induced by both ionizing radiation and desiccation) and oxidative stress associated with reactive oxygen species (ROS). LutC protein was not induced in either treatment but was constitutively expressed . Free radicals, in particular ROS, generated when cells are exposed to ionizing radiation, are cytotoxic. The unpaired electrons of free radicals render them highly reactive with biological molecules. Unsaturated fatty acids present in the membrane are particularly susceptible to free radicals. Furthermore, free radical-oxygen will deplete oxygen in the cytosol and abolish aerobic metabolism. Anaerobic lactate metabolism can be an indispensable alternative energy source. Moreover, lactate can function as a scavenger of free radicals . Thus, lactate utilization may contribute to the radiation-resistance of the Deinococcus radiodurans. As the LutC protein from Deinococcus radiodurans represents a prototypical LUD domain in lactate utilization, it could be contributing towards radiation-resistance in this bacterium.
Lactate metabolism is integral to human health, and may play a role in the radiation resistance in Deinococcus radiodurans. The LUD domain is a highly conserved protein domain that has recently been identified to play a role in lactate metabolism. In this report, we described the crystal structure of the Deinococcus radiodurans LutC protein, the first for a member of the LUD domain family. Using sequence and structure analysis, we proposed a model for the substrate and cofactor binding and regulation in LUD domains. We also analyzed possible implications for radiation resistance in Deinococcus radiodurans. Further experimental characterization will be needed to test these hypotheses.
Alignment of representative sequences of LUD family (Pfam DUF162-PF02589) was built by taking the SEED sequences of the family, reducing redundancy at 40% sequence identity and finally realigning the remaining sequences plus the sequence of 2G40 (UniProtKB id: Q9RT57) with ClustalW . For better visualisation the alignment has been split in two parts (a) and (b). In (a) we show the N-terminal part of the alignment that continues toward the C-terminus in (b). Shades of grey reflect average similarity as calculated from the BLOSUM62 amino acid substitution matrix (black most conserved, white least conserved). Dashes (-) represent deletions, dots (.) represent insertions and lower case letters represent inserted residues. For each sequence, we report the UniProtKB id (e.g. F9YU00), the position along the protein sequence of first and last residue in the alignment (in the case of Q9RT57, for example, aligned residues range from 45 to 212) and, finally, the amino acid sequence. 2G40 (Q9RT57) sequence is highlighted by a shaded box. The alignment is visualized with Belvu  (sonnhammer.sbc.su.se/Belvu.html). More sequence and domain analysis for the LUD domain family can be found in the Additional file 1.
Structure determination of LutC protein was carried out by the JCSG high-throughput structural biology pipeline . Diffraction data were collected at Stanford Synchrotron Radiation Lightsource (SSRL) beamline 1-5. The crystal structure was determined by MAD phasing using seleno-methionine-derivatized protein. The structure was validated using the JCSG Quality Control server (http://smb.slac.stanford.edu/jcsg/QC). Experimental details as well as structural and refinement statistics can be found in the Additional file 2.
Atomic coordinates and experimental structure factors have been deposited into the Protein Data Bank (http://www.rcsb.org) with PDB ID: 2G40.
LutC protein dimer was generated by symmetry-related positions in Pymol . Dimer interface was assessed by PISA . Conservation of LutC protein amino acid residues was assessed by ConSurf , which obtained close homologous sequences through BLAST. Molecular docking was performed with MVD  using default parameters. Structure graphics were prepared in Chimera .
We are grateful to the Sanford Burnham Medical Research Institute for hosting the DUF annotation jamboree in June 2013, which allowed the authors to collaborate on this work. We would like to thank all the participants of this workshop for their intellectual contributions to this work: L. Aravind, Herbert L. Axelrod, Alex Bateman, Yuanyuan Chang, Penny Coggill, Debanu Das, Ruth Y. Eberhardt, Robert D. Finn, Adam Godzik, William C. Hwang, Lukasz Jaroszewski, Alexey Murzin, Padmaja Natarajan, Marco Punta, Neil Rawlings, Daniel Rigden, Mayya Sedova, Anna Sheydina, John Wooley. We thank the members of the JCSG high-throughput structural biology pipeline for their contribution to this work.
Wellcome Trust (grant numbers WT077044/Z/05/Z); Funding for open access charge: Wellcome Trust (grant numbers WT077044/Z/05/Z); Portions of this research were carried out at the Stanford Synchrotron Radiation Lightsource, a Directorate of SLAC National Accelerator Laboratory and an Office of Science User Facility operated for the U.S. Department of Energy Office of Science by Stanford University. The SSRL Structural Molecular Biology Program is supported by the DOE Office of Biological and Environmental Research, and by the National Institutes of Health, National Institute of General Medical Sciences (including P41GM103393). The contents of this publication are solely the responsibility of the authors and do not necessarily represent the official views of NIGMS, NCRR or NIH. This work was supported in part by National Institutes of Health Grant U54 GM094586 from the NIGMS Protein Structure Initiative to the Joint Center for Structural Genomics.
- Jaroszewski L, Li Z, Krishna SS, Bakolitsa C, Wooley J, et al: Exploration of uncharted regions of the protein universe. PLoS Biol. 2009, 7: e1000205-10.1371/journal.pbio.1000205.PubMed CentralView ArticlePubMed
- Bateman A, Coggill P, Finn RD: DUFs: families in search of function. Acta Crystallogr Sect F: Struct Biol Cryst Commun. 2010, 66: 1148-1152. 10.1107/S1744309110001685.View Article
- Punta M, Coggill PC, Eberhardt RY, Mistry J, Tate J, et al: The Pfam protein families database. Nucleic Acids Res. 2012, 40: D290-D301. 10.1093/nar/gkr1065.PubMed CentralView ArticlePubMed
- Roberts RJ: Identifying protein function-a call for community action. PLoS Biol. 2004, 2: E42-10.1371/journal.pbio.0020042.PubMed CentralView ArticlePubMed
- Roberts RJ, Chang YC, Hu Z, Rachlin JN, Anton BP, et al: COMBREX: a project to accelerate the functional annotation of prokaryotic genomes. Nucleic Acids Res. 2011, 39: D11-D14. 10.1093/nar/gkq1168.PubMed CentralView ArticlePubMed
- Galperin MY, Koonin EV: From complete genome sequence to 'complete’ understanding?. Trends Biotechnol. 2010, 28: 398-406. 10.1016/j.tibtech.2010.05.006.PubMed CentralView ArticlePubMed
- Hanson AD, Pribat A, Waller JC, De Crecy-Lagard V: 'Unknown’ proteins and 'orphan’ enzymes: the missing half of the engineering parts list-and how to find it. Biochem J. 2010, 425: 1-11. 10.1042/BJ20091328.View Article
- Bateman A, Coin L, Durbin R, Finn RD, Hollich V, et al: The Pfam protein families database. Nucleic Acids Res. 2004, 32: D138-D141. 10.1093/nar/gkh121.PubMed CentralView ArticlePubMed
- Chai Y, Kolter R, Losick R: A widely conserved gene cluster required for lactate utilization in Bacillus subtilis and its involvement in biofilm formation. J Bacteriol. 2009, 191: 2423-2430. 10.1128/JB.01464-08.PubMed CentralView ArticlePubMed
- Smaldone GT, Antelmann H, Gaballa A, Helmann JD: The FsrA sRNA and FbpB protein mediate the iron-dependent induction of the Bacillus subtilis lutABC iron-sulfur-containing oxidases. J Bacteriol. 2012, 194: 2586-2593. 10.1128/JB.05567-11.PubMed CentralView ArticlePubMed
- Schmid AK, Howell HA, Battista JR, Peterson SN, Lidstrom ME: Global transcriptional and proteomic analysis of the Sig1 heat shock regulon of Deinococcus radiodurans. J Bacteriol. 2005, 187: 3339-3351. 10.1128/JB.187.10.3339-3351.2005.PubMed CentralView ArticlePubMed
- Makarova KS, Aravind L, Wolf YI, Tatusov RL, Minton KW, et al: Genome of the extremely radiation-resistant bacterium Deinococcus radiodurans viewed from the perspective of comparative genomics. Microbiol Mol Biol Rev. 2001, 65: 44-79. 10.1128/MMBR.65.1.44-79.2001.PubMed CentralView ArticlePubMed
- Holm L, Rosenstrom P: Dali server: conservation mapping in 3D. Nucleic Acids Res. 2010, 38: W545-W549. 10.1093/nar/gkq366.PubMed CentralView ArticlePubMed
- Hasegawa H, Holm L: Advances and pitfalls of protein structural alignment. Curr Opin Struct Biol. 2009, 19: 341-348. 10.1016/j.sbi.2009.04.003.View ArticlePubMed
- Anantharaman V, Aravind L: Diversification of catalytic activities and ligand interactions in the protein fold shared by the sugar isomerases, eIF2B, DeoR transcription factors, acyl-CoA transferases and methenyltetrahydrofolate synthetase. J Mol Biol. 2006, 356: 823-842. 10.1016/j.jmb.2005.11.031.View ArticlePubMed
- Murzin AG, Brenner SE, Hubbard T, Chothia C: SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol. 1995, 247: 536-540.PubMed
- Hamann N, Mander GJ, Shokes JE, Scott RA, Bennati M, et al: A cysteine-rich CCG domain contains a novel [4Fe-4S] cluster binding motif as deduced from studies with subunit B of heterodisulfide reductase from Methanothermobacter marburgensis. Biochemistry. 2007, 46: 12875-12885. 10.1021/bi700679u.PubMed CentralView ArticlePubMed
- Ljungh A, Wadstrom T: Lactic acid bacteria as probiotics. Curr Issues Intest Microbiol. 2006, 7: 73-89.PubMed
- Friedman DB, Stauff DL, Pishchany G, Whitwell CW, Torres VJ, et al: Staphylococcus aureus redirects central metabolism to increase iron availability. PLoS Pathog. 2006, 2: e87-10.1371/journal.ppat.0020087.PubMed CentralView ArticlePubMed
- Herbert MA, Hayes S, Deadman ME, Tang CM, Hood DW, et al: Signature tagged Mutagenesis of Haemophilus influenzae identifies genes required for in vivo survival. Microb Pathog. 2002, 33: 211-223. 10.1006/mpat.2002.0530.View ArticlePubMed
- Exley RM, Wu H, Shaw J, Schneider MC, Smith H, et al: Lactate acquisition promotes successful colonization of the murine genital tract by Neisseria gonorrhoeae. Infect Immun. 2007, 75: 1318-1324. 10.1128/IAI.01530-06.PubMed CentralView ArticlePubMed
- Doherty JR, Cleveland JL: Targeting lactate metabolism for cancer therapeutics. J Clin Invest. 2013, 123: 3685-3692. 10.1172/JCI69741.PubMed CentralView ArticlePubMed
- Maudsdotter L, Jonsson H, Roos S, Jonsson AB: Lactobacilli reduce cell cytotoxicity caused by Streptococcus pyogenes by producing lactic acid that degrades the toxic component lipoteichoic acid. Antimicrob Agents Chemother. 2011, 55: 1622-1628. 10.1128/AAC.00770-10.PubMed CentralView ArticlePubMed
- Najmanovich R, Kurbatova N, Thornton J: Detection of 3D atomic similarities and their use in the discrimination of small molecule protein-binding sites. Bioinformatics. 2008, 24: i105-i111.View ArticlePubMed
- Clarke AR, Wigley DB, Chia WN, Barstow D, Atkinson T, et al: Site-directed mutagenesis reveals role of mobile arginine residue in lactate dehydrogenase catalysis. Nature. 1986, 324: 699-702. 10.1038/324699a0.View ArticlePubMed
- Horjales E, Altamirano MM, Calcagno ML, Garratt RC, Oliva G: The allosteric transition of glucosamine-6-phosphate deaminase: the structure of the T state at 2.3 A resolution. Structure. 1999, 7: 527-537. 10.1016/S0969-2126(99)80069-0.View ArticlePubMed
- Rudino-Pinera E, Morales-Arrieta S, Rojas-Trejo SP, Horjales E: Structural flexibility, an essential component of the allosteric activation in Escherichia coli glucosamine-6-phosphate deaminase. Acta Crystallogr D Biol Crystallogr. 2002, 58: 10-20. 10.1107/S0907444901016699.View ArticlePubMed
- Groussard C, Morel I, Chevanne M, Monnier M, Cillard J, et al: Free radical scavenging and antioxidant effects of lactate ion: an in vitro study. J Appl Physiol. 2000, 89: 169-175.PubMed
- Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, et al: Multiple sequence alignment with the Clustal series of programs. Nucleic Acids Res. 2003, 31: 3497-3500. 10.1093/nar/gkg500.PubMed CentralView ArticlePubMed
- Sonnhammer EL, Hollich V: Scoredist: a simple and robust protein sequence distance estimator. BMC Bioinforma. 2005, 6: 108-10.1186/1471-2105-6-108.View Article
- Elsliger MA, Deacon AM, Godzik A, Lesley SA, Wooley J, et al: The JCSG high-throughput structural biology pipeline. Acta Crystallogr Sect F: Struct Biol Cryst Commun. 2010, 66: 1137-1142. 10.1107/S1744309110038212.View Article
- The PyMOL molecular graphics system. Schrödinger, LLC, http://pymol.sourceforge.net/faq.html#CITE, Version 12r3pre,
- Krissinel E, Henrick K: Inference of macromolecular assemblies from crystalline state. J Mol Biol. 2007, 372: 774-797. 10.1016/j.jmb.2007.05.022.View ArticlePubMed
- Ashkenazy H, Erez E, Martz E, Pupko T, Ben-Tal N: ConSurf 2010: calculating evolutionary conservation in sequence and structure of proteins and nucleic acids. Nucleic Acids Res. 2010, 38: W529-W533. 10.1093/nar/gkq399.PubMed CentralView ArticlePubMed
- Thomsen R, Christensen MH: MolDock: a new technique for high-accuracy molecular docking. J Med Chem. 2006, 49: 3315-3321. 10.1021/jm051197e.View ArticlePubMed
- Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, et al: UCSF Chimera-a visualization system for exploratory research and analysis. J Comput Chem. 2004, 25: 1605-1612. 10.1002/jcc.20084.View ArticlePubMed
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.