Identification of germ cell-specific genes in mammalian meiotic prophase
© Li et al.; licensee BioMed Central Ltd. 2013
Received: 25 June 2012
Accepted: 21 February 2013
Published: 27 February 2013
Mammalian germ cells undergo meiosis to produce sperm or eggs, haploid cells that are primed to meet and propagate life. Meiosis is initiated by retinoic acid and meiotic prophase is the first and most complex stage of meiosis when homologous chromosomes pair to exchange genetic information. Errors in meiosis can lead to infertility and birth defects. However, despite the importance of this process, germ cell-specific gene expression patterns during meiosis remain undefined due to difficulty in obtaining pure germ cell samples, especially in females, where prophase occurs in the embryonic ovary. Indeed, mixed signals from both germ cells and somatic cells complicate gonadal transcriptome studies.
We developed a machine-learning method for identifying germ cell-specific patterns of gene expression in microarray data from mammalian gonads, specifically during meiotic initiation and prophase. At 10% recall, the method detected spermatocyte genes and oocyte genes with 90% and 94% precision, respectively. Our method outperformed gonadal expression levels and gonadal expression correlations in predicting germ cell-specific expression. Top-predicted spermatocyte and oocyte genes were both preferentially localized to the X chromosome and significantly enriched for essential genes. Also identified were transcription factors and microRNAs that might regulate germ cell-specific expression. Finally, we experimentally validated Rps6ka3, a top-predicted X-linked spermatocyte gene. Protein localization studies in the mouse testis revealed germ cell-specific expression of RPS6KA3, mainly detected in the cytoplasm of spermatogonia and prophase spermatocytes.
We have demonstrated that, through the use of machine-learning methods, it is possible to detect germ cell-specific expression from gonadal microarray data. Results from this study improve our understanding of the transition from germ cells to meiocytes in the mammalian gonad. Further, this approach is applicable to other tissues for which isolating cell populations remains difficult.
Multi-cellular eukaryotes are made of two fundamental cell types—germ cell and somatic cell. The distinguishing characteristic of a germ cell is its capability to undergo meiosis. Meiosis is a highly specialized cell division that converts diploid germ cells into haploid sperm or eggs, cells that are primed to meet for the propagation of the organism. Mammalian meiosis is initiated by an extrinsic signal—retinoic acid—and consists of meiosis I and II, each of which is divided into prophase, metaphase, anaphase, and telophase [1-4]. Prophase of meiosis I (abbreviated as prophase) is the first and most complex stage of meiosis, when maternal and paternal homologs pair to allow the exchange of genetic information. Based on chromosomal packaging, prophase itself is subdivided into five stages: leptotene, zygotene, pachytene, diplotene, and diakinesis.
Although the components of meiosis are similar in both sexes—pre-meiotic germ cell proliferation and differentiation, initiation and progression through meiosis, and gamete maturation—the fundamentals are dramatically different with respect to timing, outcome, and ability to produce normal gametes . The first wave of spermatogenesis begins in puberty and proceeds relatively synchronously, which is followed by continuous and asynchronous spermatogenesis throughout life. In contrast, initiation of oogenesis is confined to a narrow window of fetal development. The entire pool of oocytes initiates meiosis in a semi-synchronized manner, but the process arrests at the end of prophase, before birth. One arrested oocyte, on average, then resumes oogenesis during each ovulation cycle starting from puberty [3, 4, 6]. At the end of meiosis, a single egg is produced in females, compared with four sperm in males. An additional difference is that the incidence of aneuploid gametes produced in humans is at least an order of magnitude greater in the female than the male .
Meiotic entry and progression require highly precise and ordered gene expression. Identifying these gene expression signatures is imperative to circumvent clinical disorders, including infertility, birth defects, and germ cell tumors. However, our understanding of the factors that control germ cell entry into and progression through meiosis remains rudimentary. This is because studies of mammalian germ cells are usually limited to in vivo animal models. Further, oocytes enter meiosis during fetal life, when access to ovarian tissue is extremely limited. While time-series transcriptome studies of mammalian gonads have delineated the temporal sequence of genome-wide expression [7-13], identifying germ cell-specific genes necessary for meiosis has been difficult due to the mixture of germ and somatic cells in gonads, each of which contributes to the total transcriptome. Although it is possible to isolate germ cells from the testis using physical separation methods [14, 15], isolation of pure oocyte populations from the fetal ovary has been challenging due to the limited amount of ovarian tissue. Further, gene expression and cell physiology may differ in sorted germ cell samples versus in vivo populations, and the purity of isolated samples has been questioned.
Ideally, germ cell expression signals would be deciphered from whole-gonadal expression without physically isolating germ cells. Here, we applied a machine-learning algorithm, support vector machine (SVM), to predict mouse germ cell genes during meiotic initiation and prophase from time-course gonadal microarray profiles. This timeframe was selected for two reasons. First, prophase is the most important and complicated stage of meiosis. Second, the entire germ cell pool progresses through prophase in a relatively synchronized fashion during oogenesis and the first wave of spermatogenesis, thus global gene expression can be monitored by microarrays. Our approach allowed us to locate hidden germ cell patterns at high resolution and outperformed other methods in detecting germ cell-specific expression from mixed gonadal samples. Further, our method ranked genome-wide mouse genes according to the probability of being expressed by germ cells, enabling prioritization of candidate genes for experimental follow-up. In summary, results from this study increase our knowledge of germ cell-specific expression during the critical stage of meiotic initiation and prophase. Predicted germ cell genes advance our understanding of the genetic control of germ cell development, sex-specific differences in meiosis, as well as factors predisposing to infertility and birth defects.
Computational models to predict germ cell genes during meiotic initiation and prophase
Germ cells, but not somatic cells, of the testis and ovary undergo meiosis. Microarray profiles of mammalian gonads, however, record combined signals from both germ cells and somatic cells. We built SVM classifiers to predict mouse germ cell genes in meiotic initiation and prophase from gonadal microarray data. SVM identified a combination of expression patterns in the microarray profile that maximally separated genes expressed by germ cells from those not expressed by germ cells. We developed two versions of the SVM classifier: the spermatocyte model predicted germ cell genes using spermatocyte training examples and microarray studies on postnatal testis during prophase of the first wave of spermatogenesis; the oocyte model predicted germ cell genes using oocyte training examples and microarray studies on embryonic ovary during prophase [12, 13, 16]. Genes known to be expressed by germ cells in prophase served as the positive training set, and genes known not to be expressed by germ cells served as the negative training set. Our positive training data were all derived from single-gene studies [9, 12, 17-19]. Importantly, the training data were completely independent from the microarray studies, which served as the features of the SVM classifiers.
Performance evaluation of the germ cell models
Performance comparison between the germ cell models and other methods
Precision at 10% recall
Precision at 10% recall
Germ cell model
Gonadal expression level
Gonadal expression correlation
The precision of identifying a spermatocyte gene at random was 45% based on the training data (129 positive examples and 159 negative examples). The spermatocyte model reached a precision of 90% at 10% recall, a two-fold improvement from random precision. The precision of identifying an oocyte gene at random was 25%, as estimated from the training data (46 positive examples and 138 negative examples). The oocyte model yielded a precision of 94% at 10% recall, close to a four-fold increase from the random precision. Average precisions of 78% and 52% were achieved for the spermatocyte model and oocyte model, respectively, equivalent to 1.7 and 2.1-fold increments of random precisions. The average recall is 68% for the spermatocyte model and 57% for the oocyte model. These results suggest that our models are highly accurate in predicting top-ranked germ cell genes, but not necessarily sensitive in overall classification of germ and non-germ cell genes.
We further evaluated the performance of germ cell models by receiver operating characteristic (ROC) curves (Figure 2B). True positive rate (recall) is the fraction of correctly predicted germ cell genes over all germ cell genes while false positive rate is the fraction of incorrectly predicted germ cell genes over all non-germ cell genes. We observed that the spermatocyte model performed better than the oocyte model based on the area under the ROC curve (AUC=0.87 versus 0.75). However, the lower left portion of the ROC curves indicated comparable performance. For a true positive rate of 10%, the spermatocyte and oocyte models showed a false positive rate of 1% and 0.2%, respectively, suggesting the top-ranked germ cell genes are the most reliable predictions.
Performance comparison with other prediction methods
The second approach was to compute Pearson correlation of gonadal expression across prophase for each training gene pair. Pairs of two germ cell genes or two non-germ cell genes were positive correlation examples, while pairs of one germ cell gene and one non-germ cell gene were negative correlation examples. Training gene pairs were sorted in descending order of correlation coefficient, and precision and recall were computed from the sorted list (Figure 3B, Table 1). The results showed that precisions at 10% recall and average precisions were both close to random precisions, suggesting limited power of this method in predicting germ cell genes.
It is also possible to identify new germ cell genes by performing hierarchical clustering on microarray profiles across all time points of meiotic prophase (Additional file 1: Figure S1). Genes specifically expressed in germ cells (positive training data) exhibited a particular expression pattern, which allowed separating them from other genes. The advantage of our germ cell models over hierarchical clustering is that they can prioritize genes for experimental testing based on the probability of being germ cell genes.
Performance comparison with microarray expression of male germ cell isolates
Although it remains challenging to isolate a very small number of oocytes from the embryonic ovary, techniques have been developed to isolate male germ cells of different stages with reasonable purity. Two published studies performed global expression profiling on spermatogonia and spermatocytes isolated via gravity sedimentation and sequential enzymatic digestion [7, 12, 20]. Spermatogonia undergo proliferation and differentiation prior to meiotic initiation; pachytene spermatocytes are in the prophase stage. Therefore, we evaluated expression levels of isolated spermatogonia and pachytene cells in predicting spermatocyte genes during meiotic initiation and prophase.
Performance comparison between the spermatocyte model and microarray expression of male germ cell isolates
Characterization of predicted germ cell genes
Our models assigned probabilities to mouse genes genome-wide, allowing prioritization of potential germ cell genes for analysis. We focused on the top-1,000 predicted spermatocyte genes and oocyte genes; limited overlap existed between the two gene lists (144 genes, Jaccard index=0.08). We first identified the chromosome location of predicted germ cell genes. Strikingly, both top spermatocyte and oocyte genes were significantly enriched on the X chromosome, and the enrichment was more significant in the female than the male (P-value=0.04 for spermatocyte genes; P-value=0.0002 for oocyte genes). No enrichment was observed in any other chromosome: 1-19 and Y.
Significantly enriched GO terms among 1,000 top-predicted germ cell genes
response to DNA damage stimulus
regulation of translation
intracellular protein transport
ubiquitin-dependent protein catabolic process
response to DNA damage stimulus
protein K48-linked ubiquitination
Potential transcription factors activating predicted germ cell genes
Meiotic initiation and progression through prophase depends on a robust, germ cell-specific transcription program. However, the transcription factors for prophase genes remain uncharacterized. Here, we uncovered putative transcription factors by detecting over-represented sequence motifs in the promoter regions of the top-1,000 predicted germ cell genes using FIRE .
One CG-rich motif was also significantly over-represented among the top-1,000 oocyte genes. Although this motif was different from the one enriched among top spermatocyte genes, it was also recognized as the binding site for C2H2-type zinc finger domains; both motifs had a position bias towards transcription start sites (Figure 6). Sp1 was the transcription factor most closely associated with the oocyte CG-motif. Sp1 binds to CG-rich motifs and regulates the expression of a large number of genes involved in a variety of processes. In particular, Sp1 mediates transcriptional activation of male germ cell genes that are expressed during meiotic initiation and prophase [24, 25]. One LIM homeodomain motif was also enriched among the top-1,000 oocyte genes, and exhibited a strong positive co-occurrence with the CG-rich motif. LIM-homeodomain proteins play fundamental roles in tissue patterning and differentiation . Although Lhx3 was identified as the best matching transcription factor for this motif, it was barely expressed in the embryonic ovary and had no association with germ cell development. Instead, Lhx9 is known to express in the embryonic ovary and is essential for mouse gonad formation . Our results suggest that Lhx9 might be a potential regulator for oocyte genes during meiotic initiation and progression.
Potential microRNAs repressing predicted non-germ cell genes
Like transcription factors, microRNAs have emerged as critical developmental regulators. MicroRNAs are small endogenous RNAs that typically bind their target 3’UTRs through exact or near-exact complementarity. This binding event leads to translational repression and mRNA degradation of target genes. We were interested in identifying microRNAs that could potentially repress non-germ cell genes predicted from the models.
MicroRNAs potentially repressing predicted non-germ cell genes
Experimental validation of predicted germ cell genes
Our models predicted preferential localization of germ cell genes on the X chromosome. We further focused on X-linked spermatocyte genes because functional characterization of knockout mice was relatively easy. Males have one copy of X-linked genes, thus the phenotype of loss-of-function mutations would not be masked by a second allele. Among top-1,000 spermatocyte genes, 43 were X-linked and unique to the male, i.e., not overlapping with top-1,000 oocyte genes. We manually went through the list to identify candidates that were not previously linked to spermatocyte expression and function. Rps6ka3 (ribosomal protein S6 kinase alpha-3) emerged as an interesting candidate because it encodes a growth-factor-regulated protein kinase and is a known disease gene for which knockout mouse lines and commercial antibodies are available [31-33]. Mutations in this gene are responsible for Coffin-Lowry syndrome, which is characterized in male patients by mental retardation, growth retardation, and skeletal anomalies. The estimated incidence is 1:50,000 to 1:100,000 . In addition, our previous co-expression study also identified Rps6ka3 as a candidate prophase gene .
Germ cells initiate meiosis in response to the extrinsic factor retinoic acid. Meiotic initiation is followed by prophase, a critical developmental stage of germ cells when homologous chromosomes undergo recombination to generate genetic diversity in offspring. Examining patterns of gene expression at a genomic level is necessary to better understand the process of meiotic initiation and progression as well as to identify key factors involved in the process. Further, comparison of male and female expression time courses allows for better understanding of the sexually dimorphic aspects of germ cell differentiation that may contribute to the inherently high meiotic error rate in the female . Microarrays have been utilized extensively in transcriptome profiling of mammalian gonads [7-13]. A major complication, however, is that the mRNA expression represents a combination of signals from both germ cells and somatic cells.
To overcome this obstacle, we outlined a framework for determining germ cell expression during meiotic entry and progression through prophase from gonadal microarray data in male and female mice. SVM was used to detect hidden patterns of germ cell signals and did not require cell-type frequency in gonadal samples. Our germ cell models accurately predicted spermatocyte genes with a 90% precision and oocyte genes with a 94% precision at 10% recall. Further, our models outperformed other methods substantially in predicting germ cell genes from whole-gonadal expression studies. Although experimental methods have been developed to isolate mRNA samples enriched for spermatogonia and spermatocytes [14, 15], oocyte sorting from the embryonic ovary is not yet feasible. It remains a challenge to examine gene expression in embryonic oocytes. Therefore, our study is particularly valuable for identifying oocyte genes from the ovary microarray data.
We have demonstrated that top-predicted germ cell genes had GO annotations consistent with gonadal tissue in prophase. Top-predicted germ cell genes were also significantly enriched for essential genes. This suggests that many genes expressed during meiotic initiation and prophase are essential for mouse viability. One interesting observation is that top-predicted germ cell genes were preferentially located to the X chromosome, but not to any other chromosome. The enrichment on the X chromosome was more significant in the female than in the male. This observation was in strong concordance with sex chromosomal dynamics in pre-meiotic and meiotic stages. In the male, both X and Y chromosomes are active in spermatogonia prior to meiotic entry. In fact, spermatogonium-expressed genes are more densely populated on the X chromosome in both human and mouse [18, 19, 37]. The X and Y chromosomes continue to be active entering prophase but become transcriptionally silenced at the pachytene stage, a process called meiotic sex chromosome inactivation. This inactivation is mainly driven by the unpaired state of the sex chromosomes . In contrast, in the female, one X chromosome is usually silent due to gene dosage compensation. The inactive X chromosome is reactivated preceding meiotic entry such that both X chromosomes remain active throughout meiosis [39, 40]. Therefore, we expected to observe more significant enrichment of germ cell genes on the X chromosome in the female during meiotic initiation and prophase. The characterization of top-predicted germ cell genes suggests that our models are truly informative for germ cell-specific expression.
The use of regulatory sequence motifs to identify potential transcription factors has been successfully applied in many areas [41-43]. We located sequence motifs in the promoter region of top-predicted germ cell genes to determine putative regulators. A C2H2-type zinc finger domain was enriched among both spermatocyte genes and oocyte genes. In addition, a LIM homeodomain was identified only for top oocyte genes. The use of sequence motifs to locate regulators has its limitations: the presence of a motif does not necessarily represent the binding and/or functional activity of a transcription factor, and binding motifs of many transcription factors are unknown. Nevertheless, this approach can serve as an initial screen for potential transcription factors.
Transcriptional regulation of germ cell genes is further complicated by microRNAs . For example, the expression of the Let-7 family of microRNAs is increased in spermatogonia after treatment with retinoic acid . Testis-specific microRNAs are preferentially mapped to the X chromosome and most of the X-linked microRNAs are expressed in pachytene spermatocytes, suggesting possible roles in post-transcriptional regulation of prophase genes [30, 46]. In contrast to the male, studies of microRNAs in prophase oocytes are scarce . Based on predicted targets of microRNAs and our germ cell model predictions, we identified several candidate microRNAs that may repress gene expression during meiotic initiation and progression, including the X-linked mmu-miR-351.
Using the germ cell models, we were able to rank genome-wide genes and make high-quality predictions for genes expressed during meiotic initiation and prophase. We were particularly interested in X-linked spermatocyte genes because loss-of-function mutations can be easily obtained by deleting one copy of X-linked genes in the male. We experimentally validated Rps6ka3, an X-linked disease gene previously unknown to have meiotic function, in the mouse testis using immunofluorescence. Protein expression was germ cell-specific and was mainly confined to spermatogonia and spermatocytes in prophase, concordant with the model prediction. Thus, this experiment lays a foundation for future meiotic functional study of Rps6ka3 by characterizing knockout mouse lines [31-33]. Further, this validation experiment serves as a proof of concept and indicates that our systems biology approach integrating computation and experimentation is valuable in the identification of novel meiotic genes. Such large-scale, unbiased, and quantitative studies provide an essential complement to the traditional reductionist approaches by studying individual genes.
Results from this study provide a fundamental understanding of germ cell genes active in meiotic initiation and prophase, a critical developmental stage. We have demonstrated that, through the use of machine-learning methods, it is possible to detect germ cell-specific signals from gonadal microarray datasets. Our ability to make such predictions will likely improve with the increased number of germ cell genes being characterized in the future. While we are primarily motivated by meiotic prophase studies of germ cells, this approach is applicable to a variety of areas in which it is not yet possible to obtain pure cell samples [48-50].
Training data of germ cells
Our goal was to predict germ cell genes expressed during meiotic initiation and prophase in male mouse and female mouse. Thus, positive training examples were genes currently known to express while negative training examples were genes currently known not to express in prophase of germ cells. We obtained the training data from the literature and the mouse Gene Expression Database (GXD) . GXD collects detailed single-gene expression data from RNA in situ hybridization, immunohistochemistry, Northern and Western blots, RT-PCR, RNase and nuclease S1 protection assays, and in situ knock-in reporters. Genes that are expressed and not expressed in an anatomical structure and a developmental stage are recorded in the database. Genes labeled as “Very strong”, “Strong”, and “Present” were collected as positive training data; genes labeled with “Absent” and “Trace” were negative training data. Genes with conflict assignments were treated as positive examples. Note that the database does not include any large-scale expression studies (i.e., microarray).
The male training data included genes studied in spermatogonia and primary spermatocytes during postnatal development, as defined in GXD . Additionally, we collected male training data from the literature: genes expressed in premeiotic and prophase germ cells as positive examples [9, 12, 18, 19], and genes only expressed in Leydig, Sertoli, and Myoid cells as negative examples . A total of 137 positive and 26 negative germ cell genes were obtained for the male mouse. Similarly, we collected female training data from primordial germ cells and primary oocytes in the fetal ovary during embryonic days 12-16, recorded in GXD . Training data were further supplemented with genes manually curated from the literature . In total, 47 positive and 4 negative examples served as the training data for the female mouse.
Because of the limited number of negative training examples, we also obtained negative data from microarray profiles of 61 mouse tissues . Genes only expressed in one tissue type, except testis or ovary, were collected. Specifically, the dataset across all tissue types was concatenated, and the median expression value was extracted. Tissue-specific genes were those exhibiting more than 10-fold of median expression value in one tissue except testis or ovary but showing less than the median expression in other tissues. In this way, we obtained 177 tissue-specific genes as negative training examples for both males and females. We combined these genes with those from single-gene experiments, and further limited the training data to those present in the Mouse Genome 430 2.0 Array (Affymetrix, Santa Clara, CA). Finally, a total of 288 genes served as the training data (129 positive and 159 negative examples) for the spermatocyte model and 184 genes served as the training data (46 positive and 138 negative examples) for the oocyte model.
Microarray data on gonadal tissue and male germ cells
Time-series microarray studies have been conducted to characterize global gene expression in the mouse testis and ovary during germ cell progression through meiotic prophase (GSE12769 and GSE6916) [12, 13, 16]. In these published studies, whole testes were obtained from male mice at postnatal days 6, 8, 10, and 14 during the first wave of spermatogenesis; whole ovaries were collected from female mice at embryonic days 11.5, 12.5, 14.5, and 16.5. Expression values at each of the time points served as the features for the SVM classifiers. In both testis and ovary studies, duplicate samples were obtained and applied to Mouse Genome 430 2.0 Array (Affymetrix, Santa Clara, CA). The raw data were normalized by MAS5 and signals from duplicate samples were averaged. The probe-sets were translated into genes based on NetAffx Annotation Release 31. The expression level of each gene was defined by choosing the value from the top level probe-set as ranked by Affymetrix. In case of more than one probe-set present at the top level, the average value was used.
Two published studies described global gene expression of isolated male germ cells. One study isolated type A and B spermatogonia, and pachytene spermatocytes via gravity sedimentation; the purity of spermatogonia was >85% and the purity of pachytene spermatocytes was >95% [12, 20]. The other study isolated spermatogonia and pachytene spermatocytes via sequential enzymatic digestion and sedimentation unit gravity; spermatogonia were obtained at a purity >85% and pachytene spermatocytes were obtained at a purity >82.5% . In both cases, duplicate samples were obtained and applied to Mouse Genome 430 2.0 Array (Affymetrix, Santa Clara, CA). Data processing followed the same procedure as analyzing gonadal microarray data.
SVM classifiers to predict germ cell genes
We built SVM classifiers to predict genes expressed by germ cells during meiotic initiation and prophase using the e1071 package in R . Given a set of training genes known expressed or not expressed by germ cells, SVM classifiers identified a specific pattern of expression from microarray experiments that could best separate germ cell genes from non-germ cell genes. Specifically, SVM classifiers are trained through minimizing , which is subject to y i (β T ϕ(x i ) + β0) ≥ 1 − ζ i , ζ i ≥ 0. Here, N is the number of training examples, x i is the vector of microarray data on training example i, x i is mapped into a higher dimensional space by the function ϕ(x i ), β T ϕ(x i ) + β0 is the discriminant function to determine the classification of y i (y i =1 or −1), ζ i is the slack variable, and C is the penalty parameter. The kernel function, K(xi, xj) = ϕ(x i ) T ϕ(x j ), measures the similarity between two training examples, i and j. Different kernel functions were explored, including linear, polynomial, sigmoid, and radial basis function. Parameters for each kernel were empirically optimized on the training set through a grid search to achieve the best performance. Classifiers with the best parameters were evaluated by five-fold cross-validation, which was repeated 100 times. Based on AUC values of ROC curves from cross validation, SVM classifiers with a radial basis kernel performed best for both the spermatocyte model and oocyte model. The radial basis kernel is defined by K(xi, xj) = exp(−γ‖xi − xj‖2), where γ is the kernel width. The optimal parameters for the spermatocyte model are γ= 8 and C=2; the optimal parameters for the oocyte model are γ=1 and C=2048.
Chromosome localization of mouse genes was based on the UCSC genome annotation database for the December 2011 assembly of the mouse genome (GRCm38/mm10). To determine whether top-1,000 predicted genes have preferential chromosome location, we computed a hypergeometric P-value: , where N is the number of mouse genes genome-wide, m equals 1,000, the number of top-predicted germ cell genes, n is the number of genes located on a chromosome, k is the number of top predicted genes and located on the chromosome. To correct for multiple hypothesis testing, the P-value of chromosome enrichment was further subjected to Bonferroni correction. We considered that the top-1,000 predicted genes were significantly enriched on a chromosome if and only if P(X≥k)<0.05/M, where M=21, the number of chromosomes (1-19, X and Y) in the mouse.
GO term enrichment
Full ontology file (V1.2) and mouse gene association file (V1.919) were downloaded from http://www.geneontology.org/. To identify GO terms significantly enriched among top-1,000 predicted germ cell genes, we computed a hypergeometric P-value with the same formula as chromosome localization but different notations as follows: n is the number of genes annotated by a GO term and k is the number of top-1,000 predicted genes and annotated by the GO term. The P-value of GO term enrichment was corrected for multiple testing by multiplying with the number of GO terms considered.
We used FIRE, a motif finding algorithm , to search for over-represented motifs in the promoter regions of predicted germ cell genes. The promoter region was obtained from the UCSC Genome Browser (GRCm38/mm10) and included 8 kb upstream and 2 kb downstream of characterized transcription start site. Exons and repetitive sequences were masked for motif searching. Motifs were present on either transcribed or non-transcribed strand. Potential transcription factors were identified by comparing motifs to known binding sites of mammalian transcription factors in JASPAR and TRANSFAC databases [53, 54] using STAMP, a tool for DNA motif matching .
All animal procedures have been approved by the Washington State University Animal Care and Use Committee. The BL/6-129 mice were housed in a specific-pathogen-free facility. Adult males around 90 days postpartum were euthanized by exposure to a highly concentrated atmosphere of CO2 and testicular tissue was collected. Tissue was fixed with 4% paraformaldehyde, and subsequently dehydrated with ethanol and embedded in paraffin. Tissue sections of 4 μm were placed on slides for immunofluorescence experiments .
Tissue slides were boiled in 0.01 M citrate buffer (pH 6) for 5 min, then blocked with 10% donkey serum for 30 min. Incubation with goat RPS6KA3 antibody (1:100; sc-1430, Santa Cruz Biotechnology) was performed at room temperature overnight [33, 36]. Tissue sections were subsequently incubated with Alexa Fluor 568 donkey anti-goat IgG (1:1,000; Invitrogen) at room temperature for 1 h. Tissue slides were mounted by ProLong® Gold Antifade Reagent with DAPI (Invitrogen) and digitally photographed using a Zeiss Axioplan microscope. Control experiments followed the same procedure except incubation without RPS6KA3 antibody. Cross-sections of testis from at least three mice were analyzed.
We would like to acknowledge Cathryn Hogarth for help with the immunofluorescence experiments and Leanne Whitmore for reading of the manuscript. This work was supported by the March of Dimes Basil O’Connor Starter Scholar Research Award #5-FY10-485 to PY.
- Koubova J, Menke DB, Zhou Q, Capel B, Griswold MD, Page DC: Retinoic acid regulates sex-specific timing of meiotic initiation in mice. Proc Natl Acad Sci USA 2006, 103: 2474-2479. 10.1073/pnas.0510813103PubMed CentralView ArticlePubMedGoogle Scholar
- Anderson EL, Baltus AE, Roepers-Gajadien HL, Hassold TJ, de Rooij DG, van Pelt AM, Page DC: Stra8 and its inducer, retinoic acid, regulate meiotic initiation in both spermatogenesis and oogenesis in mice. Proc Natl Acad Sci USA 2008, 105: 14976-14980. 10.1073/pnas.0807297105PubMed CentralView ArticlePubMedGoogle Scholar
- Hunt PA, Hassold TJ: Human female meiosis: what makes a good egg go bad? Trends Genet 2008, 24: 86-93. 10.1016/j.tig.2007.11.010View ArticlePubMedGoogle Scholar
- Bowles J, Koopman P: Retinoic acid, meiosis and germ cell fate in mammals. Development 2007, 134: 3401-3411. 10.1242/dev.001107View ArticlePubMedGoogle Scholar
- Hunt PA, Hassold TJ: Sex matters in meiosis. Science 2002, 296: 2181-2183. 10.1126/science.1071907View ArticlePubMedGoogle Scholar
- Hogarth CA, Griswold MD: The key role of vitamin A in spermatogenesis. J Clin Invest 2010, 120: 956-962. 10.1172/JCI41303PubMed CentralView ArticlePubMedGoogle Scholar
- Chalmel F, Rolland AD, Niederhauser-Wiederkehr C, Chung SS, Demougin P, Gattiker A, Moore J, Patard JJ, Wolgemuth DJ, Jegou B, Primig M: The conserved transcriptome in human and rodent male gametogenesis. Proc Natl Acad Sci USA 2007, 104: 8346-8351. 10.1073/pnas.0701883104PubMed CentralView ArticlePubMedGoogle Scholar
- Houmard B, Small C, Yang L, Naluai-Cecchini T, Cheng E, Hassold T, Griswold M: Global gene expression in the human fetal testis and ovary. Biol Reprod 2009, 81: 438-443. 10.1095/biolreprod.108.075747PubMed CentralView ArticlePubMedGoogle Scholar
- Lawson C, Gieske M, Murdoch B, Ye P, Li Y, Hassold T, Hunt PA: Gene expression in the fetal mouse ovary is altered by exposure to low doses of bisphenol A. Biol Reprod 2011, 84: 79-86. 10.1095/biolreprod.110.084814PubMed CentralView ArticlePubMedGoogle Scholar
- Rolland AD, Lehmann KP, Johnson KJ, Gaido KW, Koopman P: Uncovering gene regulatory networks during mouse fetal germ cell development. Biol Reprod 2011, 84: 790-800. 10.1095/biolreprod.110.088443PubMed CentralView ArticlePubMedGoogle Scholar
- Schultz N, Hamra FK, Garbers DL: A multitude of genes expressed solely in meiotic or postmeiotic spermatogenic cells offers a myriad of contraceptive targets. Proc Natl Acad Sci USA 2003, 100: 12201-12206. 10.1073/pnas.1635054100PubMed CentralView ArticlePubMedGoogle Scholar
- Shima JE, McLean DJ, McCarrey JR, Griswold MD: The murine testicular transcriptome: characterizing gene expression in the testis during the progression of spermatogenesis. Biol Reprod 2004, 71: 319-330. 10.1095/biolreprod.103.026880View ArticlePubMedGoogle Scholar
- Small CL, Shima JE, Uzumcu M, Skinner MK, Griswold MD: Profiling gene expression during the differentiation and development of the murine embryonic gonad. Biol Reprod 2005, 72: 492-501. 10.1095/biolreprod.104.033696PubMed CentralView ArticlePubMedGoogle Scholar
- Bellve AR: Purification, culture, and fractionation of spermatogenic cells. Methods Enzymol 1993, 225: 84-113.View ArticlePubMedGoogle Scholar
- Wolgemuth DJ, Gizang-Ginsberg E, Engelmyer E, Gavin BJ, Ponzetto C: Separation of mouse testis cells on a Celsep (TM) apparatus and their usefulness as a source of high molecular weight DNA or RNA. Gamete Res 1985, 12: 1-10. 10.1002/mrd.1120120102View ArticlePubMedGoogle Scholar
- Zhou Q, Nie R, Li Y, Friel P, Mitchell D, Hess RA, Small C, Griswold MD: Expression of stimulated by retinoic acid gene 8 (Stra8) in spermatogenic cells induced by retinoic acid: an in vivo study in vitamin A-sufficient postnatal murine testes. Biol Reprod 2008, 79: 35-42. 10.1095/biolreprod.107.066795PubMed CentralView ArticlePubMedGoogle Scholar
- Smith CM, Finger JH, Hayamizu TF, McCright IJ, Eppig JT, Kadin JA, Richardson JE, Ringwald M: The mouse gene expression database (GXD): 2007 update. Nucleic Acids Res 2007, 35: D618-623. 10.1093/nar/gkl1003PubMed CentralView ArticlePubMedGoogle Scholar
- Wang PJ, McCarrey JR, Yang F, Page DC: An abundance of X-linked genes expressed in spermatogonia. Nat Genet 2001, 27: 422-426. 10.1038/86927View ArticlePubMedGoogle Scholar
- Wang PJ, Page DC, McCarrey JR: Differential expression of sex-linked and autosomal germ-cell-specific genes during spermatogenesis in the mouse. Hum Mol Genet 2005, 14: 2911-2918. 10.1093/hmg/ddi322PubMed CentralView ArticlePubMedGoogle Scholar
- Zhou Q, Li Y, Nie R, Friel P, Mitchell D, Evanoff RM, Pouchnik D, Banasik B, McCarrey JR, Small C, Griswold MD: Expression of stimulated by retinoic acid gene 8 (Stra8) and maturation of murine gonocytes and spermatogonia induced by retinoic acid in vitro. Biol Reprod 2008, 78: 537-545. 10.1095/biolreprod.107.064337PubMed CentralView ArticlePubMedGoogle Scholar
- Yuan Y, Xu Y, Xu J, Ball RL, Liang H: Predicting the lethal phenotype of the knockout mouse by integrating comprehensive genomic data. Bioinformatics 2012, 28: 1246-1252. 10.1093/bioinformatics/bts120PubMed CentralView ArticlePubMedGoogle Scholar
- Elemento O, Slonim N, Tavazoie S: A universal framework for regulatory element discovery across all genomes and data types. Mol Cell 2007, 28: 337-350. 10.1016/j.molcel.2007.09.027PubMed CentralView ArticlePubMedGoogle Scholar
- Medina R, Buck T, Zaidi SK, Miele-Chamberland A, Lian JB, Stein JL, van Wijnen AJ, Stein GS: The histone gene cell cycle regulator HiNF-P is a unique zinc finger transcription factor with a novel conserved auxiliary DNA-binding motif. Biochemistry 2008, 47: 11415-11423. 10.1021/bi800961dPubMed CentralView ArticlePubMedGoogle Scholar
- Linher K, Cheung Q, Baker P, Bedecarrats G, Shiota K, Li J: An epigenetic mechanism regulates germ cell-specific expression of the porcine deleted in Azoospermia-like (DAZL) gene. Differentiation 2009, 77: 335-349. 10.1016/j.diff.2008.08.001View ArticlePubMedGoogle Scholar
- Thomas K, Wu J, Sung DY, Thompson W, Powell M, McCarrey J, Gibbs R, Walker W: SP1 transcription factors in male germ cell development and differentiation. Mol Cell Endocrinol 2007, 270: 1-7. 10.1016/j.mce.2007.03.001View ArticlePubMedGoogle Scholar
- Hobert O, Westphal H: Functions of LIM-homeobox genes. Trends Genet 2000, 16: 75-83. 10.1016/S0168-9525(99)01883-1View ArticlePubMedGoogle Scholar
- Birk OS, Casiano DE, Wassif CA, Cogliati T, Zhao L, Zhao Y, Grinberg A, Huang S, Kreidberg JA, Parker KL: The LIM homeobox gene Lhx9 is essential for mouse gonad formation. Nature 2000, 403: 909-913. 10.1038/35002622View ArticlePubMedGoogle Scholar
- Lewis BP, Burge CB, Bartel DP: Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell 2005, 120: 15-20. 10.1016/j.cell.2004.12.035View ArticlePubMedGoogle Scholar
- Betel D, Wilson M, Gabow A, Marks DS, Sander C: The microRNA.org resource: targets and expression. Nucleic Acids Res 2008, 36: D149-153.PubMed CentralView ArticlePubMedGoogle Scholar
- Song R, Ro S, Michaels JD, Park C, McCarrey JR, Yan W: Many X-linked microRNAs escape meiotic sex chromosome inactivation. Nat Genet 2009, 41: 488-493. 10.1038/ng.338PubMed CentralView ArticlePubMedGoogle Scholar
- Dufresne SD, Bjorbaek C, El-Haschimi K, Zhao Y, Aschenbach WG, Moller DE, Goodyear LJ: Altered extracellular signal-regulated kinase signaling and glycogen metabolism in skeletal muscle from p90 ribosomal S6 kinase 2 knockout mice. Mol Cell Biol 2001, 21: 81-87. 10.1128/MCB.21.1.81-87.2001PubMed CentralView ArticlePubMedGoogle Scholar
- El-Haschimi K, Dufresne SD, Hirshman MF, Flier JS, Goodyear LJ, Bjorbaek C: Insulin resistance and lipodystrophy in mice lacking ribosomal S6 kinase 2. Diabetes 2003, 52: 1340-1346. 10.2337/diabetes.52.6.1340View ArticlePubMedGoogle Scholar
- Lin JX, Spolski R, Leonard WJ: Critical role for Rsk2 in T-lymphocyte activation. Blood 2008, 111: 525-533. 10.1182/blood-2007-02-072207PubMed CentralView ArticlePubMedGoogle Scholar
- Pereira PM, Schneider A, Pannetier S, Heron D, Hanauer A: Coffin-lowry syndrome. Eur J Hum Genet 2010, 18: 627-633. 10.1038/ejhg.2009.189View ArticlePubMedGoogle Scholar
- Su Y, Li Y, Ye P: Mammalian meiosis is more conserved by sex than by species: conserved co-expression networks of meiotic prophase. Reproduction 2011, 142: 675-687. 10.1530/REP-11-0260View ArticlePubMedGoogle Scholar
- Dumont J, Umbhauer M, Rassinier P, Hanauer A, Verlhac MH: p90Rsk is not involved in cytostatic factor arrest in mouse oocytes. J Cell Biol 2005, 169: 227-231. 10.1083/jcb.200501027PubMed CentralView ArticlePubMedGoogle Scholar
- Koslowski M, Sahin U, Huber C, Tureci O: The human X chromosome is enriched for germline genes expressed in premeiotic germ cells of both sexes. Hum Mol Genet 2006, 15: 2392-2399. 10.1093/hmg/ddl163View ArticlePubMedGoogle Scholar
- Turner JM, Mahadevaiah SK, Ellis PJ, Mitchell MJ, Burgoyne PS: Pachytene asynapsis drives meiotic sex chromosome inactivation and leads to substantial postmeiotic repression in spermatids. Dev Cell 2006, 10: 521-529. 10.1016/j.devcel.2006.02.009View ArticlePubMedGoogle Scholar
- Heard E, Turner J: Function of the sex chromosomes in mammalian fertility. Cold Spring Harb Perspect Biol 2011, 3: a002675. 10.1101/cshperspect.a002675PubMed CentralView ArticlePubMedGoogle Scholar
- Yan W, McCarrey JR: Sex chromosome inactivation in the male. Epigenetics 2009, 4: 452-456. 10.4161/epi.4.7.9923PubMed CentralView ArticlePubMedGoogle Scholar
- Rhodes DR, Kalyana-Sundaram S, Mahavisno V, Barrette TR, Ghosh D, Chinnaiyan AM: Mining for regulatory programs in the cancer transcriptome. Nat Genet 2005, 37: 579-583. 10.1038/ng1578View ArticlePubMedGoogle Scholar
- Goodarzi H, Elemento O, Tavazoie S: Revealing global regulatory perturbations across human cancers. Mol Cell 2009, 36: 900-911. 10.1016/j.molcel.2009.11.016PubMed CentralView ArticlePubMedGoogle Scholar
- Huttenhower C, Mutungu KT, Indik N, Yang W, Schroeder M, Forman JJ, Troyanskaya OG, Coller HA: Detailing regulatory networks through large scale data integration. Bioinformatics 2009, 25: 3267-3274. 10.1093/bioinformatics/btp588PubMed CentralView ArticlePubMedGoogle Scholar
- Hayashi K, Chuva de Sousa Lopes SM, Kaneda M, Tang F, Hajkova P, Lao K, O'Carroll D, Das PP, Tarakhovsky A, Miska EA, Surani MA: MicroRNA biogenesis is required for mouse primordial germ cell development and spermatogenesis. PLoS One 2008, 3: e1738. 10.1371/journal.pone.0001738PubMed CentralView ArticlePubMedGoogle Scholar
- Tong MH, Mitchell D, Evanoff R, Griswold MD: Expression of Mirlet7 family microRNAs in response to retinoic acid-induced spermatogonial differentiation in mice. Biol Reprod 2011, 85: 189-197. 10.1095/biolreprod.110.089458PubMed CentralView ArticlePubMedGoogle Scholar
- Ro S, Park C, Sanders KM, McCarrey JR, Yan W: Cloning and expression profiling of testis-expressed microRNAs. Dev Biol 2007, 311: 592-602. 10.1016/j.ydbio.2007.09.009PubMed CentralView ArticlePubMedGoogle Scholar
- Aguilar AL, Piskol R, Beitzinger M, Zhu JY, Kruspe D, Aszodi A, Moser M, Englert C, Meister G: The small RNA expression profile of the developing murine urinary and reproductive systems. FEBS Lett 2010, 584: 4426-4434. 10.1016/j.febslet.2010.09.050View ArticlePubMedGoogle Scholar
- Chikina MD, Huttenhower C, Murphy CT, Troyanskaya OG: Global prediction of tissue-specific gene expression and context-dependent gene networks in Caenorhabditis elegans. PLoS Comput Biol 2009, 5: e1000417. 10.1371/journal.pcbi.1000417PubMed CentralView ArticlePubMedGoogle Scholar
- Shen-Orr SS, Tibshirani R, Khatri P, Bodian DL, Staedtler F, Perry NM, Hastie T, Sarwal MM, Davis MM, Butte AJ: Cell type-specific gene expression differences in complex tissues. Nat Methods 2010, 7: 287-289. 10.1038/nmeth.1439PubMed CentralView ArticlePubMedGoogle Scholar
- Ghosh D: Mixture models for assessing differential expression in complex tissues using microarray data. Bioinformatics 2004, 20: 1663-1669. 10.1093/bioinformatics/bth139View ArticlePubMedGoogle Scholar
- Su AI, Wiltshire T, Batalov S, Lapp H, Ching KA, Block D, Zhang J, Soden R, Hayakawa M, Kreiman G: A gene atlas of the mouse and human protein-encoding transcriptomes. Proc Natl Acad Sci USA 2004, 101: 6062-6067. 10.1073/pnas.0400782101PubMed CentralView ArticlePubMedGoogle Scholar
- Dimitriadou E, Hornik K, Leisch F, Meyer D, Weingessel A: e1071: misc functions of the department of statistics. Version 1.6. 2011. http://cran.r-project.org/web/packages/e1071/index.htmlGoogle Scholar
- Sandelin A, Alkema W, Engstrom P, Wasserman WW, Lenhard B: JASPAR: an open-access database for eukaryotic transcription factor binding profiles. Nucleic Acids Res 2004, 32: D91-94. 10.1093/nar/gkh012PubMed CentralView ArticlePubMedGoogle Scholar
- Matys V, Fricke E, Geffers R, Gossling E, Haubrock M, Hehl R, Hornischer K, Karas D, Kel AE, Kel-Margoulis OV: TRANSFAC: transcriptional regulation, from patterns to profiles. Nucleic Acids Res 2003, 31: 374-378. 10.1093/nar/gkg108PubMed CentralView ArticlePubMedGoogle Scholar
- Mahony S, Benos PV: STAMP: a web tool for exploring DNA-binding motif similarities. Nucleic Acids Res 2007, 35: W253-258. 10.1093/nar/gkm272PubMed CentralView ArticlePubMedGoogle Scholar
- Ray D, Hogarth CA, Evans EB, An W, Griswold MD, Ye P: Experimental validation of Ankrd17 and Anapc10, two novel meiotic genes predicted by computational models in mice. Biol Reprod 2012, 86: 102. 10.1095/biolreprod.111.095216PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.