GigaTON: an extensive publicly searchable database providing a new reference transcriptome in the pacific oyster Crassostrea gigas
© Riviere et al. 2015
Received: 20 May 2015
Accepted: 24 November 2015
Published: 2 December 2015
The Pacific oyster, Crassostrea gigas, is one of the most important aquaculture shellfish resources worldwide. Important efforts have been undertaken towards a better knowledge of its genome and transcriptome, which makes now C. gigas becoming a model organism among lophotrochozoans, the under-described sister clade of ecdysozoans within protostomes. These massive sequencing efforts offer the opportunity to assemble gene expression data and make such resource accessible and exploitable for the scientific community. Therefore, we undertook this assembly into an up-to-date publicly available transcriptome database: the GigaTON (Gigas TranscriptOme pipeliNe) database.
We assembled 2204 million sequences obtained from 114 publicly available RNA-seq libraries that were realized using all embryo-larval development stages, adult organs, different environmental stressors including heavy metals, temperature, salinity and exposure to air, which were mostly performed as part of the Crassostrea gigas genome project. This data was analyzed in silico and resulted into 56621 newly assembled contigs that were deposited into a publicly available database, the GigaTON database. This database also provides powerful and user-friendly request tools to browse and retrieve information about annotation, expression level, UTRs, splice and polymorphism, and gene ontology associated to all the contigs into each, and between all libraries.
The GigaTON database provides a convenient, potent and versatile interface to browse, retrieve, confront and compare massive transcriptomic information in an extensive range of conditions, tissues and developmental stages in Crassostrea gigas. To our knowledge, the GigaTON database constitutes the most extensive transcriptomic database to date in marine invertebrates, thereby a new reference transcriptome in the oyster, a highly valuable resource to physiologists and evolutionary biologists.
KeywordsOyster Transcriptome Development Stress Tissues NGS
The Pacific Oyster, Crassostrea gigas, is a bivalve mollusk that belongs to lophotrochozoans, an under-described phylogenetic group among protostomes despite being the sister clade of ecdysozoans, which encompass insects and nematodes. C. gigas is an intertidal bivalve with a pelagic larval phase followed by a benthic adult life. Pacific oysters are farmed in intertidal or nearshore areas that undergo highly variable environmental conditions as well as an important anthropization, making C. gigas considered a sentinel species. Besides, C. gigas constitutes one of the most important shellfish aquaculture resources worldwide with an estimated market value reaching 1.3 billion dollars in 2013 (FAO). In this context, knowledge about the physiology of C. gigas is of fundamental, ecological and applied interests, leading to the generation of an important amount of transcriptomic data, as illustrated by the Gigasdatabase expressed-sequenced tags (EST) database . These ESTs have enabled the development of microarrays used in physiological studies notably focusing on reproduction [2, 3], resistance to summer mortality [4–6], hypoxia , thermal stress  or ploïdy . Additionally, MPSS or RNA-Seq analyses have been performed to investigate molecular processes involved in larval growth heterosis , immune responses to pathogenic bacteria  or response to salinity stress . In many studies, biological understanding of data was hampered by limited annotation of transcripts, notably due to incomplete assembly or lack of reference transcriptome resource.
Aside from transcriptome studies, the genome of the Pacific oyster has recently been sequenced and assembled, and is now available for the community to explore . The present assembly of the oyster genome has been annotated and genes were predicted in silico by comparison with other species. Together with the genome characterization, extensive transcriptomic data have also been generated from a very broad range of developmental stages and physiological conditions giving rise to 114 RNA-seq libraries. These data were directly mapped to the genome and used to gain insights into developmental, shell formation and stress response issues in the oyster from studies based on differential expression of the predicted genes.
Up to date, these RNA-seq data were mostly used to validate the published version of the genome assembly. In addition, such a massive sequencing effort  offers the opportunity to gain more insights into the transcriptome of the Pacific oyster at different development stages and under different environmental conditions. Insights into these features are critical for our understanding of many aspects of the biology in marine bivalves and their evolution. However, there is to date, to the best of our awareness, no convenient publicly accessible resource (i) where all the transcriptomic data is assembled and gathered and (ii) that is able to answer the need for a convenient way to browse into such an important amount of data. Therefore, we further exploited the C. gigas transcriptomic data generated in parallel to the genome characterization by Zhang et al , by per-sample assembling, then meta-assembling of the obtained contigs, the 2204 million reads from the 114 RNA-seq libraries. We provide these results including sequence of complete open reading frames (ORFs) (thereby protein), presence of transcript variants, knowledge of 5’ and 3’-untranslated regions, polymorphism and expression levels between physiological stages and conditions into an extensive, searchable, highly potent and user-friendly online database based on RNABrowse , the GigaTON database.
Construction and content
System design and implementation
- 1.Data acquisition
The 114 RNA-seq libraries were generated as a part of the oyster genome project , except for two of them that were used in a study dealing with salinity stress response . They comprise 99 single and 15 paired raw read files and have been downloaded in fastq format, from the NCBI SRA website (study number SRP014559 and SRP013236, respectively).
- 2.Sequence/data cleansing
The reads were first cleaned from remaining sequencing adapters using trim_galore (http://www.bioinformatics.babraham.ac.uk/projects/trim_galore/). Then the over-represented reads were filtered out using the normalize_by_kmer_coverage.pl script from the Trinity software package . The next step aimed at discarding the invalid base calls by extracting the longest sub-sequence without unidentified nucleotides (Ns) from each read. If the length of the longest sub-sequence did not exceed half of the sequence length, the read and his mate, if present, were removed. This last step was performed using an in-house script (available upon request).
- 3.Contig assembly
The assembly was performed in two steps described hereafter. A per sample assembly was first performed, allowing us to build a contig set for each sample. Then, all sample contig sets were merged, then duplicates were removed and references with very low read counts were filtered out.
For each sample, nine Oases  assemblies using different k-mers (25, 31, 37, 43, 49, 55, 61, 65 and 69) were performed on cleaned reads. Each assembly produced a transcripts.fa contig file, which header includes an assembly locus. Only the longest and most covered (read depth) loci were kept using the OasesV0.2.04OutputToCsvDataBase.py script developed by a Brown University team (https://code.google.com/p/oases-to-csv/). After that, all files were merged. Finally, anti-sense chimeras (accidentally produced by the assembly step) were split or removed with an in-house script (available upon request).
Because similar collections of contigs were produced by different k-mers, a cd-hit-est  step was used to remove redundant contigs based on their sequence similarities (parameters -M 0 -d 0 -c 0.98 -T 8). Because different kmers sometimes constructed different transcript parts, TGICL  was used to assemble contigs having significant overlaps (parameters -c 4 -l 60 -p 96 -s 100000). The contigs were then filtered on a minimum length of 200 base pairs.
Input reads were mapped back to the contigs using bwa aln . The resulting alignment files were used to correct the contig sequences from spurious insertions and deletions using an in-house script (available upon request) and to filter out those with very low coverage. The filter excluded contigs with less than two mapped reads per million.
All sample contig fasta files were merged and each contig renamed by adding the sample name at the beginning of its name. The longest ORF was then searched with getorf from EMBOSS . A cd-hit clustering was performed on the ORFs with sequence identity equal or greater than 0.9 in order to extract from each cluster, the contig with the longest ORF or, if the ORF size are the same, the longest contig. The remaining contigs were clustered by cd-hit-est with sequence identity equal or greater than 0.95. Input reads from all conditions were then mapped back to the contigs using bwa aln . Contigs with very low coverage, less than two mapped reads per million, were filtered out.
The final set contained 56,651 contigs, corresponding to a total of 93,954,163 base pairs. The median and mean length of the contigs were respectively 1180 and 1,659.35 base pairs. The longest contig contained 66,407 base pairs (Fig. 1).
- 4.Functional annotation
The contig set was annotated using blast versus several databases including uniprot-swissprot, refseq protein, refseq_rna, rfam, Ensembl oyster proteins and the contigs themselves.
All sample read sets were aligned on the final contig set using bwa aln to quantify the expression. A quantification table was built with these counts. It can be retrieved from the download section of the website (count_matrix.csv).
- 6.Variant calling
The alignment files were processed with Samtools mpile-up. Variants including SNPs and INDELs were called using the following command line and parameters : samtools mpileup -EDu -d 10000 -q 30 | bcftools view -vcg | vcfutils.pl varFilter -d 6 -D 10000 -a 2 vcftools --maf 0.15. A subset of 336,018 variants was stored in the database and the complete variant files containing 2,144,528 SNPs and 77,274 INDELs can be found in the download section of the website. The polymorphisms have been annotated using snpEff v4.15  using the transdecoder open reading frame annotations made on the contigs. The complete annotated VCF files have been added to the download page of the website, one for SNPs (ALL_SNP.oysterAssembly1.0_annot1.vcf) and another one for INDELs (ALL_INDEL.oysterAssembly1.0_annot1.vcf). The results indicate that out of the 2,181,748 SNPs found, 33,98 % cause synonymous changes. The impact of SNPs and INDELS are detailed in Table 1 and Table 2, respectively.Table 1
Impact of SNPs on the GigaTON protein set
TYPE (alphabetical order)
5’-UTR premature start codon gain variant
Initiator codon variant
Missense variant + splice region variant
Splice regioin variant
Splice region variant + synonymous variant
Stop gained + splice region variant
Stop retained variant
33.985 %Table 2
Impact of INDELs on the GigaTON protein set
TYPE (alphabetical order)
Chromosome number variation
Disruptive in-frame deletion
Disruptive in-frame insertion
Frameshift variant + start lost
Frameshift variant + stop gained
Frameshift variant + stop lost
Start lost + disruptive in-frame deletion
Start lost + in-frame deletion
Start lost + in-frame insertion
Stop gained + disruptive in-frame deletion
Stop gained + disruptive in-frame insertion
Stop gained + in-frame insertion
Stop lost + in-frame deletion
- 7.RNABrowse environment setting, database features and request tools
Differential analyses: Venn diagrams and Digital Differential Display (DDD). The GigaTON database can be requested for differential analysis of contig content (Venn diagrams) and expression level between libraries (DDD) using a Fisher’s exact test. These requests are easily performed using a ‘point-and-click’ interface for sample library grouping, analysis parameters and results exploitation (browsing and downloading) (Fig. 2). A detailed description of each library can be found in the download section of the GigaTON website (libraries description .xlsx).
Contig search and browser: the BioMart portal. The information about the contigs can be searched, browsed and retrieved online using the Biomart portal (www.biomart.org) . Briefly, the GigaTON contigs or subgroup of contigs can be filtered out according to various criteria and sub-criteria (name, length, annotation number in a wide range of repository databases, Gene Ontology, keyword, expression), and retrieved with their corresponding attributes selected by the user among an extensive list of 71, including: filter criteria, best annotation identifier, hit start/stop position, number of variants, DNA sequence, expression pattern in tissues, developmental stages and various physiological conditions (Fig. 3). The retrieved contigs can be browsed further for additional information (sequence, counterparts in other databases, expression between libraries) in convenient and user-friendly point-and-click interface tabs (Fig. 4).
Utility and discussion
Gain of knowledge regarding existing databases
First, the novelty and gains of GigaTON are novel contigs. Among the 56621 contigs in GigaTON, 53,515 contigs (94,5 %) align to the Ensembl Oyter genome, with a vast majority displaying 95 % identity and 90 % coverage. A contig can be considered as ‘novel’ when its sequence is unknown or present in databases but not annotated, therefore including those which do not align with the genome, in addition to those that align with unannotated areas of the genome. The GigaTON database contains 38,316 contigs (67.7 %) that display an overlap with one of the 28636 genes contained in the oyster Ensembl GTF file. The remaining 18305 ‘novel’ contigs have been aligned to the RefSeq protein database, (http://www.ncbi.nlm.nih.gov/genome/annotation_euk/Crassostrea_gigas/100/) which contains more transcripts (45,394 gene models). Out of the 18,305 ‘novel’ contigs, 7750 produce significant alignments with a RefSeq protein. Among these contigs, 2241 have no hit on the Ensembl oyster genome and 1779 match oyster proteins in the RefSeq database, whereas the remaining match known proteins in other organisms. More drastically, 11,817 ‘novel’ contigs within GigaTON did not match any annotated region of the genome nor any oyster RefSeq protein, whereas 1078 have a RefSeq annotation in other organisms, mainly Lottia gigantea, Saccoglossus kowalevskii, Hydra vulgaris, Nematostella vectensis, Branchiostoma floridae and Strongylocentrus purpuratus (139, 83, 76, 72 62 and 52 hits, respectively). The distribution statistics of the length of the ‘known’ vs. ‘novel’ contigs concerning the coding regions are : minimum, 200 vs. 202; 1st quartile, 821 vs. 500; median 1372 vs. 730; mean, 1860 vs. 899; 3rd quartile, 2258 vs. 1070; maximum, 66410 vs. 19500, respectively (Fig. 5a).
An example of a novel contig in GigaTON is the contig ‘CHOYP_contig_047748’ that corresponds to the sequence encoding the precursor of a novel oyster neuropeptide family named “RxIamide”. This sequence does not match any oyster Ensembl predicted gene, cDNA or protein, despite (i) ESTs present identity (HS206944, HS234774), and (ii) the gene products were characterized using mass spectrometry . However, the considered sequence matches unannotated genomic sequences in the scaffold 40412 (43,230-43,117; 30,085-29,480). This situation is not unique and was also met for a number of neuropeptide-encoding genes (Buccalin, cerebrin, GGN-related peptide, LRNFVamide, opioid neuropeptide-like, tachykinin-like) . Such examples illustrate the gain brought by GigaTON database, which, in addition to bringing new transcriptomic information, might help improving gene prediction.
Second, the GigaTON contigs provide insights into the 5’ and 3’ untranslated regions (UTRs) of the oyster transcribed sequences (Fig. 3 and Fig. 5b). Importantly, the 5’ information is a significant asset towards the determination of a consensus promoter sequence in a bivalve mollusk, which may help unlock functional genomics studies in lophotrochozoans. Besides, there is a widely admitted approximation that considers the 1 kb sequence upstream translation starts as being proximal promoter sequences, which is commonly used at present for the interpretation of NGS data [24–26]. However, the mean 5’-additional information we provide in the GigaTON database is around 600 pb (Fig. 5b), indicating a mean 600 pb length for 5’UTRs in the Pacific oyster. Therefore the aforementioned approximation presents a potential severe bias since up to 60 % of the sequence information might be mis-considered as promoter sequence instead of UTR, and thereby should be cautiously considered, as illustrated for some Hox genes orthologues . This contradicts the general assumption that oysters, like other lophotrochozoans, have very short 5’-UTRs. However, because the length of the predicted proteins corresponding to the GigaTON contigs also displays a great increase (Fig. 5d), the additional 5’ sequence information can also correspond to coding DNA, thereby minimizing the flaw in proximal promoter approximation previously stated. The same rationale can be proposed regarding the additional 3’ sequence information provided within the GigaTON contigs as being 3’UTRs and/or coding regions. This point is important for the understanding of the mi/piRNA pathway in oysters, which mostly affects 3’-UTRs in vertebrates [28, 29] but could also highlight alternative polyadenylation signals in mollusks. Therefore, additional work, including functional analyses dedicated to the determination of consensus promoter and/or polyadenylation signals, are required to decipher such potentially important issues for our understanding of gene expression regulation in the Pacific oyster, and should take great advantage of the present database.
Second, the GigaTON pipeline led to the identification of a high number of assembly variants, since 30 % of the contigs display at least one variant (Fig. 5b). Some of these assembly variants highly likely correspond to transcript variants that could reflect alternative splicing when these variations are located within ORFs. Alternatively, when variations are located within UTRs, the presence of transcript variants could illustrate the usage of alternative promoters and/or polyadenylation signals. Both can have important functional outcomes in terms of protein structure or gene expression level regulation. As an illustration, the contig CHOYP_41.3.3 (Fig. 4) exemplifies a putative transcript variant that might lack a region between two functional domains, thereby diminishing the length of the protein with probable physiological consequences.
Third, the great increase in length for a number of proteins as mentioned above, points out that the in silico protein prediction from the genome, which is mostly based on homology with other invertebrates, is significantly improved by the present assembly. This indicates a noticeable difference in protein sets between the Pacific oyster and other invertebrates. Such a discrepancy between invertebrates seems surprising, except if one considers the growing number of documented lophotrochozoan features, such as DNA methylation patterns [24, 30], angiotensin-converting enzyme characteristics [31, 32] or BMP cDNA sequences , that display a closer resemblance with vertebrates than with ecdysozoans. Besides, in a similar fashion than mapping long 454 reads helped assembling genomic scaffolds , the GigaTON database provides even longer contigs that we think could importantly help improve the current assembly of the oyster genome. These three features might be of functional significance for our understanding of the oyster physiology and support the importance of the GigaTON database, which encompasses a high diversity of transcriptomes, in the context of post-genomic studies in lophotrochozoans.
Easiness, potency and versatility of the tools
The GigaTON database provides, to our knowledge, the most complete and integrated transcriptomic resource to date in a marine mollusk. A remarkable strength of this database is the possibility to browse data quickly and easily to get a global picture of the user’s question using Venn diagrams or DDD, which can easily be further examined in deep details. This is noticeably allowed by the use of the BioMart portal within GigaTON, which links extremely massive sequence and expression data in various physiological conditions with a user-friendly, easy to use, point-and-click interface. Our searchable online database enables a high versatility tool both in terms of request criteria and retrieved attributes that can be easily downloaded and further analyzed using dedicated software packages. In parallel, specific contig characteristics such as transcript variants, polymorphism and expression can be browsed online in the graphic interface. Furthermore, the included browser allows a convenient view of the target contig counterparts in numerous other databases at a single glance (Fig. 4c), without any particular need for the user to be skilled in bioinformatics. Therefore, the GigaTON database is most likely to be extensively accessed by an increasing scientific community studying evolutionary, environmental or aquaculture related aspects in C. gigas. Together with the current genome assembly, it constitutes one of its strong assets as an emerging model species.
In this work, we assembled 114 RNA-seq libraries, mostly generated aside of the oyster genome project, into a publicly accessible database that includes user-friendly ‘search, browse and retrieve’ online tools, the GigaTON database. To date, the GigaTON database constitutes the most extensive transcriptomic resource in a marine invertebrate to our knowledge, and sets the new reference transcriptome in C. gigas, one of the most important shellfish aquaculture resources worldwide.
Availability and requirements
The GigaTON database is available at http://gigaton.sigenae.org. There are no restrictions for its use by non-academics.
expressed sequence tag
open reading frame
The authors are grateful to Alan Semple (Dublin, Ireland) for critical reading of the manuscript. This work was funded by the Agence Nationale de la Recherche (grant number ANR-08-GENM-04 ‘Gametogenes’).
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Fleury E, Huvet A, Lelong C, De LJ, Boulo V, Gueguen Y, et al. Generation and analysis of a 29 , 745 unique Expressed Sequence Tags from the Pacific oyster ( Crassostrea gigas ) assembled into a publicly accessible database : the GigasDatabase. BMC Genomics. 2009;15:1–15.Google Scholar
- Dheilly NM, Lelong C, Huvet A, Favrel P. Development of a Pacific oyster (Crassostrea gigas) 31,918-feature microarray: identification of reference genes and tissue-enriched expression patterns. BMC Genomics. 2011;12:468.View ArticlePubMedPubMed CentralGoogle Scholar
- Dheilly NM, Lelong C, Huvet A, Kellner K, Dubos M-P, Riviere G, et al. Gametogenesis in the Pacific oyster Crassostrea gigas: a microarrays-based analysis identifies sex and stage specific genes. PLoS One. 2012;7:e36353.View ArticlePubMedPubMed CentralGoogle Scholar
- Fleury E, Moal J, Boulo V, Daniel J, Mazurais D. Microarray-Based Identification of Gonad Transcripts Differentially Expressed Between Lines of Pacific Oyster Selected to Be Resistant or Susceptible to Summer Mortality. Mar Biotechnol. 2010;12:326–39.View ArticlePubMedGoogle Scholar
- Jouaux A, Lafont M, Blin J-L, Houssin M, Mathieu M, Lelong C. Physiological change under OsHV-1 contamination in Pacific oyster Crassostrea gigas through massive mortality events on fields. BMC Genomics. 2013;14:590.View ArticlePubMedPubMed CentralGoogle Scholar
- Chaney ML, Gracey AY. Mass mortality in Pacific oysters is associated with a specific gene expression signature. Mol Ecol. 2011;20:2942–54.View ArticlePubMedGoogle Scholar
- Sussarellu R, Fabioux C, Le Moullac G, Fleury E, Moraga D. Transcriptomic response of the Pacific oyster Crassostrea gigas to hypoxia. Mar Genomics. 2010;3:133–43.View ArticlePubMedGoogle Scholar
- Lang RP, Bayne CJ, Camara MD, Cunningham C, Jenny MJ, Langdon CJ: Transcriptome Profiling of Selectively Bred Pacific Oyster Crassostrea gigas Families that Differ in Tolerance of Heat Shock. Mar Biotechnol 2009;11(5):650-68.
- Dheilly NM, Jouaux A, Boudry P, Favrel P, Lelong C. Transcriptomic profiling of gametogenesis in triploid Pacific Oysters Crassostrea gigas: towards an understanding of partial sterility associated with triploidy. PLoS One. 2014;9:e112094.View ArticlePubMedPubMed CentralGoogle Scholar
- Hedgecock D, Lin J, Decola S, Haudenschild CD, Meyer E, Manahan DT, et al. Transcriptomic analysis of growth heterosis in larval Pacific oysters (Crassostrea gigas). Proc Natl Acad Sci USA. 2007;104:2313–8.View ArticlePubMedPubMed CentralGoogle Scholar
- Zhang L, Li L, Guo X, Litman GW, Dishaw LJ, Zhang G. Massive expansion and functional divergence of innate immune genes in a protostome. Sci Rep. 2015;5:8693.View ArticlePubMedPubMed CentralGoogle Scholar
- Zhang L, Hou R, Su H, Hu X, Wang S, Bao Z. Network analysis of oyster transcriptome revealed a cascade of cellular responses during recovery after heat shock. PLoS One. 2012;7:e35484.View ArticlePubMedPubMed CentralGoogle Scholar
- Zhang GG, Fang X, Guo X, Li L, Luo R, Xu F, et al. The oyster genome reveals stress adaptation and complexity of shell formation. Nature. 2012;490:49–54.View ArticlePubMedGoogle Scholar
- Mariette J, Noirot C, Nabihoudine I, Bardou P, Hoede C, Djari A, et al. RNAbrowse: RNA-Seq de novo assembly results browser. PLoS One. 2014;9:e96821.View ArticlePubMedPubMed CentralGoogle Scholar
- Zhao X, Yu H, Kong L, Li Q. Transcriptomic responses to salinity stress in the Pacific oyster Crassostrea gigas. PLoS One. 2012;7:e46244.View ArticlePubMedPubMed CentralGoogle Scholar
- Haas BJ, Papanicolaou A, Yassour M, Grabherr M, Blood PD, Bowden J, et al. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat Protoc. 2013;8:1494–512.View ArticlePubMedGoogle Scholar
- Schulz MH, Zerbino DR, Vingron M, Birney E. Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels. Bioinforma. 2012;28(8):1086–92.View ArticleGoogle Scholar
- Li W, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinforma. 2006;22(13):1658–9.View ArticleGoogle Scholar
- Pertea G, Huang X, Liang F, Antonescu V, Sultana R, Karamycheva S, et al. TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasets. Bioinforma. 2003;19(5):651–2.View ArticleGoogle Scholar
- Li H, Durbin R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinforma. 2009;25(14):1754–60.View ArticleGoogle Scholar
- Rice P, Longden I, Bleasby A. EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet. 2000;16:276–7.View ArticlePubMedGoogle Scholar
- Cingolani P, Platts A, Wang LL, Coon M, Nguyen T, Wang L, et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly. 2012;6:80–92.View ArticlePubMedPubMed CentralGoogle Scholar
- Stewart MJ, Favrel P, Rotgans BA, Wang T, Zhao M, Sohail M, et al. Neuropeptides encoded by the genomes of the Akoya pearl oyster Pinctata fucata and Pacific oyster Crassostrea gigas: a bioinformatic and peptidomic survey. BMC Genomics. 2014;15:840.View ArticlePubMedPubMed CentralGoogle Scholar
- Riviere G. Epigenetic features in the oyster Crassostrea gigas suggestive of functionally relevant promoter DNA methylation in invertebrates. Front Physiol. 2014;5:129.PubMedPubMed CentralGoogle Scholar
- Gavery MR, Roberts SB. Predominant intragenic methylation is associated with gene expression characteristics in a bivalve mollusc. PeerJ. 2013;1:e215.View ArticlePubMedPubMed CentralGoogle Scholar
- Olson CE, Roberts SB. Genome-wide profiling of DNA methylation and gene expression in Crassostrea gigas male gametes. Front Physiol. 2014;5:224.View ArticlePubMedPubMed CentralGoogle Scholar
- Saint-Carlier E, Riviere G. Regulation of Hox orthologues in the oyster Crassostrea gigas evidences a functional role for promoter DNA methylation in an invertebrate. FEBS Lett. 2015;589:1459–66.View ArticlePubMedGoogle Scholar
- Jing Q, Huang S, Guth S, Zarubin T, Motoyama A, Chen J, et al. Involvement of microRNA in AU-rich element-mediated mRNA instability. Cell. 2005;120:623–34.View ArticlePubMedGoogle Scholar
- Lewis B, Burge C, Bartel D. Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell. 2005;120:15–20.View ArticlePubMedGoogle Scholar
- Sarda S, Zeng J, Hunt BG, Yi SV. The evolution of invertebrate gene body methylation. Mol Biol Evol. 2012;29:1907–16.View ArticlePubMedGoogle Scholar
- Rivière G, Michaud A, Deloffre L, Vandenbulcke F, Levoye A, Breton C, et al. Characterization of the first non-insect invertebrate functional angiotensin-converting enzyme (ACE): leech TtACE resembles the N-domain of mammalian ACE. Biochem J. 2004;382(Pt 2):565–73.View ArticlePubMedPubMed CentralGoogle Scholar
- Riviere G, Fellous A, Franco A, Bernay B, Favrel P. A Crucial Role in Fertility for the Oyster Angiotensin- Converting Enzyme Orthologue CgACE. PLoS One. 2011;6:e27833.View ArticlePubMedPubMed CentralGoogle Scholar
- Lelong C, Mathieu M, Favrel P. Structure and expression of mGDF, a new member of the transforming growth factor-beta superfamily in the bivalve mollusc Crassostrea gigas. Eur J Biochem. 2000;267:3986–93.View ArticlePubMedGoogle Scholar