The main objectives of this study were sequencing, assembling, and annotation of chloroplast genome of one of the main Siberian boreal forest tree conifer species Siberian larch (Larix sibirica Ledeb.) and detection of polymorphic genetic markers – microsatellite loci or simple sequence repeats (SSRs) and single nucleotide polymorphisms (SNPs).
We used the data of the whole genome sequencing of three Siberian larch trees from different regions - the Urals, Krasnoyarsk, and Khakassia, respectively. Sequence reads were obtained using the Illumina HiSeq2000 in the Laboratory of Forest Genomics at the Genome Research and Education Center of the Siberian Federal University. The assembling was done using the Bowtie2 mapping program and the SPAdes genomic assembler. The genome annotation was performed using the RAST service. We used the GMATo program for the SSRs search, and the Bowtie2 and UGENE programs for the SNPs detection. Length of the assembled chloroplast genome was 122,561 bp, which is similar to 122,474 bp in the closely related European larch (Larix decidua Mill.). As a result of annotation and comparison of the data with the existing data available only for three larch species - L. decidua, L. potaninii var. chinensis (complete genome 122,492 bp), and L. occidentalis (partial genome of 119,680 bp), we identified 110 genes, 34 of which represented tRNA, 4 rRNA, and 72 protein-coding genes. In total, 13 SNPs were detected; two of them were in the tRNA-Arg and Cell division protein FtsH genes, respectively. In addition, 23 SSR loci were identified.
The complete chloroplast genome sequence was obtained for Siberian larch for the first time. The reference complete chloroplast genomes, such as one described here, would greatly help in the chloroplast resequencing and search for additional genetic markers using population samples. The results of this research will be useful for further phylogenetic and gene flow studies in conifers.