Web-based design and analysis tools for CRISPR base editing
BMC Bioinformatics volume 19, Article number: 542 (2018)
As a result of its simplicity and high efficiency, the CRISPR-Cas system has been widely used as a genome editing tool. Recently, CRISPR base editors, which consist of deactivated Cas9 (dCas9) or Cas9 nickase (nCas9) linked with a cytidine or a guanine deaminase, have been developed. Base editing tools will be very useful for gene correction because they can produce highly specific DNA substitutions without the introduction of any donor DNA, but dedicated web-based tools to facilitate the use of such tools have not yet been developed.
We present two web tools for base editors, named BE-Designer and BE-Analyzer. BE-Designer provides all possible base editor target sequences in a given input DNA sequence with useful information including potential off-target sites. BE-Analyzer, a tool for assessing base editing outcomes from next generation sequencing (NGS) data, provides information about mutations in a table and interactive graphs. Furthermore, because the tool runs client-side, large amounts of targeted deep sequencing data (< 1 GB) do not need to be uploaded to a server, substantially reducing running time and increasing data security. BE-Designer and BE-Analyzer can be freely accessed at http://www.rgenome.net/be-designer/ and http://www.rgenome.net/be-analyzer/, respectively.
We develop two useful web tools to design target sequence (BE-Designer) and to analyze NGS data from experimental results (BE-Analyzer) for CRISPR base editors.
CRISPR-Cas (clustered regularly interspaced short palindromic repeats and CRISPR associated), an immune system in bacteria and archaea that targets nucleic acids of viruses and plasmids, is now widely used as a genome editing tool because of its convenience and high efficiency [1,2,3,4,5]. The most popular endonuclease, type II CRISPR-Cas9, makes DNA double-stranded breaks (DSBs) at a desired site with the help of its single-guide RNA (sgRNA) [6,7,8]. The DSBs provoke the cell’s own repair systems: error-prone non-homologous end joining (NHEJ) and error-free homology-directed repair (HDR), resulting in gene knock-out and knock-in (or gene correction), respectively. However, it is relatively difficult to induce gene corrections such as one nucleotide substitutions because HDR occurs rarely in mammalian cells compared to NHEJ . Furthermore, Cas9 can frequently induce DSBs at undesired sites with sequences similar to that of the sgRNA [10, 11].
Recently, CRISPR-mediated base editing tools have been developed. These tools enable the direct conversion of one nucleotide to another without producing DSBs in the target sequence and without the introduction of donor DNA templates. The initial base editors (named BEs), composed of dCas9  or nCas9  linked to a cytidine deaminase such as APOBEC1 (apolipoprotein B editing complex 1)  or AID (activation-induced deaminase) , substitute C for T. Later, adenine base editors (ABEs) were constructed by using tRNA adenine deaminase (TadA), evolved to enable the direct conversion of A to G in DNA . Because of their ability to make highly specific DNA substitutions, these base editing tools will be very useful for gene correction [17,18,19,20,21,22], but to the best of our knowledge, a user-friendly and freely-available web-based tool for their design and analysis has not yet been developed.
BE-Designer is a sgRNA designing tool for CRISPR base editors. BE-Designer rapidly provides a list of all possible sgRNA sequences from a given input DNA sequence along with useful information: possible editable sequences in a target window, relative target positions, GC content, and potential off-target sites. Basically, the interface of BE-Designer was developed using Django as a backend program.
Input panels in BE-designer
BE-Designer presently provides analysis for CRISPR base editors based on SpCas9 from Streptococcus pyogenes, which recognizes 5’-NGG-3′ protospacer-adjacent motif (PAM) sequences, as well as SpCas9 variants: SpCas9-VQR (5’-NGAN-3′), SpCas9-EQR (5’-NGAG-3′), SpCas9-VRER (5’-NGCG-3′), xCas9 3.7 (TLIKDIV SpCas9; 5’-NGR-3′ and 5’-NG-3′) [23,24,25]. BE-Designer also provides analysis for CRISPR base editors based on StCas9 from Streptococcus thermophilus (5’-NNAGAAW-3′), CjCas9 from Campylobaccter jejuni (5’-NNNVRYAC-3′), SaCas9 from Staphylococcus aureus (5’-NNGRRT-'3) and its engineered form, SaCas9-KKH (5’-NNNRRT-'3) [26,27,28]. Currently, BE-Designer supports sgRNA design in 319 different organisms, including vertebrates, insects, plants, and bacteria. Users can input DNA sequences directly in the target sequence panel of the web site or upload a text file containing DNA sequences. The DNA sequence should be a raw string comprised of IUPAC nucleotide codes or FASTA formatted text. By using an analysis parameter, users can manually select the type of base editor, either BE or ABE, and the base editing window in the target DNA (Fig. 1a).
Selection of sgRNAs
Within a given DNA sequence, BE-Designer finds all possible target sites based on input parameters; in the base editing window, target nucleotides are highlighted in red, and their relative position and GC content are indicated. BE-Designer then invokes Cas-OFFinder  to search throughout the entire genome of interest for possible off-target sequences that differ by up to 2 nucleotides from the on-target sequences (Additional file 1: Figure S1).
Due to its high sensitivity and precision, targeted deep sequencing is the best method for assessing the results of base editing. BE-Analyzer accepts targeted deep-sequencing data and analyzes them to calculate base conversion ratios. In addition to the interactive table and graphs showing the results, BE-Analyzer also provides a full list of all query sequences aligned to a given wild-type (WT) sequence, so that users can confirm mutation patterns manually. BE-Analyzer wholly runs on a client-side web browser so that there is no need to upload very large NGS datasets (< 1 GB) to a server, reducing a time-consuming step in genome editing analysis. The BE-Analyzer interface was also developed using Django as a backend program. The core algorithm of BE-Analyzer was written in C++ and then trans-compiled to WebAssembly with Emscripten (http://kripken.github.io/emscripten-site/).
Input panels in BE-analyzer
To analyze query sequences in NGS data, BE-Analyzer requires basic information: a full WT sequence for reference, the type of base editor, the desired base editing window, and the target DNA sequence (Fig. 2b). Previous studies have reported the optimal target window for each base editor. For example, BE3 usually induces base conversion in a region ranging from 13 to 17 nucleotide (nt) upstream of the PAM, and TARGET-AID is most efficient within a region 15 to 19 nt upstream of the PAM. Basically, BE-Analyzer provides the optimal default values with reference to previous studies, but users can freely revise the value manually. On the other hand, it has been reported that base editors can introduce substitutions outside of the DNA target sequences at a low frequency . Therefore, BE-Analyzer is implemented to allow additional flanking windows on each side of the target for analysis by the use of a relevant parameter.
Analysis of NGS data
From uploaded NGS data, BE-Analyzer first defines 15-nt indicator sequences on both sides of the given reference sequence; only identified queries that have both indicator sequences, with ≤1 nt mismatches, are collected. Then, BE-Analyzer counts the recurrent frequency of each sequence and sorts queries in descending order. In this procedure, sequences with frequencies below the minimum are discarded. Each sequence is aligned to the reference sequence with EMBOSS needle (https://www.ebi.ac.uk/Tools/psa/emboss_needle/) (Additional file 1: Figure S1). As a result, the aligned sequences are classified into four different groups based on the presence of a hyphen (−). If hyphens are found in the reference sequence or query, the query is classified as an insertion or deletion by a comparison of the number of hyphens in the two sequences. If hyphens (inserted or deleted sequences) are not found in a given target window including the additional flanking regions, the query is referred as a WT sequence . Otherwise, the queries that contain a few mismatched nucleotides in the given target window are classified as substitutions (Additional file 1: Figure S2).
The results are summarized as a table with 9 columns (Fig. 3a): (i) ‘Total Sequence’ indicates the number of all reads present in the Fastq file, (ii) ‘With both indicator sequences’ indicates the number of reads having both indicator sequences, (iii) ‘More than minimum frequency’ indicates the number of reads that remain after the reads that appear with less than the minimum frequency are removed, (iv, v, vi) ‘Wild type’, ‘Insertions’, and ‘Deletions’ indicate the number of reads in each category, (vii) the 7th column indicates the number of reads having at least one base substitution, (viii) the 8th column indicates the number of reads that have nucleotide conversions induced by CRISPR base editors in target windows, and (ix) the 9th column indicates the intended substitution rate (such as ‘C to T Substitution Rate’), obtained by dividing the number of reads that have intended conversions in the base editing window with the number of reads above the minimum frequency (3rd column).
For base editing, it is crucial to know how the mutation of one or a few nucleotides changes the amino acid sequence. To address this issue, BE-Analyzer provides the expected amino acid sequences for three different reading frames, so that users can select among three possible start positions (Fig. 3b). For each nucleotide, BE-Analyzer displays the nucleotide mutation rate in detail, highlighted with a color gradient.
Although cytidine deaminases mainly introduce C to T transitions in the base editing window, C to A or G transitions may also occur in flanking regions with low probability. Thus, BE-Analyzer shows the substitution rate at each site in the flanking windows and the C to D transition pattern in the target windows (Fig. 3c). In the C to D substitution graph, each transition pattern is presented with its percentile rate, and the type of transition indicated by color (red-black-green). Optionally, if users previously uploaded data from a CRISPR-untreated control, BE-Analyzer displays the substitution rate at each of those sites in the negative direction. Furthermore, for users’ convenience, BE-Analyzer shows substitution patterns within the flanking windows with a heat map, which enables visualization of the dominant substitution patterns as well as background patterns.
At the bottom of the results page, a list of categorized sequence reads aligned to the reference sequence is presented (Fig. 3d). Users can confirm all filtered sequences from the input data in this table and can also save the results by clicking the ‘Download Data’ button.
BE-Designer is an easy-to-use web tool for optimal selection of sgRNAs in a given target sequence. It identifies all possible target sequences in a given sequence and displays information about each target sequence, including predicted mutation patterns, mutation positions, and potential off-target sites. Users can easily select the optimal sgRNA sequence for current base editors. On the other hand, Benchling, Inc., a company developing biotech platforms, also provides a CRISPR-mediated base editor designing tool (https://benchling.com/). We carefully compare our BE-Designer with the Benchling’s designer as summarized in Table 1.
BE-Analyzer is another web tool for instant assessment of deep sequencing data obtained after treatment with base editors. BE-Analyzer instantly analyzes deep sequencing data at a client-side web browser and displays the results using interactive tables and graphs for users’ convenience. Useful information, including the ratio of intended conversions, transition patterns, and sequence alignments, is provided so that users can easily infer how frequently and where intended or unwanted substitutive mutations are generated.
Adenine base editors
Cytosine base editors
Clustered regularly interspaced short palindromic repeats and CRISPR associated
DNA double-stranded breaks
Next generation sequencing
Non-homologous end joining
tRNA adenine deaminase
Kim H, Kim J-S. A guide to genome engineering with programmable nucleases. Nat Rev Genet. 2014;15:321–34.
Baek K, Kim DH, Jeong J, Sim SJ, Melis A, Kim J-S, et al. DNA-free two-gene knockout in Chlamydomonas reinhardtii via CRISPR-Cas9 ribonucleoproteins. Sci Rep. 2016;6:30620.
Koonin EV, Makarova KS, Zhang F. Diversity, classification and evolution of CRISPR-Cas systems. Curr Opin Microbiol. 2017;37:67–78.
Doudna JA, Charpentier E. The new frontier of genome engineering with CRISPR-Cas9. Science. 2014;346:–1258096.
Sander JD, Joung JK. CRISPR-Cas systems for editing, regulating and targeting genomes. Nat Biotechnol. 2014;32:347–55.
Nishimasu H, Ran FA, Hsu PD, Konermann S, Shehata SI, Dohmae N, et al. Crystal structure of Cas9 in complex with guide RNA and target DNA. Cell. 2014;156:935–49.
Cong L, Ran FA, Cox D, Lin S, Barretto R, Habib N, et al. Multiplex genome engineering using CRISPR/Cas systems. Science. 2013;339:819–23.
Jinek M, Chylinski K, Fonfara I, Hauer M, Doudna JA, Charpentier E. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science. 2012;337:816–21.
Mao Z, Bozzella M, Seluanov A, Gorbunova V. DNA repair by nonhomologous end joining and homologous recombination during cell cycle in human cells. Cell Cycle. 2008;7:2902–6.
Cho SW, Kim S, Kim Y, Kweon J, Kim HS, Bae S, et al. Analysis of off-target effects of CRISPR/Cas-derived RNA-guided endonucleases and nickases. Genome Res. 2014;24:132–41.
Fu Y, Foden JA, Khayter C, Maeder ML, Reyon D, Joung JK, et al. High-frequency off-target mutagenesis induced by CRISPR-Cas nucleases in human cells. Nat Biotechnol. 2013;31:822–6.
Bikard D, Jiang W, Samai P, Hochschild A, Zhang F, Marraffini LA. Programmable repression and activation of bacterial gene expression using an engineered CRISPR-Cas system. Nucleic Acids Res. 2013;41:7429–37.
Ran FA, Hsu PD, Lin C-Y, Gootenberg JS, Konermann S, Trevino AE, et al. Double nicking by RNA-guided CRISPR Cas9 for enhanced genome editing specificity. Cell. 2013;154:1380–9.
Komor AC, Kim YB, Packer MS, Zuris JA, Liu DR. Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Natur. 2016;533:420–4.
Nishida K, Arazoe T, Yachie N, Banno S, Kakimoto M, Tabata M, et al. Targeted nucleotide editing using hybrid prokaryotic and vertebrate adaptive immune systems. Science. 2016;353:aaf8729.
Gaudelli NM, Komor AC, Rees HA, Packer MS, Badran AH, Bryson DI, et al. Programmable base editing of a•T to G•C in genomic DNA without DNA cleavage. Nature. 2017;551:464–71.
Kim K, Ryu S-M, Kim S-T, Baek G, Kim D, Lim K, et al. Highly efficient RNA-guided base editing in mouse embryos. Nat Biotechnol. 2017;35:435–7.
Zong Y, Wang Y, Li C, Zhang R, Chen K, Ran Y, et al. Precise base editing in rice, wheat and maize with a Cas9-cytidine deaminase fusion. Nat Biotechnol. 2017;35:438–40.
Liang P, Ding C, Sun H, Xie X, Xu Y, Zhang X, et al. Correction of β-thalassemia mutant by base editor in human embryos. Protein Cell. 2017;8:811–22.
Liang P, Sun H, Sun Y, Zhang X, Xie X, Zhang J, et al. Effective gene editing by high-fidelity base editor 2 in mouse zygotes. Protein Cell. 2017;8:601–11.
Kuscu C, Parlak M, Tufan T, Yang J, Szlachta K, Wei X, et al. CRISPR-STOP: gene silencing through base-editing-induced nonsense mutations. Nat Methods. 2017;14:710–2.
Billon P, Bryant EE, Joseph SA, Nambiar TS, Hayward SB, Rothstein R, et al. CRISPR-Mediated Base Editing Enables Efficient Disruption of Eukaryotic Genes through Induction of STOP Codons. Mol Cell. 2017;67:1068–1079.e4.
Kleinstiver BP, Prew MS, Tsai SQ, Topkar VV, Nguyen NT, Zheng Z, et al. Engineered CRISPR-Cas9 nucleases with altered PAM specificities. Nature. 2015;523:481–5.
Nishimasu H, Shi X, Ishiguro S, Gao L, Hirano S, Okazaki S, et al. Engineered CRISPR-Cas9 nuclease with expanded targeting space. Science. 2018;361:1259–62.
Hu JH, Miller SM, Geurts MH, Tang W, Chen L, Sun N, et al. Evolved Cas9 variants with broad PAM compatibility and high DNA specificity. Nature. 2018;556:57–63.
Kleinstiver BP, Prew MS, Tsai SQ, Nguyen NT, Topkar VV, Zheng Z, et al. Broadening the targeting range of Staphylococcus aureus CRISPR-Cas9 by modifying PAM recognition. Nat Biotechnol. 2015;33:1293–8.
Kim E, Koo T, Park SW, Kim D, Kim K, Cho H-Y, et al. In vivo genome editing with a small Cas9 orthologue derived from campylobacter jejuni. Nat Commun. 2017;8:14500.
Müller M, Lee CM, Gasiunas G, Davis TH, Cradick TJ, Siksnys V, et al. Streptococcus thermophilus CRISPR-Cas9 systems enable specific editing of the human genome. Mol Ther. 2016;24:636–44.
Bae S, Park J, Kim J-S. Cas-OFFinder: a fast and versatile algorithm that searches for potential off-target sites of Cas9 RNA-guided endonucleases. Bioinformatics. 2014;30:1473–5.
Park J, Bae S, Kim J-S. Cas-designer: a web-based tool for choice of CRISPR-Cas9 target sites. Bioinformatics. 2015;31:4014–6.
Park J, Lim K, Kim J-S, Bae S. Cas-analyzer: an online tool for assessing genome editing results using NGS data. Bioinformatics. 2017;33:286–8.
We thank Dr. M. Schlesner at DKFZ for helpful discussion.
This work was supported by National Research Foundation of Korea (NRF) Grants (no. 2017M3A9G8084539 and 2018M3A9H3022412), Next Generation BioGreen 21 Program grant no. PJ01319301, Technology Innovation Program funded by the Ministry of Trade, Industry and Energy (no. 20000158), and Korea Healthcare technology R&D Project grant no. HI16C1012 to S.B.
Availability of data and materials
Example NGS data are freely accessible from the web site (http://www.rgenome.net/be-analyzer/example).
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Figure S1. The internal programs used in this study for implementation of BE-Designer and BE-Analyzer. Figure S2. The workflow for classifying query sequences in BE-Analyzer. (DOCX 338 kb)
About this article
Cite this article
Hwang, GH., Park, J., Lim, K. et al. Web-based design and analysis tools for CRISPR base editing. BMC Bioinformatics 19, 542 (2018). https://doi.org/10.1186/s12859-018-2585-4