In Type II CRISPR systems, both the tracrRNA and crRNA are required for target interference and the two RNAs can be fused to generate a chimeric single-guide RNA (sgRNA). The sgRNA consists of a variable domain complementary to the DNA target and a structured scaffold domain derived from the bacterial crRNA (CRISPR RNA) and tracrRNA (trans-activating crRNA) molecules. The sgRNA can form a functional complex with CRISPR-associated nuclease (Cas9) and guide the nuclease to genomic loci matching a 20bp complementary invading DNA, cleaving it immediately upstream of a required 5'-NGG Protospacer Adjacent Motif (PAM). A key advantage of the CRISPR-Cas9 technology is that the Cas9 protein remains invariant yet can be readily reprogrammed to cleave novel DNA target sites by expressing an sgRNA with a different variable domain.
The sgRNA consists of crRNA- and tracrRNA-derived sequences connected by an artificial tetraloop. The crRNA sequence can be divided into guide (20-nt) and repeat (12-nt) regions, while the tracrRNA sequence can be divided into anti-repeat (14-nt) and three tracrRNA stem loops. Crystal structure reveals that the sgRNA binds the target DNA to form a T-shaped architecture comprising a guide:target heteroduplex, a repeat:anti-repeat duplex, and stem loops 1-3. The guide (nucleotides 1-20) and target DNA (nucleotides 1′-20′) form the guide:target heteroduplex via 20 Watson-Crick base pairs. The repeat (nucleotides 21-32) and the anti-repeat (nucleotides 37-50) form the repeat:anti-repeat duplex via 9 base pairs. The repeat:anti-repeat duplex and stem loop 1 are connected by a single nucleotide (A51), while stem loops 1 and 2 are connected by a 5 nt single-stranded linker.
The secondary structure of sgRNA is crucial for Cas9 recognization and binding. And the base pairing between the crRNA and tracrRNA or inside of the crRNA / tracrRNA is the basis of the sgRNA secondary structure formation. The 20bp length of target DNA recognization sequence located in the 5 prime of the crRNA varies depending on the DNA region targeted. Thus the GC content or complexity of the 20bp sequence might affect the stability of the sgRNA structure, as they also have the potential to form base pairing with the crRNA / tracrRNA. The guide sequence of non-functional sgRNAs had higher GC content on average compared with functional sgRNAs. Mechanism of the secondary sgRNA structure formation and maintenance needs more studies in detail.
Fig 1. Schematic representation and structure of the sgRNA:target DNA complex.
An sgRNA is designed to have a guide sequence domain (designated as gRNA) at the 5′ end, which is complementary to the target sequence. The gRNA domain of the sgRNA determines both the efficacy and specificity of the genome editing activities by Cas9. The guide sequence consists of 20 nucleotides that pair perfectly to the targeted genomic sequence, thereby guiding the recruitment of the Cas9 protein to the target site. The sgRNA guide sequence located at its 5′ end confers DNA target specificity. Interestingly, DNA targets and sgRNA guide sequences that differ from the canonical 20bp length have been reported in some plant studies, while in the mammalian field targets of the consensus (N)20 NGG are normally used.
sgRNA comprises the seed sequence and nonseed sequence. The 3′ end of the guide sequence, known as the "seed region", plays a critical role in recognition of target sequence. The seed sequence inﬂuences the speciﬁcity of Cas9-sgRNA binding through multiple potential mechanisms. Based on structural analysis, accessibility of the last three bases in the seed region was a prominent feature to differentiate functional sgRNAs from non-functional ones. The sgRNA is functionally equivalent to the crRNA-tracrRNA complex, but is much simpler as a research tool for mammalian genome editing. By modifying the guide sequence, it is possible to create sgRNAs with different target specificities.
There are two potential applications. First, improving the genome editing efficiency with optimized sgRNA in the sites that the conventional sgRNA shows low genome editing efficiency; second, improving the genome editing specificity with truncated target DNA recognization sequence located in the 5 prime of the crRNA. Although the targeting speciﬁcity of Cas9 is believed to be tightly controlled by the 20-nt guide sequence of the sgRNA and the presence of a PAM adjacent to the target sequence in the genome, potential off-target cleavage activity could still occur on DNA sequence with even three to ﬁve base pair mismatches in the PAM-distal part of the sgRNA-guiding sequence. Moreover, previous studies have demonstrated that different guide RNA sequences and structures can both affect the cleavage of on-target and off-target sites.
Various strategies have been reported to reduce the off-target effects. Firstly, the sgRNA sequence can be altered. Truncation of the 3′ end of sgRNA (derived from tracrRNA domain that interacts with Cas9), shortening the region complementary to the target site at the 5′ end of the sgRNA by as many as 3 nt (tru-gRNA) or addition of two guanine nucleotides to the 5′ end, directly next to the target-complementary region of the guide RNA, improves target speciﬁcity, decreasing undesired mutagenesis at some off-target sites by 5,000-fold. Meanwhile, RGENs (RNA Guided Endo Nucleases) using these altered sgRNAs also have decreased on-target activity.
Secondly, one potential strategy for minimizing off-target effects is to control the concentration of the Cas9-sgRNA complex by titrating the amount of Cas9 and sgRNA delivered. Moreover, the wild-type Cas9 nuclease can be replaced with D10 mutant nickase version of Cas9 and paired with two sgRNAs that each cleaves only one strand, this can also reduce off-target effects significantly. There are also some other methods for reducing the off-target effects that are not related to sgRNA, such as the fusions of catalytically inactive Cas9 with Fok I nuclease domain (fCas9), which edits target DNA sites with >140-fold higher speciﬁcity than wild-type Cas9 and at least fourfold than that of paired nickases at loci with highly similar off-target sites. More efforts should be made to further enhance the efficiency and specificity.
1. Xu et al. Optimized guide RNA structure for genome editing via Cas9. Oncotarget, 2017, Vol. 8, (No. 55), pp: 94166-94171.
2. Nishimasu et al. Crystal Structure of Cas9 in Complex with Guide RNA and Target DNA. Cell. 2014 February 27; 156(5): 935–949.
3. Hille F, Charpentier E. CRISPR-Cas: biology, mechanisms and relevance. Phil. Trans. R. Soc. B. September 30, 2016; 371:20150496.
4. Zhang et al. Off-target Effects in CRISPR-Cas9-mediated Genome Engineering. Molecular Therapy-Nucleic Acids. 17 November 2015; 4, e264.
5. Wong et al. WU-CRISPR: characteristics of functional guide RNAs for the CRISPR/Cas9 system. Genome Biology. 02 November 2015; 1-8.