CRISPR-Cas systems have the unique ability to heritably alter the host genome of bacteria or archaea by incorporating small fragments of foreign DNA, called spacers, in between the repeats of the CRISPR loci. This process is known as spacer acquisition. Spacers are transcribed and processed into individual CRISPR RNAs (crRNAs), which guide cas effector nucleases to destroy complementary, invading nucleic acids. Thus, spacer sequences define the specificity of the CRISPR-Cas immune response, bestowing immunity to both the host and its progeny. Spacer acquisition can be conceptually divided into two phases: spacer sequences capture in the invading genome (protospacers) and spacer integration. During the first stage, protospacers are selected and extracted from foreign genomes. In the second stage, spacers are processed and incorporated into the CRISPR locus.
Found in approximately 45% of bacteria and 85% of archaea, CRISPR systems have been categorized by cas gene content into two classes, six types and more than 20 subtypes. Each of the six types uses functionally distinct effector complexes that mediate the destruction of foreign nucleic acids. With a few exceptions, the spacer acquisition mechanism has been studied in detail in the E. coli type I CRISPR-Cas system. The core machinery that mediates spacer acquisition, encoded primarily by cas1 and cas2, is well conserved across the different types, and thus established a very simple and elegant experimental system to study this phenomenon.
Two modes of adaptation have been reported for type I systems, naive and primed. During naive adaptation, the organism obtains a spacer from a foreign DNA source. In contrast, primed acquisition relies on a pre-existing spacer that enables a biased and enhanced uptake of new spacers. Both modes are based on the action of two key proteins, Cas1 and Cas2. Naive adaptation requires only Cas1 and Cas2, whereas primed adaptation additionally requires the type I interference complex Cascade (crRNA-guided CRISPR-associated complex for antiviral defense) and the Cas3 nuclease. Other CRISPR-Cas types encode additional proteins that appear to be involved in spacer acquisition.
During type I-E targeting, the cascade binds to a foreign target in a PAM-dependent manner, and it subsequently recruits Cas3 for target destruction. In addition to eliminating the foreign nucleic acid, the nuclease and helicase activities of Cas3 also drive the generation of spacer substrates. This mode of interference-driven spacer acquisition seems to result in markedly higher rates of spacer acquisition than in primed acquisition resulting from partially matching targets. In the absence of a proper PAM, Cascade can still bind the target and recruit Cas3 in a manner dependent on Cas1-Cas2. In this case, the nuclease domain of Cas3 is inactive, and it is believed that its helicase activity is used to translocate the Cas1-Cas2 complex along the nearby DNA and drive primed spacer acquisition by the integrase complex.
Fig 1. Protospacer selection and capture. A. Cascade binds to a foreign target in a PAM-dependent manner. B. Imperfect target recognition by Cascade.
In CRISPR spacer acqusition, Cas1 interacts with Cas2 to form a complex that acts as the spacer integrase. This heterohexameric complex contains two separate DNA-binding regions, one that binds the incoming protospacer and one that binds the CRISPR array. Once loaded with the incoming spacer, the Cas1-Cas2 complex catalyses two cleavage-ligation reactions, first at the leader end of the first repeat of the CRISPR array and subsequently at the spacer end of the repeat. In this reaction, the terminal 3′-OH of each strand of the protospacer DNA carries out a nucleophilic attack on each end of the repeat DNA. The product of this reaction is an intermediate in which the 3′ ends of a double-stranded DNA (dsDNA) protospacer are ligated to single-stranded DNA (ssDNA) repeat sequences. These ssDNA gaps are presumably filled by a DNA polymerase and then ligated, resulting in a simultaneous spacer insertion and repeat duplication.
Researches have shown that new spacers integration by the Cas1-Cas2 complex is polarized, that is, the new spacers are predominantly added to the leader end of the CRISPR array. For this reason, CRISPR loci have been likened to molecular fossil records of past infections, with the newest memories located at the leader end and the most ancestral spacer sequences positioned at the trailer end. By ordering spacers chronologically, CRISPR systems optimize their immune response against the most recent invaders, as leader end spacers provide more robust immunity relative to more downstream positions. And how the polarized addition of new spacers is achieved differs in various CRISPR-Cas types.
Fig 2. Integration of new spacers into the criSPr locus. A. General schematic of the spacer integration reaction. B. Two mechanisms for preferential spacer acquisition at the leader end of the CRISPR array.
In type I CRISPR-Cas systems, an α-helix of Cas1 makes sequence-specific contacts with a minor groove in the 3′ end of the leader, but this is not sufficient to enforce spacer addition at the leader end. Rather, factors encoded by the host genome are required for site-specific integration. Type I leaders contain a conserved integration host factor (IHF)-binding site, and binding of IHF which is required for polarized spacer integration in vitro and spacer acquisition in vivo, induces a distortion of the CRISPR array DNA. This creates the ideal target substrate for the Cas1-Cas2 integrase specifically at the first repeat. Additionally, the Cas1-Cas2 integrase makes contacts with IHF and upstream sequences in the leader as a result of DNA bending induced by IHF. Indeed, an archaeal type I-A system, whose host lacks IHF, exhibits leader specificity for spacer integration in a manner dependent on an as of yet unidentified host factor or host factors.
The type II CRISPR-Cas systems also exhibit strictly polarized spacer integration. Similar to type I system, an α-helix of the type II Cas1 makes sequence-specific contacts with the minor groove of leader DNA (leader anchoring sequence, LAS). These contacts are sufficient to position spacer integration at the leader end of the array without the need of any additional host factors. This is achieved owing to the additional stabilizing contacts between the LAS and Cas1 that improve the kinetics of the cleavage-ligation reaction at the leader-repeat junction. Because the second cleavage-ligation reaction occurs at the spacer-repeat junction, the target substrate varies, and this requires some flexibility in the LAS-interacting domain of Cas1 for catalysing the reaction. Most likely as a result of this flexibility, in the absence of a proper LAS, the type II CRISPR systems can undergo integration of new spacers in the middle of the array.
1. Jon McGinn and Luciano A. Marraffini. Molecular mechanisms of CRISPR-Cas spacer acquisition. Nature Reviews Microbiology. August 2018.
2. Shipman et al. Molecular recordings by directed CRISPR spacer acquisition. Science. 2016 July 29; 353(6298): aaf1175.
3. Wang et al. Structural and Mechanistic Basis of PAM-Dependent Spacer Acquisition in CRISPR-Cas Systems. Cell. 2015 November 5. 840–853.
4. Sternberg and Richter et al. Adaptation in CRISPR-Cas Systems. Molecular Cell 61, March 17, 2016. 797-808.