Abstract
Prime editing (PE) is a highly versatile CRISPR–Cas9 genome editing technique. The current constructs, however, have variable efficiency and may require laborious experimental optimization. This study presents statistical models for learning the salient epigenomic and sequence features of target sites modulating the editing efficiency and provides guidelines for designing optimal PEs. We found that both regional constitutive heterochromatin and local nucleosome occlusion of target sites impede editing, while position-specific G/C nucleotides in the primer-binding site (PBS) and reverse transcription (RT) template regions of PE guide RNA (pegRNA) yield high editing efficiency, especially for short PBS designs. The presence of G/C nucleotides was most critical immediately 5’ to the protospacer adjacent motif (PAM) site for all designs. The effects of different last templated nucleotides were quantified and observed to depend on the length of both PBS and RT templates. Our models found AGG to be the preferred PAM and detected a guanine nucleotide four bases downstream of the PAM to facilitate editing, suggesting a hitherto-unrecognized interaction with Cas9. A neural network interpretation method based on nonextensive statistical mechanics further revealed multi-nucleotide preferences, indicating dependency among several bases across pegRNA. Our work clarifies previous conflicting observations and uncovers context-dependent features important for optimizing PE designs.
Original language | English (US) |
---|---|
Article number | 1222112 |
Journal | Frontiers in Genetics |
Volume | 14 |
DOIs | |
State | Published - 2023 |
Keywords
- CRISPR–Cas9
- DNA-RNA hybridization
- heterochromatin
- machine learning
- neural network interpretation
- nucleosome positioning
- nucleotide preference
- prime editing
ASJC Scopus subject areas
- Molecular Medicine
- Genetics
- Genetics(clinical)