Cell: Researchers Develop Novel Base-editing Tool Through Innovative Protein Clustering Approach

Jul 03, 2023

Leave a message

Caixia Gao's group at the Institute of Genetics and Developmental Biology, Chinese Academy of Sciences has pioneered the use of AI-assisted structure prediction, established a tertiary structure-based protein clustering method, and extended it to a new deaminase mining system, and developed a series of novel base editing tools with Chinese independent intellectual property rights. This work provides a novel strategy for protein function analysis and new functional element mining. The newly developed base editing system has China's independent intellectual property rights for precision gene editing technology (PCT invention patent applied). The related research results were published in Cell. 
Proteins are the main bearers of life activities. Functional clustering of proteins is an important means to explore the physiological processes they are involved in and to design novel proteins, etc. Existing methods mainly cluster proteins based on the similarity of amino acid primary sequences and use them to infer their functions and evolutionary relationships. However, protein function is determined by its three-dimensional spatial structure, and the development of high-throughput protein clustering methods based on three-dimensional structure will provide a more direct and reliable means for protein function research and promote the functional mining of unknown proteins. 
Base editing systems can achieve precise editing of DNA or RNA with single nucleotide precision, which is a transformative technology for gene function research, disease treatment, and biological breeding. However, the core components of existing base editing systems, deaminases, originate from a single family, resulting in many limitations of base editing still, and editing is still difficult to meet the needs of diversified applications. Therefore, it is especially important to innovatively explore novel deaminases and develop new base editing tools for different application scenarios. 
To solve the above problems, Caixia Gao's research group innovatively used AI-assisted large-scale protein structure prediction to establish a new high-throughput protein clustering method based on tertiary structure, realized the in-depth mining of the functional structure of deaminases, identified completely new chassis elements that are different from known deaminase tool enzymes, and developed a series of novel base editing tools with our own intellectual property rights. 
The researchers performed bulk 3D structure prediction of representative deaminase functional sequences by the protein structure prediction model AlphaFold2, and further carried out innovative 3D structure-based protein multiple comparison and clustering to classify the potential deaminases into 20 different branches. In addition to the reported APOBEC/AID cytosine deaminase, five structurally and sequence novel active cytosine deaminase branches were detected. Among these branches, further structural clustering and functional validation of proteins with a DddA-like (Double-stranded DNA deaminase toxin A-like) deamination domain revealed that this branch contains a large number of proteins with only single-stranded DNA deamination activity, in addition to the previously hypothesized proteins with double-stranded DNA deamination activity, which This overturned the previous knowledge of the function of this class of proteins. The above study shows that AI-assisted protein structure clustering can yield more accurate results than the traditional clustering method based on amino acid primary sequences when the sequence homology of the protein collection is low and the functions are diverse. Thus, this method provides an efficient and reliable new strategy for protein function analysis and mining. 
Based on the results of further clustering mentioned above, the researchers newly identified 45 single-chain cytosine deaminases (Sdd) and 13 double-chain cytosine deaminases (Ddd). These deaminases are currently the only deaminases that are all prokaryotic (bacterial) in origin, whereas the existing APOBEC/AID deaminase family members are all eukaryotic in origin (mainly human, mammalian or fish). Researchers developed a series of novel base editing systems based on these deaminases and tested them in animal and plant cells. The results showed that the newly developed double-stranded base editing systems based on Ddd1 and Ddd9 deaminases overcame the shortcomings of conventional editors with significantly lower editing efficiency for GC sequences; the single-stranded base editing systems based on Sdd7 and Sdd3 showed very high editing activity and also had considerable base editing ability in GC sequences; the single-stranded base editing system based on Sdd6 showed extremely The Sdd6-based single-stranded base editing system showed high specificity and almost undetectable off-target events. The study further developed a novel Sdd6-CBE base editor that can be encapsulated by a single adeno-associated virus (AAV) through rational design and functional validation of the protein, and obtained an editing efficiency of up to 43.1% in a mouse cell line, solving the problem that conventional base editors are too large to be delivered by adenovirus particles. In addition, to address the long-standing problem of low base editing efficiency in soybean, the team newly developed the Sdd7-CBE system and obtained 34 stably edited plants with an editing efficiency of up to 22.1% in 154 soybean positive seedlings. This research breaks through the bottleneck of the application of existing deaminases and shows the promising application of the novel base editing system in medicine and agriculture. 
The research work was supported by the National Natural Science Foundation of China, the National Key Research and Development Program and the Strategic Pioneer Science and Technology Special Project of the Chinese Academy of Sciences.
Send Inquiry