DEV Community

Cover image for AI enthusiasm #8 - AI can modify human DNA🔬
Astra Bertelli
Astra Bertelli

Posted on

AI enthusiasm #8 - AI can modify human DNA🔬

Breaking news from life sciences

There's a biotech startup that claims to have edited human DNA with the aid of AI: how comes that we are able to edit human DNA? How exactly AI can help in doing that? Let's take one step at a time.

The CRISPR era


Since the very first discover of DNA structure by Watson and Crick in 1953, the El Dorado of molecular biologists had been to "crack the code": ingenerate modifications in the genetic sequence in order to change its features. The first attempts came in the 60s and 70s, were wheat plants were modified to be resistant to cold temperatures and parasites, even though these first GMO were obtained through very imprecise and coarse modifications based on some DNA absorption and/or horizontal DNA techniques already observed in bacteria (namely plasmids). Other paths that scientists used to achieve this goal were radiations, nanoparticle bombardment and viral vector transduction, but all these methods suffered from being either imprecise or highly expensive, so that no one could really be scaled on a wide-public production (one of the first medical treatment based on viral vectors for gene modification against spinal muscular atrophy, Zolgensma, costed around 2 million dollars), both for safety and financial concerns.

A revolutionary technique on the rise

Studying bacteria, scientists discovered, in 1987, that there were some intriguing repeated sequences in their DNA, which some unique and non-repetitive sequences were always found. An explanation for this characteristic came only twenty years later, when it was understood that prokaryotes use this sort of system as an "immune defense" against viruses (there is a wide class of viral agents that "eat" bacteria, known as bacteriophages).

To put it plain and simple, there was a family of proteins, the Cas family, which were able to recognize and chop the DNA of a viral invader based on previous encounters that the cell had with the same agent: in order to do so, they used a sort of "tracker", which was actually the portion of the viral DNA they had to break. This portion was stored in the bacterial genome, and was accessible thanks to a special signal, which was actually constituted by one of the repeated sequences discovered in 1987. This system was then called CRISPR-Cas, with CRISPR meaning Clustered Regularly Interspaced Short Palindromic Repeats (luckily they found a crispy acronym for that!).

Here came the breakthrough in science: what if we implemented CRISPR-Cas system to target our genes and correct the errors that were in there? In the end, it only takes a protein (Cas9 for most of the purposes) and a guide sequence, and then the system can freely work (it is a little more complex than this, but bear with me for today's article).

CRISPR-Cas9 was actually implemented and used for gene editing, and its impact was so massive on today's science that in 2020, Jennifer Doudna and Emmanuelle Charpentier, two CRISPR pioneers, were awarded with the Nobel Prize in Chemistry "for the development of a method for genome editing”.

Needless to say, this comes also with several ethic implications, that even turned in a sci-fi like scenario when, in 2018, He Jiankui, a Chinese biophysicist, used CRISPR-Cas9 to modify to human embryos at make them resistant to HIV.

AI comes into the play

Regardless of ethical problems that may come along with its implementation, CRISPR-Cas9 has proved as one of the most effective and trustworthy techniques to edit DNA, and lots of experiments and clinical trials are nowadays relying on that. Nevertheless, one of the biggest problem is the unsuitability of Cas protein for the editing target: proteins are indeed molecules with an highly complex 3D structure that not always fits everywhere.

In this sense, finding alternatives to Cas9 and to the whole editing scaffold structure to generate combinations that are a perfect fit for a given situation is a key problem for scientists now. Or, better, it was a key problem: Profluent, a biotech startup, has now aced an incredible results in predicting the sequence of Cas-like proteins using generative AI models.

In their paper "Design of highly functional genome editors by modeling the universe of CRISPR-Cas sequences" they explain how they did it:

  1. First, they mined data from over 26 TB of assembled microbial genomes, in order to get out more than 1 million of CRISPR operons. They collected all these data in the "CRISPR-Cas9 Atlas"
  2. They trained an AI model (OpenCRISPR-1) in order to generate Cas9-like protein sequences
  3. They tested the efficacy of the editing power for their generated proteins on a human cell lines
  4. They collected data from other proteins involved in the editing and designed a "fully synthetic base editor system" that would best suit the outputs from OpenCRISPR-1, and tested it on the human cell lines as before.

The first results look incredible and it seems that we are less that one step away from unlocking a new frontier of gene editing, which could be used for therapeutics, fighting the climate crisis, granting access to food for everyone, optimize agricultural production, eradicate disease-bearing insects... and many other applications.

The coolest part of this project is that it is completely open source: everyone can use OpenCRISPR as long as they sign Profluent license and terms of usage to ensure ethical application of the software for research and commercial purposes.


  • Design of highly functional genome editors by modeling the universe of CRISPR-Cas sequences Jeffrey A. Ruffolo, Stephen Nayfach, Joseph Gallagher, Aadyot Bhatnagar, Joel Beazer, Riffat Hussain, Jordan Russ, Jennifer Yip, Emily Hill, Martin Pacesa, Alexander J. Meeske, Peter Cameron, Ali Madani bioRxiv 2024.04.22.590591; doi:
  • OpenCRISPR GitHub
  • A wonderful book about CRISPR: Editing Humanity: The CRISPR Revolution and the New Era of Genome Editing, by Kevin Davies (this is a spontaneous suggestion, not an advertisement)

Leave your thoughts!

What do you think about the impact of this kind of AI on health, society and environment? Is it dangerous to apply AI to genome editing?

Leave a comment below!

Top comments (0)