It’s a breakthrough that sounds straight out of sci-fi. A team at Stanford University announced in a recent preprint that they’ve used AI to generate a virus genome, and from that produced functional virus particles. The viruses in question are bacteriophages, which only infect bacteria – they can’t infect humans, or any other animals for that matter. But applications of this technology have the potential to impact human society in a big way.
The rest of this article is behind a paywall. Please sign in or subscribe to access the full content. ChatGPT for DNA? We had to find out more, so we caught up with senior author Dr Brian L. Hie, Assistant Professor of Chemical Engineering and supervisor of the Laboratory of Evolutionary Design. “A lot of the progress in AI for biology has happened at the level of proteins,” Hie told IFLScience. He’s not wrong – such a project even won the Nobel Prize in Chemistry last year. But Hie’s lab is interested in taking things even further: “Earlier this year we developed this model called Evo 2. It's a second generation of DNA language models.” Many of us will have used ChatGPT or another generative AI tool. These bots are able to generate text that feels close to human language because the large language models underlying them have been trained on huge amounts of textual data. A DNA language model works on a similar principle, but it’s been trained on genetic data. So instead of spitting out the answer to a question, it produces DNA sequences. Evo 2 can process inputs of up to 1 million DNA base pairs, Hie explained. In theory, the team knew it was possible that it could produce a sequence long enough to constitute an entire genome – they just hadn’t tested this ability in the lab. But why start with bacteriophages or, as most scientists refer to them, phages? “They’re very historically significant,” Hie told IFLScience. “These are the first DNA genomes ever sequenced by Fred Sanger in the 70s, the first genomes ever synthesized. So, they’re also an attractive target for the first generatively designed genomes.” There’s also a question of practicality. The phage the team used as a template, ΦX174, has a genome comprised of a little over 5,000 base pairs. An individual human genome has over 3 billion. “It’s shorter than some human genes, actually, the entire genome of the phage,” Hie said. Model structure of the bacteriophage ΦX174. This matters because DNA synthesis is quite costly. Phages are also easy to manipulate in the lab, and their genomes have been well studied so we understand a lot about them. Once the team had asked their AI model to generate phage genomes, they had to test if the output was plausible, if what it was spitting out was actually a viable genome sequence. We’ve all heard stories about chatbots hallucinating… Hie says they realized they were onto something very early on, when the generated sequences were “fooling the first line of bioinformatic tools” that verify whether genome sequences are real or not. The model “still definitely hallucinates”, but the team developed filters to weed out the non-viable sequences. That still left them with enough plausible material to move on to the next phase of their experiments. This, we were surprised to learn, was “actually quite simple,” as Hie explained. “Samuel [King], the first author of the paper, did a lot of work in making the testing protocol as easy as possible.” To go from a text file full of DNA bases to an actual functioning phage, you first have to synthesize the DNA molecules themselves, and there are companies that will do that for you. Once your little tubes of DNA arrive in the mail, you run a reaction to convert the linear strands of DNA into circular fragments called plasmids, which are more bacteria-friendly. That’s important because the next step in your process is to combine the plasmids with a load of E. coli bacteria. Treating the bacteria with a heat shock causes the DNA to diffuse inside their cells, where they can get to work turning it into proteins. Here, biology kind of takes over. If the phage genome that the AI model generated is legit, the proteins will start to self-assemble into a phage. You’ll know it’s worked when that phage goes on to kill the bacteria that were doing all this work for you. Brutal, but elegant. “It ended up being a pretty fast system because if the bacteria die, then the phage is probably working, right? And we can actually read out if the bacteria die pretty easily, because if the liquid is cloudy then there’s lots of bacteria, but if the liquid is clear then the bacteria probably died,” Hie explained. In their study, the team report the production of 16 viable phages, many of which were found to be more effective at killing bacteria than the original ΦX174 on which they were modeled. We’ve touched on some of the reasons why phages were a convenient subject for this experiment, but it’s not just about convenience. There’s a huge amount of interest in the applications of phages, and AI generation could help solve some of the problems that exist in this field. One major point of interest is in using phages to treat antibiotic-resistant infections. Phages kill bacteria – couldn’t we harness this to help us kill the bacteria that are killing us? This is not a new idea; it can apply to agricultural pests as well as human diseases, and there are places actively using these treatments. In some parts of the world, Hie told us, “you can go to the grocery store and get an over-the-counter phage cocktail.” But leveraging AI and the control it allows over biological design could mean, in the future, that scientists could produce phages “to order” for personalized medicine, and diverse strains of phages that are known to produce better therapeutic results. If we do continue to advance the technology of these models then they’ll have a major positive impact on humanity. Dr Brian L. Hie Hie is full of optimism for this technology, and it’s easy to get swept along in the impressive achievements described in the paper. But there’s another aspect to this story that also has to be considered: biosecurity. “We’re nowhere near ready for a world in which artificial intelligence can create a working virus,” opens a recent Washington Post opinion piece authored by Tal Feldman, a Yale Law student with a background in AI and data science, and Jonathan Feldman, a computer science and biology student at Georgia Institute of Technology. Feldman and Feldman concede that the Stanford team “played it safe” – indeed, Hie and co-authors included an entire section in the supplementary materials of the paper to specifically address these questions, and their attention to safety is made clear throughout. But, Feldman and Feldman ask, “what’s to stop others from using open data on human pathogens to build their own models?” Could we be heading towards a situation where bad actors can leverage this technology to make their own deadly human viruses? “These models are progressing very fast,” Hie told IFLScience. “Even if we in our study were pretty safe and thoughtful, as the models progress and become more ubiquitous, potentially other groups might not be as thoughtful about safety.” In terms of potential safeguards and regulatory actions that could be taken in the future, Hie said the Washington Post article put forward some good suggestions. He noted that actually synthesizing the DNA and producing the viruses may be simple to explain, but it’s “non-trivial” in practice. It’s also not necessarily true that someone with ill intent would have to wait to use AI to help them achieve their nefarious purposes: “You know, sadly, it’s actually not very difficult to design something dangerous right now.” The problems that this technology could help us solve, on the other hand, are major, existential crisis-level things for humanity. Antimicrobial resistance is forecast to contribute to over 39 million deaths by 2050 if more is not done – a better route to phage therapy could help with that. “I think the potential upside far outweighs the risk,” concluded Hie. “We definitely need to do it responsibly and safely, but if we do continue to advance the technology of these models then they’ll have a major positive impact on humanity.” The preprint, which is currently undergoing peer review and is yet to be published, is available via bioRxiv.How do you make a virus?
Where could this technology lead in the future?
Is the world prepared for AI-generated viruses?