A string of about 30,000 genetic letters were all that it took to start the nightmare of covid-19, the death toll from which is likely to be more than 20m. Exactly how this story began has been hotly contested. Many think that covid-19’s emergence was a zoonosis—a spillover, as so many new pathogens are, from wild animals, for it resembles a group of coronaviruses found in bats. Others have pointed to the enthusiastic coronavirus engineering going on in laboratories around the world, but particularly in Wuhan—the Chinese city where the virus was first identified. In February 2021 a team of scientists assembled by the World Health Organisation (WHO) to visit Wuhan said a laboratory leak was extremely unlikely. However, this conclusion was subsequently challenged by the WHO’s boss, who said ruling out this theory was premature.
Two recent publications appear to have bolstered the case for a natural origin connected to a “wet market” in Wuhan. These markets sell live animals, often housed in poor conditions, and are known to be sites where new pathogens jump from animal to human. Early cases of covid-19 clustered around this market. But critics counter that there are so many missing data about the epidemic’s initial days that this portrait may be inaccurate.
The opposing idea of a leak from a laboratory is not implausible. The accidental escape of viruses from labs is more common than many people realise. The flu epidemic of 1977 is thought to have started this way. But an escaped virus does not imply an engineered virus. Virology labs are also full of the unengineered sort.
Research such as that done in Wuhan offers a number of ways for a virus to leak out. A researcher on a field trip could have picked it up in the wild and then returned to Wuhan, and so spread it to others there. Or someone might have been infected with a wild-collected virus in the laboratory itself. But some argue that sars-cov-2 could have been assembled in a laboratory from other viruses that were already to hand, and then leaked out.
Into this fray comes an analysis from an unlikely source. Alex Washburne is a mathematical biologist who runs Selva, a small startup in microbiome science based in New York. He is an outsider, although he has worked in the past on virological modelling as a researcher at Montana State University. For this study, Dr Washburne collaborated with two other scientists. One is Antonius VanDongen, an associate professor of pharmacology at Duke University, in North Carolina. The other, Valentin Bruttel, is a molecular immunologist at the University of Würzburg, Germany. Dr Washburne and Dr VanDongen have been active proponents of an investigation into the lab-leak theory.
The trio base their claim on a novel method of detecting plausibly lab-engineered viruses. Their analysis, published on October 20th on bioRxiv, a preprint server, suggests sars-cov-2 has some genomic features that they say would appear if the virus had been stitched together by some form of genetic engineering. By examining how many of these putative stitching sites sars-cov-2 has, and how relatively short these pieces are, they attempt to assess how much the virus resembles others found in nature.
They start from the presumption that creating a genome as long as that of sars-cov-2 would mean combining shorter fragments of existing viruses together. For a coronavirus genome assembly they say an ideal arrangement would be to use between five and eight fragments, all under 8,000 letters long. Such fragments are created using restriction enzymes. These are molecular scissors which cut genomic material at particular sequences of genetic letters. If a genome does not have such restriction sites in opportune places, researchers typically create new ones of their own.
They argue that the distribution of restriction sites for two popular restriction enzymes—BsaI and BsmBI—are “anomalous” in the sars-cov-2 genome. And the length of the longest fragment is far shorter than would be expected. They determined this by taking 70 disparate coronavirus genomes (not including sars-cov-2) and cutting them into pieces with 214 commonly used restriction enzymes. From the resulting collection, they were able to work out the expected lengths of fragments when coronaviruses are cut into varying numbers of pieces.
The paper, which as a preprint has received no formal peer review, and which has not been accepted for publication in a journal, will be picked apart in the coming days—as well it should be, for this is the way that science works. Early reactions, though, have been deeply divided. Francois Balloux, a professor of computational systems biology at University College London, said he found the results intriguing. “Contrary to many of my colleagues, I couldn’t identify any fatal flaw in the reasoning and methodology. The distribution of BsaI/BsmBI restriction sites in sars-cov-2 is atypical”. Dr Balloux said these needed to be assessed in good faith. But Edward Holmes, an evolutionary biologist and virologist at the University of Sydney, said that every one of the features identified by the paper was natural and already found in other bat viruses. If someone were engineering a virus they would undoubtedly introduce some new ones. He added, “there are a whole range of technical reasons why this is complete nonsense.”
Sylvestre Marillonnet, an expert in synthetic biology at the Leibniz Institute for Plant Biochemistry, in Germany, agreed that the number and distribution of these restriction sites did not look quite random, and that the number of silent mutations found in these sites did suggest that sars-cov-2 might have been engineered. (Silent mutations are a result of engineers wanting to make changes in a sequence of genetic material without making changes to the proteins encoded by that sequence.) But Dr Marillonnet also said that there are arguments against this hypothesis. One of them is the tiny length of one of the six fragments, something that “does not seem logical to me”.
The other point Dr Marillonnet makes is that it is not necessary for the restriction sites to have been present in the final sequence. “Why would people introduce and leave sites in the genome when it is not needed?” he wondered. Previous arguments in support of the possibility of a lab leak have stressed that a manipulated virus would not need to have any such tell-tales. However, Justin Kinney, a professor at Cold Spring Harbor Laboratory, in New York, said that researchers have created coronaviruses before and left such sites in the genome. He said the genetic signature indicates a virus ready for further experiments and said it needed to be taken seriously, but warned the paper needed rigorous peer review.
Erik van Nimwegen, from the University of Basel, says there are only small scraps of information and it is “hard to pull anything definitive out of that”. He adds, “one cannot really exclude at all that such a constellation of sites may have occurred by chance”. The authors of the paper concede this is the case. Kristian Andersen, a professor of immunology and microbiology, at the Scripps Research Institute in La Jolla, California, described the pattern, on Twitter, as “random noise”.
Any conclusion that sars-cov-2 was engineered will be hotly contested. China denies the virus came from a Chinese lab, and has asked for investigations into whether it may have originated in America. Dr Washburne and his colleagues say their predictions are testable. If a progenitor genome to sars-cov-2 is found in the wild with restriction sites that are the same, or intermediate, it would raise the chances that this pattern evolved by chance.
Any widely supported conclusion that the virus was genetically engineered would have profound ramifications, both political and scientific. It would put in a new light the behaviour of the Chinese government in the early days of the outbreak, particularly its reluctance to share epidemiological data from those days. It would also raise questions about what was known, when, and by whom about the presumably accidental escape of an engineered virus. For now, this is a first draft of science, and needs to be treated as such. But the scrutineers are already at work. ■
Editor’s note: The pre-print “Endonuclease fingerprint indicates a synthetic origin of sars-cov-2” by Bruttel, Washburne and VanDongen, can be found at bioRxiv.
All our stories relating to the pandemic can be found on our coronavirus hub.