The Global Initiative on Sharing Avian Influenza Data (GISAID), a large database launched in 2008 to collect and share the genome information of influenza viruses, is a key tool for scientific cooperation at the international level. Through it, scientists have been able to access genetic information of the latest strands of flu, including bird flu, and collaborate on responses.
Recently, however, the platform has been involved in a controversy over its data access practices. And, in attempting to address it, it stepped into another mess, this time about the timeline of the covid genome release.
In a statement published on Mar. 21 to address an issue of data availability, GISAID included information on its background to establish its authority. Among such information, it described its role in contributing to covid research, saying that that in early 2020, “the China CDC made the first whole-genome sequences available to the world via GISAID shortly after midnight on 10 January 2020 UTC.”
Many virologists involved in the early stages of covid research beg to differ.
A better covid origin debate?
This isn’t the first time GISAID claims to have been the first to publish the covid virus’s genome. Their version of events was included in a CNN documentary about covid discovery, and in a landmark covid study published in the New England Journal of Medicine. Many scientists have been contesting this narrative for years, reports Science, which published and detailed and fascinating breakdown of the debate over who got the genome first.
As a broad group of international scientists who studied covid remember it, it was Edward Holmes, a virologist at the University of Sidney, who published the genome first, after receiving it from Zhang Yong-Zhen, a virologist at Fudan University. The genome went live on the forum virology.org early in the morning of Jan. 11, 2021. GISAID, they claim, came only later, on Jan. 12.
Zhang and Holmes were among the authors of the first paper detailing the emergence of a new coronavirus on Jan. 7, 2020. Zhang had previously published the genome on another platform on Jan. 3, but he hadn’t made it public in the tense climate of China denial of the outbreak.
He eventually did, ahead of GISAID. The digital time-stamps of tweets published in early January 2020 by researchers commenting on the genome release suggest no discussion of the genome before Jan. 11, though that is hardly definitive proof of either version.
The internet’s bad memory
GISAID’s own database reports the submission on Jan. 10, but there is no other online evidence of its existence until Jan. 12, 2020. The World Health Organization (WHO), too, does not mention the Jan. 10, 2020 submission in its official timeline of covid discovery.
But while the agency reports receiving the genome sequence from the Chinese authorities on Jan. 12, 2020, it does not mention virology.org or Zhang. Still, there are no traces of GISAID announcing the release on Jan. 10, 2020, or of other scientists commenting on it.
According to Science, no tweets support the database’s assertion. But GISAID—which is strongly defending its claim—has a screenshot of communications with the WHO that show some discussions about the virus, though no references to its genome being shared. Numerous scientists involved in early covid work told the publication they did not remember getting the genome from GISAID, nor know anyone who did.
As Zhang says, “the internet has memories,” and though the ones surrounding this cases might be incomplete, they appear to point to his version as the more accurate.
It’s all a little convoluted, and lacks a clear resolution. Still, this apparently banal disagreement could end up further eroding trust in GISAID—and perhaps even in genome sharing, a critical aspect of the scientific community’s response to deadly viruses.