Joint Subproject 4

Increasing pandemic preparedness by computational high-throughput virus discovery: Identification of RNA viruses and host reservoirs with high spillover risk

We have a long-standing interest in the discovery and evolution of viruses in general. Our strategy involves two computational pipelines, termed Virushunter and Virusgatherer, dedicated to the identification of NGS experiments positive for the presence of viral sequence reads and the assembly of the corresponding viral genomes, respectively.

Computational virus discovery workflow. Left column: Analysis of hundreds of thousands of unprocessed NGS data sets from the Sequence Read Archive (SRA) requires enormous capacities for data storage and management. Middle column: In a first step, we apply Virushunter to detect data sets (highlighted in red) containing viral sequence fragments with high sensitivity and specificity. Right column: Only the virus-positive data sets are submitted to targeted assembly of viral genomes by Virusgatherer. © dkfz.de

In the proposed project, we will now use these data in order to contribute both to sustainable strategies towards preparedness for future Emerging Infective Diseases (EIDs) and pandemics and to better understand and control the current pandemic caused by SARS-CoV-2. Regarding future EID preparedness, we will derive, validate and apply a spillover and pandemic risk score to identify RNA viruses with highest zoonosis probability. Regarding control of the current pandemic, we will compile an ACE2 receptor sequence catalogue to identify potential animal reservoirs of SARS-CoV-2 in particular and any viral EIDs in general. We will further study the possible cross-interference between SARS-CoV-2 and other respiratory viruses and ubiquitous persistent viruses.

Phylogeny of 929 viral sequences discovered in the SRA that cluster with reference viruses from the Branch 3 RNA virus supergroup. This Maximum Likelihood tree is based on conserved regions of RdRp protein sequences. The diameters of circles placed at internal nodes are proportional to bootstrap support of the respective branching events. The colored right-justified horizontal lines extending from the tips of the tree indicate the organism group of the sequencing experiment a virus was discovered in. | © dkfz.de