Speaker
Description
Wastewater samples collected at municipal wastewater treatment plants have become a promising source of data to complement traditional infectious disease surveillance. In addition to measuring pathogen concentrations in wastewater over time to monitor transmission dynamics, viral RNA sequencing has been used to investigate the genetic diversity of SARS-CoV-2 and track the emergence and relative abundance of variants. To analyse sequences from wastewater samples, most approaches use mutation calling on read alignments, combined with downstream statistical analyses to detect specific variants or deconvolve lineage proportions. In contrast, phylogenetic inference, which is widely used for epidemiological analysis of whole-genome sequences from clinical samples, has rarely been used in wastewater-based epidemiology.
The limited application of phylogenetic methods to wastewater sequences is likely also due to the current challenges of sequencing reads obtained through tiling amplicon NGS of wastewater. In particular, because wastewater samples contain a heterogeneous mix of genomic variants from a large pool of individuals, the construction of consensus sequences for downstream phylogenetic analysis is highly error-prone, as it can lead to chimeric sequences of different variants co-occurring in the population. In addition, PCR-based amplification of small amounts of viral RNA against a large genetic background introduces considerable noise and potential bias.
In this work, we evaluate the phylogenetic information that can be obtained from wastewater sequences, taking into account these important limitations. We analyse tiling amplicon reads from longitudinal samples collected during the COVID-19 pandemic at the wastewater treatment plant of Zurich, Switzerland. We construct phylogenetic trees at the amplicon level under different assumptions about the amplification process and compare their topology and temporal signal with trees constructed from whole-genome clinical sequences from the same population and time period. We discuss the implications of our results for the future potential of phylogenetic methods in wastewater-based epidemiology.