Jun 19 – 22, 2024
Squamish, BC, Canada
Canada/Pacific timezone
This conference is now SOLD OUT for in-person registration. Virtual registration is still available.

VILOCA: LOCAL HAPLOTYPE RECONSTRUCTION AND MUTATION CALLING FOR SHORT- AND LONG-READ VIRAL SEQUENCING DATA

Not scheduled
20m
Squamish, BC, Canada

Squamish, BC, Canada

Oral Software, tools & methods

Speaker

Lara Fuhrmann (ETH Zurich)

Description

RNA viruses exist in large heterogeneous populations within their host, impacting disease progression and treatment outcomes. To effectively control spread and develop targeted treatments and vaccines, quantitative characterization of within-host viral genetic diversity is crucial. Next-generation sequencing allows for comprehensive analysis of viral populations, from single-nucleotide variants to local and global haplotype sequences. Various methods have been developed for this purpose in the past years. However, recent benchmarking studies showed that in virus populations with high mutation rates and therewith high diversity, such as Human Immunodeficiency Virus (HIV), they perform very poorly. We present VILOCA, a method for mutation calling and reconstruction of local haplotypes from both short- and long-read sequencing data. Local haplotypes refer to local genomic regions that have roughly the length of the input reads. VILOCA recovers local haplotypes, even at low frequencies, by using a Dirichlet process mixture model to cluster reads around their unobserved haplotypes. We compared the performance of VILOCA to LoFreq, CliqueSNV, and other tools in terms of mutation calling and haplotype reconstruction on simulated Illumina, PacBio, and Oxford Nanopore samples, as well as on three experimental samples of HIV, Potato Virus Y, and Influenza A strains. On simulated and experimental Illumina samples, VILOCA performed better or similar to other methods. However, on the simulated long-read data, VILOCA is able to recover between 60-100% of the ground truth mutations with high precision (60-100%) compared to 20-60% recall and 40-80% precision of the second-best method. In experimental mixtures from MinIon reads, we found that VILOCA outperforms the other methods in reliable mutation calling with an f1-score of 88% compared to 75% for the second-best method. In summary, VILOCA provides significantly improved accuracy in mutation and haplotype calling, especially for long-read sequencing data, and therefore facilitates the comprehensive characterization of heterogeneous within-host viral populations.

Primary authors

Lara Fuhrmann (ETH Zurich) Benjamin Langer (ETH Zurich) Ivan Topolsky (ETH Zurich) Prof. Niko Beerenwinkel (ETH Zurich)

Presentation materials

There are no materials yet.