Conveners
Software, Tools & Methods
- Caroline Colijn (Simon Fraser University)
- Nadia Neuner-Jehle (Swiss Tropical and Public Health Institute, University of Basel, Swiss Institute of Bioinformatics)
Naming matters, especially when viruses evolve faster than our labels. Non-polio enteroviruses (NP-EVs) cause diseases ranging from mild infections to severe neurological illnesses in children. Yet their clade nomenclature often reflects chance sampling and outdated circulation patterns rather than current evolutionary understanding. Subjective naming systems complicate communication between...
HIV-1 persistence during antiretroviral therapy (ART) remains the main barrier to a cure. Eliminating these reservoirs is extremely challenging, in part because HIV-1 exhibits extraordinary genetic variability, generating heterogeneous populations capable of drug resistance and immune evasion. Although ART dramatically reduces viral population size, reservoirs retain sufficient genetic...
The increasing frequency of infectious disease outbreaks, driven by changing Earth systems and globalization, underscores the need for reliable short-term forecasting methods to inform public health decision-making. However, most short-term case count forecasting approaches depend on extensive historical data for model training and are therefore poorly suited for emerging pathogens, early...
Direct-from-sample sequencing is an essential tool in modern clinical and public health microbiology, particularly for sequencing viruses or unculturable pathogens, and for diagnostic metagenomics. Targeted metagenomics uses sets of oligonucleotide probes (up to 100,000s) to selectively sequence targeted loci or genomes from hundreds of pathogen species simultaneously, removing the need to...
Reconstructing the phylogenetic trees of pathogens responsible for disease outbreaks has become a standard tool among public health agencies. But particularly when sampling is non-constant and non-uniform, it can be challenging to link sequences and corresponding phylogenetic trees to epidemiological quantities of interest such as rates of infection, reproduction numbers, or serial intervals....
Genomic language models (gLMs) have emerged as powerful tools for learning numerical representations of DNA sequences. Most existing models, however, are not trained on viral genomes, or limited viral references and lack systematic evaluation frameworks tailored to virology. Here, we introduce vir2vec, a 422-million-parameter decoder-only genomic language model obtained through continual...