May 6 – 9, 2025
Abbaye de Royaumont, Asnières-sur-Oise, France
Europe/Paris timezone

INTEGRATING SYNTHETIC PHYLODYNAMIC LIBRARIES WITH OBSERVED GENOMIC AND EPIDEMIOLOGICAL DATA TO IMPROVE INFECTIOUS DISEASE FORECASTING

Not scheduled
20m
Abbaye de Royaumont, Asnières-sur-Oise, France

Abbaye de Royaumont, Asnières-sur-Oise, France

Abbaye de Royaumont, 95270 Asnières-sur-Oise, France
Oral Software, tools & methods

Speaker

Dr Lauren Castro (Los Alamos National Laboratory)

Description

Climate change and globalization are expected to increase the frequency of infectious disease outbreaks, underscoring the need for reliable forecasting methods to inform decision-making. The COVID-19 pandemic highlighted significant limitations in forecasting accuracy, particularly in scenarios where episodic selection drives the emergence of new genetic variants—such as the Delta and Omicron variants—resulting in surges in case counts.

Advances in high-throughput sequencing and the availability of extensive genomic data offer an opportunity to incorporate patterns of viral diversification and adaptation into forecasting models. Previous studies have shown that integrating the proportions of circulating variants of concern can improve the accuracy of COVID-19 cases and deaths forecasts. However, these studies are limited by retrospective analyses, assumptions about real-time sequence data availability, and have constrained applicability to diseases without large genomic datasets.

To address these challenges, we propose a novel approach that leverages synthetic libraries of fast-evolving pathogens to enhance forecasting. Using the previously published phylodynamic model MutAntiGen, we simulate a comprehensive synthetic library of paired viral evolution and epidemiological dynamics. By systematically varying evolutionary, epidemiological, and immunological parameters, we capture a wide range of outbreak scenarios, encompassing differences in viral turnover dynamics and outbreak frequencies.

From these simulations, we derive features commonly used in phylodynamics, including measures of genetic diversity and topological properties. These features are then incorporated into a gradient boosting model framework that combines synthetic and real-time data. Using COVID-19 as a case study, we demonstrate that integrating synthetic phylodynamic time series with observed historical data significantly improves forecasting accuracy compared to models relying solely on historical sequence and case data.

This work bridges forecasting and genomic epidemiology, offering a powerful framework for addressing data gaps and enhancing preparedness for future outbreaks of fast-evolving pathogens.

Expedited Notification No thanks, I do not require Expedited Notification

Primary authors

Dr Lauren Castro (Los Alamos National Laboratory) Dr Emma Goldberg (Los Alamos National Laboratory) Dr William Fischer (Los Alamos National Laboratory) Dr Alexander Murph (Los Alamos National Laboratory) Dr Lauren Beesley (Los Alamos National Laboratory) Dr Dave Osthus (Los Alamos National Laboratory)

Presentation materials

There are no materials yet.