Computational threat detection
Designing new computational methods to detect pathogens in sequencing data
The NAO’s computational challenge
To effectively detect and respond to an emerging pathogen in metagenomic sequencing data, one must first distinguish that pathogen sequence from a complex and variable microbial background. This is no easy task, especially when a pathogen is at very low prevalence, differs significantly in its genome sequence from known threats, or has been deliberately engineered to evade detection. A core part of the NAO’s mission is to solve this problem, developing methods that provide robust and reliable early warning for as broad a range of pathogens as possible.
Exponential growth detection
One promising approach to addressing the NAO’s computational challenge is to analyze the growth pattern of sequences in metagenomic data. In order to represent a serious threat, any pathogen must grow to become highly abundant in its target population. Consequently, by focusing on sequences exhibiting sustained growth over time, we can potentially identify a wide range of pathogens — both natural and engineered — while making very few assumptions about their underlying biology.
While potentially highly promising, this exponential growth detection approach faces significant challenges, including questions regarding its sensitivity and vulnerability to false positives when applied to complex systems like wastewater. The NAO’s computational team is working to design exponential growth detection systems and evaluate their promise in real metagenomic datasets.
Other approaches
While exponential growth detection is a promising approach to pathogen detection, the NAO also recognizes the potential of other methods for detecting pathogenic threats in metagenomic data. Among others, these include strategies to detect genetically engineered sequences as indicators of dangerous human tampering; thorough characterization of a location's normal background to identify concerning deviations; and identification of novel sequences that emerge simultaneously across multiple monitoring locations. The NAO remains dedicated to exploring and refining various approaches to enhance our ability to reliably detect future outbreaks.