Pooled Lab Discards for Pathogen Detection

Authors Jeff Kaufman & Lennart Justen
Date May 14, 2025

We gratefully acknowledge Simon Grimm and Harmon Bhasin for their help investigating different types of clinical lab discards.

A “stealth” pathogen, one with a very long period of asymptomatic spread, could potentially spread extremely widely before detection. The consequences of this kind of spread could be severe. Since most pathogens that could spread this way would have DNA or RNA genomes, this suggests an early warning system could look for suspicious nucleic acid sequences. Metagenomic sequencing is an attractive tool here, because it doesn’t require choosing in advance which sequences to look for. Instead, you attempt to survey the full range of nucleic acids present in a sample, and then apply many different computational algorithms to flag patterns for expert review.

In order to apply this approach, however, you need biological samples from humans. These samples can be collected from the environment or directly from individuals.

Environmental samples like wastewater and air tend to have many individuals contributing to each sample, and so are very low cost per individual. On the other hand, they require very deep sequencing to make up for the extremely small fraction of nucleic acids that come from human-infecting pathogens. For example, when looking for respiratory pathogens in municipal wastewater, if 1% of people caught the flu in a week we’d only expect to see about 1 in 1B sequencing reads match influenza. Other wastewater options like pooled airplane lavatory waste are better, but likely only by 1-2 orders of magnitude.

Individual samples, like blood or nasal swabs, tend to have a greater proportion of reads coming from humans and human-infecting viruses, requiring much less sequencing, but are more expensive to collect on a per-person basis. Our volunteer swab sampling program, for example, requires sampler time, sampling materials, and participant compensation for every sample.

Given the distinct advantages of these different approaches1, we’re excited about sampling strategies that can cheaply access many individual samples, approaching the coverage of municipal wastewater while increasing sensitivity. We’ve previously written about one such approach, accessing leftover samples from the blood and plasma supply, and here we discuss another approach we’re excited about: accessing laboratory discards en masse.

The medical system already involves the collection and centralized processing of a large number of samples from humans, as part of standard medical diagnosis. After testing is complete the remainder of the sample (“discard”) is potentially available for this kind of public health testing: in the US this kind of sample does not require patient consent for research2, and other countries may also have favorable regulatory treatment. There is precedent in using discard samples for public health monitoring (Pilewskie et al. 2025) and many other kinds of biotech research.

The most useful samples from a biosurveillance perspective are ones that provide broad coverage of pathogens, particularly those not well-covered by other sampling approaches, and where nucleic acids from these pathogens would comprise a high fraction of the nucleic acids in the sample.

The NAO did a shallow investigation into different types of potential clinical sample discards and came away thinking that the most promising sample types were probably blood (and its components), sputum, lymph and lymph node biopsies, and cerebrospinal fluid. Additionally, accessing laboratory discards could potentially be a sufficiently good way to collect nasal swabs that it wouldn’t make sense to continue collecting swab samples from volunteers.

We do see a few main issues with this approach, however:

  • Quantity: while some sample types (ex: blood) are extremely frequently collected, others (ex: sputum, lymph, and cerebrospinal fluid) are less so. Because of the high relative abundance of pathogens we expect in these samples, sequencing would be most cost effective with pools of 200+ individuals, and assembling pools of this scale might not be practical for the less commonly-collected sample types.

  • Delay: it’s not usually known immediately how much of a sample is excess. Instead, the lab needs to complete the primary test(s). As a result, discards usually come with a multiple-day delay. If a pathogen is doubling in the population every 1-4 days, each day of delay is a significant decrease in your limit of detection.

  • Cost: many biotech organizations are interested in these samples. In 2023 and early 2024 we looked into buying discards from multiple small US-based vendors and weren’t able to find any below $10/person. This might be workable, depending on funding, but the numbers look dramatically different if we can get down to the $1/person range.

Overall, blood does very well here. It’s a sample type that has very different sensitivity tradeoffs than wastewater or nasal swabs, it’s processed at a high enough quantity that assembling large pools is practical, and because so much is collected it’s likely there are options that offer lower delay and/or cost.

Footnotes

  1. Another helpful feature of individual samples is that if you detect some signal of pathogen spread or genetic engineering, you can be more confident that it’s indicative of something spreading among humans, compared to originating from plants, animals, or other microbes in the environment.↩︎

  2. This is a complex area, but the key regulation is 45 CFR 46. In 46.102(e)(1) a “human subject” is defined as “a living individual about whom an investigator (whether professional or student) conducting research: (i) Obtains information or biospecimens through intervention or interaction with the individual, and uses, studies, or analyzes the information or biospecimens; or (ii) Obtains, uses, studies, analyzes, or generates identifiable private information or identifiable biospecimens.” The FDA also has guidance stating that it “does not intend to object to the use, without informed consent, of leftover human specimens – remnants of specimens collected for routine clinical care or analysis that would otherwise have been discarded – in investigations that meet the criteria for exemption from the Investigational Device Exemptions (IDE) regulation at 21 CFR 812.2(c)(3), as long as subject privacy is protected by using only specimens that are not individually identifiable.” A more conservative approach involves working on samples where the patient gave “broad consent” for research use at collection time. We haven’t decided yet what our approach would be if we were to begin working with this kind of sample.↩︎