Nucleic Acid Observatory

Sharing Data

In our work comparing different approaches to pathogen detection and piloting a biosurveillance system we are collecting a range of data, primarily metagenomic sequencing.  We aim to share most of this data as quickly as possible, but some is subject to access restrictions.

We’re currently able to share data from:

  • Boston Swab Sampling: we have been collecting nasal swab samples at busy public places around greater Boston.  Sequences we identify as potentially human-infecting viruses are linked from our sample log in FASTQ format.  In the future we plan to make all non-human-genome sequences public, but need to improve our filtering first to ensure we don't accidentally share human DNA.
  • Los Angeles Wastewater Sequencing: we collaborated with Jason Rothman, formerly of Katrine Whiteson’s lab at the University of California, Irvine, and now at his own lab at the University of California, Riverside, to sequence and analyze wastewater. The sequencing is complete and available on SRA (PRJNA1198001), with a total of 45B read pairs. We’re aiming to make the metadata and other details public with a data paper in late 2024, but in the meantime please contact us if you have questions.
  • Ongoing Wastewater Sequencing: we're collaborating with Marc Johnson's group at the University of Missouri to sequencing wastewater from multiple metropolitan areas.  As of 2024-11-05 this is 79B read pairs going back to samples collected in December 2023, with the addition of about 17B every other week. Marc intends to submit a manuscript and make these public in early 2025, after which we'll be making future sequencing runs public on an ongoing basis. In the meantime, if you'd like access to this data please send us a description of your planned research and we may be able to share it.