1
|
Oltean HN, Allen KJ, Frisbie L, Lunn SM, Torres LM, Manahan L, Painter I, Russell D, Singh A, Peterson JM, Grant K, Peter C, Cao R, Garcia K, Mackellar D, Jones L, Halstead H, Gray H, Melly G, Nickerson D, Starita L, Frazar C, Greninger AL, Roychoudhury P, Mathias PC, Kalnoski MH, Ting CN, Lykken M, Rice T, Gonzalez-Robles D, Bina D, Johnson K, Wiley CL, Magnuson SC, Parsons CM, Chapman ED, Valencia CA, Fortna RR, Wolgamot G, Hughes JP, Baseman JG, Bedford T, Lindquist S. Sentinel Surveillance System Implementation and Evaluation for SARS-CoV-2 Genomic Data, Washington, USA, 2020-2021. Emerg Infect Dis 2023; 29:242-251. [PMID: 36596565 PMCID: PMC9881772 DOI: 10.3201/eid2902.221482] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
Genomic data provides useful information for public health practice, particularly when combined with epidemiologic data. However, sampling bias is a concern because inferences from nonrandom data can be misleading. In March 2021, the Washington State Department of Health, USA, partnered with submitting and sequencing laboratories to establish sentinel surveillance for SARS-CoV-2 genomic data. We analyzed available genomic and epidemiologic data during presentinel and sentinel periods to assess representativeness and timeliness of availability. Genomic data during the presentinel period was largely unrepresentative of all COVID-19 cases. Data available during the sentinel period improved representativeness for age, death from COVID-19, outbreak association, long-term care facility-affiliated status, and geographic coverage; timeliness of data availability and captured viral diversity also improved. Hospitalized cases were underrepresented, indicating a need to increase inpatient sampling. Our analysis emphasizes the need to understand and quantify sampling bias in phylogenetic studies and continue evaluation and improvement of public health surveillance systems.
Collapse
|
2
|
Volk D, Yang-Turner F, Didelot X, Crook DW, Wyllie D. Catwalk: identifying closely related sequences in large microbial sequence databases. Microb Genom 2022; 8. [PMID: 35771206 PMCID: PMC9455716 DOI: 10.1099/mgen.0.000850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
There is a need to identify microbial sequences that may form part of transmission chains, or that may represent importations across national boundaries, amidst large numbers of SARS-CoV-2 and other bacterial or viral sequences. Reference-based compression is a sequence analysis technique that allows both a compact storage of sequence data and comparisons between sequences. Published implementations of the approach are being challenged by the large sample collections now being generated. Our aim was to develop a fast software detecting highly similar sequences in large collections of microbial genomes, including millions of SARS-CoV-2 genomes. To do so, we developed Catwalk, a tool that bypasses bottlenecks in the generation, comparison and in-memory storage of microbial genomes generated by reference mapping. It is a compiled solution, coded in Nim to increase performance. It can be accessed via command line, rest api or web server interfaces. We tested Catwalk using both SARS-CoV-2 and Mycobacterium tuberculosis genomes generated by prospective public-health sequencing programmes. Pairwise sequence comparisons, using clinically relevant similarity cut-offs, took about 0.39 and 0.66 μs, respectively; in 1 s, between 1 and 2 million sequences can be searched. Catwalk operates about 1700 times faster than, and uses about 8 % of the RAM of, a Python reference-based compression and comparison tool in current use for outbreak detection. Catwalk can rapidly identify close relatives of a SARS-CoV-2 or M. tuberculosis genome amidst millions of samples.
Collapse
Affiliation(s)
- Denis Volk
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Fan Yang-Turner
- Nuffield Department of Medicine, University of Oxford, Oxford, UK.,Present address: UKRI Science and Technologies Facilities Council, Harwell, UK
| | - Xavier Didelot
- School of Life Sciences, University of Warwick, Coventry CV4 7AL, UK.,Department of Statistics, University of Warwick, Coventry CV4 7AL, UK
| | - Derrick W Crook
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - David Wyllie
- UK Health Security Agency, Forvie Site, Addenbrookes' Campus, Robinson Way, Cambridge CB2 0SR, UK
| |
Collapse
|
3
|
Shakeel M, Irfan M, Un Nisa Z, Farooq S, Ul Ain N, Iqbal W, Kakar N, Jahan S, Shahzad M, Siddiqi S, Khan IA. Genome sequencing and analysis of genomic diversity in the locally transmitted SARS-CoV-2 in Pakistan. Transbound Emerg Dis 2022; 69:e2418-e2430. [PMID: 35510932 PMCID: PMC9348400 DOI: 10.1111/tbed.14586] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Revised: 04/11/2022] [Accepted: 05/01/2022] [Indexed: 12/01/2022]
Abstract
Surveillance of genetic diversity of the SARS‐CoV‐2 is extremely important to detect the emergence of more infectious and deadly strains of the virus. In this study, we evaluated mutational events in the SARS‐CoV‐2 genomes through whole genome sequencing. The samples were collected from COVID‐19 patients in different major cities of Pakistan during the four waves of the pandemic (May 2020 to July 2021) and subjected to whole genome sequencing. Using in silico and machine learning tools, the viral mutational events were analyzed, and variants of concern and of interest were identified during each of the four waves. The overall mutation frequency (mutations per genome) increased during the course of the pandemic from 12.19 to 23.63, 31.03, and 41.22 in the first, second, third, and fourth waves, respectively. We determined that the viral strains rose to higher frequencies in local transmission. The first wave had three most common strains B.1.36, B.1.160, and B.1.255, the second wave comprised B.1.36 and B.1.247 strains, the third wave had B.1.1.7 (Alpha variant) and B.1.36 strains, and the fourth waves comprised B.1.617.2 (Delta). Intriguingly, the B.1.36 variants were found in all the waves of the infection indicating their survival fitness. Through phylogenetic analysis, the probable routes of transmission of various strains in the country were determined. Collectively, our study provided an insight into the evolution of SARS‐CoV‐2 lineages in the spatiotemporal local transmission during different waves of the pandemic, which aided the state institutions in implementing adequate preventive measures.
Collapse
Affiliation(s)
- Muhammad Shakeel
- Jamil-ur-Rahman Center for Genome Research, Dr. Panjwani Center for Molecular Medicine and Drug Research, ICCBS, University of Karachi, Karachi, 75270, Pakistan
| | - Muhammad Irfan
- Jamil-ur-Rahman Center for Genome Research, Dr. Panjwani Center for Molecular Medicine and Drug Research, ICCBS, University of Karachi, Karachi, 75270, Pakistan
| | - Zaib Un Nisa
- Jamil-ur-Rahman Center for Genome Research, Dr. Panjwani Center for Molecular Medicine and Drug Research, ICCBS, University of Karachi, Karachi, 75270, Pakistan
| | - Saba Farooq
- National Institute of Virology, Dr. Panjwani Center for Molecular Medicine and Drug Research, ICCBS, University of Karachi, Karachi, 75270, Pakistan
| | - Noor Ul Ain
- Institute of Biomedical and Genetic Engineering (IBGE), Islamabad, Pakistan
| | - Waseem Iqbal
- Pathology Unit, Mardan Medical Complex, Mardan, Pakistan
| | - Niamatullah Kakar
- Center for Advanced Studies in Vaccinology and Biotechnology (CASVAB), University of Balochistan, Quetta, Pakistan
| | - Shah Jahan
- Department of Immunology, University of Health Sciences Lahore, Lahore, Pakistan
| | - Mohsin Shahzad
- Department of Molecular Biology, Shaheed Zulfiqar Ali Bhutto Medical University, Islamabad, Pakistan
| | - Saima Siddiqi
- Institute of Biomedical and Genetic Engineering (IBGE), Islamabad, Pakistan
| | - Ishtiaq Ahmad Khan
- Jamil-ur-Rahman Center for Genome Research, Dr. Panjwani Center for Molecular Medicine and Drug Research, ICCBS, University of Karachi, Karachi, 75270, Pakistan
| |
Collapse
|