1
|
Colson P, Fantini J, Delerce J, Bader W, Levasseur A, Pontarotti P, Devaux C, Raoult D. "Outlaw" mutations in quasispecies of SARS-CoV-2 inhibit replication. Emerg Microbes Infect 2024; 13:2368211. [PMID: 38916498 PMCID: PMC11207925 DOI: 10.1080/22221751.2024.2368211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Accepted: 06/10/2024] [Indexed: 06/26/2024]
Abstract
The evolution of SARS-CoV-2, the agent of COVID-19, has been remarkable for its high mutation potential, leading to the appearance of variants. Some mutations have never appeared in the published genomes, which represent consensus, or bona fide genomes. Here we tested the hypothesis that mutations that did not appear in consensus genomes were, in fact, as frequent as the mutations that appeared during the various epidemic episodes, but were not expressed because lethal. To identify these mutations, we analysed the genomes of 90 nasopharyngeal samples and the quasispecies determined by next-generation sequencing. Mutations observed in the quasispecies and not in the consensus genomes were considered to be lethal, what we called "outlaw" mutations. Among these mutations, we analysed the 21 most frequent. Eight of these "outlaws" were in the RNA polymerase and we were able to use a structural biology model and molecular dynamics simulations to demonstrate the functional incapacity of these mutated RNA polymerases. Three other mutations affected the spike, a major protein involved in the pathogenesis of COVID-19. Overall, by analysing the SARS-CoV-2 quasispecies obtained during sequencing, this method made it possible to identify "outlaws," showing areas that could potentially become the target of treatments.
Collapse
Affiliation(s)
- Philippe Colson
- IHU Méditerranée Infection, Marseille, France
- Microbes Evolution Phylogeny and Infections (MEPHI), Institut de Recherche pour le Développement (IRD), Aix-Marseille University, Marseille, France
- Assistance Publique-Hôpitaux de Marseille (AP-HM), Marseille, France
| | - Jacques Fantini
- INSERM UMR UA 16, Aix-Marseille Université, Marseille, France
| | | | - Wahiba Bader
- IHU Méditerranée Infection, Marseille, France
- Microbes Evolution Phylogeny and Infections (MEPHI), Institut de Recherche pour le Développement (IRD), Aix-Marseille University, Marseille, France
| | - Anthony Levasseur
- IHU Méditerranée Infection, Marseille, France
- Microbes Evolution Phylogeny and Infections (MEPHI), Institut de Recherche pour le Développement (IRD), Aix-Marseille University, Marseille, France
| | - Pierre Pontarotti
- IHU Méditerranée Infection, Marseille, France
- Department of Biological Sciences, Centre National de la Recherche 16 Scientifique (CNRS), Marseille, France
| | - Christian Devaux
- IHU Méditerranée Infection, Marseille, France
- Department of Biological Sciences, Centre National de la Recherche 16 Scientifique (CNRS), Marseille, France
| | - Didier Raoult
- IHU Méditerranée Infection, Marseille, France
- Microbes Evolution Phylogeny and Infections (MEPHI), Institut de Recherche pour le Développement (IRD), Aix-Marseille University, Marseille, France
| |
Collapse
|
2
|
Guangxin L, Guangfeng L, Ce L, Hongling M, Yiqin D, Changhong C, Jianjun J, Sigang F, Juan F, Li L, Zhendong Q, Zhixun G. Genome sequencing analysis and validation of infestation-related functional genes of Vibrio parahaemolyticus LG2206 isolated from the hepatopancreas of diseased mud crab (Scylla paramamosain) in South China. FISH & SHELLFISH IMMUNOLOGY 2024; 153:109854. [PMID: 39179188 DOI: 10.1016/j.fsi.2024.109854] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/17/2024] [Revised: 08/20/2024] [Accepted: 08/20/2024] [Indexed: 08/26/2024]
Abstract
Vibrio parahaemolyticus (V. parahaemolyticus) is a major bacterial pathogen found in brackish environments, leading to disease outbreaks and great economic losses in the mud crab industry. This study investigated the molecular mechanism of V. parahaemolyticus infecting mud crabs through genome sequencing analysis, survival experiments, and the expression patterns of related functional genes. A strain of V. parahaemolyticus with high pathogenicity and lethality was isolated from diseased mud crab in South China. The genome sequencing results showed that the genome size of V. parahaemolyticus was a circular chromosome of 3,357,271 bp, with a GC content of 45 %, containing 2985 protein-coding genes, denoted as V. parahaemolyticus LG2206. Genome analysis data revealed that a total of 113 adherence coding genes were obtained, including 120 virulence factor coding genes, 37 type III secretion system (T3SS) coding genes, and 277 sequences of T3SS effectors. Survival experiments showed that the mortality was 20 % within 96 h in the 1 × 104 CFU/mL infection group, 90 % in the 3.2 × 105 CFU/mL treatment group, and 100 % in the 1 × 106 CFU/mL treatment group. The LD50 of V. parahaemolyticus LG2206 was determined as 4.6 × 104 CFU/mL. Six genes of znuA and fliD (flagellin encoding genes), yscE and yscR (T3SS encoding genes), and nfuA and htpX (virulence factor encoding genes) were selected and validated by quantitative real-time PCR analysis after infection with 4.6 × 104 CFU/mL of V. parahaemolyticus LG2206 for 96 h. The expression of the six genes exhibited a significant up-regulation trend at all tested time points. The results indicated that the infestation-related genes screened in the experiment play important roles in the infestation process. This study provides timely and effective information to further analyze the molecular mechanism of V. parahaemolyticus infection and develop comprehensive measures for disease prevention and control.
Collapse
Affiliation(s)
- Liu Guangxin
- Guangdong Provincial Key Laboratory of Fishery Ecology and Environment, Key Laboratory of South China Sea Fishery Resources Exploitation & Utilization, Ministry of Agriculture and Rural Affairs, South China Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Guangzhou, 510300, China
| | - Liu Guangfeng
- Guangdong Provincial Key Laboratory of Fishery Ecology and Environment, Key Laboratory of South China Sea Fishery Resources Exploitation & Utilization, Ministry of Agriculture and Rural Affairs, South China Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Guangzhou, 510300, China
| | - Li Ce
- Zhaoqing Aquatic Technology Extension Center, Zhaoqing, 526060, China
| | - Ma Hongling
- Guangdong Provincial Key Laboratory of Fishery Ecology and Environment, Key Laboratory of South China Sea Fishery Resources Exploitation & Utilization, Ministry of Agriculture and Rural Affairs, South China Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Guangzhou, 510300, China
| | - Deng Yiqin
- Guangdong Provincial Key Laboratory of Fishery Ecology and Environment, Key Laboratory of South China Sea Fishery Resources Exploitation & Utilization, Ministry of Agriculture and Rural Affairs, South China Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Guangzhou, 510300, China
| | - Cheng Changhong
- Guangdong Provincial Key Laboratory of Fishery Ecology and Environment, Key Laboratory of South China Sea Fishery Resources Exploitation & Utilization, Ministry of Agriculture and Rural Affairs, South China Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Guangzhou, 510300, China
| | - Jiang Jianjun
- Guangdong Provincial Key Laboratory of Fishery Ecology and Environment, Key Laboratory of South China Sea Fishery Resources Exploitation & Utilization, Ministry of Agriculture and Rural Affairs, South China Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Guangzhou, 510300, China
| | - Fan Sigang
- Guangdong Provincial Key Laboratory of Fishery Ecology and Environment, Key Laboratory of South China Sea Fishery Resources Exploitation & Utilization, Ministry of Agriculture and Rural Affairs, South China Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Guangzhou, 510300, China
| | - Feng Juan
- Guangdong Provincial Key Laboratory of Fishery Ecology and Environment, Key Laboratory of South China Sea Fishery Resources Exploitation & Utilization, Ministry of Agriculture and Rural Affairs, South China Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Guangzhou, 510300, China
| | - Lin Li
- Guangdong Provincial Water Environment and Aquatic Products Security Engineering Technology Research Center, Guangzhou Key Laboratory of Aquatic Animal Diseases and Waterfowl Breeding, Zhongkai University of Agriculture and Engineering, Guangzhou, 510225, China.
| | - Qin Zhendong
- Guangdong Provincial Water Environment and Aquatic Products Security Engineering Technology Research Center, Guangzhou Key Laboratory of Aquatic Animal Diseases and Waterfowl Breeding, Zhongkai University of Agriculture and Engineering, Guangzhou, 510225, China.
| | - Guo Zhixun
- Guangdong Provincial Key Laboratory of Fishery Ecology and Environment, Key Laboratory of South China Sea Fishery Resources Exploitation & Utilization, Ministry of Agriculture and Rural Affairs, South China Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Guangzhou, 510300, China.
| |
Collapse
|
3
|
Ren P, Zhang J, Vijg J. Somatic mutations in aging and disease. GeroScience 2024; 46:5171-5189. [PMID: 38488948 PMCID: PMC11336144 DOI: 10.1007/s11357-024-01113-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 02/27/2024] [Indexed: 03/17/2024] Open
Abstract
Time always leaves its mark, and our genome is no exception. Mutations in the genome of somatic cells were first hypothesized to be the cause of aging in the 1950s, shortly after the molecular structure of DNA had been described. Somatic mutation theories of aging are based on the fact that mutations in DNA as the ultimate template for all cellular functions are irreversible. However, it took until the 1990s to develop the methods to test if DNA mutations accumulate with age in different organs and tissues and estimate the severity of the problem. By now, numerous studies have documented the accumulation of somatic mutations with age in normal cells and tissues of mice, humans, and other animals, showing clock-like mutational signatures that provide information on the underlying causes of the mutations. In this review, we will first briefly discuss the recent advances in next-generation sequencing that now allow quantitative analysis of somatic mutations. Second, we will provide evidence that the mutation rate differs between cell types, with a focus on differences between germline and somatic mutation rate. Third, we will discuss somatic mutational signatures as measures of aging, environmental exposure, and activities of DNA repair processes. Fourth, we will explain the concept of clonally amplified somatic mutations, with a focus on clonal hematopoiesis. Fifth, we will briefly discuss somatic mutations in the transcriptome and in our other genome, i.e., the genome of mitochondria. We will end with a brief discussion of a possible causal contribution of somatic mutations to the aging process.
Collapse
Affiliation(s)
- Peijun Ren
- Center for Single-Cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China.
| | - Jie Zhang
- Center for Single-Cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China
| | - Jan Vijg
- Center for Single-Cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China.
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY, 10461, USA.
| |
Collapse
|
4
|
McDonald JB, Wade B, Andrews DM, Van TTH, Moore RJ. Development of tools for the genetic manipulation of Campylobacter and their application to the N-glycosylation system of Campylobacter hepaticus, an emerging pathogen of poultry. mBio 2024; 15:e0110124. [PMID: 39072641 PMCID: PMC11389370 DOI: 10.1128/mbio.01101-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2024] [Accepted: 06/19/2024] [Indexed: 07/30/2024] Open
Abstract
Various species of campylobacters cause significant disease problems in both humans and animals. The continuing development of tools and methods for genetic and molecular manipulation of campylobacters enables the detailed study of bacterial virulence and disease pathogenesis. Campylobacter hepaticus is an emerging pathogen that causes spotty liver disease (SLD) in poultry. SLD has a significant economic and animal welfare impact as the disease results in elevated mortalities and significant decreases in egg production. Although potential virulence genes of C. hepaticus have been identified, they have not been further studied and characterized, as appropriate genetic tools and methods to transform and perform mutagenesis studies in C. hepaticus have not been available. In this study, the genetic manipulation of C. hepaticus is reported, with the development of novel plasmid vectors, methods for transformation, site-specific mutagenesis, and mutant complementation. These tools were used to delete the pglB gene, an oligosaccharyltransferase, a central enzyme of the N-glycosylation pathway, by allelic exchange. In the mutant strain, N-glycosylation was completely abolished. The tools and methods developed in this study represent innovative approaches that can be applied to further explore important virulence factors of C. hepaticus and other closely related Campylobacter species. IMPORTANCE Spotty liver disease (SLD) of layer chickens, caused by infection with Campylobacter hepaticus, is a significant economic and animal welfare burden on an important food production industry. Currently, SLD is controlled using antibiotics; however, alternative intervention methods are needed due to increased concerns associated with environmental contamination with antibiotics, and the development of antimicrobial resistance in many bacterial pathogens of humans and animals. This study has developed methods that have enabled the genetic manipulation of C. hepaticus. To validate the methods, the pglB gene was inactivated by allelic exchange to produce a C. hepaticus strain that could no longer N-glycosylate proteins. Subsequently, the mutation was complemented by reintroduction of the gene in trans, on a plasmid vector, to demonstrate that the phenotypic changes noted were caused by the mutation of the targeted gene. The tools developed enable ongoing studies to understand other virulence mechanisms of this important emerging pathogen.
Collapse
Affiliation(s)
- Jamieson B McDonald
- School of Science, RMIT University, Bundoora West Campus, Bundoora, Victoria, Australia
| | - Ben Wade
- School of Science, RMIT University, Bundoora West Campus, Bundoora, Victoria, Australia
| | - Daniel M Andrews
- Bioproperties Pty Ltd, RMIT University, Bundoora West Campus, Bundoora, Victoria, Australia
| | - Thi Thu Hao Van
- School of Science, RMIT University, Bundoora West Campus, Bundoora, Victoria, Australia
| | - Robert J Moore
- School of Science, RMIT University, Bundoora West Campus, Bundoora, Victoria, Australia
| |
Collapse
|
5
|
Yan Z, Shi L, Li W, Liu W, Galderisi C, Spittle C, Li J. A Novel Next-Generation Sequencing Assay for the Identification of BCR::ABL1 Transcript Type and Accurate and Sensitive Detection of TKI-Resistant Mutations. J Appl Lab Med 2024:jfae096. [PMID: 39225048 DOI: 10.1093/jalm/jfae096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2024] [Accepted: 07/09/2024] [Indexed: 09/04/2024]
Abstract
BACKGROUND The clinical management of chronic myeloid leukemia (CML) patients requires the identification of the type of BCR::ABL1 transcript at diagnosis and the monitoring of its expression and potential tyrosine kinase inhibitor (TKI) resistance mutations during treatment. Detection of resistant mutation requires transcript type-specific amplification of BCR::ABL1 from RNA. METHODS In this study, a custom RNA-based next-generation sequencing (NGS) assay (Dup-Seq BCR::ABL1) that enables (a) the identification of BCR::ABL1 transcript type and (b) the detection of resistance mutations from common and atypical BCR::ABL1 transcript types was developed and validated. The assay design covers BCR exon 1 to ABL1 exon 10 and employs duplicate PCR amplification for error correction. The custom data analysis pipeline enables breakpoint determination and overlapped mutation calling from duplicates, which minimizes the low-level mutation artifacts. RESULTS This study demonstrates that this novel assay achieves high accuracy (positive percent agreement (PPA) for fusion: 98.5%; PPA and negative percent agreement (NPA) for mutation at 97.8% and 100.0%, respectively) and sensitivity (limit of detection (LOD) for mutation detection at 3% from 10 000 copies of BCR::ABL1 input). CONCLUSIONS The Dup-Seq BCR::ABL1 assay not only allows for the identification of BCR::ABL1 typical and atypical transcript types and accurate and sensitive detection of TKI-resistant mutations but also simplifies molecular testing work flow for the clinical management of CML patients.
Collapse
Affiliation(s)
- Zhenyu Yan
- ICON Laboratory Services, ICON plc, Cambridge, MA, United States
| | - Lin Shi
- ICON Laboratory Services, ICON plc, Cambridge, MA, United States
| | - Wei Li
- ICON Laboratory Services, ICON plc, Cambridge, MA, United States
| | - Weihua Liu
- ICON Laboratory Services, ICON plc, Cambridge, MA, United States
| | - Chad Galderisi
- ICON Laboratory Services, ICON plc, Cambridge, MA, United States
| | - Cynthia Spittle
- ICON Laboratory Services, ICON plc, Cambridge, MA, United States
| | - Jin Li
- ICON Laboratory Services, ICON plc, Cambridge, MA, United States
| |
Collapse
|
6
|
Allender CJ, Wike CL, Porter WT, Ellis D, Lemmer D, Pond SJK, Engelthaler DM. Sequencing by binding rivals SMOR error-corrected sequencing by synthesis technology for accurate detection and quantification of minor (< 0.1%) subpopulation variants. BMC Genomics 2024; 25:789. [PMID: 39160478 PMCID: PMC11331594 DOI: 10.1186/s12864-024-10697-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2024] [Accepted: 08/09/2024] [Indexed: 08/21/2024] Open
Abstract
BACKGROUND Detecting very minor (< 1%) subpopulations using next-generation sequencing is a critical need for multiple applications, including the detection of drug resistant pathogens and somatic variant detection in oncology. A recently available sequencing approach termed 'sequencing by binding (SBB)' claims to have higher base calling accuracy data "out of the box." This paper evaluates the utility of using SBB for the detection of ultra-rare drug resistant subpopulations in Mycobacterium tuberculosis (Mtb) using a targeted amplicon assay and compares the performance of SBB to single molecule overlapping reads (SMOR) error corrected sequencing by synthesis (SBS) data. RESULTS SBS displayed an elevated error rate when compared to SMOR error-corrected SBS and SBB techniques. SMOR error-corrected SBS and SBB technologies performed similarly within the linear range studies and error rate studies. CONCLUSIONS With lower sequencing error rates within SBB sequencing, this technique looks promising for both targeted and unbiased whole genome sequencing, leading to the identification of minor (< 1%) subpopulations without the need for error correction methods.
Collapse
Affiliation(s)
- Christopher J Allender
- Pathogen and Microbiome Division, Translational Genomics Research Institute, 3051 W. Shamrell Blvd., Suite 106, Flagstaff, AZ, 86005, USA
| | - Candice L Wike
- Emerging Opportunities Division, Translational Genomics Research Institute, 445 N 5th Street, Phoenix, AZ, USA
| | - W Tanner Porter
- Pathogen and Microbiome Division, Translational Genomics Research Institute, 3051 W. Shamrell Blvd., Suite 106, Flagstaff, AZ, 86005, USA
| | - Dean Ellis
- Emerging Opportunities Division, Translational Genomics Research Institute, 445 N 5th Street, Phoenix, AZ, USA
| | - Darrin Lemmer
- Pathogen and Microbiome Division, Translational Genomics Research Institute, 3051 W. Shamrell Blvd., Suite 106, Flagstaff, AZ, 86005, USA
| | - Stephanie J K Pond
- Emerging Opportunities Division, Translational Genomics Research Institute, 445 N 5th Street, Phoenix, AZ, USA
| | - David M Engelthaler
- Pathogen and Microbiome Division, Translational Genomics Research Institute, 3051 W. Shamrell Blvd., Suite 106, Flagstaff, AZ, 86005, USA.
| |
Collapse
|
7
|
Chung YS, Kang S, Kim J, Lee S, Kim S. CLEMENT: genomic decomposition and reconstruction of non-tumor subclones. Nucleic Acids Res 2024; 52:e62. [PMID: 38922688 DOI: 10.1093/nar/gkae527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 05/27/2024] [Accepted: 06/12/2024] [Indexed: 06/28/2024] Open
Abstract
Genome-level clonal decomposition of a single specimen has been widely studied; however, it is mostly limited to cancer research. In this study, we developed a new algorithm CLEMENT, which conducts accurate decomposition and reconstruction of multiple subclones in genome sequencing of non-tumor (normal) samples. CLEMENT employs the Expectation-Maximization (EM) algorithm with optimization strategies specific to non-tumor subclones, including false variant call identification, non-disparate clone fuzzy clustering, and clonal allele fraction confinement. In the simulation and in vitro cell line mixture data, CLEMENT outperformed current cancer decomposition algorithms in estimating the number of clones (root-mean-square-error = 0.58-0.78 versus 1.43-3.34) and in the variant-clone membership agreement (∼85.5% versus 70.1-76.7%). Additional testing on human multi-clonal normal tissue sequencing confirmed the accurate identification of subclones that originated from different cell types. Clone-level analysis, including mutational burden and signatures, provided a new understanding of normal-tissue composition. We expect that CLEMENT will serve as a crucial tool in the currently emerging field of non-tumor genome analysis.
Collapse
Affiliation(s)
- Young-Soo Chung
- Department of Biomedical Systems Informatics, Brain Korea 21 PLUS Project for Medical Science, Yonsei University College of Medicine, Seoul 03722, Republic of Korea
| | - Seungseok Kang
- Department of Biomedical Systems Informatics, Brain Korea 21 PLUS Project for Medical Science, Yonsei University College of Medicine, Seoul 03722, Republic of Korea
| | - Jisu Kim
- DataShape team, Inria Saclay Île-De-France, Palaiseau 91120, France
- Department of Statistics, Seoul National University, Seoul 08826, Republic of Korea
| | - Sangbo Lee
- Department of Biomedical Systems Informatics, Brain Korea 21 PLUS Project for Medical Science, Yonsei University College of Medicine, Seoul 03722, Republic of Korea
| | - Sangwoo Kim
- Department of Biomedical Systems Informatics, Brain Korea 21 PLUS Project for Medical Science, Yonsei University College of Medicine, Seoul 03722, Republic of Korea
| |
Collapse
|
8
|
Liu S, Obert C, Yu YP, Zhao J, Ren BG, Liu JJ, Wiseman K, Krajacich BJ, Wang W, Metcalfe K, Smith M, Ben-Yehezkel T, Luo JH. Utility analyses of AVITI sequencing chemistry. BMC Genomics 2024; 25:778. [PMID: 39127634 DOI: 10.1186/s12864-024-10686-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Accepted: 08/02/2024] [Indexed: 08/12/2024] Open
Abstract
BACKGROUND DNA sequencing is a critical tool in modern biology. Over the last two decades, it has been revolutionized by the advent of massively parallel sequencing, leading to significant advances in the genome and transcriptome sequencing of various organisms. Nevertheless, challenges with accuracy, lack of competitive options and prohibitive costs associated with high throughput parallel short-read sequencing persist. RESULTS Here, we conduct a comparative analysis using matched DNA and RNA short-reads assays between Element Biosciences' AVITI and Illumina's NextSeq 550 chemistries. Similar comparisons were evaluated for synthetic long-read sequencing for RNA and targeted single-cell transcripts between the AVITI and Illumina's NovaSeq 6000. For both DNA and RNA short-read applications, the study found that the AVITI produced significantly higher per sequence quality scores. For PCR-free DNA libraries, we observed an average 89.7% lower experimentally determined error rate when using the AVITI chemistry, compared to the NextSeq 550. For short-read RNA quantification, AVITI platform had an average of 32.5% lower error rate than that for NextSeq 550. With regards to synthetic long-read mRNA and targeted synthetic long read single cell mRNA sequencing, both platforms' respective chemistries performed comparably in quantification of genes and isoforms. The AVITI displayed a marginally lower error rate for long reads, with fewer chemistry-specific errors and a higher mutation detection rate. CONCLUSION These results point to the potential of the AVITI platform as a competitive candidate in high-throughput short read sequencing analyses when juxtaposed with the Illumina NextSeq 550.
Collapse
Affiliation(s)
- Silvia Liu
- Department of Pathology, University of Pittsburgh School of Medicine, Pittsburgh, PA, 15261, USA.
- High Throughput Genome Center, University of Pittsburgh School of Medicine, Pittsburgh, USA.
- Pittsburgh Liver Research Center, University of Pittsburgh School of Medicine, Pittsburgh, USA.
| | - Caroline Obert
- Element Biosciences Inc, 10055 Barnes Canyon Road, Suite 100, San Diego, CA, 92121, USA
| | - Yan-Ping Yu
- Department of Pathology, University of Pittsburgh School of Medicine, Pittsburgh, PA, 15261, USA
- High Throughput Genome Center, University of Pittsburgh School of Medicine, Pittsburgh, USA
- Pittsburgh Liver Research Center, University of Pittsburgh School of Medicine, Pittsburgh, USA
| | - Junhua Zhao
- Element Biosciences Inc, 10055 Barnes Canyon Road, Suite 100, San Diego, CA, 92121, USA
| | - Bao-Guo Ren
- Department of Pathology, University of Pittsburgh School of Medicine, Pittsburgh, PA, 15261, USA
- High Throughput Genome Center, University of Pittsburgh School of Medicine, Pittsburgh, USA
| | - Jia-Jun Liu
- Department of Pathology, University of Pittsburgh School of Medicine, Pittsburgh, PA, 15261, USA
- High Throughput Genome Center, University of Pittsburgh School of Medicine, Pittsburgh, USA
- Pittsburgh Liver Research Center, University of Pittsburgh School of Medicine, Pittsburgh, USA
| | - Kelly Wiseman
- Element Biosciences Inc, 10055 Barnes Canyon Road, Suite 100, San Diego, CA, 92121, USA
| | - Benjamin J Krajacich
- Element Biosciences Inc, 10055 Barnes Canyon Road, Suite 100, San Diego, CA, 92121, USA
| | - Wenjia Wang
- Department of Biostatistics, University of Pittsburgh School of Public Health, Pittsburgh, USA
| | - Kyle Metcalfe
- Element Biosciences Inc, 10055 Barnes Canyon Road, Suite 100, San Diego, CA, 92121, USA
| | - Mat Smith
- Element Biosciences Inc, 10055 Barnes Canyon Road, Suite 100, San Diego, CA, 92121, USA
| | - Tuval Ben-Yehezkel
- Element Biosciences Inc, 10055 Barnes Canyon Road, Suite 100, San Diego, CA, 92121, USA
| | - Jian-Hua Luo
- Department of Pathology, University of Pittsburgh School of Medicine, Pittsburgh, PA, 15261, USA.
- High Throughput Genome Center, University of Pittsburgh School of Medicine, Pittsburgh, USA.
- Pittsburgh Liver Research Center, University of Pittsburgh School of Medicine, Pittsburgh, USA.
| |
Collapse
|
9
|
Baumann A, Ruckert C, Meier C, Hutschenreiter T, Remy R, Schnur B, Döbel M, Fankep RCN, Skowronek D, Kutz O, Arnold N, Katzke AL, Forster M, Kobiela AL, Thiedig K, Zimmer A, Ritter J, Weber BHF, Honisch E, Hackmann K, Schmidt G, Sturm M, Ernst C. Limitations in next-generation sequencing-based genotyping of breast cancer polygenic risk score loci. Eur J Hum Genet 2024; 32:987-997. [PMID: 38907004 PMCID: PMC11291653 DOI: 10.1038/s41431-024-01647-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 05/17/2024] [Accepted: 06/10/2024] [Indexed: 06/23/2024] Open
Abstract
Considering polygenic risk scores (PRSs) in individual risk prediction is increasingly implemented in genetic testing for hereditary breast cancer (BC) based on next-generation sequencing (NGS). To calculate individual BC risks, the Breast and Ovarian Analysis of Disease Incidence and Carrier Estimation Algorithm (BOADICEA) with the inclusion of the BCAC 313 or the BRIDGES 306 BC PRS is commonly used. The PRS calculation depends on accurately reproducing the variant allele frequencies (AFs) and, consequently, the distribution of PRS values anticipated by the algorithm. Here, the 324 loci of the BCAC 313 and the BRIDGES 306 BC PRS were examined in population-specific database gnomAD and in real-world data sets of five centers of the German Consortium for Hereditary Breast and Ovarian Cancer (GC-HBOC), to determine whether these expected AFs can be reproduced by NGS-based genotyping. Four PRS loci were non-existent in gnomAD v3.1.2 non-Finnish Europeans, further 24 loci showed noticeably deviating AFs. In real-world data, between 11 and 23 loci were reported with noticeably deviating AFs, and were shown to have effects on final risk prediction. Deviations depended on the sequencing approach, variant caller and calling mode (forced versus unforced) employed. Therefore, this study demonstrates the necessity to apply quality assurance not only in terms of sequencing coverage but also observed AFs in a sufficiently large cohort, when implementing PRSs in a routine diagnostic setting. Furthermore, future PRS design should be guided by the technical reproducibility of expected AFs across commonly used genotyping methods, especially NGS, in addition to the observed effect sizes.
Collapse
Affiliation(s)
- Alexandra Baumann
- Institute for Clinical Genetics, University Hospital Carl Gustav Carus at TUD Dresden University of Technology and Faculty of Medicine of TUD Dresden University of Technology, Dresden, Germany
- ERN GENTURIS, Hereditary Cancer Syndrome Center Dresden, Dresden, Germany
- National Center for Tumor Diseases (NCT), NCT/UCC Dresden, a partnership between German Cancer Research Center (DKFZ), Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology and Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Dresden, Germany
- German Cancer Consortium (DKTK), Dresden, Germany
- German Cancer Research Center (DKFZ), Heidelberg, Germany
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
| | - Christian Ruckert
- Department of Medical Genetics, University Hospital Münster, Münster, Germany
| | - Christoph Meier
- Institute of Human Genetics, University of Regensburg, Regensburg, Germany
| | - Tim Hutschenreiter
- Institute for Clinical Genetics, University Hospital Carl Gustav Carus at TUD Dresden University of Technology and Faculty of Medicine of TUD Dresden University of Technology, Dresden, Germany
- ERN GENTURIS, Hereditary Cancer Syndrome Center Dresden, Dresden, Germany
- National Center for Tumor Diseases (NCT), NCT/UCC Dresden, a partnership between German Cancer Research Center (DKFZ), Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology and Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Dresden, Germany
- German Cancer Consortium (DKTK), Dresden, Germany
- German Cancer Research Center (DKFZ), Heidelberg, Germany
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
| | - Robert Remy
- Center for Familial Breast and Ovarian Cancer, Center for Integrated Oncology (CIO), Medical Faculty, University of Cologne and University Hospital Cologne, Cologne, Germany
| | - Benedikt Schnur
- Department of Human Genetics, Hannover Medical School (MHH), Hannover, Germany
| | - Marvin Döbel
- Institute of Medical Genetics and Applied Genomics, University Hospital Tübingen, Tübingen, Germany
| | - Rudel Christian Nkouamedjo Fankep
- Center for Familial Breast and Ovarian Cancer, Center for Integrated Oncology (CIO), Medical Faculty, University of Cologne and University Hospital Cologne, Cologne, Germany
| | - Dariush Skowronek
- Department of Human Genetics, University Medicine Greifswald and Interfaculty Institute of Genetics and Functional Genomics, University of Greifswald, Greifswald, Germany
| | - Oliver Kutz
- Institute for Clinical Genetics, University Hospital Carl Gustav Carus at TUD Dresden University of Technology and Faculty of Medicine of TUD Dresden University of Technology, Dresden, Germany
- ERN GENTURIS, Hereditary Cancer Syndrome Center Dresden, Dresden, Germany
- National Center for Tumor Diseases (NCT), NCT/UCC Dresden, a partnership between German Cancer Research Center (DKFZ), Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology and Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Dresden, Germany
- German Cancer Consortium (DKTK), Dresden, Germany
- German Cancer Research Center (DKFZ), Heidelberg, Germany
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
- Department of Gynecology and Obstetrics, University Hospital Carl Gustav Carus at TUD Dresden University of Technology and Faculty of Medicine of TUD Dresden University of Technology, Dresden, Germany
| | - Norbert Arnold
- Department of Gynecology and Obstetrics, Institute of Clinical Chemistry Institute of Clinical Molecular Biology, University Hospital Schleswig-Holstein, Campus Kiel, Kiel, Germany
| | - Anna-Lena Katzke
- Department of Human Genetics, Hannover Medical School (MHH), Hannover, Germany
| | - Michael Forster
- Department of Gynecology and Obstetrics, Institute of Clinical Chemistry Institute of Clinical Molecular Biology, University Hospital Schleswig-Holstein, Campus Kiel, Kiel, Germany
| | - Anna-Lena Kobiela
- Center for Familial Breast and Ovarian Cancer, Center for Integrated Oncology (CIO), Medical Faculty, University of Cologne and University Hospital Cologne, Cologne, Germany
| | - Katharina Thiedig
- Division of Gynaecology and Obstetrics, Klinikum rechts der Isar der Technischen Universität München, München, Germany
| | - Andreas Zimmer
- Institute for Human Genetics, Medical Center University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Julia Ritter
- Department of Human Genetics, Labor Berlin - Charité Vivantes GmbH, Berlin, Germany
| | - Bernhard H F Weber
- Institute of Human Genetics, University of Regensburg, Regensburg, Germany
- Institute of Clinical Human Genetics, University Hospital Regensburg, Regensburg, Germany
| | - Ellen Honisch
- Department of Gynaecology and Obstetrics, University Hospital Düsseldorf, Heinrich-Heine University Düsseldorf, Düsseldorf, Germany
| | - Karl Hackmann
- Institute for Clinical Genetics, University Hospital Carl Gustav Carus at TUD Dresden University of Technology and Faculty of Medicine of TUD Dresden University of Technology, Dresden, Germany
- ERN GENTURIS, Hereditary Cancer Syndrome Center Dresden, Dresden, Germany
- National Center for Tumor Diseases (NCT), NCT/UCC Dresden, a partnership between German Cancer Research Center (DKFZ), Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology and Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Dresden, Germany
- German Cancer Consortium (DKTK), Dresden, Germany
- German Cancer Research Center (DKFZ), Heidelberg, Germany
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
| | - Gunnar Schmidt
- Department of Human Genetics, Hannover Medical School (MHH), Hannover, Germany
| | - Marc Sturm
- Institute of Medical Genetics and Applied Genomics, University Hospital Tübingen, Tübingen, Germany
| | - Corinna Ernst
- Center for Familial Breast and Ovarian Cancer, Center for Integrated Oncology (CIO), Medical Faculty, University of Cologne and University Hospital Cologne, Cologne, Germany.
| |
Collapse
|
10
|
Lin J, Nguyen MA, Lin LY, Zeng J, Verma A, Neri NR, da Silva LF, Mucci A, Wolfe S, Shaw KL, Clement K, Brendel C, Pinello L, Pellin D, Bauer DE. Scalable assessment of genome editing off-targets associated with genetic variants. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.24.605019. [PMID: 39211178 PMCID: PMC11360989 DOI: 10.1101/2024.07.24.605019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/04/2024]
Abstract
Genome editing with RNA-guided DNA binding factors carries risk of off-target editing at homologous sequences. Genetic variants may introduce sequence changes that increase homology to a genome editing target, thereby increasing risk of off-target editing. Conventional methods to verify candidate off-targets rely on access to cells with genomic DNA carrying these sequences. However, for candidate off-targets associated with genetic variants, appropriate cells for experimental verification may not be available. Here we develop a method, Assessment By Stand-in Off-target LentiViral Ensemble with sequencing (ABSOLVE-seq), to integrate a set of candidate off-target sequences along with unique molecular identifiers (UMIs) in genomes of primary cells followed by clinically relevant gene editor delivery. Gene editing of dozens of candidate off-target sequences may be evaluated in a single experiment with high sensitivity, precision, and power. We provide an open-source pipeline to analyze sequencing data. This approach enables experimental assessment of the influence of human genetic diversity on specificity evaluation during gene editing therapy development.
Collapse
|
11
|
Szpechcinski A, Moes-Sosnowska J, Skronska P, Lechowicz U, Pelc M, Szolkowska M, Rudzinski P, Wojda E, Maszkowska-Kopij K, Langfort R, Orlowski T, Sliwinski P, Polaczek M, Chorostowska-Wynimko J. The Advantage of Targeted Next-Generation Sequencing over qPCR in Testing for Druggable EGFR Variants in Non-Small-Cell Lung Cancer. Int J Mol Sci 2024; 25:7908. [PMID: 39063150 PMCID: PMC11277480 DOI: 10.3390/ijms25147908] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2024] [Revised: 07/09/2024] [Accepted: 07/10/2024] [Indexed: 07/28/2024] Open
Abstract
The emergence of targeted therapies in non-small-cell lung cancer (NSCLC), including inhibitors of epidermal growth factor receptor (EGFR) tyrosine kinase, has increased the need for robust companion diagnostic tests. Nowadays, detection of actionable variants in exons 18-21 of the EGFR gene by qPCR and direct DNA sequencing is often replaced by next-generation sequencing (NGS). In this study, we evaluated the diagnostic usefulness of targeted NGS for druggable EGFR variants testing in clinical NSCLC material previously analyzed by the IVD-certified qPCR test with respect to DNA reference material. We tested 59 NSCLC tissue and cytology specimens for EGFR variants using the NGS 'TruSight Tumor 15' assay (Illumina) and the qPCR 'cobas EGFR mutation test v2' (Roche Diagnostics). The sensitivity and specificity of targeted NGS assay were evaluated using the biosynthetic and biological DNA reference material with known allelic frequencies (VAF) of EGFR variants. NGS demonstrated a sufficient lower detection limit for diagnostic applications (VAF < 5%) in DNA reference material; all EGFR variants were correctly identified. NGS showed high repeatability of VAF assessment between runs (CV% from 0.02 to 3.98). In clinical material, the overall concordance between NGS and qPCR was 76.14% (Cohen's Kappa = 0.5933). The majority of discordant results concerned false-positive detection of EGFR exon 20 insertions by qPCR. A total of 9 out of 59 (15%) clinical samples showed discordant results for one or more EGFR variants in both assays. Additionally, we observed TP53 to be a frequently co-mutated gene in EGFR-positive NSCLC patients. In conclusion, targeted NGS showed a number of superior features over qPCR in EGFR variant detection (exact identification of variants, calculation of allelic frequency, high analytical sensitivity), which might enhance the basic diagnostic report.
Collapse
Affiliation(s)
- Adam Szpechcinski
- Department of Genetics and Clinical Immunology, The Institute of Tuberculosis and Lung Diseases, 01-138 Warsaw, Poland; (J.M.-S.); (P.S.); (U.L.); (M.P.); (J.C.-W.)
| | - Joanna Moes-Sosnowska
- Department of Genetics and Clinical Immunology, The Institute of Tuberculosis and Lung Diseases, 01-138 Warsaw, Poland; (J.M.-S.); (P.S.); (U.L.); (M.P.); (J.C.-W.)
| | - Paulina Skronska
- Department of Genetics and Clinical Immunology, The Institute of Tuberculosis and Lung Diseases, 01-138 Warsaw, Poland; (J.M.-S.); (P.S.); (U.L.); (M.P.); (J.C.-W.)
| | - Urszula Lechowicz
- Department of Genetics and Clinical Immunology, The Institute of Tuberculosis and Lung Diseases, 01-138 Warsaw, Poland; (J.M.-S.); (P.S.); (U.L.); (M.P.); (J.C.-W.)
| | - Magdalena Pelc
- Department of Genetics and Clinical Immunology, The Institute of Tuberculosis and Lung Diseases, 01-138 Warsaw, Poland; (J.M.-S.); (P.S.); (U.L.); (M.P.); (J.C.-W.)
| | - Malgorzata Szolkowska
- Department of Pathology, The Institute of Tuberculosis and Lung Diseases, 01-138 Warsaw, Poland; (M.S.); (R.L.)
| | - Piotr Rudzinski
- Clinics of Thoracic Surgery, The Institute of Tuberculosis and Lung Diseases, 01-138 Warsaw, Poland; (P.R.); (T.O.)
| | - Emil Wojda
- III Department of Lung Diseases and Oncology, The Institute of Tuberculosis and Lung Diseases, 01-138 Warsaw, Poland; (E.W.); (M.P.)
- II Department of Lung Diseases, The Institute of Tuberculosis and Lung Diseases, 01-138 Warsaw, Poland;
| | | | - Renata Langfort
- Department of Pathology, The Institute of Tuberculosis and Lung Diseases, 01-138 Warsaw, Poland; (M.S.); (R.L.)
| | - Tadeusz Orlowski
- Clinics of Thoracic Surgery, The Institute of Tuberculosis and Lung Diseases, 01-138 Warsaw, Poland; (P.R.); (T.O.)
| | - Pawel Sliwinski
- II Department of Lung Diseases, The Institute of Tuberculosis and Lung Diseases, 01-138 Warsaw, Poland;
| | - Mateusz Polaczek
- III Department of Lung Diseases and Oncology, The Institute of Tuberculosis and Lung Diseases, 01-138 Warsaw, Poland; (E.W.); (M.P.)
| | - Joanna Chorostowska-Wynimko
- Department of Genetics and Clinical Immunology, The Institute of Tuberculosis and Lung Diseases, 01-138 Warsaw, Poland; (J.M.-S.); (P.S.); (U.L.); (M.P.); (J.C.-W.)
| |
Collapse
|
12
|
Goldswain H, Penrice-Randal R, Donovan-Banfield I, Duffy CW, Dong X, Randle N, Ryan Y, Rzeszutek AM, Pilgrim J, Keyser E, Weller SA, Hutley EJ, Hartley C, Prince T, Darby AC, Aye Maung N, Nwume H, Hiscox JA, Emmett SR. SARS-CoV-2 population dynamics in immunocompetent individuals in a closed transmission chain shows genomic diversity over the course of infection. Genome Med 2024; 16:89. [PMID: 39014481 PMCID: PMC11251137 DOI: 10.1186/s13073-024-01360-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2023] [Accepted: 07/04/2024] [Indexed: 07/18/2024] Open
Abstract
BACKGROUND SARS-CoV-2 remains rapidly evolving, and many biologically important genomic substitutions/indels have characterised novel SARS-CoV-2 lineages, which have emerged during successive global waves of the pandemic. Worldwide genomic sequencing has been able to monitor these waves, track transmission clusters, and examine viral evolution in real time to help inform healthcare policy. One school of thought is that an apparent greater than average divergence in an emerging lineage from contemporary variants may require persistent infection, for example in an immunocompromised host. Due to the nature of the COVID-19 pandemic and sampling, there were few studies that examined the evolutionary trajectory of SARS-CoV-2 in healthy individuals. METHODS We investigated viral evolutionary trends and participant symptomatology within a cluster of 16 SARS-CoV-2 infected, immunocompetent individuals with no co-morbidities in a closed transmission chain. Longitudinal nasopharyngeal swab sampling allowed characterisation of SARS-CoV-2 intra-host variation over time at both the dominant and minor genomic variant levels through Nimagen-Illumina sequencing. RESULTS A change in viral lineage assignment was observed in individual infections; however, there was only one indel and no evidence of recombination over the period of an acute infection. Minor and dominant genomic modifications varied between participants, with some minor genomic modifications increasing in abundance to become the dominant viral sequence during infection. CONCLUSIONS Data from this cohort of SARS-CoV-2-infected participants demonstrated that long-term persistent infection in an immunocompromised host was not necessarily a prerequisite for generating a greater than average frequency of amino acid substitutions. Amino acid substitutions at both the dominant and minor genomic sequence level were observed in immunocompetent individuals during infection showing that viral lineage changes can occur generating viral diversity.
Collapse
Affiliation(s)
- Hannah Goldswain
- Institute for Infection, Veterinary and Ecological Sciences, University of Liverpool, Liverpool, L3 5RF, UK
| | - Rebekah Penrice-Randal
- Institute for Infection, Veterinary and Ecological Sciences, University of Liverpool, Liverpool, L3 5RF, UK
| | - I'ah Donovan-Banfield
- Institute for Infection, Veterinary and Ecological Sciences, University of Liverpool, Liverpool, L3 5RF, UK
| | - Craig W Duffy
- Institute for Infection, Veterinary and Ecological Sciences, University of Liverpool, Liverpool, L3 5RF, UK
| | - Xiaofeng Dong
- Institute for Infection, Veterinary and Ecological Sciences, University of Liverpool, Liverpool, L3 5RF, UK
| | - Nadine Randle
- Institute for Infection, Veterinary and Ecological Sciences, University of Liverpool, Liverpool, L3 5RF, UK
| | - Yan Ryan
- Institute for Infection, Veterinary and Ecological Sciences, University of Liverpool, Liverpool, L3 5RF, UK
| | | | - Jack Pilgrim
- Centre for Genomic Research, University of Liverpool, Liverpool, L69 3BX, UK
| | - Emma Keyser
- Defence Science Technology Laboratory, Porton Down, Salisbury, SP4 0JQ, UK
| | - Simon A Weller
- Defence Science Technology Laboratory, Porton Down, Salisbury, SP4 0JQ, UK
| | - Emma J Hutley
- Centre for Defence Pathology, Royal Centre for Defence Medicine, OCT Centre, Birmingham, B15 2WB, UK
| | - Catherine Hartley
- Institute for Infection, Veterinary and Ecological Sciences, University of Liverpool, Liverpool, L3 5RF, UK
| | - Tessa Prince
- Institute for Infection, Veterinary and Ecological Sciences, University of Liverpool, Liverpool, L3 5RF, UK
| | - Alistair C Darby
- Institute for Infection, Veterinary and Ecological Sciences, University of Liverpool, Liverpool, L3 5RF, UK
| | - Niall Aye Maung
- British Army, Hunter House, St Omer Barracks, Aldershot, Hampshire, GU11 2BG, UK
| | - Henry Nwume
- Defence Science Technology Laboratory, Porton Down, Salisbury, SP4 0JQ, UK
| | - Julian A Hiscox
- Institute for Infection, Veterinary and Ecological Sciences, University of Liverpool, Liverpool, L3 5RF, UK.
- A*STAR Infectious Diseases Laboratories (A*STAR ID Labs), Agency for Science, Technology and Research (A*STAR), Connexis North Tower, 1 Fusionopolis Way, Singapore, #20-10138632, Singapore.
| | - Stevan R Emmett
- Defence Science Technology Laboratory, Porton Down, Salisbury, SP4 0JQ, UK.
| |
Collapse
|
13
|
Luan T, Commichaux S, Hoffmann M, Jayeola V, Jang JH, Pop M, Rand H, Luo Y. Benchmarking short and long read polishing tools for nanopore assemblies: achieving near-perfect genomes for outbreak isolates. BMC Genomics 2024; 25:679. [PMID: 38978005 PMCID: PMC11232133 DOI: 10.1186/s12864-024-10582-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Accepted: 07/01/2024] [Indexed: 07/10/2024] Open
Abstract
BACKGROUND Oxford Nanopore provides high throughput sequencing platforms able to reconstruct complete bacterial genomes with 99.95% accuracy. However, even small levels of error can obscure the phylogenetic relationships between closely related isolates. Polishing tools have been developed to correct these errors, but it is uncertain if they obtain the accuracy needed for the high-resolution source tracking of foodborne illness outbreaks. RESULTS We tested 132 combinations of assembly and short- and long-read polishing tools to assess their accuracy for reconstructing the genome sequences of 15 highly similar Salmonella enterica serovar Newport isolates from a 2020 onion outbreak. While long-read polishing alone improved accuracy, near perfect accuracy (99.9999% accuracy or ~ 5 nucleotide errors across the 4.8 Mbp genome, excluding low confidence regions) was only obtained by pipelines that combined both long- and short-read polishing tools. Notably, medaka was a more accurate and efficient long-read polisher than Racon. Among short-read polishers, NextPolish showed the highest accuracy, but Pilon, Polypolish, and POLCA performed similarly. Among the 5 best performing pipelines, polishing with medaka followed by NextPolish was the most common combination. Importantly, the order of polishing tools mattered i.e., using less accurate tools after more accurate ones introduced errors. Indels in homopolymers and repetitive regions, where the short reads could not be uniquely mapped, remained the most challenging errors to correct. CONCLUSIONS Short reads are still needed to correct errors in nanopore sequenced assemblies to obtain the accuracy required for source tracking investigations. Our granular assessment of the performance of the polishing pipelines allowed us to suggest best practices for tool users and areas for improvement for tool developers.
Collapse
Affiliation(s)
- Tu Luan
- Department of Computer Science, University of Maryland, College Park, MD, 20742, USA
| | - Seth Commichaux
- Center for Food Safety and Applied Nutrition, Food and Drug Administration, Laurel, MD, 20708, USA.
| | - Maria Hoffmann
- Center for Food Safety and Applied Nutrition, Food and Drug Administration, College Park, MD, 20740, USA
| | - Victor Jayeola
- Center for Food Safety and Applied Nutrition, Food and Drug Administration, College Park, MD, 20740, USA
| | - Jae Hee Jang
- Center for Food Safety and Applied Nutrition, Food and Drug Administration, College Park, MD, 20740, USA
| | - Mihai Pop
- Department of Computer Science, University of Maryland, College Park, MD, 20742, USA
| | - Hugh Rand
- Center for Food Safety and Applied Nutrition, Food and Drug Administration, College Park, MD, 20740, USA
| | - Yan Luo
- Center for Food Safety and Applied Nutrition, Food and Drug Administration, College Park, MD, 20740, USA
| |
Collapse
|
14
|
Jia H, Tan S, Zhang YE. Chasing Sequencing Perfection: Marching Toward Higher Accuracy and Lower Costs. GENOMICS, PROTEOMICS & BIOINFORMATICS 2024; 22:qzae024. [PMID: 38991976 DOI: 10.1093/gpbjnl/qzae024] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Revised: 01/25/2024] [Accepted: 01/29/2024] [Indexed: 07/13/2024]
Abstract
Next-generation sequencing (NGS), represented by Illumina platforms, has been an essential cornerstone of basic and applied research. However, the sequencing error rate of 1 per 1000 bp (10-3) represents a serious hurdle for research areas focusing on rare mutations, such as somatic mosaicism or microbe heterogeneity. By examining the high-fidelity sequencing methods developed in the past decade, we summarized three major factors underlying errors and the corresponding 12 strategies mitigating these errors. We then proposed a novel framework to classify 11 preexisting representative methods according to the corresponding combinatory strategies and identified three trends that emerged during methodological developments. We further extended this analysis to eight long-read sequencing methods, emphasizing error reduction strategies. Finally, we suggest two promising future directions that could achieve comparable or even higher accuracy with lower costs in both NGS and long-read sequencing.
Collapse
Affiliation(s)
- Hangxing Jia
- CAS Key Laboratory of Zoological Systematics and Evolution & State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
| | - Shengjun Tan
- CAS Key Laboratory of Zoological Systematics and Evolution & State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
| | - Yong E Zhang
- CAS Key Laboratory of Zoological Systematics and Evolution & State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
- CAS Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650223, China
| |
Collapse
|
15
|
Zhu M, Xu R, Yuan J, Wang J, Ren X, Cong T, You Y, Ju A, Xu L, Wang H, Zheng P, Tao H, Lin C, Yu H, Du J, Lin X, Xie W, Li Y, Lan X. Tracking-seq reveals the heterogeneity of off-target effects in CRISPR-Cas9-mediated genome editing. Nat Biotechnol 2024:10.1038/s41587-024-02307-y. [PMID: 38956324 DOI: 10.1038/s41587-024-02307-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Accepted: 06/06/2024] [Indexed: 07/04/2024]
Abstract
The continued development of novel genome editors calls for a universal method to analyze their off-target effects. Here we describe a versatile method, called Tracking-seq, for in situ identification of off-target effects that is broadly applicable to common genome-editing tools, including Cas9, base editors and prime editors. Through tracking replication protein A (RPA)-bound single-stranded DNA followed by strand-specific library construction, Tracking-seq requires a low cell input and is suitable for in vitro, ex vivo and in vivo genome editing, providing a sensitive and practical genome-wide approach for off-target detection in various scenarios. We show, using the same guide RNA, that Tracking-seq detects heterogeneity in off-target effects between different editor modalities and between different cell types, underscoring the necessity of direct measurement in the original system.
Collapse
Affiliation(s)
- Ming Zhu
- Tsinghua-Peking Joint Center for Life Sciences, Tsinghua University, Beijing, China.
- Department of Basic Medical Sciences, School of Medicine, Tsinghua University, Beijing, China.
- MOE Key Laboratory of Bioinformatics, State Key Laboratory of Molecular Oncology, Tsinghua University, Beijing, China.
| | - Runda Xu
- Tsinghua-Peking Joint Center for Life Sciences, Tsinghua University, Beijing, China
- Department of Basic Medical Sciences, School of Medicine, Tsinghua University, Beijing, China
- MOE Key Laboratory of Bioinformatics, State Key Laboratory of Molecular Oncology, Tsinghua University, Beijing, China
| | - Junsong Yuan
- Tsinghua-Peking Joint Center for Life Sciences, Tsinghua University, Beijing, China
- IDG-McGovern Institute for Brain Research, Center for Synthetic and Systems Biology, School of Pharmaceutical Sciences, Tsinghua University, Beijing, China
| | - Jiacheng Wang
- Tsinghua-Peking Joint Center for Life Sciences, Tsinghua University, Beijing, China
- MOE Key Laboratory of Bioinformatics, State Key Laboratory of Molecular Oncology, Tsinghua University, Beijing, China
- School of Life Sciences, Tsinghua University, Beijing, China
| | - Xiaoyu Ren
- Tsinghua-Peking Joint Center for Life Sciences, Tsinghua University, Beijing, China
- IDG-McGovern Institute for Brain Research, Center for Synthetic and Systems Biology, School of Pharmaceutical Sciences, Tsinghua University, Beijing, China
| | - Tingting Cong
- Tsinghua-Peking Joint Center for Life Sciences, Tsinghua University, Beijing, China
- IDG-McGovern Institute for Brain Research, Center for Synthetic and Systems Biology, School of Pharmaceutical Sciences, Tsinghua University, Beijing, China
| | - Yaxian You
- Tsinghua-Peking Joint Center for Life Sciences, Tsinghua University, Beijing, China
- Department of Basic Medical Sciences, School of Medicine, Tsinghua University, Beijing, China
- MOE Key Laboratory of Bioinformatics, State Key Laboratory of Molecular Oncology, Tsinghua University, Beijing, China
| | - Anji Ju
- Tsinghua-Peking Joint Center for Life Sciences, Tsinghua University, Beijing, China
- Department of Basic Medical Sciences, School of Medicine, Tsinghua University, Beijing, China
- MOE Key Laboratory of Bioinformatics, State Key Laboratory of Molecular Oncology, Tsinghua University, Beijing, China
| | - Longchen Xu
- Tsinghua-Peking Joint Center for Life Sciences, Tsinghua University, Beijing, China
- School of Life Sciences, Tsinghua University, Beijing, China
| | - Huimin Wang
- Tsinghua-Peking Joint Center for Life Sciences, Tsinghua University, Beijing, China
- Department of Basic Medical Sciences, School of Medicine, Tsinghua University, Beijing, China
| | - Peiyuan Zheng
- Tsinghua-Peking Joint Center for Life Sciences, Tsinghua University, Beijing, China
- IDG-McGovern Institute for Brain Research, Center for Synthetic and Systems Biology, School of Pharmaceutical Sciences, Tsinghua University, Beijing, China
| | - Huiying Tao
- Department of Basic Medical Sciences, School of Medicine, Tsinghua University, Beijing, China
- Department of Urology, Affiliated Yantai Yuhuangding Hospital of Qingdao University, Yantai, China
| | - Chunhua Lin
- Department of Urology, Affiliated Yantai Yuhuangding Hospital of Qingdao University, Yantai, China
| | - Honghao Yu
- Department of Basic Medical Sciences, School of Medicine, Tsinghua University, Beijing, China
- Key Laboratory of Medical Biotechnology and Translational Medicine, Guilin Medical University, Guilin, China
| | - Juanjuan Du
- Tsinghua-Peking Joint Center for Life Sciences, Tsinghua University, Beijing, China
- IDG-McGovern Institute for Brain Research, Center for Synthetic and Systems Biology, School of Pharmaceutical Sciences, Tsinghua University, Beijing, China
| | - Xin Lin
- Tsinghua-Peking Joint Center for Life Sciences, Tsinghua University, Beijing, China
- Department of Basic Medical Sciences, School of Medicine, Tsinghua University, Beijing, China
| | - Wei Xie
- Tsinghua-Peking Joint Center for Life Sciences, Tsinghua University, Beijing, China
- School of Life Sciences, Tsinghua University, Beijing, China
| | - Yinqing Li
- MOE Key Laboratory of Bioinformatics, State Key Laboratory of Molecular Oncology, Tsinghua University, Beijing, China.
- IDG-McGovern Institute for Brain Research, Center for Synthetic and Systems Biology, School of Pharmaceutical Sciences, Tsinghua University, Beijing, China.
| | - Xun Lan
- Tsinghua-Peking Joint Center for Life Sciences, Tsinghua University, Beijing, China.
- Department of Basic Medical Sciences, School of Medicine, Tsinghua University, Beijing, China.
- MOE Key Laboratory of Bioinformatics, State Key Laboratory of Molecular Oncology, Tsinghua University, Beijing, China.
| |
Collapse
|
16
|
Marshall VA, Cornejo Castro EM, Goodman CA, Labo N, Liu I, Fisher NC, Moore KN, Nair A, Immonen T, Keele BF, Polizzotto MN, Uldrick TS, Mu Y, Saswat T, Krug LT, McBride KM, Lurain K, Ramaswami R, Yarchoan R, Whitby D. Sequencing of Kaposi's Sarcoma Herpesvirus (KSHV) genomes from persons of diverse ethnicities and provenances with KSHV-associated diseases demonstrate multiple infections, novel polymorphisms, and low intra-host variance. PLoS Pathog 2024; 20:e1012338. [PMID: 39008527 PMCID: PMC11271956 DOI: 10.1371/journal.ppat.1012338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 07/25/2024] [Accepted: 06/11/2024] [Indexed: 07/17/2024] Open
Abstract
Recently published near full-length KSHV genomes from a Cameroon Kaposi sarcoma case-control study showed strong evidence of viral recombination and mixed infections, but no sequence variations associated with disease. Using the same methodology, an additional 102 KSHV genomes from 76 individuals with KSHV-associated diseases have been sequenced. Diagnoses comprise all KSHV-associated diseases (KAD): Kaposi sarcoma (KS), primary effusion lymphoma (PEL), KSHV-associated large cell lymphoma (KSHV-LCL), a type of multicentric Castleman disease (KSHV-MCD), and KSHV inflammatory cytokine syndrome (KICS). Participants originated from 22 different countries, providing the opportunity to obtain new near full-length sequences of a wide diversity of KSHV genomes. These include near full-length sequence of genomes with KSHV K1 subtypes A, B, C, and F as well as subtype E, for which no full sequence was previously available. High levels of recombination were observed. Fourteen individuals (18%) showed evidence of infection with multiple KSHV variants (from two to four unique genomes). Twenty-six comparisons of sequences, obtained from various sampling sites including PBMC, tissue biopsies, oral fluids, and effusions in the same participants, identified near complete genome conservation between different biological compartments. Polymorphisms were identified in coding and non-coding regions, including indels in the K3 and K15 genes and sequence inversions here reported for the first time. One such polymorphism in KSHV ORF46, specific to the KSHV K1 subtype E2, encoded a mutation in the leucine loop extension of the uracil DNA glycosylase that results in alteration of biochemical functions of this protein. This confirms that KSHV sequence variations can have functional consequences warranting further investigation. This study represents the largest and most diverse analysis of KSHV genome sequences to date among individuals with KAD and provides important new information on global KSHV genomics.
Collapse
Affiliation(s)
- Vickie A. Marshall
- Viral Oncology Section, AIDS and Cancer Virus Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Elena M. Cornejo Castro
- Viral Oncology Section, AIDS and Cancer Virus Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Charles A. Goodman
- Retroviral Evolution Section, AIDS and Cancer Virus Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Nazzarena Labo
- Viral Oncology Section, AIDS and Cancer Virus Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Isabella Liu
- Viral Oncology Section, AIDS and Cancer Virus Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Nicholas C. Fisher
- Viral Oncology Section, AIDS and Cancer Virus Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Kyle N. Moore
- Viral Oncology Section, AIDS and Cancer Virus Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Ananthakrishnan Nair
- Viral Oncology Section, AIDS and Cancer Virus Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Taina Immonen
- Retroviral Evolution Section, AIDS and Cancer Virus Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Brandon F. Keele
- Retroviral Evolution Section, AIDS and Cancer Virus Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Mark N. Polizzotto
- HIV and AIDS Malignancy Branch, National Cancer Institute, Bethesda, Maryland, United States of America
| | - Thomas S. Uldrick
- HIV and AIDS Malignancy Branch, National Cancer Institute, Bethesda, Maryland, United States of America
| | - Yunxiang Mu
- Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Houston, Texas, United States of America
| | - Tanuja Saswat
- HIV and AIDS Malignancy Branch, National Cancer Institute, Bethesda, Maryland, United States of America
| | - Laurie T. Krug
- HIV and AIDS Malignancy Branch, National Cancer Institute, Bethesda, Maryland, United States of America
| | - Kevin M. McBride
- Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Houston, Texas, United States of America
| | - Kathryn Lurain
- HIV and AIDS Malignancy Branch, National Cancer Institute, Bethesda, Maryland, United States of America
| | - Ramya Ramaswami
- HIV and AIDS Malignancy Branch, National Cancer Institute, Bethesda, Maryland, United States of America
| | - Robert Yarchoan
- HIV and AIDS Malignancy Branch, National Cancer Institute, Bethesda, Maryland, United States of America
| | - Denise Whitby
- Viral Oncology Section, AIDS and Cancer Virus Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| |
Collapse
|
17
|
Gabernet G, Marquez S, Bjornson R, Peltzer A, Meng H, Aron E, Lee NY, Jensen CG, Ladd D, Polster M, Hanssen F, Heumos S, Yaari G, Kowarik MC, Nahnsen S, Kleinstein SH. nf-core/airrflow: An adaptive immune receptor repertoire analysis workflow employing the Immcantation framework. PLoS Comput Biol 2024; 20:e1012265. [PMID: 39058741 PMCID: PMC11305553 DOI: 10.1371/journal.pcbi.1012265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2024] [Revised: 08/07/2024] [Accepted: 06/20/2024] [Indexed: 07/28/2024] Open
Abstract
Adaptive Immune Receptor Repertoire sequencing (AIRR-seq) is a valuable experimental tool to study the immune state in health and following immune challenges such as infectious diseases, (auto)immune diseases, and cancer. Several tools have been developed to reconstruct B cell and T cell receptor sequences from AIRR-seq data and infer B and T cell clonal relationships. However, currently available tools offer limited parallelization across samples, scalability or portability to high-performance computing infrastructures. To address this need, we developed nf-core/airrflow, an end-to-end bulk and single-cell AIRR-seq processing workflow which integrates the Immcantation Framework following BCR and TCR sequencing data analysis best practices. The Immcantation Framework is a comprehensive toolset, which allows the processing of bulk and single-cell AIRR-seq data from raw read processing to clonal inference. nf-core/airrflow is written in Nextflow and is part of the nf-core project, which collects community contributed and curated Nextflow workflows for a wide variety of analysis tasks. We assessed the performance of nf-core/airrflow on simulated sequencing data with sequencing errors and show example results with real datasets. To demonstrate the applicability of nf-core/airrflow to the high-throughput processing of large AIRR-seq datasets, we validated and extended previously reported findings of convergent antibody responses to SARS-CoV-2 by analyzing 97 COVID-19 infected individuals and 99 healthy controls, including a mixture of bulk and single-cell sequencing datasets. Using this dataset, we extended the convergence findings to 20 additional subjects, highlighting the applicability of nf-core/airrflow to validate findings in small in-house cohorts with reanalysis of large publicly available AIRR datasets.
Collapse
Affiliation(s)
- Gisela Gabernet
- Department of Pathology, Yale School of Medicine, New Haven, Connecticut, United States of America
- Quantitative Biology Center, Eberhard-Karls University of Tübingen, Tübingen, Germany
| | - Susanna Marquez
- Department of Pathology, Yale School of Medicine, New Haven, Connecticut, United States of America
| | - Robert Bjornson
- Yale Center for Research Computing, New Haven, Connecticut, United States of America
| | | | - Hailong Meng
- Department of Pathology, Yale School of Medicine, New Haven, Connecticut, United States of America
| | - Edel Aron
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
| | - Noah Y. Lee
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
| | - Cole G. Jensen
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
| | - David Ladd
- oNKo-Innate Pty Ltd, Melbourne, Victoria, Australia
| | - Mark Polster
- Quantitative Biology Center, Eberhard-Karls University of Tübingen, Tübingen, Germany
- Department of Computer Science, Eberhard-Karls University of Tübingen, Tübingen, Germany
- M3 Research Center, University Hospital, Tübingen, Germany
| | - Friederike Hanssen
- Quantitative Biology Center, Eberhard-Karls University of Tübingen, Tübingen, Germany
- Department of Computer Science, Eberhard-Karls University of Tübingen, Tübingen, Germany
- M3 Research Center, University Hospital, Tübingen, Germany
| | - Simon Heumos
- Quantitative Biology Center, Eberhard-Karls University of Tübingen, Tübingen, Germany
- Department of Computer Science, Eberhard-Karls University of Tübingen, Tübingen, Germany
- M3 Research Center, University Hospital, Tübingen, Germany
| | | | - Gur Yaari
- Faculty of Engineering, Bar Ilan University, Ramat Gan, Israel
| | - Markus C. Kowarik
- Department of Neurology and Stroke, Center for Neurology, Eberhard-Karls University of Tübingen, Tübingen, Germany
- Hertie Institute for Clinical Brain Research, Eberhard-Karls University of Tübingen, Tübingen, Germany
| | - Sven Nahnsen
- Quantitative Biology Center, Eberhard-Karls University of Tübingen, Tübingen, Germany
- Department of Computer Science, Eberhard-Karls University of Tübingen, Tübingen, Germany
- M3 Research Center, University Hospital, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics (IBMI), Eberhard-Karls University of Tübingen, Tübingen, Germany
| | - Steven H. Kleinstein
- Department of Pathology, Yale School of Medicine, New Haven, Connecticut, United States of America
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
- Department of Immunobiology, Yale School of Medicine, New Haven, Connecticut, United States of America
| |
Collapse
|
18
|
Kasmi Y, Neumann H, Haslob H, Blancke T, Möckel B, Postel U, Hanel R. Comparative analysis of bottom trawl and nanopore sequencing in fish biodiversity assessment: The sylt outer reef example. MARINE ENVIRONMENTAL RESEARCH 2024; 199:106602. [PMID: 38870557 DOI: 10.1016/j.marenvres.2024.106602] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/20/2024] [Revised: 06/03/2024] [Accepted: 06/07/2024] [Indexed: 06/15/2024]
Abstract
The assessment of fish diversity is crucial for effective conservation and management strategies, especially in ecologically sensitive regions such as marine protected areas. This study contrasts the effectiveness of environmental DNA (eDNA) metabarcoding analysis employing Nanopore technology with compare beam trawl surveys at the Sylt Outer Reef, a Natura 2000 site in the North Sea, Germany. Out of the 17 fish species caught in a bottom trawl (using a 3m beam trawl), 14 were also identified through eDNA extracted from water samples. The three species not detected in the eDNA results were absent because they lacked representation in public DNA databases. The eDNA method detected twice as many fish species as the beam trawl, totalling 36 species, of which 14 were also detected by the trawl. Additionally, the selection of primers (Mifish) facilitated the identification of one marine mammal species, the harbour porpoise. In conclusion, the findings underscore the potential of eDNA coupled with MinION sequencing (Long read technology) as a robust tool for biodiversity assessment, surpassing traditional methods in detecting species richness.
Collapse
Affiliation(s)
- Yassine Kasmi
- Thünen Institute of Fisheries Ecology, Bremerhaven, Germany.
| | | | - Holger Haslob
- Thünen Institute of Sea Fisheries, Bremerhaven, Germany
| | - Tina Blancke
- Thünen Institute of Fisheries Ecology, Bremerhaven, Germany
| | - Benita Möckel
- Thünen Institute of Fisheries Ecology, Bremerhaven, Germany
| | - Ute Postel
- Thünen Institute of Fisheries Ecology, Bremerhaven, Germany
| | - Reinhold Hanel
- Thünen Institute of Fisheries Ecology, Bremerhaven, Germany
| |
Collapse
|
19
|
Hemstrom W, Grummer JA, Luikart G, Christie MR. Next-generation data filtering in the genomics era. Nat Rev Genet 2024:10.1038/s41576-024-00738-6. [PMID: 38877133 DOI: 10.1038/s41576-024-00738-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/25/2024] [Indexed: 06/16/2024]
Abstract
Genomic data are ubiquitous across disciplines, from agriculture to biodiversity, ecology, evolution and human health. However, these datasets often contain noise or errors and are missing information that can affect the accuracy and reliability of subsequent computational analyses and conclusions. A key step in genomic data analysis is filtering - removing sequencing bases, reads, genetic variants and/or individuals from a dataset - to improve data quality for downstream analyses. Researchers are confronted with a multitude of choices when filtering genomic data; they must choose which filters to apply and select appropriate thresholds. To help usher in the next generation of genomic data filtering, we review and suggest best practices to improve the implementation, reproducibility and reporting standards for filter types and thresholds commonly applied to genomic datasets. We focus mainly on filters for minor allele frequency, missing data per individual or per locus, linkage disequilibrium and Hardy-Weinberg deviations. Using simulated and empirical datasets, we illustrate the large effects of different filtering thresholds on common population genetics statistics, such as Tajima's D value, population differentiation (FST), nucleotide diversity (π) and effective population size (Ne).
Collapse
Affiliation(s)
- William Hemstrom
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA.
| | - Jared A Grummer
- Flathead Lake Biological Station, Wildlife Biology Program and Division of Biological Sciences, University of Montana, Missoula, MT, USA
| | - Gordon Luikart
- Flathead Lake Biological Station, Wildlife Biology Program and Division of Biological Sciences, University of Montana, Missoula, MT, USA
| | - Mark R Christie
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA.
- Department of Forestry and Natural Resources, Purdue University, West Lafayette, IN, USA.
| |
Collapse
|
20
|
Dobner J, Nguyen T, Pavez-Giani MG, Cyganek L, Distelmaier F, Krutmann J, Prigione A, Rossi A. mtDNA analysis using Mitopore. Mol Ther Methods Clin Dev 2024; 32:101231. [PMID: 38572068 PMCID: PMC10988129 DOI: 10.1016/j.omtm.2024.101231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Accepted: 03/08/2024] [Indexed: 04/05/2024]
Abstract
Mitochondrial DNA (mtDNA) analysis is crucial for the diagnosis of mitochondrial disorders, forensic investigations, and basic research. Existing pipelines are complex, expensive, and require specialized personnel. In many cases, including the diagnosis of detrimental single nucleotide variants (SNVs), mtDNA analysis is still carried out using Sanger sequencing. Here, we developed a simple workflow and a publicly available webserver named Mitopore that allows the detection of mtDNA SNVs, indels, and haplogroups. To simplify mtDNA analysis, we tailored our workflow to process noisy long-read sequencing data for mtDNA analysis, focusing on sequence alignment and parameter optimization. We implemented Mitopore with eliBQ (eliminate bad quality reads), an innovative quality enhancement that permits the increase of per-base quality of over 20% for low-quality data. The whole Mitopore workflow and webserver were validated using patient-derived and induced pluripotent stem cells harboring mtDNA mutations. Mitopore streamlines mtDNA analysis as an easy-to-use fast, reliable, and cost-effective analysis method for both long- and short-read sequencing data. This significantly enhances the accessibility of mtDNA analysis and reduces the cost per sample, contributing to the progress of mtDNA-related research and diagnosis.
Collapse
Affiliation(s)
- Jochen Dobner
- Institut für Umweltmedizinische Forschung (IUF)-Leibniz Research Institute for Environmental Medicine, 40225 Düsseldorf, Germany
| | - Thach Nguyen
- Institut für Umweltmedizinische Forschung (IUF)-Leibniz Research Institute for Environmental Medicine, 40225 Düsseldorf, Germany
| | - Mario Gustavo Pavez-Giani
- Clinic for Cardiology and Pneumology, University Medical Center Göttingen, 37075 Göttingen, Germany
- DZHK (German Center for Cardiovascular Research), Partner Site Göttingen, 37075 Göttingen, Germany
| | - Lukas Cyganek
- Clinic for Cardiology and Pneumology, University Medical Center Göttingen, 37075 Göttingen, Germany
- DZHK (German Center for Cardiovascular Research), Partner Site Göttingen, 37075 Göttingen, Germany
- Cluster of Excellence “Multiscale Bioimaging: from Molecular Machines to Networks of Excitable Cells” (MBExC), University of Göttingen, 37075 Göttingen, Germany
| | - Felix Distelmaier
- Department of General Pediatrics, Neonatology and Pediatric Cardiology, Medical Faculty, Heinrich Heine University, 40225 Düsseldorf, Germany
| | - Jean Krutmann
- Institut für Umweltmedizinische Forschung (IUF)-Leibniz Research Institute for Environmental Medicine, 40225 Düsseldorf, Germany
- Medical Faculty, Heinrich Heine University, 40225 Düsseldorf, Germany
| | - Alessandro Prigione
- Department of General Pediatrics, Neonatology and Pediatric Cardiology, Medical Faculty, Heinrich Heine University, 40225 Düsseldorf, Germany
| | - Andrea Rossi
- Institut für Umweltmedizinische Forschung (IUF)-Leibniz Research Institute for Environmental Medicine, 40225 Düsseldorf, Germany
| |
Collapse
|
21
|
Davison A, Chowdhury M, Johansen M, Uliano-Silva M, Blaxter M. High heteroplasmy is associated with low mitochondrial copy number and selection against non-synonymous mutations in the snail Cepaea nemoralis. BMC Genomics 2024; 25:596. [PMID: 38872121 DOI: 10.1186/s12864-024-10505-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Accepted: 06/06/2024] [Indexed: 06/15/2024] Open
Abstract
Molluscan mitochondrial genomes are unusual because they show wide variation in size, radical genome rearrangements and frequently show high variation (> 10%) within species. As progress in understanding this variation has been limited, we used whole genome sequencing of a six-generation matriline of the terrestrial snail Cepaea nemoralis, as well as whole genome sequences from wild-collected C. nemoralis, the sister species C. hortensis, and multiple other snail species to explore the origins of mitochondrial DNA (mtDNA) variation. The main finding is that a high rate of SNP heteroplasmy in somatic tissue was negatively correlated with mtDNA copy number in both Cepaea species. In individuals with under ten mtDNA copies per nuclear genome, more than 10% of all positions were heteroplasmic, with evidence for transmission of this heteroplasmy through the germline. Further analyses showed evidence for purifying selection acting on non-synonymous mutations, even at low frequency of the rare allele, especially in cytochrome oxidase subunit 1 and cytochrome b. The mtDNA of some individuals of Cepaea nemoralis contained a length heteroplasmy, including up to 12 direct repeat copies of tRNA-Val, with 24 copies in another snail, Candidula rugosiuscula, and repeats of tRNA-Thr in C. hortensis. These repeats likely arise due to error prone replication but are not correlated with mitochondrial copy number in C. nemoralis. Overall, the findings provide key insights into mechanisms of replication, mutation and evolution in molluscan mtDNA, and so will inform wider studies on the biology and evolution of mtDNA across animal phyla.
Collapse
Affiliation(s)
- Angus Davison
- School of Life Sciences, University of Nottingham, University Park, Nottingham, NG7 2RD, UK.
| | - Mehrab Chowdhury
- School of Life Sciences, University of Nottingham, University Park, Nottingham, NG7 2RD, UK
| | - Margrethe Johansen
- School of Life Sciences, University of Nottingham, University Park, Nottingham, NG7 2RD, UK
| | - Marcela Uliano-Silva
- Tree of Life, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire, CB10 1SA, UK
| | - Mark Blaxter
- Tree of Life, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire, CB10 1SA, UK
| |
Collapse
|
22
|
Hersch SJ, Chandrasekaran S, Lam J, Nafissi N, Slavcev RA. Manufacturing DNA in E. coli yields higher-fidelity DNA than in vitro enzymatic synthesis. Mol Ther Methods Clin Dev 2024; 32:101227. [PMID: 38516691 PMCID: PMC10951457 DOI: 10.1016/j.omtm.2024.101227] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Accepted: 02/26/2024] [Indexed: 03/23/2024]
Abstract
Biotechnologies such as gene therapy have brought DNA vectors to the forefront of pharmaceuticals. The quality of starting material plays a pivotal role in determining final product quality. Here, we examined the fidelity of DNA replication using enzymatic methods (in vitro) compared to plasmid DNA produced in vivo in E. coli. Next-generation sequencing approaches rely on in vitro polymerases, which have inherent limitations in sensitivity. To address this challenge, we introduce a novel assay based on loss-of-function (LOF) mutations in the conditionally toxic sacB gene. Our findings show that DNA production in E. coli results in significantly fewer LOF mutations (80- to 3,000-fold less) compared to enzymatic DNA replication methods such as polymerase chain reaction (PCR) and rolling circle amplification (RCA). These results suggest that using DNA produced by PCR or RCA may introduce a substantial number of mutation impurities, potentially affecting the quality and yield of final pharmaceutical products. Our study underscores that DNA synthesized in vitro has a significantly higher mutation rate than DNA produced traditionally in E. coli. Therefore, utilizing in vitro enzymatically produced DNA in biotechnology and biomanufacturing may entail considerable fidelity-related risks, while using DNA starting material derived from E. coli substantially mitigates this risk.
Collapse
Affiliation(s)
| | | | - Jamie Lam
- Mediphage Bioceuticals, Inc, Toronto, ON, Canada
| | - Nafiseh Nafissi
- Mediphage Bioceuticals, Inc, Toronto, ON, Canada
- School of Pharmacy, University of Waterloo, Waterloo, ON, Canada
| | - Roderick A. Slavcev
- Mediphage Bioceuticals, Inc, Toronto, ON, Canada
- School of Pharmacy, University of Waterloo, Waterloo, ON, Canada
- Centre for Eye and Vision Research, HKSTP, Ma Liu Shui, Hong Kong
| |
Collapse
|
23
|
Yang ZX, Deng DH, Gao ZY, Zhang ZK, Fu YW, Wen W, Zhang F, Li X, Li HY, Zhang JP, Zhang XB. OliTag-seq enhances in cellulo detection of CRISPR-Cas9 off-targets. Commun Biol 2024; 7:696. [PMID: 38844522 PMCID: PMC11156888 DOI: 10.1038/s42003-024-06360-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Accepted: 05/20/2024] [Indexed: 06/09/2024] Open
Abstract
The potential for off-target mutations is a critical concern for the therapeutic application of CRISPR-Cas9 gene editing. Current detection methodologies, such as GUIDE-seq, exhibit limitations in oligonucleotide integration efficiency and sensitivity, which could hinder their utility in clinical settings. To address these issues, we introduce OliTag-seq, an in-cellulo assay specifically engineered to enhance the detection of off-target events. OliTag-seq employs a stable oligonucleotide for precise break tagging and an innovative triple-priming amplification strategy, significantly improving the scope and accuracy of off-target site identification. This method surpasses traditional assays by providing comprehensive coverage across various sgRNAs and genomic targets. Our research particularly highlights the superior sensitivity of induced pluripotent stem cells (iPSCs) in detecting off-target mutations, advocating for using patient-derived iPSCs for refined off-target analysis in therapeutic gene editing. Furthermore, we provide evidence that prolonged Cas9 expression and transient HDAC inhibitor treatments enhance the assay's ability to uncover off-target events. OliTag-seq merges the high sensitivity typical of in vitro assays with the practical application of cellular contexts. This approach significantly improves the safety and efficacy profiles of CRISPR-Cas9 interventions in research and clinical environments, positioning it as an essential tool for the precise assessment and refinement of genome editing applications.
Collapse
Grants
- the National Key Research and Development Program of China (Grant Nos. 2019YFA0110803, 2019YFA0110204, and 2021YFA1100900), the National Natural Science Foundation of China (Grant Nos. 82070115 and 81890990), the Chinese Academy of Medical Sciences (CAMS) Innovation Fund for Medical Sciences (CIFMS) (Grant Nos. 2022-I2M-2-003, 2022-I2M-2-001, 2021-I2M-1-041, 2021-I2M-1-040, and 2021-I2M-1-001), the Nonprofit Central Research Institute Fund of Chinese Academy of Medical Sciences (Grant No. 2020-PT310-011), the Tianjin Synthetic Biotechnology Innovation Capacity Improvement Project (Grant No. TSBICIP-KJGG-017), the CAMS Fundamental Research Funds for Central Research Institutes (Grant No. 3332021093), the Haihe Laboratory of Cell Ecosystem Innovation Fund (Grant No. HH23KYZX0005 and HH22KYZX0022), the State Key Laboratory of Experimental Hematology Research Grant (Grant No. Z23-05), and the Postdoctoral Fellowship Program of CPSF (Grant No. GZB20230081)
Collapse
Affiliation(s)
- Zhi-Xue Yang
- State Key Laboratory of Experimental Hematology, National Clinical Research Center for Blood Diseases, Haihe Laboratory of Cell Ecosystem, Institute of Hematology & Blood Diseases Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, 300020, Tianjin, China
- Tianjin Institutes of Health Science, 301600, Tianjin, China
| | - Dong-Hao Deng
- State Key Laboratory of Experimental Hematology, National Clinical Research Center for Blood Diseases, Haihe Laboratory of Cell Ecosystem, Institute of Hematology & Blood Diseases Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, 300020, Tianjin, China
- Tianjin Institutes of Health Science, 301600, Tianjin, China
| | - Zhu-Ying Gao
- State Key Laboratory of Experimental Hematology, National Clinical Research Center for Blood Diseases, Haihe Laboratory of Cell Ecosystem, Institute of Hematology & Blood Diseases Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, 300020, Tianjin, China
- Tianjin Institutes of Health Science, 301600, Tianjin, China
| | - Zhi-Kang Zhang
- College of Computer Science and Technology, China University of Petroleum (East China), 266000, Qingdao, China
| | - Ya-Wen Fu
- State Key Laboratory of Experimental Hematology, National Clinical Research Center for Blood Diseases, Haihe Laboratory of Cell Ecosystem, Institute of Hematology & Blood Diseases Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, 300020, Tianjin, China
| | - Wei Wen
- State Key Laboratory of Experimental Hematology, National Clinical Research Center for Blood Diseases, Haihe Laboratory of Cell Ecosystem, Institute of Hematology & Blood Diseases Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, 300020, Tianjin, China
- Tianjin Institutes of Health Science, 301600, Tianjin, China
| | - Feng Zhang
- State Key Laboratory of Experimental Hematology, National Clinical Research Center for Blood Diseases, Haihe Laboratory of Cell Ecosystem, Institute of Hematology & Blood Diseases Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, 300020, Tianjin, China
- Tianjin Institutes of Health Science, 301600, Tianjin, China
| | - Xiang Li
- State Key Laboratory of Experimental Hematology, National Clinical Research Center for Blood Diseases, Haihe Laboratory of Cell Ecosystem, Institute of Hematology & Blood Diseases Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, 300020, Tianjin, China
- Tianjin Institutes of Health Science, 301600, Tianjin, China
| | - Hua-Yu Li
- College of Computer Science and Technology, China University of Petroleum (East China), 266000, Qingdao, China.
| | - Jian-Ping Zhang
- State Key Laboratory of Experimental Hematology, National Clinical Research Center for Blood Diseases, Haihe Laboratory of Cell Ecosystem, Institute of Hematology & Blood Diseases Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, 300020, Tianjin, China.
- Tianjin Institutes of Health Science, 301600, Tianjin, China.
| | - Xiao-Bing Zhang
- State Key Laboratory of Experimental Hematology, National Clinical Research Center for Blood Diseases, Haihe Laboratory of Cell Ecosystem, Institute of Hematology & Blood Diseases Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, 300020, Tianjin, China.
- Tianjin Institutes of Health Science, 301600, Tianjin, China.
| |
Collapse
|
24
|
Goussarov G, Mysara M, Cleenwerck I, Claesen J, Leys N, Vandamme P, Van Houdt R. Benchmarking short-, long- and hybrid-read assemblers for metagenome sequencing of complex microbial communities. MICROBIOLOGY (READING, ENGLAND) 2024; 170:001469. [PMID: 38916949 PMCID: PMC11261854 DOI: 10.1099/mic.0.001469] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Accepted: 05/23/2024] [Indexed: 06/26/2024]
Abstract
Metagenome community analyses, driven by the continued development in sequencing technology, is rapidly providing insights in many aspects of microbiology and becoming a cornerstone tool. Illumina, Oxford Nanopore Technologies (ONT) and Pacific Biosciences (PacBio) are the leading technologies, each with their own advantages and drawbacks. Illumina provides accurate reads at a low cost, but their length is too short to close bacterial genomes. Long reads overcome this limitation, but these technologies produce reads with lower accuracy (ONT) or with lower throughput (PacBio high-fidelity reads). In a critical first analysis step, reads are assembled to reconstruct genomes or individual genes within the community. However, to date, the performance of existing assemblers has never been challenged with a complex mock metagenome. Here, we evaluate the performance of current assemblers that use short, long or both read types on a complex mock metagenome consisting of 227 bacterial strains with varying degrees of relatedness. We show that many of the current assemblers are not suited to handle such a complex metagenome. In addition, hybrid assemblies do not fulfil their potential. We conclude that ONT reads assembled with CANU and Illumina reads assembled with SPAdes offer the best value for reconstructing genomes and individual genes of complex metagenomes, respectively.
Collapse
Affiliation(s)
- Gleb Goussarov
- Microbiology Unit, Belgian Nuclear Research Centre (SCK CEN), Mol, Belgium
- Laboratory of Microbiology and BCCM/LMG Bacteria Collection, Faculty of Sciences, Ghent University, Ghent, Belgium
| | - Mohamed Mysara
- Microbiology Unit, Belgian Nuclear Research Centre (SCK CEN), Mol, Belgium
- Bioinformatics group, Information Technology & Computer Science, Nile University, Giza, Egypt
| | - Ilse Cleenwerck
- Laboratory of Microbiology and BCCM/LMG Bacteria Collection, Faculty of Sciences, Ghent University, Ghent, Belgium
| | - Jürgen Claesen
- Microbiology Unit, Belgian Nuclear Research Centre (SCK CEN), Mol, Belgium
| | - Natalie Leys
- Microbiology Unit, Belgian Nuclear Research Centre (SCK CEN), Mol, Belgium
| | - Peter Vandamme
- Laboratory of Microbiology and BCCM/LMG Bacteria Collection, Faculty of Sciences, Ghent University, Ghent, Belgium
| | - Rob Van Houdt
- Microbiology Unit, Belgian Nuclear Research Centre (SCK CEN), Mol, Belgium
| |
Collapse
|
25
|
Liu Y, Jiao B, Champer J, Qian W. Overriding Mendelian inheritance in Arabidopsis with a CRISPR toxin-antidote gene drive that impairs pollen germination. NATURE PLANTS 2024; 10:910-922. [PMID: 38886523 DOI: 10.1038/s41477-024-01692-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 04/09/2024] [Indexed: 06/20/2024]
Abstract
Synthetic gene drives, inspired by natural selfish genetic elements and transmitted to progeny at super-Mendelian (>50%) frequencies, present transformative potential for disseminating traits that benefit humans throughout wild populations, even facing potential fitness costs. Here we constructed a gene drive system in plants called CRISPR-Assisted Inheritance utilizing NPG1 (CAIN), which uses a toxin-antidote mechanism in the male germline to override Mendelian inheritance. Specifically, a guide RNA-Cas9 cassette targets the essential No Pollen Germination 1 (NPG1) gene, serving as the toxin to block pollen germination. A recoded, CRISPR-resistant copy of NPG1 serves as the antidote, providing rescue only in pollen cells that carry the drive. To limit potential consequences of inadvertent release, we used self-pollinating Arabidopsis thaliana as a model. The drive demonstrated a robust 88-99% transmission rate over two successive generations, producing minimal resistance alleles that are unlikely to inhibit drive spread. Our study provides a strong basis for rapid genetic modification or suppression of outcrossing plant populations.
Collapse
Affiliation(s)
- Yang Liu
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing, China
| | - Bingke Jiao
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Jackson Champer
- Center for Bioinformatics, School of Life Sciences, Center for Life Sciences, Peking University, Beijing, China
| | - Wenfeng Qian
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing, China.
- University of Chinese Academy of Sciences, Beijing, China.
| |
Collapse
|
26
|
Wade KJ, Suseno R, Kizer K, Williams J, Boquett J, Caillier S, Pollock NR, Renschen A, Santaniello A, Oksenberg JR, Norman PJ, Augusto DG, Hollenbach JA. MHConstructor: A high-throughput, haplotype-informed solution to the MHC assembly challenge. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.20.595060. [PMID: 38826378 PMCID: PMC11142050 DOI: 10.1101/2024.05.20.595060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2024]
Abstract
The extremely high levels of genetic polymorphism within the human major histocompatibility complex (MHC) limit the usefulness of reference-based alignment methods for sequence assembly. We incorporate a short read de novo assembly algorithm into a workflow for novel application to the MHC. MHConstructor is a containerized pipeline designed for high-throughput, haplotype-informed, reproducible assembly of both whole genome sequencing and target-capture short read data in large, population cohorts. To-date, no other self-contained tool exists for the generation of de novo MHC assemblies from short read data. MHConstructor facilitates wide-spread access to high quality, alignment-free MHC sequence analysis.
Collapse
Affiliation(s)
- Kristen J. Wade
- Weill Institute for Neurosciences, Department of Neurology, University of California San Francisco, San Francisco, CA, United States
| | - Rayo Suseno
- Weill Institute for Neurosciences, Department of Neurology, University of California San Francisco, San Francisco, CA, United States
| | - Kerry Kizer
- Weill Institute for Neurosciences, Department of Neurology, University of California San Francisco, San Francisco, CA, United States
| | - Jacqueline Williams
- Weill Institute for Neurosciences, Department of Neurology, University of California San Francisco, San Francisco, CA, United States
| | - Juliano Boquett
- Weill Institute for Neurosciences, Department of Neurology, University of California San Francisco, San Francisco, CA, United States
| | - Stacy Caillier
- Weill Institute for Neurosciences, Department of Neurology, University of California San Francisco, San Francisco, CA, United States
| | - Nicholas R. Pollock
- Department of Biomedical Informatics, Anschutz Medical Campus, University of Colorado, Aurora, Colorado, USA
- Department of Immunology and Microbiology, Anschutz Medical Campus, University of Colorado, Aurora, Colorado, USA
| | - Adam Renschen
- Weill Institute for Neurosciences, Department of Neurology, University of California San Francisco, San Francisco, CA, United States
| | - Adam Santaniello
- Weill Institute for Neurosciences, Department of Neurology, University of California San Francisco, San Francisco, CA, United States
| | - Jorge R. Oksenberg
- Weill Institute for Neurosciences, Department of Neurology, University of California San Francisco, San Francisco, CA, United States
| | - Paul J. Norman
- Department of Biomedical Informatics, Anschutz Medical Campus, University of Colorado, Aurora, Colorado, USA
- Department of Immunology and Microbiology, Anschutz Medical Campus, University of Colorado, Aurora, Colorado, USA
| | - Danillo G. Augusto
- Department of Biological Sciences, University of North Carolina Charlotte, Charlotte, NC, United States
- Programa de Pós-Graduação em Genética, Universidade Federal do Paraná, Curitiba, Brazil
| | - Jill A. Hollenbach
- Weill Institute for Neurosciences, Department of Neurology, University of California San Francisco, San Francisco, CA, United States
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA, United States
| |
Collapse
|
27
|
de la Mata R, Mollá-Morales A, Méndez-Vigo B, Torres-Pérez R, Oliveros JC, Gómez R, Marcer A, Castilla AR, Nordborg M, Alonso-Blanco C, Picó FX. Variation and plasticity in life-history traits and fitness of wild Arabidopsis thaliana populations are not related to their genotypic and ecological diversity. BMC Ecol Evol 2024; 24:56. [PMID: 38702598 PMCID: PMC11067129 DOI: 10.1186/s12862-024-02246-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Accepted: 04/25/2024] [Indexed: 05/06/2024] Open
Abstract
BACKGROUND Despite its implications for population dynamics and evolution, the relationship between genetic and phenotypic variation in wild populations remains unclear. Here, we estimated variation and plasticity in life-history traits and fitness of the annual plant Arabidopsis thaliana in two common garden experiments that differed in environmental conditions. We used up to 306 maternal inbred lines from six Iberian populations characterized by low and high genotypic (based on whole-genome sequences) and ecological (vegetation type) diversity. RESULTS Low and high genotypic and ecological diversity was found in edge and core Iberian environments, respectively. Given that selection is expected to be stronger in edge environments and that ecological diversity may enhance both phenotypic variation and plasticity, we expected genotypic diversity to be positively associated with phenotypic variation and plasticity. However, maternal lines, irrespective of the genotypic and ecological diversity of their population of origin, exhibited a substantial amount of phenotypic variation and plasticity for all traits. Furthermore, all populations harbored maternal lines with canalization (robustness) or sensitivity in response to harsher environmental conditions in one of the two experiments. CONCLUSIONS Overall, we conclude that the environmental attributes of each population probably determine their genotypic diversity, but all populations maintain substantial phenotypic variation and plasticity for all traits, which represents an asset to endure in changing environments.
Collapse
Affiliation(s)
- Raul de la Mata
- Departamento de Biología Evolutiva, Estación Biológica de Doñana (EBD), Consejo Superior de Investigaciones Científicas (CSIC), Sevilla, 41092, Spain
- Faculty of Forestry, Institute of Dehesa Research (INDEHESA), Universidad de Extremadura, 10600, Plasencia, Spain
| | | | - Belén Méndez-Vigo
- Departamento de Genética Molecular de Plantas, Centro Nacional de Biotecnología (CNB), Consejo Superior de Investigaciones Científicas (CSIC), 28049, Madrid, Spain
| | - Rafael Torres-Pérez
- Departamento de Genética Molecular de Plantas, Centro Nacional de Biotecnología (CNB), Consejo Superior de Investigaciones Científicas (CSIC), 28049, Madrid, Spain
| | - Juan Carlos Oliveros
- Departamento de Genética Molecular de Plantas, Centro Nacional de Biotecnología (CNB), Consejo Superior de Investigaciones Científicas (CSIC), 28049, Madrid, Spain
| | - Rocío Gómez
- Departamento de Biología Evolutiva, Estación Biológica de Doñana (EBD), Consejo Superior de Investigaciones Científicas (CSIC), Sevilla, 41092, Spain
| | - Arnald Marcer
- CREAF, Bellaterra (Cerdanyola del Vallès), 08193, Catalonia, Spain
- Universitat Autònoma de Barcelona, Bellaterra (Cerdanyola del Vallès), 08193, Catalonia, Spain
| | - Antonio R Castilla
- Department of Plant Biology, Ecology, and Evolution, College of Arts and Sciences, Oklahoma State University, Stillwater, OK, 74078-3031, USA
| | - Magnus Nordborg
- Gregor Mendel Institute, Austrian Academy of Sciences, 1030, Vienna, Austria
| | - Carlos Alonso-Blanco
- Departamento de Genética Molecular de Plantas, Centro Nacional de Biotecnología (CNB), Consejo Superior de Investigaciones Científicas (CSIC), 28049, Madrid, Spain
| | - F Xavier Picó
- Departamento de Biología Evolutiva, Estación Biológica de Doñana (EBD), Consejo Superior de Investigaciones Científicas (CSIC), Sevilla, 41092, Spain.
| |
Collapse
|
28
|
Wang D, Trimbos KB, Gomes SIF, Jacquemyn H, Merckx VSFT. Metabarcoding read abundances of orchid mycorrhizal fungi are correlated to copy numbers estimated using ddPCR. THE NEW PHYTOLOGIST 2024; 242:1825-1834. [PMID: 37929750 DOI: 10.1111/nph.19385] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 10/20/2023] [Indexed: 11/07/2023]
Abstract
Quantifying the abundances of fungi is key to understanding natural variation in mycorrhizal communities in relation to plant ecophysiology and environmental heterogeneity. High-throughput metabarcoding approaches have transformed our ability to characterize and compare complex mycorrhizal communities. However, it remains unclear how well metabarcoding read counts correlate with actual read abundances in the sample, potentially limiting their use as a proxy for species abundances. Here, we use droplet digital PCR (ddPCR) to evaluate the reliability of ITS2 metabarcoding data for quantitative assessments of mycorrhizal communities in the orchid species Neottia ovata sampled at multiple sites. We performed specific ddPCR assays for eight families of orchid mycorrhizal fungi and compared the results with read counts obtained from metabarcoding. Our results demonstrate a significant correlation between DNA copy numbers measured by ddPCR assays and metabarcoding read counts of major mycorrhizal partners of N. ovata, highlighting the usefulness of metabarcoding for quantifying the abundance of orchid mycorrhizal fungi. Yet, the levels of correlation between the two methods and the numbers of false zero values varied across fungal families, which warrants cautious evaluation of the reliability of low-abundance families. This study underscores the potential of metabarcoding data for more quantitative analyses of mycorrhizal communities and presents practical workflows for metabarcoding and ddPCR to achieve a more comprehensive understanding of orchid mycorrhizal communities.
Collapse
Affiliation(s)
- Deyi Wang
- Naturalis Biodiversity Center, 2332 AA, Leiden, the Netherlands
- Institute of Biology, Leiden University, 2333 BE, Leiden, the Netherlands
| | - Krijn B Trimbos
- Department of Environmental Biology, Institute of Environmental Sciences, 2333 CC, Leiden University, Leiden, the Netherlands
| | - Sofia I F Gomes
- Institute of Biology, Leiden University, 2333 BE, Leiden, the Netherlands
| | - Hans Jacquemyn
- Department of Biology, Plant Conservation and Population Biology, KU Leuven, Kasteelpark Arenberg 31, Heverlee, 3001, Leuven, Belgium
| | - Vincent S F T Merckx
- Naturalis Biodiversity Center, 2332 AA, Leiden, the Netherlands
- Department of Evolutionary and Population Biology, Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, 1098 XH, Amsterdam, the Netherlands
| |
Collapse
|
29
|
Efstratiou A, Gaigher A, Künzel S, Teles A, Lenz TL. Template-specific optimization of NGS genotyping pipelines reveals allele-specific variation in MHC gene expression. Mol Ecol Resour 2024; 24:e13935. [PMID: 38332480 DOI: 10.1111/1755-0998.13935] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Revised: 01/19/2024] [Accepted: 01/25/2024] [Indexed: 02/10/2024]
Abstract
Using high-throughput sequencing for precise genotyping of multi-locus gene families, such as the major histocompatibility complex (MHC), remains challenging, due to the complexity of the data and difficulties in distinguishing genuine from erroneous variants. Several dedicated genotyping pipelines for data from high-throughput sequencing, such as next-generation sequencing (NGS), have been developed to tackle the ensuing risk of artificially inflated diversity. Here, we thoroughly assess three such multi-locus genotyping pipelines for NGS data, the DOC method, AmpliSAS and ACACIA, using MHC class IIβ data sets of three-spined stickleback gDNA, cDNA and "artificial" plasmid samples with known allelic diversity. We show that genotyping of gDNA and plasmid samples at optimal pipeline parameters was highly accurate and reproducible across methods. However, for cDNA data, the gDNA-optimal parameter configuration yielded decreased overall genotyping precision and consistency between pipelines. Further adjustments of key clustering parameters were required tο account for higher error rates and larger variation in sequencing depth per allele, highlighting the importance of template-specific pipeline optimization for reliable genotyping of multi-locus gene families. Through accurate paired gDNA-cDNA typing and MHC-II haplotype inference, we show that MHC-II allele-specific expression levels correlate negatively with allele number across haplotypes. Lastly, sibship-assisted cDNA-typing of MHC-I revealed novel variants linked in haplotype blocks, and a higher-than-previously-reported individual MHC-I allelic diversity. In conclusion, we provide novel genotyping protocols for the three-spined stickleback MHC-I and -II genes, and evaluate the performance of popular NGS-genotyping pipelines. We also show that fine-tuned genotyping of paired gDNA-cDNA samples facilitates amplification bias-corrected MHC allele expression analysis.
Collapse
Affiliation(s)
- Artemis Efstratiou
- Research Unit for Evolutionary Immunogenomics, Department of Biology, University of Hamburg, Hamburg, Germany
- Research Group for Evolutionary Immunogenomics, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Arnaud Gaigher
- Research Unit for Evolutionary Immunogenomics, Department of Biology, University of Hamburg, Hamburg, Germany
- Research Group for Evolutionary Immunogenomics, Max Planck Institute for Evolutionary Biology, Plön, Germany
- CIBIO-InBIO, Research Center in Biodiversity and Genetic Resources, University of Porto, Vairão, Portugal
| | - Sven Künzel
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Ana Teles
- Research Unit for Evolutionary Immunogenomics, Department of Biology, University of Hamburg, Hamburg, Germany
- Research Group for Evolutionary Immunogenomics, Max Planck Institute for Evolutionary Biology, Plön, Germany
- Department of Evolutionary Ecology, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Tobias L Lenz
- Research Unit for Evolutionary Immunogenomics, Department of Biology, University of Hamburg, Hamburg, Germany
- Research Group for Evolutionary Immunogenomics, Max Planck Institute for Evolutionary Biology, Plön, Germany
| |
Collapse
|
30
|
Koptagel H, Jun SH, Hård J, Lagergren J. Scuphr: A probabilistic framework for cell lineage tree reconstruction. PLoS Comput Biol 2024; 20:e1012094. [PMID: 38723024 PMCID: PMC11125557 DOI: 10.1371/journal.pcbi.1012094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Revised: 05/24/2024] [Accepted: 04/20/2024] [Indexed: 05/25/2024] Open
Abstract
Cell lineage tree reconstruction methods are developed for various tasks, such as investigating the development, differentiation, and cancer progression. Single-cell sequencing technologies enable more thorough analysis with higher resolution. We present Scuphr, a distance-based cell lineage tree reconstruction method using bulk and single-cell DNA sequencing data from healthy tissues. Common challenges of single-cell DNA sequencing, such as allelic dropouts and amplification errors, are included in Scuphr. Scuphr computes the distance between cell pairs and reconstructs the lineage tree using the neighbor-joining algorithm. With its embarrassingly parallel design, Scuphr can do faster analysis than the state-of-the-art methods while obtaining better accuracy. The method's robustness is investigated using various synthetic datasets and a biological dataset of 18 cells.
Collapse
Affiliation(s)
- Hazal Koptagel
- School of EECS, KTH Royal Institute of Technology, Stockholm, Sweden
- Science for Life Laboratory, Stockholm, Sweden
| | - Seong-Hwan Jun
- Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, New York, United States of America
| | - Joanna Hård
- Department of Cell and Molecular Biology, Karolinska Institutet, Stockholm, Sweden
| | - Jens Lagergren
- School of EECS, KTH Royal Institute of Technology, Stockholm, Sweden
- Science for Life Laboratory, Stockholm, Sweden
| |
Collapse
|
31
|
Gunasekaran D, Ardell DH, Nobile CJ. SNP-SVant: A Computational Workflow to Predict and Annotate Genomic Variants in Organisms Lacking Benchmarked Variants. Curr Protoc 2024; 4:e1046. [PMID: 38717471 PMCID: PMC11081530 DOI: 10.1002/cpz1.1046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/12/2024]
Abstract
Whole-genome sequencing is widely used to investigate population genomic variation in organisms of interest. Assorted tools have been independently developed to call variants from short-read sequencing data aligned to a reference genome, including single nucleotide polymorphisms (SNPs) and structural variations (SVs). We developed SNP-SVant, an integrated, flexible, and computationally efficient bioinformatic workflow that predicts high-confidence SNPs and SVs in organisms without benchmarked variants, which are traditionally used for distinguishing sequencing errors from real variants. In the absence of these benchmarked datasets, we leverage multiple rounds of statistical recalibration to increase the precision of variant prediction. The SNP-SVant workflow is flexible, with user options to tradeoff accuracy for sensitivity. The workflow predicts SNPs and small insertions and deletions using the Genome Analysis ToolKit (GATK) and predicts SVs using the Genome Rearrangement IDentification Software Suite (GRIDSS), and it culminates in variant annotation using custom scripts. A key utility of SNP-SVant is its scalability. Variant calling is a computationally expensive procedure, and thus, SNP-SVant uses a workflow management system with intermediary checkpoint steps to ensure efficient use of resources by minimizing redundant computations and omitting steps where dependent files are available. SNP-SVant also provides metrics to assess the quality of called variants and converts between VCF and aligned FASTA format outputs to ensure compatibility with downstream tools to calculate selection statistics, which are commonplace in population genomics studies. By accounting for both small and large structural variants, users of this workflow can obtain a wide-ranging view of genomic alterations in an organism of interest. Overall, this workflow advances our capabilities in assessing the functional consequences of different types of genomic alterations, ultimately improving our ability to associate genotypes with phenotypes. © 2024 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol: Predicting single nucleotide polymorphisms and structural variations Support Protocol 1: Downloading publicly available sequencing data Support Protocol 2: Visualizing variant loci using Integrated Genome Viewer Support Protocol 3: Converting between VCF and aligned FASTA formats.
Collapse
Affiliation(s)
- Deepika Gunasekaran
- Quantitative and Systems Biology Graduate Program, University of California, Merced, CA, USA
- Department of Molecular and Cell Biology, School of Natural Sciences, University of California, Merced, CA, USA
| | - David H. Ardell
- Department of Molecular and Cell Biology, School of Natural Sciences, University of California, Merced, CA, USA
| | - Clarissa J. Nobile
- Department of Molecular and Cell Biology, School of Natural Sciences, University of California, Merced, CA, USA
- Health Science Research Institute, University of California, Merced, CA, USA
| |
Collapse
|
32
|
Espinosa E, Bautista R, Larrosa R, Plata O. Advancements in long-read genome sequencing technologies and algorithms. Genomics 2024; 116:110842. [PMID: 38608738 DOI: 10.1016/j.ygeno.2024.110842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 04/01/2024] [Accepted: 04/06/2024] [Indexed: 04/14/2024]
Abstract
The recent advent of long read sequencing technologies, such as Pacific Biosciences (PacBio) and Oxford Nanopore technology (ONT), have led to substantial improvements in accuracy and computational cost in sequencing genomes. However, de novo whole-genome assembly still presents significant challenges related to the quality of the results. Pursuing de novo whole-genome assembly remains a formidable challenge, underscored by intricate considerations surrounding computational demands and result quality. As sequencing accuracy and throughput steadily advance, a continuous stream of innovative assembly tools floods the field. Navigating this dynamic landscape necessitates a reasonable choice of sequencing platform, depth, and assembly tools to orchestrate high-quality genome reconstructions. This comprehensive review delves into the intricate interplay between cutting-edge long read sequencing technologies, assembly methodologies, and the ever-evolving field of genomics. With a focus on addressing the pivotal challenges and harnessing the opportunities presented by these advancements, we provide an in-depth exploration of the crucial factors influencing the selection of optimal strategies for achieving robust and insightful genome assemblies.
Collapse
Affiliation(s)
- Elena Espinosa
- Department of Computer Architecture, University of Malaga, Louis Pasteur, 35, Campus de Teatinos, Malaga 29071, Spain.
| | - Rocio Bautista
- Supercomputing and Bioinnovation Center, University of Malaga, C. Severo Ochoa, 34, Malaga 29590, Spain.
| | - Rafael Larrosa
- Department of Computer Architecture, University of Malaga, Louis Pasteur, 35, Campus de Teatinos, Malaga 29071, Spain; Supercomputing and Bioinnovation Center, University of Malaga, C. Severo Ochoa, 34, Malaga 29590, Spain.
| | - Oscar Plata
- Department of Computer Architecture, University of Malaga, Louis Pasteur, 35, Campus de Teatinos, Malaga 29071, Spain.
| |
Collapse
|
33
|
Lee M, Guo Q, Kim M, Choi J, Segura A, Genceroglu A, LeBlanc L, Ramirez N, Jang YJ, Jang Y, Lee BK, Marcotte EM, Kim J. Systematic mapping of TF-mediated cell fate changes by a pooled induction coupled with scRNA-seq and multi-omics approaches. Genome Res 2024; 34:484-497. [PMID: 38580401 PMCID: PMC11067882 DOI: 10.1101/gr.277926.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 02/21/2024] [Indexed: 04/07/2024]
Abstract
Transcriptional regulation controls cellular functions through interactions between transcription factors (TFs) and their chromosomal targets. However, understanding the fate conversion potential of multiple TFs in an inducible manner remains limited. Here, we introduce iTF-seq as a method for identifying individual TFs that can alter cell fate toward specific lineages at a single-cell level. iTF-seq enables time course monitoring of transcriptome changes, and with biotinylated individual TFs, it provides a multi-omics approach to understanding the mechanisms behind TF-mediated cell fate changes. Our iTF-seq study in mouse embryonic stem cells identified multiple TFs that trigger rapid transcriptome changes indicative of differentiation within a day of induction. Moreover, cells expressing these potent TFs often show a slower cell cycle and increased cell death. Further analysis using bioChIP-seq revealed that GCM1 and OTX2 act as pioneer factors and activators by increasing gene accessibility and activating the expression of lineage specification genes during cell fate conversion. iTF-seq has utility in both mapping cell fate conversion and understanding cell fate conversion mechanisms.
Collapse
Affiliation(s)
- Muyoung Lee
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, Texas 78712, USA
| | - Qingqing Guo
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, Texas 78712, USA
| | - Mijeong Kim
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, Texas 78712, USA
| | - Joonhyuk Choi
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, Texas 78712, USA
| | - Alia Segura
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, Texas 78712, USA
| | - Alper Genceroglu
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, Texas 78712, USA
| | - Lucy LeBlanc
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, Texas 78712, USA
| | - Nereida Ramirez
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, Texas 78712, USA
| | - Yu Jin Jang
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, Texas 78712, USA
| | - Yeejin Jang
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, Texas 78712, USA
| | - Bum-Kyu Lee
- Department of Biomedical Sciences, Cancer Research Center, University at Albany, State University of New York, Rensselaer, New York 12144, USA
| | - Edward M Marcotte
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, Texas 78712, USA
| | - Jonghwan Kim
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, Texas 78712, USA;
| |
Collapse
|
34
|
Goldberg ME, Noyes MD, Eichler EE, Quinlan AR, Harris K. Effects of parental age and polymer composition on short tandem repeat de novo mutation rates. Genetics 2024; 226:iyae013. [PMID: 38298127 PMCID: PMC10990422 DOI: 10.1093/genetics/iyae013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 08/11/2023] [Accepted: 01/05/2024] [Indexed: 02/02/2024] Open
Abstract
Short tandem repeats (STRs) are hotspots of genomic variability in the human germline because of their high mutation rates, which have long been attributed largely to polymerase slippage during DNA replication. This model suggests that STR mutation rates should scale linearly with a father's age, as progenitor cells continually divide after puberty. In contrast, it suggests that STR mutation rates should not scale with a mother's age at her child's conception, since oocytes spend a mother's reproductive years arrested in meiosis II and undergo a fixed number of cell divisions that are independent of the age at ovulation. Yet, mirroring recent findings, we find that STR mutation rates covary with paternal and maternal age, implying that some STR mutations are caused by DNA damage in quiescent cells rather than polymerase slippage in replicating progenitor cells. These results echo the recent finding that DNA damage in oocytes is a significant source of de novo single nucleotide variants and corroborate evidence of STR expansion in postmitotic cells. However, we find that the maternal age effect is not confined to known hotspots of oocyte mutagenesis, nor are postzygotic mutations likely to contribute significantly. STR nucleotide composition demonstrates divergent effects on de novo mutation (DNM) rates between sexes. Unlike the paternal lineage, maternally derived DNMs at A/T STRs display a significantly greater association with maternal age than DNMs at G/C-containing STRs. These observations may suggest the mechanism and developmental timing of certain STR mutations and contradict prior attribution of replication slippage as the primary mechanism of STR mutagenesis.
Collapse
Affiliation(s)
- Michael E Goldberg
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
- Departments of Human Genetics and Biomedical Informatics, University of Utah, Salt Lake City, UT 84112, USA
| | - Michelle D Noyes
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| | - Aaron R Quinlan
- Departments of Human Genetics and Biomedical Informatics, University of Utah, Salt Lake City, UT 84112, USA
| | - Kelley Harris
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
- Computational Biology Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| |
Collapse
|
35
|
Zhuang X, Ye R, Zhou Y, Cheng MY, Cui H, Wang L, Zhang S, Wang S, Cui Y, Zhang W. Leveraging new methods for comprehensive characterization of mitochondrial DNA in esophageal squamous cell carcinoma. Genome Med 2024; 16:50. [PMID: 38566210 PMCID: PMC10985887 DOI: 10.1186/s13073-024-01319-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Accepted: 03/21/2024] [Indexed: 04/04/2024] Open
Abstract
BACKGROUND Mitochondria play essential roles in tumorigenesis; however, little is known about the contribution of mitochondrial DNA (mtDNA) to esophageal squamous cell carcinoma (ESCC). Whole-genome sequencing (WGS) is by far the most efficient technology to fully characterize the molecular features of mtDNA; however, due to the high redundancy and heterogeneity of mtDNA in regular WGS data, methods for mtDNA analysis are far from satisfactory. METHODS Here, we developed a likelihood-based method dMTLV to identify low-heteroplasmic mtDNA variants. In addition, we described fNUMT, which can simultaneously detect non-reference nuclear sequences of mitochondrial origin (non-ref NUMTs) and their derived artifacts. Using these new methods, we explored the contribution of mtDNA to ESCC utilizing the multi-omics data of 663 paired tumor-normal samples. RESULTS dMTLV outperformed the existing methods in sensitivity without sacrificing specificity. The verification using Nanopore long-read sequencing data showed that fNUMT has superior specificity and more accurate breakpoint identification than the current methods. Leveraging the new method, we identified a significant association between the ESCC overall survival and the ratio of mtDNA copy number of paired tumor-normal samples, which could be potentially explained by the differential expression of genes enriched in pathways related to metabolism, DNA damage repair, and cell cycle checkpoint. Additionally, we observed that the expression of CBWD1 was downregulated by the non-ref NUMTs inserted into its intron region, which might provide precursor conditions for the tumor cells to adapt to a hypoxic environment. Moreover, we identified a strong positive relationship between the number of mtDNA truncating mutations and the contribution of signatures linked to tumorigenesis and treatment response. CONCLUSIONS Our new frameworks promote the characterization of mtDNA features, which enables the elucidation of the landscapes and roles of mtDNA in ESCC essential for extending the current understanding of ESCC etiology. dMTLV and fNUMT are freely available from https://github.com/sunnyzxh/dMTLV and https://github.com/sunnyzxh/fNUMT , respectively.
Collapse
Affiliation(s)
- Xuehan Zhuang
- Cancer Institute, Department of Oncology, Peking University Shenzhen Hospital, Shenzhen Peking University-the Hong Kong University of Science and Technology (PKU-HKUST) Medical Center; Institute of Cancer Research, Shenzhen Bay Laboratory, Shenzhen, Guangdong, 518000, China
| | - Rui Ye
- Department of Psychiatry, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Yong Zhou
- Cancer Institute, Department of Oncology, Peking University Shenzhen Hospital, Shenzhen Peking University-the Hong Kong University of Science and Technology (PKU-HKUST) Medical Center; Institute of Cancer Research, Shenzhen Bay Laboratory, Shenzhen, Guangdong, 518000, China
| | - Matthew Yibo Cheng
- Cancer Institute, Department of Oncology, Peking University Shenzhen Hospital, Shenzhen Peking University-the Hong Kong University of Science and Technology (PKU-HKUST) Medical Center; Institute of Cancer Research, Shenzhen Bay Laboratory, Shenzhen, Guangdong, 518000, China
| | - Heyang Cui
- Cancer Institute, Department of Oncology, Peking University Shenzhen Hospital, Shenzhen Peking University-the Hong Kong University of Science and Technology (PKU-HKUST) Medical Center; Institute of Cancer Research, Shenzhen Bay Laboratory, Shenzhen, Guangdong, 518000, China
| | - Longlong Wang
- Cancer Institute, Department of Oncology, Peking University Shenzhen Hospital, Shenzhen Peking University-the Hong Kong University of Science and Technology (PKU-HKUST) Medical Center; Institute of Cancer Research, Shenzhen Bay Laboratory, Shenzhen, Guangdong, 518000, China
| | - Shuangping Zhang
- The Department of Thoracic Surgery, Shanxi Cancer Hospital; Key Laboratory of Cellular Physiology of the Ministry of Education, Department of Pathology, Shanxi Medical University, Taiyuan, Shanxi, 030001, China
| | - Shubin Wang
- Cancer Institute, Department of Oncology, Peking University Shenzhen Hospital, Shenzhen Peking University-the Hong Kong University of Science and Technology (PKU-HKUST) Medical Center; Institute of Cancer Research, Shenzhen Bay Laboratory, Shenzhen, Guangdong, 518000, China
| | - Yongping Cui
- Cancer Institute, Department of Oncology, Peking University Shenzhen Hospital, Shenzhen Peking University-the Hong Kong University of Science and Technology (PKU-HKUST) Medical Center; Institute of Cancer Research, Shenzhen Bay Laboratory, Shenzhen, Guangdong, 518000, China.
- The Department of Thoracic Surgery, Shanxi Cancer Hospital; Key Laboratory of Cellular Physiology of the Ministry of Education, Department of Pathology, Shanxi Medical University, Taiyuan, Shanxi, 030001, China.
| | - Weimin Zhang
- Cancer Institute, Department of Oncology, Peking University Shenzhen Hospital, Shenzhen Peking University-the Hong Kong University of Science and Technology (PKU-HKUST) Medical Center; Institute of Cancer Research, Shenzhen Bay Laboratory, Shenzhen, Guangdong, 518000, China.
- State Key Laboratory of Molecular Oncology, Beijing Key Laboratory of Carcinogenesis and Translational Research, Laboratory of Molecular Oncology, Peking University Cancer Hospital & Institute; Research Unit of Molecular Cancer Research, Chinese Academy of Medical Sciences, Beijing, 100142, China.
| |
Collapse
|
36
|
Underhill HR, Karsy M, Davidson CJ, Hellwig S, Stevenson S, Goold EA, Vincenti S, Sellers DL, Dean C, Harrison BE, Bronner MP, Colman H, Jensen RL. Subclonal Cancer Driver Mutations Are Prevalent in the Unresected Peritumoral Edema of Adult Diffuse Gliomas. Cancer Res 2024; 84:1149-1164. [PMID: 38270917 PMCID: PMC10982644 DOI: 10.1158/0008-5472.can-23-2557] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 11/20/2023] [Accepted: 01/23/2024] [Indexed: 01/26/2024]
Abstract
Adult diffuse gliomas commonly recur regardless of therapy. As recurrence typically arises from the peritumoral edema adjacent to the resected bulk tumor, the profiling of somatic mutations from infiltrative malignant cells within this critical, unresected region could provide important insights into residual disease. A key obstacle has been the inability to distinguish between next-generation sequencing (NGS) noise and the true but weak signal from tumor cells hidden among the noncancerous brain tissue of the peritumoral edema. Here, we developed and validated True2 sequencing to reduce NGS-associated errors to <1 false positive/100 kb panel positions while detecting 97.6% of somatic mutations with an allele frequency ≥0.1%. True2 was then used to study the tumor and peritumoral edema of 22 adult diffuse gliomas including glioblastoma, astrocytoma, oligodendroglioma, and NF1-related low-grade neuroglioma. The tumor and peritumoral edema displayed a similar mutation burden, indicating that surgery debulks these cancers physically but not molecularly. Moreover, variants in the peritumoral edema included unique cancer driver mutations absent in the bulk tumor. Finally, analysis of multiple samples from each patient revealed multiple subclones with unique mutations in the same gene in 17 of 22 patients, supporting the occurrence of convergent evolution in response to patient-specific selective pressures in the tumor microenvironment that may form the molecular foundation of recurrent disease. Collectively, True2 enables the detection of ultralow frequency mutations during molecular analyses of adult diffuse gliomas, which is necessary to understand cancer evolution, recurrence, and individual response to therapy. SIGNIFICANCE True2 is a next-generation sequencing workflow that facilitates unbiased discovery of somatic mutations across the full range of variant allele frequencies, which could help identify residual disease vulnerabilities for targeted adjuvant therapies.
Collapse
Affiliation(s)
- Hunter R. Underhill
- Department of Pediatrics, Division of Medical Genetics, University of Utah, Salt Lake City, Utah
- Department of Radiology, University of Utah, Salt Lake City, Utah
- Huntsman Cancer Institute, University of Utah, Salt Lake City, Utah
| | - Michael Karsy
- Department of Neurological Surgery, University of Utah, Salt Lake City, Utah
| | | | | | - Samuel Stevenson
- Department of Pediatrics, Division of Medical Genetics, University of Utah, Salt Lake City, Utah
| | - Eric A. Goold
- Department of Pathology, University of Utah, Salt Lake City, Utah
| | | | - Drew L. Sellers
- Department of Bioengineering, University of Washington, Seattle, Washington
| | - Charlie Dean
- Huntsman Cancer Institute, University of Utah, Salt Lake City, Utah
| | - Brion E. Harrison
- Department of Pediatrics, Division of Medical Genetics, University of Utah, Salt Lake City, Utah
| | - Mary P. Bronner
- Huntsman Cancer Institute, University of Utah, Salt Lake City, Utah
- Department of Pathology, University of Utah, Salt Lake City, Utah
| | - Howard Colman
- Huntsman Cancer Institute, University of Utah, Salt Lake City, Utah
- Department of Neurological Surgery, University of Utah, Salt Lake City, Utah
- Department of Internal Medicine, Division of Oncology, University of Utah, Salt Lake City, Utah
| | - Randy L. Jensen
- Huntsman Cancer Institute, University of Utah, Salt Lake City, Utah
- Department of Neurological Surgery, University of Utah, Salt Lake City, Utah
| |
Collapse
|
37
|
Carter KA, France MT, Rutt L, Bilski L, Martinez-Greiwe S, Regan M, Brotman RM, Ravel J. Sexual transmission of urogenital bacteria: whole metagenome sequencing evidence from a sexual network study. mSphere 2024; 9:e0003024. [PMID: 38358269 PMCID: PMC10964427 DOI: 10.1128/msphere.00030-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Accepted: 01/21/2024] [Indexed: 02/16/2024] Open
Abstract
Sexual transmission of the urogenital microbiota may contribute to adverse sexual and reproductive health outcomes. The extent of sexual transmission of the urogenital microbiota is unclear as prior studies largely investigated specific pathogens. We used epidemiologic data and whole metagenome sequencing to characterize urogenital microbiota strain concordance between participants of a sexual network study. Individuals who screened positive for genital Chlamydia trachomatis were enrolled and referred their sexual contacts from the prior 60-180 days. Snowball recruitment of sexual contacts continued for up to four waves. Vaginal swabs and penile urethral swabs were collected for whole metagenome sequencing. We evaluated bacterial strain concordance using inStrain and network analysis. We defined concordance as ≥99.99% average nucleotide identity over ≥50% shared coverage; we defined putative sexual transmission as concordance between sexual contacts with <5 single-nucleotide polymorphisms per megabase. Of 138 participants, 74 (54%) were female; 120 (87%) had genital chlamydia; and 43 (31%) were recruited contacts. We identified 115 strain-concordance events among 54 participants representing 25 bacterial species. Seven events (6%) were between sexual contacts including putative heterosexual transmission of Fannyhessea vaginae, Gardnerella leopoldii, Prevotella amnii, Sneathia sanguinegens, and Sneathia vaginalis (one strain each), and putative sexual transmission of Lactobacillus iners between female contacts. Most concordance events (108, 94%) were between non-contacts, including eight female participants connected through 18 Lactobacillus crispatus and 3 Lactobacillus jensenii concordant strains, and 14 female and 2 male participants densely interconnected through 52 Gardnerella swidsinskii concordance events.IMPORTANCEEpidemiologic evidence consistently indicates bacterial vaginosis (BV) is sexually associated and may be sexually transmitted, though sexual transmission remains subject to debate. This study is not capable of demonstrating BV sexual transmission; however, we do provide strain-level metagenomic evidence that strongly supports heterosexual transmission of BV-associated species. These findings strengthen the evidence base that supports ongoing investigations of concurrent male partner treatment for reducing BV recurrence. Our data suggest that measuring the impact of male partner treatment on F. vaginae, G. leopoldii, P. amnii, S. sanguinegens, and S. vaginalis may provide insight into why a regimen does or does not perform well. We also observed a high degree of strain concordance between non-sexual-contact female participants. We posit that this may reflect limited dispersal capacity of vaginal bacteria coupled with individuals' comembership in regional transmission networks where transmission may occur between parent and child at birth, cohabiting individuals, or sexual contacts.
Collapse
Affiliation(s)
- Kayla A. Carter
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland, USA
- Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, Maryland, USA
| | - Michael T. France
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland, USA
- Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, Maryland, USA
| | - Lindsay Rutt
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland, USA
| | - Lisa Bilski
- School of Nursing, University of Maryland, Baltimore, Maryland, USA
| | | | - Mary Regan
- School of Nursing, University of Maryland, Baltimore, Maryland, USA
| | - Rebecca M. Brotman
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland, USA
- Department of Epidemiology and Public Health, University of Maryland School of Medicine, Baltimore, Maryland, USA
| | - Jacques Ravel
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland, USA
- Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, Maryland, USA
| |
Collapse
|
38
|
Zhang J, Hou C, Liu C. CRISPR-powered quantitative keyword search engine in DNA data storage. Nat Commun 2024; 15:2376. [PMID: 38491032 PMCID: PMC10943086 DOI: 10.1038/s41467-024-46767-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 03/08/2024] [Indexed: 03/18/2024] Open
Abstract
Despite the growing interest of archiving information in synthetic DNA to confront data explosion, quantitatively querying the data stored in DNA is still a challenge. Herein, we present Search Enabled by Enzymatic Keyword Recognition (SEEKER), which utilizes CRISPR-Cas12a to rapidly generate visible fluorescence when a DNA target corresponding to the keyword of interest is present. SEEKER achieves quantitative text searching since the growth rate of fluorescence intensity is proportional to keyword frequency. Compatible with SEEKER, we develop non-collision grouping coding, which reduces the size of dictionary and enables lossless compression without disrupting the original order of texts. Using four queries, we correctly identify keywords in 40 files with a background of ~8000 irrelevant terms. Parallel searching with SEEKER can be performed on a 3D-printed microfluidic chip. Overall, SEEKER provides a quantitative approach to conducting parallel searching over the complete content stored in DNA with simple implementation and rapid result generation.
Collapse
Affiliation(s)
- Jiongyu Zhang
- Department of Biomedical Engineering, University of Connecticut Health Center, Farmington, CT, 06030, USA
- Department of Biomedical Engineering, University of Connecticut, Storrs, CT, 06269, USA
| | - Chengyu Hou
- Department of Biomedical Engineering, University of Connecticut Health Center, Farmington, CT, 06030, USA
- Department of Biomedical Engineering, University of Connecticut, Storrs, CT, 06269, USA
| | - Changchun Liu
- Department of Biomedical Engineering, University of Connecticut Health Center, Farmington, CT, 06030, USA.
| |
Collapse
|
39
|
Colson P, Delerce J, Pontarotti P, Devaux C, La Scola B, Fantini J, Raoult D. Resistance-associated mutations to the anti-SARS-CoV-2 agent nirmatrelvir: Selection not induction. J Med Virol 2024; 96:e29462. [PMID: 38363015 DOI: 10.1002/jmv.29462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 01/21/2024] [Accepted: 01/27/2024] [Indexed: 02/17/2024]
Abstract
Mutations associated with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) resistance to antiprotease nirmatrelvir were reported. We aimed to detect them in SARS-CoV-2 genomes and quasispecies retrieved in our institute before drug availability in January 2022 and to analyze the impact of mutations on protease (3CLpro) structure. We sought for 38 3CLpro nirmatrelvir resistance mutations in a set of 62 673 SARS-CoV-2 genomes obtained in our institute from respiratory samples collected between 2020 and 2023 and for these mutations in SARS-CoV-2 quasispecies for 90 samples collected in 2020, using Python. SARS-CoV-2 protease with major mutation E166V was generated with Swiss Pdb Viewer and Molegro Molecular Viewer. We detected 22 (58%) of the resistance-associated mutations in 417 (0.67%) of the genomes analyzed; 325 (78%) of these genomes had been obtained from samples collected in 2020-2021. APOBEC signatures were found for 12/22 mutations. We also detected among viral quasispecies from 90 samples some minority reads harboring any of 15 nirmatrelvir resistance mutations, including E166V. Also, we predicted that E166V has a very limited effect on 3CLpro structure but may prevent drug attachment. Thus, we evidenced that mutations associated with nirmatrelvir resistance pre-existed in SARS-CoV-2 before drug availability. These findings further warrant SARS-CoV-2 genomic surveillance and SARS-CoV-2 quasispecies characterization.
Collapse
Affiliation(s)
- Philippe Colson
- IHU Méditerranée Infection, 19-21 boulevard Jean Moulin, Marseille, France
- Aix-Marseille Univ., Institut de Recherche pour le Développement (IRD), Microbes Evolution Phylogeny and Infections (MEPHI), 27 boulevard Jean Moulin, Marseille, France
- Assistance Publique-Hôpitaux de Marseille (AP-HM), Marseille, France
| | - Jérémy Delerce
- IHU Méditerranée Infection, 19-21 boulevard Jean Moulin, Marseille, France
| | - Pierre Pontarotti
- IHU Méditerranée Infection, 19-21 boulevard Jean Moulin, Marseille, France
- Department of Biological Sciences, Centre National de la Recherche 16 Scientifique (CNRS)-SNC5039, Marseille, France
| | | | - Bernard La Scola
- IHU Méditerranée Infection, 19-21 boulevard Jean Moulin, Marseille, France
- Aix-Marseille Univ., Institut de Recherche pour le Développement (IRD), Microbes Evolution Phylogeny and Infections (MEPHI), 27 boulevard Jean Moulin, Marseille, France
- Assistance Publique-Hôpitaux de Marseille (AP-HM), Marseille, France
| | - Jacques Fantini
- INSERM UMR_S 1072, Aix-Marseille Université, Marseille, France
| | - Didier Raoult
- IHU Méditerranée Infection, 19-21 boulevard Jean Moulin, Marseille, France
- Aix-Marseille Univ., Institut de Recherche pour le Développement (IRD), Microbes Evolution Phylogeny and Infections (MEPHI), 27 boulevard Jean Moulin, Marseille, France
| |
Collapse
|
40
|
Phillips AL, Ferguson S, Burton RA, Watson-Haigh NS. CLAW: An automated Snakemake workflow for the assembly of chloroplast genomes from long-read data. PLoS Comput Biol 2024; 20:e1011870. [PMID: 38335225 PMCID: PMC10883564 DOI: 10.1371/journal.pcbi.1011870] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Revised: 02/22/2024] [Accepted: 01/29/2024] [Indexed: 02/12/2024] Open
Abstract
Chloroplasts are photosynthetic organelles in algal and plant cells that contain their own genome. Chloroplast genomes are commonly used in evolutionary studies and taxonomic identification and are increasingly becoming a target for crop improvement studies. As DNA sequencing becomes more affordable, researchers are collecting vast swathes of high-quality whole-genome sequence data from laboratory and field settings alike. Whole tissue read libraries sequenced with the primary goal of understanding the nuclear genome will inadvertently contain many reads derived from the chloroplast genome. These whole-genome, whole-tissue read libraries can additionally be used to assemble chloroplast genomes with little to no extra cost. While several tools exist that make use of short-read second generation and third-generation long-read sequencing data for chloroplast genome assembly, these tools may have complex installation steps, inadequate error reporting, poor expandability, and/or lack scalability. Here, we present CLAW (Chloroplast Long-read Assembly Workflow), an easy to install, customise, and use Snakemake tool to assemble chloroplast genomes from chloroplast long-reads found in whole-genome read libraries (https://github.com/aaronphillips7493/CLAW). Using 19 publicly available reference chloroplast genome assemblies and long-read libraries from algal, monocot and eudicot species, we show that CLAW can rapidly produce chloroplast genome assemblies with high similarity to the reference assemblies. CLAW was designed such that users have complete control over parameterisation, allowing individuals to optimise CLAW to their specific use cases. We expect that CLAW will provide researchers (with varying levels of bioinformatics expertise) with an additional resource useful for contributing to the growing number of publicly available chloroplast genome assemblies.
Collapse
Affiliation(s)
- Aaron L Phillips
- Department of Food Science, University of Adelaide, Adelaide, South Australia, Australia
| | - Scott Ferguson
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Rachel A Burton
- Department of Food Science, University of Adelaide, Adelaide, South Australia, Australia
| | - Nathan S Watson-Haigh
- South Australian Genomics Centre (SAGC), SAHMRI, Adelaide, South Australia, Australia
- Australian Genome Research Facility, Victorian Comprehensive Cancer Centre, Melbourne, Victoria, Australia
- Alkahest Inc., San Carlos, California, United States of America
| |
Collapse
|
41
|
Gabernet G, Marquez S, Bjornson R, Peltzer A, Meng H, Aron E, Lee NY, Jensen C, Ladd D, Hanssen F, Heumos S, Yaari G, Kowarik MC, Nahnsen S, Kleinstein SH. nf-core/airrflow: an adaptive immune receptor repertoire analysis workflow employing the Immcantation framework. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.18.576147. [PMID: 38293151 PMCID: PMC10827190 DOI: 10.1101/2024.01.18.576147] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]
Abstract
Adaptive Immune Receptor Repertoire sequencing (AIRR-seq) is a valuable experimental tool to study the immune state in health and following immune challenges such as infectious diseases, (auto)immune diseases, and cancer. Several tools have been developed to reconstruct B cell and T cell receptor sequences from AIRR-seq data and infer B and T cell clonal relationships. However, currently available tools offer limited parallelization across samples, scalability or portability to high-performance computing infrastructures. To address this need, we developed nf-core/airrflow, an end-to-end bulk and single-cell AIRR-seq processing workflow which integrates the Immcantation Framework following BCR and TCR sequencing data analysis best practices. The Immcantation Framework is a comprehensive toolset, which allows the processing of bulk and single-cell AIRR-seq data from raw read processing to clonal inference. nf-core/airrflow is written in Nextflow and is part of the nf-core project, which collects community contributed and curated Nextflow workflows for a wide variety of analysis tasks. We assessed the performance of nf-core/airrflow on simulated sequencing data with sequencing errors and show example results with real datasets. To demonstrate the applicability of nf-core/airrflow to the high-throughput processing of large AIRR-seq datasets, we validated and extended previously reported findings of convergent antibody responses to SARS-CoV-2 by analyzing 97 COVID-19 infected individuals and 99 healthy controls, including a mixture of bulk and single-cell sequencing datasets. Using this dataset, we extended the convergence findings to 20 additional subjects, highlighting the applicability of nf-core/airrflow to validate findings in small in-house cohorts with reanalysis of large publicly available AIRR datasets. nf-core/airrflow is available free of charge, under the MIT license on GitHub (https://github.com/nf-core/airrflow). Detailed documentation and example results are available on the nf-core website at (https://nf-co.re/airrflow).
Collapse
|
42
|
Gorecki A, Ostapczuk P, Dziewit L. Diversity of antibiotic resistance gene variants at subsequent stages of the wastewater treatment process revealed by a metagenomic analysis of PCR amplicons. Front Genet 2024; 14:1334646. [PMID: 38274111 PMCID: PMC10808613 DOI: 10.3389/fgene.2023.1334646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Accepted: 12/26/2023] [Indexed: 01/27/2024] Open
Abstract
Wastewater treatment plants have been recognised as point sources of various antibiotic-resistant bacteria (ARB) and antibiotic resistance genes (ARG) which are considered recently emerging biological contaminants. So far, culture-based and molecular-based methods have been successfully applied to monitor antimicrobial resistance (AMR) in WWTPs. However, the methods applied do not permit the comprehensive identification of the true diversity of ARGs. In this study we applied next-generation sequencing for a metagenomic analysis of PCR amplicons of ARGs from the subsequent stages of the analysed WWTP. The presence of 14 genes conferring resistance to different antibiotic families was screened by PCR. In the next step, three genes were selected for detailed analysis of changes of the profile of ARG variants along the process. A relative abundance of 79 variants was analysed. The highest diversity was revealed in the ermF gene, with 52 variants. The relative abundance of some variants changed along the purification process, and some ARG variants might be present in novel hosts for which they were currently unassigned. Additionally, we identified a pool of novel ARG variants present in the studied WWTP. Overall, the results obtained indicated that the applied method is sufficient for analysing ARG variant diversity.
Collapse
Affiliation(s)
- Adrian Gorecki
- Department of Biochemistry and Microbiology, Institute of Biology, Warsaw University of Life Sciences (SGGW), Warsaw, Poland
| | - Piotr Ostapczuk
- Department of Environmental Microbiology and Biotechnology, Institute of Microbiology, Faculty of Biology, University of Warsaw, Warsaw, Poland
| | - Lukasz Dziewit
- Department of Environmental Microbiology and Biotechnology, Institute of Microbiology, Faculty of Biology, University of Warsaw, Warsaw, Poland
| |
Collapse
|
43
|
Rádai Z, Váradi A, Takács P, Nagy NA, Schmitt N, Prépost E, Kardos G, Laczkó L. An overlooked phenomenon: complex interactions of potential error sources on the quality of bacterial de novo genome assemblies. BMC Genomics 2024; 25:45. [PMID: 38195441 PMCID: PMC10777565 DOI: 10.1186/s12864-023-09910-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Accepted: 12/15/2023] [Indexed: 01/11/2024] Open
Abstract
BACKGROUND Parameters adversely affecting the contiguity and accuracy of the assemblies from Illumina next-generation sequencing (NGS) are well described. However, past studies generally focused on their additive effects, overlooking their potential interactions possibly exacerbating one another's effects in a multiplicative manner. To investigate whether or not they act interactively on de novo genome assembly quality, we simulated sequencing data for 13 bacterial reference genomes, with varying levels of error rate, sequencing depth, PCR and optical duplicate ratios. RESULTS We assessed the quality of assemblies from the simulated sequencing data with a number of contiguity and accuracy metrics, which we used to quantify both additive and multiplicative effects of the four parameters. We found that the tested parameters are engaged in complex interactions, exerting multiplicative, rather than additive, effects on assembly quality. Also, the ratio of non-repeated regions and GC% of the original genomes can shape how the four parameters affect assembly quality. CONCLUSIONS We provide a framework for consideration in future studies using de novo genome assembly of bacterial genomes, e.g. in choosing the optimal sequencing depth, balancing between its positive effect on contiguity and negative effect on accuracy due to its interaction with error rate. Furthermore, the properties of the genomes to be sequenced also should be taken into account, as they might influence the effects of error sources themselves.
Collapse
Affiliation(s)
- Zoltán Rádai
- Institute of Metagenomics, University of Debrecen, Debrecen, Hungary.
- Department of Dermatology, University Hospital Düsseldorf, Heinrich-Heine-University, Düsseldorf, Germany.
| | - Alex Váradi
- Institute of Metagenomics, University of Debrecen, Debrecen, Hungary
- Department of Laboratory Medicine, Medical School, University of Pécs, Pécs, Hungary
| | - Péter Takács
- Institute of Metagenomics, University of Debrecen, Debrecen, Hungary
- Department of Health Informatics, Institute of Health Sciences, Faculty of Health, University of Debrecen, Debrecen, Hungary
| | - Nikoletta Andrea Nagy
- Institute of Metagenomics, University of Debrecen, Debrecen, Hungary
- Department of Evolutionary Zoology, ELKH-DE Behavioural Ecology Research Group, University of Debrecen, Debrecen, Hungary
- Department of Evolutionary Zoology and Human Biology, University of Debrecen, Debrecen, Hungary
| | - Nicholas Schmitt
- Department of Dermatology, University Hospital Düsseldorf, Heinrich-Heine-University, Düsseldorf, Germany
| | - Eszter Prépost
- Department of Health Industry, University of Debrecen, Debrecen, Hungary
| | - Gábor Kardos
- Institute of Metagenomics, University of Debrecen, Debrecen, Hungary
- Department of Gerontology, Faculty of Health Sciences, University of Debrecen, Debrecen, Hungary
| | - Levente Laczkó
- Institute of Metagenomics, University of Debrecen, Debrecen, Hungary
- ELKH-DE Conservation Biology Research Group, Debrecen, Hungary
| |
Collapse
|
44
|
Hall MB, Coin LJM. Pangenome databases improve host removal and mycobacteria classification from clinical metagenomic data. Gigascience 2024; 13:giae010. [PMID: 38573185 PMCID: PMC10993716 DOI: 10.1093/gigascience/giae010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 01/10/2024] [Accepted: 02/27/2024] [Indexed: 04/05/2024] Open
Abstract
BACKGROUND Culture-free real-time sequencing of clinical metagenomic samples promises both rapid pathogen detection and antimicrobial resistance profiling. However, this approach introduces the risk of patient DNA leakage. To mitigate this risk, we need near-comprehensive removal of human DNA sequences at the point of sequencing, typically involving the use of resource-constrained devices. Existing benchmarks have largely focused on the use of standardized databases and largely ignored the computational requirements of depletion pipelines as well as the impact of human genome diversity. RESULTS We benchmarked host removal pipelines on simulated and artificial real Illumina and Nanopore metagenomic samples. We found that construction of a custom kraken database containing diverse human genomes results in the best balance of accuracy and computational resource usage. In addition, we benchmarked pipelines using kraken and minimap2 for taxonomic classification of Mycobacterium reads using standard and custom databases. With a database representative of the Mycobacterium genus, both tools obtained improved specificity and sensitivity, compared to the standard databases for classification of Mycobacterium tuberculosis. Computational efficiency of these custom databases was superior to most standard approaches, allowing them to be executed on a laptop device. CONCLUSIONS Customized pangenome databases provide the best balance of accuracy and computational efficiency when compared to standard databases for the task of human read removal and M. tuberculosis read classification from metagenomic samples. Such databases allow for execution on a laptop, without sacrificing accuracy, an especially important consideration in low-resource settings. We make all customized databases and pipelines freely available.
Collapse
Affiliation(s)
- Michael B Hall
- Department of Microbiology and Immunology, Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Melbourne, 3000 Victoria, Australia
| | - Lachlan J M Coin
- Department of Microbiology and Immunology, Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Melbourne, 3000 Victoria, Australia
| |
Collapse
|
45
|
Ng JK, Turner TN. HAT: de novo variant calling for highly accurate short-read and long-read sequencing data. Bioinformatics 2024; 40:btad775. [PMID: 38175776 PMCID: PMC10777354 DOI: 10.1093/bioinformatics/btad775] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Revised: 12/05/2023] [Indexed: 01/06/2024] Open
Abstract
MOTIVATION de novo variants (DNVs) are variants that are present in offspring but not in their parents. DNVs are both important for examining mutation rates as well as in the identification of disease-related variation. While efforts have been made to call DNVs, calling of DNVs is still challenging from parent-child sequenced trio data. We developed Hare And Tortoise (HAT) as an automated DNV detection workflow for highly accurate short-read and long-read sequencing data. Reliable detection of DNVs is important for human genomics and HAT addresses this need. RESULTS HAT is a computational workflow that begins with aligned read data (i.e. CRAM or BAM) from a parent-child sequenced trio and outputs DNVs. HAT detects high-quality DNVs from Illumina short-read whole-exome sequencing, Illumina short-read whole-genome sequencing, and highly accurate PacBio HiFi long-read whole-genome sequencing data. The quality of these DNVs is high based on a series of quality metrics including number of DNVs per individual, percent of DNVs at CpG sites, and percent of DNVs phased to the paternal chromosome of origin. AVAILABILITY AND IMPLEMENTATION https://github.com/TNTurnerLab/HAT.
Collapse
Affiliation(s)
- Jeffrey K Ng
- Department of Genetics, Washington University School of Medicine, St Louis, MO 63110, USA
| | - Tychele N Turner
- Department of Genetics, Washington University School of Medicine, St Louis, MO 63110, USA
| |
Collapse
|
46
|
Muyas F, Rodriguez MJG, Cascão R, Afonso A, Sauer CM, Faria CC, Cortés-Ciriano I, Flores I. The ALT pathway generates telomere fusions that can be detected in the blood of cancer patients. Nat Commun 2024; 15:82. [PMID: 38167290 PMCID: PMC10762111 DOI: 10.1038/s41467-023-44287-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2023] [Accepted: 12/07/2023] [Indexed: 01/05/2024] Open
Abstract
Telomere fusions (TFs) can trigger the accumulation of oncogenic alterations leading to malignant transformation and drug resistance. Despite their relevance in tumour evolution, our understanding of the patterns and consequences of TFs in human cancers remains limited. Here, we characterize the rates and spectrum of somatic TFs across >30 cancer types using whole-genome sequencing data. TFs are pervasive in human tumours with rates varying markedly across and within cancer types. In addition to end-to-end fusions, we find patterns of TFs that we mechanistically link to the activity of the alternative lengthening of telomeres (ALT) pathway. We show that TFs can be detected in the blood of cancer patients, which enables cancer detection with high specificity and sensitivity even for early-stage tumours and cancers of high unmet clinical need. Overall, we report a genomic footprint that enables characterization of the telomere maintenance mechanism of tumours and liquid biopsy analysis.
Collapse
Affiliation(s)
- Francesc Muyas
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
| | | | - Rita Cascão
- Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, Lisboa, Portugal
| | - Angela Afonso
- Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, Lisboa, Portugal
| | - Carolin M Sauer
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
| | - Claudia C Faria
- Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, Lisboa, Portugal
- Department of Neurosurgery, Hospital de Santa Maria, Centro Hospitalar Universitário Lisboa Norte (CHULN), Lisboa, Portugal
| | - Isidro Cortés-Ciriano
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK.
| | - Ignacio Flores
- Centro Nacional de Investigaciones Cardiovasculares Carlos III (CNIC), Madrid, 28029, Spain.
- Centro de Biologia Molecular Severo Ochoa, CSIC-UAM, Cantoblanco, Madrid, 28049, Spain.
| |
Collapse
|
47
|
Arslan S, Garcia FJ, Guo M, Kellinger MW, Kruglyak S, LeVieux JA, Mah AH, Wang H, Zhao J, Zhou C, Altomare A, Bailey J, Byrne MB, Chang C, Chen SX, Cho B, Dennler CN, Dien VT, Fuller D, Kelley R, Khandan O, Klein MG, Kim M, Lajoie BR, Lin B, Liu Y, Lopez T, Mains PT, Price AD, Robertson SR, Taylor-Weiner H, Tippana R, Tomaney AB, Zhang S, Abtahi M, Ambroso MR, Bajari R, Bellizzi AM, Benitez CB, Berard DR, Berti L, Blease KN, Blum AP, Boddicker AM, Bondar L, Brown C, Bui CA, Calleja-Aguirre J, Cappa K, Chan J, Chang VW, Charov K, Chen X, Constandse RM, Damron W, Dawood M, DeBuono N, Dimalanta JD, Edoli L, Elango K, Faustino N, Feng C, Ferrari M, Frankie K, Fries A, Galloway A, Gavrila V, Gemmen GJ, Ghadiali J, Ghorbani A, Goddard LA, Guetter AR, Hendricks GL, Hentschel J, Honigfort DJ, Hsieh YT, Hwang Fu YH, Im SK, Jin C, Kabu S, Kincade DE, Levy S, Li Y, Liang VK, Light WH, Lipsher JB, Liu TL, Long G, Ma R, Mailloux JM, Mandla KA, Martinez AR, Mass M, McKean DT, Meron M, Miller EA, Moh CS, Moore RK, Moreno J, Neysmith JM, Niman CS, Nunez JM, Ojeda MT, Ortiz SE, Owens J, Piland G, Proctor DJ, Purba JB, Ray M, Rong D, Saade VM, Saha S, Tomas GS, Scheidler N, Sirajudeen LH, Snow S, Stengel G, Stinson R, Stone MJ, Sundseth KJ, Thai E, Thompson CJ, Tjioe M, Trejo CL, Trieger G, Truong DN, Tse B, Voiles B, Vuong H, Wong JC, Wu CT, Yu H, Yu Y, Yu M, Zhang X, Zhao D, Zheng G, He M, Previte M. Sequencing by avidity enables high accuracy with low reagent consumption. Nat Biotechnol 2024; 42:132-138. [PMID: 37231263 DOI: 10.1038/s41587-023-01750-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Accepted: 03/15/2023] [Indexed: 05/27/2023]
Abstract
We present avidity sequencing, a sequencing chemistry that separately optimizes the processes of stepping along a DNA template and that of identifying each nucleotide within the template. Nucleotide identification uses multivalent nucleotide ligands on dye-labeled cores to form polymerase-polymer-nucleotide complexes bound to clonal copies of DNA targets. These polymer-nucleotide substrates, termed avidites, decrease the required concentration of reporting nucleotides from micromolar to nanomolar and yield negligible dissociation rates. Avidity sequencing achieves high accuracy, with 96.2% and 85.4% of base calls having an average of one error per 1,000 and 10,000 base pairs, respectively. We show that the average error rate of avidity sequencing remained stable following a long homopolymer.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Bill Lin
- Element Biosciences, San Diego, CA, USA
| | - Yu Liu
- Element Biosciences, San Diego, CA, USA
| | | | | | | | | | | | | | | | - Su Zhang
- Element Biosciences, San Diego, CA, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Xiyi Chen
- Element Biosciences, San Diego, CA, USA
| | | | | | | | | | | | | | | | | | - Chao Feng
- Element Biosciences, San Diego, CA, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Yu Li
- Element Biosciences, San Diego, CA, USA
| | | | | | | | | | | | - Rui Ma
- Element Biosciences, San Diego, CA, USA
| | | | | | | | - Max Mass
- Element Biosciences, San Diego, CA, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Ben Tse
- Element Biosciences, San Diego, CA, USA
| | | | | | | | | | - Hua Yu
- Element Biosciences, San Diego, CA, USA
| | | | - Ming Yu
- Element Biosciences, San Diego, CA, USA
| | - Xi Zhang
- Element Biosciences, San Diego, CA, USA
| | - Da Zhao
- Element Biosciences, San Diego, CA, USA
| | | | - Molly He
- Element Biosciences, San Diego, CA, USA
| | | |
Collapse
|
48
|
Goldberg ME, Noyes MD, Eichler EE, Quinlan AR, Harris K. Effects of parental age and polymer composition on short tandem repeat de novo mutation rates. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.22.573131. [PMID: 38187618 PMCID: PMC10769404 DOI: 10.1101/2023.12.22.573131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
Short tandem repeats (STRs) are hotspots of genomic variability in the human germline because of their high mutation rates, which have long been attributed largely to polymerase slippage during DNA replication. This model suggests that STR mutation rates should scale linearly with a father's age, as progenitor cells continually divide after puberty. In contrast, it suggests that STR mutation rates should not scale with a mother's age at her child's conception, since oocytes spend a mother's reproductive years arrested in meiosis II and undergo a fixed number of cell divisions that are independent of the age at ovulation. Yet, mirroring recent findings, we find that STR mutation rates covary with paternal and maternal age, implying that some STR mutations are caused by DNA damage in quiescent cells rather than the classical mechanism of polymerase slippage in replicating progenitor cells. These results also echo the recent finding that DNA damage in quiescent oocytes is a significant source of de novo SNVs and corroborate evidence of STR expansion in postmitotic cells. However, we find that the maternal age effect is not confined to previously discovered hotspots of oocyte mutagenesis, nor are post-zygotic mutations likely to contribute significantly. STR nucleotide composition demonstrates divergent effects on DNM rates between sexes. Unlike the paternal lineage, maternally derived DNMs at A/T STRs display a significantly greater association with maternal age than DNMs at GC-containing STRs. These observations may suggest the mechanism and developmental timing of certain STR mutations and are especially surprising considering the prior belief in replication slippage as the dominant mechanism of STR mutagenesis.
Collapse
Affiliation(s)
- Michael E. Goldberg
- Department of Genome Sciences, University of Washington, 3720 15 Ave NE, Seattle, WA, 98195
- Departments of Human Genetics and Biomedical Informatics, University of Utah, 15 S 2030 E, Salt Lake City, UT, 84112
| | - Michelle D. Noyes
- Department of Genome Sciences, University of Washington, 3720 15 Ave NE, Seattle, WA, 98195
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington, 3720 15 Ave NE, Seattle, WA, 98195
- Howard Hughes Medical Institute, 3720 15 Ave NE, University of Washington, Seattle, WA, 98195
| | - Aaron R. Quinlan
- Departments of Human Genetics and Biomedical Informatics, University of Utah, 15 S 2030 E, Salt Lake City, UT, 84112
- These authors contributed equally to this work
| | - Kelley Harris
- Department of Genome Sciences, University of Washington, 3720 15 Ave NE, Seattle, WA, 98195
- Computational Biology Division, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, Seattle, WA, 98109
- These authors contributed equally to this work
| |
Collapse
|
49
|
Counihan KL, Kanrar S, Tilman S, Gehring A. Evaluation of Long-Read Sequencing Simulators to Assess Real-World Applications for Food Safety. Foods 2023; 13:16. [PMID: 38201044 PMCID: PMC10778541 DOI: 10.3390/foods13010016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 12/07/2023] [Accepted: 12/16/2023] [Indexed: 01/12/2024] Open
Abstract
Shiga toxin-producing Escherichia coli (STEC) and Listeria monocytogenes are routinely responsible for severe foodborne illnesses in the United States. Current identification methods utilized by the U.S. Food Safety Inspection Service require at least four days to identify STEC and six days for L. monocytogenes. Adoption of long-read, whole genome sequencing for food safety testing could significantly reduce the time needed for identification, but method development costs are high. Therefore, the goal of this project was to use NanoSim-H software to simulate Oxford Nanopore sequencing reads to assess the feasibility of sequencing-based foodborne pathogen detection and guide experimental design. Sequencing reads were simulated for STEC, L. monocytogenes, and a 1:1 combination of STEC and Bos taurus genomes using NanoSim-H. At least 2500 simulated reads were needed to identify the seven genes of interest targeted in STEC, and at least 500 reads were needed to detect the gene targeted in L. monocytogenes. Genome coverage of 30x was estimated at 21,521, and 11,802 reads for STEC and L. monocytogenes, respectively. Approximately 5-6% of reads simulated from both bacteria did not align with their respective reference genomes due to the introduction of errors. For the STEC and B. taurus 1:1 genome mixture, all genes of interest were detected with 1,000,000 reads, but less than 1x coverage was obtained. The results suggested sample enrichment would be necessary to detect foodborne pathogens with long-read sequencing, but this would still decrease the time needed from current methods. Additionally, simulation data will be useful for reducing the time and expense associated with laboratory experimentation.
Collapse
Affiliation(s)
- Katrina L. Counihan
- Eastern Regional Research Center, United States Department of Agriculture, Agricultural Research Service, Wyndmoor, PA 19038, USA; (S.K.); (S.T.); (A.G.)
| | | | | | | |
Collapse
|
50
|
Gömer A, Klöhn M, Jagst M, Nocke MK, Pischke S, Horvatits T, Schulze zur Wiesch J, Müller T, Hardtke S, Cornberg M, Wedemeyer H, Behrendt P, Steinmann E, Todt D. Emergence of resistance-associated variants during sofosbuvir treatment in chronically infected hepatitis E patients. Hepatology 2023; 78:1882-1895. [PMID: 37334496 PMCID: PMC10653298 DOI: 10.1097/hep.0000000000000514] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Accepted: 04/21/2023] [Indexed: 06/20/2023]
Abstract
BACKGROUND AND AIMS Chronic HEV infections remain a serious problem in immunocompromised patients, as specifically approved antiviral drugs are unavailable. In 2020, a 24-week multicenter phase II pilot trial was carried out, evaluating the nucleotide analog sofosbuvir by treating nine chronically HEV-infected patients with sofosbuvir (Trial Number NCT03282474). During the study, antiviral therapy reduced virus RNA levels initially but did not lead to a sustained virologic response. Here, we characterize the changes in HEV intrahost populations during sofosbuvir treatment to identify the emergence of treatment-associated variants. APPROACH AND RESULTS We performed high-throughput sequencing on RNA-dependent RNA polymerase sequences to characterize viral population dynamics in study participants. Subsequently, we used an HEV-based reporter replicon system to investigate sofosbuvir sensitivity in high-frequency variants. Most patients had heterogenous HEV populations, suggesting high adaptability to treatment-related selection pressures. We identified numerous amino acid alterations emerging during treatment and found that the EC 50 of patient-derived replicon constructs was up to ~12-fold higher than the wild-type control, suggesting that variants associated with lower drug sensitivity were selected during sofosbuvir treatment. In particular, a single amino acid substitution (A1343V) in the finger domain of ORF1 could reduce susceptibility to sofosbuvir significantly in 8 of 9 patients. CONCLUSIONS In conclusion, viral population dynamics played a critical role during antiviral treatment. High population diversity during sofosbuvir treatment led to the selection of variants (especially A1343V) with lower sensitivity to the drug, uncovering a novel mechanism of resistance-associated variants during sofosbuvir treatment.
Collapse
Affiliation(s)
- André Gömer
- Department of Molecular and Medical Virology, Ruhr University Bochum, Bochum, Germany
| | - Mara Klöhn
- Department of Molecular and Medical Virology, Ruhr University Bochum, Bochum, Germany
| | - Michelle Jagst
- Department of Molecular and Medical Virology, Ruhr University Bochum, Bochum, Germany
- Institute of Virology, University of Veterinary Medicine Hannover, Hannover, Germany
| | - Maximilian K. Nocke
- Department of Molecular and Medical Virology, Ruhr University Bochum, Bochum, Germany
| | - Sven Pischke
- Medical Clinic and Polyclinic, University Medical Centre Hamburg Eppendorf, Hamburg, Germany
- German Center for Infection Research (DZIF), Partner Site Hamburg Lübeck-Borstel-Riems, Germany
| | - Thomas Horvatits
- Medical Clinic and Polyclinic, University Medical Centre Hamburg Eppendorf, Hamburg, Germany
- German Center for Infection Research (DZIF), Partner Site Hamburg Lübeck-Borstel-Riems, Germany
- Gastromedics Health Center, Eisenstadt, Austria
| | - Julian Schulze zur Wiesch
- Medical Clinic and Polyclinic, University Medical Centre Hamburg Eppendorf, Hamburg, Germany
- German Center for Infection Research (DZIF), Partner Site Hamburg Lübeck-Borstel-Riems, Germany
| | - Tobias Müller
- Department of Gastroenterology and Hepatology, Charité Campus Virchow-Klinikum (CVK), Berlin, Germany
| | - Svenja Hardtke
- German Center for Infection Research (DZIF); HepNet Study-House/German Liver Foundation (DLS), Hannover, Germany
- Institute for Infections Research and Vaccine, University Medical Centre Hamburg Eppendorf, Hamburg, Germany
| | - Markus Cornberg
- German Center for Infection Research (DZIF); HepNet Study-House/German Liver Foundation (DLS), Hannover, Germany
- Department of Gastroenterology, Hepatology, Infectious Diseases and Endocrinology, Hannover Medical School, Germany
- German Center for Infection Research (DZIF); Partner Site Hannover Braunschweig, Germany
- Center for Individualized Infection Medicine (CiiM), Hannover, Germany
| | - Heiner Wedemeyer
- German Center for Infection Research (DZIF); HepNet Study-House/German Liver Foundation (DLS), Hannover, Germany
- Department of Gastroenterology, Hepatology, Infectious Diseases and Endocrinology, Hannover Medical School, Germany
- German Center for Infection Research (DZIF); Partner Site Hannover Braunschweig, Germany
| | - Patrick Behrendt
- Department of Gastroenterology, Hepatology, Infectious Diseases and Endocrinology, Hannover Medical School, Germany
- German Center for Infection Research (DZIF); Partner Site Hannover Braunschweig, Germany
- Institute of Experimental Virology, TWINCORE Centre for Experimental and Clinical Infection Research, Hannover, Germany
| | - Eike Steinmann
- Department of Molecular and Medical Virology, Ruhr University Bochum, Bochum, Germany
- German Centre for Infection Research (DZIF), Bochum, Germany
| | - Daniel Todt
- Department of Molecular and Medical Virology, Ruhr University Bochum, Bochum, Germany
- European Virus Bioinformatics Center (EVBC), Jena, Germany
| |
Collapse
|