1
|
Ferreira MR, Carratto TMT, Frontanilla TS, Bonadio RS, Jain M, de Oliveira SF, Castelli EC, Mendes-Junior CT. Advances in forensic genetics: Exploring the potential of long read sequencing. Forensic Sci Int Genet 2025; 74:103156. [PMID: 39427416 DOI: 10.1016/j.fsigen.2024.103156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2024] [Revised: 10/04/2024] [Accepted: 10/06/2024] [Indexed: 10/22/2024]
Abstract
DNA-based technologies have been used in forensic practice since the mid-1980s. While PCR-based STR genotyping using Capillary Electrophoresis remains the gold standard for generating DNA profiles in routine casework worldwide, the research community is continually seeking alternative methods capable of providing additional information to enhance discrimination power or contribute with new investigative leads. Oxford Nanopore Technologies (ONT) and PacBio third-generation sequencing have revolutionized the field, offering real-time capabilities, single-molecule resolution, and long-read sequencing (LRS). ONT, the pioneer of nanopore sequencing, uses biological nanopores to analyze nucleic acids in real-time. Its devices have revolutionized sequencing and may represent an interesting alternative for forensic research and routine casework, given that it offers unparalleled flexibility in a portable size: it enables sequencing approaches that range widely from PCR-amplified short target regions (e.g., CODIS STRs) to PCR-free whole transcriptome or even ultra-long whole genome sequencing. Despite its higher error rate compared to Illumina sequencing, it can significantly improve accuracy in read alignment against a reference genome or de novo genome assembly. This is achieved by generating long contiguous sequences that correctly assemble repetitive sections and regions with structural variation. Moreover, it allows real-time determination of DNA methylation status from native DNA without the need for bisulfite conversion. LRS enables the analysis of thousands of markers at once, providing phasing information and eliminating the need for multiple assays. This maximizes the information retrieved from a single invaluable sample. In this review, we explore the potential use of LRS in different forensic genetics approaches.
Collapse
Affiliation(s)
- Marcel Rodrigues Ferreira
- Molecular Genetics and Bioinformatics Laboratory, Experimental Research Unit - Unipex, School of Medicine, São Paulo State University - Unesp, Botucatu, São Paulo, Brazil
| | - Thássia Mayra Telles Carratto
- Departamento de Química, Laboratório de Pesquisas Forenses e Genômicas, Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto, Universidade de São Paulo, Ribeirão Preto, SP 14040-901, Brazil
| | - Tamara Soledad Frontanilla
- Departamento de Genética, Faculdade de Medicina de Ribeirão Preto, Universidade de São Paulo, Ribeirão Preto, SP 14049-900, Brazil
| | - Raphael Severino Bonadio
- Depto Genética e Morfologia, Instituto de Ciências Biológicas, Universidade de Brasília, Brasília, DF, Brazil
| | - Miten Jain
- Department of Bioengineering, Department of Physics, Khoury College of Computer Sciences, Northeastern University, Boston, MA, United States
| | | | - Erick C Castelli
- Molecular Genetics and Bioinformatics Laboratory, Experimental Research Unit - Unipex, School of Medicine, São Paulo State University - Unesp, Botucatu, São Paulo, Brazil; Pathology Department, School of Medicine, São Paulo State University - Unesp, Botucatu, São Paulo, Brazil
| | - Celso Teixeira Mendes-Junior
- Departamento de Química, Laboratório de Pesquisas Forenses e Genômicas, Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto, Universidade de São Paulo, Ribeirão Preto, SP 14040-901, Brazil.
| |
Collapse
|
2
|
Udine E, Finch NA, DeJesus-Hernandez M, Jackson JL, Baker MC, Saravanaperumal SA, Wieben E, Ebbert MTW, Shah J, Petrucelli L, Rademakers R, Oskarsson B, van Blitterswijk M. Targeted long-read sequencing to quantify methylation of the C9orf72 repeat expansion. Mol Neurodegener 2024; 19:99. [PMID: 39709476 DOI: 10.1186/s13024-024-00790-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2024] [Accepted: 12/12/2024] [Indexed: 12/23/2024] Open
Abstract
BACKGROUND The gene C9orf72 harbors a non-coding hexanucleotide repeat expansion known to cause amyotrophic lateral sclerosis and frontotemporal dementia. While previous studies have estimated the length of this repeat expansion in multiple tissues, technological limitations have impeded researchers from exploring additional features, such as methylation levels. METHODS We aimed to characterize C9orf72 repeat expansions using a targeted, amplification-free long-read sequencing method. Our primary goal was to determine the presence and subsequent quantification of observed methylation in the C9orf72 repeat expansion. In addition, we measured the repeat length and purity of the expansion. To do this, we sequenced DNA extracted from blood for 27 individuals with an expanded C9orf72 repeat. RESULTS For these individuals, we obtained a total of 7,765 on-target reads, including 1,612 fully covering the expanded allele. Our in-depth analysis revealed that the expansion itself is methylated, with great variability in total methylation levels observed, as represented by the proportion of methylated CpGs (13 to 66%). Interestingly, we demonstrated that the expanded allele is more highly methylated than the wild-type allele (P-Value = 2.76E-05) and that increased methylation levels are observed in longer repeat expansions (P-Value = 1.18E-04). Furthermore, methylation levels correlate with age at collection (P-Value = 3.25E-04) as well as age at disease onset (P-Value = 0.020). Additionally, we detected repeat lengths up to 4,088 repeats (~ 25 kb) and found that the expansion contains few interruptions in the blood. CONCLUSIONS Taken together, our study demonstrates robust ability to quantify methylation of the expanded C9orf72 repeat, capturing differences between individuals harboring this expansion and revealing clinical associations.
Collapse
Affiliation(s)
- Evan Udine
- Department of Neuroscience, Mayo Clinic, Jacksonville, FL, USA
- Mayo Clinic Graduate School of Biomedical Sciences, Mayo Clinic, Jacksonville, FL, USA
| | - NiCole A Finch
- Department of Neuroscience, Mayo Clinic, Jacksonville, FL, USA
| | | | - Jazmyne L Jackson
- Fels Cancer Institute for Personalized Medicine, Temple University, Lewis Katz School of Medicine, Philadelphia, PA, USA
| | - Matthew C Baker
- Department of Neuroscience, Mayo Clinic, Jacksonville, FL, USA
| | | | - Eric Wieben
- Genome Analysis Core, Mayo Clinic, Rochester, MN, USA
| | - Mark T W Ebbert
- Department of Neuroscience, University of Kentucky Sanders-Brown Center on Aging, Lexington, KY, USA
| | - Jaimin Shah
- Department of Neurology, Mayo Clinic, Jacksonville, FL, USA
| | - Leonard Petrucelli
- Department of Neuroscience, Mayo Clinic, Jacksonville, FL, USA
- Mayo Clinic Graduate School of Biomedical Sciences, Mayo Clinic, Jacksonville, FL, USA
| | - Rosa Rademakers
- Department of Neuroscience, Mayo Clinic, Jacksonville, FL, USA
- VIB Center for Molecular Neurology, Antwerp, Belgium
- Department of Biomedical Science, University of Antwerp, Antwerp, Belgium
| | | | - Marka van Blitterswijk
- Department of Neuroscience, Mayo Clinic, Jacksonville, FL, USA.
- Mayo Clinic Graduate School of Biomedical Sciences, Mayo Clinic, Jacksonville, FL, USA.
| |
Collapse
|
3
|
Sahin H, Salehi R, Islam S, Müller M, Giehr P, Carell T. Robust Bisulfite-Free Single-Molecule Real-Time Sequencing of Methyldeoxycytidine Based on a Novel hpTet3 Enzyme. Angew Chem Int Ed Engl 2024; 63:e202418500. [PMID: 39535873 DOI: 10.1002/anie.202418500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2024] [Revised: 10/31/2024] [Accepted: 11/06/2024] [Indexed: 11/16/2024]
Abstract
In addition to the four canonical nucleosides dA, dG, dC and T, genomic DNA contains the additional base 5-methyldeoxycytidine (mdC). The presence of this methylated cytidine nucleoside in promoter regions or gene bodies significantly affects the transcriptional activity of the corresponding gene. Consequently, the methylation patterns of genes are crucial for either silencing or activating genes. Sequencing the positions of mdC in the genome is therefore of paramount importance for early cancer diagnostics as it helps determine incorrect gene expression. Currently, the bisulfite method is the gold standard for mdC-sequencing. However, this method has the drawback that the majority of the input DNA is degraded during the bisulfite treatment. Additionally, bisulfite sequencing is prone to errors. Here, we report a benign, bisulfite-free mdC sequencing method termed EMox-seq, which is based on third-generation single-molecule SMRT sequencing. The foundation of this technology is a new Tet3 enzyme that efficiently oxidizes mdCs to 5-carboxycytidine (cadC). In turn, cadC provides an excellent readout by SMRT sequencing using specially trained AI-based algorithms.
Collapse
Affiliation(s)
- Hanife Sahin
- Center for Nucleic Acid Therapies at the Department of Chemistry, Institute for Chemical Epigenetics, Ludwig-Maximilians-Universität München, Butenandtstr. 5-13, 81377, München, Germany
| | - Raheleh Salehi
- Center for Nucleic Acid Therapies at the Department of Chemistry, Institute for Chemical Epigenetics, Ludwig-Maximilians-Universität München, Butenandtstr. 5-13, 81377, München, Germany
| | - Shariful Islam
- Center for Nucleic Acid Therapies at the Department of Chemistry, Institute for Chemical Epigenetics, Ludwig-Maximilians-Universität München, Butenandtstr. 5-13, 81377, München, Germany
| | - Markus Müller
- Center for Nucleic Acid Therapies at the Department of Chemistry, Institute for Chemical Epigenetics, Ludwig-Maximilians-Universität München, Butenandtstr. 5-13, 81377, München, Germany
| | - Pascal Giehr
- Center for Nucleic Acid Therapies at the Department of Chemistry, Institute for Chemical Epigenetics, Ludwig-Maximilians-Universität München, Butenandtstr. 5-13, 81377, München, Germany
| | - Thomas Carell
- Center for Nucleic Acid Therapies at the Department of Chemistry, Institute for Chemical Epigenetics, Ludwig-Maximilians-Universität München, Butenandtstr. 5-13, 81377, München, Germany
| |
Collapse
|
4
|
Shen Q, Zhang X, Qi H, Tang Q, Sheng Q, Yi S. Chromosome-level genome assembly of the butterfly hillstream loach Beaufortia pingi. Sci Data 2024; 11:1260. [PMID: 39567629 PMCID: PMC11579477 DOI: 10.1038/s41597-024-04144-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Accepted: 11/13/2024] [Indexed: 11/22/2024] Open
Abstract
The Butterfly hillstream loach (Beaufortia pingi), an aquatic benthic fish species inhabiting mountain rapids, exhibits exceptional capabilities in movement, adsorption, and desorption processes, enabling it to adhere to smooth and contaminated surfaces in turbulent streams. These attributes make it a significant subject for genetic and evolutionary research. In this study, the genomic sequences of this species were acquired utilizing PacBio sequencing and Hi-C methods. The genome assembly is 459.8 Mb in size with a contig N50 of 5.35 Mb, and the assembled contigs were anchored into 25 chromosomes. BUSCO analysis confirmed a high completeness level with 97.0% gene coverage. A total of 111.47 Mb repetitive sequences (24.25% of the assembled genome), and 22,906 protein-coding genes were identified in the genome. This study represents the first investigation of the species' genome. The establishment of this genome assembly provides valuable resources for future genetic research and facilitates the study of genetic changes during evolution.
Collapse
Affiliation(s)
- Qi Shen
- School of Life Sciences, Huzhou University, Huzhou, 313000, China
| | - Xinhui Zhang
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
- Shenzhen Key Lab of Marine Genomics, Guangdong Provincial Key Lab of Molecular Breeding in Marine Economic Animals, BGI Academy of Marine Sciences, BGI Marine, Shenzhen, 518081, China
| | - Hangyu Qi
- School of Life Sciences, Huzhou University, Huzhou, 313000, China
| | - Qiongying Tang
- School of Life Sciences, Huzhou University, Huzhou, 313000, China
| | - Qiang Sheng
- School of Life Sciences, Huzhou University, Huzhou, 313000, China.
| | - Shaokui Yi
- School of Life Sciences, Huzhou University, Huzhou, 313000, China.
| |
Collapse
|
5
|
Liu X, Ni Y, Ye L, Guo Z, Tan L, Li J, Yang M, Chen S, Li R. Nanopore strand-specific mismatch enables de novo detection of bacterial DNA modifications. Genome Res 2024; 34:2025-2038. [PMID: 39358016 DOI: 10.1101/gr.279012.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Accepted: 09/25/2024] [Indexed: 10/04/2024]
Abstract
DNA modifications in bacteria present diverse types and distributions, playing crucial functional roles. Current methods for detecting bacterial DNA modifications via nanopore sequencing typically involve comparing raw current signals to a methylation-free control. In this study, we found that bacterial DNA modification induces errors in nanopore reads. And these errors are found only in one strand but not the other, showing a strand-specific bias. Leveraging this discovery, we developed Hammerhead, a pioneering pipeline designed for de novo methylation discovery that circumvents the necessity of raw signal inference and a methylation-free control. The majority (14 out of 16) of the identified motifs can be validated by raw signal comparison methods or by identifying corresponding methyltransferases in bacteria. Additionally, we included a novel polishing strategy employing duplex reads to correct modification-induced errors in bacterial genome assemblies, achieving a reduction of over 85% in such errors. In summary, Hammerhead enables users to effectively locate bacterial DNA methylation sites from nanopore FASTQ/FASTA reads, thus holds promise as a routine pipeline for a wide range of nanopore sequencing applications, such as genome assembly, metagenomic binning, decontaminating eukaryotic genome assemblies, and functional analysis for DNA modifications.
Collapse
Affiliation(s)
- Xudong Liu
- Department of Infectious Diseases and Public Health, Jockey Club College of Veterinary Medicine and Life Sciences, City University of Hong Kong, Hong Kong 999077, China
| | - Ying Ni
- Department of Biomedical Sciences, Jockey Club College of Veterinary Medicine and Life Sciences, City University of Hong Kong, Hong Kong 999077, China
- Department of Precision Diagnostic and Therapeutic Technology, City University of Hong Kong Shenzhen Futian Research Institute, Shenzhen 518000, China
- Tung Biomedical Sciences Centre, City University of Hong Kong, Hong Kong 999077, China
| | - Lianwei Ye
- Department of Infectious Diseases and Public Health, Jockey Club College of Veterinary Medicine and Life Sciences, City University of Hong Kong, Hong Kong 999077, China
| | - Zhihao Guo
- Department of Infectious Diseases and Public Health, Jockey Club College of Veterinary Medicine and Life Sciences, City University of Hong Kong, Hong Kong 999077, China
| | - Lu Tan
- Department of Infectious Diseases and Public Health, Jockey Club College of Veterinary Medicine and Life Sciences, City University of Hong Kong, Hong Kong 999077, China
| | - Jun Li
- Department of Infectious Diseases and Public Health, Jockey Club College of Veterinary Medicine and Life Sciences, City University of Hong Kong, Hong Kong 999077, China
| | - Mengsu Yang
- Department of Biomedical Sciences, Jockey Club College of Veterinary Medicine and Life Sciences, City University of Hong Kong, Hong Kong 999077, China
- Department of Precision Diagnostic and Therapeutic Technology, City University of Hong Kong Shenzhen Futian Research Institute, Shenzhen 518000, China
- Tung Biomedical Sciences Centre, City University of Hong Kong, Hong Kong 999077, China
- Key Laboratory of Biochip Technology, Biotech and Health Centre, Shenzhen Research Institute of City University of Hong Kong, Shenzhen 518000, China
| | - Sheng Chen
- State Key Lab of Chemical Biology and Drug Discovery, Department of Food Science and Nutrition, The Hong Kong Polytechnic University, Hong Kong 999077, China
| | - Runsheng Li
- Department of Infectious Diseases and Public Health, Jockey Club College of Veterinary Medicine and Life Sciences, City University of Hong Kong, Hong Kong 999077, China;
- Department of Precision Diagnostic and Therapeutic Technology, City University of Hong Kong Shenzhen Futian Research Institute, Shenzhen 518000, China
- Tung Biomedical Sciences Centre, City University of Hong Kong, Hong Kong 999077, China
| |
Collapse
|
6
|
Theme 2 Genetics and Genomics. Amyotroph Lateral Scler Frontotemporal Degener 2024; 25:105-121. [PMID: 39508667 DOI: 10.1080/21678421.2024.2403299] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2024]
|
7
|
Garg V, Bohra A, Mascher M, Spannagl M, Xu X, Bevan MW, Bennetzen JL, Varshney RK. Unlocking plant genetics with telomere-to-telomere genome assemblies. Nat Genet 2024; 56:1788-1799. [PMID: 39048791 DOI: 10.1038/s41588-024-01830-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2024] [Accepted: 06/12/2024] [Indexed: 07/27/2024]
Abstract
Contiguous genome sequence assemblies will help us to realize the full potential of crop translational genomics. Recent advances in sequencing technologies, especially long-read sequencing strategies, have made it possible to construct gapless telomere-to-telomere (T2T) assemblies, thus offering novel insights into genome organization and function. Plant genomes pose unique challenges, such as a continuum of ancient to recent polyploidy and abundant highly similar and long repetitive elements. Owing to progress in sequencing approaches, for most crop plants, chromosome-scale reference genome assemblies are available, but T2T assembly construction remains challenging. Here we describe methods for haplotype-resolved, gapless T2T assembly construction in plants, including various crop species. We outline the impact of T2T assemblies in elucidating the roles of repetitive elements in gene regulation, as well as in pangenomics, functional genomics, genome-assisted breeding and targeted genome manipulation. In conjunction with sequence-enriched germplasm repositories, T2T assemblies thus hold great promise for basic and applied plant sciences.
Collapse
Affiliation(s)
- Vanika Garg
- WA State Agricultural Biotechnology Centre, Centre for Crop and Food Innovation, Food Futures Institute, Murdoch University, Murdoch, Western Australia, Australia
| | - Abhishek Bohra
- WA State Agricultural Biotechnology Centre, Centre for Crop and Food Innovation, Food Futures Institute, Murdoch University, Murdoch, Western Australia, Australia
- ICAR-Indian Institute of Pulses Research, Kanpur, India
| | - Martin Mascher
- Leibniz Institute of Plant Genetics and Crop Plant Research, Gatersleben, Seeland, Germany
| | - Manuel Spannagl
- WA State Agricultural Biotechnology Centre, Centre for Crop and Food Innovation, Food Futures Institute, Murdoch University, Murdoch, Western Australia, Australia
- Plant Genome and Systems Biology, German Research Center for Environmental Health, Helmholtz Zentrum München, Neuherberg, Germany
| | - Xun Xu
- WA State Agricultural Biotechnology Centre, Centre for Crop and Food Innovation, Food Futures Institute, Murdoch University, Murdoch, Western Australia, Australia
- BGI-Shenzhen, Shenzhen, China
| | | | | | - Rajeev K Varshney
- WA State Agricultural Biotechnology Centre, Centre for Crop and Food Innovation, Food Futures Institute, Murdoch University, Murdoch, Western Australia, Australia.
| |
Collapse
|
8
|
Engelbrecht E, Rodriguez OL, Watson CT. Addressing Technical Pitfalls in Pursuit of Molecular Factors That Mediate Immunoglobulin Gene Regulation. JOURNAL OF IMMUNOLOGY (BALTIMORE, MD. : 1950) 2024; 213:651-662. [PMID: 39007649 PMCID: PMC11333172 DOI: 10.4049/jimmunol.2400131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Accepted: 06/13/2024] [Indexed: 07/16/2024]
Abstract
The expressed Ab repertoire is a critical determinant of immune-related phenotypes. Ab-encoding transcripts are distinct from other expressed genes because they are transcribed from somatically rearranged gene segments. Human Abs are composed of two identical H and L chain polypeptides derived from genes in IGH locus and one of two L chain loci. The combinatorial diversity that results from Ab gene rearrangement and the pairing of different H and L chains contributes to the immense diversity of the baseline Ab repertoire. During rearrangement, Ab gene selection is mediated by factors that influence chromatin architecture, promoter/enhancer activity, and V(D)J recombination. Interindividual variation in the composition of the Ab repertoire associates with germline variation in IGH, implicating polymorphism in Ab gene regulation. Determining how IGH variants directly mediate gene regulation will require integration of these variants with other functional genomic datasets. In this study, we argue that standard approaches using short reads have limited utility for characterizing regulatory regions in IGH at haplotype resolution. Using simulated and chromatin immunoprecipitation sequencing reads, we define features of IGH that limit use of short reads and a single reference genome, namely 1) the highly duplicated nature of the DNA sequence in IGH and 2) structural polymorphisms that are frequent in the population. We demonstrate that personalized diploid references enhance performance of short-read data for characterizing mappable portions of the locus, while also showing that long-read profiling tools will ultimately be needed to fully resolve functional impacts of IGH germline variation on expressed Ab repertoires.
Collapse
Affiliation(s)
- Eric Engelbrecht
- Department of Biochemistry and Molecular Genetics, University of Louisville, Louisville, KY
| | - Oscar L Rodriguez
- Department of Biochemistry and Molecular Genetics, University of Louisville, Louisville, KY
| | - Corey T Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville, Louisville, KY
| |
Collapse
|
9
|
Bai X, Yao HC, Wu B, Liu LR, Ding YY, Xiao CL. DeepBAM: a high-accuracy single-molecule CpG methylation detection tool for Oxford nanopore sequencing. Brief Bioinform 2024; 25:bbae413. [PMID: 39177264 PMCID: PMC11342253 DOI: 10.1093/bib/bbae413] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 07/29/2024] [Accepted: 08/05/2024] [Indexed: 08/24/2024] Open
Abstract
Recent nanopore sequencing system (R10.4) has enhanced base calling accuracy and is being increasingly utilized for detecting CpG methylation state. However, the robustness and universality of the methylation calling model in officially supplied Dorado remains poorly tested. In this study, we obtained heterogeneous datasets from human and plant sources to carry out comprehensive evaluations, which showed that Dorado performed significantly different across datasets. We therefore developed deep neural networks and implemented several optimizations in training a new model called DeepBAM. DeepBAM achieved superior and more stable performances compared with Dorado, including higher area under the ROC curves (98.47% on average and up to 7.36% improvement) and F1 scores (94.97% on average and up to 16.24% improvement) across the datasets. DeepBAM-based whole genome methylation frequencies have achieved >0.95 correlations with BS-seq on four of five datasets, outperforming Dorado in all instances. It enables unraveling allele-specific methylation patterns, including regions of transposable elements. The enhanced performance of DeepBAM paves the way for broader applications of nanopore sequencing in CpG methylation studies.
Collapse
Affiliation(s)
- Xin Bai
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, 7 Jinsui Road, Tianhe District, Guangzhou 510060, China
| | - Hui-Cong Yao
- School of Artificial Intelligence, Sun Yat-Sen University, Gaoxin District, Zhuhai 519000, China
| | - Bo Wu
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, 7 Jinsui Road, Tianhe District, Guangzhou 510060, China
| | - Luo-Ran Liu
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, 7 Jinsui Road, Tianhe District, Guangzhou 510060, China
| | - Yu-Ying Ding
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, 7 Jinsui Road, Tianhe District, Guangzhou 510060, China
| | - Chuan-Le Xiao
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, 7 Jinsui Road, Tianhe District, Guangzhou 510060, China
| |
Collapse
|
10
|
Fu Y, Aganezov S, Mahmoud M, Beaulaurier J, Juul S, Treangen TJ, Sedlazeck FJ. MethPhaser: methylation-based long-read haplotype phasing of human genomes. Nat Commun 2024; 15:5327. [PMID: 38909018 PMCID: PMC11193733 DOI: 10.1038/s41467-024-49588-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 06/11/2024] [Indexed: 06/24/2024] Open
Abstract
The assignment of variants across haplotypes, phasing, is crucial for predicting the consequences, interaction, and inheritance of mutations and is a key step in improving our understanding of phenotype and disease. However, phasing is limited by read length and stretches of homozygosity along the genome. To overcome this limitation, we designed MethPhaser, a method that utilizes methylation signals from Oxford Nanopore Technologies to extend Single Nucleotide Variation (SNV)-based phasing. We demonstrate that haplotype-specific methylations extensively exist in Human genomes and the advent of long-read technologies enabled direct report of methylation signals. For ONT R9 and R10 cell line data, we increase the phase length N50 by 78%-151% at a phasing accuracy of 83.4-98.7% To assess the impact of tissue purity and random methylation signals due to inactivation, we also applied MethPhaser on blood samples from 4 patients, still showing improvements over SNV-only phasing. MethPhaser further improves phasing across HLA and multiple other medically relevant genes, improving our understanding of how mutations interact across multiple phenotypes. The concept of MethPhaser can also be extended to non-human diploid genomes. MethPhaser is available at https://github.com/treangenlab/methphaser .
Collapse
Affiliation(s)
- Yilei Fu
- Department of Computer Science, Rice University, Houston, TX, USA
| | | | - Medhat Mahmoud
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, USA
| | | | - Sissel Juul
- Oxford Nanopore Technologies Inc, New York, NY, USA
| | - Todd J Treangen
- Department of Computer Science, Rice University, Houston, TX, USA.
- Department of Bioengineering, Rice University, Houston, TX, USA.
| | - Fritz J Sedlazeck
- Department of Computer Science, Rice University, Houston, TX, USA.
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, USA.
| |
Collapse
|
11
|
Grewal S, Yang CY, Scholefield D, Ashling S, Ghosh S, Swarbreck D, Collins J, Yao E, Sen TZ, Wilson M, Yant L, King IP, King J. Chromosome-scale genome assembly of bread wheat's wild relative Triticum timopheevii. Sci Data 2024; 11:420. [PMID: 38653999 PMCID: PMC11039740 DOI: 10.1038/s41597-024-03260-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Accepted: 04/15/2024] [Indexed: 04/25/2024] Open
Abstract
Wheat (Triticum aestivum) is one of the most important food crops with an urgent need for increase in its production to feed the growing world. Triticum timopheevii (2n = 4x = 28) is an allotetraploid wheat wild relative species containing the At and G genomes that has been exploited in many pre-breeding programmes for wheat improvement. In this study, we report the generation of a chromosome-scale reference genome assembly of T. timopheevii accession PI 94760 based on PacBio HiFi reads and chromosome conformation capture (Hi-C). The assembly comprised a total size of 9.35 Gb, featuring a contig N50 of 42.4 Mb and included the mitochondrial and plastid genome sequences. Genome annotation predicted 166,325 gene models including 70,365 genes with high confidence. DNA methylation analysis showed that the G genome had on average more methylated bases than the At genome. In summary, the T. timopheevii genome assembly provides a valuable resource for genome-informed discovery of agronomically important genes for food security.
Collapse
Affiliation(s)
- Surbhi Grewal
- Wheat Research Centre, Department of Plant and Crop Sciences, School of Biosciences, University of Nottingham, Loughborough, LE12 5RD, UK.
| | - Cai-Yun Yang
- Wheat Research Centre, Department of Plant and Crop Sciences, School of Biosciences, University of Nottingham, Loughborough, LE12 5RD, UK
| | - Duncan Scholefield
- Wheat Research Centre, Department of Plant and Crop Sciences, School of Biosciences, University of Nottingham, Loughborough, LE12 5RD, UK
| | - Stephen Ashling
- Wheat Research Centre, Department of Plant and Crop Sciences, School of Biosciences, University of Nottingham, Loughborough, LE12 5RD, UK
| | - Sreya Ghosh
- Earlham Institute, Norwich Research Park, Norwich, NR4 7UZ, UK
| | - David Swarbreck
- Earlham Institute, Norwich Research Park, Norwich, NR4 7UZ, UK
| | - Joanna Collins
- Genome Reference Informatics Team, Wellcome Sanger Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1RQ, UK
| | - Eric Yao
- University of California, Department of Bioengineering, Berkeley, CA, 94720, USA
- United States Department of Agriculture-Agricultural Research Service, Western Regional Research Center, Crop Improvement and Genetics Research Unit, 800 Buchanan St., Albany, CA, 94710, USA
| | - Taner Z Sen
- University of California, Department of Bioengineering, Berkeley, CA, 94720, USA
- United States Department of Agriculture-Agricultural Research Service, Western Regional Research Center, Crop Improvement and Genetics Research Unit, 800 Buchanan St., Albany, CA, 94710, USA
| | - Michael Wilson
- University of Nottingham, University Park, Nottingham, NG7 2RD, UK
| | - Levi Yant
- University of Nottingham, University Park, Nottingham, NG7 2RD, UK
| | - Ian P King
- Wheat Research Centre, Department of Plant and Crop Sciences, School of Biosciences, University of Nottingham, Loughborough, LE12 5RD, UK
| | - Julie King
- Wheat Research Centre, Department of Plant and Crop Sciences, School of Biosciences, University of Nottingham, Loughborough, LE12 5RD, UK
| |
Collapse
|
12
|
Wang B, Jia Y, Dang N, Yu J, Bush SJ, Gao S, He W, Wang S, Guo H, Yang X, Ma W, Ye K. Near telomere-to-telomere genome assemblies of two Chlorella species unveil the composition and evolution of centromeres in green algae. BMC Genomics 2024; 25:356. [PMID: 38600443 PMCID: PMC11005252 DOI: 10.1186/s12864-024-10280-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Accepted: 04/02/2024] [Indexed: 04/12/2024] Open
Abstract
BACKGROUND Centromeres play a crucial and conserved role in cell division, although their composition and evolutionary history in green algae, the evolutionary ancestors of land plants, remains largely unknown. RESULTS We constructed near telomere-to-telomere (T2T) assemblies for two Trebouxiophyceae species, Chlorella sorokiniana NS4-2 and Chlorella pyrenoidosa DBH, with chromosome numbers of 12 and 13, and genome sizes of 58.11 Mb and 53.41 Mb, respectively. We identified and validated their centromere sequences using CENH3 ChIP-seq and found that, similar to humans and higher plants, the centromeric CENH3 signals of green algae display a pattern of hypomethylation. Interestingly, the centromeres of both species largely comprised transposable elements, although they differed significantly in their composition. Species within the Chlorella genus display a more diverse centromere composition, with major constituents including members of the LTR/Copia, LINE/L1, and LINE/RTEX families. This is in contrast to green algae including Chlamydomonas reinhardtii, Coccomyxa subellipsoidea, and Chromochloris zofingiensis, in which centromere composition instead has a pronounced single-element composition. Moreover, we observed significant differences in the composition and structure of centromeres among chromosomes with strong collinearity within the Chlorella genus, suggesting that centromeric sequence evolves more rapidly than sequence in non-centromeric regions. CONCLUSIONS This study not only provides high-quality genome data for comparative genomics of green algae but gives insight into the composition and evolutionary history of centromeres in early plants, laying an important foundation for further research on their evolution.
Collapse
Affiliation(s)
- Bo Wang
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Yanyan Jia
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Ningxin Dang
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
- Genome Institute, the First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China
| | - Jie Yu
- College of Life Sciences, Shanghai Normal University, Shanghai, China
| | - Stephen J Bush
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Shenghan Gao
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Wenxi He
- School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, China
| | - Sirui Wang
- School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, China
| | - Hongtao Guo
- School of Computer Science and Technology, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Xiaofei Yang
- School of Computer Science and Technology, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Weimin Ma
- College of Life Sciences, Shanghai Normal University, Shanghai, China.
| | - Kai Ye
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China.
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China.
- Genome Institute, the First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China.
- School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, China.
- Faculty of Science, Leiden University, Leiden, The Netherlands.
| |
Collapse
|
13
|
Ermini L, Driguez P. The Application of Long-Read Sequencing to Cancer. Cancers (Basel) 2024; 16:1275. [PMID: 38610953 PMCID: PMC11011098 DOI: 10.3390/cancers16071275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2024] [Revised: 03/20/2024] [Accepted: 03/21/2024] [Indexed: 04/14/2024] Open
Abstract
Cancer is a multifaceted disease arising from numerous genomic aberrations that have been identified as a result of advancements in sequencing technologies. While next-generation sequencing (NGS), which uses short reads, has transformed cancer research and diagnostics, it is limited by read length. Third-generation sequencing (TGS), led by the Pacific Biosciences and Oxford Nanopore Technologies platforms, employs long-read sequences, which have marked a paradigm shift in cancer research. Cancer genomes often harbour complex events, and TGS, with its ability to span large genomic regions, has facilitated their characterisation, providing a better understanding of how complex rearrangements affect cancer initiation and progression. TGS has also characterised the entire transcriptome of various cancers, revealing cancer-associated isoforms that could serve as biomarkers or therapeutic targets. Furthermore, TGS has advanced cancer research by improving genome assemblies, detecting complex variants, and providing a more complete picture of transcriptomes and epigenomes. This review focuses on TGS and its growing role in cancer research. We investigate its advantages and limitations, providing a rigorous scientific analysis of its use in detecting previously hidden aberrations missed by NGS. This promising technology holds immense potential for both research and clinical applications, with far-reaching implications for cancer diagnosis and treatment.
Collapse
Affiliation(s)
- Luca Ermini
- NORLUX Neuro-Oncology Laboratory, Department of Cancer Research, Luxembourg Institute of Health, L-1210 Luxembourg, Luxembourg
| | - Patrick Driguez
- Bioscience Core Lab, King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia
| |
Collapse
|
14
|
Xiong X, Chen H, Zhang Q, Liu Y, Xu C. Uncovering the roles of DNA hemi-methylation in transcriptional regulation using MspJI-assisted hemi-methylation sequencing. Nucleic Acids Res 2024; 52:e24. [PMID: 38261991 PMCID: PMC10954476 DOI: 10.1093/nar/gkae023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 12/13/2023] [Accepted: 01/03/2024] [Indexed: 01/25/2024] Open
Abstract
Hemi-methylated cytosine dyads widely occur on mammalian genomic DNA, and can be stably inherited across cell divisions, serving as potential epigenetic marks. Previous identification of hemi-methylation relied on harsh bisulfite treatment, leading to extensive DNA degradation and loss of methylation information. Here we introduce Mhemi-seq, a bisulfite-free strategy, to efficiently resolve methylation status of cytosine dyads into unmethylation, strand-specific hemi-methylation, or full-methylation. Mhemi-seq reproduces methylomes from bisulfite-based sequencing (BS-seq & hpBS-seq), including the asymmetric hemi-methylation enrichment flanking CTCF motifs. By avoiding base conversion, Mhemi-seq resolves allele-specific methylation and associated imprinted gene expression more efficiently than BS-seq. Furthermore, we reveal an inhibitory role of hemi-methylation in gene expression and transcription factor (TF)-DNA binding, and some displays a similar extent of inhibition as full-methylation. Finally, we uncover new hemi-methylation patterns within Alu retrotransposon elements. Collectively, Mhemi-seq can accelerate the identification of DNA hemi-methylation and facilitate its integration into the chromatin environment for future studies.
Collapse
Affiliation(s)
- Xiong Xiong
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- China National Center for Bioinformation, Beijing 100101, China
| | - Hengye Chen
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- China National Center for Bioinformation, Beijing 100101, China
| | - Qifan Zhang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yangying Liu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Chenhuan Xu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
15
|
Sigurpalsdottir BD, Stefansson OA, Holley G, Beyter D, Zink F, Hardarson MÞ, Sverrisson SÞ, Kristinsdottir N, Magnusdottir DN, Magnusson OÞ, Gudbjartsson DF, Halldorsson BV, Stefansson K. A comparison of methods for detecting DNA methylation from long-read sequencing of human genomes. Genome Biol 2024; 25:69. [PMID: 38468278 PMCID: PMC10929077 DOI: 10.1186/s13059-024-03207-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Accepted: 02/28/2024] [Indexed: 03/13/2024] Open
Abstract
BACKGROUND Long-read sequencing can enable the detection of base modifications, such as CpG methylation, in single molecules of DNA. The most commonly used methods for long-read sequencing are nanopore developed by Oxford Nanopore Technologies (ONT) and single molecule real-time (SMRT) sequencing developed by Pacific Bioscience (PacBio). In this study, we systematically compare the performance of CpG methylation detection from long-read sequencing. RESULTS We demonstrate that CpG methylation detection from 7179 nanopore-sequenced DNA samples is highly accurate and consistent with 132 oxidative bisulfite-sequenced (oxBS) samples, isolated from the same blood draws. We introduce quality filters for CpGs that further enhance the accuracy of CpG methylation detection from nanopore-sequenced DNA, while removing at most 30% of CpGs. We evaluate the per-site performance of CpG methylation detection across different genomic features and CpG methylation rates and demonstrate how the latest R10.4 flowcell chemistry and base-calling algorithms improve methylation detection from nanopore sequencing. Additionally, we show how the methylation detection of 50 SMRT-sequenced genomes compares to nanopore sequencing and oxBS. CONCLUSIONS This study provides the first systematic comparison of CpG methylation detection tools for long-read sequencing methods. We compare two commonly used computational methods for the detection of CpG methylation in a large number of nanopore genomes, including samples sequenced using the latest R10.4 nanopore flowcell chemistry and 50 SMRT sequenced samples. We provide insights into the strengths and limitations of each sequencing method as well as recommendations for standardization and evaluation of tools designed for genome-scale modified base detection using long-read sequencing.
Collapse
Affiliation(s)
- Brynja D Sigurpalsdottir
- deCODE Genetics/Amgen Inc., Sturlugata 8, Reykjavík, Iceland.
- School of Technology, Reykjavík University, Reykjavík, Iceland.
| | | | | | - Doruk Beyter
- deCODE Genetics/Amgen Inc., Sturlugata 8, Reykjavík, Iceland
| | - Florian Zink
- deCODE Genetics/Amgen Inc., Sturlugata 8, Reykjavík, Iceland
| | - Marteinn Þ Hardarson
- deCODE Genetics/Amgen Inc., Sturlugata 8, Reykjavík, Iceland
- School of Technology, Reykjavík University, Reykjavík, Iceland
| | | | | | | | | | - Daniel F Gudbjartsson
- deCODE Genetics/Amgen Inc., Sturlugata 8, Reykjavík, Iceland
- School of Engineering and Natural Sciences, University of Iceland, Reykjavík, Iceland
| | - Bjarni V Halldorsson
- deCODE Genetics/Amgen Inc., Sturlugata 8, Reykjavík, Iceland.
- School of Technology, Reykjavík University, Reykjavík, Iceland.
| | - Kari Stefansson
- deCODE Genetics/Amgen Inc., Sturlugata 8, Reykjavík, Iceland
- Faculty of Medicine, School of Health Science, University of Iceland, Reykjavík, Iceland
| |
Collapse
|
16
|
Nakamura W, Hirata M, Oda S, Chiba K, Okada A, Mateos RN, Sugawa M, Iida N, Ushiama M, Tanabe N, Sakamoto H, Sekine S, Hirasawa A, Kawai Y, Tokunaga K, Tsujimoto SI, Shiba N, Ito S, Yoshida T, Shiraishi Y. Assessing the efficacy of target adaptive sampling long-read sequencing through hereditary cancer patient genomes. NPJ Genom Med 2024; 9:11. [PMID: 38368425 PMCID: PMC10874402 DOI: 10.1038/s41525-024-00394-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2023] [Accepted: 01/15/2024] [Indexed: 02/19/2024] Open
Abstract
Innovations in sequencing technology have led to the discovery of novel mutations that cause inherited diseases. However, many patients with suspected genetic diseases remain undiagnosed. Long-read sequencing technologies are expected to significantly improve the diagnostic rate by overcoming the limitations of short-read sequencing. In addition, Oxford Nanopore Technologies (ONT) offers adaptive sampling and computationally driven target enrichment technology. This enables more affordable intensive analysis of target gene regions compared to standard non-selective long-read sequencing. In this study, we developed an efficient computational workflow for target adaptive sampling long-read sequencing (TAS-LRS) and evaluated it through application to 33 genomes collected from suspected hereditary cancer patients. Our workflow can identify single nucleotide variants with nearly the same accuracy as the short-read platform and elucidate complex forms of structural variations. We also newly identified several SINE-R/VNTR/Alu (SVA) elements affecting the APC gene in two patients with familial adenomatous polyposis, as well as their sites of origin. In addition, we demonstrated that off-target reads from adaptive sampling, which is typically discarded, can be effectively used to accurately genotype common single-nucleotide polymorphisms (SNPs) across the entire genome, enabling the calculation of a polygenic risk score. Furthermore, we identified allele-specific MLH1 promoter hypermethylation in a Lynch syndrome patient. In summary, our workflow with TAS-LRS can simultaneously capture monogenic risk variants including complex structural variations, polygenic background as well as epigenetic alterations, and will be an efficient platform for genetic disease research and diagnosis.
Collapse
Affiliation(s)
- Wataru Nakamura
- Division of Genome Analysis Platform Development, National Cancer Center Research Institute, Tokyo, Japan
- Department of Pediatrics, Yokohama City University Hospital, Kanagawa, Japan
| | - Makoto Hirata
- Division of Genetic Medicine and Services, National Cancer Center Hospital, Tokyo, Japan
- Department of Molecular Pathology, National Cancer Center Research Institute, Tokyo, Japan
| | - Satoyo Oda
- Division of Genetic Medicine and Services, National Cancer Center Hospital, Tokyo, Japan
- Division of Laboratory Medicine, National Cancer Center Hospital, Tokyo, Japan
| | - Kenichi Chiba
- Division of Genome Analysis Platform Development, National Cancer Center Research Institute, Tokyo, Japan
| | - Ai Okada
- Division of Genome Analysis Platform Development, National Cancer Center Research Institute, Tokyo, Japan
| | - Raúl Nicolás Mateos
- Division of Genome Analysis Platform Development, National Cancer Center Research Institute, Tokyo, Japan
| | - Masahiro Sugawa
- Division of Genome Analysis Platform Development, National Cancer Center Research Institute, Tokyo, Japan
| | - Naoko Iida
- Division of Genome Analysis Platform Development, National Cancer Center Research Institute, Tokyo, Japan
| | - Mineko Ushiama
- Division of Genetic Medicine and Services, National Cancer Center Hospital, Tokyo, Japan
- Department of Clinical Genetics, National Cancer Center Research Institute, Tokyo, Japan
| | - Noriko Tanabe
- Division of Genetic Medicine and Services, National Cancer Center Hospital, Tokyo, Japan
| | - Hiromi Sakamoto
- Division of Genetic Medicine and Services, National Cancer Center Hospital, Tokyo, Japan
- Department of Clinical Genetics, National Cancer Center Research Institute, Tokyo, Japan
| | - Shigeki Sekine
- Division of Molecular Pathology, National Cancer Center Research Institute, Tokyo, Japan
| | - Akira Hirasawa
- Department of Clinical Genetics and Genomic Medicine, Okayama University Hospital, Okayama, Japan
| | - Yosuke Kawai
- Genome Medical Science Project, Research Institute, National Center for Global Health and Medicine, Tokyo, Japan
| | - Katsushi Tokunaga
- Genome Medical Science Project, Research Institute, National Center for Global Health and Medicine, Tokyo, Japan
- Central Biobank, National Center Biobank Network, Tokyo, Japan
| | - Shin-Ichi Tsujimoto
- Department of Pediatrics, Yokohama City University Hospital, Kanagawa, Japan
| | - Norio Shiba
- Department of Pediatrics, Yokohama City University Hospital, Kanagawa, Japan
| | - Shuichi Ito
- Department of Pediatrics, Yokohama City University Hospital, Kanagawa, Japan
| | - Teruhiko Yoshida
- Division of Genetic Medicine and Services, National Cancer Center Hospital, Tokyo, Japan
- Department of Clinical Genetics, National Cancer Center Research Institute, Tokyo, Japan
| | - Yuichi Shiraishi
- Division of Genome Analysis Platform Development, National Cancer Center Research Institute, Tokyo, Japan.
| |
Collapse
|
17
|
Chaisson MJP, Sulovari A, Valdmanis PN, Miller DE, Eichler EE. Advances in the discovery and analyses of human tandem repeats. Emerg Top Life Sci 2023; 7:361-381. [PMID: 37905568 PMCID: PMC10806765 DOI: 10.1042/etls20230074] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Revised: 10/18/2023] [Accepted: 10/18/2023] [Indexed: 11/02/2023]
Abstract
Long-read sequencing platforms provide unparalleled access to the structure and composition of all classes of tandemly repeated DNA from STRs to satellite arrays. This review summarizes our current understanding of their organization within the human genome, their importance with respect to disease, as well as the advances and challenges in understanding their genetic diversity and functional effects. Novel computational methods are being developed to visualize and associate these complex patterns of human variation with disease, expression, and epigenetic differences. We predict accurate characterization of this repeat-rich form of human variation will become increasingly relevant to both basic and clinical human genetics.
Collapse
Affiliation(s)
- Mark J P Chaisson
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, U.S.A
- The Genomic and Epigenomic Regulation Program, USC Norris Cancer Center, University of Southern California, Los Angeles, CA 90089, U.S.A
| | - Arvis Sulovari
- Computational Biology, Cajal Neuroscience Inc, Seattle, WA 98102, U.S.A
| | - Paul N Valdmanis
- Division of Medical Genetics, Department of Medicine, University of Washington School of Medicine, Seattle, WA 98195, U.S.A
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, U.S.A
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA 98195, U.S.A
| | - Danny E Miller
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA 98195, U.S.A
- Brotman Baty Institute for Precision Medicine, University of Washington, Seattle, WA 98195, U.S.A
- Department of Pediatrics, University of Washington, Seattle, WA 98195, U.S.A
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, U.S.A
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, U.S.A
| |
Collapse
|
18
|
Fields PD, Weber MM, Waneka G, Broz AK, Sloan DB. Chromosome-Level Genome Assembly for the Angiosperm Silene conica. Genome Biol Evol 2023; 15:evad192. [PMID: 37862134 PMCID: PMC10630074 DOI: 10.1093/gbe/evad192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 09/28/2023] [Accepted: 10/16/2023] [Indexed: 10/22/2023] Open
Abstract
The angiosperm genus Silene has been the subject of extensive study in the field of ecology and evolution, but the availability of high-quality reference genome sequences has been limited for this group. Here, we report a chromosome-level assembly for the genome of Silene conica based on Pacific Bioscience HiFi, Hi-C, and Bionano technologies. The assembly produced 10 scaffolds (1 per chromosome) with a total length of 862 Mb and only ∼1% gap content. These results confirm previous observations that S. conica and its relatives have a reduced base chromosome number relative to the genus's ancestral state of 12. Silene conica has an exceptionally large mitochondrial genome (>11 Mb), predominantly consisting of sequence of unknown origins. Analysis of shared sequence content suggests that it is unlikely that transfer of nuclear DNA is the primary driver of this mitochondrial genome expansion. More generally, this assembly should provide a valuable resource for future genomic studies in Silene, including comparative analyses with related species that recently evolved sex chromosomes.
Collapse
Affiliation(s)
- Peter D Fields
- Department of Biology, Colorado State University, Fort Collins, Colorado, USA
- Mammalian Genetics, The Jackson Laboratory, Bar Harbor, Maine, USA
| | - Melody M Weber
- Department of Biology, Colorado State University, Fort Collins, Colorado, USA
| | - Gus Waneka
- Department of Biology, Colorado State University, Fort Collins, Colorado, USA
| | - Amanda K Broz
- Department of Biology, Colorado State University, Fort Collins, Colorado, USA
| | - Daniel B Sloan
- Department of Biology, Colorado State University, Fort Collins, Colorado, USA
| |
Collapse
|
19
|
Abrouk M, Wang Y, Cavalet-Giorsa E, Troukhan M, Kravchuk M, Krattinger SG. Chromosome-scale assembly of the wild wheat relative Aegilops umbellulata. Sci Data 2023; 10:739. [PMID: 37880246 PMCID: PMC10600132 DOI: 10.1038/s41597-023-02658-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 10/17/2023] [Indexed: 10/27/2023] Open
Abstract
Wild wheat relatives have been explored in plant breeding to increase the genetic diversity of bread wheat, one of the most important food crops. Aegilops umbellulata is a diploid U genome-containing grass species that serves as a genetic reservoir for wheat improvement. In this study, we report the construction of a chromosome-scale reference assembly of Ae. umbellulata accession TA1851 based on corrected PacBio HiFi reads and chromosome conformation capture. The total assembly size was 4.25 Gb with a contig N50 of 17.7 Mb. In total, 36,268 gene models were predicted. We benchmarked the performance of hifiasm and LJA, two of the most widely used assemblers using standard and corrected HiFi reads, revealing a positive effect of corrected input reads. Comparative genome analysis confirmed substantial chromosome rearrangements in Ae. umbellulata compared to bread wheat. In summary, the Ae. umbellulata assembly provides a resource for comparative genomics in Triticeae and for the discovery of agriculturally important genes.
Collapse
Affiliation(s)
- Michael Abrouk
- Plant Science Program, Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia.
- Center for Desert Agriculture, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia.
| | - Yajun Wang
- Plant Science Program, Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
- Center for Desert Agriculture, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Emile Cavalet-Giorsa
- Plant Science Program, Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
- Center for Desert Agriculture, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | | | | | - Simon G Krattinger
- Plant Science Program, Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia.
- Center for Desert Agriculture, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia.
| |
Collapse
|
20
|
Zhang L, Li J. Unlocking the secrets: the power of methylation-based cfDNA detection of tissue damage in organ systems. Clin Epigenetics 2023; 15:168. [PMID: 37858233 PMCID: PMC10588141 DOI: 10.1186/s13148-023-01585-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Accepted: 10/11/2023] [Indexed: 10/21/2023] Open
Abstract
BACKGROUND Detecting organ and tissue damage is essential for early diagnosis, treatment decisions, and monitoring disease progression. Methylation-based assays offer a promising approach, as DNA methylation patterns can change in response to tissue damage. These assays have potential applications in early detection, monitoring disease progression, evaluating treatment efficacy, and assessing organ viability for transplantation. cfDNA released into the bloodstream upon tissue or organ injury can serve as a biomarker for damage. The epigenetic state of cfDNA, including DNA methylation patterns, can provide insights into the extent of tissue and organ damage. CONTENT Firstly, this review highlights DNA methylation as an extensively studied epigenetic modification that plays a pivotal role in processes such as cell growth, differentiation, and disease development. It then presents a variety of highly precise 5-mC methylation detection techniques that serve as powerful tools for gaining profound insights into epigenetic alterations linked with tissue damage. Subsequently, the review delves into the mechanisms underlying DNA methylation changes in organ and tissue damage, encompassing inflammation, oxidative stress, and DNA damage repair mechanisms. Next, it addresses the current research status of cfDNA methylation in the detection of specific organ tissues and organ damage. Finally, it provides an overview of the multiple steps involved in identifying specific methylation markers associated with tissue and organ damage for clinical trials. This review will explore the mechanisms and current state of research on cfDNA methylation-based assay detecting organ and tissue damage, the underlying mechanisms, and potential applications in clinical practice.
Collapse
Affiliation(s)
- Lijing Zhang
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital/National Center of Gerontology, No. 1 Dahua Road, Dongdan, Beijing, 100730, People's Republic of China
- Peking Union Medical College, Chinese Academy of Medical Sciences, Beijing Hospital, Beijing, People's Republic of China
- Beijing Engineering Research Center of Laboratory Medicine, Beijing, People's Republic of China
| | - Jinming Li
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital/National Center of Gerontology, No. 1 Dahua Road, Dongdan, Beijing, 100730, People's Republic of China.
- Peking Union Medical College, Chinese Academy of Medical Sciences, Beijing Hospital, Beijing, People's Republic of China.
- Beijing Engineering Research Center of Laboratory Medicine, Beijing, People's Republic of China.
| |
Collapse
|
21
|
Song B, Ning W, Wei D, Jiang M, Zhu K, Wang X, Edwards D, Odeny DA, Cheng S. Plant genome resequencing and population genomics: Current status and future prospects. MOLECULAR PLANT 2023; 16:1252-1268. [PMID: 37501370 DOI: 10.1016/j.molp.2023.07.009] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Revised: 05/30/2023] [Accepted: 07/25/2023] [Indexed: 07/29/2023]
Abstract
Advances in DNA sequencing technology have sparked a genomics revolution, driving breakthroughs in plant genetics and crop breeding. Recently, the focus has shifted from cataloging genetic diversity in plants to exploring their functional significance and delivering beneficial alleles for crop improvement. This transformation has been facilitated by the increasing adoption of whole-genome resequencing. In this review, we summarize the current progress of population-based genome resequencing studies and how these studies affect crop breeding. A total of 187 land plants from 163 countries have been resequenced, comprising 54 413 accessions. As part of resequencing efforts 367 traits have been surveyed and 86 genome-wide association studies have been conducted. Economically important crops, particularly cereals, vegetables, and legumes, have dominated the resequencing efforts, leaving a gap in 49 orders, including Lycopodiales, Liliales, Acorales, Austrobaileyales, and Commelinales. The resequenced germplasm is distributed across diverse geographic locations, providing a global perspective on plant genomics. We highlight genes that have been selected during domestication, or associated with agronomic traits, and form a repository of candidate genes for future research and application. Despite the opportunities for cross-species comparative genomics, many population genomic datasets are not accessible, impeding secondary analyses. We call for a more open and collaborative approach to population genomics that promotes data sharing and encourages contribution-based credit policy. The number of plant genome resequencing studies will continue to rise with the decreasing DNA sequencing costs, coupled with advances in analysis and computational technologies. This expansion, in terms of both scale and quality, holds promise for deeper insights into plant trait genetics and breeding design.
Collapse
Affiliation(s)
- Bo Song
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China
| | - Weidong Ning
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China; Huazhong Agricultural University, College of Informatics, Hubei Key Laboratory of Agricultural Bioinformatics, Wuhan, Hubei, China
| | - Di Wei
- Biotechnology Research Institute, Guangxi Academy of Agricultural Sciences, Nanning 53007, China
| | - Mengyun Jiang
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China; State Key Laboratory of Crop Stress Adaptation and Improvement, School of Life Sciences, Henan University, Kaifeng 475004, China; Shenzhen Research Institute of Henan University, Shenzhen 518000, China
| | - Kun Zhu
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China; State Key Laboratory of Crop Stress Adaptation and Improvement, School of Life Sciences, Henan University, Kaifeng 475004, China; Shenzhen Research Institute of Henan University, Shenzhen 518000, China
| | - Xingwei Wang
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China; State Key Laboratory of Crop Stress Adaptation and Improvement, School of Life Sciences, Henan University, Kaifeng 475004, China; Shenzhen Research Institute of Henan University, Shenzhen 518000, China
| | - David Edwards
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, WA, Australia
| | - Damaris A Odeny
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT) - Eastern and Southern Africa, Nairobi, Kenya
| | - Shifeng Cheng
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China.
| |
Collapse
|
22
|
Ahmed HI, Heuberger M, Schoen A, Koo DH, Quiroz-Chavez J, Adhikari L, Raupp J, Cauet S, Rodde N, Cravero C, Callot C, Lazo GR, Kathiresan N, Sharma PK, Moot I, Yadav IS, Singh L, Saripalli G, Rawat N, Datla R, Athiyannan N, Ramirez-Gonzalez RH, Uauy C, Wicker T, Tiwari VK, Abrouk M, Poland J, Krattinger SG. Einkorn genomics sheds light on history of the oldest domesticated wheat. Nature 2023; 620:830-838. [PMID: 37532937 PMCID: PMC10447253 DOI: 10.1038/s41586-023-06389-7] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2022] [Accepted: 06/29/2023] [Indexed: 08/04/2023]
Abstract
Einkorn (Triticum monococcum) was the first domesticated wheat species, and was central to the birth of agriculture and the Neolithic Revolution in the Fertile Crescent around 10,000 years ago1,2. Here we generate and analyse 5.2-Gb genome assemblies for wild and domesticated einkorn, including completely assembled centromeres. Einkorn centromeres are highly dynamic, showing evidence of ancient and recent centromere shifts caused by structural rearrangements. Whole-genome sequencing analysis of a diversity panel uncovered the population structure and evolutionary history of einkorn, revealing complex patterns of hybridizations and introgressions after the dispersal of domesticated einkorn from the Fertile Crescent. We also show that around 1% of the modern bread wheat (Triticum aestivum) A subgenome originates from einkorn. These resources and findings highlight the history of einkorn evolution and provide a basis to accelerate the genomics-assisted improvement of einkorn and bread wheat.
Collapse
Affiliation(s)
- Hanin Ibrahim Ahmed
- Plant Science Program, Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
- Center for Desert Agriculture, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Matthias Heuberger
- Department of Plant and Microbial Biology, University of Zurich, Zurich, Switzerland
| | - Adam Schoen
- Department of Plant Science and Landscape Architecture, University of Maryland, College Park, MD, USA
| | - Dal-Hoe Koo
- Wheat Genetics Resource Center and Department of Plant Pathology, Kansas State University, Manhattan, KS, USA
| | | | - Laxman Adhikari
- Plant Science Program, Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
- Center for Desert Agriculture, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - John Raupp
- Wheat Genetics Resource Center and Department of Plant Pathology, Kansas State University, Manhattan, KS, USA
| | - Stéphane Cauet
- INRAE, CNRGV French Plant Genomic Resource Center, Castanet-Tolosan, France
| | - Nathalie Rodde
- INRAE, CNRGV French Plant Genomic Resource Center, Castanet-Tolosan, France
| | - Charlotte Cravero
- INRAE, CNRGV French Plant Genomic Resource Center, Castanet-Tolosan, France
| | - Caroline Callot
- INRAE, CNRGV French Plant Genomic Resource Center, Castanet-Tolosan, France
| | - Gerard R Lazo
- Crop Improvement and Genetics Research Unit, Western Regional Research Center, Agricultural Research Service, United States Department of Agriculture, Albany, CA, USA
| | - Nagarajan Kathiresan
- KAUST Supercomputing Core Lab (KSL), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Parva K Sharma
- Department of Plant Science and Landscape Architecture, University of Maryland, College Park, MD, USA
| | - Ian Moot
- Department of Plant Science and Landscape Architecture, University of Maryland, College Park, MD, USA
| | - Inderjit Singh Yadav
- Department of Plant Science and Landscape Architecture, University of Maryland, College Park, MD, USA
| | - Lovepreet Singh
- Department of Plant Science and Landscape Architecture, University of Maryland, College Park, MD, USA
| | - Gautam Saripalli
- Department of Plant Science and Landscape Architecture, University of Maryland, College Park, MD, USA
| | - Nidhi Rawat
- Department of Plant Science and Landscape Architecture, University of Maryland, College Park, MD, USA
| | - Raju Datla
- Global Institute for Food Security, University of Saskatchewan, Saskatoon, Saskatchewan, Canada
| | - Naveenkumar Athiyannan
- Plant Science Program, Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
- Center for Desert Agriculture, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | | | | | - Thomas Wicker
- Department of Plant and Microbial Biology, University of Zurich, Zurich, Switzerland
| | - Vijay K Tiwari
- Department of Plant Science and Landscape Architecture, University of Maryland, College Park, MD, USA.
| | - Michael Abrouk
- Plant Science Program, Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia.
- Center for Desert Agriculture, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia.
| | - Jesse Poland
- Plant Science Program, Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia.
- Center for Desert Agriculture, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia.
| | - Simon G Krattinger
- Plant Science Program, Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia.
- Center for Desert Agriculture, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia.
| |
Collapse
|