1
|
Ungar RA, Goddard PC, Jensen TD, Degalez F, Smith KS, Jin CA, Bonner DE, Bernstein JA, Wheeler MT, Montgomery SB. Impact of genome build on RNA-seq interpretation and diagnostics. Am J Hum Genet 2024; 111:1282-1300. [PMID: 38834072 DOI: 10.1016/j.ajhg.2024.05.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Revised: 05/04/2024] [Accepted: 05/06/2024] [Indexed: 06/06/2024] Open
Abstract
Transcriptomics is a powerful tool for unraveling the molecular effects of genetic variants and disease diagnosis. Prior studies have demonstrated that choice of genome build impacts variant interpretation and diagnostic yield for genomic analyses. To identify the extent genome build also impacts transcriptomics analyses, we studied the effect of the hg19, hg38, and CHM13 genome builds on expression quantification and outlier detection in 386 rare disease and familial control samples from both the Undiagnosed Diseases Network and Genomics Research to Elucidate the Genetics of Rare Disease Consortium. Across six routinely collected biospecimens, 61% of quantified genes were not influenced by genome build. However, we identified 1,492 genes with build-dependent quantification, 3,377 genes with build-exclusive expression, and 9,077 genes with annotation-specific expression across six routinely collected biospecimens, including 566 clinically relevant and 512 known OMIM genes. Further, we demonstrate that between builds for a given gene, a larger difference in quantification is well correlated with a larger change in expression outlier calling. Combined, we provide a database of genes impacted by build choice and recommend that transcriptomics-guided analyses and diagnoses are cross referenced with these data for robustness.
Collapse
Affiliation(s)
- Rachel A Ungar
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA, USA; Department of Pathology, School of Medicine, Stanford University, Stanford, CA, USA
| | - Pagé C Goddard
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA, USA; Department of Pathology, School of Medicine, Stanford University, Stanford, CA, USA
| | - Tanner D Jensen
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA, USA; Department of Pathology, School of Medicine, Stanford University, Stanford, CA, USA
| | | | - Kevin S Smith
- Department of Pathology, School of Medicine, Stanford University, Stanford, CA, USA
| | - Christopher A Jin
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA, USA
| | - Devon E Bonner
- Department of Pediatrics, School of Medicine, Stanford University, Stanford, CA, USA; Stanford Center for Undiagnosed Diseases, Stanford University, Stanford, CA, USA
| | - Jonathan A Bernstein
- Stanford Center for Undiagnosed Diseases, Stanford University, Stanford, CA, USA
| | - Matthew T Wheeler
- Department of Cardiovascular Medicine, School of Medicine, Stanford University, Stanford, CA, USA
| | - Stephen B Montgomery
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA, USA; Department of Pathology, School of Medicine, Stanford University, Stanford, CA, USA; Department of Biomedical Data Science, Stanford University, Stanford, CA, USA.
| |
Collapse
|
2
|
Miga KH. From complete genomes to pangenomes. Am J Hum Genet 2024; 111:1265-1268. [PMID: 38996470 DOI: 10.1016/j.ajhg.2024.05.012] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Accepted: 05/15/2024] [Indexed: 07/14/2024] Open
Abstract
Highlighting the Distinguished Speakers Symposium on "The Future of Human Genetics and Genomics," this collection of articles is based on presentations at the ASHG 2023 Annual Meeting in Washington, DC, in celebration of all our field has accomplished in the past 75 years, since the founding of ASHG in 1948.
Collapse
Affiliation(s)
- Karen H Miga
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA.
| |
Collapse
|
3
|
Ji Y, Zhao J, Gong J, Sedlazeck FJ, Fan S. Unveiling novel genetic variants in 370 challenging medically relevant genes using the long read sequencing data of 41 samples from 19 global populations. Mol Genet Genomics 2024; 299:65. [PMID: 38972030 DOI: 10.1007/s00438-024-02158-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Accepted: 06/16/2024] [Indexed: 07/08/2024]
Abstract
BACKGROUND A large number of challenging medically relevant genes (CMRGs) are situated in complex or highly repetitive regions of the human genome, hindering comprehensive characterization of genetic variants using next-generation sequencing technologies. In this study, we employed long-read sequencing technology, extensively utilized in studying complex genomic regions, to characterize genetic alterations, including short variants (single nucleotide variants and short insertions and deletions) and copy number variations, in 370 CMRGs across 41 individuals from 19 global populations. RESULTS Our analysis revealed high levels of genetic variants in CMRGs, with 68.73% exhibiting copy number variations and 65.20% containing short variants that may disrupt protein function across individuals. Such variants can influence pharmacogenomics, genetic disease susceptibility, and other clinical outcomes. We observed significant differences in CMRG variation across populations, with individuals of African ancestry harboring the highest number of copy number variants and short variants compared to samples from other continents. Notably, 15.79% to 33.96% of short variants were exclusively detectable through long-read sequencing. While the T2T-CHM13 reference genome significantly improved the assembly of CMRG regions, thereby facilitating variant detection in these regions, some regions still lacked resolution. CONCLUSION Our results provide an important reference for future clinical and pharmacogenetic studies, highlighting the need for a comprehensive representation of global genetic diversity in the reference genome and improved variant calling techniques to fully resolve medically relevant genes.
Collapse
Affiliation(s)
- Yanfeng Ji
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, Zhangjiang Fudan International Innovation Center, School of Life Science, Fudan University, Shanghai, 200438, China
| | - Junfan Zhao
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, Zhangjiang Fudan International Innovation Center, School of Life Science, Fudan University, Shanghai, 200438, China
| | - Jiao Gong
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, Zhangjiang Fudan International Innovation Center, School of Life Science, Fudan University, Shanghai, 200438, China
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
- Department of Computer Science, Rice University, 6100 Main Street, Houston, TX, 77005, USA.
| | - Shaohua Fan
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, Zhangjiang Fudan International Innovation Center, School of Life Science, Fudan University, Shanghai, 200438, China.
| |
Collapse
|
4
|
Omidiran O, Patel A, Usman S, Mhatre I, Abdelhalim H, DeGroat W, Narayanan R, Singh K, Mendhe D, Ahmed Z. GWAS advancements to investigate disease associations and biological mechanisms. CLINICAL AND TRANSLATIONAL DISCOVERY 2024; 4:e296. [PMID: 38737752 PMCID: PMC11086745 DOI: 10.1002/ctd2.296] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Accepted: 04/16/2024] [Indexed: 05/14/2024]
Abstract
Genome-wide association studies (GWAS) have been instrumental in elucidating the genetic architecture of various traits and diseases. Despite the success of GWAS, inherent limitations such as identifying rare and ultra-rare variants, the potential for spurious associations, and in pinpointing causative agents can undermine diagnostic capabilities. This review provides an overview of GWAS and highlights recent advances in genetics that employ a range of methodologies, including Whole Genome Sequencing (WGS), Mendelian Randomization (MR), the Pangenome's high-quality T2T-CHM13 panel, and the Human BioMolecular Atlas Program (HuBMAP), as potential enablers of current and future GWAS research. State of the literature demonstrate the capabilities of these techniques in enhancing the statistical power of GWAS. WGS, with its comprehensive approach, captures the entire genome, surpassing the capabilities of the traditional GWAS technique focused on predefined Single Nucleotide Polymorphism (SNP) sites. The Pangenome's T2T-CHM13 panel, with its holistic approach, aids in the analysis of regions with high sequence identity, such as segmental duplications (SDs). Mendelian Randomization has advanced causative inference, improving clinical diagnostics and facilitating definitive conclusions. Furthermore, spatial biology techniques like HuBMAP, enable 3D molecular mapping of tissues at single-cell resolution, offering insights into pathology of complex traits. This study aims to elucidate and advocate for the increased application of these technologies, highlighting their potential to shape the future of GWAS research.
Collapse
Affiliation(s)
- Oluwaferanmi Omidiran
- Rutgers Institute for Health, Health Care Policy and Aging Research, Rutgers University, 112 Paterson St, New Brunswick, NJ, USA
| | - Aashna Patel
- Rutgers Institute for Health, Health Care Policy and Aging Research, Rutgers University, 112 Paterson St, New Brunswick, NJ, USA
| | - Sarah Usman
- Rutgers Institute for Health, Health Care Policy and Aging Research, Rutgers University, 112 Paterson St, New Brunswick, NJ, USA
| | - Ishani Mhatre
- Rutgers Institute for Health, Health Care Policy and Aging Research, Rutgers University, 112 Paterson St, New Brunswick, NJ, USA
| | - Habiba Abdelhalim
- Rutgers Institute for Health, Health Care Policy and Aging Research, Rutgers University, 112 Paterson St, New Brunswick, NJ, USA
| | - William DeGroat
- Rutgers Institute for Health, Health Care Policy and Aging Research, Rutgers University, 112 Paterson St, New Brunswick, NJ, USA
| | - Rishabh Narayanan
- Rutgers Institute for Health, Health Care Policy and Aging Research, Rutgers University, 112 Paterson St, New Brunswick, NJ, USA
| | - Kritika Singh
- Rutgers Institute for Health, Health Care Policy and Aging Research, Rutgers University, 112 Paterson St, New Brunswick, NJ, USA
| | - Dinesh Mendhe
- Rutgers Institute for Health, Health Care Policy and Aging Research, Rutgers University, 112 Paterson St, New Brunswick, NJ, USA
| | - Zeeshan Ahmed
- Rutgers Institute for Health, Health Care Policy and Aging Research, Rutgers University, 112 Paterson St, New Brunswick, NJ, USA
- Department of Medicine, Robert Wood Johnson Medical School, Rutgers Biomedical and Health Sciences, 125 Paterson St, New Brunswick, NJ, USA
| |
Collapse
|
5
|
Jeong H, Dishuck PC, Yoo D, Harvey WT, Munson KM, Lewis AP, Kordosky J, Garcia GH, Yilmaz F, Hallast P, Lee C, Pastinen T, Eichler EE. Structural polymorphism and diversity of human segmental duplications. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.04.597452. [PMID: 38895457 PMCID: PMC11185583 DOI: 10.1101/2024.06.04.597452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]
Abstract
Segmental duplications (SDs) contribute significantly to human disease, evolution, and diversity yet have been difficult to resolve at the sequence level. We present a population genetics survey of SDs by analyzing 170 human genome assemblies where the majority of SDs are fully resolved using long-read sequence assembly. Excluding the acrocentric short arms, we identify 173.2 Mbp of duplicated sequence (47.4 Mbp not present in the telomere-to-telomere reference) distinguishing fixed from structurally polymorphic events. We find that intrachromosomal SDs are among the most variable with rare events mapping near their progenitor sequences. African genomes harbor significantly more intrachromosomal SDs and are more likely to have recently duplicated gene families with higher copy number when compared to non-African samples. A comparison to a resource of 563 million full-length Iso-Seq reads identifies 201 novel, potentially protein-coding genes corresponding to these copy number polymorphic SDs.
Collapse
Affiliation(s)
- Hyeonsoo Jeong
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Altos Labs, San Diego, CA, USA
| | - Philip C. Dishuck
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - DongAhn Yoo
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - William T. Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Katherine M. Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Alexandra P. Lewis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Jennifer Kordosky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Gage H. Garcia
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Feyza Yilmaz
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Pille Hallast
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Charles Lee
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Tomi Pastinen
- Children’s Mercy Hospital and University of Missouri-Kansas City School of Medicine, Kansas City, MO, USA
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| |
Collapse
|
6
|
Nanda AS, Wu K, Irkliyenko I, Woo B, Ostrowski MS, Clugston AS, Sayles LC, Xu L, Satpathy AT, Nguyen HG, Alejandro Sweet-Cordero E, Goodarzi H, Kasinathan S, Ramani V. Direct transposition of native DNA for sensitive multimodal single-molecule sequencing. Nat Genet 2024; 56:1300-1309. [PMID: 38724748 PMCID: PMC11176058 DOI: 10.1038/s41588-024-01748-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Accepted: 04/08/2024] [Indexed: 05/23/2024]
Abstract
Concurrent readout of sequence and base modifications from long unamplified DNA templates by Pacific Biosciences of California (PacBio) single-molecule sequencing requires large amounts of input material. Here we adapt Tn5 transposition to introduce hairpin oligonucleotides and fragment (tagment) limiting quantities of DNA for generating PacBio-compatible circular molecules. We developed two methods that implement tagmentation and use 90-99% less input than current protocols: (1) single-molecule real-time sequencing by tagmentation (SMRT-Tag), which allows detection of genetic variation and CpG methylation; and (2) single-molecule adenine-methylated oligonucleosome sequencing assay by tagmentation (SAMOSA-Tag), which uses exogenous adenine methylation to add a third channel for probing chromatin accessibility. SMRT-Tag of 40 ng or more human DNA (approximately 7,000 cell equivalents) yielded data comparable to gold standard whole-genome and bisulfite sequencing. SAMOSA-Tag of 30,000-50,000 nuclei resolved single-fiber chromatin structure, CTCF binding and DNA methylation in patient-derived prostate cancer xenografts and uncovered metastasis-associated global epigenome disorganization. Tagmentation thus promises to enable sensitive, scalable and multimodal single-molecule genomics for diverse basic and clinical applications.
Collapse
Affiliation(s)
- Arjun S Nanda
- Gladstone Institute for Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA, USA
| | - Ke Wu
- Gladstone Institute for Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA
| | - Iryna Irkliyenko
- Gladstone Institute for Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA
| | - Brian Woo
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA, USA
- Helen-Diller Cancer Center, San Francisco, CA, USA
| | - Megan S Ostrowski
- Gladstone Institute for Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA
| | - Andrew S Clugston
- Helen-Diller Cancer Center, San Francisco, CA, USA
- Department of Pediatrics, University of California, San Francisco, San Francisco, CA, USA
| | - Leanne C Sayles
- Helen-Diller Cancer Center, San Francisco, CA, USA
- Department of Pediatrics, University of California, San Francisco, San Francisco, CA, USA
| | - Lingru Xu
- Helen-Diller Cancer Center, San Francisco, CA, USA
| | - Ansuman T Satpathy
- Department of Pathology, Stanford University, Stanford, CA, USA
- Parker Institute for Cancer Immunotherapy, San Francisco, CA, USA
- Gladstone-University of California, San Francisco Institute for Genomic Immunology, Gladstone Institutes, San Francisco, CA, USA
| | - Hao G Nguyen
- Helen-Diller Cancer Center, San Francisco, CA, USA
| | - E Alejandro Sweet-Cordero
- Helen-Diller Cancer Center, San Francisco, CA, USA
- Department of Pediatrics, University of California, San Francisco, San Francisco, CA, USA
| | - Hani Goodarzi
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA, USA
- Helen-Diller Cancer Center, San Francisco, CA, USA
- Parker Institute for Cancer Immunotherapy, San Francisco, CA, USA
- Bakar Computational Health Sciences Institute, San Francisco, CA, USA
| | - Sivakanthan Kasinathan
- Gladstone-University of California, San Francisco Institute for Genomic Immunology, Gladstone Institutes, San Francisco, CA, USA.
- Division of Rheumatology, Department of Pediatrics, Stanford University, Stanford, CA, USA.
| | - Vijay Ramani
- Gladstone Institute for Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA.
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA, USA.
- Helen-Diller Cancer Center, San Francisco, CA, USA.
- Bakar Computational Health Sciences Institute, San Francisco, CA, USA.
| |
Collapse
|
7
|
Lee AT, Chang EF, Paredes MF, Nowakowski TJ. Large-scale neurophysiology and single-cell profiling in human neuroscience. Nature 2024; 630:587-595. [PMID: 38898291 DOI: 10.1038/s41586-024-07405-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Accepted: 04/09/2024] [Indexed: 06/21/2024]
Abstract
Advances in large-scale single-unit human neurophysiology, single-cell RNA sequencing, spatial transcriptomics and long-term ex vivo tissue culture of surgically resected human brain tissue have provided an unprecedented opportunity to study human neuroscience. In this Perspective, we describe the development of these paradigms, including Neuropixels and recent brain-cell atlas efforts, and discuss how their convergence will further investigations into the cellular underpinnings of network-level activity in the human brain. Specifically, we introduce a workflow in which functionally mapped samples of human brain tissue resected during awake brain surgery can be cultured ex vivo for multi-modal cellular and functional profiling. We then explore how advances in human neuroscience will affect clinical practice, and conclude by discussing societal and ethical implications to consider. Potential findings from the field of human neuroscience will be vast, ranging from insights into human neurodiversity and evolution to providing cell-type-specific access to study and manipulate diseased circuits in pathology. This Perspective aims to provide a unifying framework for the field of human neuroscience as we welcome an exciting era for understanding the functional cytoarchitecture of the human brain.
Collapse
Affiliation(s)
- Anthony T Lee
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
| | - Mercedes F Paredes
- Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - Tomasz J Nowakowski
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA.
- Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA.
- Department of Anatomy, University of California, San Francisco, San Francisco, CA, USA.
- Department of Psychiatry and Behavioral Sciences, University of California, San Francisco, San Francisco, CA, USA.
- Eli and Edythe Broad Center for Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA, USA.
| |
Collapse
|
8
|
Figueroa KP, Gross C, Buena-Atienza E, Paul S, Gandelman M, Kakar N, Sturm M, Casadei N, Admard J, Park J, Zühlke C, Hellenbroich Y, Pozojevic J, Balachandran S, Händler K, Zittel S, Timmann D, Erdlenbruch F, Herrmann L, Feindt T, Zenker M, Klopstock T, Dufke C, Scoles DR, Koeppen A, Spielmann M, Riess O, Ossowski S, Haack TB, Pulst SM. A GGC-repeat expansion in ZFHX3 encoding polyglycine causes spinocerebellar ataxia type 4 and impairs autophagy. Nat Genet 2024; 56:1080-1089. [PMID: 38684900 DOI: 10.1038/s41588-024-01719-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Accepted: 03/18/2024] [Indexed: 05/02/2024]
Abstract
Despite linkage to chromosome 16q in 1996, the mutation causing spinocerebellar ataxia type 4 (SCA4), a late-onset sensory and cerebellar ataxia, remained unknown. Here, using long-read single-strand whole-genome sequencing (LR-GS), we identified a heterozygous GGC-repeat expansion in a large Utah pedigree encoding polyglycine (polyG) in zinc finger homeobox protein 3 (ZFHX3), also known as AT-binding transcription factor 1 (ATBF1). We queried 6,495 genome sequencing datasets and identified the repeat expansion in seven additional pedigrees. Ultrarare DNA variants near the repeat expansion indicate a common distant founder event in Sweden. Intranuclear ZFHX3-p62-ubiquitin aggregates were abundant in SCA4 basis pontis neurons. In fibroblasts and induced pluripotent stem cells, the GGC expansion led to increased ZFHX3 protein levels and abnormal autophagy, which were normalized with small interfering RNA-mediated ZFHX3 knockdown in both cell types. Improving autophagy points to a therapeutic avenue for this novel polyG disease. The coding GGC-repeat expansion in an extremely G+C-rich region was not detectable by short-read whole-exome sequencing, which demonstrates the power of LR-GS for variant discovery.
Collapse
Affiliation(s)
- Karla P Figueroa
- Department of Neurology, University of Utah, Salt Lake City, UT, USA
| | - Caspar Gross
- Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, Germany
- NGS Competence Center Tübingen, Tübingen, Germany
| | - Elena Buena-Atienza
- Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, Germany
- NGS Competence Center Tübingen, Tübingen, Germany
| | - Sharan Paul
- Department of Neurology, University of Utah, Salt Lake City, UT, USA
| | - Mandi Gandelman
- Department of Neurology, University of Utah, Salt Lake City, UT, USA
| | - Naseebullah Kakar
- Institute of Human Genetics, University Medical Center Schleswig-Holstein, University of Lübeck and Kiel University, Lübeck, Germany
- Department of Biotechnology, FLS&I, BUITEMS, Quetta, Pakistan
| | - Marc Sturm
- Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, Germany
| | - Nicolas Casadei
- Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, Germany
- NGS Competence Center Tübingen, Tübingen, Germany
| | - Jakob Admard
- Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, Germany
- NGS Competence Center Tübingen, Tübingen, Germany
| | - Joohyun Park
- Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, Germany
| | - Christine Zühlke
- Institute of Human Genetics, University Medical Center Schleswig-Holstein, University of Lübeck and Kiel University, Lübeck, Germany
| | - Yorck Hellenbroich
- Institute of Human Genetics, University Medical Center Schleswig-Holstein, University of Lübeck and Kiel University, Lübeck, Germany
| | - Jelena Pozojevic
- Institute of Human Genetics, University Medical Center Schleswig-Holstein, University of Lübeck and Kiel University, Lübeck, Germany
| | - Saranya Balachandran
- Institute of Human Genetics, University Medical Center Schleswig-Holstein, University of Lübeck and Kiel University, Lübeck, Germany
| | - Kristian Händler
- Institute of Human Genetics, University Medical Center Schleswig-Holstein, University of Lübeck and Kiel University, Lübeck, Germany
| | - Simone Zittel
- Department of Neurology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Dagmar Timmann
- Department of Neurology and Center for Translational Neuro- and Behavioral Sciences (C-TNBS), Essen University Hospital, University of Duisburg-Essen, Essen, Germany
| | - Friedrich Erdlenbruch
- Department of Neurology and Center for Translational Neuro- and Behavioral Sciences (C-TNBS), Essen University Hospital, University of Duisburg-Essen, Essen, Germany
| | - Laura Herrmann
- Department of Neurology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | | | - Martin Zenker
- Institute of Human Genetics, University Hospital Magdeburg and Medical Faculty, Otto-von-Guericke University, Magdeburg, Germany
| | - Thomas Klopstock
- Department of Neurology with Friedrich-Baur-Institute, University Hospital of Ludwig-Maximilians-Universität München, Munich, Germany
- German Center for Neurodegenerative Diseases (DZNE), Munich, Germany
- Munich Cluster for Systems Neurology (SyNergy), Munich, Germany
| | - Claudia Dufke
- Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, Germany
| | - Daniel R Scoles
- Department of Neurology, University of Utah, Salt Lake City, UT, USA
| | | | - Malte Spielmann
- Institute of Human Genetics, University Medical Center Schleswig-Holstein, University of Lübeck and Kiel University, Lübeck, Germany
- DZHK (German Centre for Cardiovascular Research), partner site Hamburg, Lübeck, Kiel, Lübeck, Germany
| | - Olaf Riess
- Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, Germany.
- NGS Competence Center Tübingen, Tübingen, Germany.
| | - Stephan Ossowski
- Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, Germany
- NGS Competence Center Tübingen, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics (IBMI), University of Tübingen, Tübingen, Germany
| | - Tobias B Haack
- Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, Germany
- NGS Competence Center Tübingen, Tübingen, Germany
| | - Stefan M Pulst
- Department of Neurology, University of Utah, Salt Lake City, UT, USA.
- Clinical Neurosciences Center, University of Utah Hospitals and Clinics, Salt Lake City, UT, USA.
| |
Collapse
|
9
|
Liu MH, Costa BM, Bianchini EC, Choi U, Bandler RC, Lassen E, Grońska-Pęski M, Schwing A, Murphy ZR, Rosenkjær D, Picciotto S, Bianchi V, Stengs L, Edwards M, Nunes NM, Loh CA, Truong TK, Brand RE, Pastinen T, Wagner JR, Skytte AB, Tabori U, Shoag JE, Evrony GD. DNA mismatch and damage patterns revealed by single-molecule sequencing. Nature 2024; 630:752-761. [PMID: 38867045 PMCID: PMC11216816 DOI: 10.1038/s41586-024-07532-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2023] [Accepted: 05/07/2024] [Indexed: 06/14/2024]
Abstract
Mutations accumulate in the genome of every cell of the body throughout life, causing cancer and other diseases1,2. Most mutations begin as nucleotide mismatches or damage in one of the two strands of the DNA before becoming double-strand mutations if unrepaired or misrepaired3,4. However, current DNA-sequencing technologies cannot accurately resolve these initial single-strand events. Here we develop a single-molecule, long-read sequencing method (Hairpin Duplex Enhanced Fidelity sequencing (HiDEF-seq)) that achieves single-molecule fidelity for base substitutions when present in either one or both DNA strands. HiDEF-seq also detects cytosine deamination-a common type of DNA damage-with single-molecule fidelity. We profiled 134 samples from diverse tissues, including from individuals with cancer predisposition syndromes, and derive from them single-strand mismatch and damage signatures. We find correspondences between these single-strand signatures and known double-strand mutational signatures, which resolves the identity of the initiating lesions. Tumours deficient in both mismatch repair and replicative polymerase proofreading show distinct single-strand mismatch patterns compared to samples that are deficient in only polymerase proofreading. We also define a single-strand damage signature for APOBEC3A. In the mitochondrial genome, our findings support a mutagenic mechanism occurring primarily during replication. As double-strand DNA mutations are only the end point of the mutation process, our approach to detect the initiating single-strand events at single-molecule resolution will enable studies of how mutations arise in a variety of contexts, especially in cancer and ageing.
Collapse
Affiliation(s)
- Mei Hong Liu
- Center for Human Genetics and Genomics, New York University Grossman School of Medicine, New York, NY, USA
- Department of Pediatrics, Department of Neuroscience & Physiology, Institute for Systems Genetics, Perlmutter Cancer Center, and Neuroscience Institute, New York University Grossman School of Medicine, New York, NY, USA
| | - Benjamin M Costa
- Center for Human Genetics and Genomics, New York University Grossman School of Medicine, New York, NY, USA
- Department of Pediatrics, Department of Neuroscience & Physiology, Institute for Systems Genetics, Perlmutter Cancer Center, and Neuroscience Institute, New York University Grossman School of Medicine, New York, NY, USA
| | - Emilia C Bianchini
- Center for Human Genetics and Genomics, New York University Grossman School of Medicine, New York, NY, USA
- Department of Pediatrics, Department of Neuroscience & Physiology, Institute for Systems Genetics, Perlmutter Cancer Center, and Neuroscience Institute, New York University Grossman School of Medicine, New York, NY, USA
| | - Una Choi
- Center for Human Genetics and Genomics, New York University Grossman School of Medicine, New York, NY, USA
- Department of Pediatrics, Department of Neuroscience & Physiology, Institute for Systems Genetics, Perlmutter Cancer Center, and Neuroscience Institute, New York University Grossman School of Medicine, New York, NY, USA
| | - Rachel C Bandler
- Center for Human Genetics and Genomics, New York University Grossman School of Medicine, New York, NY, USA
| | - Emilie Lassen
- Cryos International Sperm and Egg Bank, Aarhus, Denmark
| | - Marta Grońska-Pęski
- Center for Human Genetics and Genomics, New York University Grossman School of Medicine, New York, NY, USA
- Department of Pediatrics, Department of Neuroscience & Physiology, Institute for Systems Genetics, Perlmutter Cancer Center, and Neuroscience Institute, New York University Grossman School of Medicine, New York, NY, USA
| | - Adam Schwing
- Center for Human Genetics and Genomics, New York University Grossman School of Medicine, New York, NY, USA
- Department of Pediatrics, Department of Neuroscience & Physiology, Institute for Systems Genetics, Perlmutter Cancer Center, and Neuroscience Institute, New York University Grossman School of Medicine, New York, NY, USA
| | - Zachary R Murphy
- Center for Human Genetics and Genomics, New York University Grossman School of Medicine, New York, NY, USA
- Department of Pediatrics, Department of Neuroscience & Physiology, Institute for Systems Genetics, Perlmutter Cancer Center, and Neuroscience Institute, New York University Grossman School of Medicine, New York, NY, USA
| | | | - Shany Picciotto
- Department of Urology, University Hospitals Cleveland Medical Center, Case Western Reserve University School of Medicine, Cleveland, OH, USA
| | - Vanessa Bianchi
- Program in Genetics and Genome Biology, Peter Gilgan Centre for Research and Learning, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Lucie Stengs
- Program in Genetics and Genome Biology, Peter Gilgan Centre for Research and Learning, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Melissa Edwards
- Program in Genetics and Genome Biology, Peter Gilgan Centre for Research and Learning, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Nuno Miguel Nunes
- Program in Genetics and Genome Biology, Peter Gilgan Centre for Research and Learning, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Caitlin A Loh
- Center for Human Genetics and Genomics, New York University Grossman School of Medicine, New York, NY, USA
- Department of Pediatrics, Department of Neuroscience & Physiology, Institute for Systems Genetics, Perlmutter Cancer Center, and Neuroscience Institute, New York University Grossman School of Medicine, New York, NY, USA
| | - Tina K Truong
- Center for Human Genetics and Genomics, New York University Grossman School of Medicine, New York, NY, USA
- Department of Pediatrics, Department of Neuroscience & Physiology, Institute for Systems Genetics, Perlmutter Cancer Center, and Neuroscience Institute, New York University Grossman School of Medicine, New York, NY, USA
| | - Randall E Brand
- Department of Medicine, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Tomi Pastinen
- Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO, USA
| | - J Richard Wagner
- Department of Nuclear Medicine and Radiobiology, Université de Sherbrooke, Sherbrooke, Quebec, Canada
| | | | - Uri Tabori
- Program in Genetics and Genome Biology, Peter Gilgan Centre for Research and Learning, The Hospital for Sick Children, Toronto, Ontario, Canada
- Division of Haematology/Oncology, Arthur and Sonia Labatt Brain Tumour Research Centre, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Jonathan E Shoag
- Department of Urology, University Hospitals Cleveland Medical Center, Case Western Reserve University School of Medicine, Cleveland, OH, USA
| | - Gilad D Evrony
- Center for Human Genetics and Genomics, New York University Grossman School of Medicine, New York, NY, USA.
- Department of Pediatrics, Department of Neuroscience & Physiology, Institute for Systems Genetics, Perlmutter Cancer Center, and Neuroscience Institute, New York University Grossman School of Medicine, New York, NY, USA.
| |
Collapse
|
10
|
Wang H, Sun F. UNC-45A: A potential therapeutic target for malignant tumors. Heliyon 2024; 10:e31276. [PMID: 38803956 PMCID: PMC11128996 DOI: 10.1016/j.heliyon.2024.e31276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 12/31/2023] [Accepted: 05/14/2024] [Indexed: 05/29/2024] Open
Abstract
Uncoordinated mutant number-45 myosin chaperone A (UNC-45A), a protein highly conserved throughout evolution, is ubiquitously expressed in somatic cells. It is correlated with tumorigenesis, proliferation, metastasis, and invasion of multiple malignant tumors. The current understanding of the role of UNC-45A in tumor progression is mainly related to the regulation of non-muscle myosin II (NM-II). However, many studies have suggested that the mechanisms by which UNC-45A is involved in tumor progression are far greater than those of NM-II regulation. UNC-45A can also promote tumor cell proliferation by regulating checkpoint kinase 1 (ChK1) phosphorylation or the transcriptional activity of nuclear receptors, and induces chemoresistance to paclitaxel in tumor cells by destabilizing microtubule activity. In this review, we discuss the recent advances illuminating the role of UNC-45A in tumor progression. We also put forward therapeutic strategies targeting UNC-45A, in the hope of paving the way the development of UNC-45A-targeted therapies for patients with malignant tumors.
Collapse
Affiliation(s)
- Hong Wang
- School of Nursing, Binzhou Medical University, Yantai, 264003, PR China
| | - Fude Sun
- Department of Anesthesiology, Yantai Penglai Traditional Chinese Medicine Hospital, Yantai, 265699, PR China
| |
Collapse
|
11
|
Chang CM, Chang WC, Hsieh SL. Characterization of the genetic variation and evolutionary divergence of the CLEC18 family. J Biomed Sci 2024; 31:53. [PMID: 38764023 PMCID: PMC11103991 DOI: 10.1186/s12929-024-01034-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2024] [Accepted: 04/25/2024] [Indexed: 05/21/2024] Open
Abstract
BACKGROUND The C-type lectin family 18 (CLEC18) with lipid and glycan binding capabilities is important to metabolic regulation and innate immune responses against viral infection. However, human CLEC18 comprises three paralogous genes with highly similar sequences, making it challenging to distinguish genetic variations, expression patterns, and biological functions of individual CLEC18 paralogs. Additionally, the evolutionary relationship between human CLEC18 and its counterparts in other species remains unclear. METHODS To identify the sequence variation and evolutionary divergence of human CLEC18 paralogs, we conducted a comprehensive analysis using various resources, including human and non-human primate reference genome assemblies, human pangenome assemblies, and long-read-based whole-genome and -transcriptome sequencing datasets. RESULTS We uncovered paralogous sequence variants (PSVs) and polymorphic variants (PVs) of human CLEC18 proteins, and identified distinct signatures specific to each CLEC18 paralog. Furthermore, we unveiled a novel segmental duplication for human CLEC18A gene. By comparing CLEC18 across human and non-human primates, our research showed that the CLEC18 paralogy probably occurred in the common ancestor of human and closely related non-human primates, and the lipid-binding CAP/SCP/TAPS domain of CLEC18 is more diverse than its glycan-binding CTLD. Moreover, we found that certain amino acids alterations at variant positions are exclusive to human CLEC18 paralogs. CONCLUSIONS Our findings offer a comprehensive profiling of the intricate variations and evolutionary characteristics of human CLEC18.
Collapse
Affiliation(s)
- Che-Mai Chang
- Genomics Research Center, Academia Sinica, No. 128, Sec. 2, Academia Rd., Nangang Dist., Taipei City, 115, Taiwan
- Department of Clinical Pharmacy, School of Pharmacy, Taipei Medical University, No.250, Wuxing St., Xinyi Dist, Taipei City, 110, Taiwan
| | - Wei-Chiao Chang
- Department of Clinical Pharmacy, School of Pharmacy, Taipei Medical University, No.250, Wuxing St., Xinyi Dist, Taipei City, 110, Taiwan.
- Master Program in Clinical Genomics and Proteomics, School of Pharmacy, Taipei Medical University, Taipei City, 110, Taiwan.
- Department of Pharmacy, Wan Fang Hospital, Taipei Medical University, Taipei City, 116, Taiwan.
- Integrative Research Center for Critical Care, Wan Fang Hospital, Taipei Medical University, Taipei City, 116, Taiwan.
- Department of Pharmacology, National Defense Medical Center, Taipei City, 114, Taiwan.
| | - Shie-Liang Hsieh
- Genomics Research Center, Academia Sinica, No. 128, Sec. 2, Academia Rd., Nangang Dist., Taipei City, 115, Taiwan.
- Master Program in Clinical Genomics and Proteomics, School of Pharmacy, Taipei Medical University, Taipei City, 110, Taiwan.
- Immunology Research Center, National Health Research Institutes, No. 35, Keyan Rd., Zhunan Township, Miaoli County, 350, Taiwan.
- Institute of Clinical Medicine, National Yang Ming Chiao Tung University, Taipei City, 112, Taiwan.
- Department of Medical Research, Taipei Veterans General Hospital, Taipei City, 112, Taiwan.
| |
Collapse
|
12
|
Logsdon GA, Rozanski AN, Ryabov F, Potapova T, Shepelev VA, Catacchio CR, Porubsky D, Mao Y, Yoo D, Rautiainen M, Koren S, Nurk S, Lucas JK, Hoekzema K, Munson KM, Gerton JL, Phillippy AM, Ventura M, Alexandrov IA, Eichler EE. The variation and evolution of complete human centromeres. Nature 2024; 629:136-145. [PMID: 38570684 PMCID: PMC11062924 DOI: 10.1038/s41586-024-07278-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Accepted: 03/07/2024] [Indexed: 04/05/2024]
Abstract
Human centromeres have been traditionally very difficult to sequence and assemble owing to their repetitive nature and large size1. As a result, patterns of human centromeric variation and models for their evolution and function remain incomplete, despite centromeres being among the most rapidly mutating regions2,3. Here, using long-read sequencing, we completely sequenced and assembled all centromeres from a second human genome and compared it to the finished reference genome4,5. We find that the two sets of centromeres show at least a 4.1-fold increase in single-nucleotide variation when compared with their unique flanks and vary up to 3-fold in size. Moreover, we find that 45.8% of centromeric sequence cannot be reliably aligned using standard methods owing to the emergence of new α-satellite higher-order repeats (HORs). DNA methylation and CENP-A chromatin immunoprecipitation experiments show that 26% of the centromeres differ in their kinetochore position by >500 kb. To understand evolutionary change, we selected six chromosomes and sequenced and assembled 31 orthologous centromeres from the common chimpanzee, orangutan and macaque genomes. Comparative analyses reveal a nearly complete turnover of α-satellite HORs, with characteristic idiosyncratic changes in α-satellite HORs for each species. Phylogenetic reconstruction of human haplotypes supports limited to no recombination between the short (p) and long (q) arms across centromeres and reveals that novel α-satellite HORs share a monophyletic origin, providing a strategy to estimate the rate of saltatory amplification and mutation of human centromeric DNA.
Collapse
Affiliation(s)
- Glennis A Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Department of Genetics, Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Allison N Rozanski
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Fedor Ryabov
- Masters Program in National Research University Higher School of Economics, Moscow, Russia
| | - Tamara Potapova
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | | | - Claudia R Catacchio
- Department of Biosciences, Biotechnology and Environment, University of Bari Aldo Moro, Bari, Italy
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Yafei Mao
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - DongAhn Yoo
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Mikko Rautiainen
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
- Institute for Molecular Medicine Finland (FIMM), Helsinki Institute of Life Science (HiLIFE), University of Helsinki, Helsinki, Finland
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sergey Nurk
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
- Oxford Nanopore Technologies, Oxford, United Kingdom
| | - Julian K Lucas
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA, USA
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Adam M Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Mario Ventura
- Department of Biosciences, Biotechnology and Environment, University of Bari Aldo Moro, Bari, Italy
| | - Ivan A Alexandrov
- Department of Human Molecular Genetics and Biochemistry, Tel Aviv University, Tel Aviv, Israel
- Department of Anatomy and Anthropology, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
- Dan David Center for Human Evolution and Biohistory Research, Tel Aviv University, Tel Aviv, Israel
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
| |
Collapse
|
13
|
Rangwala SH, Rudnev DV, Ananiev VV, Oh DH, Asztalos A, Benica B, Borodin EA, Bouk N, Evgeniev VI, Kodali VK, Lotov V, Mozes E, Omelchenko MV, Savkina S, Sukharnikov E, Virothaisakun J, Murphy TD, Pruitt KD, Schneider VA. The NCBI Comparative Genome Viewer (CGV) is an interactive visualization tool for the analysis of whole-genome eukaryotic alignments. PLoS Biol 2024; 22:e3002405. [PMID: 38713717 PMCID: PMC11101090 DOI: 10.1371/journal.pbio.3002405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 05/17/2024] [Accepted: 04/08/2024] [Indexed: 05/09/2024] Open
Abstract
We report a new visualization tool for analysis of whole-genome assembly-assembly alignments, the Comparative Genome Viewer (CGV) (https://ncbi.nlm.nih.gov/genome/cgv/). CGV visualizes pairwise same-species and cross-species alignments provided by National Center for Biotechnology Information (NCBI) using assembly alignment algorithms developed by us and others. Researchers can examine large structural differences spanning chromosomes, such as inversions or translocations. Users can also navigate to regions of interest, where they can detect and analyze smaller-scale deletions and rearrangements within specific chromosome or gene regions. RefSeq or user-provided gene annotation is displayed where available. CGV currently provides approximately 800 alignments from over 350 animal, plant, and fungal species. CGV and related NCBI viewers are undergoing active development to further meet needs of the research community in comparative genome visualization.
Collapse
Affiliation(s)
- Sanjida H. Rangwala
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, Maryland, United States of America
| | - Dmitry V. Rudnev
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, Maryland, United States of America
| | - Victor V. Ananiev
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, Maryland, United States of America
| | - Dong-Ha Oh
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, Maryland, United States of America
| | - Andrea Asztalos
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, Maryland, United States of America
| | - Barrett Benica
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, Maryland, United States of America
| | - Evgeny A. Borodin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, Maryland, United States of America
| | - Nathan Bouk
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, Maryland, United States of America
| | - Vladislav I. Evgeniev
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, Maryland, United States of America
| | - Vamsi K. Kodali
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, Maryland, United States of America
| | - Vadim Lotov
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, Maryland, United States of America
| | - Eyal Mozes
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, Maryland, United States of America
| | - Marina V. Omelchenko
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, Maryland, United States of America
| | - Sofya Savkina
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, Maryland, United States of America
| | - Ekaterina Sukharnikov
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, Maryland, United States of America
| | - Joël Virothaisakun
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, Maryland, United States of America
| | - Terence D. Murphy
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, Maryland, United States of America
| | - Kim D. Pruitt
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, Maryland, United States of America
| | - Valerie A. Schneider
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, Maryland, United States of America
| |
Collapse
|
14
|
Qiu Y, Shen Y, Kingsford C. Revisiting the complexity of and algorithms for the graph traversal edit distance and its variants. Algorithms Mol Biol 2024; 19:17. [PMID: 38679703 PMCID: PMC11056321 DOI: 10.1186/s13015-024-00262-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2023] [Accepted: 03/21/2024] [Indexed: 05/01/2024] Open
Abstract
The graph traversal edit distance (GTED), introduced by Ebrahimpour Boroojeny et al. (2018), is an elegant distance measure defined as the minimum edit distance between strings reconstructed from Eulerian trails in two edge-labeled graphs. GTED can be used to infer evolutionary relationships between species by comparing de Bruijn graphs directly without the computationally costly and error-prone process of genome assembly. Ebrahimpour Boroojeny et al. (2018) propose two ILP formulations for GTED and claim that GTED is polynomially solvable because the linear programming relaxation of one of the ILPs always yields optimal integer solutions. The claim that GTED is polynomially solvable is contradictory to the complexity results of existing string-to-graph matching problems. We resolve this conflict in complexity results by proving that GTED is NP-complete and showing that the ILPs proposed by Ebrahimpour Boroojeny et al. do not solve GTED but instead solve for a lower bound of GTED and are not solvable in polynomial time. In addition, we provide the first two, correct ILP formulations of GTED and evaluate their empirical efficiency. These results provide solid algorithmic foundations for comparing genome graphs and point to the direction of heuristics. The source code to reproduce experimental results is available at https://github.com/Kingsford-Group/gtednewilp/ .
Collapse
Affiliation(s)
- Yutong Qiu
- Ray and Stephanie Lane Computational Biology Department, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, 15213, PA, USA
| | - Yihang Shen
- Ray and Stephanie Lane Computational Biology Department, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, 15213, PA, USA
| | - Carl Kingsford
- Ray and Stephanie Lane Computational Biology Department, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, 15213, PA, USA.
| |
Collapse
|
15
|
Hu J, Wang Z, Sun Z, Hu B, Ayoola AO, Liang F, Li J, Sandoval JR, Cooper DN, Ye K, Ruan J, Xiao CL, Wang D, Wu DD, Wang S. NextDenovo: an efficient error correction and accurate assembly tool for noisy long reads. Genome Biol 2024; 25:107. [PMID: 38671502 PMCID: PMC11046930 DOI: 10.1186/s13059-024-03252-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Accepted: 04/17/2024] [Indexed: 04/28/2024] Open
Abstract
Long-read sequencing data, particularly those derived from the Oxford Nanopore sequencing platform, tend to exhibit high error rates. Here, we present NextDenovo, an efficient error correction and assembly tool for noisy long reads, which achieves a high level of accuracy in genome assembly. We apply NextDenovo to assemble 35 diverse human genomes from around the world using Nanopore long-read data. These genomes allow us to identify the landscape of segmental duplication and gene copy number variation in modern human populations. The use of NextDenovo should pave the way for population-scale long-read assembly using Nanopore long-read data.
Collapse
Affiliation(s)
- Jiang Hu
- GrandOmics Biosciences, Beijing, 102206, China
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Zhuo Wang
- GrandOmics Biosciences, Beijing, 102206, China
| | - Zongyi Sun
- GrandOmics Biosciences, Beijing, 102206, China
| | - Benxia Hu
- Key Laboratory of Genetic Evolution and Animal Models, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China
| | - Adeola Oluwakemi Ayoola
- Key Laboratory of Genetic Evolution and Animal Models, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China
| | - Fan Liang
- GrandOmics Biosciences, Beijing, 102206, China
| | - Jingjing Li
- GrandOmics Biosciences, Beijing, 102206, China
| | - José R Sandoval
- Centro de Investigación de Genética y Biología Molecular (CIGBM), Instituto de Investigación, Facultad de Medicina, Universidad de San Martín de Porres, Lima, 15102, Peru
| | - David N Cooper
- Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - Kai Ye
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Jue Ruan
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
| | - Chuan-Le Xiao
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-Sen University, #7 Jinsui Road, Tianhe District, Guangzhou, China
| | - Depeng Wang
- GrandOmics Biosciences, Beijing, 102206, China.
| | - Dong-Dong Wu
- Key Laboratory of Genetic Evolution and Animal Models, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China.
- Kunming Primate Research Center, and National Research Facility for Phenotypic and Genetic Analysis of Model Animals (Primate Facility), National Resource Center for Non-Human Primates, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650107, China.
- Kunming Natural History Museum of Zoology, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China.
| | - Sheng Wang
- Key Laboratory of Genetic Evolution and Animal Models, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China.
- Yunnan Key Laboratory of Biodiversity Information, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China.
| |
Collapse
|
16
|
Laudanski K, Elmadhoun O, Mathew A, Kahn-Pascual Y, Kerfeld MJ, Chen J, Sisniega DC, Gomez F. Anesthetic Considerations for Patients with Hereditary Neuropathy with Liability to Pressure Palsies: A Narrative Review. Healthcare (Basel) 2024; 12:858. [PMID: 38667620 PMCID: PMC11050561 DOI: 10.3390/healthcare12080858] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2024] [Revised: 03/28/2024] [Accepted: 03/29/2024] [Indexed: 04/28/2024] Open
Abstract
Hereditary neuropathy with liability to pressure palsies (HNPP) is an autosomal dominant demyelinating neuropathy characterized by an increased susceptibility to peripheral nerve injury from trauma, compression, or shear forces. Patients with this condition are unique, necessitating distinct considerations for anesthesia and surgical teams. This review describes the etiology, prevalence, clinical presentation, and management of HNPP and presents contemporary evidence and recommendations for optimal care for HNPP patients in the perioperative period. While the incidence of HNPP is reported at 7-16:100,000, this figure may be an underestimation due to underdiagnosis, further complicating medicolegal issues. With the subtle nature of symptoms associated with HNPP, patients with this condition may remain unrecognized during the perioperative period, posing significant risks. Several aspects of caring for this population, including anesthetic choices, intraoperative positioning, and monitoring strategy, may deviate from standard practices. As such, a tailored approach to caring for this unique population, coupled with meticulous preoperative planning, is crucial and requires a multidisciplinary approach.
Collapse
Affiliation(s)
- Krzysztof Laudanski
- Department of Anesthesiology and Perioperative Care, Mayo Clinic, Rochester, MN 55902, USA; (K.L.); (O.E.); (M.J.K.); (J.C.)
| | - Omar Elmadhoun
- Department of Anesthesiology and Perioperative Care, Mayo Clinic, Rochester, MN 55902, USA; (K.L.); (O.E.); (M.J.K.); (J.C.)
| | - Amal Mathew
- School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA 19104, USA;
| | - Yul Kahn-Pascual
- St George’s University Hospitals NHS Foundation Trust, London SW17 0QT, UK;
| | - Mitchell J. Kerfeld
- Department of Anesthesiology and Perioperative Care, Mayo Clinic, Rochester, MN 55902, USA; (K.L.); (O.E.); (M.J.K.); (J.C.)
| | - James Chen
- Department of Anesthesiology and Perioperative Care, Mayo Clinic, Rochester, MN 55902, USA; (K.L.); (O.E.); (M.J.K.); (J.C.)
| | - Daniella C. Sisniega
- Department of Neurology, University of Pennsylvania, Philadelphia, PA 19104, USA;
| | - Francisco Gomez
- Department of Neurology, University of Missouri, Columbia, MO 65211, USA
| |
Collapse
|
17
|
Zhang S, Xu N, Fu L, Yang X, Li Y, Yang Z, Feng Y, Ma K, Jiang X, Han J, Hu R, Zhang L, de Gennaro L, Ryabov F, Meng D, He Y, Wu D, Yang C, Paparella A, Mao Y, Bian X, Lu Y, Antonacci F, Ventura M, Shepelev VA, Miga KH, Alexandrov IA, Logsdon GA, Phillippy AM, Su B, Zhang G, Eichler EE, Lu Q, Shi Y, Sun Q, Mao Y. Comparative genomics of macaques and integrated insights into genetic variation and population history. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.07.588379. [PMID: 38645259 PMCID: PMC11030432 DOI: 10.1101/2024.04.07.588379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
The crab-eating macaques ( Macaca fascicularis ) and rhesus macaques ( M. mulatta ) are widely studied nonhuman primates in biomedical and evolutionary research. Despite their significance, the current understanding of the complex genomic structure in macaques and the differences between species requires substantial improvement. Here, we present a complete genome assembly of a crab-eating macaque and 20 haplotype-resolved macaque assemblies to investigate the complex regions and major genomic differences between species. Segmental duplication in macaques is ∼42% lower, while centromeres are ∼3.7 times longer than those in humans. The characterization of ∼2 Mbp fixed genetic variants and ∼240 Mbp complex loci highlights potential associations with metabolic differences between the two macaque species (e.g., CYP2C76 and EHBP1L1 ). Additionally, hundreds of alternative splicing differences show post-transcriptional regulation divergence between these two species (e.g., PNPO ). We also characterize 91 large-scale genomic differences between macaques and humans at a single-base-pair resolution and highlight their impact on gene regulation in primate evolution (e.g., FOLH1 and PIEZO2 ). Finally, population genetics recapitulates macaque speciation and selective sweeps, highlighting potential genetic basis of reproduction and tail phenotype differences (e.g., STAB1 , SEMA3F , and HOXD13 ). In summary, the integrated analysis of genetic variation and population genetics in macaques greatly enhances our comprehension of lineage-specific phenotypes, adaptation, and primate evolution, thereby improving their biomedical applications in human diseases.
Collapse
|
18
|
Hujoel MLA, Handsaker RE, Sherman MA, Kamitaki N, Barton AR, Mukamel RE, Terao C, McCarroll SA, Loh PR. Protein-altering variants at copy number-variable regions influence diverse human phenotypes. Nat Genet 2024; 56:569-578. [PMID: 38548989 PMCID: PMC11018521 DOI: 10.1038/s41588-024-01684-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Accepted: 02/08/2024] [Indexed: 04/09/2024]
Abstract
Copy number variants (CNVs) are among the largest genetic variants, yet CNVs have not been effectively ascertained in most genetic association studies. Here we ascertained protein-altering CNVs from UK Biobank whole-exome sequencing data (n = 468,570) using haplotype-informed methods capable of detecting subexonic CNVs and variation within segmental duplications. Incorporating CNVs into analyses of rare variants predicted to cause gene loss of function (LOF) identified 100 associations of predicted LOF variants with 41 quantitative traits. A low-frequency partial deletion of RGL3 exon 6 conferred one of the strongest protective effects of gene LOF on hypertension risk (odds ratio = 0.86 (0.82-0.90)). Protein-coding variation in rapidly evolving gene families within segmental duplications-previously invisible to most analysis methods-generated some of the human genome's largest contributions to variation in type 2 diabetes risk, chronotype and blood cell traits. These results illustrate the potential for new genetic insights from genomic variation that has escaped large-scale analysis to date.
Collapse
Affiliation(s)
- Margaux L A Hujoel
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| | - Robert E Handsaker
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Maxwell A Sherman
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
- Serinus Biosciences Inc., New York, NY, USA
| | - Nolan Kamitaki
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Alison R Barton
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
| | - Ronen E Mukamel
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Chikashi Terao
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Clinical Research Center, Shizuoka General Hospital, Shizuoka, Japan
- Department of Applied Genetics, School of Pharmaceutical Sciences, University of Shizuoka, Shizuoka, Japan
| | - Steven A McCarroll
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Po-Ru Loh
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
19
|
Mao Y, Harvey WT, Porubsky D, Munson KM, Hoekzema K, Lewis AP, Audano PA, Rozanski A, Yang X, Zhang S, Yoo D, Gordon DS, Fair T, Wei X, Logsdon GA, Haukness M, Dishuck PC, Jeong H, Del Rosario R, Bauer VL, Fattor WT, Wilkerson GK, Mao Y, Shi Y, Sun Q, Lu Q, Paten B, Bakken TE, Pollen AA, Feng G, Sawyer SL, Warren WC, Carbone L, Eichler EE. Structurally divergent and recurrently mutated regions of primate genomes. Cell 2024; 187:1547-1562.e13. [PMID: 38428424 PMCID: PMC10947866 DOI: 10.1016/j.cell.2024.01.052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 11/26/2023] [Accepted: 01/31/2024] [Indexed: 03/03/2024]
Abstract
We sequenced and assembled using multiple long-read sequencing technologies the genomes of chimpanzee, bonobo, gorilla, orangutan, gibbon, macaque, owl monkey, and marmoset. We identified 1,338,997 lineage-specific fixed structural variants (SVs) disrupting 1,561 protein-coding genes and 136,932 regulatory elements, including the most complete set of human-specific fixed differences. We estimate that 819.47 Mbp or ∼27% of the genome has been affected by SVs across primate evolution. We identify 1,607 structurally divergent regions wherein recurrent structural variation contributes to creating SV hotspots where genes are recurrently lost (e.g., CARD, C4, and OLAH gene families) and additional lineage-specific genes are generated (e.g., CKAP2, VPS36, ACBD7, and NEK5 paralogs), becoming targets of rapid chromosomal diversification and positive selection (e.g., RGPD gene family). High-fidelity long-read sequencing has made these dynamic regions of the genome accessible for sequence-level analyses within and between primate species.
Collapse
Affiliation(s)
- Yafei Mao
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA; Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China.
| | - William T Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Alexandra P Lewis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Peter A Audano
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Allison Rozanski
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Xiangyu Yang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Shilong Zhang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - DongAhn Yoo
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - David S Gordon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA; Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Tyler Fair
- Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA, USA
| | - Xiaoxi Wei
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Glennis A Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Marina Haukness
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Philip C Dishuck
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Hyeonsoo Jeong
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Ricardo Del Rosario
- McGovern Institute for Brain Research, Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA; Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Vanessa L Bauer
- BioFrontiers Institute, Department of Molecular, Cellular, and Developmental Biology, University of Colorado, Bouder, CO, USA
| | - Will T Fattor
- BioFrontiers Institute, Department of Molecular, Cellular, and Developmental Biology, University of Colorado, Bouder, CO, USA
| | - Gregory K Wilkerson
- Department of Veterinary Sciences, Michale E. Keeling Center for Comparative Medicine and Research, The University of Texas MD Anderson Cancer Center, Bastrop, TX, USA; Department of Clinical Sciences, North Carolina State University, Raleigh, NC, USA
| | - Yuxiang Mao
- Institute of Neuroscience, State Key Laboratory of Neuroscience, Center for Excellence in Brain Science & Intelligence Technology, Chinese Academy of Sciences, Shanghai, China; Shanghai Center for Brain Science and Brain-Inspired Intelligence Technology, Shanghai, China
| | - Yongyong Shi
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China; Institute of Neuroscience, State Key Laboratory of Neuroscience, Center for Excellence in Brain Science & Intelligence Technology, Chinese Academy of Sciences, Shanghai, China; Shanghai Center for Brain Science and Brain-Inspired Intelligence Technology, Shanghai, China
| | - Qiang Sun
- Institute of Neuroscience, State Key Laboratory of Neuroscience, Center for Excellence in Brain Science & Intelligence Technology, Chinese Academy of Sciences, Shanghai, China; Shanghai Center for Brain Science and Brain-Inspired Intelligence Technology, Shanghai, China
| | - Qing Lu
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | | | - Alex A Pollen
- Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA, USA; Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - Guoping Feng
- McGovern Institute for Brain Research, Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA; Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Sara L Sawyer
- BioFrontiers Institute, Department of Molecular, Cellular, and Developmental Biology, University of Colorado, Bouder, CO, USA
| | - Wesley C Warren
- Department of Animal Sciences, Bond Life Sciences Center, University of Missouri, Columbia, MO, USA; Department of Surgery, School of Medicine, University of Missouri, Columbia, MO, USA; Institute of Data Science and Informatics, University of Missouri, Columbia, MO, USA
| | - Lucia Carbone
- Department of Medicine, Knight Cardiovascular Institute, Oregon Health and Science University, Portland, OR, USA; Division of Genetics, Oregon National Primate Research Center, Beaverton, OR, USA; Department of Molecular and Medical Genetics, Oregon Health and Science University, Portland, OR, USA; Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, OR, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA; Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
| |
Collapse
|
20
|
Annapragada AV, Niknafs N, White JR, Bruhm DC, Cherry C, Medina JE, Adleff V, Hruban C, Mathios D, Foda ZH, Phallen J, Scharpf RB, Velculescu VE. Genome-wide repeat landscapes in cancer and cell-free DNA. Sci Transl Med 2024; 16:eadj9283. [PMID: 38478628 DOI: 10.1126/scitranslmed.adj9283] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 02/16/2024] [Indexed: 03/22/2024]
Abstract
Genetic changes in repetitive sequences are a hallmark of cancer and other diseases, but characterizing these has been challenging using standard sequencing approaches. We developed a de novo kmer finding approach, called ARTEMIS (Analysis of RepeaT EleMents in dISease), to identify repeat elements from whole-genome sequencing. Using this method, we analyzed 1.2 billion kmers in 2837 tissue and plasma samples from 1975 patients, including those with lung, breast, colorectal, ovarian, liver, gastric, head and neck, bladder, cervical, thyroid, or prostate cancer. We identified tumor-specific changes in these patients in 1280 repeat element types from the LINE, SINE, LTR, transposable element, and human satellite families. These included changes to known repeats and 820 elements that were not previously known to be altered in human cancer. Repeat elements were enriched in regions of driver genes, and their representation was altered by structural changes and epigenetic states. Machine learning analyses of genome-wide repeat landscapes and fragmentation profiles in cfDNA detected patients with early-stage lung or liver cancer in cross-validated and externally validated cohorts. In addition, these repeat landscapes could be used to noninvasively identify the tissue of origin of tumors. These analyses reveal widespread changes in repeat landscapes of human cancers and provide an approach for their detection and characterization that could benefit early detection and disease monitoring of patients with cancer.
Collapse
Affiliation(s)
- Akshaya V Annapragada
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Noushin Niknafs
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - James R White
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Daniel C Bruhm
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Christopher Cherry
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Jamie E Medina
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Vilmos Adleff
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Carolyn Hruban
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Dimitrios Mathios
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Zachariah H Foda
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
- Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Jillian Phallen
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Robert B Scharpf
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Victor E Velculescu
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
- Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| |
Collapse
|
21
|
Porubsky D, Eichler EE. A 25-year odyssey of genomic technology advances and structural variant discovery. Cell 2024; 187:1024-1037. [PMID: 38290514 DOI: 10.1016/j.cell.2024.01.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Revised: 12/20/2023] [Accepted: 01/02/2024] [Indexed: 02/01/2024]
Abstract
This perspective focuses on advances in genome technology over the last 25 years and their impact on germline variant discovery within the field of human genetics. The field has witnessed tremendous technological advances from microarrays to short-read sequencing and now long-read sequencing. Each technology has provided genome-wide access to different classes of human genetic variation. We are now on the verge of comprehensive variant detection of all forms of variation for the first time with a single assay. We predict that this transition will further transform our understanding of human health and biology and, more importantly, provide novel insights into the dynamic mutational processes shaping our genomes.
Collapse
Affiliation(s)
- David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA; Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA.
| |
Collapse
|
22
|
Cerdán-Vélez D, Tress ML. The T2T-CHM13 reference assembly uncovers essential WASH1 and GPRIN2 paralogues. BIOINFORMATICS ADVANCES 2024; 4:vbae029. [PMID: 38464973 PMCID: PMC10924726 DOI: 10.1093/bioadv/vbae029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 01/02/2024] [Accepted: 02/26/2024] [Indexed: 03/12/2024]
Abstract
Summary The recently published T2T-CHM13 reference assembly completed the annotation of the final 8% of the human genome. It introduced 1956 genes, close to 100 of which are predicted to be coding because they have a protein coding parent gene. Here, we confirm the coding status and functional relevance of two of these genes, paralogues of WASHC1 and GPRIN2. We find that LOC124908094, one of four novel subtelomeric WASH1 genes uncovered in the new assembly, produces the WASH1 protein that forms part of the vital actin-regulatory WASH complex. Its coding status is supported by abundant proteomics, conservation, and cDNA evidence. It was previously assumed that gene WASHC1 produced the functional WASH1 protein, but new evidence shows that WASHC1 is a human-derived duplication and likely to be one of 12 WASH1 pseudogenes in the human gene set. We also find that the T2T-CHM13 assembly has added a functionally important copy of GPRIN2 to the human gene set. We demonstrate that uniquely mapping peptides from proteomics databases support the novel LOC124900631 rather than the GRCh38 assembly GPRIN2 gene. These new additions to the set of human coding genes underlines the importance of the new T2T-CHM13 assembly. Availability and implementation None.
Collapse
Affiliation(s)
- Daniel Cerdán-Vélez
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain
| | - Michael Liam Tress
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain
| |
Collapse
|
23
|
Brann T, Beltramini A, Chaparro C, Berriman M, Doyle SR, Protasio AV. Subtelomeric plasticity contributes to gene family expansion in the human parasitic flatworm Schistosoma mansoni. BMC Genomics 2024; 25:217. [PMID: 38413905 PMCID: PMC10900676 DOI: 10.1186/s12864-024-10032-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Accepted: 01/19/2024] [Indexed: 02/29/2024] Open
Abstract
BACKGROUND The genomic region that lies between the telomere and chromosome body, termed the subtelomere, is heterochromatic, repeat-rich, and frequently undergoes rearrangement. Within this region, large-scale structural changes enable gene diversification, and, as such, large multicopy gene families are often found at the subtelomere. In some parasites, genes associated with proliferation, invasion, and survival are often found in these regions, where they benefit from the subtelomere's highly plastic, rapidly changing nature. The increasing availability of complete (or near complete) parasite genomes provides an opportunity to investigate these typically poorly defined and overlooked genomic regions and potentially reveal relevant gene families necessary for the parasite's lifestyle. RESULTS Using the latest chromosome-scale genome assembly and hallmark repeat richness observed at chromosome termini, we have identified and characterised the subtelomeres of Schistosoma mansoni, a metazoan parasitic flatworm that infects over 250 million people worldwide. Approximately 12% of the S. mansoni genome is classified as subtelomeric, and, in line with other organisms, we find these regions to be gene-poor but rich in transposable elements. We find that S. mansoni subtelomeres have undergone extensive interchromosomal recombination and that these sites disproportionately contribute to the 2.3% of the genome derived from segmental duplications. This recombination has led to the expansion of subtelomeric gene clusters containing 103 genes, including the immunomodulatory annexins and other gene families with unknown roles. The largest of these is a 49-copy plexin domain-containing protein cluster, exclusively expressed in the tegument-the tissue located at the host-parasite physical interface-of intramolluscan life stages. CONCLUSIONS We propose that subtelomeric regions act as a genomic playground for trial-and-error of gene duplication and subsequent divergence. Owing to the importance of subtelomeric genes in other parasites, gene families implicated in this subtelomeric expansion within S. mansoni warrant further characterisation for a potential role in parasitism.
Collapse
Affiliation(s)
- T Brann
- Department of Pathology, University of Cambridge, Cambridge, CB1 2PQ, UK
| | - A Beltramini
- Department of Pathology, University of Cambridge, Cambridge, CB1 2PQ, UK
| | - C Chaparro
- IHPE, CNRS, IFREMER, UPVD, University Montpellier, Perpignan, F-66860, France
| | - M Berriman
- School of Infection and Immunity, University of Glasgow, Glasgow, G12 8TA, UK
| | - S R Doyle
- Wellcome Sanger Institute, Cambridge, CB10 1SA, UK
| | - A V Protasio
- Department of Pathology, University of Cambridge, Cambridge, CB1 2PQ, UK.
- Christ's College, Cambridge, CB2 3BU, UK.
| |
Collapse
|
24
|
Li J, Cai X, Jiang P, Wang H, Zhang S, Sun T, Chen C, Fan K. Co-based Nanozymatic Profiling: Advances Spanning Chemistry, Biomedical, and Environmental Sciences. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2024; 36:e2307337. [PMID: 37724878 DOI: 10.1002/adma.202307337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Revised: 09/12/2023] [Indexed: 09/21/2023]
Abstract
Nanozymes, next-generation enzyme-mimicking nanomaterials, have entered an era of rational design; among them, Co-based nanozymes have emerged as captivating players over times. Co-based nanozymes have been developed and have garnered significant attention over the past five years. Their extraordinary properties, including regulatable enzymatic activity, stability, and multifunctionality stemming from magnetic properties, photothermal conversion effects, cavitation effects, and relaxation efficiency, have made Co-based nanozymes a rising star. This review presents the first comprehensive profiling of the Co-based nanozymes in the chemistry, biology, and environmental sciences. The review begins by scrutinizing the various synthetic methods employed for Co-based nanozyme fabrication, such as template and sol-gel methods, highlighting their distinctive merits from a chemical standpoint. Furthermore, a detailed exploration of their wide-ranging applications in biosensing and biomedical therapeutics, as well as their contributions to environmental monitoring and remediation is provided. Notably, drawing inspiration from state-of-the-art techniques such as omics, a comprehensive analysis of Co-based nanozymes is undertaken, employing analogous statistical methodologies to provide valuable guidance. To conclude, a comprehensive outlook on the challenges and prospects for Co-based nanozymes is presented, spanning from microscopic physicochemical mechanisms to macroscopic clinical translational applications.
Collapse
Affiliation(s)
- Jingqi Li
- College of Chemistry, Chemical Engineering and Resource Utilization, Northeast Forestry University, Harbin, 150040, P. R. China
- Aulin College, Northeast Forestry University, Harbin, 150040, P. R. China
| | - Xinda Cai
- College of Chemistry, Chemical Engineering and Resource Utilization, Northeast Forestry University, Harbin, 150040, P. R. China
- Aulin College, Northeast Forestry University, Harbin, 150040, P. R. China
| | - Peng Jiang
- College of Chemistry, Chemical Engineering and Resource Utilization, Northeast Forestry University, Harbin, 150040, P. R. China
- Aulin College, Northeast Forestry University, Harbin, 150040, P. R. China
| | - Huayuan Wang
- College of Chemistry, Chemical Engineering and Resource Utilization, Northeast Forestry University, Harbin, 150040, P. R. China
- Aulin College, Northeast Forestry University, Harbin, 150040, P. R. China
| | - Shiwei Zhang
- College of Chemistry, Chemical Engineering and Resource Utilization, Northeast Forestry University, Harbin, 150040, P. R. China
- Aulin College, Northeast Forestry University, Harbin, 150040, P. R. China
| | - Tiedong Sun
- College of Chemistry, Chemical Engineering and Resource Utilization, Northeast Forestry University, Harbin, 150040, P. R. China
- Aulin College, Northeast Forestry University, Harbin, 150040, P. R. China
| | - Chunxia Chen
- College of Chemistry, Chemical Engineering and Resource Utilization, Northeast Forestry University, Harbin, 150040, P. R. China
- Aulin College, Northeast Forestry University, Harbin, 150040, P. R. China
| | - Kelong Fan
- CAS Engineering Laboratory for Nanozyme, Key Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, P. R. China
- Nanozyme Medical Center, School of Basic Medical Sciences, Zhengzhou University, Zhengzhou, 450001, P. R. China
| |
Collapse
|
25
|
Kawakami R, Hiraide T, Watanabe K, Miyamoto S, Hira K, Komatsu K, Ishigaki H, Sakaguchi K, Maekawa M, Yamashita K, Fukuda T, Miyairi I, Ogata T, Saitsu H. RNA sequencing and target long-read sequencing reveal an intronic transposon insertion causing aberrant splicing. J Hum Genet 2024; 69:91-99. [PMID: 38102195 DOI: 10.1038/s10038-023-01211-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 11/28/2023] [Accepted: 12/01/2023] [Indexed: 12/17/2023]
Abstract
More than half of cases with suspected genetic disorders remain unsolved by genetic analysis using short-read sequencing such as exome sequencing (ES) and genome sequencing (GS). RNA sequencing (RNA-seq) and long-read sequencing (LRS) are useful for interpretation of candidate variants and detection of structural variants containing repeat sequences, respectively. Recently, adaptive sampling on nanopore sequencers enables target LRS more easily. Here, we present a Japanese girl with premature chromatid separation (PCS)/mosaic variegated aneuploidy (MVA) syndrome. ES detected a known pathogenic maternal heterozygous variant (c.1402-5A>G) in intron 10 of BUB1B (NM_001211.6), a known responsive gene for PCS/MVA syndrome with autosomal recessive inheritance. Minigene splicing assay revealed that almost all transcripts from the c.1402-5G allele have mis-splicing with 4-bp insertion. GS could not detect another pathogenic variant, while RNA-seq revealed abnormal reads in intron 2. To extensively explore variants in intron 2, we performed adaptive sampling and identified a paternal 3.0 kb insertion. Consensus sequence of 16 reads spanning the insertion showed that the insertion consists of Alu and SVA elements. Realignment of RNA-seq reads to the new reference sequence containing the insertion revealed that 16 reads have 5' splice site within the insertion and 3' splice site at exon 3, demonstrating causal relationship between the insertion and aberrant splicing. In addition, immunoblotting showed severely diminished BUB1B protein level in patient derived cells. These data suggest that detection of transcriptomic abnormalities by RNA-seq can be a clue for identifying pathogenic variants, and determination of insert sequences is one of merits of LRS.
Collapse
Affiliation(s)
- Ryota Kawakami
- Department of Pediatrics, Hamamatsu University School of Medicine, Hamamatsu, Japan
| | - Takuya Hiraide
- Department of Pediatrics, Hamamatsu University School of Medicine, Hamamatsu, Japan
| | - Kazuki Watanabe
- Department of Biochemistry, Hamamatsu University School of Medicine, Hamamatsu, Japan
| | - Sachiko Miyamoto
- Department of Biochemistry, Hamamatsu University School of Medicine, Hamamatsu, Japan
| | - Kota Hira
- Department of Pediatrics, Hamamatsu University School of Medicine, Hamamatsu, Japan
| | - Kazuyuki Komatsu
- Department of Pediatrics, Hamamatsu University School of Medicine, Hamamatsu, Japan
- Department of Biochemistry, Hamamatsu University School of Medicine, Hamamatsu, Japan
| | - Hidetoshi Ishigaki
- Department of Pediatrics, Hamamatsu University School of Medicine, Hamamatsu, Japan
| | - Kimiyoshi Sakaguchi
- Department of Pediatrics, Hamamatsu University School of Medicine, Hamamatsu, Japan
| | - Masato Maekawa
- Department of Laboratory Medicine, Hamamatsu University School of Medicine, Hamamatsu, Japan
| | - Keita Yamashita
- Department of Laboratory Medicine, Hamamatsu University School of Medicine, Hamamatsu, Japan
| | - Tokiko Fukuda
- Department of Hamamatsu Child Health and Developmental Medicine, Hamamatsu University School of Medicine, Hamamatsu, Japan
| | - Isao Miyairi
- Department of Pediatrics, Hamamatsu University School of Medicine, Hamamatsu, Japan
| | - Tsutomu Ogata
- Department of Biochemistry, Hamamatsu University School of Medicine, Hamamatsu, Japan
- Department of Pediatrics, Hamamatsu Medical Center, Hamamatsu, Japan
| | - Hirotomo Saitsu
- Department of Biochemistry, Hamamatsu University School of Medicine, Hamamatsu, Japan.
| |
Collapse
|
26
|
Lu X, Liu L. Genome stability from the perspective of telomere length. Trends Genet 2024; 40:175-186. [PMID: 37957036 DOI: 10.1016/j.tig.2023.10.013] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 10/20/2023] [Accepted: 10/23/2023] [Indexed: 11/15/2023]
Abstract
Telomeres and their associated proteins protect the ends of chromosomes to maintain genome stability. Telomeres undergo progressive shortening with each cell division in mammalian somatic cells without telomerase, resulting in genome instability. When telomeres reach a critically short length or are recognized as a damage signal, cells enter a state of senescence, followed by cell cycle arrest, programmed cell death, or immortalization. This review provides an overview of recent advances in the intricate relationship between telomeres and genome instability. Alongside well-established mechanisms such as chromosomal fusion and telomere fusion, we will delve into the perspective on genome stability by examining the role of retrotransposons. Retrotransposons represent an emerging pathway to regulate genome stability through their interactions with telomeres.
Collapse
Affiliation(s)
- Xinyi Lu
- State Key Laboratory of Medicinal Chemical Biology, Nankai University, Tianjin, Tianjin 300350, China.
| | - Lin Liu
- State Key Laboratory of Medicinal Chemical Biology, Nankai University, Tianjin, Tianjin 300350, China; Frontiers Science Center for Cell Responses, College of Life Science, Nankai University, Tianjin, Tianjin 300071, China; Haihe Laboratory of Cell Ecosystem, Chinese Academy of Medical Sciences and Peking Union Medical College, Tianjin 300020, China; Institute of Translational Medicine, Tianjin Union Medical Center, Nankai University, Tianjin 300000, China.
| |
Collapse
|
27
|
Barbitoff YA, Ushakov MO, Lazareva TE, Nasykhova YA, Glotov AS, Predeus AV. Bioinformatics of germline variant discovery for rare disease diagnostics: current approaches and remaining challenges. Brief Bioinform 2024; 25:bbad508. [PMID: 38271481 PMCID: PMC10810331 DOI: 10.1093/bib/bbad508] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 11/18/2023] [Accepted: 12/12/2023] [Indexed: 01/27/2024] Open
Abstract
Next-generation sequencing (NGS) has revolutionized the field of rare disease diagnostics. Whole exome and whole genome sequencing are now routinely used for diagnostic purposes; however, the overall diagnosis rate remains lower than expected. In this work, we review current approaches used for calling and interpretation of germline genetic variants in the human genome, and discuss the most important challenges that persist in the bioinformatic analysis of NGS data in medical genetics. We describe and attempt to quantitatively assess the remaining problems, such as the quality of the reference genome sequence, reproducible coverage biases, or variant calling accuracy in complex regions of the genome. We also discuss the prospects of switching to the complete human genome assembly or the human pan-genome and important caveats associated with such a switch. We touch on arguably the hardest problem of NGS data analysis for medical genomics, namely, the annotation of genetic variants and their subsequent interpretation. We highlight the most challenging aspects of annotation and prioritization of both coding and non-coding variants. Finally, we demonstrate the persistent prevalence of pathogenic variants in the coding genome, and outline research directions that may enhance the efficiency of NGS-based disease diagnostics.
Collapse
Affiliation(s)
- Yury A Barbitoff
- Dpt. of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya line 3, 199034, St. Petersburg, Russia
- Bioinformatics Institute, Kentemirovskaya st. 2A, 197342, St. Petersburg, Russia
| | - Mikhail O Ushakov
- Dpt. of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya line 3, 199034, St. Petersburg, Russia
| | - Tatyana E Lazareva
- Dpt. of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya line 3, 199034, St. Petersburg, Russia
| | - Yulia A Nasykhova
- Dpt. of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya line 3, 199034, St. Petersburg, Russia
| | - Andrey S Glotov
- Dpt. of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya line 3, 199034, St. Petersburg, Russia
| | - Alexander V Predeus
- Bioinformatics Institute, Kentemirovskaya st. 2A, 197342, St. Petersburg, Russia
| |
Collapse
|
28
|
Lin X, Tang B, Li Z, Shi L, Zhu H. Genome-wide identification and expression analyses of CYP450 genes in sweet potato (Ipomoea batatas L.). BMC Genomics 2024; 25:58. [PMID: 38218763 PMCID: PMC10787477 DOI: 10.1186/s12864-024-09965-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Accepted: 01/03/2024] [Indexed: 01/15/2024] Open
Abstract
BACKGROUND Cytochrome P450 monooxygenases (CYP450s) play a crucial role in various biochemical reactions involved in the synthesis of antioxidants, pigments, structural polymers, and defense-related compounds in plants. As sweet potato (Ipomoea batatas L.) holds significant economic importance, a comprehensive analysis of CYP450 genes in this plant species can offer valuable insights into the evolutionary relationships and functional characteristics of these genes. RESULTS In this study, we successfully identified and categorized 95 CYP450 genes from the sweet potato genome into 5 families and 31 subfamilies. The predicted subcellular localization results indicate that CYP450s are distributed in the cell membrane system. The promoter region of the IbCYP450 genes contains various cis-acting elements related to plant hormones and stress responses. In addition, ten conserved motifs (Motif1-Motif10) have been identified in the IbCYP450 family proteins, with 5 genes lacking introns and only one exon. We observed extensive duplication events within the CYP450 gene family, which may account for its expansion. The gene duplication analysis results showed the presence of 15 pairs of genes with tandem repeats. Interaction network analysis reveals that IbCYP450 families can interact with multiple target genes and there are protein-protein interactions within the family. Transcription factor interaction analysis suggests that IbCYP450 families interact with multiple transcription factors. Furthermore, gene expression analysis revealed tissue-specific expression patterns of CYP450 genes in sweet potatoes, as well as their response to abiotic stress and plant hormones. Notably, quantitative real-time polymerase chain reaction (qRT‒PCR) analysis indicated the involvement of CYP450 genes in the defense response against nonbiological stresses in sweet potatoes. CONCLUSIONS These findings provide a foundation for further investigations aiming to elucidate the biological functions of CYP450 genes in sweet potatoes.
Collapse
Affiliation(s)
- Xiongjian Lin
- College of Coastal Agricultural Sciences, Guangdong Ocean University, Zhanjiang, 524088, China
| | - Binquan Tang
- College of Coastal Agricultural Sciences, Guangdong Ocean University, Zhanjiang, 524088, China
| | - Zhenqin Li
- College of Coastal Agricultural Sciences, Guangdong Ocean University, Zhanjiang, 524088, China
| | - Lei Shi
- College of Coastal Agricultural Sciences, Guangdong Ocean University, Zhanjiang, 524088, China
| | - Hongbo Zhu
- College of Coastal Agricultural Sciences, Guangdong Ocean University, Zhanjiang, 524088, China.
| |
Collapse
|
29
|
Ungar RA, Goddard PC, Jensen TD, Degalez F, Smith KS, Jin CA, Bonner DE, Bernstein JA, Wheeler MT, Montgomery SB. Impact of genome build on RNA-seq interpretation and diagnostics. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.01.11.24301165. [PMID: 38260490 PMCID: PMC10802764 DOI: 10.1101/2024.01.11.24301165] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Transcriptomics is a powerful tool for unraveling the molecular effects of genetic variants and disease diagnosis. Prior studies have demonstrated that choice of genome build impacts variant interpretation and diagnostic yield for genomic analyses. To identify the extent genome build also impacts transcriptomics analyses, we studied the effect of the hg19, hg38, and CHM13 genome builds on expression quantification and outlier detection in 386 rare disease and familial control samples from both the Undiagnosed Diseases Network (UDN) and Genomics Research to Elucidate the Genetics of Rare Disease (GREGoR) Consortium. We identified 2,800 genes with build-dependent quantification across six routinely-collected biospecimens, including 1,391 protein-coding genes and 341 known rare disease genes. We further observed multiple genes that only have detectable expression in a subset of genome builds. Finally, we characterized how genome build impacts the detection of outlier transcriptomic events. Combined, we provide a database of genes impacted by build choice, and recommend that transcriptomics-guided analyses and diagnoses are cross-referenced with these data for robustness.
Collapse
Affiliation(s)
- Rachel A. Ungar
- Department of Genetics, School of Medicine, Stanford University
- Department of Pathology, School of Medicine, Stanford University
| | - Pagé C. Goddard
- Department of Genetics, School of Medicine, Stanford University
- Department of Pathology, School of Medicine, Stanford University
| | - Tanner D. Jensen
- Department of Genetics, School of Medicine, Stanford University
- Department of Pathology, School of Medicine, Stanford University
| | | | - Kevin S. Smith
- Department of Pathology, School of Medicine, Stanford University
| | | | | | - Devon E. Bonner
- Department of Pediatrics, School of Medicine, Stanford University
- Stanford Center for Undiagnosed Diseases, Stanford University
| | | | - Matthew T. Wheeler
- Department of Cardiovascular Medicine, School of Medicine, Stanford University
| | - Stephen B. Montgomery
- Department of Genetics, School of Medicine, Stanford University
- Department of Pathology, School of Medicine, Stanford University
- Department of Biomedical Data Science, Stanford University
| |
Collapse
|
30
|
Abiusi E, Costa-Roger M, Bertini ES, Tiziano FD, Tizzano EF, Abiusi E, Baranello G, Bertini E, Boemer F, Burghes A, Codina-Solà M, Costa-Roger M, Dangouloff T, Groen E, Gos M, Jędrzejowska M, Kirschner J, Lemmink HH, Müller-Felber W, Ouillade MC, Quijano-Roy S, Rucinski K, Saugier-Veber P, Tiziano FD, Tizzano EF, Wirth B. 270th ENMC International Workshop: Consensus for SMN2 genetic analysis in SMA patients 10-12 March, 2023, Hoofddorp, the Netherlands. Neuromuscul Disord 2024; 34:114-122. [PMID: 38183850 DOI: 10.1016/j.nmd.2023.12.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2024]
Abstract
The 270th ENMC workshop aimed to develop a common procedure to optimize the reliability of SMN2 gene copy number determination and to reinforce collaborative networks between molecular scientists and clinicians. The workshop involved neuromuscular and clinical experts and representatives of patient advocacy groups and industry. SMN2 copy number is currently one of the main determinants for therapeutic decision in SMA patients: participants discussed the issues that laboratories may encounter in this molecular test and the cruciality of the accurate determination, due the implications as prognostic factor in symptomatic patients and in individuals identified through newborn screening programmes. At the end of the workshop, the attendees defined a set of recommendations divided into four topics: SMA molecular prognosis assessment, newborn screening for SMA, SMN2 copies and treatments, and modifiers and biomarkers. Moreover, the group draw up a series of recommendations for the companies manufacturing laboratory kits, that will help to minimize the risk of errors, regardless of the laboratories' expertise.
Collapse
Affiliation(s)
- Emanuela Abiusi
- Section of Genomic Medicine, Department of Public Health and Life Sciences, Università Cattolica del Sacro Cuore, Roma, Italy
| | - Mar Costa-Roger
- Clinical and Molecular Genetics Area, Vall d'Hebron Hospital; Medicine Genetics Group, Vall d'Hebron Research Institute (VHIR), Barcelona, Spain
| | - Enrico Silvio Bertini
- Research Unit of Neuromuscular Disease, Bambino Gesu’ Children's Hospital, IRCCS, Roma, Italy
| | - Francesco Danilo Tiziano
- Section of Genomic Medicine, Department of Public Health and Life Sciences, Università Cattolica del Sacro Cuore, Roma, Italy
- Complex Unit of Medical Genetics, Fondazione Policlinico Universitario IRCCS “A. Gemelli”, Roma, Italy
| | - Eduardo F Tizzano
- Clinical and Molecular Genetics Area, Vall d'Hebron Hospital; Medicine Genetics Group, Vall d'Hebron Research Institute (VHIR), Barcelona, Spain
| | - Emanuela Abiusi
- Section of Genomic Medicine, Dept. of Life Sciences and Public Health, Catholic University of the Sacred Heart, Roma, Italy
| | - Giovanni Baranello
- The Dubowitz Neuromuscular Centre, Developmental Neuroscience Research and Teaching Department, UCL Great Ormond Street Institute of Child Health, NIHR Great Ormond Street Hospital Biomedical Research Centre & Great Ormond Street Hospital NHS Foundation Trust, 30 Guilford Street, London WC1N 1EH, UK
| | - Enrico Bertini
- Italy, Research Unit of Neuromuscular Disease, Bambino Gesù Children's Hospital, IRCCS, Roma, Italy
| | - François Boemer
- Biochemical Genetics Lab, Department of Human Genetics, University Hospital, University of Liège, 4000 Liège, Belgium
| | - Arthur Burghes
- Department of Neurology, The Ohio State University Wexner Medical Center, Columbus, OH, USA
| | - Marta Codina-Solà
- Neuromuscular Reference Center, Department of Paediatrics, University Hospital Liege & University of Liege, Belgium
| | - Mar Costa-Roger
- Department of Neurology & Neurosurgery, UMC Utrecht Brain Center, University Medical Center Utrecht, Utrecht, the Netherlands
| | - Tamara Dangouloff
- Department of Medical Genetics, Institute of Mother and Child, Warsaw, Poland
| | - Ewout Groen
- Department of Neurology, Medical University of Warsaw, Warsaw, Poland
| | - Monika Gos
- Department of Neuropediatrics and Muscle Disorders, Medical Center University of Freiburg, Faculty of Medicine, Freiburg, Germany
| | - Maria Jędrzejowska
- Department of Genetics, University of Groningen, University Medical Center Groningen, 9700 RB Groningen, The Netherlands
| | - Janbernd Kirschner
- Centre for Neuromuscular Disorders, Center for Translational Neuro and Behavioral Sciences, Department of Pediatric Neurology, University Duisburg-Essen, 45147 Essen, Germany
| | - Henny H Lemmink
- AFM Téléthon, Évry, France; SMA Europe; European Alliance for Newborn Screening in Spinal Muscular Atrophy
| | - Wolfgang Müller-Felber
- Pediatric Neuromuscular Unit (NEIDF Reference Center at FILNEMUS & Euro-NMD), Child Neurology Department, Raymond Poincaré Hospital (UVSQ), APHP Université Paris Saclay, Garches France
| | - Marie-Christine Ouillade
- Fundacja SMA, Warsaw, Poland; SMA Europe; European Alliance for Newborn Screening in Spinal Muscular Atrophy
| | - Susana Quijano-Roy
- Univ Rouen Normandie, Inserm U1245, Normandie Univ and CHU Rouen, Department of Genetics and Nord/Est/Ile de France Neuromuscular Reference Center, F-76000 Rouen, France
| | - Kacper Rucinski
- Institute of Medical Genomics, Dept. of Life Sciences and Public Health, Catholic University of the Sacred Heart, and Complex Unit of Medical Genetics, Fondazione Policlinico Universitario IRCCS “A. Gemelli”, Roma, Italy
| | - Pascale Saugier-Veber
- Institute of Human Genetics, University Hospital of Cologne, Center for Molecular Medicine, University of Cologne and Center for Rare Diseases Cologne, University Hopsital of Cologne, Cologne, Germany
| | - Francesco Danilo Tiziano
- Institute of Medical Genomics, Dept. of Life Sciences and Public Health, Catholic University of the Sacred Heart, and Complex Unit of Medical Genetics, Fondazione Policlinico Universitario IRCCS “A. Gemelli”, Roma, Italy
| | - Eduardo Fidel Tizzano
- Clinical and Molecular Genetics Area, Vall d'Hebron Hospital; Medicine Genetics Group, Vall d'Hebron Research Institute (VHIR), Barcelona, Spain
| | - Brunhilde Wirth
- Institute of Human Genetics, University Hospital of Cologne, Center for Molecular Medicine, University of Cologne and Center for Rare Diseases Cologne, University Hopsital of Cologne, Cologne, Germany
| |
Collapse
|
31
|
Martin R, Espinoza CY, Large CRL, Rosswork J, Van Bruinisse C, Miller AW, Sanchez JC, Miller M, Paskvan S, Alvino GM, Dunham MJ, Raghuraman MK, Brewer BJ. Template switching between the leading and lagging strands at replication forks generates inverted copy number variants through hairpin-capped extrachromosomal DNA. PLoS Genet 2024; 20:e1010850. [PMID: 38175823 PMCID: PMC10766183 DOI: 10.1371/journal.pgen.1010850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Accepted: 10/23/2023] [Indexed: 01/06/2024] Open
Abstract
Inherited and germ-line de novo copy number variants (CNVs) are increasingly found to be correlated with human developmental and cancerous phenotypes. Several models for template switching during replication have been proposed to explain the generation of these gross chromosomal rearrangements. We proposed a model of template switching (ODIRA-origin dependent inverted repeat amplification) in which simultaneous ligation of the leading and lagging strands at diverging replication forks could generate segmental inverted triplications through an extrachromosomal inverted circular intermediate. Here, we created a genetic assay using split-ura3 cassettes to trap the proposed inverted intermediate. However, instead of recovering circular inverted intermediates, we found inverted linear chromosomal fragments ending in native telomeres-suggesting that a template switch had occurred at the centromere-proximal fork of a replication bubble. As telomeric inverted hairpin fragments can also be created through double strand breaks we tested whether replication errors or repair of double stranded DNA breaks were the most likely initiating event. The results from CRISPR/Cas9 cleavage experiments and growth in the replication inhibitor hydroxyurea indicate that it is a replication error, not a double stranded break that creates the inverted junctions. Since inverted amplicons of the SUL1 gene occur during long-term growth in sulfate-limited chemostats, we sequenced evolved populations to look for evidence of linear intermediates formed by an error in replication. All of the data are compatible with a two-step version of the ODIRA model in which sequential template switching at short inverted repeats between the leading and lagging strands at a replication fork, followed by integration via homologous recombination, generates inverted interstitial triplications.
Collapse
Affiliation(s)
- Rebecca Martin
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Claudia Y. Espinoza
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Christopher R. L. Large
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Joshua Rosswork
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Cole Van Bruinisse
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Aaron W. Miller
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Joseph C. Sanchez
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Madison Miller
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Samantha Paskvan
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Gina M. Alvino
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Maitreya J. Dunham
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - M. K. Raghuraman
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Bonita J. Brewer
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| |
Collapse
|
32
|
Volpe E, Corda L, Tommaso ED, Pelliccia F, Ottalevi R, Licastro D, Guarracino A, Capulli M, Formenti G, Tassone E, Giunta S. The complete diploid reference genome of RPE-1 identifies human phased epigenetic landscapes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.01.565049. [PMID: 38168337 PMCID: PMC10760208 DOI: 10.1101/2023.11.01.565049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
Comparative analysis of recent human genome assemblies highlights profound sequence divergence that peaks within polymorphic loci such as centromeres. This raises the question about the adequacy of relying on human reference genomes to accurately analyze sequencing data derived from experimental cell lines. Here, we generated the complete diploid genome assembly for the human retinal epithelial cells (RPE-1), a widely used non-cancer laboratory cell line with a stable karyotype, to use as matched reference for multi-omics sequencing data analysis. Our RPE1v1.0 assembly presents completely phased haplotypes and chromosome-level scaffolds that span centromeres with ultra-high base accuracy (>QV60). We mapped the haplotype-specific genomic variation specific to this cell line including t(Xq;10q), a stable 73.18 Mb duplication of chromosome 10 translocated onto the microdeleted chromosome X telomere t(Xq;10q). Polymorphisms between haplotypes of the same genome reveals genetic and epigenetic variation for all chromosomes, especially at centromeres. The RPE-1 assembly as matched reference genome improves mapping quality of multi-omics reads originating from RPE-1 cells with drastic reduction in alignments mismatches compared to using the most complete human reference to date (CHM13). Leveraging the accuracy achieved using a matched reference, we were able to identify the kinetochore sites at base pair resolution and show unprecedented variation between haplotypes. This work showcases the use of matched reference genomes for multiomics analyses and serves as the foundation for a call to comprehensively assemble experimentally relevant cell lines for widespread application.
Collapse
Affiliation(s)
- Emilia Volpe
- Giunta Laboratory of Genome Evolution, Department of Biology and Biotechnologies Charles Darwin, University of Rome “Sapienza”, Piazzale Aldo Moro 5, 00185 Rome, Italy
| | - Luca Corda
- Giunta Laboratory of Genome Evolution, Department of Biology and Biotechnologies Charles Darwin, University of Rome “Sapienza”, Piazzale Aldo Moro 5, 00185 Rome, Italy
| | - Elena Di Tommaso
- Giunta Laboratory of Genome Evolution, Department of Biology and Biotechnologies Charles Darwin, University of Rome “Sapienza”, Piazzale Aldo Moro 5, 00185 Rome, Italy
| | - Franca Pelliccia
- Giunta Laboratory of Genome Evolution, Department of Biology and Biotechnologies Charles Darwin, University of Rome “Sapienza”, Piazzale Aldo Moro 5, 00185 Rome, Italy
| | - Riccardo Ottalevi
- Department of Bioinformatic, Dante Genomics Corp Inc., 667 Madison Avenue, New York, NY 10065 USA and S.s.17, 67100, L’Aquila, Italy
| | | | - Andrea Guarracino
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN 38163, USA
| | - Mattia Capulli
- Department of Biotechnological and Applied Clinical Sciences, University of L’Aquila, L’Aquila, Italy
| | - Giulio Formenti
- The Rockefeller University, 1230 York Avenue, 10065 New York, USA
| | - Evelyne Tassone
- Giunta Laboratory of Genome Evolution, Department of Biology and Biotechnologies Charles Darwin, University of Rome “Sapienza”, Piazzale Aldo Moro 5, 00185 Rome, Italy
| | - Simona Giunta
- Giunta Laboratory of Genome Evolution, Department of Biology and Biotechnologies Charles Darwin, University of Rome “Sapienza”, Piazzale Aldo Moro 5, 00185 Rome, Italy
| |
Collapse
|
33
|
Lee AS, Ayers LJ, Kosicki M, Chan WM, Fozo LN, Pratt BM, Collins TE, Zhao B, Rose MF, Sanchis-Juan A, Fu JM, Wong I, Zhao X, Tenney AP, Lee C, Laricchia KM, Barry BJ, Bradford VR, Lek M, MacArthur DG, Lee EA, Talkowski ME, Brand H, Pennacchio LA, Engle EC. A cell type-aware framework for nominating non-coding variants in Mendelian regulatory disorders. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.12.22.23300468. [PMID: 38234731 PMCID: PMC10793524 DOI: 10.1101/2023.12.22.23300468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/19/2024]
Abstract
Unsolved Mendelian cases often lack obvious pathogenic coding variants, suggesting potential non-coding etiologies. Here, we present a single cell multi-omic framework integrating embryonic mouse chromatin accessibility, histone modification, and gene expression assays to discover cranial motor neuron (cMN) cis-regulatory elements and subsequently nominate candidate non-coding variants in the congenital cranial dysinnervation disorders (CCDDs), a set of Mendelian disorders altering cMN development. We generated single cell epigenomic profiles for ~86,000 cMNs and related cell types, identifying ~250,000 accessible regulatory elements with cognate gene predictions for ~145,000 putative enhancers. Seventy-five percent of elements (44 of 59) validated in an in vivo transgenic reporter assay, demonstrating that single cell accessibility is a strong predictor of enhancer activity. Applying our cMN atlas to 899 whole genome sequences from 270 genetically unsolved CCDD pedigrees, we achieved significant reduction in our variant search space and nominated candidate variants predicted to regulate known CCDD disease genes MAFB, PHOX2A, CHN1, and EBF3 - as well as new candidates in recurrently mutated enhancers through peak- and gene-centric allelic aggregation. This work provides novel non-coding variant discoveries of relevance to CCDDs and a generalizable framework for nominating non-coding variants of potentially high functional impact in other Mendelian disorders.
Collapse
Affiliation(s)
- Arthur S Lee
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA
- Kirby Neurobiology Center, Boston Children's Hospital, Boston, MA
- Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, MA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
| | - Lauren J Ayers
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA
| | - Michael Kosicki
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA
| | - Wai-Man Chan
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA
- Howard Hughes Medical Institute, Chevy Chase, MD
| | - Lydia N Fozo
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA
| | - Brandon M Pratt
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA
| | - Thomas E Collins
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA
| | - Boxun Zhao
- Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, MA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA
| | - Matthew F Rose
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA
- Kirby Neurobiology Center, Boston Children's Hospital, Boston, MA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Department of Pathology, Boston Children's Hospital, Boston, MA
- Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, MA
- Medical Genetics Training Program, Harvard Medical School, Boston, MA
| | - Alba Sanchis-Juan
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA
| | - Jack M Fu
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA
| | - Isaac Wong
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA
| | - Xuefang Zhao
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA
| | - Alan P Tenney
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA
- Kirby Neurobiology Center, Boston Children's Hospital, Boston, MA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
| | - Cassia Lee
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA
- Harvard College, Cambridge, MA
| | - Kristen M Laricchia
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
| | - Brenda J Barry
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA
- Howard Hughes Medical Institute, Chevy Chase, MD
| | - Victoria R Bradford
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA
| | - Monkol Lek
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
| | - Daniel G MacArthur
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Centre for Population Genomics, Garvan Institute of Medical Research and UNSW Sydney, Sydney, NSW, Australia
- Centre for Population Genomics, Murdoch Children's Research Institute, Melbourne, VIC, Australia
| | - Eunjung Alice Lee
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA
- Department of Genetics, Harvard Medical School, Boston, MA
| | - Michael E Talkowski
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA
| | - Harrison Brand
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA
- Pediatric Surgical Research Laboratories, Massachusetts General Hospital, Boston, MA
| | - Len A Pennacchio
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA
| | - Elizabeth C Engle
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA
- Kirby Neurobiology Center, Boston Children's Hospital, Boston, MA
- Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, MA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Howard Hughes Medical Institute, Chevy Chase, MD
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA
- Medical Genetics Training Program, Harvard Medical School, Boston, MA
- Department of Ophthalmology, Boston Children's Hospital and Harvard Medical School, Boston, MA
| |
Collapse
|
34
|
Chaisson MJP, Sulovari A, Valdmanis PN, Miller DE, Eichler EE. Advances in the discovery and analyses of human tandem repeats. Emerg Top Life Sci 2023; 7:361-381. [PMID: 37905568 PMCID: PMC10806765 DOI: 10.1042/etls20230074] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Revised: 10/18/2023] [Accepted: 10/18/2023] [Indexed: 11/02/2023]
Abstract
Long-read sequencing platforms provide unparalleled access to the structure and composition of all classes of tandemly repeated DNA from STRs to satellite arrays. This review summarizes our current understanding of their organization within the human genome, their importance with respect to disease, as well as the advances and challenges in understanding their genetic diversity and functional effects. Novel computational methods are being developed to visualize and associate these complex patterns of human variation with disease, expression, and epigenetic differences. We predict accurate characterization of this repeat-rich form of human variation will become increasingly relevant to both basic and clinical human genetics.
Collapse
Affiliation(s)
- Mark J P Chaisson
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, U.S.A
- The Genomic and Epigenomic Regulation Program, USC Norris Cancer Center, University of Southern California, Los Angeles, CA 90089, U.S.A
| | - Arvis Sulovari
- Computational Biology, Cajal Neuroscience Inc, Seattle, WA 98102, U.S.A
| | - Paul N Valdmanis
- Division of Medical Genetics, Department of Medicine, University of Washington School of Medicine, Seattle, WA 98195, U.S.A
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, U.S.A
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA 98195, U.S.A
| | - Danny E Miller
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA 98195, U.S.A
- Brotman Baty Institute for Precision Medicine, University of Washington, Seattle, WA 98195, U.S.A
- Department of Pediatrics, University of Washington, Seattle, WA 98195, U.S.A
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, U.S.A
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, U.S.A
| |
Collapse
|
35
|
Yang Y, Wu Z, Wu Z, Li T, Shen Z, Zhou X, Wu X, Li G, Zhang Y. A near-complete assembly of asparagus bean provides insights into anthocyanin accumulation in pods. PLANT BIOTECHNOLOGY JOURNAL 2023; 21:2473-2489. [PMID: 37558431 PMCID: PMC10651155 DOI: 10.1111/pbi.14142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 07/11/2023] [Accepted: 07/23/2023] [Indexed: 08/11/2023]
Abstract
Asparagus bean (Vigna unguiculata ssp. sesquipedialis), a subspecies of V. unguiculata, is a vital legume crop widely cultivated in Asia for its tender pods consumed as vegetables. However, the existing asparagus bean assemblies still contain numerous gaps and unanchored sequences, which presents challenges to functional genomics research. Here, we present an improved reference genome sequence of an elite asparagus bean variety, Fengchan 6, achieved through the integration of nanopore ultra-long reads, PacBio high-fidelity reads, and Hi-C technology. The improved assembly is 521.3 Mb in length and demonstrates several enhancements, including a higher N50 length (46.4 Mb), an anchor ratio of 99.8%, and the presence of only one gap. Furthermore, we successfully assembled 14 telomeres and all 11 centromeres, including four telomere-to-telomere chromosomes. Remarkably, the centromeric regions cover a total length of 38.1 Mb, providing valuable insights into the complex architecture of centromeres. Among the 30 594 predicted protein-coding genes, we identified 2356 genes that are tandemly duplicated in segmental duplication regions. These findings have implications for defence responses and may contribute to evolutionary processes. By utilizing the reference genome, we were able to effectively identify the presence of the gene VuMYB114, which regulates the accumulation of anthocyanins, thereby controlling the purple coloration of the pods. This discovery holds significant implications for understanding the underlying mechanisms of color determination and the breeding process. Overall, the highly improved reference genome serves as crucial resource and lays a solid foundation for asparagus bean genomic studies and genetic improvement efforts.
Collapse
Affiliation(s)
- Yi Yang
- Vegetable Research InstituteGuangdong Academy of Agricultural SciencesGuangzhouChina
- Guangdong Key Laboratory for New Technology Research of VegetablesGuangzhouChina
| | - Zhikun Wu
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic CenterSun Yat‐Sen UniversityGuangzhouChina
| | - Zengxiang Wu
- Vegetable Research InstituteGuangdong Academy of Agricultural SciencesGuangzhouChina
- Guangdong Key Laboratory for New Technology Research of VegetablesGuangzhouChina
| | - Tinyao Li
- Vegetable Research InstituteGuangdong Academy of Agricultural SciencesGuangzhouChina
- Guangdong Key Laboratory for New Technology Research of VegetablesGuangzhouChina
| | - Zhuo Shen
- Vegetable Research InstituteGuangdong Academy of Agricultural SciencesGuangzhouChina
- Guangdong Key Laboratory for New Technology Research of VegetablesGuangzhouChina
| | - Xuan Zhou
- Vegetable Research InstituteGuangdong Academy of Agricultural SciencesGuangzhouChina
- Guangdong Key Laboratory for New Technology Research of VegetablesGuangzhouChina
| | - Xinyi Wu
- Institute of VegetableZhejiang Academy of Agricultural SciencesHangzhouChina
| | - Guojing Li
- Institute of VegetableZhejiang Academy of Agricultural SciencesHangzhouChina
| | - Yan Zhang
- Vegetable Research InstituteGuangdong Academy of Agricultural SciencesGuangzhouChina
- Guangdong Key Laboratory for New Technology Research of VegetablesGuangzhouChina
| |
Collapse
|
36
|
He Y, Chu Y, Guo S, Hu J, Li R, Zheng Y, Ma X, Du Z, Zhao L, Yu W, Xue J, Bian W, Yang F, Chen X, Zhang P, Wu R, Ma Y, Shao C, Chen J, Wang J, Li J, Wu J, Hu X, Long Q, Jiang M, Ye H, Song S, Li G, Wei Y, Xu Y, Ma Y, Chen Y, Wang K, Bao J, Xi W, Wang F, Ni W, Zhang M, Yu Y, Li S, Kang Y, Gao Z. T2T-YAO: A Telomere-to-telomere Assembled Diploid Reference Genome for Han Chinese. GENOMICS, PROTEOMICS & BIOINFORMATICS 2023; 21:1085-1100. [PMID: 37595788 PMCID: PMC11082261 DOI: 10.1016/j.gpb.2023.08.001] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 08/01/2023] [Accepted: 08/08/2023] [Indexed: 08/20/2023]
Abstract
Since its initial release in 2001, the human reference genome has undergone continuous improvement in quality, and the recently released telomere-to-telomere (T2T) version - T2T-CHM13 - reaches its highest level of continuity and accuracy after 20 years of effort by working on a simplified, nearly homozygous genome of a hydatidiform mole cell line. Here, to provide an authentic complete diploid human genome reference for the Han Chinese, the largest population in the world, we assembled the genome of a male Han Chinese individual, T2T-YAO, which includes T2T assemblies of all the 22 + X + M and 22 + Y chromosomes in both haploids. The quality of T2T-YAO is much better than those of all currently available diploid assemblies, and its haploid version, T2T-YAO-hp, generated by selecting the better assembly for each autosome, reaches the top quality of fewer than one error per 29.5 Mb, even higher than that of T2T-CHM13. Derived from an individual living in the aboriginal region of the Han population, T2T-YAO shows clear ancestry and potential genetic continuity from the ancient ancestors. Each haplotype of T2T-YAO possesses ∼ 330-Mb exclusive sequences, ∼ 3100 unique genes, and tens of thousands of nucleotide and structural variations as compared with CHM13, highlighting the necessity of a population-stratified reference genome. The construction of T2T-YAO, an accurate and authentic representative of the Chinese population, would enable precise delineation of genomic variations and advance our understandings in the hereditability of diseases and phenotypes, especially within the context of the unique variations of the Chinese population.
Collapse
Affiliation(s)
- Yukun He
- Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China
| | - Yanan Chu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
| | - Shuming Guo
- Linfen Clinical Medicine Research Center, Linfen 041000, China; Institute of Chest and Lung Diseases, Shanxi Medical University, Taiyuan 030001, China
| | - Jiang Hu
- GrandOmics Biosciences Co., Ltd, Wuhan 430076, China
| | - Ran Li
- Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China
| | - Yali Zheng
- Department of Respiratory, Critical Care and Sleep Medicine, Xiang'an Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen 361101, China
| | - Xinqian Ma
- Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China
| | - Zhenglin Du
- Institute of PSI Genomics, Wenzhou 325024, China
| | - Lili Zhao
- Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China
| | - Wenyi Yu
- Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China
| | - Jianbo Xue
- Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China
| | - Wenjie Bian
- Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China
| | - Feifei Yang
- Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China
| | - Xi Chen
- Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China
| | - Pingan Zhang
- Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China
| | - Rihan Wu
- Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China
| | - Yifan Ma
- Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China
| | - Changjun Shao
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
| | - Jing Chen
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
| | - Jian Wang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
| | - Jiwei Li
- Department of Respiratory, Critical Care and Sleep Medicine, Xiang'an Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen 361101, China
| | - Jing Wu
- Department of Respiratory, Critical Care and Sleep Medicine, Xiang'an Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen 361101, China
| | - Xiaoyi Hu
- Department of Respiratory, Critical Care and Sleep Medicine, Xiang'an Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen 361101, China
| | - Qiuyue Long
- Department of Respiratory, Critical Care and Sleep Medicine, Xiang'an Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen 361101, China
| | - Mingzheng Jiang
- Department of Respiratory, Critical Care and Sleep Medicine, Xiang'an Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen 361101, China
| | - Hongli Ye
- Department of Respiratory, Critical Care and Sleep Medicine, Xiang'an Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen 361101, China
| | - Shixu Song
- Department of Respiratory, Critical Care and Sleep Medicine, Xiang'an Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen 361101, China
| | - Guangyao Li
- Linfen Clinical Medicine Research Center, Linfen 041000, China
| | - Yue Wei
- Linfen Clinical Medicine Research Center, Linfen 041000, China
| | - Yu Xu
- Beijing Jishuitan Hospital, Capital Medical University, Beijing 100035, China
| | - Yanliang Ma
- Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China
| | - Yanwen Chen
- Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China
| | - Keqiang Wang
- Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China
| | - Jing Bao
- Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China
| | - Wen Xi
- Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China
| | - Fang Wang
- Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China
| | - Wentao Ni
- Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China
| | - Moqin Zhang
- Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China
| | - Yan Yu
- Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China
| | - Shengnan Li
- Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China
| | - Yu Kang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100490, China.
| | - Zhancheng Gao
- Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Institute of Chest and Lung Diseases, Shanxi Medical University, Taiyuan 030001, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China.
| |
Collapse
|
37
|
Tan KT, Slevin MK, Leibowitz ML, Garrity-Janger M, Li H, Meyerson M. Neotelomeres and Telomere-Spanning Chromosomal Arm Fusions in Cancer Genomes Revealed by Long-Read Sequencing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.30.569101. [PMID: 38077026 PMCID: PMC10705422 DOI: 10.1101/2023.11.30.569101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
Alterations in the structure and location of telomeres are key events in cancer genome evolution. However, previous genomic approaches, unable to span long telomeric repeat arrays, could not characterize the nature of these alterations. Here, we applied both long-read and short-read genome sequencing to assess telomere repeat-containing structures in cancers and cancer cell lines. Using long-read genome sequences that span telomeric repeat arrays, we defined four types of telomere repeat variations in cancer cells: neotelomeres where telomere addition heals chromosome breaks, chromosomal arm fusions spanning telomere repeats, fusions of neotelomeres, and peri-centromeric fusions with adjoined telomere and centromere repeats. Analysis of lung adenocarcinoma genome sequences identified somatic neotelomere and telomere-spanning fusion alterations. These results provide a framework for systematic study of telomeric repeat arrays in cancer genomes, that could serve as a model for understanding the somatic evolution of other repetitive genomic elements.
Collapse
Affiliation(s)
- Kar-Tong Tan
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
- Cancer Program, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Harvard Medical School, Boston, MA 02215, USA
| | - Michael K. Slevin
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
- Center for Cancer Genomics, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Mitchell L. Leibowitz
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
- Cancer Program, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Harvard Medical School, Boston, MA 02215, USA
| | - Max Garrity-Janger
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
- Cancer Program, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Harvard Medical School, Boston, MA 02215, USA
| | - Heng Li
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA 02215, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02215, USA
| | - Matthew Meyerson
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
- Cancer Program, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Harvard Medical School, Boston, MA 02215, USA
- Center for Cancer Genomics, Dana-Farber Cancer Institute, Boston, MA 02215, USA
- Lead contact
| |
Collapse
|
38
|
Rangwala SH, Rudnev DV, Ananiev VV, Asztalos A, Benica B, Borodin EA, Bouk N, Evgeniev VI, Kodali VK, Lotov V, Mozes E, Oh DH, Omelchenko MV, Savkina S, Sukharnikov E, Virothaisakun J, Murphy TD, Pruitt KD, Schneider VA. Interactive visualization of whole eukaryote genome alignments using NCBI's Comparative Genome Viewer (CGV). BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.30.564672. [PMID: 38077029 PMCID: PMC10705539 DOI: 10.1101/2023.10.30.564672] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
We report a new visualization tool for analysis of whole genome assembly-assembly alignments, the Comparative Genome Viewer (CGV) (https://ncbi.nlm.nih.gov/genome/cgv/). CGV visualizes pairwise same-species and cross-species alignments provided by NCBI using assembly alignment algorithms developed by us and others. Researchers can examine the alignments between the two assemblies using two alternate views: a chromosome ideogram-based view or a 2D genome dotplot. Whole genome alignment views expose large structural differences spanning chromosomes, such as inversions or translocations. Users can also navigate to regions of interest, where they can detect and analyze smaller-scale deletions and rearrangements within specific chromosome or gene regions. RefSeq or user-provided gene annotation is displayed in the ideogram view where available. CGV currently provides approximately 700 alignments from over 300 animal, plant, and fungal species. CGV and related NCBI viewers are undergoing active development to further meet needs of the research community in comparative genome visualization.
Collapse
Affiliation(s)
- Sanjida H Rangwala
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Dmitry V Rudnev
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Victor V Ananiev
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Andrea Asztalos
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Barrett Benica
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Evgeny A Borodin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Nathan Bouk
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Vladislav I Evgeniev
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Vamsi K Kodali
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Vadim Lotov
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Eyal Mozes
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Dong-Ha Oh
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Marina V Omelchenko
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Sofya Savkina
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Ekaterina Sukharnikov
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Joël Virothaisakun
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Terence D. Murphy
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Kim D Pruitt
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Valerie A. Schneider
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| |
Collapse
|
39
|
Wang S, Shen Y, Lin Z, Miao Y, Wang C, Zhang W, Zhang Y. New genes driven by segmental duplications share a testis-specific expression pattern in the chromosome-level genome assembly of tree sparrow. Integr Zool 2023. [PMID: 38014459 DOI: 10.1111/1749-4877.12789] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Based on a chromosome-level genome assembly, a burst of new genes with different structures but a similar testis-specific expression pattern was detected in tree sparrow.
Collapse
Affiliation(s)
- Shengnan Wang
- Gansu Key Laboratory of Biomonitoring and Bioremediation for Environmental Pollution, School of Life Science, Lanzhou University, Lanzhou, China
| | - Yue Shen
- Gansu Key Laboratory of Biomonitoring and Bioremediation for Environmental Pollution, School of Life Science, Lanzhou University, Lanzhou, China
| | - Zhaocun Lin
- Gansu Key Laboratory of Biomonitoring and Bioremediation for Environmental Pollution, School of Life Science, Lanzhou University, Lanzhou, China
| | - Yuquan Miao
- Gansu Key Laboratory of Biomonitoring and Bioremediation for Environmental Pollution, School of Life Science, Lanzhou University, Lanzhou, China
| | - Chengqi Wang
- Gansu Key Laboratory of Biomonitoring and Bioremediation for Environmental Pollution, School of Life Science, Lanzhou University, Lanzhou, China
| | - Wenya Zhang
- Gansu Key Laboratory of Biomonitoring and Bioremediation for Environmental Pollution, School of Life Science, Lanzhou University, Lanzhou, China
| | - Yingmei Zhang
- Gansu Key Laboratory of Biomonitoring and Bioremediation for Environmental Pollution, School of Life Science, Lanzhou University, Lanzhou, China
| |
Collapse
|
40
|
Morgan MA, Mohammad Parast S, Iwanaszko M, Aoi Y, Yoo D, Dumar ZJ, Howard BC, Helmin KA, Liu Q, Thakur WR, Zeidner JM, Singer BD, Eichler EE, Shilatifard A. ELOA3: A primate-specific RNA polymerase II elongation factor encoded by a tandem repeat gene cluster. SCIENCE ADVANCES 2023; 9:eadj1261. [PMID: 37992162 PMCID: PMC10664989 DOI: 10.1126/sciadv.adj1261] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Accepted: 10/19/2023] [Indexed: 11/24/2023]
Abstract
The biological role of the repetitive DNA sequences in the human genome remains an outstanding question. Recent long-read human genome assemblies have allowed us to identify a function for one of these repetitive regions. We have uncovered a tandem array of conserved primate-specific retrogenes encoding the protein Elongin A3 (ELOA3), a homolog of the RNA polymerase II (RNAPII) elongation factor Elongin A (ELOA). Our genomic analysis shows that the ELOA3 gene cluster is conserved among primates and the number of ELOA3 gene repeats is variable in the human population and across primate species. Moreover, the gene cluster has undergone concerted evolution and homogenization within primates. Our biochemical studies show that ELOA3 functions as a promoter-associated RNAPII pause-release elongation factor with distinct biochemical and functional features from its ancestral homolog, ELOA. We propose that the ELOA3 gene cluster has evolved to fulfil a transcriptional regulatory function unique to the primate lineage that can be targeted to regulate cellular hyperproliferation.
Collapse
Affiliation(s)
- Marc A. J. Morgan
- Department of Biochemistry and Molecular Genetics, Simpson Querrey Institute for Epigenetics, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
| | - Saeid Mohammad Parast
- Department of Biochemistry and Molecular Genetics, Simpson Querrey Institute for Epigenetics, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
| | - Marta Iwanaszko
- Department of Biochemistry and Molecular Genetics, Simpson Querrey Institute for Epigenetics, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
| | - Yuki Aoi
- Department of Biochemistry and Molecular Genetics, Simpson Querrey Institute for Epigenetics, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
| | - DongAhn Yoo
- Department of Genome Sciences, University of Washington School of Medicine; Seattle, WA 98195, USA
| | - Zachary J. Dumar
- Department of Biochemistry and Molecular Genetics, Simpson Querrey Institute for Epigenetics, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
| | - Benjamin C. Howard
- Department of Biochemistry and Molecular Genetics, Simpson Querrey Institute for Epigenetics, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
| | - Kathryn A. Helmin
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, Simpson Querrey Lung Institute for Translational Science, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
| | - Qianli Liu
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, Simpson Querrey Lung Institute for Translational Science, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
| | - William R. Thakur
- Department of Biochemistry and Molecular Genetics, Simpson Querrey Institute for Epigenetics, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
| | - Jacob M. Zeidner
- Department of Biochemistry and Molecular Genetics, Simpson Querrey Institute for Epigenetics, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
| | - Benjamin D. Singer
- Department of Biochemistry and Molecular Genetics, Simpson Querrey Institute for Epigenetics, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, Simpson Querrey Lung Institute for Translational Science, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine; Seattle, WA 98195, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| | - Ali Shilatifard
- Department of Biochemistry and Molecular Genetics, Simpson Querrey Institute for Epigenetics, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
| |
Collapse
|
41
|
Feng LY, Lin PF, Xu RJ, Kang HQ, Gao LZ. Comparative Genomic Analysis of Asian Cultivated Rice and Its Wild Progenitor ( Oryza rufipogon) Has Revealed Evolutionary Innovation of the Pentatricopeptide Repeat Gene Family through Gene Duplication. Int J Mol Sci 2023; 24:16313. [PMID: 38003501 PMCID: PMC10671101 DOI: 10.3390/ijms242216313] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 11/10/2023] [Accepted: 11/12/2023] [Indexed: 11/26/2023] Open
Abstract
The pentatricopeptide repeat (PPR) gene family is one of the largest gene families in land plants. However, current knowledge about the evolution of the PPR gene family remains largely limited. In this study, we performed a comparative genomic analysis of the PPR gene family in O. sativa and its wild progenitor, O. rufipogon, and outlined a comprehensive landscape of gene duplications. Our findings suggest that the majority of PPR genes originated from dispersed duplications. Although segmental duplications have only expanded approximately 11.30% and 13.57% of the PPR gene families in the O. sativa and O. rufipogon genomes, we interestingly obtained evidence that segmental duplication promotes the structural diversity of PPR genes through incomplete gene duplications. In the O. sativa and O. rufipogon genomes, 10 (~33.33%) and 22 pairs of gene duplications (~45.83%) had non-PPR paralogous genes through incomplete gene duplication. Segmental duplications leading to incomplete gene duplications might result in the acquisition of domains, thus promoting functional innovation and structural diversification of PPR genes. This study offers a unique perspective on the evolution of PPR gene structures and underscores the potential role of segmental duplications in PPR gene structural diversity.
Collapse
Affiliation(s)
- Li-Ying Feng
- Institution of Genomics and Bioinformatics, South China Agricultural University, Guangzhou 510642, China; (L.-Y.F.); (P.-F.L.)
| | - Pei-Fan Lin
- Institution of Genomics and Bioinformatics, South China Agricultural University, Guangzhou 510642, China; (L.-Y.F.); (P.-F.L.)
| | - Rong-Jing Xu
- Tropical Biodiversity and Genomics Research Center, Hainan University, Haikou 570228, China; (R.-J.X.); (H.-Q.K.)
| | - Hai-Qi Kang
- Tropical Biodiversity and Genomics Research Center, Hainan University, Haikou 570228, China; (R.-J.X.); (H.-Q.K.)
| | - Li-Zhi Gao
- Institution of Genomics and Bioinformatics, South China Agricultural University, Guangzhou 510642, China; (L.-Y.F.); (P.-F.L.)
- Tropical Biodiversity and Genomics Research Center, Hainan University, Haikou 570228, China; (R.-J.X.); (H.-Q.K.)
| |
Collapse
|
42
|
Miga KH, Eichler EE. Envisioning a new era: Complete genetic information from routine, telomere-to-telomere genomes. Am J Hum Genet 2023; 110:1832-1840. [PMID: 37922882 PMCID: PMC10645551 DOI: 10.1016/j.ajhg.2023.09.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 09/19/2023] [Accepted: 09/20/2023] [Indexed: 11/07/2023] Open
Abstract
Advances in long-read sequencing and assembly now mean that individual labs can generate phased genomes that are more accurate and more contiguous than the original human reference genome. With declining costs and increasing democratization of technology, we suggest that complete genome assemblies, where both parental haplotypes are phased telomere to telomere, will become standard in human genetics. Soon, even in clinical settings where rigorous sample-handling standards must be met, affected individuals could have reference-grade genomes fully sequenced and assembled in just a few hours given advances in technology, computational processing, and annotation. Complete genetic variant discovery will transform how we map, catalog, and associate variation with human disease and fundamentally change our understanding of the genetic diversity of all humans.
Collapse
Affiliation(s)
- Karen H Miga
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA.
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA; Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA.
| |
Collapse
|
43
|
Bredemeyer KR, Hillier L, Harris AJ, Hughes GM, Foley NM, Lawless C, Carroll RA, Storer JM, Batzer MA, Rice ES, Davis BW, Raudsepp T, O'Brien SJ, Lyons LA, Warren WC, Murphy WJ. Single-haplotype comparative genomics provides insights into lineage-specific structural variation during cat evolution. Nat Genet 2023; 55:1953-1963. [PMID: 37919451 PMCID: PMC10845050 DOI: 10.1038/s41588-023-01548-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Accepted: 09/20/2023] [Indexed: 11/04/2023]
Abstract
The role of structurally dynamic genomic regions in speciation is poorly understood due to challenges inherent in diploid genome assembly. Here we reconstructed the evolutionary dynamics of structural variation in five cat species by phasing the genomes of three interspecies F1 hybrids to generate near-gapless single-haplotype assemblies. We discerned that cat genomes have a paucity of segmental duplications relative to great apes, explaining their remarkable karyotypic stability. X chromosomes were hotspots of structural variation, including enrichment with inversions in a large recombination desert with characteristics of a supergene. The X-linked macrosatellite DXZ4 evolves more rapidly than 99.5% of the genome clarifying its role in felid hybrid incompatibility. Resolved sensory gene repertoires revealed functional copy number changes associated with ecomorphological adaptations, sociality and domestication. This study highlights the value of gapless genomes to reveal structural mechanisms underpinning karyotypic evolution, reproductive isolation and ecological niche adaptation.
Collapse
Affiliation(s)
- Kevin R Bredemeyer
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, USA
- Interdisciplinary Program in Genetics & Genomics, Texas A&M University, College Station, TX, USA
| | - LaDeana Hillier
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Andrew J Harris
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, USA
- Interdisciplinary Program in Genetics & Genomics, Texas A&M University, College Station, TX, USA
| | - Graham M Hughes
- School of Biology & Environmental Sciences, University College Dublin, Dublin, Ireland
| | - Nicole M Foley
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, USA
| | - Colleen Lawless
- School of Biology & Environmental Sciences, University College Dublin, Dublin, Ireland
| | - Rachel A Carroll
- Department of Animal Sciences, University of Missouri, Columbia, MO, USA
| | | | - Mark A Batzer
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, USA
| | - Edward S Rice
- Department of Animal Sciences, University of Missouri, Columbia, MO, USA
| | - Brian W Davis
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, USA
- Interdisciplinary Program in Genetics & Genomics, Texas A&M University, College Station, TX, USA
| | - Terje Raudsepp
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, USA
- Interdisciplinary Program in Genetics & Genomics, Texas A&M University, College Station, TX, USA
| | - Stephen J O'Brien
- Guy Harvey Oceanographic Center, Nova Southeastern University, Fort Lauderdale, FL, USA
| | - Leslie A Lyons
- Department of Veterinary Medicine & Surgery, University of Missouri, Columbia, MO, USA
| | - Wesley C Warren
- Department of Animal Sciences, University of Missouri, Columbia, MO, USA.
| | - William J Murphy
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, USA.
- Interdisciplinary Program in Genetics & Genomics, Texas A&M University, College Station, TX, USA.
| |
Collapse
|
44
|
Meng X, Lin Q, Zeng X, Jiang J, Li M, Luo X, Chen K, Wu H, Hu Y, Liu C, Su B. Brain developmental and cortical connectivity changes in transgenic monkeys carrying the human-specific duplicated gene SRGAP2C. Natl Sci Rev 2023; 10:nwad281. [PMID: 38090550 PMCID: PMC10712708 DOI: 10.1093/nsr/nwad281] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2023] [Revised: 10/18/2023] [Accepted: 11/01/2023] [Indexed: 02/12/2024] Open
Abstract
Human-specific duplicated genes contributed to phenotypic innovations during the origin of our own species, such as an enlarged brain and highly developed cognitive abilities. While prior studies on transgenic mice carrying the human-specific SRGAP2C gene have shown enhanced brain connectivity, the relevance to humans remains unclear due to the significant evolutionary gap between humans and rodents. In this study, to investigate the phenotypic outcome and underlying genetic mechanism of SRGAP2C, we generated transgenic cynomolgus macaques (Macaca fascicularis) carrying the human-specific SRGAP2C gene. Longitudinal MRI imaging revealed delayed brain development with region-specific volume changes, accompanied by altered myelination levels in the temporal and occipital regions. On a cellular level, the transgenic monkeys exhibited increased deep-layer neurons during fetal neurogenesis and delayed synaptic maturation in adolescence. Moreover, transcriptome analysis detected neotenic expression in molecular pathways related to neuron ensheathment, synaptic connections, extracellular matrix and energy metabolism. Cognitively, the transgenic monkeys demonstrated improved motor planning and execution skills. Together, our findings provide new insights into the mechanisms by which the newly evolved gene shapes the unique development and circuitry of the human brain.
Collapse
Affiliation(s)
- Xiaoyu Meng
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China
- National Resource Center for Non-Human Primates, Kunming Primate Research Center, and National Research Facility for Phenotypic and Genetic Analysis of Model Animals (Primate Facility), Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650107, China
- Kunming College of Life Science, University of Chinese Academy of Sciences, Beijing 100101, China
| | - Qiang Lin
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China
| | - Xuerui Zeng
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China
- National Resource Center for Non-Human Primates, Kunming Primate Research Center, and National Research Facility for Phenotypic and Genetic Analysis of Model Animals (Primate Facility), Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650107, China
- Kunming College of Life Science, University of Chinese Academy of Sciences, Beijing 100101, China
| | - Jin Jiang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China
| | - Min Li
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China
| | - Xin Luo
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China
- National Resource Center for Non-Human Primates, Kunming Primate Research Center, and National Research Facility for Phenotypic and Genetic Analysis of Model Animals (Primate Facility), Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650107, China
| | - Kaimin Chen
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China
- National Resource Center for Non-Human Primates, Kunming Primate Research Center, and National Research Facility for Phenotypic and Genetic Analysis of Model Animals (Primate Facility), Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650107, China
- Kunming College of Life Science, University of Chinese Academy of Sciences, Beijing 100101, China
| | - Haixu Wu
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China
- National Resource Center for Non-Human Primates, Kunming Primate Research Center, and National Research Facility for Phenotypic and Genetic Analysis of Model Animals (Primate Facility), Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650107, China
- Kunming College of Life Science, University of Chinese Academy of Sciences, Beijing 100101, China
| | - Yan Hu
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China
| | - Cirong Liu
- Center for Excellence in Brain Science and Intelligence Technology, Institute of Neuroscience, CAS Key Laboratory of Primate Neurobiology, Chinese Academy of Sciences, Shanghai 200031, China
| | - Bing Su
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China
- National Resource Center for Non-Human Primates, Kunming Primate Research Center, and National Research Facility for Phenotypic and Genetic Analysis of Model Animals (Primate Facility), Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650107, China
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650223, China
| |
Collapse
|
45
|
Sun M, Yao C, Shu Q, He Y, Chen G, Yang G, Xu S, Liu Y, Xue Z, Wu J. Telomere-to-telomere pear ( Pyrus pyrifolia) reference genome reveals segmental and whole genome duplication driving genome evolution. HORTICULTURE RESEARCH 2023; 10:uhad201. [PMID: 38023478 PMCID: PMC10681005 DOI: 10.1093/hr/uhad201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/30/2023] [Accepted: 10/01/2023] [Indexed: 12/01/2023]
Abstract
Previously released pear genomes contain a plethora of gaps and unanchored genetic regions. Here, we report a telomere-to-telomere (T2T) gap-free genome for the red-skinned pear, 'Yunhong No. 1' (YH1; Pyrus pyrifolia), which is mainly cultivated in Yunnan Province (southwest China), the pear's primary region of origin. The YH1 genome is 501.20 Mb long with a contig N50 length of 29.26 Mb. All 17 chromosomes were assembled to the T2T level with 34 characterized telomeres. The 17 centromeres were predicted and mainly consist of centromeric-specific monomers (CEN198) and long terminal repeat (LTR) Gypsy elements (≥74.73%). By filling all unclosed gaps, the integrity of YH1 is markedly improved over previous P. pyrifolia genomes ('Cuiguan' and 'Nijisseiki'). A total of 1531 segmental duplication (SD) driven duplicated genes were identified and enriched in stress response pathways. Intrachromosomal SDs drove the expansion of disease resistance genes, suggesting the potential of SDs in adaptive pear evolution. A large proportion of duplicated gene pairs exhibit dosage effects or sub-/neo-functionalization, which may affect agronomic traits like stone cell content, sugar content, and fruit skin russet. Furthermore, as core regulators of anthocyanin biosynthesis, we found that MYB10 and MYB114 underwent various gene duplication events. Multiple copies of MYB10 and MYB114 displayed obvious dosage effects, indicating role differentiation in the formation of red-skinned pear fruit. In summary, the T2T gap-free pear genome provides invaluable resources for genome evolution and functional genomics.
Collapse
Affiliation(s)
- Manyi Sun
- College of Horticulture, State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Nanjing Agricultural University, Nanjing, Jiangsu 210095, China
- Zhongshan Biological Breeding Laboratory, No.50 Zhongling Street, Nanjing, Jiangsu 210014, China
| | - Chenjie Yao
- College of Horticulture, State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Nanjing Agricultural University, Nanjing, Jiangsu 210095, China
- Zhongshan Biological Breeding Laboratory, No.50 Zhongling Street, Nanjing, Jiangsu 210014, China
| | - Qun Shu
- Institute of Horticulture, Yunnan Academy of Agricultural Sciences, Kunming 650205, China
| | - Yingyun He
- Institute of Horticulture, Yunnan Academy of Agricultural Sciences, Kunming 650205, China
| | - Guosong Chen
- College of Horticulture, State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Nanjing Agricultural University, Nanjing, Jiangsu 210095, China
- Zhongshan Biological Breeding Laboratory, No.50 Zhongling Street, Nanjing, Jiangsu 210014, China
| | - Guangyan Yang
- College of Horticulture, State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Nanjing Agricultural University, Nanjing, Jiangsu 210095, China
- Zhongshan Biological Breeding Laboratory, No.50 Zhongling Street, Nanjing, Jiangsu 210014, China
| | - Shaozhuo Xu
- College of Horticulture, State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Nanjing Agricultural University, Nanjing, Jiangsu 210095, China
- Zhongshan Biological Breeding Laboratory, No.50 Zhongling Street, Nanjing, Jiangsu 210014, China
| | - Yueyuan Liu
- College of Horticulture, State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Nanjing Agricultural University, Nanjing, Jiangsu 210095, China
- Zhongshan Biological Breeding Laboratory, No.50 Zhongling Street, Nanjing, Jiangsu 210014, China
| | - Zhaolong Xue
- College of Horticulture, State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Nanjing Agricultural University, Nanjing, Jiangsu 210095, China
- Zhongshan Biological Breeding Laboratory, No.50 Zhongling Street, Nanjing, Jiangsu 210014, China
| | - Jun Wu
- College of Horticulture, State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Nanjing Agricultural University, Nanjing, Jiangsu 210095, China
- Zhongshan Biological Breeding Laboratory, No.50 Zhongling Street, Nanjing, Jiangsu 210014, China
| |
Collapse
|
46
|
Paparella A, L’Abbate A, Palmisano D, Chirico G, Porubsky D, Catacchio CR, Ventura M, Eichler EE, Maggiolini FAM, Antonacci F. Structural Variation Evolution at the 15q11-q13 Disease-Associated Locus. Int J Mol Sci 2023; 24:15818. [PMID: 37958807 PMCID: PMC10648317 DOI: 10.3390/ijms242115818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 10/26/2023] [Accepted: 10/27/2023] [Indexed: 11/15/2023] Open
Abstract
The impact of segmental duplications on human evolution and disease is only just starting to unfold, thanks to advancements in sequencing technologies that allow for their discovery and precise genotyping. The 15q11-q13 locus is a hotspot of recurrent copy number variation associated with Prader-Willi/Angelman syndromes, developmental delay, autism, and epilepsy and is mediated by complex segmental duplications, many of which arose recently during evolution. To gain insight into the instability of this region, we characterized its architecture in human and nonhuman primates, reconstructing the evolutionary history of five different inversions that rearranged the region in different species primarily by accumulation of segmental duplications. Comparative analysis of human and nonhuman primate duplication structures suggests a human-specific gain of directly oriented duplications in the regions flanking the GOLGA cores and HERC segmental duplications, representing potential genomic drivers for the human-specific expansions. The increasing complexity of segmental duplication organization over the course of evolution underlies its association with human susceptibility to recurrent disease-associated rearrangements.
Collapse
Affiliation(s)
- Annalisa Paparella
- Department of Biosciences, Biotechnology and Environment, University of Bari “Aldo Moro”, 70125 Bari, Italy
| | - Alberto L’Abbate
- Institute of Biomembranes, Bioenergetics, and Molecular Biotechnology (IBIOM), 70125 Bari, Italy
| | - Donato Palmisano
- Department of Biosciences, Biotechnology and Environment, University of Bari “Aldo Moro”, 70125 Bari, Italy
| | - Gerardina Chirico
- Department of Biosciences, Biotechnology and Environment, University of Bari “Aldo Moro”, 70125 Bari, Italy
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Claudia R. Catacchio
- Department of Biosciences, Biotechnology and Environment, University of Bari “Aldo Moro”, 70125 Bari, Italy
| | - Mario Ventura
- Department of Biosciences, Biotechnology and Environment, University of Bari “Aldo Moro”, 70125 Bari, Italy
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
- Howard Hughes Medical Institute (HHMI), University of Washington, Seattle, WA 98195, USA
| | - Flavia A. M. Maggiolini
- Department of Biosciences, Biotechnology and Environment, University of Bari “Aldo Moro”, 70125 Bari, Italy
- Research Centre for Viticulture and Enology, Council for Agricultural Research and Economics (CREA), 70010 Bari, Italy
| | - Francesca Antonacci
- Department of Biosciences, Biotechnology and Environment, University of Bari “Aldo Moro”, 70125 Bari, Italy
| |
Collapse
|
47
|
Clifton BD, Hariyani I, Kimura A, Luo F, Nguyen A, Ranz JM. Paralog transcriptional differentiation in the D. melanogaster-specific gene family Sdic across populations and spermatogenesis stages. Commun Biol 2023; 6:1069. [PMID: 37864070 PMCID: PMC10589255 DOI: 10.1038/s42003-023-05427-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Accepted: 10/05/2023] [Indexed: 10/22/2023] Open
Abstract
How recently originated gene copies become stable genomic components remains uncertain as high sequence similarity of young duplicates precludes their functional characterization. The tandem multigene family Sdic is specific to Drosophila melanogaster and has been annotated across multiple reference-quality genome assemblies. Here we show the existence of a positive correlation between Sdic copy number and total expression, plus vast intrastrain differences in mRNA abundance among paralogs, using RNA-sequencing from testis of four strains with variable paralog composition. Single cell and nucleus RNA-sequencing data expose paralog expression differentiation in meiotic cell types within testis from third instar larva and adults. Additional RNA-sequencing across synthetic strains only differing in their Y chromosomes reveal a tissue-dependent trans-regulatory effect on Sdic: upregulation in testis and downregulation in male accessory gland. By leveraging paralog-specific expression information from tissue- and cell-specific data, our results elucidate the intraspecific functional diversification of a recently expanded tandem gene family.
Collapse
Affiliation(s)
- Bryan D Clifton
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, CA, 92697, USA.
| | - Imtiyaz Hariyani
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, CA, 92697, USA
| | - Ashlyn Kimura
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, CA, 92697, USA
| | - Fangning Luo
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, CA, 92697, USA
| | - Alvin Nguyen
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, CA, 92697, USA
| | - José M Ranz
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, CA, 92697, USA.
| |
Collapse
|
48
|
Liu J, Liu F, Pan W. Improving the Completeness of Chromosome-Level Assembly by Recalling Sequences from Lost Contigs. Genes (Basel) 2023; 14:1926. [PMID: 37895275 PMCID: PMC10606404 DOI: 10.3390/genes14101926] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 09/13/2023] [Accepted: 09/20/2023] [Indexed: 10/29/2023] Open
Abstract
For a long time, the construction of complete reference genomes for complex eukaryotic genomes has been hindered by the limitations of sequencing technologies. Recently, the Pacific Biosciences (PacBio) HiFi data and Oxford Nanopore Technologies (ONT) Ultra-Long data, leveraging their respective advantages in accuracy and length, have provided an opportunity for generating complete chromosome sequences. Nevertheless, for the majority of genomes, the chromosome-level assemblies generated using existing methods still miss a high proportion of sequences due to losing small contigs in the step of assembly and scaffolding. To address this shortcoming, in this paper, we propose a novel method that is able to identify and fill the gaps in the chromosome-level assembly by recalling the sequences in the lost small contigs. Experimental results on both real and simulated datasets demonstrate that this method is able to improve the completeness of the chromosome-level assembly.
Collapse
Affiliation(s)
- Junyang Liu
- Zhengzhou Research Base, State Key Laboratory of Cotton Biology, School of Agricultural Sciences, Zhengzhou University, Zhengzhou 450001, China;
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences (ICR, CAAS), Shenzhen 518120, China
| | - Fang Liu
- Zhengzhou Research Base, State Key Laboratory of Cotton Biology, School of Agricultural Sciences, Zhengzhou University, Zhengzhou 450001, China;
- National Key Laboratory of Cotton Bio-Breeding and Integrated Utilization, Institute of Cotton Research, Chinese Academy of Agricultural Sciences (ICR, CAAS), Anyang 455000, China
| | - Weihua Pan
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences (ICR, CAAS), Shenzhen 518120, China
| |
Collapse
|
49
|
Chrisman B, He C, Jung JY, Stockham N, Paskov K, Washington P, Petereit J, Wall DP. Localizing unmapped sequences with families to validate the Telomere-to-Telomere assembly and identify new hotspots for genetic diversity. Genome Res 2023; 33:1734-1746. [PMID: 37879860 PMCID: PMC10691534 DOI: 10.1101/gr.277175.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 05/25/2023] [Indexed: 10/27/2023]
Abstract
Although it is ubiquitous in genomics, the current human reference genome (GRCh38) is incomplete: It is missing large sections of heterochromatic sequence, and as a singular, linear reference genome, it does not represent the full spectrum of human genetic diversity. To characterize gaps in GRCh38 and human genetic diversity, we developed an algorithm for sequence location approximation using nuclear families (ASLAN) to identify the region of origin of reads that do not align to GRCh38. Using unmapped reads and variant calls from whole-genome sequences (WGSs), ASLAN uses a maximum likelihood model to identify the most likely region of the genome that a subsequence belongs to given the distribution of the subsequence in the unmapped reads and phasings of families. Validating ASLAN on synthetic data and on reads from the alternative haplotypes in the decoy genome, ASLAN localizes >90% of 100-bp sequences with >92% accuracy and ∼1 Mb of resolution. We then ran ASLAN on 100-mers from unmapped reads from WGS from more than 700 families, and compared ASLAN localizations to alignment of the 100-mers to the recently released T2T-CHM13 assembly. We found that many unmapped reads in GRCh38 originate from telomeres and centromeres that are gaps in GRCh38. ASLAN localizations are in high concordance with T2T-CHM13 alignments, except in the centromeres of the acrocentric chromosomes. Comparing ASLAN localizations and T2T-CHM13 alignments, we identified sequences missing from T2T-CHM13 or sequences with high divergence from their aligned region in T2T-CHM13, highlighting new hotspots for genetic diversity.
Collapse
Affiliation(s)
- Brianna Chrisman
- Department of Bioengineering, Stanford University, Stanford, California 94305, USA;
- Nevada Bioinformatics Center, University of Nevada, Reno, Nevada 89557, USA
| | - Chloe He
- Department of Biomedical Data Science, Stanford University, Stanford, California 94305, USA
| | - Jae-Yoon Jung
- Department of Pediatrics (Systems Medicine), Stanford University, Stanford, California 94305, USA
| | - Nate Stockham
- Department of Neuroscience, Stanford University, Stanford, California 94305, USA
| | - Kelley Paskov
- Department of Biomedical Data Science, Stanford University, Stanford, California 94305, USA
| | - Peter Washington
- Department of Bioengineering, Stanford University, Stanford, California 94305, USA
| | - Juli Petereit
- Nevada Bioinformatics Center, University of Nevada, Reno, Nevada 89557, USA
| | - Dennis P Wall
- Department of Biomedical Data Science, Stanford University, Stanford, California 94305, USA
- Department of Pediatrics (Systems Medicine), Stanford University, Stanford, California 94305, USA
| |
Collapse
|
50
|
Yang C, Zhou Y, Song Y, Wu D, Zeng Y, Nie L, Liu P, Zhang S, Chen G, Xu J, Zhou H, Zhou L, Qian X, Liu C, Tan S, Zhou C, Dai W, Xu M, Qi Y, Wang X, Guo L, Fan G, Wang A, Deng Y, Zhang Y, Jin J, He Y, Guo C, Guo G, Zhou Q, Xu X, Yang H, Wang J, Xu S, Mao Y, Jin X, Ruan J, Zhang G. The complete and fully-phased diploid genome of a male Han Chinese. Cell Res 2023; 33:745-761. [PMID: 37452091 PMCID: PMC10542383 DOI: 10.1038/s41422-023-00849-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2023] [Accepted: 06/29/2023] [Indexed: 07/18/2023] Open
Abstract
Since the release of the complete human genome, the priority of human genomic study has now been shifting towards closing gaps in ethnic diversity. Here, we present a fully phased and well-annotated diploid human genome from a Han Chinese male individual (CN1), in which the assemblies of both haploids achieve the telomere-to-telomere (T2T) level. Comparison of this diploid genome with the CHM13 haploid T2T genome revealed significant variations in the centromere. Outside the centromere, we discovered 11,413 structural variations, including numerous novel ones. We also detected thousands of CN1 alleles that have accumulated high substitution rates and a few that have been under positive selection in the East Asian population. Further, we found that CN1 outperforms CHM13 as a reference genome in mapping and variant calling for the East Asian population owing to the distinct structural variants of the two references. Comparison of SNP calling for a large cohort of 8869 Chinese genomes using CN1 and CHM13 as reference respectively showed that the reference bias profoundly impacts rare SNP calling, with nearly 2 million rare SNPs miss-called with different reference genomes. Finally, applying the CN1 as a reference, we discovered 5.80 Mb and 4.21 Mb putative introgression sequences from Neanderthal and Denisovan, respectively, including many East Asian specific ones undetected using CHM13 as the reference. Our analyses reveal the advances of using CN1 as a reference for population genomic studies and paleo-genomic studies. This complete genome will serve as an alternative reference for future genomic studies on the East Asian population.
Collapse
Affiliation(s)
- Chentao Yang
- Center for Genomic Research, International Institutes of Medicine, The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Yiwu, Zhejiang, China
- Center for Evolutionary & Organismal Biology, & Women's Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
- BGI-Shenzhen, Shenzhen, Guangdong, China
| | - Yang Zhou
- BGI-Shenzhen, Shenzhen, Guangdong, China
- BGI Research-Wuhan, BGI, Wuhan, Hubei, China
| | - Yanni Song
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong, China
| | - Dongya Wu
- Center for Genomic Research, International Institutes of Medicine, The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Yiwu, Zhejiang, China
- Center for Evolutionary & Organismal Biology, & Women's Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
- Liangzhu Laboratory, Zhejiang University Medical Center, Hangzhou, Zhejiang, China
- Institute of Crop Science & Institute of Bioinformatics, Zhejiang University, Hangzhou, Zhejiang, China
| | - Yan Zeng
- BGI-Shenzhen, Shenzhen, Guangdong, China
| | - Lei Nie
- BGI-Shenzhen, Shenzhen, Guangdong, China
| | | | - Shilong Zhang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Guangji Chen
- BGI-Shenzhen, Shenzhen, Guangdong, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Jinjin Xu
- BGI-Shenzhen, Shenzhen, Guangdong, China
| | - Hongling Zhou
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong, China
| | - Long Zhou
- Center for Evolutionary & Organismal Biology, & Women's Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
- Liangzhu Laboratory, Zhejiang University Medical Center, Hangzhou, Zhejiang, China
- Innovation Center of Yangtze River Delta, Zhejiang University, Hangzhou, Zhejiang, China
| | - Xiaobo Qian
- BGI-Shenzhen, Shenzhen, Guangdong, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Chenlu Liu
- Life Sciences Institute, Zhejiang University, Hangzhou, Zhejiang, China
| | | | | | - Wei Dai
- BGI-Shenzhen, Shenzhen, Guangdong, China
| | - Mengyang Xu
- BGI-Shenzhen, Shenzhen, Guangdong, China
- BGI-Qingdao, BGI-Shenzhen, Qingdao, Shandong, China
| | - Yanwei Qi
- BGI-Qingdao, BGI-Shenzhen, Qingdao, Shandong, China
| | - Xiaobo Wang
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong, China
| | - Lidong Guo
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
- BGI-Qingdao, BGI-Shenzhen, Qingdao, Shandong, China
| | - Guangyi Fan
- BGI-Qingdao, BGI-Shenzhen, Qingdao, Shandong, China
| | - Aijun Wang
- BGI-Qingdao, BGI-Shenzhen, Qingdao, Shandong, China
| | - Yuan Deng
- BGI-Shenzhen, Shenzhen, Guangdong, China
| | - Yong Zhang
- BGI-Shenzhen, Shenzhen, Guangdong, China
| | | | - Yunqiu He
- Center for Genomic Research, International Institutes of Medicine, The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Yiwu, Zhejiang, China
- Center for Evolutionary & Organismal Biology, & Women's Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
| | - Chunxue Guo
- BGI-Shenzhen, Shenzhen, Guangdong, China
- BGI-Hangzhou, Hangzhou, Zhejiang, China
| | - Guoji Guo
- School of Medicine, Zhejiang University, Hangzhou, Zhejiang, China
| | - Qing Zhou
- Liangzhu Laboratory, Zhejiang University Medical Center, Hangzhou, Zhejiang, China
- Life Sciences Institute, Zhejiang University, Hangzhou, Zhejiang, China
| | - Xun Xu
- BGI-Shenzhen, Shenzhen, Guangdong, China
| | | | - Jian Wang
- BGI-Shenzhen, Shenzhen, Guangdong, China
| | - Shuhua Xu
- State Key Laboratory of Genetic Engineering, Center for Evolutionary Biology, Collaborative Innovation Center for Genetics and Development, School of Life Sciences, Fudan University, Shanghai, China
- Human Phenome Institute, Zhangjiang Fudan International Innovation Center, and Ministry of Education Key Laboratory of Contemporary Anthropology, Fudan University, Shanghai, China
- Jiangsu Key Laboratory of Phylogenomics & Comparative Genomics, International Joint Center of Genomics of Jiangsu Province School of Life Sciences, Jiangsu Normal University, Xuzhou, Jiangsu, China
- Department of Liver Surgery and Transplantation Liver Cancer Institute, Zhongshan Hospital, Fudan University, Shanghai, China
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, Yunnan, China
| | - Yafei Mao
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Xin Jin
- BGI-Shenzhen, Shenzhen, Guangdong, China
| | - Jue Ruan
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong, China.
| | - Guojie Zhang
- Center for Genomic Research, International Institutes of Medicine, The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Yiwu, Zhejiang, China.
- Center for Evolutionary & Organismal Biology, & Women's Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China.
- Liangzhu Laboratory, Zhejiang University Medical Center, Hangzhou, Zhejiang, China.
- Innovation Center of Yangtze River Delta, Zhejiang University, Hangzhou, Zhejiang, China.
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China.
| |
Collapse
|