1
|
Kawakami R, Hiraide T, Watanabe K, Miyamoto S, Hira K, Komatsu K, Ishigaki H, Sakaguchi K, Maekawa M, Yamashita K, Fukuda T, Miyairi I, Ogata T, Saitsu H. RNA sequencing and target long-read sequencing reveal an intronic transposon insertion causing aberrant splicing. J Hum Genet 2024; 69:91-99. [PMID: 38102195 DOI: 10.1038/s10038-023-01211-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 11/28/2023] [Accepted: 12/01/2023] [Indexed: 12/17/2023]
Abstract
More than half of cases with suspected genetic disorders remain unsolved by genetic analysis using short-read sequencing such as exome sequencing (ES) and genome sequencing (GS). RNA sequencing (RNA-seq) and long-read sequencing (LRS) are useful for interpretation of candidate variants and detection of structural variants containing repeat sequences, respectively. Recently, adaptive sampling on nanopore sequencers enables target LRS more easily. Here, we present a Japanese girl with premature chromatid separation (PCS)/mosaic variegated aneuploidy (MVA) syndrome. ES detected a known pathogenic maternal heterozygous variant (c.1402-5A>G) in intron 10 of BUB1B (NM_001211.6), a known responsive gene for PCS/MVA syndrome with autosomal recessive inheritance. Minigene splicing assay revealed that almost all transcripts from the c.1402-5G allele have mis-splicing with 4-bp insertion. GS could not detect another pathogenic variant, while RNA-seq revealed abnormal reads in intron 2. To extensively explore variants in intron 2, we performed adaptive sampling and identified a paternal 3.0 kb insertion. Consensus sequence of 16 reads spanning the insertion showed that the insertion consists of Alu and SVA elements. Realignment of RNA-seq reads to the new reference sequence containing the insertion revealed that 16 reads have 5' splice site within the insertion and 3' splice site at exon 3, demonstrating causal relationship between the insertion and aberrant splicing. In addition, immunoblotting showed severely diminished BUB1B protein level in patient derived cells. These data suggest that detection of transcriptomic abnormalities by RNA-seq can be a clue for identifying pathogenic variants, and determination of insert sequences is one of merits of LRS.
Collapse
Affiliation(s)
- Ryota Kawakami
- Department of Pediatrics, Hamamatsu University School of Medicine, Hamamatsu, Japan
| | - Takuya Hiraide
- Department of Pediatrics, Hamamatsu University School of Medicine, Hamamatsu, Japan
| | - Kazuki Watanabe
- Department of Biochemistry, Hamamatsu University School of Medicine, Hamamatsu, Japan
| | - Sachiko Miyamoto
- Department of Biochemistry, Hamamatsu University School of Medicine, Hamamatsu, Japan
| | - Kota Hira
- Department of Pediatrics, Hamamatsu University School of Medicine, Hamamatsu, Japan
| | - Kazuyuki Komatsu
- Department of Pediatrics, Hamamatsu University School of Medicine, Hamamatsu, Japan
- Department of Biochemistry, Hamamatsu University School of Medicine, Hamamatsu, Japan
| | - Hidetoshi Ishigaki
- Department of Pediatrics, Hamamatsu University School of Medicine, Hamamatsu, Japan
| | - Kimiyoshi Sakaguchi
- Department of Pediatrics, Hamamatsu University School of Medicine, Hamamatsu, Japan
| | - Masato Maekawa
- Department of Laboratory Medicine, Hamamatsu University School of Medicine, Hamamatsu, Japan
| | - Keita Yamashita
- Department of Laboratory Medicine, Hamamatsu University School of Medicine, Hamamatsu, Japan
| | - Tokiko Fukuda
- Department of Hamamatsu Child Health and Developmental Medicine, Hamamatsu University School of Medicine, Hamamatsu, Japan
| | - Isao Miyairi
- Department of Pediatrics, Hamamatsu University School of Medicine, Hamamatsu, Japan
| | - Tsutomu Ogata
- Department of Biochemistry, Hamamatsu University School of Medicine, Hamamatsu, Japan
- Department of Pediatrics, Hamamatsu Medical Center, Hamamatsu, Japan
| | - Hirotomo Saitsu
- Department of Biochemistry, Hamamatsu University School of Medicine, Hamamatsu, Japan.
| |
Collapse
|
2
|
Sproul JS, Hotaling S, Heckenhauer J, Powell A, Marshall D, Larracuente AM, Kelley JL, Pauls SU, Frandsen PB. Analyses of 600+ insect genomes reveal repetitive element dynamics and highlight biodiversity-scale repeat annotation challenges. Genome Res 2023; 33:1708-1717. [PMID: 37739812 PMCID: PMC10691545 DOI: 10.1101/gr.277387.122] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Accepted: 09/20/2023] [Indexed: 09/24/2023]
Abstract
Repetitive elements (REs) are integral to the composition, structure, and function of eukaryotic genomes, yet remain understudied in most taxonomic groups. We investigated REs across 601 insect species and report wide variation in RE dynamics across groups. Analysis of associations between REs and protein-coding genes revealed dynamic evolution at the interface between REs and coding regions across insects, including notably elevated RE-gene associations in lineages with abundant long interspersed nuclear elements (LINEs). We leveraged this large, empirical data set to quantify impacts of long-read technology on RE detection and investigate fundamental challenges to RE annotation in diverse groups. In long-read assemblies, we detected ∼36% more REs than short-read assemblies, with long terminal repeats (LTRs) showing 162% increased detection, whereas DNA transposons and LINEs showed less respective technology-related bias. In most insect lineages, 25%-85% of repetitive sequences were "unclassified" following automated annotation, compared with only ∼13% in Drosophila species. Although the diversity of available insect genomes has rapidly expanded, we show the rate of community contributions to RE databases has not kept pace, preventing efficient annotation and high-resolution study of REs in most groups. We highlight the tremendous opportunity and need for the biodiversity genomics field to embrace REs and suggest collective steps for making progress toward this goal.
Collapse
Affiliation(s)
- John S Sproul
- Department of Biology, Brigham Young University, Provo, Utah 84602, USA;
- Department of Biology, University of Nebraska Omaha, Omaha, Nebraska 68182, USA
- Department of Biology, University of Rochester, Rochester, New York 14627, USA
| | - Scott Hotaling
- School of Biological Sciences, Washington State University, Pullman, Washington 99163, USA
- Department of Watershed Sciences, Utah State University, Logan, Utah 84322, USA
| | - Jacqueline Heckenhauer
- LOEWE Center for Translational Biodiversity Genomics (LOEWE-TBG), 60325 Frankfurt, Germany
- Senckenberg Research Institute and Natural History Museum Frankfurt, 60325 Frankfurt, Germany
| | - Ashlyn Powell
- Department of Plant and Wildlife Sciences, Brigham Young University, Provo, Utah 84602, USA
| | - Dez Marshall
- Department of Biology, University of Nebraska Omaha, Omaha, Nebraska 68182, USA
| | | | - Joanna L Kelley
- School of Biological Sciences, Washington State University, Pullman, Washington 99163, USA
- Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, California 95064, USA
| | - Steffen U Pauls
- LOEWE Center for Translational Biodiversity Genomics (LOEWE-TBG), 60325 Frankfurt, Germany
- Senckenberg Research Institute and Natural History Museum Frankfurt, 60325 Frankfurt, Germany
- Department of Insect Biotechnology, Justus-Liebig-University Gießen, 35392 Gießen, Germany
| | - Paul B Frandsen
- LOEWE Center for Translational Biodiversity Genomics (LOEWE-TBG), 60325 Frankfurt, Germany
- Department of Plant and Wildlife Sciences, Brigham Young University, Provo, Utah 84602, USA
- Data Science Lab, Smithsonian Institution, Washington, District of Columbia 20560, USA
| |
Collapse
|
3
|
Zhao P, Peng C, Fang L, Wang Z, Liu GE. Taming transposable elements in livestock and poultry: a review of their roles and applications. Genet Sel Evol 2023; 55:50. [PMID: 37479995 PMCID: PMC10362595 DOI: 10.1186/s12711-023-00821-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Accepted: 06/30/2023] [Indexed: 07/23/2023] Open
Abstract
Livestock and poultry play a significant role in human nutrition by converting agricultural by-products into high-quality proteins. To meet the growing demand for safe animal protein, genetic improvement of livestock must be done sustainably while minimizing negative environmental impacts. Transposable elements (TE) are important components of livestock and poultry genomes, contributing to their genetic diversity, chromatin states, gene regulatory networks, and complex traits of economic value. However, compared to other species, research on TE in livestock and poultry is still in its early stages. In this review, we analyze 72 studies published in the past 20 years, summarize the TE composition in livestock and poultry genomes, and focus on their potential roles in functional genomics. We also discuss bioinformatic tools and strategies for integrating multi-omics data with TE, and explore future directions, feasibility, and challenges of TE research in livestock and poultry. In addition, we suggest strategies to apply TE in basic biological research and animal breeding. Our goal is to provide a new perspective on the importance of TE in livestock and poultry genomes.
Collapse
Affiliation(s)
- Pengju Zhao
- Hainan Institute of Zhejiang University, Hainan Sanya, 572000, China
- College of Animal Sciences, Zhejiang University, Zhejiang, Hangzhou, People's Republic of China
| | - Chen Peng
- Hainan Institute of Zhejiang University, Hainan Sanya, 572000, China
- College of Animal Sciences, Zhejiang University, Zhejiang, Hangzhou, People's Republic of China
| | - Lingzhao Fang
- Center for Quantitative Genetics and Genomics, Aarhus University, 8000, Aarhus, Denmark.
| | - Zhengguang Wang
- Hainan Institute of Zhejiang University, Hainan Sanya, 572000, China.
- College of Animal Sciences, Zhejiang University, Zhejiang, Hangzhou, People's Republic of China.
| | - George E Liu
- Animal Genomics and Improvement Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, USDA, Beltsville, MD, 20705, USA.
| |
Collapse
|
4
|
Storer JM, Hubley R, Rosen J, Smit AFA. Methodologies for the De novo Discovery of Transposable Element Families. Genes (Basel) 2022; 13:709. [PMID: 35456515 PMCID: PMC9025800 DOI: 10.3390/genes13040709] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Revised: 04/14/2022] [Accepted: 04/15/2022] [Indexed: 02/07/2023] Open
Abstract
The discovery and characterization of transposable element (TE) families are crucial tasks in the process of genome annotation. Careful curation of TE libraries for each organism is necessary as each has been exposed to a unique and often complex set of TE families. De novo methods have been developed; however, a fully automated and accurate approach to the development of complete libraries remains elusive. In this review, we cover established methods and recent developments in de novo TE analysis. We also present various methodologies used to assess these tools and discuss opportunities for further advancement of the field.
Collapse
Affiliation(s)
| | | | | | - Arian F. A. Smit
- Institute for Systems Biology, Seattle, WA 98109, USA; (J.M.S.); (R.H.); (J.R.)
| |
Collapse
|
5
|
Zverinova S, Guryev V. Variant calling: Considerations, practices, and developments. Hum Mutat 2021; 43:976-985. [PMID: 34882898 PMCID: PMC9545713 DOI: 10.1002/humu.24311] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Revised: 11/02/2021] [Accepted: 12/03/2021] [Indexed: 11/10/2022]
Abstract
The success of many clinical, association, or population genetics studies critically relies on properly performed variant calling step. The variety of modern genomics protocols, techniques, and platforms makes our choices of methods and algorithms difficult and there is no "one size fits all" solution for study design and data analysis. In this review, we discuss considerations that need to be taken into account while designing the study and preparing for the experiments. We outline the variety of variant types that can be detected using sequencing approaches and highlight some specific requirements and basic principles of their detection. Finally, we cover interesting developments that enable variant calling for a broad range of applications in the genomics field. We conclude by discussing technological and algorithmic advances that have the potential to change the ways of calling DNA variants in the nearest future.
Collapse
Affiliation(s)
- Stepanka Zverinova
- European Research Institute for the Biology of Ageing, University of Groningen, University Medical Centre Groningen, Groningen, The Netherlands
| | - Victor Guryev
- European Research Institute for the Biology of Ageing, University of Groningen, University Medical Centre Groningen, Groningen, The Netherlands
| |
Collapse
|
6
|
Wang Y, Zhao B, Choi J, Lee EA. Genomic approaches to trace the history of human brain evolution with an emerging opportunity for transposon profiling of ancient humans. Mob DNA 2021; 12:22. [PMID: 34663455 PMCID: PMC8525043 DOI: 10.1186/s13100-021-00250-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2021] [Accepted: 09/27/2021] [Indexed: 12/17/2022] Open
Abstract
Transposable elements (TEs) significantly contribute to shaping the diversity of the human genome, and lines of evidence suggest TEs as one of driving forces of human brain evolution. Existing computational approaches, including cross-species comparative genomics and population genetic modeling, can be adapted for the study of the role of TEs in evolution. In particular, diverse ancient and archaic human genome sequences are increasingly available, allowing reconstruction of past human migration events and holding the promise of identifying and tracking TEs among other evolutionarily important genetic variants at an unprecedented spatiotemporal resolution. However, highly degraded short DNA templates and other unique challenges presented by ancient human DNA call for major changes in current experimental and computational procedures to enable the identification of evolutionarily important TEs. Ancient human genomes are valuable resources for investigating TEs in the evolutionary context, and efforts to explore ancient human genomes will potentially provide a novel perspective on the genetic mechanism of human brain evolution and inspire a variety of technological and methodological advances. In this review, we summarize computational and experimental approaches that can be adapted to identify and validate evolutionarily important TEs, especially for human brain evolution. We also highlight strategies that leverage ancient genomic data and discuss unique challenges in ancient transposon genomics.
Collapse
Affiliation(s)
- Yilan Wang
- Division of Genetics and Genomics, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
- The Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Program in Biological and Biomedical Sciences, Harvard Medical School, Boston, MA, USA
| | - Boxun Zhao
- Division of Genetics and Genomics, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
- The Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, MA, USA
| | - Jaejoon Choi
- Division of Genetics and Genomics, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Eunjung Alice Lee
- Division of Genetics and Genomics, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA.
- The Broad Institute of Harvard and MIT, Cambridge, MA, USA.
- Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, MA, USA.
| |
Collapse
|
7
|
McDonald TL, Zhou W, Castro CP, Mumm C, Switzenberg JA, Mills RE, Boyle AP. Cas9 targeted enrichment of mobile elements using nanopore sequencing. Nat Commun 2021; 12:3586. [PMID: 34117247 PMCID: PMC8196195 DOI: 10.1038/s41467-021-23918-y] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Accepted: 05/25/2021] [Indexed: 02/05/2023] Open
Abstract
Mobile element insertions (MEIs) are repetitive genomic sequences that contribute to genetic variation and can lead to genetic disorders. Targeted and whole-genome approaches using short-read sequencing have been developed to identify reference and non-reference MEIs; however, the read length hampers detection of these elements in complex genomic regions. Here, we pair Cas9-targeted nanopore sequencing with computational methodologies to capture active MEIs in human genomes. We demonstrate parallel enrichment for distinct classes of MEIs, averaging 44% of reads on-targeted signals and exhibiting a 13.4-54x enrichment over whole-genome approaches. We show an individual flow cell can recover most MEIs (97% L1Hs, 93% AluYb, 51% AluYa, 99% SVA_F, and 65% SVA_E). We identify seventeen non-reference MEIs in GM12878 overlooked by modern, long-read analysis pipelines, primarily in repetitive genomic regions. This work introduces the utility of nanopore sequencing for MEI enrichment and lays the foundation for rapid discovery of elusive, repetitive genetic elements.
Collapse
Affiliation(s)
- Torrin L McDonald
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Weichen Zhou
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Christopher P Castro
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Camille Mumm
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Jessica A Switzenberg
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Ryan E Mills
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA.
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.
| | - Alan P Boyle
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA.
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
8
|
Liu S, Gao G, Layer RM, Thorgaard GH, Wiens GD, Leeds TD, Martin KE, Palti Y. Identification of High-Confidence Structural Variants in Domesticated Rainbow Trout Using Whole-Genome Sequencing. Front Genet 2021; 12:639355. [PMID: 33732289 PMCID: PMC7959816 DOI: 10.3389/fgene.2021.639355] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Accepted: 02/08/2021] [Indexed: 12/14/2022] Open
Abstract
Genomic structural variants (SVs) are a major source of genetic and phenotypic variation but have not been investigated systematically in rainbow trout (Oncorhynchus mykiss), an important aquaculture species of cold freshwater. The objectives of this study were 1) to identify and validate high-confidence SVs in rainbow trout using whole-genome re-sequencing; and 2) to examine the contribution of transposable elements (TEs) to SVs in rainbow trout. A total of 96 rainbow trout, including 11 homozygous lines and 85 outbred fish from three breeding populations, were whole-genome sequenced with an average genome coverage of 17.2×. Putative SVs were identified using the program Smoove which integrates LUMPY and other associated tools into one package. After rigorous filtering, 13,863 high-confidence SVs were identified. Pacific Biosciences long-reads of Arlee, one of the homozygous lines used for SV detection, validated 98% (3,948 of 4,030) of the high-confidence SVs identified in the Arlee homozygous line. Based on principal component analysis, the 85 outbred fish clustered into three groups consistent with their populations of origin, further indicating that the high-confidence SVs identified in this study are robust. The repetitive DNA content of the high-confidence SV sequences was 86.5%, which is much higher than the 57.1% repetitive DNA content of the reference genome, and is also higher than the repetitive DNA content of Atlantic salmon SVs reported previously. TEs thus contribute substantially to SVs in rainbow trout as TEs make up the majority of repetitive sequences. Hundreds of the high-confidence SVs were annotated as exon-loss or gene-fusion variants, and may have phenotypic effects. The high-confidence SVs reported in this study provide a foundation for further rainbow trout SV studies.
Collapse
Affiliation(s)
- Sixin Liu
- National Center for Cool and Cold Water Aquaculture, Agricultural Research Service, United States Department of Agriculture, Kearneysville, WV, United States
| | - Guangtu Gao
- National Center for Cool and Cold Water Aquaculture, Agricultural Research Service, United States Department of Agriculture, Kearneysville, WV, United States
| | - Ryan M Layer
- BioFrontiers Institute, University of Colorado Boulder, Boulder, CO, United States.,Department of Computer Science, University of Colorado Boulder, Boulder, CO, United States
| | - Gary H Thorgaard
- Center for Reproductive Biology, School of Biological Sciences, Washington State University, Pullman, WA, United States
| | - Gregory D Wiens
- National Center for Cool and Cold Water Aquaculture, Agricultural Research Service, United States Department of Agriculture, Kearneysville, WV, United States
| | - Timothy D Leeds
- National Center for Cool and Cold Water Aquaculture, Agricultural Research Service, United States Department of Agriculture, Kearneysville, WV, United States
| | | | - Yniv Palti
- National Center for Cool and Cold Water Aquaculture, Agricultural Research Service, United States Department of Agriculture, Kearneysville, WV, United States
| |
Collapse
|