1
|
Hu K, Ni P, Xu M, Zou Y, Chang J, Gao X, Li Y, Ruan J, Hu B, Wang J. HiTE: a fast and accurate dynamic boundary adjustment approach for full-length transposable element detection and annotation. Nat Commun 2024; 15:5573. [PMID: 38956036 PMCID: PMC11219922 DOI: 10.1038/s41467-024-49912-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2023] [Accepted: 06/25/2024] [Indexed: 07/04/2024] Open
Abstract
Recent advancements in genome assembly have greatly improved the prospects for comprehensive annotation of Transposable Elements (TEs). However, existing methods for TE annotation using genome assemblies suffer from limited accuracy and robustness, requiring extensive manual editing. In addition, the currently available gold-standard TE databases are not comprehensive, even for extensively studied species, highlighting the critical need for an automated TE detection method to supplement existing repositories. In this study, we introduce HiTE, a fast and accurate dynamic boundary adjustment approach designed to detect full-length TEs. The experimental results demonstrate that HiTE outperforms RepeatModeler2, the state-of-the-art tool, across various species. Furthermore, HiTE has identified numerous novel transposons with well-defined structures containing protein-coding domains, some of which are directly inserted within crucial genes, leading to direct alterations in gene expression. A Nextflow version of HiTE is also available, with enhanced parallelism, reproducibility, and portability.
Collapse
Affiliation(s)
- Kang Hu
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
- Xiangjiang Laboratory, Changsha, 410205, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China
| | - Peng Ni
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
- Xiangjiang Laboratory, Changsha, 410205, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China
| | - Minghua Xu
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China
| | - You Zou
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China
| | - Jianye Chang
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518000, China
| | - Xin Gao
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
- Center of Excellence on Smart Health, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Yaohang Li
- Department of Computer Science, Old Dominion University, Norfolk, VA, 23529, USA
| | - Jue Ruan
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518000, China
| | - Bin Hu
- Key Laboratory of Brain Health Intelligent Evaluation and Intervention, Ministry of Education (Beijing Institute of Technology), Beijing, P. R. China.
- School of Medical Technology, Beijing Institute of Technology, Beijing, P. R. China.
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China.
- Xiangjiang Laboratory, Changsha, 410205, China.
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China.
| |
Collapse
|
2
|
Wang W, Zhou L, Li H, Sun T, Wen X, Li W, Esteban MA, Hoffman AR, Hu JF, Cui J. Profiling the role of m6A effectors in the regulation of pluripotent reprogramming. Hum Genomics 2024; 18:33. [PMID: 38566168 PMCID: PMC10986062 DOI: 10.1186/s40246-024-00597-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Accepted: 03/11/2024] [Indexed: 04/04/2024] Open
Abstract
The N6-methyladenosine (m6A) RNA modification plays essential roles in multiple biological processes, including stem cell fate determination. To explore the role of the m6A modification in pluripotent reprogramming, we used RNA-seq to map m6A effectors in human iPSCs, fibroblasts, and H9 ESCs, as well as in mouse ESCs and fibroblasts. By integrating the human and mouse RNA-seq data, we found that 19 m6A effectors were significantly upregulated in reprogramming. Notably, IGF2BPs, particularly IGF2BP1, were among the most upregulated genes in pluripotent cells, while YTHDF3 had high levels of expression in fibroblasts. Using quantitative PCR and Western blot, we validated the pluripotency-associated elevation of IGF2BPs. Knockdown of IGF2BP1 induced the downregulation of stemness genes and exit from pluripotency. Proteome analysis of cells collected at both the beginning and terminal states of the reprogramming process revealed that the IGF2BP1 protein was positively correlated with stemness markers SOX2 and OCT4. The eCLIP-seq target analysis showed that IGF2BP1 interacted with the coding sequence (CDS) and 3'UTR regions of the SOX2 transcripts, in agreement with the location of m6A modifications. This study identifies IGF2BP1 as a vital pluripotency-associated m6A effector, providing new insight into the interplay between m6A epigenetic modifications and pluripotent reprogramming.
Collapse
Affiliation(s)
- Wenjun Wang
- Cancer Center, The First Hospital of Jilin University, Changchun, Jilin, 130021, China
- VA Palo Alto Health Care System, Stanford University School of Medicine, Palo Alto, CA, 94304, USA
| | - Lei Zhou
- Cancer Center, The First Hospital of Jilin University, Changchun, Jilin, 130021, China
| | - Hui Li
- Cancer Center, The First Hospital of Jilin University, Changchun, Jilin, 130021, China
| | - Tingge Sun
- Cancer Center, The First Hospital of Jilin University, Changchun, Jilin, 130021, China
| | - Xue Wen
- Cancer Center, The First Hospital of Jilin University, Changchun, Jilin, 130021, China
| | - Wei Li
- Cancer Center, The First Hospital of Jilin University, Changchun, Jilin, 130021, China
| | - Miguel A Esteban
- Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, Guangdong, 510530, PR China
| | - Andrew R Hoffman
- VA Palo Alto Health Care System, Stanford University School of Medicine, Palo Alto, CA, 94304, USA
| | - Ji-Fan Hu
- Cancer Center, The First Hospital of Jilin University, Changchun, Jilin, 130021, China.
- VA Palo Alto Health Care System, Stanford University School of Medicine, Palo Alto, CA, 94304, USA.
| | - Jiuwei Cui
- Cancer Center, The First Hospital of Jilin University, Changchun, Jilin, 130021, China.
| |
Collapse
|
3
|
Oliveira DS, Fablet M, Larue A, Vallier A, Carareto CA, Rebollo R, Vieira C. ChimeraTE: a pipeline to detect chimeric transcripts derived from genes and transposable elements. Nucleic Acids Res 2023; 51:9764-9784. [PMID: 37615575 PMCID: PMC10570057 DOI: 10.1093/nar/gkad671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Revised: 07/25/2023] [Accepted: 08/09/2023] [Indexed: 08/25/2023] Open
Abstract
Transposable elements (TEs) produce structural variants and are considered an important source of genetic diversity. Notably, TE-gene fusion transcripts, i.e. chimeric transcripts, have been associated with adaptation in several species. However, the identification of these chimeras remains hindered due to the lack of detection tools at a transcriptome-wide scale, and to the reliance on a reference genome, even though different individuals/cells/strains have different TE insertions. Therefore, we developed ChimeraTE, a pipeline that uses paired-end RNA-seq reads to identify chimeric transcripts through two different modes. Mode 1 is the reference-guided approach that employs canonical genome alignment, and Mode 2 identifies chimeras derived from fixed or insertionally polymorphic TEs without any reference genome. We have validated both modes using RNA-seq data from four Drosophila melanogaster wild-type strains. We found ∼1.12% of all genes generating chimeric transcripts, most of them from TE-exonized sequences. Approximately ∼23% of all detected chimeras were absent from the reference genome, indicating that TEs belonging to chimeric transcripts may be recent, polymorphic insertions. ChimeraTE is the first pipeline able to automatically uncover chimeric transcripts without a reference genome, consisting of two running Modes that can be used as a tool to investigate the contribution of TEs to transcriptome plasticity.
Collapse
Affiliation(s)
- Daniel S Oliveira
- São Paulo State University (Unesp), Institute of Biosciences, Humanities and Exact Sciences, São José do Rio Preto, SP, Brazil
- Laboratoire de Biométrie et Biologie Evolutive, Université Lyon 1, CNRS, UMR5558, Villeurbanne, Rhone-Alpes, 69100, France
| | - Marie Fablet
- Laboratoire de Biométrie et Biologie Evolutive, Université Lyon 1, CNRS, UMR5558, Villeurbanne, Rhone-Alpes, 69100, France
- Institut Universitaire de France (IUF), Paris, Île-de-FranceF-75231, France
| | - Anaïs Larue
- Laboratoire de Biométrie et Biologie Evolutive, Université Lyon 1, CNRS, UMR5558, Villeurbanne, Rhone-Alpes, 69100, France
- Univ Lyon, INRAE, INSA-Lyon, BF2I, UMR 203, 69621 Villeurbanne, France
| | - Agnès Vallier
- Univ Lyon, INRAE, INSA-Lyon, BF2I, UMR 203, 69621 Villeurbanne, France
| | - Claudia M A Carareto
- São Paulo State University (Unesp), Institute of Biosciences, Humanities and Exact Sciences, São José do Rio Preto, SP, Brazil
| | - Rita Rebollo
- Univ Lyon, INRAE, INSA-Lyon, BF2I, UMR 203, 69621 Villeurbanne, France
| | - Cristina Vieira
- Laboratoire de Biométrie et Biologie Evolutive, Université Lyon 1, CNRS, UMR5558, Villeurbanne, Rhone-Alpes, 69100, France
| |
Collapse
|
4
|
Li Z, Liu X, Wang C, Li Z, Jiang B, Zhang R, Tong L, Qu Y, He S, Chen H, Mao Y, Li Q, Pook T, Wu Y, Zan Y, Zhang H, Li L, Wen K, Chen Y. The pig pangenome provides insights into the roles of coding structural variations in genetic diversity and adaptation. Genome Res 2023; 33:1833-1847. [PMID: 37914227 PMCID: PMC10691484 DOI: 10.1101/gr.277638.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Accepted: 09/12/2023] [Indexed: 11/03/2023]
Abstract
Structural variations have emerged as an important driving force for genome evolution and phenotypic variation in various organisms, yet their contributions to genetic diversity and adaptation in domesticated animals remain largely unknown. Here we constructed a pangenome based on 250 sequenced individuals from 32 pig breeds in Eurasia and systematically characterized coding sequence presence/absence variations (PAVs) within pigs. We identified 308.3-Mb nonreference sequences and 3438 novel genes absent from the current reference genome. Gene PAV analysis showed that 16.8% of the genes in the pangene catalog undergo PAV. A number of newly identified dispensable genes showed close associations with adaptation. For instance, several novel swine leukocyte antigen (SLA) genes discovered in nonreference sequences potentially participate in immune responses to productive and respiratory syndrome virus (PRRSV) infection. We delineated previously unidentified features of the pig mobilome that contained 490,480 transposable element insertion polymorphisms (TIPs) resulting from recent mobilization of 970 TE families, and investigated their population dynamics along with influences on population differentiation and gene expression. In addition, several candidate adaptive TE insertions were detected to be co-opted into genes responsible for responses to hypoxia, skeletal development, regulation of heart contraction, and neuronal cell development, likely contributing to local adaptation of Tibetan wild boars. These findings enhance our understanding on hidden layers of the genetic diversity in pigs and provide novel insights into the role of SVs in the evolutionary adaptation of mammals.
Collapse
Affiliation(s)
- Zhengcao Li
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, 510006 Guangzhou, China;
| | - Xiaohong Liu
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, 510006 Guangzhou, China
| | - Chen Wang
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, 510006 Guangzhou, China
| | - Zhenyang Li
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, 510006 Guangzhou, China
| | - Bo Jiang
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, 510006 Guangzhou, China
| | - Ruifeng Zhang
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, 510006 Guangzhou, China
| | - Lu Tong
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, 510006 Guangzhou, China
| | - Youping Qu
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, 510006 Guangzhou, China
| | - Sheng He
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, 510006 Guangzhou, China
| | - Haifan Chen
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, 510006 Guangzhou, China
| | - Yafei Mao
- Bio-X Institutes, Shanghai Jiao Tong University, 200240 Shanghai, China
| | - Qingnan Li
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, 510006 Guangzhou, China
| | - Torsten Pook
- Animal Breeding and Genomics, Wageningen University & Research, Wageningen 6700 AH, The Netherlands
| | - Yu Wu
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, 510006 Guangzhou, China
| | - Yanjun Zan
- Key Laboratory of Tobacco Improvement and Biotechnology, Tobacco Research Institute, Chinese Academy of Agricultural Sciences, Qingdao 266000, China
| | - Hui Zhang
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, 510006 Guangzhou, China
| | - Lu Li
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, 510006 Guangzhou, China
| | - Keying Wen
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, 510006 Guangzhou, China
| | - Yaosheng Chen
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, 510006 Guangzhou, China;
| |
Collapse
|
5
|
Coronado-Zamora M, González J. Transposons contribute to the functional diversification of the head, gut, and ovary transcriptomes across Drosophila natural strains. Genome Res 2023; 33:1541-1553. [PMID: 37793782 PMCID: PMC10620055 DOI: 10.1101/gr.277565.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Accepted: 08/08/2023] [Indexed: 10/06/2023]
Abstract
Transcriptomes are dynamic, with cells, tissues, and body parts expressing particular sets of transcripts. Transposable elements (TEs) are a known source of transcriptome diversity; however, studies often focus on a particular type of chimeric transcript, analyze single body parts or cell types, or are based on incomplete TE annotations from a single reference genome. In this work, we have implemented a method based on de novo transcriptome assembly that minimizes the potential sources of errors while identifying a comprehensive set of gene-TE chimeras. We applied this method to the head, gut, and ovary dissected from five Drosophila melanogaster natural strains, with individual reference genomes available. We found that ∼19% of body part-specific transcripts are gene-TE chimeras. Overall, chimeric transcripts contribute a mean of 43% to the total gene expression, and they provide protein domains for DNA binding, catalytic activity, and DNA polymerase activity. Our comprehensive data set is a rich resource for follow-up analysis. Moreover, because TEs are present in virtually all species sequenced to date, their role in spatially restricted transcript expression is likely not exclusive to the species analyzed in this work.
Collapse
Affiliation(s)
| | - Josefa González
- Institute of Evolutionary Biology, CSIC, UPF, Barcelona 08003, Spain
| |
Collapse
|
6
|
Osipovich AB, Dudek KD, Trinh LT, Kim LH, Shrestha S, Cartailler JP, Magnuson MA. ZFP92, a KRAB domain zinc finger protein enriched in pancreatic islets, binds to B1/Alu SINE transposable elements and regulates retroelements and genes. PLoS Genet 2023; 19:e1010729. [PMID: 37155670 PMCID: PMC10166502 DOI: 10.1371/journal.pgen.1010729] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Accepted: 04/02/2023] [Indexed: 05/10/2023] Open
Abstract
Repressive KRAB domain-containing zinc-finger proteins (KRAB-ZFPs) are abundant in mammalian genomes and contribute both to the silencing of transposable elements (TEs) and to the regulation of developmental stage- and cell type-specific gene expression. Here we describe studies of zinc finger protein 92 (Zfp92), an X-linked KRAB-ZFP that is highly expressed in pancreatic islets of adult mice, by analyzing global Zfp92 knockout (KO) mice. Physiological, transcriptomic and genome-wide chromatin binding studies indicate that the principal function of ZFP92 in mice is to bind to and suppress the activity of B1/Alu type of SINE elements and modulate the activity of surrounding genomic entities. Deletion of Zfp92 leads to changes in expression of select LINE and LTR retroelements and genes located in the vicinity of ZFP92-bound chromatin. The absence of Zfp92 leads to altered expression of specific genes in islets, adipose and muscle that result in modest sex-specific alterations in blood glucose homeostasis, body mass and fat accumulation. In islets, Zfp92 influences blood glucose concentration in postnatal mice via transcriptional effects on Mafb, whereas in adipose and muscle, it regulates Acacb, a rate-limiting enzyme in fatty acid metabolism. In the absence of Zfp92, a novel TE-Capn11 fusion transcript is overexpressed in islets and several other tissues due to de-repression of an IAPez TE adjacent to ZFP92-bound SINE elements in intron 3 of the Capn11 gene. Together, these studies show that ZFP92 functions both to repress specific TEs and to regulate the transcription of specific genes in discrete tissues.
Collapse
Affiliation(s)
- Anna B. Osipovich
- Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, Tennessee, United States of America
- Center for Stem Cell Biology, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Karrie D. Dudek
- Center for Stem Cell Biology, Vanderbilt University, Nashville, Tennessee, United States of America
- Department of Cell and Developmental Biology, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Linh T. Trinh
- Center for Stem Cell Biology, Vanderbilt University, Nashville, Tennessee, United States of America
- Department of Cell and Developmental Biology, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Lily H. Kim
- College of Arts and Sciences, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Shristi Shrestha
- Center for Stem Cell Biology, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Jean-Philippe Cartailler
- Center for Stem Cell Biology, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Mark A. Magnuson
- Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, Tennessee, United States of America
- Center for Stem Cell Biology, Vanderbilt University, Nashville, Tennessee, United States of America
- Department of Cell and Developmental Biology, Vanderbilt University, Nashville, Tennessee, United States of America
| |
Collapse
|
7
|
Gasparotto E, Burattin FV, Di Gioia V, Panepuccia M, Ranzani V, Marasca F, Bodega B. Transposable Elements Co-Option in Genome Evolution and Gene Regulation. Int J Mol Sci 2023; 24:ijms24032610. [PMID: 36768929 PMCID: PMC9917352 DOI: 10.3390/ijms24032610] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 01/26/2023] [Accepted: 01/28/2023] [Indexed: 01/31/2023] Open
Abstract
The genome is no longer deemed as a fixed and inert item but rather as a moldable matter that is continuously evolving and adapting. Within this frame, Transposable Elements (TEs), ubiquitous, mobile, repetitive elements, are considered an alive portion of the genomes to date, whose functions, although long considered "dark", are now coming to light. Here we will review that, besides the detrimental effects that TE mobilization can induce, TEs have shaped genomes in their current form, promoting genome sizing, genomic rearrangements and shuffling of DNA sequences. Although TEs are mostly represented in the genomes by evolutionarily old, short, degenerated, and sedentary fossils, they have been thoroughly co-opted by the hosts as a prolific and original source of regulatory instruments for the control of gene transcription and genome organization in the nuclear space. For these reasons, the deregulation of TE expression and/or activity is implicated in the onset and progression of several diseases. It is likely that we have just revealed the outermost layers of TE functions. Further studies on this portion of the genome are required to unlock novel regulatory functions that could also be exploited for diagnostic and therapeutic approaches.
Collapse
Affiliation(s)
- Erica Gasparotto
- Fondazione INGM, Istituto Nazionale di Genetica Molecolare “Enrica e Romeo Invernizzi”, 20122 Milan, Italy
- SEMM, European School of Molecular Medicine, 20139 Milan, Italy
| | - Filippo Vittorio Burattin
- Fondazione INGM, Istituto Nazionale di Genetica Molecolare “Enrica e Romeo Invernizzi”, 20122 Milan, Italy
- Department of Biosciences, University of Milan, 20133 Milan, Italy
| | - Valeria Di Gioia
- Fondazione INGM, Istituto Nazionale di Genetica Molecolare “Enrica e Romeo Invernizzi”, 20122 Milan, Italy
- SEMM, European School of Molecular Medicine, 20139 Milan, Italy
| | - Michele Panepuccia
- Fondazione INGM, Istituto Nazionale di Genetica Molecolare “Enrica e Romeo Invernizzi”, 20122 Milan, Italy
| | - Valeria Ranzani
- Fondazione INGM, Istituto Nazionale di Genetica Molecolare “Enrica e Romeo Invernizzi”, 20122 Milan, Italy
| | - Federica Marasca
- Fondazione INGM, Istituto Nazionale di Genetica Molecolare “Enrica e Romeo Invernizzi”, 20122 Milan, Italy
- Department of Clinical Sciences and Community Health, University of Milan, 20122 Milan, Italy
| | - Beatrice Bodega
- Fondazione INGM, Istituto Nazionale di Genetica Molecolare “Enrica e Romeo Invernizzi”, 20122 Milan, Italy
- Department of Biosciences, University of Milan, 20133 Milan, Italy
- Correspondence:
| |
Collapse
|
8
|
Agoni L. Alternative and aberrant splicing of human endogenous retroviruses in cancer. What about head and neck? —A mini review. Front Oncol 2022; 12:1019085. [PMID: 36338752 PMCID: PMC9631305 DOI: 10.3389/fonc.2022.1019085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Accepted: 10/03/2022] [Indexed: 11/13/2022] Open
Abstract
Human endogenous retroviruses (HERVs) are transcribed in many cancer types, including head and neck cancer. Because of accumulating mutations at proviral loci over evolutionary time, HERVs are functionally defective and cannot complete their viral life cycle. Despite that, HERV transcripts, including full-length viral RNAs and viral RNAs spliced as expected at the conventional viral splice sites, can be detected in particular conditions, such as cancer. Interestingly, non-viral–related transcription, including aberrant, non-conventionally spliced RNAs, has been reported as well. The role of HERV transcription in cancer and its contribution to oncogenesis or progression are still debated. Nonetheless, HERVs may constitute a suitable cancer biomarker or a target for therapy. Thus, ongoing research aims both to clarify the basic mechanisms underlying HERV transcription in cancer and to exploit its potential toward clinical application. In this mini-review, we summarize the current knowledge, the most recent findings, and the future perspectives of research on HERV transcription and splicing, with particular focus on head and neck cancer.
Collapse
|
9
|
SoloTE for improved analysis of transposable elements in single-cell RNA-Seq data using locus-specific expression. Commun Biol 2022; 5:1063. [PMID: 36202992 PMCID: PMC9537157 DOI: 10.1038/s42003-022-04020-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 09/21/2022] [Indexed: 11/08/2022] Open
Abstract
Transposable Elements (TEs) contribute to the repetitive fraction in almost every eukaryotic genome known to date, and their transcriptional activation can influence the expression of neighboring genes in healthy and disease states. Single cell RNA-Seq (scRNA-Seq) is a technical advance that allows the study of gene expression on a cell-by-cell basis. Although a current computational approach is available for the single cell analysis of TE expression, it omits their genomic location. Here we show SoloTE, a pipeline that outperforms the previous approach in terms of computational resources and by allowing the inclusion of locus-specific TE activity in scRNA-Seq expression matrixes. We then apply SoloTE to several datasets to reveal the repertoire of TEs that become transcriptionally active in different cell groups, and based on their genomic location, we predict their potential impact on gene expression. As our tool takes as input the resulting files from standard scRNA-Seq processing pipelines, we expect it to be widely adopted in single cell studies to help researchers discover patterns of cellular diversity associated with TE expression.
Collapse
|
10
|
Functional Characterization of the N-Terminal Disordered Region of the piggyBac Transposase. Int J Mol Sci 2022; 23:ijms231810317. [PMID: 36142241 PMCID: PMC9499001 DOI: 10.3390/ijms231810317] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Revised: 08/22/2022] [Accepted: 09/03/2022] [Indexed: 01/15/2023] Open
Abstract
The piggyBac DNA transposon is an active element initially isolated from the cabbage looper moth, but members of this superfamily are also present in most eukaryotic evolutionary lineages. The functionally important regions of the transposase are well described. There is an RNase H-like fold containing the DDD motif responsible for the catalytic DNA cleavage and joining reactions and a C-terminal cysteine-rich domain important for interaction with the transposon DNA. However, the protein also contains a ~100 amino acid long N-terminal disordered region (NTDR) whose function is currently unknown. Here we show that deletion of the NTDR significantly impairs piggyBac transposition, although the extent of decrease is strongly cell-type specific. Moreover, replacing the NTDR with scrambled but similarly disordered sequences did not rescue transposase activity, indicating the importance of sequence conservation. Cell-based transposon excision and integration assays reveal that the excision step is more severely affected by NTDR deletion. Finally, bioinformatic analyses indicated that the NTDR is specific for the piggyBac superfamily and is also present in domesticated, transposase-derived proteins incapable of catalyzing transposition. Our results indicate an essential role of the NTDR in the “fine-tuning” of transposition and its significance in the functions of piggyBac-originated co-opted genes.
Collapse
|
11
|
Zhang M, Zheng S, Liang JQ. Transcriptional and reverse transcriptional regulation of host genes by human endogenous retroviruses in cancers. Front Microbiol 2022; 13:946296. [PMID: 35928153 PMCID: PMC9343867 DOI: 10.3389/fmicb.2022.946296] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Accepted: 06/29/2022] [Indexed: 11/16/2022] Open
Abstract
Human endogenous retroviruses (HERVs) originated from ancient retroviral infections of germline cells millions of years ago and have evolved as part of the host genome. HERVs not only retain the capacity as retroelements but also regulate host genes. The expansion of HERVs involves transcription by RNA polymerase II, reverse transcription, and re-integration into the host genome. Fast progress in deep sequencing and functional analysis has revealed the importance of domesticated copies of HERVs, including their regulatory sequences, transcripts, and proteins in normal cells. However, evidence also suggests the involvement of HERVs in the development and progression of many types of cancer. Here we summarize the current state of knowledge about the expression of HERVs, transcriptional regulation of host genes by HERVs, and the functions of HERVs in reverse transcription and gene editing with their reverse transcriptase.
Collapse
Affiliation(s)
- Mengwen Zhang
- The Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China
- Ministry of Education Key Laboratory of Cancer Prevention and Intervention, Second Affiliated Hospital, Cancer Institute, Zhejiang University School of Medicine, Hangzhou, China
| | - Shu Zheng
- The Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China
- Ministry of Education Key Laboratory of Cancer Prevention and Intervention, Second Affiliated Hospital, Cancer Institute, Zhejiang University School of Medicine, Hangzhou, China
- *Correspondence: Shu Zheng,
| | - Jessie Qiaoyi Liang
- Department of Medicine and Therapeutics, Faculty of Medicine, Center for Gut Microbiota Research, Li Ka Shing Institute of Health Sciences, Shenzhen Research Institute, The Chinese University of Hong Kong, Hong Kong, Hong Kong SAR, China
- Jessie Qiaoyi Liang,
| |
Collapse
|
12
|
The effects of sequencing depth on the assembly of coding and noncoding transcripts in the human genome. BMC Genomics 2022; 23:487. [PMID: 35787153 PMCID: PMC9251931 DOI: 10.1186/s12864-022-08717-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Accepted: 06/16/2022] [Indexed: 12/30/2022] Open
Abstract
Investigating the functions and activities of genes requires proper annotation of the transcribed units. However, transcript assembly efforts have produced a surprisingly large variation in the number of transcripts, and especially so for noncoding transcripts. This heterogeneity in assembled transcript sets might be partially explained by sequencing depth. Here, we used real and simulated short-read sequencing data as well as long-read data to systematically investigate the impact of sequencing depths on the accuracy of assembled transcripts. We assembled and analyzed transcripts from 671 human short-read data sets and four long-read data sets. At the first level, there is a positive correlation between the number of reads and the number of recovered transcripts. However, the effect of the sequencing depth varied based on cell or tissue type, the type of read and the nature and expression levels of the transcripts. The detection of coding transcripts saturated rapidly with both short and long-reads, however, there was no sign of early saturation for noncoding transcripts at any sequencing depth. Increasing long-read sequencing depth specifically benefited transcripts containing transposable elements. Finally, we show how single-cell RNA-seq can be guided by transcripts assembled from bulk long-read samples, and demonstrate that noncoding transcripts are expressed at similar levels to coding transcripts but are expressed in fewer cells. This study highlights the impact of sequencing depth on transcript assembly.
Collapse
|
13
|
Ma G, Babarinde IA, Zhou X, Hutchins AP. Transposable Elements in Pluripotent Stem Cells and Human Disease. Front Genet 2022; 13:902541. [PMID: 35719395 PMCID: PMC9201960 DOI: 10.3389/fgene.2022.902541] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Accepted: 05/20/2022] [Indexed: 11/18/2022] Open
Abstract
Transposable elements (TEs) are mobile genetic elements that can randomly integrate into other genomic sites. They have successfully replicated and now occupy around 40% of the total DNA sequence in humans. TEs in the genome have a complex relationship with the host cell, being both potentially deleterious and advantageous at the same time. Only a tiny minority of TEs are still capable of transposition, yet their fossilized sequence fragments are thought to be involved in various molecular processes, such as gene transcriptional activity, RNA stability and subcellular localization, and chromosomal architecture. TEs have also been implicated in biological processes, although it is often hard to reveal cause from correlation due to formidable technical issues in analyzing TEs. In this review, we compare and contrast two views of TE activity: one in the pluripotent state, where TEs are broadly beneficial, or at least mechanistically useful, and a second state in human disease, where TEs are uniformly considered harmful.
Collapse
|
14
|
Metabolic and epigenetic dysfunctions underlie the arrest of in vitro fertilized human embryos in a senescent-like state. PLoS Biol 2022; 20:e3001682. [PMID: 35771762 PMCID: PMC9246109 DOI: 10.1371/journal.pbio.3001682] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Accepted: 05/19/2022] [Indexed: 12/18/2022] Open
Abstract
Around 60% of in vitro fertilized (IVF) human embryos irreversibly arrest before compaction between the 3- to 8-cell stage, posing a significant clinical problem. The mechanisms behind this arrest are unclear. Here, we show that the arrested embryos enter a senescent-like state, marked by cell cycle arrest, the down-regulation of ribosomes and histones and down-regulation of MYC and p53 activity. The arrested embryos can be divided into 3 types. Type I embryos fail to complete the maternal-zygotic transition, and Type II/III embryos have low levels of glycolysis and either high (Type II) or low (Type III) levels of oxidative phosphorylation. Treatment with the SIRT agonist resveratrol or nicotinamide riboside (NR) can partially rescue the arrested phenotype, which is accompanied by changes in metabolic activity. Overall, our data suggests metabolic and epigenetic dysfunctions underlie the arrest of human embryos.
Collapse
|
15
|
Zhang T, Zheng R, Li M, Yan C, Lan X, Tong B, Lu P, Jiang W. Active endogenous retroviral elements in human pluripotent stem cells play a role in regulating host gene expression. Nucleic Acids Res 2022; 50:4959-4973. [PMID: 35451484 PMCID: PMC9122532 DOI: 10.1093/nar/gkac265] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Revised: 03/22/2022] [Accepted: 04/01/2022] [Indexed: 12/20/2022] Open
Abstract
Human endogenous retroviruses, also called LTR elements, can be bound by transcription factors and marked by different histone modifications in different biological contexts. Recently, individual LTR or certain subclasses of LTRs such as LTR7/HERVH and LTR5_Hs/HERVK families have been identified as cis-regulatory elements. However, there are still many LTR elements with unknown functions. Here, we dissected the landscape of histone modifications and regulatory map of LTRs by integrating 98 ChIP-seq data in human embryonic stem cells (ESCs), and annotated the active LTRs enriching enhancer/promoter-related histone marks. Notably, we found that MER57E3 functionally acted as proximal regulatory element to activate respective ZNF gene. Additionally, HERVK transcript could mainly function in nucleus to activate the adjacent genes. Since LTR5_Hs/LTR5 was bound by many early embryo-specific transcription factors, we further investigated the expression dynamics in different pluripotent states. LTR5_Hs/LTR5/HERVK exhibited higher expression level in naïve ESCs and extended pluripotent stem cells (EPSCs). Functionally, the LTR5_Hs/LTR5 with high activity could serve as a distal enhancer to regulate the host genes. Ultimately, our study not only provides a comprehensive regulatory map of LTRs in human ESCs, but also explores the regulatory models of MER57E3 and LTR5_Hs/LTR5 in host genome.
Collapse
Affiliation(s)
- Tianzhe Zhang
- Department of Biological Repositories, Frontier Science Center for Immunology and Metabolism, Medical Research Institute, Zhongnan Hospital of Wuhan University, Wuhan University, Wuhan 430071, China
| | - Ran Zheng
- Department of Biological Repositories, Frontier Science Center for Immunology and Metabolism, Medical Research Institute, Zhongnan Hospital of Wuhan University, Wuhan University, Wuhan 430071, China
| | - Mao Li
- Department of Biological Repositories, Frontier Science Center for Immunology and Metabolism, Medical Research Institute, Zhongnan Hospital of Wuhan University, Wuhan University, Wuhan 430071, China
| | - Chenchao Yan
- Department of Biological Repositories, Frontier Science Center for Immunology and Metabolism, Medical Research Institute, Zhongnan Hospital of Wuhan University, Wuhan University, Wuhan 430071, China
| | - Xianchun Lan
- Department of Biological Repositories, Frontier Science Center for Immunology and Metabolism, Medical Research Institute, Zhongnan Hospital of Wuhan University, Wuhan University, Wuhan 430071, China
| | - Bei Tong
- Department of Cardiology, Zhongnan Hospital of Wuhan University, Wuhan 430071, China
| | - Pei Lu
- Department of Biological Repositories, Frontier Science Center for Immunology and Metabolism, Medical Research Institute, Zhongnan Hospital of Wuhan University, Wuhan University, Wuhan 430071, China
| | - Wei Jiang
- Department of Biological Repositories, Frontier Science Center for Immunology and Metabolism, Medical Research Institute, Zhongnan Hospital of Wuhan University, Wuhan University, Wuhan 430071, China.,Human Genetics Resource Preservation Center of Wuhan University, Wuhan 430071, China.,Hubei Provincial Key Laboratory of Developmentally Originated Disease, Wuhan 430071, China
| |
Collapse
|