1
|
Kováčová M, Hlaváč V, Koževnikovová R, Rauš K, Gatěk J, Souček P. Artificial Intelligence-Driven Prediction Revealed CFTR Associated with Therapy Outcome of Breast Cancer: A Feasibility Study. Oncology 2024:1-12. [PMID: 39025053 DOI: 10.1159/000540395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Accepted: 07/09/2024] [Indexed: 07/20/2024]
Abstract
INTRODUCTION In silico tools capable of predicting the functional consequences of genomic differences between individuals, many of which are AI-driven, have been the most effective over the past two decades for non-synonymous single nucleotide variants (nsSNVs). When appropriately selected for the purpose of the study, a high predictive performance can be expected. In this feasibility study, we investigate the distribution of nsSNVs with an allele frequency below 5%. To classify the putative functional consequence, a tier-based filtration led by AI-driven predictors and scoring system was implemented to the overall decision-making process, resulting in a list of prioritised genes. METHODS The study has been conducted on breast cancer patients of homogeneous ethnicity. Germline rare variants have been sequenced in genes that influence pharmacokinetic parameters of anticancer drugs or molecular signalling pathways in cancer. After AI-driven functional pathogenicity classification and data mining in pharmacogenomic (PGx) databases, variants were collapsed to the gene level and ranked according to their putative deleterious role. RESULTS In breast cancer patients, seven of the twelve genes prioritised based on the predictions were found to be associated with response to oncotherapy, histological grade, and tumour subtype. Most importantly, we showed that the group of patients with at least one rare nsSNVs in cystic fibrosis transmembrane conductance regulator (CFTR) had significantly reduced disease-free (log rank, p = 0.002) and overall survival (log rank, p = 0.006). CONCLUSION AI-driven in silico analysis with PGx data mining provided an effective approach navigating for functional consequences across germline genetic background, which can be easily integrated into the overall decision-making process for future studies. The study revealed a statistically significant association with numerous clinicopathological parameters, including treatment response. Our study indicates that CFTR may be involved in the processes influencing the effectiveness of oncotherapy or in the malignant progression of the disease itself.
Collapse
Affiliation(s)
- Mária Kováčová
- Third Faculty of Medicine, Charles University, Prague, Czechia
| | - Viktor Hlaváč
- Laboratory of Pharmacogenomics, Biomedical Center, Faculty of Medicine in Pilsen, Charles University, Pilsen, Czechia
- Toxicogenomics Unit, National Institute of Public Health, Prague, Czechia
| | | | - Karel Rauš
- Institute for the Care for Mother and Child, Prague, Czechia
| | - Jiří Gatěk
- Department of Surgery, EUC Hospital and University of Tomas Bata in Zlin, Zlin, Czechia
| | - Pavel Souček
- Laboratory of Pharmacogenomics, Biomedical Center, Faculty of Medicine in Pilsen, Charles University, Pilsen, Czechia
- Toxicogenomics Unit, National Institute of Public Health, Prague, Czechia
| |
Collapse
|
2
|
McCoy MJ, Fire AZ. Parallel gene size and isoform expansion of ancient neuronal genes. Curr Biol 2024; 34:1635-1645.e3. [PMID: 38460513 PMCID: PMC11043017 DOI: 10.1016/j.cub.2024.02.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 12/16/2023] [Accepted: 02/11/2024] [Indexed: 03/11/2024]
Abstract
How nervous systems evolved is a central question in biology. A diversity of synaptic proteins is thought to play a central role in the formation of specific synapses leading to nervous system complexity. The largest animal genes, often spanning hundreds of thousands of base pairs, are known to be enriched for expression in neurons at synapses and are frequently mutated or misregulated in neurological disorders and diseases. Although many of these genes have been studied independently in the context of nervous system evolution and disease, general principles underlying their parallel evolution remain unknown. To investigate this, we directly compared orthologous gene sizes across eukaryotes. By comparing relative gene sizes within organisms, we identified a distinct class of large genes with origins predating the diversification of animals and, in many cases, the emergence of neurons as dedicated cell types. We traced this class of ancient large genes through evolution and found orthologs of the large synaptic genes potentially driving the immense complexity of metazoan nervous systems, including in humans and cephalopods. Moreover, we found that while these genes are evolving under strong purifying selection, as demonstrated by low dN/dS ratios, they have simultaneously grown larger and gained the most isoforms in animals. This work provides a new lens through which to view this distinctive class of large and multi-isoform genes and demonstrates how intrinsic genomic properties, such as gene length, can provide flexibility in molecular evolution and allow groups of genes and their host organisms to evolve toward complexity.
Collapse
Affiliation(s)
- Matthew J McCoy
- Department of Pathology, Stanford University School of Medicine, 300 Pasteur Drive, Stanford, CA 94305, USA.
| | - Andrew Z Fire
- Department of Pathology, Stanford University School of Medicine, 300 Pasteur Drive, Stanford, CA 94305, USA; Department of Genetics, Stanford University School of Medicine, 300 Pasteur Drive, Stanford, CA 94305, USA.
| |
Collapse
|
3
|
Lan L, Leng L, Liu W, Ren Y, Reeve W, Fu X, Wu Z, Zhang X. The haplotype-resolved telomere-to-telomere carnation ( Dianthus caryophyllus) genome reveals the correlation between genome architecture and gene expression. HORTICULTURE RESEARCH 2024; 11:uhad244. [PMID: 38225981 PMCID: PMC10788775 DOI: 10.1093/hr/uhad244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Accepted: 11/12/2023] [Indexed: 01/17/2024]
Abstract
Carnation (Dianthus caryophyllus) is one of the most valuable commercial flowers, due to its richness of color and form, and its excellent storage and vase life. The diverse demands of the market require faster breeding in carnations. A full understanding of carnations is therefore required to guide the direction of breeding. Hence, we assembled the haplotype-resolved gap-free carnation genome of the variety 'Baltico', which is the most common white standard variety worldwide. Based on high-depth HiFi, ultra-long nanopore, and Hi-C sequencing data, we assembled the telomere-to-telomere (T2T) genome to be 564 479 117 and 568 266 215 bp for the two haplotypes Hap1 and Hap2, respectively. This T2T genome exhibited great improvement in genome assembly and annotation results compared with the former version. The improvements were seen when different approaches to evaluation were used. Our T2T genome first informs the analysis of the telomere and centromere region, enabling us to speculate about specific centromere characteristics that cannot be identified by high-order repeats in carnations. We analyzed allele-specific expression in three tissues and the relationship between genome architecture and gene expression in the haplotypes. This demonstrated that the length of the genes, coding sequences, and introns, the exon numbers and the transposable element insertions correlate with gene expression ratios and levels. The insertions of transposable elements repress expression in gene regulatory networks in carnation. This gap-free finished T2T carnation genome provides a valuable resource to illustrate the genome characteristics and for functional genomics analysis in further studies and molecular breeding.
Collapse
Affiliation(s)
- Lan Lan
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
- College of Science, Health, Engineering and Education, Murdoch University, Murdoch 6150, Western Australia, Australia
- Kunpeng Institute of Modern Agriculture at Foshan, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
| | - Luhong Leng
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
- Kunpeng Institute of Modern Agriculture at Foshan, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
| | - Weichao Liu
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
- Kunpeng Institute of Modern Agriculture at Foshan, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
- Key Laboratory of Horticultural Plant Biology, College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan, 430070, China
| | - Yonglin Ren
- College of Science, Health, Engineering and Education, Murdoch University, Murdoch 6150, Western Australia, Australia
| | - Wayne Reeve
- College of Science, Health, Engineering and Education, Murdoch University, Murdoch 6150, Western Australia, Australia
| | - Xiaopeng Fu
- Key Laboratory of Horticultural Plant Biology, College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan, 430070, China
| | - Zhiqiang Wu
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
- Kunpeng Institute of Modern Agriculture at Foshan, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
| | - Xiaoni Zhang
- Kunpeng Institute of Modern Agriculture at Foshan, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
| |
Collapse
|
4
|
Balasooriya GI, Wee TL, Spector DL. A sub-set of guanine- and cytosine-rich genes are actively transcribed at the nuclear Lamin B1 region. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.28.564411. [PMID: 37961255 PMCID: PMC10634887 DOI: 10.1101/2023.10.28.564411] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Chromatin organization in the mammalian cell nucleus plays a vital role in the regulation of gene expression. The lamina-associated domain at the inner nuclear membrane has been proposed to harbor heterochromatin, while the nuclear interior has been shown to contain most of the euchromatin. Here, we show that a sub-set of actively transcribing genes, marked by RNA Pol II pSer2, are associated with Lamin B1 at the inner nuclear envelop in mESCs and the number of genes proportionally increases upon in vitro differentiation of mESC to olfactory precursor cells. These nuclear periphery-associated actively transcribing genes primarily represent housekeeping genes, and their gene bodies are significantly enriched with guanine and cytosine compared to genes actively transcribed at the nuclear interior. We found the promoters of these genes to also be significantly enriched with guanine and to be predominantly regulated by zinc finger protein transcription factors. We provide evidence supporting the emerging notion that the Lamin B1 region is not solely transcriptionally silent.
Collapse
|
5
|
McCoy MJ, Fire AZ. Ancient origins of complex neuronal genes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.28.534655. [PMID: 37034725 PMCID: PMC10081198 DOI: 10.1101/2023.03.28.534655] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
How nervous systems evolved is a central question in biology. An increasing diversity of synaptic proteins is thought to play a central role in the formation of specific synapses leading to nervous system complexity. The largest animal genes, often spanning millions of base pairs, are known to be enriched for expression in neurons at synapses and are frequently mutated or misregulated in neurological disorders and diseases. While many of these genes have been studied independently in the context of nervous system evolution and disease, general principles underlying their parallel evolution remain unknown. To investigate this, we directly compared orthologous gene sizes across eukaryotes. By comparing relative gene sizes within organisms, we identified a distinct class of large genes with origins predating the diversification of animals and in many cases the emergence of dedicated neuronal cell types. We traced this class of ancient large genes through evolution and found orthologs of the large synaptic genes driving the immense complexity of metazoan nervous systems, including in humans and cephalopods. Moreover, we found that while these genes are evolving under strong purifying selection as demonstrated by low dN/dS scores, they have simultaneously grown larger and gained the most isoforms in animals. This work provides a new lens through which to view this distinctive class of large and multi-isoform genes and demonstrates how intrinsic genomic properties, such as gene length, can provide flexibility in molecular evolution and allow groups of genes and their host organisms to evolve toward complexity.
Collapse
Affiliation(s)
- Matthew J. McCoy
- Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA
- Whitman Center, Marine Biological Laboratory, Woods Hole, MA 02543, USA
| | - Andrew Z. Fire
- Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| |
Collapse
|
6
|
Chai S, Wakefield L, Norgard M, Li B, Enicks D, Marks DL, Grompe M. Strong ubiquitous micro-promoters for recombinant adeno-associated viral vectors. Mol Ther Methods Clin Dev 2023; 29:504-512. [PMID: 37287749 PMCID: PMC10241652 DOI: 10.1016/j.omtm.2023.05.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Accepted: 05/12/2023] [Indexed: 06/09/2023]
Abstract
Significant progress has been made in developing recombinant adeno-associated virus (rAAV) for clinical gene therapy. While rAAV is a versatile gene delivery platform, its packaging limit of 4.7 kb limits the diseases it can target. Here, we report two unusually small promoters that enable the expression of larger transgenes than standard promoters. These micro-promoters are only 84 (MP-84) and 135 bp (MP-135) in size but have activity in most cells and tissues comparable to the CAG promoter, the strongest ubiquitous promoter to date. MP-84- and MP-135-based rAAV constructs displayed robust activity in cultured cells from the three different germ-layer lineages. In addition, reporter gene expression was documented in human primary hepatocytes and pancreatic islets and in multiple mouse tissues in vivo, including brain and skeletal muscle. MP-84 and MP-135 will enable the therapeutic expression of transgenes currently too large for rAAV vectors.
Collapse
Affiliation(s)
- Sunghee Chai
- Papé Family Pediatric Research Institute, Oregon Stem Cell Center, Portland, OR, USA
| | - Leslie Wakefield
- Papé Family Pediatric Research Institute, Oregon Stem Cell Center, Portland, OR, USA
| | - Mason Norgard
- Department of Pediatrics, Oregon Health & Science University, Portland, OR, USA
| | - Bin Li
- Papé Family Pediatric Research Institute, Oregon Stem Cell Center, Portland, OR, USA
| | - David Enicks
- Papé Family Pediatric Research Institute, Oregon Stem Cell Center, Portland, OR, USA
| | - Daniel L. Marks
- Department of Pediatrics, Oregon Health & Science University, Portland, OR, USA
| | - Markus Grompe
- Papé Family Pediatric Research Institute, Oregon Stem Cell Center, Portland, OR, USA
| |
Collapse
|
7
|
Truong DD, Lamhamedi-Cherradi SE, Porter RW, Krishnan S, Swaminathan J, Gibson A, Lazar AJ, Livingston JA, Gopalakrishnan V, Gordon N, Daw NC, Navin NE, Gorlick R, Ludwig JA. Dissociation protocols used for sarcoma tissues bias the transcriptome observed in single-cell and single-nucleus RNA sequencing. BMC Cancer 2023; 23:488. [PMID: 37254069 PMCID: PMC10230784 DOI: 10.1186/s12885-023-10977-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Accepted: 05/17/2023] [Indexed: 06/01/2023] Open
Abstract
BACKGROUND Single-cell RNA-seq has emerged as an innovative technology used to study complex tissues and characterize cell types, states, and lineages at a single-cell level. Classification of bulk tumors by their individual cellular constituents has also created new opportunities to generate single-cell atlases for many organs, cancers, and developmental models. Despite the tremendous promise of this technology, recent evidence studying epithelial tissues and diverse carcinomas suggests the methods used for tissue processing, cell disaggregation, and preservation can significantly bias gene expression and alter the observed cell types. To determine whether sarcomas - tumors of mesenchymal origin - are subject to the same technical artifacts, we profiled patient-derived tumor explants (PDXs) propagated from three aggressive subtypes: osteosarcoma (OS), Ewing sarcoma (ES), desmoplastic small round cell tumor (DSRCT). Given the rarity of these sarcoma subtypes, we explored whether single-nuclei RNA-seq from more widely available archival frozen specimens could accurately be identified by gene expression signatures linked to tissue phenotype or pathognomonic fusion proteins. RESULTS We systematically assessed dissociation methods across different sarcoma subtypes. We compared gene expression from single-cell and single-nucleus RNA-sequencing of 125,831 whole-cells and nuclei from ES, DSRCT, and OS PDXs. We detected warm dissociation artifacts in single-cell samples and gene length bias in single-nucleus samples. Classic sarcoma gene signatures were observed regardless of the dissociation method. In addition, we showed that dissociation method biases could be computationally corrected. CONCLUSIONS We highlighted transcriptional biases, including warm dissociation and gene-length biases, introduced by the dissociation method for various sarcoma subtypes. This work is the first to characterize how the dissociation methods used for sc/snRNA-seq may affect the interpretation of the molecular features in sarcoma PDXs.
Collapse
Affiliation(s)
- Danh D Truong
- Sarcoma Medical Oncology Department, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
| | | | - Robert W Porter
- Sarcoma Medical Oncology Department, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
| | - Sandhya Krishnan
- Sarcoma Medical Oncology Department, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
| | | | - Amber Gibson
- Division of Pediatrics, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
| | - Alexander J Lazar
- Division of Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
| | - J Andrew Livingston
- Sarcoma Medical Oncology Department, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
| | - Vidya Gopalakrishnan
- Division of Pediatrics, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
| | - Nancy Gordon
- Division of Pediatrics, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
| | - Najat C Daw
- Division of Pediatrics, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
| | - Nicholas E Navin
- Department of Genetics, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
| | - Richard Gorlick
- Division of Pediatrics, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
| | - Joseph A Ludwig
- Sarcoma Medical Oncology Department, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA.
| |
Collapse
|
8
|
Cao Y. Neural induction drives body axis formation during embryogenesis, but a neural induction-like process drives tumorigenesis in postnatal animals. Front Cell Dev Biol 2023; 11:1092667. [PMID: 37228646 PMCID: PMC10203556 DOI: 10.3389/fcell.2023.1092667] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 04/17/2023] [Indexed: 05/27/2023] Open
Abstract
Characterization of cancer cells and neural stem cells indicates that tumorigenicity and pluripotency are coupled cell properties determined by neural stemness, and tumorigenesis represents a process of progressive loss of original cell identity and gain of neural stemness. This reminds of a most fundamental process required for the development of the nervous system and body axis during embryogenesis, i.e., embryonic neural induction. Neural induction is that, in response to extracellular signals that are secreted by the Spemann-Mangold organizer in amphibians or the node in mammals and inhibit epidermal fate in ectoderm, the ectodermal cells lose their epidermal fate and assume the neural default fate and consequently, turn into neuroectodermal cells. They further differentiate into the nervous system and also some non-neural cells via interaction with adjacent tissues. Failure in neural induction leads to failure of embryogenesis, and ectopic neural induction due to ectopic organizer or node activity or activation of embryonic neural genes causes a formation of secondary body axis or a conjoined twin. During tumorigenesis, cells progressively lose their original cell identity and gain of neural stemness, and consequently, gain of tumorigenicity and pluripotency, due to various intra-/extracellular insults in cells of a postnatal animal. Tumorigenic cells can be induced to differentiation into normal cells and integrate into normal embryonic development within an embryo. However, they form tumors and cannot integrate into animal tissues/organs in a postnatal animal because of lack of embryonic inducing signals. Combination of studies of developmental and cancer biology indicates that neural induction drives embryogenesis in gastrulating embryos but a similar process drives tumorigenesis in a postnatal animal. Tumorigenicity is by nature the manifestation of aberrant occurrence of pluripotent state in a postnatal animal. Pluripotency and tumorigenicity are both but different manifestations of neural stemness in pre- and postnatal stages of animal life, respectively. Based on these findings, I discuss about some confusion in cancer research, propose to distinguish the causality and associations and discriminate causal and supporting factors involved in tumorigenesis, and suggest revisiting the focus of cancer research.
Collapse
Affiliation(s)
- Ying Cao
- Shenzhen Research Institute of Nanjing University, Shenzhen, China
- MOE Key Laboratory of Model Animals for Disease Study, Model Animal Research Center of Medical School, Nanjing University, Nanjing, China
- Jiangsu Key Laboratory of Molecular Medicine of Medical School, Nanjing University, Nanjing, China
| |
Collapse
|
9
|
Mimoso CA, Adelman K. U1 snRNP increases RNA Pol II elongation rate to enable synthesis of long genes. Mol Cell 2023; 83:1264-1279.e10. [PMID: 36965480 PMCID: PMC10135401 DOI: 10.1016/j.molcel.2023.03.002] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 02/06/2023] [Accepted: 02/28/2023] [Indexed: 03/27/2023]
Abstract
The expansion of introns within mammalian genomes poses a challenge for the production of full-length messenger RNAs (mRNAs), with increasing evidence that these long AT-rich sequences present obstacles to transcription. Here, we investigate RNA polymerase II (RNAPII) elongation at high resolution in mammalian cells and demonstrate that RNAPII transcribes faster across introns. Moreover, we find that this acceleration requires the association of U1 snRNP (U1) with the elongation complex at 5' splice sites. The role of U1 to stimulate elongation rate through introns reduces the frequency of both premature termination and transcriptional arrest, thereby dramatically increasing RNA production. We further show that changes in RNAPII elongation rate due to AT content and U1 binding explain previous reports of pausing or termination at splice junctions and the edge of CpG islands. We propose that U1-mediated acceleration of elongation has evolved to mitigate the risks that long AT-rich introns pose to transcript completion.
Collapse
Affiliation(s)
- Claudia A Mimoso
- Department of Biological Chemistry and Molecular Pharmacology, Blavatnik Institute, Harvard Medical School, Boston, MA 02115, USA
| | - Karen Adelman
- Department of Biological Chemistry and Molecular Pharmacology, Blavatnik Institute, Harvard Medical School, Boston, MA 02115, USA; Ludwig Center at Harvard, Boston, MA 02115, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
| |
Collapse
|
10
|
Khandia R, Saeed M, Alharbi AM, Ashraf GM, Greig NH, Kamal MA. Codon Usage Bias Correlates With Gene Length in Neurodegeneration Associated Genes. Front Neurosci 2022; 16:895607. [PMID: 35860292 PMCID: PMC9289476 DOI: 10.3389/fnins.2022.895607] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Accepted: 06/08/2022] [Indexed: 11/13/2022] Open
Abstract
Codon usage analysis is a crucial part of molecular characterization and is used to determine the factors affecting the evolution of a gene. The length of a gene is an important parameter that affects the characteristics of the gene, such as codon usage, compositional parameters, and sometimes, its functions. In the present study, we investigated the association of various parameters related to codon usage with the length of genes. Gene expression is affected by nucleotide disproportion. In sixty genes related to neurodegenerative disorders, the G nucleotide was the most abundant and the T nucleotide was the least. The nucleotide T exhibited a significant association with the length of the gene at both the overall compositional level and the first and second codon positions. Codon usage bias (CUB) of these genes was affected by pyrimidine and keto skews. Gene length was found to be significantly correlated with codon bias in neurodegeneration associated genes. In gene segments with lengths below 1,200 bp and above 2,400 bp, CUB was positively associated with length. Relative synonymous CUB, which is another measure of CUB, showed that codons TTA, GTT, GTC, TCA, GGT, and GGA exhibited a positive association with length, whereas codons GTA, AGC, CGT, CGA, and GGG showed a negative association. GC-ending codons were preferred over AT-ending codons. Overall analysis indicated that the association between CUB and length varies depending on the segment size; however, CUB of 1,200–2,000 bp gene segments appeared not affected by gene length. In synopsis, analysis suggests that length of the genes correlates with various imperative molecular signatures including A/T nucleotide disproportion and codon choices. In the present study we additionally evaluated various molecular features and their correlation with different indices of codon usage, like the Codon Adaptation Index (CAI) and Relative Dynonymous Codon Usage (RSCU) of codons. We also considered the impact of gene fragment size on different molecular features in genes related to neurodegeneration. This analysis will aid our understanding of and in potentially modulating gene expression in cases of defective gene functioning in clinical settings.
Collapse
Affiliation(s)
- Rekha Khandia
- Department of Biochemistry and Genetics, Barkatullah University, Bhopal, India
- *Correspondence: Rekha Khandia, ;
| | - Mohd. Saeed
- Department of Biology, College of Sciences, University of Hail, Hail, Saudi Arabia
| | - Ahmed M. Alharbi
- Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, University of Hail, Hail, Saudi Arabia
| | - Ghulam Md. Ashraf
- Pre-clinical Research Unit, King Fahd Medical Research Center, King Abdulaziz University, Jeddah, Saudi Arabia
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Nigel H. Greig
- Drug Design and Development Section, Translational Gerontology Branch, Intramural Research Program National Institute on Aging, NIH, Baltimore, MD, United States
| | - Mohammad Amjad Kamal
- Institutes for Systems Genetics, Frontiers Science Center for Disease-Related Molecular Network, West China Hospital, Sichuan University, Chengdu, China
- King Fahd Medical Research Center, King Abdulaziz University, Jeddah, Saudi Arabia
- Department of Pharmacy, Faculty of Allied Health Sciences, Daffodil International University, Dhaka, Bangladesh
- Enzymoics, Novel Global Community Educational Foundation, Hebersham, NSW, Australia
| |
Collapse
|
11
|
Zhang M, Liu Y, Shi L, Fang L, Xu L, Cao Y. Neural stemness unifies cell tumorigenicity and pluripotent differentiation potential. J Biol Chem 2022; 298:102106. [PMID: 35671824 PMCID: PMC9254501 DOI: 10.1016/j.jbc.2022.102106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Revised: 05/24/2022] [Accepted: 05/25/2022] [Indexed: 11/16/2022] Open
Abstract
Neural stemness is suggested to be the ground state of tumorigenicity and pluripotent differentiation potential. However, the relationship between these cell properties is unclear. Here, by disrupting the neural regulatory network in neural stem and cancer cells and by serial transplantation of cancer cells, we show that tumorigenicity and pluripotent differentiation potential are coupled cell properties unified by neural stemness. We show that loss of neural stemness via inhibition of SETDB1, an oncoprotein with enriched expression in embryonic neural cells during vertebrate embryogenesis, led to neuronal differentiation with reduced tumorigenicity and pluripotent differentiation potential in neural stem and cancer cells, whereas enhancement of neural stemness by SETDB1 overexpression caused the opposite effects. SETDB1 maintains a regulatory network comprising proteins involved in developmental programs and basic cellular functional machineries, including epigenetic modifications (EZH2), ribosome biogenesis (RPS3), translation initiation (EIF4G), and spliceosome assembly (SF3B1); all of these proteins are enriched in embryonic neural cells and play active roles in cancers. In addition, SETDB1 represses the transcription of genes promoting differentiation and cell cycle and growth arrest. Serial transplantation of cancer cells showed that neural stemness, tumorigenicity, and pluripotent differentiation potential were simultaneously enhanced; these effects were accompanied by increased expression of proteins involved in developmental programs and basic machineries, including SETDB1 and the abovementioned proteins, as well as by increased alternative splicing events. These results indicate that basic machineries work together to define a highly proliferative state with pluripotent differentiation potential and also suggest that neural stemness unifies tumorigenicity and differentiation potential.
Collapse
Affiliation(s)
- Min Zhang
- Shenzhen Research Institute of Nanjing University, Shenzhen, China; MOE Key Laboratory of Model Animals for Disease Study and State Key Laboratory of Pharmaceutical Biotechnology, Model Animal Research Center of Medical School
| | - Yang Liu
- Shenzhen Research Institute of Nanjing University, Shenzhen, China; MOE Key Laboratory of Model Animals for Disease Study and State Key Laboratory of Pharmaceutical Biotechnology, Model Animal Research Center of Medical School
| | - Lihua Shi
- MOE Key Laboratory of Model Animals for Disease Study and State Key Laboratory of Pharmaceutical Biotechnology, Model Animal Research Center of Medical School
| | - Lei Fang
- Jiangsu Key Laboratory of Molecular Medicine of Medical School, Nanjing University, Nanjing, China
| | - Liyang Xu
- MOE Key Laboratory of Model Animals for Disease Study and State Key Laboratory of Pharmaceutical Biotechnology, Model Animal Research Center of Medical School
| | - Ying Cao
- Shenzhen Research Institute of Nanjing University, Shenzhen, China; MOE Key Laboratory of Model Animals for Disease Study and State Key Laboratory of Pharmaceutical Biotechnology, Model Animal Research Center of Medical School.
| |
Collapse
|
12
|
Sugasawa T, Komine R, Manevich L, Tamai S, Takekoshi K, Kanki Y. Gene Expression Profile Provides Novel Insights of Fasting-Refeeding Response in Zebrafish Skeletal Muscle. Nutrients 2022; 14:nu14112239. [PMID: 35684038 PMCID: PMC9182819 DOI: 10.3390/nu14112239] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Revised: 05/24/2022] [Accepted: 05/25/2022] [Indexed: 02/05/2023] Open
Abstract
Recently, fasting has been spotlighted from a healthcare perspective. However, the de-tailed biological mechanisms and significance by which the effects of fasting confer health benefits are not yet clear. Due to certain advantages of the zebrafish as a vertebrate model, it is widely utilized in biological studies. However, the biological responses to nutrient metabolism within zebrafish skeletal muscles have not yet been amply reported. Therefore, we aimed to reveal a gene expression profile in zebrafish skeletal muscles in response to fasting-refeeding. Accordingly, mRNA-sequencing and bioinformatics analysis were performed to examine comprehensive gene expression changes in skeletal muscle tissues during fasting-refeeding. Our results produced a novel set of nutrition-related genes under a fasting-refeeding protocol. Moreover, we found that five genes were dramatically upregulated in each fasting (for 24 h) and refeeding (after 3 h), exhibiting a rapid response to the provided conditional changes. The assessment of the gene length revealed that the gene set whose expression was elevated only after 3 h of refeeding had a shorter length, suggesting that nutrition-related gene function is associated with gene length. Taken together, our results from the bioinformatics analyses provide new insights into biological mechanisms induced by fasting-refeeding conditions within zebrafish skeletal muscle.
Collapse
Affiliation(s)
- Takehito Sugasawa
- Laboratory of Clinical Examination and Sports Medicine, Department of Clinical Medicine, Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba 305-8577, Ibaraki, Japan; (T.S.); (S.T.)
- Department of Sports Medicine Analysis, Open Facility Network Office, Organization for Open Facility Initiatives, University of Tsukuba, 1-1-1 Tennodai, Tsukuba 305-8577, Ibaraki, Japan;
| | - Ritsuko Komine
- Department of Sports Medicine Analysis, Open Facility Network Office, Organization for Open Facility Initiatives, University of Tsukuba, 1-1-1 Tennodai, Tsukuba 305-8577, Ibaraki, Japan;
- Doctoral Program in Sports Medicine, Graduate School of Comprehensive Human Sciences, University of Tsukuba, 1-1-1 Tennodai, Tsukuba 305-8577, Ibaraki, Japan
| | - Lev Manevich
- Experimental Pathology, Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba 305-8577, Ibaraki, Japan;
- Doctoral Program in Biomedical Sciences, Graduate School of Comprehensive Human Sciences, University of Tsukuba, 1-1-1 Tennodai, Tsukuba 305-8577, Ibaraki, Japan
| | - Shinsuke Tamai
- Laboratory of Clinical Examination and Sports Medicine, Department of Clinical Medicine, Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba 305-8577, Ibaraki, Japan; (T.S.); (S.T.)
- Department of Sport Science and Research, Japan Institute of Sports Sciences, 3-15-1 Nishigaoka, Kita-ku, Tokyo 115-0056, Japan
| | - Kazuhiro Takekoshi
- Laboratory of Clinical Examination and Sports Medicine, Department of Clinical Medicine, Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba 305-8577, Ibaraki, Japan; (T.S.); (S.T.)
- Correspondence: (K.T.); (Y.K.); Tel.: +81-29-853-3209 (K.T. & Y.K.)
| | - Yasuharu Kanki
- Laboratory of Clinical Examination and Sports Medicine, Department of Clinical Medicine, Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba 305-8577, Ibaraki, Japan; (T.S.); (S.T.)
- Department of Sports Medicine Analysis, Open Facility Network Office, Organization for Open Facility Initiatives, University of Tsukuba, 1-1-1 Tennodai, Tsukuba 305-8577, Ibaraki, Japan;
- Correspondence: (K.T.); (Y.K.); Tel.: +81-29-853-3209 (K.T. & Y.K.)
| |
Collapse
|
13
|
Rekad Z, Izzi V, Lamba R, Ciais D, Van Obberghen-Schilling E. The Alternative Matrisome: alternative splicing of ECM proteins in development, homeostasis and tumor progression. Matrix Biol 2022; 111:26-52. [DOI: 10.1016/j.matbio.2022.05.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2022] [Revised: 04/19/2022] [Accepted: 05/04/2022] [Indexed: 12/14/2022]
|
14
|
Foe VE. Does the Pachytene Checkpoint, a Feature of Meiosis, Filter Out Mistakes in Double-Strand DNA Break Repair and as a side-Effect Strongly Promote Adaptive Speciation? Integr Org Biol 2022; 4:obac008. [PMID: 36827645 PMCID: PMC8998493 DOI: 10.1093/iob/obac008] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
This essay aims to explain two biological puzzles: why eukaryotic transcription units are composed of short segments of coding DNA interspersed with long stretches of non-coding (intron) DNA, and the near ubiquity of sexual reproduction. As is well known, alternative splicing of its coding sequences enables one transcription unit to produce multiple variants of each encoded protein. Additionally, padding transcription units with non-coding DNA (often many thousands of base pairs long) provides a readily evolvable way to set how soon in a cell cycle the various mRNAs will begin being expressed and the total amount of mRNA that each transcription unit can make during a cell cycle. This regulation complements control via the transcriptional promoter and facilitates the creation of complex eukaryotic cell types, tissues, and organisms. However, it also makes eukaryotes exceedingly vulnerable to double-strand DNA breaks, which end-joining break repair pathways can repair incorrectly. Transcription units cover such a large fraction of the genome that any mis-repair producing a reorganized chromosome has a high probability of destroying a gene. During meiosis, the synaptonemal complex aligns homologous chromosome pairs and the pachytene checkpoint detects, selectively arrests, and in many organisms actively destroys gamete-producing cells with chromosomes that cannot adequately synapse; this creates a filter favoring transmission to the next generation of chromosomes that retain the parental organization, while selectively culling those with interrupted transcription units. This same meiotic checkpoint, reacting to accidental chromosomal reorganizations inflicted by error-prone break repair, can, as a side effect, provide a mechanism for the formation of new species in sympatry. It has been a long-standing puzzle how something as seemingly maladaptive as hybrid sterility between such new species can arise. I suggest that this paradox is resolved by understanding the adaptive importance of the pachytene checkpoint, as outlined above.
Collapse
|
15
|
Ramakrishna NB, Murison K, Miska EA, Leitch HG. Epigenetic Regulation during Primordial Germ Cell Development and Differentiation. Sex Dev 2021; 15:411-431. [PMID: 34847550 DOI: 10.1159/000520412] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Accepted: 10/10/2021] [Indexed: 11/19/2022] Open
Abstract
Germline development varies significantly across metazoans. However, mammalian primordial germ cell (PGC) development has key conserved landmarks, including a critical period of epigenetic reprogramming that precedes sex-specific differentiation and gametogenesis. Epigenetic alterations in the germline are of unique importance due to their potential to impact the next generation. Therefore, regulation of, and by, the non-coding genome is of utmost importance during these epigenomic events. Here, we detail the key chromatin changes that occur during mammalian PGC development and how these interact with the expression of non-coding RNAs alongside broader epitranscriptomic changes. We identify gaps in our current knowledge, in particular regarding epigenetic regulation in the human germline, and we highlight important areas of future research.
Collapse
Affiliation(s)
- Navin B Ramakrishna
- Wellcome/CRUK Gurdon Institute, University of Cambridge, Cambridge, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge, United Kingdom
- Genome Institute of Singapore, A*STAR, Biopolis, Singapore, Singapore
| | - Keir Murison
- MRC London Institute of Medical Sciences, London, United Kingdom
- Institute of Clinical Sciences, Imperial College London, London, United Kingdom
| | - Eric A Miska
- Wellcome/CRUK Gurdon Institute, University of Cambridge, Cambridge, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge, United Kingdom
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom
| | - Harry G Leitch
- MRC London Institute of Medical Sciences, London, United Kingdom
- Institute of Clinical Sciences, Imperial College London, London, United Kingdom
- Centre for Paediatrics and Child Health, Faculty of Medicine, Imperial College London, London, United Kingdom
| |
Collapse
|
16
|
Neural is Fundamental: Neural Stemness as the Ground State of Cell Tumorigenicity and Differentiation Potential. Stem Cell Rev Rep 2021; 18:37-55. [PMID: 34714532 DOI: 10.1007/s12015-021-10275-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/01/2021] [Indexed: 01/07/2023]
Abstract
Tumorigenic cells are similar to neural stem cells or embryonic neural cells in regulatory networks, tumorigenicity and pluripotent differentiation potential. By integrating the evidence from developmental biology, tumor biology and evolution, I will make a detailed discussion on the observations and propose that neural stemness underlies two coupled cell properties, tumorigenicity and pluripotent differentiation potential. Neural stemness property of tumorigenic cells can hopefully integrate different observations/concepts underlying tumorigenesis. Neural stem cells and tumorigenic cells share regulatory networks; both exhibit neural stemness, tumorigenicity and pluripotent differentiation potential; both depend on expression or activation of ancestral genes; both rely primarily on aerobic glycolytic metabolism; both can differentiate into various cells/tissues that are derived from three germ layers, leading to tumor formation resembling severely disorganized or more degenerated process of embryonic tissue differentiation; both are enriched in long genes with more splice variants that provide more plastic scaffolds for cell differentiation, etc. Neural regulatory networks, which include higher levels of basic machineries of cell physiological functions and developmental programs, work concertedly to define a basic state with fast cell cycle and proliferation. This is predestined by the evolutionary advantage of neural state, the ground or initial state for multicellularity with adaptation to an ancient environment. Tumorigenesis might represent a process of restoration of neural ground state, thereby restoring a state with fast proliferation and pluripotent differentiation potential in somatic cells. Tumorigenesis and pluripotent differentiation potential might be better understood from understanding neural stemness, and cancer therapy should benefit more from targeting neural stemness.
Collapse
|
17
|
Ho JSY, Di Tullio F, Schwarz M, Low D, Incarnato D, Gay F, Tabaglio T, Zhang J, Wollmann H, Chen L, An O, Chan THM, Hall Hickman A, Zheng S, Roudko V, Chen S, Karz A, Ahmed M, He HH, Greenbaum BD, Oliviero S, Serresi M, Gargiulo G, Mann KM, Hernando E, Mulholland D, Marazzi I, Wee DKB, Guccione E. HNRNPM controls circRNA biogenesis and splicing fidelity to sustain cancer cell fitness. eLife 2021; 10:e59654. [PMID: 34075878 PMCID: PMC8346284 DOI: 10.7554/elife.59654] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2020] [Accepted: 05/30/2021] [Indexed: 12/25/2022] Open
Abstract
High spliceosome activity is a dependency for cancer cells, making them more vulnerable to perturbation of the splicing machinery compared to normal cells. To identify splicing factors important for prostate cancer (PCa) fitness, we performed pooled shRNA screens in vitro and in vivo. Our screens identified heterogeneous nuclear ribonucleoprotein M (HNRNPM) as a regulator of PCa cell growth. RNA- and eCLIP-sequencing identified HNRNPM binding to transcripts of key homeostatic genes. HNRNPM binding to its targets prevents aberrant exon inclusion and backsplicing events. In both linear and circular mis-spliced transcripts, HNRNPM preferentially binds to GU-rich elements in long flanking proximal introns. Mimicry of HNRNPM-dependent linear-splicing events using splice-switching-antisense-oligonucleotides was sufficient to inhibit PCa cell growth. This suggests that PCa dependence on HNRNPM is likely a result of mis-splicing of key homeostatic coding and non-coding genes. Our results have further been confirmed in other solid tumors. Taken together, our data reveal a role for HNRNPM in supporting cancer cell fitness. Inhibition of HNRNPM activity is therefore a potential therapeutic strategy in suppressing growth of PCa and other solid tumors.
Collapse
Affiliation(s)
- Jessica SY Ho
- Institute of Molecular and Cell Biology (IMCB), Agency for Science, Technology and Research (A*STAR)SingaporeSingapore
- Department of Microbiology, Icahn School of Medicine at Mount SinaiNew YorkUnited States
| | - Federico Di Tullio
- Center for Therapeutics Discovery, department of Oncological Sciences and Pharmacological Sciences, Tisch Cancer Institute, Icahn School of Medicine at Mount SinaiNew YorkUnited States
- Tisch Cancer Institute, Icahn School of Medicine at Mount SinaiNew YorkUnited States
- Department of Oncological Sciences, Icahn School of Medicine at Mount SinaiNew YorkUnited States
| | - Megan Schwarz
- Center for Therapeutics Discovery, department of Oncological Sciences and Pharmacological Sciences, Tisch Cancer Institute, Icahn School of Medicine at Mount SinaiNew YorkUnited States
- Tisch Cancer Institute, Icahn School of Medicine at Mount SinaiNew YorkUnited States
- Department of Oncological Sciences, Icahn School of Medicine at Mount SinaiNew YorkUnited States
| | - Diana Low
- Institute of Molecular and Cell Biology (IMCB), Agency for Science, Technology and Research (A*STAR)SingaporeSingapore
| | - Danny Incarnato
- IIGM (Italian Institute for Genomic Medicine)TorinoItaly
- Dipartimento di Scienze della Vita e Biologia dei Sistemi Università di TorinoTorinoItaly
- Department of Molecular Genetics, Groningen Biomolecular Sciences and Biotechnology Institute (GBB), University of GroningenGroningenNetherlands
| | - Florence Gay
- Institute of Molecular and Cell Biology (IMCB), Agency for Science, Technology and Research (A*STAR)SingaporeSingapore
| | - Tommaso Tabaglio
- Institute of Molecular and Cell Biology (IMCB), Agency for Science, Technology and Research (A*STAR)SingaporeSingapore
| | - JingXian Zhang
- Institute of Molecular and Cell Biology (IMCB), Agency for Science, Technology and Research (A*STAR)SingaporeSingapore
| | - Heike Wollmann
- Institute of Molecular and Cell Biology (IMCB), Agency for Science, Technology and Research (A*STAR)SingaporeSingapore
| | - Leilei Chen
- Cancer Science Institute of Singapore, National University of SingaporeSingaporeSingapore
- Department of Anatomy, Yong Loo Lin School of Medicine, National University of SingaporeSingaporeSingapore
| | - Omer An
- Cancer Science Institute of Singapore, National University of SingaporeSingaporeSingapore
| | - Tim Hon Man Chan
- Cancer Science Institute of Singapore, National University of SingaporeSingaporeSingapore
| | - Alexander Hall Hickman
- Institute of Molecular and Cell Biology (IMCB), Agency for Science, Technology and Research (A*STAR)SingaporeSingapore
| | - Simin Zheng
- Department of Microbiology, Icahn School of Medicine at Mount SinaiNew YorkUnited States
- NTU Institute of Structural Biology, Nanyang Technological UniversitySingaporeSingapore
| | - Vladimir Roudko
- Tisch Cancer Institute, Icahn School of Medicine at Mount SinaiNew YorkUnited States
- Department of Oncological Sciences, Icahn School of Medicine at Mount SinaiNew YorkUnited States
| | - Sujun Chen
- Department of Medical Biophysics, University of TorontoTorontoCanada
- Princess Margaret Cancer Center, University Health NetworkTorontoCanada
- Ontario Institute for Cancer ResearchTorontoCanada
| | - Alcida Karz
- Interdisciplinary Melanoma Cooperative Group, New York University Langone Medical CenterNew YorkUnited States
- Department of Pathology, New York University Langone Medical CenterNew YorkUnited States
| | - Musaddeque Ahmed
- Princess Margaret Cancer Center, University Health NetworkTorontoCanada
| | - Housheng Hansen He
- Department of Medical Biophysics, University of TorontoTorontoCanada
- Princess Margaret Cancer Center, University Health NetworkTorontoCanada
| | - Benjamin D Greenbaum
- Tisch Cancer Institute, Icahn School of Medicine at Mount SinaiNew YorkUnited States
- Department of Oncological Sciences, Icahn School of Medicine at Mount SinaiNew YorkUnited States
- Department of Medicine, Hematology and Medical Oncology, Icahn School of Medicine at Mount SinaiNew YorkUnited States
- Department of Pathology, Icahn School of Medicine at Mount SinaiNew YorkUnited States
| | - Salvatore Oliviero
- IIGM (Italian Institute for Genomic Medicine)TorinoItaly
- Dipartimento di Scienze della Vita e Biologia dei Sistemi Università di TorinoTorinoItaly
| | - Michela Serresi
- Max Delbruck Center for Molecular MedicineBerlin-BuchGermany
| | | | - Karen M Mann
- Department of Molecular Oncology, Moffitt Cancer CenterTampaUnited States
| | - Eva Hernando
- Interdisciplinary Melanoma Cooperative Group, New York University Langone Medical CenterNew YorkUnited States
- Department of Pathology, New York University Langone Medical CenterNew YorkUnited States
| | - David Mulholland
- Tisch Cancer Institute, Icahn School of Medicine at Mount SinaiNew YorkUnited States
- Department of Oncological Sciences, Icahn School of Medicine at Mount SinaiNew YorkUnited States
| | - Ivan Marazzi
- Department of Microbiology, Icahn School of Medicine at Mount SinaiNew YorkUnited States
| | - Dave Keng Boon Wee
- Institute of Molecular and Cell Biology (IMCB), Agency for Science, Technology and Research (A*STAR)SingaporeSingapore
| | - Ernesto Guccione
- Institute of Molecular and Cell Biology (IMCB), Agency for Science, Technology and Research (A*STAR)SingaporeSingapore
- Center for Therapeutics Discovery, department of Oncological Sciences and Pharmacological Sciences, Tisch Cancer Institute, Icahn School of Medicine at Mount SinaiNew YorkUnited States
- Tisch Cancer Institute, Icahn School of Medicine at Mount SinaiNew YorkUnited States
- Department of Oncological Sciences, Icahn School of Medicine at Mount SinaiNew YorkUnited States
| |
Collapse
|
18
|
Telonis AG, Rigoutsos I. The transcriptional trajectories of pluripotency and differentiation comprise genes with antithetical architecture and repetitive-element content. BMC Biol 2021; 19:60. [PMID: 33765992 PMCID: PMC7995781 DOI: 10.1186/s12915-020-00928-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Accepted: 11/18/2020] [Indexed: 12/12/2022] Open
Abstract
Background Extensive molecular differences exist between proliferative and differentiated cells. Here, we conduct a meta-analysis of publicly available transcriptomic datasets from preimplantation and differentiation stages examining the architectural properties and content of genes whose abundance changes significantly across developmental time points. Results Analysis of preimplantation embryos from human and mouse showed that short genes whose introns are enriched in Alu (human) and B (mouse) elements, respectively, have higher abundance in the blastocyst compared to the zygote. These highly expressed genes encode ribosomal proteins or metabolic enzymes. On the other hand, long genes whose introns are depleted in repetitive elements have lower abundance in the blastocyst and include genes from signaling pathways. Additionally, the sequences of the genes that are differentially expressed between the blastocyst and the zygote contain distinct collections of pyknon motifs that differ between up- and down-regulated genes. Further examination of the genes that participate in the stem cell-specific protein interaction network shows that their introns are short and enriched in Alu (human) and B (mouse) elements. As organogenesis progresses, in both human and mouse, we find that the primarily short and repeat-rich expressed genes make way for primarily longer, repeat-poor genes. With that in mind, we used a machine learning-based approach to identify gene signatures able to classify human adult tissues: we find that the most discriminatory genes comprising these signatures have long introns that are repeat-poor and include transcription factors and signaling-cascade genes. The introns of widely expressed genes across human tissues, on the other hand, are short and repeat-rich, and coincide with those with the highest expression at the blastocyst stage. Conclusions Protein-coding genes that are characteristic of each trajectory, i.e., proliferation/pluripotency or differentiation, exhibit antithetical biases in their intronic and exonic lengths and in their repetitive-element content. While the respective human and mouse gene signatures are functionally and evolutionarily conserved, their introns and exons are enriched or depleted in organism-specific repetitive elements. We posit that these organism-specific repetitive sequences found in exons and introns are used to effect the corresponding genes’ regulation. Supplementary Information The online version contains supplementary material available at 10.1186/s12915-020-00928-8.
Collapse
Affiliation(s)
- Aristeidis G Telonis
- Computational Medicine Center, Sidney Kimmel College of Medicine, Thomas Jefferson University, 1020 Locust Street, Suite M81, Philadelphia, PA, 19107, USA. .,Department of Human Genetics, Miller School of Medicine, University of Miami, Miami, FL, 33136, USA.
| | - Isidore Rigoutsos
- Computational Medicine Center, Sidney Kimmel College of Medicine, Thomas Jefferson University, 1020 Locust Street, Suite M81, Philadelphia, PA, 19107, USA.
| |
Collapse
|
19
|
Savarese M, Välipakka S, Johari M, Hackman P, Udd B. Is Gene-Size an Issue for the Diagnosis of Skeletal Muscle Disorders? J Neuromuscul Dis 2021; 7:203-216. [PMID: 32176652 PMCID: PMC7369045 DOI: 10.3233/jnd-190459] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Human genes have a variable length. Those having a coding sequence of extraordinary length and a high number of exons were almost impossible to sequence using the traditional Sanger-based gene-by-gene approach. High-throughput sequencing has partly overcome the size-related technical issues, enabling a straightforward, rapid and relatively inexpensive analysis of large genes. Several large genes (e.g. TTN, NEB, RYR1, DMD) are recognized as disease-causing in patients with skeletal muscle diseases. However, because of their sheer size, the clinical interpretation of variants in these genes is probably the most challenging aspect of the high-throughput genetic investigation in the field of skeletal muscle diseases. The main aim of this review is to discuss the technical and interpretative issues related to the diagnostic investigation of large genes and to reflect upon the current state of the art and the future advancements in the field.
Collapse
Affiliation(s)
- Marco Savarese
- Folkhälsan Research Center, Helsinki, Finland.,Department of Medical Genetics, Medicum, University of Helsinki, Helsinki, Finland
| | - Salla Välipakka
- Folkhälsan Research Center, Helsinki, Finland.,Department of Medical Genetics, Medicum, University of Helsinki, Helsinki, Finland
| | - Mridul Johari
- Folkhälsan Research Center, Helsinki, Finland.,Department of Medical Genetics, Medicum, University of Helsinki, Helsinki, Finland
| | - Peter Hackman
- Folkhälsan Research Center, Helsinki, Finland.,Department of Medical Genetics, Medicum, University of Helsinki, Helsinki, Finland
| | - Bjarne Udd
- Folkhälsan Research Center, Helsinki, Finland.,Department of Medical Genetics, Medicum, University of Helsinki, Helsinki, Finland.,Neuromuscular Research Center, Tampere University and University Hospital, Tampere, Finland.,Department of Neurology, Vaasa Central Hospital, Vaasa, Finland
| |
Collapse
|
20
|
Lopes I, Altab G, Raina P, de Magalhães JP. Gene Size Matters: An Analysis of Gene Length in the Human Genome. Front Genet 2021; 12:559998. [PMID: 33643374 PMCID: PMC7905317 DOI: 10.3389/fgene.2021.559998] [Citation(s) in RCA: 61] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2020] [Accepted: 01/06/2021] [Indexed: 12/23/2022] Open
Abstract
While it is expected for gene length to be associated with factors such as intron number and evolutionary conservation, we are yet to understand the connections between gene length and function in the human genome. In this study, we show that, as expected, there is a strong positive correlation between gene length, transcript length, and protein size as well as a correlation with the number of genetic variants and introns. Among tissue-specific genes, we find that the longest transcripts tend to be expressed in the blood vessels, nerves, thyroid, cervix uteri, and the brain, while the smallest transcripts tend to be expressed in the pancreas, skin, stomach, vagina, and testis. We report, as shown previously, that natural selection suppresses changes for genes with longer transcripts and promotes changes for genes with smaller transcripts. We also observe that genes with longer transcripts tend to have a higher number of co-expressed genes and protein-protein interactions, as well as more associated publications. In the functional analysis, we show that bigger transcripts are often associated with neuronal development, while smaller transcripts tend to play roles in skin development and in the immune system. Furthermore, pathways related to cancer, neurons, and heart diseases tend to have genes with longer transcripts, with smaller transcripts being present in pathways related to immune responses and neurodegenerative diseases. Based on our results, we hypothesize that longer genes tend to be associated with functions that are important in the early development stages, while smaller genes tend to play a role in functions that are important throughout the whole life, like the immune system, which requires fast responses.
Collapse
Affiliation(s)
| | | | | | - João Pedro de Magalhães
- Integrative Genomics of Ageing Group, Institute of Ageing and Chronic Disease, University of Liverpool, Liverpool, United Kingdom
| |
Collapse
|
21
|
Xu L, Zhang M, Shi L, Yang X, Chen L, Cao N, Lei A, Cao Y. Neural stemness contributes to cell tumorigenicity. Cell Biosci 2021; 11:21. [PMID: 33468253 PMCID: PMC7814647 DOI: 10.1186/s13578-021-00531-6] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2020] [Accepted: 01/05/2021] [Indexed: 01/08/2023] Open
Abstract
Background Previous studies demonstrated the dependence of cancer on nerve. Recently, a growing number of studies reveal that cancer cells share the property and regulatory network with neural stem/progenitor cells. However, relationship between the property of neural stemness and cell tumorigenicity is unknown. Results We show that neural stem/progenitor cells, but not non-neural embryonic or somatic stem/progenitor cell types, exhibit tumorigenicity and the potential for differentiation into tissue types of all germ layers when they are placed in non-native environment by transplantation into immunodeficient nude mice. Likewise, cancer cells capable of tumor initiation have the property of neural stemness because of their abilities in neurosphere formation in neural stem cell-specific serum-free medium and in differentiation potential, in addition to their neuronal differentiation potential that was characterized previously. Moreover, loss of a pro-differentiation factor in myoblasts, which have no tumorigenicity, lead to the loss of myoblast identity, and gain of the property of neural stemness, tumorigenicity and potential for re-differentiation. By contrast, loss of neural stemness via differentiation results in the loss of tumorigenicity. These suggest that the property of neural stemness contributes to cell tumorigenicity, and tumor phenotypic heterogeneity might be an effect of differentiation potential of neural stemness. Bioinformatic analysis reveals that neural genes in general are correlated with embryonic development and cancer, in addition to their role in neural development; whereas non-neural genes are not. Most of neural specific genes emerged in typical species representing transition from unicellularity to multicellularity during evolution. Genes in Monosiga brevicollis, a unicellular species that is a closest known relative of metazoans, are biased toward neural cells. Conclusions We suggest that the property of neural stemness is the source of cell tumorigenicity. This is due to that neural biased unicellular state is the ground state for multicellularity and hence cell type diversification or differentiation during evolution, and tumorigenesis is a process of restoration of neural ground state in somatic cells along a default route that is pre-determined by an evolutionary advantage of neural state.
Collapse
Affiliation(s)
- Liyang Xu
- MOE Key Laboratory of Model Animals for Disease Study, and Model Animal Research Center of the Medical School, Nanjing University, 12 Xuefu Road, Pukou High-Tech Zone, Nanjing, 210061, China
| | - Min Zhang
- MOE Key Laboratory of Model Animals for Disease Study, and Model Animal Research Center of the Medical School, Nanjing University, 12 Xuefu Road, Pukou High-Tech Zone, Nanjing, 210061, China
| | - Lihua Shi
- MOE Key Laboratory of Model Animals for Disease Study, and Model Animal Research Center of the Medical School, Nanjing University, 12 Xuefu Road, Pukou High-Tech Zone, Nanjing, 210061, China
| | - Xiaoli Yang
- MOE Key Laboratory of Model Animals for Disease Study, and Model Animal Research Center of the Medical School, Nanjing University, 12 Xuefu Road, Pukou High-Tech Zone, Nanjing, 210061, China
| | - Lu Chen
- MOE Key Laboratory of Model Animals for Disease Study, and Model Animal Research Center of the Medical School, Nanjing University, 12 Xuefu Road, Pukou High-Tech Zone, Nanjing, 210061, China
| | - Ning Cao
- MOE Key Laboratory of Model Animals for Disease Study, and Model Animal Research Center of the Medical School, Nanjing University, 12 Xuefu Road, Pukou High-Tech Zone, Nanjing, 210061, China
| | - Anhua Lei
- MOE Key Laboratory of Model Animals for Disease Study, and Model Animal Research Center of the Medical School, Nanjing University, 12 Xuefu Road, Pukou High-Tech Zone, Nanjing, 210061, China
| | - Ying Cao
- MOE Key Laboratory of Model Animals for Disease Study, and Model Animal Research Center of the Medical School, Nanjing University, 12 Xuefu Road, Pukou High-Tech Zone, Nanjing, 210061, China.
| |
Collapse
|
22
|
McCoy MJ, Fire AZ. Intron and gene size expansion during nervous system evolution. BMC Genomics 2020; 21:360. [PMID: 32410625 PMCID: PMC7222433 DOI: 10.1186/s12864-020-6760-4] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Accepted: 04/28/2020] [Indexed: 01/07/2023] Open
Abstract
Background The evolutionary radiation of animals was accompanied by extensive expansion of gene and genome sizes, increased isoform diversity, and complexity of regulation. Results Here we show that the longest genes are enriched for expression in neuronal tissues of diverse vertebrates and of invertebrates. Additionally, we show that neuronal gene size expansion occurred predominantly through net gains in intron size, with a positional bias toward the 5′ end of each gene. Conclusions We find that intron and gene size expansion is a feature of many genes whose expression is enriched in nervous systems. We speculate that unique attributes of neurons may subject neuronal genes to evolutionary forces favoring net size expansion. This process could be associated with tissue-specific constraints on gene function and/or the evolution of increasingly complex gene regulation in nervous systems.
Collapse
Affiliation(s)
- Matthew J McCoy
- Grass Fellowship Program, Marine Biological Laboratory, Woods Hole, MA, 02543, USA. .,Departments of Pathology and Genetics, Stanford University School of Medicine, Stanford, CA, 94305, USA.
| | - Andrew Z Fire
- Departments of Pathology and Genetics, Stanford University School of Medicine, Stanford, CA, 94305, USA.
| |
Collapse
|
23
|
Albaradei S, Magana-Mora A, Thafar M, Uludag M, Bajic VB, Gojobori T, Essack M, Jankovic BR. Splice2Deep: An ensemble of deep convolutional neural networks for improved splice site prediction in genomic DNA. Gene 2020; 763S:100035. [PMID: 32550561 PMCID: PMC7285987 DOI: 10.1016/j.gene.2020.100035] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2020] [Accepted: 05/06/2020] [Indexed: 12/21/2022]
Abstract
Background The accurate identification of the exon/intron boundaries is critical for the correct annotation of genes with multiple exons. Donor and acceptor splice sites (SS) demarcate these boundaries. Therefore, deriving accurate computational models to predict the SS are useful for functional annotation of genes and genomes, and for finding alternative SS associated with different diseases. Although various models have been proposed for the in silico prediction of SS, improving their accuracy is required for reliable annotation. Moreover, models are often derived and tested using the same genome, providing no evidence of broad application, i.e. to other poorly studied genomes. Results With this in mind, we developed the Splice2Deep models for SS detection. Each model is an ensemble of deep convolutional neural networks. We evaluated the performance of the models based on the ability to detect SS in Homo sapiens, Oryza sativa japonica, Arabidopsis thaliana, Drosophila melanogaster, and Caenorhabditis elegans. Results demonstrate that the models efficiently detect SS in other organisms not considered during the training of the models. Compared to the state-of-the-art tools, Splice2Deep models achieved significantly reduced average error rates of 41.97% and 28.51% for acceptor and donor SS, respectively. Moreover, the Splice2Deep cross-organism validation demonstrates that models correctly identify conserved genomic elements enabling annotation of SS in new genomes by choosing the taxonomically closest model. Conclusions The results of our study demonstrated that Splice2Deep both achieved a considerably reduced error rate compared to other state-of-the-art models and the ability to accurately recognize SS in other organisms for which the model was not trained, enabling annotation of poorly studied or newly sequenced genomes. Splice2Deep models are implemented in Python using Keras API; the models and the data are available at https://github.com/SomayahAlbaradei/Splice_Deep.git.
Collapse
Key Words
- AUC, area under curve
- AcSS, acceptor splice site
- Acc, accuracy
- Bioinformatics
- CNN, convolutional neural network
- CONV, convolutional layers
- DL, deep learning
- DNA, deoxyribonucleic acid
- DT, decision trees
- Deep-learning
- DoSS, donor splice site
- FC, fully connected layer
- ML, machine learning
- NB, naive Bayes
- NN, neural network
- POOL, pooling layer
- Prediction
- RF, random forest
- RNA, ribonucleic acid
- ReLU, rectified linear unit layer
- SS, splice site
- SVM, support vector machine
- Sn, sensitivity
- Sp, specificity
- Splice sites
- Splicing
Collapse
Affiliation(s)
- Somayah Albaradei
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia.,Faculty of Computing and Information Technology, King Abdulaziz University, Saudi Arabia
| | - Arturo Magana-Mora
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia.,Saudi Aramco, EXPEC-ARC, Drilling Technology Team, Dhahran 31311, Saudi Arabia
| | - Maha Thafar
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia.,Faculty of Computers and Information Systems, Taif University, Saudi Arabia
| | - Mahmut Uludag
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| | - Vladimir B Bajic
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| | - Takashi Gojobori
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia.,Biological and Environmental Sciences and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| | - Magbubah Essack
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| | - Boris R Jankovic
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| |
Collapse
|