1
|
Association of genetic defects in the apelin-AGTRL1 system with myocardial infarction risk in Han Chinese. Gene 2020; 766:145143. [PMID: 32911028 DOI: 10.1016/j.gene.2020.145143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Revised: 08/28/2020] [Accepted: 09/02/2020] [Indexed: 11/21/2022]
Abstract
We aimed to test the hypothesis that apelin (APLN) and its receptor AGTRL1 (APLNR) genes may contribute to the pathogenesis of myocardial infarction in Han Chinese. This is a hospital-based, case-control association study, involving 1067 patients with myocardial infarction and 942 healthy controls. Myocardial infarction is diagnosed by electrocardiogram or anatomopathological examination. Eight polymorphisms in APLN gene and 5 in APLNR gene were genotyped using the TaqMan assay. Risk was summarized as odds ratio (OR) and 95% confidence interval (CI). In males, rs56204867-G allele (adjusted OR, 95% CI, p: 0.21, 0.08-0.55, 0.002) and rs2235309-T allele (0.60, 0.42-0.84, 0.004) was associated with a significantly reduced risk of myocardial infarction, and the mutations of rs2235310 was associated with an increased risk (1.41, 1.06-2.52, 0.021), as well as for rs948847-GG genotype (1.85, 1.23-2.91, 0.007). In females, the presence of rs56204867-AG and -GG genotypes was significantly associated with 44% and 50% reduced risk (0.56 and 0.50, 0.40-8.04 and 0.29-0.86, 0.007 and 0.036), respectively; for rs2235310, CC genotype was associated with 72% increased risk (1.72, 1.09-3.22, 0.016), and the odds of myocardial infarction was 3.47 for rs9943582-TT genotype (95% CI: 1.53-7.57, 0.009). The gender-specific association of APLN and APLNR genes with myocardial infarction was reinforced by further linkage and haplotype analyses. Finally, nomograms based on significant polymorphisms are satisfactory, with the C-indexes over 80% for both genders. Taken together, our findings indicate that APLN and APLNR genes are potential candidates in the pathogenesis of myocardial infarction in Han Chinese, and importantly their contribution is gender-dependent.
Collapse
|
2
|
Casanova EL, Konkel MK. The Developmental Gene Hypothesis for Punctuated Equilibrium: Combined Roles of Developmental Regulatory Genes and Transposable Elements. Bioessays 2020; 42:e1900173. [PMID: 31943266 DOI: 10.1002/bies.201900173] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2019] [Revised: 11/30/2019] [Indexed: 12/13/2022]
Abstract
Theories of the genetics underlying punctuated equilibrium (PE) have been vague to date. Here the developmental gene hypothesis is proposed, which states that: 1) developmental regulatory (DevReg) genes are responsible for the orchestration of metazoan morphogenesis and their extreme conservation and mutation intolerance generates the equilibrium or stasis present throughout much of the fossil record and 2) the accumulation of regulatory elements and recombination within these same genes-often derived from transposable elements-drives punctuated bursts of morphological divergence and speciation across metazoa. This two-part hypothesis helps to explain the features that characterize PE, providing a theoretical genetic basis for the once-controversial theory. Also see the video abstract here https://youtu.be/C-fu-ks5yDs.
Collapse
Affiliation(s)
- Emily L Casanova
- Department of Biomedical Sciences, University of South Carolina School of Medicine at Greenville, 200A Patewood Dr., Greenville, SC, 29615, USA
| | - Miriam K Konkel
- Department of Genetics, Clemson Center for Human Genetics, Biomedical Data Science and Informatics Program, Clemson University, 105 Collings St., Clemson, SC, 29631, USA
| |
Collapse
|
3
|
Casanova EL, Switala AE, Dandamudi S, Hickman AR, Vandenbrink J, Sharp JL, Feltus FA, Casanova MF. Autism risk genes are evolutionarily ancient and maintain a unique feature landscape that echoes their function. Autism Res 2019; 12:860-869. [PMID: 31025836 PMCID: PMC6613973 DOI: 10.1002/aur.2112] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2018] [Revised: 03/22/2019] [Accepted: 04/06/2019] [Indexed: 11/09/2022]
Abstract
Previous research on autism risk (ASD), developmental regulatory (DevReg), and central nervous system (CNS) genes suggests they tend to be large in size, enriched in nested repeats, and mutation intolerant. The relevance of these genomic features is intriguing yet poorly understood. In this study, we investigated the feature landscape of these gene groups to discover structural themes useful in interpreting their function, developmental patterns, and evolutionary history. ASD, DevReg, CNS, housekeeping, and whole genome control (WGC) groups were compiled using various resources. Multiple gene features of interest were extracted from NCBI/UCSC Bioinformatics. Residual variation intolerance scores, Exome Aggregation Consortium pLI scores, and copy number variation data from Decipher were used to estimate variation intolerance. Gene age and protein-protein interactions (PPI) were estimated using Ensembl and EBI Intact databases, respectively. Compared to WGC: ASD, DevReg, and CNS genes are longer, produce larger proteins, maintain greater numbers/density of conserved noncoding elements and transposable elements, produce more transcript variants, and are comparatively variation intolerant. After controlling for gene size, mutation tolerance, and clinical association, ASD genes still retain many of these same features. In addition, we also found that ASD genes that are extremely mutation intolerant have larger PPI networks. These data support many of the recent findings within the field of autism genetics but also expand our understanding of the evolution of these broad gene groups, their potential regulatory complexity, and the extent to which they interact with the cellular network. Autism Res 2019, 12: 860-869. © 2019 International Society for Autism Research, Wiley Periodicals, Inc. LAY SUMMARY: Autism risk genes are more ancient compared to other genes in the genome. As such, they exhibit physical features related to their age, including long gene and protein size and regulatory sequences that help to control gene expression. They share many of these same features with other genes that are expressed in the brain and/or are associated with prenatal development.
Collapse
Affiliation(s)
- Emily L. Casanova
- Department of Biomedical Sciences, University of South
Carolina, South Carolina, USA
- Department of Pediatrics, Greenville Health System,
Greenville, USA
| | - Andrew E. Switala
- Department of Bioengineering, University of Louisville,
Louisville, Kentucky, USA
| | - Srini Dandamudi
- Department of Statistics, Colorado State University, Fort
Collins, Colorado, USA
| | - Allison R. Hickman
- Department of Genetics and Biochemistry, Clemson
University, Clemson, South Carolina, USA
| | | | - Julia L. Sharp
- Department of Statistics, Colorado State University, Fort
Collins, Colorado, USA
| | - F. Alex Feltus
- Department of Genetics and Biochemistry, Clemson
University, Clemson, South Carolina, USA
| | - Manuel F. Casanova
- Department of Biomedical Sciences, University of South
Carolina, South Carolina, USA
- Department of Pediatrics, Greenville Health System,
Greenville, USA
| |
Collapse
|
4
|
Akkuratov EE, Walters L, Saha-Mandal A, Khandekar S, Crawford E, Zirbel CL, Leisner S, Prakash A, Fedorova L, Fedorov A. Bioinformatics analysis of plant orthologous introns: identification of an intronic tRNA-like sequence. Gene 2014; 548:81-90. [DOI: 10.1016/j.gene.2014.07.012] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2013] [Revised: 06/26/2014] [Accepted: 07/07/2014] [Indexed: 11/26/2022]
|
5
|
Grignolio A, Mishto M, Faria AMC, Garagnani P, Franceschi C, Tieri P. Towards a liquid self: how time, geography, and life experiences reshape the biological identity. Front Immunol 2014; 5:153. [PMID: 24782860 PMCID: PMC3988364 DOI: 10.3389/fimmu.2014.00153] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2013] [Accepted: 03/24/2014] [Indexed: 01/08/2023] Open
Abstract
The conceptualization of immunological self is amongst the most important theories of modern biology, representing a sort of theoretical guideline for experimental immunologists, in order to understand how host constituents are ignored by the immune system (IS). A consistent advancement in this field has been represented by the danger/damage theory and its subsequent refinements, which at present represents the most comprehensive conceptualization of immunological self. Here, we present the new hypothesis of "liquid self," which integrates and extends the danger/damage theory. The main novelty of the liquid self hypothesis lies in the full integration of the immune response mechanisms into the host body's ecosystems, i.e., in adding the temporal, as well as the geographical/evolutionary and environmental, dimensions, which we suggested to call "immunological biography." Our hypothesis takes into account the important biological changes occurring with time (age) in the IS (including immunosenescence and inflammaging), as well as changes in the organismal context related to nutrition, lifestyle, and geography (populations). We argue that such temporal and geographical dimensions impinge upon, and continuously reshape, the antigenicity of physical entities (molecules, cells, bacteria, viruses), making them switching between "self" and "non-self" states in a dynamical, "liquid" fashion. Particular attention is devoted to oral tolerance and gut microbiota, as well as to a new potential source of unexpected self epitopes produced by proteasome splicing. Finally, our framework allows the set up of a variety of testable predictions, the most straightforward suggesting that the immune responses to defined molecules representing potentials antigens will be quantitatively and qualitatively quite different according to the immuno-biographical background of the host.
Collapse
Affiliation(s)
- Andrea Grignolio
- Interdepartmental Center "Luigi Galvani" for Bioinformatics, Biophysics and Biocomplexity, University of Bologna , Bologna , Italy
| | - Michele Mishto
- Centro Interdipartimentale di Ricerca sul Cancro "G. Prodi", University of Bologna , Bologna , Italy ; Institut für Biochemie, Charité - Universitätsmedizin Berlin , Berlin , Germany
| | - Ana Maria Caetano Faria
- Departamento de Bioquímica e Imunologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais , Belo Horizonte , Brazil
| | - Paolo Garagnani
- Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna , Bologna , Italy
| | - Claudio Franceschi
- Interdepartmental Center "Luigi Galvani" for Bioinformatics, Biophysics and Biocomplexity, University of Bologna , Bologna , Italy ; Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna , Bologna , Italy ; IRCCS of Neurological Science , Bologna , Italy ; Institute of Organic Synthesis and Photoreactivity, National Research Council , Bologna , Italy
| | - Paolo Tieri
- Institute for Applied Mathematics "M. Picone", National Research Council , Rome , Italy
| |
Collapse
|
6
|
Kelemen O, Convertini P, Zhang Z, Wen Y, Shen M, Falaleeva M, Stamm S. Function of alternative splicing. Gene 2013; 514:1-30. [PMID: 22909801 PMCID: PMC5632952 DOI: 10.1016/j.gene.2012.07.083] [Citation(s) in RCA: 504] [Impact Index Per Article: 45.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2012] [Revised: 07/21/2012] [Accepted: 07/30/2012] [Indexed: 12/15/2022]
Abstract
Almost all polymerase II transcripts undergo alternative pre-mRNA splicing. Here, we review the functions of alternative splicing events that have been experimentally determined. The overall function of alternative splicing is to increase the diversity of mRNAs expressed from the genome. Alternative splicing changes proteins encoded by mRNAs, which has profound functional effects. Experimental analysis of these protein isoforms showed that alternative splicing regulates binding between proteins, between proteins and nucleic acids as well as between proteins and membranes. Alternative splicing regulates the localization of proteins, their enzymatic properties and their interaction with ligands. In most cases, changes caused by individual splicing isoforms are small. However, cells typically coordinate numerous changes in 'splicing programs', which can have strong effects on cell proliferation, cell survival and properties of the nervous system. Due to its widespread usage and molecular versatility, alternative splicing emerges as a central element in gene regulation that interferes with almost every biological function analyzed.
Collapse
Affiliation(s)
- Olga Kelemen
- Department of Molecular and Cellular Biochemistry, University of Kentucky, Lexington, Kentucky, United States of America
| | - Paolo Convertini
- Department of Molecular and Cellular Biochemistry, University of Kentucky, Lexington, Kentucky, United States of America
| | - Zhaiyi Zhang
- Department of Molecular and Cellular Biochemistry, University of Kentucky, Lexington, Kentucky, United States of America
| | - Yuan Wen
- Department of Molecular and Cellular Biochemistry, University of Kentucky, Lexington, Kentucky, United States of America
| | - Manli Shen
- Department of Molecular and Cellular Biochemistry, University of Kentucky, Lexington, Kentucky, United States of America
| | - Marina Falaleeva
- Department of Molecular and Cellular Biochemistry, University of Kentucky, Lexington, Kentucky, United States of America
| | - Stefan Stamm
- Department of Molecular and Cellular Biochemistry, University of Kentucky, Lexington, Kentucky, United States of America
| |
Collapse
|
7
|
ACE2 gene polymorphism and essential hypertension: an updated meta-analysis involving 11,051 subjects. Mol Biol Rep 2012; 39:6581-9. [PMID: 22297693 DOI: 10.1007/s11033-012-1487-1] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2011] [Accepted: 01/24/2012] [Indexed: 12/20/2022]
Abstract
The polymorphisms of angiotensin-converting enzyme 2 (ACE2) gene have been suggested to be linked to increase risk of essential hypertension in multiple populations. However, the results are still debatable. To assess the association between ACE2 G8970A genetic polymorphism and essential hypertension, we conducted a meta-analysis of case-control studies across different ethnicity. PubMed, Embase, CBM, Wanfang and VIP databases were searched, and a total of 11 separate studies in females and nine separate studies in males met the inclusion criteria. Because ACE2 is on the X chromosome, data for each sex were analyzed separately. The selected studies contained 7,251 (4,472 females/2,779 males) hypertensive patients and 3,800 (2,161 females/1,639 males) normotensive controls. A statistically significant association was observed between the G8970A gene polymorphism and essential hypertension risk in female hypertensive group in the recessive genetic model (AA vs. GG+GA: P = 0.03, OR = 1.15, 95% CI = 1.02-1.30, P(heterogeneity) = 0.40, I(2) = 5%, fixed-effects model). Although no association was shown between the frequency of the A allele and the genetic susceptibility to essential hypertension in all male patients (A Allele: P = 0.38, OR = 1.10, 95% CI = 0.89-1.38, P(heterogeneity) = 0.02, I(2) = 56%, random-effects model), we found that the relationship between carrier of A allele and the essential hypertension risk in Han-Chinese male patients subgroup (A Allele: P = 0.006, OR = 1.21, 95% CI = 1.06–1.38, P(heterogeneity) = 0.10, I(2) = 44%, fixed-effects model). The current meta-analysis provided solid evidence suggesting that ACE2 gene polymorphism G8790A was probably a genetic risk factor for essential hypertension across different ethnic populations in female subjects and in Han-Chinese male subjects.
Collapse
|
8
|
Gambichler T, Pantelaki I, Othlinghaus N, Moritz RKC, Stricker I, Skrygan M. Deep intronic point mutations of the KIT gene in a female patient with cutaneous clear cell sarcoma and her family. Cancer Genet 2012; 205:182-5. [PMID: 22559980 DOI: 10.1016/j.cancergen.2012.02.001] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2011] [Revised: 01/30/2012] [Accepted: 02/03/2012] [Indexed: 12/20/2022]
Abstract
Clear cell sarcoma (CCS) of tendons and aponeuroses is an aggressive neoplasm that is characterized by a pathognomonic translocation, t(12;22)(q13;q12), resulting in an EWSR1-ATF1 chimeric gene. We report for the first time a female patient with CCS exhibiting both EWSR1-ATF1 fusion transcripts and hereditary homozygous point mutations in introns 11 and 16 of the KIT gene. Her parents and two brothers each had heterozygous point mutations in intron 11 or intron 16 of the KIT gene. The functional significance of these germline deep intronic point mutations and their relationship to the pathogenesis of CCS are unclear. Future studies investigating KIT intron mutations in a larger cohort of CCS patients are warranted.
Collapse
Affiliation(s)
- Thilo Gambichler
- Department of Dermatology, Ruhr-University Bochum, Bochum, Germany.
| | | | | | | | | | | |
Collapse
|
9
|
Oliveira P, Sanges R, Huntsman D, Stupka E, Oliveira C. Characterization of the intronic portion of cadherin superfamily members, common cancer orchestrators. Eur J Hum Genet 2012; 20:878-83. [PMID: 22317972 DOI: 10.1038/ejhg.2012.11] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Cadherins are cell-cell adhesion proteins essential for the maintenance of tissue architecture and integrity, and their impairment is often associated with human cancer. Knowledge regarding regulatory mechanisms associated with cadherin misexpression in cancer is scarce. Specific features of the intronic-structure and intronic-based regulatory mechanisms in the cadherin superfamily are unidentified. This study aims at systematically characterizing the intronic portion of cadherin superfamily members and the identification of intronic regions constituting putative targets/triggers of regulation, using a bioinformatic approach and biological data mining. Our study demonstrates that the cadherin superfamily genes harbour specific characteristics in comparison to all non-cadherin genes, both from the genomic and transcriptional standpoints. Cadherin superfamily genes display higher average total intron number and significantly longer introns than other genes and across the entire vertebrate lineage. Moreover, in the human genome, we observed an uncommon high frequency of MIR (mammalian-wide interspersed repeats) and MaLR (mammalian-wide interspersed repeats, a subtype of LTR) regulatory-associated repetitive elements at 5'-located introns, concomitantly with increased de novo intronic transcription. Using this approach, we identified cadherin intronic-specific sites that may constitute novel targets/triggers of cadherin superfamily expression regulation. These findings pinpoint the need to identify mechanisms affecting particularly MIR and MaLR elements located in introns 2 and 3 of human cadherin genes, possibly important in the expression modulation of this superfamily in homeostasis and cancer.
Collapse
Affiliation(s)
- Patrícia Oliveira
- Instituto de Patologia e Imunologia Molecular da Universidade do Porto, Rua Dr Roberto Frias, s/n, Porto, Portugal
| | | | | | | | | |
Collapse
|
10
|
Sakabe NJ, Savic D, Nobrega MA. Transcriptional enhancers in development and disease. Genome Biol 2012; 13:238. [PMID: 22269347 PMCID: PMC3334578 DOI: 10.1186/gb-2012-13-1-238] [Citation(s) in RCA: 102] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2011] [Accepted: 01/13/2012] [Indexed: 01/24/2023] Open
Abstract
Distal transcription enhancers are cis-regulatory elements that promote gene expression, enabling spatiotemporal control of genetic programs such as those required in metazoan developmental processes. Because of their importance, their disruption can lead to disease.
Collapse
Affiliation(s)
- Noboru Jo Sakabe
- Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA.
| | | | | |
Collapse
|
11
|
Schanze D, Ekici AB, Pfuhlmann B, Reis A, Stöber G. Evaluation of conserved and ultra-conserved non-genic sequences in chromosome 15q15-linked periodic catatonia. Am J Med Genet B Neuropsychiatr Genet 2012; 159B:77-86. [PMID: 22162401 DOI: 10.1002/ajmg.b.32004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/17/2011] [Accepted: 11/03/2011] [Indexed: 01/14/2023]
Abstract
Conserved and ultra-conserved non-genic sequence elements (CNGs, UCEs) between human and other mammalian genomes seem to constitute a heterogeneous group of functional sequences which likely have important biological function. To determine whether variation in CNGs and UCEs contributes to risk for the schizophrenic subphenotype of periodic catatonia (according to K. Leonhard; OMIM 605419), we evaluated non-coding elements at a critical 7.35 Mb interval on chromosome 15q15 in 8 unrelated cases with periodic catatonia (derived from pedigrees compatible with linkage to chromosome 15q15) and 8 controls, followed by association studies in a cohort of 510 cases and controls. Among 65 CNGs (≥100 bp, 100% identity; human-mouse comparison), 7 CNGs matched criteria for UCE (≥200 bp, 100% identity). A hot spot of 62/65 CNGs (95%) appeared at the MEIS2 locus, which implicates functional importance of associated (ultra-)conserved elements to this early developmental gene, which is present in the human fetal neocortex and associated with metabolic side effects to antipsychotic drugs. Further CNGs were identified at the PLCB2 and DLL4 locus or located intergenic between TYRO3 and MAPKBP1. Automated sequencing revealed genetic variation in 12.3% of CNGs, but frequencies were low (MAF: 0.06-0.4) in cases. Three variants located inside CNGs/UCEs were found in cases only. In a case-control association study we could not confirm a significant association of these three CNG-variants with periodic catatonia. Our results suggest genetic variation in (ultra-)conserved non-genic sequence elements which might alter functional properties. The identified variants are genetically not associated with the phenotype of periodic catatonia.
Collapse
Affiliation(s)
- Denny Schanze
- Institute of Human Genetics, University of Erlangen-Nuremberg, Erlangen, Germany
| | | | | | | | | |
Collapse
|
12
|
Sironi M, Guerini FR, Agliardi C, Biasin M, Cagliani R, Fumagalli M, Caputo D, Cassinotti A, Ardizzone S, Zanzottera M, Bolognesi E, Riva S, Kanari Y, Miyazawa M, Clerici M. An evolutionary analysis of RAC2 identifies haplotypes associated with human autoimmune diseases. Mol Biol Evol 2011; 28:3319-29. [PMID: 21680873 DOI: 10.1093/molbev/msr164] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2023] Open
Abstract
The human RAC2 gene encodes a small GTP-binding protein with a pivotal role in immune activation and in the induction of peripheral immune tolerance through restimulation-induced cell death (RICD). Different human pathogens target the protein product of RAC2, suggesting that the gene may be subject to natural selection, and that variants in RAC2 may affect immunological phenotypes in humans. We scanned the genomic region encompassing the entire transcription unit for the presence of putative noncoding regulatory elements conserved across mammals. This information was used to select two RAC2 gene regions and analyze their intraspecific genetic diversity. Results suggest that a region covering the 3' untranslated region has been a target of multiallelic balancing selection (or diversifying selection), and three major RAC2 haplogroups occur in human populations. Haplotypes belonging to one of these clades are associated with increased susceptibility to multiple sclerosis (P = 0.022) and earlier onset of disease symptoms (P = 0.025). This same haplogroup is significantly more common in patients with Crohn's disease compared with healthy controls (P = 0.048). These data reinforce recent evidences that susceptibility alleles/haplotypes are shared among multiple autoimmune disorders and support a causal "role for RAC2" variants in the pathogenesis of autoimmune diseases. Other genes with a role in RICD have previously been associated with autoimmunity in humans, suggesting that this pathway and RAC2 may represent novel therapeutic targets in autoimmune disorders.
Collapse
Affiliation(s)
- Manuela Sironi
- Bioinformatics Laboratory, Scientific Institute IRCCS E. Medea, Bosisio Parini (LC), Italy
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Webber C. Functional enrichment analysis with structural variants: pitfalls and strategies. Cytogenet Genome Res 2011; 135:277-85. [PMID: 21997137 DOI: 10.1159/000331670] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Interpreting the phenotypic consequences of human structural variation remains challenging. Functional enrichment analysis, which can identify functional enrichments among genes affected by structural variants, is providing significant biological insights into the genotype-phenotype relationship. In this review, we discuss the different approaches and choices in the application of this technique to human structural variation. We consider the importance of choosing the right background distribution for detection, the significance of the gene selection criteria, the effects of tissue-specific gene length biases and discuss sources of functional annotations with a focus on Gene Ontology and mouse phenotypic resources. Throughout this review, we highlight potential sources of significant bias that are of particular concern to the analysis of structural variants, and illustrate the importance of examining the expectations upon which enrichment analysis techniques depend.
Collapse
Affiliation(s)
- C Webber
- Department of Physiology, Anatomy and Genetics, MRC Functional Genomics Unit, University of Oxford, Oxford, UK.
| |
Collapse
|
14
|
Lee AP, Brenner S, Venkatesh B. Mouse transgenesis identifies conserved functional enhancers and cis-regulatory motif in the vertebrate LIM homeobox gene Lhx2 locus. PLoS One 2011; 6:e20088. [PMID: 21629789 PMCID: PMC3100342 DOI: 10.1371/journal.pone.0020088] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2011] [Accepted: 04/17/2011] [Indexed: 12/03/2022] Open
Abstract
The vertebrate Lhx2 is a member of the LIM homeobox family of transcription factors. It is essential for the normal development of the forebrain, eye, olfactory system and liver as well for the differentiation of lymphoid cells. However, despite the highly restricted spatio-temporal expression pattern of Lhx2, nothing is known about its transcriptional regulation. In mammals and chicken, Crb2, Dennd1a and Lhx2 constitute a conserved linkage block, while the intervening Dennd1a is lost in the fugu Lhx2 locus. To identify functional enhancers of Lhx2, we predicted conserved noncoding elements (CNEs) in the human, mouse and fugu Crb2-Lhx2 loci and assayed their function in transgenic mouse at E11.5. Four of the eight CNE constructs tested functioned as tissue-specific enhancers in specific regions of the central nervous system and the dorsal root ganglia (DRG), recapitulating partial and overlapping expression patterns of Lhx2 and Crb2 genes. There was considerable overlap in the expression domains of the CNEs, which suggests that the CNEs are either redundant enhancers or regulating different genes in the locus. Using a large set of CNEs (810 CNEs) associated with transcription factor-encoding genes that express predominantly in the central nervous system, we predicted four over-represented 8-mer motifs that are likely to be associated with expression in the central nervous system. Mutation of one of them in a CNE that drove reporter expression in the neural tube and DRG abolished expression in both domains indicating that this motif is essential for expression in these domains. The failure of the four functional enhancers to recapitulate the complete expression pattern of Lhx2 at E11.5 indicates that there must be other Lhx2 enhancers that are either located outside the region investigated or divergent in mammals and fishes. Other approaches such as sequence comparison between multiple mammals are required to identify and characterize such enhancers.
Collapse
Affiliation(s)
- Alison P. Lee
- Comparative Genomics Laboratory, Institute of Molecular and Cell Biology,
A*STAR (Agency for Science, Technology and Research), Singapore,
Singapore
| | - Sydney Brenner
- Comparative Genomics Laboratory, Institute of Molecular and Cell Biology,
A*STAR (Agency for Science, Technology and Research), Singapore,
Singapore
| | - Byrappa Venkatesh
- Comparative Genomics Laboratory, Institute of Molecular and Cell Biology,
A*STAR (Agency for Science, Technology and Research), Singapore,
Singapore
| |
Collapse
|
15
|
Rearick D, Prakash A, McSweeny A, Shepard SS, Fedorova L, Fedorov A. Critical association of ncRNA with introns. Nucleic Acids Res 2010; 39:2357-66. [PMID: 21071396 PMCID: PMC3064772 DOI: 10.1093/nar/gkq1080] [Citation(s) in RCA: 127] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
It has been widely acknowledged that non-coding RNAs are master-regulators of genomic functions. However, the significance of the presence of ncRNA within introns has not received proper attention. ncRNA within introns are commonly produced through the post-splicing process and are specific signals of gene transcription events, impacting many other genes and modulating their expression. This study, along with the following discussion, details the association of thousands of ncRNAs—snoRNA, miRNA, siRNA, piRNA and long ncRNA—within human introns. We propose that such an association between human introns and ncRNAs has a pronounced synergistic effect with important implications for fine-tuning gene expression patterns across the entire genome.
Collapse
Affiliation(s)
- David Rearick
- University of Toledo Health Science Campus, University of Toledo Health Science Campus, University of Toledo Health Science Campus, Toledo, OH 43614, USA
| | | | | | | | | | | |
Collapse
|
16
|
Anatskaya OV, Vinogradov AE. Somatic polyploidy promotes cell function under stress and energy depletion: evidence from tissue-specific mammal transcriptome. Funct Integr Genomics 2010; 10:433-46. [PMID: 20625914 DOI: 10.1007/s10142-010-0180-5] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2010] [Revised: 06/12/2010] [Accepted: 06/16/2010] [Indexed: 02/08/2023]
Abstract
Polyploid cells show great among-species and among-tissues diversity and relation to developmental mode, suggesting their importance in adaptive evolution and developmental programming. At the same time, excessive polyploidization is a hallmark of functional impairment, aging, growth disorders, and numerous pathologies including cancer and cardiac diseases. To shed light on this paradox and to find out how polyploidy contributes to organ functions, we review here the ploidy-associated shifts in activity of narrowly expressed (tissue specific) genes in human and mouse heart and liver, which have the reciprocal pattern of polyploidization. For this purpose, we use the modular biology approach and genome-scale cross-species comparison. It is evident from this review that heart and liver show similar traits in response to polyploidization. In both organs, polyploidy protects vitality (mainly due to the activation of sirtuin-mediated pathways), triggers the reserve adenosine-5'-triphosphate (ATP) production, and sustains tissue-specific functions by switching them to energy saving mode. In heart, the strongest effects consisted in the concerted up-regulation of contractile proteins and substitution of energy intensive proteins with energy economic ones. As a striking example, the energy intensive alpha myosin heavy chain (providing fast contraction) decreased its expression by a factor of 10, allowing a 270-fold increase of expression of beta myosin heavy chain (providing slow contraction), which has approximately threefold lower ATP-hydrolyzing activity. The liver showed the enhancement of immunity, reactive oxygen species and xenobiotic detoxication, and numerous metabolic adaptations to long-term energy depletion. Thus, somatic polyploidy may be an ingenious evolutionary instrument for fast adaptation to stress and new environments allowing trade-offs between high functional demand, stress, and energy depletion.
Collapse
Affiliation(s)
- Olga V Anatskaya
- Institute of Cytology, Russian Academy of Sciences, Group of Bioinformatics and Functional Genomics, St Petersburg, Russia.
| | | |
Collapse
|
17
|
Shen-Orr SS, Pilpel Y, Hunter CP. Composition and regulation of maternal and zygotic transcriptomes reflects species-specific reproductive mode. Genome Biol 2010; 11:R58. [PMID: 20515465 PMCID: PMC2911106 DOI: 10.1186/gb-2010-11-6-r58] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2009] [Revised: 04/23/2010] [Accepted: 06/01/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Early embryos contain mRNA transcripts expressed from two distinct origins; those expressed from the mother's genome and deposited in the oocyte (maternal) and those expressed from the embryo's genome after fertilization (zygotic). The transition from maternal to zygotic control occurs at different times in different animals according to the extent and form of maternal contributions, which likely reflect evolutionary and ecological forces. Maternally deposited transcripts rely on post-transcriptional regulatory mechanisms for precise spatial and temporal expression in the embryo, whereas zygotic transcripts can use both transcriptional and post-transcriptional regulatory mechanisms. The differences in maternal contributions between animals may be associated with gene regulatory changes detectable by the size and complexity of the associated regulatory regions. RESULTS We have used genomic data to identify and compare maternal and/or zygotic expressed genes from six different animals and find evidence for selection acting to shape gene regulatory architecture in thousands of genes. We find that mammalian maternal genes are enriched for complex regulatory regions, suggesting an increase in expression specificity, while egg-laying animals are enriched for maternal genes that lack transcriptional specificity. CONCLUSIONS We propose that this lack of specificity for maternal expression in egg-laying animals indicates that a large fraction of maternal genes are expressed non-functionally, providing only supplemental nutritional content to the developing embryo. These results provide clear predictive criteria for analysis of additional genomes.
Collapse
Affiliation(s)
- Shai S Shen-Orr
- Department of Molecular and Cellular Biology, Harvard University, 16 Divinity Ave, Cambridge, MA 02138, USA
| | | | | |
Collapse
|
18
|
Rao YS, Wang ZF, Chai XW, Wu GZ, Zhou M, Nie QH, Zhang XQ. Selection for the compactness of highly expressed genes in Gallus gallus. Biol Direct 2010; 5:35. [PMID: 20465857 PMCID: PMC2883972 DOI: 10.1186/1745-6150-5-35] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2009] [Accepted: 05/14/2010] [Indexed: 11/10/2022] Open
Abstract
Background Coding sequence (CDS) length, gene size, and intron length vary within a genome and among genomes. Previous studies in diverse organisms, including human, D. Melanogaster, C. elegans, S. cerevisiae, and Arabidopsis thaliana, indicated that there are negative relationships between expression level and gene size, CDS length as well as intron length. Different models such as selection for economy model, genomic design model, and mutational bias hypotheses have been proposed to explain such observation. The debate of which model is a superior one to explain the observation has not been settled down. The chicken (Gallus gallus) is an important model organism that bridges the evolutionary gap between mammals and other vertebrates. As D. Melanogaster, chicken has a larger effective population size, selection for chicken genome is expected to be more effective in increasing protein synthesis efficiency. Therefore, in this study the chicken was used as a model organism to elucidate the interaction between gene features and expression pattern upon selection pressure. Results Based on different technologies, we gathered expression data for nuclear protein coding, single-splicing genes from Gallus gallus genome and compared them with gene parameters. We found that gene size, CDS length, first intron length, average intron length, and total intron length are negatively correlated with expression level and expression breadth significantly. The tissue specificity is positively correlated with the first intron length but negatively correlated with the average intron length, and not correlated with the CDS length and protein domain numbers. Comparison analyses showed that ubiquitously expressed genes and narrowly expressed genes with the similar expression levels do not differ in compactness. Our data provided evidence that the genomic design model can not, at least in part, explain our observations. We grouped all somatic-tissue-specific genes (n = 1105), and compared the first intron length and the average intron length between highly expressed genes (top 5% expressed genes) and weakly expressed genes (bottom 5% expressed genes). We found that the first intron length and the average intron length in highly expressed genes are not different from that in weakly expressed genes. We also made a comparison between ubiquitously expressed genes and narrowly expressed somatic genes with similar expression levels. Our data demonstrated that ubiquitously expressed genes are less compact than narrowly expressed genes with the similar expression levels. Obviously, these observations can not be explained by mutational bias hypotheses either. We also found that the significant trend between genes' compactness and expression level could not be affected by local mutational biases. We argued that the selection of economy model is most likely one to explain the relationship between gene expression and gene characteristics in chicken genome. Conclusion Natural selection appears to favor the compactness of highly expressed genes in chicken genome. This observation can be explained by the selection of economy model. Reviewers This article was reviewed by Dr. Gavin Huttley, Dr. Liran Carmel (nominated by Dr. Eugene V. Koonin) and Dr. Araxi Urrutia (nominated by Dr. Laurence D. Hurst).
Collapse
Affiliation(s)
- You S Rao
- Department of Biological Technology, Jiangxi Educational Institute, Nanchang, Jiangxi, China
| | | | | | | | | | | | | |
Collapse
|
19
|
Vinogradov AE. Human transcriptome nexuses: basic-eukaryotic and metazoan. Genomics 2010; 95:345-54. [PMID: 20298777 DOI: 10.1016/j.ygeno.2010.03.004] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2009] [Revised: 03/01/2010] [Accepted: 03/08/2010] [Indexed: 01/10/2023]
Abstract
Using a new approach, I analysed human transcriptome coexpression network and revealed two large-scale nexuses. Besides gene coexpression, each nexus is characterized by a combination of gene evolutionary origin, function and among-tissues expression breadth. The first nexus contains mostly genes of pre-metazoan origin, which are widely expressed and have cell-centred functions. The second nexus is enriched in genes of metazoan origin, which are expressed more narrowly and have organism-centred functions. The revealed nexuses are supported by asymmetry in distribution of transcription factor targets between them. Within the metazoan nexus, there is a subnexus that is more pronounced in the nervous tissues and is enriched in gene regulatory complexity. It mostly contains genes related to nervous system, cell communication and multicellular organism processes and development. The revealed nexuses indicate a dichotomy in the transcriptional regulation and can provide a framework for further functional genomics studies.
Collapse
|
20
|
Liu S. Increasing alternative promoter repertories is positively associated with differential expression and disease susceptibility. PLoS One 2010; 5:e9482. [PMID: 20208995 PMCID: PMC2830428 DOI: 10.1371/journal.pone.0009482] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2009] [Accepted: 01/07/2010] [Indexed: 12/03/2022] Open
Abstract
Background Alternative Promoter (AP) usages have been shown to enable diversified transcriptional regulation of individual gene in a context-specific (e.g., pathway, cell lineage, tissue type, and development stage et. ac.) way. Aberrant uses of APs have been directly linked to mechanism of certain human diseases. However, whether or not there exists a general link between a gene's AP repertoire and its expression diversity is currently unknown. The general relation between a gene's AP repertoire and its disease susceptibility also remains largely unexplored. Methodology/Principal Findings Based on the differential expression ratio inferred from all human microarray data in NCBI GEO and the list of disease genes curated in public repositories, we systemically analyzed the general relation of AP repertoire with expression diversity and disease susceptibility. We found that genes with APs are more likely to be differentially expressed and/or disease associated than those with Single Promoter (SP), and genes with more APs are more likely differentially expressed and disease susceptible than those with less APs. Further analysis showed that genes with increased number of APs tend to have increased length in all aspects of gene structure including 3′ UTR, be associated with increased duplicability, and have increased connectivity in protein-protein interaction network. Conclusions Our genome-wide analysis provided evidences that increasing alternative promoter repertories is positively associated with differential expression and disease susceptibility.
Collapse
Affiliation(s)
- Song Liu
- Department of Biostatistics, Roswell Park Cancer Institute, Buffalo, New York, United States of America.
| |
Collapse
|
21
|
Abstract
Proteins encoded by highly expressed genes evolve more slowly. This correlation is thought to arise owing to purifying selection against toxicity of misfolded proteins (that should be more crucial for highly expressed genes). It is now widely accepted that this individual (by-gene) effect is a dominant cause in protein evolution. Here, I show that in mammals, the evolutionary rate of a protein is much more strongly related to the evolutionary rate of coexpressed proteins (and proteins of the same biological pathway) than to the expression level of its encoding gene. The complexity of gene regulation (estimated by the numbers of transcription factor targets and regulatory microRNA targets in the encoding gene) is another important cause, which is much stronger than gene expression level. Proteins encoded by complexly regulated genes evolve more slowly. The intronic length and the ratio of intronic to coding sequence lengths also correlate negatively with protein evolutionary rate (which contradicts the expectation from the negative link between expression level and evolutionary rate). One more important factor, which is much stronger than gene expression level, is evolutionary age. More recent proteins evolve faster, and expression level of an encoding gene becomes quite a minor cause in the evolution of mammal proteins of metazoan origin. These data suggest that, in contrast to a widespread opinion, systemic factors dominate mammal protein evolution.
Collapse
|
22
|
Patterns of DNA-sequence divergence between Drosophila miranda and D. pseudoobscura. J Mol Evol 2009; 69:601-11. [PMID: 19859648 DOI: 10.1007/s00239-009-9298-2] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2009] [Accepted: 10/07/2009] [Indexed: 12/22/2022]
Abstract
Contrary to the classical view, a large amount of non-coding DNA seems to be selectively constrained in Drosophila and other species. Here, using Drosophila miranda BAC sequences and the Drosophila pseudoobscura genome sequence, we aligned coding and non-coding sequences between D. pseudoobscura and D. miranda, and investigated their patterns of evolution. We found two patterns that have previously been observed in comparisons between Drosophila melanogaster and its relatives. First, there is a negative correlation between intron divergence and intron length, suggesting that longer non-coding sequences may contain more regulatory elements than shorter sequences. Our other main finding is a negative correlation between the rate of non-synonymous substitutions (d(N)) and codon usage bias (F(op)), showing that fast-evolving genes have a lower codon usage bias, consistent with strong positive selection interfering with weak selection for codon usage.
Collapse
|
23
|
Navratilova P, Becker TS. Genomic regulatory blocks in vertebrates and implications in human disease. BRIEFINGS IN FUNCTIONAL GENOMICS AND PROTEOMICS 2009; 8:333-42. [DOI: 10.1093/bfgp/elp019] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
|
24
|
Abstract
The human GSTM gene family is composed of five gene members, GSTM1-5, and plays an important role in detoxification. In this study, the human GSTM5 gene was found to have a long inverted repeat (LIR) in intron 5. The LIR is able to form a stem-loop structure with a 31-bp stem and a 9-nt loop. The intronic LIR was also identified in other primates but not in non-primates. The human and chimpanzee LIRs had undergone compensating mutations that make the stem loop more stable, suggesting a functional role for the LIR. Sequence homology showed that the LIR was actually a part of inverted exons acquired by the intron. Results of phylogenetic analysis indicate that the inverted exons were derived from exon 5 of GSTM4 and exon 5 of GSTM1. The intronic LIR and inverted GSTM exons can probably introduce complexity in the expression of GSTM gene family.
Collapse
|
25
|
Mello BP, Abrantes EF, Torres CH, Machado-Lima A, Fonseca RDS, Carraro DM, Brentani RR, Reis LFL, Brentani H. No-match ORESTES explored as tumor markers. Nucleic Acids Res 2009; 37:2607-17. [PMID: 19270067 PMCID: PMC2677862 DOI: 10.1093/nar/gkp074] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
Sequencing technologies and new bioinformatics tools have led to the complete sequencing of various genomes. However, information regarding the human transcriptome and its annotation is yet to be completed. The Human Cancer Genome Project, using ORESTES (open reading frame EST sequences) methodology, contributed to this objective by generating data from about 1.2 million expressed sequence tags. Approximately 30% of these sequences did not align to ESTs in the public databases and were considered no-match ORESTES. On the basis that a set of these ESTs could represent new transcripts, we constructed a cDNA microarray. This platform was used to hybridize against 12 different normal or tumor tissues. We identified 3421 transcribed regions not associated with annotated transcripts, representing 83.3% of the platform. The total number of differentially expressed sequences was 1007. Also, 28% of analyzed sequences could represent noncoding RNAs. Our data reinforces the knowledge of the human genome being pervasively transcribed, and point out molecular marker candidates for different cancers. To reinforce our data, we confirmed, by real-time PCR, the differential expression of three out of eight potentially tumor markers in prostate tissues. Lists of 1007 differentially expressed sequences, and the 291 potentially noncoding tumor markers were provided.
Collapse
Affiliation(s)
- Barbara P Mello
- Hospital A. C. Camargo, Rua Prof. Antônio Prudente 211, São Paulo, SP, Brazil
| | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Cagliani R, Fumagalli M, Riva S, Pozzoli U, Comi GP, Menozzi G, Bresolin N, Sironi M. The signature of long-standing balancing selection at the human defensin beta-1 promoter. Genome Biol 2008; 9:R143. [PMID: 18817538 PMCID: PMC2592704 DOI: 10.1186/gb-2008-9-9-r143] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2008] [Revised: 05/21/2008] [Accepted: 09/25/2008] [Indexed: 12/26/2022] Open
Abstract
BACKGROUND Defensins, small endogenous peptides with antimicrobial activity, are pivotal components of the innate immune response. A large cluster of defensin genes is located on human chromosome 8p; among them the beta defensin 1 (DEFB1) promoterhas been extensively studied since discovery that specific polymorphisms and haplotypes associate with asthma and atopy, susceptibility to severe sepsis, as well as HIV and Candida infection predisposition. RESULTS Here, we characterize the sequence variation and haplotype structure of the DEFB1 promoter region in six human populations. In all of them, we observed high levels of nucleotide variation, an excess of intermediate-frequency alleles, reduced population differentiation and a genealogy with common haplotypes separated by deep branches. Indeed, a significant departure from the expectation of evolutionary neutrality was observed in all populations and the possibility that this is due to demographic history alone was ruled out. Also, we verified that the selection signature is restricted to the promoter region and not due to a linked balanced polymorphism. A phylogeny-based estimation indicated that the two major haplotype clades separated around 4.5 million years ago, approximately the time when the human and chimpanzee lineages split. CONCLUSION Altogether, these features represent strong molecular signatures of long-term balancing selection, a process that is thought to be extremely rare outside major histocompatibility complex genes. Our data indicate that the DEFB1 promoter region carries functional variants and support previous hypotheses whereby alleles predisposing to atopic disorders are widespread in modern societies because they conferred resistance to pathogens in ancient settings.
Collapse
Affiliation(s)
- Rachele Cagliani
- Scientific Institute IRCCS E. Medea, Bioinformatic Lab, Via don L. Monza 20, 23842 Bosisio Parini (LC), Italy
| | - Matteo Fumagalli
- Scientific Institute IRCCS E. Medea, Bioinformatic Lab, Via don L. Monza 20, 23842 Bosisio Parini (LC), Italy
- Bioengineering Department, Politecnico di Milano, Pzza L. da Vinci, 32, 20133 Milan, Italy
| | - Stefania Riva
- Scientific Institute IRCCS E. Medea, Bioinformatic Lab, Via don L. Monza 20, 23842 Bosisio Parini (LC), Italy
| | - Uberto Pozzoli
- Scientific Institute IRCCS E. Medea, Bioinformatic Lab, Via don L. Monza 20, 23842 Bosisio Parini (LC), Italy
| | - Giacomo P Comi
- Dino Ferrari Centre, Department of Neurological Sciences, University of Milan, IRCCS Ospedale Maggiore Policlinico, Mangiagalli and Regina Elena Foundation, Via F. Sforza 35, 20100 Milan, Italy
| | - Giorgia Menozzi
- Scientific Institute IRCCS E. Medea, Bioinformatic Lab, Via don L. Monza 20, 23842 Bosisio Parini (LC), Italy
| | - Nereo Bresolin
- Scientific Institute IRCCS E. Medea, Bioinformatic Lab, Via don L. Monza 20, 23842 Bosisio Parini (LC), Italy
- Dino Ferrari Centre, Department of Neurological Sciences, University of Milan, IRCCS Ospedale Maggiore Policlinico, Mangiagalli and Regina Elena Foundation, Via F. Sforza 35, 20100 Milan, Italy
| | - Manuela Sironi
- Scientific Institute IRCCS E. Medea, Bioinformatic Lab, Via don L. Monza 20, 23842 Bosisio Parini (LC), Italy
| |
Collapse
|
27
|
|
28
|
Tsirigos A, Rigoutsos I. Human and mouse introns are linked to the same processes and functions through each genome's most frequent non-conserved motifs. Nucleic Acids Res 2008; 36:3484-93. [PMID: 18450818 PMCID: PMC2425492 DOI: 10.1093/nar/gkn155] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
We identified the most frequent, variable-length DNA sequence motifs in the human and mouse genomes and sub-selected those with multiple recurrences in the intergenic and intronic regions and at least one additional exonic instance in the corresponding genome. We discovered that these motifs have virtually no overlap with intronic sequences that are conserved between human and mouse, and thus are genome-specific. Moreover, we found that these motifs span a substantial fraction of previously uncharacterized human and mouse intronic space. Surprisingly, we found that these genome-specific motifs are over-represented in the introns of genes belonging to the same biological processes and molecular functions in both the human and mouse genomes even though the underlying sequences are not conserved between the two genomes. In fact, the processes and functions that are linked to these genome-specific sequence-motifs are distinct from the processes and functions which are associated with intronic regions that are conserved between human and mouse. The findings show that intronic regions from different genomes are linked to the same processes and functions in the absence of underlying sequence conservation. We highlight the ramifications of this observation with a concrete example that involves the microsatellite instability gene MLH1.
Collapse
Affiliation(s)
- Aristotelis Tsirigos
- Bioinformatics and Pattern Discovery Group, IBM Thomas J. Watson Research Center, PO Box 218, Yorktown Heights, NY 10598, USA
| | | |
Collapse
|
29
|
Li L, Zhu Q, He X, Sinha S, Halfon MS. Large-scale analysis of transcriptional cis-regulatory modules reveals both common features and distinct subclasses. Genome Biol 2008; 8:R101. [PMID: 17550599 PMCID: PMC2394749 DOI: 10.1186/gb-2007-8-6-r101] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2007] [Revised: 05/23/2007] [Accepted: 06/05/2007] [Indexed: 02/01/2023] Open
Abstract
Analysis of 280 experimentally-verified cis-regulatory modules from Drosophila reveal features both common to all and unique to distinct subclasses of modules. Background Transcriptional cis-regulatory modules (for example, enhancers) play a critical role in regulating gene expression. While many individual regulatory elements have been characterized, they have never been analyzed as a class. Results We have performed the first such large-scale study of cis-regulatory modules in order to determine whether they have common properties that might aid in their identification and contribute to our understanding of the mechanisms by which they function. A total of 280 individual, experimentally verified cis-regulatory modules from Drosophila were analyzed for a range of sequence-level and functional properties. We report here that regulatory modules do indeed share common properties, among them an elevated GC content, an increased level of interspecific sequence conservation, and a tendency to be transcribed into RNA. However, we find that dense clustering of transcription factor binding sites, especially homotypic clustering, which is commonly believed to be a general characteristic of regulatory modules, is rather a feature that belongs chiefly to a specific subclass. This has important implications for current computational approaches, many of which are biased toward this subset. We explore two new strategies to assess binding site clustering and gauge their performances with respect to their ability to detect all 280 modules and various functionally coherent subsets. Conclusion Our findings demonstrate that cis-regulatory modules share common features that help to define them as a class and that may lead to new insights into mechanisms of gene regulation. However, these properties alone may not be sufficient to reliably distinguish regulatory from non-regulatory sequences. We also demonstrate that there are distinct subclasses of cis-regulatory modules that are more amenable to in silico detection than others and that these differences must be taken into account when attempting genome-wide regulatory element discovery.
Collapse
Affiliation(s)
- Long Li
- Department of Biochemistry, State University of New York at Buffalo, Buffalo, NY 14214, USA
| | - Qianqian Zhu
- Department of Biochemistry, State University of New York at Buffalo, Buffalo, NY 14214, USA
| | - Xin He
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Saurabh Sinha
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Marc S Halfon
- Department of Biochemistry, State University of New York at Buffalo, Buffalo, NY 14214, USA
- Department of Biological Sciences, State University of New York at Buffalo, Buffalo, NY 14214, USA
- New York State Center of Excellence in Bioinformatics and the Life Sciences, Buffalo, NY 14203, USA
- Department of Molecular and Cellular Biology, Roswell Park Cancer Institute, Buffalo, NY 14263, USA
| |
Collapse
|
30
|
Zhou H, Lin K. Excess of microRNAs in large and very 5' biased introns. Biochem Biophys Res Commun 2008; 368:709-15. [PMID: 18249189 DOI: 10.1016/j.bbrc.2008.01.117] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2008] [Accepted: 01/27/2008] [Indexed: 11/29/2022]
Abstract
Many of microRNAs (miRNAs) and small nucleolar RNAs (snoRNAs) are located within the introns of genes in eukaryotes. Contrary to intronic snoRNAs, intronic miRNAs are processed from unspliced intronic regions before the catalysis of splicing in vertebrates. By analyzing the distribution patterns of the length and position of the introns hosting these two groups of small RNA genes, we observed that both human and mouse intronic miRNAs tended to be present in large introns, and miRNA host introns have a more 5'-biased position distribution compared with all other introns among the two genomes. These observations indicate that the negative selection of functional constraints might affect the intron size in both genomes. Interestingly, the very 5'-biased positions of miRNA host introns may be necessary for the transcription and regulation of intronic miRNAs to utilize the regulatory signals within the 5'-UTRs of their host genes.
Collapse
Affiliation(s)
- Hongjun Zhou
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering and College of Life Sciences, Beijing Normal University, No. 19, Xinjiekouwai Street, Beijing 100875, China
| | | |
Collapse
|
31
|
Woolfe A, Elgar G. Organization of conserved elements near key developmental regulators in vertebrate genomes. ADVANCES IN GENETICS 2008; 61:307-38. [PMID: 18282512 DOI: 10.1016/s0065-2660(07)00012-0] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Sequence conservation has traditionally been used as a means to target functional regions of complex genomes. In addition to its use in identifying coding regions of genes, the recent availability of whole genome data for a number of vertebrates has permitted high-resolution analyses of the noncoding "dark matter" of the genome. This has resulted in the identification of a large number of highly conserved sequence elements that appear to be preserved in all bony vertebrates. Further positional analysis of these conserved noncoding elements (CNEs) in the genome demonstrates that they cluster around genes involved in developmental regulation. This chapter describes the identification and characterization of these elements, with particular reference to their composition and organization.
Collapse
Affiliation(s)
- Adam Woolfe
- School of Biological and Chemical Sciences, Queen Mary, University of London, London E1 4NS, United Kingdom
| | | |
Collapse
|
32
|
Simons C, Makunin IV, Pheasant M, Mattick JS. Maintenance of transposon-free regions throughout vertebrate evolution. BMC Genomics 2007; 8:470. [PMID: 18093339 PMCID: PMC2241635 DOI: 10.1186/1471-2164-8-470] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2007] [Accepted: 12/20/2007] [Indexed: 01/23/2023] Open
Abstract
Background We recently reported the existence of large numbers of regions up to 80 kb long that lack transposon insertions in the human, mouse and opossum genomes. These regions are significantly associated with loci involved in developmental and transcriptional regulation. Results Here we report that transposon-free regions (TFRs) are prominent genomic features of amphibian and fish lineages, and that many have been maintained throughout vertebrate evolution, although most transposon-derived sequences have entered these lineages after their divergence. The zebrafish genome contains 470 TFRs over 10 kb and a further 3,951 TFRs over 5 kb, which is comparable to the number identified in mammals. Two thirds of zebrafish TFRs over 10 kb are orthologous to TFRs in at least one mammal, and many have orthologous TFRs in all three mammalian genomes as well as in the genome of Xenopus tropicalis. This indicates that the mechanism responsible for the maintenance of TFRs has been active at these loci for over 450 million years. However, the majority of TFR bases cannot be aligned between distantly related species, demonstrating that TFRs are not the by-product of strong primary sequence conservation. Syntenically conserved TFRs are also more enriched for regulatory genes compared to lineage-specific TFRs. Conclusion We suggest that TFRs contain extended regulatory sequences that contribute to the precise expression of genes central to early vertebrate development, and can be used as predictors of important regulatory regions.
Collapse
Affiliation(s)
- Cas Simons
- Australian Research Council Special Research Center for Functional and Applied Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia QLD 4072, Australia.
| | | | | | | |
Collapse
|
33
|
Niu W, Qi Y, Hou S, Zhou W, Qiu C. Correlation of angiotensin-converting enzyme 2 gene polymorphisms with stage 2 hypertension in Han Chinese. Transl Res 2007; 150:374-80. [PMID: 18022600 DOI: 10.1016/j.trsl.2007.06.002] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/08/2007] [Revised: 05/23/2007] [Accepted: 06/01/2007] [Indexed: 11/17/2022]
Abstract
Experimental evidence indicates that angiotensin-converting enzyme 2 (ACE2), a homologue of human ACE, might negatively regulate the activated renin-angiotensin-aldosterone system (RAAS) and might function as a protective regulator in the pathogenesis of hypertension. However, association studies regarding ACE2 are sparse in the literature, with negative results in the majority of cases. Here we conducted an association study between 2 intronic polymorphisms (A1075G and G8790A) of the ACE2 gene and stage 2 hypertension in Han Chinese. We genotyped the 2 polymorphisms in 1494 subjects (808 stage 2 hypertensives and 686 normotensives) recruited from the Fangshan district (Beijing). Data were analyzed using chi(2) test, 1-way analysis of variance, and logistic regression where appropriate. The frequency of A1075G allele distribution in males differed significantly (P < 0.0001), whereas the genotype and allele distributions of G8790A polymorphism were similar, between stage 2 hypertensives and normotensives. Systolic blood pressure (SBP) differed significantly in females across both genotypes: SBP was significantly lower in subjects with the 1075AA and 8790GG genotypes, higher in the 1075GG (+13.65 mm Hg versus AA) and 8790AA (+13.36 mm Hg versus GG) genotypes, and intermediate in the 1075AG (+5.76 mm Hg versus AA) and 8790GA (+5.65 mm Hg versus GG) genotypes. Our data suggest that the polymorphism (A1075G) might be a risk factor-at least a marker-for stage 2 hypertension in males and that the 2 studied polymorphisms might be the indicators of systolic hypertension in females.
Collapse
Affiliation(s)
- Wenquan Niu
- National Laboratory of Medical Molecular Biology, the Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences/Peking Union Medical College (CAMS/PUMC), Beijing, China.
| | | | | | | | | |
Collapse
|
34
|
Perez DS, Hoage TR, Pritchett JR, Ducharme-Smith AL, Halling ML, Ganapathiraju SC, Streng PS, Smith DI. Long, abundantly expressed non-coding transcripts are altered in cancer. Hum Mol Genet 2007; 17:642-55. [PMID: 18006640 DOI: 10.1093/hmg/ddm336] [Citation(s) in RCA: 166] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Recent studies with tiling arrays have revealed more genomic transcription than previously anticipated. Whole new groups of non-coding transcripts (NCTs) have been detected. Some of these NCTs, including miRNAs, can regulate gene expression. To date, most known NCTs studied have been relatively short, but several important regulatory NCTs, including XIST, MALAT-1, BC1 and BC200, are considerably larger in length and represent a novel class of long, non-coding RNA species. Whole-genome tiling arrays were utilized to identify novel long NCTs across the entire human genome. Our results have identified a new group of long (>400 nt), abundantly expressed NCTs and have found that a subset of these are also highly evolutionarily conserved. In this report, we have begun to characterize 15 long, conserved NCTs. Quantitative real-time RT-PCR was used to analyze their expression in different normal human tissue and also in breast and ovarian cancers. We found altered expression of many of these NCTs in both cancer types. In addition, several of these NCTs have consistent mutations when sequences of normal samples were compared with a panel of cancer-derived cell lines. One NCT was found to be consistently mutated in a panel of endometrial cancers compared with matched normal blood. These NCTs were among the most abundantly expressed transcripts detected. There are probably many long, conserved NCTs, albeit with lower levels of expression. Although the function of these NCTs is currently unknown, our study indicates that they may play an important function in both normal cells and in cancer development.
Collapse
Affiliation(s)
- Damon S Perez
- Division of Experimental Pathology, Department of Laboratory Medicine and Pathology, Mayo Clinic and Foundation, 200 First Street, S.W., Rochester, MN 55905, USA.
| | | | | | | | | | | | | | | |
Collapse
|
35
|
Li SW, Feng L, Niu DK. Selection for the miniaturization of highly expressed genes. Biochem Biophys Res Commun 2007; 360:586-92. [PMID: 17610841 DOI: 10.1016/j.bbrc.2007.06.085] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2007] [Accepted: 06/18/2007] [Indexed: 11/29/2022]
Abstract
Most widely expressed genes are also highly expressed. Based on high or wide expression, different models were proposed to explain the small sizes of highly/widely expressed genes. We found that housekeeping genes are not more compact than narrowly expressed genes with similar expression levels, but compactness and expression level are correlated in housekeeping genes (except that highly expressed Arabidopsis HK genes have longer intron length). Meanwhile, we found evidence that genes with high functional/regulatory complexity do not have longer introns and longer proteins. The genome design hypothesis is thus not supported. Furthermore, we found that housekeeping genes are not more compact than the narrowly expressed somatic genes with similar average expression levels. Because housekeeping genes are expected to have much higher germline expression levels than narrowly expressed somatic genes, transcription-associated deletion bias is not supported. Selection of the compactness of highly expressed genes for economy is supported.
Collapse
Affiliation(s)
- Shu-Wei Li
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, Beijing 100875, China
| | | | | |
Collapse
|
36
|
Voelker RB, Berglund JA. A comprehensive computational characterization of conserved mammalian intronic sequences reveals conserved motifs associated with constitutive and alternative splicing. Genes Dev 2007; 17:1023-33. [PMID: 17525134 PMCID: PMC1899113 DOI: 10.1101/gr.6017807] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2006] [Accepted: 04/12/2007] [Indexed: 11/24/2022]
Abstract
Orthologous mammalian introns contain many highly conserved sequences. Of these sequences, many are likely to represent protein binding sites that are under strong positive selection. In order to identify conserved protein binding sites that are important for splicing, we analyzed the composition of intronic sequences that are conserved between human and six eutherian mammals. We focused on all completely conserved sequences of seven or more nucleotides located in the regions adjacent to splice-junctions. We found that these conserved intronic sequences are enriched in specific motifs, and that many of these motifs are statistically associated with either alternative or constitutive splicing. In validation of our methods, we identified several motifs that are known to play important roles in alternative splicing. In addition, we identified several novel motifs containing GCT that are abundant and are associated with alternative splicing. Furthermore, we demonstrate that, for some of these motifs, conservation is a strong indicator of potential functionality since conserved instances are associated with alternative splicing while nonconserved instances are not. A surprising outcome of this analysis was the identification of a large number of AT-rich motifs that are strongly associated with constitutive splicing. Many of these appear to be novel and may represent conserved intronic splicing enhancers (ISEs). Together these data show that conservation provides important insights into the identification and possible roles of cis-acting intronic sequences important for alternative and constitutive splicing.
Collapse
Affiliation(s)
- Rodger B. Voelker
- Institute of Molecular Biology, University of Oregon, Eugene, Oregon 97403, USA
| | - J. Andrew Berglund
- Institute of Molecular Biology, University of Oregon, Eugene, Oregon 97403, USA
| |
Collapse
|
37
|
Mehler MF, Mattick JS. Noncoding RNAs and RNA Editing in Brain Development, Functional Diversification, and Neurological Disease. Physiol Rev 2007; 87:799-823. [PMID: 17615389 DOI: 10.1152/physrev.00036.2006] [Citation(s) in RCA: 238] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
The progressive maturation and functional plasticity of the nervous system in health and disease involve a dynamic interplay between the transcriptome and the environment. There is a growing awareness that the previously unexplored molecular and functional interface mediating these complex gene-environmental interactions, particularly in brain, may encompass a sophisticated RNA regulatory network involving the twin processes of RNA editing and multifaceted actions of numerous subclasses of non-protein-coding RNAs. The mature nervous system encompasses a wide range of cell types and interconnections. Long-term changes in the strength of synaptic connections are thought to underlie memory retrieval, formation, stabilization, and effector functions. The evolving nervous system involves numerous developmental transitions, such as neurulation, neural tube patterning, neural stem cell expansion and maintenance, lineage elaboration, differentiation, axonal path finding, and synaptogenesis. Although the molecular bases for these processes are largely unknown, RNA-based epigenetic mechanisms appear to be essential for orchestrating these precise and versatile biological phenomena and in defining the etiology of a spectrum of neurological diseases. The concerted modulation of RNA editing and the selective expression of non-protein-coding RNAs during seminal as well as continuous state transitions may comprise the plastic molecular code needed to couple the intrinsic malleability of neural network connections to evolving environmental influences to establish diverse forms of short- and long-term memory, context-specific behavioral responses, and sophisticated cognitive capacities.
Collapse
Affiliation(s)
- Mark F Mehler
- Institute for Brain Disorders and Neural Regeneration, Department of Neurology, Einstein Cancer Center, Albert Einstein College of Medicine, Bronx, New York 10461, USA.
| | | |
Collapse
|
38
|
Freudenberg J, Fu YH, Ptácek LJ. Enrichment of HapMap recombination hotspot predictions around human nervous system genes: evidence for positive selection ? Eur J Hum Genet 2007; 15:1071-8. [PMID: 17568387 DOI: 10.1038/sj.ejhg.5201876] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
Channels and developmental genes belong to the molecular key players in the human central nervous system (CNS). Mutations in these genes often cause monogenic neurological disease and interspecies comparisons had shown reduced divergence. On the other hand, accelerated evolution of genes with roles in neurotransmission and development had indicated widespread positive selection in hominids. In the present study, we hypothesized that recombination hotspots could be enriched at genes with particularly important role in the CNS, because at those loci beneficial mutations may occur on a highly constrained background and consequently increased recombination could promote their fixation. To test this hypothesis, we retrieved CNS genes based on keyword search, expression data and expert knowledge. Consistent with our hypothesis, we find an enrichment of hotspot predictions around genes that are retrieved by all three strategies. Moreover, when comparing human genes based on their Gene Ontology annotations, we find hotspot predictions preferentially located around channels and neurodevelopmental genes. Taken together with the distinct sequence evolution that was reported by comparative genomic studies, this finding indicates continued positive selection at many CNS gene loci. In support of this interpretation, we also find an enrichment of recombination hotspot predictions around conserved noncoding regions that were reported to display a signature of accelerated evolution in the human lineage. Widespread positive selection acting on CNS gene loci could relate to the high prevalence of human nervous system disorders with genetically complex inheritance, potentially under an ancestral susceptibility allele model.
Collapse
Affiliation(s)
- Jan Freudenberg
- Department of Neurology, Institute of Human Genetics, University of California San Francisco, San Francisco, CA, USA.
| | | | | |
Collapse
|
39
|
Abstract
SUMMARY
It is usually thought that the development of complex organisms is controlled by protein regulatory factors and morphogenetic signals exchanged between cells and differentiating tissues during ontogeny. However, it is now evident that the majority of all animal genomes is transcribed, apparently in a developmentally regulated manner, suggesting that these genomes largely encode RNA machines and that there may be a vast hidden layer of RNA regulatory transactions in the background. I propose that the epigenetic trajectories of differentiation and development are primarily programmed by feed-forward RNA regulatory networks and that most of the information required for multicellular development is embedded in these networks, with cell–cell signalling required to provide important positional information and to correct stochastic errors in the endogenous RNA-directed program.
Collapse
Affiliation(s)
- John S Mattick
- ARC Centre for Functional and Applied Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia QLD 4072, Australia.
| |
Collapse
|
40
|
Yeo GW, Van Nostrand EL, Nostrand ELV, Liang TY. Discovery and analysis of evolutionarily conserved intronic splicing regulatory elements. PLoS Genet 2007; 3:e85. [PMID: 17530930 PMCID: PMC1877881 DOI: 10.1371/journal.pgen.0030085] [Citation(s) in RCA: 110] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2006] [Accepted: 04/13/2007] [Indexed: 02/05/2023] Open
Abstract
Knowledge of the functional cis-regulatory elements that regulate constitutive and alternative pre-mRNA splicing is fundamental for biology and medicine. Here we undertook a genome-wide comparative genomics approach using available mammalian genomes to identify conserved intronic splicing regulatory elements (ISREs). Our approach yielded 314 ISREs, and insertions of ~70 ISREs between competing splice sites demonstrated that 84% of ISREs altered 5′ and 94% altered 3′ splice site choice in human cells. Consistent with our experiments, comparisons of ISREs to known splicing regulatory elements revealed that 40%–45% of ISREs might have dual roles as exonic splicing silencers. Supporting a role for ISREs in alternative splicing, we found that 30%–50% of ISREs were enriched near alternatively spliced (AS) exons, and included almost all known binding sites of tissue-specific alternative splicing factors. Further, we observed that genes harboring ISRE-proximal exons have biases for tissue expression and molecular functions that are ISRE-specific. Finally, we discovered that for Nova1, neuronal PTB, hnRNP C, and FOX1, the most frequently occurring ISRE proximal to an alternative conserved exon in the splicing factor strongly resembled its own known RNA binding site, suggesting a novel application of ISRE density and the propensity for splicing factors to auto-regulate to associate RNA binding sites to splicing factors. Our results demonstrate that ISREs are crucial building blocks in understanding general and tissue-specific AS regulation and the biological pathways and functions regulated by these AS events. During RNA splicing, sequences (introns) in a pre-mRNA are excised and discarded, and the remaining sequences (exons) are joined to form the mature RNA. Splicing is regulated not only by the binding of the basic splicing machinery to splice sites located at the exon–intron boundaries, but also by the combined effects of various other splicing factors that bind to a multitude of sequence elements located both in the exons as well as the flanking introns. Instances of alternative splicing, where usage of splice site(s) is incomplete or different between tissues, cell types, or lineages, can be created by the interaction of sequence elements and tissue, cell type, and stage-specific splicing factors. To better understand constitutive and alternative pre-mRNA splicing, the authors describe a comparative genomics approach, using available mammalian genomes, to systematically identify splicing regulatory elements located in the introns proximal to exons. A quarter of the elements were tested experimentally, and most of them altered splicing in human cells. The authors also showed that that the intronic elements are close to tissue-specific alternative exons and are more likely to be located in specific positions in the introns, suggestive of potential regulatory function. These elements are also frequently found in tissue-specific genes, suggesting a coupling between expression and alternative splicing of these genes. Finally, the authors propose a strategy using the elements to identify the binding sites of several splicing factors.
Collapse
Affiliation(s)
- Gene W Yeo
- Crick-Jacobs Center for Theoretical and Computational Biology, Salk Institute, La Jolla, California, United States of America.
| | | | | | | |
Collapse
|
41
|
Freudenberg J, Fu YH, Ptácek LJ. Bioinformatic analysis of human CNS-expressed ion channels as candidates for episodic nervous system disorders. Neurogenetics 2007; 8:159-68. [PMID: 17333079 DOI: 10.1007/s10048-007-0082-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2006] [Accepted: 01/29/2007] [Indexed: 10/23/2022]
Abstract
As monogenic forms of episodic nervous system disorders are often caused by ion channel mutations, we looked for features of human central nervous system (CNS) expressed ion channels that further our understanding of those phenotypes. To this end, we compared human ion channels with other CNS-expressed genes, which we categorized according to the existence of transmembrane domains. When looking at the phylogenetic distribution of these genes, we observed an increased percentage of ion channels that exist in vertebrate genomes while missing in invertebrate genomes. Because we hypothesized that this pattern may relate to a more specific expression, we searched for characteristics of ion channels that indicate a tighter expression regulation. We found that ion channels have longer intron and protein sequences, features typical of genes with more specific expression. In addition, ion channels have increased human-rodent conservation around their transcription start site, as indicated by a higher fraction of conserved noncoding regions. This points to a high relevance of mutations that regulate ion channel expression. When we finally asked whether vertebrate-specific diversification is also displayed by non-ion channel genes with important roles in the CNS, we found a similar phylogenetic distribution. This concordant phylogenetic pattern suggests that vertebrate-specific adaptations may account for a large part of the shared genetic basis of episodic CNS disorders, including monogenic and genetically complex disease manifestations. Consequently, this phylogenetic pattern may contribute to the prioritization of candidate genes in human genetic studies of episodic CNS disorders.
Collapse
Affiliation(s)
- Jan Freudenberg
- Laboratories of Neurogenetics, Department of Neurology, Institute of Human Genetics, University of California San Francisco, San Francisco, CA 94158-2922, USA.
| | | | | |
Collapse
|
42
|
Nakaya HI, Amaral PP, Louro R, Lopes A, Fachel AA, Moreira YB, El-Jundi TA, da Silva AM, Reis EM, Verjovski-Almeida S. Genome mapping and expression analyses of human intronic noncoding RNAs reveal tissue-specific patterns and enrichment in genes related to regulation of transcription. Genome Biol 2007; 8:R43. [PMID: 17386095 PMCID: PMC1868932 DOI: 10.1186/gb-2007-8-3-r43] [Citation(s) in RCA: 155] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2006] [Revised: 01/17/2007] [Accepted: 03/26/2007] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND RNAs transcribed from intronic regions of genes are involved in a number of processes related to post-transcriptional control of gene expression. However, the complement of human genes in which introns are transcribed, and the number of intronic transcriptional units and their tissue expression patterns are not known. RESULTS A survey of mRNA and EST public databases revealed more than 55,000 totally intronic noncoding (TIN) RNAs transcribed from the introns of 74% of all unique RefSeq genes. Guided by this information, we designed an oligoarray platform containing sense and antisense probes for each of 7,135 randomly selected TIN transcripts plus the corresponding protein-coding genes. We identified exonic and intronic tissue-specific expression signatures for human liver, prostate and kidney. The most highly expressed antisense TIN RNAs were transcribed from introns of protein-coding genes significantly enriched (p = 0.002 to 0.022) in the 'Regulation of transcription' Gene Ontology category. RNA polymerase II inhibition resulted in increased expression of a fraction of intronic RNAs in cell cultures, suggesting that other RNA polymerases may be involved in their biosynthesis. Members of a subset of intronic and protein-coding signatures transcribed from the same genomic loci have correlated expression patterns, suggesting that intronic RNAs regulate the abundance or the pattern of exon usage in protein-coding messages. CONCLUSION We have identified diverse intronic RNA expression patterns, pointing to distinct regulatory roles. This gene-oriented approach, using a combined intron-exon oligoarray, should permit further comparative analysis of intronic transcription under various physiological and pathological conditions, thus advancing current knowledge about the biological functions of these noncoding RNAs.
Collapse
Affiliation(s)
- Helder I Nakaya
- Departamento de Bioquimica, Instituto de Quimica, Universidade de São Paulo, 05508-900 São Paulo, SP, Brazil
| | - Paulo P Amaral
- Departamento de Bioquimica, Instituto de Quimica, Universidade de São Paulo, 05508-900 São Paulo, SP, Brazil
| | - Rodrigo Louro
- Departamento de Bioquimica, Instituto de Quimica, Universidade de São Paulo, 05508-900 São Paulo, SP, Brazil
| | - André Lopes
- Departamento de Bioquimica, Instituto de Quimica, Universidade de São Paulo, 05508-900 São Paulo, SP, Brazil
| | - Angela A Fachel
- Departamento de Bioquimica, Instituto de Quimica, Universidade de São Paulo, 05508-900 São Paulo, SP, Brazil
| | - Yuri B Moreira
- Departamento de Bioquimica, Instituto de Quimica, Universidade de São Paulo, 05508-900 São Paulo, SP, Brazil
| | - Tarik A El-Jundi
- Departamento de Bioquimica, Instituto de Quimica, Universidade de São Paulo, 05508-900 São Paulo, SP, Brazil
| | - Aline M da Silva
- Departamento de Bioquimica, Instituto de Quimica, Universidade de São Paulo, 05508-900 São Paulo, SP, Brazil
| | - Eduardo M Reis
- Departamento de Bioquimica, Instituto de Quimica, Universidade de São Paulo, 05508-900 São Paulo, SP, Brazil
| | - Sergio Verjovski-Almeida
- Departamento de Bioquimica, Instituto de Quimica, Universidade de São Paulo, 05508-900 São Paulo, SP, Brazil
| |
Collapse
|
43
|
Xing Y, Lee C. Relating alternative splicing to proteome complexity and genome evolution. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2007; 623:36-49. [PMID: 18380339 DOI: 10.1007/978-0-387-77374-2_3] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
Prior to genomics, studies of alternative splicing primarily focused on the function and mechanism of alternative splicing in individual genes and exons. This has changed dramatically since the late 1990s. High-throughput genomics technologies, such as EST sequencing and microarrays designed to detect changes in splicing, led to genome-wide discoveries and quantification of alternative splicing in a wide range of species from human to Arabidopsis. Consensus estimates of AS frequency in the human genome grew from less than 5% in mid-1990s to as high as 60-74% now. The rapid growth in sequence and microarray data for alternative splicing has made it possible to look into the global impact of alternative splicing on protein function and evolution of genomes. In this chapter, we review recent research on alternative splicing's impact on proteomic complexity and its role in genome evolution.
Collapse
Affiliation(s)
- Yi Xing
- Department of Internal Medicine, Roy J. and Lucille A. Carver College of Medicine, University of Iowa, Iowa City, USA
| | | |
Collapse
|
44
|
Taft RJ, Pheasant M, Mattick JS. The relationship between non-protein-coding DNA and eukaryotic complexity. Bioessays 2007; 29:288-99. [PMID: 17295292 DOI: 10.1002/bies.20544] [Citation(s) in RCA: 403] [Impact Index Per Article: 23.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
There are two intriguing paradoxes in molecular biology--the inconsistent relationship between organismal complexity and (1) cellular DNA content and (2) the number of protein-coding genes--referred to as the C-value and G-value paradoxes, respectively. The C-value paradox may be largely explained by varying ploidy. The G-value paradox is more problematic, as the extent of protein coding sequence remains relatively static over a wide range of developmental complexity. We show by analysis of sequenced genomes that the relative amount of non-protein-coding sequence increases consistently with complexity. We also show that the distribution of introns in complex organisms is non-random. Genes composed of large amounts of intronic sequence are significantly overrepresented amongst genes that are highly expressed in the nervous system, and amongst genes downregulated in embryonic stem cells and cancers. We suggest that the informational paradox in complex organisms may be explained by the expansion of cis-acting regulatory elements and genes specifying trans-acting non-protein-coding RNAs.
Collapse
Affiliation(s)
- Ryan J Taft
- ARC Special Research Centre for Functional and Applied Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Australia
| | | | | |
Collapse
|
45
|
Compensatory relationship between splice sites and exonic splicing signals depending on the length of vertebrate introns. BMC Genomics 2006; 7:311. [PMID: 17156453 PMCID: PMC1713244 DOI: 10.1186/1471-2164-7-311] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2006] [Accepted: 12/08/2006] [Indexed: 01/25/2023] Open
Abstract
BACKGROUND The signals that determine the specificity and efficiency of splicing are multiple and complex, and are not fully understood. Among other factors, the relative contributions of different mechanisms appear to depend on intron size inasmuch as long introns might hinder the activity of the spliceosome through interference with the proper positioning of the intron-exon junctions. Indeed, it has been shown that the information content of splice sites positively correlates with intron length in the nematode, Drosophila, and fungi. We explored the connections between the length of vertebrate introns, the strength of splice sites, exonic splicing signals, and evolution of flanking exons. RESULTS A compensatory relationship is shown to exist between different types of signals, namely, the splice sites and the exonic splicing enhancers (ESEs). In the range of relatively short introns (approximately, < 1.5 kilobases in length), the enhancement of the splicing signals for longer introns was manifest in the increased concentration of ESEs. In contrast, for longer introns, this effect was not detectable, and instead, an increase in the strength of the donor and acceptor splice sites was observed. Conceivably, accumulation of A-rich ESE motifs beyond a certain limit is incompatible with functional constraints operating at the level of protein sequence evolution, which leads to compensation in the form of evolution of the splice sites themselves toward greater strength. In addition, however, a correlation between sequence conservation in the exon ends and intron length, particularly, in synonymous positions, was observed throughout the entire length range of introns. Thus, splicing signals other than the currently defined ESEs, i.e., potential new classes of ESEs, might exist in exon sequences, particularly, those that flank long introns. CONCLUSION Several weak but statistically significant correlations were observed between vertebrate intron length, splice site strength, and potential exonic splicing signals. Taken together, these findings attest to a compensatory relationship between splice sites and exonic splicing signals, depending on intron length.
Collapse
|
46
|
Ponting CP, Lunter G. Signatures of adaptive evolution within human non-coding sequence. Hum Mol Genet 2006; 15 Spec No 2:R170-5. [PMID: 16987880 DOI: 10.1093/hmg/ddl182] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
The human genome is often portrayed as consisting of three sequence types, each distinguished by their mode of evolution. Purifying selection is estimated to act on 2.5-5.0% of the genome, whereas virtually all remaining sequence is considered to have evolved neutrally and to be devoid of functionality. The third mode of evolution, positive selection of advantageous changes, is considered rare. Such instances have been inferred only for a handful of sites, and these lie almost exclusively within protein-coding genes. Nevertheless, the majority of positively selected sequence is expected to lie within the wealth of functional 'dark matter' present outside of the coding sequence. Here, we review the evolutionary evidence for the majority of human-conserved DNA lying outside of the protein-coding sequence. We argue that within this non-coding fraction lies at least 1 Mb of functional sequence that has accumulated many beneficial nucleotide replacements. Illuminating the functions of this adaptive dark matter will lead to a better understanding of the sequence changes that have shaped the innovative biology of our species.
Collapse
Affiliation(s)
- Chris P Ponting
- MRC Functional Genetics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford, South Parks Road, Oxford OX1 3QX, UK.
| | | |
Collapse
|
47
|
Abstract
Human tissue-specific genes were reported to be longer than housekeeping genes (both in coding and intronic parts). The competing neutralist and adaptationist models were proposed to explain this observation. Here I show that in human genome the longest are genes with the intermediate expression pattern. From the standpoint of information theory, the regulation of such genes should be most complex. In the genomewide context, they are found here to have the higher informational load on all available levels: from participation in protein interaction networks, pathways and modules reflected in Gene Ontology categories through transcription factor regulatory sets and protein functional domains to amino acid tuples (words) in encoded proteins and nucleotide tuples in introns and promoter regions. Thus, the intermediately expressed genes have the higher functional and regulatory complexity that is reflected in their greater length (which is consistent with the 'genome design' model). The dichotomy of housekeeping versus tissue-specific entities is more pronounced on the modular level than on the molecular level. There are much lesser intermediate-specific modules (modules overrepresented in the intermediately expressed genes) than housekeeping or tissue-specific modules (normalized to gene number). The dichotomy of housekeeping versus tissue-specific genes and modules in multicellular organisms is probably caused by the burden of regulatory complexity acted on the intermediately expressed genes.
Collapse
|
48
|
Anatskaya OV, Vinogradov AE. Genome multiplication as adaptation to tissue survival: evidence from gene expression in mammalian heart and liver. Genomics 2006; 89:70-80. [PMID: 17029690 DOI: 10.1016/j.ygeno.2006.08.014] [Citation(s) in RCA: 74] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2006] [Revised: 08/11/2006] [Accepted: 08/28/2006] [Indexed: 12/13/2022]
Abstract
To elucidate the functional significance of genome multiplication in somatic tissues, we performed a large-scale analysis of ploidy-associated changes in expression of non-tissue-specific (i.e., broadly expressed) genes in the heart and liver of human and mouse (6585 homologous genes were analyzed). These species have inverse patterns of polyploidization in cardiomyocytes and hepatocytes. The between-species comparison of two pairs of homologous tissues with crisscross contrast in ploidy levels allows the removal of the effects of species and tissue specificity on the profile of gene activity. The different tests performed from the standpoint of modular biology revealed a consistent picture of ploidy-associated alteration in a wide range of functional gene groups. The major effects consisted of hypoxia-inducible factor-triggered changes in main cellular processes and signaling pathways, activation of defense against DNA lesions, acceleration of protein turnover and transcription, and the impairment of apoptosis, the immune response, and cytoskeleton maintenance. We also found a severe decline in aerobic respiration and stimulation of sugar and fatty acid metabolism. These metabolic rearrangements create a special type of metabolism that can be considered intermediate between aerobic and anaerobic. The metabolic and physiological changes revealed (reflected in the alteration of gene expression) help explain the unique ability of polyploid tissues to combine proliferation and differentiation, which are separated in diploid tissues. We argue that genome multiplication promotes cell survival and tissue regeneration under stressful conditions.
Collapse
Affiliation(s)
- Olga V Anatskaya
- Institute of Cytology, Russian Academy of Sciences, Tikhoretsky Avenue 4, St. Petersburg 194064, Russia
| | | |
Collapse
|
49
|
Sun H, Skogerbø G, Chen R. Conserved distances between vertebrate highly conserved elements. Hum Mol Genet 2006; 15:2911-22. [PMID: 16923797 DOI: 10.1093/hmg/ddl232] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
High numbers of sequence element with very high (>95%) sequence conservation between the human and other vertebrate genomes have been reported and ascribed putative cis-regulatory functions. We have investigated the structural relationships between such elements in mammalian genomes and find that not only their sequences, but also the distances between them are significantly (P<2.2x10(-16)) more conserved than corresponding distances between orthologous protein-coding genes or between exons within these genes. Regions of largely conserved distance between consecutive highly conserved elements (HCE) generally overlap previously identified HCE clusters, but may be far longer (up to 20 Mb) and possibly cover close to 25% of the human genome sequence. Similar conservation of distance is found between bird (chicken) and mammalian genomes and is also discernible in comparisons between fish and mammals. The data suggest either that a substantial amount of essential (functionally active) elements with lower sequence conservation occupy the space between the HCEs or that distance itself is an important factor in transcriptional regulation or chromatin modelling.
Collapse
Affiliation(s)
- Hong Sun
- Bioinformatics Laboratory, Institute of Biophysics, Chinese Academy of Sciences, Beijing, P.R. China
| | | | | |
Collapse
|
50
|
Marques AT, Antunes A, Fernandes PA, Ramos MJ. Comparative evolutionary genomics of the HADH2 gene encoding Abeta-binding alcohol dehydrogenase/17beta-hydroxysteroid dehydrogenase type 10 (ABAD/HSD10). BMC Genomics 2006; 7:202. [PMID: 16899120 PMCID: PMC1559703 DOI: 10.1186/1471-2164-7-202] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2006] [Accepted: 08/09/2006] [Indexed: 11/17/2022] Open
Abstract
Background The Aβ-binding alcohol dehydrogenase/17β-hydroxysteroid dehydrogenase type 10 (ABAD/HSD10) is an enzyme involved in pivotal metabolic processes and in the mitochondrial dysfunction seen in the Alzheimer's disease. Here we use comparative genomic analyses to study the evolution of the HADH2 gene encoding ABAD/HSD10 across several eukaryotic species. Results Both vertebrate and nematode HADH2 genes showed a six-exon/five-intron organization while those of the insects had a reduced and varied number of exons (two to three). Eutherian mammal HADH2 genes revealed some highly conserved noncoding regions, which may indicate the presence of functional elements, namely in the upstream region about 1 kb of the transcription start site and in the first part of intron 1. These regions were also conserved between Tetraodon and Fugu fishes. We identified a conserved alternative splicing event between human and dog, which have a nine amino acid deletion, causing the removal of the strand βF. This strand is one of the seven strands that compose the core β-sheet of the Rossman fold dinucleotide-binding motif characteristic of the short chain dehydrogenase/reductase (SDR) family members. However, the fact that the substrate binding cleft residues are retained and the existence of a shared variant between human and dog suggest that it might be functional. Molecular adaptation analyses across eutherian mammal orthologues revealed the existence of sites under positive selection, some of which being localized in the substrate-binding cleft and in the insertion 1 region on loop D (an important region for the Aβ-binding to the enzyme). Interestingly, a higher than expected number of nonsynonymous substitutions were observed between human/chimpanzee and orangutan, with six out of the seven amino acid replacements being under molecular adaptation (including three in loop D and one in the substrate binding loop). Conclusion Our study revealed that HADH2 genes maintained a reasonable conserved organization across a large evolutionary distance. The conserved noncoding regions identified among mammals and between pufferfishes, the evidence of an alternative splicing variant conserved between human and dog, and the detection of positive selection across eutherian mammals, may be of importance for further research on ABAD/HSD10 function and its implication in the Alzheimer's disease.
Collapse
Affiliation(s)
- Alexandra T Marques
- REQUIMTE, Departamento de Química, Faculdade de Ciências, Universidade do Porto, Rua do Campo Alegre, 687, 4169-007 Porto, Portugal
| | - Agostinho Antunes
- REQUIMTE, Departamento de Química, Faculdade de Ciências, Universidade do Porto, Rua do Campo Alegre, 687, 4169-007 Porto, Portugal
| | - Pedro A Fernandes
- REQUIMTE, Departamento de Química, Faculdade de Ciências, Universidade do Porto, Rua do Campo Alegre, 687, 4169-007 Porto, Portugal
| | - Maria J Ramos
- REQUIMTE, Departamento de Química, Faculdade de Ciências, Universidade do Porto, Rua do Campo Alegre, 687, 4169-007 Porto, Portugal
| |
Collapse
|