1
|
Leypold NA, Speicher MR. Evolutionary conservation in noncoding genomic regions. Trends Genet 2021; 37:903-918. [PMID: 34238591 DOI: 10.1016/j.tig.2021.06.007] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Revised: 05/25/2021] [Accepted: 06/07/2021] [Indexed: 12/28/2022]
Abstract
Humans may share more genomic commonalities with other species than previously thought. According to current estimates, ~5% of the human genome is functionally constrained, which is a much larger fraction than the ~1.5% occupied by annotated protein-coding genes. Hence, ~3.5% of the human genome comprises likely functional conserved noncoding elements (CNEs) preserved among organisms, whose common ancestors existed throughout hundreds of millions of years of evolution. As whole-genome sequencing emerges as a standard procedure in genetic analyses, interpretation of variations in CNEs, including the elucidation of mechanistic and functional roles, becomes a necessity. Here, we discuss the phenomenon of noncoding conservation via four dimensions (sequence, regulatory conservation, spatiotemporal expression, and structure) and the potential significance of CNEs in phenotype variation and disease.
Collapse
Affiliation(s)
- Nicole A Leypold
- Institute of Human Genetics, Diagnostic and Research Center for Molecular Biomedicine, Medical University of Graz, 8010 Graz, Austria.
| | - Michael R Speicher
- Institute of Human Genetics, Diagnostic and Research Center for Molecular Biomedicine, Medical University of Graz, 8010 Graz, Austria; BioTechMed-Graz, Graz, Austria.
| |
Collapse
|
2
|
Functional and structural basis of extreme conservation in vertebrate 5' untranslated regions. Nat Genet 2021; 53:729-741. [PMID: 33821006 PMCID: PMC8825242 DOI: 10.1038/s41588-021-00830-1] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2020] [Accepted: 02/26/2021] [Indexed: 01/07/2023]
Abstract
The lack of knowledge about extreme conservation in genomes remains a major gap in our understanding of the evolution of gene regulation. Here, we reveal an unexpected role of extremely conserved 5' untranslated regions (UTRs) in noncanonical translational regulation that is linked to the emergence of essential developmental features in vertebrate species. Endogenous deletion of conserved elements within these 5' UTRs decreased gene expression, and extremely conserved 5' UTRs possess cis-regulatory elements that promote cell-type-specific regulation of translation. We further developed in-cell mutate-and-map (icM2), a new methodology that maps RNA structure inside cells. Using icM2, we determined that an extremely conserved 5' UTR encodes multiple alternative structures and that each single nucleotide within the conserved element maintains the balance of alternative structures important to control the dynamic range of protein expression. These results explain how extreme sequence conservation can lead to RNA-level biological functions encoded in the untranslated regions of vertebrate genomes.
Collapse
|
3
|
Seridi L, Ryu T, Ravasi T. Dynamic epigenetic control of highly conserved noncoding elements. PLoS One 2014; 9:e109326. [PMID: 25289637 PMCID: PMC4188601 DOI: 10.1371/journal.pone.0109326] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2014] [Accepted: 09/11/2014] [Indexed: 11/19/2022] Open
Abstract
Background Many noncoding genomic loci have remained constant over long evolutionary periods, suggesting that they are exposed to strong selective pressures. The molecular functions of these elements have been partially elucidated, but the fundamental reason for their extreme conservation is still unknown. Results To gain new insights into the extreme selection of highly conserved noncoding elements (HCNEs), we used a systematic analysis of multi-omic data to study the epigenetic regulation of such elements during the development of Drosophila melanogaster. At the sequence level, HCNEs are GC-rich and have a characteristic oligomeric composition. They have higher levels of stable nucleosome occupancy than their flanking regions, and lower levels of mononucleosomes and H3.3, suggesting that these regions reside in compact chromatin. Furthermore, these regions showed remarkable modulations in histone modification and the expression levels of adjacent genes during development. Although HCNEs are primarily initiated late in replication, about 10% were related to early replication origins. Finally, HCNEs showed strong enrichment within lamina-associated domains. Conclusion HCNEs have distinct and protective sequence properties, undergo dynamic epigenetic regulation, and appear to be associated with the structural components of the chromatin, replication origins, and nuclear matrix. These observations indicate that such elements are likely to have essential cellular functions, and offer insights into their epigenetic properties.
Collapse
Affiliation(s)
- Loqmane Seridi
- Division of Biological and Environmental Sciences & Engineering, Division of Applied Mathematics and Computer Sciences, King Abdullah University of Science and Technology, Thuwal, Kingdom of Saudi Arabia
| | - Taewoo Ryu
- Division of Biological and Environmental Sciences & Engineering, Division of Applied Mathematics and Computer Sciences, King Abdullah University of Science and Technology, Thuwal, Kingdom of Saudi Arabia
- * E-mail: (T. Ryu); (T. Ravasi)
| | - Timothy Ravasi
- Division of Biological and Environmental Sciences & Engineering, Division of Applied Mathematics and Computer Sciences, King Abdullah University of Science and Technology, Thuwal, Kingdom of Saudi Arabia
- Department of Medicine, Division of Genetics, University of California San Diego, La Jolla, California, United States of America
- * E-mail: (T. Ryu); (T. Ravasi)
| |
Collapse
|
4
|
McCole RB, Fonseka CY, Koren A, Wu CT. Abnormal dosage of ultraconserved elements is highly disfavored in healthy cells but not cancer cells. PLoS Genet 2014; 10:e1004646. [PMID: 25340765 PMCID: PMC4207606 DOI: 10.1371/journal.pgen.1004646] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2014] [Accepted: 08/04/2014] [Indexed: 12/17/2022] Open
Abstract
Ultraconserved elements (UCEs) are strongly depleted from segmental duplications and copy number variations (CNVs) in the human genome, suggesting that deletion or duplication of a UCE can be deleterious to the mammalian cell. Here we address the process by which CNVs become depleted of UCEs. We begin by showing that depletion for UCEs characterizes the most recent large-scale human CNV datasets and then find that even newly formed de novo CNVs, which have passed through meiosis at most once, are significantly depleted for UCEs. In striking contrast, CNVs arising specifically in cancer cells are, as a rule, not depleted for UCEs and can even become significantly enriched. This observation raises the possibility that CNVs that arise somatically and are relatively newly formed are less likely to have established a CNV profile that is depleted for UCEs. Alternatively, lack of depletion for UCEs from cancer CNVs may reflect the diseased state. In support of this latter explanation, somatic CNVs that are not associated with disease are depleted for UCEs. Finally, we show that it is possible to observe the CNVs of induced pluripotent stem (iPS) cells become depleted of UCEs over time, suggesting that depletion may be established through selection against UCE-disrupting CNVs without the requirement for meiotic divisions.
Collapse
Affiliation(s)
- Ruth B. McCole
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Chamith Y. Fonseka
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America
- Biological and Biomedical Sciences PhD program, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Amnon Koren
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, United States of America
| | - C.-ting Wu
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America
| |
Collapse
|
5
|
Classification of selectively constrained DNA elements using feature vectors and rule-based classifiers. Genomics 2014; 104:79-86. [PMID: 25058025 DOI: 10.1016/j.ygeno.2014.07.004] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2014] [Accepted: 07/15/2014] [Indexed: 12/29/2022]
Abstract
Scarce work has been done in the analysis of the composition of conserved non-coding elements (CNEs) that are identified by comparisons of two or more genomes and are found to exist in all metazoan genomes. Here we present the analysis of CNEs with a methodology that takes into account word occurrence at various lengths scales in the form of feature vector representation and rule based classifiers. We implement our approach on both protein-coding exons and CNEs, originating from human, insect (Drosophila melanogaster) and worm (Caenorhabditis elegans) genomes, that are either identified in the present study or obtained from the literature. Alignment free feature vector representation of sequences combined with rule-based classification methods leads to successful classification of the different CNEs classes. Biologically meaningful results are derived by comparison with the genomic signatures approach, and classification rates for a variety of functional elements of the genomes along with surrogates are presented.
Collapse
|
6
|
Polychronopoulos D, Sellis D, Almirantis Y. Conserved noncoding elements follow power-law-like distributions in several genomes as a result of genome dynamics. PLoS One 2014; 9:e95437. [PMID: 24787386 PMCID: PMC4008492 DOI: 10.1371/journal.pone.0095437] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2013] [Accepted: 03/26/2014] [Indexed: 12/31/2022] Open
Abstract
Conserved, ultraconserved and other classes of constrained elements (collectively referred as CNEs here), identified by comparative genomics in a wide variety of genomes, are non-randomly distributed across chromosomes. These elements are defined using various degrees of conservation between organisms and several thresholds of minimal length. We here investigate the chromosomal distribution of CNEs by studying the statistical properties of distances between consecutive CNEs. We find widespread power-law-like distributions, i.e. linearity in double logarithmic scale, in the inter-CNE distances, a feature which is connected with fractality and self-similarity. Given that CNEs are often found to be spatially associated with genes, especially with those that regulate developmental processes, we verify by appropriate gene masking that a power-law-like pattern emerges irrespectively of whether elements found close or inside genes are excluded or not. An evolutionary model is put forward for the understanding of these findings that includes segmental or whole genome duplication events and eliminations (loss) of most of the duplicated CNEs. Simulations reproduce the main features of the observed size distributions. Power-law-like patterns in the genomic distributions of CNEs are in accordance with current knowledge about their evolutionary history in several genomes.
Collapse
Affiliation(s)
- Dimitris Polychronopoulos
- Institute of Biosciences and Applications, National Center for Scientific Research “Demokritos”, Athens, Greece
- Department of Biochemistry and Molecular Biology, Faculty of Biology, National and Kapodistrian University of Athens, Athens, Greece
| | - Diamantis Sellis
- Department of Biology, Stanford University, Stanford, California, United States of America
| | - Yannis Almirantis
- Institute of Biosciences and Applications, National Center for Scientific Research “Demokritos”, Athens, Greece
- * E-mail:
| |
Collapse
|
7
|
Dai Y, Li S, Dong X, Sun H, Li C, Liu Z, Ying B, Ding G, Li Y. The de novo sequence origin of two long non-coding genes from an inter-genic region. BMC Genomics 2013; 14 Suppl 8:S6. [PMID: 24564579 PMCID: PMC4042238 DOI: 10.1186/1471-2164-14-s8-s6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND The gene Polymorphic derived intron-containing, known as Pldi, is a long non-coding RNA (lncRNA) first discovered in mouse. Although parts of its sequence were reported to be conserved in rat and human, it can only be expressed in mouse testis with a mouse-specific transcription start site. The consensus sequence of Pldi is also part of an antisense transcript AK158810 expressed in a wide range of mouse tissues. RESULT We focused on sequence origin of Pldi and Ak158810. We demonstrated that their sequence was originated from an inter-genic region and is only presented in mammalians. Transposable events and chromosome rearrangements were involved in the evolution of ancestral sequence. Moreover, we discovered high conservation in part of this region was correlated with chromosome rearrangements, CpG demethylation and transcriptional factor binding motif. These results demonstrated that multiple factors contributed to the sequence origin of Pldi. CONCLUSIONS We comprehensively analyzed the sequence origin of Pldi-Ak158810 loci. We provided various factors, including rearrangement, transposable elements, contributed to the formation of the sequence.
Collapse
Affiliation(s)
- Yulin Dai
- Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Rd. Shanghai 200031, PR China
- Graduate School of Chinese Academy of Sciences, 19 Yuquan Rd. Beijing 100049, PR China
| | - Shengdi Li
- Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Rd. Shanghai 200031, PR China
- Graduate School of Chinese Academy of Sciences, 19 Yuquan Rd. Beijing 100049, PR China
| | - Xiao Dong
- Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Rd. Shanghai 200031, PR China
- Graduate School of Chinese Academy of Sciences, 19 Yuquan Rd. Beijing 100049, PR China
| | - Han Sun
- Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Rd. Shanghai 200031, PR China
- Graduate School of Chinese Academy of Sciences, 19 Yuquan Rd. Beijing 100049, PR China
- Shanghai Center for Bioinformation Technology, 1278 Keyuan Rd. Shanghai 201203, PR China
| | - Chao Li
- Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Rd. Shanghai 200031, PR China
- Graduate School of Chinese Academy of Sciences, 19 Yuquan Rd. Beijing 100049, PR China
| | - Zhi Liu
- Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Rd. Shanghai 200031, PR China
- Graduate School of Chinese Academy of Sciences, 19 Yuquan Rd. Beijing 100049, PR China
| | - Beili Ying
- School of Life Sciences, Fudan University, 220 Handan Rd. Shanghai 200433, PR China
| | - Guohui Ding
- Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Rd. Shanghai 200031, PR China
- Shanghai Center for Bioinformation Technology, 1278 Keyuan Rd. Shanghai 201203, PR China
| | - Yixue Li
- Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Rd. Shanghai 200031, PR China
- Shanghai Center for Bioinformation Technology, 1278 Keyuan Rd. Shanghai 201203, PR China
| |
Collapse
|
8
|
Harmston N, Baresic A, Lenhard B. The mystery of extreme non-coding conservation. Philos Trans R Soc Lond B Biol Sci 2013; 368:20130021. [PMID: 24218634 PMCID: PMC3826495 DOI: 10.1098/rstb.2013.0021] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Regions of several dozen to several hundred base pairs of extreme conservation have been found in non-coding regions in all metazoan genomes. The distribution of these elements within and across genomes has suggested that many have roles as transcriptional regulatory elements in multi-cellular organization, differentiation and development. Currently, there is no known mechanism or function that would account for this level of conservation at the observed evolutionary distances. Previous studies have found that, while these regions are under strong purifying selection, and not mutational coldspots, deletion of entire regions in mice does not necessarily lead to identifiable changes in phenotype during development. These opposing findings lead to several questions regarding their functional importance and why they are under strong selection in the first place. In this perspective, we discuss the methods and techniques used in identifying and dissecting these regions, their observed patterns of conservation, and review the current hypotheses on their functional significance.
Collapse
Affiliation(s)
- Nathan Harmston
- Institute of Clinical Sciences, Faculty of Medicine, Imperial College London and MRC Clinical Sciences Centre, , Hammersmith Hospital Campus, Du Cane Road, London W12 0NN, UK
| | | | | |
Collapse
|
9
|
Basu S, Müller F, Sanges R. Examples of sequence conservation analyses capture a subset of mouse long non-coding RNAs sharing homology with fish conserved genomic elements. BMC Bioinformatics 2013; 14 Suppl 7:S14. [PMID: 23815359 PMCID: PMC3633045 DOI: 10.1186/1471-2105-14-s7-s14] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Background Long non-coding RNAs (lncRNA) are a major class of non-coding RNAs. They are involved in diverse intra-cellular mechanisms like molecular scaffolding, splicing and DNA methylation. Through these mechanisms they are reported to play a role in cellular differentiation and development. They show an enriched expression in the brain where they are implicated in maintaining cellular identity, homeostasis, stress responses and plasticity. Low sequence conservation and lack of functional annotations make it difficult to identify homologs of mammalian lncRNAs in other vertebrates. A computational evaluation of the lncRNAs through systematic conservation analyses of both sequences as well as their genomic architecture is required. Results Our results show that a subset of mouse candidate lncRNAs could be distinguished from random sequences based on their alignment with zebrafish phastCons elements. Using ROC analyses we were able to define a measure to select significantly conserved lncRNAs. Indeed, starting from ~2,800 mouse lncRNAs we could predict that between 4 and 11% present conserved sequence fragments in fish genomes. Gene ontology (GO) enrichment analyses of protein coding genes, proximal to the region of conservation, in both organisms highlighted similar GO classes like regulation of transcription and central nervous system development. The proximal coding genes in both the species show enrichment of their expression in brain. In summary, we show that interesting genomic regions in zebrafish could be marked based on their sequence homology to a mouse lncRNA, overlap with ESTs and proximity to genes involved in nervous system development. Conclusions Conservation at the sequence level can identify a subset of putative lncRNA orthologs. The similar protein-coding neighborhood and transcriptional information about the conserved candidates provide support to the hypothesis that they share functional homology. The pipeline herein presented represents a proof of principle showing that a portion between 4 and 11% of lncRNAs retains region of conservation between mammals and fishes. We believe this study will result useful as a reference to analyze the conservation of lncRNAs in newly sequenced genomes and transcriptomes.
Collapse
Affiliation(s)
- Swaraj Basu
- Laboratory of Animal Physiology and Evolution, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121, Naples, Italy
| | | | | |
Collapse
|
10
|
Ryu T, Seridi L, Ravasi T. The evolution of ultraconserved elements with different phylogenetic origins. BMC Evol Biol 2012; 12:236. [PMID: 23217155 PMCID: PMC3556307 DOI: 10.1186/1471-2148-12-236] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2012] [Accepted: 11/09/2012] [Indexed: 11/10/2022] Open
Abstract
Background Ultraconserved elements of DNA have been identified in vertebrate and invertebrate genomes. These elements have been found to have diverse functions, including enhancer activities in developmental processes. The evolutionary origins and functional roles of these elements in cellular systems, however, have not yet been determined. Results Here, we identified a wide range of ultraconserved elements common to distant species, from primitive aquatic organisms to terrestrial species with complicated body systems, including some novel elements conserved in fruit fly and human. In addition to a well-known association with developmental genes, these DNA elements have a strong association with genes implicated in essential cell functions, such as epigenetic regulation, apoptosis, detoxification, innate immunity, and sensory reception. Interestingly, we observed that ultraconserved elements clustered by sequence similarity. Furthermore, species composition and flanking genes of clusters showed lineage-specific patterns. Ultraconserved elements are highly enriched with binding sites to developmental transcription factors regardless of how they cluster. Conclusion We identified large numbers of ultraconserved elements across distant species. Specific classes of these conserved elements seem to have been generated before the divergence of taxa and fixed during the process of evolution. Our findings indicate that these ultraconserved elements are not the exclusive property of higher modern eukaryotes, but rather transmitted from their metazoan ancestors.
Collapse
Affiliation(s)
- Taewoo Ryu
- Integrative Systems Biology Lab, Division of Biological and Environmental Sciences & Engineering, King Abdullah University of Science and Technology, Thuwal, 23955-6900, Kingdom of Saudi Arabia.
| | | | | |
Collapse
|
11
|
Ultraconserved elements in the human genome: association and transmission analyses of highly constrained single-nucleotide polymorphisms. Genetics 2012; 192:253-66. [PMID: 22714408 DOI: 10.1534/genetics.112.141945] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Ultraconserved elements in the human genome likely harbor important biological functions as they are dosage sensitive and are able to direct tissue-specific expression. Because they are under purifying selection, variants in these elements may have a lower frequency in the population but a higher likelihood of association with complex traits. We tested a set of highly constrained SNPs (hcSNPs) distributed genome-wide among ultraconserved and nearly ultraconserved elements for association with seven traits related to reproductive (age at natural menopause, number of children, age at first child, and age at last child) and overall [longevity, body mass index (BMI), and height] fitness. Using up to 24,047 European-American samples from the National Heart, Lung, and Blood Institute Candidate Gene Association Resource (CARe), we observed an excess of associations with BMI and height. In an independent replication panel the most strongly associated SNPs showed an 8.4-fold enrichment of associations at the nominal level, including three variants in previously identified loci and one in a locus (DENND1A) previously shown to be associated with polycystic ovary syndrome. Finally, using 1430 family trios, we showed that the transmissions from heterozygous parents to offspring of the derived alleles of rare (frequency ≤ 0.5%) hcSNPs are not biased, particularly after adjusting for the rates of genotype missingness and error in the data. The lack of transmission bias ruled out an immediately and strongly deleterious effect due to the rare derived alleles, consistent with the observation that mice homozygous for the deletion of ultraconserved elements showed no overt phenotype. Our study also illustrated the importance of carefully modeling potential technical confounders when analyzing genotype data of rare variants.
Collapse
|
12
|
Beaster-Jones L. Cis-regulation and conserved non-coding elements in amphioxus. Brief Funct Genomics 2012; 11:118-30. [DOI: 10.1093/bfgp/els006] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
13
|
Gondo Y, Fukumura R, Murata T, Makino S. ENU-based gene-driven mutagenesis in the mouse: a next-generation gene-targeting system. Exp Anim 2011; 59:537-48. [PMID: 21030782 DOI: 10.1538/expanim.59.537] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022] Open
Abstract
As a new mouse mutant resource, the RIKEN ENU-based gene-driven mutagenesis system in the mouse has been available to the research community since 2002. By using random base-substitution mutagenesis with ENU, a new reverse genetics infrastructure has been developed as a next-generation gene-targeting system. The construction of a large-scale mutant mouse library and high-throughput mutation discovery systems were the keys making it practically feasible. The RIKEN mutant mouse library consists of ~ 10,000 G1 mice, within which 100-150 mutant strains have been established based on users' requests every year. Use of the system is very simple: users 1) download an application form from our web site and send to us, and 2) design the PCR primers for the target gene. Then, we screen the RIKEN mutant mouse library and report all the detected mutations to the user. From among the allelic series of discovered mutations, users decide which mutant strain(s) to analyze and request the live mutant strain for functional studies of the target gene. Users have been reporting various functional mutations in the RIKEN mutant mouse library: e.g., missense, knockout-type and even functional non-coding mutations. In the near future, next-generation re-sequencing systems should drastically enhance the utility of the ENU-based gene-driven mutagenesis not only for the mouse but also for other species.
Collapse
Affiliation(s)
- Yoichi Gondo
- Mutagenesis and Genomics Team, RIKEN BioResource Center, Ibaraki, Japan
| | | | | | | |
Collapse
|
14
|
Janes DE, Chapus C, Gondo Y, Clayton DF, Sinha S, Blatti CA, Organ CL, Fujita MK, Balakrishnan CN, Edwards SV. Reptiles and mammals have differentially retained long conserved noncoding sequences from the amniote ancestor. Genome Biol Evol 2010; 3:102-13. [PMID: 21183607 PMCID: PMC3035132 DOI: 10.1093/gbe/evq087] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/15/2010] [Indexed: 12/14/2022] Open
Abstract
Many noncoding regions of genomes appear to be essential to genome function. Conservation of large numbers of noncoding sequences has been reported repeatedly among mammals but not thus far among birds and reptiles. By searching genomes of chicken (Gallus gallus), zebra finch (Taeniopygia guttata), and green anole (Anolis carolinensis), we quantified the conservation among birds and reptiles and across amniotes of long, conserved noncoding sequences (LCNS), which we define as sequences ≥500 bp in length and exhibiting ≥95% similarity between species. We found 4,294 LCNS shared between chicken and zebra finch and 574 LCNS shared by the two birds and Anolis. The percent of genomes comprised by LCNS in the two birds (0.0024%) is notably higher than the percent in mammals (<0.0003% to <0.001%), differences that we show may be explained in part by differences in genome-wide substitution rates. We reconstruct a large number of LCNS for the amniote ancestor (ca. 8,630) and hypothesize differential loss and substantial turnover of these sites in descendent lineages. By contrast, we estimated a small role for recruitment of LCNS via acquisition of novel functions over time. Across amniotes, LCNS are significantly enriched with transcription factor binding sites for many developmental genes, and 2.9% of LCNS shared between the two birds show evidence of expression in brain expressed sequence tag databases. These results show that the rate of retention of LCNS from the amniote ancestor differs between mammals and Reptilia (including birds) and that this may reflect differing roles and constraints in gene regulation.
Collapse
Affiliation(s)
- D E Janes
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
15
|
When needles look like hay: how to find tissue-specific enhancers in model organism genomes. Dev Biol 2010; 350:239-54. [PMID: 21130761 DOI: 10.1016/j.ydbio.2010.11.026] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2010] [Revised: 11/11/2010] [Accepted: 11/22/2010] [Indexed: 01/22/2023]
Abstract
A major prerequisite for the investigation of tissue-specific processes is the identification of cis-regulatory elements. No generally applicable technique is available to distinguish them from any other type of genomic non-coding sequence. Therefore, researchers often have to identify these elements by elaborate in vivo screens, testing individual regions until the right one is found. Here, based on many examples from the literature, we summarize how functional enhancers have been isolated from other elements in the genome and how they have been characterized in transgenic animals. Covering computational and experimental studies, we provide an overview of the global properties of cis-regulatory elements, like their specific interactions with promoters and target gene distances. We describe conserved non-coding elements (CNEs) and their internal structure, nucleotide composition, binding site clustering and overlap, with a special focus on developmental enhancers. Conflicting data and unresolved questions on the nature of these elements are highlighted. Our comprehensive overview of the experimental shortcuts that have been found in the different model organism communities and the new field of high-throughput assays should help during the preparation phase of a screen for enhancers. The review is accompanied by a list of general guidelines for such a project.
Collapse
|
16
|
Sathirapongsasuti JF, Sathira N, Suzuki Y, Huttenhower C, Sugano S. Ultraconserved cDNA segments in the human transcriptome exhibit resistance to folding and implicate function in translation and alternative splicing. Nucleic Acids Res 2010; 39:1967-79. [PMID: 21062826 PMCID: PMC3064809 DOI: 10.1093/nar/gkq949] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Ultraconservation, defined as perfect human-to-rodent sequence identity at least 200-bp long, is a strong indicator of evolutionary and functional importance and has been explored extensively at the genome level. However, it has not been investigated at the transcript level, where such extreme conservation might highlight loci with important post-transcriptional regulatory roles. We present 96 ultraconserved cDNA segments (UCSs), stretches of human mature mRNAs that match identically with orthologous regions in the mouse and rat genomes. UCSs can span multiple exons, a feature we leverage here to elucidate the role of ultraconservation in post-transcriptional regulation. UCS sites are implicated in functions at essentially every post-transcriptional stage: pre-mRNA splicing and degradation through alternative splicing and nonsense-mediated decay (AS-NMD), mature mRNA silencing by miRNA, fast mRNA decay rate and translational repression by upstream AUGs. We also found UCSs to exhibit resistance to formation of RNA secondary structure. These multiple layers of regulation underscore the importance of the UCS-containing genes as key global RNA processing regulators, including members of the serine/arginine-rich protein and heterogeneous nuclear ribonucleoprotein (hnRNP) families of essential splicing regulators. The discovery of UCSs shed new light on the multifaceted, fine-tuned and tight post-transcriptional regulation of gene families as conserved through the majority of the mammalian lineage.
Collapse
Affiliation(s)
- J Fah Sathirapongsasuti
- Department of Medical Genome Sciences, Graduate School of Frontier Sciences, the University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8562, Japan.
| | | | | | | | | |
Collapse
|
17
|
Fadista J, Thomsen B, Holm LE, Bendixen C. Copy number variation in the bovine genome. BMC Genomics 2010; 11:284. [PMID: 20459598 PMCID: PMC2902221 DOI: 10.1186/1471-2164-11-284] [Citation(s) in RCA: 126] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2009] [Accepted: 05/06/2010] [Indexed: 12/12/2022] Open
Abstract
Background Copy number variations (CNVs), which represent a significant source of genetic diversity in mammals, have been shown to be associated with phenotypes of clinical relevance and to be causative of disease. Notwithstanding, little is known about the extent to which CNV contributes to genetic variation in cattle. Results We designed and used a set of NimbleGen CGH arrays that tile across the assayable portion of the cattle genome with approximately 6.3 million probes, at a median probe spacing of 301 bp. This study reports the highest resolution map of copy number variation in the cattle genome, with 304 CNV regions (CNVRs) being identified among the genomes of 20 bovine samples from 4 dairy and beef breeds. The CNVRs identified covered 0.68% (22 Mb) of the genome, and ranged in size from 1.7 to 2,031 kb (median size 16.7 kb). About 20% of the CNVs co-localized with segmental duplications, while 30% encompass genes, of which the majority is involved in environmental response. About 10% of the human orthologous of these genes are associated with human disease susceptibility and, hence, may have important phenotypic consequences. Conclusions Together, this analysis provides a useful resource for assessment of the impact of CNVs regarding variation in bovine health and production traits.
Collapse
Affiliation(s)
- João Fadista
- Group of Molecular Genetics and Systems Biology, Department of Genetics and Biotechnology, Faculty of Agricultural Sciences, Aarhus University, Blichers Allé 20, DK-8830 Tjele, Denmark
| | | | | | | |
Collapse
|
18
|
Beckers J, Wurst W, de Angelis MH. Towards better mouse models: enhanced genotypes, systemic phenotyping and envirotype modelling. Nat Rev Genet 2010; 10:371-80. [PMID: 19434078 DOI: 10.1038/nrg2578] [Citation(s) in RCA: 89] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
The mouse is the leading mammalian model organism for basic genetic research and for studying human diseases. Coordinated international projects are currently in progress to generate a comprehensive map of mouse gene functions - the first for any mammalian genome. There are still many challenges ahead to maximize the value of the mouse as a model, particularly for human disease. These involve generating mice that are better models of human diseases at the genotypic level, systemic (assessing all organ systems) and systematic (analysing all mouse lines) phenotyping of existing and new mouse mutant resources, and assessing the effects of the environment on phenotypes.
Collapse
Affiliation(s)
- Johannes Beckers
- Institute of Experimental Genetics, Helmholtz Zentrum München, GmbH, Ingolstädter Landstrasse 1, 85764 Neuherberg, Germany.
| | | | | |
Collapse
|
19
|
Tseng HHE, Tompa M. Algorithms for locating extremely conserved elements in multiple sequence alignments. BMC Bioinformatics 2009; 10:432. [PMID: 20021665 PMCID: PMC2808710 DOI: 10.1186/1471-2105-10-432] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2009] [Accepted: 12/18/2009] [Indexed: 11/25/2022] Open
Abstract
Background In 2004, Bejerano et al. announced the startling discovery of hundreds of "ultraconserved elements", long genomic sequences perfectly conserved across human, mouse, and rat. Their announcement stimulated a flurry of subsequent research. Results We generalize the notion of ultraconserved element in a natural way from extraordinary human-rodent conservation to extraordinary conservation over an arbitrary set of species. We call these "Extremely Conserved Elements". There is a linear time algorithm to find all such Extremely Conserved Elements in any multiple sequence alignment, provided that the conservation is required to be across all the aligned species. For the general case of conservation across an arbitrary subset of the aligned species, we show that the question of whether there exists an Extremely Conserved Element is NP-complete. We illustrate the linear time algorithm by cataloguing all 177 Extremely Conserved Elements in the currently available 44-vertebrate whole-genome alignment, and point out some of the characteristics of these elements. Conclusions The NP-completeness in the case of conservation across an arbitrary subset of the aligned species implies that it is unlikely an efficient algorithm exists for this general case. Despite this fact, for the interesting case of conservation across all or most of the aligned species, our algorithm is efficient enough to be practical. The 177 Extremely Conserved Elements that we catalog demonstrate many of the characteristics of the original ultraconserved elements of Bejerano et al.
Collapse
Affiliation(s)
- Huei-Hun E Tseng
- Department of Computer Science and Engineering, University of Washington, Box 352350, Seattle, WA 98195-2350, USA.
| | | |
Collapse
|
20
|
Gondo Y, Fukumura R, Murata T, Makino S. Next-generation gene targeting in the mouse for functional genomics. BMB Rep 2009; 42:315-23. [PMID: 19558788 DOI: 10.5483/bmbrep.2009.42.6.315] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
In order to elucidate ultimate biological function of the genome, the model animal system carrying mutations is indispensable. Recently, large-scale mutagenesis projects have been launched in various species. Especially, the mouse is considered to be an ideal model to human because it is a mammalian species accompanied with well-established genetic as well as embryonic technologies. In 1990's, large-scale mouse mutagenesis projects firstly initiated with a potent chemical mutagen, N-ethyl-N-nitrosourea (ENU) by the phenotype-driven approach or forward genetics. The knockout mouse mutagenesis projects with trapping/conditional mutagenesis have then followed as Phase II since 2006 by the gene-driven approach or reverse genetics. Recently, the next-generation gene targeting system has also become available to the research community, which allows us to establish and analyze mutant mice carrying an allelic series of base substitutions in target genes as another reverse genetics. Overall trends in the large-scale mouse mutagenesis will be reviewed in this article particularly focusing on the new advancement of the next-generation gene targeting system. The drastic expansion of the mutant mouse resources altogether will enhance the systematic understanding of the life. The construction of the mutant mouse resources developed by the forward and reverse genetic mutagenesis is just the beginning of the annotation of mammalian genome. They provide basic infrastructure to understand the molecular mechanism of the gene and genome and will contribute to not only basic researches but also applied sciences such as human disease modelling, genomic medicine and personalized medicine.
Collapse
Affiliation(s)
- Yoichi Gondo
- Mutagenesis and Genomics Team, RIKEN BioResource Center, 3-1-1 Koyadai, Tsukuba, Ibaraki 305-0074, Japan
| | | | | | | |
Collapse
|
21
|
Hughes AL, Friedman R. More radical amino acid replacements in primates than in rodents: support for the evolutionary role of effective population size. Gene 2009; 440:50-6. [PMID: 19332110 DOI: 10.1016/j.gene.2009.03.012] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2009] [Revised: 03/16/2009] [Accepted: 03/19/2009] [Indexed: 02/04/2023]
Abstract
We examined the pattern of nucleotide substitution in 4933 conserved single-copy orthologous protein-coding genes of human, rhesus, mouse, and rat. Consistent with previous studies, the median ratio of the number of nonsynonymous substitutions per nonsynonymous site (d(N)) to the number of synonymous substitutions per synonymous site (d(S)) was significantly higher in the comparison between the two primates than in the comparison between the two rodents. This pattern was particularly strong in the case of genes expressed in the immune system, but also occurred in other genes, including a set of highly conserved genes involved in the regulation of transcription. Both synonymous and nonsynonymous differences occurred independently in the same codons in the primates and in the rodents to a greater extent than expected by chance, but the extent of the deviation from random expectation was much greater in the case of nonsynonymous differences. Parallel amino acid replacements occurred at the same sites in the primates and rodents far more frequently than expected by chance, but tended to involve very conservative amino acid changes. Divergent amino acid changes involved more chemically different amino acids than parallel changes, and divergent amino acid replacements between the primates were significantly more radical than those between the rodents. These results are most easily explained on the hypothesis that the evolution of these genes has been shaped largely by purifying selection, which has been less effective in primates than in rodents, presumably as a consequence of lower long-term effective population sizes in the former.
Collapse
Affiliation(s)
- Austin L Hughes
- Department of Biological Sciences, Coker Life Sciences Building, 715 Sumter St., University of South Carolina, Columbia SC 29208, USA.
| | | |
Collapse
|