101
|
Brault V, Pereira P, Duchon A, Hérault Y. Modeling chromosomes in mouse to explore the function of genes, genomic disorders, and chromosomal organization. PLoS Genet 2006; 2:e86. [PMID: 16839184 PMCID: PMC1500809 DOI: 10.1371/journal.pgen.0020086] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
Abstract
One of the challenges of genomic research after the completion of the human genome project is to assign a function to all the genes and to understand their interactions and organizations. Among the various techniques, the emergence of chromosome engineering tools with the aim to manipulate large genomic regions in the mouse model offers a powerful way to accelerate the discovery of gene functions and provides more mouse models to study normal and pathological developmental processes associated with aneuploidy. The combination of gene targeting in ES cells, recombinase technology, and other techniques makes it possible to generate new chromosomes carrying specific and defined deletions, duplications, inversions, and translocations that are accelerating functional analysis. This review presents the current status of chromosome engineering techniques and discusses the different applications as well as the implication of these new techniques in future research to better understand the function of chromosomal organization and structures.
Collapse
Affiliation(s)
- Véronique Brault
- Institut de Transgénose, IEM, CNRS Uni Orléans, UMR6218, Orléans, France
| | | | | | | |
Collapse
|
102
|
Abstract
The widespread occurrence of noncoding (nc) RNAs--unannotated eukaryotic transcripts with reduced protein coding potential--suggests that they are functionally important. Study of ncRNAs is increasing our understanding of the organization and regulation of genomes.
Collapse
|
103
|
Biunno I, Cattaneo M, Orlandi R, Canton C, Biagiotti L, Ferrero S, Barberis M, Pupa SM, Scarpa A, Ménard S. SEL1L a multifaceted protein playing a role in tumor progression. J Cell Physiol 2006; 208:23-38. [PMID: 16331677 DOI: 10.1002/jcp.20574] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
Since the cloning in 1997 of SEL1L, the human ortholog of the sel-1 gene of C. elegans, most studies have focused on its role in cancer progression and have provided significant evidences to link its increased expression to a decrease in tumor aggressiveness. SEL1L resides on a "Genome Desert area" on chromosome 14q24.3-31 and is highly conserved in evolution. The function of the SEL1L encoded protein is still very elusive although, several evidences from lower organisms indicate that it plays a major role in protein degradation using the ubiquitin-proteosome system. SEL1L has a very complex structure made up of modules: genomically it consists of 21 exons featuring several alternative transcripts encoding for putative protein isoforms. This structural complexity ensures protein flexibility and specificity, indeed the protein was found in different sub-cellular compartments and may turn on a particular transcript in response to specific stimuli. The overall architecture of SEL1L guarantees an exquisite regulation in the expression of the gene.
Collapse
MESH Headings
- Amino Acid Sequence
- Animals
- Cell Proliferation
- Cell Transformation, Neoplastic/genetics
- Cell Transformation, Neoplastic/pathology
- Chromosome Deletion
- Chromosomes, Human, Pair 14
- DNA Mutational Analysis
- DNA, Neoplasm/genetics
- Disease Progression
- Exons/genetics
- Fetus/chemistry
- Gene Expression Regulation, Neoplastic/genetics
- Gene Expression Regulation, Neoplastic/physiology
- Humans
- Molecular Sequence Data
- Neoplasm Metastasis
- Neoplasms/genetics
- Neoplasms/pathology
- Neoplasms/physiopathology
- Polymorphism, Genetic/genetics
- Protein Isoforms/analysis
- Protein Isoforms/chemistry
- Protein Isoforms/genetics
- Protein Isoforms/physiology
- Proteins/analysis
- Proteins/chemistry
- Proteins/genetics
- Proteins/physiology
- Receptors, Notch/genetics
- Receptors, Notch/physiology
- Signal Transduction/genetics
- Signal Transduction/physiology
- Transforming Growth Factor beta/genetics
- Transforming Growth Factor beta/physiology
Collapse
Affiliation(s)
- Ida Biunno
- Istituto di Tecnologie Biomediche, CNR, Segrate-Milano, Italy
| | | | | | | | | | | | | | | | | | | |
Collapse
|
104
|
Freeman JL, Perry GH, Feuk L, Redon R, McCarroll SA, Altshuler DM, Aburatani H, Jones KW, Tyler-Smith C, Hurles ME, Carter NP, Scherer SW, Lee C. Copy number variation: new insights in genome diversity. Genome Res 2006; 16:949-61. [PMID: 16809666 DOI: 10.1101/gr.3677206] [Citation(s) in RCA: 545] [Impact Index Per Article: 30.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
DNA copy number variation has long been associated with specific chromosomal rearrangements and genomic disorders, but its ubiquity in mammalian genomes was not fully realized until recently. Although our understanding of the extent of this variation is still developing, it seems likely that, at least in humans, copy number variants (CNVs) account for a substantial amount of genetic variation. Since many CNVs include genes that result in differential levels of gene expression, CNVs may account for a significant proportion of normal phenotypic variation. Current efforts are directed toward a more comprehensive cataloging and characterization of CNVs that will provide the basis for determining how genomic diversity impacts biological function, evolution, and common human diseases.
Collapse
Affiliation(s)
- Jennifer L Freeman
- Department of Pathology, Brigham and Women's Hospital, Boston, Massachusetts 02115, USA
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
105
|
Abstract
The first wave of information from the analysis of the human genome revealed SNPs to be the main source of genetic and phenotypic human variation. However, the advent of genome-scanning technologies has now uncovered an unexpectedly large extent of what we term 'structural variation' in the human genome. This comprises microscopic and, more commonly, submicroscopic variants, which include deletions, duplications and large-scale copy-number variants - collectively termed copy-number variants or copy-number polymorphisms - as well as insertions, inversions and translocations. Rapidly accumulating evidence indicates that structural variants can comprise millions of nucleotides of heterogeneity within every genome, and are likely to make an important contribution to human diversity and disease susceptibility.
Collapse
Affiliation(s)
- Lars Feuk
- The Centre for Applied Genomics and Program in Genetics and Genomic Biology, The Hospital for Sick Children, Department of Molecular and Medical Genetics, University of Toronto, Ontario, Canada
| | | | | |
Collapse
|
106
|
Furuno M, Pang KC, Ninomiya N, Fukuda S, Frith MC, Bult C, Kai C, Kawai J, Carninci P, Hayashizaki Y, Mattick JS, Suzuki H. Clusters of internally primed transcripts reveal novel long noncoding RNAs. PLoS Genet 2006; 2:e37. [PMID: 16683026 PMCID: PMC1449886 DOI: 10.1371/journal.pgen.0020037] [Citation(s) in RCA: 133] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2005] [Accepted: 02/01/2006] [Indexed: 02/07/2023] Open
Abstract
Non-protein-coding RNAs (ncRNAs) are increasingly being recognized as having important regulatory roles. Although much recent attention has focused on tiny 22- to 25-nucleotide microRNAs, several functional ncRNAs are orders of magnitude larger in size. Examples of such macro ncRNAs include Xist and Air, which in mouse are 18 and 108 kilobases (Kb), respectively. We surveyed the 102,801 FANTOM3 mouse cDNA clones and found that Air and Xist were present not as single, full-length transcripts but as a cluster of multiple, shorter cDNAs, which were unspliced, had little coding potential, and were most likely primed from internal adenine-rich regions within longer parental transcripts. We therefore conducted a genome-wide search for regional clusters of such cDNAs to find novel macro ncRNA candidates. Sixty-six regions were identified, each of which mapped outside known protein-coding loci and which had a mean length of 92 Kb. We detected several known long ncRNAs within these regions, supporting the basic rationale of our approach. In silico analysis showed that many regions had evidence of imprinting and/or antisense transcription. These regions were significantly associated with microRNAs and transcripts from the central nervous system. We selected eight novel regions for experimental validation by northern blot and RT-PCR and found that the majority represent previously unrecognized noncoding transcripts that are at least 10 Kb in size and predominantly localized in the nucleus. Taken together, the data not only identify multiple new ncRNAs but also suggest the existence of many more macro ncRNAs like Xist and Air. The human genome has been sequenced, and, intriguingly, less than 2% specifies the information for the basic protein building blocks of our bodies. So, what does the other 98% do? It now appears that the mammalian genome also specifies the instructions for many previously undiscovered “non protein-coding RNA” (ncRNA) genes. However, what these ncRNAs do is largely unknown. In recent years, strategies have been designed that have successfully identified hundreds of short ncRNAs—termed microRNAs—many of which have since been shown to act as genetic regulators. Also known to be functionally important are a handful of ncRNAs orders of magnitude larger in size than microRNAs. The availability of complete genome and comprehensive transcript sequences allows for the systematic discovery of more large ncRNAs. The authors developed a computational strategy to screen the mouse genome and identify large ncRNAs. They detected existing large ncRNAs, thus validating their approach, but, more importantly, discovered more than 60 other candidates, some of which were subsequently confirmed experimentally. This work opens the door to a virtually unexplored world of large ncRNAs and beckons future experimental work to define the cellular functions of these molecules.
Collapse
Affiliation(s)
- Masaaki Furuno
- Mouse Genome Informatics Consortium, The Jackson Laboratory, Bar Harbor, Maine, United States of America
| | - Ken C Pang
- Australian Research Council Special Research Centre for Functional and Applied Genomics, Institute for Molecular Bioscience, University of Queensland, Brisbane, Australia
- T Cell laboratory, Ludwig Institute for Cancer Research, Austin Health, Heidelberg, Victoria, Australia
| | - Noriko Ninomiya
- Genome Exploration Research Group (Genome Network Project Core Group), RIKEN Genomic Sciences Center, RIKEN Yokohama Institute, Yokohama, Japan
| | - Shiro Fukuda
- Genome Exploration Research Group (Genome Network Project Core Group), RIKEN Genomic Sciences Center, RIKEN Yokohama Institute, Yokohama, Japan
| | - Martin C Frith
- Australian Research Council Special Research Centre for Functional and Applied Genomics, Institute for Molecular Bioscience, University of Queensland, Brisbane, Australia
- Genome Exploration Research Group (Genome Network Project Core Group), RIKEN Genomic Sciences Center, RIKEN Yokohama Institute, Yokohama, Japan
| | - Carol Bult
- Mouse Genome Informatics Consortium, The Jackson Laboratory, Bar Harbor, Maine, United States of America
| | - Chikatoshi Kai
- Genome Exploration Research Group (Genome Network Project Core Group), RIKEN Genomic Sciences Center, RIKEN Yokohama Institute, Yokohama, Japan
| | - Jun Kawai
- Genome Exploration Research Group (Genome Network Project Core Group), RIKEN Genomic Sciences Center, RIKEN Yokohama Institute, Yokohama, Japan
- Genome Science Laboratory, Discovery Research Institute, RIKEN Wako Institute, Wako, Japan
| | - Piero Carninci
- Genome Exploration Research Group (Genome Network Project Core Group), RIKEN Genomic Sciences Center, RIKEN Yokohama Institute, Yokohama, Japan
- Genome Science Laboratory, Discovery Research Institute, RIKEN Wako Institute, Wako, Japan
| | - Yoshihide Hayashizaki
- Genome Exploration Research Group (Genome Network Project Core Group), RIKEN Genomic Sciences Center, RIKEN Yokohama Institute, Yokohama, Japan
- Genome Science Laboratory, Discovery Research Institute, RIKEN Wako Institute, Wako, Japan
| | - John S Mattick
- Australian Research Council Special Research Centre for Functional and Applied Genomics, Institute for Molecular Bioscience, University of Queensland, Brisbane, Australia
| | - Harukazu Suzuki
- Genome Exploration Research Group (Genome Network Project Core Group), RIKEN Genomic Sciences Center, RIKEN Yokohama Institute, Yokohama, Japan
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
107
|
Abstract
The house mouse has been used as a privileged model organism since the early days of genetics, and the numerous experiments made with this small mammal have regularly contributed to enrich our knowledge of mammalian biology and pathology, ranging from embryonic development to metabolic disease, histocompatibility, immunology, behavior, and cancer. Over the past two decades, a number of large-scale integrated and concerted projects have been undertaken that will probably open a new era in the genetics of the species. The sequencing of the genome, which will allow researchers to make comparisons with other mammals and identify regions conserved by evolution, is probably the most important project, but many other initiatives, such as the massive production of point or chromosomal mutations associated with comprehensive and standardized phenotyping of the mutant phenotypes, will help annotation of the approximately 25,000 genes packed in the mouse genome. In the same way, and as another consequence of the sequencing, the discovery of many single nucleotide polymorphisms and the development of new tools and resources, like the Collaborative Cross, will contribute to the development of modern quantitative genetics. It is clear that mouse genetics has changed dramatically over the last 10-15 years and its future looks promising.
Collapse
Affiliation(s)
- Jean Louis Guénet
- Département de Biologie du Développement, Institut Pasteur, 75724 Paris Cedex 15, France.
| |
Collapse
|
108
|
Abstract
The human genome project has had an impact on both biological research and its political organization; this review focuses primarily on the scientific novelty that has emerged from the project but also touches on its political dimensions. The project has generated both anticipated and novel information; in the later category are the description of the unusual distribution of genes, the prevalence of non-protein-coding genes, and the extraordinary evolutionary conservation of some regions of the genome. The applications of the sequence data are just starting to be felt in basic, rather than therapeutic, biomedical research and in the vibrant human origins and variation debates. The political impact of the project is in the unprecedented extent to which directed funding programs have emerged as drivers of basic research and the organization of the multidisciplinary groups that are needed to utilize the human DNA sequence.
Collapse
Affiliation(s)
- Peter F R Little
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney 2074, New South Wales, Australia.
| |
Collapse
|
109
|
Hirashima K, Iwaki T, Takegawa K, Giga-Hama Y, Tohda H. A simple and effective chromosome modification method for large-scale deletion of genome sequences and identification of essential genes in fission yeast. Nucleic Acids Res 2006; 34:e11. [PMID: 16434698 PMCID: PMC1351375 DOI: 10.1093/nar/gnj011] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
The technologies for chromosome modification developed to date are not satisfactorily universal, owing to the typical requirements for special enzymes and sequences. In the present report, we propose a new approach for chromosome modification in Schizosaccharomyces pombe that does not involve any special enzymes or sequences. This method, designated the 'Latour system', has wide applicability with extremely high efficiency, although both the basic principle and the operation are very simple. We demonstrate the ability of the Latour system to discriminate essential genes, with a long chromosomal area of 100 kb containing 33 genes deleted simultaneously and efficiently. Since no foreign sequences are retained after deletion using the Latour system, this system can be repeatedly applied at other sites. Provided that a negative selectable marker is available, the Latour system relies solely upon homologous recombination, which is highly conserved in living organisms. For this reason, it is expected that the system will be applicable to various yeasts.
Collapse
Affiliation(s)
| | - Tomoko Iwaki
- Department of Life Sciences, Faculty of Agriculture, Kagawa UniversityMiki-cho, Kagawa 761-0795, Japan
| | - Kaoru Takegawa
- Department of Life Sciences, Faculty of Agriculture, Kagawa UniversityMiki-cho, Kagawa 761-0795, Japan
| | | | - Hideki Tohda
- To whom correspondence should be addressed. Tel: +81 45 374 7377; Fax: +81 45 374 8872;
| |
Collapse
|
110
|
Abstract
Until recently the study of individual DNA sequences and of total DNA content (the C-value) sat at opposite ends of the spectrum in genome biology. For gene sequencers, the vast stretches of non-coding DNA found in eukaryotic genomes were largely considered to be an annoyance, whereas genome-size researchers attributed little relevance to specific nucleotide sequences. However, the dawn of comprehensive genome sequencing has allowed a new synergy between these fields, with sequence data providing novel insights into genome-size evolution, and with genome-size data being of both practical and theoretical significance for large-scale sequence analysis. In combination, these formerly disconnected disciplines are poised to deliver a greatly improved understanding of genome structure and evolution.
Collapse
Affiliation(s)
- T Ryan Gregory
- Department of Integrative Biology, University of Guelph, Ontario N1G 2W1, Canada.
| |
Collapse
|
111
|
Drake JA, Bird C, Nemesh J, Thomas DJ, Newton-Cheh C, Reymond A, Excoffier L, Attar H, Antonarakis SE, Dermitzakis ET, Hirschhorn JN. Conserved noncoding sequences are selectively constrained and not mutation cold spots. Nat Genet 2005; 38:223-7. [PMID: 16380714 DOI: 10.1038/ng1710] [Citation(s) in RCA: 187] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2005] [Accepted: 11/04/2005] [Indexed: 02/06/2023]
Abstract
Noncoding genetic variants are likely to influence human biology and disease, but recognizing functional noncoding variants is difficult. Approximately 3% of noncoding sequence is conserved among distantly related mammals, suggesting that these evolutionarily conserved noncoding regions (CNCs) are selectively constrained and contain functional variation. However, CNCs could also merely represent regions with lower local mutation rates. Here we address this issue and show that CNCs are selectively constrained in humans by analyzing HapMap genotype data. Specifically, new (derived) alleles of SNPs within CNCs are rarer than new alleles in nonconserved regions (P = 3 x 10(-18)), indicating that evolutionary pressure has suppressed CNC-derived allele frequencies. Intronic CNCs and CNCs near genes show greater allele frequency shifts, with magnitudes comparable to those for missense variants. Thus, conserved noncoding variants are more likely to be functional. Allele frequency distributions highlight selectively constrained genomic regions that should be intensively surveyed for functionally important variation.
Collapse
Affiliation(s)
- Jared A Drake
- Program in Genomics and Division of Endocrinology, Children's Hospital, Boston, Massachusetts 02115, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
112
|
Chen FC, Wang SS, Chen CJ, Li WH, Chuang TJ. Alternatively and constitutively spliced exons are subject to different evolutionary forces. Mol Biol Evol 2005; 23:675-82. [PMID: 16368777 DOI: 10.1093/molbev/msj081] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
There has been a controversy on whether alternatively spliced exons (ASEs) evolve faster than constitutively spliced exons (CSEs). Although it has been noted that ASEs are subject to weaker selective constraints than CSEs, so they evolve faster, there have also been studies that indicated slower evolution in ASEs than in CSEs. In this study, we retrieve more than 5,000 human-mouse orthologous exons and calculate the synonymous (KS) and nonsynonymous (KA) substitution rates in these exons. Our results show that ASEs have higher KA values and higher KA/KS ratios than CSEs, indicating faster amino acid-level evolution in ASEs. The faster evolution may be in part due to weaker selective constraints. It is also possible that the faster rate is in part due to faster functional evolution in ASEs. On the other hand, the majority of ASEs have lower KS values than CSEs. With reference to the substitution rate in introns, we show that the KS values in ASEs are close to the neutral substitution rate, whereas the synonymous substitution rate in CSEs has likely been accelerated. The elevated synonymous rate in CSEs is not related to CpG dinucleotides or low-complexity regions of protein but may be weakly related to codon usage bias. The overall trends of higher KA and lower KS in ASEs than in CSEs are also observed in human-rat and mouse-rat comparisons. Therefore, our observations hold for mammals of different molecular clocks.
Collapse
Affiliation(s)
- Feng-Chi Chen
- Genomics Research Center, Academia Sinica, Taipei, Taiwan
| | | | | | | | | |
Collapse
|
113
|
Adams MD. Conserved sequences and the evolution of gene regulatory signals. Curr Opin Genet Dev 2005; 15:628-33. [PMID: 16185862 DOI: 10.1016/j.gde.2005.09.004] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2005] [Accepted: 09/14/2005] [Indexed: 12/26/2022]
Abstract
Studies of evolutionary conservation of gene regulatory signals have led to a paradox: extensive sequence similarity implies functional conservation in non-coding regions across mammalian species; however, this stands in contrast to our understanding of transcriptional regulatory sites composed of degenerate recognition sequences for transcription factors that can maintain functional equivalence despite considerable sequence divergence. The latter observation provides an explanation for the rapid evolution of new traits through the gain and loss of transcription factor binding sites that bring new genes under the control of an existing genetic regulatory network. The former observation might point to novel mechanisms of gene regulation and/or chromosome function that are currently unappreciated. Recent comparative genome analysis has highlighted extensive conserved sequences in mammalian genomes that are beginning to be functionally characterized.
Collapse
Affiliation(s)
- Mark D Adams
- Department of Genetics, Center for Human Genetics, Center for Computational Genomics and Systems Biology, Case Western Reserve University, 10900 Euclid Avenue, Cleveland, OH 44106, USA.
| |
Collapse
|
114
|
Itoh T, Toyoda A, Taylor TD, Sakaki Y, Hattori M. Identification of large ancient duplications associated with human gene deserts. Nat Genet 2005; 37:1041-3. [PMID: 16186813 DOI: 10.1038/ng1648] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2005] [Accepted: 08/12/2005] [Indexed: 11/09/2022]
Abstract
We identified 15 regions of >1 Mb in the human genome composed of large ancient local duplications corresponding to gene deserts. We detected these intrachromosomal duplications in mouse and dog but not in chicken; they present as patches of similarity as low as 60%. These findings suggest that some human gene deserts originated from duplications of segments lacking genes in a mammalian common ancestor.
Collapse
Affiliation(s)
- Takehiko Itoh
- Research Center for Advanced Science and Technology, Mitsubishi Research Institute, Inc., Tokyo 100-8141, Japan.
| | | | | | | | | |
Collapse
|
115
|
Nusbaum C, Zody MC, Borowsky ML, Kamal M, Kodira CD, Taylor TD, Whittaker CA, Chang JL, Cuomo CA, Dewar K, FitzGerald MG, Yang X, Abouelleil A, Allen NR, Anderson S, Bloom T, Bugalter B, Butler J, Cook A, DeCaprio D, Engels R, Garber M, Gnirke A, Hafez N, Hall JL, Norman CH, Itoh T, Jaffe DB, Kuroki Y, Lehoczky J, Lui A, Macdonald P, Mauceli E, Mikkelsen TS, Naylor JW, Nicol R, Nguyen C, Noguchi H, O'Leary SB, O'Neill K, Piqani B, Smith CL, Talamas JA, Topham K, Totoki Y, Toyoda A, Wain HM, Young SK, Zeng Q, Zimmer AR, Fujiyama A, Hattori M, Birren BW, Sakaki Y, Lander ES. DNA sequence and analysis of human chromosome 18. Nature 2005; 437:551-5. [PMID: 16177791 DOI: 10.1038/nature03983] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2005] [Accepted: 06/27/2005] [Indexed: 11/08/2022]
Abstract
Chromosome 18 appears to have the lowest gene density of any human chromosome and is one of only three chromosomes for which trisomic individuals survive to term. There are also a number of genetic disorders stemming from chromosome 18 trisomy and aneuploidy. Here we report the finished sequence and gene annotation of human chromosome 18, which will allow a better understanding of the normal and disease biology of this chromosome. Despite the low density of protein-coding genes on chromosome 18, we find that the proportion of non-protein-coding sequences evolutionarily conserved among mammals is close to the genome-wide average. Extending this analysis to the entire human genome, we find that the density of conserved non-protein-coding sequences is largely uncorrelated with gene density. This has important implications for the nature and roles of non-protein-coding sequence elements.
Collapse
Affiliation(s)
- Chad Nusbaum
- Broad Institute of MIT and Harvard, 320 Charles Street, Cambridge, Massachusetts 02141, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
116
|
Kapranov P, Drenkow J, Cheng J, Long J, Helt G, Dike S, Gingeras TR. Examples of the complex architecture of the human transcriptome revealed by RACE and high-density tiling arrays. Genome Res 2005; 15:987-97. [PMID: 15998911 PMCID: PMC1172043 DOI: 10.1101/gr.3455305] [Citation(s) in RCA: 228] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Recently, we mapped the sites of transcription across approximately 30% of the human genome and elucidated the structures of several hundred novel transcripts. In this report, we describe a novel combination of techniques including the rapid amplification of cDNA ends (RACE) and tiling array technologies that was used to further characterize transcripts in the human transcriptome. This technical approach allows for several important pieces of information to be gathered about each array-detected transcribed region, including strand of origin, start and termination positions, and the exonic structures of spliced and unspliced coding and noncoding RNAs. In this report, the structures of transcripts from 14 transcribed loci, representing both known genes and unannotated transcripts taken from the several hundred randomly selected unannotated transcripts described in our previous work are represented as examples of the complex organization of the human transcriptome. As a consequence of this complexity, it is not unusual that a single base pair can be part of an intricate network of multiple isoforms of overlapping sense and antisense transcripts, the majority of which are unannotated. Some of these transcripts follow the canonical splicing rules, whereas others combine the exons of different genes or represent other types of noncanonical transcripts. These results have important implications concerning the correlation of genotypes to phenotypes, the regulation of complex interlaced transcriptional patterns, and the definition of a gene.
Collapse
|
117
|
Taylor J. Clues to function in gene deserts. Trends Biotechnol 2005; 23:269-71. [PMID: 15922077 DOI: 10.1016/j.tibtech.2005.04.003] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2005] [Revised: 03/21/2005] [Accepted: 04/05/2005] [Indexed: 02/07/2023]
Abstract
Recent work by Ivan Ovcharenko and colleagues has shed new light on the functional importance of gene deserts. They demonstrate that sequence conservation levels separate gene deserts into stable (more conserved) and variable classes. Both classes exhibit characteristics suggestive of function. The stable deserts in particular show features suggesting a role in the complex regulation of core vertebrate genes.
Collapse
Affiliation(s)
- James Taylor
- Center for Comparative Genomics and Bioinformatics, Pennsylvania State University, University Park, PA 16802, USA. james.bx.psu.edu
| |
Collapse
|
118
|
Babak T, Blencowe BJ, Hughes TR. A systematic search for new mammalian noncoding RNAs indicates little conserved intergenic transcription. BMC Genomics 2005; 6:104. [PMID: 16083503 PMCID: PMC1199595 DOI: 10.1186/1471-2164-6-104] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2005] [Accepted: 08/05/2005] [Indexed: 11/10/2022] Open
Abstract
Background Systematic identification and functional characterization of novel types of noncoding (nc)RNA in genomes is more difficult than it is for protein coding mRNAs, since ncRNAs typically do not possess sequence features such as splicing or translation signals, or long open reading frames. Recent "tiling" microarray studies have reported that a surprisingly larger proportion of mammalian genomes is transcribed than was previously anticipated. However, these non-genic transcripts often appear to be low in abundance, and their functional significance is not known. Results To systematically search for functional ncRNAs, we designed microarrays to detect 3,478 intergenic and intronic sequences that are conserved between the human, mouse, and rat genomes, and that score highly by other criteria that characterize ncRNAs. We probed these arrays with total RNA isolated from 16 wild-type mouse tissues. Among 55 candidates for highly-expressed novel ncRNAs tested by northern blotting, eight were confirmed as small, highly-and ubiquitously-expressed RNAs in mouse. Of the eight, five were also detected in rat tissues, but none were detected at appreciable levels in human tissues or cultured cells. Conclusion Since the sequence and expression of most known coding transcripts and functional ncRNAs is conserved between human and mouse, the lack of northern-detectable expression in human cells and tissues of the novel mouse and rat ncRNAs that we identified suggests that they are not functional or possibly have rodent-specific functions. Our results confirm that relatively little of the intergenic sequence conserved between human, mouse and rat is transcribed at high levels in mammalian tissues, possibly suggesting a limited role for transcribed intergenic and intronic sequences as independent functional elements.
Collapse
Affiliation(s)
- Tomas Babak
- Banting and Best Department of Medical Research, 112 College St., Toronto, ON M5G 1L6 Canada
- Department of Medical Genetics and Microbiology, 10 King's College Circle, Toronto, ON M1R 4F9 Canada
| | - Benjamin J Blencowe
- Banting and Best Department of Medical Research, 112 College St., Toronto, ON M5G 1L6 Canada
- Department of Medical Genetics and Microbiology, 10 King's College Circle, Toronto, ON M1R 4F9 Canada
| | - Timothy R Hughes
- Banting and Best Department of Medical Research, 112 College St., Toronto, ON M5G 1L6 Canada
- Department of Medical Genetics and Microbiology, 10 King's College Circle, Toronto, ON M1R 4F9 Canada
| |
Collapse
|
119
|
Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, Rosenbloom K, Clawson H, Spieth J, Hillier LW, Richards S, Weinstock GM, Wilson RK, Gibbs RA, Kent WJ, Miller W, Haussler D. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res 2005; 15:1034-50. [PMID: 16024819 PMCID: PMC1182216 DOI: 10.1101/gr.3715005] [Citation(s) in RCA: 2792] [Impact Index Per Article: 146.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2005] [Accepted: 06/02/2005] [Indexed: 11/24/2022]
Abstract
We have conducted a comprehensive search for conserved elements in vertebrate genomes, using genome-wide multiple alignments of five vertebrate species (human, mouse, rat, chicken, and Fugu rubripes). Parallel searches have been performed with multiple alignments of four insect species (three species of Drosophila and Anopheles gambiae), two species of Caenorhabditis, and seven species of Saccharomyces. Conserved elements were identified with a computer program called phastCons, which is based on a two-state phylogenetic hidden Markov model (phylo-HMM). PhastCons works by fitting a phylo-HMM to the data by maximum likelihood, subject to constraints designed to calibrate the model across species groups, and then predicting conserved elements based on this model. The predicted elements cover roughly 3%-8% of the human genome (depending on the details of the calibration procedure) and substantially higher fractions of the more compact Drosophila melanogaster (37%-53%), Caenorhabditis elegans (18%-37%), and Saccharaomyces cerevisiae (47%-68%) genomes. From yeasts to vertebrates, in order of increasing genome size and general biological complexity, increasing fractions of conserved bases are found to lie outside of the exons of known protein-coding genes. In all groups, the most highly conserved elements (HCEs), by log-odds score, are hundreds or thousands of bases long. These elements share certain properties with ultraconserved elements, but they tend to be longer and less perfectly conserved, and they overlap genes of somewhat different functional categories. In vertebrates, HCEs are associated with the 3' UTRs of regulatory genes, stable gene deserts, and megabase-sized regions rich in moderately conserved noncoding sequences. Noncoding HCEs also show strong statistical evidence of an enrichment for RNA secondary structure.
Collapse
Affiliation(s)
- Adam Siepel
- Center for Biomolecular Science and Engineering, University of California, Santa Cruz, Santa Cruz, California 95064, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
120
|
Lossie AC, Meehan TP, Castillo A, Zheng L, Weiser KC, Strivens MA, Justice MJ. 18th International Mouse Genome Conference. Mamm Genome 2005; 16:471-5. [PMID: 16151691 DOI: 10.1007/s00335-005-0026-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2005] [Accepted: 04/01/2005] [Indexed: 10/25/2022]
Affiliation(s)
- Amy C Lossie
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA
| | | | | | | | | | | | | |
Collapse
|
121
|
Abstract
The frequency of individual ancestral non-coding conserved regions within the genome helps in assessing the probability that they function in transcription regulation or RNA coding. Genomic segments that do not code for proteins yet show high conservation among vertebrates have recently been identified by various computational methodologies. We refer to them as ANCORs (ancestral non-coding conserved regions). The frequency of individual ANCORs within the genome, along with their (correlated) inter-species identity scores, helps in assessing the probability that they function in transcription regulation or RNA coding.
Collapse
Affiliation(s)
- Ronny Aloni
- Department of Molecular Genetics and the Crown Human Genome Center, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Doron Lancet
- Department of Molecular Genetics and the Crown Human Genome Center, Weizmann Institute of Science, Rehovot 76100, Israel
| |
Collapse
|
122
|
Abstract
The past four years have seen an explosion in the number of detected RNA transcripts with no apparent protein-coding potential. This has led to speculation that non-protein-coding RNAs (ncRNAs) might be as important as proteins in the regulation of vital cellular functions. However, there has been significantly less progress in actually demonstrating the functions of these transcripts. In this article, we review the results of recent experiments that show that transcription of non-protein-coding RNA is far more widespread than was previously anticipated. Although some ncRNAs act as molecular switches that regulate gene expression, the function of many ncRNAs is unknown. New experimental and computational approaches are emerging that will help determine whether these newly identified transcription products are evidence of important new biochemical pathways or are merely 'junk' RNA generated by the cell as a by-product of its functional activities.
Collapse
Affiliation(s)
- Alexander Hüttenhofer
- Division of Genomics and RNomics, Innsbruck Medical University-Biocenter, Fritz-Pregl-Strasse 3, 6020 Innsbruck, Austria.
| | | | | |
Collapse
|
123
|
Lettice LA, Hill RE. Preaxial polydactyly: a model for defective long-range regulation in congenital abnormalities. Curr Opin Genet Dev 2005; 15:294-300. [PMID: 15917205 DOI: 10.1016/j.gde.2005.04.002] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2005] [Accepted: 04/01/2005] [Indexed: 02/07/2023]
Abstract
Point mutations in the long-range, limb-specific regulatory element of the SHH gene are responsible for the human limb abnormality called preaxial polydactyly (PPD). Disruptions of regulatory elements in developmental genes are a small but increasingly significant class of mutations responsible for congenital defects. Identifying regulatory elements that might reside hundreds of kilobases from their relevant genes is difficult but rendered possible by the emerging field of comparative genomics. Genetic analysis of PPD highlights the notion that regulatory mutations might generate phenotypes distinct from any of those identified for coding region mutations.
Collapse
Affiliation(s)
- Laura A Lettice
- MRC Human Genetics Unit, Western General Hospital, Crewe Road, Edinburgh, EH4 2XU, UK
| | | |
Collapse
|
124
|
Abstract
There is growing evidence that mammalian genomes produce thousands of transcripts that do not encode proteins, and this RNA class might even rival the complexity of mRNAs. There is no doubt that a number of these non-protein-coding RNAs have important regulatory functions in the cell. However, do all transcripts have a function or are many of them products of fortuitous transcription with no function? The second scenario is mirrored by numerous alternative-splicing events that lead to truncated proteins. Nevertheless, analogous to 'superfluous' genomic DNA, aberrant transcripts or processing products embody evolutionary potential and provide novel RNAs that natural selection can act on.
Collapse
Affiliation(s)
- Jürgen Brosius
- Institute of Experimental Pathology, ZMBE, University of Münster, Von-Esmarch-Str. 56, Münster, Germany.
| |
Collapse
|
125
|
Koide T, Hayata T, Cho KWY. Xenopus as a model system to study transcriptional regulatory networks. Proc Natl Acad Sci U S A 2005; 102:4943-8. [PMID: 15795378 PMCID: PMC555977 DOI: 10.1073/pnas.0408125102] [Citation(s) in RCA: 75] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2004] [Indexed: 11/18/2022] Open
Abstract
Development is controlled by a complex series of events requiring sequential gene activation. Understanding the logic of gene networks during development is necessary for a complete understanding of how genes contribute to phenotype. Pioneering work initiated in the sea urchin and Drosophila has demonstrated that reasonable transcriptional regulatory network diagrams representing early development in multicellular animals can be generated through use of appropriate genomic, genetic, and biochemical tools. Establishment of similar regulatory network diagrams for vertebrate development is a necessary step. The amphibian Xenopus has long been used as a model for vertebrate early development and has contributed greatly to the elucidation of gene regulation. Because the best and most extensively studied transcriptional regulatory network in Xenopus is that underlying the formation and function of Spemann's organizer, we describe the current status of our understanding of this gene regulatory network and its relationship to mesodermal patterning. Seventy-four transcription factors currently known to be expressed in the mesoendoderm of Xenopus gastrula were characterized according to their modes of action, DNA binding consensus sequences, and target genes. Among them, nineteen transcription factors were characterized sufficiently in detail, allowing us to generate a gene regulatory network diagram. Additionally, we discuss recent amphibian work using a combined DNA microarray and bioinformatics approach that promises to accelerate regulatory network studies.
Collapse
Affiliation(s)
- Tetsuya Koide
- Developmental Biology Center and the Department of Developmental and Cell Biology, University of California, Irvine, CA 92697-2300, USA
| | | | | |
Collapse
|
126
|
Dermitzakis ET, Reymond A, Antonarakis SE. Conserved non-genic sequences — an unexpected feature of mammalian genomes. Nat Rev Genet 2005; 6:151-7. [PMID: 15716910 DOI: 10.1038/nrg1527] [Citation(s) in RCA: 211] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Mammalian genomes contain highly conserved sequences that are not functionally transcribed. These sequences are single copy and comprise approximately 1-2% of the human genome. Evolutionary analysis strongly supports their functional conservation, although their potentially diverse, functional attributes remain unknown. It is likely that genomic variation in conserved non-genic sequences is associated with phenotypic variability and human disorders. So how might their function and contribution to human disorders be examined?
Collapse
Affiliation(s)
- Emmanouil T Dermitzakis
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.
| | | | | |
Collapse
|
127
|
Ovcharenko I, Loots GG, Nobrega MA, Hardison RC, Miller W, Stubbs L. Evolution and functional classification of vertebrate gene deserts. Genome Res 2004; 15:137-45. [PMID: 15590943 PMCID: PMC540279 DOI: 10.1101/gr.3015505] [Citation(s) in RCA: 192] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Large tracts of the human genome, known as gene deserts, are devoid of protein-coding genes. Dichotomy in their level of conservation with chicken separates these regions into two distinct categories, stable and variable. The separation is not caused by differences in rates of neutral evolution but instead appears to be related to different biological functions of stable and variable gene deserts in the human genome. Gene Ontology categories of the adjacent genes are strongly biased toward transcriptional regulation and development for the stable gene deserts, and toward distinctively different functions for the variable gene deserts. Stable gene deserts resist chromosomal rearrangements and appear to harbor multiple distant regulatory elements physically linked to their neighboring genes, with the linearity of conservation invariant throughout vertebrate evolution.
Collapse
Affiliation(s)
- Ivan Ovcharenko
- Energy, Environment, Biology, and Institutional Computing, Lawrence Livermore National Laboratory, Livermore, California 94550, USA.
| | | | | | | | | | | |
Collapse
|
128
|
|
129
|
In Brief. Nat Rev Genet 2004. [DOI: 10.1038/nrg1509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
130
|
Mice do fine without 'junk DNA'. Nature 2004. [DOI: 10.1038/news041018-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|