1
|
Kang LF, Zhu ZL, Zhao Q, Chen LY, Zhang Z. Newly evolved introns in human retrogenes provide novel insights into their evolutionary roles. BMC Evol Biol 2012; 12:128. [PMID: 22839428 PMCID: PMC3565874 DOI: 10.1186/1471-2148-12-128] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2012] [Accepted: 07/19/2012] [Indexed: 01/14/2023] Open
Abstract
BACKGROUND Retrogenes generally do not contain introns. However, in some instances, retrogenes may recruit internal exonic sequences as introns, which is known as intronization. A retrogene that undergoes intronization is a good model with which to investigate the origin of introns. Nevertheless, previously, only two cases in vertebrates have been reported. RESULTS In this study, we systematically screened the human (Homo sapiens) genome for retrogenes that evolved introns and analyzed their patterns in structure, expression and origin. In total, we identified nine intron-containing retrogenes. Alignment of pairs of retrogenes and their parents indicated that, in addition to intronization (five cases), retrogenes also may have gained introns by insertion of external sequences into the genes (one case) or reversal of the orientation of transcription (three cases). Interestingly, many intronizations were promoted not by base substitutions but by cryptic splice sites, which were silent in the parental genes but active in the retrogenes. We also observed that the majority of introns generated by intronization did not involve frameshifts. CONCLUSIONS Intron gains in retrogenes are not as rare as previously thought. Furthermore, diverse mechanisms may lead to intron creation in retrogenes. The activation of cryptic splice sites in the intronization of retrogenes may be triggered by the change of gene structure after retroposition. A high percentage of non-frameshift introns in retrogenes may be because non-frameshift introns do not dramatically affect host proteins. Introns generated by intronization in human retrogenes are generally young, which is consistent with previous findings for Caenorhabditis elegans. Our results provide novel insights into the evolutionary role of introns.
Collapse
Affiliation(s)
- Li-Fang Kang
- College of Life Sciences, Chongqing University, Chongqing 400044, China
| | | | | | | | | |
Collapse
|
2
|
Changes of globin expression in the Japanese medaka (Oryzias latipes) in response to acute and chronic hypoxia. J Comp Physiol B 2010; 181:199-208. [PMID: 20963423 DOI: 10.1007/s00360-010-0518-2] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2010] [Revised: 09/25/2010] [Accepted: 10/05/2010] [Indexed: 11/27/2022]
Abstract
Fishes live in an aquatic environment with low or temporally changing O(2) availability. Variations in O(2) levels require many anatomical, behavioral, physiological, and biochemical adaptations that ensure the uptake of an adequate amount of O(2). Some fish species are comparatively well adapted to tolerate low O(2) partial pressure (hypoxia). The Japanese ricefish medaka (Oryzias latipes) is an important model organism for biomedical research that shows remarkable tolerance towards hypoxia. We have investigated the regulation and role of globins under hypoxia. We applied four different regimes of chronic hypoxia (24 and 48 h at PO(2) = 2 or 4 kPa) as well as acute hypoxia (2 h at PO(2) = 0.5 kPa) to adult male medaka. Changes of mRNA levels of seven globin genes (adult hemoglobin α and β, myoglobin, neuroglobin, cytoglobin 1 and 2, globin X), three hypoxia-response genes (lactate dehydrogenase b, phosphoglycerate kinase, adrenomedullin 1) and two putative reference genes (cyclophilin, acidic ribosomal phosphoprotein P0) were monitored by means of quantitative real-time reverse-transcription PCR. We observed strong upregulation of myoglobin, which is also expressed in the medaka brain, as previously demonstrated for carp, goldfish and zebrafish. The hemoglobin chains were found upregulated, whereas earlier studies found down-regulation of hemoglobin in hypoxic zebrafish. By contrast, neuroglobin mRNA was not affected by hypoxia in medaka, but had been found upregulated in zebrafish. Globin X is induced in medaka brain, but down-regulated in zebrafish. Thus, the patterns of hypoxia response of globins are strikingly different in various fish species, which can be interpreted as indication for different roles of the various globins in hypoxia response and for alternative metabolic strategies of fish species in coping with O(2) deprivation.
Collapse
|
3
|
Zhang YE, Vibranovski MD, Landback P, Marais GAB, Long M. Chromosomal redistribution of male-biased genes in mammalian evolution with two bursts of gene gain on the X chromosome. PLoS Biol 2010; 8. [PMID: 20957185 PMCID: PMC2950125 DOI: 10.1371/journal.pbio.1000494] [Citation(s) in RCA: 152] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2010] [Accepted: 08/16/2010] [Indexed: 01/20/2023] Open
Abstract
Mammalian X chromosomes evolved under various mechanisms including sexual antagonism, the faster-X process, and meiotic sex chromosome inactivation (MSCI). These forces may contribute to nonrandom chromosomal distribution of sex-biased genes. In order to understand the evolution of gene content on the X chromosome and autosome under these forces, we dated human and mouse protein-coding genes and miRNA genes on the vertebrate phylogenetic tree. We found that the X chromosome recently acquired a burst of young male-biased genes, which is consistent with fixation of recessive male-beneficial alleles by sexual antagonism. For genes originating earlier, however, this pattern diminishes and finally reverses with an overrepresentation of the oldest male-biased genes on autosomes. MSCI contributes to this dynamic since it silences X-linked old genes but not X-linked young genes. This demasculinization process seems to be associated with feminization of the X chromosome with more X-linked old genes expressed in ovaries. Moreover, we detected another burst of gene originations after the split of eutherian mammals and opossum, and these genes were quickly incorporated into transcriptional networks of multiple tissues. Preexisting X-linked genes also show significantly higher protein-level evolution during this period compared to autosomal genes, suggesting positive selection accompanied the early evolution of mammalian X chromosomes. These two findings cast new light on the evolutionary history of the mammalian X chromosome in terms of gene gain, sequence, and expressional evolution.
Collapse
Affiliation(s)
- Yong E. Zhang
- Department of Ecology and Evolution, the University of Chicago, Chicago, Illinois, United States of America
| | - Maria D. Vibranovski
- Department of Ecology and Evolution, the University of Chicago, Chicago, Illinois, United States of America
| | - Patrick Landback
- Department of Ecology and Evolution, the University of Chicago, Chicago, Illinois, United States of America
| | - Gabriel A. B. Marais
- Université de Lyon, Centre National de la Recherche Scientifique, Laboratoire de Biométrie et Biologie évolutive, Villeurbanne, France
| | - Manyuan Long
- Department of Ecology and Evolution, the University of Chicago, Chicago, Illinois, United States of America
- * E-mail:
| |
Collapse
|
4
|
Eisermann K, Tandon S, Bazarov A, Brett A, Fraizer G, Piontkivska H. Evolutionary conservation of zinc finger transcription factor binding sites in promoters of genes co-expressed with WT1 in prostate cancer. BMC Genomics 2008; 9:337. [PMID: 18631392 PMCID: PMC2515153 DOI: 10.1186/1471-2164-9-337] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2008] [Accepted: 07/16/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Gene expression analyses have led to a better understanding of growth control of prostate cancer cells. We and others have identified the presence of several zinc finger transcription factors in the neoplastic prostate, suggesting a potential role for these genes in the regulation of the prostate cancer transcriptome. One of the transcription factors (TFs) identified in the prostate cancer epithelial cells was the Wilms tumor gene (WT1). To rapidly identify coordinately expressed prostate cancer growth control genes that may be regulated by WT1, we used an in silico approach. RESULTS Evolutionary conserved transcription factor binding sites (TFBS) recognized by WT1, EGR1, SP1, SP2, AP2 and GATA1 were identified in the promoters of 24 differentially expressed prostate cancer genes from eight mammalian species. To test the relationship between sequence conservation and function, chromatin of LNCaP prostate cancer and kidney 293 cells were tested for TF binding using chromatin immunoprecipitation (ChIP). Multiple putative TFBS in gene promoters of placental mammals were found to be shared with those in human gene promoters and some were conserved between genomes that diverged about 170 million years ago (i.e., primates and marsupials), therefore implicating these sites as candidate binding sites. Among those genes coordinately expressed with WT1 was the kallikrein-related peptidase 3 (KLK3) gene commonly known as the prostate specific antigen (PSA) gene. This analysis located several potential WT1 TFBS in the PSA gene promoter and led to the rapid identification of a novel putative binding site confirmed in vivo by ChIP. Conversely for two prostate growth control genes, androgen receptor (AR) and vascular endothelial growth factor (VEGF), known to be transcriptionally regulated by WT1, regulatory sequence conservation was observed and TF binding in vivo was confirmed by ChIP. CONCLUSION Overall, this targeted approach rapidly identified important candidate WT1-binding elements in genes coordinately expressed with WT1 in prostate cancer cells, thus enabling a more focused functional analysis of the most likely target genes in prostate cancer progression. Identifying these genes will help to better understand how gene regulation is altered in these tumor cells.
Collapse
Affiliation(s)
- Kurtis Eisermann
- School of Biomedical Sciences, Kent State University, Kent, Ohio, USA.
| | | | | | | | | | | |
Collapse
|
5
|
Kim H, Hurwitz B, Yu Y, Collura K, Gill N, SanMiguel P, Mullikin JC, Maher C, Nelson W, Wissotski M, Braidotti M, Kudrna D, Goicoechea JL, Stein L, Ware D, Jackson SA, Soderlund C, Wing RA. Construction, alignment and analysis of twelve framework physical maps that represent the ten genome types of the genus Oryza. Genome Biol 2008; 9:R45. [PMID: 18304353 PMCID: PMC2374706 DOI: 10.1186/gb-2008-9-2-r45] [Citation(s) in RCA: 74] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2007] [Revised: 02/12/2008] [Accepted: 02/28/2008] [Indexed: 01/31/2023] Open
Abstract
Bacterial artificial chromosome (BAC) fingerprint and end-sequenced physical maps representing the ten genome types of Oryza are presented We describe the establishment and analysis of a genus-wide comparative framework composed of 12 bacterial artificial chromosome fingerprint and end-sequenced physical maps representing the 10 genome types of Oryza aligned to the O. sativa ssp. japonica reference genome sequence. Over 932 Mb of end sequence was analyzed for repeats, simple sequence repeats, miRNA and single nucleotide variations, providing the most extensive analysis of Oryza sequence to date.
Collapse
Affiliation(s)
- HyeRan Kim
- Arizona Genomics Institute, Department of Plant Sciences, University of Arizona, Tucson, Arizona 85721, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
6
|
Congdon CB, Aman JC, Nava GM, Gaskins HR, Mattingly CJ. An evaluation of information content as a metric for the inference of putative conserved noncoding regions in DNA sequences using a genetic algorithms approach. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2008; 5:1-14. [PMID: 18245871 DOI: 10.1109/tcbb.2007.1059] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
In previous work, we presented GAMI, an approach to motif inference that uses a genetic algorithms search. GAMI is designed specifically to find putative conserved regulatory motifs in noncoding regions of divergent species, and is designed to allow for analysis of long nucleotide sequences. In this work, we compare GAMI's performance when run with its original fitness function (a simple count of the number of matches) and when run with information content, as well as several variations on these metrics. Results indicate that information content does not identify highly conserved regions, and thus is not the appropriate metric for this task, while variations on information content as well as the original metric succeed in identifying putative conserved regions.
Collapse
Affiliation(s)
- Clare Bates Congdon
- Computer Science Department, University of Southern Maine, Portland 04104, USA.
| | | | | | | | | |
Collapse
|
7
|
Porterfield VM, Piontkivska H, Mintz EM. Identification of novel light-induced genes in the suprachiasmatic nucleus. BMC Neurosci 2007; 8:98. [PMID: 18021443 PMCID: PMC2216081 DOI: 10.1186/1471-2202-8-98] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2007] [Accepted: 11/19/2007] [Indexed: 11/16/2022] Open
Abstract
Background The transmission of information about the photic environment to the circadian clock involves a complex array of neurotransmitters, receptors, and second messenger systems. Exposure of an animal to light during the subjective night initiates rapid transcription of a number of immediate-early genes in the suprachiasmatic nucleus of the hypothalamus. Some of these genes have known roles in entraining the circadian clock, while others have unknown functions. Using laser capture microscopy, microarray analysis, and quantitative real-time PCR, we performed a comprehensive screen for changes in gene expression immediately following a 30 minute light pulse in suprachiasmatic nucleus of mice. Results The results of the microarray screen successfully identified previously known light-induced genes as well as several novel genes that may be important in the circadian clock. Newly identified light-induced genes include early growth response 2, proviral integration site 3, growth-arrest and DNA-damage-inducible 45 beta, and TCDD-inducible poly(ADP-ribose) polymerase. Comparative analysis of promoter sequences revealed the presence of evolutionarily conserved CRE and associated TATA box elements in most of the light-induced genes, while other core clock genes generally lack this combination of promoter elements. Conclusion The photic signalling cascade in the suprachiasmatic nucleus activates an array of immediate-early genes, most of which have unknown functions in the circadian clock. Detected evolutionary conservation of CRE and TATA box elements in promoters of light-induced genes suggest that the functional role of these elements has likely remained the same over evolutionary time across mammalian orders.
Collapse
|
8
|
Amemiya CT, Gomez-Chiarri M. Comparative genomics in vertebrate evolution and development. ACTA ACUST UNITED AC 2006; 305:672-82. [PMID: 16902957 DOI: 10.1002/jez.a.308] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The vast quantities of publicly available DNA sequencing data and genome resources are enabling biologists to investigate age-old problems in biology that were not addressable previously. In this review, we discuss how comparative genomics is practiced and how the data can be used to make biological inferences with respect to vertebrate evolution and development. Examples are taken from the well-known HOX clusters, which are always a high-priority target for genomic analyses due to their inferred role in the evolution of metazoans. In addition, we briefly discuss the application of genomic approaches to problems in comparative endocrinology.
Collapse
Affiliation(s)
- Chris T Amemiya
- Molecular Genetics Program, Benaroya Research Institute at Virginia Mason, Seattle, Washington 98101, USA.
| | | |
Collapse
|
9
|
Hou HH, Kuo MYP, Luo YW, Chang BE. Recapitulation of human betaB1-crystallin promoter activity in transgenic zebrafish. Dev Dyn 2006; 235:435-43. [PMID: 16331646 DOI: 10.1002/dvdy.20652] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
Development of the eye is morphologically similar among vertebrates, indicating that the underlying mechanism regulating the process may have been highly conserved during evolution. Herein we analyzed the promoter of the human betaB1-crytallin gene in zebrafish by transgenic experiments. To delineate the evolutionarily conserved regulatory elements, we performed serial deletion assays in the promoter region. The results demonstrated that the -90/+61-bp upstream proximal promoter region is sufficient to confer lens-tissue specificity to the human betaB1-crystallin gene in transgenic zebrafish. Through phylogenetic sequence comparisons and an electrophoretic mobility shift assay (EMSA), a highly conserved cis-element of a six-base pair sequence TG(A/C)TGA, the consensus sequence for the Maf protein binding site, within the proximal promoter region was revealed. Further, a site-mutational assay showed that this element is crucial for promoter activity. These data suggest that the fundamental transcriptional regulatory mechanism of the betaB1-crystallin gene has been well conserved between humans and zebrafish, and plausibly among all vertebrates, during evolution.
Collapse
Affiliation(s)
- Hsin-Han Hou
- Graduate Institute of Oral Biology, College of Medicine, National Taiwan University, Taipei, Taiwan, Republic of China
| | | | | | | |
Collapse
|
10
|
Barnes DW, Mattingly CJ, Parton A, Dowell LM, Bayne CJ, Forrest JN. Marine organism cell biology and regulatory sequence discoveryin comparative functional genomics. Cytotechnology 2005; 46:123-37. [PMID: 19003267 PMCID: PMC3449718 DOI: 10.1007/s10616-005-1719-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2005] [Accepted: 08/04/2005] [Indexed: 01/28/2023] Open
Abstract
The use of bioinformatics to integrate phenotypic and genomic data from mammalian models is well established as a means of understanding human biology and disease. Beyond direct biomedical applications of these approaches in predicting structure–function relationships between coding sequences and protein activities, comparative studies also promote understanding of molecular evolution and the relationship between genomic sequence and morphological and physiological specialization. Recently recognized is the potential of comparative studies to identify functionally significant regulatory regions and to generate experimentally testable hypotheses that contribute to understanding mechanisms that regulate gene expression, including transcriptional activity, alternative splicing and transcript stability. Functional tests of hypotheses generated by computational approaches require experimentally tractable in vitro systems, including cell cultures. Comparative sequence analysis strategies that use genomic sequences from a variety of evolutionarily diverse organisms are critical for identifying conserved regulatory motifs in the 5′-upstream, 3′-downstream and introns of genes. Genomic sequences and gene orthologues in the first aquatic vertebrate and protovertebrate organisms to be fully sequenced (Fugu rubripes, Ciona intestinalis, Tetraodon nigroviridis, Danio rerio) as well as in the elasmobranchs, spiny dogfish shark (Squalus acanthias) and little skate (Raja erinacea), and marine invertebrate models such as the sea urchin (Strongylocentrotus purpuratus) are valuable in the prediction of putative genomic regulatory regions. Cell cultures have been derived for these and other model species. Data and tools resulting from these kinds of studies will contribute to understanding transcriptional regulation of biomedically important genes and provide new avenues for medical therapeutics and disease prevention.
Collapse
Affiliation(s)
- David W Barnes
- Mount Desert Island Biological Laboratory, Center for Marine Functional Genomics Studies, P.O. Box 35, Old Bar Harbour Road, Salisbury Cove, MA, 04672, USA,
| | | | | | | | | | | |
Collapse
|
11
|
Raymond CK, Subramanian S, Paddock M, Qiu R, Deodato C, Palmieri A, Chang J, Radke T, Haugen E, Kas A, Waring D, Bovee D, Stacy R, Kaul R, Olson MV. Targeted, haplotype-resolved resequencing of long segments of the human genome. Genomics 2005; 86:759-66. [PMID: 16249066 DOI: 10.1016/j.ygeno.2005.08.013] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2005] [Revised: 08/26/2005] [Accepted: 08/30/2005] [Indexed: 01/09/2023]
Abstract
Currently, challenges exist to acquire long-range (hundreds of kilobase pairs) phase-discriminated sequence across substantial numbers of individuals. We have developed a straightforward method for isolating and characterizing specific genomic regions in a haplospecific manner. Real-time PCR is carried out to STS content map and genotype pools of fosmid clones arrayed in 384-well microtiter plates. Single-nucleotide polymorphisms, microsatellite markers, and insertion-deletion polymorphisms are used to differentiate the target region into haplotype-specific tiling paths. DNA of clones from these tiling paths is retrieved from the library and either sequenced by standard shotgun methods or amplified in vitro and sequenced by a primer-based, directed method. This approach provides convenient access to complete, haplotype-resolved resequencing data from multiple individuals across tens to hundreds of thousands of basepairs. We illustrate its implementation with a detailed example of more than 400 kbp from the human CFTR region, across 15 individuals, and summarize our experience applying it to many other human loci.
Collapse
Affiliation(s)
- Christopher K Raymond
- University of Washington Genome Center, Department of Medicine, University of Washington, Seattle, WA 98195, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
12
|
Alsop AE, Miethke P, Rofe R, Koina E, Sankovic N, Deakin JE, Haines H, Rapkins RW, Marshall Graves JA. Characterizing the chromosomes of the Australian model marsupial Macropus eugenii (tammar wallaby). Chromosome Res 2005; 13:627-36. [PMID: 16170627 DOI: 10.1007/s10577-005-0989-2] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2005] [Accepted: 07/05/2005] [Indexed: 11/26/2022]
Abstract
Marsupials occupy a phylogenetic middle ground that is very valuable in genome comparisons of mammal and other vertebrate species. For this reason, whole genome sequencing is being undertaken for two distantly related marsupial species, including the model kangaroo species Macropus eugenii (the tammar wallaby). As a first step towards the molecular characterization of the tammar genome, we present a detailed description of the tammar karyotype, report the development of a set of molecular anchor markers and summarize the comparative mapping data for this species.
Collapse
Affiliation(s)
- Amber E Alsop
- ARC Centre for Kangaroo Genomics, Research School of Biological Sciences, The Australian National University, Canberra, ACT 2601, Australia.
| | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Koelsch BU, Rajewsky MF, Kindler-Röhrborn A. A 6-Mb contig-based comparative gene and linkage map of the rat schwannoma tumor suppressor region at 10q32.3. Genomics 2005; 85:322-9. [PMID: 15718099 DOI: 10.1016/j.ygeno.2004.11.015] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2004] [Accepted: 11/23/2004] [Indexed: 10/25/2022]
Abstract
Frequent genetic aberrations of malignant schwannomas induced by the alkylating agent N-ethyl-N-nitrosourea in hybrids from inbred BD rat strains include allelic imbalances of the telomeric 20 Mb of chromosome 5 (Dis-2) and of the telomeric 5 Mb of chromosome 10q32 (Dis-1) in 59 and 94% of the tumors, respectively. The Dis-1 minimal loss of heterozygosity consensus region extends from D10Rat4 to the telomere and harbors a putative tumor suppressor gene(s). We constructed a 6-Mb BAC/PAC contig containing more than 70 known genes, 18 mapped microsatellites, and further ESTs/reference RNAs. A continuous block of strongly conserved synteny with mouse chromosome 11E2 and human chromosome 17q25.3 was found. Combining the sequence information from the rat and closely related syntenic regions of different mammalian species produces nearly complete gene maps as a basis for a positional candidate approach and gives insight into mammalian genomic evolution.
Collapse
Affiliation(s)
- Bernd U Koelsch
- Department of Neuropathology, University of Bonn Medical Center, Sigmund-Freud-Strasse 25, 53105 Bonn, Germany
| | | | | |
Collapse
|
14
|
Hughes JR, Cheng JF, Ventress N, Prabhakar S, Clark K, Anguita E, De Gobbi M, de Jong P, Rubin E, Higgs DR. Annotation of cis-regulatory elements by identification, subclassification, and functional assessment of multispecies conserved sequences. Proc Natl Acad Sci U S A 2005; 102:9830-5. [PMID: 15998734 PMCID: PMC1174996 DOI: 10.1073/pnas.0503401102] [Citation(s) in RCA: 110] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
An important step toward improving the annotation of the human genome is to identify cis-acting regulatory elements from primary DNA sequence. One approach is to compare sequences from multiple, divergent species. This approach distinguishes multispecies conserved sequences (MCS) in noncoding regions from more rapidly evolving neutral DNA. Here, we have analyzed a region of approximately 238kb containing the human alpha globin cluster that was sequenced and/or annotated across the syntenic region in 22 species spanning 500 million years of evolution. Using a variety of bioinformatic approaches and correlating the results with many aspects of chromosome structure and function in this region, we were able to identify and evaluate the importance of 24 individual MCSs. This approach sensitively and accurately identified previously characterized regulatory elements but also discovered unidentified promoters, exons, splicing, and transcriptional regulatory elements. Together, these studies demonstrate an integrated approach by which to identify, subclassify, and predict the potential importance of MCSs.
Collapse
Affiliation(s)
- Jim R Hughes
- Medical Research Council Molecular Haematology Unit, Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, Headington, Oxford OX3 9DS, United Kingdom
| | | | | | | | | | | | | | | | | | | |
Collapse
|
15
|
de Souza FSJ, Santangelo AM, Bumaschny V, Avale ME, Smart JL, Low MJ, Rubinstein M. Identification of neuronal enhancers of the proopiomelanocortin gene by transgenic mouse analysis and phylogenetic footprinting. Mol Cell Biol 2005; 25:3076-86. [PMID: 15798195 PMCID: PMC1069613 DOI: 10.1128/mcb.25.8.3076-3086.2005] [Citation(s) in RCA: 72] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
The proopiomelanocortin (POMC) gene is expressed in the pituitary and arcuate neurons of the hypothalamus. POMC arcuate neurons play a central role in the control of energy homeostasis, and rare loss-of-function mutations in POMC cause obesity. Moreover, POMC is the prime candidate gene within a highly significant quantitative trait locus on chromosome 2 associated with obesity traits in several human populations. Here, we identify two phylogenetically conserved neuronal POMC enhancers designated nPE1 (600 bp) and nPE2 (150 bp) located approximately 10 to 12 kb upstream of mammalian POMC transcriptional units. We show that mouse or human genomic regions containing these enhancers are able to direct reporter gene expression to POMC hypothalamic neurons, but not the pituitary of transgenic mice. Conversely, deletion of nPE1 and nPE2 in the context of the entire transcriptional unit of POMC abolishes transgene expression in the hypothalamus without affecting pituitary expression. Our results indicate that the nPEs are necessary and sufficient for hypothalamic POMC expression and that POMC expression in the brain and pituitary is controlled by independent sets of enhancers. Our study advances the understanding of the molecular nature of hypothalamic POMC neurons and will be useful to determine whether polymorphisms in POMC regulatory regions play a role in the predisposition to obesity.
Collapse
|
16
|
Margulies EH, Maduro VVB, Thomas PJ, Tomkins JP, Amemiya CT, Luo M, Green ED. Comparative sequencing provides insights about the structure and conservation of marsupial and monotreme genomes. Proc Natl Acad Sci U S A 2005; 102:3354-9. [PMID: 15718282 PMCID: PMC549084 DOI: 10.1073/pnas.0408539102] [Citation(s) in RCA: 66] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2004] [Indexed: 11/18/2022] Open
Abstract
Sequencing and comparative analyses of genomes from multiple vertebrates are providing insights about the genetic basis for biological diversity. To date, these efforts largely have focused on eutherian mammals, chicken, and fish. In this article, we describe the generation and study of genomic sequences from noneutherian mammals, a group of species occupying unusual phylogenetic positions. A large sequence data set (totaling >5 Mb) was generated for the same orthologous region in three marsupial (North American opossum, South American opossum, and Australian tammar wallaby) and one monotreme (platypus) genomes. These ancient mammalian genomes are characterized by unusual architectural features with respect to G + C and repeat content, as well as compression relative to human. Approximately 14% and 34% of the human sequence forms alignments with the orthologous sequence from platypus and the marsupials, respectively; these numbers are distinctly lower than that observed with nonprimate eutherian mammals (45-70%). The alignable sequences between human and each marsupial species are not completely overlapping (only 80% common to all three species) nor are the platypus-alignable sequences completely contained within the marsupial-alignable sequences. Phylogenetic analysis of synonymous coding positions reveals that platypus has a notably long branch length, with the human-platypus substitution rate being on average 55% greater than that seen with human-marsupial pairs. Finally, analyses of the major mammalian lineages reveal distinct patterns with respect to the common presence of evolutionarily conserved vertebrate sequences. Our results confirm that genomic sequence from noneutherian mammals can contribute uniquely to unraveling the functional and evolutionary histories of the mammalian genome.
Collapse
Affiliation(s)
- Elliott H Margulies
- Genome Technology Branch and NISC, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | | | | | | | | | | | | |
Collapse
|
17
|
Abstract
The relatively new field of phylogenomics is beginning to reveal the potential of genomic data for evolutionary studies. As the cost of whole genome sequencing falls, anticipation of complete genome sequences from divergent species, reflecting the major lineages of modern mammals, is no longer a distant dream. In this article, we describe how comparative genomic data from mammals is progressing to resolve long-standing phylogenetic controversies, to refine dogma on how chromosomes evolve and to guide annotation of human and other vertebrate genomes.
Collapse
Affiliation(s)
- William J Murphy
- Basic Research Laboratory, SAIC-Frederick, Laboratory of Genomic Diversity, National Cancer Institute, Frederick, MD 21702, USA.
| | | | | |
Collapse
|
18
|
Abstract
We describe a multiple alignment program named MAP2 based on a generalized pairwise global alignment algorithm for handling long, different intergenic and intragenic regions in genomic sequences. The MAP2 program produces an ordered list of local multiple alignments of similar regions among sequences, where different regions between local alignments are indicated by reporting only similar regions. We propose two similarity measures for the evaluation of the performance of MAP2 and existing multiple alignment programs. Experimental results produced by MAP2 on four real sets of orthologous genomic sequences show that MAP2 rarely missed a block of transitively similar regions and that MAP2 never produced a block of regions that are not transitively similar. Experimental results by MAP2 on six simulated data sets show that MAP2 found the boundaries between similar and different regions precisely. This feature is useful for finding conserved functional elements in genomic sequences. The MAP2 program is freely available in source code form at http://bioinformatics.iastate.edu/aat/sas.html for academic use.
Collapse
Affiliation(s)
| | - Xiaoqiu Huang
- To whom correspondence should be addressed. Tel: +1 515 294 2432; Fax: +1 515 294 0258;
| |
Collapse
|
19
|
Abstract
The utility of DNA sequence information for phylogenetics and phylogeography is now well known. Rather than attempt to summarize studies addressing this well-demonstrated utility, this chapter focuses on fundamental approaches and techniques that implement the collection of DNA sequence data for comparative phylogenetic purposes in a genomic context (phylogenomics). Whole genome sequencing approaches have changed the way we think about phylogenetics and have opened the way for new perspectives on "old" phylogenetics concerns. Some of these concerns are which gene regions to use and how much sequence information is needed for robust phylogenetic inference. Whole genome sequences of a few animal model organisms have gone a long way to implement approaches to better understand these important phylogenetic concerns. This chapter also addresses how genomics has made it more important for a clear understanding of orthology of gene regions in comparative biology. Finally, genome-enabled technologies that are affecting comparative biology are also discussed.
Collapse
Affiliation(s)
- Rob DeSalle
- Department of Interbrate Zoology, American Museum of Natural History, New York, New York 10024, USA
| |
Collapse
|
20
|
Roesner A, Fuchs C, Hankeln T, Burmester T. A globin gene of ancient evolutionary origin in lower vertebrates: evidence for two distinct globin families in animals. Mol Biol Evol 2004; 22:12-20. [PMID: 15356282 DOI: 10.1093/molbev/msh258] [Citation(s) in RCA: 100] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Hemoglobin, myoglobin, neuroglobin, and cytoglobin are four types of vertebrate globins with distinct tissue distributions and functions. Here, we report the identification of a fifth and novel globin gene from fish and amphibians, which has apparently been lost in the evolution of higher vertebrates (Amniota). Because its function is presently unknown, we tentatively call it globin X (GbX). Globin X sequences were obtained from three fish species, the zebrafish Danio rerio, the goldfish Carassius auratus, and the pufferfish Tetraodon nigroviridis, and the clawed frog Silurana tropicalis. Globin X sequences are distinct from vertebrate hemoglobins, myoglobins, neuroglobins, and cytoglobins. Globin X displays the highest identity scores with neuroglobin (approximately 26% to 35%), although it is not a neuronal protein, as revealed by RT-PCR experiments on goldfish RNA from various tissues. The distal ligand-binding and the proximal heme-binding histidines (E7 and F8), as well as the conserved phenylalanine CD1 are present in the globin X sequences, but because of extensions at the N-terminal and C-terminal, the globin X proteins are longer than the typical eight alpha-helical globins and comprise about 200 amino acids. In addition to the conserved globin introns at helix positions B12.2 and G7.0, the globin X genes contain two introns in E10.2 and H10.0. The intron in E10.2 is shifted by 1 bp in respect to the vertebrate neuroglobin gene (E11.0), providing possible evidence for an intron sliding event. Phylogenetic analyses confirm an ancient evolutionary relationship of globin X with neuroglobin and suggest the existence of two distinct globin types in the last common ancestor of Protostomia and Deuterostomia.
Collapse
Affiliation(s)
- Anja Roesner
- Institute of Zoology, Johannes Gutenberg University, Mainz, Germany
| | | | | | | |
Collapse
|
21
|
Everts-van der Wind A, Kata SR, Band MR, Rebeiz M, Larkin DM, Everts RE, Green CA, Liu L, Natarajan S, Goldammer T, Lee JH, McKay S, Womack JE, Lewin HA. A 1463 gene cattle-human comparative map with anchor points defined by human genome sequence coordinates. Genome Res 2004; 14:1424-37. [PMID: 15231756 PMCID: PMC442159 DOI: 10.1101/gr.2554404] [Citation(s) in RCA: 108] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
A second-generation 5000 rad radiation hybrid (RH) map of the cattle genome was constructed primarily using cattle ESTs that were targeted to gaps in the existing cattle-human comparative map, as well as to sparsely populated map intervals. A total of 870 targeted markers were added, bringing the number of markers mapped on the RH(5000) panel to 1913. Of these, 1463 have significant BLASTN hits (E < e(-5)) against the human genome sequence. A cattle-human comparative map was created using human genome sequence coordinates of the paired orthologs. One-hundred and ninety-five conserved segments (defined by two or more genes) were identified between the cattle and human genomes, of which 31 are newly discovered and 34 were extended singletons on the first-generation map. The new map represents an improvement of 20% genome-wide comparative coverage compared with the first-generation map. Analysis of gene content within human genome regions where there are gaps in the comparative map revealed gaps with both significantly greater and significantly lower gene content. The new, more detailed cattle-human comparative map provides an improved resource for the analysis of mammalian chromosome evolution, the identification of candidate genes for economically important traits, and for proper alignment of sequence contigs on cattle chromosomes.
Collapse
|
22
|
Abstract
Zebrafish have emerged as a useful vertebrate model system in which unbiased large-scale screens have revealed hundreds of mutations affecting vertebrate development. Many zebrafish mutants closely resemble known human disorders, thus providing intriguing prospects for uncovering the genetic basis of human diseases and for the development of pharmacologic agents that inhibit or correct the progression of developmental disorders. The rapid pace of advances in genomic sequencing and map construction, in addition to morpholino targeting and transgenic techniques, have facilitated the identification and analysis of genes associated with zebrafish mutants, thus promoting the development of zebrafish as a model for human disorders. This review aims to illustrate how the zebrafish has been used to identify unknown genes, to assign function to known genes, and to delineate genetic pathways, all contributing valuable leads toward understanding human pathophysiology.
Collapse
Affiliation(s)
- Trista E North
- Division of Hematology/Oncology, Department of Medicine, Children's Hospital of Boston, Enders Research Building, Boston, Massachusetts 02115, USA
| | | |
Collapse
|
23
|
Abstract
Interpreting the functional content of a given genomic sequence is one of the central challenges of biology today. Perhaps the most promising approach to this problem is based on the comparative method of classic biology in the modern guise of sequence comparison. For instance, protein-coding regions tend to be conserved between species. Hence, a simple method for distinguishing a functional exon from the chance absence of stop codons is to investigate its homologue from closely related species. Predicting regulatory elements is even more difficult than exon prediction, but again, comparisons pinpointing conserved sequence motifs upstream of translation start sites are helping to unravel gene regulatory networks. In addition to interspecific studies, intraspecific sequence comparison yields insights into the evolutionary forces that have acted on a species in the past. Of particular interest here is the identification of selection events such as selective sweeps. Both intra- and interspecific sequence comparisons are based on a variety of computational methods, including alignment, phylogenetic reconstruction, and coalescent theory. This article surveys the biology and the central computational ideas applied in recent comparative genomics projects. We argue that the most fruitful method of understanding the functional content of genomes is to study them in the context of related genomic sequences. In particular, such a study may reveal selection, a fundamental pointer to biological relevance.
Collapse
Affiliation(s)
- Bernhard Haubold
- Fachbereich Biotechnologie & Bioinformatik, Fachhochschule Weihenstephan, 85350 Freising, Germany.
| | | |
Collapse
|
24
|
Wystub S, Ebner B, Fuchs C, Weich B, Burmester T, Hankeln T. Interspecies comparison of neuroglobin, cytoglobin and myoglobin: Sequence evolution and candidate regulatory elements. Cytogenet Genome Res 2004; 105:65-78. [PMID: 15218260 DOI: 10.1159/000078011] [Citation(s) in RCA: 66] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2003] [Accepted: 12/08/2003] [Indexed: 11/19/2022] Open
Abstract
Neuroglobin and cytoglobin are two novel members of the vertebrate globin family. Their physiological role is poorly understood, although both proteins bind oxygen reversibly and may be involved in cellular oxygen homeostasis. Here we investigate the selective constraints on coding and non-coding sequences of the neuroglobin and cytoglobin genes in human, mouse, rat and fish. Neuroglobin and cytoglobin are highly conserved, displaying very low levels of non-synonymous nucleotide substitutions. An oxygen supply function predicts distinct modes of gene regulation, involving hypoxia-responsive transcription factors. To detect conserved candidate regulatory elements, we compared the neuroglobin and cytoglobin genes in mammals and fish. The myoglobin gene was included to test if it also contains hypoxia-responsive regulatory elements. Long conserved non-coding sequences, indicative of gene-regulatory elements, were found in the cytoglobin and myoglobin, but not in the neuroglobin gene. Sequence comparison and experimental data allowed us to delimit upstream regions of the neuroglobin and cytoglobin genes that contain the putative promoters, defining candidate regulatory regions for functional tests. The neuroglobin and the myoglobin genes both lack conserved hypoxia-responsive elements (HREs) for transcriptional activation, but contain conserved hypoxia-inducible mRNA stabilization signals in their 3' untranslated regions. The cytoglobin gene, in contrast, harbors both conserved HREs and mRNA stabilization sites, strongly suggestive of an oxygen-dependent regulation.
Collapse
Affiliation(s)
- S Wystub
- Institute of Molecular Genetics and Institute of Zoology, Johannes Gutenberg University Mainz, Mainz, Germany
| | | | | | | | | | | |
Collapse
|
25
|
Mattingly C, Parton A, Dowell L, Rafferty J, Barnes D. Cell and Molecular Biology of Marine Elasmobranchs: Squalus acanthias and Raja erinacea. Zebrafish 2004; 1:111-20. [DOI: 10.1089/zeb.2004.1.111] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Affiliation(s)
| | - Angela Parton
- Mount Desert Island Biological Laboratories, Salsbury Cove, Maine
| | - Lori Dowell
- Mount Desert Island Biological Laboratories, Salsbury Cove, Maine
| | - Jason Rafferty
- Mount Desert Island Biological Laboratories, Salsbury Cove, Maine
| | - David Barnes
- Mount Desert Island Biological Laboratories, Salsbury Cove, Maine
| |
Collapse
|
26
|
van Hijum SAFT, Zomer AL, Kuipers OP, Kok J. Projector: automatic contig mapping for gap closure purposes. Nucleic Acids Res 2004; 31:e144. [PMID: 14602937 PMCID: PMC275581 DOI: 10.1093/nar/gng144] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Projector was designed for automatic positioning of contigs from an unfinished prokaryotic genome onto a template genome of a closely related strain or species. Projector mapped 84 contigs of Lactococcus lactis MG1363 (corresponding to 81% of the assembly nucleotides) against the genome of L.lactis IL1403. Ninety three percent of subsequent gap closure PCRs were successful. Moreover, a significant improvement in the N50 and N80 values (describing the assembly quality) was observed after the use of Projector. Because increasing numbers of bacterial genomes are being sequenced, Projector provides an efficient method to close a significant number of remaining gaps in the late stages of a genome sequencing project.
Collapse
Affiliation(s)
- Sacha A F T van Hijum
- Molecular Genetics, University of Groningen, Groningen Biomolecular Sciences and Biotechnology Institute, PO Box 14, 9750 AA Haren, The Netherlands.
| | | | | | | |
Collapse
|
27
|
Mattingly CJ, Colby GT, Rosenstein MC, Forrest JN, Boyer JL. Promoting comparative molecular studies in environmental health research: an overview of the comparative toxicogenomics database (CTD). THE PHARMACOGENOMICS JOURNAL 2004; 4:5-8. [PMID: 14735110 DOI: 10.1038/sj.tpj.6500225] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- C J Mattingly
- Department of Bioinformatics, Mount Desert Island Biological Laboratory, Salisbury Cove, ME 04672, USA.
| | | | | | | | | |
Collapse
|
28
|
Eggen A, Hocquette JF. Genomic approaches to economic trait loci and tissue expression profiling: application to muscle biochemistry and beef quality. Meat Sci 2004; 66:1-9. [DOI: 10.1016/s0309-1740(03)00020-2] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2002] [Accepted: 12/02/2002] [Indexed: 11/26/2022]
|
29
|
Fredriksson R, Höglund PJ, Gloriam DEI, Lagerström MC, Schiöth HB. Seven evolutionarily conserved human rhodopsin G protein-coupled receptors lacking close relatives. FEBS Lett 2003; 554:381-8. [PMID: 14623098 DOI: 10.1016/s0014-5793(03)01196-7] [Citation(s) in RCA: 199] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
We report seven new members of the superfamily of human G protein-coupled receptors (GPCRs) found by searches in the human genome databases, termed GPR100, GPR119, GPR120, GPR135, GPR136, GPR141, and GPR142. We also report 16 orthologues of these receptors in mouse, rat, fugu (pufferfish) and zebrafish. Phylogenetic analysis shows that these are additional members of the family of rhodopsin-type GPCRs. GPR100 shows similarity with the orphan receptor SALPR. Remarkably, the other receptors do not have any close relative among other known human rhodopsin-like GPCRs. Most of these orphan receptors are highly conserved through several vertebrate species and are present in single copies. Analysis of expressed sequence tag (EST) sequences indicated individual expression patterns, such as for GPR135, which was found in a wide variety of tissues including eye, brain, cervix, stomach and testis. Several ESTs for GPR141 were found in marrow and cancer cells, while the other receptors seem to have more restricted expression patterns.
Collapse
Affiliation(s)
- Robert Fredriksson
- Department of Neuroscience, Uppsala University, BMC, Box 593, 751 24 Uppsala, Sweden
| | | | | | | | | |
Collapse
|
30
|
Larkin DM, Everts-van der Wind A, Rebeiz M, Schweitzer PA, Bachman S, Green C, Wright CL, Campos EJ, Benson LD, Edwards J, Liu L, Osoegawa K, Womack JE, de Jong PJ, Lewin HA. A cattle-human comparative map built with cattle BAC-ends and human genome sequence. Genome Res 2003; 13:1966-72. [PMID: 12902387 PMCID: PMC403790 DOI: 10.1101/gr.1560203] [Citation(s) in RCA: 56] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
As a step toward the goal of adding the cattle genome to those available for multispecies comparative genome analysis, 40,224 cattle BAC clones were end-sequenced, yielding 60,547 sequences (BAC end sequences, BESs) after trimming with an average read length of 515 bp. Cattle BACs were anchored to the human and mouse genome sequences by BLASTN search, revealing 29.4% and 10.1% significant hits (E < e-5), respectively. More than 60% of all cattle BES hits in both the human and mouse genomes are located within known genes. In order to confirm in silico predictions of orthologyand their relative position on cattle chromosomes, 84 cattle BESs with similarity to sequences on HSA11 were mapped using a cattle-hamster radiation hybrid (RH) panel. Resulting RH maps of BTA15 and BTA29 cover approximately 85% of HSA11 sequence, revealing a complex patchwork shuffling of segments not explained by a simple translocation followed by internal rearrangements. Overlay of the mouse conserved syntenies onto HSA11 revealed that segmental boundaries appear to be conserved in all three species. The BAC clone-based comparative map provides a foundation for the evolutionary analysis of mammalian karyotypes and for sequencing of the cattle genome.
Collapse
Affiliation(s)
- Denis M Larkin
- Department of Animal Sciences, University of Illinois at Urbana-Champaign, Urbana, Illinois, 61801 USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
31
|
Schwartz S, Elnitski L, Li M, Weirauch M, Riemer C, Smit A, Green ED, Hardison RC, Miller W. MultiPipMaker and supporting tools: Alignments and analysis of multiple genomic DNA sequences. Nucleic Acids Res 2003; 31:3518-24. [PMID: 12824357 PMCID: PMC168985 DOI: 10.1093/nar/gkg579] [Citation(s) in RCA: 169] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Analysis of multiple sequence alignments can generate important, testable hypotheses about the phylogenetic history and cellular function of genomic sequences. We describe the MultiPipMaker server, which aligns multiple, long genomic DNA sequences quickly and with good sensitivity (available at http://bio.cse.psu.edu/ since May 2001). Alignments are computed between a contiguous reference sequence and one or more secondary sequences, which can be finished or draft sequence. The outputs include a stacked set of percent identity plots, called a MultiPip, comparing the reference sequence with subsequent sequences, and a nucleotide-level multiple alignment. New tools are provided to search MultiPipMaker output for conserved matches to a user-specified pattern and for conserved matches to position weight matrices that describe transcription factor binding sites (singly and in clusters). We illustrate the use of MultiPipMaker to identify candidate regulatory regions in WNT2 and then demonstrate by transfection assays that they are functional. Analysis of the alignments also confirms the phylogenetic inference that horses are more closely related to cats than to cows.
Collapse
Affiliation(s)
- Scott Schwartz
- Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA 16802, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
32
|
Abstract
Changes in technology in the past decade have had such an impact on the way that molecular evolution research is done that it is difficult now to imagine working in a world without genomics or the Internet. In 1992, GenBank was less than a hundredth of its current size and was updated every three months on a huge spool of tape. Homology searches took 30 minutes and rarely found a hit. Now it is difficult to find sequences with only a few homologs to use as examples for teaching bioinformatics. For molecular evolution researchers, the genomics revolution has showered us with raw data and the information revolution has given us the wherewithal to analyze it. In broad terms, the most significant outcome from these changes has been our newfound ability to examine the evolution of genomes as a whole, enabling us to infer genome-wide evolutionary patterns and to identify subsets of genes whose evolution has been in some way atypical.
Collapse
Affiliation(s)
- Kenneth H Wolfe
- Department of Genetics, Smurfit Institute, University of Dublin, Trinity College, Dublin 2, Ireland.
| | | |
Collapse
|
33
|
Affiliation(s)
- S Blair Hedges
- NASA Astrobiology Institute and Department of Biology, 208 Mueller Laboratory, Pennsylvania State University, University Park, PA 16802-5301, USA
| | | |
Collapse
|
34
|
Current Awareness on Comparative and Functional Genomics. Comp Funct Genomics 2002. [PMCID: PMC2448432 DOI: 10.1002/cfg.119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
|