1
|
Velandia-Huerto CA, Berkemer SJ, Hoffmann A, Retzlaff N, Romero Marroquín LC, Hernández-Rosales M, Stadler PF, Bermúdez-Santana CI. Orthologs, turn-over, and remolding of tRNAs in primates and fruit flies. BMC Genomics 2016; 17:617. [PMID: 27515907 PMCID: PMC4981973 DOI: 10.1186/s12864-016-2927-4] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2016] [Accepted: 07/11/2016] [Indexed: 12/26/2022] Open
Abstract
Background Transfer RNAs (tRNAs) are ubiquitous in all living organism. They implement the genetic code so that most genomes contain distinct tRNAs for almost all 61 codons. They behave similar to mobile elements and proliferate in genomes spawning both local and non-local copies. Most tRNA families are therefore typically present as multicopy genes. The members of the individual tRNA families evolve under concerted or rapid birth-death evolution, so that paralogous copies maintain almost identical sequences over long evolutionary time-scales. To a good approximation these are functionally equivalent. Individual tRNA copies thus are evolutionary unstable and easily turn into pseudogenes and disappear. This leads to a rapid turnover of tRNAs and often large differences in the tRNA complements of closely related species. Since tRNA paralogs are not distinguished by sequence, common methods cannot not be used to establish orthology between tRNA genes. Results In this contribution we introduce a general framework to distinguish orthologs and paralogs in gene families that are subject to concerted evolution. It is based on the use of uniquely aligned adjacent sequence elements as anchors to establish syntenic conservation of sequence intervals. In practice, anchors and intervals can be extracted from genome-wide multiple sequence alignments. Syntenic clusters of concertedly evolving genes of different families can then be subdivided by list alignments, leading to usually small clusters of candidate co-orthologs. On the basis of recent advances in phylogenetic combinatorics, these candidate clusters can be further processed by cograph editing to recover their duplication histories. We developed a workflow that can be conceptualized as stepwise refinement of a graph of homologous genes. We apply this analysis strategy with different types of synteny anchors to investigate the evolution of tRNAs in primates and fruit flies. We identified a large number of tRNA remolding events concentrated at the tips of the phylogeny. With one notable exception all phylogenetically old tRNA remoldings do not change the isoacceptor class. Conclusions Gene families evolving under concerted evolution are not amenable to classical phylogenetic analyses since paralogs maintain identical, species-specific sequences, precluding the estimation of correct gene trees from sequence differences. This leaves conservation of syntenic arrangements with respect to “anchor elements” that are not subject to concerted evolution as the only viable source of phylogenetic information. We have demonstrated here that a purely synteny-based analysis of tRNA gene histories is indeed feasible. Although the choice of synteny anchors influences the resolution in particular when tight gene clusters are present, and the quality of sequence alignments, genome assemblies, and genome rearrangements limits the scope of the analysis, largely coherent results can be obtained for tRNAs. In particular, we conclude that a large fraction of the tRNAs are recent copies. This proliferation is compensated by rapid pseudogenization as exemplified by many very recent alloacceptor remoldings. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-2927-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Cristian A Velandia-Huerto
- Biology Department, Universidad Nacional de Colombia, Carrera 45 # 26-85, Edif. Uriel Gutiérrez, Bogotá, D.C, Colombia
| | - Sarah J Berkemer
- Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, Leipzig, D-04103, Germany.,Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Härtelstraße 16-18D-04107, Leipzig, Germany
| | - Anne Hoffmann
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Härtelstraße 16-18D-04107, Leipzig, Germany
| | - Nancy Retzlaff
- Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, Leipzig, D-04103, Germany.,Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Härtelstraße 16-18D-04107, Leipzig, Germany
| | - Liliana C Romero Marroquín
- Biology Department, Universidad Nacional de Colombia, Carrera 45 # 26-85, Edif. Uriel Gutiérrez, Bogotá, D.C, Colombia
| | - Maribel Hernández-Rosales
- CONACYT - Instituto de Matemáticas, UNAM Juriquilla, Av. Juriquilla #3001, Santiago de Querétaro, MX-76230, QRO, México
| | - Peter F Stadler
- Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, Leipzig, D-04103, Germany. .,Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Härtelstraße 16-18D-04107, Leipzig, Germany. .,Fraunhofer Institut for Cell Therapy and Immunology, Perlickstraße 1, Leipzig, D-04103, Germany. .,Department of Theoretical Chemistry, University of Vienna, Währinger Straße 17, Vienna, A-1090, Austria. .,Center for non-coding RNA in Technology and Health, Grønegårdsvej 3, Frederiksberg C, DK-1870, Denmark. .,Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM87501, USA.
| | - Clara I Bermúdez-Santana
- Biology Department, Universidad Nacional de Colombia, Carrera 45 # 26-85, Edif. Uriel Gutiérrez, Bogotá, D.C, Colombia
| |
Collapse
|
2
|
HoxA Genes and the Fin-to-Limb Transition in Vertebrates. J Dev Biol 2016; 4:jdb4010010. [PMID: 29615578 PMCID: PMC5831813 DOI: 10.3390/jdb4010010] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2015] [Revised: 01/27/2016] [Accepted: 02/04/2016] [Indexed: 12/12/2022] Open
Abstract
HoxA genes encode for important DNA-binding transcription factors that act during limb development, regulating primarily gene expression and, consequently, morphogenesis and skeletal differentiation. Within these genes, HoxA11 and HoxA13 were proposed to have played an essential role in the enigmatic evolutionary transition from fish fins to tetrapod limbs. Indeed, comparative gene expression analyses led to the suggestion that changes in their regulation might have been essential for the diversification of vertebrates' appendages. In this review, we highlight three potential modifications in the regulation and function of these genes that may have boosted appendage evolution: (1) the expansion of polyalanine repeats in the HoxA11 and HoxA13 proteins; (2) the origin of +a novel long-non-coding RNA with a possible inhibitory function on HoxA11; and (3) the acquisition of cis-regulatory elements modulating 5' HoxA transcription. We discuss the relevance of these mechanisms for appendage diversification reviewing the current state of the art and performing additional comparative analyses to characterize, in a phylogenetic framework, HoxA11 and HoxA13 expression, alanine composition within the encoded proteins, long-non-coding RNAs and cis-regulatory elements.
Collapse
|
3
|
Le Duc D, Renaud G, Krishnan A, Almén MS, Huynen L, Prohaska SJ, Ongyerth M, Bitarello BD, Schiöth HB, Hofreiter M, Stadler PF, Prüfer K, Lambert D, Kelso J, Schöneberg T. Kiwi genome provides insights into evolution of a nocturnal lifestyle. Genome Biol 2015; 16:147. [PMID: 26201466 PMCID: PMC4511969 DOI: 10.1186/s13059-015-0711-4] [Citation(s) in RCA: 59] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2015] [Accepted: 07/01/2015] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Kiwi, comprising five species from the genus Apteryx, are endangered, ground-dwelling bird species endemic to New Zealand. They are the smallest and only nocturnal representatives of the ratites. The timing of kiwi adaptation to a nocturnal niche and the genomic innovations, which shaped sensory systems and morphology to allow this adaptation, are not yet fully understood. RESULTS We sequenced and assembled the brown kiwi genome to 150-fold coverage and annotated the genome using kiwi transcript data and non-redundant protein information from multiple bird species. We identified evolutionary sequence changes that underlie adaptation to nocturnality and estimated the onset time of these adaptations. Several opsin genes involved in color vision are inactivated in the kiwi. We date this inactivation to the Oligocene epoch, likely after the arrival of the ancestor of modern kiwi in New Zealand. Genome comparisons between kiwi and representatives of ratites, Galloanserae, and Neoaves, including nocturnal and song birds, show diversification of kiwi's odorant receptors repertoire, which may reflect an increased reliance on olfaction rather than sight during foraging. Further, there is an enrichment of genes influencing mitochondrial function and energy expenditure among genes that are rapidly evolving specifically on the kiwi branch, which may also be linked to its nocturnal lifestyle. CONCLUSIONS The genomic changes in kiwi vision and olfaction are consistent with changes that are hypothesized to occur during adaptation to nocturnal lifestyle in mammals. The kiwi genome provides a valuable genomic resource for future genome-wide comparative analyses to other extinct and extant diurnal ratites.
Collapse
Affiliation(s)
- Diana Le Duc
- Institute of Biochemistry, Medical Faculty, University of Leipzig, Johannisallee 30, Leipzig, 04103, Germany.
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, 04103, Germany.
| | - Gabriel Renaud
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, 04103, Germany.
| | - Arunkumar Krishnan
- Department of Neuroscience, Unit of Functional Pharmacology, Uppsala University, Box 593, Husargatan 3, Uppsala, 751 24, Sweden.
| | - Markus Sällman Almén
- Department of Neuroscience, Unit of Functional Pharmacology, Uppsala University, Box 593, Husargatan 3, Uppsala, 751 24, Sweden.
| | - Leon Huynen
- Griffith School of Environment and School of Biomolecular and Physical Sciences, Griffith University, Nathan, Queensland, 4111, Australia.
| | - Sonja J Prohaska
- Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, 04103, Germany.
| | - Matthias Ongyerth
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, 04103, Germany.
| | - Bárbara D Bitarello
- Department of Genetics and Evolutionary Biology, University of São Paulo, São Paulo, SP, 05508-090, Brazil.
| | - Helgi B Schiöth
- Department of Neuroscience, Unit of Functional Pharmacology, Uppsala University, Box 593, Husargatan 3, Uppsala, 751 24, Sweden.
| | - Michael Hofreiter
- Adaptive Evolutionary Genomics, Institute for Biochemistry and Biology, University Potsdam, Potsdam, 14469, Germany.
| | - Peter F Stadler
- Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, 04103, Germany.
| | - Kay Prüfer
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, 04103, Germany.
| | - David Lambert
- Griffith School of Environment and School of Biomolecular and Physical Sciences, Griffith University, Nathan, Queensland, 4111, Australia.
| | - Janet Kelso
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, 04103, Germany.
| | - Torsten Schöneberg
- Institute of Biochemistry, Medical Faculty, University of Leipzig, Johannisallee 30, Leipzig, 04103, Germany.
| |
Collapse
|
4
|
|
5
|
Abstract
As more and more systems biology approaches are used to investigate the different types of biological macromolecules, increasing numbers of whole genomic studies are now available for a large array of organisms. Whether it is genomics, transcriptomics, proteomics, interactomics or metabolomics, the full complement of genomic information on all different levels can be juxtaposed between different organisms to reveal similarities or differences, and even to provide consensus models. At the intersection of comparative genomics and systems biology lies great possibility for discovery, analysis and prediction. This paper explores this nexus and the relationship from four general levels: DNA, RNA, protein and extragenomic. For each level, we provide an overview of the methods, discuss the potential challenges and survey the current research. Finally, we suggest some organizing principles and make proposals for new areas that will be important for future research.
Collapse
Affiliation(s)
- Jimmy Lin
- Wilmer Institute, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | | |
Collapse
|
6
|
Origin and functional diversification of an amphibian defense peptide arsenal. PLoS Genet 2013; 9:e1003662. [PMID: 23935531 PMCID: PMC3731216 DOI: 10.1371/journal.pgen.1003662] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2012] [Accepted: 06/05/2013] [Indexed: 11/19/2022] Open
Abstract
The skin secretion of many amphibians contains an arsenal of bioactive molecules, including hormone-like peptides (HLPs) acting as defense toxins against predators, and antimicrobial peptides (AMPs) providing protection against infectious microorganisms. Several amphibian taxa seem to have independently acquired the genes to produce skin-secreted peptide arsenals, but it remains unknown how these originated from a non-defensive ancestral gene and evolved diverse defense functions against predators and pathogens. We conducted transcriptome, genome, peptidome and phylogenetic analyses to chart the full gene repertoire underlying the defense peptide arsenal of the frog Silurana tropicalis and reconstruct its evolutionary history. Our study uncovers a cluster of 13 transcriptionally active genes, together encoding up to 19 peptides, including diverse HLP homologues and AMPs. This gene cluster arose from a duplicated gastrointestinal hormone gene that attained a HLP-like defense function after major remodeling of its promoter region. Instead, new defense functions, including antimicrobial activity, arose by mutation of the precursor proteins, resulting in the proteolytic processing of secondary peptides alongside the original ones. Although gene duplication did not trigger functional innovation, it may have subsequently facilitated the convergent loss of the original function in multiple gene lineages (subfunctionalization), completing their transformation from HLP gene to AMP gene. The processing of multiple peptides from a single precursor entails a mechanism through which peptide-encoding genes may establish new functions without the need for gene duplication to avoid adaptive conflicts with older ones.
Collapse
|
7
|
Bompfünewerer AF, Flamm C, Fried C, Fritzsch G, Hofacker IL, Lehmann J, Missal K, Mosig A, Müller B, Prohaska SJ, Stadler BMR, Stadler PF, Tanzer A, Washietl S, Witwer C. Evolutionary patterns of non-coding RNAs. Theory Biosci 2012; 123:301-69. [PMID: 18202870 DOI: 10.1016/j.thbio.2005.01.002] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2004] [Accepted: 01/24/2005] [Indexed: 01/04/2023]
Abstract
A plethora of new functions of non-coding RNAs (ncRNAs) have been discovered in past few years. In fact, RNA is emerging as the central player in cellular regulation, taking on active roles in multiple regulatory layers from transcription, RNA maturation, and RNA modification to translational regulation. Nevertheless, very little is known about the evolution of this "Modern RNA World" and its components. In this contribution, we attempt to provide at least a cursory overview of the diversity of ncRNAs and functional RNA motifs in non-translated regions of regular messenger RNAs (mRNAs) with an emphasis on evolutionary questions. This survey is complemented by an in-depth analysis of examples from different classes of RNAs focusing mostly on their evolution in the vertebrate lineage. We present a survey of Y RNA genes in vertebrates and study the molecular evolution of the U7 snRNA, the snoRNAs E1/U17, E2, and E3, the Y RNA family, the let-7 microRNA (miRNA) family, and the mRNA-like evf-1 gene. We furthermore discuss the statistical distribution of miRNAs in metazoans, which suggests an explosive increase in the miRNA repertoire in vertebrates. The analysis of the transcription of ncRNAs suggests that small RNAs in general are genetically mobile in the sense that their association with a hostgene (e.g. when transcribed from introns of a mRNA) can change on evolutionary time scales. The let-7 family demonstrates, that even the mode of transcription (as intron or as exon) can change among paralogous ncRNA.
Collapse
|
8
|
Yu H, Lindsay J, Feng ZP, Frankenberg S, Hu Y, Carone D, Shaw G, Pask AJ, O'Neill R, Papenfuss AT, Renfree MB. Evolution of coding and non-coding genes in HOX clusters of a marsupial. BMC Genomics 2012; 13:251. [PMID: 22708672 PMCID: PMC3541083 DOI: 10.1186/1471-2164-13-251] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2011] [Accepted: 05/22/2012] [Indexed: 12/13/2022] Open
Abstract
Background The HOX gene clusters are thought to be highly conserved amongst mammals and other vertebrates, but the long non-coding RNAs have only been studied in detail in human and mouse. The sequencing of the kangaroo genome provides an opportunity to use comparative analyses to compare the HOX clusters of a mammal with a distinct body plan to those of other mammals. Results Here we report a comparative analysis of HOX gene clusters between an Australian marsupial of the kangaroo family and the eutherians. There was a strikingly high level of conservation of HOX gene sequence and structure and non-protein coding genes including the microRNAs miR-196a, miR-196b, miR-10a and miR-10b and the long non-coding RNAs HOTAIR, HOTAIRM1 and HOXA11AS that play critical roles in regulating gene expression and controlling development. By microRNA deep sequencing and comparative genomic analyses, two conserved microRNAs (miR-10a and miR-10b) were identified and one new candidate microRNA with typical hairpin precursor structure that is expressed in both fibroblasts and testes was found. The prediction of microRNA target analysis showed that several known microRNA targets, such as miR-10, miR-414 and miR-464, were found in the tammar HOX clusters. In addition, several novel and putative miRNAs were identified that originated from elsewhere in the tammar genome and that target the tammar HOXB and HOXD clusters. Conclusions This study confirms that the emergence of known long non-coding RNAs in the HOX clusters clearly predate the marsupial-eutherian divergence 160 Ma ago. It also identified a new potentially functional microRNA as well as conserved miRNAs. These non-coding RNAs may participate in the regulation of HOX genes to influence the body plan of this marsupial.
Collapse
Affiliation(s)
- Hongshi Yu
- ARC Centre of Excellence in Kangaroo Genomics, The University of Melbourne, Melbourne, Victoria 3010, Australia
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
9
|
Erb I, González-Vallinas JR, Bussotti G, Blanco E, Eyras E, Notredame C. Use of ChIP-Seq data for the design of a multiple promoter-alignment method. Nucleic Acids Res 2012; 40:e52. [PMID: 22230796 PMCID: PMC3326335 DOI: 10.1093/nar/gkr1292] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
We address the challenge of regulatory sequence alignment with a new method, Pro-Coffee, a multiple aligner specifically designed for homologous promoter regions. Pro-Coffee uses a dinucleotide substitution matrix estimated on alignments of functional binding sites from TRANSFAC. We designed a validation framework using several thousand families of orthologous promoters. This dataset was used to evaluate the accuracy for predicting true human orthologs among their paralogs. We found that whereas other methods achieve on average 73.5% accuracy, and 77.6% when trained on that same dataset, the figure goes up to 80.4% for Pro-Coffee. We then applied a novel validation procedure based on multi-species ChIP-seq data. Trained and untrained methods were tested for their capacity to correctly align experimentally detected binding sites. Whereas the average number of correctly aligned sites for two transcription factors is 284 for default methods and 316 for trained methods, Pro-Coffee achieves 331, 16.5% above the default average. We find a high correlation between a method's performance when classifying orthologs and its ability to correctly align proven binding sites. Not only has this interesting biological consequences, it also allows us to conclude that any method that is trained on the ortholog data set will result in functionally more informative alignments.
Collapse
Affiliation(s)
- Ionas Erb
- Bioinformatics and Genomics program, Centre for Genomic Regulation and UPF, 08003 Barcelona, Spain
| | | | | | | | | | | |
Collapse
|
10
|
BICEGO MANUELE, DELLAGLIO FRANCO, FELIS GIOVANNAE. MULTIMODAL PHYLOGENY FOR TAXONOMY: INTEGRATING INFORMATION FROM NUCLEOTIDE AND AMINO ACID SEQUENCES. J Bioinform Comput Biol 2011; 5:1069-85. [DOI: 10.1142/s0219720007003065] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2006] [Revised: 07/02/2007] [Accepted: 07/08/2007] [Indexed: 11/18/2022]
Abstract
The crucial role played by the analysis of microbial diversity in biotechnology-based innovations has increased the interest in the microbial taxonomy research area. Phylogenetic sequence analyses have contributed significantly to the advances in this field, also in the view of the large amount of sequence data collected in recent years. Phylogenetic analyses could be realized on the basis of protein-encoding nucleotide sequences or encoded amino acid molecules: these two mechanisms present different peculiarities, still starting from two alternative representations of the same information. This complementarity could be exploited to achieve a multimodal phylogenetic scheme that is able to integrate gene and protein information in order to realize a single final tree. This aspect has been poorly addressed in the literature. In this paper, we propose to integrate the two phylogenetic analyses using basic schemes derived from the multimodality fusion theory (or multiclassifier systems theory), a well-founded and rigorous branch for which its powerfulness has already been demonstrated in other pattern recognition contexts. The proposed approach could be applied to distance matrix–based phylogenetic techniques (like neighbor joining), resulting in a smart and fast method. The proposed methodology has been tested in a real case involving sequences of some species of lactic acid bacteria. With this dataset, both nucleotide sequence– and amino acid sequence–based phylogenetic analyses present some drawbacks, which are overcome with the multimodal analysis.
Collapse
Affiliation(s)
- MANUELE BICEGO
- Dip. di Economia Impresa e Regolamentazione, University of Sassari, via Torre Tonda, 34, 07100 Sassari, Italy
| | - FRANCO DELLAGLIO
- Dip. Scientifico e Tecnologico, University of Verona, Strada Le Grazie, 15, 37135 Verona, Italy
| | - GIOVANNA E. FELIS
- Dip. Scientifico e Tecnologico, University of Verona, Strada Le Grazie, 15, 37135 Verona, Italy
| |
Collapse
|
11
|
Raincrow JD, Dewar K, Stocsits C, Prohaska SJ, Amemiya CT, Stadler PF, Chiu CH. Hox clusters of the bichir (Actinopterygii, Polypterus senegalus) highlight unique patterns of sequence evolution in gnathostome phylogeny. JOURNAL OF EXPERIMENTAL ZOOLOGY PART B-MOLECULAR AND DEVELOPMENTAL EVOLUTION 2011; 316:451-64. [PMID: 21688387 DOI: 10.1002/jez.b.21420] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/02/2011] [Revised: 03/27/2011] [Accepted: 04/24/2011] [Indexed: 12/12/2022]
Abstract
Teleost fishes have extra Hox gene clusters owing to shared or lineage-specific genome duplication events in rayfinned fish (actinopterygian) phylogeny. Hence, extrapolating between genome function of teleosts and human or even between different fish species is difficult. We have sequenced and analyzed Hox gene clusters of the Senegal bichir (Polypterus senegalus), an extant representative of the most basal actinopterygian lineage. Bichir possesses four Hox gene clusters (A, B, C, D); phylogenetic analysis supports their orthology to the four Hox gene clusters of the gnathostome ancestor. We have generated a comprehensive database of conserved Hox noncoding sequences that include cartilaginous, lobe-finned, and ray-finned fishes (bichir and teleosts). Our analysis identified putative and known Hox cis-regulatory sequences with differing depths of conservation in Gnathostoma. We found that although bichir possesses four Hox gene clusters, its pattern of conservation of noncoding sequences is mosaic between outgroups, such as human, coelacanth, and shark, with four Hox gene clusters and teleosts, such as zebrafish and pufferfish, with seven or eight Hox gene clusters. Notably, bichir Hox gene clusters have been invaded by DNA transposons and this trend is further exemplified in teleosts, suggesting an as yet unrecognized mechanism of genome evolution that may explain Hox cluster plasticity in actinopterygians. Taken together, our results suggest that actinopterygian Hox gene clusters experienced a reduction in selective constraints that surprisingly predates the teleost-specific genome duplication.
Collapse
Affiliation(s)
- Jeremy D Raincrow
- Department of Genetics, Rutgers University, Piscataway, New Jersey, USA
| | | | | | | | | | | | | |
Collapse
|
12
|
Mannaert A, Amemiya CT, Bossuyt F. Comparative analyses of vertebrate posterior HoxD clusters reveal atypical cluster architecture in the caecilian Typhlonectes natans. BMC Genomics 2010; 11:658. [PMID: 21106068 PMCID: PMC3091776 DOI: 10.1186/1471-2164-11-658] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2010] [Accepted: 11/24/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The posterior genes of the HoxD cluster play a crucial role in the patterning of the tetrapod limb. This region is under the control of a global, long-range enhancer that is present in all vertebrates. Variation in limb types, as is the case in amphibians, can probably not only be attributed to variation in Hox genes, but is likely to be the product of differences in gene regulation. With a collection of vertebrate genome sequences available today, we used a comparative genomics approach to study the posterior HoxD cluster of amphibians. A frog and a caecilian were included in the study to compare coding sequences as well as to determine the gain and loss of putative regulatory sequences. RESULTS We sequenced the posterior end of the HoxD cluster of a caecilian and performed comparative analyses of this region using HoxD clusters of other vertebrates. We determined the presence of conserved non-coding sequences and traced gains and losses of these footprints during vertebrate evolution, with particular focus on amphibians. We found that the caecilian HoxD cluster is almost three times larger than its mammalian counterpart. This enlargement is accompanied with the loss of one gene and the accumulation of repeats in that area. A similar phenomenon was observed in the coelacanth, where a different gene was lost and expansion of the area where the gene was lost has occurred. At least one phylogenetic footprint present in all vertebrates was lost in amphibians. This conserved region is a known regulatory element and functions as a boundary element in neural tissue to prevent expression of Hoxd genes. CONCLUSION The posterior part of the HoxD cluster of Typhlonectes natans is among the largest known today. The loss of Hoxd-12 and the expansion of the intergenic region may exert an influence on the limb enhancer, by having to bypass a distance seven times that of regular HoxD clusters. Whether or not there is a correlation with the loss of limbs remains to be investigated. These results, together with data on other vertebrates show that the tetrapod Hox clusters are more variable than previously thought.
Collapse
Affiliation(s)
- An Mannaert
- Biology Department, ECOL, Amphibian Evolution Lab, Vrije Universiteit Brussel, Brussels, Belgium
| | - Chris T Amemiya
- Benaroya Research Institute at Virginia Mason and University of Washington, Seattle, USA
| | - Franky Bossuyt
- Biology Department, ECOL, Amphibian Evolution Lab, Vrije Universiteit Brussel, Brussels, Belgium
| |
Collapse
|
13
|
Evolution of conserved non-coding sequences within the vertebrate Hox clusters through the two-round whole genome duplications revealed by phylogenetic footprinting analysis. J Mol Evol 2010; 71:427-36. [PMID: 20981416 DOI: 10.1007/s00239-010-9396-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2010] [Accepted: 09/17/2010] [Indexed: 02/01/2023]
Abstract
As a result of two-round whole genome duplications, four or more paralogous Hox clusters exist in vertebrate genomes. The paralogous genes in the Hox clusters show similar expression patterns, implying shared regulatory mechanisms for expression of these genes. Previous studies partly revealed the expression mechanisms of Hox genes. However, cis-regulatory elements that control these paralogous gene expression are still poorly understood. Toward solving this problem, the authors searched conserved non-coding sequences (CNSs), which are candidates of cis-regulatory elements. When comparing orthologous Hox clusters of 19 vertebrate species, 208 intergenic conserved regions were found. The authors then searched for CNSs that were conserved not only between orthologous clusters but also among the four paralogous Hox clusters. The authors found three regions that are conserved among all the four clusters and eight regions that are conserved between intergenic regions of two paralogous Hox clusters. In total, 28 CNSs were identified in the paralogous Hox clusters, and nine of them were newly found in this study. One of these novel regions bears a RARE motif. These CNSs are candidates for gene expression regulatory regions among paralogous Hox clusters. The authors also compared vertebrate CNSs with amphioxus CNSs within the Hox cluster, and found that two CNSs in the HoxA and HoxB clusters retain homology with amphioxus CNSs through the two-round whole genome duplications.
Collapse
|
14
|
Complete HOX cluster characterization of the coelacanth provides further evidence for slow evolution of its genome. Proc Natl Acad Sci U S A 2010; 107:3622-7. [PMID: 20139301 DOI: 10.1073/pnas.0914312107] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
The living coelacanth is a lobe-finned fish that represents an early evolutionary departure from the lineage that led to land vertebrates, and is of extreme interest scientifically. It has changed very little in appearance from fossilized coelacanths of the Cretaceous (150 to 65 million years ago), and is often referred to as a "living fossil." An important general question is whether long-term stasis in morphological evolution is associated with stasis in genome evolution. To this end we have used targeted genome sequencing for acquiring 1,612,752 bp of high quality finished sequence encompassing the four HOX clusters of the Indonesian coelacanth Latimeria menadoensis. Detailed analyses were carried out on genomic structure, gene and repeat contents, conserved noncoding regions, and relative rates of sequence evolution in both coding and noncoding tracts. Our results demonstrate conclusively that the coelacanth HOX clusters are evolving comparatively slowly and that this taxon should serve as a viable outgroup for interpretation of the genomes of tetrapod species.
Collapse
|
15
|
Pascual-Anaya J, D'Aniello S, Garcia-Fernàndez J. Unexpectedly large number of conserved noncoding regions within the ancestral chordate Hox cluster. Dev Genes Evol 2008; 218:591-7. [PMID: 18791732 DOI: 10.1007/s00427-008-0246-8] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2008] [Accepted: 08/11/2008] [Indexed: 10/21/2022]
Abstract
The single amphioxus Hox cluster contains 15 genes and may well resemble the ancestral chordate Hox cluster. We have sequenced the Hox genomic complement of the European amphioxus Branchiostoma lanceolatum and compared it to the American species, Branchiostoma floridae, by phylogenetic footprinting to gain insights into the evolution of Hox gene regulation in chordates. We found that Hox intergenic regions are largely conserved between the two amphioxus species, especially in the case of genes located at the 3' of the cluster, a trend previously observed in vertebrates. We further compared the amphioxus Hox cluster with the human HoxA, HoxB, HoxC, and HoxD clusters, finding several conserved noncoding regions, both in intergenic and intronic regions. This suggests that the regulation of Hox genes is highly conserved across chordates, consistent with the similar Hox expression patterns in vertebrates and amphioxus.
Collapse
Affiliation(s)
- Juan Pascual-Anaya
- Departament de Genètica, Facultat de Biologia, Universitat de Barcelona, Barcelona 08028, Spain
| | | | | |
Collapse
|
16
|
Amemiya CT, Prohaska SJ, Hill-Force A, Cook A, Wasserscheid J, Ferrier DE, Pascual-Anaya J, Garcia-Fernàndez J, Dewar K, Stadler PF. The amphioxusHox cluster: characterization, comparative genomics, and evolution. JOURNAL OF EXPERIMENTAL ZOOLOGY PART B-MOLECULAR AND DEVELOPMENTAL EVOLUTION 2008; 310:465-77. [DOI: 10.1002/jez.b.21213] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
|
17
|
Rose D, Hertel J, Reiche K, Stadler PF, Hackermüller J. NcDNAlign: plausible multiple alignments of non-protein-coding genomic sequences. Genomics 2008; 92:65-74. [PMID: 18511233 DOI: 10.1016/j.ygeno.2008.04.003] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2007] [Revised: 04/09/2008] [Accepted: 04/09/2008] [Indexed: 10/22/2022]
Abstract
Genome-wide multiple sequence alignments (MSAs) are a necessary prerequisite for an increasingly diverse collection of comparative genomic approaches. Here we present a versatile method that generates high-quality MSAs for non-protein-coding sequences. The NcDNAlign pipeline combines pairwise BLAST alignments to create initial MSAs, which are then locally improved and trimmed. The program is optimized for speed and hence is particulary well-suited to pilot studies. We demonstrate the practical use of NcDNAlign in three case studies: the search for ncRNAs in gammaproteobacteria and the analysis of conserved noncoding DNA in nematodes and teleost fish, in the latter case focusing on the fate of duplicated ultra-conserved regions. Compared to the currently widely used genome-wide alignment program TBA, our program results in a 20- to 30-fold reduction of CPU time necessary to generate gammaproteobacterial alignments. A showcase application of bacterial ncRNA prediction based on alignments of both algorithms results in similar sensitivity, false discovery rates, and up to 100 putatively novel ncRNA structures. Similar findings hold for our application of NcDNAlign to the identification of ultra-conserved regions in nematodes and teleosts. Both approaches yield conserved sequences of unknown function, result in novel evolutionary insights into conservation patterns among these genomes, and manifest the benefits of an efficient and reliable genome-wide alignment package. The software is available under the GNU Public License at http://www.bioinf.uni-leipzig.de/Software/NcDNAlign/.
Collapse
Affiliation(s)
- Dominic Rose
- Bioinformatics Group, Department of Computer Science, University of Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany
| | | | | | | | | |
Collapse
|
18
|
Hoegg S, Boore JL, Kuehl JV, Meyer A. Comparative phylogenomic analyses of teleost fish Hox gene clusters: lessons from the cichlid fish Astatotilapia burtoni. BMC Genomics 2007; 8:317. [PMID: 17845724 PMCID: PMC2080641 DOI: 10.1186/1471-2164-8-317] [Citation(s) in RCA: 71] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2007] [Accepted: 09/10/2007] [Indexed: 11/10/2022] Open
Abstract
Background Teleost fish have seven paralogous clusters of Hox genes stemming from two complete genome duplications early in vertebrate evolution, and an additional genome duplication during the evolution of ray-finned fish, followed by the secondary loss of one cluster. Gene duplications on the one hand, and the evolution of regulatory sequences on the other, are thought to be among the most important mechanisms for the evolution of new gene functions. Cichlid fish, the largest family of vertebrates with about 2500 species, are famous examples of speciation and morphological diversity. Since this diversity could be based on regulatory changes, we chose to study the coding as well as putative regulatory regions of their Hox clusters within a comparative genomic framework. Results We sequenced and characterized all seven Hox clusters of Astatotilapia burtoni, a haplochromine cichlid fish. Comparative analyses with data from other teleost fish such as zebrafish, two species of pufferfish, stickleback and medaka were performed. We traced losses of genes and microRNAs of Hox clusters, the medaka lineage seems to have lost more microRNAs than the other fish lineages. We found that each teleost genome studied so far has a unique set of Hox genes. The hoxb7a gene was lost independently several times during teleost evolution, the most recent event being within the radiation of East African cichlid fish. The conserved non-coding sequences (CNS) encompass a surprisingly large part of the clusters, especially in the HoxAa, HoxCa, and HoxDa clusters. Across all clusters, we observe a trend towards an increased content of CNS towards the anterior end. Conclusion The gene content of Hox clusters in teleost fishes is more variable than expected, with each species studied so far having a different set. Although the highest loss rate of Hox genes occurred immediately after whole genome duplications, our analyses showed that gene loss continued and is still ongoing in all teleost lineages. Along with the gene content, the CNS content also varies across clusters. The excess of CNS at the anterior end of clusters could imply a stronger conservation of anterior expression patters than those towards more posterior areas of the embryo.
Collapse
Affiliation(s)
- Simone Hoegg
- Lehrstuhl für Evolutionsbiologie und Zoologie, Department of Biology, University of Konstanz, 78457 Konstanz, Germany
| | - Jeffrey L Boore
- Program in Evolutionary Genomics, DOE Joint Genome Institute and Lawrence Berkeley National Laboratory, and University of California, Berkeley, California 94720, USA
- SymBio Corporation, 1455 Adams Drive, Menlo Park, CA 94025, and University of California, Berkeley, California 94720, USA
| | - Jennifer V Kuehl
- Program in Evolutionary Genomics, DOE Joint Genome Institute and Lawrence Berkeley National Laboratory, and University of California, Berkeley, California 94720, USA
| | - Axel Meyer
- Lehrstuhl für Evolutionsbiologie und Zoologie, Department of Biology, University of Konstanz, 78457 Konstanz, Germany
| |
Collapse
|
19
|
Hoegg S, Meyer A. Phylogenomic analyses of KCNA gene clusters in vertebrates: why do gene clusters stay intact? BMC Evol Biol 2007; 7:139. [PMID: 17697377 PMCID: PMC1978502 DOI: 10.1186/1471-2148-7-139] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2006] [Accepted: 08/15/2007] [Indexed: 11/12/2022] Open
Abstract
Background Gene clusters are of interest for the understanding of genome evolution since they provide insight in large-scale duplications events as well as patterns of individual gene losses. Vertebrates tend to have multiple copies of gene clusters that typically are only single clusters or are not present at all in genomes of invertebrates. We investigated the genomic architecture and conserved non-coding sequences of vertebrate KCNA gene clusters. KCNA genes encode shaker-related voltage-gated potassium channels and are arranged in two three-gene clusters in tetrapods. Teleost fish are found to possess four clusters. The two tetrapod KNCA clusters are of approximately the same age as the Hox gene clusters that arose through duplications early in vertebrate evolution. For some genes, their conserved retention and arrangement in clusters are thought to be related to regulatory elements in the intergenic regions, which might prevent rearrangements and gene loss. Interestingly, this hypothesis does not appear to apply to the KCNA clusters, as too few conserved putative regulatory elements are retained. Results We obtained KCNA coding sequences from basal ray-finned fishes (sturgeon, gar, bowfin) and confirmed that the duplication of these genes is specific to teleosts and therefore consistent with the fish-specific genome duplication (FSGD). Phylogenetic analyses of the genes suggest a basal position of the only intron containing KCNA gene in vertebrates (KCNA7). Sistergroup relationships of KCNA1/2 and KCNA3/6 support that a large-scale duplication gave rise to the two clusters found in the genome of tetrapods. We analyzed the intergenic regions of KCNA clusters in vertebrates and found that there are only a few conserved sequences shared between tetrapods and teleosts or between paralogous clusters. The orthologous teleost clusters, however, show sequence conservation in these regions. Conclusion The lack of overall conserved sequences in intergenic regions suggests that there are either other processes than regulatory evolution leading to cluster conservation or that the ancestral regulatory relationships among genes in KCNA clusters have been changed together with their regulatory sites.
Collapse
Affiliation(s)
- Simone Hoegg
- Lehrstuhl für Zoologie und Evolutionsbiologie, Department of Biology, University of Konstanz, 78457 Konstanz, Germany
| | - Axel Meyer
- Lehrstuhl für Zoologie und Evolutionsbiologie, Department of Biology, University of Konstanz, 78457 Konstanz, Germany
| |
Collapse
|
20
|
Wagner GP, Otto W, Lynch V, Stadler PF. A stochastic model for the evolution of transcription factor binding site abundance. J Theor Biol 2007; 247:544-53. [PMID: 17475285 DOI: 10.1016/j.jtbi.2007.03.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2006] [Revised: 02/27/2007] [Accepted: 02/27/2007] [Indexed: 10/23/2022]
Abstract
Both experimental as well as sequence evolution evidence suggests that transcription factor binding sites can undergo divergence and turnover even when the transcriptional output remains conserved. Furthermore, it is likely that there exist lineage specific differences in the retention rate of binding sites that make it desirable to estimate the rate of acquisition and decay of transcription factor binding sites from comparative sequence data. In this paper we propose a stochastic, phenomenological model for binding site turnover. For a given genomic region we assume a constant rate of binding site origination lambda and a constant per site decay rate of mu. We derived an explicit expression for the conditional probability distribution of the number of binding sites n at time t given n(0) binding sites at t=0. The analytical result was compared to a simulation model and we found that it closely predicts the simulated sequence evolution. We then analyzed a small data set of the number of estrogen response elements (ERE) in mammalian HoxA sequences and showed that the data is broadly consistent with the assumption of a stationary turnover process. A regression of shared EREs over the time since divergence led to an estimate of the half-life time for an ERE in the primate HoxA clusters of about 27 Myr, which corresponds to a per site decay rate of mu approximately 1.3 x 10(-8)/year and a rate of origination of lambda approximately 1.6 x 10(-7)/year. We conclude that the model can be used to estimate the rate of binding site turnover from comparative genomic data.
Collapse
Affiliation(s)
- Günter P Wagner
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520-8106, USA.
| | | | | | | |
Collapse
|
21
|
Abstract
DIALIGN is a software program for multiple alignment of DNA or protein sequences that combines global and local alignment features. During the last years, the program has been used extensively to compare syntenic regions in genomic sequences. An anchoring option speeds up the alignment procedure and makes it possible to use user-defined constraints to improve the quality of the program output. This chapter explains features of DIALIGN that are useful if genomic sequences are to be aligned. The program is online available through Göttingen Bioinformatics Compute Server at http://dialign.gobics.de/.
Collapse
|
22
|
Amemiya CT, Gomez-Chiarri M. Comparative genomics in vertebrate evolution and development. ACTA ACUST UNITED AC 2006; 305:672-82. [PMID: 16902957 DOI: 10.1002/jez.a.308] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The vast quantities of publicly available DNA sequencing data and genome resources are enabling biologists to investigate age-old problems in biology that were not addressable previously. In this review, we discuss how comparative genomics is practiced and how the data can be used to make biological inferences with respect to vertebrate evolution and development. Examples are taken from the well-known HOX clusters, which are always a high-priority target for genomic analyses due to their inferred role in the evolution of metazoans. In addition, we briefly discuss the application of genomic approaches to problems in comparative endocrinology.
Collapse
Affiliation(s)
- Chris T Amemiya
- Molecular Genetics Program, Benaroya Research Institute at Virginia Mason, Seattle, Washington 98101, USA.
| | | |
Collapse
|
23
|
Missal K, Zhu X, Rose D, Deng W, Skogerbø G, Chen R, Stadler PF. Prediction of structured non-coding RNAs in the genomes of the nematodesCaenorhabditis elegans andCaenorhabditis briggsae. JOURNAL OF EXPERIMENTAL ZOOLOGY PART B-MOLECULAR AND DEVELOPMENTAL EVOLUTION 2006; 306:379-92. [PMID: 16425273 DOI: 10.1002/jez.b.21086] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
We present a survey for non-coding RNAs and other structured RNA motifs in the genomes of Caenorhabditis elegans and Caenorhabditis briggsae using the RNAz program. This approach explicitly evaluates comparative sequence information to detect stabilizing selection acting on RNA secondary structure. We detect 3,672 structured RNA motifs, of which only 678 are known non-translated RNAs (ncRNAs) or clear homologs of known C. elegans ncRNAs. Most of these signals are located in introns or at a distance from known protein-coding genes. With an estimated false positive rate of about 50% and a sensitivity on the order of 50%, we estimate that the nematode genomes contain between 3,000 and 4,000 RNAs with evolutionary conserved secondary structures. Only a small fraction of these belongs to the known RNA classes, including tRNAs, snoRNAs, snRNAs, or microRNAs. A relatively small class of ncRNA candidates is associated with previously observed RNA-specific upstream elements.
Collapse
Affiliation(s)
- Kristin Missal
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstrasse 16 18, D 04107 Leipzig, Germany.
| | | | | | | | | | | | | |
Collapse
|
24
|
Perco P, Rapberger R, Siehs C, Lukas A, Oberbauer R, Mayer G, Mayer B. Transforming omics data into context: Bioinformatics on genomics and proteomics raw data. Electrophoresis 2006; 27:2659-75. [PMID: 16739231 DOI: 10.1002/elps.200600064] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Differential gene expression analysis and proteomics have exerted significant impact on the elucidation of concerted cellular processes, as simultaneous measurement of hundreds to thousands of individual objects on the level of RNA and protein ensembles became technically feasible. The availability of such data sets has promised a profound understanding of phenomena on an aggregate level, expressed as the phenotypic response (observables) of cells, e.g., in the presence of drugs, or characterization of cells and tissue displaying distinct patho-physiological states. However, the step of transforming these data into context, i.e., linking distinct expression or abundance patterns with phenotypic observables - and furthermore enabling a sound biological interpretation on the level of reaction networks and concerted pathways, is still a major shortcoming. This finding is certainly based on the enormous complexity embedded in cellular reaction networks, but a variety of computational approaches have been developed over the last few years to overcome these issues. This review provides an overview on computational procedures for analysis of genomic and proteomic data introducing a sequential analysis workflow: Explorative statistics for deriving a first, from the purely statistical viewpoint, relevant candidate gene/protein list, followed by co-regulation and network analysis to biologically expand this core list toward functional networks and pathways. The review on these procedures is complemented by example applications tailored at identification of disease-associated proteins. Optimization of computational procedures involved, in conjunction with the continuous increase in additional biological data, clearly has the potential of boosting our understanding of processes on a cell-wide level.
Collapse
Affiliation(s)
- Paul Perco
- Department of Nephrology, Medical University of Vienna, Austria
| | | | | | | | | | | | | |
Collapse
|
25
|
Lee AP, Koh EGL, Tay A, Brenner S, Venkatesh B. Highly conserved syntenic blocks at the vertebrate Hox loci and conserved regulatory elements within and outside Hox gene clusters. Proc Natl Acad Sci U S A 2006; 103:6994-9. [PMID: 16636282 PMCID: PMC1459007 DOI: 10.1073/pnas.0601492103] [Citation(s) in RCA: 75] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Hox genes in vertebrates are clustered, and the organization of the clusters has been highly conserved during evolution. The conservation of Hox clusters has been attributed to enhancers located within and outside the Hox clusters that are essential for the coordinated "temporal" and "spatial" expression patterns of Hox genes in developing embryos. To identify evolutionarily conserved regulatory elements within and outside the Hox clusters, we obtained contiguous sequences for the conserved syntenic blocks from the seven Hox loci in fugu and carried out a systematic search for conserved noncoding sequences (CNS) in the human, mouse, and fugu Hox loci. Our analysis has uncovered unusually large conserved syntenic blocks at the HoxA and HoxD loci. The conserved syntenic blocks at the human and mouse HoxA and HoxD loci span 5.4 Mb and 4 Mb and contain 21 and 19 genes, respectively. The corresponding regions in fugu are 16- and 12-fold smaller. A large number of CNS was identified within the Hox clusters and outside the Hox clusters spread over large regions. The CNS include previously characterized enhancers and overlap with the 5' global control regions of HoxA and HoxD clusters. Most of the CNS are likely to be control regions involved in the regulation of Hox and other genes in these loci. We propose that the regulatory elements spread across large regions on either side of Hox clusters are a major evolutionary constraint that has maintained the exceptionally long syntenic blocks at the HoxA and HoxD loci.
Collapse
Affiliation(s)
- Alison P. Lee
- Institute of Molecular and Cell Biology, 61 Biopolis Drive, Singapore 138673
| | - Esther G. L. Koh
- Institute of Molecular and Cell Biology, 61 Biopolis Drive, Singapore 138673
| | - Alice Tay
- Institute of Molecular and Cell Biology, 61 Biopolis Drive, Singapore 138673
| | - Sydney Brenner
- Institute of Molecular and Cell Biology, 61 Biopolis Drive, Singapore 138673
| | - Byrappa Venkatesh
- Institute of Molecular and Cell Biology, 61 Biopolis Drive, Singapore 138673
| |
Collapse
|
26
|
Bhatnagar V, Xu G, Hamilton BA, Truong DM, Eraly SA, Wu W, Nigam SK. Analyses of 5' regulatory region polymorphisms in human SLC22A6 (OAT1) and SLC22A8 (OAT3). J Hum Genet 2006; 51:575-580. [PMID: 16648942 DOI: 10.1007/s10038-006-0398-1] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2006] [Accepted: 02/23/2006] [Indexed: 11/28/2022]
Abstract
Kidney excretion of numerous organic anionic drugs and endogenous metabolites is carried out by a family of multispecific organic anion transporters (OATs). Two closely related transporters, SLC22A6, initially identified by us as NKT and also known as OAT1, and SLC22A8, also known as OAT3 and ROCT, are thought to mediate the initial steps in the transport of organic anionic drugs between the blood and proximal tubule cells of the kidney. Coding region polymorphisms in these genes are infrequent and pairing of these genes in the genome suggests they may be coordinately regulated. Hence, 5' regulatory regions of these genes may be important factors in human variation in organic anionic drug handling. We have analyzed novel single nucleotide polymorphisms in the evolutionarily conserved 5' regulatory regions of the SLC22A6 and SLC22A8 genes (phylogenetic footprints) in an ethnically diverse sample of 96 individuals (192 haploid genomes). Only one polymorphism was found in the SLC22A6 5' regulatory region. In contrast, seven polymorphisms were found in the SLC22A8 5' regulatory region, two of which were common to all ethnic groups studied. Computational analysis permitted phase and haplotype reconstruction. Proximity of these non-coding polymorphisms to transcriptional regulatory elements (including potential sex steroid response elements) suggests a potential influence on the level of transcription of the SLC22A6 and/or SLC22A8 genes and will help define their role in variation in human drug, metabolite and toxin excretion. The clustering of OAT genes in the genome raises the possibility that nucleotide polymorphisms in SLC22A6 could also effect SLC22A8 expression, and vice versa.
Collapse
Affiliation(s)
- Vibha Bhatnagar
- Departments of Pediatrics, Medicine, Cellular and Molecular Medicine, Family, Preventative Medicine, and the San Diego Veterans Administration Medical Center, University of California-San Diego, 9500 Gilman Dr., 0693, La Jolla, CA, 92093-0693, USA.
| | - Gang Xu
- Departments of Pediatrics, Medicine, Cellular and Molecular Medicine, Family, Preventative Medicine, and the San Diego Veterans Administration Medical Center, University of California-San Diego, 9500 Gilman Dr., 0693, La Jolla, CA, 92093-0693, USA
| | - Bruce A Hamilton
- Departments of Pediatrics, Medicine, Cellular and Molecular Medicine, Family, Preventative Medicine, and the San Diego Veterans Administration Medical Center, University of California-San Diego, 9500 Gilman Dr., 0693, La Jolla, CA, 92093-0693, USA
| | - David M Truong
- Departments of Pediatrics, Medicine, Cellular and Molecular Medicine, Family, Preventative Medicine, and the San Diego Veterans Administration Medical Center, University of California-San Diego, 9500 Gilman Dr., 0693, La Jolla, CA, 92093-0693, USA
| | - Satish A Eraly
- Departments of Pediatrics, Medicine, Cellular and Molecular Medicine, Family, Preventative Medicine, and the San Diego Veterans Administration Medical Center, University of California-San Diego, 9500 Gilman Dr., 0693, La Jolla, CA, 92093-0693, USA
| | - Wei Wu
- Departments of Pediatrics, Medicine, Cellular and Molecular Medicine, Family, Preventative Medicine, and the San Diego Veterans Administration Medical Center, University of California-San Diego, 9500 Gilman Dr., 0693, La Jolla, CA, 92093-0693, USA
| | - Sanjay K Nigam
- Departments of Pediatrics, Medicine, Cellular and Molecular Medicine, Family, Preventative Medicine, and the San Diego Veterans Administration Medical Center, University of California-San Diego, 9500 Gilman Dr., 0693, La Jolla, CA, 92093-0693, USA
| |
Collapse
|
27
|
Morgenstern B, Prohaska SJ, Pöhler D, Stadler PF. Multiple sequence alignment with user-defined anchor points. Algorithms Mol Biol 2006; 1:6. [PMID: 16722533 PMCID: PMC1481597 DOI: 10.1186/1748-7188-1-6] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2006] [Accepted: 04/19/2006] [Indexed: 11/15/2022] Open
Abstract
Background Automated software tools for multiple alignment often fail to produce biologically meaningful results. In such situations, expert knowledge can help to improve the quality of alignments. Results Herein, we describe a semi-automatic version of the alignment program DIALIGN that can take pre-defined constraints into account. It is possible for the user to specify parts of the sequences that are assumed to be homologous and should therefore be aligned to each other. Our software program can use these sites as anchor points by creating a multiple alignment respecting these constraints. This way, our alignment method can produce alignments that are biologically more meaningful than alignments produced by fully automated procedures. As a demonstration of how our method works, we apply our approach to genomic sequences around the Hox gene cluster and to a set of DNA-binding proteins. As a by-product, we obtain insights about the performance of the greedy algorithm that our program uses for multiple alignment and about the underlying objective function. This information will be useful for the further development of DIALIGN. The described alignment approach has been integrated into the TRACKER software system.
Collapse
Affiliation(s)
- Burkhard Morgenstern
- Universität Göttingen, Institut für Mikrobiologie und Genetik, Abteilung für Bioinformatik, Goldschmidtstrasse. 1, D-37077 Göttingen, Germany
| | - Sonja J Prohaska
- Universität Leipzig, Institut für Informatik und Interdisziplinäres Zentrum für Bioinformatik, Kreuzstrasse 7b, D-04103 Leipzig, Germany
| | - Dirk Pöhler
- Universität Göttingen, Institut für Mikrobiologie und Genetik, Abteilung für Bioinformatik, Goldschmidtstrasse. 1, D-37077 Göttingen, Germany
| | - Peter F Stadler
- Universität Leipzig, Institut für Informatik und Interdisziplinäres Zentrum für Bioinformatik, Kreuzstrasse 7b, D-04103 Leipzig, Germany
| |
Collapse
|
28
|
Perco P, Kainz A, Mayer G, Lukas A, Oberbauer R, Mayer B. Detection of coregulation in differential gene expression profiles. Biosystems 2005; 82:235-47. [PMID: 16181729 DOI: 10.1016/j.biosystems.2005.08.001] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2005] [Revised: 08/02/2005] [Accepted: 08/02/2005] [Indexed: 01/04/2023]
Abstract
Genomics and proteomics approaches generate distinct gene expression and protein profiles, listing individual genes embedded in broad functional terms as gene ontologies. However, interpretation of gene profiles in a regulatory and functional context remains a major issue. Elucidation of regulatory mechanisms at the gene expression level via analysis of promoter regions is a prominent procedure to decipher such gene regulatory networks. We propose a novel genetic algorithm (GA) to extract joint promoter modules in a set of coexpressed genes as resulting from differential gene expression experiments. Algorithm design has focused on the following constraints: (I) identification of the major promoter modules, which are (II) characterized by a maximum number of joint motifs and (III) are found in a maximum number of coexpressed genes. The capability of the GA in detecting multiple modules was evaluated on various test data sets, analyzing the impact of the number of motifs per promoter module, the number of genes associated with a module, as well as the total number of distinct promoter modules encoded in a sequence set. In addition to the test data sets, the GA was evaluated on two biological examples, namely a muscle-specific data set and the upstream sequences of the beta-actin gene (ACTB) derived from different species, complemented by a comparison to alternative promoter module identification routines.
Collapse
Affiliation(s)
- Paul Perco
- Institute for Biomolecular Structural Chemistry, University of Vienna, Campus Vienna Biocenter 6, 1030 Vienna, Austria
| | | | | | | | | | | |
Collapse
|
29
|
Crow KD, Stadler PF, Lynch VJ, Amemiya C, Wagner GP. The "fish-specific" Hox cluster duplication is coincident with the origin of teleosts. Mol Biol Evol 2005; 23:121-36. [PMID: 16162861 DOI: 10.1093/molbev/msj020] [Citation(s) in RCA: 149] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The Hox gene complement of zebrafish, medaka, and fugu differs from that of other gnathostome vertebrates. These fishes have seven to eight Hox clusters compared to the four Hox clusters described in sarcopterygians and shark. The clusters in different teleost lineages are orthologous, implying that a "fish-specific" Hox cluster duplication has occurred in the stem lineage leading to the most recent common ancestor of zebrafish and fugu. The timing of this event, however, is unknown. To address this question, we sequenced four Hox genes from taxa representing basal actinopterygian and teleost lineages and compared them to known sequences from shark, coelacanth, zebrafish, and other teleosts. The resulting gene genealogies suggest that the fish-specific Hox cluster duplication occurred coincident with the origin of crown group teleosts. In addition, we obtained evidence for an independent Hox cluster duplication in the sturgeon lineage (Acipenseriformes). Finally, results from HoxA11 suggest that duplicated Hox genes have experienced diversifying selection immediately after the duplication event. Taken together, these results support the notion that the duplicated Hox genes of teleosts were causally relevant to adaptive evolution during the initial teleost radiation.
Collapse
Affiliation(s)
- Karen D Crow
- Department of Ecology and Evolutionary Biology, Yale University, USA.
| | | | | | | | | |
Collapse
|
30
|
Wagner GP, Takahashi K, Lynch V, Prohaska SJ, Fried C, Stadler PF, Amemiya C. Molecular evolution of duplicated ray finned fish HoxA clusters: increased synonymous substitution rate and asymmetrical co-divergence of coding and non-coding sequences. J Mol Evol 2005; 60:665-76. [PMID: 15983874 DOI: 10.1007/s00239-004-0252-z] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2004] [Accepted: 11/22/2004] [Indexed: 01/01/2023]
Abstract
In this study the molecular evolution of duplicated HoxA genes in zebrafish and fugu has been investigated. All 18 duplicated HoxA genes studied have a higher non-synonymous substitution rate than the corresponding genes in either bichir or paddlefish, where these genes are not duplicated. The higher rate of evolution is not due solely to a higher non-synonymous-to-synonymous rate ratio but to an increase in both the non-synonymous as well as the synonymous substitution rate. The synonymous rate increase can be explained by a change in base composition, codon usage, or mutation rate. We found no changes in nucleotide composition or codon bias. Thus, we suggest that the HoxA genes may experience an increased mutation rate following cluster duplication. In the non-Hox nuclear gene RAG1 only an increase in non-synonymous substitutions could be detected, suggesting that the increased mutation rate is specific to duplicated Hox clusters and might be related to the structural instability of Hox clusters following duplication. The divergence among paralog genes tends to be asymmetric, with one paralog diverging faster than the other. In fugu, all b-paralogs diverge faster than the a-paralogs, while in zebrafish Hoxa-13a diverges faster. This asymmetry corresponds to the asymmetry in the divergence rate of conserved non-coding sequences, i.e., putative cis-regulatory elements. These results suggest that the 5' HoxA genes in the same cluster belong to a co-evolutionary unit in which genes have a tendency to diverge together.
Collapse
Affiliation(s)
- Günter P Wagner
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520-8106, USA.
| | | | | | | | | | | | | |
Collapse
|
31
|
Pöhler D, Werner N, Steinkamp R, Morgenstern B. Multiple alignment of genomic sequences using CHAOS, DIALIGN and ABC. Nucleic Acids Res 2005; 33:W532-4. [PMID: 15980528 PMCID: PMC1160147 DOI: 10.1093/nar/gki386] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
Comparative analysis of genomic sequences is a powerful approach to discover functional sites in these sequences. Herein, we present a WWW-based software system for multiple alignment of genomic sequences. We use the local alignment tool CHAOS to rapidly identify chains of pairwise similarities. These similarities are used as anchor points to speed up the DIALIGN multiple-alignment program. Finally, the visualization tool ABC is used for interactive graphical representation of the resulting multiple alignments. Our software is available at Göttingen Bioinformatics Compute Server (GOBICS) at
Collapse
Affiliation(s)
| | | | | | - Burkhard Morgenstern
- To whom correspondence should be addressed. Tel: +49 551 39 14628; Fax: +49 551 39 14929;
| |
Collapse
|
32
|
Hoegg S, Meyer A. Hox clusters as models for vertebrate genome evolution. Trends Genet 2005; 21:421-4. [PMID: 15967537 DOI: 10.1016/j.tig.2005.06.004] [Citation(s) in RCA: 138] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2005] [Revised: 04/11/2005] [Accepted: 06/06/2005] [Indexed: 11/21/2022]
Abstract
The surprising variation in the number of Hox clusters and the genomic architecture within vertebrate lineages, especially within the ray-finned fish, reflects a history of duplications and subsequent lineage-specific gene loss. Recent research on the evolution of conserved non-coding sequences (CNS) in Hox clusters promises to reveal interesting results for functional and phenotypic diversification.
Collapse
Affiliation(s)
- Simone Hoegg
- Lehrstuhl für Zoologie und Evolutionsbiologie, Department of Biology, University of Konstanz, 78457 Konstanz, Germany
| | | |
Collapse
|
33
|
Stadler PF, Fried C, Prohaska SJ, Bailey WJ, Misof BY, Ruddle FH, Wagner GP. Evidence for independent Hox gene duplications in the hagfish lineage: a PCR-based gene inventory of Eptatretus stoutii. Mol Phylogenet Evol 2005; 32:686-94. [PMID: 15288047 DOI: 10.1016/j.ympev.2004.03.015] [Citation(s) in RCA: 70] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2003] [Revised: 02/13/2004] [Indexed: 11/22/2022]
Abstract
Hox genes code for transcription factors that play a major role in the development of all animal phyla. In invertebrates these genes usually occur as tightly linked cluster, with a few exceptions where the clusters have been dissolved. Only in vertebrates multiple clusters have been demonstrated which arose by duplication from a single ancestral cluster. This history of Hox cluster duplications, in particular during the early elaboration of the vertebrate body plan, is still poorly understood. In this paper we report the results of a PCR survey on genomic DNA of the pacific hagfish Eptatretus stoutii. Hagfishes are one of two clades of recent jawless fishes that are an offshoot of the early radiation of jawless vertebrates. Our data provide evidence for at least 33 distinct Hox genes in the hagfish genome, which is most compatible with the hypothesis of multiple Hox clusters. The largest number, seven, of distinct homeobox fragments could be assigned to paralog group 9, which could imply that the hagfish has more than four clusters. Quartet mapping reveals that within each paralog group the hagfish sequences are statistically more closely related to gnathostome Hox genes than with either amphioxus or lamprey genes. These results support two assumptions about the history of Hox genes: (1) The association of hagfish homeobox sequences with gnathostome sequences suggests that at least one Hox cluster duplication event happened in the stem of vertebrates, i.e., prior to the most recent common ancestor of jawed and jawless vertebrates. (2) The high number of paralog group 9 sequences in hagfish and the phylogenetic position of hagfish suggests that the hagfish lineage underwent additional independent Hox cluster/-gene duplication events.
Collapse
Affiliation(s)
- Peter F Stadler
- Lehrstuhl für Bioinformatik, Institut für Informatik, Universität Leipzig, Kreuzstrasse 7b, D-04103 Leipzig, Germany.
| | | | | | | | | | | | | |
Collapse
|
34
|
Tanzer A, Amemiya CT, Kim CB, Stadler PF. Evolution of microRNAs located withinHox gene clusters. JOURNAL OF EXPERIMENTAL ZOOLOGY PART B-MOLECULAR AND DEVELOPMENTAL EVOLUTION 2005; 304:75-85. [PMID: 15643628 DOI: 10.1002/jez.b.21021] [Citation(s) in RCA: 128] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
MicroRNAs (miRNAs) form an abundant class of non-coding RNA genes that have an important function in post-transcriptional gene regulation and in particular modulate the expression of developmentally important transcription factors including Hox genes. Two families of microRNAs are genomically located in intergenic regions in the Hox clusters of vertebrates. Here we describe their evolution in detail. We show that the micro RNAs closely follow the patterns of protein evolution in the Hox clusters, which is characterized by cluster duplications followed by differential gene loss.
Collapse
Affiliation(s)
- Andrea Tanzer
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, Kreuzstrasse 7b, D 04103 Leipzig, Germany.
| | | | | | | |
Collapse
|
35
|
Prohaska SJ, Fried C, Amemiya CT, Ruddle FH, Wagner GP, Stadler PF. The shark HoxN cluster is homologous to the human HoxD cluster. J Mol Evol 2004; 58:212-7. [PMID: 15042342 DOI: 10.1007/s00239-003-2545-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
The statistical analysis of phylogenetic footprints in the two known horn shark Hox clusters and the four mammalian clusters shows that the shark HoxN cluster is HoxD-like. This finding implies that the most recent common ancestor of jawed vertebrates had at least four Hox clusters, including those which are orthologous to the four mammalian Hox clusters.
Collapse
Affiliation(s)
- Sonja J Prohaska
- Bioinformatik, Institut für Informatik, Universität Leipzig, Kreuzstrassse 7b, D-04103 Leipzig, Germany
| | | | | | | | | | | |
Collapse
|
36
|
Schmollinger M, Nieselt K, Kaufmann M, Morgenstern B. DIALIGN P: fast pair-wise and multiple sequence alignment using parallel processors. BMC Bioinformatics 2004; 5:128. [PMID: 15357879 PMCID: PMC520757 DOI: 10.1186/1471-2105-5-128] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2004] [Accepted: 09/09/2004] [Indexed: 11/30/2022] Open
Abstract
Background Parallel computing is frequently used to speed up computationally expensive tasks in Bioinformatics. Results Herein, a parallel version of the multi-alignment program DIALIGN is introduced. We propose two ways of dividing the program into independent sub-routines that can be run on different processors: (a) pair-wise sequence alignments that are used as a first step to multiple alignment account for most of the CPU time in DIALIGN. Since alignments of different sequence pairs are completely independent of each other, they can be distributed to multiple processors without any effect on the resulting output alignments. (b) For alignments of large genomic sequences, we use a heuristics by splitting up sequences into sub-sequences based on a previously introduced anchored alignment procedure. For our test sequences, this combined approach reduces the program running time of DIALIGN by up to 97%. Conclusions By distributing sub-routines to multiple processors, the running time of DIALIGN can be crucially improved. With these improvements, it is possible to apply the program in large-scale genomics and proteomics projects that were previously beyond its scope.
Collapse
Affiliation(s)
- Martin Schmollinger
- Wilhelm-Schickard-Institut fur Informatik, Sand 14, 72076 Tübingen, Germany.
| | | | | | | |
Collapse
|
37
|
Brudno M, Steinkamp R, Morgenstern B. The CHAOS/DIALIGN WWW server for multiple alignment of genomic sequences. Nucleic Acids Res 2004; 32:W41-4. [PMID: 15215346 PMCID: PMC441499 DOI: 10.1093/nar/gkh361] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Cross-species sequence comparison is a powerful approach to analyze functional sites in genomic sequences and many discoveries have been made based on genomic alignments. Herein, we present a WWW-based software system for multiple alignment of large genomic sequences. Our server utilizes the previously developed combination of CHAOS and DIALIGN to achieve both speed and alignment accuracy. CHAOS is a fast database search tool that creates a list of local sequence similarities. These are used by DIALIGN as anchor points to speed up the final alignment procedure. The resulting alignment is returned to the user in different formats together with a list of anchor points found by CHAOS. The CHAOS/DIALIGN software is freely available at http://dialign.gobics.de/chaos-dialign-submission.
Collapse
Affiliation(s)
- Michael Brudno
- Department of Computer Science, Stanford University, Stanford, CA 94305, USA
| | | | | |
Collapse
|
38
|
Chiu CH, Dewar K, Wagner GP, Takahashi K, Ruddle F, Ledje C, Bartsch P, Scemama JL, Stellwag E, Fried C, Prohaska SJ, Stadler PF, Amemiya CT. Bichir HoxA cluster sequence reveals surprising trends in ray-finned fish genomic evolution. Genome Res 2004; 14:11-7. [PMID: 14707166 PMCID: PMC314268 DOI: 10.1101/gr.1712904] [Citation(s) in RCA: 83] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The study of Hox clusters and genes provides insights into the evolution of genomic regulation of development. Derived ray-finned fishes (Actinopterygii, Teleostei) such as zebrafish and pufferfish possess duplicated Hox clusters that have undergone considerable sequence evolution. Whether these changes are associated with the duplication(s) that produced extra Hox clusters is unresolved because comparison with basal lineages is unavailable. We sequenced and analyzed the HoxA cluster of the bichir (Polypterus senegalus), a phylogenetically basal actinopterygian. Independent lines of evidence indicate that bichir has one HoxA cluster that is mosaic in its patterns of noncoding sequence conservation and gene retention relative to the HoxA clusters of human and shark, and the HoxAalpha and HoxAbeta clusters of zebrafish, pufferfish, and striped bass. HoxA cluster noncoding sequences conserved between bichir and euteleosts indicate that novel cis-sequences were acquired in the stem actinopterygians and maintained after cluster duplication. Hence, in the earliest actinopterygians, evolution of the single HoxA cluster was already more dynamic than in human and shark. This tendency peaked among teleosts after HoxA cluster duplication.
Collapse
Affiliation(s)
- Chi-Hua Chiu
- Department of Genetics, Rutgers University, Piscataway, New Jersey 08854, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
39
|
Force A, Shashikant C, Stadler P, Amemiya CT. Comparative Genomics, cis-Regulatory Elements, and Gene Duplication. Methods Cell Biol 2004; 77:545-61. [PMID: 15602931 DOI: 10.1016/s0091-679x(04)77029-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/01/2023]
Affiliation(s)
- Allan Force
- Molecular Genetics Program, Benaroya Research Institute, Seattle, Washington 98101, USA
| | | | | | | |
Collapse
|