51
|
Feltes BC, Grisci BI, Poloni JDF, Dorn M. Perspectives and applications of machine learning for evolutionary developmental biology. Mol Omics 2018; 14:289-306. [PMID: 30168572 DOI: 10.1039/c8mo00111a] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Evolutionary Developmental Biology (Evo-Devo) is an ever-expanding field that aims to understand how development was modulated by the evolutionary process. In this sense, "omic" studies emerged as a powerful ally to unravel the molecular mechanisms underlying development. In this scenario, bioinformatics tools become necessary to analyze the growing amount of information. Among computational approaches, machine learning stands out as a promising field to generate knowledge and trace new research perspectives for bioinformatics. In this review, we aim to expose the current advances of machine learning applied to evolution and development. We draw clear perspectives and argue how evolution impacted machine learning techniques.
Collapse
Affiliation(s)
- Bruno César Feltes
- Institute of Informatics, Federal University of Rio Grande do Sul, Porto Alegre, Brazil.
| | | | | | | |
Collapse
|
52
|
Mondelli ML, Magalhães T, Loss G, Wilde M, Foster I, Mattoso M, Katz D, Barbosa H, de Vasconcelos ATR, Ocaña K, Gadelha LMR. BioWorkbench: a high-performance framework for managing and analyzing bioinformatics experiments. PeerJ 2018; 6:e5551. [PMID: 30186700 PMCID: PMC6119457 DOI: 10.7717/peerj.5551] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2018] [Accepted: 08/07/2018] [Indexed: 11/20/2022] Open
Abstract
Advances in sequencing techniques have led to exponential growth in biological data, demanding the development of large-scale bioinformatics experiments. Because these experiments are computation- and data-intensive, they require high-performance computing techniques and can benefit from specialized technologies such as Scientific Workflow Management Systems and databases. In this work, we present BioWorkbench, a framework for managing and analyzing bioinformatics experiments. This framework automatically collects provenance data, including both performance data from workflow execution and data from the scientific domain of the workflow application. Provenance data can be analyzed through a web application that abstracts a set of queries to the provenance database, simplifying access to provenance information. We evaluate BioWorkbench using three case studies: SwiftPhylo, a phylogenetic tree assembly workflow; SwiftGECKO, a comparative genomics workflow; and RASflow, a RASopathy analysis workflow. We analyze each workflow from both computational and scientific domain perspectives, by using queries to a provenance and annotation database. Some of these queries are available as a pre-built feature of the BioWorkbench web application. Through the provenance data, we show that the framework is scalable and achieves high-performance, reducing up to 98% of the case studies execution time. We also show how the application of machine learning techniques can enrich the analysis process.
Collapse
Affiliation(s)
- Maria Luiza Mondelli
- National Laboratory for Scientific Computing, Petrópolis, Rio de Janeiro, Brazil
| | - Thiago Magalhães
- National Laboratory for Scientific Computing, Petrópolis, Rio de Janeiro, Brazil
| | - Guilherme Loss
- National Laboratory for Scientific Computing, Petrópolis, Rio de Janeiro, Brazil
| | - Michael Wilde
- Computation Institute, Argonne National Laboratory/University of Chicago, Chicago, IL, USA
| | - Ian Foster
- Computation Institute, Argonne National Laboratory/University of Chicago, Chicago, IL, USA
| | - Marta Mattoso
- Computer and Systems Engineering Program, COPPE, Federal University of Rio de Janeiro, Rio de Janeiro, Rio de Janeiro, Brazil
| | - Daniel Katz
- National Center for Supercomputing Applications, University of Illinois, Urbana, IL, USA
| | - Helio Barbosa
- National Laboratory for Scientific Computing, Petrópolis, Rio de Janeiro, Brazil.,Federal University of Juiz de Fora, Juiz de Fora, Minas Gerais, Brazil
| | | | - Kary Ocaña
- National Laboratory for Scientific Computing, Petrópolis, Rio de Janeiro, Brazil
| | - Luiz M R Gadelha
- National Laboratory for Scientific Computing, Petrópolis, Rio de Janeiro, Brazil
| |
Collapse
|
53
|
De novo genome assembly of Oryza granulata reveals rapid genome expansion and adaptive evolution. Commun Biol 2018; 1:84. [PMID: 30271965 PMCID: PMC6123737 DOI: 10.1038/s42003-018-0089-4] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2017] [Accepted: 06/08/2018] [Indexed: 12/18/2022] Open
Abstract
The wild relatives of rice have adapted to different ecological environments and constitute a useful reservoir of agronomic traits for genetic improvement. Here we present the ~777 Mb de novo assembled genome sequence of Oryza granulata. Recent bursts of long-terminal repeat retrotransposons, especially RIRE2, led to a rapid twofold increase in genome size after O. granulata speciation. Universal centromeric tandem repeats are absent within its centromeres, while gypsy-type LTRs constitute the main centromere-specific repetitive elements. A total of 40,116 protein-coding genes were predicted in O. granulata, which is close to that of Oryza sativa. Both the copy number and function of genes involved in photosynthesis and energy production have undergone positive selection during the evolution of O. granulata, which might have facilitated its adaptation to the low light habitats. Together, our findings reveal the rapid genome expansion, distinctive centromere organization, and adaptive evolution of O. granulata. Zhigang Wu, Dongming Fang, Rui Yang, et al. present the genome assembly of a wild rice species Oryza granulata, revealing critical insights about the rapid genome expansion and evolution observed in the Oryza genus. They find that recent bursts of LTR retrotransposons have led to the rapid increase in O. granulate genome size following speciation.
Collapse
|
54
|
Burga A, Wang W, Ben-David E, Wolf PC, Ramey AM, Verdugo C, Lyons K, Parker PG, Kruglyak L. A genetic signature of the evolution of loss of flight in the Galapagos cormorant. Science 2018; 356:356/6341/eaal3345. [PMID: 28572335 PMCID: PMC5567675 DOI: 10.1126/science.aal3345] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2016] [Accepted: 04/13/2017] [Indexed: 01/19/2023]
Abstract
We have a limited understanding of the genetic and molecular basis of evolutionary changes in the size and proportion of limbs. We studied wing and pectoral skeleton reduction leading to flightlessness in the Galapagos cormorant (Phalacrocorax harrisi). We sequenced and de novo assembled the genomes of four cormorant species and applied a predictive and comparative genomics approach to find candidate variants that may have contributed to the evolution of flightlessness. These analyses and cross-species experiments in Caenorhabditis elegans and in chondrogenic cell lines implicated variants in genes necessary for transcriptional regulation and function of the primary cilium. Cilia are essential for Hedgehog signaling, and humans affected by skeletal ciliopathies suffer from premature bone growth arrest, mirroring skeletal features associated with loss of flight.
Collapse
Affiliation(s)
- Alejandro Burga
- Department of Human Genetics, Department of Biological Chemistry, and Howard Hughes Medical Institute, University of California, Los Angeles, CA, USA.
| | - Weiguang Wang
- Departments of Molecular, Cell and Developmental Biology and Orthopaedic Surgery, University of California and Orthopaedic Institute for Children, Los Angeles, CA, USA
| | - Eyal Ben-David
- Department of Human Genetics, Department of Biological Chemistry, and Howard Hughes Medical Institute, University of California, Los Angeles, CA, USA
| | - Paul C Wolf
- Wildlife Services, U.S. Department of Agriculture, Roseburg, OR, USA
| | - Andrew M Ramey
- U.S. Geological Survey Alaska Science Center, Anchorage, AK, USA
| | - Claudio Verdugo
- Instituto de Patología Animal, Facultad de Ciencias Veterinarias, Universidad Austral de Chile, Valdivia, Chile
| | - Karen Lyons
- Departments of Molecular, Cell and Developmental Biology and Orthopaedic Surgery, University of California and Orthopaedic Institute for Children, Los Angeles, CA, USA
| | - Patricia G Parker
- Department of Biology and Whitney Harris World Ecology Center, University of Missouri, St. Louis, MO, USA.,WildCare Institute, Saint Louis Zoo, St. Louis, MO, USA
| | - Leonid Kruglyak
- Department of Human Genetics, Department of Biological Chemistry, and Howard Hughes Medical Institute, University of California, Los Angeles, CA, USA.
| |
Collapse
|
55
|
Monitoring transcription initiation activities in rat and dog. Sci Data 2017; 4:170173. [PMID: 29182598 PMCID: PMC5704677 DOI: 10.1038/sdata.2017.173] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2016] [Accepted: 10/04/2017] [Indexed: 12/27/2022] Open
Abstract
The promoter landscape of several non-human model organisms is far from complete. As a part of FANTOM5 data collection, we generated 13 profiles of transcription initiation activities in dog and rat aortic smooth muscle cells, mesenchymal stem cells and hepatocytes by employing CAGE (Cap Analysis of Gene Expression) technology combined with single molecule sequencing. Our analyses show that the CAGE profiles recapitulate known transcription start sites (TSSs) consistently, in addition to uncover novel TSSs. Our dataset can be thus used with high confidence to support gene annotation in dog and rat species. We identified 28,497 and 23,147 CAGE peaks, or promoter regions, for rat and dog respectively, and associated them to known genes. This approach could be seen as a standard method for improvement of existing gene models, as well as discovery of novel genes. Given that the FANTOM5 data collection includes dog and rat matched cell types in human and mouse as well, this data would also be useful for cross-species studies.
Collapse
|
56
|
Oliveira A, Oliveira LC, Aburjaile F, Benevides L, Tiwari S, Jamal SB, Silva A, Figueiredo HCP, Ghosh P, Portela RW, De Carvalho Azevedo VA, Wattam AR. Insight of Genus Corynebacterium: Ascertaining the Role of Pathogenic and Non-pathogenic Species. Front Microbiol 2017; 8:1937. [PMID: 29075239 PMCID: PMC5643470 DOI: 10.3389/fmicb.2017.01937] [Citation(s) in RCA: 63] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2017] [Accepted: 09/21/2017] [Indexed: 11/22/2022] Open
Abstract
This review gathers recent information about genomic and transcriptomic studies in the Corynebacterium genus, exploring, for example, prediction of pathogenicity islands and stress response in different pathogenic and non-pathogenic species. In addition, is described several phylogeny studies to Corynebacterium, exploring since the identification of species until biological speciation in one species belonging to the genus Corynebacterium. Important concepts associated with virulence highlighting the role of Pld protein and Tox gene. The adhesion, characteristic of virulence factor, was described using the sortase mechanism that is associated to anchorage to the cell wall. In addition, survival inside the host cell and some diseases, were too addressed for pathogenic corynebacteria, while important biochemical pathways and biotechnological applications retain the focus of this review for non-pathogenic corynebacteria. Concluding, this review broadly explores characteristics in genus Corynebacterium showing to have strong relevance inside the medical, veterinary, and biotechnology field.
Collapse
Affiliation(s)
- Alberto Oliveira
- Molecular and Cellular Laboratory, General Biology Department, Federal University of Minas Gerais, Belo Horizonte, Brazil
| | - Leticia C Oliveira
- Molecular and Cellular Laboratory, General Biology Department, Federal University of Minas Gerais, Belo Horizonte, Brazil
| | - Flavia Aburjaile
- Center of Genomics and System Biology, Federal University of Pará, Belém, Brazil
| | - Leandro Benevides
- Molecular and Cellular Laboratory, General Biology Department, Federal University of Minas Gerais, Belo Horizonte, Brazil
| | - Sandeep Tiwari
- Molecular and Cellular Laboratory, General Biology Department, Federal University of Minas Gerais, Belo Horizonte, Brazil
| | - Syed B Jamal
- Molecular and Cellular Laboratory, General Biology Department, Federal University of Minas Gerais, Belo Horizonte, Brazil
| | - Arthur Silva
- Center of Genomics and System Biology, Federal University of Pará, Belém, Brazil
| | - Henrique C P Figueiredo
- Aquacen, National Reference Laboratory for Aquatic Animal Diseases, Federal University of Minas Gerais, Belo Horizonte, Brazil
| | - Preetam Ghosh
- Department of Computational Science, Virginia Commonwealth University, Richmond, VA, United States
| | - Ricardo W Portela
- Laboratory of Immunology and Molecular Bióloga, Health Sciences Institute, Federal University of Bahiaa, Salvador, Brazil
| | - Vasco A De Carvalho Azevedo
- Molecular and Cellular Laboratory, General Biology Department, Federal University of Minas Gerais, Belo Horizonte, Brazil
| | - Alice R Wattam
- Biocomplexity Institute of Virginia Tech, Virginia Tech, Blacksburg, VA, United States
| |
Collapse
|
57
|
Jayaswal PK, Dogra V, Shanker A, Sharma TR, Singh NK. A tree of life based on ninety-eight expressed genes conserved across diverse eukaryotic species. PLoS One 2017; 12:e0184276. [PMID: 28922368 PMCID: PMC5603157 DOI: 10.1371/journal.pone.0184276] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2017] [Accepted: 08/21/2017] [Indexed: 01/07/2023] Open
Abstract
Rapid advances in DNA sequencing technologies have resulted in the accumulation of large data sets in the public domain, facilitating comparative studies to provide novel insights into the evolution of life. Phylogenetic studies across the eukaryotic taxa have been reported but on the basis of a limited number of genes. Here we present a genome-wide analysis across different plant, fungal, protist, and animal species, with reference to the 36,002 expressed genes of the rice genome. Our analysis revealed 9831 genes unique to rice and 98 genes conserved across all 49 eukaryotic species analysed. The 98 genes conserved across diverse eukaryotes mostly exhibited binding and catalytic activities and shared common sequence motifs; and hence appeared to have a common origin. The 98 conserved genes belonged to 22 functional gene families including 26S protease, actin, ADP–ribosylation factor, ATP synthase, casein kinase, DEAD-box protein, DnaK, elongation factor 2, glyceraldehyde 3-phosphate, phosphatase 2A, ras-related protein, Ser/Thr protein phosphatase family protein, tubulin, ubiquitin and others. The consensus Bayesian eukaryotic tree of life developed in this study demonstrated widely separated clades of plants, fungi, and animals. Musa acuminata provided an evolutionary link between monocotyledons and dicotyledons, and Salpingoeca rosetta provided an evolutionary link between fungi and animals, which indicating that protozoan species are close relatives of fungi and animals. The divergence times for 1176 species pairs were estimated accurately by integrating fossil information with synonymous substitution rates in the comprehensive set of 98 genes. The present study provides valuable insight into the evolution of eukaryotes.
Collapse
Affiliation(s)
- Pawan Kumar Jayaswal
- National Research Centre on Plant Biotechnology, IARI, Pusa, New Delhi, India
- Banasthali University, Banasthali, Rajasthan, India
| | - Vivek Dogra
- National Research Centre on Plant Biotechnology, IARI, Pusa, New Delhi, India
| | - Asheesh Shanker
- Bioinformatics Programme, Centre for Biological Sciences, Central University of South Bihar, Patna, Bihar, India
| | - Tilak Raj Sharma
- National Research Centre on Plant Biotechnology, IARI, Pusa, New Delhi, India
| | - Nagendra Kumar Singh
- National Research Centre on Plant Biotechnology, IARI, Pusa, New Delhi, India
- * E-mail:
| |
Collapse
|
58
|
Moffat JG, Vincent F, Lee JA, Eder J, Prunotto M. Opportunities and challenges in phenotypic drug discovery: an industry perspective. Nat Rev Drug Discov 2017; 16:531-543. [PMID: 28685762 DOI: 10.1038/nrd.2017.111] [Citation(s) in RCA: 506] [Impact Index Per Article: 72.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Phenotypic drug discovery (PDD) approaches do not rely on knowledge of the identity of a specific drug target or a hypothesis about its role in disease, in contrast to the target-based strategies that have been widely used in the pharmaceutical industry in the past three decades. However, in recent years, there has been a resurgence in interest in PDD approaches based on their potential to address the incompletely understood complexity of diseases and their promise of delivering first-in-class drugs, as well as major advances in the tools for cell-based phenotypic screening. Nevertheless, PDD approaches also have considerable challenges, such as hit validation and target deconvolution. This article focuses on the lessons learned by researchers engaged in PDD in the pharmaceutical industry and considers the impact of 'omics' knowledge in defining a cellular disease phenotype in the era of precision medicine, introducing the concept of a chain of translatability. We particularly aim to identify features and areas in which PDD can best deliver value to drug discovery portfolios and can contribute to the identification and the development of novel medicines, and to illustrate the challenges and uncertainties that are associated with PDD in order to help set realistic expectations with regard to its benefits and costs.
Collapse
Affiliation(s)
- John G Moffat
- Biochemical &Cellular Pharmacology, Genentech, South San Francisco, California 94080, USA
| | - Fabien Vincent
- Discovery Sciences, Primary Pharmacology Group, Pfizer, Groton, Connecticut 06340, USA
| | - Jonathan A Lee
- Department of Quantitative Biology, Eli Lilly and Company, Indianapolis, Indiana 46285, USA
| | - Jörg Eder
- Novartis Institutes for Biomedical Research, 4002 Basel, Switzerland
| | - Marco Prunotto
- Phenotype and Target ID, Chemical Biology, pRED, Roche, 4070 Basel, Switzerland. Present address: Office of Innovation, Immunology, Infectious Diseases &Ophthalmology (I2O), Roche Late Stage Development, 124 Grenzacherstrasse, 4070 Basel, Switzerland
| |
Collapse
|
59
|
Huber CD, Kim BY, Marsden CD, Lohmueller KE. Determining the factors driving selective effects of new nonsynonymous mutations. Proc Natl Acad Sci U S A 2017; 114:4465-4470. [PMID: 28400513 PMCID: PMC5410820 DOI: 10.1073/pnas.1619508114] [Citation(s) in RCA: 76] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
The distribution of fitness effects (DFE) of new mutations plays a fundamental role in evolutionary genetics. However, the extent to which the DFE differs across species has yet to be systematically investigated. Furthermore, the biological mechanisms determining the DFE in natural populations remain unclear. Here, we show that theoretical models emphasizing different biological factors at determining the DFE, such as protein stability, back-mutations, species complexity, and mutational robustness make distinct predictions about how the DFE will differ between species. Analyzing amino acid-changing variants from natural populations in a comparative population genomic framework, we find that humans have a higher proportion of strongly deleterious mutations than Drosophila melanogaster. Furthermore, when comparing the DFE across yeast, Drosophila, mice, and humans, the average selection coefficient becomes more deleterious with increasing species complexity. Last, pleiotropic genes have a DFE that is less variable than that of nonpleiotropic genes. Comparing four categories of theoretical models, only Fisher's geometrical model (FGM) is consistent with our findings. FGM assumes that multiple phenotypes are under stabilizing selection, with the number of phenotypes defining the complexity of the organism. Our results suggest that long-term population size and cost of complexity drive the evolution of the DFE, with many implications for evolutionary and medical genomics.
Collapse
Affiliation(s)
- Christian D Huber
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA 90095;
| | - Bernard Y Kim
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA 90095
| | - Clare D Marsden
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA 90095
| | - Kirk E Lohmueller
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA 90095;
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, CA 90095
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA 90095
| |
Collapse
|
60
|
Cho S, Kim MS, Jeong Y, Lee BR, Lee JH, Kang SG, Cho BK. Genome-wide primary transcriptome analysis of H 2-producing archaeon Thermococcus onnurineus NA1. Sci Rep 2017; 7:43044. [PMID: 28216628 PMCID: PMC5316973 DOI: 10.1038/srep43044] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2016] [Accepted: 01/18/2017] [Indexed: 01/09/2023] Open
Abstract
In spite of their pivotal roles in transcriptional and post-transcriptional processes, the regulatory elements of archaeal genomes are not yet fully understood. Here, we determine the primary transcriptome of the H2-producing archaeon Thermococcus onnurineus NA1. We identified 1,082 purine-rich transcription initiation sites along with well-conserved TATA box, A-rich B recognition element (BRE), and promoter proximal element (PPE) motif in promoter regions, a high pyrimidine nucleotide content (T/C) at the -1 position, and Shine-Dalgarno (SD) motifs (GGDGRD) in 5' untranslated regions (5' UTRs). Along with differential transcript levels, 117 leaderless genes and 86 non-coding RNAs (ncRNAs) were identified, representing diverse cellular functions and potential regulatory functions under the different growth conditions. Interestingly, we observed low GC content in ncRNAs for RNA-based regulation via unstructured forms or interaction with other cellular components. Further comparative analysis of T. onnurineus upstream regulatory sequences with those of closely related archaeal genomes demonstrated that transcription of orthologous genes are initiated by highly conserved promoter sequences, however their upstream sequences for transcriptional and translational regulation are largely diverse. These results provide the genetic information of T. onnurineus for its future application in metabolic engineering.
Collapse
Affiliation(s)
- Suhyung Cho
- Department of Biological Sciences and KI for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon 305-701, Republic of Korea
| | - Min-Sik Kim
- Korea Institute of Ocean Science and Technology, Ansan 426-744, Republic of Korea
| | - Yujin Jeong
- Department of Biological Sciences and KI for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon 305-701, Republic of Korea
| | - Bo-Rahm Lee
- Intelligent Synthetic Biology Center, Daejeon 305-701, Republic of Korea
| | - Jung-Hyun Lee
- Korea Institute of Ocean Science and Technology, Ansan 426-744, Republic of Korea
| | - Sung Gyun Kang
- Korea Institute of Ocean Science and Technology, Ansan 426-744, Republic of Korea
| | - Byung-Kwan Cho
- Department of Biological Sciences and KI for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon 305-701, Republic of Korea
- Intelligent Synthetic Biology Center, Daejeon 305-701, Republic of Korea
| |
Collapse
|
61
|
Comparative analysis of DNA methylome and transcriptome of skeletal muscle in lean-, obese-, and mini-type pigs. Sci Rep 2017; 7:39883. [PMID: 28045116 PMCID: PMC5206674 DOI: 10.1038/srep39883] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2016] [Accepted: 11/29/2016] [Indexed: 02/07/2023] Open
Abstract
DNA methylation plays a pivotal role in biological processes by affecting gene expression. However, how DNA methylation mediates phenotype difference of skeletal muscle between lean-, obese-, and mini-type pigs remains unclear. We systematically carried out comparative analysis of skeletal muscle by integrating analysis of genome-wide DNA methylation, mRNA, lncRNA and miRNA profiles in three different pig breeds (obese-type Tongcheng, lean-type Landrace, and mini-type Wuzhishan pigs). We found that the differentially methylated genes (DMGs) were significantly associated with lipid metabolism, oxidative stress and muscle development. Among the identified DMGs, 253 genes were related to body-size and obesity. A set of lncRNAs and mRNAs including UCP3, FHL1, ANK1, HDAC4, and HDAC5 exhibited inversely changed DNA methylation and expression level; these genes were associated with oxidation reduction, fatty acid metabolism and cell proliferation. Gene regulatory networks involved in phenotypic variation of skeletal muscle were related to lipid metabolism, cellular movement, skeletal muscle development, and the p38 MAPK signaling pathway. DNA methylation potentially influences the propensity for obesity and body size by affecting gene expression in skeletal muscle. Our findings provide an abundant information of epigenome and transcriptome that will be useful for animal breeding and biomedical research.
Collapse
|
62
|
Zhou S, Treloar AE, Lupien M. Emergence of the Noncoding Cancer Genome: A Target of Genetic and Epigenetic Alterations. Cancer Discov 2016; 6:1215-1229. [PMID: 27807102 DOI: 10.1158/2159-8290.cd-16-0745] [Citation(s) in RCA: 57] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2016] [Accepted: 08/17/2016] [Indexed: 12/14/2022]
Abstract
The emergence of whole-genome annotation approaches is paving the way for the comprehensive annotation of the human genome across diverse cell and tissue types exposed to various environmental conditions. This has already unmasked the positions of thousands of functional cis-regulatory elements integral to transcriptional regulation, such as enhancers, promoters, and anchors of chromatin interactions that populate the noncoding genome. Recent studies have shown that cis-regulatory elements are commonly the targets of genetic and epigenetic alterations associated with aberrant gene expression in cancer. Here, we review these findings to showcase the contribution of the noncoding genome and its alteration in the development and progression of cancer. We also highlight the opportunities to translate the biological characterization of genetic and epigenetic alterations in the noncoding cancer genome into novel approaches to treat or monitor disease. SIGNIFICANCE The majority of genetic and epigenetic alterations accumulate in the noncoding genome throughout oncogenesis. Discriminating driver from passenger events is a challenge that holds great promise to improve our understanding of the etiology of different cancer types. Advancing our understanding of the noncoding cancer genome may thus identify new therapeutic opportunities and accelerate our capacity to find improved biomarkers to monitor various stages of cancer development. Cancer Discov; 6(11); 1215-29. ©2016 AACR.
Collapse
Affiliation(s)
- Stanley Zhou
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada.,Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
| | - Aislinn E Treloar
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada.,Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
| | - Mathieu Lupien
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada. .,Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada.,Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| |
Collapse
|
63
|
Mingo J, Erramuzpe A, Luna S, Aurtenetxe O, Amo L, Diez I, Schepens JTG, Hendriks WJAJ, Cortés JM, Pulido R. One-Tube-Only Standardized Site-Directed Mutagenesis: An Alternative Approach to Generate Amino Acid Substitution Collections. PLoS One 2016; 11:e0160972. [PMID: 27548698 PMCID: PMC4993582 DOI: 10.1371/journal.pone.0160972] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2016] [Accepted: 07/27/2016] [Indexed: 12/30/2022] Open
Abstract
Site-directed mutagenesis (SDM) is a powerful tool to create defined collections of protein variants for experimental and clinical purposes, but effectiveness is compromised when a large number of mutations is required. We present here a one-tube-only standardized SDM approach that generates comprehensive collections of amino acid substitution variants, including scanning- and single site-multiple mutations. The approach combines unified mutagenic primer design with the mixing of multiple distinct primer pairs and/or plasmid templates to increase the yield of a single inverse-PCR mutagenesis reaction. Also, a user-friendly program for automatic design of standardized primers for Ala-scanning mutagenesis is made available. Experimental results were compared with a modeling approach together with stochastic simulation data. For single site-multiple mutagenesis purposes and for simultaneous mutagenesis in different plasmid backgrounds, combination of primer sets and/or plasmid templates in a single reaction tube yielded the distinct mutations in a stochastic fashion. For scanning mutagenesis, we found that a combination of overlapping primer sets in a single PCR reaction allowed the yield of different individual mutations, although this yield did not necessarily follow a stochastic trend. Double mutants were generated when the overlap of primer pairs was below 60%. Our results illustrate that one-tube-only SDM effectively reduces the number of reactions required in large-scale mutagenesis strategies, facilitating the generation of comprehensive collections of protein variants suitable for functional analysis.
Collapse
Affiliation(s)
- Janire Mingo
- Biomarkers in Cancer Unit, Biocruces Health Research Institute, Barakaldo, Spain
| | - Asier Erramuzpe
- Quantitative Biomedicine Unit, Biocruces Health Research Institute, Barakaldo, Spain
| | - Sandra Luna
- Biomarkers in Cancer Unit, Biocruces Health Research Institute, Barakaldo, Spain
| | - Olaia Aurtenetxe
- Biomarkers in Cancer Unit, Biocruces Health Research Institute, Barakaldo, Spain
| | - Laura Amo
- Biomarkers in Cancer Unit, Biocruces Health Research Institute, Barakaldo, Spain
| | - Ibai Diez
- Quantitative Biomedicine Unit, Biocruces Health Research Institute, Barakaldo, Spain
| | - Jan T. G. Schepens
- Department of Cell Biology, Nijmegen Centre for Molecular Life Sciences, Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands
| | - Wiljan J. A. J. Hendriks
- Department of Cell Biology, Nijmegen Centre for Molecular Life Sciences, Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands
| | - Jesús M. Cortés
- Quantitative Biomedicine Unit, Biocruces Health Research Institute, Barakaldo, Spain
- IKERBASQUE, Basque Foundation for Science, Bilbao, Spain
| | - Rafael Pulido
- Biomarkers in Cancer Unit, Biocruces Health Research Institute, Barakaldo, Spain
- IKERBASQUE, Basque Foundation for Science, Bilbao, Spain
- * E-mail: ;
| |
Collapse
|
64
|
Coppola CJ, C Ramaker R, Mendenhall EM. Identification and function of enhancers in the human genome. Hum Mol Genet 2016; 25:R190-R197. [PMID: 27402881 DOI: 10.1093/hmg/ddw216] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2016] [Accepted: 06/30/2016] [Indexed: 12/31/2022] Open
Abstract
The study of gene regulation has rapidly advanced by leveraging next-generation sequencing to identify and characterize the cis and trans elements that are critical for defining cell identity. These advances have paralleled a movement towards whole genome sequencing in clinics. These two tracks have increasingly synergized to underscore the importance of cis-regulatory elements in development as well produce countless studies implicating these elements in human disease. Other studies have emphasized the clinical phenotypes associated with variation or mutations in trans factors, including non-coding RNAs and chromatin regulators. These studies highlight the importance of obtaining a comprehensive understanding of mammalian gene regulation for predicting the impact of genetic variation on patient phenotypes. Currently lagging behind the generation of vast datasets and annotations is our ability to examine these putative elements in the dynamic context of a developing organism.
Collapse
Affiliation(s)
| | - Ryne C Ramaker
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA University of Alabama at Birmingham, Birmingham, AL, USA
| | - Eric M Mendenhall
- University of Alabama in Huntsville, Huntsville, AL, USA HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| |
Collapse
|
65
|
|
66
|
Ågren JA, Huang HR, Wright SI. Transposable element evolution in the allotetraploid Capsella bursa-pastoris. AMERICAN JOURNAL OF BOTANY 2016; 103:1197-1202. [PMID: 27440791 DOI: 10.3732/ajb.1600103] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/03/2016] [Accepted: 06/20/2016] [Indexed: 06/06/2023]
Abstract
PREMISE OF THE STUDY Shifts in ploidy affect the evolutionary dynamics of genomes in a myriad of ways. Population genetic theory predicts that transposable element (TE) proliferation may follow because the genomewide efficacy of selection should be reduced and the increase in gene copies may mask the deleterious effects of TE insertions. Moreover, in allopolyploids, TEs may further accumulate because of hybrid breakdown of TE silencing. However, to date the evidence of TE proliferation following an increase in ploidy is mixed, and the relative importance of relaxed selection vs. silencing breakdown remains unclear. METHODS We used high-coverage whole-genome sequence data to evaluate the abundance, genomic distribution, and population frequencies of TEs in the self-fertilizing recent allotetraploid Capsella bursa-pastoris (Brassicaceae). We then compared the C. bursa-pastoris TE profile with that of its two parental diploid species, outcrossing C. grandiflora and self-fertilizing C. orientalis. KEY RESULTS We found no evidence that C. bursa-pastoris has experienced a large genomewide proliferation of TEs relative to its parental species. However, when centromeric regions are excluded, we found evidence of significantly higher abundance of retrotransposons in C. bursa-pastoris along the gene-rich chromosome arms compared with C. grandiflora and C. orientalis. CONCLUSIONS The lack of a genomewide effect of allopolyploidy on TE abundance, combined with the increases TE abundance in gene-rich regions, suggests that relaxed selection rather than hybrid breakdown of host silencing explains the TE accumulation in C. bursa-pastoris.
Collapse
Affiliation(s)
- J Arvid Ågren
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ON, Canada
| | - Hui-Run Huang
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, the Chinese Academy of Sciences, China
| | - Stephen I Wright
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
67
|
Lewis KN, Soifer I, Melamud E, Roy M, McIsaac RS, Hibbs M, Buffenstein R. Unraveling the message: insights into comparative genomics of the naked mole-rat. Mamm Genome 2016; 27:259-78. [PMID: 27364349 PMCID: PMC4935753 DOI: 10.1007/s00335-016-9648-5] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2016] [Accepted: 05/09/2016] [Indexed: 12/21/2022]
Abstract
Animals have evolved to survive, and even thrive, in different environments. Genetic adaptations may have indirectly created phenotypes that also resulted in a longer lifespan. One example of this phenomenon is the preternaturally long-lived naked mole-rat. This strictly subterranean rodent tolerates hypoxia, hypercapnia, and soil-based toxins. Naked mole-rats also exhibit pronounced resistance to cancer and an attenuated decline of many physiological characteristics that often decline as mammals age. Elucidating mechanisms that give rise to their unique phenotypes will lead to better understanding of subterranean ecophysiology and biology of aging. Comparative genomics could be a useful tool in this regard. Since the publication of a naked mole-rat genome assembly in 2011, analyses of genomic and transcriptomic data have enabled a clearer understanding of mole-rat evolutionary history and suggested molecular pathways (e.g., NRF2-signaling activation and DNA damage repair mechanisms) that may explain the extraordinarily longevity and unique health traits of this species. However, careful scrutiny and re-analysis suggest that some identified features result from incorrect or imprecise annotation and assembly of the naked mole-rat genome: in addition, some of these conclusions (e.g., genes involved in cancer resistance and hairlessness) are rejected when the analysis includes additional, more closely related species. We describe how the combination of better study design, improved genomic sequencing techniques, and new bioinformatic and data analytical tools will improve comparative genomics and ultimately bridge the gap between traditional model and nonmodel organisms.
Collapse
Affiliation(s)
- Kaitlyn N Lewis
- Calico Life Sciences LLC, 1170 Veterans Blvd, South San Francisco, CA, 94080, USA
| | - Ilya Soifer
- Calico Life Sciences LLC, 1170 Veterans Blvd, South San Francisco, CA, 94080, USA
| | - Eugene Melamud
- Calico Life Sciences LLC, 1170 Veterans Blvd, South San Francisco, CA, 94080, USA
| | - Margaret Roy
- Calico Life Sciences LLC, 1170 Veterans Blvd, South San Francisco, CA, 94080, USA
| | - R Scott McIsaac
- Calico Life Sciences LLC, 1170 Veterans Blvd, South San Francisco, CA, 94080, USA
| | - Matthew Hibbs
- Computer Science Department, Trinity University, San Antonio, TX, 78212, USA
| | - Rochelle Buffenstein
- Calico Life Sciences LLC, 1170 Veterans Blvd, South San Francisco, CA, 94080, USA.
| |
Collapse
|
68
|
Lee J, Hong WY, Cho M, Sim M, Lee D, Ko Y, Kim J. Synteny Portal: a web-based application portal for synteny block analysis. Nucleic Acids Res 2016; 44:W35-40. [PMID: 27154270 PMCID: PMC4987893 DOI: 10.1093/nar/gkw310] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2016] [Accepted: 04/12/2016] [Indexed: 11/12/2022] Open
Abstract
Recent advances in next-generation sequencing technologies and genome assembly algorithms have enabled the accumulation of a huge volume of genome sequences from various species. This has provided new opportunities for large-scale comparative genomics studies. Identifying and utilizing synteny blocks, which are genomic regions conserved among multiple species, is key to understanding genomic architecture and the evolutionary history of genomes. However, the construction and visualization of such synteny blocks from multiple species are very challenging, especially for biologists with a lack of computational skills. Here, we present Synteny Portal, a versatile web-based application portal for constructing, visualizing and browsing synteny blocks. With Synteny Portal, users can easily (i) construct synteny blocks among multiple species by using prebuilt alignments in the UCSC genome browser database, (ii) visualize and download syntenic relationships as high-quality images, (iii) browse synteny blocks with genetic information and (iv) download the details of synteny blocks to be used as input for downstream synteny-based analyses, all in an intuitive and easy-to-use web-based interface. We believe that Synteny Portal will serve as a highly valuable tool that will enable biologists to easily perform comparative genomics studies by compensating limitations of existing tools. Synteny Portal is freely available at http://bioinfo.konkuk.ac.kr/synteny_portal.
Collapse
Affiliation(s)
- Jongin Lee
- Department of Animal Biotechnology, Konkuk University, Seoul 05029, South Korea
| | - Woon-Young Hong
- Department of Animal Biotechnology, Konkuk University, Seoul 05029, South Korea
| | - Minah Cho
- Department of Animal Biotechnology, Konkuk University, Seoul 05029, South Korea
| | - Mikang Sim
- Department of Animal Biotechnology, Konkuk University, Seoul 05029, South Korea
| | - Daehwan Lee
- Department of Animal Biotechnology, Konkuk University, Seoul 05029, South Korea
| | - Younhee Ko
- Department of Clinical Genetics, Department of Pediatrics, Yonsei University College of Medicine, Seoul 03722, South Korea
| | - Jaebum Kim
- Department of Animal Biotechnology, Konkuk University, Seoul 05029, South Korea
| |
Collapse
|
69
|
Faisal FE, Meng L, Crawford J, Milenković T. The post-genomic era of biological network alignment. EURASIP JOURNAL ON BIOINFORMATICS & SYSTEMS BIOLOGY 2015; 2015:3. [PMID: 28194172 PMCID: PMC5270500 DOI: 10.1186/s13637-015-0022-9] [Citation(s) in RCA: 48] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/21/2015] [Accepted: 05/18/2015] [Indexed: 11/10/2022]
Abstract
Biological network alignment aims to find regions of topological and functional (dis)similarities between molecular networks of different species. Then, network alignment can guide the transfer of biological knowledge from well-studied model species to less well-studied species between conserved (aligned) network regions, thus complementing valuable insights that have already been provided by genomic sequence alignment. Here, we review computational challenges behind the network alignment problem, existing approaches for solving the problem, ways of evaluating their alignment quality, and the approaches' biomedical applications. We discuss recent innovative efforts of improving the existing view of network alignment. We conclude with open research questions in comparative biological network research that could further our understanding of principles of life, evolution, disease, and therapeutics.
Collapse
Affiliation(s)
- Fazle E Faisal
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556 USA
- Interdisciplinary Center for Network Science and Applications, University of Notre Dame, Notre Dame, IN, 46556 USA
- ECK Institute for Global Health, University of Notre Dame, Notre Dame, IN, 46556 USA
| | - Lei Meng
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556 USA
| | - Joseph Crawford
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556 USA
- Interdisciplinary Center for Network Science and Applications, University of Notre Dame, Notre Dame, IN, 46556 USA
- ECK Institute for Global Health, University of Notre Dame, Notre Dame, IN, 46556 USA
| | - Tijana Milenković
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556 USA
- Interdisciplinary Center for Network Science and Applications, University of Notre Dame, Notre Dame, IN, 46556 USA
- ECK Institute for Global Health, University of Notre Dame, Notre Dame, IN, 46556 USA
| |
Collapse
|
70
|
Dousse A, Junier T, Zdobnov EM. CEGA--a catalog of conserved elements from genomic alignments. Nucleic Acids Res 2015; 44:D96-100. [PMID: 26527719 PMCID: PMC4702837 DOI: 10.1093/nar/gkv1163] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2015] [Accepted: 10/20/2015] [Indexed: 01/05/2023] Open
Abstract
By identifying genomic sequence regions conserved among several species, comparative genomics offers opportunities to discover putatively functional elements without any prior knowledge of what these functions might be. Comparative analyses across mammals estimated 4-5% of the human genome to be functionally constrained, a much larger fraction than the 1-2% occupied by annotated protein-coding or RNA genes. Such functionally constrained yet unannotated regions have been referred to as conserved non-coding sequences (CNCs) or ultra-conserved elements (UCEs), which remain largely uncharacterized but probably form a highly heterogeneous group of elements including enhancers, promoters, motifs, and others. To facilitate the study of such CNCs/UCEs, we present our resource of Conserved Elements from Genomic Alignments (CEGA), accessible from http://cega.ezlab.org. Harnessing the power of multiple species comparisons to detect genomic elements under purifying selection, CEGA provides a comprehensive set of CNCs identified at different radiations along the vertebrate lineage. Evolutionary constraint is identified using threshold-free phylogenetic modeling of unbiased and sensitive global alignments of genomic synteny blocks identified using protein orthology. We identified CNCs independently for five vertebrate clades, each referring to a different last common ancestor and therefore to an overlapping but varying set of CNCs with 24 488 in vertebrates, 241 575 in amniotes, 709 743 in Eutheria, 642 701 in Boreoeutheria and 612 364 in Euarchontoglires, spanning from 6 Mbp in vertebrates to 119 Mbp in Euarchontoglires. The dynamic CEGA web interface displays alignments, genomic locations, as well as biologically relevant data to help prioritize and select CNCs of interest for further functional investigations.
Collapse
Affiliation(s)
- Aline Dousse
- Department of Genetic Medicine and Development, University of Geneva Medical School, Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Thomas Junier
- Department of Genetic Medicine and Development, University of Geneva Medical School, Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Evgeny M Zdobnov
- Department of Genetic Medicine and Development, University of Geneva Medical School, Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| |
Collapse
|
71
|
Comparative Transcriptomes and EVO-DEVO Studies Depending on Next Generation Sequencing. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2015; 2015:896176. [PMID: 26543497 PMCID: PMC4620428 DOI: 10.1155/2015/896176] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/17/2015] [Accepted: 06/15/2015] [Indexed: 12/20/2022]
Abstract
High throughput technology has prompted the progressive omics studies, including genomics and transcriptomics. We have reviewed the improvement of comparative omic studies, which are attributed to the high throughput measurement of next generation sequencing technology. Comparative genomics have been successfully applied to evolution analysis while comparative transcriptomics are adopted in comparison of expression profile from two subjects by differential expression or differential coexpression, which enables their application in evolutionary developmental biology (EVO-DEVO) studies. EVO-DEVO studies focus on the evolutionary pressure affecting the morphogenesis of development and previous works have been conducted to illustrate the most conserved stages during embryonic development. Old measurements of these studies are based on the morphological similarity from macro view and new technology enables the micro detection of similarity in molecular mechanism. Evolutionary model of embryo development, which includes the "funnel-like" model and the "hourglass" model, has been evaluated by combination of these new comparative transcriptomic methods with prior comparative genomic information. Although the technology has promoted the EVO-DEVO studies into a new era, technological and material limitation still exist and further investigations require more subtle study design and procedure.
Collapse
|
72
|
Tan MH, Gan HM, Gan HY, Lee YP, Croft LJ, Schultz MB, Miller AD, Austin CM. First comprehensive multi-tissue transcriptome of Cherax quadricarinatus (Decapoda: Parastacidae) reveals unexpected diversity of endogenous cellulase. ORG DIVERS EVOL 2015. [DOI: 10.1007/s13127-015-0237-3] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
|
73
|
Dey G, Meyer T. Phylogenetic Profiling for Probing the Modular Architecture of the Human Genome. Cell Syst 2015; 1:106-15. [PMID: 27135799 DOI: 10.1016/j.cels.2015.08.006] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2015] [Revised: 08/03/2015] [Accepted: 08/10/2015] [Indexed: 12/22/2022]
Abstract
Information about functional connections between genes can be derived from patterns of coupled loss of their homologs across multiple species. This comparative approach, termed phylogenetic profiling, has been successfully used to infer genetic interactions in bacteria and eukaryotes. Rapid progress in sequencing eukaryotic species has enabled the recent phylogenetic profiling of the human genome, resulting in systematic functional predictions for uncharacterized human genes. Importantly, groups of co-evolving genes reveal widespread modularity in the underlying genetic network, facilitating experimental analyses in human cells as well as comparative studies of conserved functional modules across species. This strategy is particularly successful in identifying novel metabolic proteins and components of multi-protein complexes. The targeted sequencing of additional key eukaryotes and the incorporation of improved methods to generate and compare phylogenetic profiles will further boost the predictive power and utility of this evolutionary approach to the functional analysis of gene interaction networks.
Collapse
Affiliation(s)
- Gautam Dey
- Chemical and Systems Biology, Stanford University, Stanford CA 94305, USA.
| | - Tobias Meyer
- Chemical and Systems Biology, Stanford University, Stanford CA 94305, USA.
| |
Collapse
|
74
|
Hongo JA, de Castro GM, Cintra LC, Zerlotini A, Lobo FP. POTION: an end-to-end pipeline for positive Darwinian selection detection in genome-scale data through phylogenetic comparison of protein-coding genes. BMC Genomics 2015; 16:567. [PMID: 26231214 PMCID: PMC4521464 DOI: 10.1186/s12864-015-1765-0] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2014] [Accepted: 07/10/2015] [Indexed: 11/29/2022] Open
Abstract
Background Detection of genes evolving under positive Darwinian evolution in genome-scale data is nowadays a prevailing strategy in comparative genomics studies to identify genes potentially involved in adaptation processes. Despite the large number of studies aiming to detect and contextualize such gene sets, there is virtually no software available to perform this task in a general, automatic, large-scale and reliable manner. This certainly occurs due to the computational challenges involved in this task, such as the appropriate modeling of data under analysis, the computation time to perform several of the required steps when dealing with genome-scale data and the highly error-prone nature of the sequence and alignment data structures needed for genome-wide positive selection detection. Results We present POTION, an open source, modular and end-to-end software for genome-scale detection of positive Darwinian selection in groups of homologous coding sequences. Our software represents a key step towards genome-scale, automated detection of positive selection, from predicted coding sequences and their homology relationships to high-quality groups of positively selected genes. POTION reduces false positives through several sophisticated sequence and group filters based on numeric, phylogenetic, quality and conservation criteria to remove spurious data and through multiple hypothesis corrections, and considerably reduces computation time thanks to a parallelized design. Our software achieved a high classification performance when used to evaluate a curated dataset of Trypanosoma brucei paralogs previously surveyed for positive selection. When used to analyze predicted groups of homologous genes of 19 strains of Mycobacterium tuberculosis as a case study we demonstrated the filters implemented in POTION to remove sources of errors that commonly inflate errors in positive selection detection. A thorough literature review found no other software similar to POTION in terms of customization, scale and automation. Conclusion To the best of our knowledge, POTION is the first tool to allow users to construct and check hypotheses regarding the occurrence of site-based evidence of positive selection in non-curated, genome-scale data within a feasible time frame and with no human intervention after initial configuration. POTION is available at http://www.lmb.cnptia.embrapa.br/share/POTION/. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-1765-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jorge A Hongo
- Laboratório Multiusuário de Bioinformática, Embrapa Informática Agropecuária, Empresa Brasileira de Pesquisa Agropecuária (Embrapa), Campinas, São Paulo, 13083-886, Brazil.
| | - Giovanni M de Castro
- Laboratório Multiusuário de Bioinformática, Embrapa Informática Agropecuária, Empresa Brasileira de Pesquisa Agropecuária (Embrapa), Campinas, São Paulo, 13083-886, Brazil.
| | - Leandro C Cintra
- Laboratório Multiusuário de Bioinformática, Embrapa Informática Agropecuária, Empresa Brasileira de Pesquisa Agropecuária (Embrapa), Campinas, São Paulo, 13083-886, Brazil.
| | - Adhemar Zerlotini
- Laboratório Multiusuário de Bioinformática, Embrapa Informática Agropecuária, Empresa Brasileira de Pesquisa Agropecuária (Embrapa), Campinas, São Paulo, 13083-886, Brazil.
| | - Francisco P Lobo
- Laboratório Multiusuário de Bioinformática, Embrapa Informática Agropecuária, Empresa Brasileira de Pesquisa Agropecuária (Embrapa), Campinas, São Paulo, 13083-886, Brazil.
| |
Collapse
|
75
|
Identification of cis-suppression of human disease mutations by comparative genomics. Nature 2015; 524:225-9. [PMID: 26123021 DOI: 10.1038/nature14497] [Citation(s) in RCA: 96] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2014] [Accepted: 04/23/2015] [Indexed: 11/08/2022]
Abstract
Patterns of amino acid conservation have served as a tool for understanding protein evolution. The same principles have also found broad application in human genomics, driven by the need to interpret the pathogenic potential of variants in patients. Here we performed a systematic comparative genomics analysis of human disease-causing missense variants. We found that an appreciable fraction of disease-causing alleles are fixed in the genomes of other species, suggesting a role for genomic context. We developed a model of genetic interactions that predicts most of these to be simple pairwise compensations. Functional testing of this model on two known human disease genes revealed discrete cis amino acid residues that, although benign on their own, could rescue the human mutations in vivo. This approach was also applied to ab initio gene discovery to support the identification of a de novo disease driver in BTG2 that is subject to protective cis-modification in more than 50 species. Finally, on the basis of our data and models, we developed a computational tool to predict candidate residues subject to compensation. Taken together, our data highlight the importance of cis-genomic context as a contributor to protein evolution; they provide an insight into the complexity of allele effect on phenotype; and they are likely to assist methods for predicting allele pathogenicity.
Collapse
|
76
|
Grueber CE. Comparative genomics for biodiversity conservation. Comput Struct Biotechnol J 2015; 13:370-5. [PMID: 26106461 PMCID: PMC4475778 DOI: 10.1016/j.csbj.2015.05.003] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2015] [Revised: 05/13/2015] [Accepted: 05/15/2015] [Indexed: 12/31/2022] Open
Abstract
Genomic approaches are gathering momentum in biology and emerging opportunities lie in the creative use of comparative molecular methods for revealing the processes that influence diversity of wildlife. However, few comparative genomic studies are performed with explicit and specific objectives to aid conservation of wild populations. Here I provide a brief overview of comparative genomic approaches that offer specific benefits to biodiversity conservation. Because conservation examples are few, I draw on research from other areas to demonstrate how comparing genomic data across taxa may be used to inform the characterisation of conservation units and studies of hybridisation, as well as studies that provide conservation outcomes from a better understanding of the drivers of divergence. A comparative approach can also provide valuable insight into the threatening processes that impact rare species, such as emerging diseases and their management in conservation. In addition to these opportunities, I note areas where additional research is warranted. Overall, comparing and contrasting the genomic composition of threatened and other species provide several useful tools for helping to preserve the molecular biodiversity of the global ecosystem.
Collapse
|
77
|
Villar D, Berthelot C, Aldridge S, Rayner TF, Lukk M, Pignatelli M, Park TJ, Deaville R, Erichsen JT, Jasinska AJ, Turner JMA, Bertelsen MF, Murchison EP, Flicek P, Odom DT. Enhancer evolution across 20 mammalian species. Cell 2015; 160:554-66. [PMID: 25635462 PMCID: PMC4313353 DOI: 10.1016/j.cell.2015.01.006] [Citation(s) in RCA: 467] [Impact Index Per Article: 51.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2014] [Revised: 10/31/2014] [Accepted: 12/15/2014] [Indexed: 12/21/2022]
Abstract
The mammalian radiation has corresponded with rapid changes in noncoding regions of the genome, but we lack a comprehensive understanding of regulatory evolution in mammals. Here, we track the evolution of promoters and enhancers active in liver across 20 mammalian species from six diverse orders by profiling genomic enrichment of H3K27 acetylation and H3K4 trimethylation. We report that rapid evolution of enhancers is a universal feature of mammalian genomes. Most of the recently evolved enhancers arise from ancestral DNA exaptation, rather than lineage-specific expansions of repeat elements. In contrast, almost all liver promoters are partially or fully conserved across these species. Our data further reveal that recently evolved enhancers can be associated with genes under positive selection, demonstrating the power of this approach for annotating regulatory adaptations in genomic sequences. These results provide important insight into the functional genetics underpinning mammalian regulatory evolution.
Collapse
Affiliation(s)
- Diego Villar
- University of Cambridge, Cancer Research UK Cambridge Institute, Robinson Way, Cambridge, CB2 0RE, UK
| | - Camille Berthelot
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Sarah Aldridge
- University of Cambridge, Cancer Research UK Cambridge Institute, Robinson Way, Cambridge, CB2 0RE, UK
| | - Tim F Rayner
- University of Cambridge, Cancer Research UK Cambridge Institute, Robinson Way, Cambridge, CB2 0RE, UK
| | - Margus Lukk
- University of Cambridge, Cancer Research UK Cambridge Institute, Robinson Way, Cambridge, CB2 0RE, UK
| | - Miguel Pignatelli
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Thomas J Park
- Department of Biological Sciences, University of Illinois at Chicago (UIC), 845 West Taylor Street, Chicago, IL 60607, USA
| | - Robert Deaville
- UK Cetacean Strandings Investigation Programme (CSIP) and Institute of Zoology, Zoological Society of London, Outer Circle, Regent's Park, London NW1 4RY, UK
| | - Jonathan T Erichsen
- School of Optometry and Vision Sciences, Cardiff University, Maindy Road, Cardiff CF24 4HQ, UK
| | - Anna J Jasinska
- UCLA Center for Neurobehavioral Genetics, 695 Charles E. Young Drive South, Los Angeles, CA 90095, USA
| | - James M A Turner
- Division of Stem Cell Biology and Developmental Genetics, MRC National Institute for Medical Research, Mill Hill, London NW7 1AA, UK
| | - Mads F Bertelsen
- Center for Zoo and Wild Animal Health, Copenhagen Zoo, Roskildevej 38, DK-2000 Frederiksberg, Denmark
| | - Elizabeth P Murchison
- Department of Veterinary Medicine, University of Cambridge, Madingley Road, Cambridge CB3 0ES, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK; Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| | - Duncan T Odom
- University of Cambridge, Cancer Research UK Cambridge Institute, Robinson Way, Cambridge, CB2 0RE, UK; Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| |
Collapse
|
78
|
Abstract
The Genome 10K Project was established in 2009 by a consortium of biologists and genome scientists determined to facilitate the sequencing and analysis of the complete genomes of 10,000 vertebrate species. Since then the number of selected and initiated species has risen from ∼26 to 277 sequenced or ongoing with funding, an approximately tenfold increase in five years. Here we summarize the advances and commitments that have occurred by mid-2014 and outline the achievements and present challenges of reaching the 10,000-species goal. We summarize the status of known vertebrate genome projects, recommend standards for pronouncing a genome as sequenced or completed, and provide our present and future vision of the landscape of Genome 10K. The endeavor is ambitious, bold, expensive, and uncertain, but together the Genome 10K Consortium of Scientists and the worldwide genomics community are moving toward their goal of delivering to the coming generation the gift of genome empowerment for many vertebrate species.
Collapse
Affiliation(s)
- Klaus-Peter Koepfli
- Theodosius Dobzhansky Center for Genome Bioinformatics, St. Petersburg State University, 199034 St. Petersburg, Russian Federation;
| | | | | |
Collapse
|
79
|
Hanifin CT, Gilly WF. Evolutionary history of a complex adaptation: tetrodotoxin resistance in salamanders. Evolution 2014; 69:232-44. [PMID: 25346116 DOI: 10.1111/evo.12552] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2014] [Accepted: 10/01/2014] [Indexed: 12/27/2022]
Abstract
Understanding the processes that generate novel adaptive phenotypes is central to evolutionary biology. We used comparative analyses to reveal the history of tetrodotoxin (TTX) resistance in TTX-bearing salamanders. Resistance to TTX is a critical component of the ability to use TTX defensively but the origin of the TTX-bearing phenotype is unclear. Skeletal muscle of TTX-bearing salamanders (modern newts, family: Salamandridae) is unaffected by TTX at doses far in excess of those that block action potentials in muscle and nerve of other vertebrates. Skeletal muscle of non-TTX-bearing salamandrids is also resistant to TTX but at lower levels. Skeletal muscle TTX resistance in the Salamandridae results from the expression of TTX-resistant variants of the voltage-gated sodium channel NaV 1.4 (SCN4a). We identified four substitutions in the coding region of salSCN4a that are likely responsible for the TTX resistance measured in TTX-bearing salamanders and variation at one of these sites likely explains variation in TTX resistance among other lineages. Our results suggest that exaptation has played a role in the evolution of the TTX-bearing phenotype and provide empirical evidence that complex physiological adaptations can arise through the accumulation of beneficial mutations in the coding region of conserved proteins.
Collapse
|
80
|
Jarvis ED, Mirarab S, Aberer AJ, Li B, Houde P, Li C, Ho SYW, Faircloth BC, Nabholz B, Howard JT, Suh A, Weber CC, da Fonseca RR, Li J, Zhang F, Li H, Zhou L, Narula N, Liu L, Ganapathy G, Boussau B, Bayzid MS, Zavidovych V, Subramanian S, Gabaldón T, Capella-Gutiérrez S, Huerta-Cepas J, Rekepalli B, Munch K, Schierup M, Lindow B, Warren WC, Ray D, Green RE, Bruford MW, Zhan X, Dixon A, Li S, Li N, Huang Y, Derryberry EP, Bertelsen MF, Sheldon FH, Brumfield RT, Mello CV, Lovell PV, Wirthlin M, Schneider MPC, Prosdocimi F, Samaniego JA, Vargas Velazquez AM, Alfaro-Núñez A, Campos PF, Petersen B, Sicheritz-Ponten T, Pas A, Bailey T, Scofield P, Bunce M, Lambert DM, Zhou Q, Perelman P, Driskell AC, Shapiro B, Xiong Z, Zeng Y, Liu S, Li Z, Liu B, Wu K, Xiao J, Yinqi X, Zheng Q, Zhang Y, Yang H, Wang J, Smeds L, Rheindt FE, Braun M, Fjeldsa J, Orlando L, Barker FK, Jønsson KA, Johnson W, Koepfli KP, O'Brien S, Haussler D, Ryder OA, Rahbek C, Willerslev E, Graves GR, Glenn TC, McCormack J, Burt D, Ellegren H, Alström P, Edwards SV, Stamatakis A, Mindell DP, Cracraft J, Braun EL, Warnow T, Jun W, Gilbert MTP, Zhang G. Whole-genome analyses resolve early branches in the tree of life of modern birds. Science 2014; 346:1320-31. [PMID: 25504713 PMCID: PMC4405904 DOI: 10.1126/science.1253451] [Citation(s) in RCA: 1124] [Impact Index Per Article: 112.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
To better determine the history of modern birds, we performed a genome-scale phylogenetic analysis of 48 species representing all orders of Neoaves using phylogenomic methods created to handle genome-scale data. We recovered a highly resolved tree that confirms previously controversial sister or close relationships. We identified the first divergence in Neoaves, two groups we named Passerea and Columbea, representing independent lineages of diverse and convergently evolved land and water bird species. Among Passerea, we infer the common ancestor of core landbirds to have been an apex predator and confirm independent gains of vocal learning. Among Columbea, we identify pigeons and flamingoes as belonging to sister clades. Even with whole genomes, some of the earliest branches in Neoaves proved challenging to resolve, which was best explained by massive protein-coding sequence convergence and high levels of incomplete lineage sorting that occurred during a rapid radiation after the Cretaceous-Paleogene mass extinction event about 66 million years ago.
Collapse
Affiliation(s)
- Erich D Jarvis
- Department of Neurobiology, Howard Hughes Medical Institute (HHMI), and Duke University Medical Center, Durham, NC 27710, USA.
| | - Siavash Mirarab
- Department of Computer Science, The University of Texas at Austin, Austin, TX 78712, USA
| | - Andre J Aberer
- Scientific Computing Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
| | - Bo Li
- China National GeneBank, BGI-Shenzhen, Shenzhen 518083, China. College of Medicine and Forensics, Xi'an Jiaotong University Xi'an 710061, China. Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
| | - Peter Houde
- Department of Biology, New Mexico State University, Las Cruces, NM 88003, USA
| | - Cai Li
- China National GeneBank, BGI-Shenzhen, Shenzhen 518083, China. Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
| | - Simon Y W Ho
- School of Biological Sciences, University of Sydney, Sydney, New South Wales 2006, Australia
| | - Brant C Faircloth
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA 90095, USA. Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Benoit Nabholz
- CNRS UMR 5554, Institut des Sciences de l'Evolution de Montpellier, Université Montpellier II Montpellier, France
| | - Jason T Howard
- Department of Neurobiology, Howard Hughes Medical Institute (HHMI), and Duke University Medical Center, Durham, NC 27710, USA
| | - Alexander Suh
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, SE-752 36 Uppsala Sweden
| | - Claudia C Weber
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, SE-752 36 Uppsala Sweden
| | - Rute R da Fonseca
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
| | - Jianwen Li
- China National GeneBank, BGI-Shenzhen, Shenzhen 518083, China
| | - Fang Zhang
- China National GeneBank, BGI-Shenzhen, Shenzhen 518083, China
| | - Hui Li
- China National GeneBank, BGI-Shenzhen, Shenzhen 518083, China
| | - Long Zhou
- China National GeneBank, BGI-Shenzhen, Shenzhen 518083, China
| | - Nitish Narula
- Department of Biology, New Mexico State University, Las Cruces, NM 88003, USA. Biodiversity and Biocomplexity Unit, Okinawa Institute of Science and Technology Onna-son, Okinawa 904-0495, Japan
| | - Liang Liu
- Department of Statistics and Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA
| | - Ganesh Ganapathy
- Department of Neurobiology, Howard Hughes Medical Institute (HHMI), and Duke University Medical Center, Durham, NC 27710, USA
| | - Bastien Boussau
- Laboratoire de Biométrie et Biologie Evolutive, Centre National de la Recherche Scientifique, Université de Lyon, F-69622 Villeurbanne, France
| | - Md Shamsuzzoha Bayzid
- Department of Computer Science, The University of Texas at Austin, Austin, TX 78712, USA
| | - Volodymyr Zavidovych
- Department of Neurobiology, Howard Hughes Medical Institute (HHMI), and Duke University Medical Center, Durham, NC 27710, USA
| | - Sankar Subramanian
- Environmental Futures Research Institute, Griffith University, Nathan, Queensland 4111, Australia
| | - Toni Gabaldón
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation, Dr. Aiguader 88, 08003 Barcelona, Spain. Universitat Pompeu Fabra, Barcelona, Spain. Institució Catalana de Recerca i Estudis Avançats, Barcelona, Spain
| | - Salvador Capella-Gutiérrez
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation, Dr. Aiguader 88, 08003 Barcelona, Spain. Universitat Pompeu Fabra, Barcelona, Spain
| | - Jaime Huerta-Cepas
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation, Dr. Aiguader 88, 08003 Barcelona, Spain. Universitat Pompeu Fabra, Barcelona, Spain
| | - Bhanu Rekepalli
- Joint Institute for Computational Sciences, The University of Tennessee, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Kasper Munch
- Bioinformatics Research Centre, Aarhus University, DK-8000 Aarhus C, Denmark
| | - Mikkel Schierup
- Bioinformatics Research Centre, Aarhus University, DK-8000 Aarhus C, Denmark
| | - Bent Lindow
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
| | - Wesley C Warren
- The Genome Institute, Washington University School of Medicine, St Louis, MI 63108, USA
| | - David Ray
- Department of Biochemistry, Molecular Biology, Entomology and Plant Pathology, Mississippi State University, Mississippi State, MS 39762, USA. Institute for Genomics, Biocomputing and Biotechnology, Mississippi State University, Mississippi State, MS 39762, USA. Department of Biological Sciences, Texas Tech University, Lubbock, TX 79409, USA
| | - Richard E Green
- Department of Ecology and Evolutionary Biology, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA
| | - Michael W Bruford
- Organisms and Environment Division, Cardiff School of Biosciences, Cardiff University Cardiff CF10 3AX, Wales, UK
| | - Xiangjiang Zhan
- Organisms and Environment Division, Cardiff School of Biosciences, Cardiff University Cardiff CF10 3AX, Wales, UK. Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
| | - Andrew Dixon
- International Wildlife Consultants, Carmarthen SA33 5YL, Wales, UK
| | - Shengbin Li
- College of Medicine and Forensics, Xi'an Jiaotong University Xi'an, 710061, China
| | - Ning Li
- State Key Laboratory for Agrobiotechnology, China Agricultural University, Beijing 100094, China
| | - Yinhua Huang
- State Key Laboratory for Agrobiotechnology, China Agricultural University, Beijing 100094, China
| | - Elizabeth P Derryberry
- Department of Ecology and Evolutionary Biology, Tulane University, New Orleans, LA 70118, USA. Museum of Natural Science and Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Mads Frost Bertelsen
- Center for Zoo and Wild Animal Health, Copenhagen Zoo Roskildevej 38, DK-2000 Frederiksberg, Denmark
| | - Frederick H Sheldon
- Museum of Natural Science and Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Robb T Brumfield
- Museum of Natural Science and Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Claudio V Mello
- Department of Behavioral Neuroscience, Oregon Health and Science University, Portland, OR 97239, USA. Brazilian Avian Genome Consortium (CNPq/FAPESPA-SISBIO Aves), Federal University of Para, Belem, Para, Brazil
| | - Peter V Lovell
- Department of Behavioral Neuroscience, Oregon Health and Science University, Portland, OR 97239, USA
| | - Morgan Wirthlin
- Department of Behavioral Neuroscience, Oregon Health and Science University, Portland, OR 97239, USA
| | - Maria Paula Cruz Schneider
- Brazilian Avian Genome Consortium (CNPq/FAPESPA-SISBIO Aves), Federal University of Para, Belem, Para, Brazil. Institute of Biological Sciences, Federal University of Para, Belem, Para, Brazil
| | - Francisco Prosdocimi
- Brazilian Avian Genome Consortium (CNPq/FAPESPA-SISBIO Aves), Federal University of Para, Belem, Para, Brazil. Institute of Medical Biochemistry Leopoldo de Meis, Federal University of Rio de Janeiro, Rio de Janeiro RJ 21941-902, Brazil
| | - José Alfredo Samaniego
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
| | - Amhed Missael Vargas Velazquez
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
| | - Alonzo Alfaro-Núñez
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
| | - Paula F Campos
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
| | - Bent Petersen
- Centre for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark Kemitorvet 208, 2800 Kgs Lyngby, Denmark
| | - Thomas Sicheritz-Ponten
- Centre for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark Kemitorvet 208, 2800 Kgs Lyngby, Denmark
| | - An Pas
- Breeding Centre for Endangered Arabian Wildlife, Sharjah, United Arab Emirates
| | - Tom Bailey
- Dubai Falcon Hospital, Dubai, United Arab Emirates
| | - Paul Scofield
- Canterbury Museum Rolleston Avenue, Christchurch 8050, New Zealand
| | - Michael Bunce
- Trace and Environmental DNA Laboratory Department of Environment and Agriculture, Curtin University, Perth, Western Australia 6102, Australia
| | - David M Lambert
- Environmental Futures Research Institute, Griffith University, Nathan, Queensland 4111, Australia
| | - Qi Zhou
- Department of Integrative Biology, University of California, Berkeley, CA 94720, USA
| | - Polina Perelman
- Laboratory of Genomic Diversity, National Cancer Institute Frederick, MD 21702, USA. Institute of Molecular and Cellular Biology, SB RAS and Novosibirsk State University, Novosibirsk, Russia
| | - Amy C Driskell
- Smithsonian Institution National Museum of Natural History, Washington, DC 20013, USA
| | - Beth Shapiro
- Department of Ecology and Evolutionary Biology, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA
| | - Zijun Xiong
- China National GeneBank, BGI-Shenzhen, Shenzhen 518083, China
| | - Yongli Zeng
- China National GeneBank, BGI-Shenzhen, Shenzhen 518083, China
| | - Shiping Liu
- China National GeneBank, BGI-Shenzhen, Shenzhen 518083, China
| | - Zhenyu Li
- China National GeneBank, BGI-Shenzhen, Shenzhen 518083, China
| | - Binghang Liu
- China National GeneBank, BGI-Shenzhen, Shenzhen 518083, China
| | - Kui Wu
- China National GeneBank, BGI-Shenzhen, Shenzhen 518083, China
| | - Jin Xiao
- China National GeneBank, BGI-Shenzhen, Shenzhen 518083, China
| | - Xiong Yinqi
- China National GeneBank, BGI-Shenzhen, Shenzhen 518083, China
| | - Qiuemei Zheng
- China National GeneBank, BGI-Shenzhen, Shenzhen 518083, China
| | - Yong Zhang
- China National GeneBank, BGI-Shenzhen, Shenzhen 518083, China
| | | | - Jian Wang
- BGI-Shenzhen, Shenzhen 518083, China
| | - Linnea Smeds
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, SE-752 36 Uppsala Sweden
| | - Frank E Rheindt
- Department of Biological Sciences, National University of Singapore, Republic of Singapore
| | - Michael Braun
- Department of Vertebrate Zoology, National Museum of Natural History, Smithsonian Suitland, MD 20746, USA
| | - Jon Fjeldsa
- Center for Macroecology, Evolution and Climate, Natural History Museum of Denmark, University of Copenhagen, Universitetsparken 15, DK-2100 Copenhagen Ø, Denmark
| | - Ludovic Orlando
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
| | - F Keith Barker
- Bell Museum of Natural History, University of Minnesota, Saint Paul, MN 55108, USA
| | - Knud Andreas Jønsson
- Center for Macroecology, Evolution and Climate, Natural History Museum of Denmark, University of Copenhagen, Universitetsparken 15, DK-2100 Copenhagen Ø, Denmark. Department of Life Sciences, Natural History Museum, Cromwell Road, London SW7 5BD, UK. Department of Life Sciences, Imperial College London, Silwood Park Campus, Ascot SL5 7PY, UK
| | - Warren Johnson
- Smithsonian Conservation Biology Institute, National Zoological Park, Front Royal, VA 22630, USA
| | - Klaus-Peter Koepfli
- Smithsonian Conservation Biology Institute, National Zoological Park, Washington, DC 20008, USA
| | - Stephen O'Brien
- Theodosius Dobzhansky Center for Genome Bioinformatics, St. Petersburg State University, St. Petersburg, Russia 199004. Oceanographic Center, Nova Southeastern University, Ft Lauderdale, FL 33004, USA
| | - David Haussler
- Center for Biomolecular Science and Engineering, UCSC, Santa Cruz, CA 95064, USA
| | - Oliver A Ryder
- San Diego Zoo Institute for Conservation Research, Escondido, CA 92027, USA
| | - Carsten Rahbek
- Center for Macroecology, Evolution and Climate, Natural History Museum of Denmark, University of Copenhagen, Universitetsparken 15, DK-2100 Copenhagen Ø, Denmark. Department of Life Sciences, Imperial College London, Silwood Park Campus, Ascot SL5 7PY, UK
| | - Eske Willerslev
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
| | - Gary R Graves
- Center for Macroecology, Evolution and Climate, Natural History Museum of Denmark, University of Copenhagen, Universitetsparken 15, DK-2100 Copenhagen Ø, Denmark. Department of Vertebrate Zoology, MRC-116, National Museum of Natural History, Smithsonian Institution, Washington, DC 20013, USA
| | - Travis C Glenn
- Department of Environmental Health Science, University of Georgia, Athens, GA 30602, USA
| | - John McCormack
- Moore Laboratory of Zoology and Department of Biology, Occidental College, Los Angeles, CA 90041, USA
| | - Dave Burt
- Department of Genomics and Genetics, The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush Campus, Midlothian EH25 9RG, UK
| | - Hans Ellegren
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, SE-752 36 Uppsala Sweden
| | - Per Alström
- Swedish Species Information Centre, Swedish University of Agricultural Sciences Box 7007, SE-750 07 Uppsala, Sweden. Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
| | - Scott V Edwards
- Department of Organismic and Evolutionary Biology and Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138, USA
| | - Alexandros Stamatakis
- Scientific Computing Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany. Institute of Theoretical Informatics, Department of Informatics, Karlsruhe Institute of Technology, D- 76131 Karlsruhe, Germany
| | - David P Mindell
- Department of Biochemistry and Biophysics, University of California, San Francisco, CA 94158, USA
| | - Joel Cracraft
- Department of Ornithology, American Museum of Natural History, New York, NY 10024, USA
| | - Edward L Braun
- Department of Biology and Genetics Institute, University of Florida, Gainesville, FL 32611, USA
| | - Tandy Warnow
- Department of Computer Science, The University of Texas at Austin, Austin, TX 78712, USA. Departments of Bioengineering and Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA.
| | - Wang Jun
- BGI-Shenzhen, Shenzhen 518083, China. Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen, Denmark. Princess Al Jawhara Center of Excellence in the Research of Hereditary Disorders, King Abdulaziz University, Jeddah 21589, Saudi Arabia. Macau University of Science and Technology, Avenida Wai long, Taipa, Macau 999078, China. Department of Medicine, University of Hong Kong, Hong Kong.
| | - M Thomas P Gilbert
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark. Trace and Environmental DNA Laboratory Department of Environment and Agriculture, Curtin University, Perth, Western Australia 6102, Australia.
| | - Guojie Zhang
- China National GeneBank, BGI-Shenzhen, Shenzhen 518083, China. Centre for Social Evolution, Department of Biology, Universitetsparken 15, University of Copenhagen, DK-2100 Copenhagen, Denmark.
| |
Collapse
|
81
|
Gorlov IP, Moore JH, Peng B, Jin JL, Gorlova OY, Amos CI. SNP characteristics predict replication success in association studies. Hum Genet 2014; 133:1477-86. [PMID: 25273843 PMCID: PMC4384517 DOI: 10.1007/s00439-014-1493-6] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2014] [Accepted: 09/25/2014] [Indexed: 02/03/2023]
Abstract
Successful independent replication is the most direct approach for distinguishing real genotype-disease associations from false discoveries in genome-wide association studies (GWAS). Selecting SNPs for replication has been primarily based on P values from the discovery stage, although additional characteristics of SNPs may be used to improve replication success. We used disease-associated SNPs from more than 2,000 published GWASs to identify predictors of SNP reproducibility. SNP reproducibility was defined as a proportion of successful replications among all replication attempts. The study reporting association for the first time was considered to be discovery and all consequent studies targeting the same phenotype replications. We found that -Log(P), where P is a P value from the discovery study, is the strongest predictor of the SNP reproducibility. Other significant predictors include type of the SNP (e.g., missense vs intronic SNPs) and minor allele frequency. Features of the genes linked to the disease-associated SNP also predict SNP reproducibility. Based on empirically defined rules, we developed a reproducibility score (RS) to predict SNP reproducibility independently of -Log(P). We used data from two lung cancer GWAS studies as well as recently reported disease-associated SNPs to validate RS. Minus Log(P) outperforms RS when the very top SNPs are selected, while RS works better with relaxed selection criteria. In conclusion, we propose an empirical model to predict SNP reproducibility, which can be used to select SNPs for validation and prioritization.
Collapse
Affiliation(s)
- Ivan P Gorlov
- Department of Community and Family Medicine, Geisel School of Medicine, Dartmouth College, 74 College Street Vail 7th Floor, HB 7260 Vail, Hanover, NH, 03755, USA,
| | | | | | | | | | | |
Collapse
|
82
|
Chowanadisai W. Comparative genomic analysis of slc39a12/ZIP12: insight into a zinc transporter required for vertebrate nervous system development. PLoS One 2014; 9:e111535. [PMID: 25375179 PMCID: PMC4222902 DOI: 10.1371/journal.pone.0111535] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2013] [Accepted: 10/04/2014] [Indexed: 01/23/2023] Open
Abstract
The zinc transporter ZIP12, which is encoded by the gene slc39a12, has previously been shown to be important for neuronal differentiation in mouse Neuro-2a neuroblastoma cells and primary mouse neurons and necessary for neurulation during Xenopus tropicalis embryogenesis. However, relatively little is known about the biochemical properties, cellular regulation, or the physiological role of this gene. The hypothesis that ZIP12 is a zinc transporter important for nervous system function and development guided a comparative genetics approach to uncover the presence of ZIP12 in various genomes and identify conserved sequences and expression patterns associated with ZIP12. Ortholog detection of slc39a12 was conducted with reciprocal BLAST hits with the amino acid sequence of human ZIP12 in comparison to the human paralog ZIP4 and conserved local synteny between genomes. ZIP12 is present in the genomes of almost all vertebrates examined, from humans and other mammals to most teleost fish. However, ZIP12 appears to be absent from the zebrafish genome. The discrimination of ZIP12 compared to ZIP4 was unsuccessful or inconclusive in other invertebrate chordates and deuterostomes. Splice variation, due to the inclusion or exclusion of a conserved exon, is present in humans, rats, and cows and likely has biological significance. ZIP12 also possesses many putative di-leucine and tyrosine motifs often associated with intracellular trafficking, which may control cellular zinc uptake activity through the localization of ZIP12 within the cell. These findings highlight multiple aspects of ZIP12 at the biochemical, cellular, and physiological levels with likely biological significance. ZIP12 appears to have conserved function as a zinc uptake transporter in vertebrate nervous system development. Consequently, the role of ZIP12 may be an important link to reported congenital malformations in numerous animal models and humans that are caused by zinc deficiency.
Collapse
Affiliation(s)
- Winyoo Chowanadisai
- Department of Nutrition, University of California Davis, Davis, California, United States of America
- * E-mail:
| |
Collapse
|
83
|
Lineweaver CH, Davies PCW, Vincent MD. Targeting cancer's weaknesses (not its strengths): Therapeutic strategies suggested by the atavistic model. Bioessays 2014; 36:827-35. [PMID: 25043755 DOI: 10.1002/bies.201400070] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
In the atavistic model of cancer progression, tumor cell dedifferentiation is interpreted as a reversion to phylogenetically earlier capabilities. The more recently evolved capabilities are compromised first during cancer progression. This suggests a therapeutic strategy for targeting cancer: design challenges to cancer that can only be met by the recently evolved capabilities no longer functional in cancer cells. We describe several examples of this target-the-weakness strategy. Our most detailed example involves the immune system. The absence of adaptive immunity in immunosuppressed tumor environments is an irreversible weakness of cancer that can be exploited by creating a challenge that only the presence of adaptive immunity can meet. This leaves tumor cells more vulnerable than healthy tissue to pathogenic attack. Such a target-the-weakness therapeutic strategy has broad applications, and contrasts with current therapies that target the main strength of cancer: cell proliferation.
Collapse
Affiliation(s)
- Charles H Lineweaver
- Planetary Science Institute, Research School of Astronomy and Astrophysics and the Research School of Earth Sciences, Australian National University, Canberra, ACT, Australia
| | | | | |
Collapse
|
84
|
Alvarez CE. Naturally Occurring Cancers in Dogs: Insights for Translational Genetics and Medicine. ILAR J 2014; 55:16-45. [DOI: 10.1093/ilar/ilu010] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
85
|
Villar D, Flicek P, Odom DT. Evolution of transcription factor binding in metazoans - mechanisms and functional implications. Nat Rev Genet 2014; 15:221-33. [PMID: 24590227 PMCID: PMC4175440 DOI: 10.1038/nrg3481] [Citation(s) in RCA: 151] [Impact Index Per Article: 15.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Differences in transcription factor binding can contribute to organismal evolution by altering downstream gene expression programmes. Genome-wide studies in Drosophila melanogaster and mammals have revealed common quantitative and combinatorial properties of in vivo DNA binding, as well as marked differences in the rate and mechanisms of evolution of transcription factor binding in metazoans. Here, we review the recently discovered rapid 're-wiring' of in vivo transcription factor binding between related metazoan species and summarize general principles underlying the observed patterns of evolution. We then consider what might explain the differences in genome evolution between metazoan phyla and outline the conceptual and technological challenges facing this research field.
Collapse
Affiliation(s)
- Diego Villar
- University of Cambridge, Cancer Research UK Cambridge Institute, Li Ka Shing Centre, Robinson Way, Cambridge CB2 0RE, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB1 01SD, UK
| | - Duncan T Odom
- University of Cambridge, Cancer Research UK Cambridge Institute, Li Ka Shing Centre, Robinson Way, Cambridge CB2 0RE, UK
| |
Collapse
|
86
|
Freire-de-Lima L. Sweet and sour: the impact of differential glycosylation in cancer cells undergoing epithelial-mesenchymal transition. Front Oncol 2014; 4:59. [PMID: 24724053 PMCID: PMC3971198 DOI: 10.3389/fonc.2014.00059] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2013] [Accepted: 03/11/2014] [Indexed: 01/11/2023] Open
Abstract
Glycosylation changes are a feature of disease states. One clear example is cancer cells, which commonly express glycans at atypical levels or with different structural attributes than those found in normal cells. Epithelial–mesenchymal transition (EMT) was initially recognized as an important step for morphogenesis during embryonic development, and is now shown to be one of the key steps promoting tumor metastasis. Cancer cells undergoing EMT are characterized by significant changes in glycosylation of the extracellular matrix (ECM) components and cell-surface glycoconjugates. Current scientific methodology enables all hallmarks of EMT to be monitored in vitro and this experimental model has been extensively used in oncology research during the last 10 years. Several studies have shown that cell-surface carbohydrates attached to proteins through the amino acids, serine, or threonine (O-glycans), are involved in tumor progression and metastasis, however, the impact of O-glycans on EMT is poorly understood. Recent studies have demonstrated that transforming growth factor-beta (TGF-β), a known EMT inducer, has the ability to promote the up-regulation of a site-specific O-glycosylation in the IIICS domain of human oncofetal fibronectin, a major ECM component expressed by cancer cells and embryonic tissues. Armed with the knowledge that cell-surface glycoconjugates play a major role in the maintenance of cell homeostasis and that EMT is closely associated with glycosylation changes, we may benefit from understanding how unusual glycans can govern the molecular pathways associated with cancer progression. This review initially focuses on some well-known changes found in O-glycans expressed by cancer cells, and then discusses how these alterations may modulate the EMT process.
Collapse
Affiliation(s)
- Leonardo Freire-de-Lima
- Laboratório de Glicobiologia, Instituto de Biofísica Carlos Chagas Filho, Universidade Federal do Rio de Janeiro , Rio de Janeiro , RJ, Brazil
| |
Collapse
|
87
|
Gladieux P, Ropars J, Badouin H, Branca A, Aguileta G, Vienne DM, Rodríguez de la Vega RC, Branco S, Giraud T. Fungal evolutionary genomics provides insight into the mechanisms of adaptive divergence in eukaryotes. Mol Ecol 2014; 23:753-73. [DOI: 10.1111/mec.12631] [Citation(s) in RCA: 151] [Impact Index Per Article: 15.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2013] [Accepted: 12/04/2013] [Indexed: 12/15/2022]
Affiliation(s)
- Pierre Gladieux
- Ecologie, Systématique et Evolution UMR8079 University of Paris‐Sud Orsay 91405 France
- Ecologie, Systématique et Evolution CNRS UMR8079 Orsay 91405 France
- Department of Plant and Microbial Biology University of California Berkeley CA 94720‐3102 USA
| | - Jeanne Ropars
- Ecologie, Systématique et Evolution UMR8079 University of Paris‐Sud Orsay 91405 France
- Ecologie, Systématique et Evolution CNRS UMR8079 Orsay 91405 France
| | - Hélène Badouin
- Ecologie, Systématique et Evolution UMR8079 University of Paris‐Sud Orsay 91405 France
- Ecologie, Systématique et Evolution CNRS UMR8079 Orsay 91405 France
| | - Antoine Branca
- Ecologie, Systématique et Evolution UMR8079 University of Paris‐Sud Orsay 91405 France
- Ecologie, Systématique et Evolution CNRS UMR8079 Orsay 91405 France
| | - Gabriela Aguileta
- Center for Genomic Regulation (CRG) Dr, Aiguader 88 Barcelona 08003 Spain
- Universitat Pompeu Fabra (UPF) Barcelona 08003 Spain
| | - Damien M. Vienne
- Center for Genomic Regulation (CRG) Dr, Aiguader 88 Barcelona 08003 Spain
- Universitat Pompeu Fabra (UPF) Barcelona 08003 Spain
- Laboratoire de Biométrie et Biologie Evolutive Université Lyon 1 CNRS UMR5558 Villeurbanne 69622 France
| | - Ricardo C. Rodríguez de la Vega
- Ecologie, Systématique et Evolution UMR8079 University of Paris‐Sud Orsay 91405 France
- Ecologie, Systématique et Evolution CNRS UMR8079 Orsay 91405 France
| | - Sara Branco
- Department of Plant and Microbial Biology University of California Berkeley CA 94720‐3102 USA
| | - Tatiana Giraud
- Ecologie, Systématique et Evolution UMR8079 University of Paris‐Sud Orsay 91405 France
- Ecologie, Systématique et Evolution CNRS UMR8079 Orsay 91405 France
| |
Collapse
|
88
|
Reno PL, McLean CY, Hines JE, Capellini TD, Bejerano G, Kingsley DM. A penile spine/vibrissa enhancer sequence is missing in modern and extinct humans but is retained in multiple primates with penile spines and sensory vibrissae. PLoS One 2013; 8:e84258. [PMID: 24367647 PMCID: PMC3868586 DOI: 10.1371/journal.pone.0084258] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2013] [Accepted: 11/04/2013] [Indexed: 11/18/2022] Open
Abstract
Previous studies show that humans have a large genomic deletion downstream of the Androgen Receptor gene that eliminates an ancestral mammalian regulatory enhancer that drives expression in developing penile spines and sensory vibrissae. Here we use a combination of large-scale sequence analysis and PCR amplification to demonstrate that the penile spine/vibrissa enhancer is missing in all humans surveyed and in the Neandertal and Denisovan genomes, but is present in DNA samples of chimpanzees and bonobos, as well as in multiple other great apes and primates that maintain some form of penile integumentary appendage and facial vibrissae. These results further strengthen the association between the presence of the penile spine/vibrissa enhancer and the presence of penile spines and macro- or micro- vibrissae in non-human primates as well as show that loss of the enhancer is both a distinctive and characteristic feature of the human lineage.
Collapse
Affiliation(s)
- Philip L. Reno
- Department of Anthropology, The Pennsylvania State University, University Park, Pennsylvania, United States of America
- * E-mail: (PLR); (CYM)
| | - Cory Y. McLean
- Department of Computer Science, Stanford University, Stanford, California, United States of America
- * E-mail: (PLR); (CYM)
| | - Jasmine E. Hines
- Department of Anthropology, The Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Terence D. Capellini
- Department of Developmental Biology, Stanford University School of Medicine, Stanford, California, United States of America
| | - Gill Bejerano
- Department of Computer Science, Stanford University, Stanford, California, United States of America
- Department of Developmental Biology, Stanford University School of Medicine, Stanford, California, United States of America
| | - David M. Kingsley
- Department of Developmental Biology, Stanford University School of Medicine, Stanford, California, United States of America
- Howard Hughes Medical Institute, Stanford, California, United States of America
| |
Collapse
|
89
|
Kalbfleisch T, Heaton MP. Mapping whole genome shotgun sequence and variant calling in mammalian species without their reference genomes. F1000Res 2013; 2:244. [PMID: 25075278 PMCID: PMC4103496 DOI: 10.12688/f1000research.2-244.v2] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 02/04/2014] [Indexed: 01/20/2023] Open
Abstract
Genomics research in mammals has produced reference genome sequences that are essential for identifying variation associated with disease. High quality reference genome sequences are now available for humans, model species, and economically important agricultural animals. Comparisons between these species have provided unique insights into mammalian gene function. However, the number of species with reference genomes is small compared to those needed for studying molecular evolutionary relationships in the tree of life. For example, among the even-toed ungulates there are approximately 300 species whose phylogenetic relationships have been calculated in the 10k trees project. Only six of these have reference genomes: cattle, swine, sheep, goat, water buffalo, and bison. Although reference sequences will eventually be developed for additional hoof stock, the resources in terms of time, money, infrastructure and expertise required to develop a quality reference genome may be unattainable for most species for at least another decade. In this work we mapped 35 Gb of next generation sequence data of a Katahdin sheep to its own species' reference genome ( Ovis aries Oar3.1) and to that of a species that diverged 15 to 30 million years ago ( Bos taurus UMD3.1). In total, 56% of reads covered 76% of UMD3.1 to an average depth of 6.8 reads per site, 83 million variants were identified, of which 78 million were homozygous and likely represent interspecies nucleotide differences. Excluding repeat regions and sex chromosomes, nearly 3.7 million heterozygous sites were identified in this animal vs. bovine UMD3.1, representing polymorphisms occurring in sheep. Of these, 41% could be readily mapped to orthologous positions in ovine Oar3.1 with 80% corroborated as heterozygous. These variant sites, identified via interspecies mapping could be used for comparative genomics, disease association studies, and ultimately to understand mammalian gene function.
Collapse
Affiliation(s)
- Ted Kalbfleisch
- Department of Biochemistry and Molecular Biology, School of Medicine, University of Louisville, Louisville, KY, 40202, USA
- Intrepid Bioinformatics, Louisville, KY, 40202, USA
| | - Michael P Heaton
- USDA Meat Animal Research Center, Clay Center, Nebraska, 68933, USA
| |
Collapse
|
90
|
Kalbfleisch T, Heaton MP. Mapping whole genome shotgun sequence and variant calling in mammalian species without their reference genomes. F1000Res 2013; 2:244. [PMID: 25075278 PMCID: PMC4103496 DOI: 10.12688/f1000research.2-244.v1] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 11/05/2013] [Indexed: 05/28/2024] Open
Abstract
Genomics research in mammals has produced reference genome sequences that are essential for identifying variation associated with disease. High quality reference genome sequences are now available for humans, model species, and economically important agricultural animals. Comparisons between these species have provided unique insights into mammalian gene function. However, the number of species with reference genomes is small compared to those needed for studying molecular evolutionary relationships in the tree of life. For example, among the even-toed ungulates there are approximately 300 species whose phylogenetic relationships have been calculated in the 10k trees project. Only six of these have reference genomes: cattle, swine, sheep, goat, water buffalo, and bison. Although reference sequences will eventually be developed for additional hoof stock, the resources in terms of time, money, infrastructure and expertise required to develop a quality reference genome may be unattainable for most species for at least another decade. In this work we mapped 35 Gb of next generation sequence data of a Katahdin sheep to its own species' reference genome ( Ovis aries Oar3.1) and to that of a species that diverged 15 to 30 million years ago ( Bos taurus UMD3.1). In total, 56% of reads covered 76% of UMD3.1 to an average depth of 6.8 reads per site, 83 million variants were identified, of which 78 million were homozygous and likely represent interspecies nucleotide differences. Excluding genome repeat regions and sex chromosomes, approximately 3.7 million heterozygous sites were identified in this animal vs. bovine UMD3.1, representing polymorphisms occurring in sheep. Of these, 41% could be readily mapped to orthologous positions in ovine Oar3.1 with 80% corroborated as heterozygous. These variant sites, identified via interspecies mapping could be used for comparative genomics, disease association studies, and ultimately to understand mammalian gene function.
Collapse
Affiliation(s)
- Ted Kalbfleisch
- Department of Biochemistry and Molecular Biology, School of Medicine, University of Louisville, Louisville, KY, 40202, USA
- Intrepid Bioinformatics, Louisville, KY, 40202, USA
| | - Michael P Heaton
- USDA Meat Animal Research Center, Clay Center, Nebraska, 68933, USA
| |
Collapse
|