1
|
Margasyuk S, Kuznetsova A, Zavileyskiy L, Vlasenok M, Skvortsov D, Pervouchine D. Human introns contain conserved tissue-specific cryptic poison exons. NAR Genom Bioinform 2024; 6:lqae163. [PMID: 39664813 PMCID: PMC11632617 DOI: 10.1093/nargab/lqae163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2024] [Revised: 10/10/2024] [Accepted: 11/10/2024] [Indexed: 12/13/2024] Open
Abstract
Eukaryotic cells express a large number of transcripts from a single gene due to alternative splicing. Despite hundreds of thousands of splice isoforms being annotated in databases, it has been reported that the current exon catalogs remain incomplete. At the same time, introns of human protein-coding (PC) genes contain a large number of evolutionarily conserved elements with unknown function. Here, we explore the possibility that some of them represent cryptic exons that are expressed in rare conditions. We identified a group of cryptic exons that are similar to the annotated exons in terms of evolutionary conservation and RNA-seq read coverage in the Genotype-Tissue Expression dataset. Most of them were poison, i.e. generated an nonsense-mediated decay (NMD) isoform upon inclusion, and many showed signs of tissue-specific and cancer-specific expression and regulation. We performed RNA-seq in A549 cell line treated with cycloheximide to inactivate NMD and confirmed using quantitative polymerase chain reaction that seven of eight exons tested are, indeed, expressed. This study shows that introns of human PC genes contain cryptic poison exons, which reside in conserved intronic regions and remain not fully annotated due to insufficient representation in RNA-seq libraries.
Collapse
Affiliation(s)
- Sergey Margasyuk
- Center for Molecular and Cellular Biology, Skolkovo Institute of Science and Technology, Bolshoy Bulvar, 30, 121205, Moscow, Russia
| | - Antonina Kuznetsova
- Center for Molecular and Cellular Biology, Skolkovo Institute of Science and Technology, Bolshoy Bulvar, 30, 121205, Moscow, Russia
| | - Lev Zavileyskiy
- Center for Molecular and Cellular Biology, Skolkovo Institute of Science and Technology, Bolshoy Bulvar, 30, 121205, Moscow, Russia
| | - Maria Vlasenok
- Center for Molecular and Cellular Biology, Skolkovo Institute of Science and Technology, Bolshoy Bulvar, 30, 121205, Moscow, Russia
| | - Dmitry Skvortsov
- Center for Molecular and Cellular Biology, Skolkovo Institute of Science and Technology, Bolshoy Bulvar, 30, 121205, Moscow, Russia
- Faculty of Chemistry, Moscow State University, Ul Kolmogorova, 1, 119991, Moscow, Russia
| | - Dmitri D Pervouchine
- Center for Molecular and Cellular Biology, Skolkovo Institute of Science and Technology, Bolshoy Bulvar, 30, 121205, Moscow, Russia
| |
Collapse
|
2
|
Costa ISD, Junot T, Silva FL, Felix W, Cardozo Fh JL, Pereira de Araujo AF, Pais do Amaral C, Gonçalves S, Santos NC, Leite JRSA, Bloch C, Brand GD. Occurrence and evolutionary conservation analysis of α-helical cationic amphiphilic segments in the human proteome. FEBS J 2024; 291:547-565. [PMID: 37945538 DOI: 10.1111/febs.16997] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 09/14/2023] [Accepted: 10/20/2023] [Indexed: 11/12/2023]
Abstract
The existence of encrypted fragments with antimicrobial activity in human proteins has been thoroughly demonstrated in the literature. Recently, algorithms for the large-scale identification of these segments in whole proteomes were developed, and the pervasiveness of this phenomenon was stated. These algorithms typically mine encrypted cationic and amphiphilic segments of proteins, which, when synthesized as individual polypeptide sequences, exert antimicrobial activity by membrane disruption. In the present report, the human reference proteome was submitted to the software kamal for the uncovering of protein segments that correspond to putative intragenic antimicrobial peptides (IAPs). The assessment of the identity of these segments, frequency, functional classes of parent proteins, structural relevance, and evolutionary conservation of amino acid residues within their corresponding proteins was conducted in silico. Additionally, the antimicrobial and anticancer activity of six selected synthetic peptides was evaluated. Our results indicate that cationic and amphiphilic segments can be found in 2% of all human proteins, but are more common in transmembrane and peripheral membrane proteins. These segments are surface-exposed basic patches whose amino acid residues present similar conservation scores to other residues with similar solvent accessibility. Moreover, the antimicrobial and anticancer activity of the synthetic putative IAP sequences was irrespective to whether these are associated to membranes in the cellular setting. Our study discusses these findings in light of the current understanding of encrypted peptide sequences, offering some insights into the relevance of these segments to the organism in the context of their harboring proteins or as separate polypeptide sequences.
Collapse
Affiliation(s)
- Igor S D Costa
- Laboratório de Síntese e Análise de Biomoléculas - LSAB, Instituto de Química, Universidade de Brasília, Brazil
| | - Tiago Junot
- Laboratório de Síntese e Análise de Biomoléculas - LSAB, Instituto de Química, Universidade de Brasília, Brazil
| | - Fernanda L Silva
- Laboratório de Síntese e Análise de Biomoléculas - LSAB, Instituto de Química, Universidade de Brasília, Brazil
| | - Wanessa Felix
- Núcleo de Pesquisa em Morfologia e Imunologia Aplicada - NuPMIA, Faculdade de Medicina, Universidade de Brasília, Brazil
| | - José L Cardozo Fh
- Laboratório de Espectrometria de Massa - LEM, Embrapa Recursos Genéticos e Biotecnologia, Brasília, Brazil
| | - Antonio F Pereira de Araujo
- Laboratório de Biofísica Teórica e Computacional, Departamento de Biologia Celular, Universidade de Brasília, Brazil
| | | | - Sónia Gonçalves
- Instituto de Medicina Molecular, Faculdade de Medicina, Universidade de Lisboa, Portugal
| | - Nuno C Santos
- Instituto de Medicina Molecular, Faculdade de Medicina, Universidade de Lisboa, Portugal
| | - José R S A Leite
- Núcleo de Pesquisa em Morfologia e Imunologia Aplicada - NuPMIA, Faculdade de Medicina, Universidade de Brasília, Brazil
| | - Carlos Bloch
- Laboratório de Espectrometria de Massa - LEM, Embrapa Recursos Genéticos e Biotecnologia, Brasília, Brazil
| | - Guilherme D Brand
- Laboratório de Síntese e Análise de Biomoléculas - LSAB, Instituto de Química, Universidade de Brasília, Brazil
| |
Collapse
|
3
|
Scanlan JL, Robin C. Phylogenomics of the Ecdysteroid Kinase-like (EcKL) Gene Family in Insects Highlights Roles in Both Steroid Hormone Metabolism and Detoxification. Genome Biol Evol 2024; 16:evae019. [PMID: 38291829 PMCID: PMC10859841 DOI: 10.1093/gbe/evae019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 11/21/2023] [Accepted: 01/23/2024] [Indexed: 02/01/2024] Open
Abstract
The evolutionary dynamics of large gene families can offer important insights into the functions of their individual members. While the ecdysteroid kinase-like (EcKL) gene family has previously been linked to the metabolism of both steroid molting hormones and xenobiotic toxins, the functions of nearly all EcKL genes are unknown, and there is little information on their evolution across all insects. Here, we perform comprehensive phylogenetic analyses on a manually annotated set of EcKL genes from 140 insect genomes, revealing the gene family is comprised of at least 13 subfamilies that differ in retention and stability. Our results show the only two genes known to encode ecdysteroid kinases belong to different subfamilies and therefore ecdysteroid metabolism functions must be spread throughout the EcKL family. We provide comparative phylogenomic evidence that EcKLs are involved in detoxification across insects, with positive associations between family size and dietary chemical complexity, and we also find similar evidence for the cytochrome P450 and glutathione S-transferase gene families. Unexpectedly, we find that the size of the clade containing a known ecdysteroid kinase is positively associated with host plant taxonomic diversity in Lepidoptera, possibly suggesting multiple functional shifts between hormone and xenobiotic metabolism. Our evolutionary analyses provide hypotheses of function and a robust framework for future experimental studies of the EcKL gene family. They also open promising new avenues for exploring the genomic basis of dietary adaptation in insects, including the classically studied coevolution of butterflies with their host plants.
Collapse
Affiliation(s)
- Jack L Scanlan
- School of BioSciences, The University of Melbourne, Melbourne, VIC 3010, Australia
| | - Charles Robin
- School of BioSciences, The University of Melbourne, Melbourne, VIC 3010, Australia
| |
Collapse
|
4
|
Daybog I, Kolodny O. A computational framework for resolving the microbiome diversity conundrum. Nat Commun 2023; 14:7977. [PMID: 38042865 PMCID: PMC10693575 DOI: 10.1038/s41467-023-42768-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Accepted: 10/20/2023] [Indexed: 12/04/2023] Open
Abstract
Recent empirical studies offer conflicting findings regarding the relation between host fitness and the composition of its microbiome, a conflict which we term 'the microbial β- diversity conundrum'. The microbiome is crucial for host wellbeing and survival. Surprisingly, different healthy individuals' microbiome compositions, even in the same population, often differ dramatically, contrary to the notion that a vital trait should be highly conserved. Moreover, gnotobiotic individuals exhibit highly deleterious phenotypes, supporting the view that the microbiome is paramount to host fitness. However, the introduction of almost arbitrarily selected microbiota into the system often achieves a significant rescue effect of the deleterious phenotypes. This is true even for microbiota from soil or phylogenetically distant host species, highlighting an apparent paradox. We suggest several solutions to the paradox using a computational framework, simulating the population dynamics of hosts and their microbiomes over multiple generations. The answers invoke factors such as host population size, the specific mode of microbial contribution to host fitness, and typical microbiome richness, offering solutions to the conundrum by highlighting scenarios where even when a host's fitness is determined in full by its microbiome composition, this composition has little effect on the natural selection dynamics of the population.
Collapse
Affiliation(s)
- Itay Daybog
- Department of Ecology, Evolution and Behavior, The A. Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, 9190401, Israel.
| | - Oren Kolodny
- Department of Ecology, Evolution and Behavior, The A. Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, 9190401, Israel.
| |
Collapse
|
5
|
Xu D, Tang L, Zhou J, Wang F, Cao H, Huang Y, Kapranov P. Evidence for widespread existence of functional novel and non-canonical human transcripts. BMC Biol 2023; 21:271. [PMID: 38001496 PMCID: PMC10675921 DOI: 10.1186/s12915-023-01753-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Accepted: 10/31/2023] [Indexed: 11/26/2023] Open
Abstract
BACKGROUND Fraction of functional sequence in the human genome remains a key unresolved question in Biology and the subject of vigorous debate. While a plethora of studies have connected a significant fraction of human DNA to various biochemical processes, the classical definition of function requires evidence of effects on cellular or organismal fitness that such studies do not provide. Although multiple high-throughput reverse genetics screens have been developed to address this issue, they are limited to annotated genomic elements and suffer from non-specific effects, arguing for a strong need to develop additional functional genomics approaches. RESULTS In this work, we established a high-throughput lentivirus-based insertional mutagenesis strategy as a forward genetics screen tool in aneuploid cells. Application of this approach to human cell lines in multiple phenotypic screens suggested the presence of many yet uncharacterized functional elements in the human genome, represented at least in part by novel exons of known and novel genes. The novel transcripts containing these exons can be massively, up to thousands-fold, induced by specific stresses, and at least some can represent bi-cistronic protein-coding mRNAs. CONCLUSIONS Altogether, these results argue that many unannotated and non-canonical human transcripts, including those that appear as aberrant splice products, have biological relevance under specific biological conditions.
Collapse
Affiliation(s)
- Dongyang Xu
- Institute of Genomics, School of Medicine, Huaqiao University, 668 Jimei Road, Xiamen, 361021, China
| | - Lu Tang
- Institute of Genomics, School of Medicine, Huaqiao University, 668 Jimei Road, Xiamen, 361021, China
| | - Junjun Zhou
- Institute of Genomics, School of Medicine, Huaqiao University, 668 Jimei Road, Xiamen, 361021, China
| | - Fang Wang
- Institute of Genomics, School of Medicine, Huaqiao University, 668 Jimei Road, Xiamen, 361021, China
| | - Huifen Cao
- Institute of Genomics, School of Medicine, Huaqiao University, 668 Jimei Road, Xiamen, 361021, China
| | - Yu Huang
- Institute of Genomics, School of Medicine, Huaqiao University, 668 Jimei Road, Xiamen, 361021, China
| | - Philipp Kapranov
- Institute of Genomics, School of Medicine, Huaqiao University, 668 Jimei Road, Xiamen, 361021, China.
- State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Xiamen University, Xiamen, 361102, China.
| |
Collapse
|
6
|
Frisby TS, Langmead CJ. Identifying promising sequences for protein engineering using a deep transformer protein language model. Proteins 2023; 91:1471-1486. [PMID: 37337902 DOI: 10.1002/prot.26536] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2023] [Revised: 05/10/2023] [Accepted: 05/23/2023] [Indexed: 06/21/2023]
Abstract
Protein engineers aim to discover and design novel sequences with targeted, desirable properties. Given the near limitless size of the protein sequence landscape, it is no surprise that these desirable sequences are often a relative rarity. This makes identifying such sequences a costly and time-consuming endeavor. In this work, we show how to use a deep transformer protein language model to identify sequences that have the most promise. Specifically, we use the model's self-attention map to calculate a Promise Score that weights the relative importance of a given sequence according to predicted interactions with a specified binding partner. This Promise Score can then be used to identify strong binders worthy of further study and experimentation. We use the Promise Score within two protein engineering contexts-Nanobody (Nb) discovery and protein optimization. With Nb discovery, we show how the Promise Score provides an effective way to select lead sequences from Nb repertoires. With protein optimization, we show how to use the Promise Score to select site-specific mutagenesis experiments that identify a high percentage of improved sequences. In both cases, we also show how the self-attention map used to calculate the Promise Score can indicate which regions of a protein are involved in intermolecular interactions that drive the targeted property. Finally, we describe how to fine-tune the transformer protein language model to learn a predictive model for the targeted property, and discuss the capabilities and limitations of fine-tuning with and without knowledge transfer within the context of protein engineering.
Collapse
Affiliation(s)
- Trevor S Frisby
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
| | | |
Collapse
|
7
|
Parey E, Fernandez-Aroca D, Frost S, Uribarren A, Park TJ, Zöttl M, St John Smith E, Berthelot C, Villar D. Phylogenetic modeling of enhancer shifts in African mole-rats reveals regulatory changes associated with tissue-specific traits. Genome Res 2023; 33:1513-1526. [PMID: 37625847 PMCID: PMC10620049 DOI: 10.1101/gr.277715.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Accepted: 08/24/2023] [Indexed: 08/27/2023]
Abstract
Changes in gene regulation are thought to underlie most phenotypic differences between species. For subterranean rodents such as the naked mole-rat, proposed phenotypic adaptations include hypoxia tolerance, metabolic changes, and cancer resistance. However, it is largely unknown what regulatory changes may associate with these phenotypic traits, and whether these are unique to the naked mole-rat, the mole-rat clade, or are also present in other mammals. Here, we investigate regulatory evolution in the heart and liver from two African mole-rat species and two rodent outgroups using genome-wide epigenomic profiling. First, we adapted and applied a phylogenetic modeling approach to quantitatively compare epigenomic signals at orthologous regulatory elements and identified thousands of promoter and enhancer regions with differential epigenomic activity in mole-rats. These elements associate with known mole-rat adaptations in metabolic and functional pathways and suggest candidate genetic loci that may underlie mole-rat innovations. Second, we evaluated ancestral and species-specific regulatory changes in the study phylogeny and report several candidate pathways experiencing stepwise remodeling during the evolution of mole-rats, such as the insulin and hypoxia response pathways. Third, we report nonorthologous regulatory elements overlap with lineage-specific repetitive elements and appear to modify metabolic pathways by rewiring of HNF4 and RAR/RXR transcription factor binding sites in mole-rats. These comparative analyses reveal how mole-rat regulatory evolution informs previously reported phenotypic adaptations. Moreover, the phylogenetic modeling framework we propose here improves upon the state of the art by addressing known limitations of inter-species comparisons of epigenomic profiles and has broad implications in the field of comparative functional genomics.
Collapse
Affiliation(s)
- Elise Parey
- Institut de Biologie de l'Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, 75005 Paris, France
| | - Diego Fernandez-Aroca
- Blizard Institute, Faculty of Medicine and Dentistry, Queen Mary University of London, London E1 2AT, United Kingdom
| | - Stephanie Frost
- Blizard Institute, Faculty of Medicine and Dentistry, Queen Mary University of London, London E1 2AT, United Kingdom
| | - Ainhoa Uribarren
- Cambridge Institute, Cancer Research UK and University of Cambridge, Cambridge CB2 0RE, United Kingdom
| | - Thomas J Park
- Department of Biological Sciences and Laboratory of Integrative Neuroscience, University of Illinois at Chicago, Chicago, Illinois 60607, USA
| | - Markus Zöttl
- Department of Biology and Environmental Science, Linnaeus University, 44054 Kalmar, Sweden
| | - Ewan St John Smith
- Department of Pharmacology, University of Cambridge, Cambridge CB2 1PD, United Kingdom
| | - Camille Berthelot
- Institut de Biologie de l'Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, 75005 Paris, France;
- Institut Pasteur, Université Paris Cité, CNRS UMR 3525, INSERM UA12, Comparative Functional Genomics Group, F-75015 Paris, France
| | - Diego Villar
- Blizard Institute, Faculty of Medicine and Dentistry, Queen Mary University of London, London E1 2AT, United Kingdom;
| |
Collapse
|
8
|
Maseko NN, Steenkamp ET, Wingfield BD, Wilken PM. An in Silico Approach to Identifying TF Binding Sites: Analysis of the Regulatory Regions of BUSCO Genes from Fungal Species in the Ceratocystidaceae Family. Genes (Basel) 2023; 14:848. [PMID: 37107606 PMCID: PMC10137650 DOI: 10.3390/genes14040848] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 03/26/2023] [Accepted: 03/27/2023] [Indexed: 04/03/2023] Open
Abstract
Transcriptional regulation controls gene expression through regulatory promoter regions that contain conserved sequence motifs. These motifs, also known as regulatory elements, are critically important to expression, which is driving research efforts to identify and characterize them. Yeasts have been the focus of such studies in fungi, including in several in silico approaches. This study aimed to determine whether in silico approaches could be used to identify motifs in the Ceratocystidaceae family, and if present, to evaluate whether these correspond to known transcription factors. This study targeted the 1000 base-pair region upstream of the start codon of 20 single-copy genes from the BUSCO dataset for motif discovery. Using the MEME and Tomtom analysis tools, conserved motifs at the family level were identified. The results show that such in silico approaches could identify known regulatory motifs in the Ceratocystidaceae and other unrelated species. This study provides support to ongoing efforts to use in silico analyses for motif discovery.
Collapse
Affiliation(s)
| | | | - Brenda D. Wingfield
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Pretoria 0083, South Africa (E.T.S.); (P.M.W.)
| | | |
Collapse
|
9
|
Mott TM, Ibarra JS, Kandula N, Senning EN. Mutagenesis studies of TRPV1 subunit interfaces informed by genomic variant analysis. Biophys J 2023; 122:322-332. [PMID: 36518076 PMCID: PMC9892609 DOI: 10.1016/j.bpj.2022.12.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Revised: 11/11/2022] [Accepted: 12/08/2022] [Indexed: 12/15/2022] Open
Abstract
Protein structures and mutagenesis studies have been instrumental in elucidating molecular mechanisms of ion channel function, but making informed choices about which residues to target for mutagenesis can be challenging. Therefore, we investigated the potential for using human population genomic data to further refine our selection of mutagenesis sites in TRPV1. Single nucleotide polymorphism data of TRPV1 from gnomAD 2.1.1 revealed a lower number of missense variants within buried residues of the ankyrin repeat domain and an increased number of variants between secondary structure elements of the transmembrane segments. We hypothesized that residues critical to interactions at interfaces between subunits or domains in the channel would exhibit a similar reduction in variants. We identified in the structure of ground squirrel TRPV1 (PDB: 7LQY) a possible electrostatic network between K155 and K160 in the N-terminal ankyrin repeat domain and E761 and D762 in the C-terminus (K-KED). Consistent with our hypothesis for residues at key interface sites, none of the four residues have any variants reported in gnomAD 2.1.1. Ca2+ imaging of TRPV1 K-KED mutants confirmed significant roles for these residues, but we found that the electrostatic interaction is not essential since channel function is still observed in total charge reversals on the C-terminal side of the interface (E761K/D762K). Interestingly, Ca2+ imaging responses for a charge swap experiment with K155D/D762K showed partially restored wild-type responses. Using electrophysiology, we found that charge reversals on either K155 or D762 increased the baseline currents of TRPV1, and the charge swapped double mutant, K155D/D762K, partially restored baseline currents to wild-type levels. We interpret these results to mean that contacts across residues in the K-KED interface shift the equilibria of conformations to closed pore states. Our study demonstrates the utility and applicability of a combined missense variant and structure targeted investigation of residues at TRPV1 subunit interfaces.
Collapse
Affiliation(s)
- Taylor M Mott
- Department of Neuroscience, The University of Texas at Austin, Austin, Texas 78712
| | - Jordan S Ibarra
- Department of Neuroscience, The University of Texas at Austin, Austin, Texas 78712
| | - Nivitha Kandula
- School of Medicine, University of Missouri-Kansas City, 5000 Holmes St, Kansas City, Missouri 64110
| | - Eric N Senning
- Department of Neuroscience, The University of Texas at Austin, Austin, Texas 78712.
| |
Collapse
|
10
|
Lima ER, Freire RP, Suzuki MF, Oliveira JE, Yosidaki VL, Peroni CN, Sevilhano T, Zorzeto M, Torati LS, Soares CRJ, Lima IDDM, Kronenberger T, Maltarollo VG, Bartolini P. Isolation and Characterization of the Arapaima gigas Growth Hormone (ag-GH) cDNA and Three-Dimensional Modeling of This Hormone in Comparison with the Human Hormone (hGH). Biomolecules 2023; 13:158. [PMID: 36671542 PMCID: PMC9855374 DOI: 10.3390/biom13010158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Revised: 12/15/2022] [Accepted: 12/30/2022] [Indexed: 01/15/2023] Open
Abstract
In a previous work, the common gonadotrophic hormone α-subunit (ag-GTHα), the ag-FSH β- and ag-LH β-subunit cDNAs, were isolated and characterized by our research group from A. gigas pituitaries, while a preliminary synthesis of ag-FSH was also carried out in human embryonic kidney 293 (HEK293) cells. In the present work, the cDNA sequence encoding the ag-growth hormone (ag-GH) has also been isolated from the same giant Arapaimidae Amazonian fish. The ag-GH consists of 208 amino acids with a putative 23 amino acid signal peptide and a 185 amino acid mature peptide. The highest identity, based on the amino acid sequences, was found with the Elopiformes (82.0%), followed by Anguilliformes (79.7%) and Acipenseriformes (74.5%). The identity with the corresponding human GH (hGH) amino acid sequence is remarkable (44.8%), and the two disulfide bonds present in both sequences were perfectly conserved. Three-dimensional (3D) models of ag-GH, in comparison with hGH, were generated using the threading modeling method followed by molecular dynamics. Our simulations suggest that the two proteins have similar structural properties without major conformational changes under the simulated conditions, even though they are separated from each other by a >100 Myr evolutionary period (1 Myr = 1 million years). The sequence found will be used for the biotechnological synthesis of ag-GH while the ag-GH cDNA obtained will be utilized for preliminary Gene Therapy studies.
Collapse
Affiliation(s)
- Eliana Rosa Lima
- Instituto de Pesquisas Energéticas e Nucleares (IPEN-CNEN), Cidade Universitária, São Paulo 05508-000, SP, Brazil
| | - Renan Passos Freire
- Instituto de Pesquisas Energéticas e Nucleares (IPEN-CNEN), Cidade Universitária, São Paulo 05508-000, SP, Brazil
| | - Miriam Fussae Suzuki
- Instituto de Pesquisas Energéticas e Nucleares (IPEN-CNEN), Cidade Universitária, São Paulo 05508-000, SP, Brazil
| | - João Ezequiel Oliveira
- Instituto de Pesquisas Energéticas e Nucleares (IPEN-CNEN), Cidade Universitária, São Paulo 05508-000, SP, Brazil
| | - Vanessa Luna Yosidaki
- Instituto de Pesquisas Energéticas e Nucleares (IPEN-CNEN), Cidade Universitária, São Paulo 05508-000, SP, Brazil
| | - Cibele Nunes Peroni
- Instituto de Pesquisas Energéticas e Nucleares (IPEN-CNEN), Cidade Universitária, São Paulo 05508-000, SP, Brazil
| | - Thaís Sevilhano
- Instituto de Pesquisas Energéticas e Nucleares (IPEN-CNEN), Cidade Universitária, São Paulo 05508-000, SP, Brazil
| | - Moisés Zorzeto
- Piscicultura Raça, Canabrava do Norte 78658-000, MT, Brazil
| | - Lucas Simon Torati
- EMBRAPA Pesca e Aquicultura, Loteamento Água Fria, Palmas 77008-900, TO, Brazil
| | - Carlos Roberto Jorge Soares
- Instituto de Pesquisas Energéticas e Nucleares (IPEN-CNEN), Cidade Universitária, São Paulo 05508-000, SP, Brazil
| | - Igor Daniel de Miranda Lima
- Departamento de Produtos Farmacêuticos, Faculdade de Farmácia, Universidade Federal de Minas Gerais (UFMG), Av. Presidente Antônio Carlos, 6627, Belo Horizonte 31270-901, MG, Brazil
| | - Thales Kronenberger
- Institute of Pharmacy, Pharmaceutical and Medicinal Chemistry and Tübingen Center for Academic Drug Discovery, Eberhard Karls University Tübingen, Auf der Morgenstelle 8, 72076 Tübingen, Germany
- Department of Oncology and Pneumonology, Internal Medicine VIII, University Hospital Tübingen, Otfried-Müller-Straße 10, DE, 72076 Tübingen, Germany
- Tübingen Center for Academic Drug Discovery & Development (TüCAD2), 72076 Tübingen, Germany
- School of Pharmacy, Faculty of Health Sciences, University of Eastern Finland, 70211 Kuopio, Finland
| | - Vinicius Gonçalves Maltarollo
- Departamento de Produtos Farmacêuticos, Faculdade de Farmácia, Universidade Federal de Minas Gerais (UFMG), Av. Presidente Antônio Carlos, 6627, Belo Horizonte 31270-901, MG, Brazil
| | - Paolo Bartolini
- Instituto de Pesquisas Energéticas e Nucleares (IPEN-CNEN), Cidade Universitária, São Paulo 05508-000, SP, Brazil
| |
Collapse
|
11
|
Ahsan F, Yan Z, Precup D, Blanchette M. PhyloPGM: boosting regulatory function prediction accuracy using evolutionary information. Bioinformatics 2022; 38:i299-i306. [PMID: 35758792 PMCID: PMC9235490 DOI: 10.1093/bioinformatics/btac259] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Motivation The computational prediction of regulatory function associated with a genomic sequence is of utter importance in -omics study, which facilitates our understanding of the underlying mechanisms underpinning the vast gene regulatory network. Prominent examples in this area include the binding prediction of transcription factors in DNA regulatory regions, and predicting RNA–protein interaction in the context of post-transcriptional gene expression. However, existing computational methods have suffered from high false-positive rates and have seldom used any evolutionary information, despite the vast amount of available orthologous data across multitudes of extant and ancestral genomes, which readily present an opportunity to improve the accuracy of existing computational methods. Results In this study, we present a novel probabilistic approach called PhyloPGM that leverages previously trained TFBS or RNA–RBP binding predictors by aggregating their predictions from various orthologous regions, in order to boost the overall prediction accuracy on human sequences. Throughout our experiments, PhyloPGM has shown significant improvement over baselines such as the sequence-based RNA–RBP binding predictor RNATracker and the sequence-based TFBS predictor that is known as FactorNet. PhyloPGM is simple in principle, easy to implement and yet, yields impressive results. Availability and implementation The PhyloPGM package is available at https://github.com/BlanchetteLab/PhyloPGM Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Faizy Ahsan
- School of Computer Science, McGill University, Montreal H3A 0G4, Canada
| | - Zichao Yan
- School of Computer Science, McGill University, Montreal H3A 0G4, Canada
| | - Doina Precup
- School of Computer Science, McGill University, Montreal H3A 0G4, Canada
| | | |
Collapse
|
12
|
Gong Y, Srinivasan SS, Zhang R, Kessenbrock K, Zhang J. scEpiLock: A Weakly Supervised Learning Framework for cis-Regulatory Element Localization and Variant Impact Quantification for Single-Cell Epigenetic Data. Biomolecules 2022; 12:874. [PMID: 35883430 PMCID: PMC9312957 DOI: 10.3390/biom12070874] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 06/16/2022] [Accepted: 06/16/2022] [Indexed: 02/04/2023] Open
Abstract
Recent advances in single-cell transposase-accessible chromatin using a sequencing assay (scATAC-seq) allow cellular heterogeneity dissection and regulatory landscape reconstruction with an unprecedented resolution. However, compared to bulk-sequencing, its ultra-high missingness remarkably reduces usable reads in each cell type, resulting in broader, fuzzier peak boundary definitions and limiting our ability to pinpoint functional regions and interpret variant impacts precisely. We propose a weakly supervised learning method, scEpiLock, to directly identify core functional regions from coarse peak labels and quantify variant impacts in a cell-type-specific manner. First, scEpiLock uses a multi-label classifier to predict chromatin accessibility via a deep convolutional neural network. Then, its weakly supervised object detection module further refines the peak boundary definition using gradient-weighted class activation mapping (Grad-CAM). Finally, scEpiLock provides cell-type-specific variant impacts within a given peak region. We applied scEpiLock to various scATAC-seq datasets and found that it achieves an area under receiver operating characteristic curve (AUC) of ~0.9 and an area under precision recall (AUPR) above 0.7. Besides, scEpiLock's object detection condenses coarse peaks to only ⅓ of their original size while still reporting higher conservation scores. In addition, we applied scEpiLock on brain scATAC-seq data and reported several genome-wide association studies (GWAS) variants disrupting regulatory elements around known risk genes for Alzheimer's disease, demonstrating its potential to provide cell-type-specific biological insights in disease studies.
Collapse
Affiliation(s)
- Yanwen Gong
- Center for Complex Biological Systems, University of California, Irvine, CA 92697, USA;
- Department of Biological Chemistry, School of Medicine, University of California, Irvine, CA 92697, USA
| | | | - Ruiyi Zhang
- Department of Computer Science, University of California, Irvine, CA 92697, USA; (S.S.S.); (R.Z.)
| | - Kai Kessenbrock
- Center for Complex Biological Systems, University of California, Irvine, CA 92697, USA;
- Department of Biological Chemistry, School of Medicine, University of California, Irvine, CA 92697, USA
| | - Jing Zhang
- Department of Computer Science, University of California, Irvine, CA 92697, USA; (S.S.S.); (R.Z.)
| |
Collapse
|
13
|
Garrido-Gala J, Higuera JJ, Rodríguez-Franco A, Muñoz-Blanco J, Amil-Ruiz F, Caballero JL. A Comprehensive Study of the WRKY Transcription Factor Family in Strawberry. PLANTS 2022; 11:plants11121585. [PMID: 35736736 PMCID: PMC9229891 DOI: 10.3390/plants11121585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 06/10/2022] [Accepted: 06/11/2022] [Indexed: 11/16/2022]
Abstract
WRKY transcription factors play critical roles in plant growth and development or stress responses. Using up-to-date genomic data, a total of 64 and 257 WRKY genes have been identified in the diploid woodland strawberry, Fragaria vesca, and the more complex allo-octoploid commercial strawberry, Fragaria × ananassa cv. Camarosa, respectively. The completeness of the new genomes and annotations has enabled us to perform a more detailed evolutionary and functional study of the strawberry WRKY family members, particularly in the case of the cultivated hybrid, in which homoeologous and paralogous FaWRKY genes have been characterized. Analysis of the available expression profiles has revealed that many strawberry WRKY genes show preferential or tissue-specific expression. Furthermore, significant differential expression of several FaWRKY genes has been clearly detected in fruit receptacles and achenes during the ripening process and pathogen challenged, supporting a precise functional role of these strawberry genes in such processes. Further, an extensive analysis of predicted development, stress and hormone-responsive cis-acting elements in the strawberry WRKY family is shown. Our results provide a deeper and more comprehensive knowledge of the WRKY gene family in strawberry.
Collapse
Affiliation(s)
| | - José-Javier Higuera
- Departamento de Bioquímica y Biología Molecular, Campus Universitario de Rabanales y Campus de Excelencia Internacional Agroalimentario ceiA3, Edificio Severo Ochoa-C6, Universidad de Córdoba, 14071 Córdoba, Spain; (J.-J.H.); (A.R.-F.); (J.M.-B.)
| | - Antonio Rodríguez-Franco
- Departamento de Bioquímica y Biología Molecular, Campus Universitario de Rabanales y Campus de Excelencia Internacional Agroalimentario ceiA3, Edificio Severo Ochoa-C6, Universidad de Córdoba, 14071 Córdoba, Spain; (J.-J.H.); (A.R.-F.); (J.M.-B.)
| | - Juan Muñoz-Blanco
- Departamento de Bioquímica y Biología Molecular, Campus Universitario de Rabanales y Campus de Excelencia Internacional Agroalimentario ceiA3, Edificio Severo Ochoa-C6, Universidad de Córdoba, 14071 Córdoba, Spain; (J.-J.H.); (A.R.-F.); (J.M.-B.)
| | - Francisco Amil-Ruiz
- Unidad de Bioinformática, Servicio Central de Apoyo a la Investigación (SCAI), Universidad de Córdoba, 14071 Córdoba, Spain;
| | - José L. Caballero
- Departamento de Bioquímica y Biología Molecular, Campus Universitario de Rabanales y Campus de Excelencia Internacional Agroalimentario ceiA3, Edificio Severo Ochoa-C6, Universidad de Córdoba, 14071 Córdoba, Spain; (J.-J.H.); (A.R.-F.); (J.M.-B.)
- Correspondence:
| |
Collapse
|
14
|
Bordeira-Carriço R, Teixeira J, Duque M, Galhardo M, Ribeiro D, Acemel RD, Firbas PN, Tena JJ, Eufrásio A, Marques J, Ferreira FJ, Freitas T, Carneiro F, Goméz-Skarmeta JL, Bessa J. Multidimensional chromatin profiling of zebrafish pancreas to uncover and investigate disease-relevant enhancers. Nat Commun 2022; 13:1945. [PMID: 35410466 PMCID: PMC9001708 DOI: 10.1038/s41467-022-29551-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Accepted: 03/17/2022] [Indexed: 11/26/2022] Open
Abstract
The pancreas is a central organ for human diseases. Most alleles uncovered by genome-wide association studies of pancreatic dysfunction traits overlap with non-coding sequences of DNA. Many contain epigenetic marks of cis-regulatory elements active in pancreatic cells, suggesting that alterations in these sequences contribute to pancreatic diseases. Animal models greatly help to understand the role of non-coding alterations in disease. However, interspecies identification of equivalent cis-regulatory elements faces fundamental challenges, including lack of sequence conservation. Here we combine epigenetic assays with reporter assays in zebrafish and human pancreatic cells to identify interspecies functionally equivalent cis-regulatory elements, regardless of sequence conservation. Among other potential disease-relevant enhancers, we identify a zebrafish ptf1a distal-enhancer whose deletion causes pancreatic agenesis, a phenotype previously found to be induced by mutations in a distal-enhancer of PTF1A in humans, further supporting the causality of this condition in vivo. This approach helps to uncover interspecies functionally equivalent cis-regulatory elements and their potential role in human disease. Alterations in cis-regulatory elements (CREs) can contribute to pancreatic diseases. Here the authors combine chromatin profiling and interaction points with in vivo reporter assays in zebrafish to uncover functionally equivalent human CREs, helping to predict disease-relevant enhancers.
Collapse
|
15
|
Morales-Laverde L, Echeverz M, Trobos M, Solano C, Lasa I. Experimental Polymorphism Survey in Intergenic Regions of the icaADBCR Locus in Staphylococcus aureus Isolates from Periprosthetic Joint Infections. Microorganisms 2022; 10:600. [PMID: 35336176 PMCID: PMC8955882 DOI: 10.3390/microorganisms10030600] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Revised: 03/03/2022] [Accepted: 03/04/2022] [Indexed: 12/18/2022] Open
Abstract
Staphylococcus aureus is a leading cause of prosthetic joint infections (PJI) characterized by bacterial biofilm formation and recalcitrance to immune-mediated clearance and antibiotics. The molecular events behind PJI infection are yet to be unraveled. In this sense, identification of polymorphisms in bacterial genomes may help to establish associations between sequence variants and the ability of S. aureus to cause PJI. Here, we report an experimental nucleotide-level survey specifically aimed at the intergenic regions (IGRs) of the icaADBCR locus, which is responsible for the synthesis of the biofilm exopolysaccharide PIA/PNAG, in a collection of strains sampled from PJI and wounds. IGRs of the icaADBCR locus were highly conserved and no PJI-specific SNPs were found. Moreover, polymorphisms in these IGRs did not significantly affect transcription of the icaADBC operon under in vitro laboratory conditions. In contrast, an SNP within the icaR coding region, resulting in a V176E change in the transcriptional repressor IcaR, led to a significant increase in icaADBC operon transcription and PIA/PNAG production and a reduction in S. aureus virulence in a Galleria mellonella infection model. In conclusion, SNPs in icaADBCR IGRs of S. aureus isolates from PJI are not associated with icaADBC expression, PIA/PNAG production and adaptation to PJI.
Collapse
Affiliation(s)
- Liliana Morales-Laverde
- Laboratory of Microbial Pathogenesis, Navarrabiomed, Hospital Universitario de Navarra (HUN), Universidad Pública de Navarra (UPNA), IdiSNA, 31008 Pamplona, Spain; (L.M.-L.); (M.E.); (C.S.)
| | - Maite Echeverz
- Laboratory of Microbial Pathogenesis, Navarrabiomed, Hospital Universitario de Navarra (HUN), Universidad Pública de Navarra (UPNA), IdiSNA, 31008 Pamplona, Spain; (L.M.-L.); (M.E.); (C.S.)
| | - Margarita Trobos
- Department of Biomaterials, Institute of Clinical Sciences, Sahlgrenska Academy at University of Gothenburg, 40530 Gothenburg, Sweden;
| | - Cristina Solano
- Laboratory of Microbial Pathogenesis, Navarrabiomed, Hospital Universitario de Navarra (HUN), Universidad Pública de Navarra (UPNA), IdiSNA, 31008 Pamplona, Spain; (L.M.-L.); (M.E.); (C.S.)
| | - Iñigo Lasa
- Laboratory of Microbial Pathogenesis, Navarrabiomed, Hospital Universitario de Navarra (HUN), Universidad Pública de Navarra (UPNA), IdiSNA, 31008 Pamplona, Spain; (L.M.-L.); (M.E.); (C.S.)
| |
Collapse
|
16
|
Ponce LF, Leon K, Valiente PA. Unraveling a Conserved Conformation of the FG Loop upon the Binding of Natural Ligands to the Human and Murine PD1. J Phys Chem B 2022; 126:1441-1446. [PMID: 35167293 DOI: 10.1021/acs.jpcb.1c09463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The activation of T cells is normally accompanied by inhibitory mechanisms within which the PD1 receptor stands out. PD1 drives T cells to an unresponsive state called exhaustion, characterized by a markedly decreased capacity to exert effector functions upon binding the ligands PDL1 and PDL2. For this reason, PD1 has become one of the most important targets in cancer immunotherapy. Despite the numerous studies about PD1 signaling modulation, how the PD1 signaling pathway is activated upon the ligands' binding remains an open question. In this work, we used molecular dynamics simulations to assess the differences of the PD1 motion in the free state and in complex with the ligands. We found that, in both human and murine systems, the binding of PDL1 and PDL2 stabilizes the conformation of the FG loop similarly. This result, combined with the conservation of the FG loop residues across species, suggests that the conformation of the FG loop is somehow related to the signaling process. We also found a high similarity between the PD1-PDL1 structures with the variable region of an antibody structure, where the FG loop occupies a similar position to the CDR3 light chain.
Collapse
Affiliation(s)
- Luis F Ponce
- Molecular System Biology Department, Center of Molecular Immunology, Havana, Havana 11600, Cuba.,Center for Molecular Simulations, Biological Science Department, University of Calgary, Calgary, Alberta T2N 1N4, Canada
| | - Kalet Leon
- Molecular System Biology Department, Center of Molecular Immunology, Havana, Havana 11600, Cuba
| | - Pedro A Valiente
- Center for Protein Studies, Faculty of Biology, University of Havana, Havana, Havana 10400, Cuba.,Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| |
Collapse
|
17
|
Herman MA, Aiello BR, DeLong JD, Garcia-Ruiz H, González AL, Hwang W, McBeth C, Stojković EA, Trakselis MA, Yakoby N. A Unifying Framework for Understanding Biological Structures and Functions Across Levels of Biological Organization. Integr Comp Biol 2022; 61:2038-2047. [PMID: 34302339 PMCID: PMC8990247 DOI: 10.1093/icb/icab167] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 07/12/2021] [Accepted: 07/14/2021] [Indexed: 12/14/2022] Open
Abstract
The relationship between structure and function is a major constituent of the rules of life. Structures and functions occur across all levels of biological organization. Current efforts to integrate conceptual frameworks and approaches to address new and old questions promise to allow a more holistic and robust understanding of how different biological functions are achieved across levels of biological organization. Here, we provide unifying and generalizable definitions of both structure and function that can be applied across all levels of biological organization. However, we find differences in the nature of structures at the organismal level and below as compared to above the level of the organism. We term these intrinsic and emergent structures, respectively. Intrinsic structures are directly under selection, contributing to the overall performance (fitness) of the individual organism. Emergent structures involve interactions among aggregations of organisms and are not directly under selection. Given this distinction, we argue that while the functions of many intrinsic structures remain unknown, functions of emergent structures are the result of the aggregate of processes of individual organisms. We then provide a detailed and unified framework of the structure-function relationship for intrinsic structures to explore how their unknown functions can be defined. We provide examples of how these scalable definitions applied to intrinsic structures provide a framework to address questions on structure-function relationships that can be approached simultaneously from all subdisciplines of biology. We propose that this will produce a more holistic and robust understanding of how different biological functions are achieved across levels of biological organization.
Collapse
Affiliation(s)
- M A Herman
- School of Biological Sciences, University of Nebraska-Lincoln, Lincoln, NE 68588-0118, USA
| | - B R Aiello
- Schools of Physics and Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - J D DeLong
- School of Biological Sciences, University of Nebraska-Lincoln, Lincoln, NE 68588-0118, USA
| | - H Garcia-Ruiz
- Department of Plant Pathology, Nebraska Center for Virology, University of Nebraska-Lincoln, Lincoln, NE 68503, USA
| | - A L González
- Department of Biology and Center for Computational and Integrative Biology, Rutgers University, Camden, NJ 08103, USA
| | - W Hwang
- Departments of Biomedical Engineering, Materials Science and Engineering, and Physics and Astronomy, Texas A&M University, College Station, TX 77843, USA
| | - C McBeth
- Fraunhofer USA CMI and Boston University, Boston, MA 02215, USA
| | - E A Stojković
- Department of Biology, Northeastern Illinois University, Chicago, IL 60641, USA
| | - M A Trakselis
- Department of Chemistry and Biochemistry, Baylor University, One Bear Place #97348, Waco, TX 76798, USA
| | - N Yakoby
- Department of Biology and Center for Computational and Integrative Biology, Rutgers University, Camden, NJ 08103, USA
| |
Collapse
|
18
|
Peng Y, Kang H, Luo J, Zhang Y. A Comparative Analysis of Super-Enhancers and Broad H3K4me3 Domains in Pig, Human, and Mouse Tissues. Front Genet 2021; 12:701049. [PMID: 34899824 PMCID: PMC8652260 DOI: 10.3389/fgene.2021.701049] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2021] [Accepted: 10/29/2021] [Indexed: 11/13/2022] Open
Abstract
Super-enhancers (SEs) and broad H3K4me3 domains (BDs) are crucial regulators in the control of tissue identity in human and mouse. However, their features in pig remain largely unknown. In this study, by integrative computational analyses of epigenomic and transcriptomic data, we have characterized SEs and BDs in six pig tissues and analyzed their conservation in comparison with human and mouse tissues. Similar to human and mouse, pig SEs and BDs display higher tissue specificity than their typical counterparts. Genes proximal to SEs and BDs are associated with tissue identity in most tissues. About 55-182 SEs (5-17% in total) and 99-309 BDs (8-16% in total) across pig tissues are considered as functionally conserved elements because they have orthologous SEs and BDs in human and mouse. However, these elements do not necessarily exhibit sequence conservation. The functionally conserved SEs are correlated to tissue identity in majority of pig tissues, while those conserved BDs are linked to tissue identity in a few tissues. Our study provides resources for future gene regulatory studies in pig. It highlights that SEs are more effective in defining tissue identity than BDs, which is contrasting to a previous study. It also provides novel insights on understanding the sequence features of functionally conserved elements.
Collapse
Affiliation(s)
- Yanling Peng
- Animal Functional Genomics Group, Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Huifang Kang
- Animal Functional Genomics Group, Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Jing Luo
- Animal Functional Genomics Group, Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Yubo Zhang
- Animal Functional Genomics Group, Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| |
Collapse
|
19
|
Li Y, Pu F, Wang J, Zhou Z, Zhang C, He F, Ma Z, Zhang J. Machine Learning Methods in Prediction of Protein Palmitoylation Sites: A Brief Review. Curr Pharm Des 2021; 27:2189-2198. [PMID: 33183190 DOI: 10.2174/1381612826666201112142826] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Accepted: 07/27/2020] [Indexed: 11/22/2022]
Abstract
Protein palmitoylation is a fundamental and reversible post-translational lipid modification that involves a series of biological processes. Although a large number of experimental studies have explored the molecular mechanism behind the palmitoylation process, the computational methods has attracted much attention for its good performance in predicting palmitoylation sites compared with expensive and time-consuming biochemical experiments. The prediction of protein palmitoylation sites is helpful to reveal its biological mechanism. Therefore, the research on the application of machine learning methods to predict palmitoylation sites has become a hot topic in bioinformatics and promoted the development in the related fields. In this review, we briefly introduced the recent development in predicting protein palmitoylation sites by using machine learningbased methods and discussed their benefits and drawbacks. The perspective of machine learning-based methods in predicting palmitoylation sites was also provided. We hope the review could provide a guide in related fields.
Collapse
Affiliation(s)
- Yanwen Li
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China
| | - Feng Pu
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China
| | - Jingru Wang
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China
| | - Zhiguo Zhou
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China
| | - Chunhua Zhang
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China
| | - Fei He
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China
| | - Zhiqiang Ma
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China
| | - Jingbo Zhang
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China
| |
Collapse
|
20
|
Ashokan A, Harisankar HS, Kameswaran M, Aradhyam GK. Critical APJ receptor residues in extracellular domains that influence effector selectivity. FEBS J 2021; 288:6543-6562. [PMID: 34076959 DOI: 10.1111/febs.16048] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2020] [Revised: 04/14/2021] [Accepted: 05/01/2021] [Indexed: 11/29/2022]
Abstract
Human APJ receptor/apelin receptor (APJR), activated by apelin peptide isoforms, regulates a wide range of physiological processes. The role of extracellular loop (ECL) domain residues of APJR in ligand binding and receptor activation has not been established yet. Based on multiple sequence alignment of APJ receptor from various organisms, we identified conserved residues in the extracellular domains. Alanine substitutions of specific residues were characterized to evaluate their ligand binding efficiency and Gq -, Gi -, and β-arrestin-mediated signaling. Mutation-dependent variation in ligand binding and signaling was observed. W197 A in ECL2 and L276 L277 W279 -AAA in ECL3 were deficient in Gi and β-arrestin signaling pathways with relatively preserved Gq -mediated signaling. T169 T170 -AA, Y182 A, and T190 A mutants in ECL2 showed impaired β-arrestin-dependent cell signaling while maintaining G protein- mediated signaling. Structural comparison with angiotensin II type I receptor revealed the importance of ECL2 and ECL3 residues in APJR ligand binding and signaling. Our results unequivocally confirm the specific role of these ECL residues in ligand binding and in orchestrating receptor conformations that are involved in preferential/biased signaling functions.
Collapse
Affiliation(s)
- Anisha Ashokan
- Signal Transduction Laboratory, Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
| | - Harikumar Sheela Harisankar
- Signal Transduction Laboratory, Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
| | - Mythili Kameswaran
- Radiopharmaceuticals Division, Bhabha Atomic Research Centre, Mumbai, India
| | - Gopala Krishna Aradhyam
- Signal Transduction Laboratory, Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
| |
Collapse
|
21
|
Fong SL, Capra JA. Modeling the evolutionary architectures of transcribed human enhancer sequences reveals distinct origins, functions, and associations with human-trait variation. Mol Biol Evol 2021; 38:3681-3696. [PMID: 33973014 PMCID: PMC8382917 DOI: 10.1093/molbev/msab138] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Despite the importance of gene regulatory enhancers in human biology and evolution, we lack a comprehensive model of enhancer evolution and function. This substantially limits our understanding of the genetic basis of species divergence and our ability to interpret the effects of noncoding variants on human traits. To explore enhancer sequence evolution and its relationship to regulatory function, we traced the evolutionary origins of transcribed human enhancer sequences with activity across diverse tissues and cellular contexts from the FANTOM5 consortium. The transcribed enhancers are enriched for sequences of a single evolutionary age (“simple” evolutionary architectures) compared with enhancers that are composites of sequences of multiple evolutionary ages (“complex” evolutionary architectures), likely indicating constraint against genomic rearrangements. Complex enhancers are older, more pleiotropic, and more active across species than simple enhancers. Genetic variants within complex enhancers are also less likely to associate with human traits and biochemical activity. Transposable-element-derived sequences (TEDS) have made diverse contributions to enhancers of both architectures; the majority of TEDS are found in enhancers with simple architectures, while a minority have remodeled older sequences to create complex architectures. Finally, we compare the evolutionary architectures of transcribed enhancers with histone-mark-defined enhancers. Our results reveal that most human transcribed enhancers are ancient sequences of a single age, and thus the evolution of most human enhancers was not driven by increases in evolutionary complexity over time. Our analyses further suggest that considering enhancer evolutionary histories provides context that can aid interpretation of the effects of variants on enhancer function. Based on these results, we propose a framework for analyzing enhancer evolutionary architecture.
Collapse
Affiliation(s)
- Sarah L Fong
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN, USA
| | - John A Capra
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN, USA.,Department of Biological Sciences, Vanderbilt University, Nashville, TN, USA.,Bakar Computational Health Sciences Institute and Department of Epidemiology and Biostatistics, University of California, San Francisco, USA
| |
Collapse
|
22
|
Lima I, Cino EA. Sequence similarity in 3D for comparison of protein families. J Mol Graph Model 2021; 106:107906. [PMID: 33848948 DOI: 10.1016/j.jmgm.2021.107906] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Revised: 03/18/2021] [Accepted: 03/18/2021] [Indexed: 11/26/2022]
Abstract
Homologous proteins are often compared by pairwise sequence alignment, and structure superposition if the atomic coordinates are available. Unification of sequence and structure data is an important task in structural biology. Here, we present the Sequence Similarity 3D (SS3D) method of integrating sequence and structure information. SS3D is a distance and substitution matrix-based method for straightforward visualization of regions of similarity and difference between homologous proteins. This work details the SS3D approach, and demonstrates its utility through case studies comparing members of several protein families. The examples show that SS3D can effectively highlight biologically important regions of similarity and dissimilarity. We anticipate that the method will be useful for numerous structural biology applications, including, but not limited to, studies of binding specificity, structure-function relationships, and evolutionary pathways. SS3D is available with a manual and tutorial at https://github.com/0x462e41/SS3D/.
Collapse
Affiliation(s)
- Igor Lima
- Department of Biochemistry and Immunology, Federal University of Minas Gerais, Belo Horizonte, 31270-901, Brazil
| | - Elio A Cino
- Department of Biochemistry and Immunology, Federal University of Minas Gerais, Belo Horizonte, 31270-901, Brazil.
| |
Collapse
|
23
|
Giudicelli F, Roest Crollius H. On the importance of evolutionary constraint for regulatory sequence identification. Brief Funct Genomics 2021:elab015. [PMID: 33754633 DOI: 10.1093/bfgp/elab015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2020] [Revised: 01/15/2021] [Accepted: 02/19/2021] [Indexed: 11/13/2022] Open
Abstract
Regulation of gene expression relies on the activity of specialized genomic elements, enhancers or silencers, distributed over sometimes large distance from their target gene promoters. A significant part of vertebrate genomes consists in such regulatory elements, but their identification and that of their target genes remains challenging, due to the lack of clear signature at the nucleotide level. For many years the main hallmark used for identifying functional elements has been their sequence conservation between genomes of distant species, indicative of purifying selection. More recently, genome-wide biochemical assays have opened new avenues for detecting regulatory regions, shifting attention away from evolutionary constraints. Here, we review the respective contributions of comparative genomics and biochemical assays for the definition of regulatory elements and their targets and advocate that both sequence conservation and preserved synteny, taken as signature of functional constraint, remain essential tools in this task.
Collapse
|
24
|
Abstract
The innate immune system relies on a germ-line-encoded repertoire of pattern recognition receptors (PRRs), activated by deeply conserved pathogen signatures, such as bacterial cell wall components or foreign nucleic acids. To enable effective defence against invading pathogens and prevent from deleterious inflammation, PRR-driven immune responses are tightly controlled by a dense network of nuclear and cytoplasmic regulators. Long non-coding RNAs (lncRNAs) are increasingly recognized as important components of these regulatory circuitries, providing positive and negative control of PRR-induced innate immune responses. The present review provides an overview of the presently known roles of lncRNAs in human and murine innate antiviral and antibacterial immunity. The emerging roles in host defence and inflammation suggest that further mechanistic insights into the cellular functions of lncRNAs will decisively advance our molecular understanding of immune-associated diseases and open new avenues for therapeutic intervention.
Collapse
Affiliation(s)
- Katharina Walther
- Institute for Lung Research, Philipps University Marburg, Marburg, Germany
| | - Leon N Schulte
- Institute for Lung Research, Philipps University Marburg, Marburg, Germany.,German Center for Lung Research (DZL), Philipps University Marburg, Marburg, Germany
| |
Collapse
|
25
|
Gorlov IP, Xia X, Tsavachidis S, Gorlova OY, Amos CI. Tumor somatic mutations also existing as germline polymorphisms may help to identify functional SNPs from genome-wide association studies. Carcinogenesis 2020; 41:1353-1362. [PMID: 32681635 DOI: 10.1093/carcin/bgaa077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2020] [Revised: 07/06/2020] [Accepted: 07/15/2020] [Indexed: 11/12/2022] Open
Abstract
We hypothesized that a joint analysis of cancer risk-associated single-nucleotide polymorphism (SNP) and somatic mutations in tumor samples can predict functional and potentially causal SNPs from GWASs. We used mutations reported in the Catalog of Somatic Mutations in Cancer (COSMIC). Confirmed somatic mutations were subdivided into two groups: (1) mutations reported as SNPs, which we call mutational/SNPs and (2) somatic mutations that are not reported as SNPs, which we call mutational/noSNPs. It is generally accepted that the number of times a somatic mutation is reported in COSMIC correlates with its selective advantage to tumors, with more frequently reported mutations being more functional and providing a stronger selective advantage to the tumor cell. We found that mutations reported ≥10 times in COSMIC-frequent mutational/SNPs (fmSNPs) are likely to be functional. We identified 12 cancer risk-associated SNPs reported in the Catalog of published GWASs at least 10 times as confirmed somatic mutations and therefore deemed to be functional. Additionally, we have identified 42 SNPs that are tightly linked (R2 ≥ 0.8) to SNPs reported in the Catalog of published GWASs as cancer risk associated and that are also reported as fmSNPs. As a result, 54 candidate functional/potentially causal cancer risk associated SNPs were identified. We found that fmSNPs are more likely to be located in evolutionarily conserved regions compared with cancer risk associated SNPs that are not fmSNPs. We also found that fmSNPs also underwent positive selection, which can explain why they exist as population polymorphisms.
Collapse
Affiliation(s)
- Ivan P Gorlov
- Department of Medicine, Baylor College of Medicine, One Baylor Plaza, Mailstop BCM451, Houston, TX, USA
| | - Xiangjun Xia
- Department of Medicine, Baylor College of Medicine, One Baylor Plaza, Mailstop BCM451, Houston, TX, USA
| | - Spiridon Tsavachidis
- Department of Medicine, Baylor College of Medicine, One Baylor Plaza, Mailstop BCM451, Houston, TX, USA
| | - Olga Y Gorlova
- Department of Medicine, Baylor College of Medicine, One Baylor Plaza, Mailstop BCM451, Houston, TX, USA
| | - Christopher I Amos
- Department of Medicine, Baylor College of Medicine, One Baylor Plaza, Mailstop BCM451, Houston, TX, USA
| |
Collapse
|
26
|
Wang H, Ki JS. Molecular identification, differential expression and protective roles of iron/manganese superoxide dismutases in the green algae Closterium ehrenbergii against metal stress. Eur J Protistol 2020; 74:125689. [DOI: 10.1016/j.ejop.2020.125689] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2019] [Revised: 02/05/2020] [Accepted: 02/26/2020] [Indexed: 12/12/2022]
|
27
|
Brzović Z, Šustar P. Postgenomics function monism. STUDIES IN HISTORY AND PHILOSOPHY OF BIOLOGICAL AND BIOMEDICAL SCIENCES 2020; 80:101243. [PMID: 31924514 DOI: 10.1016/j.shpsc.2019.101243] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/08/2019] [Revised: 10/08/2019] [Accepted: 12/27/2019] [Indexed: 06/10/2023]
Abstract
The ENCODE project has made important new estimates of human genome functionality, now revising the percentage considered functional to more than 80%, which is in stark contrast to the received view, which estimated that less than 10% of the conserved parts of the human genome are functional. ENCODE's unorthodox use of the notion of biological function has stirred the so-called ENCODE controversy, involving conflicting views about the correct notion of function in postgenomics. The debate hinges on the traditional philosophical contrast between the causal role (CR) and selected effects (SE) approaches. In this paper, we examine the ENCODE controversy in terms of the distinction between function monism and pluralism. We propose to apply a weak etiological account to genomic function ascriptions. In this approach, we can ascribe a function to a genomic structure of an organism if and only if performing the function persists in causally contributing to the organism's and its ancestors' fitness. In comparison to the strong etiological (i.e., the selected effects) approach, the present account does not require there to be selection for the structure in question. This is a monistic approach that enables us to avoid the main difficulties of CR, as well as SE's overdependence on natural selection, while still preserving an evolutionary-constrained notion of biological functions. Our proposal is much more moderate in accommodating the estimates of the functionality of the human genome than both ENCODE's proposal itself and the views of the critics relying on a version of the SE account of functions.
Collapse
Affiliation(s)
- Zdenka Brzović
- Department of Philosophy, Faculty of Humanities and Social Sciences, University of Rijeka, Sveučilišna avenija 4, 51000, Rijeka, Croatia.
| | - Predrag Šustar
- Department of Philosophy, Faculty of Humanities and Social Sciences, University of Rijeka, Sveučilišna avenija 4, 51000, Rijeka, Croatia.
| |
Collapse
|
28
|
Davis LK. Intelligent Design of 14-3-3 Docking Proteins Utilizing Synthetic Evolution Artificial Intelligence (SYN-AI). ACS OMEGA 2019; 4:18948-18960. [PMID: 31763516 PMCID: PMC6868599 DOI: 10.1021/acsomega.8b03100] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/06/2018] [Accepted: 07/10/2019] [Indexed: 05/13/2023]
Abstract
The ability to write DNA code from scratch will allow for the discovery of new and interesting chemistries as well as allowing the rewiring of cell signal pathways. Herein, we have utilized synthetic evolution artificial intelligence (SYN-AI) to intelligently design a set of 14-3-3 docking genes. SYN-AI engineers synthetic genes utilizing a parental gene as an evolution template. Wherein, evolution is fast-forwarded by transforming template gene sequences to DNA secondary and tertiary codes based upon gene hierarchical structural levels. The DNA secondary code allows identification of genomic building blocks across an orthologous sequence space comprising multiple genomes. Where, the DNA tertiary code allows engineering of supersecondary structures. SYN-AI constructed a library of 10 million genes that was reduced to three structurally functional 14-3-3 docking genes by applying natural selection protocols. Synthetic protein identity was verified utilizing Clustal Omega sequence alignments and Phylogeny.fr phylogenetic analysis. Wherein, we were able to confirm the three-dimensional structure utilizing I-TASSER and protein-ligand interactions utilizing COACH and Cofactor. The conservation of allosteric communications was confirmed utilizing elastic and anisotropic network models. Whereby, we utilized elNemo and ANM2.1 to confirm conservation of the 14-3-3 ζ amphipathic groove. Notably, to the best of our knowledge, we report the first 14-3-3 docking genes to be written from scratch.
Collapse
Affiliation(s)
- Leroy K. Davis
- Prairie
View A&M University, Cooperative Agricultural Research Center (CARC), 700 University Drive, Prairie
View, Texas 77446-0518, United States
- Gene
Evolution Project, LLC, Baton Rouge, Louisiana 70835, United States
| |
Collapse
|
29
|
Mookerjee-Basu J, Hua X, Ge L, Nicolas E, Li Q, Czyzewicz P, Zhongping D, Peri S, FuxmanBass JI, Walhout AJM, Kappes DJ. Functional Conservation of a Developmental Switch in Mammals since the Jurassic Age. Mol Biol Evol 2019; 36:39-53. [PMID: 30295892 DOI: 10.1093/molbev/msy191] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
ThPOK is a "master regulator" of T lymphocyte lineage choice, whose presence or absence is sufficient to dictate development to the CD4 or CD8 lineages, respectively. Induction of ThPOK is transcriptionally regulated, via a lineage-specific silencer element, SilThPOK. Here, we take advantage of the available genome sequence data as well as site-specific gene targeting technology, to evaluate the functional conservation of ThPOK regulation across mammalian evolution, and assess the importance of motif grammar (order and orientation of TF binding sites) on SilThPOK function in vivo. We make three important points: First, the SilThPOK is present in marsupial and placental mammals, but is not found in available genome assemblies of nonmammalian vertebrates, indicating that it arose after divergence of mammals from other vertebrates. Secondly, by replacing the murine SilThPOK in situ with its marsupial equivalent using a knockin approach, we demonstrate that the marsupial SilThPOK supports correct CD4 T lymphocyte lineage-specification in mice. To our knowledge, this is the first in vivo demonstration of functional equivalency for a silencer element between marsupial and placental mammals using a definitive knockin approach. Finally, we show that alteration of the position/orientation of a highly conserved region within the murine SilThPOK is sufficient to destroy silencer activity in vivo, demonstrating that motif grammar of this "solid" synteny block is critical for silencer function. Dependence of SilThPOK function on motif grammar conserved since the mid-Jurassic age, 165 Ma, suggests that the SilThPOK operates as a silenceosome, by analogy with the previously proposed enhanceosome model.
Collapse
Affiliation(s)
- Jayati Mookerjee-Basu
- Blood Cell Development and Cancer Program, Fox Chase Cancer Center, Philadelphia, PA
| | - Xiang Hua
- Blood Cell Development and Cancer Program, Fox Chase Cancer Center, Philadelphia, PA
| | - Lu Ge
- Blood Cell Development and Cancer Program, Fox Chase Cancer Center, Philadelphia, PA
| | - Emmanuelle Nicolas
- Blood Cell Development and Cancer Program, Fox Chase Cancer Center, Philadelphia, PA
| | - Qin Li
- Blood Cell Development and Cancer Program, Fox Chase Cancer Center, Philadelphia, PA
| | - Philip Czyzewicz
- Blood Cell Development and Cancer Program, Fox Chase Cancer Center, Philadelphia, PA
| | - Dai Zhongping
- Blood Cell Development and Cancer Program, Fox Chase Cancer Center, Philadelphia, PA
| | - Suraj Peri
- Blood Cell Development and Cancer Program, Fox Chase Cancer Center, Philadelphia, PA
| | - Juan I FuxmanBass
- Program in Systems Biology, Program in Molecular Medicine, University of Massachusetts Medical School, Worcester, MA
| | - Albertha J M Walhout
- Program in Systems Biology, Program in Molecular Medicine, University of Massachusetts Medical School, Worcester, MA
| | - Dietmar J Kappes
- Blood Cell Development and Cancer Program, Fox Chase Cancer Center, Philadelphia, PA
| |
Collapse
|
30
|
Hockenberry AJ, Jewett MC, Amaral LAN, Wilke CO. Within-Gene Shine-Dalgarno Sequences Are Not Selected for Function. Mol Biol Evol 2019; 35:2487-2498. [PMID: 30085185 DOI: 10.1093/molbev/msy150] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
The Shine-Dalgarno (SD) sequence motif facilitates translation initiation and is frequently found upstream of bacterial start codons. However, thousands of instances of this motif occur throughout the middle of protein coding genes in a typical bacterial genome. Here, we use comparative evolutionary analysis to test whether SD sequences located within genes are functionally constrained. We measure the conservation of SD sequences across Enterobacteriales, and find that they are significantly less conserved than expected. Further, the strongest SD sequences are the least conserved whereas we find evidence of conservation for the weakest possible SD sequences given amino acid constraints. Our findings indicate that most SD sequences within genes are likely to be deleterious and removed via selection. To illustrate the origin of these deleterious costs, we show that ATG start codons are significantly depleted downstream of SD sequences within genes, highlighting the constraint that these sequences impose on the surrounding nucleotides to minimize the potential for erroneous translation initiation.
Collapse
Affiliation(s)
- Adam J Hockenberry
- Department of Integrative Biology, The University of Texas at Austin, Austin, TX
| | - Michael C Jewett
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL.,Chemistry of Life Processes Institute, Northwestern University, Evanston, IL.,Center for Synthetic Biology, Northwestern University, Evanston, IL.,Robert H. Lurie Comprehensive Cancer Center, Northwestern University, Chicago, IL.,Simpson Querrey Institute, Northwestern University, Evanston, IL
| | - Luís A N Amaral
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL.,Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL
| | - Claus O Wilke
- Department of Integrative Biology, The University of Texas at Austin, Austin, TX
| |
Collapse
|
31
|
Improved measures for evolutionary conservation that exploit taxonomy distances. Nat Commun 2019; 10:1556. [PMID: 30952844 PMCID: PMC6450959 DOI: 10.1038/s41467-019-09583-2] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2018] [Accepted: 03/19/2019] [Indexed: 11/30/2022] Open
Abstract
Selective pressures on protein-coding regions that provide fitness advantages can lead to the regions' fixation and conservation in genome duplications and speciation events. Consequently, conservation analyses relying on sequence similarities are exploited by a myriad of applications across all biosciences to identify functionally important protein regions. While very potent, existing conservation measures based on multiple sequence alignments are so pervasive that improvements to solutions of many problems have become incremental. We introduce a new framework for evolutionary conservation with measures that exploit taxonomy distances across species. Results show that our taxonomy-based framework comfortably outperforms existing conservation measures in identifying deleterious variants observed in the human population, including variants located in non-abundant sequence domains such as intrinsically disordered regions. The predictive power of our approach emphasizes that the phenotypic effects of sequence variants can be taxonomy-level specific and thus, conservation needs to be interpreted accordingly. Information on protein sequence variability and conservation can be leveraged to identify functionally important regions. Here, the authors develop new conservation measures that exploit taxonomy distances and LIST, a tool for predicting deleteriousness of human variants.
Collapse
|
32
|
How is structural divergence related to evolutionary information? Mol Phylogenet Evol 2018; 127:859-866. [DOI: 10.1016/j.ympev.2018.06.033] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2017] [Revised: 06/01/2018] [Accepted: 06/19/2018] [Indexed: 12/15/2022]
|
33
|
Law WD, Fogarty EA, Vester A, Antonellis A. A genome-wide assessment of conserved SNP alleles reveals a panel of regulatory SNPs relevant to the peripheral nerve. BMC Genomics 2018; 19:311. [PMID: 29716548 PMCID: PMC5930951 DOI: 10.1186/s12864-018-4692-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2017] [Accepted: 04/17/2018] [Indexed: 12/29/2022] Open
Abstract
Background Identifying functional non-coding variation is critical for defining the genetic contributions to human disease. While single-nucleotide polymorphisms (SNPs) within cis-acting transcriptional regulatory elements have been implicated in disease pathogenesis, not all cell types have been assessed and functional validations have been limited. In particular, the cells of the peripheral nervous system have been excluded from genome-wide efforts to link non-coding SNPs to altered gene function. Addressing this gap is essential for defining the genetic architecture of diseases that affect the peripheral nerve. We developed a computational pipeline to identify SNPs that affect regulatory function (rSNPs) and evaluated our predictions on a set of 144 regions in Schwann cells, motor neurons, and muscle cells. Results We identified 28 regions that display regulatory activity in at least one cell type and 13 SNPs that affect regulatory function. We then tailored our pipeline to one peripheral nerve cell type by incorporating SOX10 ChIP-Seq data; SOX10 is essential for Schwann cells. We prioritized 22 putative SOX10 response elements harboring a SNP and rapidly validated two rSNPs. We then selected one of these elements for further characterization to assess the biological relevance of our approach. Deletion of the element from the genome of cultured Schwann cells—followed by differential gene expression studies—revealed Tubb2b as a candidate target gene. Studying the enhancer in developing mouse embryos revealed activity in SOX10-positive cells including the dorsal root ganglia and melanoblasts. Conclusions Our efforts provide insight into the utility of employing strict conservation for rSNP discovery. This strategy, combined with functional analyses, can yield candidate target genes. In support of this, our efforts suggest that investigating the role of Tubb2b in SOX10-positive cells may reveal novel biology within these cell populations. Electronic supplementary material The online version of this article (10.1186/s12864-018-4692-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- William D Law
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI, USA
| | - Elizabeth A Fogarty
- Neuroscience Graduate Program, University of Michigan Medical School, Ann Arbor, MI, USA
| | - Aimée Vester
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI, USA
| | - Anthony Antonellis
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI, USA. .,Neuroscience Graduate Program, University of Michigan Medical School, Ann Arbor, MI, USA. .,Department of Neurology, University of Michigan Medical School, 3710A Medical Sciences II, 1241 E. Catherine St. SPC 5618, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
34
|
Raghuraman P, Sudandiradoss C. R516Q mutation in Melanoma differentiation-associated protein 5 (MDA5) and its pathogenic role towards rare Singleton-Merten syndrome; a signature associated molecular dynamics study. J Biomol Struct Dyn 2018; 37:750-765. [DOI: 10.1080/07391102.2018.1439770] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Affiliation(s)
- P. Raghuraman
- Department of Biotechnology, School of Bioscience and Technology, VIT University, Vellore 632014, India
| | - C. Sudandiradoss
- Department of Biotechnology, School of Bioscience and Technology, VIT University, Vellore 632014, India
| |
Collapse
|
35
|
Berthelot C, Villar D, Horvath JE, Odom DT, Flicek P. Complexity and conservation of regulatory landscapes underlie evolutionary resilience of mammalian gene expression. Nat Ecol Evol 2018; 2:152-163. [PMID: 29180706 PMCID: PMC5733139 DOI: 10.1038/s41559-017-0377-2] [Citation(s) in RCA: 92] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2017] [Accepted: 10/10/2017] [Indexed: 02/02/2023]
Abstract
To gain insight into how mammalian gene expression is controlled by rapidly evolving regulatory elements, we jointly analysed promoter and enhancer activity with downstream transcription levels in liver samples from 15 species. Genes associated with complex regulatory landscapes generally exhibit high expression levels that remain evolutionarily stable. While the number of regulatory elements is the key driver of transcriptional output and resilience, regulatory conservation matters: elements active across mammals most effectively stabilize gene expression. In contrast, recently evolved enhancers typically contribute weakly, consistent with their high evolutionary plasticity. These effects are observed across the entire mammalian clade and are robust to potential confounders, such as the gene expression level. Using liver as a representative somatic tissue, our results illuminate how the evolutionary stability of gene expression is profoundly entwined with both the number and conservation of surrounding promoters and enhancers.
Collapse
Affiliation(s)
- Camille Berthelot
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
- Institut de Biologie de l'Ecole Normale Supérieure, Centre National de la Recherche Scientifique UMR8197, Institut National de la Santé et de la Recherche Médicale U1024, 46 Rue d'Ulm, 75230, Paris, Cedex 05, France
| | - Diego Villar
- University of Cambridge, Cancer Research UK Cambridge Institute, Robinson Way, Cambridge, CB2 0RE, UK
| | - Julie E Horvath
- Biological and Biomedical Sciences, North Carolina Central University, Durham, NC, 27707, USA
- North Carolina Museum of Natural Sciences, Raleigh, NC, 27601, USA
- Evolutionary Anthropology Department, Duke University, Durham, NC, 27707, USA
| | - Duncan T Odom
- University of Cambridge, Cancer Research UK Cambridge Institute, Robinson Way, Cambridge, CB2 0RE, UK.
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| |
Collapse
|
36
|
Trizzino M, Park Y, Holsbach-Beltrame M, Aracena K, Mika K, Caliskan M, Perry GH, Lynch VJ, Brown CD. Transposable elements are the primary source of novelty in primate gene regulation. Genome Res 2017; 27:1623-1633. [PMID: 28855262 PMCID: PMC5630026 DOI: 10.1101/gr.218149.116] [Citation(s) in RCA: 143] [Impact Index Per Article: 17.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2016] [Accepted: 08/17/2017] [Indexed: 12/11/2022]
Abstract
Gene regulation shapes the evolution of phenotypic diversity. We investigated the evolution of liver promoters and enhancers in six primate species using ChIP-seq (H3K27ac and H3K4me1) to profile cis-regulatory elements (CREs) and using RNA-seq to characterize gene expression in the same individuals. To quantify regulatory divergence, we compared CRE activity across species by testing differential ChIP-seq read depths directly measured for orthologous sequences. We show that the primate regulatory landscape is largely conserved across the lineage, with 63% of the tested human liver CREs showing similar activity across species. Conserved CRE function is associated with sequence conservation, proximity to coding genes, cell-type specificity, and transcription factor binding. Newly evolved CREs are enriched in immune response and neurodevelopmental functions. We further demonstrate that conserved CREs bind master regulators, suggesting that while CREs contribute to species adaptation to the environment, core functions remain intact. Newly evolved CREs are enriched in young transposable elements (TEs), including Long-Terminal-Repeats (LTRs) and SINE-VNTR-Alus (SVAs), that significantly affect gene expression. Conversely, only 16% of conserved CREs overlap TEs. We tested the cis-regulatory activity of 69 TE subfamilies by luciferase reporter assays, spanning all major TE classes, and showed that 95.6% of tested TEs can function as either transcriptional activators or repressors. In conclusion, we demonstrated the critical role of TEs in primate gene regulation and illustrated potential mechanisms underlying evolutionary divergence among the primate species through the noncoding genome.
Collapse
Affiliation(s)
- Marco Trizzino
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA.,Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - YoSon Park
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA.,Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Marcia Holsbach-Beltrame
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Katherine Aracena
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Katelyn Mika
- Department of Human Genetics, University of Chicago, Chicago, Illinois 60637, USA
| | - Minal Caliskan
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA.,Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - George H Perry
- Departments of Anthropology and Biology, Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Vincent J Lynch
- Department of Human Genetics, University of Chicago, Chicago, Illinois 60637, USA
| | - Christopher D Brown
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA.,Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| |
Collapse
|
37
|
Abstract
Human genome sequencing is routine and will soon be a staple in research and clinical genetics. However, the promise of sequencing is often just that, with genome data routinely failing to reveal useful insights about disease in general or a person's health in particular. Nowhere is this chasm between promise and progress more evident than in the designation, "variant of uncertain significance" (VUS). Although it serves an important role, careful consideration of VUS reveals it to be a nebulous description of genomic information and its relationship to disease, symptomatic of our inability to make even crude quantitative assertions about the disease risks conferred by many genetic variants. In this perspective, I discuss the challenge of "variant interpretation" and the value of comparative and functional genomic information in meeting that challenge. Although already essential, genomic annotations will become even more important as our analytical focus widens beyond coding exons. Combined with more genotype and phenotype data, they will help facilitate more quantitative and insightful assessments of the contributions of genetic variants to disease.
Collapse
Affiliation(s)
- Gregory M Cooper
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama 35806, USA
| |
Collapse
|
38
|
Lowdon RF, Jang HS, Wang T. Evolution of Epigenetic Regulation in Vertebrate Genomes. Trends Genet 2016; 32:269-283. [PMID: 27080453 PMCID: PMC4842087 DOI: 10.1016/j.tig.2016.03.001] [Citation(s) in RCA: 49] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2015] [Revised: 03/02/2016] [Accepted: 03/03/2016] [Indexed: 12/31/2022]
Abstract
Empirical models of sequence evolution have spurred progress in the field of evolutionary genetics for decades. We are now realizing the importance and complexity of the eukaryotic epigenome. While epigenome analysis has been applied to genomes from single-cell eukaryotes to human, comparative analyses are still relatively few and computational algorithms to quantify epigenome evolution remain scarce. Accordingly, a quantitative model of epigenome evolution remains to be established. We review here the comparative epigenomics literature and synthesize its overarching themes. We also suggest one mechanism, transcription factor binding site (TFBS) turnover, which relates sequence evolution to epigenetic conservation or divergence. Lastly, we propose a framework for how the field can move forward to build a coherent quantitative model of epigenome evolution.
Collapse
Affiliation(s)
- Rebecca F Lowdon
- Department of Genetics, Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO, USA.
| | - Hyo Sik Jang
- Department of Genetics, Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO, USA
| | - Ting Wang
- Department of Genetics, Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO, USA.
| |
Collapse
|
39
|
Kumar S, Kumari R, Sharma V. Coevolution mechanisms that adapt viruses to genetic code variations implemented in their hosts. J Genet 2016; 95:3-12. [PMID: 27019427 DOI: 10.1007/s12041-016-0612-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Sushil Kumar
- SKA Institution for Research, Education and Development, 4/11 SarvPriya Vihar, New Delhi 110016, India.
| | | | | |
Collapse
|
40
|
Olgiati S, Quadri M, Bonifati V. Genetics of movement disorders in the next-generation sequencing era. Mov Disord 2016; 31:458-70. [DOI: 10.1002/mds.26521] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2015] [Accepted: 11/29/2015] [Indexed: 12/15/2022] Open
Affiliation(s)
- Simone Olgiati
- Department of Clinical Genetics; Erasmus MC; Rotterdam The Netherlands
| | - Marialuisa Quadri
- Department of Clinical Genetics; Erasmus MC; Rotterdam The Netherlands
| | - Vincenzo Bonifati
- Department of Clinical Genetics; Erasmus MC; Rotterdam The Netherlands
| |
Collapse
|
41
|
Aprea J, Calegari F. Long non-coding RNAs in corticogenesis: deciphering the non-coding code of the brain. EMBO J 2015; 34:2865-84. [PMID: 26516210 DOI: 10.15252/embj.201592655] [Citation(s) in RCA: 63] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2015] [Accepted: 10/05/2015] [Indexed: 01/17/2023] Open
Abstract
Evidence on the role of long non-coding (lnc) RNAs has been accumulating over decades, but it has been only recently that advances in sequencing technologies have allowed the field to fully appreciate their abundance and diversity. Despite this, only a handful of lncRNAs have been phenotypically or mechanistically studied. Moreover, novel lncRNAs and new classes of RNAs are being discovered at growing pace, suggesting that this class of molecules may have functions as diverse as protein-coding genes. Interestingly, the brain is the organ where lncRNAs have the most peculiar features including the highest number of lncRNAs that are expressed, proportion of tissue-specific lncRNAs and highest signals of evolutionary conservation. In this work, we critically review the current knowledge about the steps that have led to the identification of the non-coding transcriptome including the general features of lncRNAs in different contexts in terms of both their genomic organisation, evolutionary origin, patterns of expression, and function in the developing and adult mammalian brain.
Collapse
Affiliation(s)
- Julieta Aprea
- DFG-Research Center and Cluster of Excellence for Regenerative Therapies, Faculty of Medicine, Technische Universität Dresden, Dresden, Germany
| | - Federico Calegari
- DFG-Research Center and Cluster of Excellence for Regenerative Therapies, Faculty of Medicine, Technische Universität Dresden, Dresden, Germany
| |
Collapse
|
42
|
AlloRep: A Repository of Sequence, Structural and Mutagenesis Data for the LacI/GalR Transcription Regulators. J Mol Biol 2015; 428:671-678. [PMID: 26410588 DOI: 10.1016/j.jmb.2015.09.015] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2015] [Revised: 09/04/2015] [Accepted: 09/17/2015] [Indexed: 11/20/2022]
Abstract
Protein families evolve functional variation by accumulating point mutations at functionally important amino acid positions. Homologs in the LacI/GalR family of transcription regulators have evolved to bind diverse DNA sequences and allosteric regulatory molecules. In addition to playing key roles in bacterial metabolism, these proteins have been widely used as a model family for benchmarking structural and functional prediction algorithms. We have collected manually curated sequence alignments for >3000 sequences, in vivo phenotypic and biochemical data for >5750 LacI/GalR mutational variants, and noncovalent residue contact networks for 65 LacI/GalR homolog structures. Using this rich data resource, we compared the noncovalent residue contact networks of the LacI/GalR subfamilies to design and experimentally validate an allosteric mutant of a synthetic LacI/GalR repressor for use in biotechnology. The AlloRep database (freely available at www.AlloRep.org) is a key resource for future evolutionary studies of LacI/GalR homologs and for benchmarking computational predictions of functional change.
Collapse
|
43
|
Gonen S, Bishop SC, Houston RD. Exploring the utility of cross-laboratory RAD-sequencing datasets for phylogenetic analysis. BMC Res Notes 2015; 8:299. [PMID: 26152111 PMCID: PMC4495686 DOI: 10.1186/s13104-015-1261-2] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2015] [Accepted: 06/25/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Restriction site-Associated DNA sequencing (RAD-Seq) is widely applied to generate genome-wide sequence and genetic marker datasets. RAD-Seq has been extensively utilised, both at the population level and across species, for example in the construction of phylogenetic trees. However, the consistency of RAD-Seq data generated in different laboratories, and the potential use of cross-species orthologous RAD loci in the estimation of genetic relationships, have not been widely investigated. This study describes the use of SbfI RAD-Seq data for the estimation of evolutionary relationships amongst ten teleost fish species, using previously established phylogeny as a benchmark. RESULTS The number of orthologous SbfI RAD loci identified decreased with increasing evolutionary distance between the species, with several thousand loci conserved across five salmonid species (divergence ~50 MY), and several hundred conserved across the more distantly related teleost species (divergence ~100-360 MY). The majority (>70%) of loci identified between the more distantly related species were genic in origin, suggesting that the bias of SbfI towards genic regions is useful for identifying distant orthologs. Interspecific single nucleotide variants at each orthologous RAD locus were identified. Evolutionary relationships estimated using concatenated sequences of interspecific variants were congruent with previously published phylogenies, even for distantly (divergence up to ~360 MY) related species. CONCLUSION Overall, this study has demonstrated that orthologous SbfI RAD loci can be identified across closely and distantly related species. This has positive implications for the repeatability of SbfI RAD-Seq and its potential to address research questions beyond the scope of the original studies. Furthermore, the concordance in tree topologies and relationships estimated in this study with published teleost phylogenies suggests that similar meta-datasets could be utilised in the prediction of evolutionary relationships across populations and species with readily available RAD-Seq datasets, but for which relationships remain uncharacterised.
Collapse
Affiliation(s)
- Serap Gonen
- The Roslin Institute, University of Edinburgh, Midlothian, EH25 9RG, Scotland, UK.
| | - Stephen C Bishop
- The Roslin Institute, University of Edinburgh, Midlothian, EH25 9RG, Scotland, UK.
| | - Ross D Houston
- The Roslin Institute, University of Edinburgh, Midlothian, EH25 9RG, Scotland, UK.
| |
Collapse
|
44
|
Gordon KL, Arthur RK, Ruvinsky I. Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence. PLoS Genet 2015; 11:e1005268. [PMID: 26020930 PMCID: PMC4447282 DOI: 10.1371/journal.pgen.1005268] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2014] [Accepted: 05/09/2015] [Indexed: 11/28/2022] Open
Abstract
Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements. To explore the phylogenetic limits of conservation of cis-regulatory elements, we used transgenesis to test the functions of enhancers of four genes from several species spanning the phylum Nematoda. While we found a striking degree of functional conservation among the examined cis elements, their DNA sequences lacked apparent conservation with the C. elegans orthologs. In fact, sequence similarity between C. elegans and the distantly related nematodes was no greater than would be expected by chance. Short motifs, similar to known regulatory sequences in C. elegans, can be detected in most of the cis elements. When tested, some of these sites appear to mediate regulatory function. However, they seem to have originated through motif turnover, rather than to have been preserved from a common ancestor. Our results suggest that gene regulatory networks are broadly conserved in the phylum Nematoda, but this conservation persists despite substantial reorganization of regulatory elements and could not be detected using naïve comparisons of sequence similarity.
Collapse
Affiliation(s)
- Kacy L. Gordon
- Department of Organismal Biology and Anatomy, The University of Chicago, Chicago, Illinois, United States of America
- * E-mail: (KLG); (IR)
| | - Robert K. Arthur
- Department of Ecology and Evolution, The University of Chicago, Chicago, Illinois, United States of America
| | - Ilya Ruvinsky
- Department of Organismal Biology and Anatomy, The University of Chicago, Chicago, Illinois, United States of America
- Department of Ecology and Evolution, The University of Chicago, Chicago, Illinois, United States of America
- * E-mail: (KLG); (IR)
| |
Collapse
|
45
|
Krueger T, Fisher PL, Becker S, Pontasch S, Dove S, Hoegh-Guldberg O, Leggat W, Davy SK. Transcriptomic characterization of the enzymatic antioxidants FeSOD, MnSOD, APX and KatG in the dinoflagellate genus Symbiodinium. BMC Evol Biol 2015; 15:48. [PMID: 25887897 PMCID: PMC4416395 DOI: 10.1186/s12862-015-0326-0] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2014] [Accepted: 02/24/2015] [Indexed: 11/26/2022] Open
Abstract
Background The diversity of the symbiotic dinoflagellate Symbiodinium sp., as assessed by genetic markers, is well established. To what extent this diversity is reflected on the amino acid level of functional genes such as enzymatic antioxidants that play an important role in thermal stress tolerance of the coral-Symbiodinium symbiosis is, however, unknown. Here we present a predicted structural analysis and phylogenetic characterization of the enzymatic antioxidant repertoire of the genus Symbiodinium. We also report gene expression and enzymatic activity under short-term thermal stress in Symbiodinium of the B1 genotype. Results Based on eight different ITS2 types, covering six clades, multiple protein isoforms for three of the four investigated antioxidants (ascorbate peroxidase [APX], catalase peroxidase [KatG], manganese superoxide dismutase [MnSOD]) are present in the genus Symbiodinium. Amino acid sequences of both SOD metalloforms (Fe/Mn), as well as KatG, exhibited a number of prokaryotic characteristics that were also supported by the protein phylogeny. In contrast to the bacterial form, KatG in Symbiodinium is characterized by extended functionally important loops and a shortened C-terminal domain. Intercladal sequence variations were found to be much higher in both peroxidases, compared to SODs. For APX, these variable residues involve binding sites for substrates and cofactors, and might therefore differentially affect the catalytic properties of this enzyme between clades. While expression of antioxidant genes was successfully measured in Symbiodinium B1, it was not possible to assess the link between gene expression and protein activity due to high variability in expression between replicates, and little response in their enzymatic activity over the three-day experimental period. Conclusions The genus Symbiodinium has a diverse enzymatic antioxidant repertoire that has similarities to prokaryotes, potentially as a result of horizontal gene transfer or events of secondary endosymbiosis. Different degrees of sequence evolution between SODs and peroxidases might be the result of potential selective pressure on the conserved molecular function of SODs as the first line of defence. In contrast, genetic redundancy of hydrogen peroxide scavenging enzymes might permit the observed variations in peroxidase sequences. Our data and successful measurement of antioxidant gene expression in Symbiodinium will serve as basis for further studies of coral health. Electronic supplementary material The online version of this article (doi:10.1186/s12862-015-0326-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Thomas Krueger
- School of Biological Sciences, Victoria University of Wellington, Wellington, 6140, New Zealand. .,Laboratory for Biological Geochemistry, ENAC, École polytechnique fédérale de Lausanne (EPFL), Lausanne, 1015, Switzerland.
| | - Paul L Fisher
- School of Biological Sciences, Victoria University of Wellington, Wellington, 6140, New Zealand. .,School of Civil Engineering, University of Queensland, St Lucia, QLD 4072, Australia.
| | - Susanne Becker
- School of Biological Sciences, Victoria University of Wellington, Wellington, 6140, New Zealand.
| | - Stefanie Pontasch
- School of Biological Sciences, Victoria University of Wellington, Wellington, 6140, New Zealand.
| | - Sophie Dove
- School of Biological Sciences & ARC Centre of Excellence for Coral Reef Studies, University of Queensland, Brisbane, QLD, 4072, Australia.
| | - Ove Hoegh-Guldberg
- Global Change Institute, University of Queensland, Brisbane, QLD 4072, Australia.
| | - William Leggat
- Comparative Genomics Centre, School of Pharmacy and Molecular Sciences & ARC Centre of Excellence for Coral Reef Studies, James Cook University, Townsville, QLD 4811, Australia.
| | - Simon K Davy
- School of Biological Sciences, Victoria University of Wellington, Wellington, 6140, New Zealand.
| |
Collapse
|
46
|
Pervaiz T, Sun X, Zhang Y, Tao R, Zhang J, Fang J. Association between Chloroplast and Mitochondrial DNA sequences in Chinese Prunus genotypes (Prunus persica, Prunus domestica, and Prunus avium). BMC PLANT BIOLOGY 2015; 15:4. [PMID: 0 PMCID: PMC4310034 DOI: 10.1186/s12870-014-0402-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/07/2014] [Accepted: 12/22/2014] [Indexed: 05/13/2023]
Abstract
BACKGROUND The nuclear DNA is conventionally used to assess the diversity and relatedness among different species, but variations at the DNA genome level has also been used to study the relationship among different organisms. In most species, mitochondrial and chloroplast genomes are inherited maternally; therefore it is anticipated that organelle DNA remains completely associated. Many research studies were conducted simultaneously on organelle genome. The objectives of this study was to analyze the genetic relationship between chloroplast and mitochondrial DNA in three Chinese Prunus genotypes viz., Prunus persica, Prunus domestica, and Prunus avium. RESULTS We investigated the genetic diversity of Prunus genotypes using simple sequence repeat (SSR) markers relevant to the chloroplast and mitochondria. Most of the genotypes were genetically similar as revealed by phylogenetic analysis. The Y2 Wu Xing (Cherry) and L2 Hong Xin Li (Plum) genotypes have a high similarity index (0.89), followed by Zi Ye Li (0.85), whereas; L1 Tai Yang Li (plum) has the lowest genetic similarity (0.35). In case of cpSSR, Hong Tao (Peach) and L1 Tai Yang Li (Plum) genotypes demonstrated similarity index of 0.85 and Huang Tao has the lowest similarity index of 0.50. The mtSSR nucleotide sequence analysis revealed that each genotype has similar amplicon length (509 bp) except M5Y1 i.e., 505 bp with CCB256 primer; while in case of NAD6 primer, all genotypes showed different sizes. The MEHO (Peach), MEY1 (Cherry), MEL2 (Plum) and MEL1 (Plum) have 586 bps; while MEY2 (Cherry), MEZI (Plum) and MEHU (Peach) have 585, 584 and 566 bp, respectively. The CCB256 primer showed highly conserved sequences and minute single polymorphic nucleotides with no deletion or mutation. The cpSSR (ARCP511) microsatellites showed the harmonious amplicon length. The CZI (Plum), CHO (Peach) and CL1 (Plum) showed 182 bp; whileCHU (Peach), CY2 (Cherry), CL2 (Plum) and CY1 (Cherry) showed 181 bp amplicon lengths. CONCLUSIONS These results demonstrated high conservation in chloroplast and mitochondrial genome among Prunus species during the evolutionary process. These findings are valuable to study the organelle DNA diversity in different species and genotypes of Prunus to provide in depth insight in to the mitochondrial and chloroplast genomes.
Collapse
Affiliation(s)
- Tariq Pervaiz
- College of Horticulture, Nanjing Agricultural University, Nanjing, 210095, P R China.
| | - Xin Sun
- College of Horticulture, Nanjing Agricultural University, Nanjing, 210095, P R China.
| | - Yanyi Zhang
- College of Horticulture, Nanjing Agricultural University, Nanjing, 210095, P R China.
| | - Ran Tao
- College of Horticulture, Nanjing Agricultural University, Nanjing, 210095, P R China.
| | - Junhuan Zhang
- Institute of Forestry and Pomology, Beijing Academy of Agriculture and Forestry Science, Beijing, 100093, P R China.
| | - Jinggui Fang
- College of Horticulture, Nanjing Agricultural University, Nanjing, 210095, P R China.
| |
Collapse
|
47
|
Kapusta A, Feschotte C. Volatile evolution of long noncoding RNA repertoires: mechanisms and biological implications. Trends Genet 2014; 30:439-52. [PMID: 25218058 PMCID: PMC4464757 DOI: 10.1016/j.tig.2014.08.004] [Citation(s) in RCA: 204] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2014] [Revised: 08/15/2014] [Accepted: 08/16/2014] [Indexed: 02/08/2023]
Abstract
Thousands of genes encoding long noncoding RNAs (lncRNAs) have been identified in all vertebrate genomes thus far examined. The list of lncRNAs partaking in arguably important biochemical, cellular, and developmental activities is steadily growing. However, it is increasingly clear that lncRNA repertoires are subject to weak functional constraint and rapid turnover during vertebrate evolution. We discuss here some of the factors that may explain this apparent paradox, including relaxed constraint on sequence to maintain lncRNA structure/function, extensive redundancy in the regulatory circuits in which lncRNAs act, as well as adaptive and non-adaptive forces such as genetic drift. We explore the molecular mechanisms promoting the birth and rapid evolution of lncRNA genes, with an emphasis on the influence of bidirectional transcription and transposable elements, two pervasive features of vertebrate genomes. Together these properties reveal a remarkably dynamic and malleable noncoding transcriptome which may represent an important source of robustness and evolvability.
Collapse
Affiliation(s)
- Aurélie Kapusta
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT 84112, USA.
| | - Cédric Feschotte
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT 84112, USA.
| |
Collapse
|
48
|
Barrière A, Ruvinsky I. Pervasive divergence of transcriptional gene regulation in Caenorhabditis nematodes. PLoS Genet 2014; 10:e1004435. [PMID: 24968346 PMCID: PMC4072541 DOI: 10.1371/journal.pgen.1004435] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2013] [Accepted: 04/28/2014] [Indexed: 12/18/2022] Open
Abstract
Because there is considerable variation in gene expression even between closely related species, it is clear that gene regulatory mechanisms evolve relatively rapidly. Because primary sequence conservation is an unreliable proxy for functional conservation of cis-regulatory elements, their assessment must be carried out in vivo. We conducted a survey of cis-regulatory conservation between C. elegans and closely related species C. briggsae, C. remanei, C. brenneri, and C. japonica. We tested enhancers of eight genes from these species by introducing them into C. elegans and analyzing the expression patterns they drove. Our results support several notable conclusions. Most exogenous cis elements direct expression in the same cells as their C. elegans orthologs, confirming gross conservation of regulatory mechanisms. However, the majority of exogenous elements, when placed in C. elegans, also directed expression in cells outside endogenous patterns, suggesting functional divergence. Recurrent ectopic expression of different promoters in the same C. elegans cells may reflect biases in the directions in which expression patterns can evolve due to shared regulatory logic of coexpressed genes. The fact that, despite differences between individual genes, several patterns repeatedly emerged from our survey, encourages us to think that general rules governing regulatory evolution may exist and be discoverable.
Collapse
Affiliation(s)
- Antoine Barrière
- Department of Ecology and Evolution and Institute for Genomics and Systems Biology, The University of Chicago, Chicago, Illinois, United States of America
- * E-mail: (AB); (IR)
| | - Ilya Ruvinsky
- Department of Ecology and Evolution and Institute for Genomics and Systems Biology, The University of Chicago, Chicago, Illinois, United States of America
- Department of Organismal Biology and Anatomy, The University of Chicago, Chicago, Illinois, United States of America
- * E-mail: (AB); (IR)
| |
Collapse
|
49
|
Abstract
Transcription factor binding sites (TFBSs) on the DNA are generally accepted as the key nodes of gene control. However, the multitudes of TFBSs identified in genome-wide studies, some of them seemingly unconstrained in evolution, have prompted the view that in many cases TF binding may serve no biological function. Yet, insights from transcriptional biochemistry, population genetics and functional genomics suggest that rather than segregating into 'functional' or 'non-functional', TFBS inputs to their target genes may be generally cumulative, with varying degrees of potency and redundancy. As TFBS redundancy can be diminished by mutations and environmental stress, some of the apparently 'spurious' sites may turn out to be important for maintaining adequate transcriptional regulation under these conditions. This has significant implications for interpreting the phenotypic effects of TFBS mutations, particularly in the context of genome-wide association studies for complex traits.
Collapse
|
50
|
Enard W. Comparative genomics of brain size evolution. Front Hum Neurosci 2014; 8:345. [PMID: 24904382 PMCID: PMC4033227 DOI: 10.3389/fnhum.2014.00345] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2014] [Accepted: 05/06/2014] [Indexed: 01/12/2023] Open
Abstract
Which genetic changes took place during mammalian, primate and human evolution to build a larger brain? To answer this question, one has to correlate genetic changes with brain size changes across a phylogeny. Such a comparative genomics approach provides unique information to better understand brain evolution and brain development. However, its statistical power is limited for example due to the limited number of species, the presumably complex genetics of brain size evolution and the large search space of mammalian genomes. Hence, it is crucial to add functional information, for example by limiting the search space to genes and regulatory elements known to play a role in the relevant cell types during brain development. Similarly, it is crucial to experimentally follow up on hypotheses generated by such a comparative approach. Recent progress in understanding the molecular and cellular mechanisms of mammalian brain development, in genome sequencing and in genome editing, promises to make a close integration of evolutionary and experimental methods a fruitful approach to better understand the genetics of mammalian brain size evolution.
Collapse
Affiliation(s)
- Wolfgang Enard
- Department of Biology II, Ludwig Maximilian University MunichMunich, Germany
| |
Collapse
|