1
|
Beals J, Hu H, Li X. A survey of experimental and computational identification of small proteins. Brief Bioinform 2024; 25:bbae345. [PMID: 39007598 PMCID: PMC11247407 DOI: 10.1093/bib/bbae345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Revised: 05/27/2024] [Accepted: 07/02/2024] [Indexed: 07/16/2024] Open
Abstract
Small proteins (SPs) are typically characterized as eukaryotic proteins shorter than 100 amino acids and prokaryotic proteins shorter than 50 amino acids. Historically, they were disregarded because of the arbitrary size thresholds to define proteins. However, recent research has revealed the existence of many SPs and their crucial roles. Despite this, the identification of SPs and the elucidation of their functions are still in their infancy. To pave the way for future SP studies, we briefly introduce the limitations and advancements in experimental techniques for SP identification. We then provide an overview of available computational tools for SP identification, their constraints, and their evaluation. Additionally, we highlight existing resources for SP research. This survey aims to initiate further exploration into SPs and encourage the development of more sophisticated computational tools for SP identification in prokaryotes and microbiomes.
Collapse
Affiliation(s)
- Joshua Beals
- Burnett School of Biomedical Science, University of Central Florida, 4000 Central Florida Blvd, Orlando, FL 32816, United States
| | - Haiyan Hu
- Department of Computer Science, University of Central Florida, 4000 Central Florida Blvd, Orlando, FL 32816, United States
| | - Xiaoman Li
- Burnett School of Biomedical Science, University of Central Florida, 4000 Central Florida Blvd, Orlando, FL 32816, United States
| |
Collapse
|
2
|
Yang H, Li Q, Stroup EK, Wang S, Ji Z. Widespread stable noncanonical peptides identified by integrated analyses of ribosome profiling and ORF features. Nat Commun 2024; 15:1932. [PMID: 38431639 PMCID: PMC10908861 DOI: 10.1038/s41467-024-46240-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Accepted: 02/18/2024] [Indexed: 03/05/2024] Open
Abstract
Studies have revealed dozens of functional peptides in putative 'noncoding' regions and raised the question of how many proteins are encoded by noncanonical open reading frames (ORFs). Here, we comprehensively annotate genome-wide translated ORFs across five eukaryotes (human, mouse, zebrafish, worm, and yeast) by analyzing ribosome profiling data. We develop a logistic regression model named PepScore based on ORF features (expected length, encoded domain, and conservation) to calculate the probability that the encoded peptide is stable in humans. Systematic ectopic expression validates PepScore and shows that stable complex-associating microproteins can be encoded in 5'/3' untranslated regions and overlapping coding regions of mRNAs besides annotated noncoding RNAs. Stable noncanonical proteins follow conventional rules and localize to different subcellular compartments. Inhibition of proteasomal/lysosomal degradation pathways can stabilize some peptides especially those with moderate PepScores, but cannot rescue the expression of short ones with low PepScores suggesting they are directly degraded by cellular proteases. The majority of human noncanonical peptides with high PepScores show longer lengths but low conservation across species/mammals, and hundreds contain trait-associated genetic variants. Our study presents a statistical framework to identify stable noncanonical peptides in the genome and provides a valuable resource for functional characterization of noncanonical translation during development and disease.
Collapse
Affiliation(s)
- Haiwang Yang
- Department of Pharmacology, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA
| | - Qianru Li
- Department of Pharmacology, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA
| | - Emily K Stroup
- Department of Pharmacology, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA
| | - Sheng Wang
- Department of Biomedical Engineering, McCormick School of Engineering, Northwestern University, Evanston, IL, 60628, USA
| | - Zhe Ji
- Department of Pharmacology, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA.
- Department of Biomedical Engineering, McCormick School of Engineering, Northwestern University, Evanston, IL, 60628, USA.
| |
Collapse
|
3
|
Malekos E, Carpenter S. Short open reading frame genes in innate immunity: from discovery to characterization. Trends Immunol 2022; 43:741-756. [PMID: 35965152 PMCID: PMC10118063 DOI: 10.1016/j.it.2022.07.005] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 07/11/2022] [Accepted: 07/13/2022] [Indexed: 12/27/2022]
Abstract
Next-generation sequencing (NGS) technologies have greatly expanded the size of the known transcriptome. Many newly discovered transcripts are classified as long noncoding RNAs (lncRNAs) which are assumed to affect phenotype through sequence and structure and not via translated protein products despite the vast majority of them harboring short open reading frames (sORFs). Recent advances have demonstrated that the noncoding designation is incorrect in many cases and that sORF-encoded peptides (SEPs) translated from these transcripts are important contributors to diverse biological processes. Interest in SEPs is at an early stage and there is evidence for the existence of thousands of SEPs that are yet unstudied. We hope to pique interest in investigating this unexplored proteome by providing a discussion of SEP characterization generally and describing specific discoveries in innate immunity.
Collapse
Affiliation(s)
- Eric Malekos
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA; Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Susan Carpenter
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA; Department of Molecular Cell and Developmental Biology, University of California Santa Cruz, Santa Cruz, CA, USA.
| |
Collapse
|
4
|
Discovery of Unannotated Small Open Reading Frames in Streptococcus pneumoniae D39 Involved in Quorum Sensing and Virulence Using Ribosome Profiling. mBio 2022; 13:e0124722. [PMID: 35852327 PMCID: PMC9426450 DOI: 10.1128/mbio.01247-22] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Streptococcus pneumoniae, an opportunistic human pathogen, is the leading cause of community-acquired pneumonia and an agent of otitis media, septicemia, and meningitis. Although genomic and transcriptomic studies of S. pneumoniae have provided detailed perspectives on gene content and expression programs, they have lacked information pertaining to the translational landscape, particularly at a resolution that identifies commonly overlooked small open reading frames (sORFs), whose importance is increasingly realized in metabolism, regulation, and virulence. To identify protein-coding sORFs in S. pneumoniae, antibiotic-enhanced ribosome profiling was conducted. Using translation inhibitors, 114 novel sORFs were detected, and the expression of a subset of them was experimentally validated. Two loci associated with virulence and quorum sensing were examined in deeper detail. One such sORF, rio3, overlaps with the noncoding RNA srf-02 that was previously implicated in pathogenesis. Targeted mutagenesis parsing rio3 from srf-02 revealed that rio3 is responsible for the fitness defect seen in a murine nasopharyngeal colonization model. Additionally, two novel sORFs located adjacent to the quorum sensing receptor rgg1518 were found to impact regulatory activity. Our findings emphasize the importance of sORFs present in the genomes of pathogenic bacteria and underscore the utility of ribosome profiling for identifying the bacterial translatome.
Collapse
|
5
|
Weidenbach K, Gutt M, Cassidy L, Chibani C, Schmitz RA. Small Proteins in Archaea, a Mainly Unexplored World. J Bacteriol 2022; 204:e0031321. [PMID: 34543104 PMCID: PMC8765429 DOI: 10.1128/jb.00313-21] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
In recent years, increasing numbers of small proteins have moved into the focus of science. Small proteins have been identified and characterized in all three domains of life, but the majority remains functionally uncharacterized, lack secondary structure, and exhibit limited evolutionary conservation. While quite a few have already been described for bacteria and eukaryotic organisms, the amount of known and functionally analyzed archaeal small proteins is still very limited. In this review, we compile the current state of research, show strategies for systematic approaches for global identification of small archaeal proteins, and address selected functionally characterized examples. Besides, we document exemplarily for one archaeon the tool development and optimization to identify small proteins using genome-wide approaches.
Collapse
Affiliation(s)
- Katrin Weidenbach
- Institute for General Microbiology, Christian Albrechts University, Kiel, Germany
| | - Miriam Gutt
- Institute for General Microbiology, Christian Albrechts University, Kiel, Germany
| | - Liam Cassidy
- AG Proteomics & Bioanalytics, Institute for Experimental Medicine, Christian Albrechts University, Kiel, Germany
| | - Cynthia Chibani
- Institute for General Microbiology, Christian Albrechts University, Kiel, Germany
| | - Ruth A. Schmitz
- Institute for General Microbiology, Christian Albrechts University, Kiel, Germany
| |
Collapse
|
6
|
Gerovac M, Vogel J, Smirnov A. The World of Stable Ribonucleoproteins and Its Mapping With Grad-Seq and Related Approaches. Front Mol Biosci 2021; 8:661448. [PMID: 33898526 PMCID: PMC8058203 DOI: 10.3389/fmolb.2021.661448] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2021] [Accepted: 03/04/2021] [Indexed: 12/13/2022] Open
Abstract
Macromolecular complexes of proteins and RNAs are essential building blocks of cells. These stable supramolecular particles can be viewed as minimal biochemical units whose structural organization, i.e., the way the RNA and the protein interact with each other, is directly linked to their biological function. Whether those are dynamic regulatory ribonucleoproteins (RNPs) or integrated molecular machines involved in gene expression, the comprehensive knowledge of these units is critical to our understanding of key molecular mechanisms and cell physiology phenomena. Such is the goal of diverse complexomic approaches and in particular of the recently developed gradient profiling by sequencing (Grad-seq). By separating cellular protein and RNA complexes on a density gradient and quantifying their distributions genome-wide by mass spectrometry and deep sequencing, Grad-seq charts global landscapes of native macromolecular assemblies. In this review, we propose a function-based ontology of stable RNPs and discuss how Grad-seq and related approaches transformed our perspective of bacterial and eukaryotic ribonucleoproteins by guiding the discovery of new RNA-binding proteins and unusual classes of noncoding RNAs. We highlight some methodological aspects and developments that permit to further boost the power of this technique and to look for exciting new biology in understudied and challenging biological models.
Collapse
Affiliation(s)
- Milan Gerovac
- Institute of Molecular Infection Biology (IMIB), University of Würzburg, Würzburg, Germany
| | - Jörg Vogel
- Institute of Molecular Infection Biology (IMIB), University of Würzburg, Würzburg, Germany
- Helmholtz Institute for RNA-based Infection Research (HIRI), Helmholtz Centre for Infection Research (HZI), Würzburg, Germany
| | - Alexandre Smirnov
- UMR 7156—Génétique Moléculaire, Génomique, Microbiologie (GMGM), University of Strasbourg, CNRS, Strasbourg, France
- University of Strasbourg Institute for Advanced Study (USIAS), Strasbourg, France
| |
Collapse
|
7
|
Steinberg R, Koch HG. The largely unexplored biology of small proteins in pro- and eukaryotes. FEBS J 2021; 288:7002-7024. [PMID: 33780127 DOI: 10.1111/febs.15845] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 03/11/2021] [Accepted: 03/26/2021] [Indexed: 12/29/2022]
Abstract
The large abundance of small open reading frames (smORFs) in prokaryotic and eukaryotic genomes and the plethora of smORF-encoded small proteins became only apparent with the constant advancements in bioinformatic, genomic, proteomic, and biochemical tools. Small proteins are typically defined as proteins of < 50 amino acids in prokaryotes and of less than 100 amino acids in eukaryotes, and their importance for cell physiology and cellular adaptation is only beginning to emerge. In contrast to antimicrobial peptides, which are secreted by prokaryotic and eukaryotic cells for combatting pathogens and competitors, small proteins act within the producing cell mainly by stabilizing protein assemblies and by modifying the activity of larger proteins. Production of small proteins is frequently linked to stress conditions or environmental changes, and therefore, cells seem to use small proteins as intracellular modifiers for adjusting cell metabolism to different intra- and extracellular cues. However, the size of small proteins imposes a major challenge for the cellular machinery required for protein folding and intracellular trafficking and recent data indicate that small proteins can engage distinct trafficking pathways. In the current review, we describe the diversity of small proteins in prokaryotes and eukaryotes, highlight distinct and common features, and illustrate how they are handled by the protein trafficking machineries in prokaryotic and eukaryotic cells. Finally, we also discuss future topics of research on this fascinating but largely unexplored group of proteins.
Collapse
Affiliation(s)
- Ruth Steinberg
- Institute for Biochemistry and Molecular Biology, Zentrum für Biochemie und Molekulare Medizin (ZMBZ), Faculty of Medicine, Albert-Ludwigs-Universität Freiburg, Germany
| | - Hans-Georg Koch
- Institute for Biochemistry and Molecular Biology, Zentrum für Biochemie und Molekulare Medizin (ZMBZ), Faculty of Medicine, Albert-Ludwigs-Universität Freiburg, Germany
| |
Collapse
|
8
|
Abstract
The number of complete genome sequences explodes more and more with each passing year. Thus, methods for genome annotation need to be honed constantly to handle the deluge of information. Annotation of pseudogenes (i.e., gene copies that appear not to make a functional protein) in genomes is a persistent problem; here, we overview pseudogene annotation methods that are based on the detection of sequence homology in genomic DNA.
Collapse
Affiliation(s)
- Paul M Harrison
- Department of Biology, McGill University, Montreal, QC, Canada.
| |
Collapse
|
9
|
Arginine-Rich Small Proteins with a Domain of Unknown Function, DUF1127, Play a Role in Phosphate and Carbon Metabolism of Agrobacterium tumefaciens. J Bacteriol 2020; 202:JB.00309-20. [PMID: 33093235 DOI: 10.1128/jb.00309-20] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Accepted: 07/21/2020] [Indexed: 02/06/2023] Open
Abstract
In any given organism, approximately one-third of all proteins have a yet-unknown function. A widely distributed domain of unknown function is DUF1127. Approximately 17,000 proteins with such an arginine-rich domain are found in 4,000 bacteria. Most of them are single-domain proteins, and a large fraction qualifies as small proteins with fewer than 50 amino acids. We systematically identified and characterized the seven DUF1127 members of the plant pathogen Agrobacterium tumefaciens They all give rise to authentic proteins and are differentially expressed as shown at the RNA and protein levels. The seven proteins fall into two subclasses on the basis of their length, sequence, and reciprocal regulation by the LysR-type transcription factor LsrB. The absence of all three short DUF1127 proteins caused a striking phenotype in later growth phases and increased cell aggregation and biofilm formation. Protein profiling and transcriptome sequencing (RNA-seq) analysis of the wild type and triple mutant revealed a large number of differentially regulated genes in late exponential and stationary growth. The most affected genes are involved in phosphate uptake, glycine/serine homeostasis, and nitrate respiration. The results suggest a redundant function of the small DUF1127 paralogs in nutrient acquisition and central carbon metabolism of A. tumefaciens They may be required for diauxic switching between carbon sources when sugar from the medium is depleted. We end by discussing how DUF1127 might confer such a global impact on cell physiology and gene expression.IMPORTANCE Despite being prevalent in numerous ecologically and clinically relevant bacterial species, the biological role of proteins with a domain of unknown function, DUF1127, is unclear. Experimental models are needed to approach their elusive function. We used the phytopathogen Agrobacterium tumefaciens, a natural genetic engineer that causes crown gall disease, and focused on its three small DUF1127 proteins. They have redundant and pervasive roles in nutrient acquisition, cellular metabolism, and biofilm formation. The study shows that small proteins have important previously missed biological functions. How small basic proteins can have such a broad impact is a fascinating prospect of future research.
Collapse
|
10
|
Khitun A, Slavoff SA. Proteomic Detection and Validation of Translated Small Open Reading Frames. ACTA ACUST UNITED AC 2020; 11:e77. [PMID: 31750990 DOI: 10.1002/cpch.77] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Small open reading frames (smORFs) encode previously unannotated polypeptides or short proteins that regulate translation in cis (eukaryotes) and/or are independently functional (prokaryotes and eukaryotes). Ongoing efforts for complete annotation and functional characterization of smORF-encoded proteins have yielded novel regulators and therapeutic targets. However, because they are excluded from protein databases, initiate at non-AUG start codons, and produce few unique tryptic peptides, unannotated small proteins cannot be detected with standard proteomic methods. Here,, we outline a procedure for mass spectrometry-based detection of translated smORFs in cultured human cells from protein extraction, digestion, and LC-MS/MS, to database preparation and data analysis. Following proteomic detection, translation from a unique smORF may be validated via siRNA-based silencing or overexpression and epitope tagging. This is necessary to unambiguously assign a peptide to a smORF within a specific transcript isoform or genomic locus. Provided that sufficient starting material is available, this workflow can be applied to any cell type/organism and adjusted to study specific (patho)physiological contexts including, but not limited to, development, stress, and disease. © 2019 by John Wiley & Sons, Inc. Basic Protocol 1: Protein extraction, size selection, and trypsin digestion Alternate Protocol 1: In-solution C8 column size selection Support Protocol 1: Chloroform/methanol precipitation Support Protocol 2: Reduction, alkylation, and in-solution protease digestion Support Protocol 3: Peptide de-salting Basic Protocol 2: Two-dimensional LC-MS/MS with ERLIC fractionation Basic Protocol 3: Transcriptomic database construction Alternate Protocol 2: Transcriptomics database generation with gffread Basic Protocol 4: Non-annotated peptide identification from LC-MS/MS data Basic Protocol 5: Validation using isotopically labeled synthetic peptide standards and siRNA Basic Protocol 6: Transcript validation using transient overexpression.
Collapse
Affiliation(s)
- Alexandra Khitun
- Department of Chemistry, Yale University, New Haven, Connecticut.,Chemical Biology Institute, Yale University, West Haven, Connecticut
| | - Sarah A Slavoff
- Department of Chemistry, Yale University, New Haven, Connecticut.,Chemical Biology Institute, Yale University, West Haven, Connecticut.,Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut
| |
Collapse
|
11
|
Cao X, Slavoff SA. Non-AUG start codons: Expanding and regulating the small and alternative ORFeome. Exp Cell Res 2020; 391:111973. [PMID: 32209305 DOI: 10.1016/j.yexcr.2020.111973] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2019] [Revised: 03/10/2020] [Accepted: 03/18/2020] [Indexed: 01/17/2023]
Abstract
Recent ribosome profiling and proteomic studies have revealed the presence of thousands of novel coding sequences, referred to as small open reading frames (sORFs), in prokaryotic and eukaryotic genomes. These genes have defied discovery via traditional genomic tools not only because they tend to be shorter than standard gene annotation length cutoffs, but also because they are, as a class, enriched in sequence properties previously assumed to be unusual, including non-AUG start codons. In this review, we summarize what is currently known about the incidence, efficiency, and mechanism of non-AUG start codon usage in prokaryotes and eukaryotes, and provide examples of regulatory and functional sORFs that initiate at non-AUG codons. While only a handful of non-AUG-initiated novel genes have been characterized in detail to date, their participation in important biological processes suggests that an improved understanding of this class of genes is needed.
Collapse
Affiliation(s)
- Xiongwen Cao
- Department of Chemistry, Yale University, New Haven, CT, 06520, United States; Chemical Biology Institute, Yale University, West Haven, CT, 06516, United States
| | - Sarah A Slavoff
- Department of Chemistry, Yale University, New Haven, CT, 06520, United States; Chemical Biology Institute, Yale University, West Haven, CT, 06516, United States; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, 06529, United States.
| |
Collapse
|
12
|
De Nobrega AK, Lyons LC. Aging and the clock: Perspective from flies to humans. Eur J Neurosci 2020; 51:454-481. [PMID: 30269400 PMCID: PMC6441388 DOI: 10.1111/ejn.14176] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2018] [Revised: 09/10/2018] [Accepted: 09/17/2018] [Indexed: 12/15/2022]
Abstract
Endogenous circadian oscillators regulate molecular, cellular and physiological rhythms, synchronizing tissues and organ function to coordinate activity and metabolism with environmental cycles. The technological nature of modern society with round-the-clock work schedules and heavy reliance on personal electronics has precipitated a striking increase in the incidence of circadian and sleep disorders. Circadian dysfunction contributes to an increased risk for many diseases and appears to have adverse effects on aging and longevity in animal models. From invertebrate organisms to humans, the function and synchronization of the circadian system weakens with age aggravating the age-related disorders and pathologies. In this review, we highlight the impacts of circadian dysfunction on aging and longevity and the reciprocal effects of aging on circadian function with examples from Drosophila to humans underscoring the highly conserved nature of these interactions. Additionally, we review the potential for using reinforcement of the circadian system to promote healthy aging and mitigate age-related pathologies. Advancements in medicine and public health have significantly increased human life span in the past century. With the demographics of countries worldwide shifting to an older population, there is a critical need to understand the factors that shape healthy aging. Drosophila melanogaster, as a model for aging and circadian interactions, has the capacity to facilitate the rapid advancement of research in this area and provide mechanistic insights for targeted investigations in mammals.
Collapse
Affiliation(s)
- Aliza K De Nobrega
- Program in Neuroscience, Department of Biological Science, Florida State University, Tallahassee, Florida
| | - Lisa C Lyons
- Program in Neuroscience, Department of Biological Science, Florida State University, Tallahassee, Florida
| |
Collapse
|
13
|
|
14
|
Abascal F, Juan D, Jungreis I, Kellis M, Martinez L, Rigau M, Rodriguez JM, Vazquez J, Tress ML. Loose ends: almost one in five human genes still have unresolved coding status. Nucleic Acids Res 2019; 46:7070-7084. [PMID: 29982784 PMCID: PMC6101605 DOI: 10.1093/nar/gky587] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2018] [Accepted: 06/18/2018] [Indexed: 12/16/2022] Open
Abstract
Seventeen years after the sequencing of the human genome, the human proteome is still under revision. One in eight of the 22 210 coding genes listed by the Ensembl/GENCODE, RefSeq and UniProtKB reference databases are annotated differently across the three sets. We have carried out an in-depth investigation on the 2764 genes classified as coding by one or more sets of manual curators and not coding by others. Data from large-scale genetic variation analyses suggests that most are not under protein-like purifying selection and so are unlikely to code for functional proteins. A further 1470 genes annotated as coding in all three reference sets have characteristics that are typical of non-coding genes or pseudogenes. These potential non-coding genes also appear to be undergoing neutral evolution and have considerably less supporting transcript and protein evidence than other coding genes. We believe that the three reference databases currently overestimate the number of human coding genes by at least 2000, complicating and adding noise to large-scale biomedical experiments. Determining which potential non-coding genes do not code for proteins is a difficult but vitally important task since the human reference proteome is a fundamental pillar of most basic research and supports almost all large-scale biomedical projects.
Collapse
Affiliation(s)
- Federico Abascal
- Wellcome Trust Sanger Institute, Hinxton CB10 1SA, Cambridgeshire, UK
| | - David Juan
- Comparative Genomics Lab, Instituto de Biologica Evolutiva, Universitat Pompeu Fabra, Barcelona, Spain
| | - Irwin Jungreis
- MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA and Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Laura Martinez
- Bioinformatics Unit, Spanish National Cancer Research Centre, Madrid, Spain
| | - Maria Rigau
- Computational Biology Life Sciences Group, Barcelona Supercomputing Center, Barcelona, Spain
| | - Jose Manuel Rodriguez
- Cardiovascular Proteomics Laboratory, Centro Nacional de Investigaciones Cardiovasculares, Madrid, Spain
| | - Jesus Vazquez
- Cardiovascular Proteomics Laboratory, Centro Nacional de Investigaciones Cardiovasculares, Madrid, Spain
| | - Michael L Tress
- Bioinformatics Unit, Spanish National Cancer Research Centre, Madrid, Spain
| |
Collapse
|
15
|
Scheidler CM, Kick LM, Schneider S. Ribosomal Peptides and Small Proteins on the Rise. Chembiochem 2019; 20:1479-1486. [PMID: 30648812 DOI: 10.1002/cbic.201800715] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2018] [Indexed: 11/05/2022]
Abstract
Genetically encoded and ribosomally synthesised peptides and small proteins act as important regulators in fundamental cellular processes, including gene expression, development, signalling and metabolism. Moreover, they also play a crucial role in eukaryotic and prokaryotic defence against microorganisms. Extremely diverse in size and structure, they are often subject to extensive post-translational modification. Recent technological advances are now allowing the analysis of the whole cellular transcriptome and proteome, revealing the presence of hundreds of long-overlooked alternative and short open reading frames (short ORFs, or sORFs) in mRNA and supposedly noncoding RNAs. However, in many instances the biological roles of their translational products remain to be elucidated. Here we provide an overview on the intriguing structural and functional diversity of ribosomally synthesised peptides and newly discovered peptides and small proteins.
Collapse
Affiliation(s)
- Christopher M Scheidler
- Center for Integrated Protein Science at the Department of Chemistry, Chair of Biochemistry, Technical University of Munich, Lichtenbergstrasse 4, 85748, Garching, Germany
| | - Leonhard M Kick
- Center for Integrated Protein Science at the Department of Chemistry, Chair of Biochemistry, Technical University of Munich, Lichtenbergstrasse 4, 85748, Garching, Germany
| | - Sabine Schneider
- Center for Integrated Protein Science at the Department of Chemistry, Chair of Biochemistry, Technical University of Munich, Lichtenbergstrasse 4, 85748, Garching, Germany
| |
Collapse
|
16
|
Erdős G, Mészáros B, Reichmann D, Dosztányi Z. Large-Scale Analysis of Redox-Sensitive Conditionally Disordered Protein Regions Reveals Their Widespread Nature and Key Roles in High-Level Eukaryotic Processes. Proteomics 2019; 19:e1800070. [PMID: 30628183 DOI: 10.1002/pmic.201800070] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2018] [Revised: 12/13/2018] [Indexed: 12/17/2022]
Abstract
Recently developed quantitative redox proteomic studies enable the direct identification of redox-sensing cysteine residues that regulate the functional behavior of target proteins in response to changing levels of reactive oxygen species. At the molecular level, redox regulation can directly modify the active sites of enzymes, although a growing number of examples indicate the importance of an additional underlying mechanism that involves conditionally disordered proteins. These proteins alter their functional behavior by undergoing a disorder-to-order transition in response to changing redox conditions. However, the extent to which this mechanism is used in various proteomes is currently unknown. Here, a recently developed sequence-based prediction tool incorporated into the IUPred2A web server is used to estimate redox-sensitive conditionally disordered regions at a large scale. It is shown that redox-sensitive conditional disorder is fairly widespread in various proteomes and that its presence strongly correlates with the expansion of specific domains in multicellular organisms that largely rely on extra stability provided by disulfide bonds or zinc ion binding. The analyses of yeast redox proteomes and human disease data further underlie the significance of this phenomenon in the regulation of a wide range of biological processes, as well as its biomedical importance.
Collapse
Affiliation(s)
- Gábor Erdős
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest, H-1117, Hungary
| | - Bálint Mészáros
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest, H-1117, Hungary.,Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, 69117, Germany
| | - Dana Reichmann
- Department of Biological Chemistry, The Alexander Silberman Institute of Life Sciences, Safra Campus Givat Ram, The Hebrew University of Jerusalem, Jerusalem, 91904, Israel
| | - Zsuzsanna Dosztányi
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest, H-1117, Hungary
| |
Collapse
|
17
|
Shekari F, Baharvand H, Salekdeh GH. Organellar proteomics of embryonic stem cells. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2018; 95:215-30. [PMID: 24985774 DOI: 10.1016/b978-0-12-800453-1.00007-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Embryonic stem cells (ESCs) are undifferentiated cells with two common remarkable features known as self-renewal and differentiation. Proteomics plays an increasingly important role in understanding molecular mechanisms underlying self-renewal and pluripotency of ESCs and their applications in cell therapy and developmental biology studies. As the function of a protein is strongly associated with its localization in cell, a complete and accurate picture of the proteome of ESCs cannot be achieved without knowing the subcellular locations of proteins. Subcellular fractionation allows enrichment of low abundant proteins and signaling complexes and reduces the complexity of the sample. It also provided insight into tracking proteins that shuttle between different compartments. Despite the substantial interest and efforts in ESC subcellular proteomics area, progress has been relatively limited. In this review, we present an overview on current status of ESCs organelle proteomics research and discuss challenges in subcellular proteomics.
Collapse
Affiliation(s)
- Faezeh Shekari
- Department of Molecular Systems Biology at Cell Science Research Center, Royan Institute for Stem Cell Biology and Technology, ACECR, Tehran, Iran; Department of Developmental Biology, University of Science and Culture, ACECR, Tehran, Iran
| | - Hossein Baharvand
- Department of Developmental Biology, University of Science and Culture, ACECR, Tehran, Iran; Department of Stem Cells and Developmental Biology at Cell Science Research Center, Royan Institute for Stem Cell Biology and Technology, ACECR, Tehran, Iran.
| | - Ghasem Hosseini Salekdeh
- Department of Molecular Systems Biology at Cell Science Research Center, Royan Institute for Stem Cell Biology and Technology, ACECR, Tehran, Iran; Department of Systems Biology, Agricultural Biotechnology Research Institute of Iran, Karaj, Iran.
| |
Collapse
|
18
|
De Martino M, Forzati F, Arra C, Fusco A, Esposito F. HMGA1-pseudogenes and cancer. Oncotarget 2017; 7:28724-35. [PMID: 26895108 PMCID: PMC5053758 DOI: 10.18632/oncotarget.7427] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2015] [Accepted: 02/05/2016] [Indexed: 12/25/2022] Open
Abstract
Pseudogenes are DNA sequences with high homology to the corresponding functional gene, but, because of the accumulation of various mutations, they have lost their initial functions to code for proteins. Consequently, pseudogenes have been considered until few years ago dysfunctional relatives of the corresponding ancestral genes, and then useless in the course of genome evolution. However, several studies have recently established that pseudogenes are owners of key biological functions. Indeed, some pseudogenes control the expression of functional genes by competitively binding to the miRNAs, some of them generate small interference RNAs to negatively modulate the expression of functional genes, and some of them even encode functional mutated proteins. Here, we concentrate our attention on the pseudogenes of the HMGA1 gene, that codes for the HMGA1a and HMGA1b proteins having a critical role in development and cancer progression. In this review, we analyze the family of HMGA1 pseudogenes through three aspects: classification, characterization, and their possible function and involvement in cancer.
Collapse
Affiliation(s)
- Marco De Martino
- Istituto di Endocrinologia ed Oncologia Sperimentale del CNR c/o Dipartimento di Medicina Molecolare e Biotecnologie Mediche, Scuola di Medicina e Chirurgia di Napoli, Università degli Studi di Napoli "Federico II", Naples, Italy
| | - Floriana Forzati
- Istituto di Endocrinologia ed Oncologia Sperimentale del CNR c/o Dipartimento di Medicina Molecolare e Biotecnologie Mediche, Scuola di Medicina e Chirurgia di Napoli, Università degli Studi di Napoli "Federico II", Naples, Italy
| | - Claudio Arra
- Istituto Nazionale dei Tumori, Fondazione Pascale, Naples, Italy
| | - Alfredo Fusco
- Istituto di Endocrinologia ed Oncologia Sperimentale del CNR c/o Dipartimento di Medicina Molecolare e Biotecnologie Mediche, Scuola di Medicina e Chirurgia di Napoli, Università degli Studi di Napoli "Federico II", Naples, Italy
| | - Francesco Esposito
- Istituto di Endocrinologia ed Oncologia Sperimentale del CNR c/o Dipartimento di Medicina Molecolare e Biotecnologie Mediche, Scuola di Medicina e Chirurgia di Napoli, Università degli Studi di Napoli "Federico II", Naples, Italy
| |
Collapse
|
19
|
Vikram Singh A, Gharat T, Batuwangala M, Park B, Endlein T, Sitti M. Three‐dimensional patterning in biomedicine: Importance and applications in neuropharmacology. J Biomed Mater Res B Appl Biomater 2017; 106:1369-1382. [DOI: 10.1002/jbm.b.33922] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2016] [Revised: 04/19/2017] [Accepted: 04/22/2017] [Indexed: 12/18/2022]
Affiliation(s)
- Ajay Vikram Singh
- Department of Physical IntelligenceMax Planck Institute for Intelligent Systems, Heisenbergstr 370569Stuttgart Germany
| | - Tanmay Gharat
- Department of Chemical and Biological EngineeringRensselaer Polytechnic InstituteNew York New York12180
| | - Madu Batuwangala
- Department of Physical IntelligenceMax Planck Institute for Intelligent Systems, Heisenbergstr 370569Stuttgart Germany
| | - Byung‐Wook Park
- Department of Physical IntelligenceMax Planck Institute for Intelligent Systems, Heisenbergstr 370569Stuttgart Germany
| | - Thomas Endlein
- Department of Physical IntelligenceMax Planck Institute for Intelligent Systems, Heisenbergstr 370569Stuttgart Germany
| | - Metin Sitti
- Department of Physical IntelligenceMax Planck Institute for Intelligent Systems, Heisenbergstr 370569Stuttgart Germany
| |
Collapse
|
20
|
Sou SN, Jedrzejewski PM, Lee K, Sellick C, Polizzi KM, Kontoravdi C. Model-based investigation of intracellular processes determining antibody Fc-glycosylation under mild hypothermia. Biotechnol Bioeng 2017; 114:1570-1582. [PMID: 27869292 PMCID: PMC5485029 DOI: 10.1002/bit.26225] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2016] [Revised: 09/23/2016] [Accepted: 11/14/2016] [Indexed: 02/03/2023]
Abstract
Despite the positive effects of mild hypothermic conditions on monoclonal antibody (mAb) productivity (qmAb) during mammalian cell culture, the impact of reduced culture temperature on mAb Fc‐glycosylation and the mechanism behind changes in the glycan composition are not fully established. The lack of knowledge about the regulation of dynamic intracellular processes under mild hypothermia restricts bioprocess optimization. To address this issue, a mathematical model that quantitatively describes Chinese hamster ovary (CHO) cell behavior and metabolism, mAb synthesis and mAb N‐linked glycosylation profile before and after the induction of mild hypothermia is constructed. Results from this study show that the model is capable of representing experimental results well in all of the aspects mentioned above, including the N‐linked glycosylation profile of mAb produced under mild hypothermia. Most importantly, comparison between model simulation results for different culture temperatures suggests the reduced rates of nucleotide sugar donor production and galactosyltransferase (GalT) expression to be critical contributing factors that determine the variation in Fc‐glycan profiles between physiological and mild hypothermic conditions in stable CHO transfectants. This is then confirmed using experimental measurements of GalT expression levels, thereby closing the loop between the experimental and the computational system. The identification of bottlenecks within CHO cell metabolism under mild hypothermic conditions will aid bioprocess optimization, for example, by tailoring feeding strategies to improve NSD production, or manipulating the expression of specific glycosyltransferases through cell line engineering. Biotechnol. Bioeng. 2017;114: 1570–1582. © 2016 The Authors. Biotechnology and Bioengineering Published by Wiley Periodicals Inc.
Collapse
Affiliation(s)
- Si Nga Sou
- Department of Life Sciences, Imperial College London, London, United Kingdom.,Centre for Synthetic Biology and Innovation, Imperial College London, London, United Kingdom.,Department of Chemical Engineering, Centre for Process Systems Engineering, Imperial College London, London SW7 2AZ, United Kingdom
| | - Philip M Jedrzejewski
- Department of Chemical Engineering, Centre for Process Systems Engineering, Imperial College London, London SW7 2AZ, United Kingdom
| | - Ken Lee
- Cell Culture and Fermentation Sciences, MedImmune, Granta Park, Cambridge, United Kingdom
| | - Christopher Sellick
- Cell Culture and Fermentation Sciences, MedImmune, Granta Park, Cambridge, United Kingdom
| | - Karen M Polizzi
- Department of Life Sciences, Imperial College London, London, United Kingdom.,Centre for Synthetic Biology and Innovation, Imperial College London, London, United Kingdom
| | - Cleo Kontoravdi
- Department of Chemical Engineering, Centre for Process Systems Engineering, Imperial College London, London SW7 2AZ, United Kingdom
| |
Collapse
|
21
|
Wetmore BA, Merrick BA. Invited Review: Toxicoproteomics: Proteomics Applied to Toxicology and Pathology. Toxicol Pathol 2016; 32:619-42. [PMID: 15580702 DOI: 10.1080/01926230490518244] [Citation(s) in RCA: 122] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Global measurement of proteins and their many attributes in tissues and biofluids defines the field of proteomics. Toxicoproteomics, as part of the larger field of toxicogenomics, seeks to identify critical proteins and pathways in biological systems that are affected by and respond to adverse chemical and environmental exposures using global protein expression technologies. Toxicoproteomics integrates 3 disciplinary areas: traditional toxicology and pathology, differential protein and gene expression analysis, and systems biology. Key topics to be reviewed are the evolution of proteomics, proteomic technology platforms and their capabilities with exemplary studies from biology and medicine, a review of over 50 recent studies applying proteomic analysis to toxicological research, and the recent development of databases designed to integrate -Omics technologies with toxicology and pathology. Proteomics is examined for its potential in discovery of new biomarkers and toxicity signatures, in mapping serum, plasma, and other biofluid proteomes, and in parallel proteomic and transcriptomic studies. The new field of toxicoproteomics is uniquely positioned toward an expanded understanding of protein expression during toxicity and environmental disease for the advancement of public health.
Collapse
Affiliation(s)
- Barbara A Wetmore
- National Center for Toxicogenomics, National Institute of Environmental Health Sciences, Research Triangle Park, North Caroline 27709, USA
| | | |
Collapse
|
22
|
Zhou Y, Liu Z, Rothschild KJ, Lim MJ. Proteome-wide drug screening using mass spectrometric imaging of bead-arrays. Sci Rep 2016; 6:26125. [PMID: 27194112 PMCID: PMC4872124 DOI: 10.1038/srep26125] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2016] [Accepted: 04/27/2016] [Indexed: 12/17/2022] Open
Abstract
A fundamental challenge in the drug discovery process is to develop compounds with high efficacy and minimal side-effects. We describe a new approach to proteome-wide drug screening for detection of on- and off-target binding which combines the advantages of mass spectrometry with microarray technology. The method involves matrix-assisted laser desorption/ionization mass spectrometric imaging (MALDI-MSI) of agarose micro-beads randomly arrayed at high-density in custom micro-well plates. Each bead carries a unique protein target and a corresponding photocleavable mass-tag for coding (PC-Mass-Tag). Compounds bound to specific protein beads and a photo-released coding PC-Mass-Tag are detected simultaneously using MALDI-MSI. As an initial demonstration of this approach, two kinase-targeted drugs, Dasatinib and Brigatinib (AP26113), were simultaneously screened against a model 50-member kinase-bead library. A MALDI-MSI scan performed at the equivalent density of 495,000 beads in the footprint of a microscope slide yielded 100% sensitivity for detecting known strong interactions with no false positives.
Collapse
Affiliation(s)
- Ying Zhou
- AmberGen, Inc., 313 Pleasant Street, Watertown, MA 02472, United States
| | - Ziying Liu
- AmberGen, Inc., 313 Pleasant Street, Watertown, MA 02472, United States
| | - Kenneth J Rothschild
- AmberGen, Inc., 313 Pleasant Street, Watertown, MA 02472, United States.,Molecular Biophysics Laboratory, Department of Physics and Photonics Center, Boston University, Boston, MA 02215, United States
| | - Mark J Lim
- AmberGen, Inc., 313 Pleasant Street, Watertown, MA 02472, United States
| |
Collapse
|
23
|
Sheshukova EV, Shindyapina AV, Komarova TV, Dorokhov YL. “Matreshka” genes with alternative reading frames. RUSS J GENET+ 2016. [DOI: 10.1134/s1022795416020149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
24
|
Patrushev LI, Kovalenko TF. Functions of noncoding sequences in mammalian genomes. BIOCHEMISTRY (MOSCOW) 2015; 79:1442-69. [PMID: 25749159 DOI: 10.1134/s0006297914130021] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Most of the mammalian genome consists of nucleotide sequences not coding for proteins. Exons of genes make up only 3% of the human genome, while the significance of most other sequences remains unknown. Recent genome studies with high-throughput methods demonstrate that the so-called noncoding part of the genome may perform important functions. This hypothesis is supported by three groups of experimental data: 1) approximately 10% of the sequences, most of which are located in noncoding parts of the genome, is evolutionarily conserved and thus can be of functional importance; 2) up to 99% of the mammalian genome is being transcribed forming short and long noncoding RNAs in addition to common mRNA; and 3) mutations in noncoding parts of the genome can be accompanied by progression of pathological states of the organism. In the light of these data, in the review we consider the functional role of numerous known sequences of noncoding parts of the genome including introns, DNA methylation regions, enhancers and locus control regions, insulators, S/MAR sequences, pseudogenes, and genes of noncoding RNAs, as well as transposons and simple repeats of centromeric and telomeric regions of chromosomes. The assumption is made that the intergenic noncoding sequences without definite/clear functions can be involved in spatial organization of genetic loci in interphase nuclei.
Collapse
Affiliation(s)
- L I Patrushev
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, Moscow, 117997, Russia.
| | | |
Collapse
|
25
|
Xu F, Zhang Y, Li J, Zhang Y, Xiang Z, Yu Z. Expression and function analysis of two naturally truncated MyD88 variants in the Pacific oyster Crassostrea gigas. FISH & SHELLFISH IMMUNOLOGY 2015; 45:510-516. [PMID: 25963623 DOI: 10.1016/j.fsi.2015.04.034] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/16/2015] [Revised: 04/20/2015] [Accepted: 04/27/2015] [Indexed: 06/04/2023]
Abstract
Myeloid differentiation factor 88 (MyD88) is the classic signaling adaptor that mediates Toll/interleukin-1 receptor (TIR/IL-1R) dependent activation of nuclear factor-kappa B (NF-κB). In this study, two naturally truncated MyD88 members were identified from the Pacific oyster (Crassostrea gigas), namely CgMyD88-T1 and CgMyD88-T2. The full-length cDNA of CgMyD88-T1, CgMyD88-T2 are 976 bp and 1038 bp in length, containing an ORF of 552 bp and 555 bp, respectively. The two ORF encode a putative protein of 183 and 184 amino acids, respectively, with a calculated molecular weight of about 21 and 22 kDa. When compared to complete MyD88 paralogues, we found that both CgMyD88-T1 and CgMyD88-T2 contain only TIR domain but lack DD (Death Domain), which share 90.8% of similarity and 71.7% of identity with each other. Phylogenetic tree demonstrated that CgMyD88-T1 and CgMyD88-T2 clustered together and belonged to mollusk branch. Meanwhile, genomic arrangement analysis displayed that the two truncated MyD88s were distributed in tandem in one scaffold, revealing that they may originate from one truncated MyD88 ancestor recently. Expression profile showed that both of CgMyD88 variants were ubiquitously expressed in all tested tissues with highest expression in the gills and hemocytes, respectively. Both truncated CgMyD88 mRNAs were significantly up-regulated in hemocytes under HKLM (heat-killed Listeria monocytogenes) and HKVA (heat-killed Vibrio alginolyticus) challenge. Moreover, either CgMyD88-T1 or CgMyD88-T2 were able to inhibit MyD88 activated Rel/NF-κB activity in HEK293 cell, demonstrating their negative role in regulating MyD88-mediated immune signaling.
Collapse
Affiliation(s)
- Fengjiao Xu
- Key Laboratory of Tropical Marine Bio-resources and Ecology, South China Sea Institute of Oceanology, Chinese Academy of Sciences, Guangzhou 510301, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yang Zhang
- Key Laboratory of Tropical Marine Bio-resources and Ecology, South China Sea Institute of Oceanology, Chinese Academy of Sciences, Guangzhou 510301, China.
| | - Jun Li
- Key Laboratory of Tropical Marine Bio-resources and Ecology, South China Sea Institute of Oceanology, Chinese Academy of Sciences, Guangzhou 510301, China
| | - Yuehuan Zhang
- Key Laboratory of Tropical Marine Bio-resources and Ecology, South China Sea Institute of Oceanology, Chinese Academy of Sciences, Guangzhou 510301, China
| | - Zhiming Xiang
- Key Laboratory of Tropical Marine Bio-resources and Ecology, South China Sea Institute of Oceanology, Chinese Academy of Sciences, Guangzhou 510301, China
| | - Ziniu Yu
- Key Laboratory of Tropical Marine Bio-resources and Ecology, South China Sea Institute of Oceanology, Chinese Academy of Sciences, Guangzhou 510301, China
| |
Collapse
|
26
|
Raabe CA, Brosius J. Does every transcript originate from a gene? Ann N Y Acad Sci 2015; 1341:136-48. [PMID: 25847549 DOI: 10.1111/nyas.12741] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2014] [Revised: 02/05/2015] [Accepted: 02/11/2015] [Indexed: 12/20/2022]
Abstract
Outdated gene definitions favored regions corresponding to mature messenger RNAs, in particular, the open reading frame. In eukaryotes, the intergenic space was widely regarded nonfunctional and devoid of RNA transcription. Original concepts were based on the assumption that RNA expression was restricted to known protein-coding genes and a few so-called structural RNA genes, such as ribosomal RNAs or transfer RNAs. With the discovery of introns and, more recently, sensitive techniques for monitoring genome-wide transcription, this view had to be substantially modified. Tiling microarrays and RNA deep sequencing revealed myriads of transcripts, which cover almost entire genomes. The tremendous complexity of non-protein-coding RNA transcription has to be integrated into novel gene definitions. Despite an ever-growing list of functional RNAs, questions concerning the mass of identified transcripts are under dispute. Here, we examined genome-wide transcription from various angles, including evolutionary considerations, and suggest, in analogy to novel alternative splice variants that do not persist, that the vast majority of transcripts represent raw material for potential, albeit rare, exaptation events.
Collapse
Affiliation(s)
- Carsten A Raabe
- Institute of Experimental Pathology, ZMBE, University of Münster, Münster, Germany
| | | |
Collapse
|
27
|
Saha A, Mitchell JA, Nishida Y, Hildreth JE, Ariberre JA, Gilbert WV, Garfinkel DJ. A trans-dominant form of Gag restricts Ty1 retrotransposition and mediates copy number control. J Virol 2015; 89:3922-38. [PMID: 25609815 PMCID: PMC4403431 DOI: 10.1128/jvi.03060-14] [Citation(s) in RCA: 55] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2014] [Accepted: 01/15/2015] [Indexed: 11/20/2022] Open
Abstract
UNLABELLED Saccharomyces cerevisiae and Saccharomyces paradoxus lack the conserved RNA interference pathway and utilize a novel form of copy number control (CNC) to inhibit Ty1 retrotransposition. Although noncoding transcripts have been implicated in CNC, here we present evidence that a truncated form of the Gag capsid protein (p22) or its processed form (p18) is necessary and sufficient for CNC and likely encoded by Ty1 internal transcripts. Coexpression of p22/p18 and Ty1 decreases mobility more than 30,000-fold. p22/p18 cofractionates with Ty1 virus-like particles (VLPs) and affects VLP yield, protein composition, and morphology. Although p22/p18 and Gag colocalize in the cytoplasm, p22/p18 disrupts sites used for VLP assembly. Glutathione S-transferase (GST) affinity pulldowns also suggest that p18 and Gag interact. Therefore, this intrinsic Gag-like restriction factor confers CNC by interfering with VLP assembly and function and expands the strategies used to limit retroelement propagation. IMPORTANCE Retrotransposons dominate the chromosomal landscape in many eukaryotes, can cause mutations by insertion or genome rearrangement, and are evolutionarily related to retroviruses such as HIV. Thus, understanding factors that limit transposition and retroviral replication is fundamentally important. The present work describes a retrotransposon-encoded restriction protein derived from the capsid gene of the yeast Ty1 element that disrupts virus-like particle assembly in a dose-dependent manner. This form of copy number control acts as a molecular rheostat, allowing high levels of retrotransposition when few Ty1 elements are present and inhibiting transposition as copy number increases. Thus, yeast and Ty1 have coevolved a form of copy number control that is beneficial to both "host and parasite." To our knowledge, this is the first Gag-like retrotransposon restriction factor described in the literature and expands the ways in which restriction proteins modulate retroelement replication.
Collapse
Affiliation(s)
- Agniva Saha
- Department of Biochemistry and Molecular Biology, University of Georgia, Athens, Georgia, USA
| | - Jessica A Mitchell
- Department of Biochemistry and Molecular Biology, University of Georgia, Athens, Georgia, USA
| | - Yuri Nishida
- Department of Biochemistry and Molecular Biology, University of Georgia, Athens, Georgia, USA
| | - Jonathan E Hildreth
- Department of Biochemistry and Molecular Biology, University of Georgia, Athens, Georgia, USA
| | - Joshua A Ariberre
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
| | - Wendy V Gilbert
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
| | - David J Garfinkel
- Department of Biochemistry and Molecular Biology, University of Georgia, Athens, Georgia, USA
| |
Collapse
|
28
|
Petersson L, Dexlin-Mellby L, Bengtsson AA, Sturfelt G, Borrebaeck CAK, Wingren C. Multiplexing of miniaturized planar antibody arrays for serum protein profiling--a biomarker discovery in SLE nephritis. LAB ON A CHIP 2014; 14:1931-1942. [PMID: 24763547 DOI: 10.1039/c3lc51420j] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
In the quest to decipher disease-associated biomarkers, miniaturized and multiplexed antibody arrays may play a central role in generating protein expression profiles, or protein maps, of crude serum samples. In this conceptual study, we explored a novel, 4-times larger pen design, enabling us to, in a unique manner, simultaneously print 48 different reagents (antibodies) as individual 78.5 μm(2) (10 μm in diameter) sized spots at a density of 38,000 spots cm(-2) using dip-pen nanolithography technology. The antibody array set-up was interfaced with a high-resolution fluorescent-based scanner for sensitive sensing. The performance and applicability of this novel 48-plex recombinant antibody array platform design was demonstrated in a first clinical application targeting SLE nephritis, a severe chronic autoimmune connective tissue disorder, as the model disease. To this end, crude, directly biotinylated serum samples were targeted. The results showed that the miniaturized and multiplexed array platform displayed adequate performance, and that SLE-associated serum biomarker panels reflecting the disease process could be deciphered, outlining the use of miniaturized antibody arrays for disease proteomics and biomarker discovery.
Collapse
Affiliation(s)
- Linn Petersson
- Dept. of Immunotechnology and CREATE Health, Medicon Village, Lund University, Medicon Village, Building no. 406, SE-22381 Lund, Sweden.
| | | | | | | | | | | |
Collapse
|
29
|
Getting down to the core of histone modifications. Chromosoma 2014; 123:355-71. [PMID: 24789118 DOI: 10.1007/s00412-014-0465-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2014] [Revised: 04/08/2014] [Accepted: 04/09/2014] [Indexed: 10/25/2022]
Abstract
The identification of an increasing number of posttranslationally modified residues within histone core domains is furthering our understanding of how nucleosome dynamics are regulated. In this review, we first discuss how the targeting of specific histone H3 core residues can directly influence the nucleosome structure and then apply this knowledge to provide functional reasoning for their localization to distinct genomic regions. While we focus mainly on transcriptional implications, the principles discussed in this review can also be applied to their roles in other cellular processes. Finally, we highlight some examples of how aberrant modifications of core histone residues can facilitate the pathogenesis of some diseases.
Collapse
|
30
|
Chen L, Bush SJ, Tovar-Corona JM, Castillo-Morales A, Urrutia AO. Correcting for differential transcript coverage reveals a strong relationship between alternative splicing and organism complexity. Mol Biol Evol 2014; 31:1402-13. [PMID: 24682283 PMCID: PMC4032128 DOI: 10.1093/molbev/msu083] [Citation(s) in RCA: 96] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
What at the genomic level underlies organism complexity? Although several genomic features have been associated with organism complexity, in the case of alternative splicing, which has long been proposed to explain the variation in complexity, no such link has been established. Here, we analyzed over 39 million expressed sequence tags available for 47 eukaryotic species with fully sequenced genomes to obtain a comparable index of alternative splicing estimates, which corrects for the distorting effect of a variable number of transcripts per species—an important obstacle for comparative studies of alternative splicing. We find that alternative splicing has steadily increased over the last 1,400 My of eukaryotic evolution and is strongly associated with organism complexity, assayed as the number of cell types. Importantly, this association is not explained as a by-product of covariance between alternative splicing with other variables previously linked to complexity including gene content, protein length, proteome disorder, and protein interactivity. In addition, we found no evidence to suggest that the relationship of alternative splicing to cell type number is explained by drift due to reduced Ne in more complex species. Taken together, our results firmly establish alternative splicing as a significant predictor of organism complexity and are, in principle, consistent with an important role of transcript diversification through alternative splicing as a means of determining a genome’s functional information capacity.
Collapse
Affiliation(s)
- Lu Chen
- Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| | - Stephen J Bush
- Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| | - Jaime M Tovar-Corona
- Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| | | | - Araxi O Urrutia
- Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| |
Collapse
|
31
|
Elliott DJ. Illuminating the Transcriptome through the Genome. Genes (Basel) 2014; 5:235-53. [PMID: 24705295 PMCID: PMC3978521 DOI: 10.3390/genes5010235] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2014] [Revised: 03/03/2014] [Accepted: 03/05/2014] [Indexed: 02/01/2023] Open
Abstract
Sequencing the human genome was a huge milestone in genetic research that revealed almost the total DNA sequence required to create a human being. However, in order to function, the DNA genome needs to be expressed as an RNA transcriptome. This article reviews how knowledge of genome sequence information has led to fundamental discoveries in how the transcriptome is processed, with a focus on new system-wide insights into how pre-mRNAs that are encoded by split genes in the genome are rearranged by splicing into functional mRNAs. These advances have been made possible by the development of new post-genome technologies to probe splicing patterns. Transcriptome-wide approaches have characterised a "splicing code" that is embedded within and has a significant role in deciphering the genome, and is deciphered by RNA binding proteins. These analyses have also found that most human genes encode multiple mRNA isoforms, and in some cases proteins, leading in turn to a re-assessment of what exactly a gene is. Analysis of the transcriptome has given insights into how the genome is packaged and transcribed, and is helping to explain important aspects of genome evolution.
Collapse
Affiliation(s)
- David J Elliott
- Institute of Genetic Medicine, Newcastle University, Newcastle, NE1 3BZ, UK.
| |
Collapse
|
32
|
Abstract
Small proteins, here defined as proteins of 50 amino acids or fewer in the absence of processing, have traditionally been overlooked due to challenges in their annotation and biochemical detection. In the past several years, however, increasing numbers of small proteins have been identified either through the realization that mutations in intergenic regions are actually within unannotated small protein genes or through the discovery that some small, regulatory RNAs encode small proteins. These insights, together with comparative sequence analysis, indicate that tens if not hundreds of small proteins are synthesized in a given organism. This review summarizes what has been learned about the functions of several of these bacterial small proteins, most of which act at the membrane, illustrating the astonishing range of processes in which these small proteins act and suggesting several general conclusions. Important questions for future studies of these overlooked proteins are also discussed.
Collapse
Affiliation(s)
- Gisela Storz
- Cell Biology and Metabolism Branch, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland 20892-5430;
| | | | | |
Collapse
|
33
|
Enriched protein screening of human bone marrow mesenchymal stromal cell secretions reveals MFAP5 and PENK as novel IL-10 modulators. Mol Ther 2014; 22:999-1007. [PMID: 24496384 DOI: 10.1038/mt.2014.17] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2013] [Accepted: 01/30/2014] [Indexed: 01/10/2023] Open
Abstract
The secreted proteins from a cell constitute a natural biologic library that can offer significant insight into human health and disease. Discovering new secreted proteins from cells is bounded by the limitations of traditional separation and detection tools to physically fractionate and analyze samples. Here, we present a new method to systematically identify bioactive cell-secreted proteins that circumvent traditional proteomic methods by first enriching for protein candidates by differential gene expression profiling. The bone marrow stromal cell secretome was analyzed using enriched gene expression datasets in combination with potency assay testing. Four proteins expressed by stromal cells with previously unknown anti-inflammatory properties were identified, two of which provided a significant survival benefit to mice challenged with lethal endotoxic shock. Greater than 85% of secreted factors were recaptured that were otherwise undetected by proteomic methods, and remarkable hit rates of 18% in vitro and 9% in vivo were achieved.
Collapse
|
34
|
Baker MA, Aitken RJ. Proteomic insights into spermatozoa: critiques, comments and concerns. Expert Rev Proteomics 2014; 6:691-705. [DOI: 10.1586/epr.09.76] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
|
35
|
Abstract
The number of complete genome sequences explodes more and more with each passing year. Thus, methods for genome annotation need to be honed constantly to handle the deluge of information. Annotation of pseudogenes (i.e., gene copies that appear not to make a functional protein) in genomes is a persistent problem; here, we overview pseudogene annotation methods that are based on the detection of sequence homology in genomic DNA.
Collapse
Affiliation(s)
- Paul M Harrison
- Department of Biology, McGill University, Stewart Biology Building, 1205 Doctor Penfield Avenue, Montreal, QC, Canada, H3A 1B1,
| |
Collapse
|
36
|
Hust M, Frenzel A, Schirrmann T, Dübel S. Selection of recombinant antibodies from antibody gene libraries. Methods Mol Biol 2014; 1101:305-20. [PMID: 24233787 DOI: 10.1007/978-1-62703-721-1_14] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Antibodies are indispensable detection reagents for research and diagnostics and represent the biggest class of biological therapeutics on the market. In vitro antibody selection systems offer many advantages over animal-based technologies because the whole selection process is independent of the in vivo immune response. In the last two decades antibody phage display has evolved to the most robust and widely used method and has already yielded thousands of antibodies. The selection of binders by phage display is also referred to as "panning" and based on the specific molecular interaction of antibody phage with an immobilized antigen thus allowing the enrichment and isolation of antigen-specific monoclonal binders from very large antibody gene libraries. Here, we give detailed protocols for the selection of recombinant antibody fragments from antibody gene libraries in microtiter plates.
Collapse
Affiliation(s)
- Michael Hust
- Abteilung Biotechnologie, Institut für Biochemie, Biotechnologie und Bioinformatik, Technische Universität Braunschweig, Braunschweig, Germany
| | | | | | | |
Collapse
|
37
|
Frenzel A, Kügler J, Wilke S, Schirrmann T, Hust M. Construction of human antibody gene libraries and selection of antibodies by phage display. Methods Mol Biol 2014; 1060:215-243. [PMID: 24037844 DOI: 10.1007/978-1-62703-586-6_12] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Antibody phage display is the most commonly used in vitro selection technology and has yielded thousands of useful antibodies for research, diagnostics, and therapy.The prerequisite for successful generation and development of human recombinant antibodies using phage display is the construction of a high-quality antibody gene library. Here, we describe the methods for the construction of human immune and naive scFv gene libraries.The success also depends on the panning strategy for the selection of binders from these libraries. In this article, we describe a panning strategy that is high-throughput compatible and allows parallel selection in microtiter plates.
Collapse
Affiliation(s)
- André Frenzel
- Abteilung Biotechnologie Technische Universität Braunschweig, Institut für Biochemie, Biotechnologie und Bioinformatik, Braunschweig, Germany
| | | | | | | | | |
Collapse
|
38
|
Petriz BA, Franco OL. Application of Cutting-Edge Proteomics Technologies for Elucidating Host–Bacteria Interactions. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2014; 95:1-24. [DOI: 10.1016/b978-0-12-800453-1.00001-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
39
|
Brown JB, Niijima S, Okuno Y. CompoundProtein Interaction Prediction Within Chemogenomics: Theoretical Concepts, Practical Usage, and Future Directions. Mol Inform 2013; 32:906-21. [DOI: 10.1002/minf.201300101] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2013] [Accepted: 08/06/2013] [Indexed: 11/08/2022]
|
40
|
Liu D, Cai X. OsRRMh, a Spen-like gene, plays an important role during the vegetative to reproductive transition in rice. JOURNAL OF INTEGRATIVE PLANT BIOLOGY 2013; 55:876-87. [PMID: 23621499 DOI: 10.1111/jipb.12056] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/05/2013] [Accepted: 04/02/2013] [Indexed: 05/11/2023]
Abstract
OsRRMh, a homologue of OsRRM, encodes a Spen-like protein, and is composed of two N-terminal RNA recognition motifs (RRM) and one C-terminal Spen paralogue and an orthologue C-terminal domain (SPOC). The gene has been found to be constitutively expressed in the root, stem, leaf, spikelet, and immature seed, and alternative splicing patterns were confirmed in different tissues, which may indicate diverse functions for OsRRMh. The OsRRMh dsRNAi lines exhibited late-flowering and a larger panicle phenotype. When full-length OsRRMh and/or its SPOC domain were overexpressed, the fertility rate and number of spikelets per panicle were both markedly reduced. Also, overexpression of OsRRMh in the Arabidopsis fpa mutant did not restore the normal flowering time, and it delayed flowering in Col plants. Therefore, we propose that OsRRMh may confer one of its functions in the vegetative-to-reproductive transition in rice (Oryza sativa L. subsp. japonica cv. Zhonghua No. 11 (ZH11)).
Collapse
Affiliation(s)
- Derui Liu
- National Key Laboratory of Plant Molecular Genetics, Institute of Plant Physiology and Ecology, the Chinese Academy of Sciences, Shanghai, 200032, China; University of Chinese Academy of Sciences, the Chinese Academy of Sciences, Beijing, 100049, China
| | | |
Collapse
|
41
|
Singh AV. Biotechnological applications of supersonic cluster beam-deposited nanostructured thin films: Bottom-up engineering to optimize cell-protein-surface interactions. J Biomed Mater Res A 2013; 101:2994-3008. [DOI: 10.1002/jbm.a.34601] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2012] [Revised: 01/03/2013] [Accepted: 01/04/2013] [Indexed: 11/11/2022]
|
42
|
Jia B, Cheong GW, Zhang S. Multifunctional enzymes in archaea: promiscuity and moonlight. Extremophiles 2013; 17:193-203. [PMID: 23283522 DOI: 10.1007/s00792-012-0509-1] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2012] [Accepted: 12/17/2012] [Indexed: 10/27/2022]
Abstract
Enzymes from many archaea colonizing extreme environments are of great interest because of their potential for various biotechnological processes and scientific value of evolution. Many enzymes from archaea have been reported to catalyze promiscuous reactions or moonlight in different functions. Here, we summarize known archaeal enzymes of both groups that include different kinds of proteins. Knowledge of their biochemical properties and three-dimensional structures has proved invaluable in understanding mechanism, application, and evolutionary implications of this manifestation. In addition, the review also summarizes the methods to unravel the extra function which almost was discovered serendipitously. The study of these amazing enzymes will provide clues to optimize protein engineering applications and how enzymes might have evolved on Earth.
Collapse
Affiliation(s)
- Baolei Jia
- College of Plant Sciences, Jilin University, Changchun, China.
| | | | | |
Collapse
|
43
|
Blaber M, Lee J, Longo L. Emergence of symmetric protein architecture from a simple peptide motif: evolutionary models. Cell Mol Life Sci 2012; 69:3999-4006. [PMID: 22790181 PMCID: PMC11115074 DOI: 10.1007/s00018-012-1077-3] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2012] [Revised: 06/22/2012] [Accepted: 06/26/2012] [Indexed: 10/28/2022]
Abstract
Structural symmetry is observed in the majority of fundamental protein folds and gene duplication and fusion evolutionary processes are postulated to be responsible. However, convergent evolution leading to structural symmetry has also been proposed; additionally, there is debate regarding the extent to which exact primary structure symmetry is compatible with efficient protein folding. Issues of symmetry in protein evolution directly impact strategies for de novo protein design as symmetry can substantially simplify the design process. Additionally, when considering gene duplication and fusion in protein evolution, there are two competing models: "emergent architecture" and "conserved architecture". Recent experimental work has shed light on both the evolutionary process leading to symmetric protein folds as well as the ability of symmetric primary structure to efficiently fold. Such studies largely support a "conserved architecture" evolutionary model, suggesting that complex protein architecture was an early evolutionary achievement involving oligomerization of smaller polypeptides.
Collapse
Affiliation(s)
- Michael Blaber
- Department of Biomedical Sciences, College of Medicine, Florida State University, 1115 West Call St., Tallahassee, FL, 32306-4300, USA,
| | | | | |
Collapse
|
44
|
Fournier CT, Cherny JJ, Truncali K, Robbins-Pianka A, Lin MS, Krizanc D, Weir MP. Amino termini of many yeast proteins map to downstream start codons. J Proteome Res 2012; 11:5712-9. [PMID: 23140384 DOI: 10.1021/pr300538f] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Comprehensive knowledge of proteome complexity is crucial to understanding cell function. Amino termini of yeast proteins were identified through peptide mass spectrometry on glutaraldehyde-treated cell lysates as well as a parallel assessment of publicly deposited spectra. An unexpectedly large fraction of detected amino-terminal peptides (35%) mapped to translation initiation at AUG codons downstream of the annotated start codon. Many of the implicated genes have suboptimal sequence contexts for translation initiation near their annotated AUG, and their ribosome profiles show elevated tag densities consistent with translation initiation at downstream AUGs as well as their annotated AUGs. These data suggest that a significant fraction of the yeast proteome derives from initiation at downstream AUGs, increasing significantly the repertoire of encoded proteins and their potential functions and cellular localizations.
Collapse
Affiliation(s)
- Claire T Fournier
- Department of Biology, Wesleyan University, Middletown, Connecticut 06459, United States
| | | | | | | | | | | | | |
Collapse
|
45
|
Lluis MW, Godfroy JI, Yin H. Protein engineering methods applied to membrane protein targets. Protein Eng Des Sel 2012; 26:91-100. [DOI: 10.1093/protein/gzs079] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
|
46
|
Reymond JL, Awale M. Exploring chemical space for drug discovery using the chemical universe database. ACS Chem Neurosci 2012; 3:649-57. [PMID: 23019491 DOI: 10.1021/cn3000422] [Citation(s) in RCA: 165] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2012] [Accepted: 04/25/2012] [Indexed: 01/20/2023] Open
Abstract
Herein we review our recent efforts in searching for bioactive ligands by enumeration and virtual screening of the unknown chemical space of small molecules. Enumeration from first principles shows that almost all small molecules (>99.9%) have never been synthesized and are still available to be prepared and tested. We discuss open access sources of molecules, the classification and representation of chemical space using molecular quantum numbers (MQN), its exhaustive enumeration in form of the chemical universe generated databases (GDB), and examples of using these databases for prospective drug discovery. MQN-searchable GDB, PubChem, and DrugBank are freely accessible at www.gdb.unibe.ch.
Collapse
Affiliation(s)
- Jean-Louis Reymond
- Department of Chemistry and Biochemistry, University of Berne, Freiestrasse 3, 3012 Berne, Switzerland
| | - Mahendra Awale
- Department of Chemistry and Biochemistry, University of Berne, Freiestrasse 3, 3012 Berne, Switzerland
| |
Collapse
|
47
|
Sirota FL, Batagov A, Schneider G, Eisenhaber B, Eisenhaber F, Maurer-Stroh S. Beware of moving targets: reference proteome content fluctuates substantially over the years. J Bioinform Comput Biol 2012; 10:1250020. [PMID: 22867629 DOI: 10.1142/s0219720012500205] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Reference proteomes are generated by increasingly sophisticated annotation pipelines as part of regular genome build releases; yet, the corresponding changes in reference proteomes' content are dramatic. In the history of the NCBI-curated human proteome, the total number of entries has remained roughly constant but approximately half of the proteins from the 2003 build 33 are no longer represented by entries in current releases, while about the same number of new proteins have been added (for sequence identity thresholds 50-90%). Although mostly hypothetical proteins are affected, there are also spectacular cases of entry removal/addition of well studied proteins. The changes between the 2003 and recent human proteomes are in a similar order of magnitude as the differences between recent human and chimpanzee proteome releases. As an application example, we show that the proteome fluctuations affect the interpretation (about 74% of hits) of organelle-specific mass-spectrometry data. Although proteome quality tends to improve with more recent releases as, for example, the fraction of proteins with functional annotation has increased over time, existing evidence implies that, apparently, the proteome content still remains incomplete, not just pertaining to isoforms/sequence variants but also to proteins and their families that are clearly distinct.
Collapse
Affiliation(s)
- Fernanda L Sirota
- Bioinformatics Institute (BII), Agency for Science and Technology (A*STAR), 30 Biopolis Street, #07-01, Matrix, 138671, Singapore.
| | | | | | | | | | | |
Collapse
|
48
|
Dowd WW. Challenges for Biological Interpretation of Environmental Proteomics Data in Non-model Organisms. Integr Comp Biol 2012; 52:705-20. [DOI: 10.1093/icb/ics093] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
|
49
|
Abstract
Alternative splicing, an unknown mechanism 20 years ago, is now recognized as a major mechanism for proteome and transcriptome diversity, particularly in mammals--some researchers conjecture that up to 90% of human genes are alternatively spliced. Despite much research on exon and intron evolution, little is known about the evolution of transcripts. In this paper, we present a model of transcript evolution and an associated algorithm to reconstruct transcript phylogenies. The evolution of the gene structure--exons and introns--is used as basis for the reconstruction of transcript phylogenies. We apply our model and reconstruction algorithm on two well-studied genes, MAG and PAX6, obtaining results consistent with current knowledge and thereby providing evidence that a phylogenetic analysis of transcripts is feasible and likely to be informative.
Collapse
Affiliation(s)
- Yann Christinat
- Laboratory of Computational Biology and Bioinformatics, EPFL, 1015 Lausanne, Switzerland.
| | | |
Collapse
|
50
|
Landry JP, Fei Y, Zhu X. Simultaneous measurement of 10,000 protein-ligand affinity constants using microarray-based kinetic constant assays. Assay Drug Dev Technol 2011; 10:250-9. [PMID: 22192305 DOI: 10.1089/adt.2011.0406] [Citation(s) in RCA: 54] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Fluorescence-based endpoint detection of microarrays with 10,000 or more molecular targets is a most useful tool for high-throughput profiling of biomolecular interactions, including screening large molecular libraries for novel protein ligands. However, endpoint fluorescence data such as images of reacted microarrays contain little information on kinetic rate constants, and the reliability of endpoint data as measures of binding affinity depends on reaction conditions and postreaction processing. We here report a simultaneous measurement of binding curves of a protein probe with 10,000 molecular targets in a microarray with an ellipsometry-based (label-free) optical scanner. The reaction rate constants extracted from these curves (k(on), k(off), and k(a)=k(on)/k(off)) are used to characterize the probe-target interactions instead of the endpoints. This work advances the microarray technology to a new milestone, namely, from an endpoint assay to a kinetic constant assay platform. The throughput of this binding curve assay platform is comparable to those at the National Institutes of Health Molecular Library Screening Centers, making it a practical method in screening compound libraries for novel ligands and for system-wide affinity profiling of proteins, viruses, or whole cells against diverse molecular targets.
Collapse
Affiliation(s)
- James P Landry
- Department of Physics, University of California at Davis, Davis, CA 95616, USA
| | | | | |
Collapse
|