1
|
Bakker MJ, Gaffour A, Juhás M, Zapletal V, Stošek J, Bratholm LA, Pavlíková Přecechtělová J. Streamlining NMR Chemical Shift Predictions for Intrinsically Disordered Proteins: Design of Ensembles with Dimensionality Reduction and Clustering. J Chem Inf Model 2024; 64:6542-6556. [PMID: 39099394 PMCID: PMC11412307 DOI: 10.1021/acs.jcim.4c00809] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/06/2024]
Abstract
By merging advanced dimensionality reduction (DR) and clustering algorithm (CA) techniques, our study advances the sampling procedure for predicting NMR chemical shifts (CS) in intrinsically disordered proteins (IDPs), making a significant leap forward in the field of protein analysis/modeling. We enhance NMR CS sampling by generating clustered ensembles that accurately reflect the different properties and phenomena encapsulated by the IDP trajectories. This investigation critically assessed different rapid CS predictors, both neural network (e.g., Sparta+ and ShiftX2) and database-driven (ProCS-15), and highlighted the need for more advanced quantum calculations and the subsequent need for more tractable-sized conformational ensembles. Although neural network CS predictors outperformed ProCS-15 for all atoms, all tools showed poor agreement with HN CSs, and the neural network CS predictors were unable to capture the influence of phosphorylated residues, highly relevant for IDPs. This study also addressed the limitations of using direct clustering with collective variables, such as the widespread implementation of the GROMOS algorithm. Clustered ensembles (CEs) produced by this algorithm showed poor performance with chemical shifts compared to sequential ensembles (SEs) of similar size. Instead, we implement a multiscale DR and CA approach and explore the challenges and limitations of applying these algorithms to obtain more robust and tractable CEs. The novel feature of this investigation is the use of solvent-accessible surface area (SASA) as one of the fingerprints for DR alongside previously investigated α carbon distance/angles or ϕ/ψ dihedral angles. The ensembles produced with SASA tSNE DR produced CEs better aligned with the experimental CS of between 0.17 and 0.36 r2 (0.18-0.26 ppm) depending on the system and replicate. Furthermore, this technique produced CEs with better agreement than traditional SEs in 85.7% of all ensemble sizes. This study investigates the quality of ensembles produced based on different input features, comparing latent spaces produced by linear vs nonlinear DR techniques and a novel integrated silhouette score scanning protocol for tSNE DR.
Collapse
Affiliation(s)
- Michael J Bakker
- Faculty of Pharmacy in Hradec Králové, Charles University, Akademika Heyrovského 1203/8, 500 05 Hradec Králové, Czech Republic
| | - Amina Gaffour
- Faculty of Pharmacy in Hradec Králové, Charles University, Akademika Heyrovského 1203/8, 500 05 Hradec Králové, Czech Republic
| | - Martin Juhás
- Faculty of Pharmacy in Hradec Králové, Charles University, Akademika Heyrovského 1203/8, 500 05 Hradec Králové, Czech Republic
- Department of Chemistry, Faculty of Science, University of Hradec Králové, Rokitanského 62, 500 03 Hradec Králové, Czech Republic
| | - Vojtěch Zapletal
- Faculty of Pharmacy in Hradec Králové, Charles University, Akademika Heyrovského 1203/8, 500 05 Hradec Králové, Czech Republic
| | - Jakub Stošek
- Faculty of Pharmacy in Hradec Králové, Charles University, Akademika Heyrovského 1203/8, 500 05 Hradec Králové, Czech Republic
- Department of Chemistry, Faculty of Science, Masaryk University, Kotlářská 2, 611 37 Brno, Czech Republic
| | - Lars A Bratholm
- School of Chemistry, University of Bristol, Cantock's Close, BS8 1TS Bristol, U.K
| | - Jana Pavlíková Přecechtělová
- Faculty of Pharmacy in Hradec Králové, Charles University, Akademika Heyrovského 1203/8, 500 05 Hradec Králové, Czech Republic
| |
Collapse
|
2
|
Mustor EM, Wohlfahrt J, Guergues J, Stevens SM, Shaw LN. A Simplified Method for Comprehensive Capture of the Staphylococcus aureus Proteome. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.07.607079. [PMID: 39149396 PMCID: PMC11326305 DOI: 10.1101/2024.08.07.607079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/17/2024]
Abstract
Staphylococcus aureus is a major human pathogen causing myriad infections in both community and healthcare settings. Although well studied, a comprehensive exploration of its dynamic and adaptive proteome is still somewhat lacking. Herein, we employed streamlined liquid- and gas-phase fractionation with PASEF analysis on a TIMS-TOF instrument to expand coverage and explore the S. aureus dark proteome. In so doing, we captured the most comprehensive S. aureus proteome to date, totaling 2,231 proteins (85.6% coverage), using a significantly simplified process that demonstrated high reproducibility with minimal input material. We then showcase application of this library for differential expression profiling by investigating temporal dynamics of the S. aureus proteome. This revealed alterations in metabolic processes, ATP production, RNA processing, and stress-response proteins as cultures progressed to stationary growth. Notably, a significant portion of the library (94%) and proteome (80.5%) was identified by this single-shot, DIA-based analysis. Overall, our study shines new light on the hidden S. aureus proteome, generating a valuable new resource to facilitate further study of this dangerous pathogen.
Collapse
Affiliation(s)
- Emilee M Mustor
- Department of Molecular Biosciences, University of South Florida, Tampa, FL, USA
- Center for Antimicrobial Resistance, University of South Florida, Tampa, FL, USA
| | - Jessica Wohlfahrt
- Department of Molecular Biosciences, University of South Florida, Tampa, FL, USA
| | - Jennifer Guergues
- Department of Molecular Biosciences, University of South Florida, Tampa, FL, USA
| | - Stanley M Stevens
- Department of Molecular Biosciences, University of South Florida, Tampa, FL, USA
| | - Lindsey Neil Shaw
- Department of Molecular Biosciences, University of South Florida, Tampa, FL, USA
- Center for Antimicrobial Resistance, University of South Florida, Tampa, FL, USA
| |
Collapse
|
3
|
Xu S, Onoda A. Accurate and Fast Prediction of Intrinsically Disordered Protein by Multiple Protein Language Models and Ensemble Learning. J Chem Inf Model 2024; 64:2901-2911. [PMID: 37883249 DOI: 10.1021/acs.jcim.3c01202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2023]
Abstract
Intrinsically disordered proteins (IDPs) play a vital role in various biological processes and have attracted increasing attention in the past few decades. Predicting IDPs from the primary structures of proteins offers a rapid and facile means of protein analysis without necessitating crystal structures. In particular, machine learning methods have demonstrated their potential in this field. Recently, protein language models (PLMs) are emerging as a promising approach to extracting essential information from protein sequences and have been employed in protein modeling to utilize their advantages of precision and efficiency. In this article, we developed a novel IDP prediction method named IDP-ELM to predict the intrinsically disordered regions (IDRs) as well as their functions including disordered flexible linkers and disordered protein binding. This method utilizes high-dimensional representations extracted from several state-of-the-art PLMs and predicts IDRs by ensemble learning based on bidirectional recurrent neural networks. The performance of the method was evaluated on two independent test data sets from CAID (critical assessment of protein intrinsic disorder prediction) and CAID2, indicating notable improvements in terms of area under the receiver operating characteristic (AUC), Matthew's correlation coefficient (MCC), and F1 score. Moreover, IDP-ELM requires solely protein sequences as inputs and does not entail a time-consuming process of protein profile generation, which is a prerequisite for most existing state-of-the-art methods, enabling an accurate, fast, and convenient tool for proteome-level analysis. The corresponding reproducible source code and model weights are available at https://github.com/xu-shi-jie/idp-elm.
Collapse
Affiliation(s)
- Shijie Xu
- Graduate School of Environmental Science, Hokkaido University, Sapporo 060-0810, Japan
| | - Akira Onoda
- Graduate School of Environmental Science, Hokkaido University, Sapporo 060-0810, Japan
- Faculty of Environmental Earth Science, Hokkaido University, Sapporo 060-0810, Japan
| |
Collapse
|
4
|
Richardson R, Tejedor Navarro H, Amaral LAN, Stoeger T. Meta-Research: Understudied genes are lost in a leaky pipeline between genome-wide assays and reporting of results. eLife 2024; 12:RP93429. [PMID: 38546716 PMCID: PMC10977968 DOI: 10.7554/elife.93429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/01/2024] Open
Abstract
Present-day publications on human genes primarily feature genes that already appeared in many publications prior to completion of the Human Genome Project in 2003. These patterns persist despite the subsequent adoption of high-throughput technologies, which routinely identify novel genes associated with biological processes and disease. Although several hypotheses for bias in the selection of genes as research targets have been proposed, their explanatory powers have not yet been compared. Our analysis suggests that understudied genes are systematically abandoned in favor of better-studied genes between the completion of -omics experiments and the reporting of results. Understudied genes remain abandoned by studies that cite these -omics experiments. Conversely, we find that publications on understudied genes may even accrue a greater number of citations. Among 45 biological and experimental factors previously proposed to affect which genes are being studied, we find that 33 are significantly associated with the choice of hit genes presented in titles and abstracts of -omics studies. To promote the investigation of understudied genes, we condense our insights into a tool, find my understudied genes (FMUG), that allows scientists to engage with potential bias during the selection of hits. We demonstrate the utility of FMUG through the identification of genes that remain understudied in vertebrate aging. FMUG is developed in Flutter and is available for download at fmug.amaral.northwestern.edu as a MacOS/Windows app.
Collapse
Affiliation(s)
- Reese Richardson
- Interdisciplinary Biological Sciences, Northwestern UniversityEvanstonUnited States
- Department of Chemical and Biological Engineering, Northwestern UniversityEvanstonUnited States
| | - Heliodoro Tejedor Navarro
- Department of Chemical and Biological Engineering, Northwestern UniversityEvanstonUnited States
- Northwestern Institute on Complex Systems, Northwestern UniversityEvanstonUnited States
| | - Luis A Nunes Amaral
- Department of Chemical and Biological Engineering, Northwestern UniversityEvanstonUnited States
- Northwestern Institute on Complex Systems, Northwestern UniversityEvanstonUnited States
- Department of Molecular Biosciences, Northwestern UniversityEvanstonUnited States
- Department of Physics and Astronomy, Northwestern UniversityEvanstonUnited States
| | - Thomas Stoeger
- Department of Chemical and Biological Engineering, Northwestern UniversityEvanstonUnited States
- The Potocsnak Longevity Institute, Northwestern UniversityChicagoUnited States
- Simpson Querrey Lung Institute for Translational Science, Northwestern UniversityChicagoUnited States
| |
Collapse
|
5
|
Richardson RAK, Tejedor Navarro H, Amaral LAN, Stoeger T. Meta-Research: understudied genes are lost in a leaky pipeline between genome-wide assays and reporting of results. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.02.28.530483. [PMID: 36909550 PMCID: PMC10002660 DOI: 10.1101/2023.02.28.530483] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2023]
Abstract
Present-day publications on human genes primarily feature genes that already appeared in many publications prior to completion of the Human Genome Project in 2003. These patterns persist despite the subsequent adoption of high-throughput technologies, which routinely identify novel genes associated with biological processes and disease. Although several hypotheses for bias in the selection of genes as research targets have been proposed, their explanatory powers have not yet been compared. Our analysis suggests that understudied genes are systematically abandoned in favor of better-studied genes between the completion of -omics experiments and the reporting of results. Understudied genes remain abandoned by studies that cite these -omics experiments. Conversely, we find that publications on understudied genes may even accrue a greater number of citations. Among 45 biological and experimental factors previously proposed to affect which genes are being studied, we find that 33 are significantly associated with the choice of hit genes presented in titles and abstracts of - omics studies. To promote the investigation of understudied genes we condense our insights into a tool, find my understudied genes (FMUG), that allows scientists to engage with potential bias during the selection of hits. We demonstrate the utility of FMUG through the identification of genes that remain understudied in vertebrate aging. FMUG is developed in Flutter and is available for download at fmug.amaral.northwestern.edu as a MacOS/Windows app.
Collapse
Affiliation(s)
- Reese AK Richardson
- Interdisciplinary Biological Sciences, Northwestern University
- Department of Chemical and Biological Engineering, Northwestern University
| | - Heliodoro Tejedor Navarro
- Department of Chemical and Biological Engineering, Northwestern University
- Northwestern Institute on Complex Systems, Northwestern University
| | - Luis A Nunes Amaral
- Department of Chemical and Biological Engineering, Northwestern University
- Northwestern Institute on Complex Systems, Northwestern University
- Department of Physics and Astronomy, Northwestern University
- Department of Molecular Biosciences, Northwestern University
| | - Thomas Stoeger
- Department of Chemical and Biological Engineering, Northwestern University
- The Potocsnak Longevity Institute, Northwestern University
- Simpson Querrey Lung Institute for Translational Science, Northwestern University
| |
Collapse
|
6
|
Rocha JJ, Jayaram SA, Stevens TJ, Muschalik N, Shah RD, Emran S, Robles C, Freeman M, Munro S. Functional unknomics: Systematic screening of conserved genes of unknown function. PLoS Biol 2023; 21:e3002222. [PMID: 37552676 PMCID: PMC10409296 DOI: 10.1371/journal.pbio.3002222] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Accepted: 06/27/2023] [Indexed: 08/10/2023] Open
Abstract
The human genome encodes approximately 20,000 proteins, many still uncharacterised. It has become clear that scientific research tends to focus on well-studied proteins, leading to a concern that poorly understood genes are unjustifiably neglected. To address this, we have developed a publicly available and customisable "Unknome database" that ranks proteins based on how little is known about them. We applied RNA interference (RNAi) in Drosophila to 260 unknown genes that are conserved between flies and humans. Knockdown of some genes resulted in loss of viability, and functional screening of the rest revealed hits for fertility, development, locomotion, protein quality control, and resilience to stress. CRISPR/Cas9 gene disruption validated a component of Notch signalling and 2 genes contributing to male fertility. Our work illustrates the importance of poorly understood genes, provides a resource to accelerate future research, and highlights a need to support database curation to ensure that misannotation does not erode our awareness of our own ignorance.
Collapse
Affiliation(s)
- João J. Rocha
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
| | | | - Tim J. Stevens
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
| | | | - Rajen D. Shah
- Centre for Mathematical Sciences, University of Cambridge, Cambridge, United Kingdom
| | - Sahar Emran
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
| | - Cristina Robles
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
| | - Matthew Freeman
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
- Sir William Dunn School of Pathology, University of Oxford, Oxford, United Kingdom
| | - Sean Munro
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
| |
Collapse
|
7
|
Edich M, Briggs DC, Kippes O, Gao Y, Thorn A. The impact of AlphaFold2 on experimental structure solution. Faraday Discuss 2022; 240:184-195. [PMID: 35943157 PMCID: PMC10231047 DOI: 10.1039/d2fd00072e] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2022] [Accepted: 05/03/2022] [Indexed: 01/09/2023]
Abstract
AlphaFold2 is a machine-learning based program that predicts a protein structure based on the amino acid sequence. In this article, we report on the current usages of this new tool and give examples from our work in the Coronavirus Structural Task Force. With its unprecedented accuracy, it can be utilized for the design of expression constructs, de novo protein design and the interpretation of Cryo-EM data with an atomic model. However, these methods are limited by their training data and are of limited use to predict conformational variability and fold flexibility; they also lack co-factors, post-translational modifications and multimeric complexes with oligonucleotides. They also are not always perfect in terms of chemical geometry. Nevertheless, machine learning-based fold prediction is a game changer for structural bioinformatics and experimentalists alike, with exciting developments ahead.
Collapse
Affiliation(s)
- Maximilian Edich
- Institute for Nanostructure and Solid State Physics, Universität Hamburg, Luruper Chaussee 149, 22761 Hamburg, Germany.
| | - David C Briggs
- The Francis Crick Institute, 1 Midland Road, London, NW1 1AT, UK
| | - Oliver Kippes
- Institute for Nanostructure and Solid State Physics, Universität Hamburg, Luruper Chaussee 149, 22761 Hamburg, Germany.
| | - Yunyun Gao
- Institute for Nanostructure and Solid State Physics, Universität Hamburg, Luruper Chaussee 149, 22761 Hamburg, Germany.
| | - Andrea Thorn
- Institute for Nanostructure and Solid State Physics, Universität Hamburg, Luruper Chaussee 149, 22761 Hamburg, Germany.
| |
Collapse
|
8
|
Jarnot P, Ziemska-Legiecka J, Grynberg M, Gruca A. Insights from analyses of low complexity regions with canonical methods for protein sequence comparison. Brief Bioinform 2022; 23:bbac299. [PMID: 35914952 PMCID: PMC9487646 DOI: 10.1093/bib/bbac299] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 06/29/2022] [Accepted: 07/01/2022] [Indexed: 11/28/2022] Open
Abstract
Low complexity regions are fragments of protein sequences composed of only a few types of amino acids. These regions frequently occur in proteins and can play an important role in their functions. However, scientists are mainly focused on regions characterized by high diversity of amino acid composition. Similarity between regions of protein sequences frequently reflect functional similarity between them. In this article, we discuss strengths and weaknesses of the similarity analysis of low complexity regions using BLAST, HHblits and CD-HIT. These methods are considered to be the gold standard in protein similarity analysis and were designed for comparison of high complexity regions. However, we lack specialized methods that could be used to compare the similarity of low complexity regions. Therefore, we investigated the existing methods in order to understand how they can be applied to compare such regions. Our results are supported by exploratory study, discussion of amino acid composition and biological roles of selected examples. We show that existing methods need improvements to efficiently search for similar low complexity regions. We suggest features that have to be re-designed specifically for comparing low complexity regions: scoring matrix, multiple sequence alignment, e-value, local alignment and clustering based on a set of representative sequences. Results of this analysis can either be used to improve existing methods or to create new methods for the similarity analysis of low complexity regions.
Collapse
Affiliation(s)
- Patryk Jarnot
- Department of Computer Networks and Systems, Silesian University of Technology, Akademicka 2A, 44-100, Gliwice, Poland
| | - Joanna Ziemska-Legiecka
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Pawinskiego 5A, 02-106, Warsaw, Poland
| | - Marcin Grynberg
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Pawinskiego 5A, 02-106, Warsaw, Poland
| | - Aleksandra Gruca
- Department of Computer Networks and Systems, Silesian University of Technology, Akademicka 2A, 44-100, Gliwice, Poland
| |
Collapse
|
9
|
Halder A, Verma A, Biswas D, Srivastava S. Recent advances in mass-spectrometry based proteomics software, tools and databases. DRUG DISCOVERY TODAY. TECHNOLOGIES 2021; 39:69-79. [PMID: 34906327 DOI: 10.1016/j.ddtec.2021.06.007] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Revised: 05/08/2021] [Accepted: 06/21/2021] [Indexed: 01/12/2023]
Abstract
The field of proteomics immensely depends on data generation and data analysis which are thoroughly supported by software and databases. There has been a massive advancement in mass spectrometry-based proteomics over the last 10 years which has compelled the scientific community to upgrade or develop algorithms, tools, and repository databases in the field of proteomics. Several standalone software, and comprehensive databases have aided the establishment of integrated omics pipeline and meta-analysis workflow which has contributed to understand the disease pathobiology, biomarker discovery and predicting new therapeutic modalities. For shotgun proteomics where Data Dependent Acquisition is performed, several user-friendly software are developed that can analyse the pre-processed data to provide mechanistic insights of the disease. Likewise, in Data Independent Acquisition, pipelines are emerged which can accomplish the task from building the spectral library to identify the therapeutic targets. Furthermore, in the age of big data analysis the implications of machine learning and cloud computing are appending robustness, rapidness and in-depth proteomics data analysis. The current review talks about the recent advancement, and development of software, tools, and database in the field of mass-spectrometry based proteomics.
Collapse
Affiliation(s)
- Ankit Halder
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Powai, Mumbai 400076, India
| | - Ayushi Verma
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Powai, Mumbai 400076, India
| | - Deeptarup Biswas
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Powai, Mumbai 400076, India
| | - Sanjeeva Srivastava
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Powai, Mumbai 400076, India.
| |
Collapse
|
10
|
Reynolds KA, Rosa-Molinar E, Ward RE, Zhang H, Urbanowicz BR, Settles AM. Accelerating biological insight for understudied genes. Integr Comp Biol 2021; 61:2233-2243. [PMID: 33970251 DOI: 10.1093/icb/icab029] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The rapid expansion of genome sequence data is increasing the discovery of protein-coding genes across all domains of life. Annotating these genes with reliable functional information is necessary to understand evolution, to define the full biochemical space accessed by nature, and to identify target genes for biotechnology improvements. The vast majority of proteins are annotated based on sequence conservation with no specific biological, biochemical, genetic, or cellular function identified. Recent technical advances throughout the biological sciences enable experimental research on these understudied protein-coding genes in a broader collection of species. However, scientists have incentives and biases to continue focusing on well documented genes within their preferred model organism. This perspective suggests a research model that seeks to break historic silos of research bias by enabling interdisciplinary teams to accelerate biological functional annotation. We propose an initiative to develop coordinated projects of collaborating evolutionary biologists, cell biologists, geneticists, and biochemists that will focus on subsets of target genes in multiple model organisms. Concurrent analysis in multiple organisms takes advantage of evolutionary divergence and selection, which causes individual species to be better suited as experimental models for specific genes. Most importantly, multisystem approaches would encourage transdisciplinary critical thinking and hypothesis testing that is inherently slow in current biological research.
Collapse
Affiliation(s)
- Kimberly A Reynolds
- The Green Center for Systems Biology and the Department of Biophysics, The University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Eduardo Rosa-Molinar
- Department of Pharmacology & Toxicology, The University of Kansas, Lawrence, KS 66047, USA
| | - Robert E Ward
- Department of Biology, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Hongbin Zhang
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX 77843, USA
| | - Breeanna R Urbanowicz
- Department of Biochemistry and Molecular Biology, University of Georgia, Athens, Georgia 30602, USA
| | - A Mark Settles
- Bioengineering Branch, NASA Ames Research Center, Moffett Field, CA USA
| |
Collapse
|
11
|
Barderas R, Srivastava S, LaBaer J. Protein Microarray-Based Proteomics for Disease Analysis. Methods Mol Biol 2021; 2344:3-6. [PMID: 34115348 DOI: 10.1007/978-1-0716-1562-1_1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]
Abstract
As we approach the twentieth anniversary of completing the international Human Genome Project, the next (and arguably most significant) frontier in biology consists of functionally understanding the proteins, which are encoded by the genome and play a crucial role in all of biology and medicine. To accomplish this challenge, different proteomics strategies must be devised to examine the activities of gene products (proteins) at scale. Among them, protein microarrays have been used to accomplish a wide variety of investigations such as examining the binding of proteins and proteoforms to DNA, small molecules, and other proteins; characterizing humoral immune responses in health and disease; evaluating allergenic proteins; and profiling protein patterns as candidate disease-specific biomarkers. In Protein Microarray for Disease Analysis: Methods and Protocols, expert researchers involved in the field of protein microarrays provide concise descriptions of the methodologies that they currently use to fabricate microarrays and how they apply them to analyze protein interactions and responses of proteins to dissect human disease.
Collapse
Affiliation(s)
- Rodrigo Barderas
- Chronic Disease Programme, UFIEC, Instituto de Salud Carlos III, Madrid, Spain.
| | - Sanjeeva Srivastava
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Mumbai, Maharashtra, India.
| | - Joshua LaBaer
- Biodesign Center for Personalized Diagnostics, Arizona State University, Tempe, AZ, USA.
| |
Collapse
|
12
|
de Brevern AG. Analysis of Protein Disorder Predictions in the Light of a Protein Structural Alphabet. Biomolecules 2020; 10:biom10071080. [PMID: 32698546 PMCID: PMC7408373 DOI: 10.3390/biom10071080] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Revised: 07/14/2020] [Accepted: 07/18/2020] [Indexed: 12/30/2022] Open
Abstract
Intrinsically-disordered protein (IDP) characterization was an amazing change of paradigm in our classical sequence-structure-function theory. Moreover, IDPs are over-represented in major disease pathways and are now often targeted using small molecules for therapeutic purposes. This has had created a complex continuum from order-that encompasses rigid and flexible regions-to disorder regions; the latter being not accessible through classical crystallographic methodologies. In X-ray structures, the notion of order is dictated by access to resolved atom positions, providing rigidity and flexibility information with low and high experimental B-factors, while disorder is associated with the missing (non-resolved) residues. Nonetheless, some rigid regions can be found in disorder regions. Using ensembles of IDPs, their local conformations were analyzed in the light of a structural alphabet. An entropy index derived from this structural alphabet allowed us to propose a continuum of states from rigidity to flexibility and finally disorder. In this study, the analysis was extended to comparing these results to disorder predictions, underlying a limited correlation, and so opening new ideas to characterize and predict disorder.
Collapse
Affiliation(s)
- Alexandre G de Brevern
- INSERM, UMR_S 1134, DSIMB, Univ Paris, INTS, Laboratoire d'Excellence GR-Ex, 75015 Paris, France
| |
Collapse
|
13
|
Dark Proteome Database: Studies on Disorder. High Throughput 2020; 9:ht9030015. [PMID: 32629790 PMCID: PMC7563470 DOI: 10.3390/ht9030015] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Revised: 06/17/2020] [Accepted: 06/18/2020] [Indexed: 12/17/2022] Open
Abstract
There is a misconception that intrinsic disorder in proteins is equivalent to darkness. The present study aims to establish, in the scope of the Swiss-Prot and Dark Proteome databases, the relationship between disorder and darkness. Three distinct predictors were used to calculate the disorder of Swiss-Prot proteins. The analysis of the results obtained with the used predictors and visualization paradigms resulted in the same conclusion that was reached before: disorder is mostly unrelated to darkness.
Collapse
|
14
|
Haymond A, Davis JB, Espina V. Proteomics for cancer drug design. Expert Rev Proteomics 2019; 16:647-664. [PMID: 31353977 PMCID: PMC6736641 DOI: 10.1080/14789450.2019.1650025] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2019] [Accepted: 07/26/2019] [Indexed: 12/29/2022]
Abstract
Introduction: Signal transduction cascades drive cellular proliferation, apoptosis, immune, and survival pathways. Proteins have emerged as actionable drug targets because they are often dysregulated in cancer, due to underlying genetic mutations, or dysregulated signaling pathways. Cancer drug development relies on proteomic technologies to identify potential biomarkers, mechanisms-of-action, and to identify protein binding hot spots. Areas covered: Brief summaries of proteomic technologies for drug discovery include mass spectrometry, reverse phase protein arrays, chemoproteomics, and fragment based screening. Protein-protein interface mapping is presented as a promising method for peptide therapeutic development. The topic of biosimilar therapeutics is presented as an opportunity to apply proteomic technologies to this new class of cancer drug. Expert opinion: Proteomic technologies are indispensable for drug discovery. A suite of technologies including mass spectrometry, reverse phase protein arrays, and protein-protein interaction mapping provide complimentary information for drug development. These assays have matured into well controlled, robust technologies. Recent regulatory approval of biosimilar therapeutics provides another opportunity to decipher the molecular nuances of their unique mechanisms of action. The ability to identify previously hidden protein hot spots is expanding the gamut of potential drug targets. Proteomic profiling permits lead compound evaluation beyond the one drug, one target paradigm.
Collapse
Affiliation(s)
- Amanda Haymond
- Center for Applied Proteomics and Molecular Medicine, George Mason University , Manassas , VA , USA
| | - Justin B Davis
- Center for Applied Proteomics and Molecular Medicine, George Mason University , Manassas , VA , USA
| | - Virginia Espina
- Center for Applied Proteomics and Molecular Medicine, George Mason University , Manassas , VA , USA
| |
Collapse
|