1
|
Sánchez IE, Galpern EA, Ferreiro DU. Solvent constraints for biopolymer folding and evolution in extraterrestrial environments. Proc Natl Acad Sci U S A 2024; 121:e2318905121. [PMID: 38739787 PMCID: PMC11127021 DOI: 10.1073/pnas.2318905121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2023] [Accepted: 04/16/2024] [Indexed: 05/16/2024] Open
Abstract
We propose that spontaneous folding and molecular evolution of biopolymers are two universal aspects that must concur for life to happen. These aspects are fundamentally related to the chemical composition of biopolymers and crucially depend on the solvent in which they are embedded. We show that molecular information theory and energy landscape theory allow us to explore the limits that solvents impose on biopolymer existence. We consider 54 solvents, including water, alcohols, hydrocarbons, halogenated solvents, aromatic solvents, and low molecular weight substances made up of elements abundant in the universe, which may potentially take part in alternative biochemistries. We find that along with water, there are many solvents for which the liquid regime is compatible with biopolymer folding and evolution. We present a ranking of the solvents in terms of biopolymer compatibility. Many of these solvents have been found in molecular clouds or may be expected to occur in extrasolar planets.
Collapse
Affiliation(s)
- Ignacio E. Sánchez
- Laboratorio de Fisiología de Proteínas, Departamento de Química Biológica, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos AiresCP1428, Argentina
- Consejo Nacional de Investigaciones Científicas y Técnicas, Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales, Buenos AiresCP1428, Argentina
| | - Ezequiel A. Galpern
- Laboratorio de Fisiología de Proteínas, Departamento de Química Biológica, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos AiresCP1428, Argentina
- Consejo Nacional de Investigaciones Científicas y Técnicas, Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales, Buenos AiresCP1428, Argentina
| | - Diego U. Ferreiro
- Laboratorio de Fisiología de Proteínas, Departamento de Química Biológica, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos AiresCP1428, Argentina
- Consejo Nacional de Investigaciones Científicas y Técnicas, Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales, Buenos AiresCP1428, Argentina
| |
Collapse
|
2
|
Pereira de Araújo AF. Sequence-dependent and -independent information in a combined random energy model for protein folding and coding. Proteins 2024; 92:679-687. [PMID: 38158239 DOI: 10.1002/prot.26658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 12/11/2023] [Accepted: 12/15/2023] [Indexed: 01/03/2024]
Abstract
Random energy models (REMs) provide a simple description of the energy landscapes that guide protein folding and evolution. The requirement of a large energy gap between the native structure and unfolded conformations, considered necessary for cooperative, protein-like, folding behavior, indicates that proteins differ markedly from random heteropolymers. It has been suggested, therefore, that natural selection might have acted to choose nonrandom amino acid sequences satisfying this particular condition, implying that a large fraction of possible, unselected random sequences, would not fold to any structure. From an informational perspective, however, this scenario could indicate that protein structures, regarded as messages to be transmitted through a communication channel, would not be efficiently encoded in amino acid sequences, regarded as the communication channel for this transmission, since a large fraction of possible channel states would not be used. Here, we use a combined REM for conformations and sequences, with previously estimated parameters for natural proteins, to explore an alternative possibility in which the appropriate shape of the landscape results mainly from the deviation from randomness of possible native structures instead of sequences. We observe that this situation emerges naturally if the distribution of conformational energies happens to arise from two independent contributions corresponding to sequence-dependent and -independent terms. This construction is consistent with the hypothesis of a protein burial folding code, with native structures being determined by a modest amount of sequence-dependent atomic burial information with sequence-independent constraints imposed by unspecific hydrogen bond formation. More generally, an appropriate combination of sequence-dependent and -independent information accommodates the possibility of an efficient structural encoding with the main physical requirement for folding, providing possible insight not only on the folding process but also on several aspects sequence evolution such as neutral networks, conformational coverage, and de novo gene emergence.
Collapse
Affiliation(s)
- Antônio F Pereira de Araújo
- Laboratório de Biofísica Teórica, Departamento de Biologia Celular, Universidade de Brasília, Brasília, Brazil
| |
Collapse
|
3
|
Yang S, Liu D, Song Y, Liang Y, Yu H, Zuo Y. Designing a structure-function alphabet of helix based on reduced amino acid clusters. Arch Biochem Biophys 2024; 754:109942. [PMID: 38387828 DOI: 10.1016/j.abb.2024.109942] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Revised: 02/16/2024] [Accepted: 02/19/2024] [Indexed: 02/24/2024]
Abstract
Several simple secondary structures could form complex and diverse functional proteins, meaning that secondary structures may contain a lot of hidden information and are arranged according to certain principles, to carry enough information of functional specificity and diversity. However, these inner information and principles have not been understood systematically. In our study, we designed a structure-function alphabet of helix based on reduced amino acid clusters to describe the typical features of helices and delve into the information. Firstly, we selected 480 typical helices from membrane proteins, zymoproteins, transcription factors, and other proteins to define and calculate the interval range, and the helices are classified in terms of hydrophilicity, charge and length: (1) hydrophobic helix (≤43%), amphiphilic helix (43%∼71%), and hydrophilic helix (≥71%). (2) positive helix, negative helix, electrically neutral helix and uncharged helix. (3) short helix (≤8 aa), medium-length helix (9-28 aa), and long helix (≥29 aa). Then, we designed an alphabet containing 36 triplet codes according to the above classification, so that the main features of each helix can be represented by only three letters. This alphabet not only preliminarily defined the helix characteristics, but also greatly reduced the informational dimension of protein structure. Finally, we present an application example to demonstrate the value of the structure-function alphabet in protein functional determination and differentiation.
Collapse
Affiliation(s)
- Siqi Yang
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, School of Life Sciences, Inner Mongolia University, Hohhot, 010021, China
| | - Dongyang Liu
- Key Laboratory of Photobiology, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China; University of Chinese Academy of Sciences, Beijing, China
| | - Yancheng Song
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, School of Life Sciences, Inner Mongolia University, Hohhot, 010021, China
| | - Yuchao Liang
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, School of Life Sciences, Inner Mongolia University, Hohhot, 010021, China
| | - Haoyu Yu
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, School of Life Sciences, Inner Mongolia University, Hohhot, 010021, China
| | - Yongchun Zuo
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, School of Life Sciences, Inner Mongolia University, Hohhot, 010021, China.
| |
Collapse
|
4
|
Freiberger MI, Ruiz-Serra V, Pontes C, Romero-Durana M, Galaz-Davison P, Ramírez-Sarmiento CA, Schuster CD, Marti MA, Wolynes PG, Ferreiro DU, Parra RG, Valencia A. Local energetic frustration conservation in protein families and superfamilies. Nat Commun 2023; 14:8379. [PMID: 38104123 PMCID: PMC10725452 DOI: 10.1038/s41467-023-43801-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Accepted: 11/20/2023] [Indexed: 12/19/2023] Open
Abstract
Energetic local frustration offers a biophysical perspective to interpret the effects of sequence variability on protein families. Here we present a methodology to analyze local frustration patterns within protein families and superfamilies that allows us to uncover constraints related to stability and function, and identify differential frustration patterns in families with a common ancestry. We analyze these signals in very well studied protein families such as PDZ, SH3, ɑ and β globins and RAS families. Recent advances in protein structure prediction make it possible to analyze a vast majority of the protein space. An automatic and unsupervised proteome-wide analysis on the SARS-CoV-2 virus demonstrates the potential of our approach to enhance our understanding of the natural phenotypic diversity of protein families beyond single protein instances. We apply our method to modify biophysical properties of natural proteins based on their family properties, as well as perform unsupervised analysis of large datasets to shed light on the physicochemical signatures of poorly characterized proteins such as the ones belonging to emergent pathogens.
Collapse
Affiliation(s)
- Maria I Freiberger
- Laboratorio de Fisiología de Proteínas, Departamento de Química Biológica - IQUIBICEN/CONICET, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, C1428EGA, Argentina
| | - Victoria Ruiz-Serra
- Computational Biology Group, Life Sciences Department, Barcelona Supercomputing Center, Barcelona, Spain
| | - Camila Pontes
- Computational Biology Group, Life Sciences Department, Barcelona Supercomputing Center, Barcelona, Spain
| | - Miguel Romero-Durana
- Computational Biology Group, Life Sciences Department, Barcelona Supercomputing Center, Barcelona, Spain
| | - Pablo Galaz-Davison
- Institute for Biological and Medical Engineering, Schools of Engineering, Medicine, and Biological Sciences, Pontificia Universidad Católica de Chile, Santiago, 7820436, Chile
- ANID - Millennium Science Initiative Program - Millennium Institute for Integrative Biology (iBio), Santiago, 8331150, Chile
| | - Cesar A Ramírez-Sarmiento
- Institute for Biological and Medical Engineering, Schools of Engineering, Medicine, and Biological Sciences, Pontificia Universidad Católica de Chile, Santiago, 7820436, Chile
- ANID - Millennium Science Initiative Program - Millennium Institute for Integrative Biology (iBio), Santiago, 8331150, Chile
| | - Claudio D Schuster
- Laboratorio de Bioinformática, Departamento de Química Biológica - IQUIBICEN/CONICET, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, C1428EGA, Buenos Aires, Argentina
| | - Marcelo A Marti
- Laboratorio de Bioinformática, Departamento de Química Biológica - IQUIBICEN/CONICET, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, C1428EGA, Buenos Aires, Argentina
| | - Peter G Wolynes
- Center for Theoretical Biological Physics and Department of Chemistry, Rice University, Houston, TX, 77005, USA
| | - Diego U Ferreiro
- Laboratorio de Fisiología de Proteínas, Departamento de Química Biológica - IQUIBICEN/CONICET, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, C1428EGA, Argentina
| | - R Gonzalo Parra
- Computational Biology Group, Life Sciences Department, Barcelona Supercomputing Center, Barcelona, Spain.
| | - Alfonso Valencia
- Computational Biology Group, Life Sciences Department, Barcelona Supercomputing Center, Barcelona, Spain
- Catalan Institution for Research and Advanced Studies (ICREA), Barcelona, Spain
| |
Collapse
|
5
|
Lipke PN, Ragonis-Bachar P. Sticking to the Subject: Multifunctionality in Microbial Adhesins. J Fungi (Basel) 2023; 9:jof9040419. [PMID: 37108873 PMCID: PMC10144551 DOI: 10.3390/jof9040419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Revised: 03/25/2023] [Accepted: 03/27/2023] [Indexed: 03/31/2023] Open
Abstract
Bacterial and fungal adhesins mediate microbial aggregation, biofilm formation, and adhesion to host. We divide these proteins into two major classes: professional adhesins and moonlighting adhesins that have a non-adhesive activity that is evolutionarily conserved. A fundamental difference between the two classes is the dissociation rate. Whereas moonlighters, including cytoplasmic enzymes and chaperones, can bind with high affinity, they usually dissociate quickly. Professional adhesins often have unusually long dissociation rates: minutes or hours. Each adhesin has at least three activities: cell surface association, binding to a ligand or adhesive partner protein, and as a microbial surface pattern for host recognition. We briefly discuss Bacillus subtilis TasA, pilin adhesins, gram positive MSCRAMMs, and yeast mating adhesins, lectins and flocculins, and Candida Awp and Als families. For these professional adhesins, multiple activities include binding to diverse ligands and binding partners, assembly into molecular complexes, maintenance of cell wall integrity, signaling for cellular differentiation in biofilms and in mating, surface amyloid formation, and anchorage of moonlighting adhesins. We summarize the structural features that lead to these diverse activities. We conclude that adhesins resemble other proteins with multiple activities, but they have unique structural features to facilitate multifunctionality.
Collapse
Affiliation(s)
- Peter N. Lipke
- Biology Department, Brooklyn College of the City University of New York, Brooklyn, NY 11215, USA
- Correspondence:
| | - Peleg Ragonis-Bachar
- Department of Biology, Technion-Israel Institute of Technology, Haifa 3200003, Israel
| |
Collapse
|