Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Dryden DTF, Thomson AR, White JH. How much of protein sequence space has been explored by life on Earth? J R Soc Interface 2008;5:953-6. [PMID: 18426772 PMCID: PMC2459213 DOI: 10.1098/rsif.2008.0085] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

For:	Dryden DTF, Thomson AR, White JH. How much of protein sequence space has been explored by life on Earth? J R Soc Interface 2008;5:953-6. [PMID: 18426772 PMCID: PMC2459213 DOI: 10.1098/rsif.2008.0085] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

Number

Cited by Other Article(s)

Faure AJ, Martí-Aranda A, Hidalgo-Carcedo C, Beltran A, Schmiedel JM, Lehner B. The genetic architecture of protein stability. Nature 2024:10.1038/s41586-024-07966-0. [PMID: 39322666 DOI: 10.1038/s41586-024-07966-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Accepted: 08/20/2024] [Indexed: 09/27/2024]

Vila JA. Analysis of proteins in the light of mutations. EUROPEAN BIOPHYSICS JOURNAL : EBJ 2024;53:255-265. [PMID: 38955858 DOI: 10.1007/s00249-024-01714-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Revised: 05/23/2024] [Accepted: 06/18/2024] [Indexed: 07/04/2024]

Joshi SHN, Jenkins C, Ulaeto D, Gorochowski TE. Accelerating Genetic Sensor Development, Scale-up, and Deployment Using Synthetic Biology. BIODESIGN RESEARCH 2024;6:0037. [PMID: 38919711 PMCID: PMC11197468 DOI: 10.34133/bdr.0037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Accepted: 04/23/2024] [Indexed: 06/27/2024] Open

Ndochinwa GO, Wang QY, Okoro NO, Amadi OC, Nwagu TN, Nnamchi CI, Moneke AN, Odiba AS. New advances in protein engineering for industrial applications: Key takeaways. Open Life Sci 2024;19:20220856. [PMID: 38911927 PMCID: PMC11193397 DOI: 10.1515/biol-2022-0856] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Revised: 03/01/2024] [Accepted: 03/13/2024] [Indexed: 06/25/2024] Open

Tang X, Dai H, Knight E, Wu F, Li Y, Li T, Gerstein M. A survey of generative AI for de novo drug design: new frontiers in molecule and protein generation. Brief Bioinform 2024;25:bbae338. [PMID: 39007594 PMCID: PMC11247410 DOI: 10.1093/bib/bbae338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2024] [Revised: 05/21/2024] [Accepted: 06/27/2024] [Indexed: 07/16/2024] Open

Caparotta M, Perez A. Advancing Molecular Dynamics: Toward Standardization, Integration, and Data Accessibility in Structural Biology. J Phys Chem B 2024;128:2219-2227. [PMID: 38418288 DOI: 10.1021/acs.jpcb.3c04823] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/01/2024]

Abstract

Molecular dynamics (MD) simulations have become a valuable tool in structural biology, offering insights into complex biological systems that are difficult to obtain through experimental techniques alone. The lack of available data sets and structures in most published computational work has limited other researchers' use of these models. In recent years, the emergence of online sharing platforms and MD database initiatives favor the deposition of ensembles and structures to accompany publications, favoring reuse of the data sets. However, the lack of uniform metadata collection, formats, and what data are deposited limits the impact and its use by different communities that are not necessarily experts in MD. This Perspective highlights the need for standardization and better resource sharing for processing and interpreting MD simulation results, akin to efforts in other areas of structural biology. As the field moves forward, we will see an increase in popularity and benefits of MD-based integrative approaches combining experimental data and simulations through probabilistic reasoning, but these too are limited by uniformity in experimental data availability and choices on how the data are modeled that are not trivial to decipher from papers. Other fields have addressed similar challenges comprehensively by establishing task forces with different degrees of success. The large scope and number of communities to represent the breadth of types of MD simulations complicates a parallel approach that would fit all. Thus, each group typically decides what data and which format to upload on servers like Zenodo. Uploading data with FAIR (findable, accessible, interoperable, reusable) principles in mind including optimal metadata collection will make the data more accessible and actionable by the community. Such a wealth of simulation data will foster method development and infrastructure advancements, thus propelling the field forward.

Collapse

Vila JA. Protein folding rate evolution upon mutations. Biophys Rev 2023;15:661-669. [PMID: 37681091 PMCID: PMC10480377 DOI: 10.1007/s12551-023-01088-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Accepted: 06/24/2023] [Indexed: 09/09/2023] Open

Sánchez IE, Galpern EA, Garibaldi MM, Ferreiro DU. Molecular Information Theory Meets Protein Folding. J Phys Chem B 2022;126:8655-8668. [PMID: 36282961 DOI: 10.1021/acs.jpcb.2c04532] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]

Gaia as Solaris: An Alternative Default Evolutionary Trajectory. ORIGINS LIFE EVOL B 2022;52:129-147. [PMID: 35441955 DOI: 10.1007/s11084-022-09619-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Accepted: 01/21/2022] [Indexed: 01/23/2023]

Mokhtari DA, Appel MJ, Fordyce PM, Herschlag D. High throughput and quantitative enzymology in the genomic era. Curr Opin Struct Biol 2021;71:259-273. [PMID: 34592682 PMCID: PMC8648990 DOI: 10.1016/j.sbi.2021.07.010] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Accepted: 07/23/2021] [Indexed: 12/28/2022]

Buchholz PCF, van Loo B, Eenink BDG, Bornberg-Bauer E, Pleiss J. Ancestral sequences of a large promiscuous enzyme family correspond to bridges in sequence space in a network representation. J R Soc Interface 2021;18:20210389. [PMID: 34727710 DOI: 10.1098/rsif.2021.0389] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Bitard-Feildel T. Navigating the amino acid sequence space between functional proteins using a deep learning framework. PeerJ Comput Sci 2021;7:e684. [PMID: 34616884 PMCID: PMC8459775 DOI: 10.7717/peerj-cs.684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Accepted: 07/30/2021] [Indexed: 06/13/2023]

Abstract

MOTIVATION

Shedding light on the relationships between protein sequences and functions is a challenging task with many implications in protein evolution, diseases understanding, and protein design. The protein sequence space mapping to specific functions is however hard to comprehend due to its complexity. Generative models help to decipher complex systems thanks to their abilities to learn and recreate data specificity. Applied to proteins, they can capture the sequence patterns associated with functions and point out important relationships between sequence positions. By learning these dependencies between sequences and functions, they can ultimately be used to generate new sequences and navigate through uncharted area of molecular evolution.

RESULTS

This study presents an Adversarial Auto-Encoder (AAE) approached, an unsupervised generative model, to generate new protein sequences. AAEs are tested on three protein families known for their multiple functions the sulfatase, the HUP and the TPP families. Clustering results on the encoded sequences from the latent space computed by AAEs display high level of homogeneity regarding the protein sequence functions. The study also reports and analyzes for the first time two sampling strategies based on latent space interpolation and latent space arithmetic to generate intermediate protein sequences sharing sequential properties of original sequences linked to known functional properties issued from different families and functions. Generated sequences by interpolation between latent space data points demonstrate the ability of the AAE to generalize and produce meaningful biological sequences from an evolutionary uncharted area of the biological sequence space. Finally, 3D structure models computed by comparative modelling using generated sequences and templates of different sub-families point out to the ability of the latent space arithmetic to successfully transfer protein sequence properties linked to function between different sub-families. All in all this study confirms the ability of deep learning frameworks to model biological complexity and bring new tools to explore amino acid sequence and functional spaces.

Collapse

About the Protein Space Vastness. Protein J 2020;39:472-475. [PMID: 33130957 DOI: 10.1007/s10930-020-09939-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/27/2020] [Indexed: 12/24/2022]

Thorvaldsen S, Hössjer O. Using statistical methods to model the fine-tuning of molecular machines and systems. J Theor Biol 2020;501:110352. [PMID: 32505827 DOI: 10.1016/j.jtbi.2020.110352] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2019] [Revised: 05/26/2020] [Accepted: 05/27/2020] [Indexed: 10/24/2022]

Lespinats S, De Clerck O, Colange B, Gorelova V, Grando D, Maréchal E, Van Der Straeten D, Rébeillé F, Bastien O. Phylogeny and Sequence Space: A Combined Approach to Analyze the Evolutionary Trajectories of Homologous Proteins. The Case Study of Aminodeoxychorismate Synthase. Acta Biotheor 2020;68:139-156. [PMID: 31312977 DOI: 10.1007/s10441-019-09352-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2019] [Accepted: 07/10/2019] [Indexed: 11/27/2022]

Bauer TL, Buchholz PCF, Pleiss J. The modular structure of α/β-hydrolases. FEBS J 2019;287:1035-1053. [PMID: 31545554 DOI: 10.1111/febs.15071] [Citation(s) in RCA: 65] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2019] [Revised: 08/15/2019] [Accepted: 09/19/2019] [Indexed: 12/22/2022]

Abstract

The α/β-hydrolase fold family is highly diverse in sequence, structure and biochemical function. To investigate the sequence-structure-function relationships, the Lipase Engineering Database (https://led.biocatnet.de) was updated. Overall, 280 638 protein sequences and 1557 protein structures were analysed. All α/β-hydrolases consist of the catalytically active core domain, but they might also contain additional structural modules, resulting in 12 different architectures: core domain only, additional lids at three different positions, three different caps, additional N- or C-terminal domains and combinations of N- and C-terminal domains with caps and lids respectively. In addition, the α/β-hydrolases were distinguished by their oxyanion hole signature (GX-, GGGX- and Y-types). The N-terminal domains show two different folds, the Rossmann fold or the β-propeller fold. The C-terminal domains show a β-sandwich fold. The N-terminal β-propeller domain and the C-terminal β-sandwich domain are structurally similar to carbohydrate-binding proteins such as lectins. The classification was applied to the newly discovered polyethylene terephthalate (PET)-degrading PETases and MHETases, which are core domain α/β-hydrolases of the GX- and the GGGX-type respectively. To investigate evolutionary relationships, sequence networks were analysed. The degree distribution followed a power law with a scaling exponent γ = 1.4, indicating a highly inhomogeneous network which consists of a few hubs and a large number of less connected sequences. The hub sequences have many functional neighbours and therefore are expected to be robust toward possible deleterious effects of mutations. The cluster size distribution followed a power law with an extrapolated scaling exponent τ = 2.6, which strongly supports the connectedness of the sequence space of α/β-hydrolases. DATABASE: Supporting data about domains from other proteins with structural similarity to the N- or C-terminal domains of α/β-hydrolases are available in Data Repository of the University of Stuttgart (DaRUS) under doi: https://doi.org/10.18419/darus-458.

Collapse

Ćirković MM, Vukotić B, Stojanović M. Persistence of Technosignatures: A Comment on Lingam and Loeb. ASTROBIOLOGY 2019;19:1300-1302. [PMID: 31260327 DOI: 10.1089/ast.2019.2052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]

Marchi J, Galpern EA, Espada R, Ferreiro DU, Walczak AM, Mora T. Size and structure of the sequence space of repeat proteins. PLoS Comput Biol 2019;15:e1007282. [PMID: 31415557 PMCID: PMC6733475 DOI: 10.1371/journal.pcbi.1007282] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2019] [Revised: 09/09/2019] [Accepted: 07/24/2019] [Indexed: 11/18/2022] Open

Abstract

The coding space of protein sequences is shaped by evolutionary constraints set by requirements of function and stability. We show that the coding space of a given protein family—the total number of sequences in that family—can be estimated using models of maximum entropy trained on multiple sequence alignments of naturally occuring amino acid sequences. We analyzed and calculated the size of three abundant repeat proteins families, whose members are large proteins made of many repetitions of conserved portions of ∼30 amino acids. While amino acid conservation at each position of the alignment explains most of the reduction of diversity relative to completely random sequences, we found that correlations between amino acid usage at different positions significantly impact that diversity. We quantified the impact of different types of correlations, functional and evolutionary, on sequence diversity. Analysis of the detailed structure of the coding space of the families revealed a rugged landscape, with many local energy minima of varying sizes with a hierarchical structure, reminiscent of fustrated energy landscapes of spin glass in physics. This clustered structure indicates a multiplicity of subtypes within each family, and suggests new strategies for protein design.

Natural protein molecules are only a small subset of the possible strings of amino acids. This naturally calls the question of how many protein sequences theoretically exist that are functional, and how many have already been explored by nature. To help answer this question, we developed a statistical method to calculate the total potential number of protein sequences of a given family, focusing on three families of repeat proteins, which play important roles in a variety of cellular processes. The number of sequences that we compute is limited by functional interactions between the residues of the protein, as well as its evolutionary history. Applying techniques from the physics of disordered systems, we show that the space of sequences has a rugged structure, which could hinder their evolution. Individual proteins can be organised into distinct clusters corresponding to basins of attraction of the landscape, suggesting the existence of subfamilies within each family.

Collapse

Gorelova V, Bastien O, De Clerck O, Lespinats S, Rébeillé F, Van Der Straeten D. Evolution of folate biosynthesis and metabolism across algae and land plant lineages. Sci Rep 2019;9:5731. [PMID: 30952916 PMCID: PMC6451014 DOI: 10.1038/s41598-019-42146-5] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2018] [Accepted: 03/25/2019] [Indexed: 11/09/2022] Open

Korenić A, Perović S, Ćirković MM, Miquel PA. Symmetry breaking and functional incompleteness in biological systems. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2019;150:1-12. [PMID: 30776381 DOI: 10.1016/j.pbiomolbio.2019.02.001] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/06/2018] [Revised: 12/31/2018] [Accepted: 02/05/2019] [Indexed: 12/20/2022]

Turjanski P, Ferreiro DU. On the Natural Structure of Amino Acid Patterns in Families of Protein Sequences. J Phys Chem B 2018;122:11295-11301. [PMID: 30239207 DOI: 10.1021/acs.jpcb.8b07206] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]

The scale-free nature of protein sequence space. PLoS One 2018;13:e0200815. [PMID: 30067815 PMCID: PMC6070207 DOI: 10.1371/journal.pone.0200815] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2018] [Accepted: 07/03/2018] [Indexed: 11/19/2022] Open

Ćirković MM. Woodpeckers and Diamonds: Some Aspects of Evolutionary Convergence in Astrobiology. ASTROBIOLOGY 2018;18:491-502. [PMID: 29676927 DOI: 10.1089/ast.2017.1741] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Buchholz PCF, Fademrecht S, Pleiss J. Percolation in protein sequence space. PLoS One 2017;12:e0189646. [PMID: 29261740 PMCID: PMC5738032 DOI: 10.1371/journal.pone.0189646] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2017] [Accepted: 11/28/2017] [Indexed: 01/08/2023] Open

Reply to "The Curious Case of TEM-116". Antimicrob Agents Chemother 2016;60:7001. [PMID: 28045665 DOI: 10.1128/aac.01786-16] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

Yu JF, Cao Z, Yang Y, Wang CL, Su ZD, Zhao YW, Wang JH, Zhou Y. Natural protein sequences are more intrinsically disordered than random sequences. Cell Mol Life Sci 2016;73:2949-57. [PMID: 26801222 PMCID: PMC4937073 DOI: 10.1007/s00018-016-2138-9] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2015] [Revised: 01/10/2016] [Accepted: 01/11/2016] [Indexed: 11/16/2022]

Louis AA. Contingency, convergence and hyper-astronomical numbers in biological evolution. STUDIES IN HISTORY AND PHILOSOPHY OF BIOLOGICAL AND BIOMEDICAL SCIENCES 2016;58:107-116. [PMID: 26868415 DOI: 10.1016/j.shpsc.2015.12.014] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/21/2015] [Accepted: 12/21/2015] [Indexed: 06/05/2023]

Brisendine JM, Koder RL. Fast, cheap and out of control--Insights into thermodynamic and informatic constraints on natural protein sequences from de novo protein design. BIOCHIMICA ET BIOPHYSICA ACTA 2016;1857:485-492. [PMID: 26498191 PMCID: PMC4856154 DOI: 10.1016/j.bbabio.2015.10.002] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/28/2015] [Accepted: 10/06/2015] [Indexed: 12/15/2022]

Uversky VN. Paradoxes and wonders of intrinsic disorder: Complexity of simplicity. INTRINSICALLY DISORDERED PROTEINS 2016;4:e1135015. [PMID: 28232895 DOI: 10.1080/21690707.2015.1135015] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2015] [Accepted: 10/18/2015] [Indexed: 01/20/2023]

McLeish TCB. Are there ergodic limits to evolution? Ergodic exploration of genome space and convergence. Interface Focus 2015;5:20150041. [PMID: 26640648 DOI: 10.1098/rsfs.2015.0041] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open

Abstract

We examine the analogy between evolutionary dynamics and statistical mechanics to include the fundamental question of ergodicity-the representative exploration of the space of possible states (in the case of evolution this is genome space). Several properties of evolutionary dynamics are identified that allow a generalization of the ergodic dynamics, familiar in dynamical systems theory, to evolution. Two classes of evolved biological structure then arise, differentiated by the qualitative duration of their evolutionary time scales. The first class has an ergodicity time scale (the time required for representative genome exploration) longer than available evolutionary time, and has incompletely explored the genotypic and phenotypic space of its possibilities. This case generates no expectation of convergence to an optimal phenotype or possibility of its prediction. The second, more interesting, class exhibits an evolutionary form of ergodicity-essentially all of the structural space within the constraints of slower evolutionary variables have been sampled; the ergodicity time scale for the system evolution is less than the evolutionary time. In this case, some convergence towards similar optima may be expected for equivalent systems in different species where both possess ergodic evolutionary dynamics. When the fitness maximum is set by physical, rather than co-evolved, constraints, it is additionally possible to make predictions of some properties of the evolved structures and systems. We propose four structures that emerge from evolution within genotypes whose fitness is induced from their phenotypes. Together, these result in an exponential speeding up of evolution, when compared with complete exploration of genomic space. We illustrate a possible case of application and a prediction of convergence together with attaining a physical fitness optimum in the case of invertebrate compound eye resolution.

Collapse

Vermeij GJ. Forbidden phenotypes and the limits of evolution. Interface Focus 2015;5:20150028. [PMID: 26640643 DOI: 10.1098/rsfs.2015.0028] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open

Wagner A, Rosen W. Spaces of the possible: universal Darwinism and the wall between technological and biological innovation. J R Soc Interface 2015;11:20131190. [PMID: 24850903 DOI: 10.1098/rsif.2013.1190] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Dutilh BE. Metagenomic ventures into outer sequence space. BACTERIOPHAGE 2014;4:e979664. [PMID: 26458273 PMCID: PMC4588555 DOI: 10.4161/21597081.2014.979664] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 08/23/2014] [Revised: 10/14/2014] [Accepted: 10/17/2014] [Indexed: 11/19/2022]

Krick T, Verstraete N, Alonso LG, Shub DA, Ferreiro DU, Shub M, Sánchez IE. Amino Acid metabolism conflicts with protein diversity. Mol Biol Evol 2014;31:2905-12. [PMID: 25086000 PMCID: PMC4209132 DOI: 10.1093/molbev/msu228] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open

Widmann M, Pleiss J. Protein variants form a system of networks: microdiversity of IMP metallo-beta-lactamases. PLoS One 2014;9:e101813. [PMID: 25013948 PMCID: PMC4094381 DOI: 10.1371/journal.pone.0101813] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2014] [Accepted: 06/10/2014] [Indexed: 12/29/2022] Open

Abstract

Genome and metagenome sequencing projects support the view that only a tiny portion of the total protein microdiversity in the biosphere has been sequenced yet, while the vast majority of existing protein variants is still unknown. By using a network approach, the microdiversity of 42 metallo-β-lactamases of the IMP family was investigated. In the networks, the nodes are formed by the variants, while the edges correspond to single mutations between pairs of variants. The 42 variants were assigned to 7 separate networks. By analyzing the networks and their relationships, the structure of sequence space was studied and existing, but still unknown, functional variants were predicted. The largest network consists of 10 variants with IMP-1 in its center and includes two ubiquitous mutations, V67F and S262G. By relating the corresponding pairs of variants, the networks were integrated into a single system of networks. The largest network also included a quartet of variants: IMP-1, two single mutants, and the respective double mutant. The existence of quartets indicates that if two mutations resulted in functional enzymes, the double mutant may also be active and stable. Therefore, quartet construction from triplets was applied to predict 15 functional variants. Further functional mutants were predicted by applying the two ubiquitous mutations in all networks. In addition, since the networks are separated from each other by 10-15 mutations on average, it is expected that a subset of the theoretical intermediates are functional, and therefore are supposed to exist in the biosphere. Finally, the network analysis helps to distinguish between epistatic and additive effects of mutations; while the presence of correlated mutations indicates a strong interdependency between the respective positions, the mutations V67F and S262G are ubiquitous and therefore background independent.

Collapse

Chiarabelli C, Stano P, Luisi PL. Chemical synthetic biology: a mini-review. Front Microbiol 2013;4:285. [PMID: 24065964 PMCID: PMC3779815 DOI: 10.3389/fmicb.2013.00285] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2013] [Accepted: 09/04/2013] [Indexed: 01/08/2023] Open

Uversky VN. A decade and a half of protein intrinsic disorder: biology still waits for physics. Protein Sci 2013;22:693-724. [PMID: 23553817 PMCID: PMC3690711 DOI: 10.1002/pro.2261] [Citation(s) in RCA: 364] [Impact Index Per Article: 33.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2013] [Revised: 03/23/2013] [Accepted: 03/25/2013] [Indexed: 12/28/2022]

Yanagawa H. Exploration of the Origin and Evolution of Globular Proteins by mRNA Display. Biochemistry 2013;52:3841-51. [DOI: 10.1021/bi301704x] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]

Unusual biophysics of intrinsically disordered proteins. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2012;1834:932-51. [PMID: 23269364 DOI: 10.1016/j.bbapap.2012.12.008] [Citation(s) in RCA: 413] [Impact Index Per Article: 34.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/23/2012] [Revised: 11/21/2012] [Accepted: 12/12/2012] [Indexed: 02/08/2023]

Abstract

Research of a past decade and a half leaves no doubt that complete understanding of protein functionality requires close consideration of the fact that many functional proteins do not have well-folded structures. These intrinsically disordered proteins (IDPs) and proteins with intrinsically disordered protein regions (IDPRs) are highly abundant in nature and play a number of crucial roles in a living cell. Their functions, which are typically associated with a wide range of intermolecular interactions where IDPs possess remarkable binding promiscuity, complement functional repertoire of ordered proteins. All this requires a close attention to the peculiarities of biophysics of these proteins. In this review, some key biophysical features of IDPs are covered. In addition to the peculiar sequence characteristics of IDPs these biophysical features include sequential, structural, and spatiotemporal heterogeneity of IDPs; their rough and relatively flat energy landscapes; their ability to undergo both induced folding and induced unfolding; the ability to interact specifically with structurally unrelated partners; the ability to gain different structures at binding to different partners; and the ability to keep essential amount of disorder even in the bound form. IDPs are also characterized by the "turned-out" response to the changes in their environment, where they gain some structure under conditions resulting in denaturation or even unfolding of ordered proteins. It is proposed that the heterogeneous spatiotemporal structure of IDPs/IDPRs can be described as a set of foldons, inducible foldons, semi-foldons, non-foldons, and unfoldons. They may lose their function when folded, and activation of some IDPs is associated with the awaking of the dormant disorder. It is possible that IDPs represent the "edge of chaos" systems which operate in a region between order and complete randomness or chaos, where the complexity is maximal. This article is part of a Special Issue entitled: The emerging dynamic view of proteins: Protein plasticity in allostery, evolution and self-assembly.

Collapse

Bywater RP. On dating stages in prebiotic chemical evolution. Naturwissenschaften 2012;99:167-76. [DOI: 10.1007/s00114-012-0892-6] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2011] [Revised: 01/28/2012] [Accepted: 01/30/2012] [Indexed: 01/08/2023]

Holliday GL, Fischer JD, Mitchell JBO, Thornton JM. Characterizing the complexity of enzymes on the basis of their mechanisms and structures with a bio-computational analysis. FEBS J 2011;278:3835-45. [PMID: 21605342 PMCID: PMC3258480 DOI: 10.1111/j.1742-4658.2011.08190.x] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Liu X, Lv B, Guo W. The size distribution of protein families within different types of folds. Biochem Biophys Res Commun 2011;406:218-22. [PMID: 21303659 DOI: 10.1016/j.bbrc.2011.02.020] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2011] [Accepted: 02/03/2011] [Indexed: 11/19/2022]

Tanaka J, Doi N, Takashima H, Yanagawa H. Comparative characterization of random-sequence proteins consisting of 5, 12, and 20 kinds of amino acids. Protein Sci 2010;19:786-95. [PMID: 20162614 DOI: 10.1002/pro.358] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]

Marliere P. The farther, the safer: a manifesto for securely navigating synthetic species away from the old living world. SYSTEMS AND SYNTHETIC BIOLOGY 2009;3:77-84. [PMID: 19816802 PMCID: PMC2759432 DOI: 10.1007/s11693-009-9040-9] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 05/21/2009] [Revised: 08/05/2009] [Accepted: 08/07/2009] [Indexed: 11/06/2022]

Bywater RP. Membrane-spanning peptides and the origin of life. J Theor Biol 2009;261:407-13. [PMID: 19679140 DOI: 10.1016/j.jtbi.2009.08.001] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2008] [Revised: 07/31/2009] [Accepted: 08/04/2009] [Indexed: 11/29/2022]