Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	[Subscribe] [Scholar Register]

Number

Cited by Other Article(s)

Feng C, Wei H, Xu C, Feng B, Zhu X, Liu J, Zou Q. iProps: A Comprehensive Software Tool for Protein Classification and Analysis With Automatic Machine Learning Capabilities and Model Interpretation Capabilities. IEEE J Biomed Health Inform 2024;28:6237-6247. [PMID: 39008396 DOI: 10.1109/jbhi.2024.3425716] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/17/2024]

Abstract

Protein classification is a crucial field in bioinformatics. The development of a comprehensive tool that can perform feature evaluation, visualization, automated machine learning, and model interpretation would significantly advance research in protein classification. However, there is a significant gap in the literature regarding tools that integrate all these essential functionalities. This paper presents iProps, a novel Python-based software package, meticulously crafted to fulfill these multifaceted requirements. iProps is distinguished by its proficiency in feature extraction, evaluation, automated machine learning, and interpretation of classification models. Firstly, iProps fully leverages evolutionary information and amino acid reduction information to propose or extend several numerical protein features that are independent of sequence length, including SC-PSSM, ORDip, TRC, CTDC-E, CKSAAGP-E, and so forth; at the same time, it also implements the calculation of 17 other numerical features within the software. iProps also provides feature combination operations for the aforementioned features to generate more hybrid features, and has added data balancing sampling processing as well as built-in classifier settings, among other functionalities. Thus, It can discern the most effective protein class recognition feature from a multitude of candidates, utilizing three automated machine learning algorithms to identify the most optimal classifiers and parameter settings. Furthermore, iProps generates a detailed explanatory report that includes 23 informative graphs derived from three interpretable models. To assess the performance of iProps, a series of numerical experiments were conducted using two well-established datasets. The results demonstrated that our software achieved superior recognition performance in every case. Beyond its contributions to bioinformatics, iProps broadens its applicability by offering robust data analysis tools that are beneficial across various disciplines, capitalizing on its automated machine learning and model interpretation capabilities. As an open-source platform, iProps is readily accessible and features an intuitive user interface, ensuring ease of use for individuals, even those without a background in programming.

Collapse

Santos-Júnior CD, Torres MDT, Duan Y, Rodríguez Del Río Á, Schmidt TSB, Chong H, Fullam A, Kuhn M, Zhu C, Houseman A, Somborski J, Vines A, Zhao XM, Bork P, Huerta-Cepas J, de la Fuente-Nunez C, Coelho LP. Discovery of antimicrobial peptides in the global microbiome with machine learning. Cell 2024;187:3761-3778.e16. [PMID: 38843834 DOI: 10.1016/j.cell.2024.05.013] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 04/11/2024] [Accepted: 05/06/2024] [Indexed: 06/25/2024]

Affiliation(s)

Célio Dias Santos-Júnior Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai 200433, China; Laboratory of Microbial Processes & Biodiversity - LMPB, Department of Hydrobiology, Universidade Federal de São Carlos - UFSCar, São Carlos, São Paulo 13565-905, Brazil
Marcelo D T Torres Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA; Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA; Department of Chemistry, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, USA; Penn Institute for Computational Science, University of Pennsylvania, Philadelphia, PA, USA
Yiqian Duan Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai 200433, China
Álvaro Rodríguez Del Río Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Campus de Montegancedo-UPM, Pozuelo de Alarcón, 28223 Madrid, Spain
Thomas S B Schmidt Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany; APC Microbiome & School of Medicine, University College Cork, Cork, Ireland
Hui Chong Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai 200433, China
Anthony Fullam Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
Michael Kuhn Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
Chengkai Zhu Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai 200433, China
Amy Houseman Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai 200433, China
Jelena Somborski Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai 200433, China
Anna Vines Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai 200433, China
Xing-Ming Zhao Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai 200433, China; Department of Neurology, Zhongshan Hospital, Fudan University, Shanghai, China; State Key Laboratory of Medical Neurobiology, Institutes of Brain Science, Fudan University, Shanghai, China; MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China
Peer Bork Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany; Max Delbrück Centre for Molecular Medicine, Berlin, Germany; Department of Bioinformatics, Biocenter, University of Würzburg, Würzburg, Germany
Jaime Huerta-Cepas Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Campus de Montegancedo-UPM, Pozuelo de Alarcón, 28223 Madrid, Spain
Cesar de la Fuente-Nunez Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA; Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA; Department of Chemistry, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, USA; Penn Institute for Computational Science, University of Pennsylvania, Philadelphia, PA, USA.
Luis Pedro Coelho Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai 200433, China; Centre for Microbiome Research, School of Biomedical Sciences, Queensland University of Technology, Translational Research Institute, Woolloongabba, QLD, Australia.

Collapse

Santos-Júnior CD, Der Torossian Torres M, Duan Y, del Río ÁR, Schmidt TS, Chong H, Fullam A, Kuhn M, Zhu C, Houseman A, Somborski J, Vines A, Zhao XM, Bork P, Huerta-Cepas J, de la Fuente-Nunez C, Coelho LP. Computational exploration of the global microbiome for antibiotic discovery. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.31.555663. [PMID: 37693522 PMCID: PMC10491242 DOI: 10.1101/2023.08.31.555663] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2023]

Affiliation(s)

Célio Dias Santos-Júnior Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai, China
Marcelo Der Torossian Torres Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania; Philadelphia, Pennsylvania, United States of America Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania; Philadelphia, Pennsylvania, United States of America Penn Institute for Computational Science, University of Pennsylvania; Philadelphia, Pennsylvania, United States of America
Yiqian Duan Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai, China
Álvaro Rodríguez del Río Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Campus de Montegancedo-UPM, 28223 Pozuelo de Alarcón, Madrid, Spain
Thomas S.B. Schmidt Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
Hui Chong Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai, China
Anthony Fullam Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
Michael Kuhn Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
Chengkai Zhu Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai, China
Amy Houseman Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai, China
Jelena Somborski Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai, China
Anna Vines Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai, China
Xing-Ming Zhao Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai, China Department of Neurology, Zhongshan Hospital, Fudan University, Shanghai, China State Key Laboratory of Medical Neurobiology, Institutes of Brain Science, Fudan University, Shanghai, China MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China International Human Phenome Institute, Shanghai, China
Peer Bork Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany Max Delbrück Centre for Molecular Medicine, Berlin, Germany Department of Bioinformatics, Biocenter, University of Würzburg, Würzburg, Germany
Jaime Huerta-Cepas Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Campus de Montegancedo-UPM, 28223 Pozuelo de Alarcón, Madrid, Spain
Cesar de la Fuente-Nunez Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania; Philadelphia, Pennsylvania, United States of America Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania; Philadelphia, Pennsylvania, United States of America Penn Institute for Computational Science, University of Pennsylvania; Philadelphia, Pennsylvania, United States of America
Luis Pedro Coelho Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai, China

Collapse

Dousis A, Ravichandran K, Hobert EM, Moore MJ, Rabideau AE. An engineered T7 RNA polymerase that produces mRNA free of immunostimulatory byproducts. Nat Biotechnol 2023;41:560-568. [PMID: 36357718 PMCID: PMC10110463 DOI: 10.1038/s41587-022-01525-6] [Citation(s) in RCA: 53] [Impact Index Per Article: 53.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Accepted: 09/22/2022] [Indexed: 11/12/2022]

Sánchez IE, Galpern EA, Garibaldi MM, Ferreiro DU. Molecular Information Theory Meets Protein Folding. J Phys Chem B 2022;126:8655-8668. [PMID: 36282961 DOI: 10.1021/acs.jpcb.2c04532] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]

Magi Meconi G, Sasselli IR, Bianco V, Onuchic JN, Coluzza I. Key aspects of the past 30 years of protein design. REPORTS ON PROGRESS IN PHYSICS. PHYSICAL SOCIETY (GREAT BRITAIN) 2022;85:086601. [PMID: 35704983 DOI: 10.1088/1361-6633/ac78ef] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Accepted: 06/15/2022] [Indexed: 06/15/2023]

Liang Y, Yang S, Zheng L, Wang H, Zhou J, Huang S, Yang L, Zuo Y. Research progress of reduced amino acid alphabets in protein analysis and prediction. Comput Struct Biotechnol J 2022;20:3503-3510. [PMID: 35860409 PMCID: PMC9284397 DOI: 10.1016/j.csbj.2022.07.001] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 06/30/2022] [Accepted: 07/01/2022] [Indexed: 11/29/2022] Open

Zheng L, Liu D, Li YA, Yang S, Liang Y, Xing Y, Zuo Y. RaacFold: a webserver for 3D visualization and analysis of protein structure by using reduced amino acid alphabets. Nucleic Acids Res 2022;50:W633-W638. [PMID: 35639512 PMCID: PMC9252778 DOI: 10.1093/nar/gkac415] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 04/23/2022] [Accepted: 05/09/2022] [Indexed: 12/11/2022] Open

Wan H, Zhang J, Ding Y, Wang H, Tian G. Immunoglobulin Classification Based on FC* and GC* Features. Front Genet 2022;12:827161. [PMID: 35140745 PMCID: PMC8819591 DOI: 10.3389/fgene.2021.827161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Accepted: 12/22/2021] [Indexed: 11/13/2022] Open

Cooley NP, Wright ES. Accurate annotation of protein coding sequences with IDTAXA. NAR Genom Bioinform 2021;3:lqab080. [PMID: 34541527 PMCID: PMC8445202 DOI: 10.1093/nargab/lqab080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Revised: 07/07/2021] [Accepted: 08/25/2021] [Indexed: 11/12/2022] Open

Štambuk N, Konjevoda P, Pavan J. Antisense Peptide Technology for Diagnostic Tests and Bioengineering Research. Int J Mol Sci 2021;22:9106. [PMID: 34502016 PMCID: PMC8431130 DOI: 10.3390/ijms22179106] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Revised: 08/10/2021] [Accepted: 08/13/2021] [Indexed: 01/01/2023] Open

ANPrAod: Identify Antioxidant Proteins by Fusing Amino Acid Clustering Strategy and N-Peptide Combination. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2021;2021:5518209. [PMID: 33927782 PMCID: PMC8049822 DOI: 10.1155/2021/5518209] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Revised: 03/02/2021] [Accepted: 03/10/2021] [Indexed: 11/18/2022]

Iannuzzi R, Rossetti G, Spitaleri A, Bonnal RJP, Pagani M, Mollica L. A Simplified Amino Acidic Alphabet to Unveil the T-Cells Receptors Antigens: A Computational Perspective. Front Chem 2021;9:598802. [PMID: 33718327 PMCID: PMC7947793 DOI: 10.3389/fchem.2021.598802] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Accepted: 01/19/2021] [Indexed: 11/15/2022] Open

Abstract

The exposure to pathogens triggers the activation of adaptive immune responses through antigens bound to surface receptors of antigen presenting cells (APCs). T cell receptors (TCR) are responsible for initiating the immune response through their physical direct interaction with antigen-bound receptors on the APCs surface. The study of T cell interactions with antigens is considered of crucial importance for the comprehension of the role of immune responses in cancer growth and for the subsequent design of immunomodulating anticancer drugs. RNA sequencing experiments performed on T cells represented a major breakthrough for this branch of experimental molecular biology. Apart from the gene expression levels, the hypervariable CDR3α/β sequences of the TCR loops can now be easily determined and modelled in the three dimensions, being the portions of TCR mainly responsible for the interaction with APC receptors. The most direct experimental method for the investigation of antigens would be based on peptide libraries, but their huge combinatorial nature, size, cost, and the difficulty of experimental fine tuning makes this approach complicated time consuming, and costly. We have implemented in silico methodology with the aim of moving from CDR3α/β sequences to a library of potentially antigenic peptides that can be used in immunologically oriented experiments to study T cells’ reactivity. To reduce the size of the library, we have verified the reproducibility of experimental benchmarks using the permutation of only six residues that can be considered representative of all ensembles of 20 natural amino acids. Such a simplified alphabet is able to correctly find the poses and chemical nature of original antigens within a small subset of ligands of potential interest. The newly generated library would have the advantage of leading to potentially antigenic ligands that would contribute to a better understanding of the chemical nature of TCR-antigen interactions. This step is crucial in the design of immunomodulators targeted towards T-cells response as well as in understanding the first principles of an immune response in several diseases, from cancer to autoimmune disorders.

Collapse

Wang H, Xi Q, Liang P, Zheng L, Hong Y, Zuo Y. IHEC_RAAC: a online platform for identifying human enzyme classes via reduced amino acid cluster strategy. Amino Acids 2021;53:239-251. [PMID: 33486591 DOI: 10.1007/s00726-021-02941-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2020] [Accepted: 01/11/2021] [Indexed: 12/18/2022]

Zheng L, Liu D, Yang W, Yang L, Zuo Y. RaacLogo: a new sequence logo generator by using reduced amino acid clusters. Brief Bioinform 2020;22:5855392. [PMID: 32524143 DOI: 10.1093/bib/bbaa096] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2020] [Revised: 04/12/2020] [Accepted: 04/29/2020] [Indexed: 12/15/2022] Open

Zheng L, Huang S, Mu N, Zhang H, Zhang J, Chang Y, Yang L, Zuo Y. RAACBook: a web server of reduced amino acid alphabet for sequence-dependent inference by using Chou's five-step rule. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2020;2019:5650975. [PMID: 31802128 PMCID: PMC6893003 DOI: 10.1093/database/baz131] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/20/2019] [Revised: 10/16/2019] [Accepted: 10/17/2019] [Indexed: 12/12/2022]

Nerattini F, Tubiana L, Cardelli C, Bianco V, Dellago C, Coluzza I. Protein design under competing conditions for the availability of amino acids. Sci Rep 2020;10:2684. [PMID: 32060385 PMCID: PMC7021711 DOI: 10.1038/s41598-020-59401-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2019] [Accepted: 12/08/2019] [Indexed: 11/09/2022] Open

Kaushik AC, Mehmood A, Khan MT, Kumar A, Dai X, Wei DQ. RETRACTED ARTICLE: Protein blueprint and their interactions while approachability struggle for amino acids. J Biomol Struct Dyn 2020;39:i-ix. [PMID: 31914855 DOI: 10.1080/07391102.2020.1713894] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]

Wood DE, Lu J, Langmead B. Improved metagenomic analysis with Kraken 2. Genome Biol 2019. [PMID: 31779668 DOI: 10.1101/762302] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/15/2023] Open

Wood DE, Lu J, Langmead B. Improved metagenomic analysis with Kraken 2. Genome Biol 2019;20:257. [PMID: 31779668 PMCID: PMC6883579 DOI: 10.1186/s13059-019-1891-0] [Citation(s) in RCA: 2594] [Impact Index Per Article: 518.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2019] [Accepted: 11/18/2019] [Indexed: 02/06/2023] Open

Solis AD. Reduced alphabet of prebiotic amino acids optimally encodes the conformational space of diverse extant protein folds. BMC Evol Biol 2019;19:158. [PMID: 31362700 PMCID: PMC6668081 DOI: 10.1186/s12862-019-1464-6] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2018] [Accepted: 06/19/2019] [Indexed: 11/10/2022] Open

Abstract

Background

There is wide agreement that only a subset of the twenty standard amino acids existed prebiotically in sufficient concentrations to form functional polypeptides. We ask how this subset, postulated as {A,D,E,G,I,L,P,S,T,V}, could have formed structures stable enough to found metabolic pathways. Inspired by alphabet reduction experiments, we undertook a computational analysis to measure the structural coding behavior of sequences simplified by reduced alphabets. We sought to discern characteristics of the prebiotic set that would endow it with unique properties relevant to structure, stability, and folding.

Results

Drawing on a large dataset of single-domain proteins, we employed an information-theoretic measure to assess how well the prebiotic amino acid set preserves fold information against all other possible ten-amino acid sets. An extensive virtual mutagenesis procedure revealed that the prebiotic set excellently preserves sequence-dependent information regarding both backbone conformation and tertiary contact matrix of proteins. We observed that information retention is fold-class dependent: the prebiotic set sufficiently encodes the structure space of α/β and α + β folds, and to a lesser extent, of all-α and all-β folds. The prebiotic set appeared insufficient to encode the small proteins. Assessing how well the prebiotic set discriminates native vs. incorrect sequence-structure matches, we found that α/β and α + β folds exhibit more pronounced energy gaps with the prebiotic set than with nearly all alternatives.

Conclusions

The prebiotic set optimally encodes local backbone structures that appear in the folded environment and near-optimally encodes the tertiary contact matrix of extant proteins. The fold-class-specific patterns observed from our structural analysis confirm the postulated timeline of fold appearance in proteogenesis derived from proteomic sequence analyses. Polypeptides arising in a prebiotic environment will likely form α/β and α + β-like folds if any at all. We infer that the progressive expansion of the alphabet allowed the increased conformational stability and functional specificity of later folds, including all-α, all-β, and small proteins. Our results suggest that prebiotic sequences are amenable to mutations that significantly lower native conformational energies and increase discrimination amidst incorrect folds. This property may have assisted the genesis of functional proto-enzymes prior to the expansion of the full amino acid alphabet.

Collapse

Haldane A, Flynn WF, He P, Levy RM. Coevolutionary Landscape of Kinase Family Proteins: Sequence Probabilities and Functional Motifs. Biophys J 2019;114:21-31. [PMID: 29320688 DOI: 10.1016/j.bpj.2017.10.028] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2017] [Revised: 09/11/2017] [Accepted: 10/17/2017] [Indexed: 01/25/2023] Open

Cardelli C, Nerattini F, Tubiana L, Bianco V, Dellago C, Sciortino F, Coluzza I. General Methodology to Identify the Minimum Alphabet Size for Heteropolymer Design. ADVANCED THEORY AND SIMULATIONS 2019. [DOI: 10.1002/adts.201900031] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]

Zhang W, Pei J, Lai L. Statistical Analysis and Prediction of Covalent Ligand Targeted Cysteine Residues. J Chem Inf Model 2017;57:1453-1460. [DOI: 10.1021/acs.jcim.7b00163] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]