Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Chandonia JM, Fox NK, Brenner SE. SCOPe: classification of large macromolecular structures in the structural classification of proteins-extended database. Nucleic Acids Res 2020;47:D475-D481. [PMID: 30500919 PMCID: PMC6323910 DOI: 10.1093/nar/gky1134] [Citation(s) in RCA: 81] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2018] [Accepted: 11/27/2018] [Indexed: 11/12/2022] Open

For:	Chandonia JM, Fox NK, Brenner SE. SCOPe: classification of large macromolecular structures in the structural classification of proteins-extended database. Nucleic Acids Res 2020;47:D475-D481. [PMID: 30500919 PMCID: PMC6323910 DOI: 10.1093/nar/gky1134] [Citation(s) in RCA: 81] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2018] [Accepted: 11/27/2018] [Indexed: 11/12/2022] Open

Number

Cited by Other Article(s)

Johnson SR, Peshwa M, Sun Z. Sensitive remote homology search by local alignment of small positional embeddings from protein language models. eLife 2024;12:RP91415. [PMID: 38488154 PMCID: PMC10942778 DOI: 10.7554/elife.91415] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/17/2024] Open

van Kempen M, Kim SS, Tumescheit C, Mirdita M, Lee J, Gilchrist CLM, Söding J, Steinegger M. Fast and accurate protein structure search with Foldseek. Nat Biotechnol 2024;42:243-246. [PMID: 37156916 PMCID: PMC10869269 DOI: 10.1038/s41587-023-01773-0] [Citation(s) in RCA: 267] [Impact Index Per Article: 267.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Accepted: 03/30/2023] [Indexed: 05/10/2023]

Romei M, Carpentier M, Chomilier J, Lecointre G. Origins and Functional Significance of Eukaryotic Protein Folds. J Mol Evol 2023;91:854-864. [PMID: 38060007 DOI: 10.1007/s00239-023-10136-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Accepted: 10/03/2023] [Indexed: 12/08/2023]

Michel F, Romero‐Romero S, Höcker B. Retracing the evolution of a modern periplasmic binding protein. Protein Sci 2023;32:e4793. [PMID: 37788980 PMCID: PMC10601554 DOI: 10.1002/pro.4793] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 09/20/2023] [Accepted: 09/22/2023] [Indexed: 10/05/2023]

Shao J, Zhang Q, Yan K, Liu B. PreHom-PCLM: protein remote homology detection by combing motifs and protein cubic language model. Brief Bioinform 2023;24:bbad347. [PMID: 37833837 DOI: 10.1093/bib/bbad347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 08/14/2023] [Accepted: 09/14/2023] [Indexed: 10/15/2023] Open

Al-Masri C, Trozzi F, Lin SH, Tran O, Sahni N, Patek M, Cichonska A, Ravikumar B, Rahman R. Investigating the conformational landscape of AlphaFold2-predicted protein kinase structures. BIOINFORMATICS ADVANCES 2023;3:vbad129. [PMID: 37786533 PMCID: PMC10541651 DOI: 10.1093/bioadv/vbad129] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Revised: 07/28/2023] [Accepted: 09/13/2023] [Indexed: 10/04/2023]

Wu F, Wu L, Radev D, Xu J, Li SZ. Integration of pre-trained protein language models into geometric deep learning networks. Commun Biol 2023;6:876. [PMID: 37626165 PMCID: PMC10457366 DOI: 10.1038/s42003-023-05133-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Accepted: 07/11/2023] [Indexed: 08/27/2023] Open

Goldtzvik Y, Sen N, Lam SD, Orengo C. Protein diversification through post-translational modifications, alternative splicing, and gene duplication. Curr Opin Struct Biol 2023;81:102640. [PMID: 37354790 DOI: 10.1016/j.sbi.2023.102640] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 05/05/2023] [Accepted: 05/24/2023] [Indexed: 06/26/2023]

Li F, Yang JJ, Sun ZY, Wang L, Qi LY, A S, Liu YQ, Zhang HM, Dang LF, Wang SJ, Luo CX, Nian WF, O’Conner S, Ju LZ, Quan WP, Li XK, Wang C, Wang DP, You HL, Cheng ZK, Yan J, Tang FC, Yang DC, Xia CW, Gao G, Wang Y, Zhang BC, Zhou YH, Guo X, Xiang SH, Liu H, Peng TB, Su XD, Chen Y, Ouyang Q, Wang DH, Zhang DM, Xu ZH, Hou HW, Bai SN, Li L. Plant-on-chip: Core morphogenesis processes in the tiny plant Wolffia australiana. PNAS NEXUS 2023;2:pgad141. [PMID: 37181047 PMCID: PMC10169700 DOI: 10.1093/pnasnexus/pgad141] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 04/10/2023] [Accepted: 04/17/2023] [Indexed: 05/16/2023]

Affiliation(s)

Feng Li The High School Affiliated to Renmin University of China, Beijing 100080, China Center of Quantitative Biology, Peking University, Beijing 100871, China State Key Laboratory of Protein & Plant Gene Research, Peking University, Beijing 100871, China College of Life Sciences, Peking University, Beijing 100871, China
Jing-Jing Yang Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan 430072, China
Zong-Yi Sun GrandOmics Biosciences Ltd., Wuhan 430076, China
Lei Wang Department of Biological Sciences, Mississippi State University, Mississippi State, MS 39762, USA
Le-Yao Qi The High School Affiliated to Renmin University of China, Beijing 100080, China
Sina A The High School Affiliated to Renmin University of China, Beijing 100080, China
Yi-Qun Liu College of Life Sciences, Peking University, Beijing 100871, China
Hong-Mei Zhang College of Life Sciences, Peking University, Beijing 100871, China
Lei-Fan Dang College of Life Sciences, Peking University, Beijing 100871, China
Shu-Jing Wang Center of Quantitative Biology, Peking University, Beijing 100871, China
Chun-Xiong Luo Center of Quantitative Biology, Peking University, Beijing 100871, China
Wei-Feng Nian The High School Affiliated to Renmin University of China, Beijing 100080, China
Seth O’Conner Department of Biological Sciences, Mississippi State University, Mississippi State, MS 39762, USA
Long-Zhen Ju GrandOmics Biosciences Ltd., Wuhan 430076, China
Wei-Peng Quan GrandOmics Biosciences Ltd., Wuhan 430076, China
Xiao-Kang Li GrandOmics Biosciences Ltd., Wuhan 430076, China
Chao Wang GrandOmics Biosciences Ltd., Wuhan 430076, China
De-Peng Wang GrandOmics Biosciences Ltd., Wuhan 430076, China
Han-Li You Key Laboratory of Plant Functional Genomics of the Ministry of Education, Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops, Yangzhou University, Yangzhou 225009, China
Zhu-Kuan Cheng Key Laboratory of Plant Functional Genomics of the Ministry of Education, Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops, Yangzhou University, Yangzhou 225009, China
Jia Yan College of Life Sciences, Peking University, Beijing 100871, China
Fu-Chou Tang College of Life Sciences, Peking University, Beijing 100871, China
De-Chang Yang State Key Laboratory of Protein & Plant Gene Research, Peking University, Beijing 100871, China College of Life Sciences, Peking University, Beijing 100871, China Biomedical Pioneering Innovative Center (BIOPIC) and Beijing Advanced Innovation Center for Genomics (ICG), Beijing 100871, China Center for Bioinformatics (CBI), Peking University, Beijing 100871, China
Chu-Wei Xia State Key Laboratory of Protein & Plant Gene Research, Peking University, Beijing 100871, China College of Life Sciences, Peking University, Beijing 100871, China Biomedical Pioneering Innovative Center (BIOPIC) and Beijing Advanced Innovation Center for Genomics (ICG), Beijing 100871, China Center for Bioinformatics (CBI), Peking University, Beijing 100871, China
Ge Gao State Key Laboratory of Protein & Plant Gene Research, Peking University, Beijing 100871, China College of Life Sciences, Peking University, Beijing 100871, China Biomedical Pioneering Innovative Center (BIOPIC) and Beijing Advanced Innovation Center for Genomics (ICG), Beijing 100871, China Center for Bioinformatics (CBI), Peking University, Beijing 100871, China
Yan Wang Key Laboratory of Plant Functional Genomics of the Ministry of Education, Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops, Yangzhou University, Yangzhou 225009, China
Bao-Cai Zhang Key Laboratory of Plant Functional Genomics of the Ministry of Education, Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops, Yangzhou University, Yangzhou 225009, China
Yi-Hua Zhou Key Laboratory of Plant Functional Genomics of the Ministry of Education, Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops, Yangzhou University, Yangzhou 225009, China
Xing Guo State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen 518083, China
Sun-Huan Xiang State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen 518083, China
Huan Liu State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen 518083, China
Tian-Bo Peng State Key Laboratory of Protein & Plant Gene Research, Peking University, Beijing 100871, China College of Life Sciences, Peking University, Beijing 100871, China
Xiao-Dong Su State Key Laboratory of Protein & Plant Gene Research, Peking University, Beijing 100871, China College of Life Sciences, Peking University, Beijing 100871, China
Yong Chen PASTEUR, Département de chimie, École normale supérieure, PSL University, Sorbonne Université, CNRS, 24 rue Lhomond, Paris 75005, France
Qi Ouyang Center of Quantitative Biology, Peking University, Beijing 100871, China School of Physics, Peking University, Beijing 100871, China
Dong-Hui Wang State Key Laboratory of Protein & Plant Gene Research, Peking University, Beijing 100871, China College of Life Sciences, Peking University, Beijing 100871, China
Da-Ming Zhang Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
Zhi-Hong Xu State Key Laboratory of Protein & Plant Gene Research, Peking University, Beijing 100871, China College of Life Sciences, Peking University, Beijing 100871, China
Hong-Wei Hou Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan 430072, China
Shu-Nong Bai Center of Quantitative Biology, Peking University, Beijing 100871, China State Key Laboratory of Protein & Plant Gene Research, Peking University, Beijing 100871, China College of Life Sciences, Peking University, Beijing 100871, China
Ling Li Department of Biological Sciences, Mississippi State University, Mississippi State, MS 39762, USA

Collapse

Nawaz MS, Fournier-Viger P, He Y, Zhang Q. PSAC-PDB: Analysis and classification of protein structures. Comput Biol Med 2023;158:106814. [PMID: 36989742 DOI: 10.1016/j.compbiomed.2023.106814] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Revised: 03/09/2023] [Accepted: 03/20/2023] [Indexed: 03/29/2023]

Yu ZZ, Peng CX, Liu J, Zhang B, Zhou XG, Zhang GJ. DomBpred: Protein Domain Boundary Prediction Based on Domain-Residue Clustering Using Inter-Residue Distance. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023;20:912-922. [PMID: 35594218 DOI: 10.1109/tcbb.2022.3175905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]

Abstract

Domain boundary prediction is one of the most important problems in the study of protein structure and function, especially for large proteins. At present, most domain boundary prediction methods have low accuracy and limitations in dealing with multi-domain proteins. In this study, we develop a sequence-based protein domain boundary prediction, named DomBpred. In DomBpred, the input sequence is first classified as either a single-domain protein or a multi-domain protein through a designed effective sequence metric based on a constructed single-domain sequence library. For the multi-domain protein, a domain-residue clustering algorithm inspired by Ising model is proposed to cluster the spatially close residues according inter-residue distance. The unclassified residues and the residues at the edge of the cluster are then tuned by the secondary structure to form potential cut points. Finally, a domain boundary scoring function is proposed to recursively evaluate the potential cut points to generate the domain boundary. DomBpred is tested on a large-scale test set of FUpred comprising 2549 proteins. Experimental results show that DomBpred better performs than the state-of-the-art methods in classifying whether protein sequences are composed by single or multiple domains, and the Matthew's correlation coefficient is 0.882. Moreover, on 849 multi-domain proteins, the domain boundary distance and normalised domain overlap scores of DomBpred are 0.523 and 0.824, respectively, which are 5.0% and 4.2% higher than those of the best comparison method, respectively. Comparison with other methods on the given test set shows that DomBpred outperforms most state-of-the-art sequence-based methods and even achieves better results than the top-level template-based method. The executable program is freely available at https://github.com/iobio-zjut/DomBpred and the online server at http://zhanglab-bioinf.com/DomBpred/.

Collapse

Luo Y, Wang P, Mou M, Zheng H, Hong J, Tao L, Zhu F. A novel strategy for designing the magic shotguns for distantly related target pairs. Brief Bioinform 2023;24:6984790. [PMID: 36631399 DOI: 10.1093/bib/bbac621] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2022] [Revised: 11/09/2022] [Accepted: 12/17/2022] [Indexed: 01/13/2023] Open

Burley SK, Bhikadiya C, Bi C, Bittrich S, Chao H, Chen L, Craig PA, Crichlow GV, Dalenberg K, Duarte JM, Dutta S, Fayazi M, Feng Z, Flatt JW, Ganesan S, Ghosh S, Goodsell DS, Green RK, Guranovic V, Henry J, Hudson BP, Khokhriakov I, Lawson CL, Liang Y, Lowe R, Peisach E, Persikova I, Piehl DW, Rose Y, Sali A, Segura J, Sekharan M, Shao C, Vallat B, Voigt M, Webb B, Westbrook JD, Whetstone S, Young JY, Zalevsky A, Zardecki C. RCSB Protein Data Bank (RCSB.org): delivery of experimentally-determined PDB structures alongside one million computed structure models of proteins from artificial intelligence/machine learning. Nucleic Acids Res 2023;51:D488-D508. [PMID: 36420884 PMCID: PMC9825554 DOI: 10.1093/nar/gkac1077] [Citation(s) in RCA: 155] [Impact Index Per Article: 155.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 10/17/2022] [Accepted: 11/02/2022] [Indexed: 11/27/2022] Open

Affiliation(s)

Stephen K Burley Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Rutgers Cancer Institute of New Jersey, New Brunswick, NJ 08901, USA Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
Charmi Bhikadiya Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
Chunxiao Bi Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
Sebastian Bittrich Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
Henry Chao Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Li Chen Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Paul A Craig School of Chemistry and Materials Science, Rochester Institute of Technology, Rochester, NY 14623, USA
Gregg V Crichlow Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Kenneth Dalenberg Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Jose M Duarte Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
Shuchismita Dutta Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Rutgers Cancer Institute of New Jersey, New Brunswick, NJ 08901, USA
Maryam Fayazi Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Zukang Feng Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Justin W Flatt Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Sai Ganesan Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, Quantitative Biosciences Institute, University of California San Francisco, San Francisco, CA 94158, USA
Sutapa Ghosh Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
David S Goodsell Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Rutgers Cancer Institute of New Jersey, New Brunswick, NJ 08901, USA Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
Rachel Kramer Green Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Vladimir Guranovic Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Jeremy Henry Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
Brian P Hudson Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Igor Khokhriakov Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
Catherine L Lawson Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Yuhe Liang Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Robert Lowe Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Ezra Peisach Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Irina Persikova Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Dennis W Piehl Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Yana Rose Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
Andrej Sali Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, Quantitative Biosciences Institute, University of California San Francisco, San Francisco, CA 94158, USA
Joan Segura Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
Monica Sekharan Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Chenghua Shao Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Brinda Vallat Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Maria Voigt Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Ben Webb Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, Quantitative Biosciences Institute, University of California San Francisco, San Francisco, CA 94158, USA
John D Westbrook Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Rutgers Cancer Institute of New Jersey, New Brunswick, NJ 08901, USA
Shamara Whetstone Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Jasmine Y Young Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Arthur Zalevsky Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, Quantitative Biosciences Institute, University of California San Francisco, San Francisco, CA 94158, USA
Christine Zardecki Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA

Collapse

Paquet E, Viktor HL, Madi K, Wu J. Deformable Protein Shape Classification Based on Deep Learning, and the Fractional Fokker-Planck and Kähler-Dirac Equations. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023;45:391-407. [PMID: 35085073 DOI: 10.1109/tpami.2022.3146796] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]

Dapkūnas J, Margelevičius M. The COMER web server for protein analysis by homology. Bioinformatics 2022;39:6909010. [PMID: 36519835 PMCID: PMC9825750 DOI: 10.1093/bioinformatics/btac807] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Revised: 11/04/2022] [Accepted: 12/14/2022] [Indexed: 12/23/2022] Open

Ibrahim AY, Khaodeuanepheng NP, Amarasekara DL, Correia JJ, Lewis KA, Fitzkee NC, Hough LE, Whitten ST. Intrinsically disordered regions that drive phase separation form a robustly distinct protein class. J Biol Chem 2022;299:102801. [PMID: 36528065 PMCID: PMC9860499 DOI: 10.1016/j.jbc.2022.102801] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Revised: 11/29/2022] [Accepted: 12/09/2022] [Indexed: 12/23/2022] Open

Burley SK, Bhikadiya C, Bi C, Bittrich S, Chao H, Chen L, Craig PA, Crichlow GV, Dalenberg K, Duarte JM, Dutta S, Fayazi M, Feng Z, Flatt JW, Ganesan SJ, Ghosh S, Goodsell DS, Green RK, Guranovic V, Henry J, Hudson BP, Khokhriakov I, Lawson CL, Liang Y, Lowe R, Peisach E, Persikova I, Piehl DW, Rose Y, Sali A, Segura J, Sekharan M, Shao C, Vallat B, Voigt M, Webb B, Westbrook JD, Whetstone S, Young JY, Zalevsky A, Zardecki C. RCSB Protein Data bank: Tools for visualizing and understanding biological macromolecules in 3D. Protein Sci 2022;31:e4482. [PMID: 36281733 PMCID: PMC9667899 DOI: 10.1002/pro.4482] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Revised: 10/17/2022] [Accepted: 10/19/2022] [Indexed: 12/14/2022]

Affiliation(s)

Stephen K. Burley Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA Cancer Institute of New Jersey, Rutgers, The State University of New JerseyNew BrunswickNew JerseyUSA Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of CaliforniaLa JollaCaliforniaUSA Department of Chemistry and Chemical Biology, RutgersThe State University of New JerseyPiscatawayNew JerseyUSA
Charmi Bhikadiya Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
Chunxiao Bi Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of CaliforniaLa JollaCaliforniaUSA
Sebastian Bittrich Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of CaliforniaLa JollaCaliforniaUSA
Henry Chao Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
Li Chen Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
Paul A. Craig School of Chemistry and Materials ScienceRochester Institute of TechnologyRochesterNew YorkUSA
Gregg V. Crichlow Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
Kenneth Dalenberg Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
Jose M. Duarte Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of CaliforniaLa JollaCaliforniaUSA
Shuchismita Dutta Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA Cancer Institute of New Jersey, Rutgers, The State University of New JerseyNew BrunswickNew JerseyUSA
Maryam Fayazi Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
Zukang Feng Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
Justin W. Flatt Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
Sai J. Ganesan Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic SciencesQuantitative Biosciences Institute, University of CaliforniaSan FranciscoCaliforniaUSA Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Pharmaceutical ChemistryQuantitative Biosciences Institute, University of CaliforniaSan FranciscoCaliforniaUSA
Sutapa Ghosh Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
David S. Goodsell Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA Cancer Institute of New Jersey, Rutgers, The State University of New JerseyNew BrunswickNew JerseyUSA Department of Integrative Structural and Computational BiologyThe Scripps Research InstituteLa JollaCaliforniaUSA
Rachel Kramer Green Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
Vladimir Guranovic Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
Jeremy Henry Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of CaliforniaLa JollaCaliforniaUSA
Brian P. Hudson Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
Igor Khokhriakov Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of CaliforniaLa JollaCaliforniaUSA
Catherine L. Lawson Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
Yuhe Liang Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
Robert Lowe Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
Ezra Peisach Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
Irina Persikova Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
Dennis W. Piehl Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
Yana Rose Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of CaliforniaLa JollaCaliforniaUSA
Andrej Sali Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic SciencesQuantitative Biosciences Institute, University of CaliforniaSan FranciscoCaliforniaUSA Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Pharmaceutical ChemistryQuantitative Biosciences Institute, University of CaliforniaSan FranciscoCaliforniaUSA
Joan Segura Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of CaliforniaLa JollaCaliforniaUSA
Monica Sekharan Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
Chenghua Shao Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
Brinda Vallat Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
Maria Voigt Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
Benjamin Webb Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic SciencesQuantitative Biosciences Institute, University of CaliforniaSan FranciscoCaliforniaUSA Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Pharmaceutical ChemistryQuantitative Biosciences Institute, University of CaliforniaSan FranciscoCaliforniaUSA
John D. Westbrook Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
Shamara Whetstone Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
Jasmine Y. Young Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
Arthur Zalevsky Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic SciencesQuantitative Biosciences Institute, University of CaliforniaSan FranciscoCaliforniaUSA Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Pharmaceutical ChemistryQuantitative Biosciences Institute, University of CaliforniaSan FranciscoCaliforniaUSA
Christine Zardecki Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA

Collapse

González-Delgado J, Bernadó P, Neuvial P, Cortés J. Statistical proofs of the interdependence between nearest neighbor effects on polypeptide backbone conformations. J Struct Biol 2022;214:107907. [PMID: 36272694 DOI: 10.1016/j.jsb.2022.107907] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 10/03/2022] [Accepted: 10/09/2022] [Indexed: 11/06/2022]

Rocafort M, Bowen JK, Hassing B, Cox MP, McGreal B, de la Rosa S, Plummer KM, Bradshaw RE, Mesarich CH. The Venturia inaequalis effector repertoire is dominated by expanded families with predicted structural similarity, but unrelated sequence, to avirulence proteins from other plant-pathogenic fungi. BMC Biol 2022;20:246. [PMID: 36329441 PMCID: PMC9632046 DOI: 10.1186/s12915-022-01442-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Accepted: 10/17/2022] [Indexed: 11/06/2022] Open

Abstract

Background

Scab, caused by the biotrophic fungus Venturia inaequalis, is the most economically important disease of apples worldwide. During infection, V. inaequalis occupies the subcuticular environment, where it secretes virulence factors, termed effectors, to promote host colonization. Consistent with other plant-pathogenic fungi, many of these effectors are expected to be non-enzymatic proteins, some of which can be recognized by corresponding host resistance proteins to activate plant defences, thus acting as avirulence determinants. To develop durable control strategies against scab, a better understanding of the roles that these effector proteins play in promoting subcuticular growth by V. inaequalis, as well as in activating, suppressing, or circumventing resistance protein-mediated defences in apple, is required.

Results

We generated the first comprehensive RNA-seq transcriptome of V. inaequalis during colonization of apple. Analysis of this transcriptome revealed five temporal waves of gene expression that peaked during early, mid, or mid-late infection. While the number of genes encoding secreted, non-enzymatic proteinaceous effector candidates (ECs) varied in each wave, most belonged to waves that peaked in expression during mid-late infection. Spectral clustering based on sequence similarity determined that the majority of ECs belonged to expanded protein families. To gain insights into function, the tertiary structures of ECs were predicted using AlphaFold2. Strikingly, despite an absence of sequence similarity, many ECs were predicted to have structural similarity to avirulence proteins from other plant-pathogenic fungi, including members of the MAX, LARS, ToxA and FOLD effector families. In addition, several other ECs, including an EC family with sequence similarity to the AvrLm6 avirulence effector from Leptosphaeria maculans, were predicted to adopt a KP6-like fold. Thus, proteins with a KP6-like fold represent another structural family of effectors shared among plant-pathogenic fungi.

Conclusions

Our study reveals the transcriptomic profile underpinning subcuticular growth by V. inaequalis and provides an enriched list of ECs that can be investigated for roles in virulence and avirulence. Furthermore, our study supports the idea that numerous sequence-unrelated effectors across plant-pathogenic fungi share common structural folds. In doing so, our study gives weight to the hypothesis that many fungal effectors evolved from ancestral genes through duplication, followed by sequence diversification, to produce sequence-unrelated but structurally similar proteins.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12915-022-01442-9.

Collapse

A family of unusual immunoglobulin superfamily genes in an invertebrate histocompatibility complex. Proc Natl Acad Sci U S A 2022;119:e2207374119. [PMID: 36161920 PMCID: PMC9546547 DOI: 10.1073/pnas.2207374119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Burley SK, Berman HM, Duarte JM, Feng Z, Flatt JW, Hudson BP, Lowe R, Peisach E, Piehl DW, Rose Y, Sali A, Sekharan M, Shao C, Vallat B, Voigt M, Westbrook JD, Young JY, Zardecki C. Protein Data Bank: A Comprehensive Review of 3D Structure Holdings and Worldwide Utilization by Researchers, Educators, and Students. Biomolecules 2022;12:1425. [PMID: 36291635 PMCID: PMC9599165 DOI: 10.3390/biom12101425] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 09/23/2022] [Accepted: 09/26/2022] [Indexed: 11/18/2022] Open

Affiliation(s)

Stephen K. Burley Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Helen M. Berman Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Jose M. Duarte Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
Zukang Feng Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Justin W. Flatt Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Brian P. Hudson Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Robert Lowe Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Ezra Peisach Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Dennis W. Piehl Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Yana Rose Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
Andrej Sali Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, Quantitative Biosciences Institute, University of California San Francisco, San Francisco, CA 94158, USA
Monica Sekharan Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Chenghua Shao Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Brinda Vallat Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
Maria Voigt Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
John D. Westbrook Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
Jasmine Y. Young Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
Christine Zardecki Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA

Collapse

Elnaggar A, Heinzinger M, Dallago C, Rehawi G, Wang Y, Jones L, Gibbs T, Feher T, Angerer C, Steinegger M, Bhowmik D, Rost B. ProtTrans: Toward Understanding the Language of Life Through Self-Supervised Learning. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022;44:7112-7127. [PMID: 34232869 DOI: 10.1109/tpami.2021.3095381] [Citation(s) in RCA: 344] [Impact Index Per Article: 172.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]

Elnaggar A, Heinzinger M, Dallago C, Rehawi G, Wang Y, Jones L, Gibbs T, Feher T, Angerer C, Steinegger M, Bhowmik D, Rost B. ProtTrans: Toward Understanding the Language of Life Through Self-Supervised Learning. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022. [PMID: 34232869 DOI: 10.1101/2020.07.12.199554] [Citation(s) in RCA: 66] [Impact Index Per Article: 33.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]

Taheri-Ledari M, Zandieh A, Shariatpanahi SP, Eslahchi C. Assignment of structural domains in proteins using diffusion kernels on graphs. BMC Bioinformatics 2022;23:369. [PMID: 36076174 PMCID: PMC9461149 DOI: 10.1186/s12859-022-04902-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Accepted: 08/23/2022] [Indexed: 11/10/2022] Open

Abstract

Though proposing algorithmic approaches for protein domain decomposition has been of high interest, the inherent ambiguity to the problem makes it still an active area of research. Besides, accurate automated methods are in high demand as the number of solved structures for complex proteins is on the rise. While majority of the previous efforts for decomposition of 3D structures are centered on the developing clustering algorithms, employing enhanced measures of proximity between the amino acids has remained rather uncharted. If there exists a kernel function that in its reproducing kernel Hilbert space, structural domains of proteins become well separated, then protein structures can be parsed into domains without the need to use a complex clustering algorithm. Inspired by this idea, we developed a protein domain decomposition method based on diffusion kernels on protein graphs. We examined all combinations of four graph node kernels and two clustering algorithms to investigate their capability to decompose protein structures. The proposed method is tested on five of the most commonly used benchmark datasets for protein domain assignment plus a comprehensive non-redundant dataset. The results show a competitive performance of the method utilizing one of the diffusion kernels compared to four of the best automatic methods. Our method is also able to offer alternative partitionings for the same structure which is in line with the subjective definition of protein domain. With a competitive accuracy and balanced performance for the simple and complex structures despite relying on a relatively naive criterion to choose optimal decomposition, the proposed method revealed that diffusion kernels on graphs in particular, and kernel functions in general are promising measures to facilitate parsing proteins into domains and performing different structural analysis on proteins. The size and interconnectedness of the protein graphs make them promising targets for diffusion kernels as measures of affinity between amino acids. The versatility of our method allows the implementation of future kernels with higher performance. The source code of the proposed method is accessible at https://github.com/taherimo/kludo . Also, the proposed method is available as a web application from https://cbph.ir/tools/kludo .

Collapse

Qin X, Zhang L, Liu M, Xu Z, Liu G. ASFold-DNN: Protein Fold Recognition Based on Evolutionary Features With Variable Parameters Using Full Connected Neural Network. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022;19:2712-2722. [PMID: 34133282 DOI: 10.1109/tcbb.2021.3089168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

Newaz K, Piland J, Clark PL, Emrich SJ, Li J, Milenković T. Multi-layer sequential network analysis improves protein 3D structural classification. Proteins 2022;90:1721-1731. [PMID: 35441395 PMCID: PMC9356989 DOI: 10.1002/prot.26349] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Revised: 03/04/2022] [Accepted: 03/30/2022] [Indexed: 11/08/2022]

Low Complexity Induces Structure in Protein Regions Predicted as Intrinsically Disordered. Biomolecules 2022;12:biom12081098. [PMID: 36008992 PMCID: PMC9405754 DOI: 10.3390/biom12081098] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 08/02/2022] [Accepted: 08/06/2022] [Indexed: 01/02/2023] Open

Kuznetsova KG, Zvonareva SS, Ziganshin R, Mekhova ES, Dgebuadze P, Yen DTH, Nguyen THT, Moshkovskii SA, Fedosov AE. Vexitoxins: conotoxin-like venom peptides from predatory gastropods of the genus Vexillum. Proc Biol Sci 2022;289:20221152. [PMID: 35946162 PMCID: PMC9363990 DOI: 10.1098/rspb.2022.1152] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open

A closed Candidatus Odinarchaeum chromosome exposes Asgard archaeal viruses. Nat Microbiol 2022;7:948-952. [PMID: 35760836 PMCID: PMC9246712 DOI: 10.1038/s41564-022-01122-y] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Accepted: 04/06/2022] [Indexed: 12/11/2022]

Yu CC, Raj N, Chu JW. Edge weights in a protein elastic network reorganize collective motions and render long-range sensitivity responses. J Chem Phys 2022;156:245105. [DOI: 10.1063/5.0095107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

The Structural Rule Distinguishing a Superfold: A Case Study of Ferredoxin Fold and the Reverse Ferredoxin Fold. Molecules 2022;27:molecules27113547. [PMID: 35684484 PMCID: PMC9181952 DOI: 10.3390/molecules27113547] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 05/24/2022] [Accepted: 05/28/2022] [Indexed: 01/27/2023] Open

Holm L. Dali server: structural unification of protein families. Nucleic Acids Res 2022;50:W210-W215. [PMID: 35610055 PMCID: PMC9252788 DOI: 10.1093/nar/gkac387] [Citation(s) in RCA: 364] [Impact Index Per Article: 182.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Revised: 04/27/2022] [Accepted: 05/02/2022] [Indexed: 12/26/2022] Open

Linhorst A, Lübke T. The Human Ntn-Hydrolase Superfamily: Structure, Functions and Perspectives. Cells 2022;11:cells11101592. [PMID: 35626629 PMCID: PMC9140057 DOI: 10.3390/cells11101592] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 05/05/2022] [Accepted: 05/06/2022] [Indexed: 01/27/2023] Open

Zheng W, Wuyun Q, Zhou X, Li Y, Freddolino PL, Zhang Y. LOMETS3: integrating deep learning and profile alignment for advanced protein template recognition and function annotation. Nucleic Acids Res 2022;50:W454-W464. [PMID: 35420129 PMCID: PMC9252734 DOI: 10.1093/nar/gkac248] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2022] [Revised: 03/29/2022] [Accepted: 03/31/2022] [Indexed: 11/25/2022] Open

Aderinwale T, Bharadwaj V, Christoffer C, Terashi G, Zhang Z, Jahandideh R, Kagaya Y, Kihara D. Real-time structure search and structure classification for AlphaFold protein models. Commun Biol 2022;5:316. [PMID: 35383281 PMCID: PMC8983703 DOI: 10.1038/s42003-022-03261-8] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Accepted: 03/11/2022] [Indexed: 11/17/2022] Open

Prabantu VM, Gadiyaram V, Vishveshwara S, Srinivasan N. Understanding structural variability in proteins using protein structural networks. Curr Res Struct Biol 2022;4:134-145. [PMID: 35586857 PMCID: PMC9108755 DOI: 10.1016/j.crstbi.2022.04.002] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Revised: 03/01/2022] [Accepted: 04/09/2022] [Indexed: 11/13/2022] Open

Abstract

Proteins perform their function by accessing a suitable conformer from the ensemble of available conformations. The conformational diversity of a chosen protein structure can be obtained by experimental methods under different conditions. A key issue is the accurate comparison of different conformations. A gold standard used for such a comparison is the root mean square deviation (RMSD) between the two structures. While extensive refinements of RMSD evaluation at the backbone level are available, a comprehensive framework including the side chain interaction is not well understood. Here we employ protein structure network (PSN) formalism, with the non-covalent interactions of side chain, explicitly treated. The PSNs thus constructed are compared through graph spectral method, which provides a comparison at the local and at the global structural level. In this work, PSNs of multiple crystal conformers of single-chain, single-domain proteins, are subject to pair-wise analysis to examine the dissimilarity in their network topologies and in order to determine the conformational diversity of their native structures. This information is utilized to classify the structural domains of proteins into different categories. It is observed that proteins typically tend to retain structure and interactions at the backbone level. However, some of them also depict variability in either their overall structure or only in their inter-residue connectivity at the sidechain level, or both. Variability of sub-networks based on solvent accessibility and secondary structure is studied. The types of specific interactions are found to contribute differently to structure variability. An ensemble analysis by computing the mathematical variance of edge-weights across multiple conformers provided information on the contribution to overall variability from each edge of the PSN. Interactions that are highly variable are identified and their impact on structure variability has been discussed with the help of a case study. The classification based on the present side-chain network-based studies provides a framework to correlate the structure-function relationships in protein structures.

•

Monomeric, single domain protein structures can exhibit non-rigid behaviour and be highly variable.

•

The comparison of protein structural networks can better discriminate conformations with similar backbones.

•

Specific interactions between solvent accessible and inaccessible residues are poorly preserved.

•

Network edge-variation offers insights on which interacting residues are likely to influence their dynamics and function.

•

These side-chain network-based studies provide a framework to correlate protein structure-function relationships.

Collapse

Zimmermann MT. Molecular Modeling is an Enabling Approach to Complement and Enhance Channelopathy Research. Compr Physiol 2022;12:3141-3166. [PMID: 35578963 DOI: 10.1002/cphy.c190047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Langenfeld F, Aderinwale T, Christoffer C, Shin WH, Terashi G, Wang X, Kihara D, Benhabiles H, Hammoudi K, Cabani A, Windal F, Melkemi M, Otu E, Zwiggelaar R, Hunter D, Liu Y, Sirugue L, Nguyen HNH, Nguyen TDH, Nguyen-Truong VT, Le D, Nguyen HD, Tran MT, Montès M. Surface-based protein domains retrieval methods from a SHREC2021 challenge. J Mol Graph Model 2022;111:108103. [PMID: 34959149 PMCID: PMC9746607 DOI: 10.1016/j.jmgm.2021.108103] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 11/29/2021] [Accepted: 12/04/2021] [Indexed: 12/15/2022]

Affiliation(s)

Florent Langenfeld Laboratoire de Génomique, Bio-informatique et Chimie Moléculaire (GBCM), EA 7528, Conservatoire National des Arts-et-Métiers, HESAM Université, 2, rue Conté, Paris, 75003, France,*Corresponding author: (F. Langenfeld)
Tunde Aderinwale Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
Charles Christoffer Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
Woong-Hee Shin Department of Chemical Science Education, Sunchon National University, Suncheon, 57922, Republic of Korea
Genki Terashi Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
Xiao Wang Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
Daisuke Kihara Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA,dDepartment of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
Halim Benhabiles Univ. Lille, CNRS, Centrale Lille, Univ. Polytechnique Hauts-de-France, Junia, UMR 8520, IEMN - Institut d’Electronique de Microélectronique et de Nanotechnologie, F-59 000, Lille, France
Karim Hammoudi Université de Haute-Alsace, Department of Computer Science, IRIMAS, F-68 100, Mulhouse, France,gUniversité de Strasbourg, France
Adnane Cabani Normandie University, UNIROUEN, ESIGELEC, IRSEEM, 76000, Rouen, France
Feryal Windal Univ. Lille, CNRS, Centrale Lille, Univ. Polytechnique Hauts-de-France, Junia, UMR 8520, IEMN - Institut d’Electronique de Microélectronique et de Nanotechnologie, F-59 000, Lille, France
Mahmoud Melkemi Université de Haute-Alsace, Department of Computer Science, IRIMAS, F-68 100, Mulhouse, France,gUniversité de Strasbourg, France
Ekpo Otu Department of Computer Science, Aberystwyth University, Aberystwyth, SY23 3FL, UK
Reyer Zwiggelaar Department of Computer Science, Aberystwyth University, Aberystwyth, SY23 3FL, UK
David Hunter Department of Computer Science, Aberystwyth University, Aberystwyth, SY23 3FL, UK
Yonghuai Liu Department of Computer Science, Edge Hill University, Ormskirk, L39 4QP, UK
Léa Sirugue Laboratoire de Génomique, Bio-informatique et Chimie Moléculaire (GBCM), EA 7528, Conservatoire National des Arts-et-Métiers, HESAM Université, 2, rue Conté, Paris, 75003, France
Huu-Nghia H. Nguyen University of Science, VNU-HCM, Viet Nam,lVietnam National University, Ho Chi Minh City, Viet Nam
Tuan-Duy H. Nguyen University of Science, VNU-HCM, Viet Nam,lVietnam National University, Ho Chi Minh City, Viet Nam
Vinh-Thuyen Nguyen-Truong University of Science, VNU-HCM, Viet Nam,lVietnam National University, Ho Chi Minh City, Viet Nam
Danh Le University of Science, VNU-HCM, Viet Nam,lVietnam National University, Ho Chi Minh City, Viet Nam
Hai-Dang Nguyen University of Science, VNU-HCM, Viet Nam,lVietnam National University, Ho Chi Minh City, Viet Nam
Minh-Triet Tran University of Science, VNU-HCM, Viet Nam,lVietnam National University, Ho Chi Minh City, Viet Nam,mJohn von Neumann Institute, VNU-HCM, Viet Nam
Matthieu Montès Laboratoire de Génomique, Bio-informatique et Chimie Moléculaire (GBCM), EA 7528, Conservatoire National des Arts-et-Métiers, HESAM Université, 2, rue Conté, Paris, 75003, France,**Corresponding author: (M. Montès)

Collapse

Wang L, Zhang J, Wang D, Song C. Membrane contact probability: An essential and predictive character for the structural and functional studies of membrane proteins. PLoS Comput Biol 2022;18:e1009972. [PMID: 35353812 PMCID: PMC9000120 DOI: 10.1371/journal.pcbi.1009972] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Revised: 04/11/2022] [Accepted: 02/25/2022] [Indexed: 11/20/2022] Open

Abstract

One of the unique traits of membrane proteins is that a significant fraction of their hydrophobic amino acids is exposed to the hydrophobic core of lipid bilayers rather than being embedded in the protein interior, which is often not explicitly considered in the protein structure and function predictions. Here, we propose a characteristic and predictive quantity, the membrane contact probability (MCP), to describe the likelihood of the amino acids of a given sequence being in direct contact with the acyl chains of lipid molecules. We show that MCP is complementary to solvent accessibility in characterizing the outer surface of membrane proteins, and it can be predicted for any given sequence with a machine learning-based method by utilizing a training dataset extracted from MemProtMD, a database generated from molecular dynamics simulations for the membrane proteins with a known structure. As the first of many potential applications, we demonstrate that MCP can be used to systematically improve the prediction precision of the protein contact maps and structures.

The distribution of residues on protein surfaces is largely determined by the surrounding environment. For soluble proteins, most of the residues on the outer surface are hydrophilic, and people use the quantity “solvent accessibility” to describe and predict these surface residues. In contrast, for membrane proteins that are embedded in a lipid bilayer, many of their surface residues are hydrophobic and membrane-contacting, but there is yet a widely-accepted quantity for the description or prediction of this characteristic property. Here, we propose a new quantity termed “membrane contact probability (MCP)”, which can be used to describe and predict the membrane-contacting surface residues of proteins. We also propose a machine learning-based method to predict MCP from protein sequences, utilizing the dataset generated by physics-based computer simulations. We demonstrate that a quantity such as MCP is helpful for protein structure prediction, and we believe that it will find broad applications in the structure and function studies of membrane proteins.

Collapse

Mining folded proteomes in the era of accurate structure prediction. PLoS Comput Biol 2022;18:e1009930. [PMID: 35333855 PMCID: PMC8986115 DOI: 10.1371/journal.pcbi.1009930] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Revised: 04/06/2022] [Accepted: 02/16/2022] [Indexed: 01/02/2023] Open

Abstract

Protein structure fundamentally underpins the function and processes of numerous biological systems. Fold recognition algorithms offer a sensitive and robust tool to detect structural, and thereby functional, similarities between distantly related homologs. In the era of accurate structure prediction owing to advances in machine learning techniques and a wealth of experimentally determined structures, previously curated sequence databases have become a rich source of biological information. Here, we use bioinformatic fold recognition algorithms to scan the entire AlphaFold structure database to identify novel protein family members, infer function and group predicted protein structures. As an example of the utility of this approach, we identify novel, previously unknown members of various pore-forming protein families, including MACPFs, GSDMs and aerolysin-like proteins.

Virtually every cellular process in all organisms on Earth is driven by molecular nano-machines known as proteins. The diverse functions of proteins are the result of the unique three-dimensional shape adopted by a given protein molecule. It is therefore important to determine the shape of a given protein, which unlike DNA and our genes, cannot be known from its sequence alone. Since two proteins with similar shapes typically have a similar function, knowing a protein shape provides crucial clues about its function. By virtue of decades of experimental work and advances in artificial intelligence, this complex shape can now be computationally predicted for any protein whose composition is known. Scientists have used these and other methods to produce enormous libraries of protein shapes consisting of nearly a million unique entries. However, these libraries are too large and too complex for researchers to ‘read’. We use shape-comparison algorithms to carefully check these shape-libraries to gain insight into the potential function and biological role of previously unknown proteins. Furthermore, we identified new members of protein families using this technique. We show that shape-matching algorithms and computationally generated shape-libraries can be used effectively together to yield new insights and expedite scientific endeavours.

Collapse

Pražnikar J, Attygalle NT. Quantitative analysis of visual codewords of a protein distance matrix. PLoS One 2022;17:e0263566. [PMID: 35120181 PMCID: PMC8815937 DOI: 10.1371/journal.pone.0263566] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Accepted: 01/24/2022] [Indexed: 12/02/2022] Open

Jin X, Luo X, Liu B. PHR-search: a search framework for protein remote homology detection based on the predicted protein hierarchical relationships. Brief Bioinform 2022;23:6520306. [PMID: 35134113 DOI: 10.1093/bib/bbab609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Revised: 12/14/2021] [Accepted: 12/30/2021] [Indexed: 11/13/2022] Open

Bhattacharya N, Thomas N, Rao R, Dauparas J, Koo PK, Baker D, Song YS, Ovchinnikov S. Interpreting Potts and Transformer Protein Models Through the Lens of Simplified Attention. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2022;27:34-45. [PMID: 34890134 PMCID: PMC8752338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]

Kaushik R, Zhang KYJ. ProFitFun: a protein tertiary structure fitness function for quantifying the accuracies of model structures. Bioinformatics 2022;38:369-376. [PMID: 34542606 DOI: 10.1093/bioinformatics/btab666] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Revised: 09/06/2021] [Accepted: 09/16/2021] [Indexed: 02/03/2023] Open

Abstract

MOTIVATION

An accurate estimation of the quality of protein model structures typifies as a cornerstone in protein structure prediction regimes. Despite the recent groundbreaking success in the field of protein structure prediction, there are certain prospects for the improvement in model quality estimation at multiple stages of protein structure prediction and thus, to further push the prediction accuracy. Here, a novel approach, named ProFitFun, for assessing the quality of protein models is proposed by harnessing the sequence and structural features of experimental protein structures in terms of the preferences of backbone dihedral angles and relative surface accessibility of their amino acid residues at the tripeptide level. The proposed approach leverages upon the backbone dihedral angle and surface accessibility preferences of the residues by accounting for its N-terminal and C-terminal neighbors in the protein structure. These preferences are used to evaluate protein structures through a machine learning approach and tested on an extensive dataset of diverse proteins.

RESULTS

The approach was extensively validated on a large test dataset (n = 25 005) of protein structures, comprising 23 661 models of 82 non-homologous proteins and 1344 non-homologous experimental structures. In addition, an external dataset of 40 000 models of 200 non-homologous proteins was also used for the validation of the proposed method. Both datasets were further used for benchmarking the proposed method with four different state-of-the-art methods for protein structure quality assessment. In the benchmarking, the proposed method outperformed some state-of-the-art methods in terms of Spearman's and Pearson's correlation coefficients, average GDT-TS loss, sum of z-scores and average absolute difference of predictions over corresponding observed values. The high accuracy of the proposed approach promises a potential use of the sequence and structural features in computational protein design.

AVAILABILITY AND IMPLEMENTATION

http://github.com/KYZ-LSB/ProTerS-FitFun.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Kumar G, Srinivasan N, Sandhya S. Profiles of Natural and Designed Protein-Like Sequences Effectively Bridge Protein Sequence Gaps: Implications in Distant Homology Detection. Methods Mol Biol 2022;2449:149-167. [PMID: 35507261 DOI: 10.1007/978-1-0716-2095-3_5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]

Guzzo A, Delarue P, Rojas A, Nicolaï A, Maisuradze GG, Senet P. Missense Mutations Modify the Conformational Ensemble of the α-Synuclein Monomer Which Exhibits a Two-Phase Characteristic. Front Mol Biosci 2021;8:786123. [PMID: 34912851 PMCID: PMC8667727 DOI: 10.3389/fmolb.2021.786123] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Accepted: 10/25/2021] [Indexed: 12/15/2022] Open

Abstract

α-Synuclein is an intrinsically disordered protein occurring in different conformations and prone to aggregate in β-sheet structures, which are the hallmark of the Parkinson disease. Missense mutations are associated with familial forms of this neuropathy. How these single amino-acid substitutions modify the conformations of wild-type α-synuclein is unclear. Here, using coarse-grained molecular dynamics simulations, we sampled the conformational space of the wild type and mutants (A30P, A53P, and E46K) of α-synuclein monomers for an effective time scale of 29.7 ms. To characterize the structures, we developed an algorithm, CUTABI (CUrvature and Torsion based of Alpha-helix and Beta-sheet Identification), to identify residues in the α-helix and β-sheet from C^α-coordinates. CUTABI was built from the results of the analysis of 14,652 selected protein structures using the Dictionary of Secondary Structure of Proteins (DSSP) algorithm. DSSP results are reproduced with 93% of success for 10 times lower computational cost. A two-dimensional probability density map of α-synuclein as a function of the number of residues in the α-helix and β-sheet is computed for wild-type and mutated proteins from molecular dynamics trajectories. The density of conformational states reveals a two-phase characteristic with a homogeneous phase (state B, β-sheets) and a heterogeneous phase (state HB, mixture of α-helices and β-sheets). The B state represents 40% of the conformations for the wild-type, A30P, and E46K and only 25% for A53T. The density of conformational states of the B state for A53T and A30P mutants differs from the wild-type one. In addition, the mutant A53T has a larger propensity to form helices than the others. These findings indicate that the equilibrium between the different conformations of the α-synuclein monomer is modified by the missense mutations in a subtle way. The α-helix and β-sheet contents are promising order parameters for intrinsically disordered proteins, whereas other structural properties such as average gyration radius, R_g, or probability distribution of R_g cannot discriminate significantly the conformational ensembles of the wild type and mutants. When separated in states B and HB, the distributions of R_g are more significantly different, indicating that global structural parameters alone are insufficient to characterize the conformational ensembles of the α-synuclein monomer.

Collapse

Machat M, Langenfeld F, Craciun D, Sirugue L, Labib T, Lagarde N, Maria M, Montes M. Comparative evaluation of shape retrieval methods on macromolecular surfaces: an application of computer vision methods in structural bioinformatics. Bioinformatics 2021;37:4375-4382. [PMID: 34247232 PMCID: PMC8652110 DOI: 10.1093/bioinformatics/btab511] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Revised: 05/18/2021] [Accepted: 07/08/2021] [Indexed: 11/24/2022] Open

Chandonia JM, Guan L, Lin S, Yu C, Fox NK, Brenner SE. SCOPe: improvements to the structural classification of proteins - extended database to facilitate variant interpretation and machine learning. Nucleic Acids Res 2021;50:D553-D559. [PMID: 34850923 PMCID: PMC8728185 DOI: 10.1093/nar/gkab1054] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Revised: 10/14/2021] [Accepted: 11/30/2021] [Indexed: 11/14/2022] Open

Hou M, Peng C, Zhou X, Zhang B, Zhang G. Multi contact-based folding method for de novo protein structure prediction. Brief Bioinform 2021;23:6445108. [PMID: 34849573 DOI: 10.1093/bib/bbab463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Revised: 09/21/2021] [Accepted: 10/10/2021] [Indexed: 11/12/2022] Open

Li M, Hu J, Wang Y, Li Y, Zhang L, Liu Z. Challenging Reverse Screening: A Benchmark Study for Comprehensive Evaluation. Mol Inform 2021;41:e2100063. [PMID: 34787366 DOI: 10.1002/minf.202100063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Accepted: 10/15/2021] [Indexed: 11/08/2022]