1
|
Gupta A, Mardi P, Mishra PKK, Kumar A, Kumar R, Mahapatra A, Jena A, Behera PC. Evaluation of supplemented protein-L-isoaspartate-O-methyltransferase ( PIMT) gene of Carica papaya and Ricinus communis in stress survival of Escherichia coli BL21(DE3) cells. Prep Biochem Biotechnol 2024; 54:882-895. [PMID: 38170207 DOI: 10.1080/10826068.2023.2297692] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
In growing plant population, effect of stress is a perturb issue affecting its physiological, biochemical, yield loss and developmental growth. Protein-L-isoaspartate-O-methyltransferase (PIMT) is a broadly distributed protein repair enzyme which actuate under stressful environment or aging. Stress can mediate damage converting protein bound aspartate (Asp) residues to isoaspartate (iso-Asp). This spontaneous and deleterious conversion occurs at an elevated state of stress and aging. Iso-Asp formation is associated with protein inactivation and compromised cellular survival. PIMT can convert iso-Asp back to Asp, thus repairing and contributing to cellular survival. The present work describes the isolation, cloning, sequencing and expression of PIMT genes of Carica papaya (Cp pimt) and Ricinus communis (Rc pimt) Using gene specific primers, both the pimts were amplified from their respective cDNAs and subsequently cloned in prokaryotic expression vector pProEXHTa. BL21(DE3) strain of E. coli cells were used as expression host. The expression kinetics of both the PIMTs were studied with various concentrations of IPTG and at different time points. Finally, the PIMT supplemented BL21(DE3) cells were evaluated against different stresses in comparison to their counterparts with the empty vector control.
Collapse
Affiliation(s)
- Akanksha Gupta
- Plant Biotechnology, Department of Genetics and Plant Breeding, Banaras Hindu University, Mirzapur, India
| | - Pragati Mardi
- Plant Biotechnology, Department of Genetics and Plant Breeding, Banaras Hindu University, Mirzapur, India
| | - Prasanta Kumar Koustasa Mishra
- Unit of Teaching Veterinary Clinical Complex, Faculty of Veterinary and Animal Sciences, Banaras Hindu University, Mirzapur, India
| | - Anshuman Kumar
- Department of Animal Genetics and Breeding, Faculty of Veterinary and Animal Sciences, Banaras Hindu University, Mirzapur, India
| | - Rajesh Kumar
- Plant Biotechnology, Department of Genetics and Plant Breeding, Banaras Hindu University, Mirzapur, India
| | - Archana Mahapatra
- Department of Veterinary Anatomy, Faculty of Veterinary and Animal Sciences, Banaras Hindu University, Mirzapur, India
| | - Anupama Jena
- Fisheries and Animal Resource Development Department, Bhubaneswar, India
| | - Prakash Chandra Behera
- Department of Veterinary Biochemistry, College of Veterinary Science and Animal Husbandry, OUAT, Bhubaneshwar, India
| |
Collapse
|
2
|
Selvavinayagam ST, Sankar S, Yong YK, Murugesan A, Suvaithenamudhan S, Hemashree K, Rajeshkumar M, Kumaresan A, Pandey RP, Shanmugam S, Arthydevi P, Kumar MS, Gopalan N, Kannan M, Cheedarla N, Tan HY, Zhang Y, Larsson M, Balakrishnan P, Velu V, Byrareddy SN, Shankar EM, Raju S. Emergence of SARS-CoV-2 omicron variant JN.1 in Tamil Nadu, India - Clinical characteristics and novel mutations. Sci Rep 2024; 14:17476. [PMID: 39080396 PMCID: PMC11289243 DOI: 10.1038/s41598-024-68678-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Accepted: 07/26/2024] [Indexed: 08/02/2024] Open
Abstract
In December 2023, we observed a notable shift in the COVID-19 landscape, when JN.1 omicron emerged as the predominant SARS-CoV-2 variant with a 95% incidence. We characterized the clinical profile, and genetic changes in JN.1, an emerging SARS-CoV-2 variant of interest. Whole genome sequencing was performed on SARS-CoV-2 positive clinical specimens, followed by sequence analysis. Mutations within the spike protein sequences were analysed and compared with the previously reported lineages and sub-lineages, to identify the potential impact of the unique mutations on protein structure and possible alterations in the functionality. Several unique and dynamic mutations were identified herein. Molecular docking analysis showed changes in the binding affinity, and key interacting residues of wild-type and mutated structures with key host cell receptors of SARS-CoV-2 entry viz., ACE2, CD147, CD209L and AXL. Our data provides key insights on the emergence of newer variants and highlights the necessity for robust and sustained global genomic surveillance of SARS-CoV-2.
Collapse
Affiliation(s)
- Sivaprakasam T Selvavinayagam
- State Public Health Laboratory, Directorate of Public Health and Preventive Medicine, DMS Campus, Teynampet, Chennai, Tamil Nadu, 600 006, India
| | - Sathish Sankar
- Department of Microbiology, Centre for Infectious Diseases, Saveetha Dental College and Hospitals, Saveetha Institute of Medical and Technical Sciences, Chennai, Tamil Nadu, 600 077, India
| | - Yean K Yong
- Laboratory Center, Xiamen University Malaysia, 43900, Sepang, Selangor, Malaysia
- Kelip-kelip! Center of Excellence for Light Enabling Technologies, Xiamen University Malaysia, 43900, Sepang, Selangor, Malaysia
| | - Amudhan Murugesan
- Department of Microbiology, Government Theni Medical College and Hospital, Theni, 625 512, India
| | - Suvaiyarasan Suvaithenamudhan
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli, Tamil Nadu, 620 024, India
| | - Kannan Hemashree
- State Public Health Laboratory, Directorate of Public Health and Preventive Medicine, DMS Campus, Teynampet, Chennai, Tamil Nadu, 600 006, India
| | - Manivannan Rajeshkumar
- State Public Health Laboratory, Directorate of Public Health and Preventive Medicine, DMS Campus, Teynampet, Chennai, Tamil Nadu, 600 006, India
| | - Anandhazhvar Kumaresan
- State Public Health Laboratory, Directorate of Public Health and Preventive Medicine, DMS Campus, Teynampet, Chennai, Tamil Nadu, 600 006, India
| | - Ramendra P Pandey
- School of Health Sciences and Technology, UPES, Dehradun, Uttarakhand, 248 007, India
| | - Saravanan Shanmugam
- Center for Infectious Diseases, Saveetha Medical College and Hospital, Saveetha Institute of Medical and Technical Sciences, Saveetha University, Chennai, Tamil Nadu, 602 105, India
| | - Parthiban Arthydevi
- State Public Health Laboratory, Directorate of Public Health and Preventive Medicine, DMS Campus, Teynampet, Chennai, Tamil Nadu, 600 006, India
| | - Masilamani Senthil Kumar
- State Public Health Laboratory, Directorate of Public Health and Preventive Medicine, DMS Campus, Teynampet, Chennai, Tamil Nadu, 600 006, India
| | - Natarajan Gopalan
- Department of Epidemiology and Public Health, Central University of Tamil Nadu, Thiruvarur, 610 005, India
| | - Meganathan Kannan
- Blood and Vascular Biology, Department of Biotechnology, Central University of Tamil Nadu, Thiruvarur, 610 005, India
| | - Narayanaiah Cheedarla
- Department of Pathology and Laboratory Medicine, Emory University School of Medicine, Division of Microbiology and Immunology, Emory National Primate Research Center, Emory Vaccine Center, Atlanta, GA, 30329, USA
| | - Hong Y Tan
- School of Traditional Chinese Medicine, Xiamen University Malaysia, 43900, Sepang, Selangor, Malaysia
| | - Ying Zhang
- Kelip-kelip! Center of Excellence for Light Enabling Technologies, Xiamen University Malaysia, 43900, Sepang, Selangor, Malaysia
- Chemical Engineering, Xiamen University Malaysia, 43900, Sepang, Malaysia
| | - Marie Larsson
- Division of Molecular Medicine and Virology, Department of Biomedical and Clinical Sciences, Linköping University, 58 185, Linköping, Sweden
| | - Pachamuthu Balakrishnan
- Department of Research, Meenakshi Academy of Higher Education and Research (MAHER), Chennai, 600 078, India
| | - Vijayakumar Velu
- Department of Pathology and Laboratory Medicine, Emory University School of Medicine, Division of Microbiology and Immunology, Emory National Primate Research Center, Emory Vaccine Center, Atlanta, GA, 30329, USA
| | - Siddappa N Byrareddy
- Department of Pharmacology and Experimental Neuroscience, University of Nebraska Medical Center, Omaha, NE, 68131, USA
| | - Esaki M Shankar
- Infection and Inflammation, Department of Biotechnology, Central University of Tamil Nadu, Thiruvarur, 610 005, India.
| | - Sivadoss Raju
- State Public Health Laboratory, Directorate of Public Health and Preventive Medicine, DMS Campus, Teynampet, Chennai, Tamil Nadu, 600 006, India.
| |
Collapse
|
3
|
Sanchez-Fernandez A, Poon JF, Leung AE, Prévost SF, Dicko C. Stabilization of Non-Native Folds and Programmable Protein Gelation in Compositionally Designed Deep Eutectic Solvents. ACS NANO 2024; 18:18314-18326. [PMID: 38949563 PMCID: PMC11256765 DOI: 10.1021/acsnano.4c01950] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Revised: 06/17/2024] [Accepted: 06/18/2024] [Indexed: 07/02/2024]
Abstract
Proteins are adjustable units from which biomaterials with designed properties can be developed. However, non-native folded states with controlled topologies are hardly accessible in aqueous environments, limiting their prospects as building blocks. Here, we demonstrate the ability of a series of anhydrous deep eutectic solvents (DESs) to precisely control the conformational landscape of proteins. We reveal that systematic variations in the chemical composition of binary and ternary DESs dictate the stabilization of a wide range of conformations, that is, compact globular folds, intermediate folding states, or unfolded chains, as well as controlling their collective behavior. Besides, different conformational states can be visited by simply adjusting the composition of ternary DESs, allowing for the refolding of unfolded states and vice versa. Notably, we show that these intermediates can trigger the formation of supramolecular gels, also known as eutectogels, where their mechanical properties correlate to the folding state of the protein. Given the inherent vulnerability of proteins outside the native fold in aqueous environments, our findings highlight DESs as tailorable solvents capable of stabilizing various non-native conformations on demand through solvent design.
Collapse
Affiliation(s)
- Adrian Sanchez-Fernandez
- Center
for Research in Biological Chemistry and Molecular Materials (CiQUS),
Department of Chemical Engineering, Universidade
de Santiago de Compostela, Santiago de Compostela 15705, Spain
| | - Jia-Fei Poon
- European
Spallation Source, Lund University, Lund SE-22100, Sweden
| | | | | | - Cedric Dicko
- Pure
and Applied Biochemistry, Department of Chemistry, Lund University, Lund SE-22100, Sweden
- Lund
Institute of Advanced Neutron and X-ray Science, Lund SE-22370, Sweden
| |
Collapse
|
4
|
de Crécy-Lagard V, Dias R, Friedberg I, Yuan Y, Swairjo MA. Limitations of Current Machine-Learning Models in Predicting Enzymatic Functions for Uncharacterized Proteins. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.01.601547. [PMID: 39005379 PMCID: PMC11244979 DOI: 10.1101/2024.07.01.601547] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/16/2024]
Abstract
Thirty to seventy percent of proteins in any given genome have no assigned function and have been labeled as the protein "unknownme". This large knowledge gap prevents the biological community from fully leveraging the plethora of genomic data that is now available. Machine-learning approaches are showing some promise in propagating functional knowledge from experimentally characterized proteins to the correct set of isofunctional orthologs. However, they largely fail to predict enzymatic functions unseen in the training set, as shown by dissecting the predictions made for 450 enzymes of unknown function from the model bacteria Escherichia coli using the DeepECTransformer platform. Lessons from these failures can help the community develop machine-learning methods that assist domain experts in making testable functional predictions for more members of the uncharacterized proteome.
Collapse
|
5
|
Ahmed MH, Samia NSN, Singh G, Gupta V, Mishal MFM, Hossain A, Suman KH, Raza A, Dutta AK, Labony MA, Sultana J, Faysal EH, Alnasser SM, Alam P, Azam F. An immuno-informatics approach for annotation of hypothetical proteins and multi-epitope vaccine designed against the Mpox virus. J Biomol Struct Dyn 2024; 42:5288-5307. [PMID: 37519185 DOI: 10.1080/07391102.2023.2239921] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2023] [Accepted: 06/09/2023] [Indexed: 08/01/2023]
Abstract
A worrying new outbreak of Monkeypox (Mpox) in humans is caused by the Mpox virus (MpoxV). The pathogen has roughly 28 hypothetical proteins of unknown structure, function, and pathogenicity. Using reliable bioinformatics tools, we attempted to analyze the MpoxV genome, identify the role of hypothetical proteins (HPs), and design a potential candidate vaccine. Out of 28, we identified seven hypothetical proteins using multi-server validation with high confidence for the occurrence of conserved domains. Their physical, chemical, and functional characterizations, including molecular weight, theoretical isoelectric point, 3D structures, GRAVY value, subcellular localization, functional motifs, antigenicity, and virulence factors, were performed. We predicted possible cytotoxic T cell (CTL), helper T cell (HTL) and linear and conformational B cell epitopes, which were combined in a 219 amino acid multiepitope vaccine with human β defensin as a linker. This multi-epitopic vaccine was structurally modelled and docked with toll-like receptor-3 (TLR-3). The dynamical stability of the vaccine-TLR-3 docked complexes exhibited stable interactions based on RMSD and RMSF tests. Additionally, the modelled vaccine was cloned in-silico in an E. coli host to check the appropriate expression of the final vaccine built. Our results might conform to an immunogenic and safe vaccine, which would require further experimental validation.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Md Hridoy Ahmed
- Department of Genetic Engineering and Biotechnology, University of Chittagong, Chittagong, Bangladesh
| | - Nure Sharaf Nower Samia
- Department of Life Sciences (DLS), School of Environment and Life Sciences (SELS), Independent University, Dhaka, Bangladesh
| | - Gagandeep Singh
- Kusuma School of Biological Sciences, Indian Institute of Technology, New Delhi, India
- Section of Microbiology, Central Ayurveda Research Institute, Jhansi CCRAS, Ministry of Ayush, India
| | - Vandana Gupta
- Department of Microbiology, Ram Lal Anand College, University of Delhi, New Delhi, India
| | | | - Alomgir Hossain
- Department of Genetic Engineering and Biotechnology, University of Rajshahi, Rajshahi, Bangladesh
| | | | - Adnan Raza
- Bioscience department, COMSATS University of Islamabad, Islamabad, Pakistan
| | - Amit Kumar Dutta
- Department of Microbiology, University of Rajshahi, Rajshahi, Bangladesh
| | - Moriom Akhter Labony
- Department of Genetic Engineering and Biotechnology, University of Chittagong, Chittagong, Bangladesh
| | - Jakia Sultana
- Department of Botany, University of Rajshahi, Rajshahi, Bangladesh
| | | | - Sulaiman Mohammed Alnasser
- Department of Pharmacology and Toxicology, Unaizah College of Pharmacy, Qassim University, Buraydah, Saudi Arabia
| | - Prawez Alam
- Department of Pharmacognosy, College of Pharmacy, Prince Sattam Bin Abdulaziz University, Al Kharj, Saudi Arabia
| | - Faizul Azam
- Department of Pharmaceutical Chemistry and Pharmacognosy, Unaizah College of Pharmacy, Qassim University, Buraydah, Saudi Arabia
| |
Collapse
|
6
|
Zhang Y, Ma F, Chen J, Chen Y, Xu L, Li A, Liu Y, Ma R, Shi L. Controlled Refolding of Denatured IL-12 Using In Situ Antigen-Capturing Nanochaperone Remarkably Reduces the Systemic Toxicity and Enhances Cancer Immunotherapy. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2024; 36:e2309927. [PMID: 38387609 DOI: 10.1002/adma.202309927] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Revised: 01/27/2024] [Indexed: 02/24/2024]
Abstract
Cytokines are powerful in cancer immunotherapy, however, their therapeutic potential is limited by the severe systemic toxicity. Here a potent strategy to reduce the toxicity of systemic cytokine therapy by delivering its denatured form using a finely designed nanochaperone, is described. It is demonstrated that even if the denatured protein cargos are occasionally released under normal physiological conditions they are still misfolded, while can effectively refold into native states and release to function in tumor microenvironment. Consequently, the systemic toxicity of cytokines is nearly completely overcome. Moreover, an immunogenic cell death (ICD)-inducing chemotherapeutic is further loaded and delivered to tumor using this nanochaperone to trigger the release of tumor-associated antigens (TAAs) that are subsequently captured in situ by nanochaperone and then reflows into lymph nodes (LNs) to promote antigen cross-presentation. This optimized personalized nanochaperone-vaccine demonstrates unprecedented suppressive effects against large, advanced tumors, and in combination with immune checkpoint blockade (ICB) therapy results in a significant abscopal effect and inhibition of postoperative tumor recurrence and metastasis. Hence, this approach provides a simple and universal delivery strategy to reduce the systemic toxicities of cytokines, as well as provides a robust personalized cancer vaccination platform, which may find wide applications in cancer immunotherapy.
Collapse
Affiliation(s)
- Yongxin Zhang
- Key Laboratory of Functional Polymer Materials of Ministry of Education, Institute of Polymer Chemistry, State Key Laboratory of Medicinal Chemical Biology, Frontiers Science Center for New Organic Matter, College of Chemistry, Nankai University, Tianjin, 300071, P. R. China
| | - Feihe Ma
- Key Laboratory of Functional Polymer Materials of Ministry of Education, Institute of Polymer Chemistry, State Key Laboratory of Medicinal Chemical Biology, Frontiers Science Center for New Organic Matter, College of Chemistry, Nankai University, Tianjin, 300071, P. R. China
| | - Jiajing Chen
- Key Laboratory of Functional Polymer Materials of Ministry of Education, Institute of Polymer Chemistry, State Key Laboratory of Medicinal Chemical Biology, Frontiers Science Center for New Organic Matter, College of Chemistry, Nankai University, Tianjin, 300071, P. R. China
| | - Yujie Chen
- Key Laboratory of Functional Polymer Materials of Ministry of Education, Institute of Polymer Chemistry, State Key Laboratory of Medicinal Chemical Biology, Frontiers Science Center for New Organic Matter, College of Chemistry, Nankai University, Tianjin, 300071, P. R. China
| | - Linlin Xu
- Key Laboratory of Functional Polymer Materials of Ministry of Education, Institute of Polymer Chemistry, State Key Laboratory of Medicinal Chemical Biology, Frontiers Science Center for New Organic Matter, College of Chemistry, Nankai University, Tianjin, 300071, P. R. China
| | - Ang Li
- Key Laboratory of Functional Polymer Materials of Ministry of Education, Institute of Polymer Chemistry, State Key Laboratory of Medicinal Chemical Biology, Frontiers Science Center for New Organic Matter, College of Chemistry, Nankai University, Tianjin, 300071, P. R. China
| | - Yang Liu
- Key Laboratory of Functional Polymer Materials of Ministry of Education, Institute of Polymer Chemistry, State Key Laboratory of Medicinal Chemical Biology, Frontiers Science Center for New Organic Matter, College of Chemistry, Nankai University, Tianjin, 300071, P. R. China
| | - Rujiang Ma
- Key Laboratory of Functional Polymer Materials of Ministry of Education, Institute of Polymer Chemistry, State Key Laboratory of Medicinal Chemical Biology, Frontiers Science Center for New Organic Matter, College of Chemistry, Nankai University, Tianjin, 300071, P. R. China
| | - Linqi Shi
- Key Laboratory of Functional Polymer Materials of Ministry of Education, Institute of Polymer Chemistry, State Key Laboratory of Medicinal Chemical Biology, Frontiers Science Center for New Organic Matter, College of Chemistry, Nankai University, Tianjin, 300071, P. R. China
- Haihe Laboratory of Sustainable Chemical Transformations, Tianjin, 300192, P. R. China
| |
Collapse
|
7
|
Selvavinayagam ST, Sankar S, Yong YK, Murugesan A, Suvaithenamudhan S, Hemashree K, Rajeshkumar M, Kumaresan A, Pandey RP, Shanmugam S, Arthydevi P, Kumar MS, Gopalan N, Kannan M, Cheedarla N, Tan HY, Zhang Y, Larsson M, Balakrishnan P, Velu V, Byrareddy SN, Shankar EM, Raju S. Emergence of SARS-CoV-2 Omicron Variant JN.1 in Tamil Nadu, India - Clinical Characteristics and Novel Mutations. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.04.16.24305882. [PMID: 38699322 PMCID: PMC11065016 DOI: 10.1101/2024.04.16.24305882] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]
Abstract
In December 2023, we observed a notable shift in the COVID-19 landscape, when the JN.1 emerged as a predominant SARS-CoV-2 variant with a 95% incidence. We characterized the clinical profile, and genetic changes in JN.1, an emerging SARS-CoV-2 variant of interest. Whole genome sequencing was performed on SARS-CoV-2 positive samples, followed by sequence analysis. Mutations within the spike protein sequences were analyzed and compared with the previous lineages and sublineages of SARS-CoV-2, to identify the potential impact of these unique mutations on protein structure and possible functionality. Several unique and dynamic mutations were identified herein. Our data provides key insights into the emergence of newer variants of SARS-CoV-2 in our region and highlights the need for robust and sustained genomic surveillance of SARS-CoV-2.
Collapse
|
8
|
Yuan Q, Tian C, Yang Y. Genome-scale annotation of protein binding sites via language model and geometric deep learning. eLife 2024; 13:RP93695. [PMID: 38630609 PMCID: PMC11023698 DOI: 10.7554/elife.93695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/19/2024] Open
Abstract
Revealing protein binding sites with other molecules, such as nucleic acids, peptides, or small ligands, sheds light on disease mechanism elucidation and novel drug design. With the explosive growth of proteins in sequence databases, how to accurately and efficiently identify these binding sites from sequences becomes essential. However, current methods mostly rely on expensive multiple sequence alignments or experimental protein structures, limiting their genome-scale applications. Besides, these methods haven't fully explored the geometry of the protein structures. Here, we propose GPSite, a multi-task network for simultaneously predicting binding residues of DNA, RNA, peptide, protein, ATP, HEM, and metal ions on proteins. GPSite was trained on informative sequence embeddings and predicted structures from protein language models, while comprehensively extracting residual and relational geometric contexts in an end-to-end manner. Experiments demonstrate that GPSite substantially surpasses state-of-the-art sequence-based and structure-based approaches on various benchmark datasets, even when the structures are not well-predicted. The low computational cost of GPSite enables rapid genome-scale binding residue annotations for over 568,000 sequences, providing opportunities to unveil unexplored associations of binding sites with molecular functions, biological processes, and genetic variants. The GPSite webserver and annotation database can be freely accessed at https://bio-web1.nscc-gz.cn/app/GPSite.
Collapse
Affiliation(s)
- Qianmu Yuan
- School of Computer Science and Engineering, Sun Yat-sen UniversityGuangzhouChina
| | - Chong Tian
- School of Computer Science and Engineering, Sun Yat-sen UniversityGuangzhouChina
| | - Yuedong Yang
- School of Computer Science and Engineering, Sun Yat-sen UniversityGuangzhouChina
| |
Collapse
|
9
|
Malatesta M, Fornasier E, Di Salvo ML, Tramonti A, Zangelmi E, Peracchi A, Secchi A, Polverini E, Giachin G, Battistutta R, Contestabile R, Percudani R. One substrate many enzymes virtual screening uncovers missing genes of carnitine biosynthesis in human and mouse. Nat Commun 2024; 15:3199. [PMID: 38615009 PMCID: PMC11016064 DOI: 10.1038/s41467-024-47466-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 03/26/2024] [Indexed: 04/15/2024] Open
Abstract
The increasing availability of experimental and computational protein structures entices their use for function prediction. Here we develop an automated procedure to identify enzymes involved in metabolic reactions by assessing substrate conformations docked to a library of protein structures. By screening AlphaFold-modeled vitamin B6-dependent enzymes, we find that a metric based on catalytically favorable conformations at the enzyme active site performs best (AUROC Score=0.84) in identifying genes associated with known reactions. Applying this procedure, we identify the mammalian gene encoding hydroxytrimethyllysine aldolase (HTMLA), the second enzyme of carnitine biosynthesis. Upon experimental validation, we find that the top-ranked candidates, serine hydroxymethyl transferase (SHMT) 1 and 2, catalyze the HTMLA reaction. However, a mouse protein absent in humans (threonine aldolase; Tha1) catalyzes the reaction more efficiently. Tha1 did not rank highest based on the AlphaFold model, but its rank improved to second place using the experimental crystal structure we determined at 2.26 Å resolution. Our findings suggest that humans have lost a gene involved in carnitine biosynthesis, with HTMLA activity of SHMT partially compensating for its function.
Collapse
Affiliation(s)
- Marco Malatesta
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parma, Italy
| | | | - Martino Luigi Di Salvo
- Istituto Pasteur Italia-Fondazione Cenci Bolognetti and Department of Biochemical Sciences "A. Rossi Fanelli", Sapienza University of Rome, Rome, Italy
| | - Angela Tramonti
- Institute of Molecular Biology and Pathology, Italian National Research Council, Rome, Italy
| | - Erika Zangelmi
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parma, Italy
| | - Alessio Peracchi
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parma, Italy
| | - Andrea Secchi
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parma, Italy
| | - Eugenia Polverini
- Department of Mathematical, Physical and Computer Sciences, University of Parma, Parma, Italy
| | - Gabriele Giachin
- Department of Chemical Sciences, University of Padua, Padova, Italy
| | | | - Roberto Contestabile
- Istituto Pasteur Italia-Fondazione Cenci Bolognetti and Department of Biochemical Sciences "A. Rossi Fanelli", Sapienza University of Rome, Rome, Italy.
| | - Riccardo Percudani
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parma, Italy.
| |
Collapse
|
10
|
Kumar N, Srivastava R. Deep learning in structural bioinformatics: current applications and future perspectives. Brief Bioinform 2024; 25:bbae042. [PMID: 38701422 PMCID: PMC11066934 DOI: 10.1093/bib/bbae042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Revised: 01/05/2024] [Accepted: 01/18/2024] [Indexed: 05/05/2024] Open
Abstract
In this review article, we explore the transformative impact of deep learning (DL) on structural bioinformatics, emphasizing its pivotal role in a scientific revolution driven by extensive data, accessible toolkits and robust computing resources. As big data continue to advance, DL is poised to become an integral component in healthcare and biology, revolutionizing analytical processes. Our comprehensive review provides detailed insights into DL, featuring specific demonstrations of its notable applications in bioinformatics. We address challenges tailored for DL, spotlight recent successes in structural bioinformatics and present a clear exposition of DL-from basic shallow neural networks to advanced models such as convolution, recurrent, artificial and transformer neural networks. This paper discusses the emerging use of DL for understanding biomolecular structures, anticipating ongoing developments and applications in the realm of structural bioinformatics.
Collapse
Affiliation(s)
- Niranjan Kumar
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi, India
| | - Rakesh Srivastava
- Center for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad, India
| |
Collapse
|
11
|
Kutnowski N, Ghanim GE, Lee Y, Rio DC. Activity of zebrafish THAP9 transposase and zebrafish P element-like transposons. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.22.586318. [PMID: 38562726 PMCID: PMC10983969 DOI: 10.1101/2024.03.22.586318] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Transposable elements are mobile DNA segments that are found ubiquitously across the three domains of life. One family of transposons, called P elements, were discovered in the fruit fly Drosophila melanogaster. Since their discovery, P element transposase-homologous genes (called THAP-domain containing 9 or THAP9) have been discovered in other animal genomes. Here, we show that the zebrafish (Danio rerio) genome contains both an active THAP9 transposase (zfTHAP9) and mobile P-like transposable elements (called Pdre). zfTHAP9 transposase can excise one of its own elements (Pdre2) and Drosophila P elements. Drosophila P element transposase (DmTNP) is also able to excise the zebrafish Pdre2 element, even though it's distinct from the Drosophila P element. However, zfTHAP9 cannot transpose Pdre2 or Drosophila P elements, indicating partial transposase activity. Characterization of the N-terminal THAP DNA binding domain of zfTHAP9 shows distinct DNA binding site preferences from DmTNP and mutation of the zfTHAP9, based on known mutations in DmTNP, generated a hyperactive protein,. These results define an active vertebrate THAP9 transposase that can act on the endogenous zebrafish Pdre and Drosophila P elements.
Collapse
Affiliation(s)
- Nitzan Kutnowski
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
- California Institute for Quantitative Biosciences, University of California, Berkeley, Berkeley, CA, USA
| | - George E Ghanim
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
- California Institute for Quantitative Biosciences, University of California, Berkeley, Berkeley, CA, USA
| | - Yeon Lee
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
- California Institute for Quantitative Biosciences, University of California, Berkeley, Berkeley, CA, USA
| | - Donald C Rio
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
- California Institute for Quantitative Biosciences, University of California, Berkeley, Berkeley, CA, USA
| |
Collapse
|
12
|
Blake KS, Kumar H, Loganathan A, Williford EE, Diorio-Toth L, Xue YP, Tang WK, Campbell TP, Chong DD, Angtuaco S, Wencewicz TA, Tolia NH, Dantas G. Sequence-structure-function characterization of the emerging tetracycline destructase family of antibiotic resistance enzymes. Commun Biol 2024; 7:336. [PMID: 38493211 PMCID: PMC10944477 DOI: 10.1038/s42003-024-06023-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 03/07/2024] [Indexed: 03/18/2024] Open
Abstract
Tetracycline destructases (TDases) are flavin monooxygenases which can confer resistance to all generations of tetracycline antibiotics. The recent increase in the number and diversity of reported TDase sequences enables a deep investigation of the TDase sequence-structure-function landscape. Here, we evaluate the sequence determinants of TDase function through two complementary approaches: (1) constructing profile hidden Markov models to predict new TDases, and (2) using multiple sequence alignments to identify conserved positions important to protein function. Using the HMM-based approach we screened 50 high-scoring candidate sequences in Escherichia coli, leading to the discovery of 13 new TDases. The X-ray crystal structures of two new enzymes from Legionella species were determined, and the ability of anhydrotetracycline to inhibit their tetracycline-inactivating activity was confirmed. Using the MSA-based approach we identified 31 amino acid positions 100% conserved across all known TDase sequences. The roles of these positions were analyzed by alanine-scanning mutagenesis in two TDases, to study the impact on cell and in vitro activity, structure, and stability. These results expand the diversity of TDase sequences and provide valuable insights into the roles of important residues in TDases, and flavin monooxygenases more broadly.
Collapse
Affiliation(s)
- Kevin S Blake
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO, USA
| | - Hirdesh Kumar
- Host-Pathogen Interactions and Structural Vaccinology section (HPISV), National Institute of Allergy and Infectious Diseases (NIAID), National Institutes of Health (NIH), Bethesda, MD, USA
| | - Anisha Loganathan
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO, USA
| | - Emily E Williford
- Department of Chemistry, Washington University in St. Louis, St. Louis, MO, USA
| | - Luke Diorio-Toth
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO, USA
| | - Yao-Peng Xue
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO, USA
| | - Wai Kwan Tang
- Host-Pathogen Interactions and Structural Vaccinology section (HPISV), National Institute of Allergy and Infectious Diseases (NIAID), National Institutes of Health (NIH), Bethesda, MD, USA
| | - Tayte P Campbell
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO, USA
| | - David D Chong
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO, USA
| | - Steven Angtuaco
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO, USA
| | - Timothy A Wencewicz
- Department of Chemistry, Washington University in St. Louis, St. Louis, MO, USA.
| | - Niraj H Tolia
- Host-Pathogen Interactions and Structural Vaccinology section (HPISV), National Institute of Allergy and Infectious Diseases (NIAID), National Institutes of Health (NIH), Bethesda, MD, USA.
| | - Gautam Dantas
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO, USA.
- Department of Pathology and Immunology, Division of Laboratory and Genomic Medicine, Washington University School of Medicine, St. Louis, MO, USA.
- Department of Molecular Microbiology, Washington University School of Medicine, St. Louis, MO, USA.
- Department of Biomedical Engineering, Washington University School of Medicine, St. Louis, MO, USA.
- Department of Pediatrics, Washington University School of Medicine, St. Louis, MO, USA.
| |
Collapse
|
13
|
Raghuraman P, Ramireddy S, Raman G, Park S, Sudandiradoss C. Understanding a point mutation signature D54K in the caspase activation recruitment domain of NOD1 capitulating concerted immunity via atomistic simulation. J Biomol Struct Dyn 2024:1-17. [PMID: 38415678 DOI: 10.1080/07391102.2024.2322618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 12/11/2023] [Indexed: 02/29/2024]
Abstract
Point mutation D54K in the human N-terminal caspase recruitment domain (CARD) of nucleotide-binding oligomerization domain -1 (NOD1) abrogates an imperative downstream interaction with receptor-interacting protein kinase (RIPK2) that entails combating bacterial infections and inflammatory dysfunction. Here, we addressed the molecular details concerning conformational changes and interaction patterns (monomeric-dimeric states) of D54K by signature-based molecular dynamics simulation. Initially, the sequence analysis prioritized D54K as a pathogenic mutation, among other variants, based on a sequence signature. Since the mutation is highly conserved, we derived the distant ortholog to predict the sequence and structural similarity between native and mutant. This analysis showed the utility of 33 communal core residues associated with structural-functional preservation and variations, concurrently served to infer the cryptic hotspots Cys39, Glu53, Asp54, Glu56, Ile57, Leu74, and Lys78 determining the inter helical fold forming homodimers for putative receptor interaction. Subsequently, the atomistic simulations with free energy (MM/PB(GB)SA) calculations predicted structural alteration that takes place in the N-terminal mutant CARD where coils changed to helices (45 α3- L4-α4-L6- α683) in contrast to native (45T2-L4-α4-L6-T483). Likewise, the C-terminal helices 93T1-α7105 connected to the loops distorted compared to native 93α6-L7105 may result in conformational misfolding that promotes functional regulation and activation. These structural perturbations of D54K possibly destabilize the flexible adaptation of critical homotypic NOD1CARD-CARDRIPK2 interactions (α4Asp42-Arg488α5 and α6Phe86-Lys471α4) is consistent with earlier experimental reports. Altogether, our findings unveil the conformational plasticity of mutation-dependent immunomodulatory response and may aid in functional validation exploring clinical investigation on CARD-regulated immunotherapies to prevent systemic infection and inflammation.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- P Raghuraman
- Department of Biotechnology, School of Bioscience and Technology, Vellore Institute of Technology, Vellore, India
- Department of Life Sciences, Yeungnam University, Gyeongsan, Gyeongsangbuk-do, Republic of Korea
| | - Sriroopreddy Ramireddy
- Department of Biotechnology, School of Bioscience and Technology, Vellore Institute of Technology, Vellore, India
- Department of Genetics and Molecular Biology, School of Health Sciences, The Apollo University, Chittoor, India
| | - Gurusamy Raman
- Department of Life Sciences, Yeungnam University, Gyeongsan, Gyeongsangbuk-do, Republic of Korea
| | - SeonJoo Park
- Department of Life Sciences, Yeungnam University, Gyeongsan, Gyeongsangbuk-do, Republic of Korea
| | - C Sudandiradoss
- Department of Biotechnology, School of Bioscience and Technology, Vellore Institute of Technology, Vellore, India
| |
Collapse
|
14
|
Mandwal A, Bishop SL, Castellanos M, Westlund A, Chaconas G, Davidsen J, Lewis IA. MINNO: An Open Source Software for Refining Metabolic Networks and Investigating Complex Network Activity Using Empirical Metabolomics Data. Anal Chem 2024; 96:3382-3388. [PMID: 38359900 PMCID: PMC10902815 DOI: 10.1021/acs.analchem.3c04501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 12/18/2023] [Accepted: 01/19/2024] [Indexed: 02/17/2024]
Abstract
Metabolomics is a powerful tool for uncovering biochemical diversity in a wide range of organisms. Metabolic network modeling is commonly used to frame metabolomics data in the context of a broader biological system. However, network modeling of poorly characterized nonmodel organisms remains challenging due to gene homology mismatches which lead to network architecture errors. To address this, we developed the Metabolic Interactive Nodular Network for Omics (MINNO), a web-based mapping tool that uses empirical metabolomics data to refine metabolic networks. MINNO allows users to create, modify, and interact with metabolic pathway visualizations for thousands of organisms, in both individual and multispecies contexts. Herein, we illustrate the use of MINNO in elucidating the metabolic networks of understudied species, such as those of the Borrelia genus, which cause Lyme and relapsing fever diseases. Using a hybrid genomics-metabolomics modeling approach, we constructed species-specific metabolic networks for threeBorrelia species. Using these empirically refined networks, we were able to metabolically differentiate these species via their nucleotide metabolism, which cannot be predicted from genomic networks. Additionally, using MINNO, we identified 18 missing reactions from the KEGG database, of which nine were supported by the primary literature. These examples illustrate the use of metabolomics for the empirical refining of genetically constructed networks and show how MINNO can be used to study nonmodel organisms.
Collapse
Affiliation(s)
- Ayush Mandwal
- Department
of Physics and Astronomy, University of
Calgary, 2500 University Dr NW, Calgary T2N 1N4, Alberta, Canada
| | - Stephanie L. Bishop
- Alberta
Centre for Advanced Diagnostics, Department of Biological Sciences, University of Calgary, 2500 University Dr NW, Calgary T2N 1N4, Alberta, Canada
| | - Mildred Castellanos
- Department
of Biochemistry and Molecular Biology, Cumming School of Medicine,
Snyder Institute for Chronic Diseases, University
of Calgary, 2500 University
Dr NW, Calgary T2N 1N4, Alberta, Canada
| | - Anika Westlund
- Alberta
Centre for Advanced Diagnostics, Department of Biological Sciences, University of Calgary, 2500 University Dr NW, Calgary T2N 1N4, Alberta, Canada
| | - George Chaconas
- Department
of Biochemistry and Molecular Biology, Cumming School of Medicine,
Snyder Institute for Chronic Diseases, University
of Calgary, 2500 University
Dr NW, Calgary T2N 1N4, Alberta, Canada
- Department
of Microbiology, Immunology and Infectious Diseases, Cumming School
of Medicine, Snyder Institute for Chronic Diseases, University of Calgary, 2500 University Dr NW, Calgary T2N 1N4, Alberta, Canada
| | - Jörn Davidsen
- Department
of Physics and Astronomy, University of
Calgary, 2500 University Dr NW, Calgary T2N 1N4, Alberta, Canada
- Hotchkiss
Brain Institute, University of Calgary, 2500 University Dr NW, Calgary T2N 1N4, Alberta, Canada
| | - Ian A. Lewis
- Alberta
Centre for Advanced Diagnostics, Department of Biological Sciences, University of Calgary, 2500 University Dr NW, Calgary T2N 1N4, Alberta, Canada
| |
Collapse
|
15
|
Bonello J, Orengo C. FunPredCATH: An ensemble method for predicting protein function using CATH. BIOCHIMICA ET BIOPHYSICA ACTA. PROTEINS AND PROTEOMICS 2024; 1872:140985. [PMID: 38122964 DOI: 10.1016/j.bbapap.2023.140985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 12/05/2023] [Accepted: 12/06/2023] [Indexed: 12/23/2023]
Abstract
MOTIVATION The growth of unannotated proteins in UniProt increases at a very high rate every year due to more efficient sequencing methods. However, the experimental annotation of proteins is a lengthy and expensive process. Using computational techniques to narrow the search can speed up the process by providing highly specific Gene Ontology (GO) terms. METHODOLOGY We propose an ensemble approach that combines three generic base predictors that predict Gene Ontology (BP, CC and MF) terms from sequences across different species. We train our models on UniProtGOA annotation data and use the CATH domain resources to identify the protein families. We then calculate a score based on the prevalence of individual GO terms in the functional families that is then used as an indicator of confidence when assigning the GO term to an uncharacterised protein. METHODS In the ensemble, we use a statistics-based method that scores the occurrence of GO terms in a CATH FunFam against a background set of proteins annotated by the same GO term. We also developed a set-based method that uses Set Intersection and Set Union to score the occurrence of GO terms within the same CATH FunFam. Finally, we also use FunFams-Plus, a predictor method developed by the Orengo Group at UCL to predict GO terms for uncharacterised proteins in the CAFA3 challenge. EVALUATION We evaluated the methods against the CAFA3 benchmark and DomFun. We used the Precision, Recall and Fmax metrics and the benchmark datasets that are used in CAFA3 to evaluate our models and compare them to the CAFA3 results. Our results show that FunPredCATH compares well with top CAFA methods in the different ontologies and benchmarks. CONTRIBUTIONS FunPredCATH compares well with other prediction methods on CAFA3, and the ensemble approach outperforms the base methods. We show that non-IEA models obtain higher Fmax scores than the IEA counterparts, while the models including IEA annotations have higher coverage at the expense of a lower Fmax score.
Collapse
Affiliation(s)
- Joseph Bonello
- Department of Structural and Molecular Biology, University College London, Gower Street, London WC1E 6BT, United Kingdom; Department of Computer Information Systems, University of Malta, Faculty of ICT, Msida, MSD 2080, Malta.
| | - Christine Orengo
- Department of Structural and Molecular Biology, University College London, Gower Street, London WC1E 6BT, United Kingdom
| |
Collapse
|
16
|
Klawa SJ, Lee M, Riker KD, Jian T, Wang Q, Gao Y, Daly ML, Bhonge S, Childers WS, Omosun TO, Mehta AK, Lynn DG, Freeman R. Uncovering supramolecular chirality codes for the design of tunable biomaterials. Nat Commun 2024; 15:788. [PMID: 38278785 PMCID: PMC10817930 DOI: 10.1038/s41467-024-45019-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 01/12/2024] [Indexed: 01/28/2024] Open
Abstract
In neurodegenerative diseases, polymorphism and supramolecular assembly of β-sheet amyloids are implicated in many different etiologies and may adopt either a left- or right-handed supramolecular chirality. Yet, the underlying principles of how sequence regulates supramolecular chirality remains unknown. Here, we characterize the sequence specificity of the central core of amyloid-β 42 and design derivatives which enable chirality inversion at biologically relevant temperatures. We further find that C-terminal modifications can tune the energy barrier of a left-to-right chiral inversion. Leveraging this design principle, we demonstrate how temperature-triggered chiral inversion of peptides hosting therapeutic payloads modulates the dosed release of an anticancer drug. These results suggest a generalizable approach for fine-tuning supramolecular chirality that can be applied in developing treatments to regulate amyloid morphology in neurodegeneration as well as in other disease states.
Collapse
Affiliation(s)
- Stephen J Klawa
- Department of Applied Physical Sciences, University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Michelle Lee
- Department of Chemistry, Emory University, Atlanta, GA, 30322, USA
| | - Kyle D Riker
- Department of Applied Physical Sciences, University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Tengyue Jian
- Department of Applied Physical Sciences, University of North Carolina, Chapel Hill, NC, 27599, USA
- Broad Pharm, San Diego, California, 92121, USA
| | - Qunzhao Wang
- Department of Applied Physical Sciences, University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Yuan Gao
- Department of Applied Physical Sciences, University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Margaret L Daly
- Department of Applied Physical Sciences, University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Shreeya Bhonge
- Department of Applied Physical Sciences, University of North Carolina, Chapel Hill, NC, 27599, USA
| | - W Seth Childers
- Department of Chemistry, Emory University, Atlanta, GA, 30322, USA
- Department of Chemistry, University of Pittsburgh, Pittsburgh, PA, 15260, USA
| | - Tolulope O Omosun
- Department of Chemistry, Emory University, Atlanta, GA, 30322, USA
- U.S. Department of Justice, Chicago, IL, 60603, USA
| | - Anil K Mehta
- Department of Chemistry, Emory University, Atlanta, GA, 30322, USA
- The National High Magnetic Field Laboratory, University of Florida, Gainesville, FL, 32611, USA
| | - David G Lynn
- Department of Chemistry, Emory University, Atlanta, GA, 30322, USA.
- Department of Biology, Emory University, Atlanta, GA, 30322, USA.
| | - Ronit Freeman
- Department of Applied Physical Sciences, University of North Carolina, Chapel Hill, NC, 27599, USA.
| |
Collapse
|
17
|
Ko S, Kim J, Lim J, Lee SM, Park JY, Woo J, Scott-Nevros ZK, Kim JR, Yoon H, Kim D. Blanket antimicrobial resistance gene database with structural information, BOARDS, provides insights on historical landscape of resistance prevalence and effects of mutations in enzyme structure. mSystems 2024; 9:e0094323. [PMID: 38085058 PMCID: PMC10871167 DOI: 10.1128/msystems.00943-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 11/02/2023] [Indexed: 01/24/2024] Open
Abstract
Antimicrobial resistance (AMR) in pathogenic bacteria poses a significant threat to public health, yet there is still a need for development in the tools to deeply understand AMR genes based on genetic or structural information. In this study, we present an interactive web database named Blanket Overarching Antimicrobial-Resistance gene Database with Structural information (BOARDS, sbml.unist.ac.kr), a database that comprehensively includes 3,943 reported AMR gene information for 1,997 extended spectrum beta-lactamase (ESBL) and 1,946 other genes as well as a total of 27,395 predicted protein structures. These structures, which include both wild-type AMR genes and their mutants, were derived from 80,094 publicly available whole-genome sequences. In addition, we developed the rapid analysis and detection tool of antimicrobial-resistance (RADAR), a one-stop analysis pipeline to detect AMR genes across whole-genome sequencing (WGSs). By integrating BOARDS and RADAR, the AMR prevalence landscape for eight multi-drug resistant pathogens was reconstructed, leading to unexpected findings such as the pre-existence of the MCR genes before their official reports. Enzymatic structure prediction-based analysis revealed that the occurrence of mutations found in some ESBL genes was found to be closely related to the binding affinities with their antibiotic substrates. Overall, BOARDS can play a significant role in performing in-depth analysis on AMR.IMPORTANCEWhile the increasing antibiotic resistance (AMR) in pathogen has been a burden on public health, effective tools for deep understanding of AMR based on genetic or structural information remain limited. In this study, a blanket overarching antimicrobial-resistance gene database with structure information (BOARDS)-a web-based database that comprehensively collected AMR gene data with predictive protein structural information was constructed. Additionally, we report the development of a RADAR pipeline that can analyze whole-genome sequences as well. BOARDS, which includes sequence and structural information, has shown the historical landscape and prevalence of the AMR genes and can provide insight into single-nucleotide polymorphism effects on antibiotic degrading enzymes within protein structures.
Collapse
Affiliation(s)
- Seyoung Ko
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
- School of Life Sciences, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Jaehyung Kim
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Jaewon Lim
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Sang-Mok Lee
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Joon Young Park
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Jihoon Woo
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Zoe K. Scott-Nevros
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Jong R. Kim
- School of Engineering and Digital Sciences, Nazarbayev University, Astan, Kazakhstan
| | - Hyunjin Yoon
- Department of Molecular Science and Technology, Ajou University, Suwon, South Korea
| | - Donghyuk Kim
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
- School of Life Sciences, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| |
Collapse
|
18
|
Hussain A, Brooks III CL. Guiding discovery of protein sequence-structure-function modeling. Bioinformatics 2024; 40:btae002. [PMID: 38195719 PMCID: PMC10789314 DOI: 10.1093/bioinformatics/btae002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2023] [Revised: 12/05/2023] [Accepted: 01/08/2024] [Indexed: 01/11/2024] Open
Abstract
MOTIVATION Protein engineering techniques are key in designing novel catalysts for a wide range of reactions. Although approaches vary in their exploration of the sequence-structure-function paradigm, they are often hampered by the labor-intensive steps of protein expression and screening. In this work, we describe the development and testing of a high-throughput in silico sequence-structure-function pipeline using AlphaFold2 and fast Fourier transform docking that is benchmarked with enantioselectivity and reactivity predictions for an ancestral sequence library of fungal flavin-dependent monooxygenases. RESULTS The predicted enantioselectivities and reactivities correlate well with previously described screens of an experimentally available subset of these proteins and capture known changes in enantioselectivity across the phylogenetic tree representing ancestorial proteins from this family. With this pipeline established as our functional screen, we apply ensemble decision tree models and explainable AI techniques to build sequence-function models and extract critical residues within the binding site and the second-sphere residues around this site. We demonstrate that the top-identified key residues in the control of enantioselectivity and reactivity correspond to experimentally verified residues. The in silico sequence-to-function pipeline serves as an accelerated framework to inform protein engineering efforts from vast informative sequence landscapes contained in protein families, ancestral resurrects, and directed evolution campaigns. AVAILABILITY Jupyter notebooks detailing the sequence-structure-function pipeline are available at https://github.com/BrooksResearchGroup-UM/seq_struct_func.
Collapse
Affiliation(s)
- Azam Hussain
- Department of Macromolecular Science and Engineering Program, University of Michigan, Ann Arbor, MI 48109-1055, United States
| | - Charles L Brooks III
- Department of Chemistry, University of Michigan, Ann Arbor, MI 48109-1055, United States
| |
Collapse
|
19
|
Song H, Zhao K, Jiang G, Sun S, Li J, Tu M, Wang L, Xie H, Chen D. Genome-Wide Identification and Expression Analysis of the SBP-Box Gene Family in Loquat Fruit Development. Genes (Basel) 2023; 15:23. [PMID: 38254913 PMCID: PMC10815216 DOI: 10.3390/genes15010023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 12/17/2023] [Accepted: 12/21/2023] [Indexed: 01/24/2024] Open
Abstract
The loquat (Eriobotrya japonica L.) is a special evergreen tree, and its fruit is of high medical and health value as well as having stable market demand around the world. In recent years, research on the accumulation of nutrients in loquat fruit, such as carotenoids, flavonoids, and terpenoids, has become a hotspot. The SBP-box gene family encodes transcription factors involved in plant growth and development. However, there has been no report on the SBP-box gene family in the loquat genome and their functions in carotenoid biosynthesis and fruit ripening. In this study, we identified 28 EjSBP genes in the loquat genome, which were unevenly distributed on 12 chromosomes. We also systematically investigated the phylogenetic relationship, collinearity, gene structure, conserved motifs, and cis-elements of EjSBP proteins. Most EjSBP genes showed high expression in the root, stem, leaf, and inflorescence, while only five EjSBP genes were highly expressed in the fruit. Gene expression analysis revealed eight differentially expressed EjSBP genes between yellow- and white-fleshed fruits, suggesting that the EjSBP genes play important roles in loquat fruit development at the breaker stage. Notably, EjSBP01 and EjSBP19 exhibited completely opposite expression patterns between white- and yellow-fleshed fruits during fruit development, and showed a close relationship with SlCnr involved in carotenoid biosynthesis and fruit ripening, indicating that these two genes may participate in the synthesis and accumulation of carotenoids in loquat fruit. In summary, this study provides comprehensive information about the SBP-box gene family in the loquat, and identified two EjSBP genes as candidates involved in carotenoid synthesis and accumulation during loquat fruit development.
Collapse
Affiliation(s)
- Haiyan Song
- Horticulture Research Institute, Sichuan Academy of Agricultural Sciences, Chengdu 610066, China; (H.S.); (K.Z.); (G.J.); (S.S.); (J.L.); (M.T.); (L.W.); (H.X.)
- Key Laboratory of Horticultural Crop Biology and Germplasm Creation in Southwestern China of the Ministry of Agriculture and Rural Affairs, Chengdu 610066, China
- College of Life Science, Sichuan University, Chengdu 610065, China
| | - Ke Zhao
- Horticulture Research Institute, Sichuan Academy of Agricultural Sciences, Chengdu 610066, China; (H.S.); (K.Z.); (G.J.); (S.S.); (J.L.); (M.T.); (L.W.); (H.X.)
- Key Laboratory of Horticultural Crop Biology and Germplasm Creation in Southwestern China of the Ministry of Agriculture and Rural Affairs, Chengdu 610066, China
| | - Guoliang Jiang
- Horticulture Research Institute, Sichuan Academy of Agricultural Sciences, Chengdu 610066, China; (H.S.); (K.Z.); (G.J.); (S.S.); (J.L.); (M.T.); (L.W.); (H.X.)
- Key Laboratory of Horticultural Crop Biology and Germplasm Creation in Southwestern China of the Ministry of Agriculture and Rural Affairs, Chengdu 610066, China
| | - Shuxia Sun
- Horticulture Research Institute, Sichuan Academy of Agricultural Sciences, Chengdu 610066, China; (H.S.); (K.Z.); (G.J.); (S.S.); (J.L.); (M.T.); (L.W.); (H.X.)
- Key Laboratory of Horticultural Crop Biology and Germplasm Creation in Southwestern China of the Ministry of Agriculture and Rural Affairs, Chengdu 610066, China
| | - Jing Li
- Horticulture Research Institute, Sichuan Academy of Agricultural Sciences, Chengdu 610066, China; (H.S.); (K.Z.); (G.J.); (S.S.); (J.L.); (M.T.); (L.W.); (H.X.)
- Key Laboratory of Horticultural Crop Biology and Germplasm Creation in Southwestern China of the Ministry of Agriculture and Rural Affairs, Chengdu 610066, China
| | - Meiyan Tu
- Horticulture Research Institute, Sichuan Academy of Agricultural Sciences, Chengdu 610066, China; (H.S.); (K.Z.); (G.J.); (S.S.); (J.L.); (M.T.); (L.W.); (H.X.)
- Key Laboratory of Horticultural Crop Biology and Germplasm Creation in Southwestern China of the Ministry of Agriculture and Rural Affairs, Chengdu 610066, China
| | - Lingli Wang
- Horticulture Research Institute, Sichuan Academy of Agricultural Sciences, Chengdu 610066, China; (H.S.); (K.Z.); (G.J.); (S.S.); (J.L.); (M.T.); (L.W.); (H.X.)
- Key Laboratory of Horticultural Crop Biology and Germplasm Creation in Southwestern China of the Ministry of Agriculture and Rural Affairs, Chengdu 610066, China
| | - Hongjiang Xie
- Horticulture Research Institute, Sichuan Academy of Agricultural Sciences, Chengdu 610066, China; (H.S.); (K.Z.); (G.J.); (S.S.); (J.L.); (M.T.); (L.W.); (H.X.)
- Key Laboratory of Horticultural Crop Biology and Germplasm Creation in Southwestern China of the Ministry of Agriculture and Rural Affairs, Chengdu 610066, China
| | - Dong Chen
- Horticulture Research Institute, Sichuan Academy of Agricultural Sciences, Chengdu 610066, China; (H.S.); (K.Z.); (G.J.); (S.S.); (J.L.); (M.T.); (L.W.); (H.X.)
- Key Laboratory of Horticultural Crop Biology and Germplasm Creation in Southwestern China of the Ministry of Agriculture and Rural Affairs, Chengdu 610066, China
| |
Collapse
|
20
|
Chen J, Gu Z, Lai L, Pei J. In silico protein function prediction: the rise of machine learning-based approaches. MEDICAL REVIEW (2021) 2023; 3:487-510. [PMID: 38282798 PMCID: PMC10808870 DOI: 10.1515/mr-2023-0038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 10/11/2023] [Indexed: 01/30/2024]
Abstract
Proteins function as integral actors in essential life processes, rendering the realm of protein research a fundamental domain that possesses the potential to propel advancements in pharmaceuticals and disease investigation. Within the context of protein research, an imperious demand arises to uncover protein functionalities and untangle intricate mechanistic underpinnings. Due to the exorbitant costs and limited throughput inherent in experimental investigations, computational models offer a promising alternative to accelerate protein function annotation. In recent years, protein pre-training models have exhibited noteworthy advancement across multiple prediction tasks. This advancement highlights a notable prospect for effectively tackling the intricate downstream task associated with protein function prediction. In this review, we elucidate the historical evolution and research paradigms of computational methods for predicting protein function. Subsequently, we summarize the progress in protein and molecule representation as well as feature extraction techniques. Furthermore, we assess the performance of machine learning-based algorithms across various objectives in protein function prediction, thereby offering a comprehensive perspective on the progress within this field.
Collapse
Affiliation(s)
- Jiaxiao Chen
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
| | - Zhonghui Gu
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
| | - Luhua Lai
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
- BNLMS, College of Chemistry and Molecular Engineering, Peking University, Beijing, China
- Research Unit of Drug Design Method, Chinese Academy of Medical Sciences (2021RU014), Beijing, China
| | - Jianfeng Pei
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
- Research Unit of Drug Design Method, Chinese Academy of Medical Sciences (2021RU014), Beijing, China
| |
Collapse
|
21
|
Ribeiro AJM, Riziotis IG, Borkakoti N, Thornton JM. Enzyme function and evolution through the lens of bioinformatics. Biochem J 2023; 480:1845-1863. [PMID: 37991346 PMCID: PMC10754289 DOI: 10.1042/bcj20220405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Revised: 11/09/2023] [Accepted: 11/14/2023] [Indexed: 11/23/2023]
Abstract
Enzymes have been shaped by evolution over billions of years to catalyse the chemical reactions that support life on earth. Dispersed in the literature, or organised in online databases, knowledge about enzymes can be structured in distinct dimensions, either related to their quality as biological macromolecules, such as their sequence and structure, or related to their chemical functions, such as the catalytic site, kinetics, mechanism, and overall reaction. The evolution of enzymes can only be understood when each of these dimensions is considered. In addition, many of the properties of enzymes only make sense in the light of evolution. We start this review by outlining the main paradigms of enzyme evolution, including gene duplication and divergence, convergent evolution, and evolution by recombination of domains. In the second part, we overview the current collective knowledge about enzymes, as organised by different types of data and collected in several databases. We also highlight some increasingly powerful computational tools that can be used to close gaps in understanding, in particular for types of data that require laborious experimental protocols. We believe that recent advances in protein structure prediction will be a powerful catalyst for the prediction of binding, mechanism, and ultimately, chemical reactions. A comprehensive mapping of enzyme function and evolution may be attainable in the near future.
Collapse
Affiliation(s)
- Antonio J. M. Ribeiro
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, U.K
| | - Ioannis G. Riziotis
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, U.K
| | - Neera Borkakoti
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, U.K
| | - Janet M. Thornton
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, U.K
| |
Collapse
|
22
|
Zhang L, Yao L, Zhao F, Yu A, Zhou Y, Wen Q, Wang J, Zheng T, Chen P. Protein and Peptide-Based Nanotechnology for Enhancing Stability, Bioactivity, and Delivery of Anthocyanins. Adv Healthc Mater 2023; 12:e2300473. [PMID: 37537383 DOI: 10.1002/adhm.202300473] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 05/18/2023] [Indexed: 08/05/2023]
Abstract
Anthocyanin, a unique natural polyphenol, is abundant in plants and widely utilized in biomedicine, cosmetics, and the food industry due to its excellent antioxidant, anticancer, antiaging, antimicrobial, and anti-inflammatory properties. However, the degradation of anthocyanin in an extreme environment, such as alkali pH, high temperatures, and metal ions, limits its physiochemical stabilities and bioavailabilities. Encapsulation and combining anthocyanin with biomaterials could efficiently stabilize anthocyanin for protection. Promisingly, natural or artificially designed proteins and peptides with favorable stabilities, excellent biocapacity, and wide sources are potential candidates to stabilize anthocyanin. This review focuses on recent progress, strategies, and perspectives on protein and peptide for anthocyanin functionalization and delivery, i.e., formulation technologies, physicochemical stability enhancement, cellular uptake, bioavailabilities, and biological activities development. Interestingly, due to the simplicity and diversity of peptide structure, the interaction mechanisms between peptide and anthocyanin could be illustrated. This work sheds light on the mechanism of protein/peptide-anthocyanin nanoparticle construction and expands on potential applications of anthocyanin in nutrition and biomedicine.
Collapse
Affiliation(s)
- Lei Zhang
- Department of Chemical Engineering and Waterloo Institute for Nanotechnology, University of Waterloo, Waterloo, Ontario, N2L3G1, Canada
| | - Liang Yao
- College of Biotechnology, Sericultural Research Institute, Jiangsu University of Science and Technology, Zhenjiang, Jiangsu, 212018, China
| | - Feng Zhao
- Department of Chemical Engineering and Waterloo Institute for Nanotechnology, University of Waterloo, Waterloo, Ontario, N2L3G1, Canada
| | - Alice Yu
- Schulich School of Medicine and Dentistry, Western University, Ontario, N6A 3K7, Canada
| | - Yueru Zhou
- Department of Chemical Engineering and Waterloo Institute for Nanotechnology, University of Waterloo, Waterloo, Ontario, N2L3G1, Canada
| | - Qingmei Wen
- Guangzhou Institute of Energy Conversion, Chinese Academy of Sciences, Guangzhou, 510640, China
| | - Jun Wang
- College of Biotechnology, Sericultural Research Institute, Jiangsu University of Science and Technology, Zhenjiang, Jiangsu, 212018, China
| | - Tao Zheng
- Guangzhou Institute of Energy Conversion, Chinese Academy of Sciences, Guangzhou, 510640, China
| | - Pu Chen
- Department of Chemical Engineering and Waterloo Institute for Nanotechnology, University of Waterloo, Waterloo, Ontario, N2L3G1, Canada
| |
Collapse
|
23
|
Bora JR, Mahalakshmi R. Empowering canonical biochemicals with cross-linked novelty: Recursions in applications of protein cross-links. Proteins 2023. [PMID: 37589191 DOI: 10.1002/prot.26571] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 08/02/2023] [Accepted: 08/03/2023] [Indexed: 08/18/2023]
Abstract
Diversity in the biochemical workhorses of the cell-that is, proteins-is achieved by the innumerable permutations offered primarily by the 20 canonical L-amino acids prevalent in all biological systems. Yet, proteins are known to additionally undergo unusual modifications for specialized functions. Of the various post-translational modifications known to occur in proteins, the recently identified non-disulfide cross-links are unique, residue-specific covalent modifications that confer additional structural stability and unique functional characteristics to these biomolecules. We review an exclusive class of amino acid cross-links encompassing aromatic and sulfur-containing side chains, which not only confer superior biochemical characteristics to the protein but also possess additional spectroscopic features that can be exploited as novel chromophores. Studies of their in vivo reaction mechanism have facilitated their specialized in vitro applications in hydrogels and protein anchoring in monolayer chips. Furthering the discovery of unique canonical cross-links through new chemical, structural, and bioinformatics tools will catalyze the development of protein-specific hyperstable nanostructures, superfoods, and biotherapeutics.
Collapse
Affiliation(s)
- Jinam Ravindra Bora
- Department of Biological Sciences, Molecular Biophysics Laboratory, Indian Institute of Science Education and Research, Bhopal, India
| | - Radhakrishnan Mahalakshmi
- Department of Biological Sciences, Molecular Biophysics Laboratory, Indian Institute of Science Education and Research, Bhopal, India
| |
Collapse
|
24
|
Van den Broeck L, Bhosale DK, Song K, Fonseca de Lima CF, Ashley M, Zhu T, Zhu S, Van De Cotte B, Neyt P, Ortiz AC, Sikes TR, Aper J, Lootens P, Locke AM, De Smet I, Sozzani R. Functional annotation of proteins for signaling network inference in non-model species. Nat Commun 2023; 14:4654. [PMID: 37537196 PMCID: PMC10400656 DOI: 10.1038/s41467-023-40365-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 07/25/2023] [Indexed: 08/05/2023] Open
Abstract
Molecular biology aims to understand cellular responses and regulatory dynamics in complex biological systems. However, these studies remain challenging in non-model species due to poor functional annotation of regulatory proteins. To overcome this limitation, we develop a multi-layer neural network that determines protein functionality directly from the protein sequence. We annotate kinases and phosphatases in Glycine max. We use the functional annotations from our neural network, Bayesian inference principles, and high resolution phosphoproteomics to infer phosphorylation signaling cascades in soybean exposed to cold, and identify Glyma.10G173000 (TOI5) and Glyma.19G007300 (TOT3) as key temperature regulators. Importantly, the signaling cascade inference does not rely upon known kinase motifs or interaction data, enabling de novo identification of kinase-substrate interactions. Conclusively, our neural network shows generalization and scalability, as such we extend our predictions to Oryza sativa, Zea mays, Sorghum bicolor, and Triticum aestivum. Taken together, we develop a signaling inference approach for non-model species leveraging our predicted kinases and phosphatases.
Collapse
Affiliation(s)
- Lisa Van den Broeck
- Plant and Microbial Biology Department and NC Plant Sciences Initiative, North Carolina State University, Raleigh, NC, 27695, USA.
| | - Dinesh Kiran Bhosale
- Electrical and Computer Engineering Department, North Carolina State University, Raleigh, NC, 27695, USA
| | - Kuncheng Song
- Bioinformatics Research Center, North Carolina State University, Raleigh, NC, 27695, USA
| | - Cássio Flavio Fonseca de Lima
- Department of Plant Biotechnology and Bioinformatics, Ghent University, B-9052, Ghent, Belgium
- VIB Center for Plant Systems Biology, B-9052, Ghent, Belgium
| | - Michael Ashley
- Electrical and Computer Engineering Department, North Carolina State University, Raleigh, NC, 27695, USA
| | - Tingting Zhu
- Department of Plant Biotechnology and Bioinformatics, Ghent University, B-9052, Ghent, Belgium
- VIB Center for Plant Systems Biology, B-9052, Ghent, Belgium
| | - Shanshuo Zhu
- Department of Plant Biotechnology and Bioinformatics, Ghent University, B-9052, Ghent, Belgium
- VIB Center for Plant Systems Biology, B-9052, Ghent, Belgium
| | - Brigitte Van De Cotte
- Department of Plant Biotechnology and Bioinformatics, Ghent University, B-9052, Ghent, Belgium
- VIB Center for Plant Systems Biology, B-9052, Ghent, Belgium
| | - Pia Neyt
- Department of Plant Biotechnology and Bioinformatics, Ghent University, B-9052, Ghent, Belgium
- VIB Center for Plant Systems Biology, B-9052, Ghent, Belgium
| | - Anna C Ortiz
- USDA-ARS Soybean & Nitrogen Fixation Research Unit, Raleigh, NC, 27607, Belgium
| | - Tiffany R Sikes
- USDA-ARS Soybean & Nitrogen Fixation Research Unit, Raleigh, NC, 27607, Belgium
| | - Jonas Aper
- Protealis NV, Technologiepark-Zwijnaarde 94, 9052, Ghent, Belgium
| | - Peter Lootens
- Plant Sciences Unit, Flanders Research Institute for Agriculture, Fisheries and Food (ILVO), 9090, Melle, Belgium
| | - Anna M Locke
- USDA-ARS Soybean & Nitrogen Fixation Research Unit, Raleigh, NC, 27607, Belgium
- Department of Crop and Soil Sciences and NC Plant Sciences Initiative, North Carolina State University, Raleigh, NC, 27695, USA
| | - Ive De Smet
- Department of Plant Biotechnology and Bioinformatics, Ghent University, B-9052, Ghent, Belgium
- VIB Center for Plant Systems Biology, B-9052, Ghent, Belgium
| | - Rosangela Sozzani
- Plant and Microbial Biology Department and NC Plant Sciences Initiative, North Carolina State University, Raleigh, NC, 27695, USA.
| |
Collapse
|
25
|
Lomoio U, Puccio B, Tradigo G, Guzzi PH, Veltri P. SARS-CoV-2 protein structure and sequence mutations: Evolutionary analysis and effects on virus variants. PLoS One 2023; 18:e0283400. [PMID: 37471335 PMCID: PMC10358949 DOI: 10.1371/journal.pone.0283400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Accepted: 07/04/2023] [Indexed: 07/22/2023] Open
Abstract
The structure and sequence of proteins strongly influence their biological functions. New models and algorithms can help researchers in understanding how the evolution of sequences and structures is related to changes in functions. Recently, studies of SARS-CoV-2 Spike (S) protein structures have been performed to predict binding receptors and infection activity in COVID-19, hence the scientific interest in the effects of virus mutations due to sequence, structure and vaccination arises. However, there is the need for models and tools to study the links between the evolution of S protein sequence, structure and functions, and virus transmissibility and the effects of vaccination. As studies on S protein have been generated a large amount of relevant information, we propose in this work to use Protein Contact Networks (PCNs) to relate protein structures with biological properties by means of network topology properties. Topological properties are used to compare the structural changes with sequence changes. We find that both node centrality and community extraction analysis can be used to relate protein stability and functionality with sequence mutations. Starting from this we compare structural evolution to sequence changes and study mutations from a temporal perspective focusing on virus variants. Finally by applying our model to the Omicron variant we report a timeline correlation between Omicron and the vaccination campaign.
Collapse
Affiliation(s)
- Ugo Lomoio
- Department of Surgical and Medical Sciences, University of Catanzaro, Catanzaro, Italy
| | - Barbara Puccio
- Department of Surgical and Medical Sciences, University of Catanzaro, Catanzaro, Italy
| | | | - Pietro Hiram Guzzi
- Department of Surgical and Medical Sciences, University of Catanzaro, Catanzaro, Italy
| | | |
Collapse
|
26
|
Mandwal A, Bishop SL, Castellanos M, Westlund A, Chaconas G, Lewis I, Davidsen J. Metabolic Interactive Nodular Network for Omics (MINNO): Refining and investigating metabolic networks based on empirical metabolomics data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.14.548964. [PMID: 37503268 PMCID: PMC10370097 DOI: 10.1101/2023.07.14.548964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
Metabolomics is a powerful tool for uncovering biochemical diversity in a wide range of organisms, and metabolic network modeling is commonly used to frame results in the context of a broader homeostatic system. However, network modeling of poorly characterized, non-model organisms remains challenging due to gene homology mismatches. To address this challenge, we developed Metabolic Interactive Nodular Network for Omics (MINNO), a web-based mapping tool that takes in empirical metabolomics data to refine metabolic networks for both model and unusual organisms. MINNO allows users to create and modify interactive metabolic pathway visualizations for thousands of organisms, in both individual and multi-species contexts. Herein, we demonstrate an important application of MINNO in elucidating the metabolic networks of understudied species, such as those of the Borrelia genus, which cause Lyme disease and relapsing fever. Using a hybrid genomics-metabolomics modeling approach, we constructed species-specific metabolic networks for three Borrelia species. Using these empirically refined networks, we were able to metabolically differentiate these genetically similar species via their nucleotide and nicotinate metabolic pathways that cannot be predicted from genomic networks. These examples illustrate the use of metabolomics for the empirical refining of genetically constructed networks and show how MINNO can be used to study non-model organisms.
Collapse
Affiliation(s)
- Ayush Mandwal
- Department of Physics and Astronomy, University of Calgary, Calgary, AB, Canada
| | - Stephanie L. Bishop
- Department of Biological Sciences, University of Calgary, Calgary, AB, Canada
| | - Mildred Castellanos
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, Snyder Institute for Chronic Diseases, University of Calgary, Calgary, AB, Canada
| | - Anika Westlund
- Department of Biological Sciences, University of Calgary, Calgary, AB, Canada
| | - George Chaconas
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, Snyder Institute for Chronic Diseases, University of Calgary, Calgary, AB, Canada
| | - Ian Lewis
- Department of Biological Sciences, University of Calgary, Calgary, AB, Canada
| | - Jörn Davidsen
- Department of Physics and Astronomy, University of Calgary, Calgary, AB, Canada
- Hotchkiss Brain Institute, University of Calgary, Calgary, AB, Canada
| |
Collapse
|
27
|
Cagiada M, Bottaro S, Lindemose S, Schenstrøm SM, Stein A, Hartmann-Petersen R, Lindorff-Larsen K. Discovering functionally important sites in proteins. Nat Commun 2023; 14:4175. [PMID: 37443362 PMCID: PMC10345196 DOI: 10.1038/s41467-023-39909-0] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Accepted: 07/02/2023] [Indexed: 07/15/2023] Open
Abstract
Proteins play important roles in biology, biotechnology and pharmacology, and missense variants are a common cause of disease. Discovering functionally important sites in proteins is a central but difficult problem because of the lack of large, systematic data sets. Sequence conservation can highlight residues that are functionally important but is often convoluted with a signal for preserving structural stability. We here present a machine learning method to predict functional sites by combining statistical models for protein sequences with biophysical models of stability. We train the model using multiplexed experimental data on variant effects and validate it broadly. We show how the model can be used to discover active sites, as well as regulatory and binding sites. We illustrate the utility of the model by prospective prediction and subsequent experimental validation on the functional consequences of missense variants in HPRT1 which may cause Lesch-Nyhan syndrome, and pinpoint the molecular mechanisms by which they cause disease.
Collapse
Affiliation(s)
- Matteo Cagiada
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Sandro Bottaro
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Søren Lindemose
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Signe M Schenstrøm
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Amelie Stein
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Rasmus Hartmann-Petersen
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| | - Kresten Lindorff-Larsen
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
28
|
Yang F, Wang T, Guo Q, Zou Q, Yu S. The CmMYB3 transcription factors isolated from the Chrysanthemum morifolium regulate flavonol biosynthesis in Arabidopsis thaliana. PLANT CELL REPORTS 2023; 42:791-803. [PMID: 36840758 DOI: 10.1007/s00299-023-02991-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 01/25/2023] [Indexed: 06/18/2023]
Abstract
Chrysanthemum morifolium MYB3 factors are transcriptional activators for the regulation of flavonol biosynthesis. Flavonol was not only the critical secondary metabolite participating in the growth and development of plants but also the main active ingredient in medicinal chrysanthemum. However, few pieces of research revealed the transcriptional regulation of flavonol biosynthesis in Chrysanthemum morifolium. Here, we isolated two CmMYB3 transcription factors (CmMYB3a and CmMYB3b) from the capitulum of Chrysanthemum morifolium cv 'Hangju'. According to the sequence characteristics, the CmMYB3a and CmMYB3b belonged to the R2R3-MYB subgroup 7, whose members were often reported to regulate flavonol biosynthesis positively. CmMYB3a and CmMYB3b factors were identified to localize in the nucleus by subcellular localization assay. Besides, both of them have obvious transcriptional self-activation activity in their C-terminal. After the overexpression of CmMYB3 genes in Nicotiana benthamiana and Arabidopsis thaliana, the flavonol contents in plants were increased, and the expression of AtCHS, AtCHI, AtF3H, and AtFLS genes in A. thaliana was also improved. Interestingly, the CmMYB3a factor had stronger functions in improving flavonol contents and related gene expression levels than CmMYB3b. The interaction analysis between transcription factors and promoters suggested that CmMYB3 could bind and activate the promoters of CmCHI and CmFLS genes in C. morifolium, and CmMYB3a also functioned more powerfully. Overall, these results indicated that CmMYB3a and CmMYB3b work as transcriptional activators in controlling flavonol biosynthesis.
Collapse
Affiliation(s)
- Feng Yang
- Institute of Chinese Medicinal Materials, Nanjing Agricultural University, Nanjing, 210095, People's Republic of China
| | - Tao Wang
- Institute of Chinese Medicinal Materials, Nanjing Agricultural University, Nanjing, 210095, People's Republic of China
| | - Qiaosheng Guo
- Institute of Chinese Medicinal Materials, Nanjing Agricultural University, Nanjing, 210095, People's Republic of China.
| | - Qingjun Zou
- Institute of Chinese Medicinal Materials, Nanjing Agricultural University, Nanjing, 210095, People's Republic of China
| | - Shuyan Yu
- Institute of Chinese Medicinal Materials, Nanjing Agricultural University, Nanjing, 210095, People's Republic of China
| |
Collapse
|
29
|
Sun S, Wang M, Xiang J, Shao Y, Li L, Sedjoah RCAA, Wu G, Zhou J, Xin Z. BON domain-containing protein-mediated co-selection of antibiotic and heavy metal resistance in bacteria. Int J Biol Macromol 2023; 238:124062. [PMID: 36933600 DOI: 10.1016/j.ijbiomac.2023.124062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Revised: 03/12/2023] [Accepted: 03/13/2023] [Indexed: 03/17/2023]
Abstract
The widespread antibiotic resistance of bacteria has become one of the most severe threats to public health. However, the mechanisms that allow microbial acquisition of resistance are still poorly understood. In the present study, a novel BON domain-containing protein was heterologously expressed in Escherichia coli. It functions as an efflux pump-like to confer resistance to various antibiotics, especially for ceftazidime, with a >32-fold increase in minimum inhibitory concentration (MIC). The fluorescence spectroscopy experiment indicated that BON protein could interact with several metal ions, such as copper and silver, which has been associated with the induced co-regulation of antibiotic and heavy metal resistance in bacteria. Furthermore, the BON protein was demonstrated to spontaneously self-assemble into a trimer and generate a central pore-like architecture for antibiotic transporting. A WXG motif as a molecular switch is essential for forming the transmembrane oligomeric pores and controls the interaction between BON protein and cell membrane. Based on these findings, a mechanism termed "one-in, one-out", was proposed for the first time. The present study provides new insights into the structure and function of BON protein and a previously unidentified antibiotic resistance mechanism, filling the knowledge gap in understanding BON protein-mediated intrinsic antibiotic resistance.
Collapse
Affiliation(s)
- Shengwei Sun
- Key Laboratory of Food Processing and Quality Control, College of Food Science and Technology, Nanjing Agricultural University, Nanjing 210095, PR China
| | - Mengxi Wang
- Key Laboratory of Food Processing and Quality Control, College of Food Science and Technology, Nanjing Agricultural University, Nanjing 210095, PR China
| | - Jiahui Xiang
- Key Laboratory of Food Processing and Quality Control, College of Food Science and Technology, Nanjing Agricultural University, Nanjing 210095, PR China
| | - Yuting Shao
- Key Laboratory of Food Processing and Quality Control, College of Food Science and Technology, Nanjing Agricultural University, Nanjing 210095, PR China
| | - Longxiang Li
- Key Laboratory of Food Processing and Quality Control, College of Food Science and Technology, Nanjing Agricultural University, Nanjing 210095, PR China
| | - Rita-Cindy Aye-Ayire Sedjoah
- Key Laboratory of Food Processing and Quality Control, College of Food Science and Technology, Nanjing Agricultural University, Nanjing 210095, PR China
| | - Guojun Wu
- Key Laboratory of Food Processing and Quality Control, College of Food Science and Technology, Nanjing Agricultural University, Nanjing 210095, PR China
| | - Jingjie Zhou
- Key Laboratory of Food Processing and Quality Control, College of Food Science and Technology, Nanjing Agricultural University, Nanjing 210095, PR China
| | - Zhihong Xin
- Key Laboratory of Food Processing and Quality Control, College of Food Science and Technology, Nanjing Agricultural University, Nanjing 210095, PR China.
| |
Collapse
|
30
|
Li M, Shi W, Zhang F, Zeng M, Li Y. A Deep Learning Framework for Predicting Protein Functions With Co-Occurrence of GO Terms. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:833-842. [PMID: 35476573 DOI: 10.1109/tcbb.2022.3170719] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
The understanding of protein functions is critical to many biological problems such as the development of new drugs and new crops. To reduce the huge gap between the increase of protein sequences and annotations of protein functions, many methods have been proposed to deal with this problem. These methods use Gene Ontology (GO) to classify the functions of proteins and consider one GO term as a class label. However, they ignore the co-occurrence of GO terms that is helpful for protein function prediction. We propose a new deep learning model, named DeepPFP-CO, which uses Graph Convolutional Network (GCN) to explore and capture the co-occurrence of GO terms to improve the protein function prediction performance. In this way, we can further deduce the protein functions by fusing the predicted propensity of the center function and its co-occurrence functions. We use Fmax and AUPR to evaluate the performance of DeepPFP-CO and compare DeepPFP-CO with state-of-the-art methods such as DeepGOPlus and DeepGOA. The computational results show that DeepPFP-CO outperforms DeepGOPlus and other methods. Moreover, we further analyze our model at the protein level. The results have demonstrated that DeepPFP-CO improves the performance of protein function prediction. DeepPFP-CO is available at https://csuligroup.com/DeepPFP/.
Collapse
|
31
|
Tang FL, Zhang XG, Ke PY, Liu J, Zhang ZJ, Hu DM, Gu J, Zhang H, Guo HK, Zang QW, Huang R, Ma YL, Kwan P. MBD5 regulates NMDA receptor expression and seizures by inhibiting Stat1 transcription. Neurobiol Dis 2023; 181:106103. [PMID: 36997128 DOI: 10.1016/j.nbd.2023.106103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2022] [Revised: 02/21/2023] [Accepted: 03/23/2023] [Indexed: 03/31/2023] Open
Abstract
Epilepsy is considered to result from an imbalance between excitation and inhibition of the central nervous system. Pathogenic mutations in the methyl-CpG binding domain protein 5 gene (MBD5) are known to cause epilepsy. However, the function and mechanism of MBD5 in epilepsy remain elusive. Here, we found that MBD5 was mainly localized in the pyramidal cells and granular cells of mouse hippocampus, and its expression was increased in the brain tissues of mouse models of epilepsy. Exogenous overexpression of MBD5 inhibited the transcription of the signal transducer and activator of transcription 1 gene (Stat1), resulting in increased expression of N-methyl-d-aspartate receptor (NMDAR) subunit 1 (GluN1), 2A (GluN2A) and 2B (GluN2B), leading to aggravation of the epileptic behaviour phenotype in mice. The epileptic behavioural phenotype was alleviated by overexpression of STAT1 which reduced the expression of NMDARs, and by the NMDAR antagonist memantine. These results indicate that MBD5 accumulation affects seizures through STAT1-mediated inhibition of NMDAR expression in mice. Collectively, our findings suggest that the MBD5-STAT1-NMDAR pathway may be a new pathway that regulates the epileptic behavioural phenotype and may represent a new treatment target.
Collapse
|
32
|
Zatorski N, Sun Y, Elmas A, Dallago C, Karl T, Stein D, Rost B, Huang KL, Walsh M, Schlessinger A. Structural Analysis of Genomic and Proteomic Signatures Reveal Dynamic Expression of Intrinsically Disordered Regions in Breast Cancer and Tissue. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.23.529755. [PMID: 36865220 PMCID: PMC9980136 DOI: 10.1101/2023.02.23.529755] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/26/2023]
Abstract
Structural features of proteins capture underlying information about protein evolution and function, which enhances the analysis of proteomic and transcriptomic data. Here we develop Structural Analysis of Gene and protein Expression Signatures (SAGES), a method that describes expression data using features calculated from sequence-based prediction methods and 3D structural models. We used SAGES, along with machine learning, to characterize tissues from healthy individuals and those with breast cancer. We analyzed gene expression data from 23 breast cancer patients and genetic mutation data from the COSMIC database as well as 17 breast tumor protein expression profiles. We identified prominent expression of intrinsically disordered regions in breast cancer proteins as well as relationships between drug perturbation signatures and breast cancer disease signatures. Our results suggest that SAGES is generally applicable to describe diverse biological phenomena including disease states and drug effects.
Collapse
Affiliation(s)
- Nicole Zatorski
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, One Gustave Levey Pl NY, NY 10029, USA
| | - Yifei Sun
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, One Gustave Levey Pl NY, NY 10029, USA
| | - Abdulkadir Elmas
- Department of Genetic and Genomic Sciences, Icahn School of Medicine at Mount Sinai, One Gustave Levey Pl NY, NY 10029, USA
| | - Christian Dallago
- NVIDIA DE GmbH, Einsteinstraße 172, 81677 München, Germany
- Faculty of Informatics, Bioinformatics & Computational Biology, Technical University Munich (TUM), 85748 Garching, Germany
| | - Timothy Karl
- Faculty of Informatics, Bioinformatics & Computational Biology, Technical University Munich (TUM), 85748 Garching, Germany
| | - David Stein
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, One Gustave Levey Pl NY, NY 10029, USA
| | - Burkhard Rost
- Faculty of Informatics, Bioinformatics & Computational Biology, Technical University Munich (TUM), 85748 Garching, Germany
| | - Kuan-Lin Huang
- Department of Genetic and Genomic Sciences, Icahn School of Medicine at Mount Sinai, One Gustave Levey Pl NY, NY 10029, USA
| | - Martin Walsh
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, One Gustave Levey Pl NY, NY 10029, USA
| | - Avner Schlessinger
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, One Gustave Levey Pl NY, NY 10029, USA
| |
Collapse
|
33
|
Villard J, Kılıç M, Rothlisberger U. Surrogate Based Genetic Algorithm Method for Efficient Identification of Low-Energy Peptide Structures. J Chem Theory Comput 2023; 19:1080-1097. [PMID: 36692853 PMCID: PMC9933449 DOI: 10.1021/acs.jctc.2c01078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
Identification of the most stable structure(s) of a system is a prerequisite for the calculation of any of its properties from first-principles. However, even for relatively small molecules, exhaustive explorations of the potential energy surface (PES) are severely hampered by the dimensionality bottleneck. In this work, we address the challenging task of efficiently sampling realistic low-lying peptide coordinates by resorting to a surrogate based genetic algorithm (GA)/density functional theory (DFT) approach (sGADFT) in which promising candidates provided by the GA are ultimately optimized with DFT. We provide a benchmark of several computational methods (GAFF, AMOEBApro13, PM6, PM7, DFTB3-D3(BJ)) as possible prescanning surrogates and apply sGADFT to two test case systems that are (i) two isomer families of the protonated Gly-Pro-Gly-Gly tetrapeptide (Masson, A.; J. Am. Soc. Mass Spectrom.2015, 26, 1444-1454) and (ii) the doubly protonated cyclic decapeptide gramicidin S (Nagornova, N. S.; J. Am. Chem. Soc.2010, 132, 4040-4041). We show that our GA procedure can correctly identify low-energy minima in as little as a few hours. Subsequent refinement of surrogate low-energy structures within a given energy threshold (≤10 kcal/mol (i), ≤5 kcal/mol (ii)) via DFT relaxation invariably led to the identification of the most stable structures as determined from high-resolution infrared (IR) spectroscopy at low temperature. The sGADFT method therefore constitutes a highly efficient route for the screening of realistic low-lying peptide structures in the gas phase as needed for instance for the interpretation and assignment of experimental IR spectra.
Collapse
|
34
|
Ribeiro AL, Sánchez M, Bosch S, Berenguer J, Hidalgo A. Stabilization of Enzymes by Using Thermophiles. Methods Mol Biol 2023; 2704:313-328. [PMID: 37642853 DOI: 10.1007/978-1-0716-3385-4_19] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
Manufactured steroid compounds have many applications in the pharmaceutical industry. Due to the chemical complexity and chirality of steroids, there is an increasing demand for enzyme-based bioconversion processes to replace those based on chemical synthesis. In this context, thermostability of the involved enzymes is a highly desirable property as both the increased half-life of the enzyme and the enhanced solubility of substrates and products will improve the yield of the reactions. Metagenomic libraries from thermal environments are potential sources of thermostable enzymes of prokaryotic origin, but the number of expected hits could be quite low for enzymes handling substrates such as steroids, rarely found in prokaryotes. An alternative to metagenome screening is the selection of thermostable variants of well-known steroid-processing enzymes. Here we review and detail a protocol for such selection, where error-prone PCR (epPCR) is used to introduce random mutations into a gene to create a variants library for further selection of thermostable variants in the thermophile Thermus thermophilus. The method involves the use of folding interference vectors where the proper folding of the enzyme of interest at high temperature is linked to the folding of a reporter encoding a selectable property such as thermostable resistance to kanamycin, leading to a life-or-death selection of variants of reinforced folding independently of the activity of the enzyme.
Collapse
Affiliation(s)
- Ana-Luisa Ribeiro
- Centro de Biología Molecular Severo Ochoa (UAM-CSIC). Facultad de Ciencias. Universidad Autónoma de Madrid, Madrid, Spain
| | - Mercedes Sánchez
- Centro de Biología Molecular Severo Ochoa (UAM-CSIC). Facultad de Ciencias. Universidad Autónoma de Madrid, Madrid, Spain
| | - Sandra Bosch
- Centro de Biología Molecular Severo Ochoa (UAM-CSIC). Facultad de Ciencias. Universidad Autónoma de Madrid, Madrid, Spain
| | - José Berenguer
- Centro de Biología Molecular Severo Ochoa (UAM-CSIC). Facultad de Ciencias. Universidad Autónoma de Madrid, Madrid, Spain
| | - Aurelio Hidalgo
- Centro de Biología Molecular Severo Ochoa (UAM-CSIC). Facultad de Ciencias. Universidad Autónoma de Madrid, Madrid, Spain.
| |
Collapse
|
35
|
Favourable Interfacial Characteristics of A2 Milk Protein Monolayer. J Membr Biol 2023; 256:35-41. [PMID: 35723704 PMCID: PMC9208347 DOI: 10.1007/s00232-022-00248-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Accepted: 05/21/2022] [Indexed: 02/07/2023]
Abstract
Shielding of the specific body organ using the biocompatible material helps preventing direct exposure of that part to the foreign entities responsible for infections. Here we show the potential of the A2 milk protein recovered from the milk of cow from Indian origin for possible prevention of the direct exposure to other foreign molecules. We measured the surface pressure of the monolayers of different types of protein samples using Langmuir isotherm experiments. The surface pressure measurements for the monolayer of four types of protein macromolecules have been carried out using the Wilhelmy plate micro pressure sensor. We studied the self-organization of different protein macromolecules and their monolayer compression characteristics. The electrochemical behaviour is studied using electrochemical impedance spectroscopy. We found the highest surface pressure for the monolayer of A2 protein. Further, it is also found that A2 protein exhibited the highest surface activity amongst the other proteins. This property can be effectively used for making the envelope of the A2 protein surrounding the targeted entity.
Collapse
|
36
|
Fischer S, Gillis J. Defining the extent of gene function using ROC curvature. Bioinformatics 2022; 38:5390-5397. [PMID: 36271855 PMCID: PMC9750128 DOI: 10.1093/bioinformatics/btac692] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 09/19/2022] [Accepted: 10/20/2022] [Indexed: 12/25/2022] Open
Abstract
MOTIVATION Interactions between proteins help us understand how genes are functionally related and how they contribute to phenotypes. Experiments provide imperfect 'ground truth' information about a small subset of potential interactions in a specific biological context, which can then be extended to the whole genome across different contexts, such as conditions, tissues or species, through machine learning methods. However, evaluating the performance of these methods remains a critical challenge. Here, we propose to evaluate the generalizability of gene characterizations through the shape of performance curves. RESULTS We identify Functional Equivalence Classes (FECs), subsets of annotated and unannotated genes that jointly drive performance, by assessing the presence of straight lines in ROC curves built from gene-centric prediction tasks, such as function or interaction predictions. FECs are widespread across data types and methods, they can be used to evaluate the extent and context-specificity of functional annotations in a data-driven manner. For example, FECs suggest that B cell markers can be decomposed into shared primary markers (10-50 genes), and tissue-specific secondary markers (100-500 genes). In addition, FECs suggest the existence of functional modules that span a wide range of the genome, with marker sets spanning at most 5% of the genome and data-driven extensions of Gene Ontology sets spanning up to 40% of the genome. Simple to assess visually and statistically, the identification of FECs in performance curves paves the way for novel functional characterization and increased robustness in the definition of functional gene sets. AVAILABILITY AND IMPLEMENTATION Code for analyses and figures is available at https://github.com/yexilein/pyroc. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Stephan Fischer
- Cold Spring Harbor Laboratory, Stanley Institute for Cognitive Genomics, Cold Spring Harbor, NY 11724, USA
- Institut Pasteur, Université Paris Cité, Bioinformatics and Biostatistics Hub, Paris F-75015, France
| | - Jesse Gillis
- Cold Spring Harbor Laboratory, Stanley Institute for Cognitive Genomics, Cold Spring Harbor, NY 11724, USA
- Department of Physiology, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
37
|
Mon H, Sato M, Lee JM, Kusakabe T. Construction of gene co-expression networks in cultured silkworm cells and identification of previously uncharacterized lepidopteran-specific genes required for chromosome dynamics. INSECT BIOCHEMISTRY AND MOLECULAR BIOLOGY 2022; 151:103875. [PMID: 36410580 DOI: 10.1016/j.ibmb.2022.103875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Revised: 11/01/2022] [Accepted: 11/11/2022] [Indexed: 06/16/2023]
Abstract
Advances in sequencing technology and bioinformatics have accelerated gene discovery and homology-based functional annotation in many species, and numerous targeted gene studies have greatly expanded the understanding of gene functions. Nevertheless, there are still many genes that lack homology with genes in other evolutionary lineages and are left as genes with unknown functions. We constructed a gene co-expression network from the Bombyx mori ovary-derived cell line, BmN4, and attempted to infer the biological roles of uncharacterized genes based on the correlation between the function-known and unknown genes. Within this network, we focused on the co-expression modules involved in chromosome architecture, dynamics, and integrity, and selected the uncharacterized genes for subsequent RNAi-based phenotypic screening. This approach enabled the identification of 5 genes whose knockdown led to abnormalities in chromosome dynamics and spindle morphology in mitosis. One of them was a recently characterized gene, BmCenp-T, which plays a central role in building the kinetochore protein complex on the silkworm holocentric chromosomes. In this study, we suggest a method for constructing the gene co-expression network and selecting candidate genes for small-scale RNAi screening. This approach is complementary to homology-based annotation and may be useful for the analysis of lineage-specific uncharacterized genes such as orphan genes.
Collapse
Affiliation(s)
- Hiroaki Mon
- Laboratory of Insect Genome Science, Kyushu University Graduate School of Bioresource and Bioenvironmental Sciences, Motooka 744, Nishi-ku, Fukuoka, 819-0395, Japan
| | - Masanao Sato
- Laboratory of Applied Molecular Entomology, Division of Applied Bioscience, Research Faculty of Agriculture, Hokkaido University, Sapporo, 060-8589, Japan
| | - Jae Man Lee
- Laboratory of Creative Science for Insect Industries, Kyushu University Graduate School of Bioresource and Bioenvironmental Sciences, Motooka 744, Nishi-ku, Fukuoka, 819-0395, Japan
| | - Takahiro Kusakabe
- Laboratory of Insect Genome Science, Kyushu University Graduate School of Bioresource and Bioenvironmental Sciences, Motooka 744, Nishi-ku, Fukuoka, 819-0395, Japan.
| |
Collapse
|
38
|
Kim HU, Jeong H, Chung JM, Jeoung D, Hyun J, Jung HS. Comparative analysis of human and bovine thyroglobulin structures. J Anal Sci Technol 2022. [DOI: 10.1186/s40543-022-00330-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
AbstractIn biology, evolutionary conserved protein sequences show homologous physiological phenotypes in their structures and functions. If the protein has a vital function, its sequence is usually conserved across the species. However, in highly conserved protein there still remains small differences across the species. Upon protein–protein interaction (PPI), it is observed that the conserved proteins can have different binding partners that are considered to be caused by the small sequence variations in a specific domain. Thyroglobulin (TG) is the most commonly found protein in the thyroid gland of vertebrates and serves as the precursor of the thyroid hormones, tetraiodothyronine and triiodothyronine that are critical for growth, development and metabolism in vertebrates. In this study, we comparatively analyzed the sequences and structures of the highly conserved regions of TG from two different species in relation to their PPIs. In order to do so, we employed SIM for sequence alignment, STRING for PPI analysis and cryo-electron microscopy for 3D structural analysis. Our Cryo-EM model for TG of Bos taurus determined at 7.1 Å resolution fitted well with the previously published Cryo-EM model for Homo sapiens TG. By demonstrating overall structural homology between TGs from different species, we address that local amino acid sequence variation is sufficient to alter PPIs specific for the organism. We predict that our result will contribute to a deeper understanding in the evolutionary pattern applicable to many other proteins.
Collapse
|
39
|
Varadi M, Nair S, Sillitoe I, Tauriello G, Anyango S, Bienert S, Borges C, Deshpande M, Green T, Hassabis D, Hatos A, Hegedus T, Hekkelman ML, Joosten R, Jumper J, Laydon A, Molodenskiy D, Piovesan D, Salladini E, Salzberg SL, Sommer MJ, Steinegger M, Suhajda E, Svergun D, Tenorio-Ku L, Tosatto S, Tunyasuvunakool K, Waterhouse AM, Žídek A, Schwede T, Orengo C, Velankar S. 3D-Beacons: decreasing the gap between protein sequences and structures through a federated network of protein structure data resources. Gigascience 2022; 11:6854872. [PMID: 36448847 PMCID: PMC9709962 DOI: 10.1093/gigascience/giac118] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 09/20/2022] [Accepted: 11/11/2022] [Indexed: 12/02/2022] Open
Abstract
While scientists can often infer the biological function of proteins from their 3-dimensional quaternary structures, the gap between the number of known protein sequences and their experimentally determined structures keeps increasing. A potential solution to this problem is presented by ever more sophisticated computational protein modeling approaches. While often powerful on their own, most methods have strengths and weaknesses. Therefore, it benefits researchers to examine models from various model providers and perform comparative analysis to identify what models can best address their specific use cases. To make data from a large array of model providers more easily accessible to the broader scientific community, we established 3D-Beacons, a collaborative initiative to create a federated network with unified data access mechanisms. The 3D-Beacons Network allows researchers to collate coordinate files and metadata for experimentally determined and theoretical protein models from state-of-the-art and specialist model providers and also from the Protein Data Bank.
Collapse
Affiliation(s)
- Mihaly Varadi
- Correspondence address. Mihaly Varadi, PDBe team, Wellcome Trust Genome Campus, Saffron Walden CB10 1SA, UK. E-mail:
| | | | | | | | - Stephen Anyango
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton CB10 1SA, UK
| | - Stefan Bienert
- Biozentrum, University of Basel, Basel 4056, Switzerland,Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Clemente Borges
- Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland,European Molecular Biology Laboratory, EMBL Hamburg, Hamburg 69117, Germany
| | - Mandar Deshpande
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton CB10 1SA, UK
| | | | | | - Andras Hatos
- Department of Biomedical Sciences, University of Padova, Padova 35129, Italy,Department of Oncology, Lausanne University Hospital, Lausanne 1015, Switzerland,Department of Computational Biology, University of Lausanne, Lausanne 1015, Switzerland,Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland,Swiss Cancer Center Leman, Lausanne 1005, Switzerland
| | - Tamas Hegedus
- Department of Biophysics and Radiation Biology, Semmelweis University, Budapest 1094, Hungary
| | | | - Robbie Joosten
- Netherlands Cancer Institute, Amsterdam 1066 CX, The Netherlands
| | | | | | - Dmitry Molodenskiy
- Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland,European Molecular Biology Laboratory, EMBL Hamburg, Hamburg 69117, Germany
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova 35129, Italy
| | - Edoardo Salladini
- Department of Biomedical Sciences, University of Padova, Padova 35129, Italy
| | - Steven L Salzberg
- Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21205, USA
| | - Markus J Sommer
- Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21205, USA
| | - Martin Steinegger
- School of Biology, Seoul National University, Seoul 82-2-880-6971, 6977, South Korea
| | - Erzsebet Suhajda
- Department of Biophysics and Radiation Biology, Semmelweis University, Budapest 1094, Hungary
| | - Dmitri Svergun
- Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland,European Molecular Biology Laboratory, EMBL Hamburg, Hamburg 69117, Germany
| | - Luiggi Tenorio-Ku
- Department of Biomedical Sciences, University of Padova, Padova 35129, Italy
| | - Silvio Tosatto
- Department of Biomedical Sciences, University of Padova, Padova 35129, Italy
| | | | - Andrew Mark Waterhouse
- Biozentrum, University of Basel, Basel 4056, Switzerland,Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | | | - Torsten Schwede
- Biozentrum, University of Basel, Basel 4056, Switzerland,Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Christine Orengo
- Department of Structural and Molecular Biology, UCL, London WC1E 6BT, UK
| | - Sameer Velankar
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton CB10 1SA, UK
| |
Collapse
|
40
|
Manriquez‐Sandoval E, Fried SD. DomainMapper: Accurate domain structure annotation including those with non-contiguous topologies. Protein Sci 2022; 31:e4465. [PMID: 36208126 PMCID: PMC9601794 DOI: 10.1002/pro.4465] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Revised: 09/30/2022] [Accepted: 10/03/2022] [Indexed: 11/11/2022]
Abstract
Automated domain annotation is an important tool for structural informatics. These pipelines typically involve searching query sequences against hidden Markov model (HMM) profiles, yielding matches to profiles for various domains. However, domain annotation can be ambiguous or inaccurate when proteins contain domains with non-contiguous residue ranges, and especially when insertional domains are hosted within them. Here, we present DomainMapper, an algorithm that accurately assigns a unique domain structure annotation to a query sequence, including those with complex topologies. We validate our domain assignments using the AlphaFold database and confirm that non-contiguity is pervasive (10.74% of all domains in yeast and 4.52% in human). Using this resource, we find that certain folds have strong propensities to be non-contiguous or insertional across the Tree of Life. DomainMapper is freely available and can be ran as a single command-line function.
Collapse
Affiliation(s)
| | - Stephen D. Fried
- T. C. Jenkins Department of BiophysicsJohns Hopkins UniversityBaltimoreMDUSA
- Department of ChemistryJohns Hopkins UniversityBaltimoreMDUSA
| |
Collapse
|
41
|
Jagtap S, Pirayre A, Bidard F, Duval L, Malliaros FD. BRANEnet: embedding multilayer networks for omics data integration. BMC Bioinformatics 2022; 23:429. [PMID: 36245002 PMCID: PMC9575224 DOI: 10.1186/s12859-022-04955-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2022] [Accepted: 08/24/2022] [Indexed: 11/10/2022] Open
Abstract
Background Gene expression is regulated at different molecular levels, including chromatin accessibility, transcription, RNA maturation, and transport. These regulatory mechanisms have strong connections with cellular metabolism. In order to study the cellular system and its functioning, omics data at each molecular level can be generated and efficiently integrated. Here, we propose BRANEnet, a novel multi-omics integration framework for multilayer heterogeneous networks. BRANEnet is an expressive, scalable, and versatile method to learn node embeddings, leveraging random walk information within a matrix factorization framework. Our goal is to efficiently integrate multi-omics data to study different regulatory aspects of multilayered processes that occur in organisms. We evaluate our framework using multi-omics data of Saccharomyces cerevisiae, a well-studied yeast model organism. Results We test BRANEnet on transcriptomics (RNA-seq) and targeted metabolomics (NMR) data for wild-type yeast strain during a heat-shock time course of 0, 20, and 120 min. Our framework learns features for differentially expressed bio-molecules showing heat stress response. We demonstrate the applicability of the learned features for targeted omics inference tasks: transcription factor (TF)-target prediction, integrated omics network (ION) inference, and module identification. The performance of BRANEnet is compared to existing network integration methods. Our model outperforms baseline methods by achieving high prediction scores for a variety of downstream tasks. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04955-w.
Collapse
Affiliation(s)
- Surabhi Jagtap
- Université Paris-Saclay, CentraleSupélec, Inria, 3 Rue Joliot Curie, 91190, Gif-Sur-Yvette, France.,IFP Energies nouvelles, 1 et 4 avenue de Bois-Préau, 92852, Rueil-Malmaison, France
| | - Aurélie Pirayre
- IFP Energies nouvelles, 1 et 4 avenue de Bois-Préau, 92852, Rueil-Malmaison, France
| | - Frédérique Bidard
- IFP Energies nouvelles, 1 et 4 avenue de Bois-Préau, 92852, Rueil-Malmaison, France
| | - Laurent Duval
- IFP Energies nouvelles, 1 et 4 avenue de Bois-Préau, 92852, Rueil-Malmaison, France
| | - Fragkiskos D Malliaros
- Université Paris-Saclay, CentraleSupélec, Inria, 3 Rue Joliot Curie, 91190, Gif-Sur-Yvette, France.
| |
Collapse
|
42
|
Lai WY, Wong Z, Chang CH, Samian MR, Watanabe N, Teh AH, Noordin R, Ong EBB. Identifying Leptospira interrogans putative virulence factors with a yeast protein expression screen. Appl Microbiol Biotechnol 2022; 106:6567-6581. [DOI: 10.1007/s00253-022-12160-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 08/17/2022] [Accepted: 08/30/2022] [Indexed: 11/24/2022]
|
43
|
Fer E, McGrath KM, Guy L, Hockenberry AJ, Kaçar B. Early divergence of translation initiation and elongation factors. Protein Sci 2022; 31:e4393. [PMID: 36250475 PMCID: PMC9601768 DOI: 10.1002/pro.4393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Revised: 07/05/2022] [Accepted: 07/11/2022] [Indexed: 11/18/2022]
Abstract
Protein translation is a foundational attribute of all living cells. The translation function carried out by the ribosome critically depends on an assortment of protein interaction partners, collectively referred to as the translation machinery. Various studies suggest that the diversification of the translation machinery occurred prior to the last universal common ancestor, yet it is unclear whether the predecessors of the extant translation machinery factors were functionally distinct from their modern counterparts. Here we reconstructed the shared ancestral trajectory and subsequent evolution of essential translation factor GTPases, elongation factor EF-Tu (aEF-1A/eEF-1A), and initiation factor IF2 (aIF5B/eIF5B). Based upon their similar functions and structural homologies, it has been proposed that EF-Tu and IF2 emerged from an ancient common ancestor. We generated the phylogenetic tree of IF2 and EF-Tu proteins and reconstructed ancestral sequences corresponding to the deepest nodes in their shared evolutionary history, including the last common IF2 and EF-Tu ancestor. By identifying the residue and domain substitutions, as well as structural changes along the phylogenetic history, we developed an evolutionary scenario for the origins, divergence and functional refinement of EF-Tu and IF2 proteins. Our analyses suggest that the common ancestor of IF2 and EF-Tu was an IF2-like GTPase protein. Given the central importance of the translation machinery to all cellular life, its earliest evolutionary constraints and trajectories are key to characterizing the universal constraints and capabilities of cellular evolution.
Collapse
Affiliation(s)
- Evrim Fer
- Department of BacteriologyUniversity of Wisconsin‐MadisonMadisonWisconsinUSA
- Microbiology Doctoral Training ProgramUniversity of Wisconsin‐MadisonMadisonWisconsinUSA
- NASA Center for Early Life and EvolutionUniversity of Wisconsin‐MadisonMadisonWisconsinUSA
| | - Kaitlyn M. McGrath
- Department of BacteriologyUniversity of Wisconsin‐MadisonMadisonWisconsinUSA
- NASA Center for Early Life and EvolutionUniversity of Wisconsin‐MadisonMadisonWisconsinUSA
- Department of Molecular and Cellular BiologyUniversity of ArizonaTucsonArizonaUSA
| | - Lionel Guy
- Department of Medical Biochemistry and Microbiology, Science for Life LaboratoryUppsala UniversityUppsalaSweden
| | - Adam J. Hockenberry
- Department of Integrative BiologyThe University of Texas at AustinAustinTexasUSA
| | - Betül Kaçar
- Department of BacteriologyUniversity of Wisconsin‐MadisonMadisonWisconsinUSA
- NASA Center for Early Life and EvolutionUniversity of Wisconsin‐MadisonMadisonWisconsinUSA
| |
Collapse
|
44
|
Escudeiro P, Henry CS, Dias RP. Functional characterization of prokaryotic dark matter: the road so far and what lies ahead. CURRENT RESEARCH IN MICROBIAL SCIENCES 2022; 3:100159. [PMID: 36561390 PMCID: PMC9764257 DOI: 10.1016/j.crmicr.2022.100159] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Revised: 07/18/2022] [Accepted: 08/05/2022] [Indexed: 12/25/2022] Open
Abstract
Eight-hundred thousand to one trillion prokaryotic species may inhabit our planet. Yet, fewer than two-hundred thousand prokaryotic species have been described. This uncharted fraction of microbial diversity, and its undisclosed coding potential, is known as the "microbial dark matter" (MDM). Next-generation sequencing has allowed to collect a massive amount of genome sequence data, leading to unprecedented advances in the field of genomics. Still, harnessing new functional information from the genomes of uncultured prokaryotes is often limited by standard classification methods. These methods often rely on sequence similarity searches against reference genomes from cultured species. This hinders the discovery of unique genetic elements that are missing from the cultivated realm. It also contributes to the accumulation of prokaryotic gene products of unknown function among public sequence data repositories, highlighting the need for new approaches for sequencing data analysis and classification. Increasing evidence indicates that these proteins of unknown function might be a treasure trove of biotechnological potential. Here, we outline the challenges, opportunities, and the potential hidden within the functional dark matter (FDM) of prokaryotes. We also discuss the pitfalls surrounding molecular and computational approaches currently used to probe these uncharted waters, and discuss future opportunities for research and applications.
Collapse
Affiliation(s)
- Pedro Escudeiro
- BioISI - Instituto de Biosistemas e Ciências Integrativas, Faculdade de Ciências, Universidade de Lisboa, Lisboa 1749-016, Portugal
| | - Christopher S. Henry
- Argonne National Laboratory, Lemont, Illinois, USA,University of Chicago, Chicago, Illinois, USA
| | - Ricardo P.M. Dias
- BioISI - Instituto de Biosistemas e Ciências Integrativas, Faculdade de Ciências, Universidade de Lisboa, Lisboa 1749-016, Portugal,iXLab - Innovation for National Biological Resilience, Faculdade de Ciências, Universidade de Lisboa, Lisboa 1749-016, Portugal,Corresponding author.
| |
Collapse
|
45
|
Sorokina M, Belapure J, Tüting C, Paschke R, Papasotiriou I, Rodrigues JP, Kastritis PL. An Electrostatically-steered Conformational Selection Mechanism Promotes SARS-CoV-2 Spike Protein Variation. J Mol Biol 2022; 434:167637. [PMID: 35595165 PMCID: PMC9112565 DOI: 10.1016/j.jmb.2022.167637] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Revised: 04/28/2022] [Accepted: 05/06/2022] [Indexed: 12/16/2022]
Abstract
After two years since the outbreak, the COVID-19 pandemic remains a global public health emergency. SARS-CoV-2 variants with substitutions on the spike (S) protein emerge increasing the risk of immune evasion and cross-species transmission. Here, we analyzed the evolution of the S protein as recorded in 276,712 samples collected before the start of vaccination efforts. Our analysis shows that most variants destabilize the S protein trimer, increase its conformational heterogeneity and improve the odds of the recognition by the host cell receptor. Most frequent substitutions promote overall hydrophobicity by replacing charged amino acids, reducing stabilizing local interactions in the unbound S protein trimer. Moreover, our results identify "forbidden" regions that rarely show any sequence variation, and which are related to conformational changes occurring upon fusion. These results are significant for understanding the structure and function of SARS-CoV-2 related proteins which is a critical step in vaccine development and epidemiological surveillance.
Collapse
Affiliation(s)
- Marija Sorokina
- Institute of Biochemistry and Biotechnology, Martin Luther University Halle-Wittenberg, Kurt-Mothes-Str. 3, 06120 Halle/Saale, Germany,RGCC International GmbH, Baarerstrasse 95, Zug 6300, Switzerland,BioSolutions GmbH, Weinbergweg 22, 06120 Halle/Saale, Germany
| | - Jaydeep Belapure
- Interdisciplinary Research Center HALOmem, Charles Tanford Protein Center, Martin Luther University Halle-Wittenberg, Kurt-Mothes-Str. 3a, 06120 Halle/Saale, Germany
| | - Christian Tüting
- Interdisciplinary Research Center HALOmem, Charles Tanford Protein Center, Martin Luther University Halle-Wittenberg, Kurt-Mothes-Str. 3a, 06120 Halle/Saale, Germany
| | - Reinhard Paschke
- BioSolutions GmbH, Weinbergweg 22, 06120 Halle/Saale, Germany,Biozentrum, Martin Luther University Halle-Wittenberg, Weinbergweg 22, 06120 Halle/Saale, Germany
| | | | | | - Panagiotis L. Kastritis
- Institute of Biochemistry and Biotechnology, Martin Luther University Halle-Wittenberg, Kurt-Mothes-Str. 3, 06120 Halle/Saale, Germany,Interdisciplinary Research Center HALOmem, Charles Tanford Protein Center, Martin Luther University Halle-Wittenberg, Kurt-Mothes-Str. 3a, 06120 Halle/Saale, Germany,Biozentrum, Martin Luther University Halle-Wittenberg, Weinbergweg 22, 06120 Halle/Saale, Germany,Corresponding author at: Institute of Biochemistry and Biotechnology, Martin Luther University Halle-Wittenberg, Kurt-Mothes-Str. 3, 06120 Halle/Saale, Germany
| |
Collapse
|
46
|
Wang S, Wu R, Lu J, Jiang Y, Huang T, Cai YD. Protein-protein interaction networks as miners of biological discovery. Proteomics 2022; 22:e2100190. [PMID: 35567424 DOI: 10.1002/pmic.202100190] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2021] [Revised: 03/28/2022] [Accepted: 04/29/2022] [Indexed: 11/12/2022]
Abstract
Protein-protein interactions (PPIs) form the basis of a myriad of biological pathways and mechanism, such as the formation of protein-complexes or the components of signaling cascades. Here, we reviewed experimental methods for identifying PPI pairs, including yeast two-hybrid, mass spectrometry, co-localization, and co-immunoprecipitation. Furthermore, a range of computational methods leveraging biochemical properties, evolution history, protein structures and more have enabled identification of additional PPIs. Given the wealth of known PPIs, we reviewed important network methods to construct and analyze networks of PPIs. These methods aid biological discovery through identifying hub genes and dynamic changes in the network, and have been thoroughly applied in various fields of biological research. Lastly, we discussed the challenges and future direction of research utilizing the power of PPI networks. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Steven Wang
- Department of Biological Sciences, Columbia University, New York, NY, USA
| | - Runxin Wu
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Jiaqi Lu
- Department of Chemistry and Biochemistry, University of Notre Dame, Notre Dame, IN, USA
| | - Yijia Jiang
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Tao Huang
- Bio-Med Big Data Center, Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, Shanghai, China
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai, China
| |
Collapse
|
47
|
Challenges in Serologic Diagnostics of Neglected Human Systemic Mycoses: An Overview on Characterization of New Targets. Pathogens 2022; 11:pathogens11050569. [PMID: 35631090 PMCID: PMC9143782 DOI: 10.3390/pathogens11050569] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Revised: 04/18/2022] [Accepted: 04/21/2022] [Indexed: 12/04/2022] Open
Abstract
Systemic mycoses have been viewed as neglected diseases and they are responsible for deaths and disabilities around the world. Rapid, low-cost, simple, highly-specific and sensitive diagnostic tests are critical components of patient care, disease control and active surveillance. However, the diagnosis of fungal infections represents a great challenge because of the decline in the expertise needed for identifying fungi, and a reduced number of instruments and assays specific to fungal identification. Unfortunately, time of diagnosis is one of the most important risk factors for mortality rates from many of the systemic mycoses. In addition, phenotypic and biochemical identification methods are often time-consuming, which has created an increasing demand for new methods of fungal identification. In this review, we discuss the current context of the diagnosis of the main systemic mycoses and propose alternative approaches for the identification of new targets for fungal pathogens, which can help in the development of new diagnostic tests.
Collapse
|
48
|
Gisriel CJ, Brudvig GW. Comparison of PsbQ and Psb27 in photosystem II provides insight into their roles. PHOTOSYNTHESIS RESEARCH 2022; 152:177-191. [PMID: 35001227 PMCID: PMC9271139 DOI: 10.1007/s11120-021-00888-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Accepted: 11/24/2021] [Indexed: 06/14/2023]
Abstract
Photosystem II (PSII) catalyzes the oxidation of water at its active site that harbors a high-valent inorganic Mn4CaOx cluster called the oxygen-evolving complex (OEC). Extrinsic subunits generally serve to protect the OEC from reductants and stabilize the structure, but diversity in the extrinsic subunits exists between phototrophs. Recent cryo-electron microscopy experiments have provided new molecular structures of PSII with varied extrinsic subunits. We focus on the extrinsic subunit PsbQ, that binds to the mature PSII complex, and on Psb27, an extrinsic subunit involved in PSII biogenesis. PsbQ and Psb27 share a similar binding site and have a four-helix bundle tertiary structure, suggesting they are related. Here, we use sequence alignments, structural analyses, and binding simulations to compare PsbQ and Psb27 from different organisms. We find no evidence that PsbQ and Psb27 are related despite their similar structures and binding sites. Evolutionary divergence within PsbQ homologs from different lineages is high, probably due to their interactions with other extrinsic subunits that themselves exhibit vast diversity between lineages. This may result in functional variation as exemplified by large differences in their calculated binding energies. Psb27 homologs generally exhibit less divergence, which may be due to stronger evolutionary selection for certain residues that maintain its function during PSII biogenesis and this is consistent with their more similar calculated binding energies between organisms. Previous experimental inconsistencies, low confidence binding simulations, and recent structural data suggest that Psb27 is likely to exhibit flexibility that may be an important characteristic of its activity. The analysis provides insight into the functions and evolution of PsbQ and Psb27, and an unusual example of proteins with similar tertiary structures and binding sites that probably serve different roles.
Collapse
Affiliation(s)
| | - Gary W Brudvig
- Department of Chemistry, Yale University, New Haven, CT, 06520, USA.
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, 06520, USA.
| |
Collapse
|
49
|
Mark Mondol S, Das D, Priom DM, Shaminur Rahman M, Rafiul Islam M, Rahaman MM. In Silico Identification and Characterization of a Hypothetical Protein From Rhodobacter capsulatus Revealing S-Adenosylmethionine-Dependent Methyltransferase Activity. Bioinform Biol Insights 2022; 16:11779322221094236. [PMID: 35478993 PMCID: PMC9036352 DOI: 10.1177/11779322221094236] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Accepted: 03/25/2022] [Indexed: 11/15/2022] Open
Abstract
Rhodobacter capsulatus is a purple non-sulfur bacteria widely used as a model organism to study bacterial photosynthesis. It exhibits extensive metabolic activities and demonstrates other distinctive characteristics such as pleomorphism and nitrogen-fixing capability. It can act as a gene transfer agent (GTA). The commercial importance relies on producing polyester polyhydroxyalkanoate (PHA), extracellular nucleic acids, and commercially critical single-cell proteins. These diverse features make the organism an exciting and environmentally and industrially important one to study. This study was aimed to characterize, model, and annotate the function of a hypothetical protein (Accession no. CAA71016.1) of R capsulatus through computational analysis. The urf7 gene encodes the protein. The tertiary structure was predicted through MODELLER and energy minimization and refinement by YASARA Energy Minimization Server and GalaxyRefine tools. Analysis of sequence similarity, evolutionary relationship, and exploration of domain, family, and superfamily inferred that the protein has S-adenosylmethionine (SAM)-dependent methyltransferase activity. This was further verified by active site prediction by CASTp server and molecular docking analysis through Autodock Vina tool and PatchDock server of the predicted tertiary structure of the protein with its ligands (SAM and SAH). Normally, as a part of the gene product of photosynthetic gene cluster (PGC), the established roles of SAM-dependent methyltransferases are bacteriochlorophyll and carotenoid biosynthesis. But the STRING database unveiled its association with NADH-ubiquinone oxidoreductase (Complex I). The assembly and regulation of this Complex I is mediated by the gene products of the nuo operon. As a part of this operon, the urf7 gene encodes SAM-dependent methyltransferase. As a consequence of these findings, it is reasonable to propose that the hypothetical protein of interest in this study is a SAM-dependent methyltransferase associated with bacterial NADH-ubiquinone oxidoreductase assembly. Due to conservation of Complex I from prokaryotes to eukaryotes, R capsulatus can be a model organism of study to understand the common disorders which are linked to the dysfunctions of complex I.
Collapse
Affiliation(s)
| | - Depro Das
- Department of Biochemistry and Molecular Biology, University of Dhaka, Dhaka, Bangladesh
| | | | - M Shaminur Rahman
- Department of Microbiology, Jashore University of Science and Technology, Jashore, Bangladesh.,M Shaminur Rahman is now affiliated to Department of Microbiology, University of Dhaka, Dhaka, Bangladesh
| | - M Rafiul Islam
- Department of Microbiology, University of Dhaka, Dhaka, Bangladesh
| | | |
Collapse
|
50
|
Pascarelli S, Laurino P. Inter-paralog amino acid inversion events in large phylogenies of duplicated proteins. PLoS Comput Biol 2022; 18:e1010016. [PMID: 35377869 PMCID: PMC9009777 DOI: 10.1371/journal.pcbi.1010016] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Revised: 04/14/2022] [Accepted: 03/12/2022] [Indexed: 11/25/2022] Open
Abstract
Connecting protein sequence to function is becoming increasingly relevant since high-throughput sequencing studies accumulate large amounts of genomic data. In order to go beyond the existing database annotation, it is fundamental to understand the mechanisms underlying functional inheritance and divergence. If the homology relationship between proteins is known, can we determine whether the function diverged? In this work, we analyze different possibilities of protein sequence evolution after gene duplication and identify “inter-paralog inversions”, i.e., sites where the relationship between the ancestry and the functional signal is decoupled. The amino acids in these sites are masked from being recognized by other prediction tools. Still, they play a role in functional divergence and could indicate a shift in protein function. We develop a method to specifically recognize inter-paralog amino acid inversions in a phylogeny and test it on real and simulated datasets. In a dataset built from the Epidermal Growth Factor Receptor (EGFR) sequences found in 88 fish species, we identify 19 amino acid sites that went through inversion after gene duplication, mostly located at the ligand-binding extracellular domain. Our work uncovers an outcome of protein duplications with direct implications in protein functional annotation and sequence evolution. The developed method is optimized to work with large protein datasets and can be readily included in a targeted protein analysis pipeline. Proteins are critical components of living systems because they facilitate most biological processes like protein synthesis, DNA replication, chemical catalysis, etc. Proteins are encoded in their genes. During evolution, genes accumulate mutations that get translated at the protein level. These mutations can be “neutral” if they do not affect the protein function immediately and directly; otherwise, mutations can be functional if they directly modify protein function. An event that provides an opportunity to study protein function is gene duplication namely, when two copies of a gene encoding the same protein appear. One copy of the protein often retains the same function while the other is free to diverge and specialize to a different function. This work sheds light on an alternative outcome of gene duplication that might be critical to discern between neutral and functional mutations. By looking at 88 fish genomes, we found proteins in which the evolution of their sequences does not follow the expected pattern of divergence after gene duplication. In this case, the protein sequence of a subgroup of species diverges in the copy expected to retain its function, while the sequence is retained in the expectedly divergent one. We called this event “inter-paralog amino acid inversion”. Our data shows that this “inversion” event is correlated to function, and its detection has to be considered for assigning protein functions correctly.
Collapse
Affiliation(s)
- Stefano Pascarelli
- Protein Engineering and Evolution Unit, Okinawa Institute of Science and Technology Graduate University, Onna, Okinawa, Japan
| | - Paola Laurino
- Protein Engineering and Evolution Unit, Okinawa Institute of Science and Technology Graduate University, Onna, Okinawa, Japan
- * E-mail:
| |
Collapse
|