1
|
Geist JL, Lee CY, Strom JM, de Jesús Naveja J, Luck K. Generation of a high confidence set of domain-domain interface types to guide protein complex structure predictions by AlphaFold. Bioinformatics 2024; 40:btae482. [PMID: 39171834 PMCID: PMC11361816 DOI: 10.1093/bioinformatics/btae482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Revised: 07/10/2024] [Accepted: 08/20/2024] [Indexed: 08/23/2024] Open
Abstract
MOTIVATION While the release of AlphaFold (AF) represented a breakthrough for the prediction of protein complex structures, its sensitivity, especially when using full length protein sequences, still remains limited. Modeling success rates might increase if AF predictions were guided by likely interacting protein fragments. This approach requires available sets of highly confident protein-protein interface types. Computational resources, such as 3did, infer interacting globular domain types from observed contacts in protein structures. Assessing the accuracy of these predicted interface types is difficult because we lack hand-curated reference sets of verified domain-domain interface (DDI) types. RESULTS To improve protein complex modeling of DDIs by AF, we manually inspected 80 randomly selected DDI types from the 3did resource to generate a first reference set of DDI types. Identified cases of DDI type nonapproval (40%) primarily resulted from inaccurate Pfam domain matches, crystal contacts, and synthetic protein constructs. Using logistic regression, we predicted a subset of 2411 out of 5724 considered DDI types in 3did to be of high confidence, which we subsequently applied to 53 000 human-protein interactions to predict DDIs followed by AF modeling. We obtained highly confident AF models for 604 out of 1129 predicted DDIs. Of note, for 47% of them no confident AF structural model could be obtained using full length protein sequences. AVAILABILITY AND IMPLEMENTATION Code is available at https://github.com/KatjaLuckLab/DDI_manuscript.
Collapse
Affiliation(s)
| | - Chop Yan Lee
- Institute of Molecular Biology (IMB) gGmbH, Mainz 55128, Germany
| | | | - José de Jesús Naveja
- Institute of Molecular Biology (IMB) gGmbH, Mainz 55128, Germany
- 3rd Medical Department, University Medical Center, Johannes Gutenberg University Mainz, Mainz 55131, Germany
- University Cancer Center, University Medical Center, Johannes Gutenberg University Mainz, Mainz 55131, Germany
| | - Katja Luck
- Institute of Molecular Biology (IMB) gGmbH, Mainz 55128, Germany
| |
Collapse
|
2
|
Xiao MS, Damodaran AP, Kumari B, Dickson E, Xing K, On TA, Parab N, King HE, Perez AR, Guiblet WM, Duncan G, Che A, Chari R, Andresson T, Vidigal JA, Weatheritt RJ, Aregger M, Gonatopoulos-Pournatzis T. Genome-scale exon perturbation screens uncover exons critical for cell fitness. Mol Cell 2024; 84:2553-2572.e19. [PMID: 38917794 PMCID: PMC11246229 DOI: 10.1016/j.molcel.2024.05.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Revised: 04/04/2024] [Accepted: 05/24/2024] [Indexed: 06/27/2024]
Abstract
CRISPR-Cas technology has transformed functional genomics, yet understanding of how individual exons differentially shape cellular phenotypes remains limited. Here, we optimized and conducted massively parallel exon deletion and splice-site mutation screens in human cell lines to identify exons that regulate cellular fitness. Fitness-promoting exons are prevalent in essential and highly expressed genes and commonly overlap with protein domains and interaction interfaces. Conversely, fitness-suppressing exons are enriched in nonessential genes, exhibiting lower inclusion levels, and overlap with intrinsically disordered regions and disease-associated mutations. In-depth mechanistic investigation of the screen-hit TAF5 alternative exon-8 revealed that its inclusion is required for assembly of the TFIID general transcription initiation complex, thereby regulating global gene expression output. Collectively, our orthogonal exon perturbation screens established a comprehensive repository of phenotypically important exons and uncovered regulatory mechanisms governing cellular fitness and gene expression.
Collapse
Affiliation(s)
- Mei-Sheng Xiao
- RNA Biology Laboratory, Center for Cancer Research (CCR), National Cancer Institute (NCI), National Institutes of Health (NIH), Frederick, MD 21702, USA
| | - Arun Prasath Damodaran
- RNA Biology Laboratory, Center for Cancer Research (CCR), National Cancer Institute (NCI), National Institutes of Health (NIH), Frederick, MD 21702, USA.
| | - Bandana Kumari
- RNA Biology Laboratory, Center for Cancer Research (CCR), National Cancer Institute (NCI), National Institutes of Health (NIH), Frederick, MD 21702, USA
| | - Ethan Dickson
- RNA Biology Laboratory, Center for Cancer Research (CCR), National Cancer Institute (NCI), National Institutes of Health (NIH), Frederick, MD 21702, USA
| | - Kun Xing
- RNA Biology Laboratory, Center for Cancer Research (CCR), National Cancer Institute (NCI), National Institutes of Health (NIH), Frederick, MD 21702, USA
| | - Tyler A On
- Molecular Targets Program, Center for Cancer Research (CCR), National Cancer Institute (NCI), National Institutes of Health (NIH), Frederick, MD 21702, USA
| | - Nikhil Parab
- RNA Biology Laboratory, Center for Cancer Research (CCR), National Cancer Institute (NCI), National Institutes of Health (NIH), Frederick, MD 21702, USA
| | - Helen E King
- EMBL Australia and Garvan Institute of Medical Research, Sydney, NSW 2010, Australia
| | - Alexendar R Perez
- Laboratory of Biochemistry and Molecular Biology, Center for Cancer Research (CCR), National Cancer Institute (NCI), National Institutes of Health (NIH), Bethesda, MD 20892, USA; Department of Anesthesia and Perioperative Care, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Wilfried M Guiblet
- RNA Biology Laboratory, Center for Cancer Research (CCR), National Cancer Institute (NCI), National Institutes of Health (NIH), Frederick, MD 21702, USA
| | - Gerard Duncan
- Protein Characterization Laboratory, Frederick National Laboratory for Cancer Research (FNLCR), Frederick, MD 21701, USA
| | - Anney Che
- Advanced Biomedical Computational Science, Frederick National Laboratory for Cancer Research (FNLCR), Frederick, MD 21701, USA
| | - Raj Chari
- Genome Modification Core, Frederick National Laboratory for Cancer Research (FNLCR), Frederick, MD 21702, USA
| | - Thorkell Andresson
- Protein Characterization Laboratory, Frederick National Laboratory for Cancer Research (FNLCR), Frederick, MD 21701, USA
| | - Joana A Vidigal
- Laboratory of Biochemistry and Molecular Biology, Center for Cancer Research (CCR), National Cancer Institute (NCI), National Institutes of Health (NIH), Bethesda, MD 20892, USA
| | - Robert J Weatheritt
- EMBL Australia and Garvan Institute of Medical Research, Sydney, NSW 2010, Australia; School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW 2010, Australia
| | - Michael Aregger
- Molecular Targets Program, Center for Cancer Research (CCR), National Cancer Institute (NCI), National Institutes of Health (NIH), Frederick, MD 21702, USA.
| | - Thomas Gonatopoulos-Pournatzis
- RNA Biology Laboratory, Center for Cancer Research (CCR), National Cancer Institute (NCI), National Institutes of Health (NIH), Frederick, MD 21702, USA.
| |
Collapse
|
3
|
Dapkūnas J, Timinskas A, Olechnovič K, Tomkuvienė M, Venclovas Č. PPI3D: a web server for searching, analyzing and modeling protein-protein, protein-peptide and protein-nucleic acid interactions. Nucleic Acids Res 2024; 52:W264-W271. [PMID: 38619046 PMCID: PMC11223826 DOI: 10.1093/nar/gkae278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Revised: 03/19/2024] [Accepted: 04/03/2024] [Indexed: 04/16/2024] Open
Abstract
Structure-resolved protein interactions with other proteins, peptides and nucleic acids are key for understanding molecular mechanisms. The PPI3D web server enables researchers to query preprocessed and clustered structural data, analyze the results and make homology-based inferences for protein interactions. PPI3D offers three interaction exploration modes: (i) all interactions for proteins homologous to the query, (ii) interactions between two proteins or their homologs and (iii) interactions within a specific PDB entry. The server allows interactive analysis of the identified interactions in both summarized and detailed manner. This includes protein annotations, structures, the interface residues and the corresponding contact surface areas. In addition, users can make inferences about residues at the interaction interface for the query protein(s) from the sequence alignments and homology models. The weekly updated PPI3D database includes all the interaction interfaces and binding sites from PDB, clustered based on both protein sequence and structural similarity, yielding non-redundant datasets without loss of alternative interaction modes. Consequently, the PPI3D users avoid being flooded with redundant information, a typical situation for intensely studied proteins. Furthermore, PPI3D provides a possibility to download user-defined sets of interaction interfaces and analyze them locally. The PPI3D web server is available at https://bioinformatics.lt/ppi3d.
Collapse
Affiliation(s)
- Justas Dapkūnas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Saulėtekio av. 7, Vilnius LT-10257, Lithuania
| | - Albertas Timinskas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Saulėtekio av. 7, Vilnius LT-10257, Lithuania
| | - Kliment Olechnovič
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Saulėtekio av. 7, Vilnius LT-10257, Lithuania
- Univ. Grenoble Alpes, CNRS, Grenoble INP, LJK, 38000 Grenoble, France
| | - Miglė Tomkuvienė
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Saulėtekio av. 7, Vilnius LT-10257, Lithuania
| | - Česlovas Venclovas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Saulėtekio av. 7, Vilnius LT-10257, Lithuania
| |
Collapse
|
4
|
Su Z, Dhusia K, Wu Y. Encoding the space of protein-protein binding interfaces by artificial intelligence. Comput Biol Chem 2024; 110:108080. [PMID: 38643609 DOI: 10.1016/j.compbiolchem.2024.108080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 04/03/2024] [Accepted: 04/17/2024] [Indexed: 04/23/2024]
Abstract
The physical interactions between proteins are largely determined by the structural properties at their binding interfaces. It was found that the binding interfaces in distinctive protein complexes are highly similar. The structural properties underlying different binding interfaces could be further captured by artificial intelligence. In order to test this hypothesis, we broke protein-protein binding interfaces into pairs of interacting fragments. We employed a generative model to encode these interface fragment pairs in a low-dimensional latent space. After training, new conformations of interface fragment pairs were generated. We found that, by only using a small number of interface fragment pairs that were generated by artificial intelligence, we were able to guide the assembly of protein complexes into their native conformations. These results demonstrate that the conformational space of fragment pairs at protein-protein binding interfaces is highly degenerate. Features in this degenerate space can be well characterized by artificial intelligence. In summary, our machine learning method will be potentially useful to search for and predict the conformations of unknown protein-protein interactions.
Collapse
Affiliation(s)
- Zhaoqian Su
- Data Science Institute, Vanderbilt University, 1001 19th Ave S, Nashville, TN 37212, USA
| | - Kalyani Dhusia
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, USA
| | - Yinghao Wu
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, USA.
| |
Collapse
|
5
|
Lambourne L, Mattioli K, Santoso C, Sheynkman G, Inukai S, Kaundal B, Berenson A, Spirohn-Fitzgerald K, Bhattacharjee A, Rothman E, Shrestha S, Laval F, Yang Z, Bisht D, Sewell JA, Li G, Prasad A, Phanor S, Lane R, Campbell DM, Hunt T, Balcha D, Gebbia M, Twizere JC, Hao T, Frankish A, Riback JA, Salomonis N, Calderwood MA, Hill DE, Sahni N, Vidal M, Bulyk ML, Fuxman Bass JI. Widespread variation in molecular interactions and regulatory properties among transcription factor isoforms. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.12.584681. [PMID: 38617209 PMCID: PMC11014633 DOI: 10.1101/2024.03.12.584681] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/16/2024]
Abstract
Most human Transcription factors (TFs) genes encode multiple protein isoforms differing in DNA binding domains, effector domains, or other protein regions. The global extent to which this results in functional differences between isoforms remains unknown. Here, we systematically compared 693 isoforms of 246 TF genes, assessing DNA binding, protein binding, transcriptional activation, subcellular localization, and condensate formation. Relative to reference isoforms, two-thirds of alternative TF isoforms exhibit differences in one or more molecular activities, which often could not be predicted from sequence. We observed two primary categories of alternative TF isoforms: "rewirers" and "negative regulators", both of which were associated with differentiation and cancer. Our results support a model wherein the relative expression levels of, and interactions involving, TF isoforms add an understudied layer of complexity to gene regulatory networks, demonstrating the importance of isoform-aware characterization of TF functions and providing a rich resource for further studies.
Collapse
Affiliation(s)
- Luke Lambourne
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Kaia Mattioli
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Clarissa Santoso
- Department of Biology, Boston University, Boston, MA, USA
- Bioinformatics Program, Boston University, Boston, MA, USA
| | - Gloria Sheynkman
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Sachi Inukai
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Babita Kaundal
- Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Anna Berenson
- Molecular Biology, Cell Biology & Biochemistry Program, Boston University, Boston, MA, USA
| | - Kerstin Spirohn-Fitzgerald
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Anukana Bhattacharjee
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Elisabeth Rothman
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | | | - Florent Laval
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
- TERRA Teaching and Research Centre, University of Liège, Gembloux, Belgium
- Laboratory of Viral Interactomes, GIGA Institute, University of Liège, Liège, Belgium
| | - Zhipeng Yang
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Deepa Bisht
- Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Jared A Sewell
- Department of Biology, Boston University, Boston, MA, USA
| | - Guangyuan Li
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Anisa Prasad
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Harvard College, Cambridge MA, USA
| | - Sabrina Phanor
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Ryan Lane
- Department of Biology, Boston University, Boston, MA, USA
| | | | - Toby Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Dawit Balcha
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Marinella Gebbia
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
- Lunenfeld-Tanenbaum Research Institute (LTRI), Sinai Health System, Toronto, Ontario, Canada
| | - Jean-Claude Twizere
- TERRA Teaching and Research Centre, University of Liège, Gembloux, Belgium
- Laboratory of Viral Interactomes, GIGA Institute, University of Liège, Liège, Belgium
| | - Tong Hao
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Adam Frankish
- Laboratory of Viral Interactomes, GIGA Institute, University of Liège, Liège, Belgium
| | - Josh A Riback
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX, USA
| | - Nathan Salomonis
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Michael A Calderwood
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - David E Hill
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Nidhi Sahni
- Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Marc Vidal
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Martha L Bulyk
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Juan I Fuxman Bass
- Department of Biology, Boston University, Boston, MA, USA
- Bioinformatics Program, Boston University, Boston, MA, USA
- Molecular Biology, Cell Biology & Biochemistry Program, Boston University, Boston, MA, USA
| |
Collapse
|
6
|
Banik M, Paudel KR, Majumder R, Idrees S. Prediction of virus-host interactions and identification of hot spot residues of DENV-2 and SH3 domain interactions. Arch Microbiol 2024; 206:162. [PMID: 38483579 PMCID: PMC10940428 DOI: 10.1007/s00203-024-03892-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Revised: 02/08/2024] [Accepted: 02/08/2024] [Indexed: 03/17/2024]
Abstract
Dengue virus, particularly serotype 2 (DENV-2), poses a significant global health threat, and understanding the molecular basis of its interactions with host cell proteins is imperative for developing targeted therapeutic strategies. This study elucidated the interactions between proline-enriched motifs and Src homology 3 (SH3) domain. The SH3 domain is pivotal in mediating protein-protein interactions, particularly by recognizing and binding to proline-rich regions in partner proteins. Through a computational pipeline, we analyzed the interactions and binding modes of proline-enriched motifs with SH3 domains, identified new potential DENV-2 interactions with the SH3 domain, and revealed potential hot spot residues, underscoring their significance in the viral life cycle. This comprehensive analysis provides crucial insights into the molecular basis of DENV-2 infection, highlighting conserved and serotype-specific interactions. The identified hot spot residues offer potential targets for therapeutic intervention, laying the foundation for developing antiviral strategies against Dengue virus infection. These findings contribute to the broader understanding of viral-host interactions and provide a roadmap for future research on Dengue virus pathogenesis and treatment.
Collapse
Affiliation(s)
- Mithila Banik
- Department of Bioinformatics and Biotechnology, Asian University for Women, Chattogram, Bangladesh
| | - Keshav Raj Paudel
- Centre for Inflammation, Centenary Institute and the University of Technology Sydney, School of Life Sciences, Faculty of Science, Sydney, NSW, Australia
| | - Rajib Majumder
- Applied Bioscience, Macquarie University, Sydney, NSW, Australia
| | - Sobia Idrees
- Centre for Inflammation, Centenary Institute and the University of Technology Sydney, School of Life Sciences, Faculty of Science, Sydney, NSW, Australia.
| |
Collapse
|
7
|
Idrees S, Paudel KR. Proteome-wide assessment of human interactome as a source of capturing domain-motif and domain-domain interactions. J Cell Commun Signal 2024; 18:e12014. [PMID: 38545252 PMCID: PMC10964934 DOI: 10.1002/ccs3.12014] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2023] [Accepted: 12/11/2023] [Indexed: 06/29/2024] Open
Abstract
Protein-protein interactions (PPIs) play a crucial role in various biological processes by establishing domain-motif (DMI) and domain-domain interactions (DDIs). While the existence of real DMIs/DDIs is generally assumed, it is rarely tested; therefore, this study extensively compared high-throughput methods and public PPI repositories as sources for DMI and DDI prediction based on the assumption that the human interactome provides sufficient data for the reliable identification of DMIs and DDIs. Different datasets from leading high-throughput methods (Yeast two-hybrid [Y2H], Affinity Purification coupled Mass Spectrometry [AP-MS], and Co-fractionation-coupled Mass Spectrometry) were assessed for their ability to capture DMIs and DDIs using known DMI/DDI information. High-throughput methods were not notably worse than PPI databases and, in some cases, appeared better. In conclusion, all PPI datasets demonstrated significant enrichment in DMIs and DDIs (p-value <0.001), establishing Y2H and AP-MS as reliable methods for predicting these interactions. This study provides valuable insights for biologists in selecting appropriate methods for predicting DMIs, ultimately aiding in SLiM discovery.
Collapse
Affiliation(s)
- Sobia Idrees
- School of Biotechnology and Biomolecular SciencesUniversity of New South WalesSydneyNew South WalesAustralia
- Centre for InflammationCentenary Institute and the University of Technology SydneySchool of Life SciencesFaculty of ScienceSydneyNew South WalesAustralia
| | - Keshav Raj Paudel
- Centre for InflammationCentenary Institute and the University of Technology SydneySchool of Life SciencesFaculty of ScienceSydneyNew South WalesAustralia
| |
Collapse
|
8
|
Desai S, Ahmad S, Bawaskar B, Rashmi S, Mishra R, Lakhwani D, Dutt A. Singleton mutations in large-scale cancer genome studies: uncovering the tail of cancer genome. NAR Cancer 2024; 6:zcae010. [PMID: 38487301 PMCID: PMC10939354 DOI: 10.1093/narcan/zcae010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2023] [Accepted: 02/23/2024] [Indexed: 03/17/2024] Open
Abstract
Singleton or low-frequency driver mutations are challenging to identify. We present a domain driver mutation estimator (DOME) to identify rare candidate driver mutations. DOME analyzes positions analogous to known statistical hotspots and resistant mutations in combination with their functional and biochemical residue context as determined by protein structures and somatic mutation propensity within conserved PFAM domains, integrating the CADD scoring scheme. Benchmarked against seven other tools, DOME exhibited superior or comparable accuracy compared to all evaluated tools in the prediction of functional cancer drivers, with the exception of one tool. DOME identified a unique set of 32 917 high-confidence predicted driver mutations from the analysis of whole proteome missense variants within domain boundaries across 1331 genes, including 1192 noncancer gene census genes, emphasizing its unique place in cancer genome analysis. Additionally, analysis of 8799 TCGA (The Cancer Genome Atlas) and in-house tumor samples revealed 847 potential driver mutations, with mutations in tyrosine kinase members forming the dominant burden, underscoring its higher significance in cancer. Overall, DOME complements current approaches for identifying novel, low-frequency drivers and resistant mutations in personalized therapy.
Collapse
Affiliation(s)
- Sanket Desai
- Integrated Cancer Genomics Laboratory, Advanced Centre for Treatment, Research, and Education in Cancer, Kharghar, Navi Mumbai 410210, Maharashtra, India
- Homi Bhabha National Institute, Training School Complex, Anushakti Nagar, Mumbai 400094, Maharashtra, India
| | - Suhail Ahmad
- Integrated Cancer Genomics Laboratory, Advanced Centre for Treatment, Research, and Education in Cancer, Kharghar, Navi Mumbai 410210, Maharashtra, India
- Homi Bhabha National Institute, Training School Complex, Anushakti Nagar, Mumbai 400094, Maharashtra, India
| | - Bhargavi Bawaskar
- Integrated Cancer Genomics Laboratory, Advanced Centre for Treatment, Research, and Education in Cancer, Kharghar, Navi Mumbai 410210, Maharashtra, India
| | - Sonal Rashmi
- Integrated Cancer Genomics Laboratory, Advanced Centre for Treatment, Research, and Education in Cancer, Kharghar, Navi Mumbai 410210, Maharashtra, India
| | - Rohit Mishra
- Integrated Cancer Genomics Laboratory, Advanced Centre for Treatment, Research, and Education in Cancer, Kharghar, Navi Mumbai 410210, Maharashtra, India
| | - Deepika Lakhwani
- Integrated Cancer Genomics Laboratory, Advanced Centre for Treatment, Research, and Education in Cancer, Kharghar, Navi Mumbai 410210, Maharashtra, India
| | - Amit Dutt
- Integrated Cancer Genomics Laboratory, Advanced Centre for Treatment, Research, and Education in Cancer, Kharghar, Navi Mumbai 410210, Maharashtra, India
- Homi Bhabha National Institute, Training School Complex, Anushakti Nagar, Mumbai 400094, Maharashtra, India
- Department of Genetics, University of Delhi, South Campus, New Delhi 110021, India
| |
Collapse
|
9
|
Hkimi C, Kamoun S, Khamessi O, Ghedira K. Mycobacterium tuberculosis-THP-1 like macrophages protein-protein interaction map revealed through dual RNA-seq analysis and a computational approach. J Med Microbiol 2024; 73. [PMID: 38314675 DOI: 10.1099/jmm.0.001803] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2024] Open
Abstract
Introduction. Infection caused by Mycobacterium tuberculosis (M. tb) is still a leading cause of mortality worldwide with estimated 1.4 million deaths annually.Hypothesis/Gap statement. Despite macrophages' ability to kill bacterium, M. tb can grow inside these innate immune cells and the exploration of the infection has traditionally been characterized by a one-sided relationship, concentrating solely on the host or examining the pathogen in isolation.Aim. Because of only a handful of M. tb-host interactions have been experimentally characterized, our main goal is to predict protein-protein interactions during the early phases of the infection.Methodology. In this work, we performed an integrative computational approach that exploits differentially expressed genes obtained from Dual RNA-seq analysis combined with known domain-domain interactions.Results. A total of 2381 and 7214 genes were identified as differentially expressed in M. tb and in THP-1-like macrophages, respectively, revealing different transcriptional profiles in response to infection. Over 48 h of infection, the host-pathogen network revealed 25 016 PPIs. Analysis of the resulting predicted network based on cellular localization information of M. tb proteins, indicated the implication of interacting nodes including the bacterial PE/PPE/PE_PGRS family. In addition, M. tb proteins interacted with host proteins involved in NF-kB signalling pathway as well as interfering with the host apoptosis ability via the potential interaction of M. tb TB16.3 with human TAB1 and M. tb GroEL2 with host protein kinase C delta, respectively.Conclusion. The prediction of the full range of interactions between M. tb and host will contribute to better understanding of the pathogenesis of this bacterium and may provide advanced approaches to explore new therapeutic targets against tuberculosis.
Collapse
Affiliation(s)
- Chaima Hkimi
- Laboratory of Bioinformatics, Biomathematics and Biostatistics (LR20IPT09), Pasteur Institute of Tunis, Tunis 1002, Tunisia
- Higher Institute of Biotechnology of Sidi Thabet, University of Manouba, Ariana BP-66, Manouba 2010, Tunisia
| | - Selim Kamoun
- Laboratory of Bioinformatics, Biomathematics and Biostatistics (LR20IPT09), Pasteur Institute of Tunis, Tunis 1002, Tunisia
- Higher Institute of Biotechnology of Sidi Thabet, University of Manouba, Ariana BP-66, Manouba 2010, Tunisia
| | - Oussema Khamessi
- Laboratory of Bioinformatics, Biomathematics and Biostatistics (LR20IPT09), Pasteur Institute of Tunis, Tunis 1002, Tunisia
- Higher Institute of Biotechnology of Sidi Thabet, University of Manouba, Ariana BP-66, Manouba 2010, Tunisia
| | - Kais Ghedira
- Laboratory of Bioinformatics, Biomathematics and Biostatistics (LR20IPT09), Pasteur Institute of Tunis, Tunis 1002, Tunisia
| |
Collapse
|
10
|
Lee CY, Hubrich D, Varga JK, Schäfer C, Welzel M, Schumbera E, Djokic M, Strom JM, Schönfeld J, Geist JL, Polat F, Gibson TJ, Keller Valsecchi CI, Kumar M, Schueler-Furman O, Luck K. Systematic discovery of protein interaction interfaces using AlphaFold and experimental validation. Mol Syst Biol 2024; 20:75-97. [PMID: 38225382 PMCID: PMC10883280 DOI: 10.1038/s44320-023-00005-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Revised: 12/04/2023] [Accepted: 12/05/2023] [Indexed: 01/17/2024] Open
Abstract
Structural resolution of protein interactions enables mechanistic and functional studies as well as interpretation of disease variants. However, structural data is still missing for most protein interactions because we lack computational and experimental tools at scale. This is particularly true for interactions mediated by short linear motifs occurring in disordered regions of proteins. We find that AlphaFold-Multimer predicts with high sensitivity but limited specificity structures of domain-motif interactions when using small protein fragments as input. Sensitivity decreased substantially when using long protein fragments or full length proteins. We delineated a protein fragmentation strategy particularly suited for the prediction of domain-motif interfaces and applied it to interactions between human proteins associated with neurodevelopmental disorders. This enabled the prediction of highly confident and likely disease-related novel interfaces, which we further experimentally corroborated for FBXO23-STX1B, STX1B-VAMP2, ESRRG-PSMC5, PEX3-PEX19, PEX3-PEX16, and SNRPB-GIGYF1 providing novel molecular insights for diverse biological processes. Our work highlights exciting perspectives, but also reveals clear limitations and the need for future developments to maximize the power of Alphafold-Multimer for interface predictions.
Collapse
Affiliation(s)
- Chop Yan Lee
- Institute of Molecular Biology (IMB) gGmbH, 55128, Mainz, Germany
| | - Dalmira Hubrich
- Institute of Molecular Biology (IMB) gGmbH, 55128, Mainz, Germany
| | - Julia K Varga
- Department of Microbiology and Molecular Genetics, Institute for Biomedical Research Israel-Canada, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem, 9112001, Israel
| | | | - Mareen Welzel
- Institute of Molecular Biology (IMB) gGmbH, 55128, Mainz, Germany
| | - Eric Schumbera
- Institute of Molecular Biology (IMB) gGmbH, 55128, Mainz, Germany
- Computational Biology and Data Mining Group Biozentrum I, 55128, Mainz, Germany
| | - Milena Djokic
- Institute of Molecular Biology (IMB) gGmbH, 55128, Mainz, Germany
| | - Joelle M Strom
- Institute of Molecular Biology (IMB) gGmbH, 55128, Mainz, Germany
| | - Jonas Schönfeld
- Institute of Molecular Biology (IMB) gGmbH, 55128, Mainz, Germany
| | - Johanna L Geist
- Institute of Molecular Biology (IMB) gGmbH, 55128, Mainz, Germany
| | - Feyza Polat
- Institute of Molecular Biology (IMB) gGmbH, 55128, Mainz, Germany
| | - Toby J Gibson
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, 69117, Germany
| | | | - Manjeet Kumar
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, 69117, Germany
| | - Ora Schueler-Furman
- Department of Microbiology and Molecular Genetics, Institute for Biomedical Research Israel-Canada, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem, 9112001, Israel.
| | - Katja Luck
- Institute of Molecular Biology (IMB) gGmbH, 55128, Mainz, Germany.
| |
Collapse
|
11
|
Michalik I, Kuder KJ. Machine Learning Methods in Protein-Protein Docking. Methods Mol Biol 2024; 2780:107-126. [PMID: 38987466 DOI: 10.1007/978-1-0716-3985-6_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/12/2024]
Abstract
An exponential increase in the number of publications that address artificial intelligence (AI) usage in life sciences has been noticed in recent years, while new modeling techniques are constantly being reported. The potential of these methods is vast-from understanding fundamental cellular processes to discovering new drugs and breakthrough therapies. Computational studies of protein-protein interactions, crucial for understanding the operation of biological systems, are no exception in this field. However, despite the rapid development of technology and the progress in developing new approaches, many aspects remain challenging to solve, such as predicting conformational changes in proteins, or more "trivial" issues as high-quality data in huge quantities.Therefore, this chapter focuses on a short introduction to various AI approaches to study protein-protein interactions, followed by a description of the most up-to-date algorithms and programs used for this purpose. Yet, given the considerable pace of development in this hot area of computational science, at the time you read this chapter, the development of the algorithms described, or the emergence of new (and better) ones should come as no surprise.
Collapse
Affiliation(s)
- Ilona Michalik
- Department of Technology and Biotechnology of Drugs, Faculty of Pharmacy, Jagiellonian University Medical College, Kraków, Poland
| | - Kamil J Kuder
- Department of Technology and Biotechnology of Drugs, Faculty of Pharmacy, Jagiellonian University Medical College, Kraków, Poland.
| |
Collapse
|
12
|
Idrees S, Paudel KR. Bioinformatics prediction and screening of viral mimicry candidates through integrating known and predicted DMI data. Arch Microbiol 2023; 206:30. [PMID: 38117335 DOI: 10.1007/s00203-023-03764-w] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2023] [Revised: 11/15/2023] [Accepted: 11/20/2023] [Indexed: 12/21/2023]
Abstract
Domain-motif interactions (DMIs) represent transient bonds formed when a Short Linear Motif (SLiM) engages a globular domain via a compact contact interface. Understanding the mechanics of DMIs is critical for maintaining diverse regulatory processes and deciphering how various viruses hijack host cellular machinery. However, identifying DMIs through traditional in vitro and in vivo experiments is challenging due to their degenerate nature and small contact areas. Predictions often carry a high rate of false positives, necessitating rigorous in-silico validation before embarking on experimental work. This study assessed the binding energy changes in predicted SLiM instances through in-silico peptide exchange experiment, elucidating how they interact with known 3D DMI complexes. We identified a subset of potential mimicry candidates that exhibited effective binding affinities with native DMI structures, suggesting their potential to be true mimicry candidates. The identified viral SLiMs can be potential targets in developing therapeutics, opening new opportunities for innovative treatments that can be finely tuned to address the complex molecular underpinnings of various diseases. To gain a comprehensive understanding of identified DMIs, it is imperative to conduct further validation through experimental approaches.
Collapse
Affiliation(s)
- Sobia Idrees
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, Australia.
- Centre for Inflammation, Centenary Institute and the University of Technology Sydney, Faculty of Science, School of Life Sciences, Sydney, NSW, Australia.
| | - Keshav Raj Paudel
- Centre for Inflammation, Centenary Institute and the University of Technology Sydney, Faculty of Science, School of Life Sciences, Sydney, NSW, Australia
| |
Collapse
|
13
|
Wang Y, Zhou B, Ru J, Meng X, Wang Y, Liu W. Advances in computational methods for identifying cancer driver genes. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:21643-21669. [PMID: 38124614 DOI: 10.3934/mbe.2023958] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
Cancer driver genes (CDGs) are crucial in cancer prevention, diagnosis and treatment. This study employed computational methods for identifying CDGs, categorizing them into four groups. The major frameworks for each of these four categories were summarized. Additionally, we systematically gathered data from public databases and biological networks, and we elaborated on computational methods for identifying CDGs using the aforementioned databases. Further, we summarized the algorithms, mainly involving statistics and machine learning, used for identifying CDGs. Notably, the performances of nine typical identification methods for eight types of cancer were compared to analyze the applicability areas of these methods. Finally, we discussed the challenges and prospects associated with methods for identifying CDGs. The present study revealed that the network-based algorithms and machine learning-based methods demonstrated superior performance.
Collapse
Affiliation(s)
- Ying Wang
- School of Computer Science and Engineering, Changshu Institute of Technology, Changshu 215500, China
| | - Bohao Zhou
- School of Computer Science and Engineering, Changshu Institute of Technology, Changshu 215500, China
| | - Jidong Ru
- School of Textile Garment and Design, Changshu Institute of Technology, Changshu 215500, China
| | - Xianglian Meng
- School of Computer Information and Engineering, Changzhou Institute of Technology, Changzhou 213032, China
| | - Yundong Wang
- School of Computer Science and Engineering, Changshu Institute of Technology, Changshu 215500, China
| | - Wenjie Liu
- School of Computer Information and Engineering, Changzhou Institute of Technology, Changzhou 213032, China
| |
Collapse
|
14
|
Rosilan NF, Waiho K, Fazhan H, Sung YY, Zakaria NH, Afiqah-Aleng N, Mohamed-Hussein ZA. Current trends of host-pathogen relationship in shrimp infectious disease via computational protein-protein interaction: A bibliometric analysis. FISH & SHELLFISH IMMUNOLOGY 2023; 142:109171. [PMID: 37858788 DOI: 10.1016/j.fsi.2023.109171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Revised: 10/12/2023] [Accepted: 10/16/2023] [Indexed: 10/21/2023]
Abstract
Protein-protein interactions (PPIs) are essential for understanding cell physiology in normal and pathological conditions, as they might involve in all cellular processes. PPIs have been widely used to elucidate the pathobiology of human and plant diseases. Therefore, they can also be used to unveil the pathobiology of infectious diseases in shrimp, which is one of the high-risk factors influencing the success or failure of shrimp production. PPI network analysis, specifically host-pathogen PPI (HP-PPI), provides insights into the molecular interactions between the shrimp and pathogens. This review quantitatively analyzed the research trends within this field through bibliometric analysis using specific keywords, countries, authors, organizations, journals, and documents. This analysis has screened 206 records from the Scopus database for determining eligibility, resulting in 179 papers that were retrieved for bibliometric analysis. The analysis revealed that China and Thailand were the driving forces behind this specific field of research and frequently collaborated with the United States. Aquaculture and Diseases of Aquatic Organisms were the prominent sources for publications in this field. The main keywords identified included "white spot syndrome virus," "WSSV," and "shrimp." We discovered that studies on HP-PPI are currently quite scarce. As a result, we further discussed the significance of HP-PPI by highlighting various approaches that have been previously adopted. These findings not only emphasize the importance of HP-PPI but also pave the way for future researchers to explore the pathogenesis of infectious diseases in shrimp. By doing so, preventative measures and enhanced treatment strategies can be identified.
Collapse
Affiliation(s)
- Nur Fathiah Rosilan
- Institute of Climate Adaptation and Marine Biotechnology (ICAMB), Universiti Malaysia Terengganu, 21030, Kuala Nerus, Terengganu, Malaysia
| | - Khor Waiho
- Higher Institution Centre of Excellence (HICoE), Institute of Tropical Aquaculture and Fisheries, Universiti Malaysia Terengganu, 21030, Kuala Nerus, Terengganu, Malaysia; Centre for Chemical Biology, Universiti Sains Malaysia, Minden, 11900, Penang, Malaysia; Department of Aquaculture, Faculty of Fisheries, Kasetsart University, 10900, Bangkok, Thailand
| | - Hanafiah Fazhan
- Higher Institution Centre of Excellence (HICoE), Institute of Tropical Aquaculture and Fisheries, Universiti Malaysia Terengganu, 21030, Kuala Nerus, Terengganu, Malaysia; Centre for Chemical Biology, Universiti Sains Malaysia, Minden, 11900, Penang, Malaysia; Department of Aquaculture, Faculty of Fisheries, Kasetsart University, 10900, Bangkok, Thailand
| | - Yeong Yik Sung
- Institute of Climate Adaptation and Marine Biotechnology (ICAMB), Universiti Malaysia Terengganu, 21030, Kuala Nerus, Terengganu, Malaysia
| | - Nor Hafizah Zakaria
- Institute of Climate Adaptation and Marine Biotechnology (ICAMB), Universiti Malaysia Terengganu, 21030, Kuala Nerus, Terengganu, Malaysia.
| | - Nor Afiqah-Aleng
- Institute of Climate Adaptation and Marine Biotechnology (ICAMB), Universiti Malaysia Terengganu, 21030, Kuala Nerus, Terengganu, Malaysia.
| | - Zeti-Azura Mohamed-Hussein
- UKM Medical Molecular Biology Institute, UKM Medical Centre, Jalan Yaacob Latiff, 56000, Cheras, Kuala Lumpur, Malaysia; Department of Applied Physics, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, 43600, UKM Bangi, Selangor, Malaysia
| |
Collapse
|
15
|
Varadi M, Tsenkov M, Velankar S. Challenges in bridging the gap between protein structure prediction and functional interpretation. Proteins 2023. [PMID: 37850517 DOI: 10.1002/prot.26614] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 09/26/2023] [Accepted: 10/04/2023] [Indexed: 10/19/2023]
Abstract
The rapid evolution of protein structure prediction tools has significantly broadened access to protein structural data. Although predicted structure models have the potential to accelerate and impact fundamental and translational research significantly, it is essential to note that they are not validated and cannot be considered the ground truth. Thus, challenges persist, particularly in capturing protein dynamics, predicting multi-chain structures, interpreting protein function, and assessing model quality. Interdisciplinary collaborations are crucial to overcoming these obstacles. Databases like the AlphaFold Protein Structure Database, the ESM Metagenomic Atlas, and initiatives like the 3D-Beacons Network provide FAIR access to these data, enabling their interpretation and application across a broader scientific community. Whilst substantial advancements have been made in protein structure prediction, further progress is required to address the remaining challenges. Developing training materials, nurturing collaborations, and ensuring open data sharing will be paramount in this pursuit. The continued evolution of these tools and methodologies will deepen our understanding of protein function and accelerate disease pathogenesis and drug development discoveries.
Collapse
Affiliation(s)
- Mihaly Varadi
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Maxim Tsenkov
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| |
Collapse
|
16
|
Jayaprakash A, Roy A, Thanmalagan RR, Arunachalam A, P T V L. Understanding the mechanism of pathogenicity through interactome studies between Arachis hypogaea L. and Aspergillus flavus. J Proteomics 2023; 287:104975. [PMID: 37482270 DOI: 10.1016/j.jprot.2023.104975] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 06/28/2023] [Accepted: 07/15/2023] [Indexed: 07/25/2023]
Abstract
Aspergillus flavus (A. flavus) infects the peanut seeds during pre-and post-harvest stages, causing seed quality destruction for humans and livestock consumption. Even though many resistant varieties were developed, the molecular mechanism of defense interactions of peanut against A. flavus still needs further investigation. Hence, an interologous host-pathogen protein interaction (HPPI) network was constructed to understand the subcellular level interaction mechanism between peanut and A. flavus. Out of the top 10 hub proteins of both organisms, protein phosphatase 2C and cyclic nucleotide-binding/kinase domain-containing protein and different ribosomal proteins were identified as candidate proteins involved in defense. Functional annotation and subcellular localization based characterization of HPPI identified protein SGT1 homolog, calmodulin and Rac-like GTP-binding proteins to be involved in defense response against fungus. The relevance of HPPI in infectious conditions was assessed using two transcriptome data which identified the interplay of host kinase class R proteins, bHLH TFs and cell wall related proteins to impart resistance against pathogen infection. Further, the pathogenicity analysis identified glycogen phosphorylase and molecular chaperone and allergen Mod-E/Hsp90/Hsp1 as potential pathogen targets to enhance the host defense mechanism. Hence, the computationally predicted host-pathogen PPI network could provide valuable support for molecular biology experiments to understand the host-pathogen interaction. SIGNIFICANCE: Protein-protein interactions execute significant cellular interactions in an organism and are influenced majorly by stress conditions. Here we reported the host-pathogen protein-protein interaction between peanut and A. flavus, and a detailed network analysis based on function, subcellular localization, gene co-expression, and pathogenicity was performed. The network analysis identified key proteins such as host kinase class R proteins, calmodulin, SGT1 homolog, Rac-like GTP-binding proteins bHLH TFs and cell wall related to impart resistance against pathogen infection. We observed the interplay of defense related proteins and cell wall related proteins predominantly, which could be subjected to further studies. The network analysis described in this study could be applied to understand other host-pathogen systems generally.
Collapse
Affiliation(s)
- Aiswarya Jayaprakash
- Department of Bioinformatics, School of Life Sciences, Pondicherry University, R. V. Nagar Kalapet, Pondicherry 605014, India
| | - Abhijeet Roy
- Department of Bioinformatics, School of Life Sciences, Pondicherry University, R. V. Nagar Kalapet, Pondicherry 605014, India
| | - Raja Rajeswary Thanmalagan
- Department of Bioinformatics, School of Life Sciences, Pondicherry University, R. V. Nagar Kalapet, Pondicherry 605014, India
| | - Annamalai Arunachalam
- Department of Food Science & Technology, School of Life Sciences, Pondicherry University, R. V. Nagar Kalapet, Pondicherry 605014, India
| | - Lakshmi P T V
- Department of Bioinformatics, School of Life Sciences, Pondicherry University, R. V. Nagar Kalapet, Pondicherry 605014, India.
| |
Collapse
|
17
|
Mohseni Behbahani Y, Saighi P, Corsi F, Laine E, Carbone A. LEVELNET to visualize, explore, and compare protein-protein interaction networks. Proteomics 2023; 23:e2200159. [PMID: 37403279 DOI: 10.1002/pmic.202200159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2022] [Revised: 04/27/2023] [Accepted: 04/28/2023] [Indexed: 07/06/2023]
Abstract
Physical interactions between proteins are central to all biological processes. Yet, the current knowledge of who interacts with whom in the cell and in what manner relies on partial, noisy, and highly heterogeneous data. Thus, there is a need for methods comprehensively describing and organizing such data. LEVELNET is a versatile and interactive tool for visualizing, exploring, and comparing protein-protein interaction (PPI) networks inferred from different types of evidence. LEVELNET helps to break down the complexity of PPI networks by representing them as multi-layered graphs and by facilitating the direct comparison of their subnetworks toward biological interpretation. It focuses primarily on the protein chains whose 3D structures are available in the Protein Data Bank. We showcase some potential applications, such as investigating the structural evidence supporting PPIs associated to specific biological processes, assessing the co-localization of interaction partners, comparing the PPI networks obtained through computational experiments versus homology transfer, and creating PPI benchmarks with desired properties.
Collapse
Affiliation(s)
- Yasser Mohseni Behbahani
- Sorbonne Université, CNRS, IBPS, Laboratory of Computational and Quantitative Biology (LCQB), UMR 7238, Paris, France
| | - Paul Saighi
- Sorbonne Université, CNRS, IBPS, Laboratory of Computational and Quantitative Biology (LCQB), UMR 7238, Paris, France
| | - Flavia Corsi
- Sorbonne Université, CNRS, IBPS, Laboratory of Computational and Quantitative Biology (LCQB), UMR 7238, Paris, France
| | - Elodie Laine
- Sorbonne Université, CNRS, IBPS, Laboratory of Computational and Quantitative Biology (LCQB), UMR 7238, Paris, France
| | - Alessandra Carbone
- Sorbonne Université, CNRS, IBPS, Laboratory of Computational and Quantitative Biology (LCQB), UMR 7238, Paris, France
| |
Collapse
|
18
|
Vitting-Seerup K. Most protein domains exist as variants with distinct functions across cells, tissues and diseases. NAR Genom Bioinform 2023; 5:lqad084. [PMID: 37745975 PMCID: PMC10516350 DOI: 10.1093/nargab/lqad084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 08/09/2023] [Accepted: 09/05/2023] [Indexed: 09/26/2023] Open
Abstract
Protein domains are the active subunits that provide proteins with specific functions through precise three-dimensional structures. Such domains facilitate most protein functions, including molecular interactions and signal transduction. Currently, these protein domains are described and analyzed as invariable molecular building blocks with fixed functions. Here, I show that most human protein domains exist as multiple distinct variants termed 'domain isotypes'. Domain isotypes are used in a cell, tissue and disease-specific manner and have surprisingly different 3D structures. Accordingly, domain isotypes, compared to each other, modulate or abolish the functionality of protein domains. These results challenge the current view of protein domains as invariable building blocks and have significant implications for both wet- and dry-lab workflows. The extensive use of protein domain isotypes within protein isoforms adds to the literature indicating we need to transition to an isoform-centric research paradigm.
Collapse
Affiliation(s)
- Kristoffer Vitting-Seerup
- The Bioinformatics Section, Department of Health Technology, The Technical University of Denmark (DTU), Denmark
| |
Collapse
|
19
|
Vadnala RN, Hannenhalli S, Narlikar L, Siddharthan R. Transcription factors organize into functional groups on the linear genome and in 3D chromatin. Heliyon 2023; 9:e18211. [PMID: 37520992 PMCID: PMC10382302 DOI: 10.1016/j.heliyon.2023.e18211] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Revised: 07/11/2023] [Accepted: 07/11/2023] [Indexed: 08/01/2023] Open
Abstract
Transcription factors (TFs) and their binding sites have evolved to interact cooperatively or competitively with each other. Here we examine in detail, across multiple cell lines, such cooperation or competition among TFs both in sequential and spatial proximity (using chromatin conformation capture assays), considering in vivo binding data as well as TF binding motifs in DNA. We ascertain significantly co-occurring ("attractive") or avoiding ("repulsive") TF pairs using robust randomized models that retain the essential characteristics of the experimental data. Across human cell lines TFs organize into two groups, with intra-group attraction and inter-group repulsion. This is true for both sequential and spatial proximity, and for both in vivo binding and sequence motifs. Attractive TF pairs exhibit significantly more physical interactions suggesting an underlying mechanism. The two TF groups differ significantly in their genomic and network properties, as well in their function-while one group regulates housekeeping function, the other potentially regulates lineage-specific functions, that are disrupted in cancer. Weaker binding sites tend to occur in spatially interacting regions of the genome. Our results suggest that a complex pattern of spatial cooperativity of TFs and chromatin has evolved with the genome to support housekeeping and lineage-specific functions.
Collapse
Affiliation(s)
- Rakesh Netha Vadnala
- The Institute of Mathematical Sciences, Chennai, India
- Homi Bhabha National Institute, Mumbai, India
| | | | - Leelavati Narlikar
- Department of Data Science, Indian Institute of Science Education and Research, Pune, India
| | - Rahul Siddharthan
- The Institute of Mathematical Sciences, Chennai, India
- Homi Bhabha National Institute, Mumbai, India
| |
Collapse
|
20
|
Kang JE, Jun JH, Kwon JH, Lee JH, Hwang K, Kim S, Jeong N. Arabidopsis Transcription Regulatory Factor Domain/Domain Interaction Analysis Tool-Liquid/Liquid Phase Separation, Oligomerization, GO Analysis: A Toolkit for Interaction Data-Based Domain Analysis. Genes (Basel) 2023; 14:1476. [PMID: 37510380 PMCID: PMC10379056 DOI: 10.3390/genes14071476] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 07/04/2023] [Accepted: 07/14/2023] [Indexed: 07/30/2023] Open
Abstract
Although a large number of databases are available for regulatory elements, a bottleneck has been created by the lack of bioinformatics tools to predict the interaction modes of regulatory elements. To reduce this gap, we developed the Arabidopsis Transcription Regulatory Factor Domain/Domain Interaction Analysis Tool-liquid/liquid phase separation (LLPS), oligomerization, GO analysis (ART FOUNDATION-LOG), a useful toolkit for protein-nucleic acid interaction (PNI) and protein-protein interaction (PPI) analysis based on domain-domain interactions (DDIs). LLPS, protein oligomerization, the structural properties of protein domains, and protein modifications are major components in the orchestration of the spatiotemporal dynamics of PPIs and PNIs. Our goal is to integrate PPI/PNI information into the development of a prediction model for identifying important genetic variants in peaches. Our program unified interdatabase relational keys based on protein domains to facilitate inference from the model species. A key advantage of this program lies in the integrated information of related features, such as protein oligomerization, LOG analysis, structural characterizations of domains (e.g., domain linkers, intrinsically disordered regions, DDIs, domain-motif (peptide) interactions, beta sheets, and transmembrane helices), and post-translational modification. We provided simple tests to demonstrate how to use this program, which can be applied to other eukaryotic organisms.
Collapse
Affiliation(s)
- Jee Eun Kang
- Fruit Research Division, National Institute of Horticultural and Herbal Science, Wanju 55365, Republic of Korea
| | - Ji Hae Jun
- Fruit Research Division, National Institute of Horticultural and Herbal Science, Wanju 55365, Republic of Korea
| | - Jung Hyun Kwon
- Fruit Research Division, National Institute of Horticultural and Herbal Science, Wanju 55365, Republic of Korea
| | - Ju-Hyun Lee
- Fruit Research Division, National Institute of Horticultural and Herbal Science, Wanju 55365, Republic of Korea
| | - Kidong Hwang
- Fruit Research Division, National Institute of Horticultural and Herbal Science, Wanju 55365, Republic of Korea
| | - Sungjong Kim
- Fruit Research Division, National Institute of Horticultural and Herbal Science, Wanju 55365, Republic of Korea
| | - Namhee Jeong
- Fruit Research Division, National Institute of Horticultural and Herbal Science, Wanju 55365, Republic of Korea
| |
Collapse
|
21
|
Zheng J, Yang X, Huang Y, Yang S, Wuchty S, Zhang Z. Deep learning-assisted prediction of protein-protein interactions in Arabidopsis thaliana. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2023; 114:984-994. [PMID: 36919205 DOI: 10.1111/tpj.16188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Revised: 02/20/2023] [Accepted: 03/09/2023] [Indexed: 05/27/2023]
Abstract
Currently, the experimentally identified interactome of Arabidopsis (Arabidopsis thaliana) is still far from complete, suggesting that computational prediction methods can complement experimental techniques. Motivated by the prosperity and success of deep learning algorithms and natural language processing techniques, we introduce an integrative deep learning framework, DeepAraPPI, allowing us to predict protein-protein interactions (PPIs) of Arabidopsis utilizing sequence, domain and Gene Ontology (GO) information. Our current DeepAraPPI comprises: (i) a word2vec encoding-based Siamese recurrent convolutional neural network (RCNN) model; (ii) a Domain2vec encoding-based multiple-layer perceptron (MLP) model; and (iii) a GO2vec encoding-based MLP model. Finally, DeepAraPPI combines the prediction results of the three individual predictors through a logistic regression model. Compiling high-quality positive and negative training and test samples by applying strict filtering strategies, DeepAraPPI shows superior performance compared with existing state-of-the-art Arabidopsis PPI prediction methods. DeepAraPPI also provides better cross-species predictive ability in rice (Oryza sativa) than traditional machine learning methods, although the overall performance in cross-species prediction remains to be improved. DeepAraPPI is freely accessible at http://zzdlab.com/deeparappi/. In the meantime, we have also made the source code and data sets of DeepAraPPI available at https://github.com/zjy1125/DeepAraPPI.
Collapse
Affiliation(s)
- Jingyan Zheng
- State Key Laboratory of Animal Biotech Breeding, College of Biological Sciences, China Agricultural University, Beijing, 100193, China
| | - Xiaodi Yang
- Department of Hematology, Peking University First Hospital, Beijing, 100034, China
| | - Yan Huang
- State Key Laboratory of Animal Biotech Breeding, College of Biological Sciences, China Agricultural University, Beijing, 100193, China
| | - Shiping Yang
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing, 100193, China
| | - Stefan Wuchty
- Department of Computer Science, University of Miami, Miami, FL, 33146, USA
- Department of Biology, University of Miami, Miami, FL, 33146, USA
- Sylvester Comprehensive Cancer Center, University of Miami, Miami, FL, 33136, USA
- Institute of Data Science and Computing, University of Miami, Miami, FL, 33146, USA
| | - Ziding Zhang
- State Key Laboratory of Animal Biotech Breeding, College of Biological Sciences, China Agricultural University, Beijing, 100193, China
| |
Collapse
|
22
|
Luther CH, Brandt P, Vylkova S, Dandekar T, Müller T, Dittrich M. Integrated analysis of SR-like protein kinases Sky1 and Sky2 links signaling networks with transcriptional regulation in Candida albicans. Front Cell Infect Microbiol 2023; 13:1108235. [PMID: 37082713 PMCID: PMC10111165 DOI: 10.3389/fcimb.2023.1108235] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Accepted: 03/01/2023] [Indexed: 04/07/2023] Open
Abstract
Fungal infections are a major global health burden where Candida albicans is among the most common fungal pathogen in humans and is a common cause of invasive candidiasis. Fungal phenotypes, such as those related to morphology, proliferation and virulence are mainly driven by gene expression, which is primarily regulated by kinase signaling cascades. Serine-arginine (SR) protein kinases are highly conserved among eukaryotes and are involved in major transcriptional processes in human and S. cerevisiae. Candida albicans harbors two SR protein kinases, while Sky2 is important for metabolic adaptation, Sky1 has similar functions as in S. cerevisiae. To investigate the role of these SR kinases for the regulation of transcriptional responses in C. albicans, we performed RNA sequencing of sky1Δ and sky2Δ and integrated a comprehensive phosphoproteome dataset of these mutants. Using a Systems Biology approach, we study transcriptional regulation in the context of kinase signaling networks. Transcriptomic enrichment analysis indicates that pathways involved in the regulation of gene expression are downregulated and mitochondrial processes are upregulated in sky1Δ. In sky2Δ, primarily metabolic processes are affected, especially for arginine, and we observed that arginine-induced hyphae formation is impaired in sky2Δ. In addition, our analysis identifies several transcription factors as potential drivers of the transcriptional response. Among these, a core set is shared between both kinase knockouts, but it appears to regulate different subsets of target genes. To elucidate these diverse regulatory patterns, we created network modules by integrating the data of site-specific protein phosphorylation and gene expression with kinase-substrate predictions and protein-protein interactions. These integrated signaling modules reveal shared parts but also highlight specific patterns characteristic for each kinase. Interestingly, the modules contain many proteins involved in fungal morphogenesis and stress response. Accordingly, experimental phenotyping shows a higher resistance to Hygromycin B for sky1Δ. Thus, our study demonstrates that a combination of computational approaches with integration of experimental data can offer a new systems biological perspective on the complex network of signaling and transcription. With that, the investigation of the interface between signaling and transcriptional regulation in C. albicans provides a deeper insight into how cellular mechanisms can shape the phenotype.
Collapse
Affiliation(s)
- Christian H. Luther
- University of Würzburg, Department of Bioinformatics, Biocenter/Am Hubland 97074, Würzburg, Germany
| | - Philipp Brandt
- Septomics Research Center, Friedrich Schiller University and Leibniz Institute for Natural Product Research and Infection Biology – Hans Knöll Institute, Jena, Germany
| | - Slavena Vylkova
- Septomics Research Center, Friedrich Schiller University and Leibniz Institute for Natural Product Research and Infection Biology – Hans Knöll Institute, Jena, Germany
| | - Thomas Dandekar
- University of Würzburg, Department of Bioinformatics, Biocenter/Am Hubland 97074, Würzburg, Germany
| | - Tobias Müller
- University of Würzburg, Department of Bioinformatics, Biocenter/Am Hubland 97074, Würzburg, Germany
| | - Marcus Dittrich
- University of Würzburg, Department of Bioinformatics, Biocenter/Am Hubland 97074, Würzburg, Germany
- University of Würzburg, Institut of Human Genetics, Biocenter/Am Hubland 97074, Würzburg, Germany
- *Correspondence: Marcus Dittrich,
| |
Collapse
|
23
|
Duhan N, Kaundal R. HuCoPIA: An Atlas of Human vs. SARS-CoV-2 Interactome and the Comparative Analysis with Other Coronaviridae Family Viruses. Viruses 2023; 15:492. [PMID: 36851706 PMCID: PMC9962590 DOI: 10.3390/v15020492] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Revised: 02/01/2023] [Accepted: 02/04/2023] [Indexed: 02/12/2023] Open
Abstract
SARS-CoV-2, a novel betacoronavirus strain, has caused a pandemic that has claimed the lives of nearly 6.7M people worldwide. Vaccines and medicines are being developed around the world to reduce the disease spread, fatality rates, and control the new variants. Understanding the protein-protein interaction mechanism of SARS-CoV-2 in humans, and their comparison with the previous SARS-CoV and MERS strains, is crucial for these efforts. These interactions might be used to assess vaccination effectiveness, diagnose exposure, and produce effective biotherapeutics. Here, we present the HuCoPIA database, which contains approximately 100,000 protein-protein interactions between humans and three strains (SARS-CoV-2, SARS-CoV, and MERS) of betacoronavirus. The interactions in the database are divided into common interactions between all three strains and those unique to each strain. It also contains relevant functional annotation information of human proteins. The HuCoPIA database contains SARS-CoV-2 (41,173), SARS-CoV (31,997), and MERS (26,862) interactions, with functional annotation of human proteins like subcellular localization, tissue-expression, KEGG pathways, and Gene ontology information. We believe HuCoPIA will serve as an invaluable resource to diverse experimental biologists, and will help to advance the research in better understanding the mechanism of betacoronaviruses.
Collapse
Affiliation(s)
- Naveen Duhan
- Department of Plants, Soils, and Climate/Center for Integrated BioSystems, College of Agriculture and Applied Sciences, Utah State University, Logan, UT 84322, USA
- Bioinformatics Facility, Center for Integrated BioSystems, College of Agriculture and Applied Sciences, Utah State University, Logan, UT 84322, USA
| | - Rakesh Kaundal
- Department of Plants, Soils, and Climate/Center for Integrated BioSystems, College of Agriculture and Applied Sciences, Utah State University, Logan, UT 84322, USA
- Bioinformatics Facility, Center for Integrated BioSystems, College of Agriculture and Applied Sciences, Utah State University, Logan, UT 84322, USA
- Department of Computer Science, College of Science, Utah State University, Logan, UT 84322, USA
| |
Collapse
|
24
|
Karan B, Mahapatra S, Sahu SS, Pandey DM, Chakravarty S. Computational models for prediction of protein-protein interaction in rice and Magnaporthe grisea. FRONTIERS IN PLANT SCIENCE 2023; 13:1046209. [PMID: 36816487 PMCID: PMC9929577 DOI: 10.3389/fpls.2022.1046209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Accepted: 12/28/2022] [Indexed: 06/18/2023]
Abstract
INTRODUCTION Plant-microbe interactions play a vital role in the development of strategies to manage pathogen-induced destructive diseases that cause enormous crop losses every year. Rice blast is one of the severe diseases to rice Oryza sativa (O. sativa) due to Magnaporthe grisea (M. grisea) fungus. Protein-protein interaction (PPI) between rice and fungus plays a key role in causing rice blast disease. METHODS In this paper, four genomic information-based models such as (i) the interolog, (ii) the domain, (iii) the gene ontology, and (iv) the phylogenetic-based model are developed for predicting the interaction between O. sativa and M. grisea in a whole-genome scale. RESULTS AND DISCUSSION A total of 59,430 interacting pairs between 1,801 rice proteins and 135 blast fungus proteins are obtained from the four models. Furthermore, a machine learning model is developed to assess the predicted interactions. Using composition-based amino acid composition (AAC) and conjoint triad (CT) features, an accuracy of 88% and 89% is achieved, respectively. When tested on the experimental dataset, the CT feature provides the highest accuracy of 95%. Furthermore, the specificity of the model is verified with other pathogen-host datasets where less accuracy is obtained, which confirmed that the model is specific to O. sativa and M. grisea. Understanding the molecular processes behind rice resistance to blast fungus begins with the identification of PPIs, and these predicted PPIs will be useful for drug design in the plant science community.
Collapse
Affiliation(s)
- Biswajit Karan
- Department of Electronics and Communication Engineering, Birla Institute of Technology, Ranchi, India
| | - Satyajit Mahapatra
- Department of Electronics and Communication Engineering, Birla Institute of Technology, Ranchi, India
| | - Sitanshu Sekhar Sahu
- Department of Electronics and Communication Engineering, Birla Institute of Technology, Ranchi, India
| | - Dev Mani Pandey
- Department of Bioengineering and Biotechnology, Birla Institute of Technology, Ranchi, India
| | - Sumit Chakravarty
- Department of Electrical and Computer Engineering, Kennesaw State University, Kennesaw, GA, United States
| |
Collapse
|
25
|
Towards a structurally resolved human protein interaction network. Nat Struct Mol Biol 2023; 30:216-225. [PMID: 36690744 PMCID: PMC9935395 DOI: 10.1038/s41594-022-00910-8] [Citation(s) in RCA: 70] [Impact Index Per Article: 70.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Accepted: 12/14/2022] [Indexed: 01/25/2023]
Abstract
Cellular functions are governed by molecular machines that assemble through protein-protein interactions. Their atomic details are critical to studying their molecular mechanisms. However, fewer than 5% of hundreds of thousands of human protein interactions have been structurally characterized. Here we test the potential and limitations of recent progress in deep-learning methods using AlphaFold2 to predict structures for 65,484 human protein interactions. We show that experiments can orthogonally confirm higher-confidence models. We identify 3,137 high-confidence models, of which 1,371 have no homology to a known structure. We identify interface residues harboring disease mutations, suggesting potential mechanisms for pathogenic variants. Groups of interface phosphorylation sites show patterns of co-regulation across conditions, suggestive of coordinated tuning of multiple protein interactions as signaling responses. Finally, we provide examples of how the predicted binary complexes can be used to build larger assemblies helping to expand our understanding of human cell biology.
Collapse
|
26
|
Ozdemir ES, Nussinov R. Pathogen-driven cancers from a structural perspective: Targeting host-pathogen protein-protein interactions. Front Oncol 2023; 13:1061595. [PMID: 36910650 PMCID: PMC9997845 DOI: 10.3389/fonc.2023.1061595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Accepted: 02/06/2023] [Indexed: 02/25/2023] Open
Abstract
Host-pathogen interactions (HPIs) affect and involve multiple mechanisms in both the pathogen and the host. Pathogen interactions disrupt homeostasis in host cells, with their toxins interfering with host mechanisms, resulting in infections, diseases, and disorders, extending from AIDS and COVID-19, to cancer. Studies of the three-dimensional (3D) structures of host-pathogen complexes aim to understand how pathogens interact with their hosts. They also aim to contribute to the development of rational therapeutics, as well as preventive measures. However, structural studies are fraught with challenges toward these aims. This review describes the state-of-the-art in protein-protein interactions (PPIs) between the host and pathogens from the structural standpoint. It discusses computational aspects of predicting these PPIs, including machine learning (ML) and artificial intelligence (AI)-driven, and overviews available computational methods and their challenges. It concludes with examples of how theoretical computational approaches can result in a therapeutic agent with a potential of being used in the clinics, as well as future directions.
Collapse
Affiliation(s)
- Emine Sila Ozdemir
- Cancer Early Detection Advanced Research Center, Knight Cancer Institute, Oregon Health & Science University, Portland, OR, United States
| | - Ruth Nussinov
- Cancer Innovation Laboratory, Frederick National Laboratory for Cancer Research, National Cancer Institute at Frederick, Frederick, MD, United States.,Department of Human Molecular Genetics and Biochemistry, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
27
|
Zheng J, Yang X, Zhang Z. Using PlaPPISite to Predict and Analyze Plant Protein-Protein Interaction Sites. Methods Mol Biol 2023; 2690:385-399. [PMID: 37450161 DOI: 10.1007/978-1-0716-3327-4_30] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/18/2023]
Abstract
Proteome-wide characterization of protein-protein interactions (PPIs) is crucial to understand the functional roles of protein machinery within cells systematically. With the accumulation of PPI data in different plants, the interaction details of binary PPIs, such as the three-dimensional (3D) structural contexts of interaction sites/interfaces, are urgently demanded. To meet this requirement, we have developed a comprehensive and easy-to-use database called PlaPPISite ( http://zzdlab.com/plappisite/index.php ) to present interaction details for 13 plant interactomes. Here, we provide a clear guide on how to search and view protein interaction details through the PlaPPISite database. Firstly, the running environment of our database is introduced. Secondly, the input file format is briefly introduced. Moreover, we discussed which information related to interaction sites can be achieved through several examples. In addition, some notes about PlaPPISite are also provided. More importantly, we would like to emphasize the importance of interaction site information in plant systems biology through this user guide of PlaPPISite. In particular, the easily accessible 3D structures of PPIs in the coming post-AlphaFold2 era will definitely boost the application of plant interactome to decipher the molecular mechanisms of many fundamental biological issues.
Collapse
Affiliation(s)
- Jingyan Zheng
- State Key Laboratory of Animal Biotech Breeding, College of Biological Sciences, China Agricultural University, Beijing, China
| | - Xiaodi Yang
- Department of Hematology, Peking University First Hospital, Beijing, China.
| | - Ziding Zhang
- State Key Laboratory of Animal Biotech Breeding, College of Biological Sciences, China Agricultural University, Beijing, China.
| |
Collapse
|
28
|
Lio CT, Grabert G, Louadi Z, Fenn A, Baumbach J, Kacprowski T, List M, Tsoy O. Systematic analysis of alternative splicing in time course data using Spycone. Bioinformatics 2022; 39:6965022. [PMID: 36579860 PMCID: PMC9831059 DOI: 10.1093/bioinformatics/btac846] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Revised: 11/16/2022] [Accepted: 12/28/2022] [Indexed: 12/30/2022] Open
Abstract
MOTIVATION During disease progression or organism development, alternative splicing may lead to isoform switches that demonstrate similar temporal patterns and reflect the alternative splicing co-regulation of such genes. Tools for dynamic process analysis usually neglect alternative splicing. RESULTS Here, we propose Spycone, a splicing-aware framework for time course data analysis. Spycone exploits a novel IS detection algorithm and offers downstream analysis such as network and gene set enrichment. We demonstrate the performance of Spycone using simulated and real-world data of SARS-CoV-2 infection. AVAILABILITY AND IMPLEMENTATION The Spycone package is available as a PyPI package. The source code of Spycone is available under the GPLv3 license at https://github.com/yollct/spycone and the documentation at https://spycone.readthedocs.io/en/latest/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Chit Tong Lio
- Institute for Computational Systems Biology, University of Hamburg, Notkestrasse 9, Hamburg 22607, Germany,Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Freising 85354, Germany
| | - Gordon Grabert
- Division Data Science in Biomedicine, Peter L. Reichertz Institute for Medical Informatics of Technische Universität Braunschweig and Hannover Medical School, Braunschweig 38106, Germany,Braunschweig Integrated Centre of Systems Biology (BRICS), TU Braunschweig, Braunschweig 38106, Germany
| | - Zakaria Louadi
- Institute for Computational Systems Biology, University of Hamburg, Notkestrasse 9, Hamburg 22607, Germany,Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Freising 85354, Germany
| | - Amit Fenn
- Institute for Computational Systems Biology, University of Hamburg, Notkestrasse 9, Hamburg 22607, Germany,Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Freising 85354, Germany
| | - Jan Baumbach
- Institute for Computational Systems Biology, University of Hamburg, Notkestrasse 9, Hamburg 22607, Germany,Institute of Mathematics and Computer Science, University of Southern Denmark, Odense 5000, Denmark
| | - Tim Kacprowski
- Division Data Science in Biomedicine, Peter L. Reichertz Institute for Medical Informatics of Technische Universität Braunschweig and Hannover Medical School, Braunschweig 38106, Germany,Braunschweig Integrated Centre of Systems Biology (BRICS), TU Braunschweig, Braunschweig 38106, Germany
| | | | - Olga Tsoy
- To whom correspondence should be addressed.
| |
Collapse
|
29
|
Yu H, Li L, Huffman A, Beverley J, Hur J, Merrell E, Huang HH, Wang Y, Liu Y, Ong E, Cheng L, Zeng T, Zhang J, Li P, Liu Z, Wang Z, Zhang X, Ye X, Handelman SK, Sexton J, Eaton K, Higgins G, Omenn GS, Athey B, Smith B, Chen L, He Y. A new framework for host-pathogen interaction research. Front Immunol 2022; 13:1066733. [PMID: 36591248 PMCID: PMC9797517 DOI: 10.3389/fimmu.2022.1066733] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Accepted: 11/14/2022] [Indexed: 12/23/2022] Open
Abstract
COVID-19 often manifests with different outcomes in different patients, highlighting the complexity of the host-pathogen interactions involved in manifestations of the disease at the molecular and cellular levels. In this paper, we propose a set of postulates and a framework for systematically understanding complex molecular host-pathogen interaction networks. Specifically, we first propose four host-pathogen interaction (HPI) postulates as the basis for understanding molecular and cellular host-pathogen interactions and their relations to disease outcomes. These four postulates cover the evolutionary dispositions involved in HPIs, the dynamic nature of HPI outcomes, roles that HPI components may occupy leading to such outcomes, and HPI checkpoints that are critical for specific disease outcomes. Based on these postulates, an HPI Postulate and Ontology (HPIPO) framework is proposed to apply interoperable ontologies to systematically model and represent various granular details and knowledge within the scope of the HPI postulates, in a way that will support AI-ready data standardization, sharing, integration, and analysis. As a demonstration, the HPI postulates and the HPIPO framework were applied to study COVID-19 with the Coronavirus Infectious Disease Ontology (CIDO), leading to a novel approach to rational design of drug/vaccine cocktails aimed at interrupting processes occurring at critical host-coronavirus interaction checkpoints. Furthermore, the host-coronavirus protein-protein interactions (PPIs) relevant to COVID-19 were predicted and evaluated based on prior knowledge of curated PPIs and domain-domain interactions, and how such studies can be further explored with the HPI postulates and the HPIPO framework is discussed.
Collapse
Affiliation(s)
- Hong Yu
- Department of Respiratory and Critical Care Medicine, Guizhou Provincial People’s Hospital and National Health Commission (NHC) Key Laboratory of Immunological Diseases, People’s Hospital of Guizhou Province, Guiyang, Guizhou, China
- Department of Basic Medicine, Guizhou University Medical College, Guiyang, Guizhou, China
| | - Li Li
- Department of Genetics, Harvard Medical School, Boston, MA, United States
| | - Anthony Huffman
- University of Michigan Medical School, Ann Arbor, MI, United States
| | - John Beverley
- Department of Philosophy, University at Buffalo, Buffalo, NY, United States
- Asymmetric Operations Sector, Johns Hopkins University Applied Physics Laboratory, Laurel, MD, United States
| | - Junguk Hur
- Department of Biomedical Sciences, University of North Dakota School of Medicine and Health Sciences, Grand Forks, ND, United States
| | - Eric Merrell
- Department of Philosophy, University at Buffalo, Buffalo, NY, United States
| | - Hsin-hui Huang
- University of Michigan Medical School, Ann Arbor, MI, United States
- Department of Biotechnology and Laboratory Science in Medicine, National Yang-Ming University, Taipei, Taiwan
| | - Yang Wang
- Department of Respiratory and Critical Care Medicine, Guizhou Provincial People’s Hospital and National Health Commission (NHC) Key Laboratory of Immunological Diseases, People’s Hospital of Guizhou Province, Guiyang, Guizhou, China
- Department of Basic Medicine, Guizhou University Medical College, Guiyang, Guizhou, China
- University of Michigan Medical School, Ann Arbor, MI, United States
| | - Yingtong Liu
- University of Michigan Medical School, Ann Arbor, MI, United States
| | - Edison Ong
- University of Michigan Medical School, Ann Arbor, MI, United States
| | - Liang Cheng
- Department of Bioinformatics, Harbin Medical University, Harbin, Helongjian, China
| | - Tao Zeng
- Key Laboratory of Systems Biology, Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, Shanghai, China
| | - Jingsong Zhang
- Key Laboratory of Systems Biology, Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, Shanghai, China
| | - Pengpai Li
- Center of Intelligent Medicine, School of Control Science and Engineering, Shandong University, Jinan, Shandong, China
| | - Zhiping Liu
- Center of Intelligent Medicine, School of Control Science and Engineering, Shandong University, Jinan, Shandong, China
| | - Zhigang Wang
- Department of Biomedical Engineering, Institute of Basic Medical Sciences and School of Basic Medicine, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China
| | - Xiangyan Zhang
- Department of Respiratory and Critical Care Medicine, Guizhou Provincial People’s Hospital and National Health Commission (NHC) Key Laboratory of Immunological Diseases, People’s Hospital of Guizhou Province, Guiyang, Guizhou, China
- Department of Basic Medicine, Guizhou University Medical College, Guiyang, Guizhou, China
| | - Xianwei Ye
- Department of Respiratory and Critical Care Medicine, Guizhou Provincial People’s Hospital and National Health Commission (NHC) Key Laboratory of Immunological Diseases, People’s Hospital of Guizhou Province, Guiyang, Guizhou, China
- Department of Basic Medicine, Guizhou University Medical College, Guiyang, Guizhou, China
| | | | - Jonathan Sexton
- University of Michigan Medical School, Ann Arbor, MI, United States
| | - Kathryn Eaton
- University of Michigan Medical School, Ann Arbor, MI, United States
| | - Gerry Higgins
- University of Michigan Medical School, Ann Arbor, MI, United States
| | - Gilbert S. Omenn
- University of Michigan Medical School, Ann Arbor, MI, United States
| | - Brian Athey
- University of Michigan Medical School, Ann Arbor, MI, United States
| | - Barry Smith
- Department of Philosophy, University at Buffalo, Buffalo, NY, United States
| | - Luonan Chen
- Key Laboratory of Systems Biology, Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, Shanghai, China
| | - Yongqun He
- University of Michigan Medical School, Ann Arbor, MI, United States
| |
Collapse
|
30
|
Robert PA, Akbar R, Frank R, Pavlović M, Widrich M, Snapkov I, Slabodkin A, Chernigovskaya M, Scheffer L, Smorodina E, Rawat P, Mehta BB, Vu MH, Mathisen IF, Prósz A, Abram K, Olar A, Miho E, Haug DTT, Lund-Johansen F, Hochreiter S, Haff IH, Klambauer G, Sandve GK, Greiff V. Unconstrained generation of synthetic antibody-antigen structures to guide machine learning methodology for antibody specificity prediction. NATURE COMPUTATIONAL SCIENCE 2022; 2:845-865. [PMID: 38177393 DOI: 10.1038/s43588-022-00372-4] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 11/09/2022] [Indexed: 01/06/2024]
Abstract
Machine learning (ML) is a key technology for accurate prediction of antibody-antigen binding. Two orthogonal problems hinder the application of ML to antibody-specificity prediction and the benchmarking thereof: the lack of a unified ML formalization of immunological antibody-specificity prediction problems and the unavailability of large-scale synthetic datasets to benchmark real-world relevant ML methods and dataset design. Here we developed the Absolut! software suite that enables parameter-based unconstrained generation of synthetic lattice-based three-dimensional antibody-antigen-binding structures with ground-truth access to conformational paratope, epitope and affinity. We formalized common immunological antibody-specificity prediction problems as ML tasks and confirmed that for both sequence- and structure-based tasks, accuracy-based rankings of ML methods trained on experimental data hold for ML methods trained on Absolut!-generated data. The Absolut! framework has the potential to enable real-world relevant development and benchmarking of ML strategies for biotherapeutics design.
Collapse
Affiliation(s)
- Philippe A Robert
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway.
| | - Rahmad Akbar
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Robert Frank
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | | | - Michael Widrich
- ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Linz, Austria
| | - Igor Snapkov
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Andrei Slabodkin
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Maria Chernigovskaya
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | | | - Eva Smorodina
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Puneet Rawat
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Brij Bhushan Mehta
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Mai Ha Vu
- Department of Linguistics and Scandinavian Studies, University of Oslo, Oslo, Norway
| | | | - Aurél Prósz
- Danish Cancer Society Research Center, Translational Cancer Genomics, Copenhagen, Denmark
| | - Krzysztof Abram
- The Novo Nordisk Foundation Center for Biosustainability, Autoflow, DTU Biosustain and IT University of Copenhagen, Copenhagen, Denmark
| | - Alex Olar
- Department of Complex Systems in Physics, Eötvös Loránd University, Budapest, Hungary
| | - Enkelejda Miho
- Institute of Medical Engineering and Medical Informatics, School of Life Sciences, FHNW University of Applied Sciences and Arts Northwestern Switzerland, Muttenz, Switzerland
- aiNET GmbH, Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | | | | - Sepp Hochreiter
- ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Linz, Austria
- Institute of Advanced Research in Artificial Intelligence (IARAI), Vienna, Austria
| | | | - Günter Klambauer
- ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Linz, Austria
| | | | - Victor Greiff
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway.
| |
Collapse
|
31
|
Gómez Borrego J, Torrent Burgas M. Analysis of Host–Bacteria Protein Interactions Reveals Conserved Domains and Motifs That Mediate Fundamental Infection Pathways. Int J Mol Sci 2022; 23:ijms231911489. [PMID: 36232803 PMCID: PMC9569774 DOI: 10.3390/ijms231911489] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 09/22/2022] [Accepted: 09/23/2022] [Indexed: 11/16/2022] Open
Abstract
Adhesion and colonization of host cells by pathogenic bacteria depend on protein–protein interactions (PPIs). These interactions are interesting from the pharmacological point of view since new molecules that inhibit host-pathogen PPIs would act as new antimicrobials. Most of these interactions are discovered using high-throughput methods that may display a high false positive rate. The absence of curation of these databases can make the available data unreliable. To address this issue, a comprehensive filtering process was developed to obtain a reliable list of domains and motifs that participate in PPIs between bacteria and human cells. From a structural point of view, our analysis revealed that human proteins involved in the interactions are rich in alpha helix and disordered regions and poorer in beta structure. Disordered regions in human proteins harbor short sequence motifs that are specifically recognized by certain domains in pathogenic proteins. The most relevant domain–domain interactions were validated by AlphaFold, showing that a proper analysis of host-pathogen PPI databases can reveal structural conserved patterns. Domain–motif interactions, on the contrary, were more difficult to validate, since unstructured regions were involved, where AlphaFold could not make a good prediction. Moreover, these interactions are also likely accommodated by post-translational modifications, especially phosphorylation, which can potentially occur in 25–50% of host proteins. Hence, while common structural patterns are involved in host–pathogen PPIs and can be retrieved from available databases, more information is required to properly infer the full interactome. By resolving these issues, and in combination with new prediction tools like Alphafold, new classes of antimicrobials could be discovered from a more detailed understanding of these interactions.
Collapse
|
32
|
Sen N, Madhusudhan MS. A structural database of chain–chain and domain–domain interfaces of proteins. Protein Sci 2022. [DOI: 10.1002/pro.4406] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
- Neeladri Sen
- Indian Institute of Science Education and Research Pune India
- Institute of Structural and Molecular Biology University College London London UK
| | | |
Collapse
|
33
|
A Regulatory Axis between Epithelial Splicing Regulatory Proteins and Estrogen Receptor α Modulates the Alternative Transcriptome of Luminal Breast Cancer. Int J Mol Sci 2022; 23:ijms23147835. [PMID: 35887187 PMCID: PMC9319905 DOI: 10.3390/ijms23147835] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2022] [Revised: 07/13/2022] [Accepted: 07/14/2022] [Indexed: 11/17/2022] Open
Abstract
Epithelial splicing regulatory proteins 1 and 2 (ESRP1/2) control the splicing pattern during epithelial to mesenchymal transition (EMT) in a physiological context and in cancer, including breast cancer (BC). Here, we report that ESRP1, but not ESRP2, is overexpressed in luminal BCs of patients with poor prognosis and correlates with estrogen receptor α (ERα) levels. Analysis of ERα genome-binding profiles in cell lines and primary breast tumors showed its binding in the proximity of ESRP1 and ESRP2 genes, whose expression is strongly decreased by ERα silencing in hormone-deprived conditions. The combined knock-down of ESRP1/2 in MCF-7 cells followed by RNA-Seq, revealed the dysregulation of 754 genes, with a widespread alteration of alternative splicing events (ASEs) of genes involved in cell signaling, metabolism, cell growth, and EMT. Functional network analysis of ASEs correlated with ESRP1/2 expression in ERα+ BCs showed RAC1 as the hub node in the protein-protein interactions altered by ESRP1/2 silencing. The comparison of ERα- and ESRP-modulated ASEs revealed 63 commonly regulated events, including 27 detected in primary BCs and endocrine-resistant cell lines. Our data support a functional implication of the ERα-ESRP1/2 axis in the onset and progression of BC by controlling the splicing patterns of related genes.
Collapse
|
34
|
TritiKBdb: A Functional Annotation Resource for Deciphering the Complete Interaction Networks in Wheat-Karnal Bunt Pathosystem. Int J Mol Sci 2022; 23:ijms23137455. [PMID: 35806459 PMCID: PMC9267065 DOI: 10.3390/ijms23137455] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Revised: 06/30/2022] [Accepted: 06/30/2022] [Indexed: 02/01/2023] Open
Abstract
The study of molecular interactions, especially the inter-species protein-protein interactions, is crucial for understanding the disease infection mechanism in plants. These interactions play an important role in disease infection and host immune responses against pathogen attack. Among various critical fungal diseases, the incidences of Karnal bunt (Tilletia indica) around the world have hindered the export of the crops such as wheat from infected regions, thus causing substantial economic losses. Due to sparse information on T. indica, limited insight is available with regard to gaining in-depth knowledge of the interaction mechanisms between the host and pathogen proteins during the disease infection process. Here, we report the development of a comprehensive database and webserver, TritiKBdb, that implements various tools to study the protein-protein interactions in the Triticum species-Tilletia indica pathosystem. The novel ‘interactomics’ tool allows the user to visualize/compare the networks of the predicted interactions in an enriched manner. TritiKBdb is a user-friendly database that provides functional annotations such as subcellular localization, available domains, KEGG pathways, and GO terms of the host and pathogen proteins. Additionally, the information about the host and pathogen proteins that serve as transcription factors and effectors, respectively, is also made available. We believe that TritiKBdb will serve as a beneficial resource for the research community, and aid the community in better understanding the infection mechanisms of Karnal bunt and its interactions with wheat. The database is freely available for public use at http://bioinfo.usu.edu/tritikbdb/.
Collapse
|
35
|
Ghadie MA, Xia Y. Are transient protein-protein interactions more dispensable? PLoS Comput Biol 2022; 18:e1010013. [PMID: 35404956 PMCID: PMC9000134 DOI: 10.1371/journal.pcbi.1010013] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2021] [Accepted: 03/11/2022] [Indexed: 12/12/2022] Open
Abstract
Protein-protein interactions (PPIs) are key drivers of cell function and evolution. While it is widely assumed that most permanent PPIs are important for cellular function, it remains unclear whether transient PPIs are equally important. Here, we estimate and compare dispensable content among transient PPIs and permanent PPIs in human. Starting with a human reference interactome mapped by experiments, we construct a human structural interactome by building three-dimensional structural models for PPIs, and then distinguish transient PPIs from permanent PPIs using several structural and biophysical properties. We map common mutations from healthy individuals and disease-causing mutations onto the structural interactome, and perform structure-based calculations of the probabilities for common mutations (assumed to be neutral) and disease mutations (assumed to be mildly deleterious) to disrupt transient PPIs and permanent PPIs. Using Bayes' theorem we estimate that a similarly small fraction (<~20%) of both transient and permanent PPIs are completely dispensable, i.e., effectively neutral upon disruption. Hence, transient and permanent interactions are subject to similarly strong selective constraints in the human interactome.
Collapse
Affiliation(s)
| | - Yu Xia
- Department of Bioengineering, McGill University, Montreal, Canada
| |
Collapse
|
36
|
Fang H, Zhong C, Tang C. Predicting protein–protein interactions between banana and Fusarium oxysporum f. sp. cubense race 4 integrating sequence and domain homologous alignment and neural network verification. Proteome Sci 2022; 20:4. [PMID: 35351140 PMCID: PMC8962045 DOI: 10.1186/s12953-022-00186-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2021] [Accepted: 03/06/2022] [Indexed: 11/18/2022] Open
Abstract
Background The pathogen of banana Fusarium oxysporum f. sp. cubense race 4(Foc4) infects almost all banana species, and it is the most destructive. The molecular mechanism of the interactions between Fusarium oxysporum and banana still needs to be further investigated. Methods We use both the interolog and domain-domain method to predict the protein–protein interactions (PPIs) between banana and Foc4. The predicted protein interaction sequences are encoded by the conjoint triad and autocovariance method respectively to obtain continuous and discontinuous information of protein sequences. This information is used as the input data of the neural network model. The Long Short-Term Memory (LSTM) neural network five-fold cross-validation and independent test methods are used to verify the predicted protein interaction sequences. To further confirm the PPIs between banana and Foc4, the GO (Gene Ontology) and KEGG (Kyoto Encylopedia of Genes and Genomics) functional annotation and interaction network analysis are carried out. Results The experimental results show that the PPIs for banana and foc4 predicted by our proposed method may interact with each other in terms of sequence structure, GO and KEGG functional annotation, and Foc4 protein plays a more active role in the process of Foc4 infecting banana. Conclusions This study obtained the PPIs between banana and Foc4 by using computing means for the first time, which will provide data support for molecular biology experiments. Supplementary Information The online version contains supplementary material available at 10.1186/s12953-022-00186-2.
Collapse
|
37
|
Deciphering the Host-Pathogen Interactome of the Wheat-Common Bunt System: A Step towards Enhanced Resilience in Next Generation Wheat. Int J Mol Sci 2022; 23:ijms23052589. [PMID: 35269732 PMCID: PMC8910311 DOI: 10.3390/ijms23052589] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Accepted: 02/09/2022] [Indexed: 02/05/2023] Open
Abstract
Common bunt, caused by two fungal species, Tilletia caries and Tilletia laevis, is one of the most potentially destructive diseases of wheat. Despite the availability of synthetic chemicals against the disease, organic agriculture relies greatly on resistant cultivars. Using two computational approaches—interolog and domain-based methods—a total of approximately 58 M and 56 M probable PPIs were predicted in T. aestivum–T. caries and T. aestivum–T. laevis interactomes, respectively. We also identified 648 and 575 effectors in the interactions from T. caries and T. laevis, respectively. The major host hubs belonged to the serine/threonine protein kinase, hsp70, and mitogen-activated protein kinase families, which are actively involved in plant immune signaling during stress conditions. The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis of the host proteins revealed significant GO terms (O-methyltransferase activity, regulation of response to stimulus, and plastid envelope) and pathways (NF-kappa B signaling and the MAPK signaling pathway) related to plant defense against pathogens. Subcellular localization suggested that most of the pathogen proteins target the host in the plastid. Furthermore, a comparison between unique T. caries and T. laevis proteins was carried out. We also identified novel host candidates that are resistant to disease. Additionally, the host proteins that serve as transcription factors were also predicted.
Collapse
|
38
|
Li S, Wu S, Wang L, Li F, Jiang H, Bai F. Recent advances in predicting protein-protein interactions with the aid of artificial intelligence algorithms. Curr Opin Struct Biol 2022; 73:102344. [PMID: 35219216 DOI: 10.1016/j.sbi.2022.102344] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Revised: 01/02/2022] [Accepted: 01/17/2022] [Indexed: 12/15/2022]
Abstract
Protein-protein interactions (PPIs) are essential in the regulation of biological functions and cell events, therefore understanding PPIs have become a key issue to understanding the molecular mechanism and investigating the design of drugs. Here we highlight the major developments in computational methods developed for predicting PPIs by using types of artificial intelligence algorithms. The first part introduces the source of experimental PPI data. The second part is devoted to the PPI prediction methods based on sequential information. The third part covers representative methods using structural information as the input feature. The last part is methods designed by combining different types of features. For each part, the state-of-the-art computational PPI prediction methods are reviewed in an inclusive view. Finally, we discuss the flaws existing in this area and future directions of next-generation algorithms.
Collapse
Affiliation(s)
- Shiwei Li
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, ShanghaiTech University, Shanghai, China
| | - Sanan Wu
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, ShanghaiTech University, Shanghai, China
| | - Lin Wang
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, ShanghaiTech University, Shanghai, China
| | - Fenglei Li
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, ShanghaiTech University, Shanghai, China; School of Information Science and Technology, ShanghaiTech University, Shanghai, China
| | - Hualiang Jiang
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, ShanghaiTech University, Shanghai, China; Drug Discovery and Design Center, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Pudong, Shanghai, 201203, China
| | - Fang Bai
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, ShanghaiTech University, Shanghai, China; School of Information Science and Technology, ShanghaiTech University, Shanghai, China.
| |
Collapse
|
39
|
Huang XT, Jia S, Gao L, Wu J. Reconstruction of human protein-coding gene functional association network based on machine learning. Brief Bioinform 2022; 23:6502555. [PMID: 35021191 DOI: 10.1093/bib/bbab552] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2021] [Revised: 11/13/2021] [Accepted: 12/02/2021] [Indexed: 01/02/2023] Open
Abstract
Networks consisting of molecular interactions are intrinsically dynamical systems of an organism. These interactions curated in molecular interaction databases are still not complete and contain false positives introduced by high-throughput screening experiments. In this study, we propose a framework to integrate interactions of functional associated protein-coding genes from 31 data sources to reconstruct a network with high coverage and quality. For each interaction, 369 features were constructed including properties of both the interaction and the involved genes. The training and validation sets were built on the pathway interactions as positives and the potential negative instances resulting from our proposed semi-supervised strategy. Random forest classification method was then applied to train and predict multiple times to give a score for each interaction. After setting a threshold estimated by a Binomial distribution, a Human protein-coding Gene Functional Association Network (HuGFAN) was reconstructed with 20 383 genes and 1185 429 high confidence interactions. Then, HuGFAN was compared with other networks from data sources with respect to network properties, suggesting that HuGFAN is more function and pathway related. Finally, HuGFAN was applied to identify cancer driver through two famous network-based methods (DriverNet and HotNet2) to show its outstanding performance compared with other networks. HuGFAN and other supplementary files are freely available at https://github.com/xthuang226/HuGFAN.
Collapse
Affiliation(s)
- Xiao-Tai Huang
- School of Computer Science and Technology, Xidian University, Xi'an, 710071, Shaanxi, China
| | - Songwei Jia
- School of Computer Science and Technology, Xidian University, Xi'an, 710071, Shaanxi, China
| | - Lin Gao
- School of Computer Science and Technology, Xidian University, Xi'an, 710071, Shaanxi, China
| | - Jing Wu
- School of Mechanical Engineering, Dongguan University of Technology, Dongguan, 523808, Guangdong, China
| |
Collapse
|
40
|
OUP accepted manuscript. Brief Funct Genomics 2022; 21:243-269. [DOI: 10.1093/bfgp/elac007] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Revised: 03/17/2022] [Accepted: 03/18/2022] [Indexed: 11/14/2022] Open
|
41
|
Kataria R, Kaundal R. Deciphering the Crosstalk Mechanisms of Wheat-Stem Rust Pathosystem: Genome-Scale Prediction Unravels Novel Host Targets. FRONTIERS IN PLANT SCIENCE 2022; 13:895480. [PMID: 35800602 PMCID: PMC9253690 DOI: 10.3389/fpls.2022.895480] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/13/2022] [Accepted: 05/31/2022] [Indexed: 05/04/2023]
Abstract
Triticum aestivum (wheat), a major staple food grain, is affected by various biotic stresses. Among these, fungal diseases cause about 15-20% of yield loss, worldwide. In this study, we performed a comparative analysis of protein-protein interactions between two Puccinia graminis races (Pgt 21-0 and Pgt Ug99) that cause stem (black) rust in wheat. The available molecular techniques to study the host-pathogen interaction mechanisms are expensive and labor-intensive. We implemented two computational approaches (interolog and domain-based) for the prediction of PPIs and performed various functional analysis to determine the significant differences between the two pathogen races. The analysis revealed that T. aestivum-Pgt 21-0 and T. aestivum-Pgt Ug99 interactomes consisted of ∼90M and ∼56M putative PPIs, respectively. In the predicted PPIs, we identified 115 Pgt 21-0 and 34 Pgt Ug99 potential effectors that were highly involved in pathogen virulence and development. Functional enrichment analysis of the host proteins revealed significant GO terms and KEGG pathways such as O-methyltransferase activity (GO:0008171), regulation of signal transduction (GO:0009966), lignin metabolic process (GO:0009808), plastid envelope (GO:0009526), plant-pathogen interaction pathway (ko04626), and MAPK pathway (ko04016) that are actively involved in plant defense and immune signaling against the biotic stresses. Subcellular localization analysis anticipated the host plastid as a primary target for pathogen attack. The highly connected host hubs in the protein interaction network belonged to protein kinase domain including Ser/Thr protein kinase, MAPK, and cyclin-dependent kinase. We also identified 5,577 transcription factors in the interactions, associated with plant defense during biotic stress conditions. Additionally, novel host targets that are resistant to stem rust disease were also identified. The present study elucidates the functional differences between Pgt 21-0 and Pgt Ug99, thus providing the researchers with strain-specific information for further experimental validation of the interactions, and the development of durable, disease-resistant crop lines.
Collapse
Affiliation(s)
- Raghav Kataria
- Department of Plants, Soils, and Climate, College of Agriculture and Applied Sciences, Utah State University, Logan, UT, United States
| | - Rakesh Kaundal
- Department of Plants, Soils, and Climate, College of Agriculture and Applied Sciences, Utah State University, Logan, UT, United States
- Bioinformatics Facility, Center for Integrated BioSystems, Utah State University, Logan, UT, United States
- Department of Computer Science, College of Science, Utah State University, Logan, UT, United States
- *Correspondence: Rakesh Kaundal,
| |
Collapse
|
42
|
Louadi Z, Elkjaer ML, Klug M, Lio CT, Fenn A, Illes Z, Bongiovanni D, Baumbach J, Kacprowski T, List M, Tsoy O. Functional enrichment of alternative splicing events with NEASE reveals insights into tissue identity and diseases. Genome Biol 2021; 22:327. [PMID: 34857024 PMCID: PMC8638120 DOI: 10.1186/s13059-021-02538-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Accepted: 11/10/2021] [Indexed: 01/27/2023] Open
Abstract
Alternative splicing (AS) is an important aspect of gene regulation. Nevertheless, its role in molecular processes and pathobiology is far from understood. A roadblock is that tools for the functional analysis of AS-set events are lacking. To mitigate this, we developed NEASE, a tool integrating pathways with structural annotations of protein-protein interactions to functionally characterize AS events. We show in four application cases how NEASE can identify pathways contributing to tissue identity and cell type development, and how it highlights splicing-related biomarkers. With a unique view on AS, NEASE generates unique and meaningful biological insights complementary to classical pathways analysis.
Collapse
Affiliation(s)
- Zakaria Louadi
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, 85354, Freising, Germany
- Institute for Computational Systems Biology, University of Hamburg, Notkestrasse 9, 22607, Hamburg, Germany
| | - Maria L Elkjaer
- Department of Neurology, Odense University Hospital, Odense, Denmark
- Institute of Clinical Research, University of Southern Denmark, Odense, Denmark
- Institute of Molecular Medicine, University of Southern Denmark, Odense, Denmark
| | - Melissa Klug
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, 85354, Freising, Germany
- Department of Internal Medicine I, School of Medicine, University hospital rechts der Isar, Technical University of Munich, Munich, Germany
- German Center for Cardiovascular Research (DZHK), Partner Site Munich Heart Alliance, Munich, Germany
| | - Chit Tong Lio
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, 85354, Freising, Germany
- Institute for Computational Systems Biology, University of Hamburg, Notkestrasse 9, 22607, Hamburg, Germany
| | - Amit Fenn
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, 85354, Freising, Germany
- Institute for Computational Systems Biology, University of Hamburg, Notkestrasse 9, 22607, Hamburg, Germany
| | - Zsolt Illes
- Department of Neurology, Odense University Hospital, Odense, Denmark
- Institute of Clinical Research, University of Southern Denmark, Odense, Denmark
- Institute of Molecular Medicine, University of Southern Denmark, Odense, Denmark
| | - Dario Bongiovanni
- Department of Internal Medicine I, School of Medicine, University hospital rechts der Isar, Technical University of Munich, Munich, Germany
- German Center for Cardiovascular Research (DZHK), Partner Site Munich Heart Alliance, Munich, Germany
- Department of Cardiovascular Medicine, Humanitas Clinical and Research Center IRCCS and Humanitas University, Rozzano, Milan, Italy
| | - Jan Baumbach
- Institute for Computational Systems Biology, University of Hamburg, Notkestrasse 9, 22607, Hamburg, Germany
- Institute of Mathematics and Computer Science, University of Southern Denmark, Campusvej 55, 5000, Odense, Denmark
| | - Tim Kacprowski
- Division Data Science in Biomedicine, Peter L. Reichertz Institute for Medical Informatics of Technische Universität Braunschweig and Hannover Medical School, Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), TU Braunschweig, Braunschweig, Germany
| | - Markus List
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, 85354, Freising, Germany.
| | - Olga Tsoy
- Institute for Computational Systems Biology, University of Hamburg, Notkestrasse 9, 22607, Hamburg, Germany.
| |
Collapse
|
43
|
Ma JX, Yang Y, Li G, Ma BG. Computationally Reconstructed Interactome of Bradyrhizobium diazoefficiens USDA110 Reveals Novel Functional Modules and Protein Hubs for Symbiotic Nitrogen Fixation. Int J Mol Sci 2021; 22:11907. [PMID: 34769335 PMCID: PMC8584416 DOI: 10.3390/ijms222111907] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Revised: 10/22/2021] [Accepted: 10/28/2021] [Indexed: 11/16/2022] Open
Abstract
Symbiotic nitrogen fixation is an important part of the nitrogen biogeochemical cycles and the main nitrogen source of the biosphere. As a classical model system for symbiotic nitrogen fixation, rhizobium-legume systems have been studied elaborately for decades. Details about the molecular mechanisms of the communication and coordination between rhizobia and host plants is becoming clearer. For more systematic insights, there is an increasing demand for new studies integrating multiomics information. Here, we present a comprehensive computational framework integrating the reconstructed protein interactome of B. diazoefficiens USDA110 with its transcriptome and proteome data to study the complex protein-protein interaction (PPI) network involved in the symbiosis system. We reconstructed the interactome of B. diazoefficiens USDA110 by computational approaches. Based on the comparison of interactomes between B. diazoefficiens USDA110 and other rhizobia, we inferred that the slow growth of B. diazoefficiens USDA110 may be due to the requirement of more protein modifications, and we further identified 36 conserved functional PPI modules. Integrated with transcriptome and proteome data, interactomes representing free-living cell and symbiotic nitrogen-fixing (SNF) bacteroid were obtained. Based on the SNF interactome, a core-sub-PPI-network for symbiotic nitrogen fixation was determined and nine novel functional modules and eleven key protein hubs playing key roles in symbiosis were identified. The reconstructed interactome of B. diazoefficiens USDA110 may serve as a valuable reference for studying the mechanism underlying the SNF system of rhizobia and legumes.
Collapse
Affiliation(s)
| | | | | | - Bin-Guang Ma
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China; (J.-X.M.); (Y.Y.); (G.L.)
| |
Collapse
|
44
|
Arici MK, Tuncbag N. Performance Assessment of the Network Reconstruction Approaches on Various Interactomes. Front Mol Biosci 2021; 8:666705. [PMID: 34676243 PMCID: PMC8523993 DOI: 10.3389/fmolb.2021.666705] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Accepted: 07/14/2021] [Indexed: 01/04/2023] Open
Abstract
Beyond the list of molecules, there is a necessity to collectively consider multiple sets of omic data and to reconstruct the connections between the molecules. Especially, pathway reconstruction is crucial to understanding disease biology because abnormal cellular signaling may be pathological. The main challenge is how to integrate the data together in an accurate way. In this study, we aim to comparatively analyze the performance of a set of network reconstruction algorithms on multiple reference interactomes. We first explored several human protein interactomes, including PathwayCommons, OmniPath, HIPPIE, iRefWeb, STRING, and ConsensusPathDB. The comparison is based on the coverage of each interactome in terms of cancer driver proteins, structural information of protein interactions, and the bias toward well-studied proteins. We next used these interactomes to evaluate the performance of network reconstruction algorithms including all-pair shortest path, heat diffusion with flux, personalized PageRank with flux, and prize-collecting Steiner forest (PCSF) approaches. Each approach has its own merits and weaknesses. Among them, PCSF had the most balanced performance in terms of precision and recall scores when 28 pathways from NetPath were reconstructed using the listed algorithms. Additionally, the reference interactome affects the performance of the network reconstruction approaches. The coverage and disease- or tissue-specificity of each interactome may vary, which may result in differences in the reconstructed networks.
Collapse
Affiliation(s)
- M Kaan Arici
- Graduate School of Informatics, Middle East Technical University, Ankara, Turkey.,Foot and Mouth Diseases Institute, Ministry of Agriculture and Forestry, Ankara, Turkey
| | - Nurcan Tuncbag
- Chemical and Biological Engineering, College of Engineering, Koc University, Istanbul, Turkey.,School of Medicine, Koc University, Istanbul, Turkey
| |
Collapse
|
45
|
Hollander M, Do T, Will T, Helms V. Detecting Rewiring Events in Protein-Protein Interaction Networks Based on Transcriptomic Data. FRONTIERS IN BIOINFORMATICS 2021; 1:724297. [PMID: 36303788 PMCID: PMC9581068 DOI: 10.3389/fbinf.2021.724297] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2021] [Accepted: 08/23/2021] [Indexed: 12/25/2022] Open
Abstract
Proteins rarely carry out their cellular functions in isolation. Instead, eukaryotic proteins engage in about six interactions with other proteins on average. The aggregated protein interactome of an organism forms a “hairy ball”-type protein-protein interaction (PPI) network. Yet, in a typical human cell, only about half of all proteins are expressed at a particular time. Hence, it has become common practice to prune the full PPI network to the subset of expressed proteins. If RNAseq data is available, one can further resolve the specific protein isoforms present in a cell or tissue. Here, we review various approaches, software tools and webservices that enable users to construct context-specific or tissue-specific PPI networks and how these are rewired between two cellular conditions. We illustrate their different functionalities on the example of the interactions involving the human TNR6 protein. In an outlook, we describe how PPI networks may be integrated with epigenetic data or with data on the activity of splicing factors.
Collapse
|
46
|
Martins YC, Ziviani A, Nicolás MF, de Vasconcelos ATR. Large-Scale Protein Interactions Prediction by Multiple Evidence Analysis Associated With an In-Silico Curation Strategy. FRONTIERS IN BIOINFORMATICS 2021; 1:731345. [PMID: 36303787 PMCID: PMC9581021 DOI: 10.3389/fbinf.2021.731345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2021] [Accepted: 08/23/2021] [Indexed: 11/17/2022] Open
Abstract
Predicting the physical or functional associations through protein-protein interactions (PPIs) represents an integral approach for inferring novel protein functions and discovering new drug targets during repositioning analysis. Recent advances in high-throughput data generation and multi-omics techniques have enabled large-scale PPI predictions, thus promoting several computational methods based on different levels of biological evidence. However, integrating multiple results and strategies to optimize, extract interaction features automatically and scale up the entire PPI prediction process is still challenging. Most procedures do not offer an in-silico validation process to evaluate the predicted PPIs. In this context, this paper presents the PredPrIn scientific workflow that enables PPI prediction based on multiple lines of evidence, including the structure, sequence, and functional annotation categories, by combining boosting and stacking machine learning techniques. We also present a pipeline (PPIVPro) for the validation process based on cellular co-localization filtering and a focused search of PPI evidence on scientific publications. Thus, our combined approach provides means to extensive scale training or prediction of new PPIs and a strategy to evaluate the prediction quality. PredPrIn and PPIVPro are publicly available at https://github.com/YasCoMa/predprin and https://github.com/YasCoMa/ppi_validation_process.
Collapse
Affiliation(s)
- Yasmmin Côrtes Martins
- Bioinformatics Laboratory, National Laboratory of Scientific Computing, Petrópolis, Brazil
| | - Artur Ziviani
- Data Extreme Lab (DEXL), National Laboratory of Scientific Computing, Petrópolis, Brazil
| | - Marisa Fabiana Nicolás
- Bioinformatics Laboratory, National Laboratory of Scientific Computing, Petrópolis, Brazil
| | - Ana Tereza Ribeiro de Vasconcelos
- Bioinformatics Laboratory, National Laboratory of Scientific Computing, Petrópolis, Brazil
- *Correspondence: Ana Tereza Ribeiro de Vasconcelos,
| |
Collapse
|
47
|
Alborzi SZ, Ahmed Nacer A, Najjar H, Ritchie DW, Devignes MD. PPIDomainMiner: Inferring domain-domain interactions from multiple sources of protein-protein interactions. PLoS Comput Biol 2021; 17:e1008844. [PMID: 34370723 PMCID: PMC8376228 DOI: 10.1371/journal.pcbi.1008844] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2021] [Revised: 08/19/2021] [Accepted: 07/12/2021] [Indexed: 12/26/2022] Open
Abstract
Many biological processes are mediated by protein-protein interactions (PPIs). Because protein domains are the building blocks of proteins, PPIs likely rely on domain-domain interactions (DDIs). Several attempts exist to infer DDIs from PPI networks but the produced datasets are heterogeneous and sometimes not accessible, while the PPI interactome data keeps growing. We describe a new computational approach called “PPIDM” (Protein-Protein Interactions Domain Miner) for inferring DDIs using multiple sources of PPIs. The approach is an extension of our previously described “CODAC” (Computational Discovery of Direct Associations using Common neighbors) method for inferring new edges in a tripartite graph. The PPIDM method has been applied to seven widely used PPI resources, using as “Gold-Standard” a set of DDIs extracted from 3D structural databases. Overall, PPIDM has produced a dataset of 84,552 non-redundant DDIs. Statistical significance (p-value) is calculated for each source of PPI and used to classify the PPIDM DDIs in Gold (9,175 DDIs), Silver (24,934 DDIs) and Bronze (50,443 DDIs) categories. Dataset comparison reveals that PPIDM has inferred from the 2017 releases of PPI sources about 46% of the DDIs present in the 2020 release of the 3did database, not counting the DDIs present in the Gold-Standard. The PPIDM dataset contains 10,229 DDIs that are consistent with more than 13,300 PPIs extracted from the IMEx database, and nearly 23,300 DDIs (27.5%) that are consistent with more than 214,000 human PPIs extracted from the STRING database. Examples of newly inferred DDIs covering more than 10 PPIs in the IMEx database are provided. Further exploitation of the PPIDM DDI reservoir includes the inventory of possible partners of a protein of interest and characterization of protein interactions at the domain level in combination with other methods. The result is publicly available at http://ppidm.loria.fr/. We revisit at a large scale the question of inferring DDIs from PPIs. Compared to previous studies, we take a unified approach accross multiple sources of PPIs. This approach is a method for inferring new edges in a tripartite graph setting and can be compared to link prediction approaches in knowledge graphs. Aggregation of several sources is performed using an optimized weighted average of the individual scores calculated in each source. A huge dataset of over 84K DDIs is produced which far exceeds the previous datasets. We show that a significant portion of the PPIDM dataset covers a large number of PPIs from curated (IMEx) or non curated (STRING) databases. Such a reservoir of DDIs deserves further exploration and can be combined with high-throughput methods such as cross-linking mass spectrometry to identify plausible protein partners of proteins of interest.
Collapse
|
48
|
Etzion-Fuchs A, Todd DA, Singh M. dSPRINT: predicting DNA, RNA, ion, peptide and small molecule interaction sites within protein domains. Nucleic Acids Res 2021; 49:e78. [PMID: 33999210 PMCID: PMC8287948 DOI: 10.1093/nar/gkab356] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Revised: 03/30/2021] [Accepted: 04/22/2021] [Indexed: 01/08/2023] Open
Abstract
Domains are instrumental in facilitating protein interactions with DNA, RNA, small molecules, ions and peptides. Identifying ligand-binding domains within sequences is a critical step in protein function annotation, and the ligand-binding properties of proteins are frequently analyzed based upon whether they contain one of these domains. To date, however, knowledge of whether and how protein domains interact with ligands has been limited to domains that have been observed in co-crystal structures; this leaves approximately two-thirds of human protein domain families uncharacterized with respect to whether and how they bind DNA, RNA, small molecules, ions and peptides. To fill this gap, we introduce dSPRINT, a novel ensemble machine learning method for predicting whether a domain binds DNA, RNA, small molecules, ions or peptides, along with the positions within it that participate in these types of interactions. In stringent cross-validation testing, we demonstrate that dSPRINT has an excellent performance in uncovering ligand-binding positions and domains. We also apply dSPRINT to newly characterize the molecular functions of domains of unknown function. dSPRINT's predictions can be transferred from domains to sequences, enabling predictions about the ligand-binding properties of 95% of human genes. The dSPRINT framework and its predictions for 6503 human protein domains are freely available at http://protdomain.princeton.edu/dsprint.
Collapse
Affiliation(s)
- Anat Etzion-Fuchs
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory, Princeton, NJ 08544, USA
| | - David A Todd
- Department of Computer Science, Princeton University, 35 Olden Street, Princeton, NJ 08544, USA
| | - Mona Singh
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory, Princeton, NJ 08544, USA.,Department of Computer Science, Princeton University, 35 Olden Street, Princeton, NJ 08544, USA
| |
Collapse
|
49
|
Lang B, Yang JS, Garriga-Canut M, Speroni S, Aschern M, Gili M, Hoffmann T, Tartaglia GG, Maurer SP. Matrix-screening reveals a vast potential for direct protein-protein interactions among RNA binding proteins. Nucleic Acids Res 2021; 49:6702-6721. [PMID: 34133714 PMCID: PMC8266617 DOI: 10.1093/nar/gkab490] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2020] [Revised: 04/23/2021] [Accepted: 05/20/2021] [Indexed: 01/02/2023] Open
Abstract
RNA-binding proteins (RBPs) are crucial factors of post-transcriptional gene regulation and their modes of action are intensely investigated. At the center of attention are RNA motifs that guide where RBPs bind. However, sequence motifs are often poor predictors of RBP-RNA interactions in vivo. It is hence believed that many RBPs recognize RNAs as complexes, to increase specificity and regulatory possibilities. To probe the potential for complex formation among RBPs, we assembled a library of 978 mammalian RBPs and used rec-Y2H matrix screening to detect direct interactions between RBPs, sampling > 600 K interactions. We discovered 1994 new interactions and demonstrate that interacting RBPs bind RNAs adjacently in vivo. We further find that the mRNA binding region and motif preferences of RBPs deviate, depending on their adjacently binding interaction partners. Finally, we reveal novel RBP interaction networks among major RNA processing steps and show that splicing impairing RBP mutations observed in cancer rewire spliceosomal interaction networks. The dataset we provide will be a valuable resource for understanding the combinatorial interactions of RBPs with RNAs and the resulting regulatory outcomes.
Collapse
Affiliation(s)
- Benjamin Lang
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology (BIST), Doctor Aiguader 88, Barcelona 08003, Spain.,Department of Structural Biology and Center of Excellence for Data-Driven Discovery, St. Jude Children's Research Hospital, 262 Danny Thomas Place, Memphis, TN 38105, USA
| | - Jae-Seong Yang
- Centre de Recerca en Agrigenòmica, Consortium CSIC-IRTA-UAB-UB (CRAG), Cerdanyola del Vallès, 08193 Barcelona, Spain
| | - Mireia Garriga-Canut
- Division of Engineering, New York University Abu Dhabi (NYUAD), Abu Dhabi 129188, UAE
| | - Silvia Speroni
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology (BIST), Doctor Aiguader 88, Barcelona 08003, Spain
| | - Moritz Aschern
- Centre de Recerca en Agrigenòmica, Consortium CSIC-IRTA-UAB-UB (CRAG), Cerdanyola del Vallès, 08193 Barcelona, Spain
| | - Maria Gili
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology (BIST), Doctor Aiguader 88, Barcelona 08003, Spain
| | - Tobias Hoffmann
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology (BIST), Doctor Aiguader 88, Barcelona 08003, Spain
| | - Gian Gaetano Tartaglia
- Center for Human Technologies, Istituto Italiano di Tecnologia, Via Enrico Melen 83, 16152, Genoa, Italy.,Biology and Biotechnology Department "Charles Darwin", Sapienza University of Rome, P.le A. Moro 5, Rome 00185, Italy
| | - Sebastian P Maurer
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology (BIST), Doctor Aiguader 88, Barcelona 08003, Spain.,Universitat Pompeu Fabra (UPF), Department of Experimental and Health Sciences, Barcelona, Spain
| |
Collapse
|
50
|
Li WJ, Wang CW, Tao L, Yan YH, Zhang MJ, Liu ZX, Li YX, Zhao HQ, Li XM, He XD, Xue Y, Dong MQ. Insulin signaling regulates longevity through protein phosphorylation in Caenorhabditis elegans. Nat Commun 2021; 12:4568. [PMID: 34315882 PMCID: PMC8316574 DOI: 10.1038/s41467-021-24816-z] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Accepted: 07/01/2021] [Indexed: 12/22/2022] Open
Abstract
Insulin/IGF-1 Signaling (IIS) is known to constrain longevity by inhibiting the transcription factor FOXO. How phosphorylation mediated by IIS kinases regulates lifespan beyond FOXO remains unclear. Here, we profile IIS-dependent phosphorylation changes in a large-scale quantitative phosphoproteomic analysis of wild-type and three IIS mutant Caenorhabditis elegans strains. We quantify more than 15,000 phosphosites and find that 476 of these are differentially phosphorylated in the long-lived daf-2/insulin receptor mutant. We develop a machine learning-based method to prioritize 25 potential lifespan-related phosphosites. We perform validations to show that AKT-1 pT492 inhibits DAF-16/FOXO and compensates the loss of daf-2 function, that EIF-2α pS49 potently inhibits protein synthesis and daf-2 longevity, and that reduced phosphorylation of multiple germline proteins apparently transmits reduced DAF-2 signaling to the soma. In addition, an analysis of kinases with enriched substrates detects that casein kinase 2 (CK2) subunits negatively regulate lifespan. Our study reveals detailed functional insights into longevity.
Collapse
Affiliation(s)
- Wen-Jun Li
- School of Life Sciences, Peking University, Beijing, China
- National Institute of Biological Sciences, Beijing, China
| | - Chen-Wei Wang
- Key Laboratory of Molecular Biophysics of Ministry of Education, Hubei Bioinformatics and Molecular Imaging Key Laboratory, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China
- Nanjing University Institute of Artificial Intelligence Biomedicine, Nanjing, Jiangsu, China
| | - Li Tao
- National Institute of Biological Sciences, Beijing, China
- Department of Biology, Stanford University, Stanford, CA, USA
| | - Yong-Hong Yan
- National Institute of Biological Sciences, Beijing, China
| | - Mei-Jun Zhang
- National Institute of Biological Sciences, Beijing, China
- Annoroad Gene Tech. Co., Ltd., Beijing, China
| | - Ze-Xian Liu
- Key Laboratory of Molecular Biophysics of Ministry of Education, Hubei Bioinformatics and Molecular Imaging Key Laboratory, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou, China
| | - Yu-Xin Li
- National Institute of Biological Sciences, Beijing, China
- Department of Cellular and Molecular Medicine, University of California San Diego, La Jolla, CA, USA
| | - Han-Qing Zhao
- National Institute of Biological Sciences, Beijing, China
| | - Xue-Mei Li
- School of Life Sciences, Peking University, Beijing, China
- National Institute of Biological Sciences, Beijing, China
| | - Xian-Dong He
- National Institute of Biological Sciences, Beijing, China
| | - Yu Xue
- Key Laboratory of Molecular Biophysics of Ministry of Education, Hubei Bioinformatics and Molecular Imaging Key Laboratory, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China.
- Nanjing University Institute of Artificial Intelligence Biomedicine, Nanjing, Jiangsu, China.
| | - Meng-Qiu Dong
- National Institute of Biological Sciences, Beijing, China.
- Tsinghua Institute of Multidisciplinary Biomedical Research, Tsinghua University, Beijing, China.
| |
Collapse
|