1
|
Yuen HY, Jansson J. Normalized L3-based link prediction in protein-protein interaction networks. BMC Bioinformatics 2023; 24:59. [PMID: 36814208 PMCID: PMC9945744 DOI: 10.1186/s12859-023-05178-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2021] [Accepted: 02/08/2023] [Indexed: 02/24/2023] Open
Abstract
BACKGROUND Protein-protein interaction (PPI) data is an important type of data used in functional genomics. However, high-throughput experiments are often insufficient to complete the PPI interactome of different organisms. Computational techniques are thus used to infer missing data, with link prediction being one such approach that uses the structure of the network of PPIs known so far to identify non-edges whose addition to the network would make it more sound, according to some underlying assumptions. Recently, a new idea called the L3 principle introduced biological motivation into PPI link predictions, yielding predictors that are superior to general-purpose link predictors for complex networks. Interestingly, the L3 principle can be interpreted in another way, so that other signatures of PPI networks can also be characterized for PPI predictions. This alternative interpretation uncovers candidate PPIs that the current L3-based link predictors may not be able to fully capture, underutilizing the L3 principle. RESULTS In this article, we propose a formulation of link predictors that we call NormalizedL3 (L3N) which addresses certain missing elements within L3 predictors in the perspective of network modeling. Our computational validations show that the L3N predictors are able to find missing PPIs more accurately (in terms of true positives among the predicted PPIs) than the previously proposed methods on several datasets from the literature, including BioGRID, STRING, MINT, and HuRI, at the cost of using more computation time in some of the cases. In addition, we found that L3-based link predictors (including L3N) ranked a different pool of PPIs higher than the general-purpose link predictors did. This suggests that different types of PPIs can be predicted based on different topological assumptions, and that even better PPI link predictors may be obtained in the future by improved network modeling.
Collapse
Affiliation(s)
- Ho Yin Yuen
- Department of Biomedical Engineering, The Hong Kong Polytechnic University, Hong Kong, China.
| | - Jesper Jansson
- Graduate School of Informatics, Kyoto University, Kyoto, 606-8501, Japan.
| |
Collapse
|
2
|
Velásquez-Zapata V, Elmore JM, Fuerst G, Wise RP. An interolog-based barley interactome as an integration framework for immune signaling. Genetics 2022; 221:iyac056. [PMID: 35435213 PMCID: PMC9157089 DOI: 10.1093/genetics/iyac056] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Accepted: 04/04/2022] [Indexed: 12/12/2022] Open
Abstract
The barley MLA nucleotide-binding leucine-rich-repeat (NLR) receptor and its orthologs confer recognition specificity to many fungal diseases, including powdery mildew, stem-, and stripe rust. We used interolog inference to construct a barley protein interactome (Hordeum vulgare predicted interactome, HvInt) comprising 66,133 edges and 7,181 nodes, as a foundation to explore signaling networks associated with MLA. HvInt was compared with the experimentally validated Arabidopsis interactome of 11,253 proteins and 73,960 interactions, verifying that the 2 networks share scale-free properties, including a power-law distribution and small-world network. Then, by successive layering of defense-specific "omics" datasets, HvInt was customized to model cellular response to powdery mildew infection. Integration of HvInt with expression quantitative trait loci (eQTL) enabled us to infer disease modules and responses associated with fungal penetration and haustorial development. Next, using HvInt and infection-time-course RNA sequencing of immune signaling mutants, we assembled resistant and susceptible subnetworks. The resulting differentially coexpressed (resistant - susceptible) interactome is essential to barley immunity, facilitates the flow of signaling pathways and is linked to mildew resistance locus a (Mla) through trans eQTL associations. Lastly, we anchored HvInt with new and previously identified interactors of the MLA coiled coli + nucleotide-binding domains and extended these to additional MLA alleles, orthologs, and NLR outgroups to predict receptor localization and conservation of signaling response. These results link genomic, transcriptomic, and physical interactions during MLA-specified immunity.
Collapse
Affiliation(s)
- Valeria Velásquez-Zapata
- Program in Bioinformatics & Computational Biology, Iowa State University, Ames, IA 50011, USA
- Department of Plant Pathology & Microbiology, Iowa State University, Ames, IA 50011, USA
| | - James Mitch Elmore
- Department of Plant Pathology & Microbiology, Iowa State University, Ames, IA 50011, USA
- Corn Insects and Crop Genetics Research, USDA-Agricultural Research Service, Ames, IA 50011, USA
| | - Gregory Fuerst
- Department of Plant Pathology & Microbiology, Iowa State University, Ames, IA 50011, USA
- Corn Insects and Crop Genetics Research, USDA-Agricultural Research Service, Ames, IA 50011, USA
| | - Roger P Wise
- Program in Bioinformatics & Computational Biology, Iowa State University, Ames, IA 50011, USA
- Department of Plant Pathology & Microbiology, Iowa State University, Ames, IA 50011, USA
- Corn Insects and Crop Genetics Research, USDA-Agricultural Research Service, Ames, IA 50011, USA
| |
Collapse
|
3
|
Naveed M, Makhdoom SI, Abbas G, Safdari M, Farhadi A, Habtemariam S, Shabbir MA, Jabeen K, Asif MF, Tehreem S. The Virulent Hypothetical Proteins: The Potential Drug Target Involved in Bacterial Pathogenesis. Mini Rev Med Chem 2022; 22:2608-2623. [DOI: 10.2174/1389557522666220413102107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Revised: 12/01/2021] [Accepted: 01/21/2022] [Indexed: 11/22/2022]
Abstract
Abstract:
Hypothetical proteins (HPs) are non-predicted sequences that are identified only by open reading frames in sequenced genomes but their protein products remain uncharacterized by any experimental means. The genome of every species consists of HPs that are involved in various cellular processes and signaling pathways. Annotation of HPs is important as they play a key role in disease mechanisms, drug designing, vaccine production, antibiotic production, and host adaptation. In the case of bacteria, 25-50% of the genome comprises of HPs, which are involved in metabolic pathways and pathogenesis. The characterization of bacterial HPs helps to identify virulent proteins that are involved in pathogenesis. This can be done using in-silico studies, which provide sequence analogs, physiochemical properties, cellular or subcellular localization, structure and function validation, and protein-protein interactions. The most diverse types of virulent proteins are exotoxins, endotoxins, and adherent virulent factors that are encoded by virulent genes present on the chromosomal DNA of the bacteria. This review evaluates virulent HPs of pathogenic bacteria, such as Staphylococcus aureus, Chlamydia trachomatis, Fusobacterium nucleatum, and Yersinia pestis. The potential of these HPs as a drug target in bacteria-caused infectious diseases along with the mode of action and treatment approaches have been discussed.
Collapse
Affiliation(s)
- Muhammad Naveed
- Department of Biotechnology, Faculty of Life Sciences, University of Central Punjab, Pakistan
| | - Syeda Izma Makhdoom
- Department of Biotechnology, Faculty of Life Sciences, University of Central Punjab, Pakistan
| | - Ghulam Abbas
- National Laboratory of Biomacromolecules, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Mohammadreza Safdari
- Department of Orthopedic Surgery, Faculty of Medicine, North Khorasan University of Medical Sciences, Bojnurd, Iran
| | - Amin Farhadi
- Kavian Institute of Higher Education, Mashhad, Iran
| | - Solomon Habtemariam
- Pharmacognosy Research Laboratories & Herbal Analysis Services UK, University of Greenwich, Medway Campus-Science, Grenville Building (G102/G107), Central Avenue, Chatham-Maritime, Kent, ME4 4TB, UK
| | - Muhammad Aqib Shabbir
- Department of Biotechnology, Faculty of Life Sciences, University of Central Punjab, Pakistan
| | - Khizra Jabeen
- Department of Biotechnology, Faculty of Life Sciences, University of Central Punjab, Pakistan
| | - Muhammad Farrukh Asif
- Department of Biotechnology, Faculty of Life Sciences, University of Central Punjab, Pakistan
| | - Sana Tehreem
- State Key Laboratory of Biocatalysis and Enzyme Engineering, School of Life Sciences, Hubei University, Wuhan 430062, Hubei, China
| |
Collapse
|
4
|
Velásquez-Zapata V, Elmore JM, Banerjee S, Dorman KS, Wise RP. Next-generation yeast-two-hybrid analysis with Y2H-SCORES identifies novel interactors of the MLA immune receptor. PLoS Comput Biol 2021; 17:e1008890. [PMID: 33798202 PMCID: PMC8046355 DOI: 10.1371/journal.pcbi.1008890] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Revised: 04/14/2021] [Accepted: 03/17/2021] [Indexed: 12/21/2022] Open
Abstract
Protein-protein interaction networks are one of the most effective representations of cellular behavior. In order to build these models, high-throughput techniques are required. Next-generation interaction screening (NGIS) protocols that combine yeast two-hybrid (Y2H) with deep sequencing are promising approaches to generate interactome networks in any organism. However, challenges remain to mining reliable information from these screens and thus, limit its broader implementation. Here, we present a computational framework, designated Y2H-SCORES, for analyzing high-throughput Y2H screens. Y2H-SCORES considers key aspects of NGIS experimental design and important characteristics of the resulting data that distinguish it from RNA-seq expression datasets. Three quantitative ranking scores were implemented to identify interacting partners, comprising: 1) significant enrichment under selection for positive interactions, 2) degree of interaction specificity among multi-bait comparisons, and 3) selection of in-frame interactors. Using simulation and an empirical dataset, we provide a quantitative assessment to predict interacting partners under a wide range of experimental scenarios, facilitating independent confirmation by one-to-one bait-prey tests. Simulation of Y2H-NGIS enabled us to identify conditions that maximize detection of true interactors, which can be achieved with protocols such as prey library normalization, maintenance of larger culture volumes and replication of experimental treatments. Y2H-SCORES can be implemented in different yeast-based interaction screenings, with an equivalent or superior performance than existing methods. Proof-of-concept was demonstrated by discovery and validation of novel interactions between the barley nucleotide-binding leucine-rich repeat (NLR) immune receptor MLA6, and fourteen proteins, including those that function in signaling, transcriptional regulation, and intracellular trafficking.
Collapse
Affiliation(s)
- Valeria Velásquez-Zapata
- Program in Bioinformatics & Computational Biology, Iowa State University, Ames, Iowa, United States of America
- Department of Plant Pathology & Microbiology, Iowa State University, Ames, Iowa, United States of America
| | - J. Mitch Elmore
- Department of Plant Pathology & Microbiology, Iowa State University, Ames, Iowa, United States of America
- Corn Insects and Crop Genetics Research, USDA-Agricultural Research Service, Ames, Iowa, United States of America
| | - Sagnik Banerjee
- Program in Bioinformatics & Computational Biology, Iowa State University, Ames, Iowa, United States of America
- Department of Statistics, Iowa State University, Ames, Iowa, United States of America
| | - Karin S. Dorman
- Program in Bioinformatics & Computational Biology, Iowa State University, Ames, Iowa, United States of America
- Department of Statistics, Iowa State University, Ames, Iowa, United States of America
- Department of Genetics, Development and Cell Biology, Iowa State University, Ames, Iowa, United States of America
| | - Roger P. Wise
- Program in Bioinformatics & Computational Biology, Iowa State University, Ames, Iowa, United States of America
- Department of Plant Pathology & Microbiology, Iowa State University, Ames, Iowa, United States of America
- Corn Insects and Crop Genetics Research, USDA-Agricultural Research Service, Ames, Iowa, United States of America
| |
Collapse
|
5
|
Grbić M, Matić D, Kartelj A, Vračević S, Filipović V. A three-phase method for identifying functionally related protein groups in weighted PPI networks. Comput Biol Chem 2020; 86:107246. [PMID: 32339914 DOI: 10.1016/j.compbiolchem.2020.107246] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2019] [Revised: 01/27/2020] [Accepted: 03/03/2020] [Indexed: 01/17/2023]
Abstract
Identifying significant protein groups is of great importance for further understanding protein functions. This paper introduces a novel three-phase heuristic method for identifying such groups in weighted PPI networks. In the first phase a variable neighborhood search (VNS) algorithm is applied on a weighted PPI network, in order to support protein complexes by adding a minimum number of new PPIs. In the second phase proteins from different complexes are merged into larger protein groups. In the third phase these groups are expanded by a number of 2-level neighbor proteins, favoring proteins that have higher average gene co-expression with the base group proteins. Experimental results show that: (i) the proposed VNS algorithm outperforms the existing approach described in literature and (ii) the above-mentioned three-phase method identifies protein groups with very high statistical significance.
Collapse
Affiliation(s)
- Milana Grbić
- University of Banjaluka, Faculty of Natural Sciences and Mathematics, Mladena Stojanovića 2, 78000 Banjaluka, Bosnia and Herzegovina.
| | - Dragan Matić
- University of Banjaluka, Faculty of Natural Sciences and Mathematics, Mladena Stojanovića 2, 78000 Banjaluka, Bosnia and Herzegovina.
| | - Aleksandar Kartelj
- University of Belgrade, Faculty of Mathematics, Studentski trg 16/IV 11 000, Belgrade, Serbia.
| | - Savka Vračević
- University of Banjaluka, Faculty of Natural Sciences and Mathematics, Mladena Stojanovića 2, 78000 Banjaluka, Bosnia and Herzegovina.
| | - Vladimir Filipović
- University of Belgrade, Faculty of Mathematics, Studentski trg 16/IV 11 000, Belgrade, Serbia.
| |
Collapse
|