1
|
Nissan N, Hooker J, Arezza E, Dick K, Golshani A, Mimee B, Cober E, Green J, Samanfar B. Large-scale data mining pipeline for identifying novel soybean genes involved in resistance against the soybean cyst nematode. FRONTIERS IN BIOINFORMATICS 2023; 3:1199675. [PMID: 37409347 PMCID: PMC10319130 DOI: 10.3389/fbinf.2023.1199675] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Accepted: 05/31/2023] [Indexed: 07/07/2023] Open
Abstract
The soybean cyst nematode (SCN) [Heterodera glycines Ichinohe] is a devastating pathogen of soybean [Glycine max (L.) Merr.] that is rapidly becoming a global economic issue. Two loci conferring SCN resistance have been identified in soybean, Rhg1 and Rhg4; however, they offer declining protection. Therefore, it is imperative that we identify additional mechanisms for SCN resistance. In this paper, we develop a bioinformatics pipeline to identify protein-protein interactions related to SCN resistance by data mining massive-scale datasets. The pipeline combines two leading sequence-based protein-protein interaction predictors, the Protein-protein Interaction Prediction Engine (PIPE), PIPE4, and Scoring PRotein INTeractions (SPRINT) to predict high-confidence interactomes. First, we predicted the top soy interacting protein partners of the Rhg1 and Rhg4 proteins. Both PIPE4 and SPRINT overlap in their predictions with 58 soybean interacting partners, 19 of which had GO terms related to defense. Beginning with the top predicted interactors of Rhg1 and Rhg4, we implement a "guilt by association" in silico proteome-wide approach to identify novel soybean genes that may be involved in SCN resistance. This pipeline identified 1,082 candidate genes whose local interactomes overlap significantly with the Rhg1 and Rhg4 interactomes. Using GO enrichment tools, we highlighted many important genes including five genes with GO terms related to response to the nematode (GO:0009624), namely, Glyma.18G029000, Glyma.11G228300, Glyma.08G120500, Glyma.17G152300, and Glyma.08G265700. This study is the first of its kind to predict interacting partners of known resistance proteins Rhg1 and Rhg4, forming an analysis pipeline that enables researchers to focus their search on high-confidence targets to identify novel SCN resistance genes in soybean.
Collapse
Affiliation(s)
- Nour Nissan
- Agriculture and Agri-Food Canada, Ottawa Research and Development Centre, Ottawa, ON, Canada
- Department of Biology and Ottawa Institute of Systems Biology, Carleton University, Ottawa, ON, Canada
| | - Julia Hooker
- Agriculture and Agri-Food Canada, Ottawa Research and Development Centre, Ottawa, ON, Canada
- Department of Biology and Ottawa Institute of Systems Biology, Carleton University, Ottawa, ON, Canada
| | - Eric Arezza
- Department of Systems and Computer Engineering, Carleton University, Ottawa, ON, Canada
| | - Kevin Dick
- Department of Systems and Computer Engineering, Carleton University, Ottawa, ON, Canada
| | - Ashkan Golshani
- Department of Biology and Ottawa Institute of Systems Biology, Carleton University, Ottawa, ON, Canada
| | - Benjamin Mimee
- Agriculture and Agri-Food Canada, Saint-Jean-sur-Richelieu Research and Development Centre, Saint-Jeansur-Richelieu, QC, Canada
| | - Elroy Cober
- Agriculture and Agri-Food Canada, Ottawa Research and Development Centre, Ottawa, ON, Canada
| | - James Green
- Department of Systems and Computer Engineering, Carleton University, Ottawa, ON, Canada
| | - Bahram Samanfar
- Agriculture and Agri-Food Canada, Ottawa Research and Development Centre, Ottawa, ON, Canada
- Department of Biology and Ottawa Institute of Systems Biology, Carleton University, Ottawa, ON, Canada
| |
Collapse
|
2
|
Kazmirchuk TDD, Bradbury-Jost C, Withey TA, Gessese T, Azad T, Samanfar B, Dehne F, Golshani A. Peptides of a Feather: How Computation Is Taking Peptide Therapeutics under Its Wing. Genes (Basel) 2023; 14:1194. [PMID: 37372372 DOI: 10.3390/genes14061194] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Revised: 05/24/2023] [Accepted: 05/26/2023] [Indexed: 06/29/2023] Open
Abstract
Leveraging computation in the development of peptide therapeutics has garnered increasing recognition as a valuable tool to generate novel therapeutics for disease-related targets. To this end, computation has transformed the field of peptide design through identifying novel therapeutics that exhibit enhanced pharmacokinetic properties and reduced toxicity. The process of in-silico peptide design involves the application of molecular docking, molecular dynamics simulations, and machine learning algorithms. Three primary approaches for peptide therapeutic design including structural-based, protein mimicry, and short motif design have been predominantly adopted. Despite the ongoing progress made in this field, there are still significant challenges pertaining to peptide design including: enhancing the accuracy of computational methods; improving the success rate of preclinical and clinical trials; and developing better strategies to predict pharmacokinetics and toxicity. In this review, we discuss past and present research pertaining to the design and development of in-silico peptide therapeutics in addition to highlighting the potential of computation and artificial intelligence in the future of disease therapeutics.
Collapse
Affiliation(s)
- Thomas David Daniel Kazmirchuk
- Department of Biology, and the Ottawa Institute of Systems Biology (OISB), Carleton University, Ottawa, ON K1S 5B6, Canada
| | - Calvin Bradbury-Jost
- Department of Biology, and the Ottawa Institute of Systems Biology (OISB), Carleton University, Ottawa, ON K1S 5B6, Canada
| | - Taylor Ann Withey
- Department of Biology, and the Ottawa Institute of Systems Biology (OISB), Carleton University, Ottawa, ON K1S 5B6, Canada
| | - Tadesse Gessese
- Department of Biology, and the Ottawa Institute of Systems Biology (OISB), Carleton University, Ottawa, ON K1S 5B6, Canada
| | - Taha Azad
- Department of Microbiology and Infectious Diseases, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
- Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke (CHUS), Sherbrooke, QC J1H 5N4, Canada
| | - Bahram Samanfar
- Department of Biology, and the Ottawa Institute of Systems Biology (OISB), Carleton University, Ottawa, ON K1S 5B6, Canada
- Agriculture and Agri-Food Canada, Ottawa Research and Development Centre (ORDC), Ottawa, ON K1A 0C6, Canada
| | - Frank Dehne
- School of Computer Science, Carleton University, Ottawa, ON K1S 5B6, Canada
| | - Ashkan Golshani
- Department of Biology, and the Ottawa Institute of Systems Biology (OISB), Carleton University, Ottawa, ON K1S 5B6, Canada
| |
Collapse
|
3
|
Chandra A, Tünnermann L, Löfstedt T, Gratz R. Transformer-based deep learning for predicting protein properties in the life sciences. eLife 2023; 12:82819. [PMID: 36651724 PMCID: PMC9848389 DOI: 10.7554/elife.82819] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Accepted: 01/06/2023] [Indexed: 01/19/2023] Open
Abstract
Recent developments in deep learning, coupled with an increasing number of sequenced proteins, have led to a breakthrough in life science applications, in particular in protein property prediction. There is hope that deep learning can close the gap between the number of sequenced proteins and proteins with known properties based on lab experiments. Language models from the field of natural language processing have gained popularity for protein property predictions and have led to a new computational revolution in biology, where old prediction results are being improved regularly. Such models can learn useful multipurpose representations of proteins from large open repositories of protein sequences and can be used, for instance, to predict protein properties. The field of natural language processing is growing quickly because of developments in a class of models based on a particular model-the Transformer model. We review recent developments and the use of large-scale Transformer models in applications for predicting protein characteristics and how such models can be used to predict, for example, post-translational modifications. We review shortcomings of other deep learning models and explain how the Transformer models have quickly proven to be a very promising way to unravel information hidden in the sequences of amino acids.
Collapse
Affiliation(s)
- Abel Chandra
- Department of Computing Science, Umeå UniversityUmeåSweden
| | - Laura Tünnermann
- Umeå Plant Science Centre (UPSC), Department of Forest Genetics and Plant Physiology, Swedish University of Agricultural SciencesUmeåSweden
| | - Tommy Löfstedt
- Department of Computing Science, Umeå UniversityUmeåSweden
| | - Regina Gratz
- Umeå Plant Science Centre (UPSC), Department of Forest Genetics and Plant Physiology, Swedish University of Agricultural SciencesUmeåSweden
- Department of Forest Ecology and Management, Swedish University of Agricultural SciencesUmeåSweden
| |
Collapse
|
4
|
Small RNA Targets: Advances in Prediction Tools and High-Throughput Profiling. BIOLOGY 2022; 11:biology11121798. [PMID: 36552307 PMCID: PMC9775672 DOI: 10.3390/biology11121798] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Revised: 11/27/2022] [Accepted: 12/08/2022] [Indexed: 12/14/2022]
Abstract
MicroRNAs (miRNAs) are an abundant class of small non-coding RNAs that regulate gene expression at the post-transcriptional level. They are suggested to be involved in most biological processes of the cell primarily by targeting messenger RNAs (mRNAs) for cleavage or translational repression. Their binding to their target sites is mediated by the Argonaute (AGO) family of proteins. Thus, miRNA target prediction is pivotal for research and clinical applications. Moreover, transfer-RNA-derived fragments (tRFs) and other types of small RNAs have been found to be potent regulators of Ago-mediated gene expression. Their role in mRNA regulation is still to be fully elucidated, and advancements in the computational prediction of their targets are in their infancy. To shed light on these complex RNA-RNA interactions, the availability of good quality high-throughput data and reliable computational methods is of utmost importance. Even though the arsenal of computational approaches in the field has been enriched in the last decade, there is still a degree of discrepancy between the results they yield. This review offers an overview of the relevant advancements in the field of bioinformatics and machine learning and summarizes the key strategies utilized for small RNA target prediction. Furthermore, we report the recent development of high-throughput sequencing technologies, and explore the role of non-miRNA AGO driver sequences.
Collapse
|
5
|
Baranwal M, Magner A, Saldinger J, Turali-Emre ES, Elvati P, Kozarekar S, VanEpps JS, Kotov NA, Violi A, Hero AO. Struct2Graph: a graph attention network for structure based predictions of protein–protein interactions. BMC Bioinformatics 2022; 23:370. [PMID: 36088285 PMCID: PMC9464414 DOI: 10.1186/s12859-022-04910-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Accepted: 08/26/2022] [Indexed: 12/03/2022] Open
Abstract
Background Development of new methods for analysis of protein–protein interactions (PPIs) at molecular and nanometer scales gives insights into intracellular signaling pathways and will improve understanding of protein functions, as well as other nanoscale structures of biological and abiological origins. Recent advances in computational tools, particularly the ones involving modern deep learning algorithms, have been shown to complement experimental approaches for describing and rationalizing PPIs. However, most of the existing works on PPI predictions use protein-sequence information, and thus have difficulties in accounting for the three-dimensional organization of the protein chains. Results In this study, we address this problem and describe a PPI analysis based on a graph attention network, named Struct2Graph, for identifying PPIs directly from the structural data of folded protein globules. Our method is capable of predicting the PPI with an accuracy of 98.89% on the balanced set consisting of an equal number of positive and negative pairs. On the unbalanced set with the ratio of 1:10 between positive and negative pairs, Struct2Graph achieves a fivefold cross validation average accuracy of 99.42%. Moreover, Struct2Graph can potentially identify residues that likely contribute to the formation of the protein–protein complex. The identification of important residues is tested for two different interaction types: (a) Proteins with multiple ligands competing for the same binding area, (b) Dynamic protein–protein adhesion interaction. Struct2Graph identifies interacting residues with 30% sensitivity, 89% specificity, and 87% accuracy. Conclusions In this manuscript, we address the problem of prediction of PPIs using a first of its kind, 3D-structure-based graph attention network (code available at https://github.com/baranwa2/Struct2Graph). Furthermore, the novel mutual attention mechanism provides insights into likely interaction sites through its unsupervised knowledge selection process. This study demonstrates that a relatively low-dimensional feature embedding learned from graph structures of individual proteins outperforms other modern machine learning classifiers based on global protein features. In addition, through the analysis of single amino acid variations, the attention mechanism shows preference for disease-causing residue variations over benign polymorphisms, demonstrating that it is not limited to interface residues. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04910-9.
Collapse
|
6
|
Reciprocal perspective as a super learner improves drug-target interaction prediction (MUSDTI). Sci Rep 2022; 12:13237. [PMID: 35918366 PMCID: PMC9344797 DOI: 10.1038/s41598-022-16493-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Accepted: 07/11/2022] [Indexed: 11/08/2022] Open
Abstract
The identification of novel drug-target interactions (DTI) is critical to drug discovery and drug repurposing to address contemporary medical and public health challenges presented by emergent diseases. Historically, computational methods have framed DTI prediction as a binary classification problem (indicating whether or not a drug physically interacts with a given protein target); however, framing the problem instead as a regression-based prediction of the physiochemical binding affinity is more meaningful. With growing databases of experimentally derived drug-target interactions (e.g. Davis, Binding-DB, and Kiba), deep learning-based DTI predictors can be effectively leveraged to achieve state-of-the-art (SOTA) performance. In this work, we formulated a DTI competition as part of the coursework for a senior undergraduate machine learning course and challenged students to generate component DTI models that might surpass SOTA models and effectively combine these component models as part of a meta-model using the Reciprocal Perspective (RP) multi-view learning framework. Following 6 weeks of concerted effort, 28 student-produced component deep-learning DTI models were leveraged in this work to produce a new SOTA RP-DTI model, denoted the Meta Undergraduate Student DTI (MUSDTI) model. Through a series of experiments we demonstrate that (1) RP can considerably improve SOTA DTI prediction, (2) our new double-cold experimental design is more appropriate for emergent DTI challenges, (3) that our novel MUSDTI meta-model outperforms SOTA models, (4) that RP can improve upon individual models as an ensembling method, and finally, (5) RP can be utilized for low computation transfer learning. This work introduces a number of important revelations for the field of DTI prediction and sequence-based, pairwise prediction in general.
Collapse
|
7
|
Assessing sequence-based protein-protein interaction predictors for use in therapeutic peptide engineering. Sci Rep 2022; 12:9610. [PMID: 35688894 PMCID: PMC9187631 DOI: 10.1038/s41598-022-13227-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Accepted: 04/25/2022] [Indexed: 12/01/2022] Open
Abstract
Engineering peptides to achieve a desired therapeutic effect through the inhibition of a specific target activity or protein interaction is a non-trivial task. Few of the existing in silico peptide design algorithms generate target-specific peptides. Instead, many methods produce peptides that achieve a desired effect through an unknown mechanism. In contrast with resource-intensive high-throughput experiments, in silico screening is a cost-effective alternative that can prune the space of candidates when engineering target-specific peptides. Using a set of FDA-approved peptides we curated specifically for this task, we assess the applicability of several sequence-based protein–protein interaction predictors as a screening tool within the context of peptide therapeutic engineering. We show that similarity-based protein–protein interaction predictors are more suitable for this purpose than the state-of-the-art deep learning methods publicly available at the time of writing. We also show that this approach is mostly useful when designing new peptides against targets for which naturally-occurring interactors are already known, and that deploying it for de novo peptide engineering tasks may require gathering additional target-specific training data. Taken together, this work offers evidence that supports the use of similarity-based protein–protein interaction predictors for peptide therapeutic engineering, especially peptide analogs.
Collapse
|
8
|
Dick K, Pattang A, Hooker J, Nissan N, Sadowski M, Barnes B, Tan LH, Burnside D, Phanse S, Aoki H, Babu M, Dehne F, Golshani A, Cober ER, Green JR, Samanfar B. Human-Soybean Allergies: Elucidation of the Seed Proteome and Comprehensive Protein-Protein Interaction Prediction. J Proteome Res 2021; 20:4925-4947. [PMID: 34582199 DOI: 10.1021/acs.jproteome.1c00138] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The soybean crop, Glycine max (L.) Merr., is consumed by humans, Homo sapiens, worldwide. While the respective bodies of literature and -omics data for each of these organisms are extensive, comparatively few studies investigate the molecular biological processes occurring between the two. We are interested in elucidating the network of protein-protein interactions (PPIs) involved in human-soybean allergies. To this end, we leverage state-of-the-art sequence-based PPI predictors amenable to predicting the enormous comprehensive interactome between human and soybean. A network-based analytical approach is proposed, leveraging similar interaction profiles to identify candidate allergens and proteins involved in the allergy response. Interestingly, the predicted interactome can be explored from two complementary perspectives: which soybean proteins are predicted to interact with specific human proteins and which human proteins are predicted to interact with specific soybean proteins. A total of eight proteins (six specific to the human proteome and two to the soy proteome) have been identified and supported by the literature to be involved in human health, specifically related to immunological and neurological pathways. This study, beyond generating the most comprehensive human-soybean interactome to date, elucidated a soybean seed interactome and identified several proteins putatively consequential to human health.
Collapse
Affiliation(s)
- Kevin Dick
- Department of Systems and Computer Engineering, Carleton University, Ottawa, Ontario, Canada K1S 5B6
| | - Arezo Pattang
- Agriculture and Agri-Food Canada, Ottawa Research and Development Centre, Ottawa, Ontario, Canada K1A 0C6
- Department of Biology and Institute of Biochemistry, and Ottawa Institute of Systems Biology, Carleton University, Ottawa, Ontario, Canada K1S 5B6
| | - Julia Hooker
- Agriculture and Agri-Food Canada, Ottawa Research and Development Centre, Ottawa, Ontario, Canada K1A 0C6
- Department of Biology and Institute of Biochemistry, and Ottawa Institute of Systems Biology, Carleton University, Ottawa, Ontario, Canada K1S 5B6
| | - Nour Nissan
- Agriculture and Agri-Food Canada, Ottawa Research and Development Centre, Ottawa, Ontario, Canada K1A 0C6
- Department of Biology and Institute of Biochemistry, and Ottawa Institute of Systems Biology, Carleton University, Ottawa, Ontario, Canada K1S 5B6
| | - Michael Sadowski
- Agriculture and Agri-Food Canada, Ottawa Research and Development Centre, Ottawa, Ontario, Canada K1A 0C6
- Department of Biology and Institute of Biochemistry, and Ottawa Institute of Systems Biology, Carleton University, Ottawa, Ontario, Canada K1S 5B6
| | - Bradley Barnes
- Department of Systems and Computer Engineering, Carleton University, Ottawa, Ontario, Canada K1S 5B6
| | - Le Hoa Tan
- Agriculture and Agri-Food Canada, Ottawa Research and Development Centre, Ottawa, Ontario, Canada K1A 0C6
- Department of Biology and Institute of Biochemistry, and Ottawa Institute of Systems Biology, Carleton University, Ottawa, Ontario, Canada K1S 5B6
| | - Daniel Burnside
- Department of Biology and Institute of Biochemistry, and Ottawa Institute of Systems Biology, Carleton University, Ottawa, Ontario, Canada K1S 5B6
| | - Sadhna Phanse
- Department of Biochemistry, University of Regina, Regina, Saskatchewan, Canada S4S 0A2
| | - Hiroyuki Aoki
- Department of Biochemistry, University of Regina, Regina, Saskatchewan, Canada S4S 0A2
| | - Mohan Babu
- Department of Biochemistry, University of Regina, Regina, Saskatchewan, Canada S4S 0A2
| | - Frank Dehne
- School of Computer Science, Carleton University, Ottawa, Ontario, Canada K1S 5B6
| | - Ashkan Golshani
- Department of Biology and Institute of Biochemistry, and Ottawa Institute of Systems Biology, Carleton University, Ottawa, Ontario, Canada K1S 5B6
| | - Elroy R Cober
- Agriculture and Agri-Food Canada, Ottawa Research and Development Centre, Ottawa, Ontario, Canada K1A 0C6
| | - James R Green
- Department of Systems and Computer Engineering, Carleton University, Ottawa, Ontario, Canada K1S 5B6
| | - Bahram Samanfar
- Agriculture and Agri-Food Canada, Ottawa Research and Development Centre, Ottawa, Ontario, Canada K1A 0C6
- Department of Biology and Institute of Biochemistry, and Ottawa Institute of Systems Biology, Carleton University, Ottawa, Ontario, Canada K1S 5B6
| |
Collapse
|
9
|
Dick K, Chopra A, Biggar KK, Green JR. Multi-schema computational prediction of the comprehensive SARS-CoV-2 vs. human interactome. PeerJ 2021; 9:e11117. [PMID: 33868814 PMCID: PMC8029698 DOI: 10.7717/peerj.11117] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2020] [Accepted: 02/24/2021] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND Understanding the disease pathogenesis of the novel coronavirus, denoted SARS-CoV-2, is critical to the development of anti-SARS-CoV-2 therapeutics. The global propagation of the viral disease, denoted COVID-19 ("coronavirus disease 2019"), has unified the scientific community in searching for possible inhibitory small molecules or polypeptides. A holistic understanding of the SARS-CoV-2 vs. human inter-species interactome promises to identify putative protein-protein interactions (PPI) that may be considered targets for the development of inhibitory therapeutics. METHODS We leverage two state-of-the-art, sequence-based PPI predictors (PIPE4 & SPRINT) capable of generating the comprehensive SARS-CoV-2 vs. human interactome, comprising approximately 285,000 pairwise predictions. Three prediction schemas (all, proximal, RP-PPI) are leveraged to obtain our highest-confidence subset of PPIs and human proteins predicted to interact with each of the 14 SARS-CoV-2 proteins considered in this study. Notably, the use of the Reciprocal Perspective (RP) framework demonstrates improved predictive performance in multiple cross-validation experiments. RESULTS The all schema identified 279 high-confidence putative interactions involving 225 human proteins, the proximal schema identified 129 high-confidence putative interactions involving 126 human proteins, and the RP-PPI schema identified 539 high-confidence putative interactions involving 494 human proteins. The intersection of the three sets of predictions comprise the seven highest-confidence PPIs. Notably, the Spike-ACE2 interaction was the highest ranked for both the PIPE4 and SPRINT predictors with the all and proximal schemas, corroborating existing evidence for this PPI. Several other predicted PPIs are biologically relevant within the context of the original SARS-CoV virus. Furthermore, the PIPE-Sites algorithm was used to identify the putative subsequence that might mediate each interaction and thereby inform the design of inhibitory polypeptides intended to disrupt the corresponding host-pathogen interactions. CONCLUSION We publicly released the comprehensive sets of PPI predictions and their corresponding PIPE-Sites landscapes in the following DataVerse repository: https://www.doi.org/10.5683/SP2/JZ77XA. The information provided represents theoretical modeling only and caution should be exercised in its use. It is intended as a resource for the scientific community at large in furthering our understanding of SARS-CoV-2.
Collapse
Affiliation(s)
- Kevin Dick
- Department of Systems and Computer Engineering, Carleton University, Ottawa, Ontario, Canada
- Institute for Data Science, Carleton University, Ottawa, Ontario, Canada
| | - Anand Chopra
- Institute of Biochemistry, Carleton University, Ottawa, Ontario, Canada
- Department of Biology, Carleton University, Ottawa, Ontario, Canada
| | - Kyle K. Biggar
- Institute of Biochemistry, Carleton University, Ottawa, Ontario, Canada
- Department of Biology, Carleton University, Ottawa, Ontario, Canada
| | - James R. Green
- Department of Systems and Computer Engineering, Carleton University, Ottawa, Ontario, Canada
- Institute for Data Science, Carleton University, Ottawa, Ontario, Canada
| |
Collapse
|
10
|
Kyrollos DG, Reid B, Dick K, Green JR. RPmirDIP: Reciprocal Perspective improves miRNA targeting prediction. Sci Rep 2020; 10:11770. [PMID: 32678114 PMCID: PMC7366700 DOI: 10.1038/s41598-020-68251-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2020] [Accepted: 06/15/2020] [Indexed: 12/16/2022] Open
Abstract
MicroRNAs (miRNAs) are short, non-coding RNAs that interact with messenger RNA (mRNA) to accomplish critical cellular activities such as the regulation of gene expression. Several machine learning methods have been developed to improve classification accuracy and reduce validation costs by predicting which miRNA will target which gene. Application of these predictors to large numbers of unique miRNA–gene pairs has resulted in datasets comprising tens of millions of scored interactions; the largest among these is mirDIP. We here demonstrate that miRNA target prediction can be significantly improved (\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$p < 0.001$$\end{document}p<0.001) through the application of the Reciprocal Perspective (RP) method, a cascaded, semi-supervised machine learning method originally developed for protein-protein interaction prediction. The RP method, aptly named RPmirDIP, augments the original mirDIP prediction scores by leveraging local thresholds from the two complimentary views available to each miRNA–gene pair, rather than apply a traditional global decision threshold. Application of this novel RPmirDIP predictor promises to help identify new, unexpected miRNA–gene interactions. A dataset of RPmirDIP-scored interactions are made available to the scientific community at cu-bic.ca/RPmirDIP and 10.5683/SP2/LD8JKJ.
Collapse
Affiliation(s)
- Daniel G Kyrollos
- Department of Systems and Computer Engineering, Carleton University, Ottawa, Canada
| | - Bradley Reid
- Department of Systems and Computer Engineering, Carleton University, Ottawa, Canada
| | - Kevin Dick
- Department of Systems and Computer Engineering, Carleton University, Ottawa, Canada.,Institute of Data Science, Carleton University, Ottawa, Canada
| | - James R Green
- Department of Systems and Computer Engineering, Carleton University, Ottawa, Canada. .,Institute of Data Science, Carleton University, Ottawa, Canada.
| |
Collapse
|
11
|
Poverennaya EV, Kiseleva OI, Ivanov AS, Ponomarenko EA. Methods of Computational Interactomics for Investigating Interactions of Human Proteoforms. BIOCHEMISTRY (MOSCOW) 2020; 85:68-79. [PMID: 32079518 DOI: 10.1134/s000629792001006x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
Human genome contains ca. 20,000 protein-coding genes that could be translated into millions of unique protein species (proteoforms). Proteoforms coded by a single gene often have different functions, which implies different protein partners. By interacting with each other, proteoforms create a network reflecting the dynamics of cellular processes in an organism. Perturbations of protein-protein interactions change the network topology, which often triggers pathological processes. Studying proteoforms is a relatively new research area in proteomics, and this is why there are comparatively few experimental studies on the interaction of proteoforms. Bioinformatics tools can facilitate such studies by providing valuable complementary information to the experimental data and, in particular, expanding the possibilities of the studies of proteoform interactions.
Collapse
Affiliation(s)
| | - O I Kiseleva
- Institute of Biomedical Chemistry, Moscow, 119121, Russia
| | - A S Ivanov
- Institute of Biomedical Chemistry, Moscow, 119121, Russia
| | | |
Collapse
|
12
|
Chen Y, Wang W, Liu J, Feng J, Gong X. Protein Interface Complementarity and Gene Duplication Improve Link Prediction of Protein-Protein Interaction Network. Front Genet 2020; 11:291. [PMID: 32300358 PMCID: PMC7142252 DOI: 10.3389/fgene.2020.00291] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2020] [Accepted: 03/10/2020] [Indexed: 12/20/2022] Open
Abstract
Protein-protein interactions are the foundations of cellular life activities. At present, the already known protein-protein interactions only account for a small part of the total. With the development of experimental and computing technology, more and more PPI data are mined, PPI networks are more and more dense. It is possible to predict protein-protein interaction from the perspective of network structure. Although there are many high-throughput experimental methods to detect protein-protein interactions, the cost of experiments is high, time-consuming, and there is a certain error rate meanwhile. Network-based approaches can provide candidates of protein pairs for high-throughput experiments and improve the accuracy rate. This paper presents a new link prediction approach "Sim" for PPI networks from the perspectives of proteins' complementary interfaces and gene duplication. By integrating our approach "Sim" with the state-of-art network-based approach "L3," the prediction accuracy and robustness are improved.
Collapse
Affiliation(s)
- Yu Chen
- School of Mathematics, Renmin University of China, Beijing, China.,School of Mathematics and Statistics, Minnan Normal University, Zhangzhou, China.,Institute for Mathematical Sciences, Renmin University of China, Beijing, China
| | - Wei Wang
- School of Mathematics, Renmin University of China, Beijing, China
| | - Jiale Liu
- School of Mathematics, Renmin University of China, Beijing, China.,Institute for Mathematical Sciences, Renmin University of China, Beijing, China
| | - Jinping Feng
- School of Mathematics and Statistics, Henan University, Kaifeng, China
| | - Xinqi Gong
- School of Mathematics, Renmin University of China, Beijing, China.,Institute for Mathematical Sciences, Renmin University of China, Beijing, China
| |
Collapse
|
13
|
Dick K, Samanfar B, Barnes B, Cober ER, Mimee B, Tan LH, Molnar SJ, Biggar KK, Golshani A, Dehne F, Green JR. PIPE4: Fast PPI Predictor for Comprehensive Inter- and Cross-Species Interactomes. Sci Rep 2020; 10:1390. [PMID: 31996697 PMCID: PMC6989690 DOI: 10.1038/s41598-019-56895-w] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2019] [Accepted: 12/13/2019] [Indexed: 02/06/2023] Open
Abstract
The need for larger-scale and increasingly complex protein-protein interaction (PPI) prediction tasks demands that state-of-the-art predictors be highly efficient and adapted to inter- and cross-species predictions. Furthermore, the ability to generate comprehensive interactomes has enabled the appraisal of each PPI in the context of all predictions leading to further improvements in classification performance in the face of extreme class imbalance using the Reciprocal Perspective (RP) framework. We here describe the PIPE4 algorithm. Adaptation of the PIPE3/MP-PIPE sequence preprocessing step led to upwards of 50x speedup and the new Similarity Weighted Score appropriately normalizes for window frequency when applied to any inter- and cross-species prediction schemas. Comprehensive interactomes for three prediction schemas are generated: (1) cross-species predictions, where Arabidopsis thaliana is used as a proxy to predict the comprehensive Glycine max interactome, (2) inter-species predictions between Homo sapiens-HIV1, and (3) a combined schema involving both cross- and inter-species predictions, where both Arabidopsis thaliana and Caenorhabditis elegans are used as proxy species to predict the interactome between Glycine max (the soybean legume) and Heterodera glycines (the soybean cyst nematode). Comparing PIPE4 with the state-of-the-art resulted in improved performance, indicative that it should be the method of choice for complex PPI prediction schemas.
Collapse
Affiliation(s)
- Kevin Dick
- Department of Systems and Computer Engineering, Carleton University, Ottawa, Ontario, K1S 5B6, Canada
| | - Bahram Samanfar
- Agriculture and Agri-Food Canada, Ottawa Research and Development Centre, Ottawa, Ontario, K1A 0C6, Canada
- Department of Biology, Carleton University, Ottawa, K1S 5B6, Ontario, Canada
| | - Bradley Barnes
- Department of Systems and Computer Engineering, Carleton University, Ottawa, Ontario, K1S 5B6, Canada
| | - Elroy R Cober
- Agriculture and Agri-Food Canada, Ottawa Research and Development Centre, Ottawa, Ontario, K1A 0C6, Canada
| | - Benjamin Mimee
- Agriculture and Agri-Food Canada, Saint-Jean-sur-Richelieu Research and Development Centre, Saint-Jean-sur-Richelieu, J3B 3E6, Quebec, Canada
| | - Le Hoa Tan
- Agriculture and Agri-Food Canada, Ottawa Research and Development Centre, Ottawa, Ontario, K1A 0C6, Canada
| | - Stephen J Molnar
- Agriculture and Agri-Food Canada, Ottawa Research and Development Centre, Ottawa, Ontario, K1A 0C6, Canada
| | - Kyle K Biggar
- Department of Biology, Carleton University, Ottawa, K1S 5B6, Ontario, Canada
| | - Ashkan Golshani
- Department of Biology, Carleton University, Ottawa, K1S 5B6, Ontario, Canada
- Ottawa Institute of Systems Biology, Carleton University, 1125 Colonel By Drive, Ottawa, K1S 5B6, Canada
| | - Frank Dehne
- School of Computer Science, Carleton University, Ottawa, Ontario, K1S 5B6, Canada
| | - James R Green
- Department of Systems and Computer Engineering, Carleton University, Ottawa, Ontario, K1S 5B6, Canada.
| |
Collapse
|
14
|
Lee LYH, Loscalzo J. Network Medicine in Pathobiology. THE AMERICAN JOURNAL OF PATHOLOGY 2019; 189:1311-1326. [PMID: 31014954 DOI: 10.1016/j.ajpath.2019.03.009] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Accepted: 03/05/2019] [Indexed: 12/11/2022]
Abstract
The past decade has witnessed exponential growth in the generation of high-throughput human data across almost all known dimensions of biological systems. The discipline of network medicine has rapidly evolved in parallel, providing an unbiased, comprehensive biological framework through which to interrogate and integrate systematically these large-scale, multi-omic data to enhance our understanding of disease mechanisms and to design drugs that reflect a deep knowledge of molecular pathobiology. In this review, we discuss the key principles of network medicine and the human disease network and explore the latest applications of network medicine in this multi-omic era. We also highlight the current conceptual and technological challenges, which serve as exciting opportunities by which to improve and expand the network-based applications beyond the artificial boundaries of the current state of human pathobiology.
Collapse
Affiliation(s)
| | - Joseph Loscalzo
- Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts.
| |
Collapse
|