1
|
Medvedeva A, Teimouri H, Kolomeisky AB. Predicting Antimicrobial Activity for Untested Peptide-Based Drugs Using Collaborative Filtering and Link Prediction. J Chem Inf Model 2023. [PMID: 37307501 DOI: 10.1021/acs.jcim.3c00137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
The increase of bacterial resistance to currently available antibiotics has underlined the urgent need to develop new antibiotic drugs. Antimicrobial peptides (AMPs), alone or in combination with other peptides and/or existing antibiotics, have emerged as promising candidates for this task. However, given that there are thousands of known AMPs and an even larger number can be synthesized, it is impossible to comprehensively test all of them using standard wet lab experimental methods. These observations stimulated an application of machine-learning methods to identify promising AMPs. Currently, machine learning studies combine very different bacteria without considering bacteria-specific features or interactions with AMPs. In addition, the sparsity of current AMP data sets disqualifies the application of traditional machine-learning methods or makes the results unreliable. Here, we present a new approach, featuring neighborhood-based collaborative filtering, to predict with high accuracy a given bacteria's response to untested AMPs based on similarities between bacterial responses. Furthermore, we also developed a complementary bacteria-specific link prediction approach that can be used to visualize networks of AMP-antibiotic combinations, enabling us to propose new combinations that are likely to be effective.
Collapse
Affiliation(s)
- Angela Medvedeva
- Department of Chemistry, Rice University, Houston, Texas 77005, United States
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, United States
| | - Hamid Teimouri
- Department of Chemistry, Rice University, Houston, Texas 77005, United States
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, United States
| | - Anatoly B Kolomeisky
- Department of Chemistry, Rice University, Houston, Texas 77005, United States
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, United States
- Department of Chemical and Biomolecular Engineering, Rice University, Houston, Texas 77005, United States
- Department of Physics and Astronomy, Rice University, Houston, Texas 77005, United States
| |
Collapse
|
2
|
Howlett-Prieto Q, Oommen C, Carrithers MD, Wunsch DC, Hier DB. Subtypes of relapsing-remitting multiple sclerosis identified by network analysis. Front Digit Health 2023; 4:1063264. [PMID: 36714613 PMCID: PMC9874946 DOI: 10.3389/fdgth.2022.1063264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Accepted: 12/22/2022] [Indexed: 01/12/2023] Open
Abstract
We used network analysis to identify subtypes of relapsing-remitting multiple sclerosis subjects based on their cumulative signs and symptoms. The electronic medical records of 113 subjects with relapsing-remitting multiple sclerosis were reviewed, signs and symptoms were mapped to classes in a neuro-ontology, and classes were collapsed into sixteen superclasses by subsumption. After normalization and vectorization of the data, bipartite (subject-feature) and unipartite (subject-subject) network graphs were created using NetworkX and visualized in Gephi. Degree and weighted degree were calculated for each node. Graphs were partitioned into communities using the modularity score. Feature maps visualized differences in features by community. Network analysis of the unipartite graph yielded a higher modularity score (0.49) than the bipartite graph (0.25). The bipartite network was partitioned into five communities which were named fatigue, behavioral, hypertonia/weakness, abnormal gait/sphincter, and sensory, based on feature characteristics. The unipartite network was partitioned into five communities which were named fatigue, pain, cognitive, sensory, and gait/weakness/hypertonia based on features. Although we did not identify pure subtypes (e.g., pure motor, pure sensory, etc.) in this cohort of multiple sclerosis subjects, we demonstrated that network analysis could partition these subjects into different subtype communities. Larger datasets and additional partitioning algorithms are needed to confirm these findings and elucidate their significance. This study contributes to the literature investigating subtypes of multiple sclerosis by combining feature reduction by subsumption with network analysis.
Collapse
Affiliation(s)
- Quentin Howlett-Prieto
- Department of Neurology and Rehabilitation, University of Illinois at Chicago, Chicago, IL, United States
| | - Chelsea Oommen
- Department of Neurology and Rehabilitation, University of Illinois at Chicago, Chicago, IL, United States
| | - Michael D. Carrithers
- Department of Neurology and Rehabilitation, University of Illinois at Chicago, Chicago, IL, United States
| | - Donald C. Wunsch
- Department of Electrical and Computer Engineering, Missouri University of Science and Technology, Rolla, MO, United States
| | - Daniel B. Hier
- Department of Neurology and Rehabilitation, University of Illinois at Chicago, Chicago, IL, United States,Department of Electrical and Computer Engineering, Missouri University of Science and Technology, Rolla, MO, United States,Correspondence: Daniel B. Hier
| |
Collapse
|
3
|
Wang S, Wu R, Lu J, Jiang Y, Huang T, Cai YD. Protein-protein interaction networks as miners of biological discovery. Proteomics 2022; 22:e2100190. [PMID: 35567424 DOI: 10.1002/pmic.202100190] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2021] [Revised: 03/28/2022] [Accepted: 04/29/2022] [Indexed: 11/12/2022]
Abstract
Protein-protein interactions (PPIs) form the basis of a myriad of biological pathways and mechanism, such as the formation of protein-complexes or the components of signaling cascades. Here, we reviewed experimental methods for identifying PPI pairs, including yeast two-hybrid, mass spectrometry, co-localization, and co-immunoprecipitation. Furthermore, a range of computational methods leveraging biochemical properties, evolution history, protein structures and more have enabled identification of additional PPIs. Given the wealth of known PPIs, we reviewed important network methods to construct and analyze networks of PPIs. These methods aid biological discovery through identifying hub genes and dynamic changes in the network, and have been thoroughly applied in various fields of biological research. Lastly, we discussed the challenges and future direction of research utilizing the power of PPI networks. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Steven Wang
- Department of Biological Sciences, Columbia University, New York, NY, USA
| | - Runxin Wu
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Jiaqi Lu
- Department of Chemistry and Biochemistry, University of Notre Dame, Notre Dame, IN, USA
| | - Yijia Jiang
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Tao Huang
- Bio-Med Big Data Center, Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, Shanghai, China
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai, China
| |
Collapse
|
4
|
Yang J, Shu L, Duan H, Li H. A Visual Phenotype-Based Differential Diagnosis Process for Rare Diseases. Interdiscip Sci 2021; 14:331-348. [PMID: 34751921 DOI: 10.1007/s12539-021-00490-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2021] [Revised: 10/23/2021] [Accepted: 10/28/2021] [Indexed: 02/01/2023]
Abstract
PURPOSE Phenotype-based rapid diagnosis can make up for the time-consuming genetic sequencing diagnosis of rare diseases. However, the collected phenotypes of patients can sometimes be inaccurate or incomplete, which limits the accuracy of diagnostic results. To solve this problem, we try to design a phenotype-based differential diagnosis process for rare diseases to achieve rapid and accurate diagnosis of rare diseases. METHODS The core of the differential diagnosis of rare diseases is to optimize the phenotype information of a specific patient and the visualized comparative analysis of diseases. To recommend additional phenotypes, replace the fuzzy phenotypes and filter the unexplained phenotypes for patients, we constructed a phenotype hierarchical network and a disease-phenotype differential network and calculated the phenotype co-occurrence relationship. In addition, we designed a visual comparative analysis method to explore the correlation and difference of disease phenotypes. RESULTS The evaluation based on the published 10 rare disease cases demonstrated that after the optimization of patient phenotype information through our differential diagnosis, the target disease often got a better ranking and recommendation score than before. We have deployed this scheme on the RDmap project ( http://rdmap.nbscn.org ). CONCLUSION Compared to genetic and molecular analysis, phenotype-based diagnosis is faster, cheaper, and easier. The differential diagnosis process we designed can optimize the phenotype information of patients and better locate the target disease. It can also help to make screening decisions before genetic testing.
Collapse
Affiliation(s)
- Jian Yang
- The Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Binsheng Road 3333#, Hangzhou, 310052, Zhejiang, China.,The College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou, Zhejiang, China
| | - Liqi Shu
- Rhode Island Hospital, Warren Alpert Medical School of Brown University, Rhode Island, USA
| | - Huilong Duan
- The College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou, Zhejiang, China
| | - Haomin Li
- The Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Binsheng Road 3333#, Hangzhou, 310052, Zhejiang, China.
| |
Collapse
|
5
|
Pourreza Shahri M, Kahanda I. Deep semi-supervised learning ensemble framework for classifying co-mentions of human proteins and phenotypes. BMC Bioinformatics 2021; 22:500. [PMID: 34656098 PMCID: PMC8520253 DOI: 10.1186/s12859-021-04421-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2021] [Accepted: 10/04/2021] [Indexed: 11/13/2022] Open
Abstract
Background Identifying human protein-phenotype relationships has attracted researchers in bioinformatics and biomedical natural language processing due to its importance in uncovering rare and complex diseases. Since experimental validation of protein-phenotype associations is prohibitive, automated tools capable of accurately extracting these associations from the biomedical text are in high demand. However, while the manual annotation of protein-phenotype co-mentions required for training such models is highly resource-consuming, extracting millions of unlabeled co-mentions is straightforward. Results In this study, we propose a novel deep semi-supervised ensemble framework that combines deep neural networks, semi-supervised, and ensemble learning for classifying human protein-phenotype co-mentions with the help of unlabeled data. This framework allows the ability to incorporate an extensive collection of unlabeled sentence-level co-mentions of human proteins and phenotypes with a small labeled dataset to enhance overall performance. We develop PPPredSS, a prototype of our proposed semi-supervised framework that combines sophisticated language models, convolutional networks, and recurrent networks. Our experimental results demonstrate that the proposed approach provides a new state-of-the-art performance in classifying human protein-phenotype co-mentions by outperforming other supervised and semi-supervised counterparts. Furthermore, we highlight the utility of PPPredSS in powering a curation assistant system through case studies involving a group of biologists. Conclusions This article presents a novel approach for human protein-phenotype co-mention classification based on deep, semi-supervised, and ensemble learning. The insights and findings from this work have implications for biomedical researchers, biocurators, and the text mining community working on biomedical relationship extraction.
Collapse
Affiliation(s)
| | - Indika Kahanda
- School of Computing, University of North Florida, Jacksonville, USA.
| |
Collapse
|
6
|
Saikia SJ, Nirmala SR. Identification of disease genes and assessment of eye-related diseases caused by disease genes using JMFC and GDLNN. Comput Methods Biomech Biomed Engin 2021; 25:359-370. [PMID: 34384296 DOI: 10.1080/10255842.2021.1955358] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Early detection of disease genes helps humans to recover from certain gene-related diseases, like genetic eye diseases. This work identifies the possibility of eye diseasesfor the disease genes utilizing a Gaussian-activation function (G)-centric deeplearning neural network (GDLNN) model. In this work, human genes are selected by computing structural similarity and genes are clustered as disease genesand normal genes by using the JMFC clustering algorithm. Levy flight and Crossover and Mutation (LCM) centric Chicken Swarm Optimization (LCM-CSO) is employed for feature selection and GDLNN classifies the eye-related diseases for the input genes using the selected features.
Collapse
Affiliation(s)
- Samar Jyoti Saikia
- Department of Electronics and Communication Engineering, Gauhati University, Guwahati, Assam, India.,Department of Electronics and Communication Engineering, Assam Don Bosco University, Guwahati, Assam, India
| | - S R Nirmala
- Department of Electronics and Communication Engineering, Gauhati University, Guwahati, Assam, India.,School of Electronics and Communication Engineering, KLE Technological University, Hubli, Karnataka, India
| |
Collapse
|
7
|
Díaz-Santiago E, Claros MG, Yahyaoui R, de Diego-Otero Y, Calvo R, Hoenicka J, Palau F, Ranea JAG, Perkins JR. Decoding Neuromuscular Disorders Using Phenotypic Clusters Obtained From Co-Occurrence Networks. Front Mol Biosci 2021; 8:635074. [PMID: 34046427 PMCID: PMC8147726 DOI: 10.3389/fmolb.2021.635074] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2020] [Accepted: 02/15/2021] [Indexed: 12/19/2022] Open
Abstract
Neuromuscular disorders (NMDs) represent an important subset of rare diseases associated with elevated morbidity and mortality whose diagnosis can take years. Here we present a novel approach using systems biology to produce functionally-coherent phenotype clusters that provide insight into the cellular functions and phenotypic patterns underlying NMDs, using the Human Phenotype Ontology as a common framework. Gene and phenotype information was obtained for 424 NMDs in OMIM and 126 NMDs in Orphanet, and 335 and 216 phenotypes were identified as typical for NMDs, respectively. ‘Elevated serum creatine kinase’ was the most specific to NMDs, in agreement with the clinical test of elevated serum creatinine kinase that is conducted on NMD patients. The approach to obtain co-occurring NMD phenotypes was validated based on co-mention in PubMed abstracts. A total of 231 (OMIM) and 150 (Orphanet) clusters of highly connected co-occurrent NMD phenotypes were obtained. In parallel, a tripartite network based on phenotypes, diseases and genes was used to associate NMD phenotypes with functions, an approach also validated by literature co-mention, with KEGG pathways showing proportionally higher overlap than Gene Ontology and Reactome. Phenotype-function pairs were crossed with the co-occurrent NMD phenotype clusters to obtain 40 (OMIM) and 72 (Orphanet) functionally coherent phenotype clusters. As expected, many of these overlapped with known diseases and confirmed existing knowledge. Other clusters revealed interesting new findings, indicating informative phenotypes for differential diagnosis, providing deeper knowledge of NMDs, and pointing towards specific cell dysfunction caused by pleiotropic genes. This work is an example of reproducible research that i) can help better understand NMDs and support their diagnosis by providing a new tool that exploits existing information to obtain novel clusters of functionally-related phenotypes, and ii) takes us another step towards personalised medicine for NMDs.
Collapse
Affiliation(s)
- Elena Díaz-Santiago
- Department of Molecular Biology and Biochemistry, Universidad de Málaga, Málaga, Spain
| | - M Gonzalo Claros
- Department of Molecular Biology and Biochemistry, Universidad de Málaga, Málaga, Spain.,CIBER de Enfermedades Raras (CIBERER), Madrid, Spain.,Institute of Biomedical Research in Malaga (IBIMA), IBIMA-RARE, Málaga, Spain.,Institute for Mediterranean and Subtropical Horticulture "La Mayora" (IHSM-UMA-CSIC), Málaga, Spain
| | - Raquel Yahyaoui
- Institute of Biomedical Research in Malaga (IBIMA), IBIMA-RARE, Málaga, Spain.,Laboratory of Metabolopathies and Neonatal Screening, Málaga Regional University Hospital, Málaga, Spain
| | | | - Rocío Calvo
- Institute of Biomedical Research in Malaga (IBIMA), IBIMA-RARE, Málaga, Spain.,Laboratory of Metabolopathies and Neonatal Screening, Málaga Regional University Hospital, Málaga, Spain
| | - Janet Hoenicka
- CIBER de Enfermedades Raras (CIBERER), Madrid, Spain.,Sant Joan de Déu Hospital and Research Institute, Barcelona, Spain
| | - Francesc Palau
- CIBER de Enfermedades Raras (CIBERER), Madrid, Spain.,Sant Joan de Déu Hospital and Research Institute, Barcelona, Spain.,Hospital Clínic and University of Barcelona School of Medicine and Health Sciences, Barcelona, Spain
| | - Juan A G Ranea
- Department of Molecular Biology and Biochemistry, Universidad de Málaga, Málaga, Spain.,CIBER de Enfermedades Raras (CIBERER), Madrid, Spain.,Institute of Biomedical Research in Malaga (IBIMA), IBIMA-RARE, Málaga, Spain
| | - James R Perkins
- Department of Molecular Biology and Biochemistry, Universidad de Málaga, Málaga, Spain.,CIBER de Enfermedades Raras (CIBERER), Madrid, Spain.,Institute of Biomedical Research in Malaga (IBIMA), IBIMA-RARE, Málaga, Spain
| |
Collapse
|
8
|
Gu C, Shi X, Dang X, Chen J, Chen C, Chen Y, Pan X, Huang T. Identification of Common Genes and Pathways in Eight Fibrosis Diseases. Front Genet 2021; 11:627396. [PMID: 33519923 PMCID: PMC7844395 DOI: 10.3389/fgene.2020.627396] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Accepted: 12/15/2020] [Indexed: 01/05/2023] Open
Abstract
Acute and chronic inflammation often leads to fibrosis, which is also the common and final pathological outcome of chronic inflammatory diseases. To explore the common genes and pathogenic pathways among different fibrotic diseases, we collected all the reported genes of the eight fibrotic diseases: eye fibrosis, heart fibrosis, hepatic fibrosis, intestinal fibrosis, lung fibrosis, pancreas fibrosis, renal fibrosis, and skin fibrosis. We calculated the Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) enrichment scores of all fibrotic disease genes. Each gene was encoded using KEGG and GO enrichment scores, which reflected how much a gene can affect this function. For each fibrotic disease, by comparing the KEGG and GO enrichment scores between reported disease genes and other genes using the Monte Carlo feature selection (MCFS) method, the key KEGG and GO features were identified. We compared the gene overlaps among eight fibrotic diseases and connective tissue growth factor (CTGF) was finally identified as the common key molecule. The key KEGG and GO features of the eight fibrotic diseases were all screened by MCFS method. Moreover, we interestingly found overlaps of pathways between renal fibrosis and skin fibrosis, such as GO:1901890-positive regulation of cell junction assembly, as well as common regulatory genes, such as CTGF, which is the key molecule regulating fibrogenesis. We hope to offer a new insight into the cellular and molecular mechanisms underlying fibrosis and therefore help leading to the development of new drugs, which specifically delay or even improve the symptoms of fibrosis.
Collapse
Affiliation(s)
- Chang Gu
- Department of Thoracic Surgery, Shanghai Chest Hospital, Shanghai Jiao Tong University, Shanghai, China
- Department of Thoracic Surgery, Shanghai Pulmonary Hospital, Tongji University School of Medicine, Shanghai, China
| | - Xin Shi
- Department of Cardiology, Shanghai Chest Hospital, Shanghai Jiao Tong University, Shanghai, China
| | - Xuening Dang
- Department of Colorectal and Anal Surgery, Xinhua Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
- Shanghai Colorectal Cancer Research Center, Shanghai, China
| | - Jiafei Chen
- Department of Thoracic Surgery, Shanghai Pulmonary Hospital, Tongji University School of Medicine, Shanghai, China
| | - Chunji Chen
- Department of Thoracic Surgery, Shanghai Chest Hospital, Shanghai Jiao Tong University, Shanghai, China
| | - Yumei Chen
- Department of Nuclear Medicine, Ren Ji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Xufeng Pan
- Department of Thoracic Surgery, Shanghai Chest Hospital, Shanghai Jiao Tong University, Shanghai, China
| | - Tao Huang
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, Shanghai, China
| |
Collapse
|