1
|
He L, Sun H, Mo Q, Xiao Q, Yang K, Chen X, Zhu H, Tong X, Yao X, Chen J, Yao Z. A multi-module structure labelled molecular network orients the chemical profiles of traditional Chinese medicine prescriptions: Xiaoyao San, as an example. J Chromatogr A 2024; 1715:464613. [PMID: 38184988 DOI: 10.1016/j.chroma.2023.464613] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Revised: 12/22/2023] [Accepted: 12/26/2023] [Indexed: 01/09/2024]
Abstract
Ultra-high-performance liquid chromatography-mass spectrometry (UHPLC-MS) technology has emerged as a crucial tool for identifying components in traditional Chinese medicine (TCM). However, the characterization of the chemical profiles of TCM prescriptions (TCMPs) which often consist of multiple herbal medicines and contain diverse structural types, presents several challenges, such as component overlapping and time-consuming. In this study, a novel strategy known as the multi-module structure labelled molecular network (MSLMN), which integrates molecular networking, database annotation, and cluster analysis techniques, has been successfully proposed, which facilitates the identification of chemical constituents by leveraging a high-structural similarity ion list derived from the MSLMN. It has been effectively applied to analyze the chemical profile of Xiaoyao San (XYS), a classical TCMP. Through the MSLMN method, a total of 302 chemical constituents were identified, covering nine structural types in XYS. Furthermore, a validated and quantitative analytical method using UHPLC-QqQ-MS/MS technology was developed for 31 identified chemicals, encompassing all eight herbal medicines present in XYS, and the developed analytical approach was applied to investigate the content distribution across 40 different batches of commercially available XYS. In total, the proposed strategy has practical significance for improving the insight into the chemical profile of XYS and serves as a valuable approach for handling complex system data based on UHPLC-MS, particularly for TCMPs.
Collapse
Affiliation(s)
- Liangliang He
- International Cooperative Laboratory of Traditional Chinese Medicine Modernization and Innovative Drug Development of Ministry of Education (MOE) of China, State Key Laboratory of Bioactive Molecules and Druggability Assessment, and Institute of Traditional Chinese Medicine & Natural Products, College of Pharmacy, Jinan University, Guangzhou 510632, China
| | - Heng Sun
- International Cooperative Laboratory of Traditional Chinese Medicine Modernization and Innovative Drug Development of Ministry of Education (MOE) of China, State Key Laboratory of Bioactive Molecules and Druggability Assessment, and Institute of Traditional Chinese Medicine & Natural Products, College of Pharmacy, Jinan University, Guangzhou 510632, China
| | - Qingmei Mo
- International Cooperative Laboratory of Traditional Chinese Medicine Modernization and Innovative Drug Development of Ministry of Education (MOE) of China, State Key Laboratory of Bioactive Molecules and Druggability Assessment, and Institute of Traditional Chinese Medicine & Natural Products, College of Pharmacy, Jinan University, Guangzhou 510632, China
| | - Qiang Xiao
- International Cooperative Laboratory of Traditional Chinese Medicine Modernization and Innovative Drug Development of Ministry of Education (MOE) of China, State Key Laboratory of Bioactive Molecules and Druggability Assessment, and Institute of Traditional Chinese Medicine & Natural Products, College of Pharmacy, Jinan University, Guangzhou 510632, China
| | - Kefeng Yang
- International Cooperative Laboratory of Traditional Chinese Medicine Modernization and Innovative Drug Development of Ministry of Education (MOE) of China, State Key Laboratory of Bioactive Molecules and Druggability Assessment, and Institute of Traditional Chinese Medicine & Natural Products, College of Pharmacy, Jinan University, Guangzhou 510632, China
| | - Xintong Chen
- International Cooperative Laboratory of Traditional Chinese Medicine Modernization and Innovative Drug Development of Ministry of Education (MOE) of China, State Key Laboratory of Bioactive Molecules and Druggability Assessment, and Institute of Traditional Chinese Medicine & Natural Products, College of Pharmacy, Jinan University, Guangzhou 510632, China
| | - Haodong Zhu
- International Cooperative Laboratory of Traditional Chinese Medicine Modernization and Innovative Drug Development of Ministry of Education (MOE) of China, State Key Laboratory of Bioactive Molecules and Druggability Assessment, and Institute of Traditional Chinese Medicine & Natural Products, College of Pharmacy, Jinan University, Guangzhou 510632, China
| | - Xupeng Tong
- Hangzhou Chenfeng Qingxing Technology Co., Ltd, Hangzhou 310000, China
| | - Xinsheng Yao
- International Cooperative Laboratory of Traditional Chinese Medicine Modernization and Innovative Drug Development of Ministry of Education (MOE) of China, State Key Laboratory of Bioactive Molecules and Druggability Assessment, and Institute of Traditional Chinese Medicine & Natural Products, College of Pharmacy, Jinan University, Guangzhou 510632, China
| | - Jiaxu Chen
- International Cooperative Laboratory of Traditional Chinese Medicine Modernization and Innovative Drug Development of Ministry of Education (MOE) of China, State Key Laboratory of Bioactive Molecules and Druggability Assessment, and Institute of Traditional Chinese Medicine & Natural Products, College of Pharmacy, Jinan University, Guangzhou 510632, China; Guangzhou Key Laboratory of Formula-Pattern of Traditional Chinese Medicine, School of Traditional Chinese Medicine, Jinan University, Guangzhou 510632, China.
| | - Zhihong Yao
- International Cooperative Laboratory of Traditional Chinese Medicine Modernization and Innovative Drug Development of Ministry of Education (MOE) of China, State Key Laboratory of Bioactive Molecules and Druggability Assessment, and Institute of Traditional Chinese Medicine & Natural Products, College of Pharmacy, Jinan University, Guangzhou 510632, China; Guangzhou Key Laboratory of Formula-Pattern of Traditional Chinese Medicine, School of Traditional Chinese Medicine, Jinan University, Guangzhou 510632, China.
| |
Collapse
|
2
|
Wang Y, Zou J, Wang K, Liu C, Yuan X. Semi-supervised deep embedded clustering with pairwise constraints and subset allocation. Neural Netw 2023; 164:310-322. [PMID: 37163847 DOI: 10.1016/j.neunet.2023.04.016] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 03/08/2023] [Accepted: 04/11/2023] [Indexed: 05/12/2023]
Abstract
Semi-supervised deep clustering methods attract much attention due to their excellent performance on the end-to-end clustering task. However, it is hard to obtain satisfying clustering results since many overlapping samples in industrial text datasets strongly and incorrectly influence the learning process. Existing methods incorporate prior knowledge in the form of pairwise constraints or class labels, which not only largely ignore the correlation between these two supervision information but also cause the problem of weak-supervised constraint or incorrect strong-supervised label guidance. In order to tackle these problems, we propose a semi-supervised method based on pairwise constraints and subset allocation (PCSA-DEC). We redefine the similarity-based constraint loss by forcing the similarity of samples in the same class much higher than other samples and design a novel subset allocation loss to precisely learn strong-supervised information contained in labels which consistent with unlabeled data. Experimental results on the two industrial text datasets show that our method can yield 8.2%-8.7% improvement in accuracy and 13.4%-19.8% on normalized mutual information over the state-of-the-art method.
Collapse
Affiliation(s)
- Yalin Wang
- School of Automation, Central South University, Changsha, 410083, Hunan, China.
| | - Jiangfeng Zou
- School of Automation, Central South University, Changsha, 410083, Hunan, China.
| | - Kai Wang
- School of Automation, Central South University, Changsha, 410083, Hunan, China.
| | - Chenliang Liu
- School of Automation, Central South University, Changsha, 410083, Hunan, China.
| | - Xiaofeng Yuan
- School of Automation, Central South University, Changsha, 410083, Hunan, China.
| |
Collapse
|
3
|
Wang J, Meng S, Lin K, Yi X, Sun Y, Xu X, He N, Zhang Z, Hu H, Qie X, Zhang D, Tang Y, Huang WE, He J, Song Y. Leveraging single-cell Raman spectroscopy and single-cell sorting for the detection and identification of yeast infections. Anal Chim Acta 2023; 1239:340658. [PMID: 36628751 DOI: 10.1016/j.aca.2022.340658] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Revised: 11/21/2022] [Accepted: 11/21/2022] [Indexed: 11/26/2022]
Abstract
Invasive fungal infection serves as a great threat to human health. Discrimination between fungal and bacterial infections at the earliest stage is vital for effective clinic practice; however, traditional culture-dependent microscopic diagnosis of fungal infection usually requires several days, meanwhile, culture-independent immunological and molecular methods are limited by the detectable type of pathogens and the issues with high false-positive rates. In this study, we proposed a novel culture-independent phenotyping method based on single-cell Raman spectroscopy for the rapid discrimination between fungal and bacterial infections. Three Raman biomarkers, including cytochrome c, peptidoglycan, and nucleic acid, were identified through hierarchical clustering analysis of Raman spectra across 12 types of most common yeast and bacterial pathogens. Compared to those of bacterial pathogens, the single cells of yeast pathogens demonstrated significantly stronger Raman peaks for cytochrome c, but weaker signals for peptidoglycan and nucleic acid. A two-step protocol combining the three biomarkers was established and able to differentiate fungal infections from bacterial infections with an overall accuracy of 94.9%. Our approach was also used to detect ten raw urinary tract infection samples. Successful identification of fungi was achieved within half an hour after sample obtainment. We further demonstrated the accurate fungal species taxonomy achieved with Raman-assisted cell ejection. Our findings demonstrate that Raman-based fungal identification is a novel, facile, reliable, and with a breadth of coverage approach, that has a great potential to be adopted in routine clinical practice to reduce the turn-around time of invasive fungal disease (IFD) diagnostics.
Collapse
Affiliation(s)
- Jingkai Wang
- Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China; Division of Life Sciences and Medicine, School of Biomedical Engineering (Suzhou), University of Science and Technology of China, Suzhou, 215163, China
| | - Siyu Meng
- Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Kaicheng Lin
- Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Xiaofei Yi
- Institute of Antibiotics, Huashan Hospital, Fudan University, Shanghai, 20040, China; National Clinical Research Center for Aging and Medicine, Huashan Hospital, Fudan University, Shanghai, 200040, China
| | - Yixiang Sun
- Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Xiaogang Xu
- Institute of Antibiotics, Huashan Hospital, Fudan University, Shanghai, 20040, China; National Clinical Research Center for Aging and Medicine, Huashan Hospital, Fudan University, Shanghai, 200040, China
| | - Na He
- Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Zhiqiang Zhang
- Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Huijie Hu
- Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China; Division of Life Sciences and Medicine, School of Biomedical Engineering (Suzhou), University of Science and Technology of China, Suzhou, 215163, China
| | - Xingwang Qie
- Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Dayi Zhang
- College of New Energy and Environment, Jilin University, Changchun, 130021, PR China
| | - Yuguo Tang
- Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Wei E Huang
- Department of Engineering Science, University of Oxford, Parks Road, Oxford, OX1 3PJ, UK
| | - Jian He
- State Key Laboratory of Oncogenes and Related Genes, Center for Single-Cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China.
| | - Yizhi Song
- Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China; Division of Life Sciences and Medicine, School of Biomedical Engineering (Suzhou), University of Science and Technology of China, Suzhou, 215163, China.
| |
Collapse
|
4
|
Molecular docking, network pharmacology and experimental verification to explore the mechanism of Wulongzhiyangwan in the treatment of pruritus. Sci Rep 2023; 13:361. [PMID: 36611103 PMCID: PMC9825397 DOI: 10.1038/s41598-023-27593-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2022] [Accepted: 01/04/2023] [Indexed: 01/09/2023] Open
Abstract
Wulongzhiyangwan (WLZYW) is a Chinese prescription medicine for the treatment of pruritus, but its mechanism has not been clarified. The purpose of this study was to explore the mechanism of WLZYW in pruritus through network pharmacology analysis and experimental validation. The active components and corresponding targets of WLZYW were obtained from the Traditional Chinese Medicine Systematic Pharmacology (TCMSP) database. Pruritus-related targets were obtained from the GeneCards, TTD (Therapeutic Target Database), and DrugBank databases. The key compounds, core targets, main biological processes and signaling pathways related to WLZYW were identified by constructing and analyzing related networks. The binding affinity between WLZYW components and core targets was validated by AutoDock Vina software. In this study, RBL-2H3 cells were used to construct a degranulation model to simulate histamine-dependent pruritus. 10 chemical constituents, 235 targets and 3606 pruritus-related targets of WLZYW were obtained. Subsequently, 26 core targets were identified through analysis, VEGFA and AKT1 were the main candidates. A pathway enrichment analysis showed that overlapping targets were significantly enriched in the PI3K/AKT signaling pathway. A molecular docking analysis revealed tight binding of VEGF to three core compounds, kaempferol, luteolin and quercetin. Experiments showed that WZLYW inhibited mast cell degranulation, regulated VEGFa mRNA and protein expression levels by inhibiting PI3K/AKT and ERK1/2 signaling pathway activation. The mechanism of WZLYW in pruritus may be regulating VEGFa expression. Network pharmacology assays suggested that WLZYW downregulates VEGFa expression by regulating the PI3K/AKT and ERK1/2 signaling pathways in pruritis treatment.
Collapse
|
5
|
Manipur I, Giordano M, Piccirillo M, Parashuraman S, Maddalena L. Community Detection in Protein-Protein Interaction Networks and Applications. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:217-237. [PMID: 34951849 DOI: 10.1109/tcbb.2021.3138142] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
The ability to identify and characterize not only the protein-protein interactions but also their internal modular organization through network analysis is fundamental for understanding the mechanisms of biological processes at the molecular level. Indeed, the detection of the network communities can enhance our understanding of the molecular basis of disease pathology, and promote drug discovery and disease treatment in personalized medicine. This work gives an overview of recent computational methods for the detection of protein complexes and functional modules in protein-protein interaction networks, also providing a focus on some of its applications. We propose a systematic reformulation of frequently adopted taxonomies for these methods, also proposing new categories to keep up with the most recent research. We review the literature of the last five years (2017-2021) and provide links to existing data and software resources. Finally, we survey recent works exploiting module identification and analysis, in the context of a variety of disease processes for biomarker identification and therapeutic target detection. Our review provides the interested reader with an up-to-date and self-contained view of the existing research, with links to state-of-the-art literature and resources, as well as hints on open issues and future research directions in complex detection and its applications.
Collapse
|
6
|
Pan X, Hu L, Hu P, You ZH. Identifying Protein Complexes From Protein-Protein Interaction Networks Based on Fuzzy Clustering and GO Semantic Information. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2882-2893. [PMID: 34242171 DOI: 10.1109/tcbb.2021.3095947] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Protein complexes are of great significance to provide valuable insights into the mechanisms of biological processes of proteins. A variety of computational algorithms have thus been proposed to identify protein complexes in a protein-protein interaction network. However, few of them can perform their tasks by taking into account both network topology and protein attribute information in a unified fuzzy-based clustering framework. Since proteins in the same complex are similar in terms of their attribute information and the consideration of fuzzy clustering can also make it possible for us to identify overlapping complexes, we target to propose such a novel fuzzy-based clustering framework, namely FCAN-PCI, for an improved identification accuracy. To do so, the semantic similarity between the attribute information of proteins is calculated and we then integrate it into a well-established fuzzy clustering model together with the network topology. After that, a momentum method is adopted to accelerate the clustering procedure. FCAN-PCI finally applies a heuristical search strategy to identify overlapping protein complexes. A series of extensive experiments have been conducted to evaluate the performance of FCAN-PCI by comparing it with state-of-the-art identification algorithms and the results demonstrate the promising performance of FCAN-PCI.
Collapse
|
7
|
Chen Z, Liang B, Wu Y, Zhou H, Wang Y, Wu H. Identifying driver modules based on multi-omics biological networks in prostate cancer. IET Syst Biol 2022; 16:187-200. [PMID: 36039671 PMCID: PMC9675413 DOI: 10.1049/syb2.12050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2022] [Revised: 07/31/2022] [Accepted: 08/13/2022] [Indexed: 01/11/2023] Open
Abstract
The development of sequencing technology has promoted the expansion of cancer genome data. It is necessary to identify the pathogenesis of cancer at the molecular level and explore reliable treatment methods and precise drug targets in cancer by identifying carcinogenic functional modules in massive multi-omics data. However, there are still limitations to identifying carcinogenic driver modules by utilising genetic characteristics simply. Therefore, this study proposes a computational method, NetAP, to identify driver modules in prostate cancer. Firstly, high mutual exclusivity, high coverage, and high topological similarity between genes are integrated to construct a weight function, which calculates the weight of gene pairs in a biological network. Secondly, the random walk method is utilised to reevaluate the strength of interaction among genes. Finally, the optimal driver modules are identified by utilising the affinity propagation algorithm. According to the results, the authors' method identifies more validated driver genes and driver modules compared with the other previous methods. Thus, the proposed NetAP method can identify carcinogenic driver modules effectively and reliably, and the experimental results provide a powerful basis for cancer diagnosis, treatment and drug targets.
Collapse
Affiliation(s)
- Zhongli Chen
- Tibet Center for Disease Control and PreventionLhasaChina,School of SoftwareShandong UniversityJinanChina,School of Information EngineeringNorthwest A&F UniversityYanglingChina
| | - Biting Liang
- School of Information EngineeringNorthwest A&F UniversityYanglingChina
| | - Yingfu Wu
- School of Information EngineeringNorthwest A&F UniversityYanglingChina
| | - Haoru Zhou
- School of Information EngineeringNorthwest A&F UniversityYanglingChina
| | - Yuchen Wang
- School of SoftwareShandong UniversityJinanChina
| | - Hao Wu
- School of SoftwareShandong UniversityJinanChina
| |
Collapse
|
8
|
Bonomo M, Giancarlo R, Greco D, Rombo SE. Topological ranks reveal functional knowledge encoded in biological networks: a comparative analysis. Brief Bioinform 2022; 23:6563936. [PMID: 35381599 DOI: 10.1093/bib/bbac101] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Revised: 01/31/2022] [Accepted: 02/28/2022] [Indexed: 12/21/2022] Open
Abstract
MOTIVATION Biological networks topology yields important insights into biological function, occurrence of diseases and drug design. In the last few years, different types of topological measures have been introduced and applied to infer the biological relevance of network components/interactions, according to their position within the network structure. Although comparisons of such measures have been previously proposed, to what extent the topology per se may lead to the extraction of novel biological knowledge has never been critically examined nor formalized in the literature. RESULTS We present a comparative analysis of nine outstanding topological measures, based on compact views obtained from the rank they induce on a given input biological network. The goal is to understand their ability in correctly positioning nodes/edges in the rank, according to the functional knowledge implicitly encoded in biological networks. To this aim, both internal and external (gold standard) validation criteria are taken into account, and six networks involving three different organisms (yeast, worm and human) are included in the comparison. The results show that a distinct handful of best-performing measures can be identified for each of the considered organisms, independently from the reference gold standard. AVAILABILITY Input files and code for the computation of the considered topological measures and K-haus distance are available at https://gitlab.com/MaryBonomo/ranking. CONTACT simona.rombo@unipa.it. SUPPLEMENTARY INFORMATION Supplementary data are available at Briefings in Bioinformatics online.
Collapse
Affiliation(s)
- Mariella Bonomo
- Department of Engineering, University of Palermo, Palermo, 90121, Italy, Palermo
| | - Raffaele Giancarlo
- Department of Mathematics and Computer Science, University of Palermo, Palermo, 90121, Italy, Palermo
| | - Daniele Greco
- Department of Mathematics and Computer Science, University of Palermo, Palermo, 90121, Italy, Palermo
| | - Simona E Rombo
- Department of Mathematics and Computer Science, University of Palermo, Palermo, 90121, Italy, Palermo
| |
Collapse
|
9
|
Xiang J, Meng X, Zhao Y, Wu FX, Li M. HyMM: hybrid method for disease-gene prediction by integrating multiscale module structure. Brief Bioinform 2022; 23:6547263. [PMID: 35275996 DOI: 10.1093/bib/bbac072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Revised: 01/18/2022] [Accepted: 02/13/2022] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Identifying disease-related genes is an important issue in computational biology. Module structure widely exists in biomolecule networks, and complex diseases are usually thought to be caused by perturbations of local neighborhoods in the networks, which can provide useful insights for the study of disease-related genes. However, the mining and effective utilization of the module structure is still challenging in such issues as a disease gene prediction. RESULTS We propose a hybrid disease-gene prediction method integrating multiscale module structure (HyMM), which can utilize multiscale information from local to global structure to more effectively predict disease-related genes. HyMM extracts module partitions from local to global scales by multiscale modularity optimization with exponential sampling, and estimates the disease relatedness of genes in partitions by the abundance of disease-related genes within modules. Then, a probabilistic model for integration of gene rankings is designed in order to integrate multiple predictions derived from multiscale module partitions and network propagation, and a parameter estimation strategy based on functional information is proposed to further enhance HyMM's predictive power. By a series of experiments, we reveal the importance of module partitions at different scales, and verify the stable and good performance of HyMM compared with eight other state-of-the-arts and its further performance improvement derived from the parameter estimation. CONCLUSIONS The results confirm that HyMM is an effective framework for integrating multiscale module structure to enhance the ability to predict disease-related genes, which may provide useful insights for the study of the multiscale module structure and its application in such issues as a disease-gene prediction.
Collapse
Affiliation(s)
- Ju Xiang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China; Department of Basic Medical Sciences & Academician Workstation, Changsha Medical University, Changsha, Hunan 410219, China
| | - Xiangmao Meng
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Yichao Zhao
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Fang-Xiang Wu
- Division of Biomedical Engineering and Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, SK, S7N 5A9, Canada
| | - Min Li
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| |
Collapse
|
10
|
Liu D, Qiu M. Immune and Metabolic Dysregulated Coding and Non-coding RNAs Reveal Survival Association in Uterine Corpus Endometrial Carcinoma. Front Genet 2021; 12:673192. [PMID: 34249094 PMCID: PMC8264798 DOI: 10.3389/fgene.2021.673192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2021] [Accepted: 04/14/2021] [Indexed: 11/13/2022] Open
Abstract
Uterine corpus endometrial carcinoma (UCEC) is one of the most common gynecologic malignancies, but only a few biomarkers have been proven to be effective in clinical practice. Previous studies have demonstrated the important roles of non-coding RNAs (ncRNAs) in diagnosis, prognosis, and therapy selection in UCEC and suggested the significance of integrating molecules at different levels for interpreting the underlying molecular mechanism. In this study, we collected transcriptome data, including long non-coding RNAs (lncRNAs), microRNAs (miRNAs), and messenger RNAs (mRNAs), of 570 samples, which were comprised of 537 UCEC samples and 33 normal samples. First, differentially expressed lncRNAs, miRNAs, and mRNAs, which distinguished invasive carcinoma samples from normal samples, were identified, and further analysis showed that cancer- and metabolism-related functions were enriched by these RNAs. Next, an integrated, dysregulated, and scale-free biological network consisting of differentially expressed lncRNAs, miRNAs, and mRNAs was constructed. Protein-coding and ncRNA genes in this network showed potential immune and metabolic functions. A further analysis revealed two clinic-related modules that showed a close correlation with metabolic and immune functions. RNAs in the two modules were functionally validated to be associated with UCEC. The findings of this study demonstrate an important clinical application for improving outcome prediction for UCEC.
Collapse
Affiliation(s)
- Da Liu
- Department of Obstetrics and Gynecology, Shengjing Hospital of China Medical University, Shenyang, China
| | - Min Qiu
- Department of Orthopedics, Shengjing Hospital of China Medical University, Shenyang, China
| |
Collapse
|
11
|
Wang HC, Chou MC, Wu CC, Chan LP, Moi SH, Pan MR, Liu TC, Yang CH. Application of the Interaction between Tissue Immunohistochemistry Staining and Clinicopathological Factors for Evaluating the Risk of Oral Cancer Progression by Hierarchical Clustering Analysis: A Case-Control Study in a Taiwanese Population. Diagnostics (Basel) 2021; 11:diagnostics11060925. [PMID: 34063938 PMCID: PMC8224004 DOI: 10.3390/diagnostics11060925] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2021] [Revised: 05/12/2021] [Accepted: 05/17/2021] [Indexed: 01/31/2023] Open
Abstract
The aim of this single-center case-control study is to investigate the feasibility and accuracy of oral cancer protein risk stratification (OCPRS) to analyze the risk of cancer progression. All patients diagnosed with oral cancer in Taiwan, between 2012 and 2014, and who underwent surgical intervention were selected for the study. The tissue was further processed for immunohistochemistry (IHC) for 21 target proteins. Analyses were performed using the results of IHC staining, clinicopathological characteristics, and survival outcomes. Novel stratifications with a hierarchical clustering approach and combinations were applied using the Cox proportional hazard regression model. Of the 163 participants recruited, 102 patients were analyzed, and OCPRS successfully identified patients with different progression-free survival (PFS) profiles in high-risk (53 subjects) versus low-risk (49 subjects) groups (p = 0.012). OCPRS was composed of cytoplasmic PLK1, phosphoMet, and SGK2 IHC staining. After controlling for the influence of clinicopathological features, high-risk patients were 2.33 times more likely to experience cancer progression than low-risk patients (p = 0.020). In the multivariate model, patients with extranodal extension (HR = 2.66, p = 0.045) demonstrated a significantly increased risk for disease progression. Risk stratification with OCPRS provided distinct PFS groups for patients with oral cancer after surgical intervention. OCPRS appears suitable for routine clinical use for progression and prognosis estimation.
Collapse
Affiliation(s)
- Hui-Ching Wang
- Graduate Institute of Clinical Medicine, College of Medicine, Kaohsiung Medical University, Kaohsiung 807, Taiwan;
- Division of Hematology and Oncology, Department of Internal Medicine, Kaohsiung Medical University Hospital, Kaohsiung Medical University, Kaohsiung 807, Taiwan
- Faculty of Medicine, College of Medicine, Kaohsiung Medical University, Kaohsiung 807, Taiwan;
- Drug Development and Value Creation Research Center, Kaohsiung Medical University, Kaohsiung 807, Taiwan;
| | - Meng-Chun Chou
- Department of Nursing, Kaohsiung Medical University Hospital, Kaohsiung Medical University, Kaohsiung 807, Taiwan;
| | - Chun-Chieh Wu
- Drug Development and Value Creation Research Center, Kaohsiung Medical University, Kaohsiung 807, Taiwan;
- Department of Pathology, Kaohsiung Medical University Hospital, Kaohsiung Medical University, Kaohsiung 807, Taiwan
| | - Leong-Perng Chan
- Faculty of Medicine, College of Medicine, Kaohsiung Medical University, Kaohsiung 807, Taiwan;
- Department of Otolaryngology-Head and Neck Surgery, Kaohsiung Medical University Hospital, Kaohsiung Medical University, Kaohsiung 807, Taiwan
- Department of Otorhinolaryngology-Head and Neck Surgery, Kaohsiung Municipal Ta-Tung Hospital and Kaohsiung Medical University Hospital, Kaohsiung 807, Taiwan
| | - Sin-Hua Moi
- Center of Cancer Program Development, E-Da Cancer Hospital, I-Shou University, Kaohsiung 824, Taiwan;
| | - Mei-Ren Pan
- Graduate Institute of Clinical Medicine, College of Medicine, Kaohsiung Medical University, Kaohsiung 807, Taiwan;
- Drug Development and Value Creation Research Center, Kaohsiung Medical University, Kaohsiung 807, Taiwan;
- Correspondence: (M.-R.P.); (T.-C.L.); (C.-H.Y.); Tel.: +886-7-3121101-5092-34 (M.-R.P.); +886-4-781-3888 (T.-C.L.); +886-7-381-4526 (C.-H.Y.); Fax: +886-7-3218309 (M.-R.P.)
| | - Ta-Chih Liu
- Department of Hematology-Oncology, Chang Bing Show Chwan Memorial Hospital, Changhua 505, Taiwan
- Correspondence: (M.-R.P.); (T.-C.L.); (C.-H.Y.); Tel.: +886-7-3121101-5092-34 (M.-R.P.); +886-4-781-3888 (T.-C.L.); +886-7-381-4526 (C.-H.Y.); Fax: +886-7-3218309 (M.-R.P.)
| | - Cheng-Hong Yang
- Department of Electronic Engineering, National Kaohsiung University of Science and Technology, Kaohsiung 807, Taiwan
- Ph. D. Program in Biomedical Engineering, Kaohsiung Medical University, Kaohsiung 807, Taiwan
- Correspondence: (M.-R.P.); (T.-C.L.); (C.-H.Y.); Tel.: +886-7-3121101-5092-34 (M.-R.P.); +886-4-781-3888 (T.-C.L.); +886-7-381-4526 (C.-H.Y.); Fax: +886-7-3218309 (M.-R.P.)
| |
Collapse
|
12
|
CorGO: An Integrated Method for Clustering Functionally Similar Genes. Interdiscip Sci 2021; 13:624-637. [PMID: 33761117 DOI: 10.1007/s12539-021-00424-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2020] [Revised: 02/23/2021] [Accepted: 03/05/2021] [Indexed: 10/21/2022]
Abstract
Identification of groups of co-expressed or co-regulated genes is critical for exploring the underlying mechanism behind a particular disease like cancer. Condition-specific (disease-specific) gene-expression profiles acquired from different platforms are widely utilized by researchers to get insight into the regulatory mechanism of the disease. Several clustering algorithms are developed using gene expression profiles to identify the group of similar genes. These algorithms are computationally efficient but are not able to capture the functional similarity present between the genes, which is very important from a biological perspective. In this study, an algorithm named CorGO is introduced, that specifically deals with the identification of functionally similar gene-clusters. Two types of relationships are calculated for this purpose. Firstly, the Correlation (Cor) between the genes are captured from the gene-expression data, which helps in deciphering the relationship between genes based on its expression across several diseased samples. Secondly, Gene Ontology (GO)-based semantic similarity information available for the genes is utilized, that helps in adding up biological relevance to the identified gene-clusters. A similarity measure is defined by integrating these two components that help in the identification of homogeneous and functionally similar groups of genes. CorGO is applied to four different types of gene expression profiles of different types of cancer. Gene-clusters identified by CorGO, are further validated by pathway enrichment, disease enrichment, and network analysis. These biological analyses demonstrated significant connectivity and functional relatedness within the genes of the same cluster. A comparative study with commonly used clustering algorithms is also performed to show the efficacy of the proposed method.
Collapse
|
13
|
Li C, Liao C, Meng X, Chen H, Chen W, Wei B, Zhu P. Effective Analysis of Inpatient Satisfaction: The Random Forest Algorithm. Patient Prefer Adherence 2021; 15:691-703. [PMID: 33854303 PMCID: PMC8039189 DOI: 10.2147/ppa.s294402] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/27/2020] [Accepted: 03/10/2021] [Indexed: 12/17/2022] Open
Abstract
PURPOSE To identify the factors influencing inpatient satisfaction by fitting the optimal discriminant model. PATIENTS AND METHODS A cross-sectional survey of inpatient satisfaction was conducted with 3888 patients in 16 large public hospitals in Zhejiang Province. Independent variables were screened by single-factor analysis, and the importance of all variables was comprehensively evaluated. The relationship between patients' overall satisfaction and influencing factors was established, the relative risk was evaluated by marginal benefit, and the optimal model was fitted using the receiver operating characteristic curve. RESULTS Patients' overall satisfaction was 79.73%. The five most influential factors on inpatient satisfaction, in this order, were: patients' right to know, timely nursing response, satisfaction with medical staff service, integrity of medical staff, and accuracy of diagnosis. The prediction accuracy of the random forest model was higher than that of the multiple logistic regression and naive Bayesian models. CONCLUSION Inpatient satisfaction is related to healthcare quality, diagnosis, and treatment process. Rapid identification and active improvement of the factors affecting patient satisfaction can reduce public hospital operating costs and improve patient experiences and the efficiency of health resource allocation. Public hospitals should strengthen the exchange of medical information between doctors and patients, shorten waiting time, and improve the level of medical technology, service attitude, and transparency of information disclosure.
Collapse
Affiliation(s)
- Chengcheng Li
- School of Humanities and Social Sciences, Guangxi Medical University, Nanning, 530021, People’s Republic of China
| | - Conghui Liao
- School of Public Health, Sun Yat-Sen University, Guangzhou, 510080, People’s Republic of China
| | - Xuehui Meng
- Department of Health Service Management, Humanities and Management School, Zhejiang Chinese Medical University, Hangzhou, 310000, People’s Republic of China
| | - Honghua Chen
- School of Basic Medicine, Guangxi Medical University, Nanning, 530021, People’s Republic of China
| | - Weiling Chen
- School of Basic Medicine, Guangxi Medical University, Nanning, 530021, People’s Republic of China
| | - Bo Wei
- School of Information and Management, Guangxi Medical University, Nanning, 530021, People’s Republic of China
| | - Pinghua Zhu
- School of Humanities and Social Sciences, Guangxi Medical University, Nanning, 530021, People’s Republic of China
- Correspondence: Pinghua Zhu Email
| |
Collapse
|
14
|
Elahi A, Babamir SM. Identification of Protein Complexes Based on Core-Attachment Structure and Combination of Centrality Measures and Biological Properties in PPI Weighted Networks. Protein J 2020; 39:681-702. [PMID: 33040223 DOI: 10.1007/s10930-020-09922-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/28/2020] [Indexed: 02/02/2023]
Abstract
In protein interaction networks, a complex is a group of proteins that causes a biological process to take place. The correct identification of complexes can help to better understand function of cells used for therapeutic purposes, such as drug discoveries. This paper uses core-attachment structure, centrality measures, and biological properties of proteins to identify protein complex with the aim of enhancing prediction accuracy compared to related work. We used the inherent organization of complex to the identification in this article, while most methods have not considered such properties. On the other hand, clustering methods, as the common method for identifying complexes in protein interaction networks have been applied. However, we want to propose a method for more accurate identification of complexes in this article. Using this method, we determined the core center of each complex and its attachment proteins using the centrality measures, biological properties and weight density, whereby the weight of each interaction was calculated using the protein information in the gene ontology. In the proposed approach to weighting the network and measuring the importance of proteins, we used our previous work. To compare with other methods, we used datasets DIP, Collins, Krogan, and Human. The results show that the performance of our method was significantly improved, compared to other methods, in terms of detecting the protein complex. Using the p-value concept, we show the biological significance of our predicted complexes. The proposed method could identify an acceptable number of protein complexes, with the highest proportion of biological significance in collaborating on the functional annotation of proteins.
Collapse
|
15
|
Overton IM, Sims AH, Owen JA, Heale BSE, Ford MJ, Lubbock ALR, Pairo-Castineira E, Essafi A. Functional Transcription Factor Target Networks Illuminate Control of Epithelial Remodelling. Cancers (Basel) 2020; 12:cancers12102823. [PMID: 33007944 PMCID: PMC7652213 DOI: 10.3390/cancers12102823] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2020] [Revised: 09/16/2020] [Accepted: 09/24/2020] [Indexed: 12/15/2022] Open
Abstract
Cell identity is governed by gene expression, regulated by transcription factor (TF) binding at cis-regulatory modules. Decoding the relationship between TF binding patterns and gene regulation is nontrivial, remaining a fundamental limitation in understanding cell decision-making. We developed the NetNC software to predict functionally active regulation of TF targets; demonstrated on nine datasets for the TFs Snail, Twist, and modENCODE Highly Occupied Target (HOT) regions. Snail and Twist are canonical drivers of epithelial to mesenchymal transition (EMT), a cell programme important in development, tumour progression and fibrosis. Predicted "neutral" (non-functional) TF binding always accounted for the majority (50% to 95%) of candidate target genes from statistically significant peaks and HOT regions had higher functional binding than most of the Snail and Twist datasets examined. Our results illuminated conserved gene networks that control epithelial plasticity in development and disease. We identified new gene functions and network modules including crosstalk with notch signalling and regulation of chromatin organisation, evidencing networks that reshape Waddington's epigenetic landscape during epithelial remodelling. Expression of orthologous functional TF targets discriminated breast cancer molecular subtypes and predicted novel tumour biology, with implications for precision medicine. Predicted invasion roles were validated using a tractable cell model, supporting our approach.
Collapse
Affiliation(s)
- Ian M. Overton
- MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, UK; (A.H.S.); (B.S.E.H.); (M.J.F.); (A.L.R.L.); (E.P.-C.); (A.E.)
- Department of Systems Biology, Harvard University, Boston, MA 02115, USA;
- Centre for Synthetic and Systems Biology (SynthSys), University of Edinburgh, Edinburgh EH9 3BF, UK
- Patrick G Johnston Centre for Cancer Research, Queen’s University Belfast, Belfast BT9 7AE, UK
- Correspondence:
| | - Andrew H. Sims
- MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, UK; (A.H.S.); (B.S.E.H.); (M.J.F.); (A.L.R.L.); (E.P.-C.); (A.E.)
| | - Jeremy A. Owen
- Department of Systems Biology, Harvard University, Boston, MA 02115, USA;
- Department of Physics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Bret S. E. Heale
- MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, UK; (A.H.S.); (B.S.E.H.); (M.J.F.); (A.L.R.L.); (E.P.-C.); (A.E.)
| | - Matthew J. Ford
- MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, UK; (A.H.S.); (B.S.E.H.); (M.J.F.); (A.L.R.L.); (E.P.-C.); (A.E.)
| | - Alexander L. R. Lubbock
- MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, UK; (A.H.S.); (B.S.E.H.); (M.J.F.); (A.L.R.L.); (E.P.-C.); (A.E.)
| | - Erola Pairo-Castineira
- MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, UK; (A.H.S.); (B.S.E.H.); (M.J.F.); (A.L.R.L.); (E.P.-C.); (A.E.)
| | - Abdelkader Essafi
- MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, UK; (A.H.S.); (B.S.E.H.); (M.J.F.); (A.L.R.L.); (E.P.-C.); (A.E.)
| |
Collapse
|
16
|
Zhang J, Liu X, Zhou W, Cheng G, Wu J, Guo S, Jia S, Liu Y, Li B, Zhang X, Wang M. A bioinformatics investigation into molecular mechanism of Yinzhihuang granules for treating hepatitis B by network pharmacology and molecular docking verification. Sci Rep 2020; 10:11448. [PMID: 32651427 PMCID: PMC7351787 DOI: 10.1038/s41598-020-68224-7] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2020] [Accepted: 06/22/2020] [Indexed: 12/17/2022] Open
Abstract
Yinzhihuang granules (YZHG) is a patented Chinese medicine for the treatment of hepatitis B. This study aimed to investigate the intrinsic mechanisms of YZHG in the treatment of hepatitis B and to provide new evidence and insights for its clinical application. The chemical compounds of YZHG were searched in the CNKI and PUBMED databases, and their putative targets were then predicted through a search of the SuperPred and Swiss Target Prediction databases. In addition, the targets of hepatitis B were obtained from TTD, PharmGKB and DisGeNET. The abovementioned data were visualized using Cytoscape 3.7.1, and network construction identified a total of 13 potential targets of YZHG in the treatment of hepatitis B. Molecular docking verification showed that CDK6, CDK2, TP53 and BRCA1 might be strongly correlated with hepatitis B treatment. Furthermore, GO and KEGG analyses indicated that the treatment of hepatitis B by YZHG might be related to positive regulation of transcription, positive regulation of gene expression, the hepatitis B pathway and the viral carcinogenesis pathway. Network pharmacology intuitively shows the multicomponent, multitarget and multichannel pharmacological effects of YZHG in the treatment of hepatitis B and provides a scientific basis for its mechanism of action.
Collapse
Affiliation(s)
- Jingyuan Zhang
- Beijing University of Chinese Medicine, Beijing, 100102, China
| | - Xinkui Liu
- Beijing University of Chinese Medicine, Beijing, 100102, China
| | - Wei Zhou
- Beijing University of Chinese Medicine, Beijing, 100102, China
| | - Guoliang Cheng
- State Key Laboratory of Generic Manufacture Technology of Chinese Traditional Medicine, Linyi, 276000, China
| | - Jiarui Wu
- Beijing University of Chinese Medicine, Beijing, 100102, China.
| | - Siyu Guo
- Beijing University of Chinese Medicine, Beijing, 100102, China
| | - Shanshan Jia
- Beijing University of Chinese Medicine, Beijing, 100102, China
| | - Yingying Liu
- Beijing University of Chinese Medicine, Beijing, 100102, China
| | - Bingbing Li
- State Key Laboratory of Generic Manufacture Technology of Chinese Traditional Medicine, Linyi, 276000, China
| | - Xiaomeng Zhang
- Beijing University of Chinese Medicine, Beijing, 100102, China
| | - Miaomiao Wang
- Beijing University of Chinese Medicine, Beijing, 100102, China
| |
Collapse
|
17
|
Li G, Li M, Wang J, Li Y, Pan Y. United Neighborhood Closeness Centrality and Orthology for Predicting Essential Proteins. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:1451-1458. [PMID: 30596582 DOI: 10.1109/tcbb.2018.2889978] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
Identifying essential proteins plays an important role in disease study, drug design, and understanding the minimal requirement for cellular life. Computational methods for essential proteins discovery overcome the disadvantages of biological experimental methods that are often time-consuming, expensive, and inefficient. The topological features of protein-protein interaction (PPI) networks are often used to design computational prediction methods, such as Degree Centrality (DC), Betweenness Centrality (BC), Closeness Centrality (CC), Subgraph Centrality (SC), Eigenvector Centrality (EC), Information Centrality (IC), and Neighborhood Centrality (NC). However, the prediction accuracies of these individual methods still have space to be improved. Studies show that additional information, such as orthologous relations, helps discover essential proteins. Many researchers have proposed different methods by combining multiple information sources to gain improvement of prediction accuracy. In this study, we find that essential proteins appear in triangular structure in PPI network significantly more often than nonessential ones. Based on this phenomenon, we propose a novel pure centrality measure, so-called Neighborhood Closeness Centrality (NCC). Accordingly, we develop a new combination model, Extended Pareto Optimality Consensus model, named EPOC, to fuse NCC and Orthology information and a novel essential proteins identification method, NCCO, is fully proposed. Compared with seven existing classic centrality methods (DC, BC, IC, CC, SC, EC, and NC) and three consensus methods (PeC, ION, and CSC), our results on S.cerevisiae and E.coli datasets show that NCCO has clear advantages. As a consensus method, EPOC also yields better performance than the random walk model.
Collapse
|
18
|
Li M, Meng X, Zheng R, Wu FX, Li Y, Pan Y, Wang J. Identification of Protein Complexes by Using a Spatial and Temporal Active Protein Interaction Network. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:817-827. [PMID: 28885159 DOI: 10.1109/tcbb.2017.2749571] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
The rapid development of proteomics and high-throughput technologies has produced a large amount of Protein-Protein Interaction (PPI) data, which makes it possible for considering dynamic properties of protein interaction networks (PINs) instead of static properties. Identification of protein complexes from dynamic PINs becomes a vital scientific problem for understanding cellular life in the post genome era. Up to now, plenty of models or methods have been proposed for the construction of dynamic PINs to identify protein complexes. However, most of the constructed dynamic PINs just focus on the temporal dynamic information and thus overlook the spatial dynamic information of the complex biological systems. To address the limitation of the existing dynamic PIN analysis approaches, in this paper, we propose a new model-based scheme for the construction of the Spatial and Temporal Active Protein Interaction Network (ST-APIN) by integrating time-course gene expression data and subcellular location information. To evaluate the efficiency of ST-APIN, the commonly used classical clustering algorithm MCL is adopted to identify protein complexes from ST-APIN and the other three dynamic PINs, NF-APIN, DPIN, and TC-PIN. The experimental results show that, the performance of MCL on ST-APIN outperforms those on the other three dynamic PINs in terms of matching with known complexes, sensitivity, specificity, and f-measure. Furthermore, we evaluate the identified protein complexes by Gene Ontology (GO) function enrichment analysis. The validation shows that the identified protein complexes from ST-APIN are more biologically significant. This study provides a general paradigm for constructing the ST-APINs, which is essential for further understanding of molecular systems and the biomedical mechanism of complex diseases.
Collapse
|
19
|
Brain-wide functional architecture remodeling by alcohol dependence and abstinence. Proc Natl Acad Sci U S A 2020; 117:2149-2159. [PMID: 31937658 DOI: 10.1073/pnas.1909915117] [Citation(s) in RCA: 56] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
Alcohol abuse and alcohol dependence are key factors in the development of alcohol use disorder, which is a pervasive societal problem with substantial economic, medical, and psychiatric consequences. Although our understanding of the neurocircuitry that underlies alcohol use has improved, novel brain regions that are involved in alcohol use and novel biomarkers of alcohol use need to be identified. The present study used a single-cell whole-brain imaging approach to 1) assess whether abstinence from alcohol in an animal model of alcohol dependence alters the functional architecture of brain activity and modularity, 2) validate our current knowledge of the neurocircuitry of alcohol abstinence, and 3) discover brain regions that may be involved in alcohol use. Alcohol abstinence resulted in the whole-brain reorganization of functional architecture in mice and a pronounced decrease in modularity that was not observed in nondependent moderate drinkers. Structuring of the alcohol abstinence network revealed three major brain modules: 1) extended amygdala module, 2) midbrain striatal module, and 3) cortico-hippocampo-thalamic module, reminiscent of the three-stage theory. Many hub brain regions that control this network were identified, including several that have been previously overlooked in alcohol research. These results identify brain targets for future research and demonstrate that alcohol use and dependence remodel brain-wide functional architecture to decrease modularity. Further studies are needed to determine whether the changes in coactivation and modularity that are associated with alcohol abstinence are causal features of alcohol dependence or a consequence of excessive drinking and alcohol exposure.
Collapse
|
20
|
Krzhizhanovskaya VV, Závodszky G, Lees MH, Dongarra JJ, Sloot PMA, Brissos S, Teixeira J. On the Planarity of Validated Complexes of Model Organisms in Protein-Protein Interaction Networks. LECTURE NOTES IN COMPUTER SCIENCE 2020. [PMCID: PMC7302240 DOI: 10.1007/978-3-030-50371-0_48] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Leveraging protein-protein interaction networks to identify groups of proteins and their common functionality is an important problem in bioinformatics. Systems-level analysis of protein-protein interactions is made possible through network science and modeling of high-throughput data. From these analyses, small protein complexes are traditionally represented graphically as complete graphs or dense clusters of nodes. However, there are certain graph theoretic properties that have not been extensively studied in PPI networks, especially as they pertain to cluster discovery, such as planarity. Planarity of graphs have been used to reflect the physical constraints of real-world systems outside of bioinformatics, in areas such as mapping and imaging. Here, we investigate the planarity property in network models of protein complexes. We hypothesize that complexes represented as PPI subgraphs will tend to be planar, reflecting the actual physical interface and limits of components in the complex. When testing the planarity of known complex subgraphs in S. cerevisiae and selected mammalian PPIs, we find that a majority of validated complexes possess this planar property. We discuss the biological motivation of planar versus nonplanar subgraphs, observing that planar subgraphs tend to have longer protein components. Functional classification of planar versus nonplanar complex subgraphs reveals differences in annotation of these groups relating to cellular component organization, structural molecule activity, catalytic activity, and nucleic acid binding. These results provide a new quantitative and biologically motivated measure of real protein complexes in the network model, important for the development of future complex-finding algorithms in PPIs. Accounting for this property paves the way to new means for discovering new protein complexes and uncovering the functionality of unknown or novel proteins.
Collapse
|
21
|
Song J, Peng W, Wang F, Wang J. Identifying driver genes involving gene dysregulated expression, tissue-specific expression and gene-gene network. BMC Med Genomics 2019; 12:168. [PMID: 31888619 PMCID: PMC6936147 DOI: 10.1186/s12920-019-0619-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Accepted: 11/11/2019] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Cancer as a kind of genomic alteration disease each year deprives many people's life. The biggest challenge to overcome cancer is to identify driver genes that promote the cancer development from a huge amount of passenger mutations that have no effect on the selective growth advantage of cancer. In order to solve those problems, some researchers have started to focus on identification of driver genes by integrating networks with other biological information. However, more efforts should be needed to improve the prediction performance. METHODS Considering the facts that driver genes have impact on expression of their downstream genes, they likely interact with each other to form functional modules and those modules should tend to be expressed similarly in the same tissue. We proposed a novel model named by DyTidriver to identify driver genes through involving the gene dysregulated expression, tissue-specific expression and variation frequency into the human functional interaction network (e.g. human FIN). RESULTS This method was applied on 974 breast, 316 prostate and 230 lung cancer patients. The consequence shows our method outperformed other five existing methods in terms of Fscore, Precision and Recall values. The enrichment and cociter analysis illustrate DyTidriver can not only identifies the driver genes enriched in some significant pathways but also has the capability to figure out some unknown driver genes. CONCLUSION The final results imply that driver genes are those that impact more dysregulated genes and express similarly in the same tissue.
Collapse
Affiliation(s)
- Junrong Song
- Faculty of Management and Economics/Faculty of Information Engineering and Automation/Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming, Yunnan, 650500, People's Republic of China
| | - Wei Peng
- Faculty of Management and Economics/Faculty of Information Engineering and Automation/Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming, Yunnan, 650500, People's Republic of China.
| | - Feng Wang
- Faculty of Management and Economics/Faculty of Information Engineering and Automation/Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming, Yunnan, 650500, People's Republic of China
| | - Jianxin Wang
- School of Information Science and Engineering, Central South University, Changsha, Hunan, 410083, People's Republic of China
| |
Collapse
|
22
|
The Eminence of Co-Expressed Ties in Schizophrenia Network Communities. DATA 2019. [DOI: 10.3390/data4040149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Exploring gene networks is crucial for identifying significant biological interactions occurring in a disease condition. These interactions can be acknowledged by modeling the tie structure of networks. Such tie orientations are often detected within embedded community structures. However, most of the prevailing community detection modules are intended to capture information from nodes and its attributes, usually ignoring the ties. In this study, a modularity maximization algorithm is proposed based on nonlinear representation of local tangent space alignment (LTSA). Initially, the tangent coordinates are computed locally to identify k-nearest neighbors across the genes. These local neighbors are further optimized by generating a nonlinear network embedding function for detecting gene communities based on eigenvector decomposition. Experimental results suggest that this algorithm detects gene modules with a better modularity index of 0.9256, compared to other traditional community detection algorithms. Furthermore, co-expressed genes across these communities are identified by discovering the characteristic tie structures. These detected ties are known to have substantial biological influence in the progression of schizophrenia, thereby signifying the influence of tie patterns in biological networks. This technique can be extended logically on other diseases networks for detecting substantial gene “hotspots”.
Collapse
|
23
|
Wu Z, Liao Q, Liu B. A comprehensive review and evaluation of computational methods for identifying protein complexes from protein–protein interaction networks. Brief Bioinform 2019; 21:1531-1548. [DOI: 10.1093/bib/bbz085] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2019] [Revised: 06/17/2019] [Accepted: 06/17/2019] [Indexed: 02/04/2023] Open
Abstract
Abstract
Protein complexes are the fundamental units for many cellular processes. Identifying protein complexes accurately is critical for understanding the functions and organizations of cells. With the increment of genome-scale protein–protein interaction (PPI) data for different species, various computational methods focus on identifying protein complexes from PPI networks. In this article, we give a comprehensive and updated review on the state-of-the-art computational methods in the field of protein complex identification, especially focusing on the newly developed approaches. The computational methods are organized into three categories, including cluster-quality-based methods, node-affinity-based methods and ensemble clustering methods. Furthermore, the advantages and disadvantages of different methods are discussed, and then, the performance of 17 state-of-the-art methods is evaluated on two widely used benchmark data sets. Finally, the bottleneck problems and their potential solutions in this important field are discussed.
Collapse
Affiliation(s)
- Zhourun Wu
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong, China
| | - Qing Liao
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong, China
| | - Bin Liu
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
- Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, Beijing, China
| |
Collapse
|
24
|
Xie D, Yi Y, Zhou J, Li X, Wu H. A novel temporal protein complexes identification framework based on density–distance and heuristic algorithm. Neural Comput Appl 2019. [DOI: 10.1007/s00521-018-3660-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
25
|
Wang R, Wang C, Sun L, Liu G. A seed-extended algorithm for detecting protein complexes based on density and modularity with topological structure and GO annotations. BMC Genomics 2019; 20:637. [PMID: 31390979 PMCID: PMC6686515 DOI: 10.1186/s12864-019-5956-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2019] [Accepted: 07/04/2019] [Indexed: 12/28/2022] Open
Abstract
Background The detection of protein complexes is of great significance for researching mechanisms underlying complex diseases and developing new drugs. Thus, various computational algorithms have been proposed for protein complex detection. However, most of these methods are based on only topological information and are sensitive to the reliability of interactions. As a result, their performance is affected by false-positive interactions in PPINs. Moreover, these methods consider only density and modularity and ignore protein complexes with various densities and modularities. Results To address these challenges, we propose an algorithm to exploit protein complexes in PPINs by a Seed-Extended algorithm based on Density and Modularity with Topological structure and GO annotations, named SE-DMTG to improve the accuracy of protein complex detection. First, we use common neighbors and GO annotations to construct a weighted PPIN. Second, we define a new seed selection strategy to select seed nodes. Third, we design a new fitness function to detect protein complexes with various densities and modularities. We compare the performance of SE-DMTG with that of thirteen state-of-the-art algorithms on several real datasets. Conclusion The experimental results show that SE-DMTG not only outperforms some classical algorithms in yeast PPINs in terms of the F-measure and Jaccard but also achieves an ideal performance in terms of functional enrichment. Furthermore, we apply SE-DMTG to PPINs of several other species and demonstrate the outstanding accuracy and matching ratio in detecting protein complexes compared with other algorithms.
Collapse
Affiliation(s)
- Rongquan Wang
- College of Computer Science and Technology, Jilin University, No. 2699 Qianjin Street, Changchun, 130012, China.,Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, No. 2699 Qianjin Street, Changchun, 130012, China
| | - Caixia Wang
- School of International Economics, China Foreign Affairs University, 24 Zhanlanguan Road, Xicheng District, Beijing, 100037, China
| | - Liyan Sun
- College of Computer Science and Technology, Jilin University, No. 2699 Qianjin Street, Changchun, 130012, China.,Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, No. 2699 Qianjin Street, Changchun, 130012, China
| | - Guixia Liu
- College of Computer Science and Technology, Jilin University, No. 2699 Qianjin Street, Changchun, 130012, China. .,Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, No. 2699 Qianjin Street, Changchun, 130012, China.
| |
Collapse
|
26
|
Chen W, Li W, Huang G, Flavel M. The Applications of Clustering Methods in Predicting Protein Functions. CURR PROTEOMICS 2019. [DOI: 10.2174/1570164616666181212114612] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background:
The understanding of protein function is essential to the study of biological
processes. However, the prediction of protein function has been a difficult task for bioinformatics to
overcome. This has resulted in many scholars focusing on the development of computational methods
to address this problem.
Objective:
In this review, we introduce the recently developed computational methods of protein function
prediction and assess the validity of these methods. We then introduce the applications of clustering
methods in predicting protein functions.
Collapse
Affiliation(s)
- Weiyang Chen
- College of Information, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China
| | - Weiwei Li
- College of Information, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China
| | - Guohua Huang
- College of Information Engineering, Shaoyang University, Shaoyang, Hunan 422000, China
| | - Matthew Flavel
- School of Life Sciences, La Trobe University, Bundoora, Vic 3083, Australia
| |
Collapse
|
27
|
Saha S, Sengupta K, Chatterjee P, Basu S, Nasipuri M. Analysis of protein targets in pathogen-host interaction in infectious diseases: a case study on Plasmodium falciparum and Homo sapiens interaction network. Brief Funct Genomics 2019; 17:441-450. [PMID: 29028886 DOI: 10.1093/bfgp/elx024] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Infection and disease progression is the outcome of protein interactions between pathogen and host. Pathogen, the role player of Infection, is becoming a severe threat to life as because of its adaptability toward drugs and evolutionary dynamism in nature. Identifying protein targets by analyzing protein interactions between host and pathogen is the key point. Proteins with higher degree and possessing some topologically significant graph theoretical measures are found to be drug targets. On the other hand, exceptional nodes may be involved in infection mechanism because of some pathway process and biologically unknown factors. In this article, we attempt to investigate characteristics of host-pathogen protein interactions by presenting a comprehensive review of computational approaches applied on different infectious diseases. As an illustration, we have analyzed a case study on infectious disease malaria, with its causative agent Plasmodium falciparum acting as 'Bait' and host, Homo sapiens/human acting as 'Prey'. In this pathogen-host interaction network based on some interconnectivity and centrality properties, proteins are viewed as central, peripheral, hub and non-hub nodes and their significance on infection process. Besides, it is observed that because of sparseness of the pathogen and host interaction network, there may be some topologically unimportant but biologically significant proteins, which can also act as Bait/Prey. So, functional similarity or gene ontology mapping can help us in this case to identify these proteins.
Collapse
Affiliation(s)
- Sovan Saha
- Department of Computer Science and Engineering at Dr Sudhir Chandra Sur Degree Engineering College, India
| | - Kaustav Sengupta
- Department of Computer Science and Engineering, Jadavpur University, India
| | - Piyali Chatterjee
- Department of Computer Science and Engineering, Netaji Subhash Engineering College, Garia, India
| | - Subhadip Basu
- Department of Computer Science and Engineering, Jadavpur University, India
| | - Mita Nasipuri
- Department of Computer Science and Engineering, Jadavpur University, India
| |
Collapse
|
28
|
Clustering analysis using a novel locality-informed grey wolf-inspired clustering approach. Knowl Inf Syst 2019. [DOI: 10.1007/s10115-019-01358-x] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
29
|
Xiao Q, Luo P, Li M, Wang J, Wu FX. A Novel Core-Attachment-Based Method to Identify Dynamic Protein Complexes Based on Gene Expression Profiles and PPI Networks. Proteomics 2019; 19:e1800129. [PMID: 30650262 DOI: 10.1002/pmic.201800129] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2018] [Revised: 12/09/2018] [Indexed: 11/06/2022]
Abstract
Cellular functions are always performed by protein complexes. At present, many approaches have been proposed to identify protein complexes from protein-protein interaction (PPI) networks. Some approaches focus on detecting local dense subgraphs in PPI networks which are regarded as protein-complex cores, then identify protein complexes by including local neighbors. However, from gene expression profiles at different time points or tissues it is known that proteins are dynamic. Therefore, identifying dynamic protein complexes should become very important and meaningful. In this study, a novel core-attachment-based method named CO-DPC to detect dynamic protein complexes is presented. First, CO-DPC selects active proteins according to gene expression profiles and the 3-sigma principle, and constructs dynamic PPI networks based on the co-expression principle and PPI networks. Second, CO-DPC detects local dense subgraphs as the cores of protein complexes and then attach close neighbors of these cores to form protein complexes. In order to evaluate the method, the method and the existing algorithms are applied to yeast PPI networks. The experimental results show that CO-DPC performs much better than the existing methods. In addition, the identified dynamic protein complexes can match very well and thus become more meaningful for future biological study.
Collapse
Affiliation(s)
- Qianghua Xiao
- School of Mathematics and Physics, University of South China, Hengyang, 421001, P. R. China
| | - Ping Luo
- Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK, S7N 5A9, Canada
| | - Min Li
- School of Information Science and Engineering, Central South University, Changsha, 410083, P. R. China
| | - Jianxin Wang
- School of Information Science and Engineering, Central South University, Changsha, 410083, P. R. China
| | - Fang-Xiang Wu
- Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK, S7N 5A9, Canada
| |
Collapse
|
30
|
Rzeznik-Orignac J, Puisay A, Derelle E, Peru E, Le Bris N, Galand PE. Co-occurring nematodes and bacteria in submarine canyon sediments. PeerJ 2018; 6:e5396. [PMID: 30083476 PMCID: PMC6074754 DOI: 10.7717/peerj.5396] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2018] [Accepted: 07/17/2018] [Indexed: 01/17/2023] Open
Abstract
In submarine canyon sediments, bacteria and nematodes dominate the benthic biomass and play a key role in nutrient cycling and energy transfer. The diversity of these communities remains, however, poorly studied. This work aims at describing the composition of bacteria and nematode communities in the Lacaze-Duthiers submarine canyon in the north-western Mediterranean Sea. We targeted three sediment depths for two consecutive years and investigated the communities using nuclear markers (18S rRNA and 16S rRNA genes). High throughput sequencing combined to maximal information coefficient (MIC) statistical analysis allowed us to identify, for the first time, at the same small scale, the community structures and the co-occurrence of nematodes and bacteria Operational Taxonomic Units across the sediment cores. The associations detected by MIC revealed marked patterns of co-occurrences between the bacteria and nematodes in the sediment of the canyon and could be linked to the ecological requirements of individual bacteria and nematodes. For the bacterial community, Delta- and Gammaproteobacteria sequences were the most abundant, as seen in some canyons earlier, although Acidobacteria, Actinobacteria and Planctomycetes have been prevalent in other canyon sediments. The 20 identified nematode genera included bacteria feeders as Terschellingia, Eubostrichus, Geomonhystera, Desmoscolex and Leptolaimus. The present study provides new data on the diversity of bacterial and nematodes communities in the Lacaze-Duthiers canyon and further highlights the importance of small-scale sampling for an accurate vision of deep-sea communities.
Collapse
Affiliation(s)
- Jadwiga Rzeznik-Orignac
- Laboratoire d'Ecogéochimie des Environnements Benthiques, LECOB, Sorbonne Université, CNRS, Banyuls-sur-Mer, France
| | - Antoine Puisay
- Laboratoire d'Ecogéochimie des Environnements Benthiques, LECOB, Sorbonne Université, CNRS, Banyuls-sur-Mer, France.,Criobe, Laboratoire d'Excellence "Corail", PSL Research University: EPHE-UPVD-CNRS, Papetoai, French Polynesia
| | - Evelyne Derelle
- Laboratoire de Biologie Intégrative des Organismes Marins, Sorbonne Université, CNRS, Banyuls-sur-Mer, France.,LEMAR UMR 6539 CNRS/UBO/IRD/Ifremer, IUEM, Plouzané, France
| | - Erwan Peru
- Laboratoire d'Ecogéochimie des Environnements Benthiques, LECOB, Sorbonne Université, CNRS, Banyuls-sur-Mer, France
| | - Nadine Le Bris
- Laboratoire d'Ecogéochimie des Environnements Benthiques, LECOB, Sorbonne Université, CNRS, Banyuls-sur-Mer, France
| | - Pierre E Galand
- Laboratoire d'Ecogéochimie des Environnements Benthiques, LECOB, Sorbonne Université, CNRS, Banyuls-sur-Mer, France
| |
Collapse
|
31
|
Detection of Protein Complexes Based on Penalized Matrix Decomposition in a Sparse Protein⁻Protein Interaction Network. Molecules 2018; 23:molecules23061460. [PMID: 29914123 PMCID: PMC6100434 DOI: 10.3390/molecules23061460] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2018] [Revised: 06/11/2018] [Accepted: 06/12/2018] [Indexed: 01/20/2023] Open
Abstract
High-throughput technology has generated large-scale protein interaction data, which is crucial in our understanding of biological organisms. Many complex identification algorithms have been developed to determine protein complexes. However, these methods are only suitable for dense protein interaction networks, because their capabilities decrease rapidly when applied to sparse protein–protein interaction (PPI) networks. In this study, based on penalized matrix decomposition (PMD), a novel method of penalized matrix decomposition for the identification of protein complexes (i.e., PMDpc) was developed to detect protein complexes in the human protein interaction network. This method mainly consists of three steps. First, the adjacent matrix of the protein interaction network is normalized. Second, the normalized matrix is decomposed into three factor matrices. The PMDpc method can detect protein complexes in sparse PPI networks by imposing appropriate constraints on factor matrices. Finally, the results of our method are compared with those of other methods in human PPI network. Experimental results show that our method can not only outperform classical algorithms, such as CFinder, ClusterONE, RRW, HC-PIN, and PCE-FR, but can also achieve an ideal overall performance in terms of a composite score consisting of F-measure, accuracy (ACC), and the maximum matching ratio (MMR).
Collapse
|
32
|
Seidpisheh M, Mohammadpour A. Hierarchical clustering of heavy-tailed data using a new similarity measure. INTELL DATA ANAL 2018. [DOI: 10.3233/ida-173371] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Affiliation(s)
- Mohammad Seidpisheh
- Department of Statistics, Faculty of Mathematical Sciences and Computer, Allameh Tabataba’i University, Tehran, Iran
| | - Adel Mohammadpour
- Department of Statistics, Faculty of Mathematics and Computer Science, Amirkabir University of Technology, Tehran, Iran
| |
Collapse
|
33
|
Liu X, Wu J, Zhang D, Wang K, Duan X, Meng Z, Zhang X. Network Pharmacology-Based Approach to Investigate the Mechanisms of Hedyotis diffusa Willd. in the Treatment of Gastric Cancer. EVIDENCE-BASED COMPLEMENTARY AND ALTERNATIVE MEDICINE : ECAM 2018; 2018:7802639. [PMID: 29853970 PMCID: PMC5954954 DOI: 10.1155/2018/7802639] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/04/2018] [Revised: 03/27/2018] [Accepted: 04/01/2018] [Indexed: 12/20/2022]
Abstract
BACKGROUND Hedyotis diffusa Willd. (HDW) is one of the renowned herbs often used in the treatment of gastric cancer (GC). However, its curative mechanism has not been fully elucidated. OBJECTIVE To systematically investigate the mechanisms of HDW in GC. METHODS A network pharmacology approach mainly comprising target prediction, network construction, and module analysis was adopted in this study. RESULTS A total of 353 targets of the 32 bioactive compounds in HDW were obtained. The network analysis showed that CA isoenzymes, p53, PIK3CA, CDK2, P27Kip1, cyclin D1, cyclin B1, cyclin A2, AKT1, BCL2, MAPK1, and VEGFA were identified as key targets of HDW in the treatment of GC. The functional enrichment analysis indicated that HDW probably produced the therapeutic effects against GC by synergistically regulating many biological pathways, such as nucleotide excision repair, apoptosis, cell cycle, PI3K/AKT/mTOR signaling pathway, VEGF signaling pathway, and Ras signaling pathway. CONCLUSIONS This study holistically illuminates the fact that the pharmacological mechanisms of HDW in GC might be strongly associated with its synergic modulation of apoptosis, cell cycle, differentiation, proliferation, migration, invasion, and angiogenesis.
Collapse
Affiliation(s)
- Xinkui Liu
- Department of Clinical Chinese Pharmacy, School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing 100102, China
| | - Jiarui Wu
- Department of Clinical Chinese Pharmacy, School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing 100102, China
| | - Dan Zhang
- Department of Clinical Chinese Pharmacy, School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing 100102, China
| | - Kaihuan Wang
- Department of Clinical Chinese Pharmacy, School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing 100102, China
| | - Xiaojiao Duan
- Department of Clinical Chinese Pharmacy, School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing 100102, China
| | - Ziqi Meng
- Department of Clinical Chinese Pharmacy, School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing 100102, China
| | - Xiaomeng Zhang
- Department of Clinical Chinese Pharmacy, School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing 100102, China
| |
Collapse
|
34
|
Wang X, Li G, Luo Q, Gan C. Identification of crucial genes associated with esophageal squamous cell carcinoma by gene expression profile analysis. Oncol Lett 2018; 15:8983-8990. [PMID: 29844815 PMCID: PMC5958829 DOI: 10.3892/ol.2018.8464] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2016] [Accepted: 11/16/2017] [Indexed: 12/14/2022] Open
Abstract
To uncover the genes associated with the development of esophageal squamous cell carcinoma (ESCC), an ESCC microarray dataset was used to identify genes differentially expressed between ESCC and normal control tissues. The dataset GSE17351 was downloaded from the Gene Expression Omnibus, containing 5 tumor esophageal mucosa samples and 5 adjacent normal esophageal mucosa samples from 5 male patients with ESCC. The differentially expressed genes (DEGs) were identified using the Linear Models for Microarray Data R package. Then, a co-expression network was constructed using the Weighted Correlation Network Analysis (WGCNA) package, and co-expression network modules were obtained with a hierarchical clustering algorithm. Additionally, functional enrichment analyses for DEGs in the top 2 modules with the highest significance were respectively conducted using the WGCNA package and the cluster Profiler package. In total, 487 upregulated and 468 downregulated DEGs were identified. A total of 24 modules were obtained from the co-expression network, and the top 2 modules with the highest significance, designated as 'blue4' and 'magenta', were further analyzed. In the module blue4, DEGs were significantly enriched in a number of Gene Ontology terms, including 'spindle organization' [e.g., ubiquitin conjugating enzyme E2 C (UBE2C) and SAC3 domain containing 1] and 'cell cycle process' [e.g., UBE2C, minichromosome maintenance complex component 6 (MCM6) and cell division cycle 20 (CDC20)]. Furthermore, a number of DEGs (e.g., UBE2C, CDC20 and MCM6) were enriched in the 'cell cycle' and 'ubiquitin mediated proteolysis' pathways. In the module 'magenta', a number of DEGs [e.g., transferrin receptor (TFRC) and TEA domain transcription factor 4 (TEAD4)] were enriched in the primary metabolic process and intracellular membrane-bounded organelle. Additionally, 308 upregulated genes and 215 downregulated genes were differentially expressed in the same pattern in another dataset, GSE20347, including UBE2C, CDC20, MCM6, TFRC, TEAD4, protein phosphatase 1 regulatory subunit 3C and MAL, T-cell differentiation protein. These DEGs may function in the development of ESCC.
Collapse
Affiliation(s)
- Xuehai Wang
- Department of Thoracic Surgery, Sichuan Academy of Medical Sciences and Sichuan Provincial People's Hospital, Chengdu, Sichuan 610072, P.R. China
| | - Gang Li
- Department of Thoracic Surgery, Sichuan Academy of Medical Sciences and Sichuan Provincial People's Hospital, Chengdu, Sichuan 610072, P.R. China
| | - Qingsong Luo
- Department of Thoracic Surgery, Sichuan Academy of Medical Sciences and Sichuan Provincial People's Hospital, Chengdu, Sichuan 610072, P.R. China
| | - Chongzhi Gan
- Department of Thoracic Surgery, Sichuan Academy of Medical Sciences and Sichuan Provincial People's Hospital, Chengdu, Sichuan 610072, P.R. China
| |
Collapse
|
35
|
Pazos F, Chagoyen M. Characteristics and evolution of the ecosystem of software tools supporting research in molecular biology. Brief Bioinform 2018; 20:1329-1336. [DOI: 10.1093/bib/bby001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2017] [Revised: 12/20/2017] [Indexed: 11/14/2022] Open
Abstract
Abstract
Daily work in molecular biology presently depends on a large number of computational tools. An in-depth, large-scale study of that ‘ecosystem’ of Web tools, its characteristics, interconnectivity, patterns of usage/citation, temporal evolution and rate of decay is crucial for understanding the forces that shape it and for informing initiatives aimed at its funding, long-term maintenance and improvement. In particular, the long-term maintenance of these tools is compromised because of their specific development model. Hundreds of published studies become irreproducible de facto, as the software tools used to conduct them become unavailable. In this study, we present a large-scale survey of >5400 publications describing Web servers within the two main bibliographic resources for disseminating new software developments in molecular biology. For all these servers, we studied their citation patterns, the subjects they address, their citation networks and the temporal evolution of these factors. We also analysed how these factors affect the availability of these servers (whether they are alive). Our results show that this ecosystem of tools is highly interconnected and adapts to the ‘trendy’ subjects in every moment. The servers present characteristic temporal patterns of citation/usage, and there is a worrying rate of server ‘death’, which is influenced by factors such as the server popularity and the institutions that hosts it. These results can inform initiatives aimed at the long-term maintenance of these resources.
Collapse
|
36
|
Cao B, Deng S, Luo J, Ding P, Wang S. Identification of overlapping protein complexes by fuzzy K-medoids clustering algorithm in yeast protein-protein interaction networks. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2018. [DOI: 10.3233/jifs-17026] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Affiliation(s)
- Buwen Cao
- School of Information Science and Engineering, Hunan City University, Yiyang, China
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Shuguang Deng
- College of Communication and Electronic Engineering, Hunan City University, Yiyang, China
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Pingjian Ding
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Shulin Wang
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| |
Collapse
|
37
|
Protein Complexes Prediction Method Based on Core-Attachment Structure and Functional Annotations. Int J Mol Sci 2017; 18:ijms18091910. [PMID: 28878201 PMCID: PMC5618559 DOI: 10.3390/ijms18091910] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2017] [Revised: 08/31/2017] [Accepted: 09/01/2017] [Indexed: 11/17/2022] Open
Abstract
Recent advances in high-throughput laboratory techniques captured large-scale protein–protein interaction (PPI) data, making it possible to create a detailed map of protein interaction networks, and thus enable us to detect protein complexes from these PPI networks. However, most of the current state-of-the-art studies still have some problems, for instance, incapability of identifying overlapping clusters, without considering the inherent organization within protein complexes, and overlooking the biological meaning of complexes. Therefore, we present a novel overlapping protein complexes prediction method based on core–attachment structure and function annotations (CFOCM), which performs in two stages: first, it detects protein complex cores with the maximum value of our defined cluster closeness function, in which the proteins are also closely related to at least one common function. Then it appends attach proteins into these detected cores to form the returned complexes. For performance evaluation, CFOCM and six classical methods have been used to identify protein complexes on three different yeast PPI networks, and three sets of real complexes including the Munich Information Center for Protein Sequences (MIPS), the Saccharomyces Genome Database (SGD) and the Catalogues of Yeast protein Complexes (CYC2008) are selected as benchmark sets, and the results show that CFOCM is indeed effective and robust for achieving the highest F-measure values in all tests.
Collapse
|
38
|
Mehranfar A, Ghadiri N, Kouhsar M, Golshani A. A Type-2 fuzzy data fusion approach for building reliable weighted protein interaction networks with application in protein complex detection. Comput Biol Med 2017; 88:18-31. [DOI: 10.1016/j.compbiomed.2017.06.019] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2017] [Revised: 06/04/2017] [Accepted: 06/19/2017] [Indexed: 02/02/2023]
|
39
|
Li M, Li D, Tang Y, Wu F, Wang J. CytoCluster: A Cytoscape Plugin for Cluster Analysis and Visualization of Biological Networks. Int J Mol Sci 2017; 18:ijms18091880. [PMID: 28858211 PMCID: PMC5618529 DOI: 10.3390/ijms18091880] [Citation(s) in RCA: 58] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2017] [Revised: 08/22/2017] [Accepted: 08/23/2017] [Indexed: 12/15/2022] Open
Abstract
Nowadays, cluster analysis of biological networks has become one of the most important approaches to identifying functional modules as well as predicting protein complexes and network biomarkers. Furthermore, the visualization of clustering results is crucial to display the structure of biological networks. Here we present CytoCluster, a cytoscape plugin integrating six clustering algorithms, HC-PIN (Hierarchical Clustering algorithm in Protein Interaction Networks), OH-PIN (identifying Overlapping and Hierarchical modules in Protein Interaction Networks), IPCA (Identifying Protein Complex Algorithm), ClusterONE (Clustering with Overlapping Neighborhood Expansion), DCU (Detecting Complexes based on Uncertain graph model), IPC-MCE (Identifying Protein Complexes based on Maximal Complex Extension), and BinGO (the Biological networks Gene Ontology) function. Users can select different clustering algorithms according to their requirements. The main function of these six clustering algorithms is to detect protein complexes or functional modules. In addition, BinGO is used to determine which Gene Ontology (GO) categories are statistically overrepresented in a set of genes or a subgraph of a biological network. CytoCluster can be easily expanded, so that more clustering algorithms and functions can be added to this plugin. Since it was created in July 2013, CytoCluster has been downloaded more than 9700 times in the Cytoscape App store and has already been applied to the analysis of different biological networks. CytoCluster is available from http://apps.cytoscape.org/apps/cytocluster.
Collapse
Affiliation(s)
- Min Li
- School of Information Science and Engineering, Central South University, Changsha 410083, China.
| | - Dongyan Li
- School of software, Central South University, Changsha 410083, China.
| | - Yu Tang
- School of Information Science and Engineering, Central South University, Changsha 410083, China.
| | - Fangxiang Wu
- School of Information Science and Engineering, Central South University, Changsha 410083, China.
- Department of Mechanical Engineering and Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK S7N 5A9, Canada.
| | - Jianxin Wang
- School of Information Science and Engineering, Central South University, Changsha 410083, China.
| |
Collapse
|
40
|
Zhu X, Qiu J, Xie M, Wang J. A multi-objective biclustering algorithm based on fuzzy mathematics. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2017.01.095] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
41
|
Lei X, Liang J. Neighbor Affinity-Based Core-Attachment Method to Detect Protein Complexes in Dynamic PPI Networks. Molecules 2017; 22:molecules22071223. [PMID: 28737728 PMCID: PMC6151993 DOI: 10.3390/molecules22071223] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2017] [Revised: 07/14/2017] [Accepted: 07/18/2017] [Indexed: 12/13/2022] Open
Abstract
Protein complexes play significant roles in cellular processes. Identifying protein complexes from protein-protein interaction (PPI) networks is an effective strategy to understand biological processes and cellular functions. A number of methods have recently been proposed to detect protein complexes. However, most of methods predict protein complexes from static PPI networks, and usually overlook the inherent dynamics and topological properties of protein complexes. In this paper, we proposed a novel method, called NABCAM (Neighbor Affinity-Based Core-Attachment Method), to identify protein complexes from dynamic PPI networks. Firstly, the centrality score of every protein is calculated. The proteins with the highest centrality scores are regarded as the seed proteins. Secondly, the seed proteins are expanded to complex cores by calculating the similarity values between the seed proteins and their neighboring proteins. Thirdly, the attachments are appended to their corresponding protein complex cores by comparing the affinity among neighbors inside the core, against that outside the core. Finally, filtering processes are carried out to obtain the final clustering result. The result in the DIP database shows that the NABCAM algorithm can predict protein complexes effectively in comparison with other state-of-the-art methods. Moreover, many protein complexes predicted by our method are biologically significant.
Collapse
Affiliation(s)
- Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, Xi'an 710119, China.
| | - Jing Liang
- School of Computer Science, Shaanxi Normal University, Xi'an 710119, China.
| |
Collapse
|
42
|
Huo M, Wang Z, Wu D, Zhang Y, Qiao Y. Using Coexpression Protein Interaction Network Analysis to Identify Mechanisms of Danshensu Affecting Patients with Coronary Heart Disease. Int J Mol Sci 2017. [PMID: 28629174 PMCID: PMC5486119 DOI: 10.3390/ijms18061298] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Salvia miltiorrhiza, known as Danshen, has attracted worldwide interest for its substantial effects on coronary heart disease (CHD). Danshensu (DSS) is one of the main active ingredients of Danshen on CHD. Although it has been proven to have a good clinical effect on CHD, the action mechanisms remain elusive. In the current study, a coexpression network-based approach was used to illustrate the beneficial properties of DSS in the context of CHD. By integrating the gene expression profile data and protein-protein interactions (PPIs) data, two coexpression protein interaction networks (CePIN) in a CHD state (CHD CePIN) and a non-CHD state (non-CHD CePIN) were generated. Then, shared nodes and unique nodes in CHD CePIN were attained by conducting a comparison between CHD CePIN and non-CHD CePIN. By calculating the topological parameters of each shared node and unique node in the networks, and comparing the differentially expressed genes, target proteins involved in disease regulation were attained. Then, Gene Ontology (GO) enrichment was utilized to identify biological processes associated to target proteins. Consequently, it turned out that the treatment of CHD with DSS may be partly attributed to the regulation of immunization and blood circulation. Also, it indicated that sodium/hydrogen exchanger 3 (SLC9A3), Prostaglandin G/H synthase 2 (PTGS2), Oxidized low-density lipoprotein receptor 1 (OLR1), and fibrinogen gamma chain (FGG) may be potential therapeutic targets for CHD. In summary, this study provided a novel coexpression protein interaction network approach to provide an explanation of the mechanisms of DSS on CHD and identify key proteins which maybe the potential therapeutic targets for CHD.
Collapse
Affiliation(s)
- Mengqi Huo
- Key Laboratory of Traditional Chinese Medicine Information Engineer of State Administration of Traditional Chinese Medicine; School of Chinese Material Medica, Beijing University of Chinese Medicine, Beijing 100102, China.
| | - Zhixin Wang
- Key Laboratory of Traditional Chinese Medicine Information Engineer of State Administration of Traditional Chinese Medicine; School of Chinese Material Medica, Beijing University of Chinese Medicine, Beijing 100102, China.
| | - Dongxue Wu
- Key Laboratory of Traditional Chinese Medicine Information Engineer of State Administration of Traditional Chinese Medicine; School of Chinese Material Medica, Beijing University of Chinese Medicine, Beijing 100102, China.
| | - Yanling Zhang
- Key Laboratory of Traditional Chinese Medicine Information Engineer of State Administration of Traditional Chinese Medicine; School of Chinese Material Medica, Beijing University of Chinese Medicine, Beijing 100102, China.
| | - Yanjiang Qiao
- Key Laboratory of Traditional Chinese Medicine Information Engineer of State Administration of Traditional Chinese Medicine; School of Chinese Material Medica, Beijing University of Chinese Medicine, Beijing 100102, China.
| |
Collapse
|
43
|
Wu M, Ou-Yang L, Li XL. Protein Complex Detection via Effective Integration of Base Clustering Solutions and Co-Complex Affinity Scores. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017; 14:733-739. [PMID: 27071190 DOI: 10.1109/tcbb.2016.2552176] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
With the increasing availability of protein interaction data, various computational methods have been developed to predict protein complexes. However, different computational methods may have their own advantages and limitations. Ensemble clustering has thus been studied to minimize the potential bias and risk of individual methods and generate prediction results with better coverage and accuracy. In this paper, we extend the traditional ensemble clustering by taking into account the co-complex affinity scores and present an Ensemble H ierarchical Clustering framework (EnsemHC) to detect protein complexes. First, we construct co-cluster matrices by integrating the clustering results with the co-complex evidences. Second, we sum up the constructed co-cluster matrices to derive a final ensemble matrix via a novel iterative weighting scheme. Finally, we apply the hierarchical clustering to generate protein complexes from the final ensemble matrix. Experimental results demonstrate that our EnsemHC performs better than its base clustering methods and various existing integrative methods. In addition, we also observed that integrating the clusters and co-complex affinity scores from different data sources will improve the prediction performance, e.g., integrating the clusters from TAP data and co-complex affinities from binary PPI data achieved the best performance in our experiments.
Collapse
|
44
|
Peng W, Li M, Chen L, Wang L. Predicting Protein Functions by Using Unbalanced Random Walk Algorithm on Three Biological Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017; 14:360-369. [PMID: 28368814 DOI: 10.1109/tcbb.2015.2394314] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
With the gap between the sequence data and their functional annotations becomes increasing wider, many computational methods have been proposed to annotate functions for unknown proteins. However, designing effective methods to make good use of various biological resources is still a big challenge for researchers due to function diversity of proteins. In this work, we propose a new method named ThrRW, which takes several steps of random walking on three different biological networks: protein interaction network (PIN), domain co-occurrence network (DCN), and functional interrelationship network (FIN), respectively, so as to infer functional information from neighbors in the corresponding networks. With respect to the topological and structural differences of the three networks, the number of walking steps in the three networks will be different. In the course of working, the functional information will be transferred from one network to another according to the associations between the nodes in different networks. The results of experiment on S. cerevisiae data show that our method achieves better prediction performance not only than the methods that consider both PIN data and GO term similarities, but also than the methods using both PIN data and protein domain information, which verifies the effectiveness of our method on integrating multiple biological data sources.
Collapse
|
45
|
Li M, Lu Y, Niu Z, Wu FX. United Complex Centrality for Identification of Essential Proteins from PPI Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017; 14:370-380. [PMID: 28368815 DOI: 10.1109/tcbb.2015.2394487] [Citation(s) in RCA: 50] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Essential proteins are indispensable for the survival or reproduction of an organism. Identification of essential proteins is not only necessary for the understanding of the minimal requirements for cellular life, but also important for the disease study and drug design. With the development of high-throughput techniques, a large number of protein-protein interaction data are available, which promotes the studies of essential proteins from the network level. Up to now, though a series of computational methods have been proposed, the prediction precision still needs to be improved. In this paper, we propose a new method, United complex Centrality (UC), to identify essential proteins by integrating the protein complexes with the topological features of protein-protein interaction (PPI) networks. By analyzing the relationship between the essential proteins and the known protein complexes of S. cerevisiae and human, we find that the proteins in complexes are more likely to be essential compared with the proteins not included in any complexes and the proteins appeared in multiple complexes are more inclined to be essential compared to those only appeared in a single complex. Considering that some protein complexes generated by computational methods are inaccurate, we also provide a modified version of UC with parameter alpha, named UC-P. The experimental results show that protein complex information can help identify the essential proteins more accurate both for the PPI network of S. cerevisiae and that of human. The proposed method UC performs obviously better than the eight previously proposed methods (DC, IC, EC, SC, BC, CC, NC, and LAC) for identifying essential proteins.
Collapse
|
46
|
Qi Y, Luo J. Prediction of Essential Proteins Based on Local Interaction Density. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2016; 13:1170-1182. [PMID: 26701891 DOI: 10.1109/tcbb.2015.2509989] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Prediction of essential proteins which is aided by computer science and supported from high throughput data is a more efficient method compared with time consuming and expensive experimental approaches. There are many computational approaches reported, however they are usually sensitive to various network structures so that their robustness are generally poor. In this paper, a novel topological centrality measure for predicting essential proteins based on local interaction density, named as LID, is proposed. It is different from previous measures that LID takes the essentiality of a node from interaction densities among its neighbors through topological analyses of real proteins in a protein complex set first time at the viewpoint of biological modules. LID is applied to four different yeast protein interaction networks, which are obtained, respectively, from the DIP database and the BioGRID database. The experimental results show that the number of essential proteins detected by LID universally exceeds or approximates the best performance of other 10 topological centrality measures in all 24 comparisons of four networks: DC, BC, ClusterC, CloseC, MNC, SoECC(NC), LAC, SC, EigC, and InfoC. The better robustness of LID for multiple data sets will make it to be a new core topological centrality measure to improve the performance of prediction for more species protein interaction networks.
Collapse
|
47
|
Cao B, Luo J, Liang C, Wang S, Ding P. PCE-FR: A Novel Method for Identifying Overlapping Protein Complexes in Weighted Protein-Protein Interaction Networks Using Pseudo-Clique Extension Based on Fuzzy Relation. IEEE Trans Nanobioscience 2016; 15:728-738. [PMID: 27662678 DOI: 10.1109/tnb.2016.2611683] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Identifying overlapping protein complexes in protein-protein interaction (PPI) networks can provide insight into cellular functional organization and thus elucidate underlying cellular mechanisms. Recently, various algorithms for protein complexes detection have been developed for PPI networks. However, majority of algorithms primarily depend on network topological feature and/or gene expression profile, failing to consider the inherent biological meanings between protein pairs. In this paper, we propose a novel method to detect protein complexes using pseudo-clique extension based on fuzzy relation (PCE-FR). Our algorithm operates in three stages: it first forms the nonoverlapping protein substructure based on fuzzy relation and then expands each substructure by adding neighbor proteins to maximize the cohesive score. Finally, highly overlapped candidate protein complexes are merged to form the final protein complex set. Particularly, our algorithm employs the biological significance hidden in protein pairs to construct edge weight for protein interaction networks. The experiment results show that our method can not only outperform classical algorithms such as CFinder, ClusterONE, CMC, RRW, HC-PIN, and ProRank +, but also achieve ideal overall performance in most of the yeast PPI datasets in terms of composite score consisting of precision, accuracy, and separation. We further apply our method to a human PPI network from the HPRD dataset and demonstrate it is very effective in detecting protein complexes compared to other algorithms.
Collapse
|
48
|
Ou-Yang L, Zhang XF, Dai DQ, Wu MY, Zhu Y, Liu Z, Yan H. Protein complex detection based on partially shared multi-view clustering. BMC Bioinformatics 2016; 17:371. [PMID: 27623844 PMCID: PMC5022186 DOI: 10.1186/s12859-016-1164-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2015] [Accepted: 07/23/2016] [Indexed: 01/05/2023] Open
Abstract
Background Protein complexes are the key molecular entities to perform many essential biological functions. In recent years, high-throughput experimental techniques have generated a large amount of protein interaction data. As a consequence, computational analysis of such data for protein complex detection has received increased attention in the literature. However, most existing works focus on predicting protein complexes from a single type of data, either physical interaction data or co-complex interaction data. These two types of data provide compatible and complementary information, so it is necessary to integrate them to discover the underlying structures and obtain better performance in complex detection. Results In this study, we propose a novel multi-view clustering algorithm, called the Partially Shared Multi-View Clustering model (PSMVC), to carry out such an integrated analysis. Unlike traditional multi-view learning algorithms that focus on mining either consistent or complementary information embedded in the multi-view data, PSMVC can jointly explore the shared and specific information inherent in different views. In our experiments, we compare the complexes detected by PSMVC from single data source with those detected from multiple data sources. We observe that jointly analyzing multi-view data benefits the detection of protein complexes. Furthermore, extensive experiment results demonstrate that PSMVC performs much better than 16 state-of-the-art complex detection techniques, including ensemble clustering and data integration techniques. Conclusions In this work, we demonstrate that when integrating multiple data sources, using partially shared multi-view clustering model can help to identify protein complexes which are not readily identifiable by conventional single-view-based methods and other integrative analysis methods. All the results and source codes are available on https://github.com/Oyl-CityU/PSMVC. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1164-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Le Ou-Yang
- College of Information Engineering, Shenzhen University, Nanhai Ave 3688, Shenzhen, 518060, China.,Department of Electronic and Engineering, City University of Hong Kong, Tat Chee Avenue, Hong Kong, China
| | - Xiao-Fei Zhang
- School of Mathematics and Statistics and Hubei Key Laboratory of Mathematical Sciences, Central China Normal University, Wuhan, 430079, China
| | - Dao-Qing Dai
- Intelligent Data Center and Department of Mathematics, Sun Yat-Sen University, Xin Gang Road West, Guangzhou, 510275, China.
| | - Meng-Yun Wu
- School of Statistics and Management, Shanghai University of Finance and Economics, Guoding Road, Shanghai, 200433, China
| | - Yuan Zhu
- School of Automation, China University of Geosciences, Wuhan, China
| | - Zhiyong Liu
- Shenzhen Polytechnic, Shenzhen, 518055, China
| | - Hong Yan
- Department of Electronic and Engineering, City University of Hong Kong, Tat Chee Avenue, Hong Kong, China
| |
Collapse
|
49
|
|
50
|
Li G, Li M, Wang J, Wu J, Wu FX, Pan Y. Predicting essential proteins based on subcellular localization, orthology and PPI networks. BMC Bioinformatics 2016; 17 Suppl 8:279. [PMID: 27586883 PMCID: PMC5009824 DOI: 10.1186/s12859-016-1115-5] [Citation(s) in RCA: 49] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
Background Essential proteins play an indispensable role in the cellular survival and development. There have been a series of biological experimental methods for finding essential proteins; however they are time-consuming, expensive and inefficient. In order to overcome the shortcomings of biological experimental methods, many computational methods have been proposed to predict essential proteins. The computational methods can be roughly divided into two categories, the topology-based methods and the sequence-based ones. The former use the topological features of protein-protein interaction (PPI) networks while the latter use the sequence features of proteins to predict essential proteins. Nevertheless, it is still challenging to improve the prediction accuracy of the computational methods. Results Comparing with nonessential proteins, essential proteins appear more frequently in certain subcellular locations and their evolution more conservative. By integrating the information of subcellular localization, orthologous proteins and PPI networks, we propose a novel essential protein prediction method, named SON, in this study. The experimental results on S.cerevisiae data show that the prediction accuracy of SON clearly exceeds that of nine competing methods: DC, BC, IC, CC, SC, EC, NC, PeC and ION. Conclusions We demonstrate that, by integrating the information of subcellular localization, orthologous proteins with PPI networks, the accuracy of predicting essential proteins can be improved. Our proposed method SON is effective for predicting essential proteins.
Collapse
Affiliation(s)
- Gaoshi Li
- School of Information Science and Engineering, Central South University, Changsha, 410083, Hunan, People's Republic of China.,Guangxi Key Lab of Multi-source Information Mining and Security, Guangxi Normal University, Guilin, 541004, Guangxi, People's Republic of China
| | - Min Li
- School of Information Science and Engineering, Central South University, Changsha, 410083, Hunan, People's Republic of China.
| | - Jianxin Wang
- School of Information Science and Engineering, Central South University, Changsha, 410083, Hunan, People's Republic of China.
| | - Jingli Wu
- Guangxi Key Lab of Multi-source Information Mining and Security, Guangxi Normal University, Guilin, 541004, Guangxi, People's Republic of China
| | - Fang-Xiang Wu
- School of Information Science and Engineering, Central South University, Changsha, 410083, Hunan, People's Republic of China.,Department of Mechanical Engineering and Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, S7N 5A9, SK, Canada
| | - Yi Pan
- School of Information Science and Engineering, Central South University, Changsha, 410083, Hunan, People's Republic of China.,Department of Computer Science, Georgia State University, Atlanta, 30302-4110, GA, USA
| |
Collapse
|