1
|
Chen S, Huang C, Wang L, Zhou S. A disease-related essential protein prediction model based on the transfer neural network. Front Genet 2023; 13:1087294. [PMID: 36685976 PMCID: PMC9845409 DOI: 10.3389/fgene.2022.1087294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Accepted: 12/14/2022] [Indexed: 01/06/2023] Open
Abstract
Essential proteins play important roles in the development and survival of organisms whose mutations are proven to be the drivers of common internal diseases having higher prevalence rates. Due to high costs of traditional biological experiments, an improved Transfer Neural Network (TNN) was designed to extract raw features from multiple biological information of proteins first, and then, based on the newly-constructed Transfer Neural Network, a novel computational model called TNNM was designed to infer essential proteins in this paper. Different from traditional Markov chain, since Transfer Neural Network adopted the gradient descent algorithm to automatically obtain the transition probability matrix, the prediction accuracy of TNNM was greatly improved. Moreover, additional antecedent memory coefficient and bias term were introduced in Transfer Neural Network, which further enhanced both the robustness and the non-linear expression ability of TNNM as well. Finally, in order to evaluate the identification performance of TNNM, intensive experiments have been executed based on two well-known public databases separately, and experimental results show that TNNM can achieve better performance than representative state-of-the-art prediction models in terms of both predictive accuracies and decline rate of accuracies. Therefore, TNNM may play an important role in key protein prediction in the future.
Collapse
Affiliation(s)
- Sisi Chen
- The First Hospital of Hunan University of Chinese Medicine, Changsha, Hunan, China
| | - Chiguo Huang
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, China,*Correspondence: Chiguo Huang, ; Lei Wang, ; Shunxian Zhou,
| | - Lei Wang
- The First Hospital of Hunan University of Chinese Medicine, Changsha, Hunan, China,Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, China,*Correspondence: Chiguo Huang, ; Lei Wang, ; Shunxian Zhou,
| | - Shunxian Zhou
- The First Hospital of Hunan University of Chinese Medicine, Changsha, Hunan, China,Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, China,College of Information Science and Engineering, Hunan Women’s University, Changsha, Hunan, China,*Correspondence: Chiguo Huang, ; Lei Wang, ; Shunxian Zhou,
| |
Collapse
|
2
|
Li Y, Wu FX, Ngom A. A review on machine learning principles for multi-view biological data integration. Brief Bioinform 2019; 19:325-340. [PMID: 28011753 DOI: 10.1093/bib/bbw113] [Citation(s) in RCA: 126] [Impact Index Per Article: 25.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2016] [Indexed: 01/08/2023] Open
Abstract
Driven by high-throughput sequencing techniques, modern genomic and clinical studies are in a strong need of integrative machine learning models for better use of vast volumes of heterogeneous information in the deep understanding of biological systems and the development of predictive models. How data from multiple sources (called multi-view data) are incorporated in a learning system is a key step for successful analysis. In this article, we provide a comprehensive review on omics and clinical data integration techniques, from a machine learning perspective, for various analyses such as prediction, clustering, dimension reduction and association. We shall show that Bayesian models are able to use prior information and model measurements with various distributions; tree-based methods can either build a tree with all features or collectively make a final decision based on trees learned from each view; kernel methods fuse the similarity matrices learned from individual views together for a final similarity matrix or learning model; network-based fusion methods are capable of inferring direct and indirect associations in a heterogeneous network; matrix factorization models have potential to learn interactions among features from different views; and a range of deep neural networks can be integrated in multi-modal learning for capturing the complex mechanism of biological systems.
Collapse
Affiliation(s)
- Yifeng Li
- Information and Communications Technologies, National Research Council Canada, Ottawa, Ontario, Canada
| | - Fang-Xiang Wu
- Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, Saskatchewan, Canada
| | - Alioune Ngom
- School of Computer Science, University of Windsor, Windsor, Ontario, Canada
| |
Collapse
|
3
|
Liu W, Ma L, Jeon B, Chen L, Chen B. A Network Hierarchy-Based method for functional module detection in protein-protein interaction networks. J Theor Biol 2018; 455:26-38. [PMID: 29981337 DOI: 10.1016/j.jtbi.2018.06.026] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2018] [Revised: 06/27/2018] [Accepted: 06/29/2018] [Indexed: 02/02/2023]
Abstract
In the post-genomic era, one of the important tasks is to identify protein complexes and functional modules from high-throughput protein-protein interaction data, so that we can systematically analyze and understand the molecular functions and biological processes of cells. Although a lot of functional module detection studies have been proposed, how to design correctly and efficiently functional modules detection algorithms is still a challenging and important scientific problem in computational biology. In this paper, we present a novel Network Hierarchy-Based method to detect functional modules in PPI networks (named NHB-FMD). NHB-FMD first constructs the hierarchy tree corresponding to the PPI network and then encodes the tree such that genetic algorithm is employed to obtain the hierarchy tree with Maximum Likelihood. After that functional module partitioning is performed based on it and the best partitioning is selected as the result. Experimental results in the real PPI networks have shown that the proposed algorithm not only significantly outperforms the state-of-the-art methods but also can detect protein modules more effectively and accurately.
Collapse
Affiliation(s)
- Wei Liu
- College of Information Engineering of Yangzhou University, Yangzhou 225127, China; The Laboratory for Internfet of Things and Mobile Internet Technology of Jiangsu Province, Huaiyin Institute of Technology, Huaiyin 223002, China; School of Electronic and Electrical Engineering, Sungkyunkwan University, Suwon, South Korea.
| | - Liangyu Ma
- College of Information Engineering of Yangzhou University, Yangzhou 225127, China
| | - Byeungwoo Jeon
- School of Electronic and Electrical Engineering, Sungkyunkwan University, Suwon, South Korea
| | - Ling Chen
- College of Information Engineering of Yangzhou University, Yangzhou 225127, China
| | - Bolun Chen
- The Laboratory for Internfet of Things and Mobile Internet Technology of Jiangsu Province, Huaiyin Institute of Technology, Huaiyin 223002, China
| |
Collapse
|
4
|
|
5
|
Luo P, Tian LP, Chen B, Xiao Q, Wu FX. Predicting Gene-Disease Associations with Manifold Learning. BIOINFORMATICS RESEARCH AND APPLICATIONS 2018. [DOI: 10.1007/978-3-319-94968-0_26] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
|
6
|
Wang J, Zheng W, Qian Y, Liang J. A Seed Expansion Graph Clustering Method for Protein Complexes Detection in Protein Interaction Networks. Molecules 2017; 22:molecules22122179. [PMID: 29292776 PMCID: PMC6150027 DOI: 10.3390/molecules22122179] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2017] [Revised: 12/03/2017] [Accepted: 12/03/2017] [Indexed: 02/02/2023] Open
Abstract
Most proteins perform their biological functions while interacting as complexes. The detection of protein complexes is an important task not only for understanding the relationship between functions and structures of biological network, but also for predicting the function of unknown proteins. We present a new nodal metric by integrating its local topological information. The metric reflects its representability in a larger local neighborhood to a cluster of a protein interaction (PPI) network. Based on the metric, we propose a seed-expansion graph clustering algorithm (SEGC) for protein complexes detection in PPI networks. A roulette wheel strategy is used in the selection of the seed to enhance the diversity of clustering. For a candidate node u, we define its closeness to a cluster C, denoted as NC(u, C), by combing the density of a cluster C and the connection between a node u and C. In SEGC, a cluster which initially consists of only a seed node, is extended by adding nodes recursively from its neighbors according to the closeness, until all neighbors fail the process of expansion. We compare the F-measure and accuracy of the proposed SEGC algorithm with other algorithms on Saccharomyces cerevisiae protein interaction networks. The experimental results show that SEGC outperforms other algorithms under full coverage.
Collapse
Affiliation(s)
- Jie Wang
- Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, School of Computer and Information Technology, Shanxi University, Taiyuan 030006, Shanxi, China.
| | - Wenping Zheng
- Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, School of Computer and Information Technology, Shanxi University, Taiyuan 030006, Shanxi, China.
| | - Yuhua Qian
- Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, School of Computer and Information Technology, Shanxi University, Taiyuan 030006, Shanxi, China.
| | - Jiye Liang
- Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, School of Computer and Information Technology, Shanxi University, Taiyuan 030006, Shanxi, China.
| |
Collapse
|
7
|
Identifying protein complex by integrating characteristic of core-attachment into dynamic PPI network. PLoS One 2017; 12:e0186134. [PMID: 29045465 PMCID: PMC5646790 DOI: 10.1371/journal.pone.0186134] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2017] [Accepted: 09/26/2017] [Indexed: 11/19/2022] Open
Abstract
How to identify protein complex is an important and challenging task in proteomics. It would make great contribution to our knowledge of molecular mechanism in cell life activities. However, the inherent organization and dynamic characteristic of cell system have rarely been incorporated into the existing algorithms for detecting protein complexes because of the limitation of protein-protein interaction (PPI) data produced by high throughput techniques. The availability of time course gene expression profile enables us to uncover the dynamics of molecular networks and improve the detection of protein complexes. In order to achieve this goal, this paper proposes a novel algorithm DCA (Dynamic Core-Attachment). It detects protein-complex core comprising of continually expressed and highly connected proteins in dynamic PPI network, and then the protein complex is formed by including the attachments with high adhesion into the core. The integration of core-attachment feature into the dynamic PPI network is responsible for the superiority of our algorithm. DCA has been applied on two different yeast dynamic PPI networks and the experimental results show that it performs significantly better than the state-of-the-art techniques in terms of prediction accuracy, hF-measure and statistical significance in biology. In addition, the identified complexes with strong biological significance provide potential candidate complexes for biologists to validate.
Collapse
|
8
|
Xu B, Wang Y, Wang Z, Zhou J, Zhou S, Guan J. An effective approach to detecting both small and large complexes from protein-protein interaction networks. BMC Bioinformatics 2017; 18:419. [PMID: 29072136 PMCID: PMC5657047 DOI: 10.1186/s12859-017-1820-8] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Background Predicting protein complexes from protein-protein interaction (PPI) networks has been studied for decade. Various methods have been proposed to address some challenging issues of this problem, including overlapping clusters, high false positive/negative rates of PPI data and diverse complex structures. It is well known that most current methods can detect effectively only complexes of size ≥3, which account for only about half of the total existing complexes. Recently, a method was proposed specifically for finding small complexes (size = 2 and 3) from PPI networks. However, up to now there is no effective approach that can predict both small (size ≤ 3) and large (size >3) complexes from PPI networks. Results In this paper, we propose a novel method, called CPredictor2.0, that can detect both small and large complexes under a unified framework. Concretely, we first group proteins of similar functions. Then, the Markov clustering algorithm is employed to discover clusters in each group. Finally, we merge all discovered clusters that overlap with each other to a certain degree, and the merged clusters as well as the remaining clusters constitute the set of detected complexes. Extensive experiments have shown that the new method can more effectively predict both small and large complexes, in comparison with the state-of-the-art methods. Conclusions The proposed method, CPredictor2.0, can be applied to accurately predict both small and large protein complexes.
Collapse
Affiliation(s)
- Bin Xu
- Department of Computer Science and Technology, Tongji University, 4800 Cao'an Road, Shanghai, 201804, China
| | - Yang Wang
- School of Software, Jiangxi Normal University, 99 Ziyang Avenue, Nanchang, 330022, China
| | - Zewei Wang
- Shanghai Southwest Model Middle School, 67 Huicheng Vallige-1, Baise Road, Shanghai, 200237, China
| | - Jiaogen Zhou
- The institute of subtropical Agriculture, China Academy of Sciences, 444 Yuandaer Road, Mapoling, Changsha, 410125, China
| | - Shuigeng Zhou
- Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, 220 Handan Road, Shanghai, 200433, China.,The Bioinformatics Lab at Changzhou NO. 7 People's Hospital, Changzhou, Jiangsu, 213011, China
| | - Jihong Guan
- Department of Computer Science and Technology, Tongji University, 4800 Cao'an Road, Shanghai, 201804, China.
| |
Collapse
|
9
|
Protein Complexes Prediction Method Based on Core-Attachment Structure and Functional Annotations. Int J Mol Sci 2017; 18:ijms18091910. [PMID: 28878201 PMCID: PMC5618559 DOI: 10.3390/ijms18091910] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2017] [Revised: 08/31/2017] [Accepted: 09/01/2017] [Indexed: 11/17/2022] Open
Abstract
Recent advances in high-throughput laboratory techniques captured large-scale protein–protein interaction (PPI) data, making it possible to create a detailed map of protein interaction networks, and thus enable us to detect protein complexes from these PPI networks. However, most of the current state-of-the-art studies still have some problems, for instance, incapability of identifying overlapping clusters, without considering the inherent organization within protein complexes, and overlooking the biological meaning of complexes. Therefore, we present a novel overlapping protein complexes prediction method based on core–attachment structure and function annotations (CFOCM), which performs in two stages: first, it detects protein complex cores with the maximum value of our defined cluster closeness function, in which the proteins are also closely related to at least one common function. Then it appends attach proteins into these detected cores to form the returned complexes. For performance evaluation, CFOCM and six classical methods have been used to identify protein complexes on three different yeast PPI networks, and three sets of real complexes including the Munich Information Center for Protein Sequences (MIPS), the Saccharomyces Genome Database (SGD) and the Catalogues of Yeast protein Complexes (CYC2008) are selected as benchmark sets, and the results show that CFOCM is indeed effective and robust for achieving the highest F-measure values in all tests.
Collapse
|
10
|
Ji J, Lv J, Yang C, Zhang A. Detecting Functional Modules Based on a Multiple-Grain Model in Large-Scale Protein-Protein Interaction Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2016; 13:610-622. [PMID: 26394434 DOI: 10.1109/tcbb.2015.2480066] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Detecting functional modules from a Protein-Protein Interaction (PPI) network is a fundamental and hot issue in proteomics research, where many computational approaches have played an important role in recent years. However, how to effectively and efficiently detect functional modules in large-scale PPI networks is still a challenging problem. We present a new framework, based on a multiple-grain model of PPI networks, to detect functional modules in PPI networks. First, we give a multiple-grain representation model of a PPI network, which has a smaller scale with super nodes. Next, we design the protein grain partitioning method, which employs a functional similarity or a structural similarity to merge some proteins layer by layer. Thirdly, a refining mechanism with border node tests is proposed to address the protein overlapping of different modules during the grain eliminating process. Finally, systematic experiments are conducted on five large-scale yeast and human networks. The results show that the framework not only significantly reduces the running time of functional module detection, but also effectively identifies overlapping modules while keeping some competitive performances, thus it is highly competent to detect functional modules in large-scale PPI networks.
Collapse
|
11
|
Mining Temporal Protein Complex Based on the Dynamic PIN Weighted with Connected Affinity and Gene Co-Expression. PLoS One 2016; 11:e0153967. [PMID: 27100396 PMCID: PMC4839750 DOI: 10.1371/journal.pone.0153967] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2015] [Accepted: 04/06/2016] [Indexed: 11/19/2022] Open
Abstract
The identification of temporal protein complexes would make great contribution to our knowledge of the dynamic organization characteristics in protein interaction networks (PINs). Recent studies have focused on integrating gene expression data into static PIN to construct dynamic PIN which reveals the dynamic evolutionary procedure of protein interactions, but they fail in practice for recognizing the active time points of proteins with low or high expression levels. We construct a Time-Evolving PIN (TEPIN) with a novel method called Deviation Degree, which is designed to identify the active time points of proteins based on the deviation degree of their own expression values. Owing to the differences between protein interactions, moreover, we weight TEPIN with connected affinity and gene co-expression to quantify the degree of these interactions. To validate the efficiencies of our methods, ClusterONE, CAMSE and MCL algorithms are applied on the TEPIN, DPIN (a dynamic PIN constructed with state-of-the-art three-sigma method) and SPIN (the original static PIN) to detect temporal protein complexes. Each algorithm on our TEPIN outperforms that on other networks in terms of match degree, sensitivity, specificity, F-measure and function enrichment etc. In conclusion, our Deviation Degree method successfully eliminates the disadvantages which exist in the previous state-of-the-art dynamic PIN construction methods. Moreover, the biological nature of protein interactions can be well described in our weighted network. Weighted TEPIN is a useful approach for detecting temporal protein complexes and revealing the dynamic protein assembly process for cellular organization.
Collapse
|
12
|
Lei X, Wang F, Wu FX, Zhang A, Pedrycz W. Protein complex identification through Markov clustering with firefly algorithm on dynamic protein–protein interaction networks. Inf Sci (N Y) 2016. [DOI: 10.1016/j.ins.2015.09.028] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
|
13
|
Cao B, Luo J, Liang C, Wang S, Song D. MOEPGA: A novel method to detect protein complexes in yeast protein-protein interaction networks based on MultiObjective Evolutionary Programming Genetic Algorithm. Comput Biol Chem 2015; 58:173-81. [PMID: 26298638 DOI: 10.1016/j.compbiolchem.2015.06.006] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2015] [Revised: 06/02/2015] [Accepted: 06/22/2015] [Indexed: 02/02/2023]
Abstract
The identification of protein complexes in protein-protein interaction (PPI) networks has greatly advanced our understanding of biological organisms. Existing computational methods to detect protein complexes are usually based on specific network topological properties of PPI networks. However, due to the inherent complexity of the network structures, the identification of protein complexes may not be fully addressed by using single network topological property. In this study, we propose a novel MultiObjective Evolutionary Programming Genetic Algorithm (MOEPGA) which integrates multiple network topological features to detect biologically meaningful protein complexes. Our approach first systematically analyzes the multiobjective problem in terms of identifying protein complexes from PPI networks, and then constructs the objective function of the iterative algorithm based on three common topological properties of protein complexes from the benchmark dataset, finally we describe our algorithm, which mainly consists of three steps, population initialization, subgraph mutation and subgraph selection operation. To show the utility of our method, we compared MOEPGA with several state-of-the-art algorithms on two yeast PPI datasets. The experiment results demonstrate that the proposed method can not only find more protein complexes but also achieve higher accuracy in terms of fscore. Moreover, our approach can cover a certain number of proteins in the input PPI network in terms of the normalized clustering score. Taken together, our method can serve as a powerful framework to detect protein complexes in yeast PPI networks, thereby facilitating the identification of the underlying biological functions.
Collapse
Affiliation(s)
- Buwen Cao
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China; Collaboration and Innovation Center for Digital Chinese Medicine of 2011 Project of Colleges and Universities in Hunan Province, China
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China; Collaboration and Innovation Center for Digital Chinese Medicine of 2011 Project of Colleges and Universities in Hunan Province, China.
| | - Cheng Liang
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China; Collaboration and Innovation Center for Digital Chinese Medicine of 2011 Project of Colleges and Universities in Hunan Province, China
| | - Shulin Wang
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China; Collaboration and Innovation Center for Digital Chinese Medicine of 2011 Project of Colleges and Universities in Hunan Province, China
| | - Dan Song
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China; Collaboration and Innovation Center for Digital Chinese Medicine of 2011 Project of Colleges and Universities in Hunan Province, China
| |
Collapse
|
14
|
Zhang W, Zou X. A New Method for Detecting Protein Complexes based on the Three Node Cliques. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015; 12:879-886. [PMID: 26357329 DOI: 10.1109/tcbb.2014.2386314] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
The identification of protein complexes in protein-protein interaction (PPI) networks is fundamental for understanding biological processes and cellular molecular mechanisms. Many graph computational algorithms have been proposed to identify protein complexes from PPI networks by detecting densely connected groups of proteins. These algorithms assess the density of subgraphs through evaluation of the sum of individual edges or nodes; thus, incomplete and inaccurate measures may miss meaningful biological protein complexes with functional significance. In this study, we propose a novel method for assessing the compactness of local subnetworks by measuring the number of three node cliques. The present method detects each optimal cluster by growing a seed and maximizing the compactness function. To demonstrate the efficacy of the new proposed method, we evaluate its performance using five PPI networks on three reference sets of yeast protein complexes with five different measurements and compare the performance of the proposed method with four state-of-the-art methods. The results show that the protein complexes generated by the proposed method are of better quality than those generated by four classic methods. Therefore, the new proposed method is effective and useful for detecting protein complexes in PPI networks.
Collapse
|
15
|
Dai HL. Imbalanced Protein Data Classification Using Ensemble FTM-SVM. IEEE Trans Nanobioscience 2015; 14:350-359. [DOI: 10.1109/tnb.2015.2431292] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
16
|
Wu L, Shen Y, Li M, Wu FX. Network output controllability-based method for drug target identification. IEEE Trans Nanobioscience 2015; 14:184-91. [PMID: 25643411 DOI: 10.1109/tnb.2015.2391175] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Biomolecules do not perform their functions alone, but interactively with one another to form so called biomolecular networks. It is well known that a complex disease stems from the malfunctions of corresponding biomolecular networks. Therefore, one of important tasks is to identify drug targets from biomolecular networks. In this study, the drug target identification is formulated as a problem of finding steering nodes in biomolecular networks while the concept of network output controllability is applied to the problem of drug target identification. By applying control signals to these steering nodes, the biomolecular networks are expected to be transited from one state to another. A graph-theoretic algorithm has been proposed to find a minimum set of steering nodes in biomolecular networks which can be a potential set of drug targets. Application results of the method to real biomolecular networks show that identified potential drug targets are in agreement with existing research results. This indicates that the method can generate testable predictions and provide insights into experimental design of drug discovery.
Collapse
|
17
|
Chen B, Wang J, Li M, Wu FX. Identifying disease genes by integrating multiple data sources. BMC Med Genomics 2014; 7 Suppl 2:S2. [PMID: 25350511 PMCID: PMC4243092 DOI: 10.1186/1755-8794-7-s2-s2] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Now multiple types of data are available for identifying disease genes. Those data include gene-disease associations, disease phenotype similarities, protein-protein interactions, pathways, gene expression profiles, etc.. It is believed that integrating different kinds of biological data is an effective method to identify disease genes. RESULTS In this paper, we propose a multiple data integration method based on the theory of Markov random field (MRF) and the method of Bayesian analysis for identifying human disease genes. The proposed method is not only flexible in easily incorporating different kinds of data, but also reliable in predicting candidate disease genes. CONCLUSIONS Numerical experiments are carried out by integrating known gene-disease associations, protein complexes, protein-protein interactions, pathways and gene expression profiles. Predictions are evaluated by the leave-one-out method. The proposed method achieves an AUC score of 0.743 when integrating all those biological data in our experiments.
Collapse
|