1
|
Pan L, Wang H, Yang B, Li W. A protein network refinement method based on module discovery and biological information. BMC Bioinformatics 2024; 25:157. [PMID: 38643108 PMCID: PMC11031909 DOI: 10.1186/s12859-024-05772-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Accepted: 04/10/2024] [Indexed: 04/22/2024] Open
Abstract
BACKGROUND The identification of essential proteins can help in understanding the minimum requirements for cell survival and development to discover drug targets and prevent disease. Nowadays, node ranking methods are a common way to identify essential proteins, but the poor data quality of the underlying PIN has somewhat hindered the identification accuracy of essential proteins for these methods in the PIN. Therefore, researchers constructed refinement networks by considering certain biological properties of interacting protein pairs to improve the performance of node ranking methods in the PIN. Studies show that proteins in a complex are more likely to be essential than proteins not present in the complex. However, the modularity is usually ignored for the refinement methods of the PINs. METHODS Based on this, we proposed a network refinement method based on module discovery and biological information. The idea is, first, to extract the maximal connected subgraph in the PIN, and to divide it into different modules by using Fast-unfolding algorithm; then, to detect critical modules according to the orthologous information, subcellular localization information and topology information within each module; finally, to construct a more refined network (CM-PIN) by using the identified critical modules. RESULTS To evaluate the effectiveness of the proposed method, we used 12 typical node ranking methods (LAC, DC, DMNC, NC, TP, LID, CC, BC, PR, LR, PeC, WDC) to compare the overall performance of the CM-PIN with those on the S-PIN, D-PIN and RD-PIN. The experimental results showed that the CM-PIN was optimal in terms of the identification number of essential proteins, precision-recall curve, Jackknifing method and other criteria, and can help to identify essential proteins more accurately.
Collapse
Affiliation(s)
- Li Pan
- Hunan Institute of Science and Technology, Yueyang, 414006, China
- Hunan Engineering Research Center of Multimodal Health Sensing and Intelligent Analysis, Yueyang, 414006, China
| | - Haoyue Wang
- Hunan Institute of Science and Technology, Yueyang, 414006, China.
| | - Bo Yang
- Hunan Institute of Science and Technology, Yueyang, 414006, China
- Hunan Engineering Research Center of Multimodal Health Sensing and Intelligent Analysis, Yueyang, 414006, China
| | - Wenbin Li
- Hunan Institute of Science and Technology, Yueyang, 414006, China.
| |
Collapse
|
2
|
Cote AC, Young HE, Huckins LM. Comparison of confound adjustment methods in the construction of gene co-expression networks. Genome Biol 2022; 23:44. [PMID: 35115012 PMCID: PMC8812044 DOI: 10.1186/s13059-022-02606-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2021] [Accepted: 01/03/2022] [Indexed: 11/23/2022] Open
Abstract
Adjustment for confounding sources of expression variation is an important preprocessing step in large gene expression studies, but the effect of confound adjustment on co-expression network analysis has not been well-characterized. Here, we demonstrate that the choice of confound adjustment method can have a considerable effect on the architecture of the resulting co-expression network. We compare standard and alternative confound adjustment methods and provide recommendations for their use in the construction of gene co-expression networks from bulk tissue RNA-seq datasets.
Collapse
Affiliation(s)
- Alanna C Cote
- Pamela Sklar Division of Psychiatric Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.,Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.,Department of Genetics and Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Hannah E Young
- Pamela Sklar Division of Psychiatric Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.,Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.,Department of Genetics and Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Laura M Huckins
- Pamela Sklar Division of Psychiatric Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA. .,Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA. .,Department of Genetics and Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA. .,Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA. .,Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA. .,Mental Illness Research, Education and Clinical Centers, James J. Peters Department of Veterans Affairs Medical Center, Bronx, NY, 10468, USA.
| |
Collapse
|
3
|
Du Q, Campbell MT, Yu H, Liu K, Walia H, Zhang Q, Zhang C. Gene Co-expression Network Analysis and Linking Modules to Phenotyping Response in Plants. Methods Mol Biol 2022; 2539:261-268. [PMID: 35895209 DOI: 10.1007/978-1-0716-2537-8_20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Environmental factors, including different stresses, can have an impact on the expression of genes and subsequently the phenotype and development of plants. Since a large number of genes are involved in response to the perturbation of the environment, identifying groups of co-expressed genes is meaningful. The gene co-expression network models can be used for the exploration, interpretation, and identification of genes responding to environmental changes. Once a gene co-expression network is constructed, one can determine gene modules and the association of gene modules to the phenotypic response. To link modules to phenotype, one approach is to find the correlated eigengenes of given modules or to integrate all eigengenes in regularized linear model. This manuscript describes the method from construction of co-expression network, module discovery, association between modules and phenotypic data, and finally to annotation/visualization.
Collapse
Affiliation(s)
- Qian Du
- School of Biological Sciences, Center for Plant Science and Innovation, University of Nebraska, Lincoln, NE, USA
| | - Malachy T Campbell
- Department of Agronomy and Horticulture, Center for Plant Science and Innovation, University of Nebraska, Lincoln, NE, USA
| | - Huihui Yu
- School of Biological Sciences, Center for Plant Science and Innovation, University of Nebraska, Lincoln, NE, USA
| | - Kan Liu
- School of Biological Sciences, Center for Plant Science and Innovation, University of Nebraska, Lincoln, NE, USA
| | - Harkamal Walia
- Department of Agronomy and Horticulture, Center for Plant Science and Innovation, University of Nebraska, Lincoln, NE, USA
| | - Qi Zhang
- Department of Mathematics and Statistics, College of Engineering and Physical Sciences (CEPS), University of New Hampshire, Durham, NH, USA
| | - Chi Zhang
- School of Biological Sciences, Center for Plant Science and Innovation, University of Nebraska, Lincoln, NE, USA.
| |
Collapse
|
4
|
Chow J, Jensen M, Amini H, Hormozdiari F, Penn O, Shifman S, Girirajan S, Hormozdiari F. Dissecting the genetic basis of comorbid epilepsy phenotypes in neurodevelopmental disorders. Genome Med 2019; 11:65. [PMID: 31653223 DOI: 10.1186/s13073-019-0678-y] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2019] [Accepted: 10/15/2019] [Indexed: 12/22/2022] Open
Abstract
Background Neurodevelopmental disorders (NDDs) such as autism spectrum disorder, intellectual disability, developmental disability, and epilepsy are characterized by abnormal brain development that may affect cognition, learning, behavior, and motor skills. High co-occurrence (comorbidity) of NDDs indicates a shared, underlying biological mechanism. The genetic heterogeneity and overlap observed in NDDs make it difficult to identify the genetic causes of specific clinical symptoms, such as seizures. Methods We present a computational method, MAGI-S, to discover modules or groups of highly connected genes that together potentially perform a similar biological function. MAGI-S integrates protein-protein interaction and co-expression networks to form modules centered around the selection of a single “seed” gene, yielding modules consisting of genes that are highly co-expressed with the seed gene. We aim to dissect the epilepsy phenotype from a general NDD phenotype by providing MAGI-S with high confidence NDD seed genes with varying degrees of association with epilepsy, and we assess the enrichment of de novo mutation, NDD-associated genes, and relevant biological function of constructed modules. Results The newly identified modules account for the increased rate of de novo non-synonymous mutations in autism, intellectual disability, developmental disability, and epilepsy, and enrichment of copy number variations (CNVs) in developmental disability. We also observed that modules seeded with genes strongly associated with epilepsy tend to have a higher association with epilepsy phenotypes than modules seeded at other neurodevelopmental disorder genes. Modules seeded with genes strongly associated with epilepsy (e.g., SCN1A, GABRA1, and KCNB1) are significantly associated with synaptic transmission, long-term potentiation, and calcium signaling pathways. On the other hand, modules found with seed genes that are not associated or weakly associated with epilepsy are mostly involved with RNA regulation and chromatin remodeling. Conclusions In summary, our method identifies modules enriched with de novo non-synonymous mutations and can capture specific networks that underlie the epilepsy phenotype and display distinct enrichment in relevant biological processes. MAGI-S is available at https://github.com/jchow32/magi-s.
Collapse
|
5
|
Xiao Q, Luo J, Liang C, Cai J, Li G, Cao B. CeModule: an integrative framework for discovering regulatory patterns from genomic data in cancer. BMC Bioinformatics 2019; 20:67. [PMID: 30732558 PMCID: PMC6367773 DOI: 10.1186/s12859-019-2654-3] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2017] [Accepted: 01/24/2019] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Non-coding RNAs (ncRNAs) are emerging as key regulators and play critical roles in a wide range of tumorigenesis. Recent studies have suggested that long non-coding RNAs (lncRNAs) could interact with microRNAs (miRNAs) and indirectly regulate miRNA targets through competing interactions. Therefore, uncovering the competing endogenous RNA (ceRNA) regulatory mechanism of lncRNAs, miRNAs and mRNAs in post-transcriptional level will aid in deciphering the underlying pathogenesis of human polygenic diseases and may unveil new diagnostic and therapeutic opportunities. However, the functional roles of vast majority of cancer specific ncRNAs and their combinational regulation patterns are still insufficiently understood. RESULTS Here we develop an integrative framework called CeModule to discover lncRNA, miRNA and mRNA-associated regulatory modules. We fully utilize the matched expression profiles of lncRNAs, miRNAs and mRNAs and establish a model based on joint orthogonality non-negative matrix factorization for identifying modules. Meanwhile, we impose the experimentally verified miRNA-lncRNA interactions, the validated miRNA-mRNA interactions and the weighted gene-gene network into this framework to improve the module accuracy through the network-based penalties. The sparse regularizations are also used to help this model obtain modular sparse solutions. Finally, an iterative multiplicative updating algorithm is adopted to solve the optimization problem. CONCLUSIONS We applied CeModule to two cancer datasets including ovarian cancer (OV) and uterine corpus endometrial carcinoma (UCEC) obtained from TCGA. The modular analysis indicated that the identified modules involving lncRNAs, miRNAs and mRNAs are significantly associated and functionally enriched in cancer-related biological processes and pathways, which may provide new insights into the complex regulatory mechanism of human diseases at the system level.
Collapse
Affiliation(s)
- Qiu Xiao
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China
- Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing, Hunan Normal University, Changsha, 410081, China
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China.
| | - Cheng Liang
- College of Information Science and Engineering, Shandong Normal University, Jinan, 250000, China
| | - Jie Cai
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China
| | - Guanghui Li
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China
| | - Buwen Cao
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China
| |
Collapse
|