1
|
Sebastian S, Roy S, Kalita J. A generic parallel framework for inferring large-scale gene regulatory networks from expression profiles: application to Alzheimer's disease network. Brief Bioinform 2023; 24:6868522. [PMID: 36534961 DOI: 10.1093/bib/bbac482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Revised: 09/14/2022] [Accepted: 10/11/2022] [Indexed: 12/23/2022] Open
Abstract
The inference of large-scale gene regulatory networks is essential for understanding comprehensive interactions among genes. Most existing methods are limited to reconstructing networks with a few hundred nodes. Therefore, parallel computing paradigms must be leveraged to construct large networks. We propose a generic parallel framework that enables any existing method, without re-engineering, to infer large networks in parallel, guaranteeing quality output. The framework is tested on 15 inference methods (not limited to) employing in silico benchmarks and real-world large expression matrices, followed by qualitative and speedup assessment. The framework does not compromise the quality of the base serial inference method. We rank the candidate methods and use the top-performing method to infer an Alzheimer's Disease (AD) affected network from large expression profiles of a triple transgenic mouse model consisting of 45,101 genes. The resultant network is further explored to obtain hub genes that emerge functionally related to the disease. We partition the network into 41 modules and conduct pathway enrichment analysis, revealing that a good number of participating genes are collectively responsible for several brain disorders, including AD. Finally, we extract the interactions of a few known AD genes and observe that they are periphery genes connected to the network's hub genes. Availability: The R implementation of the framework is downloadable from https://github.com/Netralab/GenericParallelFramework.
Collapse
Affiliation(s)
- Softya Sebastian
- Network Reconstruction and Analysis (NetRA) Lab, Department of Computer Applications, Sikkim University, 6th Mile, Gangtok, 737102, Sikkim, India
| | - Swarup Roy
- Network Reconstruction and Analysis (NetRA) Lab, Department of Computer Applications, Sikkim University, 6th Mile, Gangtok, 737102, Sikkim, India
| | - Jugal Kalita
- Department of Computer Science, University of Colorado at Colorado Springs, CO, 80918 USA
| |
Collapse
|
2
|
Guebila MB, Morgan DC, Glass K, Kuijjer ML, DeMeo DL, Quackenbush J. gpuZoo: Cost-effective estimation of gene regulatory networks using the Graphics Processing Unit. NAR Genom Bioinform 2022; 4:lqac002. [PMID: 35156023 PMCID: PMC8826808 DOI: 10.1093/nargab/lqac002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2021] [Revised: 12/28/2021] [Accepted: 02/02/2022] [Indexed: 11/14/2022] Open
Abstract
Gene regulatory network inference allows for the modeling of genome-scale regulatory processes that are altered during development, in disease, and in response to perturbations. Our group has developed a collection of tools to model various regulatory processes, including transcriptional (PANDA, SPIDER) and post-transcriptional (PUMA) gene regulation, as well as gene regulation in individual samples (LIONESS). These methods work by postulating a network structure and then optimizing that structure to be consistent with multiple lines of biological evidence through repeated operations on data matrices. Although our methods are widely used, the corresponding computational complexity, and the associated costs and run times, do limit some applications. To improve the cost/time performance of these algorithms, we developed gpuZoo which implements GPU-accelerated calculations, dramatically improving the performance of these algorithms. The runtime of the gpuZoo implementation in MATLAB and Python is up to 61 times faster and 28 times less expensive than multi-core CPU implementation of the same methods. gpuZoo is available in MATLAB through the netZooM package https://github.com/netZoo/netZooM and in Python through the netZooPy package https://github.com/netZoo/netZooPy.
Collapse
Affiliation(s)
- Marouen Ben Guebila
- Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA
| | - Daniel C Morgan
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Kimberly Glass
- Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Marieke L Kuijjer
- Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, Oslo, Norway
- Department of Pathology, Leiden University Medical Center, Leiden, The Netherlands
- Leiden Center for Computational Oncology, Leiden University Medical Center, Leiden, The Netherlands
| | - Dawn L DeMeo
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - John Quackenbush
- Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| |
Collapse
|
3
|
Tao W, Radstake TRDJ, Pandit A. RegEnrich gene regulator enrichment analysis reveals a key role of the ETS transcription factor family in interferon signaling. Commun Biol 2022; 5:31. [PMID: 35017649 PMCID: PMC8752721 DOI: 10.1038/s42003-021-02991-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Accepted: 11/29/2021] [Indexed: 12/13/2022] Open
Abstract
Changes in a few key transcriptional regulators can lead to different biological states. Extracting the key gene regulators governing a biological state allows us to gain mechanistic insights. Most current tools perform pathway/GO enrichment analysis to identify key genes and regulators but tend to overlook the gene/protein regulatory interactions. Here we present RegEnrich, an open-source Bioconductor R package, which combines differential expression analysis, data-driven gene regulatory network inference, enrichment analysis, and gene regulator ranking to identify key regulators using gene/protein expression profiling data. By benchmarking using multiple gene expression datasets of gene silencing studies, we found that RegEnrich using the GSEA method to rank the regulators performed the best. Further, RegEnrich was applied to 21 publicly available datasets on in vitro interferon-stimulation of different cell types. Collectively, RegEnrich can accurately identify key gene regulators from the cells under different biological states, which can be valuable in mechanistically studying cell differentiation, cell response to drug stimulation, disease development, and ultimately drug development.
Collapse
Affiliation(s)
- Weiyang Tao
- Center for Translational Immunology, Department of Immunology, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands.
- Department of Rheumatology and Clinical Immunology, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands.
| | - Timothy R D J Radstake
- Center for Translational Immunology, Department of Immunology, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
- Department of Rheumatology and Clinical Immunology, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Aridaman Pandit
- Center for Translational Immunology, Department of Immunology, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands.
- Department of Rheumatology and Clinical Immunology, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands.
| |
Collapse
|
4
|
ClustMMRA v2: A Scalable Computational Pipeline for the Identification of MicroRNA Clusters Acting Cooperatively on Tumor Molecular Subgroups. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2022; 1385:259-279. [DOI: 10.1007/978-3-031-08356-3_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
5
|
García-Cortés D, Hernández-Lemus E, Espinal-Enríquez J. Luminal A Breast Cancer Co-expression Network: Structural and Functional Alterations. Front Genet 2021; 12:629475. [PMID: 33959148 PMCID: PMC8096206 DOI: 10.3389/fgene.2021.629475] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2020] [Accepted: 03/17/2021] [Indexed: 12/20/2022] Open
Abstract
Luminal A is the most common breast cancer molecular subtype in women worldwide. These tumors have characteristic yet heterogeneous alterations at the genomic and transcriptomic level. Gene co-expression networks (GCNs) have contributed to better characterize the cancerous phenotype. We have previously shown an imbalance in the proportion of intra-chromosomal (cis-) over inter-chromosomal (trans-) interactions when comparing cancer and healthy tissue GCNs. In particular, for breast cancer molecular subtypes (Luminal A included), the majority of high co-expression interactions connect gene-pairs in the same chromosome, a phenomenon that we have called loss of trans- co-expression. Despite this phenomenon has been described, the functional implication of this specific network topology has not been studied yet. To understand the biological role that communities of co-expressed genes may have, we constructed GCNs for healthy and Luminal A phenotypes. Network modules were obtained based on their connectivity patterns and they were classified according to their chromosomal homophily (proportion of cis-/trans- interactions). A functional overrepresentation analysis was performed on communities in both networks to observe the significantly enriched processes for each community. We also investigated possible mechanisms for which the loss of trans- co-expression emerges in cancer GCN. To this end we evaluated transcription factor binding sites, CTCF binding sites, differential gene expression and copy number alterations (CNAs) in the cancer GCN. We found that trans- communities in Luminal A present more significantly enriched categories than cis- ones. Processes, such as angiogenesis, cell proliferation, or cell adhesion were found in trans- modules. The differential expression analysis showed that FOXM1, CENPA, and CIITA transcription factors, exert a major regulatory role on their communities by regulating expression of their target genes in other chromosomes. Finally, identification of CNAs, displayed a high enrichment of deletion peaks in cis- communities. With this approach, we demonstrate that network topology determine, to at certain extent, the function in Luminal A breast cancer network. Furthermore, several mechanisms seem to be acting together to avoid trans- co-expression. Since this phenomenon has been observed in other cancer tissues, a remaining question is whether the loss of long distance co-expression is a novel hallmark of cancer.
Collapse
Affiliation(s)
- Diana García-Cortés
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico.,Programa de Doctorado en Ciencias Biomédicas, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Enrique Hernández-Lemus
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico.,Centro de Ciencias de la Complejidad, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Jesús Espinal-Enríquez
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico.,Centro de Ciencias de la Complejidad, Universidad Nacional Autónoma de México, Mexico City, Mexico
| |
Collapse
|
6
|
Uriarte-Navarrete I, Hernández-Lemus E, de Anda-Jáuregui G. Gene-Microbiome Co-expression Networks in Colon Cancer. Front Genet 2021; 12:617505. [PMID: 33659025 PMCID: PMC7917223 DOI: 10.3389/fgene.2021.617505] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2020] [Accepted: 01/22/2021] [Indexed: 12/27/2022] Open
Abstract
It is known that cancer onset and development arise from complex, multi-factorial phenomena spanning from the molecular, functional, micro-environmental, and cellular up to the tissular and organismal levels. Important advances have been made in the systematic analysis of the molecular (mostly genomic and transcriptomic) within large studies of high throughput data such as The Cancer Genome Atlas collaboration. However, the role of the microbiome in the induction of biological changes needed to reach these pathological states remains to be explored, largely because of scarce experimental data. In recent work a non-standard bioinformatics strategy was used to indirectly quantify microbial abundance from TCGA RNA-seq data, allowing the evaluation of the microbiome in well-characterized cancer patients, thus opening the way to studies incorporating the molecular and microbiome dimensions altogether. In this work, we used such recently described approaches for the quantification of microbial species alongside with gene expression. With this, we will reconstruct bipartite networks linking microbial abundance and gene expression in the context of colon cancer, by resorting to network reconstruction based on measures from information theory. The rationale is that microbial communities may induce biological changes important for the cancerous state. We analyzed changes in microbiome-gene interactions in the context of early (stages I and II) and late (stages III and IV) colon cancer, studied changes in network descriptors, and identify key discriminating features for early and late stage colon cancer. We found that early stage bipartite network is associated with the establishment of structural features in the tumor cells, whereas late stage is related to more advance signaling and metabolic features. This functional divergence thus arise as a consequence of changes in the organization of the corresponding gene-microorganism co-expression networks.
Collapse
Affiliation(s)
| | - Enrique Hernández-Lemus
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico
- Centro de Ciencias de la Complejidad, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Guillermo de Anda-Jáuregui
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico
- Centro de Ciencias de la Complejidad, Universidad Nacional Autónoma de México, Mexico City, Mexico
- Conacyt Research Chairs, National Council on Science and Technology, Mexico City, Mexico
| |
Collapse
|
7
|
Abstract
Genomics is both a data- and compute-intensive discipline. The success of genomics depends on an adequate informatics infrastructure that can address growing data demands and enable a diverse range of resource-intensive computational activities. Designing a suitable infrastructure is a challenging task, and its success largely depends on its adoption by users. In this article, we take a user-centric view of the genomics, where users are bioinformaticians, computational biologists, and data scientists. We try to take their point of view on how traditional computational activities for genomics are expanding due to data growth, as well as the introduction of big data and cloud technologies. The changing landscape of computational activities and new user requirements will influence the design of future genomics infrastructures.
Collapse
Affiliation(s)
- Ritesh Krishna
- IBM Research Europe, The Hartree Centre STFC Laboratory, Warrington WA4 4AD, UK.,IBM Research Europe, The Hartree Centre STFC Laboratory, Warrington WA4 4AD, UK
| | - Vadim Elisseev
- IBM Research Europe, The Hartree Centre STFC Laboratory, Warrington WA4 4AD, UK.,IBM Research Europe, The Hartree Centre STFC Laboratory, Warrington WA4 4AD, UK
| |
Collapse
|
8
|
Mercatelli D, Scalambra L, Triboli L, Ray F, Giorgi FM. Gene regulatory network inference resources: A practical overview. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2019; 1863:194430. [PMID: 31678629 DOI: 10.1016/j.bbagrm.2019.194430] [Citation(s) in RCA: 58] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Revised: 09/06/2019] [Accepted: 09/09/2019] [Indexed: 02/08/2023]
Abstract
Transcriptional regulation is a fundamental molecular mechanism involved in almost every aspect of life, from homeostasis to development, from metabolism to behavior, from reaction to stimuli to disease progression. In recent years, the concept of Gene Regulatory Networks (GRNs) has grown popular as an effective applied biology approach for describing the complex and highly dynamic set of transcriptional interactions, due to its easy-to-interpret features. Since cataloguing, predicting and understanding every GRN connection in all species and cellular contexts remains a great challenge for biology, researchers have developed numerous tools and methods to infer regulatory processes. In this review, we catalogue these methods in six major areas, based on the dominant underlying information leveraged to infer GRNs: Coexpression, Sequence Motifs, Chromatin Immunoprecipitation (ChIP), Orthology, Literature and Protein-Protein Interaction (PPI) specifically focused on transcriptional complexes. The methods described here cover a wide range of user-friendliness: from web tools that require no prior computational expertise to command line programs and algorithms for large scale GRN inferences. Each method for GRN inference described herein effectively illustrates a type of transcriptional relationship, with many methods being complementary to others. While a truly holistic approach for inferring and displaying GRNs remains one of the greatest challenges in the field of systems biology, we believe that the integration of multiple methods described herein provides an effective means with which experimental and computational biologists alike may obtain the most complete pictures of transcriptional relationships. This article is part of a Special Issue entitled: Transcriptional Profiles and Regulatory Gene Networks edited by Dr. Federico Manuel Giorgi and Dr. Shaun Mahony.
Collapse
Affiliation(s)
- Daniele Mercatelli
- Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Laura Scalambra
- Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Luca Triboli
- Centre for Integrative Biology (CIBIO), University of Trento, Italy
| | - Forest Ray
- Department of Systems Biology, Columbia University Medical Center, New York, NY, United States
| | - Federico M Giorgi
- Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy.
| |
Collapse
|
9
|
Cantini L, Bertoli G, Cava C, Dubois T, Zinovyev A, Caselle M, Castiglioni I, Barillot E, Martignetti L. Identification of microRNA clusters cooperatively acting on epithelial to mesenchymal transition in triple negative breast cancer. Nucleic Acids Res 2019; 47:2205-2215. [PMID: 30657980 PMCID: PMC6412120 DOI: 10.1093/nar/gkz016] [Citation(s) in RCA: 58] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2018] [Revised: 12/17/2018] [Accepted: 01/08/2019] [Indexed: 12/19/2022] Open
Abstract
MicroRNAs play important roles in many biological processes. Their aberrant expression can have oncogenic or tumor suppressor function directly participating to carcinogenesis, malignant transformation, invasiveness and metastasis. Indeed, miRNA profiles can distinguish not only between normal and cancerous tissue but they can also successfully classify different subtypes of a particular cancer. Here, we focus on a particular class of transcripts encoding polycistronic miRNA genes that yields multiple miRNA components. We describe 'clustered MiRNA Master Regulator Analysis (ClustMMRA)', a fully redesigned release of the MMRA computational pipeline (MiRNA Master Regulator Analysis), developed to search for clustered miRNAs potentially driving cancer molecular subtyping. Genomically clustered miRNAs are frequently co-expressed to target different components of pro-tumorigenic signaling pathways. By applying ClustMMRA to breast cancer patient data, we identified key miRNA clusters driving the phenotype of different tumor subgroups. The pipeline was applied to two independent breast cancer datasets, providing statistically concordant results between the two analyses. We validated in cell lines the miR-199/miR-214 as a novel cluster of miRNAs promoting the triple negative breast cancer (TNBC) phenotype through its control of proliferation and EMT.
Collapse
Affiliation(s)
- Laura Cantini
- Institut Curie, 26 rue d'Ulm, F-75005 Paris, France.,PSL Research University, F-75005 Paris, France.,Inserm, U900, F-75005, Paris France.,Mines Paris Tech, F-77305 cedex Fontainebleau, France.,Computational Systems Biology Team, Institut de Biologie de l'Ecole Normale Supérieure, CNRS UMR8197, INSERM U1024, Ecole Normale Supérieure, Paris Sciences et Lettres Research University, 75005 Paris, France
| | - Gloria Bertoli
- Institute of Molecular Bioimaging and Physiology, National Research Council (IBFM-CNR), Italy
| | - Claudia Cava
- Institute of Molecular Bioimaging and Physiology, National Research Council (IBFM-CNR), Italy
| | - Thierry Dubois
- Institut Curie, 26 rue d'Ulm, F-75005 Paris, France.,PSL Research University, F-75005 Paris, France.,Institut Curie, PSL Research University, Department of Translational Research, Breast Cancer Biology Group, Paris, France
| | - Andrei Zinovyev
- Institut Curie, 26 rue d'Ulm, F-75005 Paris, France.,PSL Research University, F-75005 Paris, France.,Inserm, U900, F-75005, Paris France.,Mines Paris Tech, F-77305 cedex Fontainebleau, France
| | - Michele Caselle
- Department of Physics and INFN, Università degli Studi di Torino, Turin, Italy
| | - Isabella Castiglioni
- Institute of Molecular Bioimaging and Physiology, National Research Council (IBFM-CNR), Italy
| | - Emmanuel Barillot
- Institut Curie, 26 rue d'Ulm, F-75005 Paris, France.,PSL Research University, F-75005 Paris, France.,Inserm, U900, F-75005, Paris France.,Mines Paris Tech, F-77305 cedex Fontainebleau, France
| | - Loredana Martignetti
- Institut Curie, 26 rue d'Ulm, F-75005 Paris, France.,PSL Research University, F-75005 Paris, France.,Inserm, U900, F-75005, Paris France.,Mines Paris Tech, F-77305 cedex Fontainebleau, France
| |
Collapse
|