1
|
Cingiz MÖ. k- Strong Inference Algorithm: A Hybrid Information Theory Based Gene Network Inference Algorithm. Mol Biotechnol 2024; 66:3213-3225. [PMID: 37950851 DOI: 10.1007/s12033-023-00929-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 10/05/2023] [Indexed: 11/13/2023]
Abstract
Gene networks allow researchers to understand the underlying mechanisms between diseases and genes while reducing the need for wet lab experiments. Numerous gene network inference (GNI) algorithms have been presented in the literature to infer accurate gene networks. We proposed a hybrid GNI algorithm, k-Strong Inference Algorithm (ksia), to infer more reliable and robust gene networks from omics datasets. To increase reliability, ksia integrates Pearson correlation coefficient (PCC) and Spearman rank correlation coefficient (SCC) scores to determine mutual information scores between molecules to increase diversity of relation predictions. To infer a more robust gene network, ksia applies three different elimination steps to remove redundant and spurious relations between genes. The performance of ksia was evaluated on microbe microarrays database in the overlap analysis with other GNI algorithms, namely ARACNE, C3NET, CLR, and MRNET. Ksia inferred less number of relations due to its strict elimination steps. However, ksia generally performed better on Escherichia coli (E.coli) and Saccharomyces cerevisiae (yeast) gene expression datasets due to F- measure and precision values. The integration of association estimator scores and three elimination stages slightly increases the performance of ksia based gene networks. Users can access ksia R package and user manual of package via https://github.com/ozgurcingiz/ksia .
Collapse
Affiliation(s)
- Mustafa Özgür Cingiz
- Computer Engineering Department, Faculty of Engineering and Natural Sciences, Bursa Technical University, Mimar Sinan Campus, Yildirim, 16310, Bursa, Turkey.
| |
Collapse
|
2
|
Pfau SJ, Langen UH, Fisher TM, Prakash I, Nagpurwala F, Lozoya RA, Lee WCA, Wu Z, Gu C. Characteristics of blood-brain barrier heterogeneity between brain regions revealed by profiling vascular and perivascular cells. Nat Neurosci 2024; 27:1892-1903. [PMID: 39210068 PMCID: PMC11452347 DOI: 10.1038/s41593-024-01743-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Accepted: 07/30/2024] [Indexed: 09/04/2024]
Abstract
The blood-brain barrier (BBB) protects the brain and maintains neuronal homeostasis. BBB properties can vary between brain regions to support regional functions, yet how BBB heterogeneity occurs is poorly understood. Here, we used single-cell and spatial transcriptomics to compare the mouse median eminence, one of the circumventricular organs that has naturally leaky blood vessels, with the cortex. We identified hundreds of molecular differences in endothelial cells (ECs) and perivascular cells, including astrocytes, pericytes and fibroblasts. Using electron microscopy and an aqueous-based tissue-clearing method, we revealed distinct anatomical specializations and interaction patterns of ECs and perivascular cells in these regions. Finally, we identified candidate regionally enriched EC-perivascular cell ligand-receptor pairs. Our results indicate that both molecular specializations in ECs and unique EC-perivascular cell interactions contribute to BBB functional heterogeneity. This platform can be used to investigate BBB heterogeneity in other regions and may facilitate the development of central nervous system region-specific therapeutics.
Collapse
Affiliation(s)
- Sarah J Pfau
- Howard Hughes Medical Institute, Department of Neurobiology, Harvard Medical School, Boston, MA, USA
| | - Urs H Langen
- Howard Hughes Medical Institute, Department of Neurobiology, Harvard Medical School, Boston, MA, USA
- Roche Pharma Research and Early Development, Neuroscience and Rare Diseases Discovery and Translational Area, Roche Innovation Center Basel, Basel, Switzerland
| | - Theodore M Fisher
- Howard Hughes Medical Institute, Department of Neurobiology, Harvard Medical School, Boston, MA, USA
| | - Indumathi Prakash
- Howard Hughes Medical Institute, Department of Neurobiology, Harvard Medical School, Boston, MA, USA
| | - Faheem Nagpurwala
- Howard Hughes Medical Institute, Department of Neurobiology, Harvard Medical School, Boston, MA, USA
| | - Ricardo A Lozoya
- Howard Hughes Medical Institute, Department of Neurobiology, Harvard Medical School, Boston, MA, USA
| | - Wei-Chung Allen Lee
- F.M. Kirby Neurobiology Center, Boston Children's Hospital and Department of Neurobiology, Harvard Medical School, Boston, MA, USA
| | - Zhuhao Wu
- Helen and Robert Appel Alzheimer's Disease Research Institute, Feil Family Brain and Mind Research Institute, Weill Cornell Medicine, New York, NY, USA
| | - Chenghua Gu
- Howard Hughes Medical Institute, Department of Neurobiology, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
3
|
Segura-Ortiz A, García-Nieto J, Aldana-Montes JF, Navas-Delgado I. Multi-objective context-guided consensus of a massive array of techniques for the inference of Gene Regulatory Networks. Comput Biol Med 2024; 179:108850. [PMID: 39013340 DOI: 10.1016/j.compbiomed.2024.108850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Revised: 07/03/2024] [Accepted: 07/03/2024] [Indexed: 07/18/2024]
Abstract
BACKGROUND AND OBJECTIVE Gene Regulatory Network (GRN) inference is a fundamental task in biology and medicine, as it enables a deeper understanding of the intricate mechanisms of gene expression present in organisms. This bioinformatics problem has been addressed in the literature through multiple computational approaches. Techniques developed for inferring from expression data have employed Bayesian networks, ordinary differential equations (ODEs), machine learning, information theory measures and neural networks, among others. The diversity of implementations and their respective customization have led to the emergence of many tools and multiple specialized domains derived from them, understood as subsets of networks with specific characteristics that are challenging to detect a priori. This specialization has introduced significant uncertainty when choosing the most appropriate technique for a particular dataset. This proposal, named MO-GENECI, builds upon the basic idea of the previous proposal GENECI and optimizes consensus among different inference techniques, through a carefully refined multi-objective evolutionary algorithm guided by various objective functions, linked to the biological context at hand. METHODS MO-GENECI has been tested on an extensive and diverse academic benchmark of 106 gene regulatory networks from multiple sources and sizes. The evaluation of MO-GENECI compared its performance to individual techniques using key metrics (AUROC and AUPR) for gene regulatory network inference. Friedman's statistical ranking provided an ordered classification, followed by non-parametric Holm tests to determine statistical significance. RESULTS MO-GENECI's Pareto front approximation facilitates easy selection of an appropriate solution based on generic input data characteristics. The best solution consistently emerged as the winner in all statistical tests, and in many cases, the median precision solution showed no statistically significant difference compared to the winner. CONCLUSIONS MO-GENECI has not only demonstrated achieving more accurate results than individual techniques, but has also overcome the uncertainty associated with the initial choice due to its flexibility and adaptability. It is shown intelligently to select the most suitable techniques for each case. The source code is hosted in a public repository at GitHub under MIT license: https://github.com/AdrianSeguraOrtiz/MO-GENECI. Moreover, to facilitate its installation and use, the software associated with this implementation has been encapsulated in a Python package available at PyPI: https://pypi.org/project/geneci/.
Collapse
Affiliation(s)
- Adrián Segura-Ortiz
- Department de Lenguajes y Ciencias de la Computación, ITIS Software, Universidad de Málaga, Málaga, 29071, Spain.
| | - José García-Nieto
- Department de Lenguajes y Ciencias de la Computación, ITIS Software, Universidad de Málaga, Málaga, 29071, Spain; Biomedical Research Institute of Málaga (IBIMA), Universidad de Málaga, Málaga, Spain
| | - José F Aldana-Montes
- Department de Lenguajes y Ciencias de la Computación, ITIS Software, Universidad de Málaga, Málaga, 29071, Spain; Biomedical Research Institute of Málaga (IBIMA), Universidad de Málaga, Málaga, Spain
| | - Ismael Navas-Delgado
- Department de Lenguajes y Ciencias de la Computación, ITIS Software, Universidad de Málaga, Málaga, 29071, Spain; Biomedical Research Institute of Málaga (IBIMA), Universidad de Málaga, Málaga, Spain
| |
Collapse
|
4
|
del Giudice G, Serra A, Pavel A, Torres Maia M, Saarimäki LA, Fratello M, Federico A, Alenius H, Fadeel B, Greco D. A Network Toxicology Approach for Mechanistic Modelling of Nanomaterial Hazard and Adverse Outcomes. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024; 11:e2400389. [PMID: 38923832 PMCID: PMC11348149 DOI: 10.1002/advs.202400389] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Revised: 05/10/2024] [Indexed: 06/28/2024]
Abstract
Hazard assessment is the first step in evaluating the potential adverse effects of chemicals. Traditionally, toxicological assessment has focused on the exposure, overlooking the impact of the exposed system on the observed toxicity. However, systems toxicology emphasizes how system properties significantly contribute to the observed response. Hence, systems theory states that interactions store more information than individual elements, leading to the adoption of network based models to represent complex systems in many fields of life sciences. Here, they develop a network-based approach to characterize toxicological responses in the context of a biological system, inferring biological system specific networks. They directly link molecular alterations to the adverse outcome pathway (AOP) framework, establishing direct connections between omics data and toxicologically relevant phenotypic events. They apply this framework to a dataset including 31 engineered nanomaterials with different physicochemical properties in two different in vitro and one in vivo models and demonstrate how the biological system is the driving force of the observed response. This work highlights the potential of network-based methods to significantly improve their understanding of toxicological mechanisms from a systems biology perspective and provides relevant considerations and future data-driven approaches for the hazard assessment of nanomaterials and other advanced materials.
Collapse
Affiliation(s)
- Giusy del Giudice
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Faculty of Medicine and Health TechnologyTampere UniversityTampere33520Finland
- Division of Pharmaceutical Biosciences, Faculty of PharmacyUniversity of HelsinkiHelsinki00790Finland
| | - Angela Serra
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Faculty of Medicine and Health TechnologyTampere UniversityTampere33520Finland
- Division of Pharmaceutical Biosciences, Faculty of PharmacyUniversity of HelsinkiHelsinki00790Finland
- Tampere Institute for Advanced StudyTampere UniversityTampere33100Finland
| | - Alisa Pavel
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Faculty of Medicine and Health TechnologyTampere UniversityTampere33520Finland
| | - Marcella Torres Maia
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Faculty of Medicine and Health TechnologyTampere UniversityTampere33520Finland
| | - Laura Aliisa Saarimäki
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Faculty of Medicine and Health TechnologyTampere UniversityTampere33520Finland
- Division of Pharmaceutical Biosciences, Faculty of PharmacyUniversity of HelsinkiHelsinki00790Finland
| | - Michele Fratello
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Faculty of Medicine and Health TechnologyTampere UniversityTampere33520Finland
| | - Antonio Federico
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Faculty of Medicine and Health TechnologyTampere UniversityTampere33520Finland
- Division of Pharmaceutical Biosciences, Faculty of PharmacyUniversity of HelsinkiHelsinki00790Finland
- Tampere Institute for Advanced StudyTampere UniversityTampere33100Finland
| | - Harri Alenius
- Human Microbiome Research Program (HUMI)University of HelsinkiHelsinki00014Finland
- Institute of Environmental MedicineKarolinska InstitutetStockholm171 77Sweden
| | - Bengt Fadeel
- Institute of Environmental MedicineKarolinska InstitutetStockholm171 77Sweden
| | - Dario Greco
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Faculty of Medicine and Health TechnologyTampere UniversityTampere33520Finland
- Division of Pharmaceutical Biosciences, Faculty of PharmacyUniversity of HelsinkiHelsinki00790Finland
- Tampere Institute for Advanced StudyTampere UniversityTampere33100Finland
- Institute of BiotechnologyUniversity of HelsinkiHelsinki00790Finland
| |
Collapse
|
5
|
Chee FT, Harun S, Mohd Daud K, Sulaiman S, Nor Muhammad NA. Exploring gene regulation and biological processes in insects: Insights from omics data using gene regulatory network models. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2024; 189:1-12. [PMID: 38604435 DOI: 10.1016/j.pbiomolbio.2024.04.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 12/18/2023] [Accepted: 04/03/2024] [Indexed: 04/13/2024]
Abstract
Gene regulatory network (GRN) comprises complicated yet intertwined gene-regulator relationships. Understanding the GRN dynamics will unravel the complexity behind the observed gene expressions. Insect gene regulation is often complicated due to their complex life cycles and diverse ecological adaptations. The main interest of this review is to have an update on the current mathematical modelling methods of GRNs to explain insect science. Several popular GRN architecture models are discussed, together with examples of applications in insect science. In the last part of this review, each model is compared from different aspects, including network scalability, computation complexity, robustness to noise and biological relevancy.
Collapse
Affiliation(s)
- Fong Ting Chee
- Institute of Systems Biology, Universiti Kebangsaan Malaysia, 43600 UKM Bangi, Selangor, Malaysia
| | - Sarahani Harun
- Institute of Systems Biology, Universiti Kebangsaan Malaysia, 43600 UKM Bangi, Selangor, Malaysia
| | - Kauthar Mohd Daud
- Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, 43600, UKM Bangi, Selangor, Malaysia
| | - Suhaila Sulaiman
- FGV R&D Sdn Bhd, FGV Innovation Center, PT23417 Lengkuk Teknologi, Bandar Baru Enstek, 71760 Nilai, Negeri Sembilan, Malaysia
| | - Nor Azlan Nor Muhammad
- Institute of Systems Biology, Universiti Kebangsaan Malaysia, 43600 UKM Bangi, Selangor, Malaysia.
| |
Collapse
|
6
|
Federico A, Möbus L, Al-Abdulraheem Z, Pavel A, Fortino V, Del Giudice G, Alenius H, Fyhrquist N, Greco D. Integrative network analysis suggests prioritised drugs for atopic dermatitis. J Transl Med 2024; 22:64. [PMID: 38229087 PMCID: PMC10792836 DOI: 10.1186/s12967-024-04879-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Accepted: 01/10/2024] [Indexed: 01/18/2024] Open
Abstract
BACKGROUND Atopic dermatitis (AD) is a prevalent chronic inflammatory skin disease whose pathophysiology involves the interplay between genetic and environmental factors, ultimately leading to dysfunction of the epidermis. While several treatments are effective in symptom management, many existing therapies offer only temporary relief and often come with side effects. For this reason, the formulation of an effective therapeutic plan is challenging and there is a need for more effective and targeted treatments that address the root causes of the condition. Here, we hypothesise that modelling the complexity of the molecular buildup of the atopic dermatitis can be a concrete means to drive drug discovery. METHODS We preprocessed, harmonised and integrated publicly available transcriptomics datasets of lesional and non-lesional skin from AD patients. We inferred co-expression network models of both AD lesional and non-lesional skin and exploited their interactional properties by integrating them with a priori knowledge in order to extrapolate a robust AD disease module. Pharmacophore-based virtual screening was then utilised to build a tailored library of compounds potentially active for AD. RESULTS In this study, we identified a core disease module for AD, pinpointing known and unknown molecular determinants underlying the skin lesions. We identified skin- and immune-cell type signatures expressed by the disease module, and characterised the impaired cellular functions underlying the complex phenotype of atopic dermatitis. Therefore, by investigating the connectivity of genes belonging to the AD module, we prioritised novel putative biomarkers of the disease. Finally, we defined a tailored compound library by characterising the therapeutic potential of drugs targeting genes within the disease module to facilitate and tailor future drug discovery efforts towards novel pharmacological strategies for AD. CONCLUSIONS Overall, our study reveals a core disease module providing unprecedented information about genetic, transcriptional and pharmacological relationships that foster drug discovery in atopic dermatitis.
Collapse
Affiliation(s)
- Antonio Federico
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Faculty of Medicine and Health Technology, Tampere University, 33100, Tampere, Finland
- Tampere Institute for Advanced Study, Tampere University, 33100, Tampere, Finland
- Division of Pharmaceutical Biosciences, Faculty of Pharmacy, University of Helsinki, 00100, Helsinki, Finland
| | - Lena Möbus
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Faculty of Medicine and Health Technology, Tampere University, 33100, Tampere, Finland
| | - Zeyad Al-Abdulraheem
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Faculty of Medicine and Health Technology, Tampere University, 33100, Tampere, Finland
| | - Alisa Pavel
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Faculty of Medicine and Health Technology, Tampere University, 33100, Tampere, Finland
| | - Vittorio Fortino
- Institute of Biomedicine, University of Eastern Finland, Kuopio, Finland
| | - Giusy Del Giudice
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Faculty of Medicine and Health Technology, Tampere University, 33100, Tampere, Finland
- Division of Pharmaceutical Biosciences, Faculty of Pharmacy, University of Helsinki, 00100, Helsinki, Finland
| | - Harri Alenius
- Faculty of Medicine, Human Microbiome Research Program, University of Helsinki, Helsinki, Finland
- Institute of Environmental Medicine (IMM), Karolinska Institutet, Stockholm, Sweden
| | - Nanna Fyhrquist
- Faculty of Medicine, Human Microbiome Research Program, University of Helsinki, Helsinki, Finland
- Institute of Environmental Medicine (IMM), Karolinska Institutet, Stockholm, Sweden
| | - Dario Greco
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Faculty of Medicine and Health Technology, Tampere University, 33100, Tampere, Finland.
- Division of Pharmaceutical Biosciences, Faculty of Pharmacy, University of Helsinki, 00100, Helsinki, Finland.
- Institute of Biotechnology, University of Helsinki, 00100, Helsinki, Finland.
| |
Collapse
|
7
|
Piras IS, Braccagni G, Huentelman MJ, Bortolato M. A preliminary transcriptomic analysis of the orbitofrontal cortex of antisocial individuals. CNS Neurosci Ther 2023; 29:3173-3182. [PMID: 37269073 PMCID: PMC10580340 DOI: 10.1111/cns.14283] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Revised: 05/18/2023] [Accepted: 05/19/2023] [Indexed: 06/04/2023] Open
Abstract
AIMS Antisocial personality disorder (ASPD) and conduct disorder (CD) are characterized by a persistent pattern of violations of societal norms and others' rights. Ample evidence shows that the pathophysiology of these disorders is contributed by orbitofrontal cortex (OFC) alterations, yet the underlying molecular mechanisms remain elusive. To address this knowledge gap, we performed the first-ever RNA sequencing study of postmortem OFC samples from subjects with a lifetime diagnosis of ASPD and/or CD. METHODS The transcriptomic profiles of OFC samples from subjects with ASPD and/or CD were compared to those of unaffected age-matched controls (n = 9/group). RESULTS The OFC of ASPD/CD-affected subjects displayed significant differences in the expression of 328 genes. Further gene-ontology analyses revealed an extensive downregulation of excitatory neuron transcripts and upregulation of astrocyte transcripts. These alterations were paralleled by significant modifications in synaptic regulation and glutamatergic neurotransmission pathways. CONCLUSION These preliminary findings suggest that ASPD and CD feature a complex array of functional deficits in the pyramidal neurons and astrocytes of the OFC. In turn, these aberrances may contribute to the reduced OFC connectivity observed in antisocial subjects. Future analyses on larger cohorts are needed to validate these results.
Collapse
Affiliation(s)
- Ignazio S. Piras
- Neurogenomics DivisionTranslational Genomics Research Institute (TGen)PhoenixArizonaUSA
| | - Giulia Braccagni
- Department of Pharmacology and ToxicologyCollege of PharmacyUniversity of UtahSalt Lake CityUtahUSA
| | - Matthew J. Huentelman
- Neurogenomics DivisionTranslational Genomics Research Institute (TGen)PhoenixArizonaUSA
| | - Marco Bortolato
- Department of Pharmacology and ToxicologyCollege of PharmacyUniversity of UtahSalt Lake CityUtahUSA
| |
Collapse
|
8
|
Saarimäki LA, Morikka J, Pavel A, Korpilähde S, del Giudice G, Federico A, Fratello M, Serra A, Greco D. Toxicogenomics Data for Chemical Safety Assessment and Development of New Approach Methodologies: An Adverse Outcome Pathway-Based Approach. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2023; 10:e2203984. [PMID: 36479815 PMCID: PMC9839874 DOI: 10.1002/advs.202203984] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Revised: 11/09/2022] [Indexed: 05/25/2023]
Abstract
Mechanistic toxicology provides a powerful approach to inform on the safety of chemicals and the development of safe-by-design compounds. Although toxicogenomics supports mechanistic evaluation of chemical exposures, its implementation into the regulatory framework is hindered by uncertainties in the analysis and interpretation of such data. The use of mechanistic evidence through the adverse outcome pathway (AOP) concept is promoted for the development of new approach methodologies (NAMs) that can reduce animal experimentation. However, to unleash the full potential of AOPs and build confidence into toxicogenomics, robust associations between AOPs and patterns of molecular alteration need to be established. Systematic curation of molecular events to AOPs will create the much-needed link between toxicogenomics and systemic mechanisms depicted by the AOPs. This, in turn, will introduce novel ways of benefitting from the AOPs, including predictive models and targeted assays, while also reducing the need for multiple testing strategies. Hence, a multi-step strategy to annotate AOPs is developed, and the resulting associations are applied to successfully highlight relevant adverse outcomes for chemical exposures with strong in vitro and in vivo convergence, supporting chemical grouping and other data-driven approaches. Finally, a panel of AOP-derived in vitro biomarkers for pulmonary fibrosis (PF) is identified and experimentally validated.
Collapse
Affiliation(s)
- Laura Aliisa Saarimäki
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE)Faculty of Medicine and Health TechnologyTampere UniversityArvo Ylpön katu 34Tampere33520Finland
| | - Jack Morikka
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE)Faculty of Medicine and Health TechnologyTampere UniversityArvo Ylpön katu 34Tampere33520Finland
| | - Alisa Pavel
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE)Faculty of Medicine and Health TechnologyTampere UniversityArvo Ylpön katu 34Tampere33520Finland
| | - Seela Korpilähde
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE)Faculty of Medicine and Health TechnologyTampere UniversityArvo Ylpön katu 34Tampere33520Finland
| | - Giusy del Giudice
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE)Faculty of Medicine and Health TechnologyTampere UniversityArvo Ylpön katu 34Tampere33520Finland
| | - Antonio Federico
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE)Faculty of Medicine and Health TechnologyTampere UniversityArvo Ylpön katu 34Tampere33520Finland
| | - Michele Fratello
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE)Faculty of Medicine and Health TechnologyTampere UniversityArvo Ylpön katu 34Tampere33520Finland
| | - Angela Serra
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE)Faculty of Medicine and Health TechnologyTampere UniversityArvo Ylpön katu 34Tampere33520Finland
- Tampere Institute for Advanced StudyTampere UniversityKalevantie 4Tampere33100Finland
| | - Dario Greco
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE)Faculty of Medicine and Health TechnologyTampere UniversityArvo Ylpön katu 34Tampere33520Finland
- Institute of BiotechnologyUniversity of HelsinkiP.O.Box 56HelsinkiUusimaa00014Finland
| |
Collapse
|
9
|
Federico A, Pavel A, Möbus L, McKean D, Del Giudice G, Fortino V, Niehues H, Rastrick J, Eyerich K, Eyerich S, van den Bogaard E, Smith C, Weidinger S, de Rinaldis E, Greco D. The integration of large-scale public data and network analysis uncovers molecular characteristics of psoriasis. Hum Genomics 2022; 16:62. [PMID: 36437479 PMCID: PMC9703794 DOI: 10.1186/s40246-022-00431-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Accepted: 11/07/2022] [Indexed: 11/29/2022] Open
Abstract
In recent years, a growing interest in the characterization of the molecular basis of psoriasis has been observed. However, despite the availability of a large amount of molecular data, many pathogenic mechanisms of psoriasis are still poorly understood. In this study, we performed an integrated analysis of 23 public transcriptomic datasets encompassing both lesional and uninvolved skin samples from psoriasis patients. We defined comprehensive gene co-expression network models of psoriatic lesions and uninvolved skin. Moreover, we curated and exploited a wide range of functional information from multiple public sources in order to systematically annotate the inferred networks. The integrated analysis of transcriptomics data and co-expression networks highlighted genes that are frequently dysregulated and show aberrant patterns of connectivity in the psoriatic lesion compared with the unaffected skin. Our approach allowed us to also identify plausible, previously unknown, actors in the expression of the psoriasis phenotype. Finally, we characterized communities of co-expressed genes associated with relevant molecular functions and expression signatures of specific immune cell types associated with the psoriasis lesion. Overall, integrating experimental driven results with curated functional information from public repositories represents an efficient approach to empower knowledge generation about psoriasis and may be applicable to other complex diseases.
Collapse
Affiliation(s)
- Antonio Federico
- Faculty of Medicine and Health Technology, Tampere University, Kauppi Campus, Arvo Ylpön Katu 34, 33520, Tampere, Finland
- BioMeditech Institute, Tampere University, Tampere, Finland
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Tampere University, Tampere, Finland
- Tampere Institute for Advanced Studies, Tampere University, Tampere, Finland
| | - Alisa Pavel
- Faculty of Medicine and Health Technology, Tampere University, Kauppi Campus, Arvo Ylpön Katu 34, 33520, Tampere, Finland
- BioMeditech Institute, Tampere University, Tampere, Finland
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Tampere University, Tampere, Finland
| | - Lena Möbus
- Faculty of Medicine and Health Technology, Tampere University, Kauppi Campus, Arvo Ylpön Katu 34, 33520, Tampere, Finland
- BioMeditech Institute, Tampere University, Tampere, Finland
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Tampere University, Tampere, Finland
| | - David McKean
- Sanofi Immunology and Inflammation Research Therapeutic Area, Precision Immunology Cluster, Cambridge, Massachusetts, USA
| | - Giusy Del Giudice
- Faculty of Medicine and Health Technology, Tampere University, Kauppi Campus, Arvo Ylpön Katu 34, 33520, Tampere, Finland
- BioMeditech Institute, Tampere University, Tampere, Finland
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Tampere University, Tampere, Finland
| | - Vittorio Fortino
- Institute of Biomedicine, University of Eastern Finland, Kuopio, Finland
| | - Hanna Niehues
- Department of Dermatology, Radboud University Medical Center, Radboud Institute for Molecular Life Sciences, Nijmegen, The Netherlands
| | - Joe Rastrick
- Immunology Therapeutic Area, UCB Pharma, Slough, UK
| | - Kilian Eyerich
- Department of Dermatology and Allergy, Technical University of Munich, Munich, Germany
- Unit of Dermatology and Venerology, Department of Medicine, Karolinska Institute, Karolinska University Hospital, Stockholm, Sweden
| | - Stefanie Eyerich
- ZAUM-Center of Allergy and Environment, Technical University and Helmholtz Center Munich, Munich, Germany
| | - Ellen van den Bogaard
- Department of Dermatology, Radboud University Medical Center, Radboud Institute for Molecular Life Sciences, Nijmegen, The Netherlands
| | - Catherine Smith
- St. John's Institute of Dermatology, King's College London, London, UK
| | | | - Emanuele de Rinaldis
- Sanofi Immunology and Inflammation Research Therapeutic Area, Precision Immunology Cluster, Cambridge, Massachusetts, USA
| | - Dario Greco
- Faculty of Medicine and Health Technology, Tampere University, Kauppi Campus, Arvo Ylpön Katu 34, 33520, Tampere, Finland.
- BioMeditech Institute, Tampere University, Tampere, Finland.
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Tampere University, Tampere, Finland.
- Institute of Biotechnology, University of Helsinki, Helsinki, Finland.
| |
Collapse
|
10
|
Seçilmiş D, Hillerton T, Tjärnberg A, Nelander S, Nordling TEM, Sonnhammer ELL. Knowledge of the perturbation design is essential for accurate gene regulatory network inference. Sci Rep 2022; 12:16531. [PMID: 36192495 PMCID: PMC9529923 DOI: 10.1038/s41598-022-19005-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Accepted: 08/23/2022] [Indexed: 11/08/2022] Open
Abstract
The gene regulatory network (GRN) of a cell executes genetic programs in response to environmental and internal cues. Two distinct classes of methods are used to infer regulatory interactions from gene expression: those that only use observed changes in gene expression, and those that use both the observed changes and the perturbation design, i.e. the targets used to cause the changes in gene expression. Considering that the GRN by definition converts input cues to changes in gene expression, it may be conjectured that the latter methods would yield more accurate inferences but this has not previously been investigated. To address this question, we evaluated a number of popular GRN inference methods that either use the perturbation design or not. For the evaluation we used targeted perturbation knockdown gene expression datasets with varying noise levels generated by two different packages, GeneNetWeaver and GeneSpider. The accuracy was evaluated on each dataset using a variety of measures. The results show that on all datasets, methods using the perturbation design matrix consistently and significantly outperform methods not using it. This was also found to be the case on a smaller experimental dataset from E. coli. Targeted gene perturbations combined with inference methods that use the perturbation design are indispensable for accurate GRN inference.
Collapse
Affiliation(s)
- Deniz Seçilmiş
- Department of Biochemistry and Biophysics, Stockholm University, Science for Life Laboratory, Box 1031, 17121, Solna, Sweden
| | - Thomas Hillerton
- Department of Biochemistry and Biophysics, Stockholm University, Science for Life Laboratory, Box 1031, 17121, Solna, Sweden
| | - Andreas Tjärnberg
- Center for Developmental Genetics, New York University, New York, USA
| | - Sven Nelander
- Department of Immunology, Genetics and Pathology and Science for Life Laboratory, Uppsala University, 75185, Uppsala, Sweden
| | - Torbjörn E M Nordling
- Department of Mechanical Engineering, National Cheng Kung University, Tainan, 701, Taiwan, ROC
- Department of Applied Physics and Electronics, Umeå University, 90187, Umeå, Sweden
| | - Erik L L Sonnhammer
- Department of Biochemistry and Biophysics, Stockholm University, Science for Life Laboratory, Box 1031, 17121, Solna, Sweden.
| |
Collapse
|
11
|
Rapier-Sharman N, Clancy J, Pickett BE. Joint Secondary Transcriptomic Analysis of Non-Hodgkin's B-Cell Lymphomas Predicts Reliance on Pathways Associated with the Extracellular Matrix and Robust Diagnostic Biomarkers. JOURNAL OF BIOINFORMATICS AND SYSTEMS BIOLOGY : OPEN ACCESS 2022; 5:119-135. [PMID: 36873459 PMCID: PMC9980876 DOI: 10.26502/jbsb.5107040] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Approximately 450,000 cases of Non-Hodgkin's lymphoma are annually diagnosed worldwide, resulting in ~240,000 deaths. An augmented understanding of the common mechanisms of pathology among larger numbers of B-cell Non-Hodgkin's Lymphoma (BCNHL) patients is sorely needed. We consequently performed a large joint secondary transcriptomic analysis of the available BCNHL RNA-sequencing projects from GEO, consisting of 322 relevant samples across ten distinct public studies, to find common underlying mechanisms and biomarkers across multiple BCNHL subtypes and patient subpopulations; limitations may include lack of diversity in certain ethnicities and age groups and limited clinical subtype diversity due to sample availability. We found ~10,400 significant differentially expressed genes (FDR-adjusted p-value < 0.05) and 33 significantly modulated pathways (Bonferroni-adjusted p-value < 0.05) when comparing BCNHL samples to non-diseased B-cell samples. Our findings included a significant class of proteoglycans not previously associated with lymphomas as well as significant modulation of genes that code for extracellular matrix-associated proteins. Our drug repurposing analysis predicted new candidates for repurposed drugs including ocriplasmin and collagenase. We also used a machine learning approach to identify robust BCNHL biomarkers that include YES1, FERMT2, and FAM98B, which have not previously been associated with BCNHL in the literature, but together provide ~99.9% combined specificity and sensitivity for differentiating lymphoma cells from healthy B-cells based on measurement of transcript expression levels in B-cells. This analysis supports past findings and validates existing knowledge while providing novel insights into the inner workings and mechanisms of transformed B-cell lymphomas that could give rise to improved diagnostics and/or therapeutics.
Collapse
Affiliation(s)
- Naomi Rapier-Sharman
- Department of Microbiology and Molecular Biology, Brigham Young University, Provo, UT 84602, USA
| | - Jeffrey Clancy
- Department of Microbiology and Molecular Biology, Brigham Young University, Provo, UT 84602, USA
| | - Brett E Pickett
- Department of Microbiology and Molecular Biology, Brigham Young University, Provo, UT 84602, USA
| |
Collapse
|
12
|
Lee AJ, Reiter T, Doing G, Oh J, Hogan DA, Greene CS. Using genome-wide expression compendia to study microorganisms. Comput Struct Biotechnol J 2022; 20:4315-4324. [PMID: 36016717 PMCID: PMC9396250 DOI: 10.1016/j.csbj.2022.08.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Revised: 08/07/2022] [Accepted: 08/07/2022] [Indexed: 11/30/2022] Open
Abstract
A gene expression compendium is a heterogeneous collection of gene expression experiments assembled from data collected for diverse purposes. The widely varied experimental conditions and genetic backgrounds across samples creates a tremendous opportunity for gaining a systems level understanding of the transcriptional responses that influence phenotypes. Variety in experimental design is particularly important for studying microbes, where the transcriptional responses integrate many signals and demonstrate plasticity across strains including response to what nutrients are available and what microbes are present. Advances in high-throughput measurement technology have made it feasible to construct compendia for many microbes. In this review we discuss how these compendia are constructed and analyzed to reveal transcriptional patterns.
Collapse
Affiliation(s)
- Alexandra J. Lee
- Genomics and Computational Biology Graduate Program, University of Pennsylvania, Philadelphia, PA, USA
| | - Taylor Reiter
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Denver, CO, USA
| | - Georgia Doing
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Julia Oh
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Deborah A. Hogan
- Department of Microbiology and Immunology, Geisel School of Medicine, Dartmouth, Hanover, NH, USA
| | - Casey S. Greene
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Denver, CO, USA
| |
Collapse
|
13
|
Seçilmiş D, Nelander S, Sonnhammer ELL. Optimal Sparsity Selection Based on an Information Criterion for Accurate Gene Regulatory Network Inference. Front Genet 2022; 13:855770. [PMID: 35923701 PMCID: PMC9340570 DOI: 10.3389/fgene.2022.855770] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2022] [Accepted: 05/30/2022] [Indexed: 11/25/2022] Open
Abstract
Accurate inference of gene regulatory networks (GRNs) is important to unravel unknown regulatory mechanisms and processes, which can lead to the identification of treatment targets for genetic diseases. A variety of GRN inference methods have been proposed that, under suitable data conditions, perform well in benchmarks that consider the entire spectrum of false-positives and -negatives. However, it is very challenging to predict which single network sparsity gives the most accurate GRN. Lacking criteria for sparsity selection, a simplistic solution is to pick the GRN that has a certain number of links per gene, which is guessed to be reasonable. However, this does not guarantee finding the GRN that has the correct sparsity or is the most accurate one. In this study, we provide a general approach for identifying the most accurate and sparsity-wise relevant GRN within the entire space of possible GRNs. The algorithm, called SPA, applies a “GRN information criterion” (GRNIC) that is inspired by two commonly used model selection criteria, Akaike and Bayesian Information Criterion (AIC and BIC) but adapted to GRN inference. The results show that the approach can, in most cases, find the GRN whose sparsity is close to the true sparsity and close to as accurate as possible with the given GRN inference method and data. The datasets and source code can be found at https://bitbucket.org/sonnhammergrni/spa/.
Collapse
Affiliation(s)
- Deniz Seçilmiş
- Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Solna, Sweden
| | - Sven Nelander
- Science for Life Laboratory, Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden
| | - Erik L. L. Sonnhammer
- Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Solna, Sweden
- *Correspondence: Erik L. L. Sonnhammer,
| |
Collapse
|
14
|
Obayashi T, Hibara H, Kagaya Y, Aoki Y, Kinoshita K. ATTED-II v11: A Plant Gene Coexpression Database Using a Sample Balancing Technique by Subagging of Principal Components. PLANT & CELL PHYSIOLOGY 2022; 63:869-881. [PMID: 35353884 DOI: 10.1093/pcp/pcac041] [Citation(s) in RCA: 49] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/16/2021] [Revised: 02/06/2022] [Accepted: 03/29/2022] [Indexed: 05/25/2023]
Abstract
ATTED-II (https://atted.jp) is a gene coexpression database for nine plant species based on publicly available RNAseq and microarray data. One of the challenges in constructing condition-independent coexpression data based on publicly available gene expression data is managing the inherent sampling bias. Here, we report ATTED-II version 11, wherein we adopted a coexpression calculation methodology to balance the samples using principal component analysis and ensemble calculation. This approach has two advantages. First, omitting principal components with low contribution rates reduces the main contributors of noise. Second, balancing large differences in contribution rates enables considering various sample conditions entirely. In addition, based on RNAseq- and microarray-based coexpression data, we provide species-representative, integrated coexpression information to enhance the efficiency of interspecies comparison of the coexpression data. These coexpression data are provided as a standardized z-score to facilitate integrated analysis with different data sources. We believe that with these improvements, ATTED-II is more valuable and powerful for supporting interspecies comparative studies and integrated analyses using heterogeneous data.
Collapse
Affiliation(s)
- Takeshi Obayashi
- Graduate School of Information Sciences, Tohoku University, 6-3-09, Aramaki-Aza-Aoba, Aoba-ku, Sendai, 980-8679 Japan
| | - Himiko Hibara
- Graduate School of Information Sciences, Tohoku University, 6-3-09, Aramaki-Aza-Aoba, Aoba-ku, Sendai, 980-8679 Japan
| | - Yuki Kagaya
- Graduate School of Information Sciences, Tohoku University, 6-3-09, Aramaki-Aza-Aoba, Aoba-ku, Sendai, 980-8679 Japan
| | - Yuichi Aoki
- Graduate School of Information Sciences, Tohoku University, 6-3-09, Aramaki-Aza-Aoba, Aoba-ku, Sendai, 980-8679 Japan
- Tohoku Medical Megabank Organization, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, 980-8573 Japan
| | - Kengo Kinoshita
- Graduate School of Information Sciences, Tohoku University, 6-3-09, Aramaki-Aza-Aoba, Aoba-ku, Sendai, 980-8679 Japan
- Tohoku Medical Megabank Organization, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, 980-8573 Japan
- Institute of Development, Aging, and Cancer, Tohoku University, 4-1 Seiryo-machi, Aoba-ku, Sendai, 980-8575 Japan
| |
Collapse
|
15
|
Bense S, Witte J, Preuße M, Koska M, Pezoldt L, Dröge A, Hartmann O, Müsken M, Schulze J, Fiebig T, Bähre H, Felgner S, Pich A, Häussler S. Pseudomonas aeruginosa post-translational responses to elevated c-di-GMP levels. Mol Microbiol 2022; 117:1213-1226. [PMID: 35362616 DOI: 10.1111/mmi.14902] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Revised: 03/22/2022] [Accepted: 03/27/2022] [Indexed: 11/29/2022]
Abstract
C-di-GMP signaling can directly influence bacterial behavior by affecting the functionality of c-di-GMP-binding proteins. In addition, c-di-GMP can exert a global effect on gene transcription or translation, e.g., via riboswitches or by binding to transcription factors. In this study, we investigated the effects of changes in intracellular c-di-GMP levels on gene expression and protein production in the opportunistic pathogen Pseudomonas aeruginosa. We induced c-di-GMP production via an ectopically introduced diguanylate cyclase and recorded the transcriptional, translational as well as proteomic profile of the cells. We demonstrate that rising levels of c-di-GMP under growth conditions otherwise characterized by low c-di-GMP levels caused a switch to a non-motile, auto-aggregative P. aeruginosa phenotype. This phenotypic switch became apparent before any c-di-GMP-dependent role on transcription, translation, or protein abundance was observed. Our results suggest that rising global c-di-GMP pools first affects the motility phenotype of P. aeruginosa by altering protein functionality and only then global gene transcription.
Collapse
Affiliation(s)
- Sarina Bense
- Department of Molecular Bacteriology, Helmholtz Center for Infection Research, Braunschweig, Germany.,Institute for Molecular Bacteriology, TWINCORE GmbH, Center of Clinical and Experimental Infection Research, a joint venture of the Hannover Medical School and the Helmholtz Center for Infection Research, Hannover, Germany
| | - Julius Witte
- Institute for Molecular Bacteriology, TWINCORE GmbH, Center of Clinical and Experimental Infection Research, a joint venture of the Hannover Medical School and the Helmholtz Center for Infection Research, Hannover, Germany.,Research Core Unit Proteomics and Institute for Toxicology, Hannover Medical School, Hannover, Germany
| | - Matthias Preuße
- Department of Molecular Bacteriology, Helmholtz Center for Infection Research, Braunschweig, Germany
| | - Michal Koska
- Department of Molecular Bacteriology, Helmholtz Center for Infection Research, Braunschweig, Germany.,Institute for Molecular Bacteriology, TWINCORE GmbH, Center of Clinical and Experimental Infection Research, a joint venture of the Hannover Medical School and the Helmholtz Center for Infection Research, Hannover, Germany
| | - Lorena Pezoldt
- Department of Molecular Bacteriology, Helmholtz Center for Infection Research, Braunschweig, Germany.,Institute for Molecular Bacteriology, TWINCORE GmbH, Center of Clinical and Experimental Infection Research, a joint venture of the Hannover Medical School and the Helmholtz Center for Infection Research, Hannover, Germany
| | - Astrid Dröge
- Department of Molecular Bacteriology, Helmholtz Center for Infection Research, Braunschweig, Germany
| | - Oliver Hartmann
- Department of Molecular Bacteriology, Helmholtz Center for Infection Research, Braunschweig, Germany.,Institute for Molecular Bacteriology, TWINCORE GmbH, Center of Clinical and Experimental Infection Research, a joint venture of the Hannover Medical School and the Helmholtz Center for Infection Research, Hannover, Germany
| | - Mathias Müsken
- Central Facility for Microscopy, Helmholtz Center for Infection Research, Braunschweig, Germany
| | - Julia Schulze
- Institute of Clinical Biochemistry, Hannover Medical School, Hannover, Germany
| | - Timm Fiebig
- Institute of Clinical Biochemistry, Hannover Medical School, Hannover, Germany
| | - Heike Bähre
- Research Core Unit Metabolomics and Institute of Pharmacology, Hannover Medical School, Hannover, Germany. Infection Research, Hannover, Germany
| | - Sebastian Felgner
- Department of Molecular Bacteriology, Helmholtz Center for Infection Research, Braunschweig, Germany.,Institute for Molecular Bacteriology, TWINCORE GmbH, Center of Clinical and Experimental Infection Research, a joint venture of the Hannover Medical School and the Helmholtz Center for Infection Research, Hannover, Germany
| | - Andreas Pich
- Research Core Unit Proteomics and Institute for Toxicology, Hannover Medical School, Hannover, Germany
| | - Susanne Häussler
- Department of Molecular Bacteriology, Helmholtz Center for Infection Research, Braunschweig, Germany.,Institute for Molecular Bacteriology, TWINCORE GmbH, Center of Clinical and Experimental Infection Research, a joint venture of the Hannover Medical School and the Helmholtz Center for Infection Research, Hannover, Germany.,Department of Clinical Microbiology, Copenhagen University Hospital - Rigshospitalet, Copenhagen, Denmark.,Cluster of Excellence RESIST (EXC 2155), Hannover Medical School, Hannover, Germany
| |
Collapse
|
16
|
Cholico GN, Nault R, Zacharewski TR. Genome-Wide ChIPseq Analysis of AhR, COUP-TF, and HNF4 Enrichment in TCDD-Treated Mouse Liver. Int J Mol Sci 2022; 23:1558. [PMID: 35163483 PMCID: PMC8836158 DOI: 10.3390/ijms23031558] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2021] [Revised: 01/19/2022] [Accepted: 01/27/2022] [Indexed: 02/01/2023] Open
Abstract
The aryl hydrocarbon receptor (AhR) is a ligand-activated transcription factor known for mediating the toxicity of 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD) and related compounds. Although the canonical mechanism of AhR activation involves heterodimerization with the aryl hydrocarbon receptor nuclear translocator, other transcriptional regulators that interact with AhR have been identified. Enrichment analysis of motifs in AhR-bound genomic regions implicated co-operation with COUP transcription factor (COUP-TF) and hepatocyte nuclear factor 4 (HNF4). The present study investigated AhR, HNF4α and COUP-TFII genomic binding and effects on gene expression associated with liver-specific function and cell differentiation in response to TCDD. Hepatic ChIPseq data from male C57BL/6 mice at 2 h after oral gavage with 30 µg/kg TCDD were integrated with bulk RNA-sequencing (RNAseq) time-course (2-72 h) and dose-response (0.01-30 µg/kg) datasets to assess putative AhR, HNF4α and COUP-TFII interactions associated with differential gene expression. Functional enrichment analysis of differentially expressed genes (DEGs) identified differential binding enrichment for AhR, COUP-TFII, and HNF4α to regions within liver-specific genes, suggesting intersections associated with the loss of liver-specific functions and hepatocyte differentiation. Analysis found that the repression of liver-specific, HNF4α target and hepatocyte differentiation genes, involved increased AhR and HNF4α binding with decreased COUP-TFII binding. Collectively, these results suggested TCDD-elicited loss of liver-specific functions and markers of hepatocyte differentiation involved interactions between AhR, COUP-TFII and HNF4α.
Collapse
Affiliation(s)
| | | | - Tim R. Zacharewski
- Biochemistry & Molecular Biology, Institute for Integrative Toxicology, Michigan State University, East Lansing, MI 48824, USA; (G.N.C.); (R.N.)
| |
Collapse
|
17
|
Development of sexual structures influences metabolomic and transcriptomic profiles in Aspergillus flavus. Fungal Biol 2022; 126:187-200. [DOI: 10.1016/j.funbio.2022.01.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Revised: 01/19/2022] [Accepted: 01/20/2022] [Indexed: 01/02/2023]
|
18
|
Sweany RR, Mack BM, Moore GG, Gilbert MK, Cary JW, Lebar MD, Rajasekaran K, Damann Jr. KE. Genetic Responses and Aflatoxin Inhibition during Co-Culture of Aflatoxigenic and Non-Aflatoxigenic Aspergillus flavus. Toxins (Basel) 2021; 13:794. [PMID: 34822579 PMCID: PMC8618995 DOI: 10.3390/toxins13110794] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Revised: 10/30/2021] [Accepted: 11/05/2021] [Indexed: 11/16/2022] Open
Abstract
Aflatoxin is a carcinogenic mycotoxin produced by Aspergillus flavus. Non-aflatoxigenic (Non-tox) A. flavus isolates are deployed in corn fields as biocontrol because they substantially reduce aflatoxin contamination via direct replacement and additionally via direct contact or touch with toxigenic (Tox) isolates and secretion of inhibitory/degradative chemicals. To understand touch inhibition, HPLC analysis and RNA sequencing examined aflatoxin production and gene expression of Non-tox isolate 17 and Tox isolate 53 mono-cultures and during their interaction in co-culture. Aflatoxin production was reduced by 99.7% in 72 h co-cultures. Fewer than expected unique reads were assigned to Tox 53 during co-culture, indicating its growth and/or gene expression was inhibited in response to Non-tox 17. Predicted secreted proteins and genes involved in oxidation/reduction were enriched in Non-tox 17 and co-cultures compared to Tox 53. Five secondary metabolite (SM) gene clusters and kojic acid synthesis genes were upregulated in Non-tox 17 compared to Tox 53 and a few were further upregulated in co-cultures in response to touch. These results suggest Non-tox strains can inhibit growth and aflatoxin gene cluster expression in Tox strains through touch. Additionally, upregulation of other SM genes and redox genes during the biocontrol interaction demonstrates a potential role of inhibitory SMs and antioxidants as additional biocontrol mechanisms and deserves further exploration to improve biocontrol formulations.
Collapse
Affiliation(s)
- Rebecca R. Sweany
- Food and Feed Safety Research Unit, Southern Regional Research Center, US Department of Agriculture, New Orleans, LA 70124, USA; (B.M.M.); (M.K.G.); (J.W.C.); (M.D.L.)
- Department of Plant Pathology and Crop Physiology, Louisiana State University, Baton Rouge, LA 70808, USA;
| | - Brian M. Mack
- Food and Feed Safety Research Unit, Southern Regional Research Center, US Department of Agriculture, New Orleans, LA 70124, USA; (B.M.M.); (M.K.G.); (J.W.C.); (M.D.L.)
- Department of Plant Pathology and Crop Physiology, Louisiana State University, Baton Rouge, LA 70808, USA;
| | - Geromy G. Moore
- Food and Feed Safety Research Unit, Southern Regional Research Center, US Department of Agriculture, New Orleans, LA 70124, USA; (B.M.M.); (M.K.G.); (J.W.C.); (M.D.L.)
- Department of Plant Pathology and Crop Physiology, Louisiana State University, Baton Rouge, LA 70808, USA;
| | - Matthew K. Gilbert
- Food and Feed Safety Research Unit, Southern Regional Research Center, US Department of Agriculture, New Orleans, LA 70124, USA; (B.M.M.); (M.K.G.); (J.W.C.); (M.D.L.)
- Department of Plant Pathology and Crop Physiology, Louisiana State University, Baton Rouge, LA 70808, USA;
| | - Jeffrey W. Cary
- Food and Feed Safety Research Unit, Southern Regional Research Center, US Department of Agriculture, New Orleans, LA 70124, USA; (B.M.M.); (M.K.G.); (J.W.C.); (M.D.L.)
- Department of Plant Pathology and Crop Physiology, Louisiana State University, Baton Rouge, LA 70808, USA;
| | - Matthew D. Lebar
- Food and Feed Safety Research Unit, Southern Regional Research Center, US Department of Agriculture, New Orleans, LA 70124, USA; (B.M.M.); (M.K.G.); (J.W.C.); (M.D.L.)
- Department of Plant Pathology and Crop Physiology, Louisiana State University, Baton Rouge, LA 70808, USA;
| | - Kanniah Rajasekaran
- Food and Feed Safety Research Unit, Southern Regional Research Center, US Department of Agriculture, New Orleans, LA 70124, USA; (B.M.M.); (M.K.G.); (J.W.C.); (M.D.L.)
- Department of Plant Pathology and Crop Physiology, Louisiana State University, Baton Rouge, LA 70808, USA;
| | - Kenneth E. Damann Jr.
- Department of Plant Pathology and Crop Physiology, Louisiana State University, Baton Rouge, LA 70808, USA;
| |
Collapse
|
19
|
Karaduta O, Glazko G, Dvanajscak Z, Arthur J, Mackintosh S, Orr L, Rahmatallah Y, Yeruva L, Tackett A, Zybailov B. Resistant starch slows the progression of CKD in the 5/6 nephrectomy mouse model. Physiol Rep 2021; 8:e14610. [PMID: 33038060 PMCID: PMC7547583 DOI: 10.14814/phy2.14610] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Revised: 08/31/2020] [Accepted: 09/21/2020] [Indexed: 01/02/2023] Open
Abstract
Background Resistant Starch (RS) improves CKD outcomes. In this report, we study how RS modulates host‐microbiome interactions in CKD by measuring changes in the abundance of proteins and bacteria in the gut. In addition, we demonstrate RS‐mediated reduction in CKD‐induced kidney damage. Methods Eight mice underwent 5/6 nephrectomy to induce CKD and eight served as healthy controls. CKD and Healthy (H) groups were further split into those receiving RS (CKDRS, n = 4; HRS, n = 4) and those on normal diet (CKD, n = 4, H, n = 4). Kidney injury was evaluated by measuring BUN/creatinine and by histopathological evaluation. Cecal contents were analyzed using mass spectrometry‐based metaproteomics and de novo sequencing using PEAKS. All the data were analyzed using R/Bioconductor packages. Results The 5/6 nephrectomy compromised kidney function as seen by an increase in BUN/creatinine compared to healthy groups. Histopathology of kidney sections showed reduced tubulointerstitial injury in the CKDRS versus CKD group; while no significant difference in BUN/creatinine was observed between the two CKD groups. Identified proteins point toward a higher population of butyrate‐producing bacteria, reduced abundance of mucin‐degrading bacteria in the RS fed groups, and to the downregulation of indole metabolism in CKD groups. Conclusion RS slows the progression of chronic kidney disease. Resistant starch supplementation leads to active bacterial proliferation and the reduction of harmful bacterial metabolites.
Collapse
Affiliation(s)
- Oleg Karaduta
- Department of Biochemistry and Molecular Biology, UAMS, Little Rock, AR, USA
| | - Galina Glazko
- Department of Biomedical Informatics, UAMS, Little Rock, AR, USA
| | | | - John Arthur
- Division of Nephrology, UAMS, Little Rock, AR, USA
| | - Samuel Mackintosh
- Department of Biochemistry and Molecular Biology, UAMS, Little Rock, AR, USA.,Proteomics Core Facility, UAMS, Little Rock, AR, USA
| | - Lisa Orr
- Department of Biochemistry and Molecular Biology, UAMS, Little Rock, AR, USA
| | | | - Laxmi Yeruva
- Department of Biochemistry and Molecular Biology, UAMS, Little Rock, AR, USA.,Arkansas Children's Nutrition Center, Little Rock, AR, USA.,Department of Pediatrics, UAMS, Little Rock, AR, USA
| | - Alan Tackett
- Department of Biochemistry and Molecular Biology, UAMS, Little Rock, AR, USA.,Proteomics Core Facility, UAMS, Little Rock, AR, USA.,Arkansas Children's Research Institute, Little Rock, AR, USA
| | - Boris Zybailov
- Department of Biochemistry and Molecular Biology, UAMS, Little Rock, AR, USA
| |
Collapse
|
20
|
Trinh HC, Kwon YK. A novel constrained genetic algorithm-based Boolean network inference method from steady-state gene expression data. Bioinformatics 2021; 37:i383-i391. [PMID: 34252959 PMCID: PMC8275338 DOI: 10.1093/bioinformatics/btab295] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/24/2021] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION It is a challenging problem in systems biology to infer both the network structure and dynamics of a gene regulatory network from steady-state gene expression data. Some methods based on Boolean or differential equation models have been proposed but they were not efficient in inference of large-scale networks. Therefore, it is necessary to develop a method to infer the network structure and dynamics accurately on large-scale networks using steady-state expression. RESULTS In this study, we propose a novel constrained genetic algorithm-based Boolean network inference (CGA-BNI) method where a Boolean canalyzing update rule scheme was employed to capture coarse-grained dynamics. Given steady-state gene expression data as an input, CGA-BNI identifies a set of path consistency-based constraints by comparing the gene expression level between the wild-type and the mutant experiments. It then searches Boolean networks which satisfy the constraints and induce attractors most similar to steady-state expressions. We devised a heuristic mutation operation for faster convergence and implemented a parallel evaluation routine for execution time reduction. Through extensive simulations on the artificial and the real gene expression datasets, CGA-BNI showed better performance than four other existing methods in terms of both structural and dynamics prediction accuracies. Taken together, CGA-BNI is a promising tool to predict both the structure and the dynamics of a gene regulatory network when a highest accuracy is needed at the cost of sacrificing the execution time. AVAILABILITY AND IMPLEMENTATION Source code and data are freely available at https://github.com/csclab/CGA-BNI. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hung-Cuong Trinh
- Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh 758307, Vietnam
| | - Yung-Keun Kwon
- Department of IT Convergence, University of Ulsan, Ulsan 680-749, Korea
| |
Collapse
|
21
|
Grimes T, Datta S. SeqNet: An R Package for Generating Gene-Gene Networks and Simulating RNA-Seq Data. J Stat Softw 2021; 98:10.18637/jss.v098.i12. [PMID: 34321962 PMCID: PMC8315007 DOI: 10.18637/jss.v098.i12] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Gene expression data provide an abundant resource for inferring connections in gene regulatory networks. While methodologies developed for this task have shown success, a challenge remains in comparing the performance among methods. Gold-standard datasets are scarce and limited in use. And while tools for simulating expression data are available, they are not designed to resemble the data obtained from RNA-seq experiments. SeqNet is an R package that provides tools for generating a rich variety of gene network structures and simulating RNA-seq data from them. This produces in silico RNA-seq data for benchmarking and assessing gene network inference methods. The package is available on CRAN and on GitHub at https://github.com/tgrimes/SeqNet.
Collapse
Affiliation(s)
- Tyler Grimes
- Univeristy of Florida, Department of Biostatistics
| | | |
Collapse
|
22
|
Ma JH, Feng Z, Wu JY, Zhang Y, Di W. Learning from imbalanced fetal outcomes of systemic lupus erythematosus in artificial neural networks. BMC Med Inform Decis Mak 2021; 21:127. [PMID: 33845834 PMCID: PMC8042715 DOI: 10.1186/s12911-021-01486-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Accepted: 03/31/2021] [Indexed: 11/10/2022] Open
Abstract
OBJECTIVE To explore an effective algorithm based on artificial neural network to pick correctly the minority of pregnant women with SLE suffering fetal loss outcomes from the majority with live birth and train a well behaved model as a clinical decision assistant. METHODS We integrated the thoughts of comparative and focused study into the artificial neural network and presented an effective algorithm aiming at imbalanced learning in small dataset. RESULTS We collected 469 non-trivial pregnant patients with SLE, where 420 had live-birth outcomes and the other 49 patients ended in fetal loss. A well trained imbalanced-learning model had a high sensitivity of 19/21 ([Formula: see text]) for the identification of patients with fetal loss outcomes. DISCUSSION The misprediction of the two patients was explainable. Algorithm improvements in artificial neural network framework enhanced the identification in imbalanced learning problems and the external validation increased the reliability of algorithm. CONCLUSION The well-trained model was fully qualified to assist healthcare providers to make timely and accurate decisions.
Collapse
Affiliation(s)
- Jing-Hang Ma
- Department of Obstetrics and Gynecology, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
- Shanghai Key Laboratory of Gynecologic Oncology, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
- State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
- First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - Zhen Feng
- First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - Jia-Yue Wu
- Department of Obstetrics and Gynecology, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
- Shanghai Key Laboratory of Gynecologic Oncology, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
- State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Yu Zhang
- Shanghai Key Laboratory of Gynecologic Oncology, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
- State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Wen Di
- Department of Obstetrics and Gynecology, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China.
- Shanghai Key Laboratory of Gynecologic Oncology, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China.
- State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China.
| |
Collapse
|
23
|
Xie J, Yin Y, Yang F, Sun J, Wang J. Differential Network Analysis Reveals Regulatory Patterns in Neural Stem Cell Fate Decision. Interdiscip Sci 2021; 13:91-102. [PMID: 33439459 DOI: 10.1007/s12539-020-00415-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2020] [Revised: 12/11/2020] [Accepted: 12/22/2020] [Indexed: 11/30/2022]
Abstract
Deciphering regulatory patterns of neural stem cell (NSC) differentiation with multiple stages is essential to understand NSC differentiation mechanisms. Recent single-cell transcriptome datasets became available at individual differentiation. However, a systematic and integrative analysis of multiple datasets at multiple temporal stages of NSC differentiation is lacking. In this study, we propose a new method integrating prior information to construct three gene regulatory networks at pair-wise stages of transcriptome and apply this method to investigate five NSC differentiation paths on four different single-cell transcriptome datasets. By constructing gene regulatory networks for each path, we delineate their regulatory patterns via differential topology and network diffusion analyses. We find 12 common differentially expressed genes among the five NSC differentiation paths, with one common regulatory pattern (Gsk3b_App_Cdk5) shared by all paths. The identified regulatory pattern, partly supported by previous experimental evidence, is essential to all differentiation paths, but it plays a different role in each path when regulating other genes. Together, our integrative analysis provides both common and specific regulatory mechanisms for each of the five NSC differentiation paths.
Collapse
Affiliation(s)
- Jiang Xie
- School of Computer Engineering and Science, Shanghai University, Shanghai, China
| | - Yiting Yin
- School of Computer Engineering and Science, Shanghai University, Shanghai, China
| | - Fuzhang Yang
- School of Computer Engineering and Science, Shanghai University, Shanghai, China
| | - Jiamin Sun
- School of Computer Engineering and Science, Shanghai University, Shanghai, China
| | - Jiao Wang
- School of Life Sciences, Shanghai University, Shanghai, China.
| |
Collapse
|
24
|
Kimura S, Fukutomi R, Tokuhisa M, Okada M. Inference of Genetic Networks From Time-Series and Static Gene Expression Data: Combining a Random-Forest-Based Inference Method With Feature Selection Methods. Front Genet 2021; 11:595912. [PMID: 33384716 PMCID: PMC7770182 DOI: 10.3389/fgene.2020.595912] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2020] [Accepted: 11/23/2020] [Indexed: 11/17/2022] Open
Abstract
Several researchers have focused on random-forest-based inference methods because of their excellent performance. Some of these inference methods also have a useful ability to analyze both time-series and static gene expression data. However, they are only of use in ranking all of the candidate regulations by assigning them confidence values. None have been capable of detecting the regulations that actually affect a gene of interest. In this study, we propose a method to remove unpromising candidate regulations by combining the random-forest-based inference method with a series of feature selection methods. In addition to detecting unpromising regulations, our proposed method uses outputs from the feature selection methods to adjust the confidence values of all of the candidate regulations that have been computed by the random-forest-based inference method. Numerical experiments showed that the combined application with the feature selection methods improved the performance of the random-forest-based inference method on 99 of the 100 trials performed on the artificial problems. However, the improvement tends to be small, since our combined method succeeded in removing only 19% of the candidate regulations at most. The combined application with the feature selection methods moreover makes the computational cost higher. While a bigger improvement at a lower computational cost would be ideal, we see no impediments to our investigation, given that our aim is to extract as much useful information as possible from a limited amount of gene expression data.
Collapse
Affiliation(s)
- Shuhei Kimura
- Faculty of Engineering, Tottori University, Tottori, Japan
| | - Ryo Fukutomi
- Graduate School of Sustainability Science, Tottori University, Tottori, Japan
| | | | - Mariko Okada
- Laboratory of Cell Systems, Institute of Protein Research, Osaka University, Osaka, Japan
| |
Collapse
|
25
|
Almeida-Silva F, Moharana KC, Machado FB, Venancio TM. Exploring the complexity of soybean (Glycine max) transcriptional regulation using global gene co-expression networks. PLANTA 2020; 252:104. [PMID: 33196909 DOI: 10.1007/s00425-020-03499-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/09/2020] [Accepted: 10/15/2020] [Indexed: 06/11/2023]
Abstract
MAIN CONCLUSION We report a soybean gene co-expression network built with data from 1284 RNA-Seq experiments, which was used to identify important regulators, modules and to elucidate the fates of gene duplicates. Soybean (Glycine max (L.) Merr.) is one of the most important crops worldwide, constituting a major source of protein and edible oil. Gene co-expression networks (GCN) have been extensively used to study transcriptional regulation and evolution of genes and genomes. Here, we report a soybean GCN using 1284 publicly available RNA-Seq samples from 15 distinct tissues. We found modules that are differentially regulated in specific tissues, comprising processes such as photosynthesis, gluconeogenesis, lignin metabolism, and response to biotic stress. We identified transcription factors among intramodular hubs, which probably integrate different pathways and shape the transcriptional landscape in different conditions. The top hubs for each module tend to encode proteins with critical roles, such as succinate dehydrogenase and RNA polymerase subunits. Importantly, gene essentiality was strongly correlated with degree centrality and essential hubs were enriched in genes involved in nucleic acids metabolism and regulation of cell replication. Using a guilt-by-association approach, we predicted functions for 93 of 106 hubs without functional description in soybean. Most of the duplicated genes had different transcriptional profiles, supporting their functional divergence, although paralogs originating from whole-genome duplications (WGD) are more often preserved in the same module than those from other mechanisms. Together, our results highlight the importance of GCN analysis in unraveling key functional aspects of the soybean genome, in particular those associated with hub genes and WGD events.
Collapse
Affiliation(s)
- Fabricio Almeida-Silva
- Laboratório de Química e Função de Proteínas e Peptídeos, Centro de Biociências e Biotecnologia, Universidade Estadual do Norte Fluminense Darcy Ribeiro, Av. Alberto Lamego 2000, P5, sala 217, Campos dos Goytacazes, RJ, Brazil
| | - Kanhu C Moharana
- Laboratório de Química e Função de Proteínas e Peptídeos, Centro de Biociências e Biotecnologia, Universidade Estadual do Norte Fluminense Darcy Ribeiro, Av. Alberto Lamego 2000, P5, sala 217, Campos dos Goytacazes, RJ, Brazil
| | - Fabricio B Machado
- Laboratório de Química e Função de Proteínas e Peptídeos, Centro de Biociências e Biotecnologia, Universidade Estadual do Norte Fluminense Darcy Ribeiro, Av. Alberto Lamego 2000, P5, sala 217, Campos dos Goytacazes, RJ, Brazil
| | - Thiago M Venancio
- Laboratório de Química e Função de Proteínas e Peptídeos, Centro de Biociências e Biotecnologia, Universidade Estadual do Norte Fluminense Darcy Ribeiro, Av. Alberto Lamego 2000, P5, sala 217, Campos dos Goytacazes, RJ, Brazil.
| |
Collapse
|
26
|
Manjang K, Tripathi S, Yli-Harja O, Dehmer M, Emmert-Streib F. Graph-based exploitation of gene ontology using GOxploreR for scrutinizing biological significance. Sci Rep 2020; 10:16672. [PMID: 33028846 PMCID: PMC7542435 DOI: 10.1038/s41598-020-73326-3] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Accepted: 08/17/2020] [Indexed: 12/12/2022] Open
Abstract
Gene ontology (GO) is an eminent knowledge base frequently used for providing biological interpretations for the analysis of genes or gene sets from biological, medical and clinical problems. Unfortunately, the interpretation of such results is challenging due to the large number of GO terms, their hierarchical and connected organization as directed acyclic graphs (DAGs) and the lack of tools allowing to exploit this structural information explicitly. For this reason, we developed the R package GOxploreR. The main features of GOxploreR are (I) easy and direct access to structural features of GO, (II) structure-based ranking of GO-terms, (III) mapping to reduced GO-DAGs including visualization capabilities and (IV) prioritizing of GO-terms. The underlying idea of GOxploreR is to exploit a graph-theoretical perspective of GO as manifested by its DAG-structure and the containing hierarchy levels for cumulating semantic information. That means all these features enhance the utilization of structural information of GO and complement existing analysis tools. Overall, GOxploreR provides exploratory as well as confirmatory tools for complementing any kind of analysis resulting in a list of GO-terms, e.g., from differentially expressed genes or gene sets, GWAS or biomarkers. Our R package GOxploreR is freely available from CRAN.
Collapse
Affiliation(s)
- Kalifa Manjang
- Predictive Society and Data Analytics Lab, Tampere University, Tampere, Korkeakoulunkatu 10, 33720, Tampere, Finland
| | - Shailesh Tripathi
- Predictive Society and Data Analytics Lab, Tampere University, Tampere, Korkeakoulunkatu 10, 33720, Tampere, Finland
| | - Olli Yli-Harja
- Computational Systems Biology, Tampere University, Tampere, Korkeakoulunkatu 10, 33720, Tampere, Finland.,Institute for Systems Biology, Seattle, WA, USA.,Institute of Biosciences and Medical Technology, Tampere University, Tampere, Korkeakoulunkatu 10, 33720, Tampere, Finland
| | - Matthias Dehmer
- Department of Biomedical Computer Science and Mechatronics, UMIT-The Health and Life Science University, 6060, Hall in Tyrol, Austria.,College of Artificial Intelligence, Nankai University, Tianjin, 300350, China
| | - Frank Emmert-Streib
- Predictive Society and Data Analytics Lab, Tampere University, Tampere, Korkeakoulunkatu 10, 33720, Tampere, Finland. .,Institute of Biosciences and Medical Technology, Tampere University, Tampere, Korkeakoulunkatu 10, 33720, Tampere, Finland.
| |
Collapse
|
27
|
Díez-Villanueva A, Sanz-Pamplona R, Carreras-Torres R, Moratalla-Navarro F, Alonso M, Paré-Brunet L, Aussó S, Guinó E, Solé X, Cordero D, Salazar R, Berdasco M, Peinado MA, Moreno V. DNA methylation events in transcription factors and gene expression changes in colon cancer. Epigenomics 2020; 12:1593-1610. [DOI: 10.2217/epi-2020-0029] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Abstract
Aim: Gain insight about the role of DNA methylation in the malignant growth of colon cancer. Patients & methods: Methylation and gene expression from 90 adjacent-tumor paired tissues and 48 healthy tissues were analyzed. Tumor genes whose change in expression was explained by changes in methylation were identified using linear models adjusted for tumor stromal content. Results: No differences in methylation were found between adjacent and healthy tissues, but clear differences were found between adjacent and tumor samples. We identified hypermethylated CpG islands located in promoter regions that drive differential gene expression of transcription factors and their target genes. Conclusion: Changes in methylation of a few genes provoke important changes in gene expression, by expanding the signal through transcription activation/repression.
Collapse
Affiliation(s)
- Anna Díez-Villanueva
- Unit of Biomarkers & Susceptibility, Oncology Data Analytics Program, Catalan Institute of Oncology (ICO), 08908 Hospitalet de Llobregat, Barcelona, Spain
- Colorectal Cancer Group, ONCOBELL Program, Bellvitge Biomedical Research Institute (IDIBELL), 08908 Hospitalet de Llobregat, Barcelona, Spain
- Biomedical Research Centre Network for Epidemiology & Public Health (CIBERESP), 28029 Madrid, Spain
| | - Rebeca Sanz-Pamplona
- Unit of Biomarkers & Susceptibility, Oncology Data Analytics Program, Catalan Institute of Oncology (ICO), 08908 Hospitalet de Llobregat, Barcelona, Spain
- Colorectal Cancer Group, ONCOBELL Program, Bellvitge Biomedical Research Institute (IDIBELL), 08908 Hospitalet de Llobregat, Barcelona, Spain
- Biomedical Research Centre Network for Epidemiology & Public Health (CIBERESP), 28029 Madrid, Spain
| | - Robert Carreras-Torres
- Unit of Biomarkers & Susceptibility, Oncology Data Analytics Program, Catalan Institute of Oncology (ICO), 08908 Hospitalet de Llobregat, Barcelona, Spain
- Colorectal Cancer Group, ONCOBELL Program, Bellvitge Biomedical Research Institute (IDIBELL), 08908 Hospitalet de Llobregat, Barcelona, Spain
- Biomedical Research Centre Network for Epidemiology & Public Health (CIBERESP), 28029 Madrid, Spain
| | - Ferran Moratalla-Navarro
- Unit of Biomarkers & Susceptibility, Oncology Data Analytics Program, Catalan Institute of Oncology (ICO), 08908 Hospitalet de Llobregat, Barcelona, Spain
- Colorectal Cancer Group, ONCOBELL Program, Bellvitge Biomedical Research Institute (IDIBELL), 08908 Hospitalet de Llobregat, Barcelona, Spain
- Biomedical Research Centre Network for Epidemiology & Public Health (CIBERESP), 28029 Madrid, Spain
- Department of Clinical Sciences, Faculty of Medicine and Health Sciences, University of Barcelona, 08907 Barcelona, Spain
| | - M Henar Alonso
- Unit of Biomarkers & Susceptibility, Oncology Data Analytics Program, Catalan Institute of Oncology (ICO), 08908 Hospitalet de Llobregat, Barcelona, Spain
- Colorectal Cancer Group, ONCOBELL Program, Bellvitge Biomedical Research Institute (IDIBELL), 08908 Hospitalet de Llobregat, Barcelona, Spain
- Biomedical Research Centre Network for Epidemiology & Public Health (CIBERESP), 28029 Madrid, Spain
- Department of Clinical Sciences, Faculty of Medicine and Health Sciences, University of Barcelona, 08907 Barcelona, Spain
| | - Laia Paré-Brunet
- Unit of Biomarkers & Susceptibility, Oncology Data Analytics Program, Catalan Institute of Oncology (ICO), 08908 Hospitalet de Llobregat, Barcelona, Spain
- Colorectal Cancer Group, ONCOBELL Program, Bellvitge Biomedical Research Institute (IDIBELL), 08908 Hospitalet de Llobregat, Barcelona, Spain
| | - Susanna Aussó
- Unit of Biomarkers & Susceptibility, Oncology Data Analytics Program, Catalan Institute of Oncology (ICO), 08908 Hospitalet de Llobregat, Barcelona, Spain
- Colorectal Cancer Group, ONCOBELL Program, Bellvitge Biomedical Research Institute (IDIBELL), 08908 Hospitalet de Llobregat, Barcelona, Spain
| | - Elisabet Guinó
- Unit of Biomarkers & Susceptibility, Oncology Data Analytics Program, Catalan Institute of Oncology (ICO), 08908 Hospitalet de Llobregat, Barcelona, Spain
- Colorectal Cancer Group, ONCOBELL Program, Bellvitge Biomedical Research Institute (IDIBELL), 08908 Hospitalet de Llobregat, Barcelona, Spain
- Biomedical Research Centre Network for Epidemiology & Public Health (CIBERESP), 28029 Madrid, Spain
| | - Xavier Solé
- Unit of Biomarkers & Susceptibility, Oncology Data Analytics Program, Catalan Institute of Oncology (ICO), 08908 Hospitalet de Llobregat, Barcelona, Spain
- Colorectal Cancer Group, ONCOBELL Program, Bellvitge Biomedical Research Institute (IDIBELL), 08908 Hospitalet de Llobregat, Barcelona, Spain
- Biomedical Research Centre Network for Epidemiology & Public Health (CIBERESP), 28029 Madrid, Spain
| | - David Cordero
- Unit of Biomarkers & Susceptibility, Oncology Data Analytics Program, Catalan Institute of Oncology (ICO), 08908 Hospitalet de Llobregat, Barcelona, Spain
- Colorectal Cancer Group, ONCOBELL Program, Bellvitge Biomedical Research Institute (IDIBELL), 08908 Hospitalet de Llobregat, Barcelona, Spain
- Biomedical Research Centre Network for Epidemiology & Public Health (CIBERESP), 28029 Madrid, Spain
| | - Ramón Salazar
- Colorectal Cancer Group, ONCOBELL Program, Bellvitge Biomedical Research Institute (IDIBELL), 08908 Hospitalet de Llobregat, Barcelona, Spain
- Biomedical Research Centre Network for Cancer (CIBERONC), 28029 Madrid, Spain
- Medical Oncology Service, Catalan Institute of Oncology (ICO), 08908 Hospitalet de Llobregat, Barcelona, Spain
| | - Maria Berdasco
- Cancer Epigenetics & Biology Program, Bellvitge Biomedical Research Institute (IDIBELL), 08908 Hospitalet de Llobregat, Barcelona, Spain
- Epigenetic Therapies Group, Experimental & Clinical Hematology Program (PHEC), Josep Carreras Leukaemia Research Institute, 08916 Badalona, Barcelona, Spain
| | - Miguel A Peinado
- Program of Predictive and Personalized Medicine of Cancer, Germans Trias i Pujol Research Institute (PMPPC-IGTP), 08916 Badalona, Barcelona, Spain
| | - Victor Moreno
- Unit of Biomarkers & Susceptibility, Oncology Data Analytics Program, Catalan Institute of Oncology (ICO), 08908 Hospitalet de Llobregat, Barcelona, Spain
- Colorectal Cancer Group, ONCOBELL Program, Bellvitge Biomedical Research Institute (IDIBELL), 08908 Hospitalet de Llobregat, Barcelona, Spain
- Biomedical Research Centre Network for Epidemiology & Public Health (CIBERESP), 28029 Madrid, Spain
- Department of Clinical Sciences, Faculty of Medicine and Health Sciences, University of Barcelona, 08907 Barcelona, Spain
| |
Collapse
|
28
|
Saint-Antoine MM, Singh A. Network inference in systems biology: recent developments, challenges, and applications. Curr Opin Biotechnol 2020; 63:89-98. [PMID: 31927423 PMCID: PMC7308210 DOI: 10.1016/j.copbio.2019.12.002] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Accepted: 12/03/2019] [Indexed: 12/12/2022]
Abstract
One of the most interesting, difficult, and potentially useful topics in computational biology is the inference of gene regulatory networks (GRNs) from expression data. Although researchers have been working on this topic for more than a decade and much progress has been made, it remains an unsolved problem and even the most sophisticated inference algorithms are far from perfect. In this paper, we review the latest developments in network inference, including state-of-the-art algorithms like PIDC, Phixer, and more. We also discuss unsolved computational challenges, including the optimal combination of algorithms, integration of multiple data sources, and pseudo-temporal ordering of static expression data. Lastly, we discuss some exciting applications of network inference in cancer research, and provide a list of useful software tools for researchers hoping to conduct their own network inference analyses.
Collapse
Affiliation(s)
- Michael M Saint-Antoine
- Center for Bioinformatics and Computational Biology, University of Delaware, Newark, Delaware 19716, USA
| | - Abhyudai Singh
- Electrical and Computer Engineering, University of Delaware, Newark, Delaware 19716, USA.
| |
Collapse
|
29
|
Conrad EC, Bernabei JM, Kini LG, Shah P, Mikhail F, Kheder A, Shinohara RT, Davis KA, Bassett DS, Litt B. The sensitivity of network statistics to incomplete electrode sampling on intracranial EEG. Netw Neurosci 2020; 4:484-506. [PMID: 32537538 PMCID: PMC7286312 DOI: 10.1162/netn_a_00131] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2019] [Accepted: 02/10/2020] [Indexed: 12/12/2022] Open
Abstract
Network neuroscience applied to epilepsy holds promise to map pathological networks, localize seizure generators, and inform targeted interventions to control seizures. However, incomplete sampling of the epileptic brain because of sparse placement of intracranial electrodes may affect model results. In this study, we evaluate the sensitivity of several published network measures to incomplete spatial sampling and propose an algorithm using network subsampling to determine confidence in model results. We retrospectively evaluated intracranial EEG data from 28 patients implanted with grid, strip, and depth electrodes during evaluation for epilepsy surgery. We recalculated global and local network metrics after randomly and systematically removing subsets of intracranial EEG electrode contacts. We found that sensitivity to incomplete sampling varied significantly across network metrics. This sensitivity was largely independent of whether seizure onset zone contacts were targeted or spared from removal. We present an algorithm using random subsampling to compute patient-specific confidence intervals for network localizations. Our findings highlight the difference in robustness between commonly used network metrics and provide tools to assess confidence in intracranial network localization. We present these techniques as an important step toward translating personalized network models of seizures into rigorous, quantitative approaches to invasive therapy.
Collapse
Affiliation(s)
- Erin C. Conrad
- Department of Neurology, Hospital of the University of Pennsylvania, Philadelphia, PA, USA
- Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
| | - John M. Bernabei
- Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
- Department of Bioengineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA
| | - Lohith G. Kini
- Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
- Department of Bioengineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA
| | - Preya Shah
- Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
- Department of Bioengineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA
| | - Fadi Mikhail
- Department of Neurology, Hospital of the University of Pennsylvania, Philadelphia, PA, USA
- Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
| | - Ammar Kheder
- Department of Neurology, Emory University, Atlanta, GA, USA
| | - Russell T. Shinohara
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, PA, USA
- Penn Statistics in Imaging and Visualization Center, University of Pennsylvania, Philadelphia, PA, USA
- Center for Biomedical Image Computing and Analytics, University of Pennsylvania, Philadelphia, PA, USA
| | - Kathryn A. Davis
- Department of Neurology, Hospital of the University of Pennsylvania, Philadelphia, PA, USA
- Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
| | - Danielle S. Bassett
- Department of Neurology, Hospital of the University of Pennsylvania, Philadelphia, PA, USA
- Department of Bioengineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA
- Department of Electrical and Systems Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA
- Department of Physics and Astronomy, College of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, USA
- Department of Psychiatry, Hospital of the University of Pennsylvania, Philadelphia, PA, USA
| | - Brian Litt
- Department of Neurology, Hospital of the University of Pennsylvania, Philadelphia, PA, USA
- Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
- Department of Bioengineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA
- Department of Neurosurgery, Hospital of the University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
30
|
Palombo V, Milanesi M, Sferra G, Capomaccio S, Sgorlon S, D'Andrea M. PANEV: an R package for a pathway-based network visualization. BMC Bioinformatics 2020; 21:46. [PMID: 32028885 PMCID: PMC7006390 DOI: 10.1186/s12859-020-3371-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Accepted: 01/15/2020] [Indexed: 11/18/2022] Open
Abstract
Background During the last decade, with the aim to solve the challenge of post-genomic and transcriptomic data mining, a plethora of tools have been developed to create, edit and analyze metabolic pathways. In particular, when a complex phenomenon is considered, the creation of a network of multiple interconnected pathways of interest could be useful to investigate the underlying biology and ultimately identify functional candidate genes affecting the trait under investigation. Results PANEV (PAthway NEtwork Visualizer) is an R package set for gene/pathway-based network visualization. Based on information available on KEGG, it visualizes genes within a network of multiple levels (from 1 to n) of interconnected upstream and downstream pathways. The network graph visualization helps to interpret functional profiles of a cluster of genes. Conclusions The suite has no species constraints and it is ready to analyze genomic or transcriptomic outcomes. Users need to supply the list of candidate genes, specify the target pathway(s) and the number of interconnected downstream and upstream pathways (levels) required for the investigation. The package is available at https://github.com/vpalombo/PANEV.
Collapse
Affiliation(s)
- Valentino Palombo
- Dipartimento Agricoltura, Ambiente e Alimenti, Università degli Studi del Molise, 86100, Campobasso, Italy
| | - Marco Milanesi
- Department of Support, Production and Animal Health, School of Veterinary Medicine, São Paulo State University, Araçatuba, São Paulo, 16050-680, Brazil.,Istituto di Zootecnica, Università Cattolica del Sacro Cuore, 29122, Piacenza, Italy
| | - Gabriella Sferra
- Dipartimento di Bioscienze e Territorio, Università degli Studi del Molise, 86090, Pesche, IS, Italy
| | - Stefano Capomaccio
- Istituto di Zootecnica, Università Cattolica del Sacro Cuore, 29122, Piacenza, Italy.,Dipartimento di Medicina Veterinaria, Università di Perugia, 06126, Perugia, Italy
| | - Sandy Sgorlon
- Dipartimento di Scienze Agrarie ed Ambientali, Università degli Studi di Udine, 33100, Udine, Italy
| | - Mariasilvia D'Andrea
- Dipartimento Agricoltura, Ambiente e Alimenti, Università degli Studi del Molise, 86100, Campobasso, Italy.
| |
Collapse
|
31
|
Law SR, Kellgren TG, Björk R, Ryden P, Keech O. Centralization Within Sub-Experiments Enhances the Biological Relevance of Gene Co-expression Networks: A Plant Mitochondrial Case Study. FRONTIERS IN PLANT SCIENCE 2020; 11:524. [PMID: 32582224 PMCID: PMC7287149 DOI: 10.3389/fpls.2020.00524] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/12/2019] [Accepted: 04/07/2020] [Indexed: 05/07/2023]
Abstract
UNLABELLED Gene co-expression networks (GCNs) can be prepared using a variety of mathematical approaches based on data sampled across diverse developmental processes, tissue types, pathologies, mutant backgrounds, and stress conditions. These networks are used to identify genes with similar expression dynamics but are prone to introducing false-positive and false-negative relationships, especially in the instance of large and heterogenous datasets. With the aim of optimizing the relevance of edges in GCNs and enhancing global biological insight, we propose a novel approach that involves a data-centering step performed simultaneously per gene and per sub-experiment, called centralization within sub-experiments (CSE). Using a gene set encoding the plant mitochondrial proteome as a case study, our results show that all CSE-based GCNs assessed had significantly more edges within the majority of the considered functional sub-networks, such as the mitochondrial electron transport chain and its complexes, than GCNs not using CSE; thus demonstrating that CSE-based GCNs are efficient at predicting canonical functions and associated pathways, here referred to as the core gene network. Furthermore, we show that correlation analyses using CSE-processed data can be used to fine-tune prediction of the function of uncharacterized genes; while its use in combination with analyses based on non-CSE data can augment conventional stress analyses with the innate connections underpinning the dynamic system being examined. Therefore, CSE is an effective alternative method to conventional batch correction approaches, particularly when dealing with large and heterogenous datasets. The method is easy to implement into a pre-existing GCN analysis pipeline and can provide enhanced biological relevance to conventional GCNs by allowing users to delineate a core gene network. AUTHOR SUMMARY Gene co-expression networks (GCNs) are the product of a variety of mathematical approaches that identify causal relationships in gene expression dynamics but are prone to the misdiagnoses of false-positives and false-negatives, especially in the instance of large and heterogenous datasets. In light of the burgeoning output of next-generation sequencing projects performed on a variety of species, and developmental or clinical conditions; the statistical power and complexity of these networks will undoubtedly increase, while their biological relevance will be fiercely challenged. Here, we propose a novel approach to generate a "core" GCN with enhanced biological relevance. Our method involves a data-centering step that effectively removes all primary treatment/tissue effects, which is simple to employ and can be easily implemented into pre-existing GCN analysis pipelines. The gain in biological relevance resulting from the adoption of this approach was assessed using a plant mitochondrial case study.
Collapse
Affiliation(s)
- Simon R. Law
- Department of Plant Physiology, Umeå Plant Science Centre, Umeå Universitet, Umeå, Sweden
| | - Therese G. Kellgren
- Department of Mathematics and Mathematical Statistics, Umeå Universitet, Umeå, Sweden
| | - Rafael Björk
- Department of Mathematics and Mathematical Statistics, Umeå Universitet, Umeå, Sweden
| | - Patrik Ryden
- Department of Mathematics and Mathematical Statistics, Umeå Universitet, Umeå, Sweden
- *Correspondence: Patrik Ryden,
| | - Olivier Keech
- Department of Plant Physiology, Umeå Plant Science Centre, Umeå Universitet, Umeå, Sweden
- Olivier Keech,
| |
Collapse
|
32
|
Schubert M, Colomé-Tatché M, Foijer F. Gene networks in cancer are biased by aneuploidies and sample impurities. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2019; 1863:194444. [PMID: 31654805 DOI: 10.1016/j.bbagrm.2019.194444] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Revised: 09/05/2019] [Accepted: 10/14/2019] [Indexed: 12/14/2022]
Abstract
Gene regulatory network inference is a standard technique for obtaining structured regulatory information from, for instance, gene expression measurements. Methods performing this task have been extensively evaluated on synthetic, and to a lesser extent real data sets. In contrast to these test evaluations, applications to gene expression data of human cancers are often limited by fewer samples and more potential regulatory links, and are biased by copy number aberrations as well as cell mixtures and sample impurities. Here, we take networks inferred from TCGA cohorts as an example to show that (1) transcription factor annotations are essential to obtain reliable networks, and (2) even for state of the art methods, we expect that between 20 and 80% of edges are caused by copy number changes and cell mixtures rather than transcription factor regulation.
Collapse
Affiliation(s)
- Michael Schubert
- European Research Institute for the Biology of Ageing, University of Groningen, University Medical Center Groningen, 9713 AV, Groningen, the Netherlands; Institute of Computational Biology, Helmholtz Zentrum München, Ingolstädter Landstr. 1, 85764 Neuherberg, Germany.
| | - Maria Colomé-Tatché
- European Research Institute for the Biology of Ageing, University of Groningen, University Medical Center Groningen, 9713 AV, Groningen, the Netherlands; Institute of Computational Biology, Helmholtz Zentrum München, Ingolstädter Landstr. 1, 85764 Neuherberg, Germany; TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
| | - Floris Foijer
- European Research Institute for the Biology of Ageing, University of Groningen, University Medical Center Groningen, 9713 AV, Groningen, the Netherlands.
| |
Collapse
|
33
|
Erdmann J, Thöming JG, Pohl S, Pich A, Lenz C, Häussler S. The Core Proteome of Biofilm-Grown Clinical Pseudomonas aeruginosa Isolates. Cells 2019; 8:E1129. [PMID: 31547513 PMCID: PMC6829490 DOI: 10.3390/cells8101129] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Revised: 09/18/2019] [Accepted: 09/19/2019] [Indexed: 12/13/2022] Open
Abstract
Comparative genomics has greatly facilitated the identification of shared as well as unique features among individual cells or tissues, and thus offers the potential to find disease markers. While proteomics is recognized for its potential to generate quantitative maps of protein expression, comparative proteomics in bacteria has been largely restricted to the comparison of single cell lines or mutant strains. In this study, we used a data independent acquisition (DIA) technique, which enables global protein quantification of large sample cohorts, to record the proteome profiles of overall 27 whole genome sequenced and transcriptionally profiled clinical isolates of the opportunistic pathogen Pseudomonas aeruginosa. Analysis of the proteome profiles across the 27 clinical isolates grown under planktonic and biofilm growth conditions led to the identification of a core biofilm-associated protein profile. Furthermore, we found that protein-to-mRNA ratios between different P. aeruginosa strains are well correlated, indicating conserved patterns of post-transcriptional regulation. Uncovering core regulatory pathways, which drive biofilm formation and associated antibiotic tolerance in bacterial pathogens, promise to give clues to interactions between bacterial species and their environment and could provide useful targets for new clinical interventions to combat biofilm-associated infections.
Collapse
Affiliation(s)
- Jelena Erdmann
- Institute for Molecular Bacteriology, TWINCORE GmbH, Centre for Experimental and Clinical Infection Research, a joint venture of the Hannover Medical School and the Helmholtz Centre for Infection Research, Hannover 30625, Germany.
- Research Core Unit Proteomics and Institute of Toxicology, Hannover Medical School, Hannover 30625, Germany.
| | - Janne G Thöming
- Institute for Molecular Bacteriology, TWINCORE GmbH, Centre for Experimental and Clinical Infection Research, a joint venture of the Hannover Medical School and the Helmholtz Centre for Infection Research, Hannover 30625, Germany.
| | - Sarah Pohl
- Institute for Molecular Bacteriology, TWINCORE GmbH, Centre for Experimental and Clinical Infection Research, a joint venture of the Hannover Medical School and the Helmholtz Centre for Infection Research, Hannover 30625, Germany.
- Department of Molecular Bacteriology, Helmholtz Center for Infection Research, Braunschweig 38124, Germany.
| | - Andreas Pich
- Research Core Unit Proteomics and Institute of Toxicology, Hannover Medical School, Hannover 30625, Germany.
| | - Christof Lenz
- Institute of Clinical Chemistry, Bioanalytics, University Medical Center Göttingen, Göttingen 37075, Germany.
- Max Planck Institute for Biophysical Chemistry, Bioanalytical Mass Spectrometry, Göttingen 37077, Germany.
| | - Susanne Häussler
- Institute for Molecular Bacteriology, TWINCORE GmbH, Centre for Experimental and Clinical Infection Research, a joint venture of the Hannover Medical School and the Helmholtz Centre for Infection Research, Hannover 30625, Germany.
- Department of Molecular Bacteriology, Helmholtz Center for Infection Research, Braunschweig 38124, Germany.
| |
Collapse
|
34
|
Muldoon JJ, Yu JS, Fassia MK, Bagheri N. Network inference performance complexity: a consequence of topological, experimental and algorithmic determinants. Bioinformatics 2019; 35:3421-3432. [PMID: 30932143 PMCID: PMC6748731 DOI: 10.1093/bioinformatics/btz105] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2018] [Revised: 01/24/2019] [Accepted: 02/11/2019] [Indexed: 12/21/2022] Open
Abstract
MOTIVATION Network inference algorithms aim to uncover key regulatory interactions governing cellular decision-making, disease progression and therapeutic interventions. Having an accurate blueprint of this regulation is essential for understanding and controlling cell behavior. However, the utility and impact of these approaches are limited because the ways in which various factors shape inference outcomes remain largely unknown. RESULTS We identify and systematically evaluate determinants of performance-including network properties, experimental design choices and data processing-by developing new metrics that quantify confidence across algorithms in comparable terms. We conducted a multifactorial analysis that demonstrates how stimulus target, regulatory kinetics, induction and resolution dynamics, and noise differentially impact widely used algorithms in significant and previously unrecognized ways. The results show how even if high-quality data are paired with high-performing algorithms, inferred models are sometimes susceptible to giving misleading conclusions. Lastly, we validate these findings and the utility of the confidence metrics using realistic in silico gene regulatory networks. This new characterization approach provides a way to more rigorously interpret how algorithms infer regulation from biological datasets. AVAILABILITY AND IMPLEMENTATION Code is available at http://github.com/bagherilab/networkinference/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Joseph J Muldoon
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA
- Interdisciplinary Biological Sciences Program, Northwestern University, Evanston, IL, USA
| | - Jessica S Yu
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA
| | - Mohammad-Kasim Fassia
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA
- Department of Biomedical Engineering, Northwestern University, Evanston, IL, USA
| | - Neda Bagheri
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA
- Interdisciplinary Biological Sciences Program, Northwestern University, Evanston, IL, USA
- Center for Synthetic Biology, Northwestern University, Evanston, IL, USA
- Chemistry of Life Processes Institute, Northwestern University, Evanston, IL, USA
- Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL, USA
| |
Collapse
|
35
|
Understanding Statistical Hypothesis Testing: The Logic of Statistical Inference. MACHINE LEARNING AND KNOWLEDGE EXTRACTION 2019. [DOI: 10.3390/make1030054] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Statistical hypothesis testing is among the most misunderstood quantitative analysis methods from data science. Despite its seeming simplicity, it has complex interdependencies between its procedural components. In this paper, we discuss the underlying logic behind statistical hypothesis testing, the formal meaning of its components and their connections. Our presentation is applicable to all statistical hypothesis tests as generic backbone and, hence, useful across all application domains in data science and artificial intelligence.
Collapse
|
36
|
Azam MF, Musa A, Dehmer M, Yli-Harja OP, Emmert-Streib F. Global Genetics Research in Prostate Cancer: A Text Mining and Computational Network Theory Approach. Front Genet 2019; 10:70. [PMID: 30838019 PMCID: PMC6383410 DOI: 10.3389/fgene.2019.00070] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2018] [Accepted: 01/28/2019] [Indexed: 11/13/2022] Open
Abstract
Prostate cancer is the most common cancer type in men in Finland and second worldwide. In this paper, we analyze almost 150, 000 published papers about prostate cancer, authored by ten thousands of scientists worldwide, with an integrated text mining and computational network theory approach. We demonstrate how to integrate text mining with network analysis investigating research contributions of countries and collaborations within and between countries. Furthermore, we study the time evolution of individually and collectively studied genes. Finally, we investigate a collaboration network of Finland and compare studied genes with globally studied genes in prostate cancer genetics. Overall, our results provide a global overview of prostate cancer research in genetics. In addition, we present a specific discussion for Finland. Our results shed light on trends within the last 30 years and are useful for translational researchers within the full range from genetics to public health management and health policy.
Collapse
Affiliation(s)
- Md Facihul Azam
- Predictive Society and Data Analysis Lab, Faculty of Information Technology and Communication Sciences, Tampere University, Tampere, Finland.,Institute of Biosciences and Medical Technology, Tampere, Finland
| | - Aliyu Musa
- Predictive Society and Data Analysis Lab, Faculty of Information Technology and Communication Sciences, Tampere University, Tampere, Finland.,Institute of Biosciences and Medical Technology, Tampere, Finland
| | - Matthias Dehmer
- Faculty for Management, Institute for Intelligent Production, University of Applied Sciences Upper Austria, Steyr, Austria.,Department of Mechatronics and Biomedical Computer Science, UMIT, Hall in Tyrol, Austria.,College of Computer and Control Engineering, Nankai University, Tianjin, China
| | - Olli P Yli-Harja
- Institute of Biosciences and Medical Technology, Tampere, Finland.,Computational Systems Biology, Faculty of Biomedical Engineering, Tampere University, Tampere, Finland.,Institute for Systems Biology, Seattle, WA, United States
| | - Frank Emmert-Streib
- Predictive Society and Data Analysis Lab, Faculty of Information Technology and Communication Sciences, Tampere University, Tampere, Finland.,Institute of Biosciences and Medical Technology, Tampere, Finland
| |
Collapse
|
37
|
Moore D, Simoes RDM, Dehmer M, Emmert-Streib F. Prostate Cancer Gene Regulatory Network Inferred from RNA-Seq Data. Curr Genomics 2019; 20:38-48. [PMID: 31015790 PMCID: PMC6446481 DOI: 10.2174/1389202919666181107122005] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2018] [Revised: 09/29/2018] [Accepted: 10/22/2018] [Indexed: 12/26/2022] Open
Abstract
BACKGROUND Cancer is a complex disease with a lucid etiology and in understanding the causation, we need to appreciate this complexity. OBJECTIVE Here we are aiming to gain insights into the genetic associations of prostate cancer through a network-based systems approach using the BC3Net algorithm. METHODS Specifically, we infer a prostate cancer Gene Regulatory Network (GRN) from a large-scale gene expression data set of 333 patient RNA-seq profiles obtained from The Cancer Genome Atlas (TCGA) database. RESULTS We analyze the functional components of the inferred network by extracting subnetworks based on biological process information and interpret the role of known cancer genes within each process. Fur-thermore, we investigate the local landscape of prostate cancer genes and discuss pathological associa-tions that may be relevant in the development of new targeted cancer therapies. CONCLUSION Our network-based analysis provides a practical systems biology approach to reveal the collective gene-interactions of prostate cancer. This allows a close interpretation of biological activity in terms of the hallmarks of cancer.
Collapse
Affiliation(s)
| | | | | | - Frank Emmert-Streib
- Address correspondence to this author at the Department of Signal Processing, Predictive Medicine and Data Analytics Laboratory, Tampere University of Technology, Tampere 33720, Finland; Tel: +358503015353;, E-mails: ;
| |
Collapse
|
38
|
Defining Data Science by a Data-Driven Quantification of the Community. MACHINE LEARNING AND KNOWLEDGE EXTRACTION 2018. [DOI: 10.3390/make1010015] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Data science is a new academic field that has received much attention in recent years. One reason for this is that our increasingly digitalized society generates more and more data in all areas of our lives and science and we are desperately seeking for solutions to deal with this problem. In this paper, we investigate the academic roots of data science. We are using data of scientists and their citations from Google Scholar, who have an interest in data science, to perform a quantitative analysis of the data science community. Furthermore, for decomposing the data science community into its major defining factors corresponding to the most important research fields, we introduce a statistical regression model that is fully automatic and robust with respect to a subsampling of the data. This statistical model allows us to define the ‘importance’ of a field as its predictive abilities. Overall, our method provides an objective answer to the question ‘What is data science?’.
Collapse
|
39
|
Colby SM, McClure RS, Overall CC, Renslow RS, McDermott JE. Improving network inference algorithms using resampling methods. BMC Bioinformatics 2018; 19:376. [PMID: 30314469 PMCID: PMC6186128 DOI: 10.1186/s12859-018-2402-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2018] [Accepted: 09/27/2018] [Indexed: 11/10/2022] Open
Abstract
Background Relatively small changes to gene expression data dramatically affect co-expression networks inferred from that data which, in turn, can significantly alter the subsequent biological interpretation. This error propagation is an underappreciated problem that, while hinted at in the literature, has not yet been thoroughly explored. Resampling methods (e.g. bootstrap aggregation, random subspace method) are hypothesized to alleviate variability in network inference methods by minimizing outlier effects and distilling persistent associations in the data. But the efficacy of the approach assumes the generalization from statistical theory holds true in biological network inference applications. Results We evaluated the effect of bootstrap aggregation on inferred networks using commonly applied network inference methods in terms of stability, or resilience to perturbations in the underlying expression data, a metric for accuracy, and functional enrichment of edge interactions. Conclusion Bootstrap aggregation results in improved stability and, depending on the size of the input dataset, a marginal improvement to accuracy assessed by each method’s ability to link genes in the same functional pathway. Electronic supplementary material The online version of this article (10.1186/s12859-018-2402-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sean M Colby
- Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington, USA
| | - Ryan S McClure
- Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington, USA
| | - Christopher C Overall
- Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington, USA.,Present Address: Center for Brain Immunology and Glia, University of Virginia, Charlottesville, Virginia, USA
| | - Ryan S Renslow
- Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington, USA
| | - Jason E McDermott
- Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington, USA.
| |
Collapse
|
40
|
Liu L, Lu Y, Wei L, Yu H, Cao Y, Li Y, Yang N, Song Y, Liang C, Wang T. Transcriptomics analyses reveal the molecular roadmap and long non-coding RNA landscape of sperm cell lineage development. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2018; 96:421-437. [PMID: 30047180 DOI: 10.1111/tpj.14041] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/26/2018] [Accepted: 07/19/2018] [Indexed: 06/08/2023]
Abstract
Sperm cell (SC) lineage development from the haploid microspore to SCs represents a unique biological process in which the microspore generates a larger vegetative cell (VC) and a smaller generative cell (GC) enclosed in the VC, then the GC further develops to functionally specified SCs in the VC for double fertilization. Understanding the mechanisms of SC lineage development remains a critical goal in plant biology. We isolated individual cells of the three cell types, and characterized the genome-wide atlas of long non-coding (lnc) RNAs and mRNAs of haploid SC lineage cells. Sperm cell lineage development involves global repression of genes for pluripotency, somatic development and metabolism following asymmetric microspore division and coordinated upregulation of GC/SC preferential genes. This process is accompanied by progressive loss of the active marks H3K4me3 and H3K9ac, and accumulation of the repressive methylation mark H3K9. The SC lineage has a higher ratio of lncRNAs to mRNAs and preferentially expresses a larger percentage of lncRNAs than does the non-SC lineage. A co-expression network showed that the largest set of lncRNAs in these nodes, with more than 100 links, are GC-preferential, and a small proportion of lncRNAs co-express with their neighboring genes. Single molecular fluorescence in situ hybridization showed that several candidate genes may be markers distinguishing the three cell types of the SC lineage. Our findings reveal the molecular programming and potential roles of lncRNAs in SC lineage development.
Collapse
Affiliation(s)
- Lingtong Liu
- Key Laboratory of Plant Molecular Physiology, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China
| | - Yunlong Lu
- Key Laboratory of Plant Molecular Physiology, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China
- College of Life Science, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Liqin Wei
- Key Laboratory of Plant Molecular Physiology, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China
| | - Hua Yu
- College of Life Science, University of Chinese Academy of Sciences, Beijing, 100049, China
- Research Center for Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, 100101, China
| | - Yinghao Cao
- Research Center for Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, 100101, China
| | - Yan Li
- Research Center for Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, 100101, China
| | - Ning Yang
- Key Laboratory of Plant Molecular Physiology, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China
| | - Yunyun Song
- Key Laboratory of Plant Molecular Physiology, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China
- College of Life Science, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Chengzhi Liang
- Research Center for Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, 100101, China
| | - Tai Wang
- Key Laboratory of Plant Molecular Physiology, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China
- College of Life Science, University of Chinese Academy of Sciences, Beijing, 100049, China
| |
Collapse
|
41
|
Inference of Genome-Scale Gene Regulatory Networks: Are There Differences in Biological and Clinical Validations? MACHINE LEARNING AND KNOWLEDGE EXTRACTION 2018. [DOI: 10.3390/make1010008] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Causal networks, e.g., gene regulatory networks (GRNs) inferred from gene expression data, contain a wealth of information but are defying simple, straightforward and low-budget experimental validations. In this paper, we elaborate on this problem and discuss distinctions between biological and clinical validations. As a result, validation differences for GRNs reflect known differences between basic biological and clinical research questions making the validations context specific. Hence, the meaning of biologically and clinically meaningful GRNs can be very different. For a concerted approach to a problem of this size, we suggest the establishment of the HUMAN GENE REGULATORY NETWORK PROJECT which provides the information required for biological and clinical validations alike.
Collapse
|
42
|
Baltakys K, Kanniainen J, Emmert-Streib F. Multilayer Aggregation with Statistical Validation: Application to Investor Networks. Sci Rep 2018; 8:8198. [PMID: 29844512 PMCID: PMC5974194 DOI: 10.1038/s41598-018-26575-2] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2017] [Accepted: 05/16/2018] [Indexed: 11/26/2022] Open
Abstract
Multilayer networks are attracting growing attention in many fields, including finance. In this paper, we develop a new tractable procedure for multilayer aggregation based on statistical validation, which we apply to investor networks. Moreover, we propose two other improvements to their analysis: transaction bootstrapping and investor categorization. The aggregation procedure can be used to integrate security-wise and time-wise information about investor trading networks, but it is not limited to finance. In fact, it can be used for different applications, such as gene, transportation, and social networks, were they inferred or observable. Additionally, in the investor network inference, we use transaction bootstrapping for better statistical validation. Investor categorization allows for constant size networks and having more observations for each node, which is important in the inference especially for less liquid securities. Furthermore, we observe that the window size used for averaging has a substantial effect on the number of inferred relationships. We apply this procedure by analyzing a unique data set of Finnish shareholders during the period 2004-2009. We find that households in the capital have high centrality in investor networks, which, under the theory of information channels in investor networks suggests that they are well-informed investors.
Collapse
Affiliation(s)
- Kęstutis Baltakys
- Laboratory of Industrial and Information Management, Tampere University of Technology, Tampere, Finland.
| | - Juho Kanniainen
- Laboratory of Industrial and Information Management, Tampere University of Technology, Tampere, Finland
| | - Frank Emmert-Streib
- Predictive Medicine and Data Analytics Lab, Faculty of Biomedical Sciences and Engineering, Tampere University of Technology, Tampere, Finland
- Institute of Biosciences and Medical Technology, Tampere, Finland
| |
Collapse
|
43
|
Kawalia SB, Raschka T, Naz M, de Matos Simoes R, Senger P, Hofmann-Apitius M. Analytical Strategy to Prioritize Alzheimer's Disease Candidate Genes in Gene Regulatory Networks Using Public Expression Data. J Alzheimers Dis 2018; 59:1237-1254. [PMID: 28800327 PMCID: PMC5611835 DOI: 10.3233/jad-170011] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Alzheimer’s disease (AD) progressively destroys cognitive abilities in the aging population with tremendous effects on memory. Despite recent progress in understanding the underlying mechanisms, high drug attrition rates have put a question mark behind our knowledge about its etiology. Re-evaluation of past studies could help us to elucidate molecular-level details of this disease. Several methods to infer such networks exist, but most of them do not elaborate on context specificity and completeness of the generated networks, missing out on lesser-known candidates. In this study, we present a novel strategy that corroborates common mechanistic patterns across large scale AD gene expression studies and further prioritizes potential biomarker candidates. To infer gene regulatory networks (GRNs), we applied an optimized version of the BC3Net algorithm, named BC3Net10, capable of deriving robust and coherent patterns. In principle, this approach initially leverages the power of literature knowledge to extract AD specific genes for generating viable networks. Our findings suggest that AD GRNs show significant enrichment for key signaling mechanisms involved in neurotransmission. Among the prioritized genes, well-known AD genes were prominent in synaptic transmission, implicated in cognitive deficits. Moreover, less intensive studied AD candidates (STX2, HLA-F, HLA-C, RAB11FIP4, ARAP3, AP2A2, ATP2B4, ITPR2, and ATP2A3) are also involved in neurotransmission, providing new insights into the underlying mechanism. To our knowledge, this is the first study to generate knowledge-instructed GRNs that demonstrates an effective way of combining literature-based knowledge and data-driven analysis to identify lesser known candidates embedded in stable and robust functional patterns across disparate datasets.
Collapse
Affiliation(s)
- Shweta Bagewadi Kawalia
- Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, Sankt Augustin, Germany.,Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn-Aachen International Center for Information Technology, Bonn, Germany
| | - Tamara Raschka
- Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, Sankt Augustin, Germany.,University of Applied Sciences Koblenz, RheinAhrCampus, Remagen, Germany
| | - Mufassra Naz
- Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, Sankt Augustin, Germany.,Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn-Aachen International Center for Information Technology, Bonn, Germany
| | | | - Philipp Senger
- Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, Sankt Augustin, Germany
| | - Martin Hofmann-Apitius
- Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, Sankt Augustin, Germany.,Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn-Aachen International Center for Information Technology, Bonn, Germany
| |
Collapse
|
44
|
Yu H, Jiao B, Lu L, Wang P, Chen S, Liang C, Liu W. NetMiner-an ensemble pipeline for building genome-wide and high-quality gene co-expression network using massive-scale RNA-seq samples. PLoS One 2018; 13:e0192613. [PMID: 29425247 PMCID: PMC5806890 DOI: 10.1371/journal.pone.0192613] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2017] [Accepted: 01/27/2018] [Indexed: 01/10/2023] Open
Abstract
Accurately reconstructing gene co-expression network is of great importance for uncovering the genetic architecture underlying complex and various phenotypes. The recent availability of high-throughput RNA-seq sequencing has made genome-wide detecting and quantifying of the novel, rare and low-abundance transcripts practical. However, its potential merits in reconstructing gene co-expression network have still not been well explored. Using massive-scale RNA-seq samples, we have designed an ensemble pipeline, called NetMiner, for building genome-scale and high-quality Gene Co-expression Network (GCN) by integrating three frequently used inference algorithms. We constructed a RNA-seq-based GCN in one species of monocot rice. The quality of network obtained by our method was verified and evaluated by the curated gene functional association data sets, which obviously outperformed each single method. In addition, the powerful capability of network for associating genes with functions and agronomic traits was shown by enrichment analysis and case studies. In particular, we demonstrated the potential value of our proposed method to predict the biological roles of unknown protein-coding genes, long non-coding RNA (lncRNA) genes and circular RNA (circRNA) genes. Our results provided a valuable and highly reliable data source to select key candidate genes for subsequent experimental validation. To facilitate identification of novel genes regulating important biological processes and phenotypes in other plants or animals, we have published the source code of NetMiner, making it freely available at https://github.com/czllab/NetMiner.
Collapse
Affiliation(s)
- Hua Yu
- Nantong Medical College and School of Pharmacy, Nantong University, Nantong, China
- State Key Laboratory of Plant Genomics, Institute of Genetic and Developmental Biology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
- * E-mail: , , (HY); (CL); (WL)
| | - Bingke Jiao
- State Key Laboratory of Plant Genomics, Institute of Genetic and Developmental Biology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Lu Lu
- Nantong Polytechnic College, Nantong, China
| | - Pengfei Wang
- Nantong Medical College and School of Pharmacy, Nantong University, Nantong, China
| | - Shuangcheng Chen
- Nantong Medical College and School of Pharmacy, Nantong University, Nantong, China
| | - Chengzhi Liang
- State Key Laboratory of Plant Genomics, Institute of Genetic and Developmental Biology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
- * E-mail: , , (HY); (CL); (WL)
| | - Wei Liu
- Nantong Medical College and School of Pharmacy, Nantong University, Nantong, China
- * E-mail: , , (HY); (CL); (WL)
| |
Collapse
|
45
|
Weishaupt H, Johansson P, Engström C, Nelander S, Silvestrov S, Swartling FJ. Loss of Conservation of Graph Centralities in Reverse-engineered Transcriptional Regulatory Networks. Methodol Comput Appl Probab 2017. [DOI: 10.1007/s11009-017-9554-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
|
46
|
Jung HC, Kim SH, Lee JH, Kim JH, Han SW. Gene Regulatory Network Analysis for Triple-Negative Breast Neoplasms by Using Gene Expression Data. J Breast Cancer 2017; 20:240-245. [PMID: 28970849 PMCID: PMC5620438 DOI: 10.4048/jbc.2017.20.3.240] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2017] [Accepted: 07/31/2017] [Indexed: 01/22/2023] Open
Abstract
Purpose To better identify the physiology of triple-negative breast neoplasm (TNBN), we analyzed the TNBN gene regulatory network using gene expression data. Methods We collected TNBN gene expression data from The Cancer Genome Atlas to construct a TNBN gene regulatory network using least absolute shrinkage and selection operator regression. In addition, we constructed a triple-positive breast neoplasm (TPBN) network for comparison. Furthermore, survival analysis based on gene expression levels and differentially expressed gene (DEG) analysis were carried out to support and compare the network analysis results, respectively. Results The TNBN gene regulatory network, which followed a power-law distribution, had 10,237 vertices and 17,773 edges, with an average vertex-to-vertex distance of 8.6. The genes ZDHHC20 and RAPGEF6 were identified by centrality analysis to be important vertices. However, in the DEG analysis, we could not find meaningful fold changes in ZDHHC20 and RAPGEF6 between the TPBN and TNBN gene expression data. In the multivariate survival analysis, the hazard ratio for ZDHHC20 and RAPGEF6 was 1.677 (1.192–2.357) and 1.676 (1.222–2.299), respectively. Conclusion Our TNBN gene regulatory network was a scale-free one, which means that the network would be easily destroyed if the hub vertices were attacked. Thus, it is important to identify the hub vertices in the network analysis. In the TNBN gene regulatory network, ZDHHC20 and RAPGEF6 were found to be oncogenes. Further study of these genes could help to reveal a novel method for treating TNBN in the future.
Collapse
Affiliation(s)
- Hee Chan Jung
- Department of Internal Medicine, Eulji University College of Medicine, Seoul, Korea
| | - Sung Hwan Kim
- Department of Statistics, Keimyung University, Daegu, Korea
| | - Jeong Hoon Lee
- Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul, Korea
| | - Ju Han Kim
- Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul, Korea
| | - Sung Won Han
- Division of Fusion Data Analytics Laboratory, School of Industrial Management Engineering, Korea University, Seoul, Korea
| |
Collapse
|
47
|
Tripathi S, Lloyd-Price J, Ribeiro A, Yli-Harja O, Dehmer M, Emmert-Streib F. sgnesR: An R package for simulating gene expression data from an underlying real gene network structure considering delay parameters. BMC Bioinformatics 2017; 18:325. [PMID: 28676075 PMCID: PMC5496254 DOI: 10.1186/s12859-017-1731-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2016] [Accepted: 06/15/2017] [Indexed: 01/04/2023] Open
Abstract
Background sgnesR (Stochastic Gene Network Expression Simulator in R) is an R package that provides an interface to simulate gene expression data from a given gene network using the stochastic simulation algorithm (SSA). The package allows various options for delay parameters and can easily included in reactions for promoter delay, RNA delay and Protein delay. A user can tune these parameters to model various types of reactions within a cell. As examples, we present two network models to generate expression profiles. We also demonstrated the inference of networks and the evaluation of association measure of edge and non-edge components from the generated expression profiles. Results The purpose of sgnesR is to enable an easy to use and a quick implementation for generating realistic gene expression data from biologically relevant networks that can be user selected. Conclusions sgnesR is freely available for academic use. The R package has been tested for R 3.2.0 under Linux, Windows and Mac OS X. Electronic supplementary material The online version of this article (doi:10.1186/s12859-017-1731-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Shailesh Tripathi
- Predictive Medicine and Data Analytics Lab, Department of Signal Processing, Tampere University of Technology, Tampere, Finland
| | - Jason Lloyd-Price
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Harvard University, Boston, USA.,Laboratory of Biosystem Dynamics, Department of Signal Processing, Tampere University of Technology, Tampere, Finland
| | - Andre Ribeiro
- Laboratory of Biosystem Dynamics, Department of Signal Processing, Tampere University of Technology, Tampere, Finland.,Institute of Biosciences and Medical Technology, Tampere, Finland
| | - Olli Yli-Harja
- Institute of Biosciences and Medical Technology, Tampere, Finland.,Computational Systems Biology, Department of Signal Processing, Tampere University of Technology, Tampere, Finland
| | - Matthias Dehmer
- Institute for Theoretical Informatics, Mathematics and Operations Research, Department of Computer Science, Universität der Bundeswehr München, Munich, Germany
| | - Frank Emmert-Streib
- Predictive Medicine and Data Analytics Lab, Department of Signal Processing, Tampere University of Technology, Tampere, Finland. .,Institute of Biosciences and Medical Technology, Tampere, Finland.
| |
Collapse
|
48
|
Monneret G, Jaffrézic F, Rau A, Zerjal T, Nuel G. Identification of marginal causal relationships in gene networks from observational and interventional expression data. PLoS One 2017; 12:e0171142. [PMID: 28301504 PMCID: PMC5354375 DOI: 10.1371/journal.pone.0171142] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2016] [Accepted: 01/01/2017] [Indexed: 11/29/2022] Open
Abstract
Causal network inference is an important methodological challenge in biology as well as other areas of application. Although several causal network inference methods have been proposed in recent years, they are typically applicable for only a small number of genes, due to the large number of parameters to be estimated and the limited number of biological replicates available. In this work, we consider the specific case of transcriptomic studies made up of both observational and interventional data in which a single gene of biological interest is knocked out. We focus on a marginal causal estimation approach, based on the framework of Gaussian directed acyclic graphs, to infer causal relationships between the knocked-out gene and a large set of other genes. In a simulation study, we found that our proposed method accurately differentiates between downstream causal relationships and those that are upstream or simply associative. It also enables an estimation of the total causal effects between the gene of interest and the remaining genes. Our method performed very similarly to a classical differential analysis for experiments with a relatively large number of biological replicates, but has the advantage of providing a formal causal interpretation. Our proposed marginal causal approach is computationally efficient and may be applied to several thousands of genes simultaneously. In addition, it may help highlight subsets of genes of interest for a more thorough subsequent causal network inference. The method is implemented in an R package called MarginalCausality (available on GitHub).
Collapse
Affiliation(s)
- Gilles Monneret
- UMR GABI, AgroParisTech, INRA, Université Paris-Saclay, 78350 Jouy-en-Josas, France
- LPMA, UMR CNRS 7599, UPMC, Sorbonne Universités, 4 place Jussieu, 75005 Paris, France
- * E-mail:
| | - Florence Jaffrézic
- UMR GABI, AgroParisTech, INRA, Université Paris-Saclay, 78350 Jouy-en-Josas, France
| | - Andrea Rau
- UMR GABI, AgroParisTech, INRA, Université Paris-Saclay, 78350 Jouy-en-Josas, France
| | - Tatiana Zerjal
- UMR GABI, AgroParisTech, INRA, Université Paris-Saclay, 78350 Jouy-en-Josas, France
| | - Grégory Nuel
- LPMA, UMR CNRS 7599, UPMC, Sorbonne Universités, 4 place Jussieu, 75005 Paris, France
| |
Collapse
|
49
|
Wouters J, Kalender Atak Z, Aerts S. Decoding transcriptional states in cancer. Curr Opin Genet Dev 2017; 43:82-92. [PMID: 28129557 DOI: 10.1016/j.gde.2017.01.003] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2016] [Revised: 01/05/2017] [Accepted: 01/09/2017] [Indexed: 12/27/2022]
Abstract
Gene regulatory networks determine cellular identity. In cancer, aberrations of gene networks are caused by driver mutations that often affect transcription factors and chromatin modifiers. Nevertheless, gene transcription in cancer follows the same cis-regulatory rules as normal cells, and cancer cells have served as convenient model systems to study transcriptional regulation. Tumours often show regulatory heterogeneity, with subpopulations of cells in different transcriptional states, which has important therapeutic implications. Here, we review recent experimental and computational techniques to reverse engineer cancer gene networks using transcriptome and epigenome data. New algorithms, data integration strategies, and increasing amounts of single cell genomics data provide exciting opportunities to model dynamic regulatory states at unprecedented resolution.
Collapse
Affiliation(s)
- Jasper Wouters
- Laboratory of Computational Biology, VIB Center for Brain & Disease Research, Leuven, Belgium; Department of Human Genetics, KU Leuven (University of Leuven), Leuven, Belgium
| | - Zeynep Kalender Atak
- Laboratory of Computational Biology, VIB Center for Brain & Disease Research, Leuven, Belgium; Department of Human Genetics, KU Leuven (University of Leuven), Leuven, Belgium
| | - Stein Aerts
- Laboratory of Computational Biology, VIB Center for Brain & Disease Research, Leuven, Belgium; Department of Human Genetics, KU Leuven (University of Leuven), Leuven, Belgium.
| |
Collapse
|
50
|
Guo S, Jiang Q, Chen L, Guo D. Gene regulatory network inference using PLS-based methods. BMC Bioinformatics 2016; 17:545. [PMID: 28031031 PMCID: PMC5192600 DOI: 10.1186/s12859-016-1398-6] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2016] [Accepted: 12/01/2016] [Indexed: 12/12/2022] Open
Abstract
Background Inferring the topology of gene regulatory networks (GRNs) from microarray gene expression data has many potential applications, such as identifying candidate drug targets and providing valuable insights into the biological processes. It remains a challenge due to the fact that the data is noisy and high dimensional, and there exists a large number of potential interactions. Results We introduce an ensemble gene regulatory network inference method PLSNET, which decomposes the GRN inference problem with p genes into p subproblems and solves each of the subproblems by using Partial least squares (PLS) based feature selection algorithm. Then, a statistical technique is used to refine the predictions in our method. The proposed method was evaluated on the DREAM4 and DREAM5 benchmark datasets and achieved higher accuracy than the winners of those competitions and other state-of-the-art GRN inference methods. Conclusions Superior accuracy achieved on different benchmark datasets, including both in silico and in vivo networks, shows that PLSNET reaches state-of-the-art performance. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1398-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Shun Guo
- Department of Electronic Engineering, Xiamen University, Fujian, 361005, China.,Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518000, China
| | - Qingshan Jiang
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518000, China
| | - Lifei Chen
- School of Mathematics and Computer Science, Fujian Normal University, Fujian, 350117, China
| | - Donghui Guo
- Department of Electronic Engineering, Xiamen University, Fujian, 361005, China.
| |
Collapse
|