51
|
A gene expression atlas for different kinds of stress in the mouse brain. Sci Data 2020; 7:437. [PMID: 33328476 PMCID: PMC7744580 DOI: 10.1038/s41597-020-00772-z] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2020] [Accepted: 11/25/2020] [Indexed: 12/17/2022] Open
Abstract
Stressful experiences are part of everyday life and animals have evolved physiological and behavioral responses aimed at coping with stress and maintaining homeostasis. However, repeated or intense stress can induce maladaptive reactions leading to behavioral disorders. Adaptations in the brain, mediated by changes in gene expression, have a crucial role in the stress response. Recent years have seen a tremendous increase in studies on the transcriptional effects of stress. The input raw data are freely available from public repositories and represent a wealth of information for further global and integrative retrospective analyses. We downloaded from the Sequence Read Archive 751 samples (SRA-experiments), from 18 independent BioProjects studying the effects of different stressors on the brain transcriptome in mice. We performed a massive bioinformatics re-analysis applying a single, standardized pipeline for computing differential gene expression. This data mining allowed the identification of novel candidate stress-related genes and specific signatures associated with different stress conditions. The large amount of computational results produced was systematized in the interactive “Stress Mice Portal”.
Collapse
|
52
|
Fernández LP, Gómez de Cedrón M, Ramírez de Molina A. Alterations of Lipid Metabolism in Cancer: Implications in Prognosis and Treatment. Front Oncol 2020; 10:577420. [PMID: 33194695 PMCID: PMC7655926 DOI: 10.3389/fonc.2020.577420] [Citation(s) in RCA: 104] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Accepted: 09/14/2020] [Indexed: 01/06/2023] Open
Abstract
Cancer remains the second leading cause of mortality worldwide. In the course of this multistage and multifactorial disease, a set of alterations takes place, with genetic and environmental factors modulating tumorigenesis and disease progression. Metabolic alterations of tumors are well-recognized and are considered as one of the hallmarks of cancer. Cancer cells adapt their metabolic competences in order to efficiently supply their novel demands of energy to sustain cell proliferation and metastasis. At present, there is a growing interest in understanding the metabolic switch that occurs during tumorigenesis. Together with the Warburg effect and the increased glutaminolysis, lipid metabolism has emerged as essential for tumor development and progression. Indeed, several investigations have demonstrated the consequences of lipid metabolism alterations in cell migration, invasion, and angiogenesis, three basic steps occurring during metastasis. In addition, obesity and associated metabolic alterations have been shown to augment the risk of cancer and to worsen its prognosis. Consequently, an extensive collection of tumorigenic steps has been shown to be modulated by lipid metabolism, not only affecting the growth of primary tumors, but also mediating progression and metastasis. Besides, key enzymes involved in lipid-metabolic pathways have been associated with cancer survival and have been proposed as prognosis biomarkers of cancer. In this review, we will analyze the impact of obesity and related tumor microenviroment alterations as modifiable risk factors in cancer, focusing on the lipid alterations co-occurring during tumorigenesis. The value of precision technologies and its application to target lipid metabolism in cancer will also be discussed. The degree to which lipid alterations, together with current therapies and intake of specific dietary components, affect risk of cancer is now under investigation, and innovative therapeutic or preventive applications must be explored.
Collapse
Affiliation(s)
- Lara P Fernández
- Precision Nutrition and Cancer Program, Molecular Oncology Group, IMDEA Food Institute, Campus of International Excellence (CEI) University Autonomous of Madrid (UAM) + CSIC, Madrid, Spain
| | - Marta Gómez de Cedrón
- Precision Nutrition and Cancer Program, Molecular Oncology Group, IMDEA Food Institute, Campus of International Excellence (CEI) University Autonomous of Madrid (UAM) + CSIC, Madrid, Spain
| | - Ana Ramírez de Molina
- Precision Nutrition and Cancer Program, Molecular Oncology Group, IMDEA Food Institute, Campus of International Excellence (CEI) University Autonomous of Madrid (UAM) + CSIC, Madrid, Spain
| |
Collapse
|
53
|
Sielemann K, Hafner A, Pucker B. The reuse of public datasets in the life sciences: potential risks and rewards. PeerJ 2020; 8:e9954. [PMID: 33024631 PMCID: PMC7518187 DOI: 10.7717/peerj.9954] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Accepted: 08/25/2020] [Indexed: 12/13/2022] Open
Abstract
The 'big data' revolution has enabled novel types of analyses in the life sciences, facilitated by public sharing and reuse of datasets. Here, we review the prodigious potential of reusing publicly available datasets and the associated challenges, limitations and risks. Possible solutions to issues and research integrity considerations are also discussed. Due to the prominence, abundance and wide distribution of sequencing data, we focus on the reuse of publicly available sequence datasets. We define 'successful reuse' as the use of previously published data to enable novel scientific findings. By using selected examples of successful reuse from different disciplines, we illustrate the enormous potential of the practice, while acknowledging the respective limitations and risks. A checklist to determine the reuse value and potential of a particular dataset is also provided. The open discussion of data reuse and the establishment of this practice as a norm has the potential to benefit all stakeholders in the life sciences.
Collapse
Affiliation(s)
- Katharina Sielemann
- Genetics and Genomics of Plants, Center for Biotechnology (CeBiTec) & Faculty of Biology, Bielefeld University, Bielefeld, Germany
- Graduate School DILS, Bielefeld Institute for Bioinformatics Infrastructure (BIBI), Bielefeld University, Bielefeld, Germany
| | - Alenka Hafner
- Genetics and Genomics of Plants, Center for Biotechnology (CeBiTec) & Faculty of Biology, Bielefeld University, Bielefeld, Germany
- Current Affiliation: Intercollege Graduate Degree Program in Plant Biology, Penn State University, University Park, State College, PA, United States of America
| | - Boas Pucker
- Genetics and Genomics of Plants, Center for Biotechnology (CeBiTec) & Faculty of Biology, Bielefeld University, Bielefeld, Germany
- Evolution and Diversity, Department of Plant Sciences, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
54
|
Hounkpe BW, Benatti RDO, Carvalho BDS, De Paula EV. Identification of common and divergent gene expression signatures in patients with venous and arterial thrombosis using data from public repositories. PLoS One 2020; 15:e0235501. [PMID: 32780732 PMCID: PMC7418995 DOI: 10.1371/journal.pone.0235501] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2019] [Accepted: 06/17/2020] [Indexed: 12/31/2022] Open
Abstract
STRENGTHS AND LIMITATIONS OF THIS STUDY Our results represent the first comparison of venous and arterial thrombosis at the transcriptomic level.Our main result was the demonstration that immunothrombosis pathways are important to the pathophysiology of these conditions, also at the transcriptomic level.A specific signature for venous and arterial thrombosis was described, and validated in independent cohorts.The limited number of public repositories with gene expression data from patients with venous thromboembolism limits the representation of these patients in our analyses.In order to gather a meaningful number of studies with gene expression data we had to include patients in different time-points since the index thrombotic event, which might have increased the heterogeneity of our population.
Collapse
Affiliation(s)
| | | | - Benilton de Sá Carvalho
- Department of Statistics, Institute of Mathematics, Statistics and Scientific Computing, University of Campinas, Campinas, SP, Brazil
| | - Erich Vinicius De Paula
- School of Medical Sciences, University of Campinas, Campinas, SP, Brazil
- Hematology and Hemotherapy Center, University of Campinas, Campinas, SP, Brazil
| |
Collapse
|
55
|
Microarray Normalization Revisited for Reproducible Breast Cancer Biomarkers. BIOMED RESEARCH INTERNATIONAL 2020; 2020:1363827. [PMID: 32832541 PMCID: PMC7428878 DOI: 10.1155/2020/1363827] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/04/2019] [Revised: 03/30/2020] [Accepted: 05/11/2020] [Indexed: 11/21/2022]
Abstract
Precision medicine for breast cancer relies on biomarkers to select therapies. However, the reliability of biomarkers drawn from gene expression arrays has been questioned and calls for reassessment, in particular for large datasets. We revisit widely used data-normalization procedures and evaluate differences in outcome in order to pinpoint the most reliable reprocessing methods biomarkers can be based upon. We generated a database of 3753 breast cancer patients out of 38 studies by downloading and curating patient samples from NCBI-GEO. As gene-expression biomarkers, we select the assessment of receptor status and breast cancer subtype classification. Each normalization procedure is applied separately, and biomarkers are then evaluated for each patient. Differences between normalization pipelines are quantified as percentages of patients having outcomes different for each pipeline. Some normalization procedures lead to quite consistent biomarkers, differing only in 1-2% of patients. Other normalization procedures—some of them have been used in many clinical studies—end up with distrusting discrepancies (10% and more). A good deal of doubt regarding the reliability of microarrays may root in the haphazard application of inadequate preprocessing pipelines. Several modes of batch corrections are evaluated regarding a possible improvement of receptor prediction from gene expression versus the golden standard of immunohistochemistry. Finally, we nominate those normalization methods yielding consistent and trustable results. Adequate bioinformatics data preprocessing is key and crucial for any subsequent statistics to arrive at trustable results. We conclude with a suggestion for future bioinformatics development to further increase the reliability of cancer biomarkers.
Collapse
|
56
|
Chen P, Song M, Wang Y, Deng S, Hong W, Zhang X, Yu B. Identification of key genes of human bone marrow stromal cells adipogenesis at an early stage. PeerJ 2020; 8:e9484. [PMID: 32742785 PMCID: PMC7380279 DOI: 10.7717/peerj.9484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2020] [Accepted: 06/15/2020] [Indexed: 11/20/2022] Open
Abstract
Background Bone marrow adipocyte (BMA), closely associated with bone degeneration, shares common progenitors with osteoblastic lineage. However, the intrinsic mechanism of cells fate commitment between BMA and osteogenic lineage remains unclear. Methods Gene Expression Omnibus (GEO) dataset GSE107789 publicly available was downloaded and analyzed. Differentially expressed genes (DEGs) were analyzed using GEO2R. Functional and pathway enrichment analyses of Gene Ontology and Kyoto Encyclopedia of Genes and Genomes were conducted by The Database for Annotation, Visualization and Integrated Discovery and Gene set enrichment analysis software. Protein-protein interactions (PPI) network was obtained using STRING database, visualized and clustered by Cytoscape software. Transcriptional levels of key genes were verified by real-time quantitative PCR in vitro in Bone marrow stromal cells (BMSCs) undergoing adipogenic differentiation at day 7 and in vivo in ovariectomized mice model. Results A total of 2,869 DEGs, including 1,357 up-regulated and 1,512 down-regulated ones, were screened out from transcriptional profile of human BMSCs undergoing adipogenic induction at day 7 vs. day 0. Functional and pathway enrichment analysis, combined with modules analysis of PPI network, highlighted ACSL1, sphingosine 1-phosphate receptors 3 (S1PR3), ZBTB16 and glypican 3 as key genes up-regulated at the early stage of BMSCs adipogenic differentiation. Furthermore, up-regulated mRNA expression levels of ACSL1, S1PR3 and ZBTB16 were confirmed both in vitro and in vivo. Conclusion ACSL1, S1PR3 and ZBTB16 may play crucial roles in early regulation of BMSCs adipogenic differentiation.
Collapse
Affiliation(s)
- Pengyu Chen
- Department of Orthopaedics, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, China.,Guangdong Provincial Key Laboratory of Bone and Cartilage Regenerative Medicine, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, China
| | - Mingrui Song
- Department of Orthopaedics, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, China.,Guangdong Provincial Key Laboratory of Bone and Cartilage Regenerative Medicine, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, China
| | - Yutian Wang
- Department of Orthopaedics, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, China.,Guangdong Provincial Key Laboratory of Bone and Cartilage Regenerative Medicine, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, China
| | - Songyun Deng
- Department of Orthopaedics, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, China.,Guangdong Provincial Key Laboratory of Bone and Cartilage Regenerative Medicine, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, China
| | - Weisheng Hong
- Department of Orthopaedics, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, China.,Guangdong Provincial Key Laboratory of Bone and Cartilage Regenerative Medicine, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, China
| | - Xianrong Zhang
- Department of Orthopaedics, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, China.,Guangdong Provincial Key Laboratory of Bone and Cartilage Regenerative Medicine, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, China
| | - Bin Yu
- Department of Orthopaedics, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, China.,Guangdong Provincial Key Laboratory of Bone and Cartilage Regenerative Medicine, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, China
| |
Collapse
|
57
|
Xie J, Xu Y, Chen H, Chi M, He J, Li M, Liu H, Xia J, Guan Q, Guo Z, Yan H. Identification of population-level differentially expressed genes in one-phenotype data. Bioinformatics 2020; 36:4283-4290. [PMID: 32428201 PMCID: PMC7520039 DOI: 10.1093/bioinformatics/btaa523] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2020] [Revised: 04/15/2020] [Accepted: 05/14/2020] [Indexed: 01/01/2023] Open
Abstract
Motivation For some specific tissues, such as the heart and brain, normal controls are difficult to obtain. Thus, studies with only a particular type of disease samples (one phenotype) cannot be analyzed using common methods, such as significance analysis of microarrays, edgeR and limma. The RankComp algorithm, which was mainly developed to identify individual-level differentially expressed genes (DEGs), can be applied to identify population-level DEGs for the one-phenotype data but cannot identify the dysregulation directions of DEGs. Results Here, we optimized the RankComp algorithm, termed PhenoComp. Compared with RankComp, PhenoComp provided the dysregulation directions of DEGs and had more robust detection power in both simulated and real one-phenotype data. Moreover, using the DEGs detected by common methods as the ‘gold standard’, the results showed that the DEGs detected by PhenoComp using only one-phenotype data were comparable to those identified by common methods using case-control samples, independent of the measurement platform. PhenoComp also exhibited good performance for weakly differential expression signal data. Availability and implementation The PhenoComp algorithm is available on the web at https://github.com/XJJ-student/PhenoComp. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jiajing Xie
- Department of Bioinformatics, Key Laboratory of Ministry of Education for Gastrointestinal Cancer, The School of Basic Medical Sciences, Fujian Medical University, Fuzhou 350122, China.,Key Laboratory of Medical Bioinformatics, Fujian Province, Fuzhou 350122, China
| | - Yang Xu
- Department of Bioinformatics, Key Laboratory of Ministry of Education for Gastrointestinal Cancer, The School of Basic Medical Sciences, Fujian Medical University, Fuzhou 350122, China.,Key Laboratory of Medical Bioinformatics, Fujian Province, Fuzhou 350122, China
| | - Haifeng Chen
- Department of General Surgery, Fuzhou Second Hospital Affiliated to Xiamen University, Fuzhou 350007, China
| | - Meirong Chi
- Department of Bioinformatics, Key Laboratory of Ministry of Education for Gastrointestinal Cancer, The School of Basic Medical Sciences, Fujian Medical University, Fuzhou 350122, China.,Key Laboratory of Medical Bioinformatics, Fujian Province, Fuzhou 350122, China
| | - Jun He
- Department of Bioinformatics, Key Laboratory of Ministry of Education for Gastrointestinal Cancer, The School of Basic Medical Sciences, Fujian Medical University, Fuzhou 350122, China.,Key Laboratory of Medical Bioinformatics, Fujian Province, Fuzhou 350122, China
| | - Meifeng Li
- Department of Bioinformatics, Key Laboratory of Ministry of Education for Gastrointestinal Cancer, The School of Basic Medical Sciences, Fujian Medical University, Fuzhou 350122, China.,Key Laboratory of Medical Bioinformatics, Fujian Province, Fuzhou 350122, China
| | - Hui Liu
- Department of Bioinformatics, Key Laboratory of Ministry of Education for Gastrointestinal Cancer, The School of Basic Medical Sciences, Fujian Medical University, Fuzhou 350122, China.,Key Laboratory of Medical Bioinformatics, Fujian Province, Fuzhou 350122, China
| | - Jie Xia
- Department of Bioinformatics, Key Laboratory of Ministry of Education for Gastrointestinal Cancer, The School of Basic Medical Sciences, Fujian Medical University, Fuzhou 350122, China.,Key Laboratory of Medical Bioinformatics, Fujian Province, Fuzhou 350122, China
| | - Qingzhou Guan
- Department of Bioinformatics, Key Laboratory of Ministry of Education for Gastrointestinal Cancer, The School of Basic Medical Sciences, Fujian Medical University, Fuzhou 350122, China.,Key Laboratory of Medical Bioinformatics, Fujian Province, Fuzhou 350122, China
| | - Zheng Guo
- Department of Bioinformatics, Key Laboratory of Ministry of Education for Gastrointestinal Cancer, The School of Basic Medical Sciences, Fujian Medical University, Fuzhou 350122, China.,Key Laboratory of Medical Bioinformatics, Fujian Province, Fuzhou 350122, China
| | - Haidan Yan
- Department of Bioinformatics, Key Laboratory of Ministry of Education for Gastrointestinal Cancer, The School of Basic Medical Sciences, Fujian Medical University, Fuzhou 350122, China.,Key Laboratory of Medical Bioinformatics, Fujian Province, Fuzhou 350122, China
| |
Collapse
|
58
|
Lei Z, Yu S, Ding Y, Liang J, Halifu Y, Xiang F, Zhang D, Wang H, Hu W, Li T, Wang Y, Zou X, Zhang K, Kang X. Identification of key genes and pathways involved in vitiligo development based on integrated analysis. Medicine (Baltimore) 2020; 99:e21297. [PMID: 32756109 PMCID: PMC7402735 DOI: 10.1097/md.0000000000021297] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Vitiligo is a chronic skin condition lack of melanocytes. However, researches on the aetiology and pathogenesis of vitiligo are still under debate. This study aimed to explore the key genes and pathways associated with occurrence and development of vitiligo.Weighted gene coexpression network analysis (WGCNA) was applied to reanalyze the gene expression dataset GSE65127 systematically. Functional enrichments of these modules were carried out at gene ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), gene set variation analysis (GSVA), and gene set enrichment analysis (GSEA). Then, a map of regulatory network was delineated according to pivot analysis and drug prediction. In addition, hub genes and crucial pathways were validated by an independent dataset GSE75819. The expressions of hub genes in modules were also tested by quantitative real-time polymerase chain reaction (qRT-PCR).Eight coexpressed modules were identified by WGCNA based on 5794 differentially expressed genes of vitiligo. Three modules were found to be significantly correlated with Lesional, Peri-Lesional, and Non-Lesional, respectively. The persistent maladjusted genes included 269 upregulated genes and 82 downregulated genes. The enrichments showed module genes were implicated in immune response, p53 signaling pathway, etc. According to GSEA and GSVA, dysregulated pathways were activated incessantly from Non-Lesional to Peri-Lesional and then to Lesional, 4 of which were verified by an independent dataset GSE75819. Finally, 42 transcription factors and 228 drugs were spotted. Focusing on the persistent maladjusted genes, a map of regulatory network was delineated. Hub genes (CACTIN, DCTN1, GPR143, HADH, MRPL47, NKTR, NUF2) and transcription factors (ITGAV, SYK, PDPK1) were validated by an independent dataset GSE75819. In addition, hub genes (CACTIN, DCTN1, GPR143, MRPL47, NKTR) were also confirmed by qRT-PCR.The present study, at least, might provide an integrated and in-depth insight for exploring the underlying mechanism of vitiligo and predicting potential diagnostic biomarkers and therapeutic targets.
Collapse
Affiliation(s)
| | - Shirong Yu
- Department of Dermatology, People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi, Xinjiang, China
| | - Yuan Ding
- Department of Dermatology, People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi, Xinjiang, China
| | - Junqin Liang
- Department of Dermatology, People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi, Xinjiang, China
| | - Yilinuer Halifu
- Department of Dermatology, People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi, Xinjiang, China
| | - Fang Xiang
- Department of Dermatology, People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi, Xinjiang, China
| | - Dezhi Zhang
- Department of Dermatology, People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi, Xinjiang, China
| | - Hongjuan Wang
- Department of Dermatology, People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi, Xinjiang, China
| | - Wen Hu
- Department of Dermatology, People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi, Xinjiang, China
| | - Tingting Li
- Department of Dermatology, People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi, Xinjiang, China
| | - Yunying Wang
- Department of Dermatology, People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi, Xinjiang, China
| | - Xuelian Zou
- Department of Dermatology, People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi, Xinjiang, China
| | - Kunjie Zhang
- Department of Dermatology, People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi, Xinjiang, China
| | - Xiaojing Kang
- Department of Dermatology, People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi, Xinjiang, China
| |
Collapse
|
59
|
Bazile J, Jaffrezic F, Dehais P, Reichstadt M, Klopp C, Laloe D, Bonnet M. Molecular signatures of muscle growth and composition deciphered by the meta-analysis of age-related public transcriptomics data. Physiol Genomics 2020; 52:322-332. [PMID: 32657225 DOI: 10.1152/physiolgenomics.00020.2020] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
The lean-to-fat ratio is a major issue in the beef meat industry from both carcass and meat production perspectives. This industrial perspective has motivated meat physiologists to use transcriptomics technologies to decipher mechanisms behind fat deposition within muscle during the time course of muscle growth. However, synthetic biological information from this volume of data remains to be produced to identify mechanisms found in various breeds and rearing practices. We conducted a meta-analysis on 10 transcriptomic data sets stored in public databases, from the longissimus thoracis of five different bovine breeds divergent by age. We updated gene identifiers on the last version of the bovine genome (UCD1.2), and the 715 genes common to the 10 studies were subjected to the meta-analysis. Of the 238 genes differentially expressed (DEG), we identified a transcriptional signature of the dynamic regulation of glycolytic and oxidative metabolisms that agrees with a known shift between those two pathways from the animal puberty. We proposed some master genes of the myogenesis, namely MYOG and MAPK14, as probable regulators of the glycolytic and oxidative metabolisms. We also identified overexpressed genes related to lipid metabolism (APOE, LDLR, MXRA8, and HSP90AA1) that may contribute to the expected enhanced marbling as age increases. Lastly, we proposed a transcriptional signature related to the induction (YBX1) or repression (MAPK14, YWAH, ERBB2) of the commitment of myogenic progenitors into the adipogenic lineage. The relationships between the abundance of the identified mRNA and marbling values remain to be analyzed in a marbling biomarkers discovery perspectives.
Collapse
Affiliation(s)
- Jeanne Bazile
- INRAE, UMR Herbivores, Université Clermont Auvergne, VetAgro Sup, Saint-Genès-Champanelle, France
| | - Florence Jaffrezic
- INRAE, UMR1313 Génétique Animale et Biologie Intégrative, Jouy-en-Josas, France
| | - Patrice Dehais
- Plate-forme bio-informatique Genotoul, Mathématiques et Informatique Appliquées de Toulouse, INRAE, Castanet Tolosan, France.,SIGENAE, GenPhySE, Université de Toulouse, INRAE, ENVT, Castanet Tolosan, France
| | - Matthieu Reichstadt
- INRAE, UMR Herbivores, Université Clermont Auvergne, VetAgro Sup, Saint-Genès-Champanelle, France
| | - Christophe Klopp
- Plate-forme bio-informatique Genotoul, Mathématiques et Informatique Appliquées de Toulouse, INRAE, Castanet Tolosan, France.,SIGENAE, GenPhySE, Université de Toulouse, INRAE, ENVT, Castanet Tolosan, France
| | - Denis Laloe
- INRAE, UMR1313 Génétique Animale et Biologie Intégrative, Jouy-en-Josas, France
| | - Muriel Bonnet
- INRAE, UMR Herbivores, Université Clermont Auvergne, VetAgro Sup, Saint-Genès-Champanelle, France
| |
Collapse
|
60
|
Reanalysis and integration of public microarray datasets reveals novel host genes modulated in leprosy. Mol Genet Genomics 2020; 295:1355-1368. [PMID: 32661593 DOI: 10.1007/s00438-020-01705-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2020] [Accepted: 07/01/2020] [Indexed: 01/24/2023]
Abstract
Due to multiple hypothesis testing with often limited sample size, microarrays and other-omics technologies can sometimes produce irreproducible findings. Complementary to better experimental design, reanalysis and integration of gene expression datasets may help overcome reproducibility issues by identifying consistent differentially expressed genes from independent studies. In this work, after a systematic search, nine microarray datasets evaluating host gene expression in leprosy were reanalyzed and the information was integrated to strengthen evidence of differential expression for several genes. Our results are relevant in prioritizing genes and pathways for further investigation, whether in functional studies or in biomarker discovery. Reanalysis of individual datasets revealed several differentially expressed genes (DEGs) in accordance with original reports. Then, five integration methods (P value and effect size based) were tested. In the end, random-effects model and ratio association were selected as the main methods to pinpoint DEGs. Overall, classic pathways were found corroborating previous findings and validating this approach. Also, we identified some novel DEG involved especially with skin development processes (AQP3, AKR1C3, CYP27B1, LTB, VDR) and keratinocyte biology (CSTA, DSG1, KRT14, KRT5, PKP1, IVL), both still poorly understood in leprosy context. In addition, here we provide aggregated evidence towards some gene candidates that should be prioritized in further leprosy research, as they are likely important in immunopathogenesis. Altogether, these data are useful in better understanding host responses to the disease and, at the same time, provide a list of potential host biomarkers that could be useful in complementing leprosy diagnosis based on transcriptional levels.
Collapse
|
61
|
Li B, Dai C, Wang L, Deng H, Li Y, Guan Z, Ni H. A novel drug repurposing approach for non-small cell lung cancer using deep learning. PLoS One 2020; 15:e0233112. [PMID: 32525938 PMCID: PMC7289363 DOI: 10.1371/journal.pone.0233112] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Accepted: 04/28/2020] [Indexed: 01/02/2023] Open
Abstract
Drug repurposing is an attractive and pragmatic way offering reduced risks and development time in the complicated process of drug discovery. In the past, drug repurposing has been largely accidental and serendipitous. The most successful examples so far have not involved a systematic approach. Nowadays, remarkable advances in drugs, diseases and bioinformatic knowledge are offering great opportunities for designing novel drug repurposing approach through comprehensive understanding of drug information. In this study, we introduced a novel drug repurposing approach based on transcriptomic data and chemical structures using deep learning. One strong candidate for repurposing has been identified. Pimozide is an anti-dyskinesia agent that is used for the suppression of motor and phonic tics in patients with Tourette's Disorder. However, our pipeline proposed it as a strong candidate for treating non-small cell lung cancer. The cytotoxicity of pimozide against A549 cell lines has been validated.
Collapse
Affiliation(s)
- Bingrui Li
- Beijing Deep Intelligent Pharma Technologies Co., Ltd, Beijing, China
| | - Chan Dai
- Beijing Deep Intelligent Pharma Technologies Co., Ltd, Beijing, China
| | - Lijun Wang
- Beijing Deep Intelligent Pharma Technologies Co., Ltd, Beijing, China
| | - Hailong Deng
- Beijing Deep Intelligent Pharma Technologies Co., Ltd, Beijing, China
| | - Yingying Li
- Beijing Deep Intelligent Pharma Technologies Co., Ltd, Beijing, China
- * E-mail: (YL); (ZG); (HN)
| | - Zheng Guan
- Beijing Deep Intelligent Pharma Technologies Co., Ltd, Beijing, China
- * E-mail: (YL); (ZG); (HN)
| | - Haihong Ni
- Beijing Deep Intelligent Pharma Technologies Co., Ltd, Beijing, China
- * E-mail: (YL); (ZG); (HN)
| |
Collapse
|
62
|
Jo K, Santos-Buitrago B, Kim M, Rhee S, Talcott C, Kim S. Logic-based analysis of gene expression data predicts association between TNF, TGFB1 and EGF pathways in basal-like breast cancer. Methods 2020; 179:89-100. [PMID: 32445696 DOI: 10.1016/j.ymeth.2020.05.008] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2020] [Revised: 04/30/2020] [Accepted: 05/13/2020] [Indexed: 12/16/2022] Open
Abstract
For breast cancer, clinically important subtypes are well characterized at the molecular level in terms of gene expression profiles. In addition, signaling pathways in breast cancer have been extensively studied as therapeutic targets due to their roles in tumor growth and metastasis. However, it is challenging to put signaling pathways and gene expression profiles together to characterize biological mechanisms of breast cancer subtypes since many signaling events result from post-translational modifications, rather than gene expression differences. We designed a logic-based computational framework to explain the differences in gene expression profiles among breast cancer subtypes using Pathway Logic and transcriptional network information. Pathway Logic is a rewriting-logic-based formal system for modeling biological pathways including post-translational modifications. Our method demonstrated its utility by constructing subtype-specific path from key receptors (TNFR, TGFBR1 and EGFR) to key transcription factor (TF) regulators (RELA, ATF2, SMAD3 and ELK1) and identifying potential association between pathways via TFs in basal-specific paths, which could provide a novel insight on aggressive breast cancer subtypes. Codes and results are available at http://epigenomics.snu.ac.kr/PL/.
Collapse
Affiliation(s)
- Kyuri Jo
- Department of Computer Engineering, Chungbuk National University, Cheongju, Republic of Korea
| | - Beatriz Santos-Buitrago
- Department of Computer Science and Engineering, Seoul National University, Seoul, Republic of Korea
| | - Minsu Kim
- Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Sungmin Rhee
- Department of Computer Science and Engineering, Seoul National University, Seoul, Republic of Korea
| | | | - Sun Kim
- Department of Computer Science and Engineering, Seoul National University, Seoul, Republic of Korea; Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea; Institute of Engineering Research, Seoul National University, Seoul, Republic of Korea; Bioinformatics Institute, Seoul National University, Seoul, Republic of Korea.
| |
Collapse
|
63
|
Obayashi T, Kagaya Y, Aoki Y, Tadaka S, Kinoshita K. COXPRESdb v7: a gene coexpression database for 11 animal species supported by 23 coexpression platforms for technical evaluation and evolutionary inference. Nucleic Acids Res 2020; 47:D55-D62. [PMID: 30462320 PMCID: PMC6324053 DOI: 10.1093/nar/gky1155] [Citation(s) in RCA: 92] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2018] [Accepted: 11/02/2018] [Indexed: 12/11/2022] Open
Abstract
The advent of RNA-sequencing and microarray technologies has led to rapid growth of transcriptome data generated for a wide range of organisms, under various cellular, organ and individual conditions. Since the number of possible combinations of intercellular and extracellular conditions is almost unlimited, cataloging all transcriptome conditions would be an immeasurable challenge. Gene coexpression refers to the similarity of gene expression patterns under various conditions, such as disease states, tissue types, and developmental stages. Since the quality of gene coexpression data depends on the quality and quantity of transcriptome data, timely usage of the growing data is key to promoting individual research in molecular biology. COXPRESdb (http://coxpresdb.jp) is a database providing coexpression information for 11 animal species. One characteristic feature of COXPRESdb is its ability to compare multiple coexpression data derived from different transcriptomics technologies and different species, which strongly reduces false positive relationships in individual gene coexpression data. Here, we summarized the current version of this database, including 23 coexpression platforms with the highest-level quality till date. Using various functionalities in COXPRESdb, the new coexpression data would support a broader area of research from molecular biology to medical sciences.
Collapse
Affiliation(s)
- Takeshi Obayashi
- Graduate School of Information Sciences, Tohoku University, 6-3-09, Aramaki-Aza-Aoba, Aoba-ku, Sendai 980-8679, Japan
| | - Yuki Kagaya
- Graduate School of Information Sciences, Tohoku University, 6-3-09, Aramaki-Aza-Aoba, Aoba-ku, Sendai 980-8679, Japan
| | - Yuichi Aoki
- Tohoku Medical Megabank Organization, Tohoku University, Sendai 980-8573, Japan
| | - Shu Tadaka
- Tohoku Medical Megabank Organization, Tohoku University, Sendai 980-8573, Japan
| | - Kengo Kinoshita
- Graduate School of Information Sciences, Tohoku University, 6-3-09, Aramaki-Aza-Aoba, Aoba-ku, Sendai 980-8679, Japan
- Tohoku Medical Megabank Organization, Tohoku University, Sendai 980-8573, Japan
- Institute of Development, Aging, and Cancer, Tohoku University, Sendai 980-8575, Japan
- To whom correspondence should be addressed. Tel: +81 22 795 7179; Fax: +81 22 795 7179;
| |
Collapse
|
64
|
Khan S, Taverna F, Rohlenova K, Treps L, Geldhof V, de Rooij L, Sokol L, Pircher A, Conradi LC, Kalucka J, Schoonjans L, Eelen G, Dewerchin M, Karakach T, Li X, Goveia J, Carmeliet P. EndoDB: a database of endothelial cell transcriptomics data. Nucleic Acids Res 2020; 47:D736-D744. [PMID: 30357379 PMCID: PMC6324065 DOI: 10.1093/nar/gky997] [Citation(s) in RCA: 60] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2018] [Accepted: 10/09/2018] [Indexed: 12/29/2022] Open
Abstract
Endothelial cells (ECs) line blood vessels, regulate homeostatic processes (blood flow, immune cell trafficking), but are also involved in many prevalent diseases. The increasing use of high-throughput technologies such as gene expression microarrays and (single cell) RNA sequencing generated a wealth of data on the molecular basis of EC (dys-)function. Extracting biological insight from these datasets is challenging for scientists who are not proficient in bioinformatics. To facilitate the re-use of publicly available EC transcriptomics data, we developed the endothelial database EndoDB, a web-accessible collection of expert curated, quality assured and pre-analyzed data collected from 360 datasets comprising a total of 4741 bulk and 5847 single cell endothelial transcriptomes from six different organisms. Unlike other added-value databases, EndoDB allows to easily retrieve and explore data of specific studies, determine under which conditions genes and pathways of interest are deregulated and assess reprogramming of metabolism via principal component analysis, differential gene expression analysis, gene set enrichment analysis, heatmaps and metabolic and transcription factor analysis, while single cell data are visualized as gene expression color-coded t-SNE plots. Plots and tables in EndoDB are customizable, downloadable and interactive. EndoDB is freely available at https://vibcancer.be/software-tools/endodb, and will be updated to include new studies.
Collapse
Affiliation(s)
- Shawez Khan
- Department of Oncology and Leuven Cancer Institute (LKI), Laboratory of Angiogenesis and Vascular Metabolism, KU Leuven, 3000 Leuven, Belgium
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-Sen University, Guangzhou 510060, Guangdong, P.R. China
- Laboratory of Angiogenesis and Vascular Metabolism, Center for Cancer Biology, VIB, 3000 Leuven, Belgium
| | - Federico Taverna
- Department of Oncology and Leuven Cancer Institute (LKI), Laboratory of Angiogenesis and Vascular Metabolism, KU Leuven, 3000 Leuven, Belgium
- Laboratory of Angiogenesis and Vascular Metabolism, Center for Cancer Biology, VIB, 3000 Leuven, Belgium
| | - Katerina Rohlenova
- Department of Oncology and Leuven Cancer Institute (LKI), Laboratory of Angiogenesis and Vascular Metabolism, KU Leuven, 3000 Leuven, Belgium
- Laboratory of Angiogenesis and Vascular Metabolism, Center for Cancer Biology, VIB, 3000 Leuven, Belgium
| | - Lucas Treps
- Department of Oncology and Leuven Cancer Institute (LKI), Laboratory of Angiogenesis and Vascular Metabolism, KU Leuven, 3000 Leuven, Belgium
- Laboratory of Angiogenesis and Vascular Metabolism, Center for Cancer Biology, VIB, 3000 Leuven, Belgium
| | - Vincent Geldhof
- Department of Oncology and Leuven Cancer Institute (LKI), Laboratory of Angiogenesis and Vascular Metabolism, KU Leuven, 3000 Leuven, Belgium
- Laboratory of Angiogenesis and Vascular Metabolism, Center for Cancer Biology, VIB, 3000 Leuven, Belgium
| | - Laura de Rooij
- Department of Oncology and Leuven Cancer Institute (LKI), Laboratory of Angiogenesis and Vascular Metabolism, KU Leuven, 3000 Leuven, Belgium
- Laboratory of Angiogenesis and Vascular Metabolism, Center for Cancer Biology, VIB, 3000 Leuven, Belgium
| | - Liliana Sokol
- Department of Oncology and Leuven Cancer Institute (LKI), Laboratory of Angiogenesis and Vascular Metabolism, KU Leuven, 3000 Leuven, Belgium
- Laboratory of Angiogenesis and Vascular Metabolism, Center for Cancer Biology, VIB, 3000 Leuven, Belgium
| | - Andreas Pircher
- Department of Oncology and Leuven Cancer Institute (LKI), Laboratory of Angiogenesis and Vascular Metabolism, KU Leuven, 3000 Leuven, Belgium
- Laboratory of Angiogenesis and Vascular Metabolism, Center for Cancer Biology, VIB, 3000 Leuven, Belgium
| | - Lena-Christin Conradi
- Department of Oncology and Leuven Cancer Institute (LKI), Laboratory of Angiogenesis and Vascular Metabolism, KU Leuven, 3000 Leuven, Belgium
- Laboratory of Angiogenesis and Vascular Metabolism, Center for Cancer Biology, VIB, 3000 Leuven, Belgium
| | - Joanna Kalucka
- Department of Oncology and Leuven Cancer Institute (LKI), Laboratory of Angiogenesis and Vascular Metabolism, KU Leuven, 3000 Leuven, Belgium
- Laboratory of Angiogenesis and Vascular Metabolism, Center for Cancer Biology, VIB, 3000 Leuven, Belgium
| | - Luc Schoonjans
- Department of Oncology and Leuven Cancer Institute (LKI), Laboratory of Angiogenesis and Vascular Metabolism, KU Leuven, 3000 Leuven, Belgium
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-Sen University, Guangzhou 510060, Guangdong, P.R. China
- Laboratory of Angiogenesis and Vascular Metabolism, Center for Cancer Biology, VIB, 3000 Leuven, Belgium
| | - Guy Eelen
- Department of Oncology and Leuven Cancer Institute (LKI), Laboratory of Angiogenesis and Vascular Metabolism, KU Leuven, 3000 Leuven, Belgium
- Laboratory of Angiogenesis and Vascular Metabolism, Center for Cancer Biology, VIB, 3000 Leuven, Belgium
| | - Mieke Dewerchin
- Department of Oncology and Leuven Cancer Institute (LKI), Laboratory of Angiogenesis and Vascular Metabolism, KU Leuven, 3000 Leuven, Belgium
- Laboratory of Angiogenesis and Vascular Metabolism, Center for Cancer Biology, VIB, 3000 Leuven, Belgium
| | - Tobias Karakach
- Department of Oncology and Leuven Cancer Institute (LKI), Laboratory of Angiogenesis and Vascular Metabolism, KU Leuven, 3000 Leuven, Belgium
- Laboratory of Angiogenesis and Vascular Metabolism, Center for Cancer Biology, VIB, 3000 Leuven, Belgium
| | - Xuri Li
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-Sen University, Guangzhou 510060, Guangdong, P.R. China
- To whom correspondence should be addressed. Tel: +32 16 373 204; Fax: +32 16 372 585; . Correspondence may also be addressed to Jermaine Goveia. Tel: +32 16 373 204; Fax: +32 16 372 585; . Correspondence may also be addressed to Xuri Li. Tel: +86 20 8733 1815; Fax: +86 20 8733 1815;
| | - Jermaine Goveia
- Department of Oncology and Leuven Cancer Institute (LKI), Laboratory of Angiogenesis and Vascular Metabolism, KU Leuven, 3000 Leuven, Belgium
- Laboratory of Angiogenesis and Vascular Metabolism, Center for Cancer Biology, VIB, 3000 Leuven, Belgium
- To whom correspondence should be addressed. Tel: +32 16 373 204; Fax: +32 16 372 585; . Correspondence may also be addressed to Jermaine Goveia. Tel: +32 16 373 204; Fax: +32 16 372 585; . Correspondence may also be addressed to Xuri Li. Tel: +86 20 8733 1815; Fax: +86 20 8733 1815;
| | - Peter Carmeliet
- Department of Oncology and Leuven Cancer Institute (LKI), Laboratory of Angiogenesis and Vascular Metabolism, KU Leuven, 3000 Leuven, Belgium
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-Sen University, Guangzhou 510060, Guangdong, P.R. China
- Laboratory of Angiogenesis and Vascular Metabolism, Center for Cancer Biology, VIB, 3000 Leuven, Belgium
- To whom correspondence should be addressed. Tel: +32 16 373 204; Fax: +32 16 372 585; . Correspondence may also be addressed to Jermaine Goveia. Tel: +32 16 373 204; Fax: +32 16 372 585; . Correspondence may also be addressed to Xuri Li. Tel: +86 20 8733 1815; Fax: +86 20 8733 1815;
| |
Collapse
|
65
|
Lim HGM, Gladys Lee YC. A Cross-Platform Comparison of Affymetrix, Agilent, and Illumina Microarray Reveals Functional Genomics in Colorectal Cancer Progression. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2020; 2019:252-255. [PMID: 31945889 DOI: 10.1109/embc.2019.8857806] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/09/2022]
Abstract
Colorectal cancer is one of the most common cancers with the second highest mortality rate in the world. The microarray can be used to collect gene expression alteration information from many tissue samples that will be useful to understand colorectal cancer from the molecular level. However, the mechanism behind the progression from normal to cancer is not fully understood. Here, a cross-platform comparison among three common microarray platforms (Affymetrix, Agilent, and Illumina) was applied. As results, we found a significant correlation of purine metabolism and p53 signaling pathway role in colorectal cancer progression. Purine metabolism can control the regulation of cell proliferation which involve hydro-lyase activity on organelle lumen. Meanwhile, genetic alterations in p53 signaling pathways could control some hallmarks of cancer. These two terms might play important roles in inducing normal colorectal cells into cancer.
Collapse
|
66
|
Yan J, Wu L, Jia C, Yu S, Lu Z, Sun Y, Chen J. Development of a four-gene prognostic model for pancreatic cancer based on transcriptome dysregulation. Aging (Albany NY) 2020; 12:3747-3770. [PMID: 32081836 PMCID: PMC7066910 DOI: 10.18632/aging.102844] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2019] [Accepted: 02/04/2020] [Indexed: 12/14/2022]
Abstract
We systematically developed a prognostic model for pancreatic cancer that was compatible across different transcriptomic platforms and patient cohorts. After performing quality control measures, we used seven microarray datasets and two RNA sequencing datasets to identify consistently dysregulated genes in pancreatic cancer patients. Weighted gene co-expression network analysis was performed to explore the associations between gene expression patterns and clinical features. The least absolute shrinkage and selection operator (LASSO) and Cox regression were used to construct a prognostic model. We tested the predictive power of the model by determining the area under the curve of the risk score for time-dependent survival. Most of the differentially expressed genes in pancreatic cancer were enriched in functions pertaining to the tumor immune microenvironment. The transcriptome profiles were found to be associated with overall survival, and four genes were identified as independent prognostic factors. A prognostic risk score was then proposed, which displayed moderate accuracy in the training and self-validation cohorts. Furthermore, patients in two independent microarray cohorts were successfully stratified into high- and low-risk prognostic groups. Thus, we constructed a reliable prognostic model for pancreatic cancer, which should be beneficial for clinical therapeutic decision-making.
Collapse
Affiliation(s)
- Jie Yan
- Department of Pathology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100730, China
| | - Liangcai Wu
- Department of Obstetrics and Gynecology, Obstetrics and Gynecology Hospital of Fudan University, Shanghai 200011, China
| | - Congwei Jia
- Department of Pathology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100730, China
| | - Shuangni Yu
- Department of Pathology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100730, China
| | - Zhaohui Lu
- Department of Pathology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100730, China
| | - Yueping Sun
- Institute of Medical Information, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100020, China
| | - Jie Chen
- Department of Pathology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100730, China
| |
Collapse
|
67
|
Lee K, Lehmann M, Paul MV, Wang L, Luckner M, Wanner G, Geigenberger P, Leister D, Kleine T. Lack of FIBRILLIN6 in Arabidopsis thaliana affects light acclimation and sulfate metabolism. THE NEW PHYTOLOGIST 2020; 225:1715-1731. [PMID: 31596965 DOI: 10.1111/nph.16246] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/19/2019] [Accepted: 10/02/2019] [Indexed: 06/10/2023]
Abstract
Arabidopsis thaliana contains 13 fibrillins (FBNs), which are all localized to chloroplasts. FBN1 and FBN2 are involved in photoprotection of photosystem II, and FBN4 and FBN5 are thought to be involved in plastoquinone transport and biosynthesis, respectively. The functions of the other FBNs remain largely unknown. To gain insight into the function of FBN6, we performed coexpression and Western analyses, conducted fluorescence and transmission electron microscopy, stained reactive oxygen species (ROS), measured photosynthetic parameters and glutathione levels, and applied transcriptomics and metabolomics. Using coexpression analyses, FBN6 was identified as a photosynthesis-associated gene. FBN6 is localized to thylakoid and envelope membranes, and its knockout results in stunted plants. The delayed-growth phenotype cannot be attributed to altered basic photosynthesis parameters or a reduced CO2 assimilation rate. Under moderate light stress, primary leaves of fbn6 plants begin to bleach and contain enlarged plastoglobules. RNA sequencing and metabolomics analyses point to an alteration in sulfate reduction in fbn6. Indeed, glutathione content is higher in fbn6, which in turn confers cadmium tolerance of fbn6 seedlings. We conclude that loss of FBN6 leads to perturbation of ROS homeostasis. FBN6 enables plants to cope with moderate light stress and affects cadmium tolerance.
Collapse
Affiliation(s)
- Kwanuk Lee
- Plant Molecular Biology (Botany), Department Biology I, Ludwig-Maximilians-University München, 82152, Martinsried, Germany
| | - Martin Lehmann
- Plant Molecular Biology (Botany), Department Biology I, Ludwig-Maximilians-University München, 82152, Martinsried, Germany
| | - Melanie V Paul
- Plant Metabolism, Department Biology I, Ludwig-Maximilians-University München, 82152, Martinsried, Germany
| | - Liangsheng Wang
- Plant Molecular Biology (Botany), Department Biology I, Ludwig-Maximilians-University München, 82152, Martinsried, Germany
| | - Manja Luckner
- Ultrastrukturforschung, Department Biology I, Ludwig-Maximilians-University München, 81252, Planegg-Martinsried, Germany
| | - Gerhard Wanner
- Ultrastrukturforschung, Department Biology I, Ludwig-Maximilians-University München, 81252, Planegg-Martinsried, Germany
| | - Peter Geigenberger
- Plant Metabolism, Department Biology I, Ludwig-Maximilians-University München, 82152, Martinsried, Germany
| | - Dario Leister
- Plant Molecular Biology (Botany), Department Biology I, Ludwig-Maximilians-University München, 82152, Martinsried, Germany
| | - Tatjana Kleine
- Plant Molecular Biology (Botany), Department Biology I, Ludwig-Maximilians-University München, 82152, Martinsried, Germany
| |
Collapse
|
68
|
McCarthy FM, Pendarvis K, Cooksey AM, Gresham CR, Bomhoff M, Davey S, Lyons E, Sonstegard TS, Bridges SM, Burgess SC. Chickspress: a resource for chicken gene expression. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2020; 2019:5512474. [PMID: 31210271 PMCID: PMC6556980 DOI: 10.1093/database/baz058] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/13/2018] [Revised: 03/07/2019] [Accepted: 04/15/2019] [Indexed: 12/12/2022]
Abstract
High-throughput sequencing and proteomics technologies are markedly increasing the amount of RNA and peptide data that are available to researchers, which are typically made publicly available via data repositories such as the NCBI Sequence Read Archive and proteome archives, respectively. These data sets contain valuable information about when and where gene products are expressed, but this information is not readily obtainable from archived data sets. Here we report Chickspress (http://geneatlas.arl.arizona.edu), the first publicly available gene expression resource for chicken tissues. Since there is no single source of chicken gene models, Chickspress incorporates both NCBI and Ensembl gene models and links these gene sets with experimental gene expression data and QTL information. By linking gene models from both NCBI and Ensembl gene prediction pipelines, researchers can, for the first time, easily compare gene models from each of these prediction workflows to available experimental data for these products. We use Chickspress data to show the differences between these gene annotation pipelines. Chickspress also provides rapid search, visualization and download capacity for chicken gene sets based upon tissue type, developmental stage and experiment type. This first Chickspress release contains 161 gene expression data sets, including expression of mRNAs, miRNAs, proteins and peptides. We provide several examples demonstrating how researchers may use this resource.
Collapse
Affiliation(s)
- Fiona M McCarthy
- School of Animal and Comparative Biomedical Sciences, University of Arizona, Tucson AZ, USA
| | - Ken Pendarvis
- School of Animal and Comparative Biomedical Sciences, University of Arizona, Tucson AZ, USA
| | - Amanda M Cooksey
- School of Animal and Comparative Biomedical Sciences, University of Arizona, Tucson AZ, USA
| | - Cathy R Gresham
- Institute of Genomics, Biocomputing & Biotechnology, Mississippi State University, Starkville MS, USA
| | - Matt Bomhoff
- School of Plant Sciences, CyVerse, University of Arizona, Tucson AZ , USA
| | - Sean Davey
- School of Plant Sciences, CyVerse, University of Arizona, Tucson AZ , USA
| | - Eric Lyons
- School of Plant Sciences, CyVerse, University of Arizona, Tucson AZ , USA
| | - Tad S Sonstegard
- United States Department of Agriculture Agricultural Research Service Beltsville Agricultural Research Center, Beltsville MD, USA
| | - Susan M Bridges
- Department of Computer Science and Engineering, Mississippi State University, Starkville MS, USA
| | - Shane C Burgess
- School of Animal and Comparative Biomedical Sciences, University of Arizona, Tucson AZ, USA
| |
Collapse
|
69
|
Carvajal-Lopez P, Von Borstel FD, Torres A, Rustici G, Gutierrez J, Romero-Vivas E. Microarray-Based Quality Assessment as a Supporting Criterion for de novo Transcriptome Assembly Selection. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:198-206. [PMID: 30059314 DOI: 10.1109/tcbb.2018.2860997] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
RNA-Sequencing and de novo assembly have enabled the analysis of species with non-available reference transcriptomes, although intrinsic features (biological and technical) induce errors in the reconstruction. A strategy to resolve these errors consists of varying assembling process parameters to generate multiple reconstructions. However, the best assembly selection remains a challenge. Quantitative metrics for quality assessment have been inconsistent when compared with pertinent references. In this paper, a criterion for supporting assembly selection based on mapping DNA microarray hybridized probes to assembly sets is proposed. Mouse and fruit fly RNA-Seq datasets were assembled with standard de novo procedures. Quality assessment was estimated using quantitative metrics and the proposed criterion. The assembly that best mapped to the available reference transcriptomes of these model species provided the highest quality assembly. The hybridized probes identified the best assemblies, whereas quantitative metrics remained inconsistent. For example, subtle probe mapping difference of 0.25 percent, but statistically significant (ANOVA, p < 0.05), enabled the assembly selection that led to identify 3,719 more contigs and led to 1,049 further mapped contigs to the mouse reference transcriptome. The microarray data availability for non-model species makes the proposed criterion suitable for quality assessment of multiple de novo assembly strategies.
Collapse
|
70
|
Hodgson SH, Muller J, Lockstone HE, Hill AVS, Marsh K, Draper SJ, Knight JC. Use of gene expression studies to investigate the human immunological response to malaria infection. Malar J 2019; 18:418. [PMID: 31835999 PMCID: PMC6911278 DOI: 10.1186/s12936-019-3035-0] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Accepted: 11/26/2019] [Indexed: 01/02/2023] Open
Abstract
Background Transcriptional profiling of the human immune response to malaria has been used to identify diagnostic markers, understand the pathogenicity of severe disease and dissect the mechanisms of naturally acquired immunity (NAI). However, interpreting this body of work is difficult given considerable variation in study design, definition of disease, patient selection and methodology employed. This work details a comprehensive review of gene expression profiling (GEP) of the human immune response to malaria to determine how this technology has been applied to date, instances where this has advanced understanding of NAI and the extent of variability in methodology between studies to allow informed comparison of data and interpretation of results. Methods Datasets from the gene expression omnibus (GEO) including the search terms; ‘plasmodium’ or ‘malaria’ or ‘sporozoite’ or ‘merozoite’ or ‘gametocyte’ and ‘Homo sapiens’ were identified and publications analysed. Datasets of gene expression changes in relation to malaria vaccines were excluded. Results Twenty-three GEO datasets and 25 related publications were included in the final review. All datasets related to Plasmodium falciparum infection, except two that related to Plasmodium vivax infection. The majority of datasets included samples from individuals infected with malaria ‘naturally’ in the field (n = 13, 57%), however some related to controlled human malaria infection (CHMI) studies (n = 6, 26%), or cells stimulated with Plasmodium in vitro (n = 6, 26%). The majority of studies examined gene expression changes relating to the blood stage of the parasite. Significant heterogeneity between datasets was identified in terms of study design, sample type, platform used and method of analysis. Seven datasets specifically investigated transcriptional changes associated with NAI to malaria, with evidence supporting suppression of the innate pro-inflammatory response as an important mechanism for this in the majority of these studies. However, further interpretation of this body of work was limited by heterogeneity between studies and small sample sizes. Conclusions GEP in malaria is a potentially powerful tool, but to date studies have been hypothesis generating with small sample sizes and widely varying methodology. As CHMI studies are increasingly performed in endemic settings, there will be growing opportunity to use GEP to understand detailed time-course changes in host response and understand in greater detail the mechanisms of NAI.
Collapse
Affiliation(s)
- Susanne H Hodgson
- The Jenner Institute, University of Oxford, Old Road Campus Road Building, Off Roosevelt Drive, Oxford, OX3 7DQ, UK. .,Department of Infectious Diseases & Microbiology, Oxford University Hospitals Trust, Oxford, UK.
| | - Julius Muller
- The Jenner Institute, University of Oxford, Old Road Campus Road Building, Off Roosevelt Drive, Oxford, OX3 7DQ, UK
| | - Helen E Lockstone
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| | - Adrian V S Hill
- The Jenner Institute, University of Oxford, Old Road Campus Road Building, Off Roosevelt Drive, Oxford, OX3 7DQ, UK.,Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| | - Kevin Marsh
- Department of Tropical Medicine, University of Oxford, Oxford, UK
| | - Simon J Draper
- The Jenner Institute, University of Oxford, Old Road Campus Road Building, Off Roosevelt Drive, Oxford, OX3 7DQ, UK
| | - Julian C Knight
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| |
Collapse
|
71
|
Zhang Y, Ma L. Identification of key genes and pathways in calcific aortic valve disease by bioinformatics analysis. J Thorac Dis 2019; 11:5417-5426. [PMID: 32030260 DOI: 10.21037/jtd.2019.11.57] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Background Calcific aortic valve disease (CAVD) is the most common type of valvular heart disease in the elderly. This study is aimed to explore molecular mechanism of CAVD via bioinformatics analysis. Methods The gene expression profiles of GSE51472 (including 5 normal aortic valve and 5 calcified aortic valve) and GSE83453 (including 8 normal aortic valve and 19 calcified aortic valve) were downloaded from the Gene Expression Omnibus (GEO) database. Differentially expressed genes (DEGs) were screened using the MetaDE package in R software. Functional and pathway enrichment analysis were performed based on Gene ontology (GO) and KEGG pathway database. Then, STRING database, Cytoscape and MCODE were applied to construct the protein-protein interaction (PPI) network and screen hub genes. Pathway enrichment analysis was further performed for hub genes and gene clusters identified via module analysis. Results A total of 107 DEGs were identified in CAVD (53 up-regulated genes, and 54 down-regulated genes), and they were mainly enriched in the terms of immune response, extracellular matrix organization, leukocyte transendothelial migration, cell adhesion molecules (CAMs), and fatty acid metabolism. Five hub genes including VCAM1, MMP9, ITGB2, RAC2, and vWF were identified via PPI network, which were mainly enriched in terms of leukocyte transendothelial migration and cell adhesion. An independently down-regulated protein cluster containing ALDH2, HIBCH, ACADVL, ECHDC2, VAT1L, and MAOA was also identified via PPI network. Conclusions The present study identified VCAM1, MMP9, ITGB2, RAC2, vWF and ALDH2 as key genes in the progression of CAVD. Immune cells infiltration might play a key role in the progression of CAVD, while ALDH2-mediated detoxification effect might play a protective role in CAVD. Further studies are needed to elucidate the pathogenesis of CAVD.
Collapse
Affiliation(s)
- Yiran Zhang
- Department of Cardiovascular Surgery, the First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China
| | - Liang Ma
- Department of Cardiovascular Surgery, the First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China
| |
Collapse
|
72
|
Zhu J, Wang Z, Chen F, Liu C. Identification of genes and functional coexpression modules closely related to ulcerative colitis by gene datasets analysis. PeerJ 2019; 7:e8061. [PMID: 31741804 PMCID: PMC6858811 DOI: 10.7717/peerj.8061] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2019] [Accepted: 10/20/2019] [Indexed: 02/06/2023] Open
Abstract
Background Ulcerative colitis is a type of inflammatory bowel disease posing a great threat to the public health worldwide. Previously, gene expression studies of mucosal colonic biopsies have provided some insight into the pathophysiological mechanisms in ulcerative colitis; however, the exact pathogenesis is unclear. The purpose of this study is to identify the most related genes and pathways of UC by bioinformatics, so as to reveal the core of the pathogenesis. Methods Genome-wide gene expression datasets involving ulcerative colitis patients were collected from gene expression omnibus database. To identify most close genes, an integrated analysis of gene expression signature was performed by employing robust rank aggregation method. We used weighted gene co-expression network analysis to explore the functional modules involved in ulcerative colitis pathogenesis. Besides, biological process and pathways analysis of co-expression modules were figured out by gene ontology enrichment analysis using Metascape. Results A total of 328 ulcerative colitis patients and 138 healthy controls were from 14 datasets. The 150 most significant differentially expressed genes are likely to include causative genes of disease, and further studies are needed to demonstrate this. Seven main functional modules were identified, which pathway enrichment analysis indicated were associated with many biological processes. Pathways such as ‘extracellular matrix, immune inflammatory response, cell cycle, material metabolism’ are consistent with the core mechanism of ulcerative colitis. However, ‘defense response to virus’ and ‘herpes simplex infection’ suggest that viral infection is one of the aetiological agents. Besides, ‘Signaling by Receptor Tyrosine Kinases’ and ‘pathway in cancer’ provide new clues for the study of the risk and process of ulcerative colitis cancerization.
Collapse
Affiliation(s)
- Jie Zhu
- Department of Infectious Diseases, Qilu Hospital, Shandong University, Jinan, Shandong, China
| | - Zheng Wang
- Department of Infectious Diseases, Qilu Hospital, Shandong University, Jinan, Shandong, China
| | - Fengzhe Chen
- Department of Infectious Diseases, Qilu Hospital, Shandong University, Jinan, Shandong, China
| | - Changhong Liu
- Department of Gastroenterology, Shandong Provincial Qianfoshan Hospital, the First Hospital Affiliated with Shandong First Medical University, Jinan, Shandong, China
| |
Collapse
|
73
|
Zhou YY, Chen LP, Zhang Y, Hu SK, Dong ZJ, Wu M, Chen QX, Zhuang ZZ, Du XJ. Integrated transcriptomic analysis reveals hub genes involved in diagnosis and prognosis of pancreatic cancer. Mol Med 2019; 25:47. [PMID: 31706267 PMCID: PMC6842480 DOI: 10.1186/s10020-019-0113-2] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2019] [Accepted: 09/20/2019] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND The hunt for the molecular markers with specificity and sensitivity has been a hot area for the tumor treatment. Due to the poor diagnosis and prognosis of pancreatic cancer (PC), the excision rate is often low, which makes it more urgent to find the ideal tumor markers. METHODS Robust Rank Aggreg (RRA) methods was firstly applied to identify the differentially expressed genes (DEGs) between PC tissues and normal tissues from GSE28735, GSE15471, GSE16515, and GSE101448. Among these DEGs, the highly correlated genes were clustered using WGCNA analysis. The co-expression networks and molecular complex detection (MCODE) Cytoscape app were then performed to find the sub-clusters and confirm 35 candidate genes. For these genes, least absolute shrinkage and selection operator (lasso) regression model was applied and validated to build a diagnostic risk score model. Cox proportional hazard regression analysis was used and validated to build a prognostic model. RESULTS Based on integrated transcriptomic analysis, we identified a 19 gene module (SYCN, PNLIPRP1, CAP2, GNMT, MAT1A, ABAT, GPT2, ADHFE1, PHGDH, PSAT1, ERP27, PDIA2, MT1H, COMP, COL5A2, FN1, COL1A2, FAP and POSTN) as a specific predictive signature for the diagnosis of PC. Based on the two consideration, accuracy and feasibility, we simplified the diagnostic risk model as a four-gene model: 0.3034*log2(MAT1A)-0.1526*log2(MT1H) + 0.4645*log2(FN1) -0.2244*log2(FAP), log2(gene count). Besides, a four-hub gene module was also identified as prognostic model = - 1.400*log2(CEL) + 1.321*log2(CPA1) + 0.454*log2(POSTN) + 1.011*log2(PM20D1), log2(gene count). CONCLUSION Integrated transcriptomic analysis identifies two four-hub gene modules as specific predictive signatures for the diagnosis and prognosis of PC, which may bring new sight for the clinical practice of PC.
Collapse
Affiliation(s)
- Yang-Yang Zhou
- Department of Rheumatology and Immunology, The Second Affiliated Hospital and Yuying Children’s Hospital of Wenzhou Medical University, Wenzhou, 325000 Zhejiang Province China
| | - Li-Ping Chen
- Department of Rheumatology and Immunology, The Second Affiliated Hospital and Yuying Children’s Hospital of Wenzhou Medical University, Wenzhou, 325000 Zhejiang Province China
- Chemical Biology Research Center, College of Pharmaceutical Sciences, Wenzhou Medical University, Wenzhou, 325000 Zhejiang China
| | - Yi Zhang
- Chemical Biology Research Center, College of Pharmaceutical Sciences, Wenzhou Medical University, Wenzhou, 325000 Zhejiang China
| | - Sun-Kuan Hu
- Department of Gastroenterology, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325000 Zhejiang Province China
| | - Zhao-Jun Dong
- Chemical Biology Research Center, College of Pharmaceutical Sciences, Wenzhou Medical University, Wenzhou, 325000 Zhejiang China
| | - Ming Wu
- Department of Gastroenterology, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325000 Zhejiang Province China
| | - Qiu-Xiang Chen
- Department of Ultrasound, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325000 Zhejiang Province China
| | - Zhi-Zhi Zhuang
- Department of Rheumatology and Immunology, The Second Affiliated Hospital and Yuying Children’s Hospital of Wenzhou Medical University, Wenzhou, 325000 Zhejiang Province China
| | - Xiao-Jing Du
- Department of Gastroenterology, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325000 Zhejiang Province China
| |
Collapse
|
74
|
Mercatelli D, Scalambra L, Triboli L, Ray F, Giorgi FM. Gene regulatory network inference resources: A practical overview. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2019; 1863:194430. [PMID: 31678629 DOI: 10.1016/j.bbagrm.2019.194430] [Citation(s) in RCA: 58] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Revised: 09/06/2019] [Accepted: 09/09/2019] [Indexed: 02/08/2023]
Abstract
Transcriptional regulation is a fundamental molecular mechanism involved in almost every aspect of life, from homeostasis to development, from metabolism to behavior, from reaction to stimuli to disease progression. In recent years, the concept of Gene Regulatory Networks (GRNs) has grown popular as an effective applied biology approach for describing the complex and highly dynamic set of transcriptional interactions, due to its easy-to-interpret features. Since cataloguing, predicting and understanding every GRN connection in all species and cellular contexts remains a great challenge for biology, researchers have developed numerous tools and methods to infer regulatory processes. In this review, we catalogue these methods in six major areas, based on the dominant underlying information leveraged to infer GRNs: Coexpression, Sequence Motifs, Chromatin Immunoprecipitation (ChIP), Orthology, Literature and Protein-Protein Interaction (PPI) specifically focused on transcriptional complexes. The methods described here cover a wide range of user-friendliness: from web tools that require no prior computational expertise to command line programs and algorithms for large scale GRN inferences. Each method for GRN inference described herein effectively illustrates a type of transcriptional relationship, with many methods being complementary to others. While a truly holistic approach for inferring and displaying GRNs remains one of the greatest challenges in the field of systems biology, we believe that the integration of multiple methods described herein provides an effective means with which experimental and computational biologists alike may obtain the most complete pictures of transcriptional relationships. This article is part of a Special Issue entitled: Transcriptional Profiles and Regulatory Gene Networks edited by Dr. Federico Manuel Giorgi and Dr. Shaun Mahony.
Collapse
Affiliation(s)
- Daniele Mercatelli
- Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Laura Scalambra
- Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Luca Triboli
- Centre for Integrative Biology (CIBIO), University of Trento, Italy
| | - Forest Ray
- Department of Systems Biology, Columbia University Medical Center, New York, NY, United States
| | - Federico M Giorgi
- Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy.
| |
Collapse
|
75
|
Napolitano F, Carrella D, Gao X, di Bernardo D. gep2pep: a Bioconductor package for the creation and analysis of pathway-based expression profiles. Bioinformatics 2019; 36:btz803. [PMID: 31647521 PMCID: PMC7703749 DOI: 10.1093/bioinformatics/btz803] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2019] [Revised: 09/03/2019] [Accepted: 10/21/2019] [Indexed: 11/13/2022] Open
Abstract
SUMMARY Pathway-based expression profiles allow for high-level interpretation of transcriptomic data and systematic comparison of dysregulated cellular programs. We have previously demonstrated the efficacy of pathway-based approaches with two different applications: the Drug Set Enrichment Analysis and the Gene2drug analysis. Here we present a software tool that allows to easily convert gene-based profiles to pathway-based profiles and analyze them within the popular R framework. We also provide pre-computed profiles derived from the original Connectivity Map and its next generation release, i.e. the LINCS database. AVAILABILITY AND IMPLEMENTATION the tool is implemented as the R/Bioconductor package gep2pep and can be freely downloaded from https://bioconductor.org/packages/gep2pep. SUPPLEMENTARY INFORMATION Supplementary data are available at http://dsea.tigem.it/lincs.
Collapse
Affiliation(s)
- Farancesco Napolitano
- Telethon Institute of Genetics and Medicine (TIGEM), Pozzuoli, NA 80078, Italy
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Diego Carrella
- Telethon Institute of Genetics and Medicine (TIGEM), Pozzuoli, NA 80078, Italy
| | - Xin Gao
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Diego di Bernardo
- Telethon Institute of Genetics and Medicine (TIGEM), Pozzuoli, NA 80078, Italy
| |
Collapse
|
76
|
Wang Z, Zhu J, Liu C, Ma L. Identification of key genes and pathways associated with Crohn's disease by bioinformatics analysis. Scand J Gastroenterol 2019; 54:1205-1213. [PMID: 31526198 DOI: 10.1080/00365521.2019.1665096] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Aims: Crohn's disease (CD) is a type of inflammatory bowel disease. The present study aimed to identify key genes and significant signaling pathways associated with CD by bioinformatics analysis. A total of 179 CD patients and 94 healthy controls from nine genome-wide gene expression datasets were included.Results: MMP1 and CLDN8 were two key genes screened from the differentially expressed genes. Connectivity Map predicted several small molecules as possible adjuvant drugs to treat CD. Besides, we used weighted gene coexpression network analysis to explore the functional modules involved in CD pathogenesis. Seven main functional modules were identified, of which black module showed the highest correlation with CD. The genes in black module mainly enriched in interferon signaling and defense response to virus. Blue module was another important module and enriched in several signaling pathways, including extracellular matrix organization, inflammatory response and blood vessel development.Conclusions: This study identified a number of key genes and pathways involved in CD and potential drugs to combat it, which might offer insights into CD pathogenesis and provide a clue to potential treatments.
Collapse
Affiliation(s)
- Zheng Wang
- Department of Infectious Diseases, Qilu Hospital, Shandong University, Jinan, China
| | - Jie Zhu
- Department of Infectious Diseases, Qilu Hospital, Shandong University, Jinan, China
| | - Changhong Liu
- Department of Gastroenterology, Shandong Provincial Qianfoshan Hospital, The First Hospital Affiliated with Shandong First Medical University, Jinan, China
| | - Lixian Ma
- Department of Infectious Diseases, Qilu Hospital, Shandong University, Jinan, China
| |
Collapse
|
77
|
Chaudhari P, Agarwal H, Bhateja V. Data augmentation for cancer classification in oncogenomics: an improved KNN based approach. EVOLUTIONARY INTELLIGENCE 2019. [DOI: 10.1007/s12065-019-00283-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
78
|
Liu Y, Liu X, Jia J, Zheng J, Yan T. Comprehensive analysis of aberrantly expressed profiles of mRNA and its relationship with serum galactose-deficient IgA1 level in IgA nephropathy. J Transl Med 2019; 17:320. [PMID: 31547815 PMCID: PMC6757375 DOI: 10.1186/s12967-019-2064-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2018] [Accepted: 09/09/2019] [Indexed: 11/10/2022] Open
Abstract
Background Immunoglobulin A nephropathy (IgAN) is the leading cause of end-stage kidney disease. Previous mRNA microarray profiling studies of IgAN revealed inconsistent data. We sought to identify the aberrantly expressed genes and biological pathways by integrating IgAN gene expression datasets in blood cells and performing systematically experimental validation. We also explored the relationship between target genes and galactose-deficient IgA1 (Gd-IgA1) in IgAN. Methods We retrieved Gene Expression Omnibus (GEO) datasets of IgAN. Gene Ontology (GO) enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses were used for functional analysis. Deep sequencing on RNA isolated from B cells was used for microarray validation. The relationship between target mRNA expressions and Gd-IgA1 levels in serum were also studied. Results Three studies with microarray expression profiling datasets met our inclusion criteria. We identified 655 dyregulated genes, including 319 up-regulated and 336 down-regulated genes in three GEO datasets with a total of 35 patients of IgAN and 19 healthy controls. Based on biological process in GO term, these dyregulated genes are mainly related to pentose-phosphate shunt, non-oxidative branch, post-embryonic camera-type eye development and leukocyte activation. KEGG pathway analysis of microarray data revealed that these aberrantly expressed genes were enriched in human T-cell leukemia virus 1 infection, proteoglycans in cancer, intestinal immune network for IgA production and autophagy. We further performed deep sequencing on mRNAs isolated from B cells of an independent set of five patients with IgAN and three healthy persons with the same clinical and demographic characteristics. Seventy-seven genes overlapped with 655 differentially regulated genes mentioned above, including 43 up-regulated and thirty-four down-regulated genes. We next investigated whether these genes expression correlated with Gd-IgA1 levels in IgAN patients. Pearson correlation analyses showed PTEN (phosphatase and tensin homolog) was the most powerful gene negatively correlated with Gd-IgA1 levels. Conclusions These results demonstrated that dyregulated genes in patients with IgAN were enriched in intestinal immune network for IgA production and autophagy process, and PTEN in B cells might be involved in the mechanism of Gd-IgA1 production.
Collapse
Affiliation(s)
- Youxia Liu
- Department of Nephrology, Tianjin Medical University General Hospital, NO. 154, Anshan Road, Heping District, Tianjin, People's Republic of China.
| | - Xiangchun Liu
- Department of Nephrology, The Second Hospital of Shandong University, Jinan, People's Republic of China
| | - Junya Jia
- Department of Nephrology, Tianjin Medical University General Hospital, NO. 154, Anshan Road, Heping District, Tianjin, People's Republic of China
| | - Jie Zheng
- Radiology Department, Tianjin Medical University General Hospital, Tianjin, People's Republic of China
| | - Tiekun Yan
- Department of Nephrology, Tianjin Medical University General Hospital, NO. 154, Anshan Road, Heping District, Tianjin, People's Republic of China.
| |
Collapse
|
79
|
Ju B, Kim Y. The formation of research ethics for data sharing by biological scientists: an empirical analysis. ASLIB J INFORM MANAG 2019. [DOI: 10.1108/ajim-12-2018-0296] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Purpose
The purpose of this paper is to investigate how biological scientists form research ethics for data sharing, and what the major factors affecting biological scientists’ formation of research ethics for data sharing are.
Design/methodology/approach
A research model for data sharing was developed based on the consequential theorists’ perspective of ethics. An online survey of 577 participants was administered, and the proposed research model was validated with a structural equation modeling technique.
Findings
The results show that egoism factors (perceived reputation, perceived risk, perceived effort), utilitarianism factors (perceived community benefit and perceived reciprocity) and norm of practice factors (perceived pressure by funding agency, perceived pressure by journal and norm of data sharing) all contribute to the formation of research ethics for data sharing.
Research limitations/implications
This research employed the consequentialist perspective of ethics for its research model development, and the proposed research model nicely explained how egoism, utilitarianism and norm of practice factors influence biological scientists’ research ethics for data sharing, which eventually leads to their data sharing intentions.
Practical implications
This research provides important practical implications for examining scientists’ data sharing behaviors from the perspective of research ethics. This research suggests that scientists’ data sharing behaviors can be better facilitated by emphasizing their egoism, utilitarianism and normative factors involved in research ethics for data sharing.
Originality/value
The ethical perspectives in data sharing research has been under-studied; this research sheds light on biological scientists’ formation of research ethics for data sharing, which can be applied in promoting scientists’ data sharing behaviors across different disciplines.
Collapse
|
80
|
Nie X, Wei J, Hao Y, Tao J, Li Y, Liu M, Xu B, Li B. Consistent Biomarkers and Related Pathogenesis Underlying Asthma Revealed by Systems Biology Approach. Int J Mol Sci 2019; 20:ijms20164037. [PMID: 31430856 PMCID: PMC6720652 DOI: 10.3390/ijms20164037] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2019] [Revised: 08/14/2019] [Accepted: 08/17/2019] [Indexed: 12/13/2022] Open
Abstract
Asthma is a common chronic airway disease worldwide. Due to its clinical and genetic heterogeneity, the cellular and molecular processes in asthma are highly complex and relatively unknown. To discover novel biomarkers and the molecular mechanisms underlying asthma, several studies have been conducted by focusing on gene expression patterns in epithelium through microarray analysis. However, few robust specific biomarkers were identified and some inconsistent results were observed. Therefore, it is imperative to conduct a robust analysis to solve these problems. Herein, an integrated gene expression analysis of ten independent, publicly available microarray data of bronchial epithelial cells from 348 asthmatic patients and 208 healthy controls was performed. As a result, 78 up- and 75 down-regulated genes were identified in bronchial epithelium of asthmatics. Comprehensive functional enrichment and pathway analysis revealed that response to chemical stimulus, extracellular region, pathways in cancer, and arachidonic acid metabolism were the four most significantly enriched terms. In the protein-protein interaction network, three main communities associated with cytoskeleton, response to lipid, and regulation of response to stimulus were established, and the most highly ranked 6 hub genes (up-regulated CD44, KRT6A, CEACAM5, SERPINB2, and down-regulated LTF and MUC5B) were identified and should be considered as new biomarkers. Pathway cross-talk analysis highlights that signaling pathways mediated by IL-4/13 and transcription factor HIF-1α and FOXA1 play crucial roles in the pathogenesis of asthma. Interestingly, three chemicals, polyphenol catechin, antibiotic lomefloxacin, and natural alkaloid boldine, were predicted and may be potential drugs for asthma treatment. Taken together, our findings shed new light on the common molecular pathogenesis mechanisms of asthma and provide theoretical support for further clinical therapeutic studies.
Collapse
Affiliation(s)
- Xiner Nie
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, China
| | - Jinyi Wei
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, China
| | - Youjin Hao
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, China
| | - Jingxin Tao
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, China
| | - Yinghong Li
- School of Biological Information, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
| | - Mingwei Liu
- College of Laboratory Medicine, Chongqing Medical University, Chongqing 400046, China
| | - Boying Xu
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, China
| | - Bo Li
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, China.
| |
Collapse
|
81
|
Abstract
The amount of omics data in the public domain is increasing every year. Modern science has become a data-intensive discipline. Innovative solutions for data management, data sharing, and for discovering novel datasets are therefore increasingly required. In 2016, we released the first version of the Omics Discovery Index (OmicsDI) as a light-weight system to aggregate datasets across multiple public omics data resources. OmicsDI aggregates genomics, transcriptomics, proteomics, metabolomics and multiomics datasets, as well as computational models of biological processes. Here, we propose a set of novel metrics to quantify the attention and impact of biomedical datasets. A complete framework (now integrated into OmicsDI) has been implemented in order to provide and evaluate those metrics. Finally, we propose a set of recommendations for authors, journals and data resources to promote an optimal quantification of the impact of datasets. Increasing amount of public omics data are important and valuable resources for the research community. Here, the authors develop a set of metrics to quantify the attention and impact of biomedical datasets and integrate them into the framework of Omics Discovery Index (OmicsDI).
Collapse
|
82
|
Zhang YW, Lin Y, Yu HY, Tian RN, Li F. Characteristic genes in THP‑1 derived macrophages infected with Mycobacterium tuberculosis H37Rv strain identified by integrating bioinformatics methods. Int J Mol Med 2019; 44:1243-1254. [PMID: 31364746 PMCID: PMC6713430 DOI: 10.3892/ijmm.2019.4293] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2019] [Accepted: 06/06/2019] [Indexed: 12/11/2022] Open
Abstract
Mycobacterium tuberculosis (M. tb) is a highly successful pathogen that has co-existed with humans for 1,000's of years. As the cornerstone of the immune system, macrophages are a key part of innate immunity. They ingest and degrade foreign substances including aging cells and microorganisms, coordinate the inflammatory process, and are the first line of defense against M. tb infection. Recent advances in cellular mycobacteriology have indicated that M. tb uses an remarkably complex strategy to disrupt macrophage function, in order to counteract the antimicrobial mechanisms of the innate and adaptive immune responses, thereby achieving immune escape. With the popularity of microarray technology, a variety of public platforms have provided a variety of gene expression data associated with physiological and disease conditions. Meta-analysis can systematically and quantitatively analyze multiple independent data concerning the same disease, greatly improving the statistical significance and credibility of the gene expression data analysis performed. In the present study, 6 microarray expression datasets of human acute monocytic leukemia THP-1 cell line infected by M. tb H37Rv strain were collected from the GEO database. A total of 4 high-quality datasets were identified using meta-analysis methods in R language, and 306 differentially expressed genes with statistical significance were obtained. Then, a protein-protein interaction (PPI) network of these differentially expressed genes was constructed on the Search Tool for the Retrieval of Interacting Genes/Proteins Database online tool and visualized by Cytoscape v. 3.6.1 software. Using CentiScape and MCODE plugin in the Cytoscape software to mine the functional modules associated with M. tb infection process, 32 characteristic genes were identified. Gene ontology and Kyoto Encyclopedia of Genes and Genomes analysis was performed on the 32 characteristic genes, and it was demonstrated that these genes were primarily associated with the type I interferon (IFN) pathway. In the established model of THP-1-derived macrophages infected by M. tb, the actual differential expression levels of IFN-stimulated gene 15 (ISG15), 2′-5-oligoadenylate synthetase like (OASL), IFN regulatory factor 7 (IRF7) and DExD/H-box helicase 58 (DDX58), the first 4 genes of the 32 characteristic genes, were verified by reverse transcription quantitative polymerase chain reaction. The results were consistent with the results of microarray analysis. The association between ISG15, OASL and IRF7 and TB infection was also verified. Although a number of studies have identified that the type I IFN pathway may assist M. tb to achieve immune escape, the present study used a meta-analysis of microarray data and PPI network analysis to examine some of the novel genes identified in the IFN pathway. The results furthered the understanding of the molecular mechanisms of the TB immune response and provided a novel perspective for future therapeutic goals.
Collapse
Affiliation(s)
- Yu-Wei Zhang
- Department of Pathogen Biology, The Key Laboratory of Zoonosis, Chinese Ministry of Education, College of Basic Medicine, Jilin University, Changchun, Jilin 130021, P.R. China
| | - Yan Lin
- Department of Pathogen Biology, The Key Laboratory of Zoonosis, Chinese Ministry of Education, College of Basic Medicine, Jilin University, Changchun, Jilin 130021, P.R. China
| | - Hui-Yuan Yu
- School of Bethune Medical, Jilin University, Changchun, Jilin 130021, P.R. China
| | - Ruo-Nan Tian
- Department of Pathogen Biology, The Key Laboratory of Zoonosis, Chinese Ministry of Education, College of Basic Medicine, Jilin University, Changchun, Jilin 130021, P.R. China
| | - Fan Li
- Department of Pathogen Biology, The Key Laboratory of Zoonosis, Chinese Ministry of Education, College of Basic Medicine, Jilin University, Changchun, Jilin 130021, P.R. China
| |
Collapse
|
83
|
Bioinformatics-based discovery of PYGM and TNNC2 as potential biomarkers of head and neck squamous cell carcinoma. Biosci Rep 2019; 39:BSR20191612. [PMID: 31324732 PMCID: PMC6663994 DOI: 10.1042/bsr20191612] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2019] [Revised: 07/11/2019] [Accepted: 07/18/2019] [Indexed: 12/12/2022] Open
Abstract
Head and neck squamous cell carcinoma (HNSCC) is an aggressive malignancy with high morbidity and mortality rates and ranks as the sixth most common cancer all over the world. Despite numerous advancements in therapeutic methods, the prognosis of HNSCC patients still remains poor. Therefore, there is an urgent need to have a better understanding of the molecular mechanisms underlying HNSCC progression and to identify essential genes that could serve as effective biomarkers and potential treatment targets. In the present study, original data of three independent datasets were downloaded from the Gene Expression Omnibus database (GEO) and R language was applied to screen out the differentially expressed genes (DEGs). PYGM and TNNC2 were finally selected from the overlapping DEGs of three datasets for further analyses. Transcriptional and survival data related to PYGM and TNNC2 was detected through multiple online databases such as Oncomine, Gene Expression Profiling Interactive Analysis (GEPIA), cBioportal, and UALCAN. Quantitative real-time polymerase chain reaction (qPCR) analysis was adopted for the validation of PYGM and TNNC2 mRNA level in HNSCC tissues and cell lines. Survival curves were plotted to evaluate the association of these two genes with HNSCC prognosis. It was demonstrated that PYGM and TNNC2 were significantly down-regulated in HNSCC and the aberrant expression of PYGM and TNNC2 were correlated with HNSCC prognosis, implying the potential of exploiting them as therapeutic targets for HNSCC treatment or potential biomarkers for diagnosis and prognosis.
Collapse
|
84
|
Vijayakumar S, Conway M, Lió P, Angione C. Seeing the wood for the trees: a forest of methods for optimization and omic-network integration in metabolic modelling. Brief Bioinform 2019; 19:1218-1235. [PMID: 28575143 DOI: 10.1093/bib/bbx053] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2017] [Indexed: 11/13/2022] Open
Abstract
Metabolic modelling has entered a mature phase with dozens of methods and software implementations available to the practitioner and the theoretician. It is not easy for a modeller to be able to see the wood (or the forest) for the trees. Driven by this analogy, we here present a 'forest' of principal methods used for constraint-based modelling in systems biology. This provides a tree-based view of methods available to prospective modellers, also available in interactive version at http://modellingmetabolism.net, where it will be kept updated with new methods after the publication of the present manuscript. Our updated classification of existing methods and tools highlights the most promising in the different branches, with the aim to develop a vision of how existing methods could hybridize and become more complex. We then provide the first hands-on tutorial for multi-objective optimization of metabolic models in R. We finally discuss the implementation of multi-view machine learning approaches in poly-omic integration. Throughout this work, we demonstrate the optimization of trade-offs between multiple metabolic objectives, with a focus on omic data integration through machine learning. We anticipate that the combination of a survey, a perspective on multi-view machine learning and a step-by-step R tutorial should be of interest for both the beginner and the advanced user.
Collapse
Affiliation(s)
| | - Max Conway
- Computer Laboratory, University of Cambridge, UK
| | - Pietro Lió
- Computer Laboratory, University of Cambridge, UK
| | - Claudio Angione
- Department of Computer Science and Information Systems, Teesside University, UK
| |
Collapse
|
85
|
Liu L, Wang S, Cen C, Peng S, Chen Y, Li X, Diao N, Li Q, Ma L, Han P. Identification of differentially expressed genes in pancreatic ductal adenocarcinoma and normal pancreatic tissues based on microarray datasets. Mol Med Rep 2019; 20:1901-1914. [PMID: 31257501 DOI: 10.3892/mmr.2019.10414] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2018] [Accepted: 05/01/2019] [Indexed: 11/06/2022] Open
Abstract
Pancreatic ductal adenocarcinoma (PDAC) is a highly aggressive malignant tumor with rapid progression and poor prognosis. In the present study, 11 high‑quality microarray datasets, comprising 334 tumor samples and 151 non‑tumor samples from the Gene Expression Omnibus, were screened, and integrative meta‑analysis of expression data was used to identify gene signatures that differentiate between PDAC and normal pancreatic tissues. Following the identification of differentially expressed genes (DEGs), two‑way hierarchical clustering analysis was performed for all DEGs using the gplots package in R software. Hub genes were then determined through protein‑protein interaction network analysis using NetworkAnalyst. In addition, functional annotation and pathway enrichment analyses of all DEGs were conducted in the Database for Annotation, Visualization, and Integrated Discovery. The expression levels and Kaplan‑Meier analysis of the top 10 upregulated and downregulated genes were verified in The Cancer Genome Atlas. A total of 1,587 DEGs, including 1,004 upregulated and 583 downregulated genes, were obtained by comparing PDAC with normal tissues. Of these, hematological and neurological expressed 1, integrin subunit α2 (ITGA2) and S100 calcium‑binding protein A6 (S100A6) were the top upregulated genes, and kinesin family member 1A, Dymeclin and β‑secretase 1 were the top downregulated genes. Reverse transcription‑quantitative PCR was performed to examine the expression levels of S100A6, KRT19 and GNG7, and the results suggested that S100A6 was significantly upregulated in PDAC compared with normal pancreatic tissues. ITGA2 overexpression was significantly associated with shorter overall survival times, whereas family with sequence similarity 46 member C overexpression was strongly associated with longer overall survival times. In addition, network‑based meta‑analysis confirmed growth factor receptor‑bound protein 2 and histone deacetylase 5 as pivotal hub genes in PDAC compared with normal tissue. In conclusion, the results of the present meta‑analysis identified PDAC‑related gene signatures, providing new perspectives and potential targets for PDAC diagnosis and treatment.
Collapse
Affiliation(s)
- Liying Liu
- Department of Radiology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei 430022, P.R. China
| | - Siqi Wang
- Department of Radiology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei 430022, P.R. China
| | - Chunyuan Cen
- Department of Radiology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei 430022, P.R. China
| | - Shuyi Peng
- Department of Radiology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei 430022, P.R. China
| | - Yan Chen
- Department of Radiology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei 430022, P.R. China
| | - Xin Li
- Department of Radiology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei 430022, P.R. China
| | - Nan Diao
- Department of Radiology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei 430022, P.R. China
| | - Qian Li
- Department of Radiology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei 430022, P.R. China
| | - Ling Ma
- Advanced Application Team, GE Healthcare, Shanghai 201203, P.R. China
| | - Ping Han
- Department of Radiology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei 430022, P.R. China
| |
Collapse
|
86
|
|
87
|
Zhu J, Wang Z, Chen F. Association of Key Genes and Pathways with Atopic Dermatitis by Bioinformatics Analysis. Med Sci Monit 2019; 25:4353-4361. [PMID: 31184315 PMCID: PMC6582687 DOI: 10.12659/msm.916525] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Background Atopic dermatitis is a chronic inflammatory disease of the skin. It has a high prevalence worldwide and affected persons are prone to recurrent attacks, seriously affecting the physical and mental of patients. The exact etiology of the disease is still unclear. Material/Methods There are 7 datasets on atopic dermatitis in the Gene Expression Omnibus database, including 142 lesional and 134 non-lesional skin biopsy samples. Differential analysis was performed after datasets were integrated by robust multi-array average method. Functional modules of GSE99802 were explored by weighted gene co-expression network analysis. The 4 most important modules were enriched into the pathways by Metascape. Results Significantly differentially expressed genes included 41 upregulated and 10 downregulated genes. The following 5 of the most important upregulated genes had the strongest association with atopic dermatitis. SERPINB3&4 promote inflammation and impaired skin barrier function in the early stage of atopic dermatitis. S100A9 aggravates the inflammatory response by inducing the activation of toll-like receptor 4, neutrophil chemotaxis, neutrophilic inflammation, and the amplification of interleukin-8. MMP1 is the key protease of skin collagen degradation, keeping the extracellular matrix in dynamic balance. MMP12 induces the aggregation of various inflammatory cells into inflammatory tissue. The enriched pathways of each module mainly include Cellular responses to external stimuli, Metabolism of RNA and Translation, and Infectious disease. Conclusions The associated pathways and genes not only help us understand the molecular mechanism of the disease, but also provide research directions or targets for accurate diagnosis and treatment.
Collapse
Affiliation(s)
- Jie Zhu
- Department of Infectious Diseases, Qilu Hospital, Shandong University, Jinan, Shandong, China (mainland)
| | - Zheng Wang
- Department of Infectious Diseases, Qilu Hospital, Shandong University, Jinan, Shandong, China (mainland)
| | - Fengzhe Chen
- Department of Infectious Diseases, Qilu Hospital, Shandong University, Jinan, Shandong, China (mainland)
| |
Collapse
|
88
|
Wang Z, Zhu J, Chen F, Ma L. Weighted Gene Coexpression Network Analysis Identifies Key Genes and Pathways Associated with Idiopathic Pulmonary Fibrosis. Med Sci Monit 2019; 25:4285-4304. [PMID: 31177264 PMCID: PMC6582683 DOI: 10.12659/msm.916828] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Idiopathic pulmonary fibrosis (IPF) is a life-threatening disease with an unknown etiology. Gene expression microarray data have provided some insights into the molecular mechanisms of IPF. This study aimed to identify key genes and significant signaling pathways involved in IPF using bioinformatics analysis. MATERIAL AND METHODS Differentially expressed genes (DEGs) were identified using integrated analysis of gene expression data with a robust rank aggregation (RRA) method. The Connectivity Map (CMAP) was used to identify gene-expression signatures associated with IPF. Weighted gene coexpression network analysis (WGCNA) was used to explore the functional modules involved in the pathogenesis of IPF. RESULTS A total of 191 patients with IPF and 101 normal controls from six genome-wide expression datasets were included. CMAP predicted several small molecular agents as potential gene targets in IPF. Several functional modules were detected that showed the highest correlation with IPF, including an extracellular matrix (ECM) component, and a myeloid leukocyte migration and activation component involved in the immune response. Hub genes were identified in the key functional modules that might have a role in the progression of IPF. CONCLUSIONS WGCNA was used to identify functional modules and hub genes involved in the pathogenesis of IPF.
Collapse
Affiliation(s)
- Zheng Wang
- Department of Infectious Diseases, Qilu Hospital, Shandong University, Jinan, Shandong, China (mainland)
| | - Jie Zhu
- Department of Infectious Diseases, Qilu Hospital, Shandong University, Jinan, Shandong, China (mainland)
| | - Fengzhe Chen
- Department of Infectious Diseases, Qilu Hospital, Shandong University, Jinan, Shandong, China (mainland)
| | - Lixian Ma
- Department of Infectious Diseases, Qilu Hospital, Shandong University, Jinan, Shandong, China (mainland)
| |
Collapse
|
89
|
Fang HY, Wang Q, Zhang JZ, Huang H. Prognostic value of expression of HOXB7 in gastric cancer. Shijie Huaren Xiaohua Zazhi 2019; 27:671-675. [DOI: 10.11569/wcjd.v27.i11.671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Gastric cancer (GC) is the fifth most common cancer and the third leading cause of cancer death worldwide. Identifying new targets for the treatment and predictive evaluation of GC is of great significance, especially for improving the prognosis. Few studies have focused on the clinical significance of homeobox B7 (HOXB7) expression in GC.
AIM To assess the prognostic value of HOXB7 expression in GC.
METHODS HOXB7 data were retrieved from the Oncomine GC database. The prognostic value of HOXB7 was assessed using an online survival analysis tool (KM Plotter database).
RESULTS Based on the Oncomine database, HOXB7 expression in GC was significantly higher than that in normal tissue (P < 0.05). Further analysis revealed that the expression of HOXB7 gene in both intestinal and diffuse GCs was significantly higher than that in normal tissue. Moreover, KM Plotters of overall survival indicated that high HOXB7 expression was closely associated with poor survival in GC (P < 0.05). Furthermore, high HOXB7 expression was also related with overall survival in different GC subtypes (Lauren subtype) (P < 0.05).
CONCLUSION High HOXB7 expression might be an important biological event during gastric oncogenesis, and could be a novel prognostic predictive factor for GC.
Collapse
Affiliation(s)
- Hong-Yan Fang
- Department of Oncology, Wuhan Fifth Hospital, Wuhan 430050, Hubei Province, China
| | - Qun Wang
- Department of Oncology, Wuhan Fifth Hospital, Wuhan 430050, Hubei Province, China
| | - Jiang-Zhou Zhang
- Department of Oncology, Wuhan Fifth Hospital, Wuhan 430050, Hubei Province, China
| | - Hui Huang
- Department of Oncology, Wuhan Fifth Hospital, Wuhan 430050, Hubei Province, China
| |
Collapse
|
90
|
Navarro FCP, Mohsen H, Yan C, Li S, Gu M, Meyerson W, Gerstein M. Genomics and data science: an application within an umbrella. Genome Biol 2019; 20:109. [PMID: 31142351 PMCID: PMC6540394 DOI: 10.1186/s13059-019-1724-1] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
Data science allows the extraction of practical insights from large-scale data. Here, we contextualize it as an umbrella term, encompassing several disparate subdomains. We focus on how genomics fits as a specific application subdomain, in terms of well-known 3 V data and 4 M process frameworks (volume-velocity-variety and measurement-mining-modeling-manipulation, respectively). We further analyze the technical and cultural “exports” and “imports” between genomics and other data-science subdomains (e.g., astronomy). Finally, we discuss how data value, privacy, and ownership are pressing issues for data science applications, in general, and are especially relevant to genomics, due to the persistent nature of DNA.
Collapse
Affiliation(s)
- Fábio C P Navarro
- Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT, 06520, USA.,Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT, 06520, USA
| | - Hussein Mohsen
- Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT, 06520, USA.,Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT, 06520, USA
| | - Chengfei Yan
- Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT, 06520, USA.,Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT, 06520, USA
| | - Shantao Li
- Department of Computer Science, Stanford University, Stanford, CA, 94305, USA.,Department of Biomedical Data Sciences, Stanford University, Stanford, CA, 94305, USA
| | - Mengting Gu
- Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT, 06520, USA.,Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT, 06520, USA
| | - William Meyerson
- Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT, 06520, USA.,Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT, 06520, USA
| | - Mark Gerstein
- Program in Computational Biology and Bioinformatics, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT, 06520, USA. .,Department of Molecular Biophysics and Biochemistry, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT, 06520, USA. .,Department of Computer Science, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT, 06520, USA. .,Department of Statistics and Data Science, Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT, 06520, USA.
| |
Collapse
|
91
|
Mahi NA, Najafabadi MF, Pilarczyk M, Kouril M, Medvedovic M. GREIN: An Interactive Web Platform for Re-analyzing GEO RNA-seq Data. Sci Rep 2019; 9:7580. [PMID: 31110304 PMCID: PMC6527554 DOI: 10.1038/s41598-019-43935-8] [Citation(s) in RCA: 110] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2018] [Accepted: 05/05/2019] [Indexed: 12/31/2022] Open
Abstract
The vast amount of RNA-seq data deposited in Gene Expression Omnibus (GEO) and Sequence Read Archive (SRA) is still a grossly underutilized resource for biomedical research. To remove technical roadblocks for reusing these data, we have developed a web-application GREIN (GEO RNA-seq Experiments Interactive Navigator) which provides user-friendly interfaces to manipulate and analyze GEO RNA-seq data. GREIN is powered by the back-end computational pipeline for uniform processing of RNA-seq data and the large number (>6,500) of already processed datasets. The front-end user interfaces provide a wealth of user-analytics options including sub-setting and downloading processed data, interactive visualization, statistical power analyses, construction of differential gene expression signatures and their comprehensive functional characterization, and connectivity analysis with LINCS L1000 data. The combination of the massive amount of back-end data and front-end analytics options driven by user-friendly interfaces makes GREIN a unique open-source resource for re-using GEO RNA-seq data. GREIN is accessible at: https://shiny.ilincs.org/grein , the source code at: https://github.com/uc-bd2k/grein , and the Docker container at: https://hub.docker.com/r/ucbd2k/grein .
Collapse
Affiliation(s)
- Naim Al Mahi
- Division of Biostatistics and Bioinformatics, Department of Environmental Health, University of Cincinnati, 3223 Eden Avenue, Cincinnati, OH, 45220, USA
| | - Mehdi Fazel Najafabadi
- Division of Biostatistics and Bioinformatics, Department of Environmental Health, University of Cincinnati, 3223 Eden Avenue, Cincinnati, OH, 45220, USA
| | - Marcin Pilarczyk
- Division of Biostatistics and Bioinformatics, Department of Environmental Health, University of Cincinnati, 3223 Eden Avenue, Cincinnati, OH, 45220, USA
| | - Michal Kouril
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Mario Medvedovic
- Division of Biostatistics and Bioinformatics, Department of Environmental Health, University of Cincinnati, 3223 Eden Avenue, Cincinnati, OH, 45220, USA.
| |
Collapse
|
92
|
Vijayakumar P, Bakyaraj S, Singaravadivelan A, Vasanthakumar T, Suresh R. Meta-analysis of mammary RNA seq datasets reveals the molecular understanding of bovine lactation biology. Genome 2019; 62:489-501. [PMID: 31071269 DOI: 10.1139/gen-2018-0144] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
A better understanding of the biology of lactation, both in terms of gene expression and the identification of candidate genes for the production of milk and its components, is made possible by recent advances in RNA seq technology. The purpose of this study was to understand the synthesis of milk components and the molecular pathways involved, as well as to identify candidate genes for milk production traits within whole mammary transcriptomic datasets. We performed a meta-analysis of publically available RNA seq transcriptome datasets of mammary tissue/milk somatic cells. In total, 11 562 genes were commonly identified from all RNA seq based mammary gland transcriptomes. Functional annotation of commonly expressed genes revealed the molecular processes that contribute to the synthesis of fats, proteins, and lactose in mammary secretory cells and the molecular pathways responsible for milk synthesis. In addition, we identified several candidate genes responsible for milk production traits and constructed a gene regulatory network for RNA seq data. In conclusion, this study provides a basic understanding of the lactation biology of cows at the gene expression level.
Collapse
Affiliation(s)
- Periyasamy Vijayakumar
- a Veterinary College and Research Institute, TANUVAS, Orathanadu-614 625, Thanjavur, Tamil Nadu, India
| | - Sanniyasi Bakyaraj
- b College of Poultry Production and Management, TANUVAS, Hosur-635 110, Krishnagiri, Tamil Nadu, India
| | | | - Thangavelu Vasanthakumar
- a Veterinary College and Research Institute, TANUVAS, Orathanadu-614 625, Thanjavur, Tamil Nadu, India
| | - Ramalingam Suresh
- a Veterinary College and Research Institute, TANUVAS, Orathanadu-614 625, Thanjavur, Tamil Nadu, India
| |
Collapse
|
93
|
Djordjevic D, Tang JYS, Chen YX, Kwan SLS, Ling RWK, Qian G, Woo CYY, Ellis SJ, Ho JWK. Discovery of perturbation gene targets via free text metadata mining in Gene Expression Omnibus. Comput Biol Chem 2019; 80:152-158. [PMID: 30959271 DOI: 10.1016/j.compbiolchem.2019.03.014] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2019] [Accepted: 03/23/2019] [Indexed: 10/27/2022]
Abstract
There exists over 2.5 million publicly available gene expression samples across 101,000 data series in NCBI's Gene Expression Omnibus (GEO) database. Due to the lack of the use of standardised ontology terms in GEO's free text metadata to annotate the experimental type and sample type, this database remains difficult to harness computationally without significant manual intervention. In this work, we present an interactive R/Shiny tool called GEOracle that utilises text mining and machine learning techniques to automatically identify perturbation experiments, group treatment and control samples and perform differential expression. We present applications of GEOracle to discover conserved signalling pathway target genes and identify an organ specific gene regulatory network. GEOracle is effective in discovering perturbation gene targets in GEO by harnessing its free text metadata. Its effectiveness and applicability has been demonstrated by cross validation and two real-life case studies. It opens up new avenues to unlock the gene regulatory information embedded inside large biological databases such as GEO. GEOracle is available at https://github.com/VCCRI/GEOracle.
Collapse
Affiliation(s)
- Djordje Djordjevic
- Victor Chang Cardiac Research Institute, Sydney, Australia; University of New South Wales, Sydney, Australia
| | - Joshua Y S Tang
- Victor Chang Cardiac Research Institute, Sydney, Australia; University of New South Wales, Sydney, Australia
| | - Yun Xin Chen
- Victor Chang Cardiac Research Institute, Sydney, Australia
| | | | | | - Gordon Qian
- Victor Chang Cardiac Research Institute, Sydney, Australia
| | | | - Samuel J Ellis
- Victor Chang Cardiac Research Institute, Sydney, Australia
| | - Joshua W K Ho
- Victor Chang Cardiac Research Institute, Sydney, Australia; University of New South Wales, Sydney, Australia; School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| |
Collapse
|
94
|
Abstract
The identification of genes that are differentially expressed provides a molecular foothold onto biological questions of interest. Whether some genes are more likely to be differentially expressed than others, and to what degree, has never been assessed on a global scale. Here, we reanalyze more than 600 studies and find that knowledge of a gene’s prior probability of differential expression (DE) allows for accurate prediction of DE hit lists, regardless of the biological question. This result suggests redundancy in transcriptomics experiments that both informs gene set interpretation and highlights room for growth within the field. Differential expression (DE) is commonly used to explore molecular mechanisms of biological conditions. While many studies report significant results between their groups of interest, the degree to which results are specific to the question at hand is not generally assessed, potentially leading to inaccurate interpretation. This could be particularly problematic for metaanalysis where replicability across datasets is taken as strong evidence for the existence of a specific, biologically relevant signal, but which instead may arise from recurrence of generic processes. To address this, we developed an approach to predict DE based on an analysis of over 600 studies. A predictor based on empirical prior probability of DE performs very well at this task (mean area under the receiver operating characteristic curve, ∼0.8), indicating that a large fraction of DE hit lists are nonspecific. In contrast, predictors based on attributes such as gene function, mutation rates, or network features perform poorly. Genes associated with sex, the extracellular matrix, the immune system, and stress responses are prominent within the “DE prior.” In a series of control studies, we show that these patterns reflect shared biology rather than technical artifacts or ascertainment biases. Finally, we demonstrate the application of the DE prior to data interpretation in three use cases: (i) breast cancer subtyping, (ii) single-cell genomics of pancreatic islet cells, and (iii) metaanalysis of lung adenocarcinoma and renal transplant rejection transcriptomics. In all cases, we find hallmarks of generic DE, highlighting the need for nuanced interpretation of gene phenotypic associations.
Collapse
|
95
|
Hu Z, Olatoye MO, Marla S, Morris GP. An Integrated Genotyping-by-Sequencing Polymorphism Map for Over 10,000 Sorghum Genotypes. THE PLANT GENOME 2019; 12:180044. [PMID: 30951089 DOI: 10.3835/plantgenome2018.06.0044] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
Mining crop genomic variation can facilitate the genetic research of complex traits and molecular breeding. In sorghum [ L. (Moench)], several large-scale single nucleotide polymorphism (SNP) datasets have been generated using genotyping-by-sequencing of KI reduced representation libraries. However, data reuse has been impeded by differences in reference genome coordinates among datasets. To facilitate reuse of these data, we constructed and characterized an integrated 459,304-SNP dataset for 10,323 sorghum genotypes on the version 3.1 reference genome. The SNP distribution showed high enrichment in subtelomeric chromosome arms and in genic regions (48% of SNPs) and was highly correlated ( = 0.82) to the distribution of KI restriction sites. The genetic structure reflected population differences by botanical race, as well as familial structure among recombinant inbred lines (RILs). Faster linkage disequilibrium decay was observed in the diversity panel than in the RILs, as expected, given the greater opportunity for recombination in diverse populations. To validate the quality and utility of the integrated SNP dataset, we used genome-wide association studies (GWAS) of genebank phenotype data, precisely mapping several known genes (e.g and ) and identifying novel associations for other traits. We further validated the dataset with GWAS of new and published plant height and flowering time data in a nested association mapping population, precisely mapping known genes and identifying epistatic interactions underlying both traits. These findings validate this integrated SNP dataset as a useful genomics resource for sorghum genetics and breeding.
Collapse
|
96
|
Moradzadeh K, Gheisari Y. The analysis of a time-course transcriptome profile by systems biology approaches reveals key molecular processes in acute kidney injury. JOURNAL OF RESEARCH IN MEDICAL SCIENCES 2019; 24:3. [PMID: 30815016 PMCID: PMC6383344 DOI: 10.4103/jrms.jrms_690_18] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/27/2018] [Revised: 09/14/2018] [Accepted: 10/07/2018] [Indexed: 11/22/2022]
Abstract
Background: Acute kidney injury is a common debilitating disease with no curative treatment. The recent development of big biological data is expected to expand our understanding of the disorder if appropriately analyzed to generate translational knowledge. We have here re-analyzed a time-course microarray data on mRNA expression of rat kidneys exposed to ischemia-reperfusion to identify key underlying biological processes. Materials and Methods: The dataset was quality controlled by principal component analysis and hierarchical clustering. Using limma R package, differentially expressed (DE) genes were detected which were then clustered according to their expression trajectories. The biological processes related to each cluster were harvested using gene ontology enrichment analysis. In addition, the interaction map of proteins encoded by the DE genes was constructed, and the functions related to network central genes were determined. Furthermore, signaling pathways related to the DE genes were harvested using pathway enrichment analysis. Results: We found 8139 DE genes that drive critical processes such as the control of blood circulation, reactive species metabolism, mitochondrial respiration, apoptosis, cell proliferation, as well as inflammatory and immunological reactions. The role of less recognized pathways such as olfactory signaling in acute kidney injury is also proposed that remains to be investigated in future studies. Conclusion: Using systems biology top-down approach, we have suggested novel potential genes and pathways to be intervened toward kidney regeneration.
Collapse
Affiliation(s)
- Kobra Moradzadeh
- Department of Genetics and Molecular Biology, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Yousof Gheisari
- Department of Genetics and Molecular Biology, Isfahan University of Medical Sciences, Isfahan, Iran.,Regenerative Medicine Research Center, Isfahan University of Medical Sciences, Isfahan, Iran
| |
Collapse
|
97
|
Moretto M, Sonego P, Villaseñor-Altamirano AB, Engelen K. First step toward gene expression data integration: transcriptomic data acquisition with COMMAND>_. BMC Bioinformatics 2019; 20:54. [PMID: 30691411 PMCID: PMC6348648 DOI: 10.1186/s12859-019-2643-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2018] [Accepted: 01/22/2019] [Indexed: 11/30/2022] Open
Abstract
Background Exploring cellular responses to stimuli using extensive gene expression profiles has become a routine procedure performed on a daily basis. Raw and processed data from these studies are available on public databases but the opportunity to fully exploit such rich datasets is limited due to the large heterogeneity of data formats. In recent years, several approaches have been proposed to effectively integrate gene expression data for analysis and exploration at a broader level. Despite the different goals and approaches towards gene expression data integration, the first step is common to any proposed method: data acquisition. Although it is seemingly straightforward to extract valuable information from a set of downloaded files, things can rapidly get complicated, especially as the number of experiments grows. Transcriptomic datasets are deposited in public databases with little regard to data format and thus retrieving raw data might become a challenging task. While for RNA-seq experiments such problem is partially mitigated by the fact that raw reads are generally available on databases such as the NCBI SRA, for microarray experiments standards are not equally well established, or enforced during submission, and thus a multitude of data formats has emerged. Results COMMAND>_ is a specialized tool meant to simplify gene expression data acquisition. It is a flexible multi-user web-application that allows users to search and download gene expression experiments, extract only the relevant information from experiment files, re-annotate microarray platforms, and present data in a simple and coherent data model for subsequent analysis. Conclusions COMMAND>_ facilitates the creation of local datasets of gene expression data coming from both microarray and RNA-seq experiments and may be a more efficient tool to build integrated gene expression compendia. COMMAND>_ is free and open-source software, including publicly available tutorials and documentation.
Collapse
Affiliation(s)
- Marco Moretto
- Unit of Computational Biology, Research and Innovation Centre, Fondazione Edmund Mach, via E. Mach 1, 38010, San Michele all'Adige, Italy.
| | - Paolo Sonego
- Unit of Computational Biology, Research and Innovation Centre, Fondazione Edmund Mach, via E. Mach 1, 38010, San Michele all'Adige, Italy
| | - Ana B Villaseñor-Altamirano
- Laboratorio Internacional de Investigación Sobre el Genoma Humano, Universidad Nacional Autónoma De México, 76230, Juriquilla, Querétaro, Mexico.,Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, 62210, Cuernavaca, Morelos, Mexico
| | - Kristof Engelen
- Unit of Computational Biology, Research and Innovation Centre, Fondazione Edmund Mach, via E. Mach 1, 38010, San Michele all'Adige, Italy.
| |
Collapse
|
98
|
Castro DM, de Veaux NR, Miraldi ER, Bonneau R. Multi-study inference of regulatory networks for more accurate models of gene regulation. PLoS Comput Biol 2019; 15:e1006591. [PMID: 30677040 PMCID: PMC6363223 DOI: 10.1371/journal.pcbi.1006591] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2018] [Revised: 02/05/2019] [Accepted: 10/23/2018] [Indexed: 12/16/2022] Open
Abstract
Gene regulatory networks are composed of sub-networks that are often shared across biological processes, cell-types, and organisms. Leveraging multiple sources of information, such as publicly available gene expression datasets, could therefore be helpful when learning a network of interest. Integrating data across different studies, however, raises numerous technical concerns. Hence, a common approach in network inference, and broadly in genomics research, is to separately learn models from each dataset and combine the results. Individual models, however, often suffer from under-sampling, poor generalization and limited network recovery. In this study, we explore previous integration strategies, such as batch-correction and model ensembles, and introduce a new multitask learning approach for joint network inference across several datasets. Our method initially estimates the activities of transcription factors, and subsequently, infers the relevant network topology. As regulatory interactions are context-dependent, we estimate model coefficients as a combination of both dataset-specific and conserved components. In addition, adaptive penalties may be used to favor models that include interactions derived from multiple sources of prior knowledge including orthogonal genomics experiments. We evaluate generalization and network recovery using examples from Bacillus subtilis and Saccharomyces cerevisiae, and show that sharing information across models improves network reconstruction. Finally, we demonstrate robustness to both false positives in the prior information and heterogeneity among datasets. Due to increasing availability of biological data, methods to properly integrate data generated across the globe become essential for extracting reproducible insights into relevant research questions. In this work, we developed a framework to reconstruct gene regulatory networks from expression datasets generated in separate studies—and thus, because of technical variation (different dates, handlers, laboratories, protocols etc…), challenging to integrate. Since regulatory mechanisms are often shared across conditions, we hypothesized that drawing conclusions from various data sources would improve performance of gene regulatory network inference. By transferring knowledge among regulatory models, our method is able to detect weaker patterns that are conserved across datasets, while also being able to detect dataset-unique interactions. We also allow incorporation of prior knowledge on network structure to favor models that are somewhat similar to the prior itself. Using two model organisms, we show that joint network inference outperforms inference from a single dataset. We also demonstrate that our method is robust to false edges in the prior and to low condition overlap across datasets, and that it can outperform current data integration strategies.
Collapse
Affiliation(s)
| | - Nicholas R de Veaux
- Center for Computational Biology, Flatiron Institute, New York, NY 10010, USA
| | - Emily R Miraldi
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH 45229, USA.,Divisions of Immunobiology & Biomedical Informatics, Cincinnati Children's Hospital, Cincinnati, OH 45229, USA
| | - Richard Bonneau
- New York University, New York, NY 10003, USA.,Center for Computational Biology, Flatiron Institute, New York, NY 10010, USA
| |
Collapse
|
99
|
Lee YS, Krishnan A, Oughtred R, Rust J, Chang CS, Ryu J, Kristensen VN, Dolinski K, Theesfeld CL, Troyanskaya OG. A Computational Framework for Genome-wide Characterization of the Human Disease Landscape. Cell Syst 2019; 8:152-162.e6. [PMID: 30685436 PMCID: PMC7374759 DOI: 10.1016/j.cels.2018.12.010] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2018] [Revised: 10/16/2018] [Accepted: 12/20/2018] [Indexed: 01/21/2023]
Abstract
A key challenge for the diagnosis and treatment of complex human diseases is identifying their molecular basis. Here, we developed a unified computational framework, URSAHD (Unveiling RNA Sample Annotation for Human Diseases), that leverages machine learning and the hierarchy of anatomical relationships present among diseases to integrate thousands of clinical gene expression profiles and identify molecular characteristics specific to each of the hundreds of complex diseases. URSAHD can distinguish between closely related diseases more accurately than literature-validated genes or traditional differential-expression-based computational approaches and is applicable to any disease, including rare and understudied ones. We demonstrate the utility of URSAHD in classifying related nervous system cancers and experimentally verifying novel neuroblastoma-associated genes identified by URSAHD. We highlight the applications for potential targeted drug-repurposing and for quantitatively assessing the molecular response to clinical therapies. URSAHD is freely available for public use, including the use of underlying models, at ursahd.princeton.edu.
Collapse
Affiliation(s)
- Young-Suk Lee
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA; Department of Computer Science, Princeton University, Princeton, NJ, USA; School of Biological Sciences, Seoul National University, Seoul, South Korea
| | - Arjun Krishnan
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA; Departments of Computational Mathematics, Science, and Engineering and Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, USA
| | - Rose Oughtred
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Jennifer Rust
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Christie S Chang
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Joseph Ryu
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Vessela N Kristensen
- Department of Genetics, Institute of Cancer Research, Oslo University Hospital, Radiumhospitalet, Oslo, Norway; Institute of Clinical Medicine, Faculty of Medicine, University of Oslo, Oslo, Norway; Department of Clinical Molecular Biology (EpiGen), Division of Medicine, Akershus University Hospital, Lørenskog, Norway
| | - Kara Dolinski
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Chandra L Theesfeld
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA.
| | - Olga G Troyanskaya
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA; Department of Computer Science, Princeton University, Princeton, NJ, USA; Flatiron Institute, Simons Foundation, New York, NY, USA.
| |
Collapse
|
100
|
Transferrin receptor-involved HIF-1 signaling pathway in cervical cancer. Cancer Gene Ther 2019; 26:356-365. [DOI: 10.1038/s41417-019-0078-x] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2018] [Accepted: 12/28/2018] [Indexed: 12/17/2022]
|