1
|
Selvaraj G, Kaliamurthi S, Peslherbe GH, Wei DQ. Identifying potential drug targets and candidate drugs for COVID-19: biological networks and structural modeling approaches. F1000Res 2021; 10:127. [PMID: 33968364 PMCID: PMC8080978 DOI: 10.12688/f1000research.50850.1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 02/01/2021] [Indexed: 02/05/2023] Open
Abstract
Background: Coronavirus (CoV) is an emerging human pathogen causing severe acute respiratory syndrome (SARS) around the world. Earlier identification of biomarkers for SARS can facilitate detection and reduce the mortality rate of the disease. Thus, by integrated network analysis and structural modeling approach, we aimed to explore the potential drug targets and the candidate drugs for coronavirus medicated SARS. Methods: Differentially expression (DE) analysis of CoV infected host genes (HGs) expression profiles was conducted by using the Limma. Highly integrated DE-CoV-HGs were selected to construct the protein-protein interaction (PPI) network. Results: Using the Walktrap algorithm highly interconnected modules include module 1 (202 nodes); module 2 (126 nodes) and module 3 (121 nodes) modules were retrieved from the PPI network. MYC, HDAC9, NCOA3, CEBPB, VEGFA, BCL3, SMAD3, SMURF1, KLHL12, CBL, ERBB4, and CRKL were identified as potential drug targets (PDTs), which are highly expressed in the human respiratory system after CoV infection. Functional terms growth factor receptor binding, c-type lectin receptor signaling, interleukin-1 mediated signaling, TAP dependent antigen processing and presentation of peptide antigen via MHC class I, stimulatory T cell receptor signaling, and innate immune response signaling pathways, signal transduction and cytokine immune signaling pathways were enriched in the modules. Protein-protein docking results demonstrated the strong binding affinity (-314.57 kcal/mol) of the ERBB4-3cLpro complex which was selected as a drug target. In addition, molecular dynamics simulations indicated the structural stability and flexibility of the ERBB4-3cLpro complex. Further, Wortmannin was proposed as a candidate drug to ERBB4 to control SARS-CoV-2 pathogenesis through inhibit receptor tyrosine kinase-dependent macropinocytosis, MAPK signaling, and NF-kb singling pathways that regulate host cell entry, replication, and modulation of the host immune system. Conclusion: We conclude that CoV drug target "ERBB4" and candidate drug "Wortmannin" provide insights on the possible personalized therapeutics for emerging COVID-19.
Collapse
Affiliation(s)
- Gurudeeban Selvaraj
- Centre for Research in Molecular Modeling, Concordia University, Montreal, Quebec, H4B 1R6, Canada
- Centre of Interdisciplinary Science-Computational Life Sciences, College of Chemistry and Chemical Engineering,, Henan University of Technology, Zhengzhou, Henan, 450001, China
| | - Satyavani Kaliamurthi
- Centre for Research in Molecular Modeling, Concordia University, Montreal, Quebec, H4B 1R6, Canada
- Centre of Interdisciplinary Science-Computational Life Sciences, College of Chemistry and Chemical Engineering,, Henan University of Technology, Zhengzhou, Henan, 450001, China
| | - Gilles H. Peslherbe
- Centre for Research in Molecular Modeling, Concordia University, Montreal, Quebec, H4B 1R6, Canada
| | - Dong-Qing Wei
- Centre of Interdisciplinary Science-Computational Life Sciences, College of Chemistry and Chemical Engineering,, Henan University of Technology, Zhengzhou, Henan, 450001, China
- The State Key Laboratory of Microbial Metabolism, College of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, Shanghai, 200240, China
- IASIA (International Association of Scientists in the Interdisciplinary Areas), 125 Boul. de Bromont, Quebec, J2L 2K7, Canada
| |
Collapse
|
2
|
Selvaraj G, Kaliamurthi S, Peslherbe GH, Wei DQ. Identifying potential drug targets and candidate drugs for COVID-19: biological networks and structural modeling approaches. F1000Res 2021; 10:127. [PMID: 33968364 PMCID: PMC8080978 DOI: 10.12688/f1000research.50850.3] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 05/10/2021] [Indexed: 02/05/2023] Open
Abstract
Background: Coronavirus (CoV) is an emerging human pathogen causing severe acute respiratory syndrome (SARS) around the world. Earlier identification of biomarkers for SARS can facilitate detection and reduce the mortality rate of the disease. Thus, by integrated network analysis and structural modeling approach, we aimed to explore the potential drug targets and the candidate drugs for coronavirus medicated SARS. Methods: Differentially expression (DE) analysis of CoV infected host genes (HGs) expression profiles was conducted by using the Limma. Highly integrated DE-CoV-HGs were selected to construct the protein-protein interaction (PPI) network. Results: Using the Walktrap algorithm highly interconnected modules include module 1 (202 nodes); module 2 (126 nodes) and module 3 (121 nodes) modules were retrieved from the PPI network. MYC, HDAC9, NCOA3, CEBPB, VEGFA, BCL3, SMAD3, SMURF1, KLHL12, CBL, ERBB4, and CRKL were identified as potential drug targets (PDTs), which are highly expressed in the human respiratory system after CoV infection. Functional terms growth factor receptor binding, c-type lectin receptor signaling, interleukin-1 mediated signaling, TAP dependent antigen processing and presentation of peptide antigen via MHC class I, stimulatory T cell receptor signaling, and innate immune response signaling pathways, signal transduction and cytokine immune signaling pathways were enriched in the modules. Protein-protein docking results demonstrated the strong binding affinity (-314.57 kcal/mol) of the ERBB4-3cLpro complex which was selected as a drug target. In addition, molecular dynamics simulations indicated the structural stability and flexibility of the ERBB4-3cLpro complex. Further, Wortmannin was proposed as a candidate drug to ERBB4 to control SARS-CoV-2 pathogenesis through inhibit receptor tyrosine kinase-dependent macropinocytosis, MAPK signaling, and NF-kb singling pathways that regulate host cell entry, replication, and modulation of the host immune system. Conclusion: We conclude that CoV drug target "ERBB4" and candidate drug "Wortmannin" provide insights on the possible personalized therapeutics for emerging COVID-19.
Collapse
Affiliation(s)
- Gurudeeban Selvaraj
- Centre for Research in Molecular Modeling, Concordia University, Montreal, Quebec, H4B 1R6, Canada
- Centre of Interdisciplinary Science-Computational Life Sciences, College of Chemistry and Chemical Engineering,, Henan University of Technology, Zhengzhou, Henan, 450001, China
| | - Satyavani Kaliamurthi
- Centre for Research in Molecular Modeling, Concordia University, Montreal, Quebec, H4B 1R6, Canada
- Centre of Interdisciplinary Science-Computational Life Sciences, College of Chemistry and Chemical Engineering,, Henan University of Technology, Zhengzhou, Henan, 450001, China
| | - Gilles H. Peslherbe
- Centre for Research in Molecular Modeling, Concordia University, Montreal, Quebec, H4B 1R6, Canada
| | - Dong-Qing Wei
- Centre of Interdisciplinary Science-Computational Life Sciences, College of Chemistry and Chemical Engineering,, Henan University of Technology, Zhengzhou, Henan, 450001, China
- The State Key Laboratory of Microbial Metabolism, College of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, Shanghai, 200240, China
- IASIA (International Association of Scientists in the Interdisciplinary Areas), 125 Boul. de Bromont, Quebec, J2L 2K7, Canada
| |
Collapse
|
3
|
Mallik S, Zhao Z. Multi-Objective Optimized Fuzzy Clustering for Detecting Cell Clusters from Single-Cell Expression Profiles. Genes (Basel) 2019; 10:E611. [PMID: 31412637 PMCID: PMC6723724 DOI: 10.3390/genes10080611] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2019] [Revised: 07/30/2019] [Accepted: 08/07/2019] [Indexed: 02/06/2023] Open
Abstract
Rapid advance in single-cell RNA sequencing (scRNA-seq) allows measurement of the expression of genes at single-cell resolution in complex disease or tissue. While many methods have been developed to detect cell clusters from the scRNA-seq data, this task currently remains a main challenge. We proposed a multi-objective optimization-based fuzzy clustering approach for detecting cell clusters from scRNA-seq data. First, we conducted initial filtering and SCnorm normalization. We considered various case studies by selecting different cluster numbers ( c l = 2 to a user-defined number), and applied fuzzy c-means clustering algorithm individually. From each case, we evaluated the scores of four cluster validity index measures, Partition Entropy ( P E ), Partition Coefficient ( P C ), Modified Partition Coefficient ( M P C ), and Fuzzy Silhouette Index ( F S I ). Next, we set the first measure as minimization objective (↓) and the remaining three as maximization objectives (↑), and then applied a multi-objective decision-making technique, TOPSIS, to identify the best optimal solution. The best optimal solution (case study) that had the highest TOPSIS score was selected as the final optimal clustering. Finally, we obtained differentially expressed genes (DEGs) using Limma through the comparison of expression of the samples between each resultant cluster and the remaining clusters. We applied our approach to a scRNA-seq dataset for the rare intestinal cell type in mice [GEO ID: GSE62270, 23,630 features (genes) and 288 cells]. The optimal cluster result (TOPSIS optimal score= 0.858) comprised two clusters, one with 115 cells and the other 91 cells. The evaluated scores of the four cluster validity indices, F S I , P E , P C , and M P C for the optimized fuzzy clustering were 0.482, 0.578, 0.607, and 0.215, respectively. The Limma analysis identified 1240 DEGs (cluster 1 vs. cluster 2). The top ten gene markers were Rps21, Slc5a1, Crip1, Rpl15, Rpl3, Rpl27a, Khk, Rps3a1, Aldob and Rps17. In this list, Khk (encoding ketohexokinase) is a novel marker for the rare intestinal cell type. In summary, this method is useful to detect cell clusters from scRNA-seq data.
Collapse
Affiliation(s)
- Saurav Mallik
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Zhongming Zhao
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, USA.
| |
Collapse
|
4
|
Sundararajan Z, Knoll R, Hombach P, Becker M, Schultze JL, Ulas T. Shiny-Seq: advanced guided transcriptome analysis. BMC Res Notes 2019; 12:432. [PMID: 31319888 PMCID: PMC6637470 DOI: 10.1186/s13104-019-4471-1] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2019] [Accepted: 07/12/2019] [Indexed: 12/11/2022] Open
Abstract
Objective A comprehensive analysis of RNA-Seq data uses a wide range of different tools and algorithms, which are normally limited to R users only. While several tools and advanced analysis pipelines are available, some require programming skills and others lack the support for many important features that enable a more comprehensive data analysis. There is thus, a need for a guided and easy to use comprehensive RNA-Seq data platform, which integrates the state of the art analysis workflow. Results We present the tool Shiny-Seq, which provides a guided and easy to use comprehensive RNA-Seq data analysis pipeline. It has many features such as batch effect estimation and removal, quality check with several visualization options, enrichment analysis with multiple biological databases, identification of patterns using advanced methods such as weighted gene co-expression network analysis, summarizing analysis as power point presentation and all results as tables via a one-click feature. The source code is published on GitHub (https://github.com/schultzelab/Shiny-Seq) and licensed under GPLv3. Shiny-Seq is written in R using the Shiny framework. In addition, the application is hosted on a public website hosted by the shinyapps.io server (https://schultzelab.shinyapps.io/Shiny-Seq/) and as a Docker image https://hub.docker.com/r/makaho/shiny-seq.
Collapse
Affiliation(s)
- Zenitha Sundararajan
- Genomics and Immunoregulation, LIMES Institute, University of Bonn, Carl-Troll-Str. 31, 53113, Bonn, Germany
| | - Rainer Knoll
- Genomics and Immunoregulation, LIMES Institute, University of Bonn, Carl-Troll-Str. 31, 53113, Bonn, Germany
| | - Peter Hombach
- Genomics and Immunoregulation, LIMES Institute, University of Bonn, Carl-Troll-Str. 31, 53113, Bonn, Germany
| | - Matthias Becker
- Platform for Single Cell Genomics and Epigenomics (PRECISE) at the German Center for Neurodegenerative Diseases and the University of Bonn, Venusberg-Campus 1, Gebäude 99, 53127, Bonn, Germany
| | - Joachim L Schultze
- Genomics and Immunoregulation, LIMES Institute, University of Bonn, Carl-Troll-Str. 31, 53113, Bonn, Germany.,Platform for Single Cell Genomics and Epigenomics (PRECISE) at the German Center for Neurodegenerative Diseases and the University of Bonn, Venusberg-Campus 1, Gebäude 99, 53127, Bonn, Germany
| | - Thomas Ulas
- Genomics and Immunoregulation, LIMES Institute, University of Bonn, Carl-Troll-Str. 31, 53113, Bonn, Germany. .,Platform for Single Cell Genomics and Epigenomics (PRECISE) at the German Center for Neurodegenerative Diseases and the University of Bonn, Venusberg-Campus 1, Gebäude 99, 53127, Bonn, Germany.
| |
Collapse
|
5
|
Moore SG, Ericsson AC, Behura SK, Lamberson WR, Evans TJ, McCabe MS, Poock SE, Lucy MC. Concurrent and long-term associations between the endometrial microbiota and endometrial transcriptome in postpartum dairy cows. BMC Genomics 2019; 20:405. [PMID: 31117952 DOI: 10.1186/s12864-019-5797-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2019] [Accepted: 05/14/2019] [Indexed: 12/11/2022] Open
Abstract
Background Fertility in dairy cows depends on ovarian cyclicity and on uterine involution. Ovarian cyclicity and uterine involution are delayed when there is uterine dysbiosis (overgrowth of pathogenic bacteria). Fertility in dairy cows may involve a mechanism through which the uterine microbiota affects ovarian cyclicity as well as the transcriptome of the endometrium within the involuting uterus. The hypothesis was that the transcriptome of the endometrium in postpartum cows would be associated with the cyclicity status of the cow as well as the microbiota during uterine involution. The endometrium of first lactation dairy cows was sampled at 1, 5, and 9 weeks postpartum. All cows were allowed to return to cyclicity without intervention until week 5 and treated with an ovulation synchronization protocol so that sampling at week 9 was on day 13 of the estrous cycle. The endometrial microbiota was measured by 16S rRNA gene sequencing and principal component analysis. The endometrial transcriptome was measured by mRNA sequencing, differential gene expression analysis, and Ingenuity Pathway Analysis. Results The endometrial microbiota changed from week 1 to week 5 but the week 5 and week 9 microbiota were similar. The endometrial transcriptome differed for cows that were either cycling or not cycling at week 5 and cyclicity status depended in part on the endometrial microbiota. Compared with cows cycling at week 5, there were large changes in the transcriptome of cows that progressed from non-cycling at week 5 to cycling at week 9. There was evidence for concurrent and longer-term associations between the endometrial microbiota and transcriptome. The week 1 endometrial microbiota had the greatest effect on the subsequent endometrial transcriptome and this effect was greatest at week 5 and diminished by week 9. Conclusions The cumulative response of the endometrial transcriptome to the microbiota represented the combination of past microbial exposure and current microbial exposure. The endometrial transcriptome in postpartum cows, therefore, depended on the immediate and longer-term effects of the uterine microbiota that acted directly on the uterus. There may also be an indirect mechanism through which the microbiome affects the transcriptome through the restoration of ovarian cyclicity postpartum. Electronic supplementary material The online version of this article (10.1186/s12864-019-5797-8) contains supplementary material, which is available to authorized users.
Collapse
|
6
|
LaRese TP, Rheaume BA, Abraham R, Eipper BA, Mains RE. Sex-Specific Gene Expression in the Mouse Nucleus Accumbens Before and After Cocaine Exposure. J Endocr Soc 2019; 3:468-487. [PMID: 30746506 PMCID: PMC6364626 DOI: 10.1210/js.2018-00313] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/27/2018] [Accepted: 01/09/2019] [Indexed: 12/18/2022] Open
Abstract
The nucleus accumbens plays a major role in the response of mammals to cocaine. In animal models and human studies, the addictive effects of cocaine and relapse probability have been shown to be greater in females. Sex-specific differential expression of key transcripts at baseline and after prolonged withdrawal could underlie these differences. To distinguish between these possibilities, gene expression was analyzed in four groups of mice (cycling females, ovariectomized females treated with estradiol or placebo, and males) 28 days after they had received seven daily injections of saline or cocaine. As expected, sensitization to the locomotor effects of cocaine was most pronounced in the ovariectomized mice receiving estradiol, was greater in cycling females than in males, and failed to occur in ovariectomized/placebo mice. After the 28-day withdrawal period, RNA prepared from the nucleus accumbens of the individual cocaine- or saline-injected mice was subjected to RNA sequencing analysis. Baseline expression of 3% of the nucleus accumbens transcripts differed in the cycling female mice compared with the male mice. Expression of a similar number of transcripts was altered by ovariectomy or was responsive to estradiol treatment. Nucleus accumbens transcripts differentially expressed in cycling female mice withdrawn from cocaine exhibited substantial overlap with those differentially expressed in cocaine-withdrawn male mice. A small set of transcripts were similarly affected by cocaine in the placebo- or estradiol-treated ovariectomized mice. Sex and hormonal status have profound effects on RNA expression in the nucleus accumbens of naive mice. Prolonged withdrawal from cocaine alters the expression of a much smaller number of common and sex hormone-specific transcripts.
Collapse
Affiliation(s)
- Taylor P LaRese
- Department of Neuroscience, University of Connecticut Health Center, Farmington, Connecticut
| | - Bruce A Rheaume
- Department of Neuroscience, University of Connecticut Health Center, Farmington, Connecticut
| | - Ron Abraham
- Department of Neuroscience, University of Connecticut Health Center, Farmington, Connecticut
| | - Betty A Eipper
- Department of Neuroscience, University of Connecticut Health Center, Farmington, Connecticut
| | - Richard E Mains
- Department of Neuroscience, University of Connecticut Health Center, Farmington, Connecticut
| |
Collapse
|
7
|
Abstract
Background Gene signatures are important to represent the molecular changes in the disease genomes or the cells in specific conditions, and have been often used to separate samples into different groups for better research or clinical treatment. While many methods and applications have been available in literature, there still lack powerful ones that can take account of the complex data and detect the most informative signatures. Methods In this article, we present a new framework for identifying gene signatures using Pareto-optimal cluster size identification for RNA-seq data. We first performed pre-filtering steps and normalization, then utilized the empirical Bayes test in Limma package to identify the differentially expressed genes (DEGs). Next, we used a multi-objective optimization technique, “Multi-objective optimization for collecting cluster alternatives” (MOCCA in R package) on these DEGs to find Pareto-optimal cluster size, and then applied k-means clustering to the RNA-seq data based on the optimal cluster size. The best cluster was obtained through computing the average Spearman’s Correlation Score among all the genes in pair-wise manner belonging to the module. The best cluster is treated as the signature for the respective disease or cellular condition. Results We applied our framework to a cervical cancer RNA-seq dataset, which included 253 squamous cell carcinoma (SCC) samples and 22 adenocarcinoma (ADENO) samples. We identified a total of 582 DEGs by Limma analysis of SCC versus ADENO samples. Among them, 260 are up-regulated genes and 322 are down-regulated genes. Using MOCCA, we obtained seven Pareto-optimal clusters. The best cluster has a total of 35 DEGs consisting of all-upregulated genes. For validation, we ran PAMR (prediction analysis for microarrays) classifier on the selected best cluster, and assessed the classification performance. Our evaluation, measured by sensitivity, specificity, precision, and accuracy, showed high confidence. Conclusions Our framework identified a multi-objective based cluster that is treated as a signature that can classify the disease and control group of samples with higher classification performance (accuracy 0.935) for the corresponding disease. Our method is useful to find signature for any RNA-seq or microarray data.
Collapse
Affiliation(s)
- Saurav Mallik
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, 77030, TX, USA
| | - Zhongming Zhao
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, 77030, TX, USA. .,Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, 37232, TN, USA.
| |
Collapse
|
8
|
Abstract
Microarray data have vastly accumulated in the past two decades. Due to the high-throughput characteristic of microarray techniques, it has transformed biological studies from specific genes to transcriptome level, and deeply boosted many fields of biological studies. While microarray offers great advantages for expression profiling, on the other hand it faces a lot challenges for computational analysis. In this chapter, we demonstrate how to perform standard analysis including data preprocessing, quality assessment, differential expression analysis, and general downstream analyses.
Collapse
Affiliation(s)
- Ming-An Sun
- Epigenomics and Computational Biology Lab, Biocomplexity Institute of Virginia Tech, Blacksburg, VA, USA.
| | - Xiaojian Shao
- Department of Human Genetics, McGill University, Montréal, Canada
- The McGill University and Génome Québec Innovation Centre, Montréal, QC, Canada
| | - Yejun Wang
- Department of Cell Biology and Genetics, School of Basic Medicine, Shenzhen University Health Science Center, Shenzhen, China
| |
Collapse
|
9
|
Mallik S, Zhao Z. ConGEMs: Condensed Gene Co-Expression Module Discovery Through Rule-Based Clustering and Its Application to Carcinogenesis. Genes (Basel) 2017; 9:E7. [PMID: 29283433 PMCID: PMC5793160 DOI: 10.3390/genes9010007] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2017] [Revised: 12/12/2017] [Accepted: 12/12/2017] [Indexed: 01/18/2023] Open
Abstract
For transcriptomic analysis, there are numerous microarray-based genomic data, especially those generated for cancer research. The typical analysis measures the difference between a cancer sample-group and a matched control group for each transcript or gene. Association rule mining is used to discover interesting item sets through rule-based methodology. Thus, it has advantages to find causal effect relationships between the transcripts. In this work, we introduce two new rule-based similarity measures-weighted rank-based Jaccard and Cosine measures-and then propose a novel computational framework to detect condensed gene co-expression modules ( C o n G E M s) through the association rule-based learning system and the weighted similarity scores. In practice, the list of evolved condensed markers that consists of both singular and complex markers in nature depends on the corresponding condensed gene sets in either antecedent or consequent of the rules of the resultant modules. In our evaluation, these markers could be supported by literature evidence, KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway and Gene Ontology annotations. Specifically, we preliminarily identified differentially expressed genes using an empirical Bayes test. A recently developed algorithm-RANWAR-was then utilized to determine the association rules from these genes. Based on that, we computed the integrated similarity scores of these rule-based similarity measures between each rule-pair, and the resultant scores were used for clustering to identify the co-expressed rule-modules. We applied our method to a gene expression dataset for lung squamous cell carcinoma and a genome methylation dataset for uterine cervical carcinogenesis. Our proposed module discovery method produced better results than the traditional gene-module discovery measures. In summary, our proposed rule-based method is useful for exploring biomarker modules from transcriptomic data.
Collapse
Affiliation(s)
- Saurav Mallik
- Department of Computer Science & Engineering, Aliah University, Newtown, WB-700156, India.
| | - Zhongming Zhao
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.
- Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.
| |
Collapse
|
10
|
Torres-Oliva M, Almudi I, McGregor AP, Posnien N. A robust (re-)annotation approach to generate unbiased mapping references for RNA-seq-based analyses of differential expression across closely related species. BMC Genomics 2016; 17:392. [PMID: 27220689 PMCID: PMC4877740 DOI: 10.1186/s12864-016-2646-x] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2015] [Accepted: 04/22/2016] [Indexed: 01/12/2023] Open
Abstract
BACKGROUND RNA-seq based on short reads generated by next generation sequencing technologies has become the main approach to study differential gene expression. Until now, the main applications of this technique have been to study the variation of gene expression in a whole organism, tissue or cell type under different conditions or at different developmental stages. However, RNA-seq also has a great potential to be used in evolutionary studies to investigate gene expression divergence in closely related species. RESULTS We show that the published genomes and annotations of the three closely related Drosophila species D. melanogaster, D. simulans and D. mauritiana have limitations for inter-specific gene expression studies. This is due to missing gene models in at least one of the genome annotations, unclear orthology assignments and significant gene length differences in the different species. A comprehensive evaluation of four statistical frameworks (DESeq2, DESeq2 with length correction, RPKM-limma and RPKM-voom-limma) shows that none of these methods sufficiently accounts for inter-specific gene length differences, which inevitably results in false positive candidate genes. We propose that published reference genomes should be re-annotated before using them as references for RNA-seq experiments to include as many genes as possible and to account for a potential length bias. We present a straight-forward reciprocal re-annotation pipeline that allows to reliably compare the expression for nearly all genes annotated in D. melanogaster. CONCLUSIONS We conclude that our reciprocal re-annotation of previously published genomes facilitates the analysis of significantly more genes in an inter-specific differential gene expression study. We propose that the established pipeline can easily be applied to re-annotate other genomes of closely related animals and plants to improve comparative expression analyses.
Collapse
Affiliation(s)
- Montserrat Torres-Oliva
- />Georg-August-Universität Göttingen, Johann-Friedrich-Blumenbach-Institut für Zoologie und Anthropologie, Abteilung für Entwicklungsbiologie, GZMB Ernst-Caspari-Haus, Justus-von-Liebig-Weg 11, 37077 Göttingen, Germany
- />Göttingen Center for Molecular Biosciences (GZMB), GZMB Ernst-Caspari-Haus, Justus-von-Liebig-Weg 11, 37077 Göttingen, Germany
| | - Isabel Almudi
- />Department of Biological and Medical Sciences, Oxford Brookes University, Gipsy Lane, Oxford, OX3 0BP UK
- />Andalusian Centre of Developmental Biology, carretera de Utrera, km.1, 41013 Seville, Spain
| | - Alistair P. McGregor
- />Department of Biological and Medical Sciences, Oxford Brookes University, Gipsy Lane, Oxford, OX3 0BP UK
| | - Nico Posnien
- />Georg-August-Universität Göttingen, Johann-Friedrich-Blumenbach-Institut für Zoologie und Anthropologie, Abteilung für Entwicklungsbiologie, GZMB Ernst-Caspari-Haus, Justus-von-Liebig-Weg 11, 37077 Göttingen, Germany
- />Göttingen Center for Molecular Biosciences (GZMB), GZMB Ernst-Caspari-Haus, Justus-von-Liebig-Weg 11, 37077 Göttingen, Germany
| |
Collapse
|
11
|
Da Rocha MS, Arnold LL, Dodmane PR, Pennington KL, Qiu F, De Camargo JLV, Cohen SM. Diuron metabolites and urothelial cytotoxicity: in vivo, in vitro and molecular approaches. Toxicology 2013; 314:238-46. [PMID: 24172598 DOI: 10.1016/j.tox.2013.10.005] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2013] [Revised: 10/08/2013] [Accepted: 10/18/2013] [Indexed: 10/26/2022]
Abstract
Diuron is carcinogenic to the rat urinary bladder at high dietary levels. The proposed mode of action (MOA) for diuron is urothelial cytotoxicity and necrosis followed by regenerative urothelial hyperplasia. Diuron-induced urothelial cytotoxicity is not due to urinary solids. Diuron is extensively metabolized, and in rats, N-(3,4-dichlorophenyl)urea (DCPU) and 4,5-dichloro-2-hydroxyphenyl urea (2-OH-DCPU) were the predominant urinary metabolites; lesser metabolites included N-(3,4-dichlorophenyl)-3-methylurea (DCPMU) and trace levels of 3,4-dichloroaniline (DCA). In humans, DCPMU and DCPU have been found in the urine after a case of product abuse. To aid in elucidating the MOA of diuron and to evaluate the metabolites that are responsible for the diuron toxicity in the bladder epithelium, we investigated the urinary concentrations of metabolites in male Wistar rats treated with 2500ppm of diuron, the urothelial cytotoxicity in vitro of the metabolites and their gene expression profiles. DCPU was found in rat urine at concentrations substantially greater than the in vitro IC50 and induced more gene expression alterations than the other metabolites tested. 2-OH-DCPU was present in urine at a concentration approximately half of the in vitro IC50, whereas DCPMU and DCA were present in urine at concentrations well below the IC50. For the diuron-induced MOA for the rat bladder, we suggest that DCPU is the primary metabolite responsible for the urothelial cytotoxicity with some contribution also by 2-OH-DCPU. This study supports a MOA for diuron-induced bladder effects in rats consisting of metabolism to DCPU (and 2-OH-DCPU to a lesser extent), concentration and excretion in urine, urothelial cytotoxicity, and regenerative proliferation.
Collapse
Affiliation(s)
- Mitscheli S Da Rocha
- Department of Pathology and Microbiology, University of Nebraska Medical Center, Omaha, NE, USA; Center for the Evaluation of the Environmental Impact on Human Health (TOXICAM), Department of Pathology, Botucatu Medical School, UNESP - São Paulo State University, Botucatu, São Paulo, Brazil.
| | | | | | | | | | | | | |
Collapse
|