1
|
Ma G, Kang J, Yu T. Bayesian functional analysis for untargeted metabolomics data with matching uncertainty and small sample sizes. Brief Bioinform 2024; 25:bbae141. [PMID: 38581417 PMCID: PMC10998539 DOI: 10.1093/bib/bbae141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Revised: 02/28/2024] [Accepted: 03/13/2024] [Indexed: 04/08/2024] Open
Abstract
Untargeted metabolomics based on liquid chromatography-mass spectrometry technology is quickly gaining widespread application, given its ability to depict the global metabolic pattern in biological samples. However, the data are noisy and plagued by the lack of clear identity of data features measured from samples. Multiple potential matchings exist between data features and known metabolites, while the truth can only be one-to-one matches. Some existing methods attempt to reduce the matching uncertainty, but are far from being able to remove the uncertainty for most features. The existence of the uncertainty causes major difficulty in downstream functional analysis. To address these issues, we develop a novel approach for Bayesian Analysis of Untargeted Metabolomics data (BAUM) to integrate previously separate tasks into a single framework, including matching uncertainty inference, metabolite selection and functional analysis. By incorporating the knowledge graph between variables and using relatively simple assumptions, BAUM can analyze datasets with small sample sizes. By allowing different confidence levels of feature-metabolite matching, the method is applicable to datasets in which feature identities are partially known. Simulation studies demonstrate that, compared with other existing methods, BAUM achieves better accuracy in selecting important metabolites that tend to be functionally consistent and assigning confidence scores to feature-metabolite matches. We analyze a COVID-19 metabolomics dataset and a mouse brain metabolomics dataset using BAUM. Even with a very small sample size of 16 mice per group, BAUM is robust and stable. It finds pathways that conform to existing knowledge, as well as novel pathways that are biologically plausible.
Collapse
Affiliation(s)
- Guoxuan Ma
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Jian Kang
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Tianwei Yu
- Shenzhen Research Institute of Big Data, School of Data Science, The Chinese University of Hong Kong - Shenzhen (CUHK-Shenzhen), Shenzhen, Guangdong 518172, China
| |
Collapse
|
2
|
Moon JH, Lee S, Pak M, Hur B, Kim S. MLDEG: A Machine Learning Approach to Identify Differentially Expressed Genes Using Network Property and Network Propagation. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2356-2364. [PMID: 33750713 DOI: 10.1109/tcbb.2021.3067613] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
MOTIVATION Identifying differentially expressed genes (DEGs) in transcriptome data is a very important task. However, performances of existing DEG methods vary significantly for data sets measured in different conditions and no single statistical or machine learning model for DEG detection perform consistently well for data sets of different traits. In addition, setting a cutoff value for the significance of differential expressions is one of confounding factors to determine DEGs. RESULTS We address these problems by developing an ensemble model that refines the heterogeneous and inconsistent results of the existing methods by taking accounts into network information such as network propagation and network property. DEG candidates that are predicted with weak evidence by the existing tools are re-classified by our proposed ensemble model for the transcriptome data. Tested on 10 RNA-seq datasets downloaded from gene expression omnibus (GEO), our method showed excellent performance of winning the first place in detecting ground truth (GT) genes in eight datasets and find almost all GT genes in six datasets. On the other hand, performances of all existing methods varied significantly for the 10 data sets. Because of the design principle, our method can accommodate any new DEG methods naturally. AVAILABILITY The source code of our method is available at https://github.com/jihmoon/MLDEG.
Collapse
|
3
|
Tan Y, Jiang C, Jia Q, Wang J, Huang G, Tang F. A novel oncogenic seRNA promotes nasopharyngeal carcinoma metastasis. Cell Death Dis 2022; 13:401. [PMID: 35461306 PMCID: PMC9035166 DOI: 10.1038/s41419-022-04846-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Revised: 03/30/2022] [Accepted: 04/07/2022] [Indexed: 12/24/2022]
Abstract
Nasopharyngeal carcinoma (NPC) is a common malignant cancer in southern China that has highly invasive and metastatic features and causes high mortality, but the underlying mechanisms of this malignancy remain unclear. In this study, we utilized ChIP-Seq to identify metastasis-specific super enhancers (SEs) and found that the SE of LOC100506178 existed only in metastatic NPC cells and powerfully aggravated NPC metastasis. This metastatic SE transcribed into lncRNA LOC100506178, and it was verified as a seRNA through GRO-Seq. Furthermore, SE-derived seRNA LOC100506178 was found to be highly expressed in metastatic NPC cells and NPC lymph node metastatic tissues. Knockdown of seRNA LOC100506178 arrested the invasion and metastasis of NPC cells in vitro and in vivo, demonstrating that seRNA LOC100506178 accelerates the acquisition of NPC malignant phenotype. Mechanistic studies revealed that seRNA LOC100506178 specifically interacted with the transcription factor hnRNPK and modulated the expression of hnRNPK. Further, hnRNPK in combination with the promoter region of MICAL2 increased Mical2 transcription. Knockdown of seRNA LOC100506178 or hnRNPK markedly repressed MICAL2, Vimentin and Snail expression and upregulated E-cadherin expression. Overexpression of seRNA LOC100506178 or hnRNPK markedly increased MICAL2, Vimentin and Snail expression and decreased E-cadherin expression. Therefore, seRNA LOC100506178 may promote MICAL2 expression by upregulating hnRNPK, subsequently enhancing EMT process and accelerating the invasion and metastasis of NPC cells. seRNA LOC100506178 has the potential to serve as a novel prognostic biomarker and therapeutic target in NPC patients.
Collapse
Affiliation(s)
- Yuan Tan
- Clinical Laboratory of Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Hunan Key Laboratory of Oncotarget Gene, Changsha, China
- Institute of Medical Technology, Peking University Health Science Center, Beijing, China
| | - Chonghua Jiang
- Affiliated Haikou Hospital of Xiangya Medical College, Central South University, Haikou, China
| | - Qunying Jia
- Clinical Laboratory of Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Hunan Key Laboratory of Oncotarget Gene, Changsha, China
| | - Jing Wang
- Clinical Laboratory of Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Hunan Key Laboratory of Oncotarget Gene, Changsha, China
| | - Ge Huang
- Clinical Laboratory of Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Hunan Key Laboratory of Oncotarget Gene, Changsha, China
| | - Faqing Tang
- Clinical Laboratory of Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Hunan Key Laboratory of Oncotarget Gene, Changsha, China.
| |
Collapse
|
4
|
Jin Z, Kang J, Yu T. Feature selection and classification over the network with missing node observations. Stat Med 2022; 41:1242-1262. [PMID: 34816464 PMCID: PMC9773124 DOI: 10.1002/sim.9267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Revised: 09/14/2021] [Accepted: 10/29/2021] [Indexed: 12/25/2022]
Abstract
Jointly analyzing transcriptomic data and the existing biological networks can yield more robust and informative feature selection results, as well as better understanding of the biological mechanisms. Selecting and classifying node features over genome-scale networks has become increasingly important in genomic biology and genomic medicine. Existing methods have some critical drawbacks. The first is they do not allow flexible modeling of different subtypes of selected nodes. The second is they ignore nodes with missing values, very likely to increase bias in estimation. To address these limitations, we propose a general modeling framework for Bayesian node classification (BNC) with missing values. A new prior model is developed for the class indicators incorporating the network structure. For posterior computation, we resort to the Swendsen-Wang algorithm for efficiently updating class indicators. BNC can naturally handle missing values in the Bayesian modeling framework, which improves the node classification accuracy and reduces the bias in estimating gene effects. We demonstrate the advantages of our methods via extensive simulation studies and the analysis of the cutaneous melanoma dataset from The Cancer Genome Atlas.
Collapse
Affiliation(s)
| | - Jian Kang
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan
| | - Tianwei Yu
- School of Data Science and Warshel Institute, The Chinese University of Hong Kong - Shenzhen, and Shenzhen Research Institute of Big Data, Shenzhen, China
| |
Collapse
|
5
|
Coolen JPM, Wolters F, Tostmann A, van Groningen LFJ, Bleeker-Rovers CP, Tan ECTH, van der Geest-Blankert N, Hautvast JLA, Hopman J, Wertheim HFL, Rahamat-Langendoen JC, Storch M, Melchers WJG. SARS-CoV-2 whole-genome sequencing using reverse complement PCR: For easy, fast and accurate outbreak and variant analysis. J Clin Virol 2021; 144:104993. [PMID: 34619382 PMCID: PMC8487099 DOI: 10.1016/j.jcv.2021.104993] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Revised: 09/21/2021] [Accepted: 09/29/2021] [Indexed: 12/28/2022]
Abstract
During the course of the SARS-CoV-2 pandemic reports of mutations with effects on spreading and vaccine effectiveness emerged. Large scale mutation analysis using rapid SARS-CoV-2 Whole Genome Sequencing (WGS) is often unavailable but could support public health organizations and hospitals in monitoring transmission and rising levels of mutant strains. Here we report a novel WGS technique for SARS-CoV-2, the EasySeq™ RC-PCR SARS-CoV-2 WGS kit. By applying a reverse complement polymerase chain reaction (RC-PCR), an Illumina library preparation is obtained in a single PCR, thereby saving time, resources and facilitating high-throughput screening. Using this WGS technique, we evaluated SARS-CoV-2 diversity and possible transmission within a group of 173 patients and healthcare workers (HCW) of the Radboud university medical center during 2020. Due to the emergence of variants of concern, we screened SARS-CoV-2 positive samples in 2021 for identification of mutations and lineages. With use of EasySeq™ RC-PCR SARS-CoV-2 WGS kit we were able to obtain reliable results to confirm outbreak clusters and additionally identify new previously unassociated links in a considerably easier workaround compared to current methods. Furthermore, various SARS-CoV-2 variants of interest were detected among samples and validated against an Oxford Nanopore sequencing amplicon strategy which illustrates this technique is suitable for surveillance and monitoring current circulating variants.
Collapse
Affiliation(s)
- Jordy P M Coolen
- Department of Medical Microbiology and Radboudumc Center for Infectious Diseases, Radboud university medical center, Nijmegen, The Netherlands.
| | - Femke Wolters
- Department of Medical Microbiology and Radboudumc Center for Infectious Diseases, Radboud university medical center, Nijmegen, The Netherlands
| | - Alma Tostmann
- Department of Medical Microbiology and Radboudumc Center for Infectious Diseases, Radboud university medical center, Nijmegen, The Netherlands
| | | | - Chantal P Bleeker-Rovers
- Department of Internal Medicine, division of Infectious Diseases, Radboud university medical center, Nijmegen, The Netherlands
| | - Edward C T H Tan
- Department of Emergency Medicine, Radboud university medical center, Nijmegen, The Netherlands; Department of Surgery, Radboud university medical center, Nijmegen, The Netherlands
| | | | | | - Joost Hopman
- Department of Medical Microbiology and Radboudumc Center for Infectious Diseases, Radboud university medical center, Nijmegen, The Netherlands
| | - Heiman F L Wertheim
- Department of Medical Microbiology and Radboudumc Center for Infectious Diseases, Radboud university medical center, Nijmegen, The Netherlands
| | - Janette C Rahamat-Langendoen
- Department of Medical Microbiology and Radboudumc Center for Infectious Diseases, Radboud university medical center, Nijmegen, The Netherlands
| | - Marko Storch
- London Biofoundry, Imperial College Translation & Innovation Hub, White City Campus, 84 Wood Lane, London, W12 0BZ, United Kingdom of Great Britain and Northern Ireland, United Kingdom
| | - Willem J G Melchers
- Department of Medical Microbiology and Radboudumc Center for Infectious Diseases, Radboud university medical center, Nijmegen, The Netherlands
| |
Collapse
|
6
|
Gan Y, Liang S, Wei Q, Zou G. Identification of Differential Gene Groups From Single-Cell Transcriptomes Using Network Entropy. Front Cell Dev Biol 2020; 8:588041. [PMID: 33195248 PMCID: PMC7649823 DOI: 10.3389/fcell.2020.588041] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2020] [Accepted: 09/14/2020] [Indexed: 11/13/2022] Open
Abstract
A complex tissue contains a variety of cells with distinct molecular signatures. Single-cell RNA sequencing has characterized the transcriptomes of different cell types and enables researchers to discover the underlying mechanisms of cellular heterogeneity. A critical task in single-cell transcriptome studies is to uncover transcriptional differences among specific cell types. However, the intercellular transcriptional variation is usually confounded with high level of technical noise, which masks the important biological signals. Here, we propose a new computational method DiffGE for differential analysis, adopting network entropy to measure the expression dynamics of gene groups among different cell types and to identify the highly differential gene groups. To evaluate the effectiveness of our proposed method, DiffGE is applied to three independent single-cell RNA-seq datasets and to identify the highly dynamic gene groups that exhibit distinctive expression patterns in different cell types. We compare the results of our method with those of three widely applied algorithms. Further, the gene function analysis indicates that these detected differential gene groups are significantly related to cellular regulation processes. The results demonstrate the power of our method in evaluating the transcriptional dynamics and identifying highly differential gene groups among different cell types.
Collapse
Affiliation(s)
- Yanglan Gan
- School of Computer Science and Technology, Donghua University, Shanghai, China
| | - Shanshan Liang
- School of Computer Science and Technology, Donghua University, Shanghai, China
| | - Qingting Wei
- School of Software, Nanchang University, Nanchang, China
| | - Guobing Zou
- School of Computer Engineering and Science, Shanghai University, Shanghai, China
| |
Collapse
|
7
|
Identification of Differentially Expressed Genes in Different Types of Broiler Skeletal Muscle Fibers Using the RNA-seq Technique. BIOMED RESEARCH INTERNATIONAL 2020; 2020:9478949. [PMID: 32695825 PMCID: PMC7362283 DOI: 10.1155/2020/9478949] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/29/2020] [Revised: 06/10/2020] [Accepted: 06/27/2020] [Indexed: 11/17/2022]
Abstract
The difference in muscle fiber types is very important to the muscle development and meat quality of broilers. At present, the molecular regulation mechanisms of skeletal muscle fiber-type transformation in broilers are still unclear. In this study, differentially expressed genes between breast and leg muscles in broilers were analyzed using RNA-seq. A total of 767 DEGs were identified. Compared with leg muscle, there were 429 upregulated genes and 338 downregulated genes in breast muscle. Gene Ontology (GO) enrichment indicated that these DEGs were mainly involved in cellular processes, single organism processes, cells, and cellular components, as well as binding and catalytic activity. KEGG analysis shows that a total of 230 DEGs were mapped to 126 KEGG pathways and significantly enriched in the four pathways of glycolysis/gluconeogenesis, starch and sucrose metabolism, insulin signalling pathways, and the biosynthesis of amino acids. Quantitative real-time reverse transcription polymerase chain reaction (qRT-PCR) was used to verify the differential expression of 7 selected DEGs, and the results were consistent with RNA-seq data. In addition, the expression profile of MyHC isoforms in chicken skeletal muscle cells showed that with the extension of differentiation time, the expression of fast fiber subunits (types IIA and IIB) gradually increased, while slow muscle fiber subunits (type I) showed a downward trend after 4 days of differentiation. The differential genes screened in this study will provide some new ideas for further understanding the molecular mechanism of skeletal muscle fiber transformation in broilers.
Collapse
|
8
|
Wei YC, Huang GH. CONY: A Bayesian procedure for detecting copy number variations from sequencing read depths. Sci Rep 2020; 10:10493. [PMID: 32591545 PMCID: PMC7319969 DOI: 10.1038/s41598-020-64353-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2019] [Accepted: 04/15/2020] [Indexed: 12/26/2022] Open
Abstract
Copy number variations (CNVs) are genomic structural mutations consisting of abnormal numbers of fragment copies. Next-generation sequencing of read-depth signals mirrors these variants. Some tools used to predict CNVs by depth have been published, but most of these tools can be applied to only a specific data type due to modeling limitations. We develop a tool for copy number variation detection by a Bayesian procedure, i.e., CONY, that adopts a Bayesian hierarchical model and an efficient reversible-jump Markov chain Monte Carlo inference algorithm for whole genome sequencing of read-depth data. CONY can be applied not only to individual samples for estimating the absolute number of copies but also to case-control pairs for detecting patient-specific variations. We evaluate the performance of CONY and compare CONY with competing approaches through simulations and by using experimental data from the 1000 Genomes Project. CONY outperforms the other methods in terms of accuracy in both single-sample and paired-samples analyses. In addition, CONY performs well regardless of whether the data coverage is high or low. CONY is useful for detecting both absolute and relative CNVs from read-depth data sequences. The package is available at https://github.com/weiyuchung/CONY.
Collapse
Affiliation(s)
- Yu-Chung Wei
- Graduate Institute of Statistics and Information Science, National Changhua University of Education, No.1 Jinde Road, Changhua City, Changhua County, 50007, Taiwan
| | - Guan-Hua Huang
- Institute of Statistics, National Chiao Tung University, 1001 University Road, Hsinchu, 30010, Taiwan.
| |
Collapse
|
9
|
Feng CM, Xu Y, Liu JX, Gao YL, Zheng CH. Supervised Discriminative Sparse PCA for Com-Characteristic Gene Selection and Tumor Classification on Multiview Biological Data. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:2926-2937. [PMID: 30802874 DOI: 10.1109/tnnls.2019.2893190] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Principal component analysis (PCA) has been used to study the pathogenesis of diseases. To enhance the interpretability of classical PCA, various improved PCA methods have been proposed to date. Among these, a typical method is the so-called sparse PCA, which focuses on seeking sparse loadings. However, the performance of these methods is still far from satisfactory due to their limitation of using unsupervised learning methods; moreover, the class ambiguity within the sample is high. To overcome this problem, this paper developed a new PCA method, which is named the supervised discriminative sparse PCA (SDSPCA). The main innovation of this method is the incorporation of discriminative information and sparsity into the PCA model. Specifically, in contrast to the traditional sparse PCA, which imposes sparsity on the loadings, here, sparse components are obtained to represent the data. Furthermore, via the linear transformation, the sparse components approximate the given label information. On the one hand, sparse components improve interpretability over the traditional PCA, while on the other hand, they are have discriminative abilities suitable for classification purposes. A simple algorithm is developed, and its convergence proof is provided. SDSPCA has been applied to the common-characteristic gene selection and tumor classification on multiview biological data. The sparsity and classification performance of SDSPCA are empirically verified via abundant, reasonable, and effective experiments, and the obtained results demonstrate that SDSPCA outperforms other state-of-the-art methods.
Collapse
|
10
|
Li Y, Wu Y, Zhang X, Bai Y, Akthar LM, Lu X, Shi M, Zhao J, Jiang Q, Li Y. SCIA: A Novel Gene Set Analysis Applicable to Data With Different Characteristics. Front Genet 2019; 10:598. [PMID: 31293623 PMCID: PMC6603225 DOI: 10.3389/fgene.2019.00598] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2019] [Accepted: 06/05/2019] [Indexed: 01/06/2023] Open
Abstract
Gene set analysis is commonly used in functional enrichment and molecular pathway analyses. Most of the present methods are based on the competitive testing methods which assume each gene is independent of the others. However, the false discovery rates of competitive methods are amplified when they are applied to datasets with high inter-gene correlations. The self-contained testing methods could solve this problem, but there are other restrictions on data characteristics. Therefore, a statistically rigorous testing method applicable to different datasets with various complex characteristics is needed to obtain unbiased and comparable results. We propose a self-contained and competitive incorporated analysis (SCIA) to alleviate the bias caused by the limited application scope of existing gene set analysis methods. This is accomplished through a novel permutation strategy using a priori biological networks to selectively permute gene labels with different probabilities. In simulation studies, SCIA was compared with four representative analysis methods (GSEA, CAMERA, ROAST, and NES), and produced the best performance in both false discovery rate and sensitivity under most conditions with different parameter settings. Further, the KEGG pathway analysis on two real datasets of lung cancer showed that the results found by SCIA in both of the two datasets are much more than that of GSEA and most of them could be supported by literature. Overall, SCIA promisingly offers researchers more reliable and comparable results with different datasets.
Collapse
Affiliation(s)
- Yiqun Li
- Department of Laboratory of Cancer Biology, School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Ying Wu
- Department of Biostatistics, School of Public Health, Southern Medical University, Guangzhou, China
| | - Xiaohan Zhang
- Department of Laboratory of Cancer Biology, School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Yunfan Bai
- Department of Laboratory of Cancer Biology, School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Luqman Muhammad Akthar
- Department of Laboratory of Cancer Biology, School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Xin Lu
- Department of Laboratory of Cancer Biology, School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Ming Shi
- Department of Laboratory of Cancer Biology, School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Jianxiang Zhao
- Department of Laboratory of Cancer Biology, School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Qinghua Jiang
- Department of Laboratory of Cancer Biology, School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Yu Li
- Department of Laboratory of Cancer Biology, School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| |
Collapse
|
11
|
LaRese TP, Rheaume BA, Abraham R, Eipper BA, Mains RE. Sex-Specific Gene Expression in the Mouse Nucleus Accumbens Before and After Cocaine Exposure. J Endocr Soc 2019; 3:468-487. [PMID: 30746506 PMCID: PMC6364626 DOI: 10.1210/js.2018-00313] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/27/2018] [Accepted: 01/09/2019] [Indexed: 12/18/2022] Open
Abstract
The nucleus accumbens plays a major role in the response of mammals to cocaine. In animal models and human studies, the addictive effects of cocaine and relapse probability have been shown to be greater in females. Sex-specific differential expression of key transcripts at baseline and after prolonged withdrawal could underlie these differences. To distinguish between these possibilities, gene expression was analyzed in four groups of mice (cycling females, ovariectomized females treated with estradiol or placebo, and males) 28 days after they had received seven daily injections of saline or cocaine. As expected, sensitization to the locomotor effects of cocaine was most pronounced in the ovariectomized mice receiving estradiol, was greater in cycling females than in males, and failed to occur in ovariectomized/placebo mice. After the 28-day withdrawal period, RNA prepared from the nucleus accumbens of the individual cocaine- or saline-injected mice was subjected to RNA sequencing analysis. Baseline expression of 3% of the nucleus accumbens transcripts differed in the cycling female mice compared with the male mice. Expression of a similar number of transcripts was altered by ovariectomy or was responsive to estradiol treatment. Nucleus accumbens transcripts differentially expressed in cycling female mice withdrawn from cocaine exhibited substantial overlap with those differentially expressed in cocaine-withdrawn male mice. A small set of transcripts were similarly affected by cocaine in the placebo- or estradiol-treated ovariectomized mice. Sex and hormonal status have profound effects on RNA expression in the nucleus accumbens of naive mice. Prolonged withdrawal from cocaine alters the expression of a much smaller number of common and sex hormone-specific transcripts.
Collapse
Affiliation(s)
- Taylor P LaRese
- Department of Neuroscience, University of Connecticut Health Center, Farmington, Connecticut
| | - Bruce A Rheaume
- Department of Neuroscience, University of Connecticut Health Center, Farmington, Connecticut
| | - Ron Abraham
- Department of Neuroscience, University of Connecticut Health Center, Farmington, Connecticut
| | - Betty A Eipper
- Department of Neuroscience, University of Connecticut Health Center, Farmington, Connecticut
| | - Richard E Mains
- Department of Neuroscience, University of Connecticut Health Center, Farmington, Connecticut
| |
Collapse
|
12
|
Angelin-Bonnet O, Biggs PJ, Vignes M. Gene Regulatory Networks: A Primer in Biological Processes and Statistical Modelling. Methods Mol Biol 2019; 1883:347-383. [PMID: 30547408 DOI: 10.1007/978-1-4939-8882-2_15] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Modelling gene regulatory networks requires not only a thorough understanding of the biological system depicted, but also the ability to accurately represent this system from a mathematical perspective. Throughout this chapter, we aim to familiarize the reader with the biological processes and molecular factors at play in the process of gene expression regulation. We first describe the different interactions controlling each step of the expression process, from transcription to mRNA and protein decay. In the second section, we provide statistical tools to accurately represent this biological complexity in the form of mathematical models. Among other considerations, we discuss the topological properties of biological networks, the application of deterministic and stochastic frameworks, and the quantitative modelling of regulation. We particularly focus on the use of such models for the simulation of expression data that can serve as a benchmark for the testing of network inference algorithms.
Collapse
Affiliation(s)
- Olivia Angelin-Bonnet
- Institute of Fundamental Sciences, Palmerston North, New Zealand
- School of Veterinary Science, Massey University, Palmerston North, New Zealand
| | - Patrick J Biggs
- Institute of Fundamental Sciences, Palmerston North, New Zealand
- School of Veterinary Science, Massey University, Palmerston North, New Zealand
| | - Matthieu Vignes
- Institute of Fundamental Sciences, Palmerston North, New Zealand.
- School of Veterinary Science, Massey University, Palmerston North, New Zealand.
| |
Collapse
|
13
|
Wang T, Nabavi S. SigEMD: A powerful method for differential gene expression analysis in single-cell RNA sequencing data. Methods 2018; 145:25-32. [PMID: 29702224 DOI: 10.1016/j.ymeth.2018.04.017] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2018] [Revised: 04/13/2018] [Accepted: 04/19/2018] [Indexed: 10/17/2022] Open
Abstract
Differential gene expression analysis is one of the significant efforts in single cell RNA sequencing (scRNAseq) analysis to discover the specific changes in expression levels of individual cell types. Since scRNAseq exhibits multimodality, large amounts of zero counts, and sparsity, it is different from the traditional bulk RNA sequencing (RNAseq) data. The new challenges of scRNAseq data promote the development of new methods for identifying differentially expressed (DE) genes. In this study, we proposed a new method, SigEMD, that combines a data imputation approach, a logistic regression model and a nonparametric method based on the Earth Mover's Distance, to precisely and efficiently identify DE genes in scRNAseq data. The regression model and data imputation are used to reduce the impact of large amounts of zero counts, and the nonparametric method is used to improve the sensitivity of detecting DE genes from multimodal scRNAseq data. By additionally employing gene interaction network information to adjust the final states of DE genes, we further reduce the false positives of calling DE genes. We used simulated datasets and real datasets to evaluate the detection accuracy of the proposed method and to compare its performance with those of other differential expression analysis methods. Results indicate that the proposed method has an overall powerful performance in terms of precision in detection, sensitivity, and specificity.
Collapse
Affiliation(s)
- Tianyu Wang
- Computer Science and Engineering Department, University of Connecticut, Storrs, CT, USA.
| | - Sheida Nabavi
- Computer Science and Engineering Department and Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA.
| |
Collapse
|
14
|
Pathway and Network Analysis of Differentially Expressed Genes in Transcriptomes. Methods Mol Biol 2018. [PMID: 29508288 DOI: 10.1007/978-1-4939-7710-9_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2023]
Abstract
In recent years, transcriptome sequencing has become very popular, encompassing a wide variety of applications from simple mRNA profiling to discovery and analysis of the entire transcriptome. One of the most common aims of transcriptome sequencing is to identify genes that are differentially expressed (DE) between two or more biological conditions, and to infer associated pathways and gene networks from expression profiles. It can provide avenues for further systematic investigation into potential biologic mechanisms. Gene Set (GS) enrichment analysis is a popular approach to identify pathways or sets of genes that are significantly enriched in the context of differentially expressed genes. However, the approach considers a pathway as a simple gene collection disregarding knowledge of gene or protein interactions. In contrast, topology-based methods integrate the topological structure of a pathway and gene network into the analysis. To provide a panoramic view of such approaches, this chapter demonstrates several recent computational workflows, including gene set enrichment and topology-based methods, for analysis of the DE pathways and gene networks from transcriptome-wide sequencing data.
Collapse
|