1
|
Sekula M, Gaskins J, Datta S. Single-Cell Differential Network Analysis with Sparse Bayesian Factor Models. Front Genet 2022; 12:810816. [PMID: 35186014 PMCID: PMC8855158 DOI: 10.3389/fgene.2021.810816] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2021] [Accepted: 12/21/2021] [Indexed: 11/13/2022] Open
Abstract
Differential network analysis plays an important role in learning how gene interactions change under different biological conditions, and the high resolution of single-cell RNA (scRNA-seq) sequencing provides new opportunities to explore these changing gene-gene interactions. Here, we present a sparse hierarchical Bayesian factor model to identify differences across network structures from different biological conditions in scRNA-seq data. Our methodology utilizes latent factors to impact gene expression values for each cell to help account for zero-inflation, increased cell-to-cell variability, and overdispersion that are unique characteristics of scRNA-seq data. Condition-dependent parameters determine which latent factors are activated in a gene, which allows for not only the calculation of gene-gene co-expression within each group but also the calculation of the co-expression differences between groups. We highlight our methodology’s performance in detecting differential gene-gene associations across groups by analyzing simulated datasets and a SARS-CoV-2 case study dataset.
Collapse
Affiliation(s)
- Michael Sekula
- Department of Bioinformatics and Biostatistics, University of Louisville, Louisville, KY, United States
| | - Jeremy Gaskins
- Department of Bioinformatics and Biostatistics, University of Louisville, Louisville, KY, United States
| | - Susmita Datta
- Department of Biostatistics, University of Florida, Gainesville, FL, United States
- *Correspondence: Susmita Datta,
| |
Collapse
|
2
|
Liu J, Wang H, Sun W, Liu Y. Prioritizing Autism Risk Genes using Personalized Graphical Models Estimated from Single Cell RNA-seq Data. J Am Stat Assoc 2022; 117:38-51. [PMID: 35529781 PMCID: PMC9070996 DOI: 10.1080/01621459.2021.1933495] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Hundreds of autism risk genes have been reported recently, mainly based on genetic studies where these risk genes have more de novo mutations in autism subjects than healthy controls. However, as a complex disease, autism is likely associated with more risk genes and many of them may not be identifiable through de novo mutations. We hypothesize that more autism risk genes can be identified through their connections with known autism risk genes in personalized gene-gene interaction graphs. We estimate such personalized graphs using single cell RNA sequencing (scRNA-seq) while appropriately modeling the cell dependence and possible zero-inflation in the scRNA-seq data. The sample size, which is the number of cells per individual, ranges from 891 to 1,241 in our case study using scRNA-seq data in autism subjects and controls. We consider 1,500 genes in our analysis. Since the number of genes is larger or comparable to the sample size, we perform penalized estimation. We score each gene's relevance by applying a simple graph kernel smoothing method to each personalized graph. The molecular functions of the top-scored genes are related to autism diseases. For example, a candidate gene RYR2 that encodes protein ryanodine receptor 2 is involved in neurotransmission, a process that is impaired in ASD patients. While our method provides a systemic and unbiased approach to prioritize autism risk genes, the relevance of these genes needs to be further validated in functional studies.
Collapse
Affiliation(s)
- Jianyu Liu
- Department of Statistics and Operations Research, University of North Carolina, Chapel Hill
| | - Haodong Wang
- Department of Statistics and Operations Research, University of North Carolina, Chapel Hill
| | - Wei Sun
- Biostatistics Program, Public Health Sciences Division Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Yufeng Liu
- Department of Statistics and Operations Research, University of North Carolina, Chapel Hill,Department of Genetics, Department of Biostatistics, Carolina Center for Genome Science, Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill,
| |
Collapse
|
3
|
Network Biology and Artificial Intelligence Drive the Understanding of the Multidrug Resistance Phenotype in Cancer. Drug Resist Updat 2022; 60:100811. [DOI: 10.1016/j.drup.2022.100811] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2021] [Revised: 01/22/2022] [Accepted: 01/24/2022] [Indexed: 02/07/2023]
|
4
|
Liu C, Cai D, Zeng W, Huang Y. Inferring Differential Networks by Integrating Gene Expression Data With Additional Knowledge. Front Genet 2021; 12:760155. [PMID: 34858477 PMCID: PMC8632038 DOI: 10.3389/fgene.2021.760155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Accepted: 10/13/2021] [Indexed: 11/23/2022] Open
Abstract
Evidences increasingly indicate the involvement of gene network rewiring in disease development and cell differentiation. With the accumulation of high-throughput gene expression data, it is now possible to infer the changes of gene networks between two different states or cell types via computational approaches. However, the distribution diversity of multi-platform gene expression data and the sparseness and high noise rate of single-cell RNA sequencing (scRNA-seq) data raise new challenges for existing differential network estimation methods. Furthermore, most existing methods are purely rely on gene expression data, and ignore the additional information provided by various existing biological knowledge. In this study, to address these challenges, we propose a general framework, named weighted joint sparse penalized D-trace model (WJSDM), to infer differential gene networks by integrating multi-platform gene expression data and multiple prior biological knowledge. Firstly, a non-paranormal graphical model is employed to tackle gene expression data with missing values. Then we propose a weighted group bridge penalty to integrate multi-platform gene expression data and various existing biological knowledge. Experiment results on synthetic data demonstrate the effectiveness of our method in inferring differential networks. We apply our method to the gene expression data of ovarian cancer and the scRNA-seq data of circulating tumor cells of prostate cancer, and infer the differential network associated with platinum resistance of ovarian cancer and anti-androgen resistance of prostate cancer. By analyzing the estimated differential networks, we find some important biological insights about the mechanisms underlying platinum resistance of ovarian cancer and anti-androgen resistance of prostate cancer.
Collapse
Affiliation(s)
- Chen Liu
- Department of Chemotherapy, The First Affiliated Hospital of Fujian Medical University, Fuzhou, China
| | - Dehan Cai
- Department of Electrical Engineering, City University of Hong Kong, Hong Kong, China
| | - WuCha Zeng
- Department of Chemotherapy, The First Affiliated Hospital of Fujian Medical University, Fuzhou, China
| | - Yun Huang
- Department of Geriatric Medicine, The First Affiliated Hospital of Fujian Medical University, Fuzhou, China
| |
Collapse
|
5
|
Wang Q, Zhang B, Yue Z. Disentangling the Molecular Pathways of Parkinson's Disease using Multiscale Network Modeling. Trends Neurosci 2021; 44:182-188. [PMID: 33358606 PMCID: PMC10942661 DOI: 10.1016/j.tins.2020.11.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Revised: 10/28/2020] [Accepted: 11/19/2020] [Indexed: 12/14/2022]
Abstract
Parkinson's disease (PD) is a complex neurodegenerative disorder. The identification of genetic variants has shed light on the molecular pathways for inherited PD, while the disease mechanism for idiopathic PD remains elusive, partly due to a lack of robust tools. The complexity of PD arises from the heterogeneity of clinical symptoms, pathologies, environmental insults contributing to the disease, and disease comorbidities. Molecular networks have been increasingly used to identify molecular pathways and drug targets in complex human diseases. Here, we review recent advances in molecular network approaches and their application to PD. We discuss how network modeling can predict functions of PD genetic risk factors through network context and assist in the discovery of network-based therapeutics for neurodegenerative diseases.
Collapse
Affiliation(s)
- Qian Wang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1425 Madison Avenue, NY 10029, USA; Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, 1425 Madison Avenue, NY 10029, USA; Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, NY 10029, USA; Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, 1425 Madison Avenue, NY 10029-6501, USA; Department of Neurology and Neuroscience, Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, 1425 Madison Avenue, NY 10029, USA
| | - Bin Zhang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1425 Madison Avenue, NY 10029, USA; Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, 1425 Madison Avenue, NY 10029, USA; Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, NY 10029, USA; Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, 1425 Madison Avenue, NY 10029-6501, USA.
| | - Zhenyu Yue
- Department of Neurology and Neuroscience, Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, 1425 Madison Avenue, NY 10029, USA.
| |
Collapse
|
6
|
Wu N, Yin F, Ou-Yang L, Zhu Z, Xie W. Joint learning of multiple gene networks from single-cell gene expression data. Comput Struct Biotechnol J 2020; 18:2583-2595. [PMID: 33033579 PMCID: PMC7527714 DOI: 10.1016/j.csbj.2020.09.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2020] [Revised: 08/31/2020] [Accepted: 09/01/2020] [Indexed: 11/24/2022] Open
Abstract
Inferring gene networks from gene expression data is important for understanding functional organizations within cells. With the accumulation of single-cell RNA sequencing (scRNA-seq) data, it is possible to infer gene networks at single cell level. However, due to the characteristics of scRNA-seq data, such as cellular heterogeneity and high sparsity caused by dropout events, traditional network inference methods may not be suitable for scRNA-seq data. In this study, we introduce a novel joint Gaussian copula graphical model (JGCGM) to jointly estimate multiple gene networks for multiple cell subgroups from scRNA-seq data. Our model can deal with non-Gaussian data with missing values, and identify the common and unique network structures of multiple cell subgroups, which is suitable for scRNA-seq data. Extensive experiments on synthetic data demonstrate that our proposed model outperforms other compared state-of-the-art network inference models. We apply our model to real scRNA-seq data sets to infer gene networks of different cell subgroups. Hub genes in the estimated gene networks are found to be biological significance.
Collapse
Affiliation(s)
- Nuosi Wu
- College of Electronics and Information Engineering, Shenzhen University, Shenzhen, China
| | - Fu Yin
- College of Electronics and Information Engineering, Shenzhen University, Shenzhen, China
| | - Le Ou-Yang
- College of Electronics and Information Engineering, Shenzhen University, Shenzhen, China
- Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen Key Laboratory of Media Security, and Guangdong Laboratory of Artificial Intelligence and Digital Economy(SZ), Shenzhen University, Shenzhen, China
- Shenzhen Institute of Artificial Intelligence and Robotics for Society, Shenzhen, China
| | - Zexuan Zhu
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
| | - Weixin Xie
- College of Electronics and Information Engineering, Shenzhen University, Shenzhen, China
| |
Collapse
|
7
|
Yang X, Kui L, Tang M, Li D, Wei K, Chen W, Miao J, Dong Y. High-Throughput Transcriptome Profiling in Drug and Biomarker Discovery. Front Genet 2020; 11:19. [PMID: 32117438 PMCID: PMC7013098 DOI: 10.3389/fgene.2020.00019] [Citation(s) in RCA: 88] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2019] [Accepted: 01/07/2020] [Indexed: 01/26/2023] Open
Abstract
The development of new drugs is multidisciplinary and systematic work. High-throughput techniques based on “-omics” have driven the discovery of biomarkers in diseases and therapeutic targets of drugs. A transcriptome is the complete set of all RNAs transcribed by certain tissues or cells at a specific stage of development or physiological condition. Transcriptome research can demonstrate gene functions and structures from the whole level and reveal the molecular mechanism of specific biological processes in diseases. Currently, gene expression microarray and high-throughput RNA-sequencing have been widely used in biological, medical, clinical, and drug research. The former has been applied in drug screening and biomarker detection of drugs due to its high throughput, fast detection speed, simple analysis, and relatively low price. With the further development of detection technology and the improvement of analytical methods, the detection flux of RNA-seq is much higher but the price is lower, hence it has powerful advantages in detecting biomarkers and drug discovery. Compared with the traditional RNA-seq, scRNA-seq has higher accuracy and efficiency, especially the single-cell level of gene expression pattern analysis can provide more information for drug and biomarker discovery. Therefore, (sc)RNA-seq has broader application prospects, especially in the field of drug discovery. In this overview, we will review the application of these technologies in drug, especially in natural drug and biomarker discovery and development. Emerging applications of scRNA-seq and the third generation RNA-sequencing tools are also discussed.
Collapse
Affiliation(s)
- Xiaonan Yang
- Guangxi Key Laboratory of Medicinal Resources Protection and Genetic Improvement, Guangxi Botanical Garden of Medicinal Plants, Nanning, China
| | - Ling Kui
- Dana-Farber Cancer Institute, Harvard Medical School, Brookline, MA, United States
| | - Min Tang
- School of Life Sciences, Jiangsu University, Zhenjiang, China
| | - Dawei Li
- College of Biological Big Data, Yunnan Agricultural University, Kunming, China.,State Key Laboratory for Conservation and Utilization of Bio-Resources in Yunnan, Yunnan Agricultural University, Kunming, China
| | - Kunhua Wei
- Guangxi Key Laboratory of Medicinal Resources Protection and Genetic Improvement, Guangxi Botanical Garden of Medicinal Plants, Nanning, China.,School of Pharmacy, Guangxi Medical University, Nanning, China
| | - Wei Chen
- College of Biological Big Data, Yunnan Agricultural University, Kunming, China.,State Key Laboratory for Conservation and Utilization of Bio-Resources in Yunnan, Yunnan Agricultural University, Kunming, China
| | - Jianhua Miao
- Guangxi Key Laboratory of Medicinal Resources Protection and Genetic Improvement, Guangxi Botanical Garden of Medicinal Plants, Nanning, China.,School of Pharmacy, Guangxi Medical University, Nanning, China
| | - Yang Dong
- Guangxi Key Laboratory of Medicinal Resources Protection and Genetic Improvement, Guangxi Botanical Garden of Medicinal Plants, Nanning, China.,College of Biological Big Data, Yunnan Agricultural University, Kunming, China.,State Key Laboratory for Conservation and Utilization of Bio-Resources in Yunnan, Yunnan Agricultural University, Kunming, China
| |
Collapse
|
8
|
Rossi E, Zamarchi R. Single-Cell Analysis of Circulating Tumor Cells: How Far Have We Come in the -Omics Era? Front Genet 2019; 10:958. [PMID: 31681412 PMCID: PMC6811661 DOI: 10.3389/fgene.2019.00958] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2019] [Accepted: 09/09/2019] [Indexed: 12/11/2022] Open
Abstract
Tumor cells detach from the primary tumor or metastatic sites and enter the peripheral blood, often causing metastasis. These cells, named Circulating Tumor Cells (CTCs), display the same spatial and temporal heterogeneity as the primary tumor. Since CTCs are involved in tumor progression, they represent a privileged window to disclose mechanisms of metastases, while -omic analyses at the single-cell level allow dissection of the complex relationships between the tumor subpopulations and the surrounding normal tissue. However, in addition to reporting the proof of concept that we can query CTCs to reveal tumor evolution throughout the continuum of treatment for early detection of resistance to therapy, the scientific literature has also been highlighting the disadvantages of CTCs, which hampers a routine use of this approach in clinical practice. To date, an increasing number of CTC technologies, as well as -omics methods, have been employed, mostly lacking strong comparative analyses. The rarity of CTCs also represents a major challenge, because there is no consensus regarding the minimal criteria necessary and sufficient to define an event as CTC; moreover, we cannot often compare data from of one study with that of another. Finally, the availability of an individual tumor profile undermines the traditional histology-based treatment. Applying molecular data for patient benefit implies a collective effort by biologists, bioengineers, and clinicians, to create tools to interpret molecular data and manage precision medicine in every single patient. Herein, we focus on the most recent findings in CTC −omics to learn how far we have come.
Collapse
Affiliation(s)
- Elisabetta Rossi
- Department of Surgery, Oncology and Gastroenterology, University of Padova, Padova, Italy.,Veneto Institute of Oncology IOV-IRCCS, Padua, Italy
| | - Rita Zamarchi
- Veneto Institute of Oncology IOV-IRCCS, Padua, Italy
| |
Collapse
|
9
|
Wang K, Liu X, Guo Y, Wu Z, Zhi D, Ruan J, Zhao Z. The International Conference on Intelligent Biology and Medicine (ICIBM) 2018: systems biology on diverse data types. BMC SYSTEMS BIOLOGY 2018; 12:125. [PMID: 30577731 PMCID: PMC6302362 DOI: 10.1186/s12918-018-0648-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Between June 10–12, 2018, the International Conference on Intelligent Biology and Medicine (ICIBM 2018) was held in Los Angeles, California, USA. The conference included 11 scientific sessions, four tutorials, one poster session, four keynote talks and four eminent scholar talks that covered a wide range of topics in 3D genome structure analysis and visualization, next generation sequencing analysis, computational drug discovery, medical informatics, cancer genomics and systems biology. Systems biology has been a main theme in ICIBM 2018, with exciting advances presented in many areas of systems biology, covering various different data types such as gene regulation, circular RNAs expression, single-cell RNA-Seq, inter-chromosomal interactions, metabolomics, proteomics and phosphoproteomics. Here, we describe ten high quality papers to be published in BMC Systems Biology.
Collapse
Affiliation(s)
- Kai Wang
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA. .,Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.
| | - Xiaoming Liu
- Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA.,College of Public Health, University of South Florida, Tampa, FL, 33612, USA
| | - Yan Guo
- Comprehensive Cancer Center, University of New Mexico, Albuquerque, NM, 87131, USA
| | - Zhijin Wu
- Department of Biostatistics, Brown University, Providence, RI, 02912, USA
| | - Degui Zhi
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Jianhua Ruan
- Department of Computer Science, The University of Texas at San Antonio, San Antonio, TX, 78249, USA
| | - Zhongming Zhao
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA.
| |
Collapse
|