1
|
Shabestary K, Klemm C, Carling B, Marshall J, Savigny J, Storch M, Ledesma-Amaro R. Phenotypic heterogeneity follows a growth-viability tradeoff in response to amino acid identity. Nat Commun 2024; 15:6515. [PMID: 39095345 PMCID: PMC11297284 DOI: 10.1038/s41467-024-50602-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Accepted: 07/16/2024] [Indexed: 08/04/2024] Open
Abstract
In their natural environments, microorganisms mainly operate at suboptimal growth conditions with fluctuations in nutrient abundance. The resulting cellular adaptation is subject to conflicting tasks: growth or survival maximisation. Here, we study this adaptation by systematically measuring the impact of a nitrogen downshift to 24 nitrogen sources on cellular metabolism at the single-cell level. Saccharomyces lineages grown in rich media and exposed to a nitrogen downshift gradually differentiate to form two subpopulations of different cell sizes where one favours growth while the other favours viability with an extended chronological lifespan. This differentiation is asymmetrical with daughter cells representing the new differentiated state with increased viability. We characterise the metabolic response of the subpopulations using RNA sequencing, metabolic biosensors and a transcription factor-tagged GFP library coupled to high-throughput microscopy, imaging more than 800,000 cells. We find that the subpopulation with increased viability is associated with a dormant quiescent state displaying differences in MAPK signalling. Depending on the identity of the nitrogen source present, differentiation into the quiescent state can be actively maintained, attenuated, or aborted. These results establish amino acids as important signalling molecules for the formation of genetically identical subpopulations, involved in chronological lifespan and growth rate determination.
Collapse
Affiliation(s)
- Kiyan Shabestary
- Department of Bioengineering and Imperial College Centre for Synthetic Biology, Imperial College London, London, SW7 2AZ, UK.
| | - Cinzia Klemm
- Department of Bioengineering and Imperial College Centre for Synthetic Biology, Imperial College London, London, SW7 2AZ, UK
| | - Benedict Carling
- Department of Bioengineering and Imperial College Centre for Synthetic Biology, Imperial College London, London, SW7 2AZ, UK
- London Biofoundry, Imperial College Translation & Innovation Hub, London, UK
| | - James Marshall
- Department of Bioengineering and Imperial College Centre for Synthetic Biology, Imperial College London, London, SW7 2AZ, UK
- London Biofoundry, Imperial College Translation & Innovation Hub, London, UK
| | - Juline Savigny
- Department of Bioengineering and Imperial College Centre for Synthetic Biology, Imperial College London, London, SW7 2AZ, UK
| | - Marko Storch
- London Biofoundry, Imperial College Translation & Innovation Hub, London, UK
- Department of Infectious Disease, Imperial College London, London, SW7 2AZ, UK
| | - Rodrigo Ledesma-Amaro
- Department of Bioengineering and Imperial College Centre for Synthetic Biology, Imperial College London, London, SW7 2AZ, UK.
| |
Collapse
|
2
|
Huo Q, Song R, Ma Z. Recent advances in exploring transcriptional regulatory landscape of crops. FRONTIERS IN PLANT SCIENCE 2024; 15:1421503. [PMID: 38903438 PMCID: PMC11188431 DOI: 10.3389/fpls.2024.1421503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Accepted: 05/23/2024] [Indexed: 06/22/2024]
Abstract
Crop breeding entails developing and selecting plant varieties with improved agronomic traits. Modern molecular techniques, such as genome editing, enable more efficient manipulation of plant phenotype by altering the expression of particular regulatory or functional genes. Hence, it is essential to thoroughly comprehend the transcriptional regulatory mechanisms that underpin these traits. In the multi-omics era, a large amount of omics data has been generated for diverse crop species, including genomics, epigenomics, transcriptomics, proteomics, and single-cell omics. The abundant data resources and the emergence of advanced computational tools offer unprecedented opportunities for obtaining a holistic view and profound understanding of the regulatory processes linked to desirable traits. This review focuses on integrated network approaches that utilize multi-omics data to investigate gene expression regulation. Various types of regulatory networks and their inference methods are discussed, focusing on recent advancements in crop plants. The integration of multi-omics data has been proven to be crucial for the construction of high-confidence regulatory networks. With the refinement of these methodologies, they will significantly enhance crop breeding efforts and contribute to global food security.
Collapse
Affiliation(s)
| | | | - Zeyang Ma
- State Key Laboratory of Maize Bio-breeding, Frontiers Science Center for Molecular Design Breeding, Joint International Research Laboratory of Crop Molecular Breeding, National Maize Improvement Center, College of Agronomy and Biotechnology, China Agricultural University, Beijing, China
| |
Collapse
|
3
|
Moyung K, Li Y, Hartemink AJ, MacAlpine DM. Genome-wide nucleosome and transcription factor responses to genetic perturbations reveal chromatin-mediated mechanisms of transcriptional regulation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.24.595391. [PMID: 38826400 PMCID: PMC11142231 DOI: 10.1101/2024.05.24.595391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2024]
Abstract
Epigenetic mechanisms contribute to gene regulation by altering chromatin accessibility through changes in transcription factor (TF) and nucleosome occupancy throughout the genome. Despite numerous studies focusing on changes in gene expression, the intricate chromatin-mediated regulatory code remains largely unexplored on a comprehensive scale. We address this by employing a factor-agnostic, reverse-genetics approach that uses MNase-seq to capture genome-wide TF and nucleosome occupancies in response to the individual deletion of 201 transcriptional regulators in Saccharomyces cerevisiae, thereby assaying nearly one million mutant-gene interactions. We develop a principled approach to identify and quantify chromatin changes genome-wide, observing differences in TF and nucleosome occupancy that recapitulate well-established pathways identified by gene expression data. We also discover distinct chromatin signatures associated with the up- and downregulation of genes, and use these signatures to reveal regulatory mechanisms previously unexplored in expression-based studies. Finally, we demonstrate that chromatin features are predictive of transcriptional activity and leverage these features to reconstruct chromatin-based transcriptional regulatory networks. Overall, these results illustrate the power of an approach combining genetic perturbation with high-resolution epigenomic profiling; the latter enables a close examination of the interplay between TFs and nucleosomes genome-wide, providing a deeper, more mechanistic understanding of the complex relationship between chromatin organization and transcription.
Collapse
Affiliation(s)
- Kevin Moyung
- Program in Computational Biology and Bioinformatics, Duke University, Durham, NC 27708
- Department of Pharmacology and Cancer Biology, Duke University Medical Center, Durham, NC 27710
| | - Yulong Li
- Department of Pharmacology and Cancer Biology, Duke University Medical Center, Durham, NC 27710
- Department of Computer Science, Duke University, Durham, NC 27708
| | - Alexander J. Hartemink
- Program in Computational Biology and Bioinformatics, Duke University, Durham, NC 27708
- Department of Computer Science, Duke University, Durham, NC 27708
| | - David M. MacAlpine
- Program in Computational Biology and Bioinformatics, Duke University, Durham, NC 27708
- Department of Pharmacology and Cancer Biology, Duke University Medical Center, Durham, NC 27710
| |
Collapse
|
4
|
Kim H, Chang W, Chae SJ, Park JE, Seo M, Kim JK. scLENS: data-driven signal detection for unbiased scRNA-seq data analysis. Nat Commun 2024; 15:3575. [PMID: 38678050 DOI: 10.1038/s41467-024-47884-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 04/14/2024] [Indexed: 04/29/2024] Open
Abstract
High dimensionality and noise have limited the new biological insights that can be discovered in scRNA-seq data. While dimensionality reduction tools have been developed to extract biological signals from the data, they often require manual determination of signal dimension, introducing user bias. Furthermore, a common data preprocessing method, log normalization, can unintentionally distort signals in the data. Here, we develop scLENS, a dimensionality reduction tool that circumvents the long-standing issues of signal distortion and manual input. Specifically, we identify the primary cause of signal distortion during log normalization and effectively address it by uniformizing cell vector lengths with L2 normalization. Furthermore, we utilize random matrix theory-based noise filtering and a signal robustness test to enable data-driven determination of the threshold for signal dimensions. Our method outperforms 11 widely used dimensionality reduction tools and performs particularly well for challenging scRNA-seq datasets with high sparsity and variability. To facilitate the use of scLENS, we provide a user-friendly package that automates accurate signal detection of scRNA-seq data without manual time-consuming tuning.
Collapse
Affiliation(s)
- Hyun Kim
- Biomedical Mathematics Group, Pioneer Research Center for Mathematical and Computational Sciences, Institute for Basic Science, Daejeon, 34126, Republic of Korea
| | - Won Chang
- Division of Statistics and Data Science, University of Cincinnati, Cincinnati, OH, 45221, USA
| | - Seok Joo Chae
- Biomedical Mathematics Group, Pioneer Research Center for Mathematical and Computational Sciences, Institute for Basic Science, Daejeon, 34126, Republic of Korea
- Department of Mathematical Sciences, KAIST, Daejeon, 34141, Republic of Korea
| | - Jong-Eun Park
- Graduate School of Medical Science and Engineering, KAIST, Daejeon, 34141, Republic of Korea
| | - Minseok Seo
- Department of Computer and Information Science, Korea University, Sejong, 30019, Republic of Korea
| | - Jae Kyoung Kim
- Biomedical Mathematics Group, Pioneer Research Center for Mathematical and Computational Sciences, Institute for Basic Science, Daejeon, 34126, Republic of Korea.
- Department of Mathematical Sciences, KAIST, Daejeon, 34141, Republic of Korea.
| |
Collapse
|
5
|
Skok Gibbs C, Mahmood O, Bonneau R, Cho K. PMF-GRN: a variational inference approach to single-cell gene regulatory network inference using probabilistic matrix factorization. Genome Biol 2024; 25:88. [PMID: 38589899 PMCID: PMC11003171 DOI: 10.1186/s13059-024-03226-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Accepted: 03/26/2024] [Indexed: 04/10/2024] Open
Abstract
Inferring gene regulatory networks (GRNs) from single-cell data is challenging due to heuristic limitations. Existing methods also lack estimates of uncertainty. Here we present Probabilistic Matrix Factorization for Gene Regulatory Network Inference (PMF-GRN). Using single-cell expression data, PMF-GRN infers latent factors capturing transcription factor activity and regulatory relationships. Using variational inference allows hyperparameter search for principled model selection and direct comparison to other generative models. We extensively test and benchmark our method using real single-cell datasets and synthetic data. We show that PMF-GRN infers GRNs more accurately than current state-of-the-art single-cell GRN inference methods, offering well-calibrated uncertainty estimates.
Collapse
Affiliation(s)
| | - Omar Mahmood
- Center for Data Science, New York University, New York, NY, 10011, USA
| | - Richard Bonneau
- Center for Data Science, New York University, New York, NY, 10011, USA
- Prescient Design, Genentech, New York, NY, 10010, USA
- Center for Genomics and Systems Biology, New York University, New York, NY, 10003, USA
| | - Kyunghyun Cho
- Center for Data Science, New York University, New York, NY, 10011, USA.
- Prescient Design, Genentech, New York, NY, 10010, USA.
| |
Collapse
|
6
|
Nadal-Ribelles M, Solé C, de Nadal E, Posas F. The rise of single-cell transcriptomics in yeast. Yeast 2024; 41:158-170. [PMID: 38403881 DOI: 10.1002/yea.3934] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 01/24/2024] [Accepted: 02/15/2024] [Indexed: 02/27/2024] Open
Abstract
The field of single-cell omics has transformed our understanding of biological processes and is constantly advancing both experimentally and computationally. One of the most significant developments is the ability to measure the transcriptome of individual cells by single-cell RNA-seq (scRNA-seq), which was pioneered in higher eukaryotes. While yeast has served as a powerful model organism in which to test and develop transcriptomic technologies, the implementation of scRNA-seq has been significantly delayed in this organism, mainly because of technical constraints associated with its intrinsic characteristics, namely the presence of a cell wall, a small cell size and little amounts of RNA. In this review, we examine the current technologies for scRNA-seq in yeast and highlight their strengths and weaknesses. Additionally, we explore opportunities for developing novel technologies and the potential outcomes of implementing single-cell transcriptomics and extension to other modalities. Undoubtedly, scRNA-seq will be invaluable for both basic and applied yeast research, providing unique insights into fundamental biological processes.
Collapse
Affiliation(s)
- Mariona Nadal-Ribelles
- Department of Medicine and Life Sciences, Universitat Pompeu Fabra, Barcelona, Spain
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Carme Solé
- Department of Medicine and Life Sciences, Universitat Pompeu Fabra, Barcelona, Spain
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Eulalia de Nadal
- Department of Medicine and Life Sciences, Universitat Pompeu Fabra, Barcelona, Spain
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Francesc Posas
- Department of Medicine and Life Sciences, Universitat Pompeu Fabra, Barcelona, Spain
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain
| |
Collapse
|
7
|
Brettner L, Eder R, Schmidlin K, Geiler-Samerotte K. An ultra high-throughput, massively multiplexable, single-cell RNA-seq platform in yeasts. Yeast 2024; 41:242-255. [PMID: 38282330 PMCID: PMC11146634 DOI: 10.1002/yea.3927] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Revised: 01/04/2024] [Accepted: 01/16/2024] [Indexed: 01/30/2024] Open
Abstract
Yeasts are naturally diverse, genetically tractable, and easy to grow such that researchers can investigate any number of genotypes, environments, or interactions thereof. However, studies of yeast transcriptomes have been limited by the processing capabilities of traditional RNA sequencing techniques. Here we optimize a powerful, high-throughput single-cell RNA sequencing (scRNAseq) platform, SPLiT-seq (Split Pool Ligation-based Transcriptome sequencing), for yeasts and apply it to 43,388 cells of multiple species and ploidies. This platform utilizes a combinatorial barcoding strategy to enable massively parallel RNA sequencing of hundreds of yeast genotypes or growth conditions at once. This method can be applied to most species or strains of yeast for a fraction of the cost of traditional scRNAseq approaches. Thus, our technology permits researchers to leverage "the awesome power of yeast" by allowing us to survey the transcriptome of hundreds of strains and environments in a short period of time and with no specialized equipment. The key to this method is that sequential barcodes are probabilistically appended to cDNA copies of RNA while the molecules remain trapped inside of each cell. Thus, the transcriptome of each cell is labeled with a unique combination of barcodes. Since SPLiT-seq uses the cell membrane as a container for this reaction, many cells can be processed together without the need to physically isolate them from one another in separate wells or droplets. Further, the first barcode in the sequence can be chosen intentionally to identify samples from different environments or genetic backgrounds, enabling multiplexing of hundreds of unique perturbations in a single experiment. In addition to greater multiplexing capabilities, our method also facilitates a deeper investigation of biological heterogeneity, given its single-cell nature. For example, in the data presented here, we detect transcriptionally distinct cell states related to cell cycle, ploidy, metabolic strategies, and so forth, all within clonal yeast populations grown in the same environment. Hence, our technology has two obvious and impactful applications for yeast research: the first is the general study of transcriptional phenotypes across many strains and environments, and the second is investigating cell-to-cell heterogeneity across the entire transcriptome.
Collapse
Affiliation(s)
- Leandra Brettner
- Biodesign Institute Center for Mechanisms of Evolution, Arizona State University, Tempe, Arizona, USA
| | - Rachel Eder
- Biodesign Institute Center for Mechanisms of Evolution, Arizona State University, Tempe, Arizona, USA
- School of Life Sciences, Arizona State University, Tempe, Arizona, USA
| | - Kara Schmidlin
- Biodesign Institute Center for Mechanisms of Evolution, Arizona State University, Tempe, Arizona, USA
- School of Life Sciences, Arizona State University, Tempe, Arizona, USA
| | - Kerry Geiler-Samerotte
- Biodesign Institute Center for Mechanisms of Evolution, Arizona State University, Tempe, Arizona, USA
- School of Life Sciences, Arizona State University, Tempe, Arizona, USA
| |
Collapse
|
8
|
Sun H, Qu H, Duan K, Du W. scMGCN: A Multi-View Graph Convolutional Network for Cell Type Identification in scRNA-seq Data. Int J Mol Sci 2024; 25:2234. [PMID: 38396909 PMCID: PMC10889820 DOI: 10.3390/ijms25042234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Revised: 02/07/2024] [Accepted: 02/09/2024] [Indexed: 02/25/2024] Open
Abstract
Single-cell RNA sequencing (scRNA-seq) data reveal the complexity and diversity of cellular ecosystems and molecular interactions in various biomedical research. Hence, identifying cell types from large-scale scRNA-seq data using existing annotations is challenging and requires stable and interpretable methods. However, the current cell type identification methods have limited performance, mainly due to the intrinsic heterogeneity among cell populations and extrinsic differences between datasets. Here, we present a robust graph artificial intelligence model, a multi-view graph convolutional network model (scMGCN) that integrates multiple graph structures from raw scRNA-seq data and applies graph convolutional networks with attention mechanisms to learn cell embeddings and predict cell labels. We evaluate our model on single-dataset, cross-species, and cross-platform experiments and compare it with other state-of-the-art methods. Our results show that scMGCN outperforms the other methods regarding stability, accuracy, and robustness to batch effects. Our main contributions are as follows: Firstly, we introduce multi-view learning and multiple graph construction methods to capture comprehensive cellular information from scRNA-seq data. Secondly, we construct a scMGCN that combines graph convolutional networks with attention mechanisms to extract shared, high-order information from cells. Finally, we demonstrate the effectiveness and superiority of the scMGCN on various datasets.
Collapse
Affiliation(s)
| | | | | | - Wei Du
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (H.S.); (H.Q.); (K.D.)
| |
Collapse
|
9
|
Walls AW, Rosenthal AZ. Bacterial phenotypic heterogeneity through the lens of single-cell RNA sequencing. Transcription 2024; 15:48-62. [PMID: 38532542 DOI: 10.1080/21541264.2024.2334110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2023] [Accepted: 03/19/2024] [Indexed: 03/28/2024] Open
Abstract
Bacterial transcription is not monolithic. Microbes exist in a wide variety of cell states that help them adapt to their environment, acquire and produce essential nutrients, and engage in both competition and cooperation with their neighbors. While we typically think of bacterial adaptation as a group behavior, where all cells respond in unison, there is often a mixture of phenotypic responses within a bacterial population, where distinct cell types arise. A primary phenomenon driving these distinct cell states is transcriptional heterogeneity. Given that bacterial mRNA transcripts are extremely short-lived compared to eukaryotes, their transcriptional state is closely associated with their physiology, and thus the transcriptome of a bacterial cell acts as a snapshot of the behavior of that bacterium. Therefore, the application of single-cell transcriptomics to microbial populations will provide novel insight into cellular differentiation and bacterial ecology. In this review, we provide an overview of transcriptional heterogeneity in microbial systems, discuss the findings already provided by single-cell approaches, and plot new avenues of inquiry in transcriptional regulation, cellular biology, and mechanisms of heterogeneity that are made possible when microbial communities are analyzed at single-cell resolution.
Collapse
Affiliation(s)
- Alex W Walls
- Department of Microbiology and Immunology, University of North Carolina, Chapel Hill, NC, USA
| | - Adam Z Rosenthal
- Department of Microbiology and Immunology, University of North Carolina, Chapel Hill, NC, USA
| |
Collapse
|
10
|
Tjärnberg A, Beheler-Amass M, Jackson CA, Christiaen LA, Gresham D, Bonneau R. Structure-primed embedding on the transcription factor manifold enables transparent model architectures for gene regulatory network and latent activity inference. Genome Biol 2024; 25:24. [PMID: 38238840 PMCID: PMC10797903 DOI: 10.1186/s13059-023-03134-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Accepted: 11/30/2023] [Indexed: 01/22/2024] Open
Abstract
BACKGROUND Modeling of gene regulatory networks (GRNs) is limited due to a lack of direct measurements of genome-wide transcription factor activity (TFA) making it difficult to separate covariance and regulatory interactions. Inference of regulatory interactions and TFA requires aggregation of complementary evidence. Estimating TFA explicitly is problematic as it disconnects GRN inference and TFA estimation and is unable to account for, for example, contextual transcription factor-transcription factor interactions, and other higher order features. Deep-learning offers a potential solution, as it can model complex interactions and higher-order latent features, although does not provide interpretable models and latent features. RESULTS We propose a novel autoencoder-based framework, StrUcture Primed Inference of Regulation using latent Factor ACTivity (SupirFactor) for modeling, and a metric, explained relative variance (ERV), for interpretation of GRNs. We evaluate SupirFactor with ERV in a wide set of contexts. Compared to current state-of-the-art GRN inference methods, SupirFactor performs favorably. We evaluate latent feature activity as an estimate of TFA and biological function in S. cerevisiae as well as in peripheral blood mononuclear cells (PBMC). CONCLUSION Here we present a framework for structure-primed inference and interpretation of GRNs, SupirFactor, demonstrating interpretability using ERV in multiple biological and experimental settings. SupirFactor enables TFA estimation and pathway analysis using latent factor activity, demonstrated here on two large-scale single-cell datasets, modeling S. cerevisiae and PBMC. We find that the SupirFactor model facilitates biological analysis acquiring novel functional and regulatory insight.
Collapse
Affiliation(s)
- Andreas Tjärnberg
- Center for Developmental Genetics, New York University, New York, NY, 10003, USA.
- Center For Genomics and Systems Biology, NYU, New York, NY, 10008, USA.
- Department of Biology, NYU, New York, NY, 10008, USA.
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, 10010, USA.
- Department of Neuro-Science, University of Wisconsin-Madison - Waisman Center, Madison, USA.
| | - Maggie Beheler-Amass
- Center For Genomics and Systems Biology, NYU, New York, NY, 10008, USA
- Department of Biology, NYU, New York, NY, 10008, USA
| | - Christopher A Jackson
- Center For Genomics and Systems Biology, NYU, New York, NY, 10008, USA
- Department of Biology, NYU, New York, NY, 10008, USA
| | - Lionel A Christiaen
- Center for Developmental Genetics, New York University, New York, NY, 10003, USA
- Department of Biology, NYU, New York, NY, 10008, USA
- Sars International Centre for Marine Molecular Biology, University of Bergen, Bergen, Norway
- Department of Heart Disease, Haukeland University Hospital, Bergen, Norway
| | - David Gresham
- Center For Genomics and Systems Biology, NYU, New York, NY, 10008, USA
- Department of Biology, NYU, New York, NY, 10008, USA
| | - Richard Bonneau
- Center For Genomics and Systems Biology, NYU, New York, NY, 10008, USA.
- Department of Biology, NYU, New York, NY, 10008, USA.
- Flatiron Institute, Center for Computational Biology, Simons Foundation, New York, NY, 10010, USA.
- Courant Institute of Mathematical Sciences, Computer Science Department, New York University, New York, NY, 10003, USA.
- Center For Data Science, NYU, New York, NY, 10008, USA.
- Prescient Design, a Genentech accelerator, New York, NY, 10010, USA.
| |
Collapse
|
11
|
Martini L, Baek SH, Lo I, Raby BA, Silverman E, Weiss S, Glass K, Halu A. Detecting and dissecting signaling crosstalk via the multilayer network integration of signaling and regulatory interactions. Nucleic Acids Res 2024; 52:e5. [PMID: 37953325 PMCID: PMC10783515 DOI: 10.1093/nar/gkad1035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 06/27/2023] [Accepted: 10/23/2023] [Indexed: 11/14/2023] Open
Abstract
The versatility of cellular response arises from the communication, or crosstalk, of signaling pathways in a complex network of signaling and transcriptional regulatory interactions. Understanding the various mechanisms underlying crosstalk on a global scale requires untargeted computational approaches. We present a network-based statistical approach, MuXTalk, that uses high-dimensional edges called multilinks to model the unique ways in which signaling and regulatory interactions can interface. We demonstrate that the signaling-regulatory interface is located primarily in the intermediary region between signaling pathways where crosstalk occurs, and that multilinks can differentiate between distinct signaling-transcriptional mechanisms. Using statistically over-represented multilinks as proxies of crosstalk, we infer crosstalk among 60 signaling pathways, expanding currently available crosstalk databases by more than five-fold. MuXTalk surpasses existing methods in terms of model performance metrics, identifies additions to manual curation efforts, and pinpoints potential mediators of crosstalk. Moreover, it accommodates the inherent context-dependence of crosstalk, allowing future applications to cell type- and disease-specific crosstalk.
Collapse
Affiliation(s)
- Leonardo Martini
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, 02115, USA
- Department of Computer, Control, and Management Engineering, Sapienza University of Rome, Rome, 00185, Italy
| | - Seung Han Baek
- Division of Pulmonary Medicine, Boston Children’s Hospital, Harvard Medical School, Boston, MA, 02115, USA
| | - Ian Lo
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
| | - Benjamin A Raby
- Division of Pulmonary Medicine, Boston Children’s Hospital, Harvard Medical School, Boston, MA, 02115, USA
| | - Edwin K Silverman
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, 02115, USA
| | - Scott T Weiss
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, 02115, USA
| | - Kimberly Glass
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, 02115, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
| | - Arda Halu
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, 02115, USA
| |
Collapse
|
12
|
Boocock J, Alexander N, Tapia LA, Walter-McNeill L, Munugala C, Bloom JS, Kruglyak L. Single-cell eQTL mapping in yeast reveals a tradeoff between growth and reproduction. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.07.570640. [PMID: 38106186 PMCID: PMC10723400 DOI: 10.1101/2023.12.07.570640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Expression quantitative trait loci (eQTLs) provide a key bridge between noncoding DNA sequence variants and organismal traits. The effects of eQTLs can differ among tissues, cell types, and cellular states, but these differences are obscured by gene expression measurements in bulk populations. We developed a one-pot approach to map eQTLs in Saccharomyces cerevisiae by single-cell RNA sequencing (scRNA-seq) and applied it to over 100,000 single cells from three crosses. We used scRNA-seq data to genotype each cell, measure gene expression, and classify the cells by cell-cycle stage. We mapped thousands of local and distant eQTLs and identified interactions between eQTL effects and cell-cycle stages. We took advantage of single-cell expression information to identify hundreds of genes with allele-specific effects on expression noise. We used cell-cycle stage classification to map 20 loci that influence cell-cycle progression. One of these loci influenced the expression of genes involved in the mating response. We showed that the effects of this locus arise from a common variant (W82R) in the gene GPA1, which encodes a signaling protein that negatively regulates the mating pathway. The 82R allele increases mating efficiency at the cost of slower cell-cycle progression and is associated with a higher rate of outcrossing in nature. Our results provide a more granular picture of the effects of genetic variants on gene expression and downstream traits.
Collapse
Affiliation(s)
- James Boocock
- Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Biological Chemistry, University of California, Los Angeles, Los Angeles, CA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Noah Alexander
- Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Biological Chemistry, University of California, Los Angeles, Los Angeles, CA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Leslie Alamo Tapia
- Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Biological Chemistry, University of California, Los Angeles, Los Angeles, CA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Laura Walter-McNeill
- Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Biological Chemistry, University of California, Los Angeles, Los Angeles, CA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Chetan Munugala
- Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Biological Chemistry, University of California, Los Angeles, Los Angeles, CA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Joshua S Bloom
- Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Biological Chemistry, University of California, Los Angeles, Los Angeles, CA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Leonid Kruglyak
- Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Biological Chemistry, University of California, Los Angeles, Los Angeles, CA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| |
Collapse
|
13
|
Saha E, Fanfani V, Mandros P, Ben-Guebila M, Fischer J, Hoff-Shutta K, Glass K, DeMeo DL, Lopes-Ramos C, Quackenbush J. Bayesian Optimized sample-specific Networks Obtained By Omics data (BONOBO). BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.16.567119. [PMID: 38014256 PMCID: PMC10680741 DOI: 10.1101/2023.11.16.567119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Gene regulatory networks (GRNs) are effective tools for inferring complex interactions between molecules that regulate biological processes and hence can provide insights into drivers of biological systems. Inferring co-expression networks is a critical element of GRN inference as the correlation between expression patterns may indicate that genes are coregulated by common factors. However, methods that estimate co-expression networks generally derive an aggregate network representing the mean regulatory properties of the population and so fail to fully capture population heterogeneity. To address these concerns, we introduce BONOBO (Bayesian Optimized Networks Obtained By assimilating Omics data), a scalable Bayesian model for deriving individual sample-specific co-expression networks by recognizing variations in molecular interactions across individuals. For every sample, BONOBO assumes a Gaussian distribution on the log-transformed centered gene expression and a conjugate prior distribution on the sample-specific co-expression matrix constructed from all other samples in the data. Combining the sample-specific gene expression with the prior distribution, BONOBO yields a closed-form solution for the posterior distribution of the sample-specific co-expression matrices, thus making the method extremely scalable. We demonstrate the utility of BONOBO in several contexts, including analyzing gene regulation in yeast transcription factor knockout studies, prognostic significance of miRNA-mRNA interaction in human breast cancer subtypes, and sex differences in gene regulation within human thyroid tissue. We find that BONOBO outperforms other sample-specific co-expression network inference methods and provides insight into individual differences in the drivers of biological processes.
Collapse
Affiliation(s)
- Enakshi Saha
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA
| | - Viola Fanfani
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA
| | - Panagiotis Mandros
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA
| | - Marouen Ben-Guebila
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA
| | - Jonas Fischer
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA
| | - Katherine Hoff-Shutta
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA, USA
| | - Kimberly Glass
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Dawn Lisa DeMeo
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Camila Lopes-Ramos
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - John Quackenbush
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA, USA
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, USA
| |
Collapse
|
14
|
Dumeaux V, Massahi S, Bettauer V, Mottola A, Dukovny A, Khurdia SS, Costa ACBP, Omran RP, Simpson S, Xie JL, Whiteway M, Berman J, Hallett MT. Candida albicans exhibits heterogeneous and adaptive cytoprotective responses to antifungal compounds. eLife 2023; 12:e81406. [PMID: 37888959 PMCID: PMC10699808 DOI: 10.7554/elife.81406] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2022] [Accepted: 10/26/2023] [Indexed: 10/28/2023] Open
Abstract
Candida albicans, an opportunistic human pathogen, poses a significant threat to human health and is associated with significant socio-economic burden. Current antifungal treatments fail, at least in part, because C. albicans can initiate a strong drug tolerance response that allows some cells to grow at drug concentrations above their minimal inhibitory concentration. To better characterize this cytoprotective tolerance program at the molecular single-cell level, we used a nanoliter droplet-based transcriptomics platform to profile thousands of individual fungal cells and establish their subpopulation characteristics in the absence and presence of antifungal drugs. Profiles of untreated cells exhibit heterogeneous expression that correlates with cell cycle stage with distinct metabolic and stress responses. At 2 days post-fluconazole exposure (a time when tolerance is measurable), surviving cells bifurcate into two major subpopulations: one characterized by the upregulation of genes encoding ribosomal proteins, rRNA processing machinery, and mitochondrial cellular respiration capacity, termed the Ribo-dominant (Rd) state; and the other enriched for genes encoding stress responses and related processes, termed the Stress-dominant (Sd) state. This bifurcation persists at 3 and 6 days post-treatment. We provide evidence that the ribosome assembly stress response (RASTR) is activated in these subpopulations and may facilitate cell survival.
Collapse
Affiliation(s)
- Vanessa Dumeaux
- Department of Anatomy and Cell Biology, Western University, London, Canada
| | - Samira Massahi
- Department of Biology, Concordia University, Montreal, Canada
| | - Van Bettauer
- Department of Computer Science and Software Engineering, Concordia University, Montreal, Canada
| | - Austin Mottola
- Shmunis School of Biomedical and Cancer Research, The George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv-Yafo, Israel
| | - Anna Dukovny
- Shmunis School of Biomedical and Cancer Research, The George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv-Yafo, Israel
| | | | | | | | - Shawn Simpson
- Department of Computer Science and Software Engineering, Concordia University, Montreal, Canada
| | - Jinglin Lucy Xie
- Department of Chemical and Systems Biology, Stanford University, Stanford, United States
| | | | - Judith Berman
- Shmunis School of Biomedical and Cancer Research, The George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv-Yafo, Israel
| | | |
Collapse
|
15
|
Wu Y, Qian B, Wang A, Dong H, Zhu E, Ma B. iLSGRN: inference of large-scale gene regulatory networks based on multi-model fusion. Bioinformatics 2023; 39:btad619. [PMID: 37851379 PMCID: PMC10589915 DOI: 10.1093/bioinformatics/btad619] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 10/04/2023] [Accepted: 10/17/2023] [Indexed: 10/19/2023] Open
Abstract
MOTIVATION Gene regulatory networks (GRNs) are a way of describing the interaction between genes, which contribute to revealing the different biological mechanisms in the cell. Reconstructing GRNs based on gene expression data has been a central computational problem in systems biology. However, due to the high dimensionality and non-linearity of large-scale GRNs, accurately and efficiently inferring GRNs is still a challenging task. RESULTS In this article, we propose a new approach, iLSGRN, to reconstruct large-scale GRNs from steady-state and time-series gene expression data based on non-linear ordinary differential equations. Firstly, the regulatory gene recognition algorithm calculates the Maximal Information Coefficient between genes and excludes redundant regulatory relationships to achieve dimensionality reduction. Then, the feature fusion algorithm constructs a model leveraging the feature importance derived from XGBoost (eXtreme Gradient Boosting) and RF (Random Forest) models, which can effectively train the non-linear ordinary differential equations model of GRNs and improve the accuracy and stability of the inference algorithm. The extensive experiments on different scale datasets show that our method makes sensible improvement compared with the state-of-the-art methods. Furthermore, we perform cross-validation experiments on the real gene datasets to validate the robustness and effectiveness of the proposed method. AVAILABILITY AND IMPLEMENTATION The proposed method is written in the Python language, and is available at: https://github.com/lab319/iLSGRN.
Collapse
Affiliation(s)
- Yiming Wu
- School of Information Science and Technology, Dalian Maritime University, Dalian 116026, China
| | - Bing Qian
- School of Information Science and Technology, Dalian Maritime University, Dalian 116026, China
| | - Anqi Wang
- Department of Statistics and Actuarial Science, The University of Hong Kong, Hong Kong 999077, China
| | - Heng Dong
- School of Information Science and Technology, Dalian Maritime University, Dalian 116026, China
| | - Enqiang Zhu
- Institution of Computing Science and Technology, Guangzhou University, Guangzhou 510006, China
| | - Baoshan Ma
- School of Information Science and Technology, Dalian Maritime University, Dalian 116026, China
| |
Collapse
|
16
|
Şapcı AOB, Lu S, Yan S, Ay F, Tastan O, Keleş S. MuDCoD: multi-subject community detection in personalized dynamic gene networks from single-cell RNA sequencing. Bioinformatics 2023; 39:btad592. [PMID: 37740957 PMCID: PMC10564618 DOI: 10.1093/bioinformatics/btad592] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Revised: 08/24/2023] [Accepted: 09/21/2023] [Indexed: 09/25/2023] Open
Abstract
MOTIVATION With the wide availability of single-cell RNA-seq (scRNA-seq) technology, population-scale scRNA-seq datasets across multiple individuals and time points are emerging. While the initial investigations of these datasets tend to focus on standard analysis of clustering and differential expression, leveraging the power of scRNA-seq data at the personalized dynamic gene co-expression network level has the potential to unlock subject and/or time-specific network-level variation, which is critical for understanding phenotypic differences. Community detection from co-expression networks of multiple time points or conditions has been well-studied; however, none of the existing settings included networks from multiple subjects and multiple time points simultaneously. To address this, we develop Multi-subject Dynamic Community Detection (MuDCoD) for multi-subject community detection in personalized dynamic gene networks from scRNA-seq. MuDCoD builds on the spectral clustering framework and promotes information sharing among the networks of the subjects as well as networks at different time points. It clusters genes in the personalized dynamic gene networks and reveals gene communities that are variable or shared not only across time but also among subjects. RESULTS Evaluation and benchmarking of MuDCoD against existing approaches reveal that MuDCoD effectively leverages apparent shared signals among networks of the subjects at individual time points, and performs robustly when there is no or little information sharing among the networks. Applications to population-scale scRNA-seq datasets of human-induced pluripotent stem cells during dopaminergic neuron differentiation and CD4+ T cell activation indicate that MuDCoD enables robust inference for identifying time-varying personalized gene modules. Our results illustrate how personalized dynamic community detection can aid in the exploration of subject-specific biological processes that vary across time. AVAILABILITY AND IMPLEMENTATION MuDCoD is publicly available at https://github.com/bo1929/MuDCoD as a Python package. Implementation includes simulation and real-data experiments together with extensive documentation.
Collapse
Affiliation(s)
- Ali Osman Berk Şapcı
- Bioinformatics and Systems Biology Graduate Program, University of California San Diego, La Jolla, CA 92093, United States
- Faculty of Engineering and Natural Sciences, Sabancı University, Istanbul 34956, Turkey
| | - Shan Lu
- Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706, United States
| | - Shuchen Yan
- Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706, United States
| | - Ferhat Ay
- Department of Pediatrics, University of California San Diego, La Jolla, CA 92093, United States
- Centers for Autoimmunity, Inflammation and Cancer Immunotherapy, La Jolla Institute for Immunology, La Jolla, CA 92037, United States
| | - Oznur Tastan
- Faculty of Engineering and Natural Sciences, Sabancı University, Istanbul 34956, Turkey
| | - Sündüz Keleş
- Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706, United States
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53706, United States
| |
Collapse
|
17
|
Jackson CA, Beheler-Amass M, Tjärnberg A, Suresh I, Hickey ASM, Bonneau R, Gresham D. Simultaneous estimation of gene regulatory network structure and RNA kinetics from single cell gene expression. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.21.558277. [PMID: 37790443 PMCID: PMC10542544 DOI: 10.1101/2023.09.21.558277] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]
Abstract
Cells respond to environmental and developmental stimuli by remodeling their transcriptomes through regulation of both mRNA transcription and mRNA decay. A central goal of biology is identifying the global set of regulatory relationships between factors that control mRNA production and degradation and their target transcripts and construct a predictive model of gene expression. Regulatory relationships are typically identified using transcriptome measurements and causal inference algorithms. RNA kinetic parameters are determined experimentally by employing run-on or metabolic labeling (e.g. 4-thiouracil) methods that allow transcription and decay rates to be separately measured. Here, we develop a deep learning model, trained with single-cell RNA-seq data, that both infers causal regulatory relationships and estimates RNA kinetic parameters. The resulting in silico model predicts future gene expression states and can be perturbed to simulate the effect of transcription factor changes. We acquired model training data by sequencing the transcriptomes of 175,000 individual Saccharomyces cerevisiae cells that were subject to an external perturbation and continuously sampled over a one hour period. The rate of change for each transcript was calculated on a per-cell basis to estimate RNA velocity. We then trained a deep learning model with transcriptome and RNA velocity data to calculate time-dependent estimates of mRNA production and decay rates. By separating RNA velocity into transcription and decay rates, we show that rapamycin treatment causes existing ribosomal protein transcripts to be rapidly destabilized, while production of new transcripts gradually slows over the course of an hour. The neural network framework we present is designed to explicitly model causal regulatory relationships between transcription factors and their genes, and shows superior performance to existing models on the basis of recovery of known regulatory relationships. We validated the predictive power of the model by perturbing transcription factors in silico and comparing transcriptome-wide effects with experimental data. Our study represents the first step in constructing a complete, predictive, biophysical model of gene expression regulation.
Collapse
Affiliation(s)
- Christopher A Jackson
- Center For Genomics and Systems Biology, New York University, New York, NY, USA
- Department of Biology, New York University, New York, NY, USA
| | - Maggie Beheler-Amass
- Center For Genomics and Systems Biology, New York University, New York, NY, USA
- Department of Biology, New York University, New York, NY, USA
| | - Andreas Tjärnberg
- Center For Genomics and Systems Biology, New York University, New York, NY, USA
- Department of Biology, New York University, New York, NY, USA
| | - Ina Suresh
- Center For Genomics and Systems Biology, New York University, New York, NY, USA
- Department of Biology, New York University, New York, NY, USA
| | - Angela Shang-mei Hickey
- Center For Genomics and Systems Biology, New York University, New York, NY, USA
- Department of Biology, New York University, New York, NY, USA
| | | | - David Gresham
- Center For Genomics and Systems Biology, New York University, New York, NY, USA
- Department of Biology, New York University, New York, NY, USA
| |
Collapse
|
18
|
Vande Zande P, Zhou X, Selmecki A. The Dynamic Fungal Genome: Polyploidy, Aneuploidy and Copy Number Variation in Response to Stress. Annu Rev Microbiol 2023; 77:341-361. [PMID: 37307856 PMCID: PMC10599402 DOI: 10.1146/annurev-micro-041320-112443] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Fungal species have dynamic genomes and often exhibit genomic plasticity in response to stress. This genome plasticity often comes with phenotypic consequences that affect fitness and resistance to stress. Fungal pathogens exhibit genome plasticity in both clinical and agricultural settings and often during adaptation to antifungal drugs, posing significant challenges to human health. Therefore, it is important to understand the rates, mechanisms, and impact of large genomic changes. This review addresses the prevalence of polyploidy, aneuploidy, and copy number variation across diverse fungal species, with special attention to prominent fungal pathogens and model species. We also explore the relationship between environmental stress and rates of genomic changes and highlight the mechanisms underlying genotypic and phenotypic changes. A comprehensive understanding of these dynamic fungal genomes is needed to identify novel solutions for the increase in antifungal drug resistance.
Collapse
Affiliation(s)
- Pétra Vande Zande
- Department of Microbiology and Immunology, University of Minnesota, Minneapolis, Minnesota, USA;
| | - Xin Zhou
- Department of Microbiology and Immunology, University of Minnesota, Minneapolis, Minnesota, USA;
| | - Anna Selmecki
- Department of Microbiology and Immunology, University of Minnesota, Minneapolis, Minnesota, USA;
| |
Collapse
|
19
|
Tan F, Xuan Y, Long L, Yu Y, Zhang C, Liang P, Wang Y, Chen M, Wen J, Chen G. Single-cell analysis of human prepuce reveals dynamic changes in gene regulation and cellular communications. BMC Genomics 2023; 24:514. [PMID: 37658288 PMCID: PMC10474653 DOI: 10.1186/s12864-023-09615-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2023] [Accepted: 08/22/2023] [Indexed: 09/03/2023] Open
Abstract
BACKGROUND The cellular and molecular dynamics of human prepuce are crucial for understanding its biological and physiological functions, as well as the prevention of related genital diseases. However, the cellular compositions and heterogeneity of human prepuce at single-cell resolution are still largely unknown. Here we systematically dissected the prepuce of children and adults based on the single-cell RNA-seq data of 90,770 qualified cells. RESULTS We identified 15 prepuce cell subtypes, including fibroblast, smooth muscle cells, T/natural killer cells, macrophages, vascular endothelial cells, and dendritic cells. The proportions of these cell types varied among different individuals as well as between children and adults. Moreover, we detected cell-type-specific gene regulatory networks (GRNs), which could contribute to the unique functions of related cell types. The GRNs were also highly dynamic between the prepuce cells of children and adults. Our cell-cell communication network analysis among different cell types revealed a set of child-specific (e.g., CD96, EPO, IFN-1, and WNT signaling pathways) and adult-specific (e.g., BMP10, NEGR, ncWNT, and NPR1 signaling pathways) signaling pathways. The variations of GRNs and cellular communications could be closely associated with prepuce development in children and prepuce maintenance in adults. CONCLUSIONS Collectively, we systematically analyzed the cellular variations and molecular changes of the human prepuce at single-cell resolution. Our results gained insights into the heterogeneity of prepuce cells and shed light on the underlying molecular mechanisms of prepuce development and maintenance.
Collapse
Affiliation(s)
- Fei Tan
- School of Medicine, Shanghai Skin Disease Hospital, Tongji University, Shanghai, 200443, China.
- Shanghai Skin Disease Clinical College, The Fifth Clinical Medical College, Anhui Medical University, Shanghai Skin Disease Hospital, Shanghai, 200443, China.
| | - Yuan Xuan
- Shanghai Skin Disease Clinical College, The Fifth Clinical Medical College, Anhui Medical University, Shanghai Skin Disease Hospital, Shanghai, 200443, China
| | - Lan Long
- Longgang District Maternity & Child Healthcare Hospital of Shenzhen City, Shenzhen, 518172, China
| | - Yang Yu
- Department of Urology, Shanghai Tenth People's Hospital, Tongji University School of Medicine, Shanghai, 200072, China
| | - Chunhua Zhang
- Department of Dermatology, Shanghai Baoshan Hospital of Integrated Traditional Chinese and Western Medicine, Shanghai, 201999, China
| | - Pengchen Liang
- School of Microelectronics, Shanghai University, Shanghai, 201800, China
| | - Yaoqun Wang
- Shanghai Skin Disease Clinical College, The Fifth Clinical Medical College, Anhui Medical University, Shanghai Skin Disease Hospital, Shanghai, 200443, China
| | - Meiyu Chen
- Shanghai Skin Disease Clinical College, The Fifth Clinical Medical College, Anhui Medical University, Shanghai Skin Disease Hospital, Shanghai, 200443, China
| | - Jiling Wen
- Department of Urology, Shanghai East Hospital, Tongji University School of Medicine, Shanghai, 200120, China.
| | - Geng Chen
- School of Medicine, Shanghai Skin Disease Hospital, Tongji University, Shanghai, 200443, China.
- Center for Bioinformatics and Computational Biology, School of Life Sciences, East China Normal University, Shanghai, 200241, China.
| |
Collapse
|
20
|
Marku M, Pancaldi V. From time-series transcriptomics to gene regulatory networks: A review on inference methods. PLoS Comput Biol 2023; 19:e1011254. [PMID: 37561790 PMCID: PMC10414591 DOI: 10.1371/journal.pcbi.1011254] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/12/2023] Open
Abstract
Inference of gene regulatory networks has been an active area of research for around 20 years, leading to the development of sophisticated inference algorithms based on a variety of assumptions and approaches. With the ever increasing demand for more accurate and powerful models, the inference problem remains of broad scientific interest. The abstract representation of biological systems through gene regulatory networks represents a powerful method to study such systems, encoding different amounts and types of information. In this review, we summarize the different types of inference algorithms specifically based on time-series transcriptomics, giving an overview of the main applications of gene regulatory networks in computational biology. This review is intended to give an updated reference of regulatory networks inference tools to biologists and researchers new to the topic and guide them in selecting the appropriate inference method that best fits their questions, aims, and experimental data.
Collapse
Affiliation(s)
- Malvina Marku
- CRCT, Université de Toulouse, Inserm, CNRS, Université Toulouse III-Paul Sabatier, Centre de Recherches en Cancérologie de Toulouse, Toulouse, France
| | - Vera Pancaldi
- CRCT, Université de Toulouse, Inserm, CNRS, Université Toulouse III-Paul Sabatier, Centre de Recherches en Cancérologie de Toulouse, Toulouse, France
- Barcelona Supercomputing Center, Barcelona, Spain
| |
Collapse
|
21
|
Littman R, Cheng M, Wang N, Peng C, Yang X. SCING: Inference of robust, interpretable gene regulatory networks from single cell and spatial transcriptomics. iScience 2023; 26:107124. [PMID: 37434694 PMCID: PMC10331489 DOI: 10.1016/j.isci.2023.107124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Revised: 03/31/2023] [Accepted: 06/09/2023] [Indexed: 07/13/2023] Open
Abstract
Gene regulatory network (GRN) inference is an integral part of understanding physiology and disease. Single cell/nuclei RNA-seq (scRNA-seq/snRNA-seq) data has been used to elucidate cell-type GRNs; however, the accuracy and speed of current scRNAseq-based GRN approaches are suboptimal. Here, we present Single Cell INtegrative Gene regulatory network inference (SCING), a gradient boosting and mutual information-based approach for identifying robust GRNs from scRNA-seq, snRNA-seq, and spatial transcriptomics data. Performance evaluation using Perturb-seq datasets, held-out data, and the mouse cell atlas combined with the DisGeNET database demonstrates the improved accuracy and biological interpretability of SCING compared to existing methods. We applied SCING to the entire mouse single cell atlas, human Alzheimer's disease (AD), and mouse AD spatial transcriptomics. SCING GRNs reveal unique disease subnetwork modeling capabilities, have intrinsic capacity to correct for batch effects, retrieve disease relevant genes and pathways, and are informative on spatial specificity of disease pathogenesis.
Collapse
Affiliation(s)
- Russell Littman
- Department of Integrative Biology & Physiology, UCLA, Los Angeles, CA, USA
- Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA, USA
| | - Michael Cheng
- Department of Integrative Biology & Physiology, UCLA, Los Angeles, CA, USA
- Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA, USA
| | - Ning Wang
- Department of Integrative Biology & Physiology, UCLA, Los Angeles, CA, USA
| | - Chao Peng
- Department of Neurology, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
| | - Xia Yang
- Department of Integrative Biology & Physiology, UCLA, Los Angeles, CA, USA
- Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA, USA
- Institute for Quantitative and Computational Biosciences (QCBio), Los Angeles, CA, USA
- Molecular Biology Institute (MBI), Los Angeles, CA, USA
- Brain Research Institute (BRI), Los Angeles, CA, USA
| |
Collapse
|
22
|
Meng X, Xu P, Tao F. RespectM revealed metabolic heterogeneity powers deep learning for reshaping the DBTL cycle. iScience 2023; 26:107069. [PMID: 37426353 PMCID: PMC10329182 DOI: 10.1016/j.isci.2023.107069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 03/18/2023] [Accepted: 06/05/2023] [Indexed: 07/11/2023] Open
Abstract
Synthetic biology, relying on Design-Build-Test-Learn (DBTL) cycle, aims to solve medicine, manufacturing, and agriculture problems. However, the DBTL cycle's Learn (L) step lacks predictive power for the behavior of biological systems, resulting from the incompatibility between sparse testing data and chaotic metabolic networks. Herein, we develop a method, "RespectM," based on mass spectrometry imaging, which is able to detect metabolites at a rate of 500 cells per hour with high efficiency. In this study, 4,321 single cell level metabolomics data were acquired, representing metabolic heterogeneity. An optimizable deep neural network was applied to learn from metabolic heterogeneity and a "heterogeneity-powered learning (HPL)" based model was trained as well. By testing the HPL based model, we suggest minimal operations to achieve high triglyceride production for engineering. The HPL strategy could revolutionize rational design and reshape the DBTL cycle.
Collapse
Affiliation(s)
- Xuanlin Meng
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic and Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, P. R. China
| | - Ping Xu
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic and Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, P. R. China
| | - Fei Tao
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic and Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, P. R. China
| |
Collapse
|
23
|
Theodosiou L, Farr AD, Rainey PB. Barcoding Populations of Pseudomonas fluorescens SBW25. J Mol Evol 2023; 91:254-262. [PMID: 37186220 PMCID: PMC10275814 DOI: 10.1007/s00239-023-10103-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Accepted: 03/13/2023] [Indexed: 05/17/2023]
Abstract
In recent years, evolutionary biologists have developed an increasing interest in the use of barcoding strategies to study eco-evolutionary dynamics of lineages within evolving populations and communities. Although barcoded populations can deliver unprecedented insight into evolutionary change, barcoding microbes presents specific technical challenges. Here, strategies are described for barcoding populations of the model bacterium Pseudomonas fluorescens SBW25, including the design and cloning of barcoded regions, preparation of libraries for amplicon sequencing, and quantification of resulting barcoded lineages. In so doing, we hope to aid the design and implementation of barcoding methodologies in a broad range of model and non-model organisms.
Collapse
Affiliation(s)
- Loukas Theodosiou
- Department of Microbial Population Biology, Max Planck Institute for Evolutionary Biology, Plön, Germany.
- Department of Comparative Development and Genetics, Max Planck Institute for Plant Breeding, Cologne, Germany.
| | - Andrew D Farr
- Department of Microbial Population Biology, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Paul B Rainey
- Department of Microbial Population Biology, Max Planck Institute for Evolutionary Biology, Plön, Germany
- Laboratory of Biophysics and Evolution, CBI, ESPCI Paris, Université PSL, CNRS, Paris, France
| |
Collapse
|
24
|
Li K, Sun YH, Ouyang Z, Negi S, Gao Z, Zhu J, Wang W, Chen Y, Piya S, Hu W, Zavodszky MI, Yalamanchili H, Cao S, Gehrke A, Sheehan M, Huh D, Casey F, Zhang X, Zhang B. scRNASequest: an ecosystem of scRNA-seq analysis, visualization, and publishing. BMC Genomics 2023; 24:228. [PMID: 37131143 PMCID: PMC10155351 DOI: 10.1186/s12864-023-09332-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Accepted: 04/25/2023] [Indexed: 05/04/2023] Open
Abstract
BACKGROUND Single-cell RNA sequencing is a state-of-the-art technology to understand gene expression in complex tissues. With the growing amount of data being generated, the standardization and automation of data analysis are critical to generating hypotheses and discovering biological insights. RESULTS Here, we present scRNASequest, a semi-automated single-cell RNA-seq (scRNA-seq) data analysis workflow which allows (1) preprocessing from raw UMI count data, (2) harmonization by one or multiple methods, (3) reference-dataset-based cell type label transfer and embedding projection, (4) multi-sample, multi-condition single-cell level differential gene expression analysis, and (5) seamless integration with cellxgene VIP for visualization and with CellDepot for data hosting and sharing by generating compatible h5ad files. CONCLUSIONS We developed scRNASequest, an end-to-end pipeline for single-cell RNA-seq data analysis, visualization, and publishing. The source code under MIT open-source license is provided at https://github.com/interactivereport/scRNASequest . We also prepared a bookdown tutorial for the installation and detailed usage of the pipeline: https://interactivereport.github.io/scRNAsequest/tutorial/docs/ . Users have the option to run it on a local computer with a Linux/Unix system including MacOS, or interact with SGE/Slurm schedulers on high-performance computing (HPC) clusters.
Collapse
Affiliation(s)
- Kejie Li
- Research Data Sciences, Translational Biology, Biogen Inc., Cambridge, MA, 02142, USA
| | - Yu H Sun
- Research Data Sciences, Translational Biology, Biogen Inc., Cambridge, MA, 02142, USA
| | | | - Soumya Negi
- Research Data Sciences, Translational Biology, Biogen Inc., Cambridge, MA, 02142, USA
| | - Zhen Gao
- Research Data Sciences, Translational Biology, Biogen Inc., Cambridge, MA, 02142, USA
| | - Jing Zhu
- Research Data Sciences, Translational Biology, Biogen Inc., Cambridge, MA, 02142, USA
| | - Wanli Wang
- Research Data Sciences, Translational Biology, Biogen Inc., Cambridge, MA, 02142, USA
| | - Yirui Chen
- Research Data Sciences, Translational Biology, Biogen Inc., Cambridge, MA, 02142, USA
| | - Sarbottam Piya
- Research Data Sciences, Translational Biology, Biogen Inc., Cambridge, MA, 02142, USA
| | - Wenxing Hu
- Research Data Sciences, Translational Biology, Biogen Inc., Cambridge, MA, 02142, USA
| | - Maria I Zavodszky
- Research Data Sciences, Translational Biology, Biogen Inc., Cambridge, MA, 02142, USA
| | - Hima Yalamanchili
- Research Data Sciences, Translational Biology, Biogen Inc., Cambridge, MA, 02142, USA
| | - Shaolong Cao
- Research Data Sciences, Translational Biology, Biogen Inc., Cambridge, MA, 02142, USA
| | - Andrew Gehrke
- Research Data Sciences, Translational Biology, Biogen Inc., Cambridge, MA, 02142, USA
| | - Mark Sheehan
- Research Data Sciences, Translational Biology, Biogen Inc., Cambridge, MA, 02142, USA
| | - Dann Huh
- Research Data Sciences, Translational Biology, Biogen Inc., Cambridge, MA, 02142, USA
| | - Fergal Casey
- Research Data Sciences, Translational Biology, Biogen Inc., Cambridge, MA, 02142, USA
| | - Xinmin Zhang
- Data Science, BioInfoRx Inc., Madison, WI, 53719, USA
| | - Baohong Zhang
- Research Data Sciences, Translational Biology, Biogen Inc., Cambridge, MA, 02142, USA.
| |
Collapse
|
25
|
Wayman JA, Thomas A, Bejjani A, Katko A, Almanan M, Godarova A, Korinfskaya S, Cazares TA, Yukawa M, Kottyan LC, Barski A, Chougnet CA, Hildeman DA, Miraldi ER. An atlas of gene regulatory networks for memory CD4 + T cells in youth and old age. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.07.531590. [PMID: 36945549 PMCID: PMC10028906 DOI: 10.1101/2023.03.07.531590] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2023]
Abstract
Aging profoundly affects immune-system function, promoting susceptibility to pathogens, cancers and chronic inflammation. We previously identified a population of IL-10-producing, T follicular helper-like cells (" Tfh10 "), linked to suppressed vaccine responses in aged mice. Here, we integrate single-cell ( sc )RNA-seq, scATAC-seq and genome-scale modeling to characterize Tfh10 - and the full CD4 + memory T cell ( CD4 + TM ) compartment - in young and old mice. We identified 13 CD4 + TM populations, which we validated through cross-comparison to prior scRNA-seq studies. We built gene regulatory networks ( GRNs ) that predict transcription-factor control of gene expression in each T-cell population and how these circuits change with age. Through integration with pan-cell aging atlases, we identified intercellular-signaling networks driving age-dependent changes in CD4 + TM. Our atlas of finely resolved CD4 + TM subsets, GRNs and cell-cell communication networks is a comprehensive resource of predicted regulatory mechanisms operative in memory T cells, presenting new opportunities to improve immune responses in the elderly.
Collapse
|
26
|
McCalla SG, Fotuhi Siahpirani A, Li J, Pyne S, Stone M, Periyasamy V, Shin J, Roy S. Identifying strengths and weaknesses of methods for computational network inference from single-cell RNA-seq data. G3 (BETHESDA, MD.) 2023; 13:jkad004. [PMID: 36626328 PMCID: PMC9997554 DOI: 10.1093/g3journal/jkad004] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Revised: 11/09/2022] [Accepted: 12/16/2022] [Indexed: 01/11/2023]
Abstract
Single-cell RNA-sequencing (scRNA-seq) offers unparalleled insight into the transcriptional programs of different cellular states by measuring the transcriptome of thousands of individual cells. An emerging problem in the analysis of scRNA-seq is the inference of transcriptional gene regulatory networks and a number of methods with different learning frameworks have been developed to address this problem. Here, we present an expanded benchmarking study of eleven recent network inference methods on seven published scRNA-seq datasets in human, mouse, and yeast considering different types of gold standard networks and evaluation metrics. We evaluate methods based on their computing requirements as well as on their ability to recover the network structure. We find that, while most methods have a modest recovery of experimentally derived interactions based on global metrics such as Area Under the Precision Recall curve, methods are able to capture targets of regulators that are relevant to the system under study. Among the top performing methods that use only expression were SCENIC, PIDC, MERLIN or Correlation. Addition of prior biological knowledge and the estimation of transcription factor activities resulted in the best overall performance with the Inferelator and MERLIN methods that use prior knowledge outperforming methods that use expression alone. We found that imputation for network inference did not improve network inference accuracy and could be detrimental. Comparisons of inferred networks for comparable bulk conditions showed that the networks inferred from scRNA-seq datasets are often better or at par with the networks inferred from bulk datasets. Our analysis should be beneficial in selecting methods for network inference. At the same time, this highlights the need for improved methods and better gold standards for regulatory network inference from scRNAseq datasets.
Collapse
Affiliation(s)
- Sunnie Grace McCalla
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI 53706, USA
| | | | - Jiaxin Li
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Saptarshi Pyne
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA
| | - Matthew Stone
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53792, USA
| | - Viswesh Periyasamy
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA
- Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Junha Shin
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA
| | - Sushmita Roy
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53792, USA
- Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI 53706, USA
| |
Collapse
|
27
|
Advances in Mass Spectrometry-Based Single Cell Analysis. BIOLOGY 2023; 12:biology12030395. [PMID: 36979087 PMCID: PMC10045136 DOI: 10.3390/biology12030395] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Revised: 02/27/2023] [Accepted: 03/01/2023] [Indexed: 03/06/2023]
Abstract
Technological developments and improvements in single-cell isolation and analytical platforms allow for advanced molecular profiling at the single-cell level, which reveals cell-to-cell variation within the admixture cells in complex biological or clinical systems. This helps to understand the cellular heterogeneity of normal or diseased tissues and organs. However, most studies focused on the analysis of nucleic acids (e.g., DNA and RNA) and mass spectrometry (MS)-based analysis for proteins and metabolites of a single cell lagged until recently. Undoubtedly, MS-based single-cell analysis will provide a deeper insight into cellular mechanisms related to health and disease. This review summarizes recent advances in MS-based single-cell analysis methods and their applications in biology and medicine.
Collapse
|
28
|
Kamrad S, Correia-Melo C, Szyrwiel L, Aulakh SK, Bähler J, Demichev V, Mülleder M, Ralser M. Metabolic heterogeneity and cross-feeding within isogenic yeast populations captured by DILAC. Nat Microbiol 2023; 8:441-454. [PMID: 36797484 PMCID: PMC9981460 DOI: 10.1038/s41564-022-01304-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 12/13/2022] [Indexed: 02/18/2023]
Abstract
Genetically identical cells are known to differ in many physiological parameters such as growth rate and drug tolerance. Metabolic specialization is believed to be a cause of such phenotypic heterogeneity, but detection of metabolically divergent subpopulations remains technically challenging. We developed a proteomics-based technology, termed differential isotope labelling by amino acids (DILAC), that can detect producer and consumer subpopulations of a particular amino acid within an isogenic cell population by monitoring peptides with multiple occurrences of the amino acid. We reveal that young, morphologically undifferentiated yeast colonies contain subpopulations of lysine producers and consumers that emerge due to nutrient gradients. Deconvoluting their proteomes using DILAC, we find evidence for in situ cross-feeding where rapidly growing cells ferment and provide the more slowly growing, respiring cells with ethanol. Finally, by combining DILAC with fluorescence-activated cell sorting, we show that the metabolic subpopulations diverge phenotypically, as exemplified by a different tolerance to the antifungal drug amphotericin B. Overall, DILAC captures previously unnoticed metabolic heterogeneity and provides experimental evidence for the role of metabolic specialization and cross-feeding interactions as a source of phenotypic heterogeneity in isogenic cell populations.
Collapse
Affiliation(s)
- Stephan Kamrad
- Department of Biochemistry, Charité Universitätsmedizin Berlin, Berlin, Germany
- Molecular Biology of Metabolism Laboratory, The Francis Crick Institute, London, UK
| | - Clara Correia-Melo
- Department of Biochemistry, Charité Universitätsmedizin Berlin, Berlin, Germany
- Molecular Biology of Metabolism Laboratory, The Francis Crick Institute, London, UK
| | - Lukasz Szyrwiel
- Department of Biochemistry, Charité Universitätsmedizin Berlin, Berlin, Germany
| | - Simran Kaur Aulakh
- Molecular Biology of Metabolism Laboratory, The Francis Crick Institute, London, UK
| | - Jürg Bähler
- Institute of Healthy Ageing and Department of Genetics, Evolution and Environment, University College London, London, UK
| | - Vadim Demichev
- Department of Biochemistry, Charité Universitätsmedizin Berlin, Berlin, Germany
- Molecular Biology of Metabolism Laboratory, The Francis Crick Institute, London, UK
| | - Michael Mülleder
- Core Facility-High-Throughput Mass Spectrometry, Charité Universitätsmedizin Berlin, Berlin, Germany
| | - Markus Ralser
- Department of Biochemistry, Charité Universitätsmedizin Berlin, Berlin, Germany.
- Molecular Biology of Metabolism Laboratory, The Francis Crick Institute, London, UK.
- The Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, UK.
- Max Planck Institute for Molecular Genetics, Berlin, Germany.
| |
Collapse
|
29
|
Abid D, Brent MR. NetProphet 3: a machine learning framework for transcription factor network mapping and multi-omics integration. Bioinformatics 2023; 39:7000334. [PMID: 36692138 PMCID: PMC9912366 DOI: 10.1093/bioinformatics/btad038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Revised: 01/11/2023] [Accepted: 01/18/2023] [Indexed: 01/25/2023] Open
Abstract
MOTIVATION Many methods have been proposed for mapping the targets of transcription factors (TFs) from gene expression data. It is known that combining outputs from multiple methods can improve performance. To date, outputs have been combined by using either simplistic formulae, such as geometric mean, or carefully hand-tuned formulae that may not generalize well to new inputs. Finally, the evaluation of accuracy has been challenging due to the lack of genome-scale, ground-truth networks. RESULTS We developed NetProphet3, which combines scores from multiple analyses automatically, using a tree boosting algorithm trained on TF binding location data. We also developed three independent, genome-scale evaluation metrics. By these metrics, NetProphet3 is more accurate than other commonly used packages, including NetProphet 2.0, when gene expression data from direct TF perturbations are available. Furthermore, its integration mode can forge a consensus network from gene expression data and TF binding location data. AVAILABILITY AND IMPLEMENTATION All data and code are available at https://zenodo.org/record/7504131#.Y7Wu3i-B2x8. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Dhoha Abid
- Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO 63110, USA.,Department of Computer Science and Engineering, Washington University, St. Louis, MO 63130, USA
| | - Michael R Brent
- Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO 63110, USA.,Department of Computer Science and Engineering, Washington University, St. Louis, MO 63130, USA.,Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
| |
Collapse
|
30
|
Tjärnberg A, Beheler-Amass M, Jackson CA, Christiaen LA, Gresham D, Bonneau R. Structure primed embedding on the transcription factor manifold enables transparent model architectures for gene regulatory network and latent activity inference. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.02.526909. [PMID: 36778259 PMCID: PMC9915715 DOI: 10.1101/2023.02.02.526909] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
The modeling of gene regulatory networks (GRNs) is limited due to a lack of direct measurements of regulatory features in genome-wide screens. Most GRN inference methods are therefore forced to model relationships between regulatory genes and their targets with expression as a proxy for the upstream independent features, complicating validation and predictions produced by modeling frameworks. Separating covariance and regulatory influence requires aggregation of independent and complementary sets of evidence, such as transcription factor (TF) binding and target gene expression. However, the complete regulatory state of the system, e.g. TF activity (TFA) is unknown due to a lack of experimental feasibility, making regulatory relations difficult to infer. Some methods attempt to account for this by modeling TFA as a latent feature, but these models often use linear frameworks that are unable to account for non-linearities such as saturation, TF-TF interactions, and other higher order features. Deep learning frameworks may offer a solution, as they are capable of modeling complex interactions and capturing higher-order latent features. However, these methods often discard central concepts in biological systems modeling, such as sparsity and latent feature interpretability, in favor of increased model complexity. We propose a novel deep learning autoencoder-based framework, StrUcture Primed Inference of Regulation using latent Factor ACTivity (SupirFactor), that scales to single cell genomic data and maintains interpretability to perform GRN inference and estimate TFA as a latent feature. We demonstrate that SupirFactor outperforms current leading GRN inference methods, predicts biologically relevant TFA and elucidates functional regulatory pathways through aggregation of TFs.
Collapse
Affiliation(s)
- Andreas Tjärnberg
- Center for Developmental Genetics, New York University, New York 10003 NY, USA
- Center For Genomics and Systems Biology, NYU, New York, NY 10008, USA
- Department of Biology, NYU, New York, NY 10008, USA
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, 10010, USA
| | - Maggie Beheler-Amass
- Center For Genomics and Systems Biology, NYU, New York, NY 10008, USA
- Department of Biology, NYU, New York, NY 10008, USA
| | - Christopher A Jackson
- Center For Genomics and Systems Biology, NYU, New York, NY 10008, USA
- Department of Biology, NYU, New York, NY 10008, USA
| | - Lionel A Christiaen
- Center for Developmental Genetics, New York University, New York 10003 NY, USA
- Department of Biology, NYU, New York, NY 10008, USA
- Sars International Centre for Marine Molecular Biology, University of Bergen, Bergen, Norway
- Department of Heart Disease, Haukeland University Hospital, Bergen, Norway
| | - David Gresham
- Center For Genomics and Systems Biology, NYU, New York, NY 10008, USA
- Department of Biology, NYU, New York, NY 10008, USA
| | - Richard Bonneau
- Center For Genomics and Systems Biology, NYU, New York, NY 10008, USA
- Department of Biology, NYU, New York, NY 10008, USA
- Flatiron Institute, Center for Computational Biology, Simons Foundation, New York, NY 10010, USA
- Courant Institute of Mathematical Sciences, Computer Science Department, New York University, New York, NY 10003, USA
- Center For Data Science, NYU, New York, NY 10008, USA
- Prescient Design, a Genentech accelerator, New York, NY, 10010, USA
| |
Collapse
|
31
|
Fan Z, Luo Y, Lu H, Wang T, Feng Y, Zhao W, Kim P, Zhou X. SPASCER: spatial transcriptomics annotation at single-cell resolution. Nucleic Acids Res 2023; 51:D1138-D1149. [PMID: 36243975 PMCID: PMC9825565 DOI: 10.1093/nar/gkac889] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 09/21/2022] [Accepted: 10/13/2022] [Indexed: 01/30/2023] Open
Abstract
In recent years, the explosive growth of spatial technologies has enabled the characterization of spatial heterogeneity of tissue architectures. Compared to traditional sequencing, spatial transcriptomics reserves the spatial information of each captured location and provides novel insights into diverse spatially related biological contexts. Even though two spatial transcriptomics databases exist, they provide limited analytical information. Information such as spatial heterogeneity of genes and cells, cell-cell communication activities in space, and the cell type compositions in the microenvironment are critical clues to unveil the mechanism of tumorigenesis and embryo differentiation. Therefore, we constructed a new spatial transcriptomics database, named SPASCER (https://ccsm.uth.edu/SPASCER), designed to help understand the heterogeneity of tissue organizations, region-specific microenvironment, and intercellular interactions across tissue architectures at multiple levels. SPASCER contains datasets from 43 studies, including 1082 sub-datasets from 16 organ types across four species. scRNA-seq was integrated to deconvolve/map spatial transcriptomics, and processed with spatial cell-cell interaction, gene pattern and pathway enrichment analysis. Cell-cell interactions and gene regulation network of scRNA-seq from matched spatial transcriptomics were performed as well. The application of SPASCER will provide new insights into tissue architecture and a solid foundation for the mechanistic understanding of many biological processes in healthy and diseased tissues.
Collapse
Affiliation(s)
- Zhiwei Fan
- West China School of Public Health and West China Fourth Hospital, Sichuan University, Chengdu 610041, China
- Center for Computational Systems Medicine, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Yangyang Luo
- West China Hospital, Sichuan University, Chengdu 610041, China
| | - Huifen Lu
- West China Hospital, Sichuan University, Chengdu 610041, China
| | - Tiangang Wang
- School of Life Science and Technology, Xidian University, Xi’an 710126, China
| | - YuZhou Feng
- West China Hospital, Sichuan University, Chengdu 610041, China
| | - Weiling Zhao
- Center for Computational Systems Medicine, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Pora Kim
- Center for Computational Systems Medicine, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Xiaobo Zhou
- Center for Computational Systems Medicine, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- School of Dentistry, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| |
Collapse
|
32
|
A two-phase gene selection method using anomaly detection and genetic algorithm for microarray data. Knowl Based Syst 2023. [DOI: 10.1016/j.knosys.2022.110249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
|
33
|
Galindez G, Sadegh S, Baumbach J, Kacprowski T, List M. Network-based approaches for modeling disease regulation and progression. Comput Struct Biotechnol J 2022; 21:780-795. [PMID: 36698974 PMCID: PMC9841310 DOI: 10.1016/j.csbj.2022.12.022] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 12/14/2022] [Accepted: 12/14/2022] [Indexed: 12/23/2022] Open
Abstract
Molecular interaction networks lay the foundation for studying how biological functions are controlled by the complex interplay of genes and proteins. Investigating perturbed processes using biological networks has been instrumental in uncovering mechanisms that underlie complex disease phenotypes. Rapid advances in omics technologies have prompted the generation of high-throughput datasets, enabling large-scale, network-based analyses. Consequently, various modeling techniques, including network enrichment, differential network extraction, and network inference, have proven to be useful for gaining new mechanistic insights. We provide an overview of recent network-based methods and their core ideas to facilitate the discovery of disease modules or candidate mechanisms. Knowledge generated from these computational efforts will benefit biomedical research, especially drug development and precision medicine. We further discuss current challenges and provide perspectives in the field, highlighting the need for more integrative and dynamic network approaches to model disease development and progression.
Collapse
Affiliation(s)
- Gihanna Galindez
- Division Data Science in Biomedicine, Peter L. Reichertz Institute for Medical Informatics of Technische Universität Braunschweig and Hannover Medical School, Braunschweig, Germany.,Braunschweig Integrated Centre of Systems Biology (BRICS), TU Braunschweig, Braunschweig, Germany
| | - Sepideh Sadegh
- Chair of Experimental Bioinformatics, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany.,Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Jan Baumbach
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany.,Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| | - Tim Kacprowski
- Division Data Science in Biomedicine, Peter L. Reichertz Institute for Medical Informatics of Technische Universität Braunschweig and Hannover Medical School, Braunschweig, Germany.,Braunschweig Integrated Centre of Systems Biology (BRICS), TU Braunschweig, Braunschweig, Germany
| | - Markus List
- Chair of Experimental Bioinformatics, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
| |
Collapse
|
34
|
Cervantes-Pérez SA, Thibivillliers S, Tennant S, Libault M. Review: Challenges and perspectives in applying single nuclei RNA-seq technology in plant biology. PLANT SCIENCE : AN INTERNATIONAL JOURNAL OF EXPERIMENTAL PLANT BIOLOGY 2022; 325:111486. [PMID: 36202294 DOI: 10.1016/j.plantsci.2022.111486] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/31/2022] [Revised: 09/12/2022] [Accepted: 09/30/2022] [Indexed: 06/16/2023]
Abstract
Plant single-cell RNA-seq technology quantifies the abundance of plant transcripts at a single-cell resolution. Deciphering the transcriptomes of each plant cell, their regulation during plant cell development, and their response to environmental stresses will support the functional study of genes, the establishment of precise transcriptional programs, the prediction of more accurate gene regulatory networks, and, in the long term, the design of de novo gene pathways to enhance selected crop traits. In this review, we will discuss the opportunities, challenges, and problems, and share tentative solutions associated with the generation and analysis of plant single-cell transcriptomes. We will discuss the benefit and limitations of using plant protoplasts vs. nuclei to conduct single-cell RNA-seq experiments on various plant species and organs, the functional annotation of plant cell types based on their transcriptomic profile, the characterization of the dynamic regulation of the plant genes during cell development or in response to environmental stress, the need to characterize and integrate additional layers of -omics datasets to capture new molecular modalities at the single-cell level and reveal their causalities, the deposition and access to single-cell datasets, and the accessibility of this technology to plant scientists.
Collapse
Affiliation(s)
- Sergio Alan Cervantes-Pérez
- Department of Agronomy and Horticulture, Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE, 68503, USA
| | - Sandra Thibivillliers
- Department of Agronomy and Horticulture, Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE, 68503, USA; Center for Biotechnology, University of Nebraska, Lincoln, NE 68588, USA; Single Cell Genomics Core Facility, University of Nebraska-Lincoln, NE 68588, USA
| | - Sutton Tennant
- Department of Agronomy and Horticulture, Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE, 68503, USA
| | - Marc Libault
- Department of Agronomy and Horticulture, Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE, 68503, USA; Center for Biotechnology, University of Nebraska, Lincoln, NE 68588, USA; Single Cell Genomics Core Facility, University of Nebraska-Lincoln, NE 68588, USA.
| |
Collapse
|
35
|
Yang Y, Chaffin TA, Ahkami AH, Blumwald E, Stewart CN. Plant synthetic biology innovations for biofuels and bioproducts. Trends Biotechnol 2022; 40:1454-1468. [PMID: 36241578 DOI: 10.1016/j.tibtech.2022.09.007] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Revised: 08/26/2022] [Accepted: 09/15/2022] [Indexed: 01/21/2023]
Abstract
Plant-based biosynthesis of fuels, chemicals, and materials promotes environmental sustainability, which includes decreases in greenhouse gas emissions, water pollution, and loss of biodiversity. Advances in plant synthetic biology (synbio) should improve precision and efficacy of genetic engineering for sustainability. Applicable synbio innovations include genome editing, gene circuit design, synthetic promoter development, gene stacking technologies, and the design of environmental sensors. Moreover, recent advancements in developing spatially resolved and single-cell omics contribute to the discovery and characterization of cell-type-specific mechanisms and spatiotemporal gene regulations in distinct plant tissues for the expression of cell- and tissue-specific genes, resulting in improved bioproduction. This review highlights recent plant synbio progress and new single-cell molecular profiling towards sustainable biofuel and biomaterial production.
Collapse
Affiliation(s)
- Yongil Yang
- Center for Agricultural Synthetic Biology, University of Tennessee Institute of Agriculture, Knoxville, TN, USA; Department of Plant Sciences, University of Tennessee, Knoxville, TN, USA
| | - Timothy Alexander Chaffin
- Center for Agricultural Synthetic Biology, University of Tennessee Institute of Agriculture, Knoxville, TN, USA; Department of Plant Sciences, University of Tennessee, Knoxville, TN, USA
| | - Amir H Ahkami
- Environmental Molecular Sciences Laboratory (EMSL), Pacific Northwest National Laboratory (PNNL), Richland, WA, USA
| | - Eduardo Blumwald
- Department of Plant Sciences, University of California, Davis, CA, USA
| | - Charles Neal Stewart
- Center for Agricultural Synthetic Biology, University of Tennessee Institute of Agriculture, Knoxville, TN, USA; Department of Plant Sciences, University of Tennessee, Knoxville, TN, USA; Center for Bioenergy Innovation, Oak Ridge National Laboratory, Oak Ridge, TN, USA.
| |
Collapse
|
36
|
Moreno M, Vilaça R, Ferreira PG. Scalable transcriptomics analysis with Dask: applications in data science and machine learning. BMC Bioinformatics 2022; 23:514. [PMID: 36451115 PMCID: PMC9710082 DOI: 10.1186/s12859-022-05065-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Accepted: 11/16/2022] [Indexed: 12/02/2022] Open
Abstract
BACKGROUND Gene expression studies are an important tool in biological and biomedical research. The signal carried in expression profiles helps derive signatures for the prediction, diagnosis and prognosis of different diseases. Data science and specifically machine learning have many applications in gene expression analysis. However, as the dimensionality of genomics datasets grows, scalable solutions become necessary. METHODS In this paper we review the main steps and bottlenecks in machine learning pipelines, as well as the main concepts behind scalable data science including those of concurrent and parallel programming. We discuss the benefits of the Dask framework and how it can be integrated with the Python scientific environment to perform data analysis in computational biology and bioinformatics. RESULTS This review illustrates the role of Dask for boosting data science applications in different case studies. Detailed documentation and code on these procedures is made available at https://github.com/martaccmoreno/gexp-ml-dask . CONCLUSION By showing when and how Dask can be used in transcriptomics analysis, this review will serve as an entry point to help genomic data scientists develop more scalable data analysis procedures.
Collapse
Affiliation(s)
- Marta Moreno
- grid.5808.50000 0001 1503 7226Department of Computer Science, Faculty of Sciences, University of Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal ,grid.20384.3d0000 0004 0500 6380Laboratory of Artificial Intelligence and Decision Support, INESC TEC, Rua Dr. Roberto Frias, 4200-465 Porto, Portugal
| | - Ricardo Vilaça
- grid.20384.3d0000 0004 0500 6380High-Assurance Software Laboratory, INESC TEC, Rua Dr. Roberto Frias, 4200-465 Porto, Portugal ,grid.10328.380000 0001 2159 175XDepartment of Informatics, Minho Advanced Computing Center, University of Minho, Gualtar, 4710-070 Braga, Portugal
| | - Pedro G. Ferreira
- grid.5808.50000 0001 1503 7226Department of Computer Science, Faculty of Sciences, University of Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal ,grid.20384.3d0000 0004 0500 6380Laboratory of Artificial Intelligence and Decision Support, INESC TEC, Rua Dr. Roberto Frias, 4200-465 Porto, Portugal ,grid.5808.50000 0001 1503 7226Institute of Molecular Pathology and Immunology of the University of Porto, Institute for Research and Innovation in Health (i3s), R. Alfredo Allen 208, 4200-135 Porto, Portugal
| |
Collapse
|
37
|
High-throughput approaches to functional characterization of genetic variation in yeast. Curr Opin Genet Dev 2022; 76:101979. [PMID: 36075138 DOI: 10.1016/j.gde.2022.101979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 07/29/2022] [Accepted: 08/02/2022] [Indexed: 11/20/2022]
Abstract
Expansion of sequencing efforts to include thousands of genomes is providing a fundamental resource for determining the genetic diversity that exists in a population. Now, high-throughput approaches are necessary to begin to understand the role these genotypic changes play in affecting phenotypic variation. Saccharomyces cerevisiae maintains its position as an excellent model system to determine the function of unknown variants with its exceptional genetic diversity, phenotypic diversity, and reliable genetic manipulation tools. Here, we review strategies and techniques developed in yeast that scale classic approaches of assessing variant function. These approaches improve our ability to better map quantitative trait loci at a higher resolution, even for rare variants, and are already providing greater insight into the role that different types of mutations play in phenotypic variation and evolution not just in yeast but across taxa.
Collapse
|
38
|
Kartha VK, Duarte FM, Hu Y, Ma S, Chew JG, Lareau CA, Earl A, Burkett ZD, Kohlway AS, Lebofsky R, Buenrostro JD. Functional inference of gene regulation using single-cell multi-omics. CELL GENOMICS 2022; 2. [PMID: 36204155 PMCID: PMC9534481 DOI: 10.1016/j.xgen.2022.100166] [Citation(s) in RCA: 66] [Impact Index Per Article: 33.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Cells require coordinated control over gene expression when responding to environmental stimuli. Here we apply scATAC-seq and single-cell RNA sequencing (scRNA-seq) in resting and stimulated human blood cells. Collectively, we generate ~91,000 single-cell profiles, allowing us to probe the cis-regulatory landscape of the immunological response across cell types, stimuli, and time. Advancing tools to integrate multi-omics data, we develop functional inference of gene regulation (FigR), a framework to computationally pair scA-TAC-seq with scRNA-seq cells, connect distal cis-regulatory elements to genes, and infer gene-regulatory networks (GRNs) to identify candidate transcription factor (TF) regulators. Utilizing these paired multi-omics data, we define domains of regulatory chromatin (DORCs) of immune stimulation and find that cells alter chromatin accessibility and gene expression at timescales of minutes. Construction of the stimulation GRN elucidates TF activity at disease-associated DORCs. Overall, FigR enables elucidation of regulatory interactions across single-cell data, providing new opportunities to understand the function of cells within tissues. Single-cell methods for measuring chromatin accessibility (ATAC-seq) and gene expression (RNA-seq) are rapidly evolving, but tools to integrate data and infer gene-regulatory relationships remain limited. Here we generate multi-omics data of resting and stimulated human blood cells and present a new computational framework for constructing gene-regulatory networks (GRNs). Specifically, we describe functional inference of gene regulation (FigR), a workflow to (1) pair scATAC-seq with scRNA-seq, (2) connect cis-regulatory elements to target genes, and (3) identify TF-gene relationships.
Collapse
Affiliation(s)
- Vinay K. Kartha
- Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, MA 02138, USA
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Fabiana M. Duarte
- Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, MA 02138, USA
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Yan Hu
- Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, MA 02138, USA
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Sai Ma
- Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, MA 02138, USA
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | | | - Caleb A. Lareau
- Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Andrew Earl
- Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, MA 02138, USA
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | | | | | | | - Jason D. Buenrostro
- Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, MA 02138, USA
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Corresponding author
| |
Collapse
|
39
|
Galán-Vásquez E, Gómez-García MDC, Pérez-Rueda E. A landscape of gene regulation in the parasitic amoebozoa Entamoeba spp. PLoS One 2022; 17:e0271640. [PMID: 35913975 PMCID: PMC9342746 DOI: 10.1371/journal.pone.0271640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Accepted: 07/05/2022] [Indexed: 11/27/2022] Open
Abstract
Entamoeba are amoeboid extracellular parasites that represent an important group of organisms for which the regulatory networks must be examined to better understand how genes and functional processes are interrelated. In this work, we inferred the gene regulatory networks (GRNs) in four Entamoeba species, E. histolytica, E. dispar, E. nuttalli, and E. invadens, and the GRN topological properties and the corresponding biological functions were evaluated. From these analyses, we determined that transcription factors (TFs) of E. histolytica, E. dispar, and E. nuttalli are associated mainly with the LIM family, while the TFs in E. invadens are associated with the RRM_1 family. In addition, we identified that EHI_044890 regulates 121 genes in E. histolytica, EDI_297980 regulates 284 genes in E. dispar, ENU1_120230 regulates 195 genes in E. nuttalli, and EIN_249270 regulates 257 genes in E. invadens. Finally, we identified that three types of processes, Macromolecule metabolic process, Cellular macromolecule metabolic process, and Cellular nitrogen compound metabolic process, are the main biological processes for each network. The results described in this work can be used as a basis for the study of gene regulation in these organisms.
Collapse
Affiliation(s)
- Edgardo Galán-Vásquez
- Departamento de Ingeniería de Sistemas Computacionales y Automatización, Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de México, Ciudad Universitaria, Ciudad de México, México
- * E-mail: (EG-V); (EP-R)
| | - María del Consuelo Gómez-García
- Laboratorio de Biomedicina Molecular, Escuela Nacional de Medicina y Homeopatía, Instituto Politécnico Nacional, Ciudad de México, México
| | - Ernesto Pérez-Rueda
- Unidad Académica Yucatán, Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de México, Mérida, Yucatán, México
- * E-mail: (EG-V); (EP-R)
| |
Collapse
|
40
|
Brettner L, Ho WC, Schmidlin K, Apodaca S, Eder R, Geiler-Samerotte K. Challenges and potential solutions for studying the genetic and phenotypic architecture of adaptation in microbes. Curr Opin Genet Dev 2022; 75:101951. [PMID: 35797741 DOI: 10.1016/j.gde.2022.101951] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Revised: 06/01/2022] [Accepted: 06/14/2022] [Indexed: 11/29/2022]
Abstract
All organisms are defined by the makeup of their DNA. Over billions of years, the structure and information contained in that DNA, often referred to as genetic architecture, have been honed by a multitude of evolutionary processes. Mutations that cause genetic elements to change in a way that results in beneficial phenotypic change are more likely to survive and propagate through the population in a process known as adaptation. Recent work reveals that the genetic targets of adaptation are varied and can change with genetic background. Further, seemingly similar adaptive mutations, even within the same gene, can have diverse and unpredictable effects on phenotype. These challenges represent major obstacles in predicting adaptation and evolution. In this review, we cover these concepts in detail and identify three emerging synergistic solutions: higher-throughput evolution experiments combined with updated genotype-phenotype mapping strategies and physiological models. Our review largely focuses on recent literature in yeast, and the field seems to be on the cusp of a new era with regard to studying the predictability of evolution.
Collapse
|
41
|
Morphogen-directed cell fate boundaries: slow passage through bifurcation and the role of folded saddles. J Theor Biol 2022; 549:111220. [PMID: 35839857 DOI: 10.1016/j.jtbi.2022.111220] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Revised: 06/24/2022] [Accepted: 07/06/2022] [Indexed: 11/21/2022]
Abstract
One of the fundamental mechanisms in embryogenesis is the process by which cells differentiate and create tissues and structures important for functioning as a multicellular organism. Morphogenesis involves diffusive process of chemical signalling involving morphogens that pre-pattern the tissue. These morphogens influence cell fate through a highly nonlinear process of transcriptional signalling. In this paper, we consider this multiscale process in an idealised model for a growing domain. We focus on intracellular processes that lead to robust differentiation into two cell lineages through interaction of a single morphogen species with a cell fate variable that undergoes a bifurcation from monostability to bistability. In particular, we investigate conditions that result in successful and robust pattern formation into two well-separated domains, as well as conditions where this fails and produces a pinned boundary wave where only one part of the domain grows. We show that successful and unsuccessful patterning scenarios can be characterised in terms of presence or absence of a folded saddle singularity for a system with two slow variables and one fast variable; this models the interaction of slow morphogen diffusion, slow parameter drift through bifurcation and fast transcription dynamics. We illustrate how this approach can successfully model acquisition of three cell fates to produce three-domain "French flag" patterning, as well as for a more realistic model of the cell fate dynamics in terms of two mutually inhibiting transcription factors.
Collapse
|
42
|
Cano R, Lenz AR, Galan-Vasquez E, Ramirez-Prado JH, Perez-Rueda E. Gene Regulatory Network Inference and Gene Module Regulating Virulence in Fusarium oxysporum. Front Microbiol 2022; 13:861528. [PMID: 35722316 PMCID: PMC9201490 DOI: 10.3389/fmicb.2022.861528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2022] [Accepted: 05/09/2022] [Indexed: 11/20/2022] Open
Abstract
In this work, we inferred the gene regulatory network (GRN) of the fungus Fusarium oxysporum by using the regulatory networks of Aspergillus nidulans FGSC A4, Neurospora crassa OR74A, Saccharomyces cerevisiae S288c, and Fusarium graminearum PH-1 as templates for sequence comparisons. Topological properties to infer the role of transcription factors (TFs) and to identify functional modules were calculated in the GRN. From these analyzes, five TFs were identified as hubs, including FOXG_04688 and FOXG_05432, which regulate 2,404 and 1,864 target genes, respectively. In addition, 16 communities were identified in the GRN, where the largest contains 1,923 genes and the smallest contains 227 genes. Finally, the genes associated with virulence were extracted from the GRN and exhaustively analyzed, and we identified a giant module with ten TFs and 273 target genes, where the most highly connected node corresponds to the transcription factor FOXG_05265, homologous to the putative bZip transcription factor CPTF1 of Claviceps purpurea, which is involved in ergotism disease that affects cereal crops and grasses. The results described in this work can be used for the study of gene regulation in this organism and open the possibility to explore putative genes associated with virulence against their host.
Collapse
Affiliation(s)
- Regnier Cano
- Centro de Investigaciones Científicas de Yucatán, Mérida, Mexico
| | - Alexandre Rafael Lenz
- Departamento de Ciências Exatas e da Terra, Universidade do Estado da Bahia, Salvador, Brazil
| | - Edgardo Galan-Vasquez
- Departamento de Ingeniería de Sistemas Computacionales y Automatización, Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de México, Ciudad Universitaria, Mexico, Mexico
| | | | - Ernesto Perez-Rueda
- Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Unidad Académica Yucatán Universidad Nacional Autónoma de México, Mérida, Mexico
| |
Collapse
|
43
|
Lasri A, Shahrezaei V, Sturrock M. Benchmarking imputation methods for network inference using a novel method of synthetic scRNA-seq data generation. BMC Bioinformatics 2022; 23:236. [PMID: 35715748 PMCID: PMC9204969 DOI: 10.1186/s12859-022-04778-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Accepted: 05/31/2022] [Indexed: 11/30/2022] Open
Abstract
Background Single cell RNA-sequencing (scRNA-seq) has very rapidly become the new workhorse of modern biology providing an unprecedented global view on cellular diversity and heterogeneity. In particular, the structure of gene-gene expression correlation contains information on the underlying gene regulatory networks. However, interpretation of scRNA-seq data is challenging due to specific experimental error and biases that are unique to this kind of data including drop-out (or technical zeros). Methods To deal with this problem several methods for imputation of zeros for scRNA-seq have been developed. However, it is not clear how these processing steps affect inference of genetic networks from single cell data. Here, we introduce Biomodelling.jl, a tool for generation of synthetic scRNA-seq data using multiscale modelling of stochastic gene regulatory networks in growing and dividing cells. Results Our tool produces realistic transcription data with a known ground truth network topology that can be used to benchmark different approaches for gene regulatory network inference. Using this tool we investigate the impact of different imputation methods on the performance of several network inference algorithms. Conclusions Biomodelling.jl provides a versatile and useful tool for future development and benchmarking of network inference approaches using scRNA-seq data. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04778-9
Collapse
Affiliation(s)
- Ayoub Lasri
- Department of Physiology and Medical Physics, Royal College of Surgeons in Ireland, Dublin, Ireland
| | - Vahid Shahrezaei
- Department of Mathematics, Faculty of Natural Sciences, Imperial College London, London, SW7 2AZ, UK
| | - Marc Sturrock
- Department of Physiology and Medical Physics, Royal College of Surgeons in Ireland, Dublin, Ireland.
| |
Collapse
|
44
|
Gonçalves LO, Pulido AFV, Mathias FAS, Enes AES, Carvalho MGR, de Melo Resende D, Polak ME, Ruiz JC. Expression Profile of Genes Related to the Th17 Pathway in Macrophages Infected by Leishmania major and Leishmania amazonensis: The Use of Gene Regulatory Networks in Modeling This Pathway. Front Cell Infect Microbiol 2022; 12:826523. [PMID: 35774406 PMCID: PMC9239034 DOI: 10.3389/fcimb.2022.826523] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Accepted: 03/09/2022] [Indexed: 11/13/2022] Open
Abstract
Leishmania amazonensis and Leishmania major are the causative agents of cutaneous and mucocutaneous diseases. The infections‘ outcome depends on host–parasite interactions and Th1/Th2 response, and in cutaneous form, regulation of Th17 cytokines has been reported to maintain inflammation in lesions. Despite that, the Th17 regulatory scenario remains unclear. With the aim to gain a better understanding of the transcription factors (TFs) and genes involved in Th17 induction, in this study, the role of inducing factors of the Th17 pathway in Leishmania–macrophage infection was addressed through computational modeling of gene regulatory networks (GRNs). The Th17 GRN modeling integrated experimentally validated data available in the literature and gene expression data from a time-series RNA-seq experiment (4, 24, 48, and 72 h post-infection). The generated model comprises a total of 10 TFs, 22 coding genes, and 16 cytokines related to the Th17 immune modulation. Addressing the Th17 induction in infected and uninfected macrophages, an increase of 2- to 3-fold in 4–24 h was observed in the former. However, there was a decrease in basal levels at 48–72 h for both groups. In order to evaluate the possible outcomes triggered by GRN component modulation in the Th17 pathway. The generated GRN models promoted an integrative and dynamic view of Leishmania–macrophage interaction over time that extends beyond the analysis of single-gene expression.
Collapse
Affiliation(s)
- Leilane Oliveira Gonçalves
- Programa de Pós-graduação em Biologia Computacional e Sistemas, Instituto Oswaldo Cruz, Fiocruz, Rio de Janeiro, Brazil
- Grupo Informática de Biossistemas, Instituto René Rachou, Fiocruz Minas, Belo Horizonte, Brazil
| | - Andrés F. Vallejo Pulido
- Systems Immunology Group, Clinical and Experimental Sciences, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
| | | | - Alexandre Estevão Silvério Enes
- Programa de Pós-graduação em Biologia Computacional e Sistemas, Instituto Oswaldo Cruz, Fiocruz, Rio de Janeiro, Brazil
- Grupo Informática de Biossistemas, Instituto René Rachou, Fiocruz Minas, Belo Horizonte, Brazil
| | | | - Daniela de Melo Resende
- Grupo Genômica Funcional de Parasitos, Instituto René Rachou, Fiocruz Minas, Belo Horizonte, Brazil
| | - Marta E. Polak
- Systems Immunology Group, Clinical and Experimental Sciences, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
- *Correspondence: Jeronimo C. Ruiz, ; Marta E. Polak,
| | - Jeronimo C. Ruiz
- Grupo Informática de Biossistemas, Instituto René Rachou, Fiocruz Minas, Belo Horizonte, Brazil
- *Correspondence: Jeronimo C. Ruiz, ; Marta E. Polak,
| |
Collapse
|
45
|
Zhang Y, He Y, Chen Q, Yang Y, Gong M. Fusion prior gene network for high reliable single-cell gene regulatory network inference. Comput Biol Med 2022; 143:105279. [PMID: 35134605 DOI: 10.1016/j.compbiomed.2022.105279] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Revised: 01/25/2022] [Accepted: 01/29/2022] [Indexed: 11/03/2022]
Abstract
Single-Cell RNA sequencing technology provides an opportunity to discover gene regulatory networks(GRN) that control cell differentiation and drive cell type transformation. However, it is faced with the challenge of high loss and high noise of sequencing data and contains many pseudo-connections. To solve these problems, we propose a framework called Fusion prior gene network for Gene Regulatory Network inference Accuracy Enhancement(FGRNAE) to infer a high reliable gene regulatory network. Specifically, based on the Single-Cell RNA-sequencing Network Propagation and network Fusion(scNPF) preprocessing framework, we employ the Random Walk with Restart on the prior gene network to interpolate the missing data. Furthermore, we infer the network using the Random Forest algorithm with the results achieved above. In addition, we apply data from the Co-Function Network to build a meta-gene network and select the regulatory connection with the Markov Random Field. Extensive experiments based on datasets from BEELINE validate the effectiveness of our framework for improving the accuracy of inference.
Collapse
Affiliation(s)
- Yongqing Zhang
- School of Computer Science, Chengdu University of Information Technology, Chengdu, 610225, China; School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Yuchen He
- School of Computer Science, Chengdu University of Information Technology, Chengdu, 610225, China
| | - Qingyuan Chen
- School of Computer Science, Chengdu University of Information Technology, Chengdu, 610225, China
| | - Yihan Yang
- International College, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
| | - Meiqin Gong
- West China Second University Hospital, Sichuan University, Chengdu, 610041, China.
| |
Collapse
|
46
|
Applications of cell- and tissue-specific 'omics to improve plant productivity. Emerg Top Life Sci 2022; 6:163-173. [PMID: 35293572 PMCID: PMC9023014 DOI: 10.1042/etls20210286] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2021] [Revised: 02/21/2022] [Accepted: 02/25/2022] [Indexed: 01/05/2023]
Abstract
The individual tissues and cell types of plants each have characteristic properties that contribute to the function of the plant as a whole. These are reflected by unique patterns of gene expression, protein and metabolite content, which enable cell-type-specific patterns of growth, development and physiology. Gene regulatory networks act within the cell types to govern the production and activity of these components. For the broader organism to grow and reproduce successfully, cell-type-specific activity must also function within the context of surrounding cell types, which is achieved by coordination of signalling pathways. We can investigate how gene regulatory networks are constructed and function using integrative ‘omics technologies. Historically such experiments in plant biological research have been performed at the bulk tissue level, to organ resolution at best. In this review, we describe recent advances in cell- and tissue-specific ‘omics technologies that allow investigation at much improved resolution. We discuss the advantages of these approaches for fundamental and translational plant biology, illustrated through the examples of specialised metabolism in medicinal plants and seed germination. We also discuss the challenges that must be overcome for such approaches to be adopted widely by the community.
Collapse
|
47
|
|
48
|
Hegenbarth JC, Lezzoche G, De Windt LJ, Stoll M. Perspectives on Bulk-Tissue RNA Sequencing and Single-Cell RNA Sequencing for Cardiac Transcriptomics. FRONTIERS IN MOLECULAR MEDICINE 2022; 2:839338. [PMID: 39086967 PMCID: PMC11285642 DOI: 10.3389/fmmed.2022.839338] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/19/2021] [Accepted: 01/31/2022] [Indexed: 08/02/2024]
Abstract
The heart has been the center of numerous transcriptomic studies in the past decade. Even though our knowledge of the key organ in our cardiovascular system has significantly increased over the last years, it is still not fully understood yet. In recent years, extensive efforts were made to understand the genetic and transcriptomic contribution to cardiac function and failure in more detail. The advent of Next Generation Sequencing (NGS) technologies has brought many discoveries but it is unable to comprehend the finely orchestrated interactions between and within the various cell types of the heart. With the emergence of single-cell sequencing more than 10 years ago, researchers gained a valuable new tool to enable the exploration of new subpopulations of cells, cell-cell interactions, and integration of multi-omic approaches at a single-cell resolution. Despite this innovation, it is essential to make an informed choice regarding the appropriate technique for transcriptomic studies, especially when working with myocardial tissue. Here, we provide a primer for researchers interested in transcriptomics using NGS technologies.
Collapse
Affiliation(s)
- Jana-Charlotte Hegenbarth
- Department of Molecular Genetics, Faculty of Science and Engineering, Faculty of Health, Medicine and Life Sciences, Maastricht University, Maastricht, Netherlands
| | - Giuliana Lezzoche
- Department of Molecular Genetics, Faculty of Science and Engineering, Faculty of Health, Medicine and Life Sciences, Maastricht University, Maastricht, Netherlands
| | - Leon J. De Windt
- Department of Molecular Genetics, Faculty of Science and Engineering, Faculty of Health, Medicine and Life Sciences, Maastricht University, Maastricht, Netherlands
| | - Monika Stoll
- Department of Biochemistry, CARIM School for Cardiovascular Diseases, Maastricht University, Maastricht, Netherlands
- Department of Genetic Epidemiology, Institute of Human Genetics, University Hospital Münster, Münster, Germany
| |
Collapse
|
49
|
Gibbs CS, Jackson CA, Saldi GA, Tjärnberg A, Shah A, Watters A, De Veaux N, Tchourine K, Yi R, Hamamsy T, Castro DM, Carriero N, Gorissen BL, Gresham D, Miraldi ER, Bonneau R. High performance single-cell gene regulatory network inference at scale: The Inferelator 3.0. Bioinformatics 2022; 38:2519-2528. [PMID: 35188184 PMCID: PMC9048651 DOI: 10.1093/bioinformatics/btac117] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Revised: 12/08/2021] [Accepted: 02/17/2022] [Indexed: 12/04/2022] Open
Abstract
Motivation Gene regulatory networks define regulatory relationships between transcription factors and target genes within a biological system, and reconstructing them is essential for understanding cellular growth and function. Methods for inferring and reconstructing networks from genomics data have evolved rapidly over the last decade in response to advances in sequencing technology and machine learning. The scale of data collection has increased dramatically; the largest genome-wide gene expression datasets have grown from thousands of measurements to millions of single cells, and new technologies are on the horizon to increase to tens of millions of cells and above. Results In this work, we present the Inferelator 3.0, which has been significantly updated to integrate data from distinct cell types to learn context-specific regulatory networks and aggregate them into a shared regulatory network, while retaining the functionality of the previous versions. The Inferelator is able to integrate the largest single-cell datasets and learn cell-type-specific gene regulatory networks. Compared to other network inference methods, the Inferelator learns new and informative Saccharomyces cerevisiae networks from single-cell gene expression data, measured by recovery of a known gold standard. We demonstrate its scaling capabilities by learning networks for multiple distinct neuronal and glial cell types in the developing Mus musculus brain at E18 from a large (1.3 million) single-cell gene expression dataset with paired single-cell chromatin accessibility data. Availability and implementation The inferelator software is available on GitHub (https://github.com/flatironinstitute/inferelator) under the MIT license and has been released as python packages with associated documentation (https://inferelator.readthedocs.io/). Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Claudia Skok Gibbs
- Flatiron Institute, Center for Computational Biology, Simons Foundation, New York, NY, USA.,Center For Data Science, NYU, New York, NY, USA
| | - Christopher A Jackson
- Center For Genomics and Systems Biology, NYU, New York, NY, USA.,Department of Biology, NYU, New York, NY, USA
| | - Giuseppe-Antonio Saldi
- Center For Genomics and Systems Biology, NYU, New York, NY, USA.,Department of Biology, NYU, New York, NY, USA
| | - Andreas Tjärnberg
- Center For Genomics and Systems Biology, NYU, New York, NY, USA.,Department of Biology, NYU, New York, NY, USA
| | - Aashna Shah
- Flatiron Institute, Center for Computational Biology, Simons Foundation, New York, NY, USA
| | - Aaron Watters
- Flatiron Institute, Center for Computational Biology, Simons Foundation, New York, NY, USA
| | - Nicholas De Veaux
- Flatiron Institute, Center for Computational Biology, Simons Foundation, New York, NY, USA
| | | | - Ren Yi
- Courant Institute of Mathematical Sciences, Computer Science Department, NYU, New York, NY, USA
| | | | - Dayanne M Castro
- Center For Genomics and Systems Biology, NYU, New York, NY, USA.,Department of Biology, NYU, New York, NY, USA
| | - Nicholas Carriero
- Flatiron Institute, Scientific Computing Core, Simons Foundation, New York, NY, USA
| | - Bram L Gorissen
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - David Gresham
- Center For Genomics and Systems Biology, NYU, New York, NY, USA.,Department of Biology, NYU, New York, NY, USA
| | - Emily R Miraldi
- Divisions of Immunobiology and Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA.,Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
| | - Richard Bonneau
- Flatiron Institute, Center for Computational Biology, Simons Foundation, New York, NY, USA.,Center For Data Science, NYU, New York, NY, USA.,Center For Genomics and Systems Biology, NYU, New York, NY, USA.,Department of Biology, NYU, New York, NY, USA.,Courant Institute of Mathematical Sciences, Computer Science Department, NYU, New York, NY, USA
| |
Collapse
|
50
|
Jackson CA, Vogel C. New horizons in the stormy sea of multimodal single-cell data integration. Mol Cell 2022; 82:248-259. [PMID: 35063095 PMCID: PMC8830781 DOI: 10.1016/j.molcel.2021.12.012] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Revised: 12/08/2021] [Accepted: 12/13/2021] [Indexed: 01/22/2023]
Abstract
While measurements of RNA expression have dominated the world of single-cell analyses, new single-cell techniques increasingly allow collection of different data modalities, measuring different molecules, structural connections, and intermolecular interactions. Integrating the resulting multimodal single-cell datasets is a new bioinformatics challenge. Equally important, it is a new experimental design challenge for the bench scientist, who is not only choosing from a myriad of techniques for each data modality but also faces new challenges in experimental design. The ultimate goal is to design, execute, and analyze multimodal single-cell experiments that are more than just descriptive but enable the learning of new causal and mechanistic biology. This objective requires strict consideration of the goals behind the analysis, which might range from mapping the heterogeneity of a cellular population to assembling system-wide causal networks that can further our understanding of cellular functions and eventually lead to models of tissues and organs. We review steps and challenges toward this goal. Single-cell transcriptomics is now a mature technology, and methods to measure proteins, lipids, small-molecule metabolites, and other molecular phenotypes at the single-cell level are rapidly developing. Integrating these single-cell readouts so that each cell has measurements of multiple types of data, e.g., transcriptomes, proteomes, and metabolomes, is expected to allow identification of highly specific cellular subpopulations and to provide the basis for inferring causal biological mechanisms.
Collapse
Affiliation(s)
- Christopher A Jackson
- New York University, Department of Biology, Center for Genomics and Systems Biology, New York NY, USA
| | - Christine Vogel
- New York University, Department of Biology, Center for Genomics and Systems Biology, New York NY, USA
| |
Collapse
|