1
|
Qin H, Shi X, Zhou H. scSwinFormer: A Transformer-Based Cell-Type Annotation Method for scRNA-Seq Data Using Smooth Gene Embedding and Global Features. J Chem Inf Model 2024; 64:6316-6323. [PMID: 39101690 DOI: 10.1021/acs.jcim.4c00616] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/06/2024]
Abstract
Single-cell omics techniques have made it possible to analyze individual cells in biological samples, providing us with a more detailed understanding of cellular heterogeneity and biological systems. Accurate identification of cell types is critical for single-cell RNA sequencing (scRNA-seq) analysis. However, scRNA-seq data are usually high dimensional and sparse, posing a great challenge to analyze scRNA-seq data. Existing cell-type annotation methods are either constrained in modeling scRNA-seq data or lack consideration of long-term dependencies of characterized genes. In this work, we developed a Transformer-based deep learning method, scSwinFormer, for the cell-type annotation of large-scale scRNA-seq data. Sequence modeling of scRNA-seq data is performed using the smooth gene embedding module, and then, the potential dependencies of genes are captured by the self-attention module. Subsequently, the global information inherent in scRNA-seq data is synthesized using the Cell Token, thereby facilitating accurate cell-type annotation. We evaluated the performance of our model against current state-of-the-art scRNA-seq cell-type annotation methods on multiple real data sets. ScSwinFormer outperforms the current state-of-the-art scRNA-seq cell-type annotation methods in both external and benchmark data set experiments.
Collapse
Affiliation(s)
- Hengyu Qin
- School of Information and Electronics, Beijing Institute of Technology, Beijing 100081, China
| | - Xiumin Shi
- School of Information and Electronics, Beijing Institute of Technology, Beijing 100081, China
| | - Han Zhou
- School of Information and Electronics, Beijing Institute of Technology, Beijing 100081, China
| |
Collapse
|
2
|
Zhao M, Li J, Liu X, Ma K, Tang J, Guo F. A gene regulatory network-aware graph learning method for cell identity annotation in single-cell RNA-seq data. Genome Res 2024; 34:1036-1051. [PMID: 39134412 PMCID: PMC11368180 DOI: 10.1101/gr.278439.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 07/23/2024] [Indexed: 08/22/2024]
Abstract
Cell identity annotation for single-cell transcriptome data is a crucial process for constructing cell atlases, unraveling pathogenesis, and inspiring therapeutic approaches. Currently, the efficacy of existing methodologies is contingent upon specific data sets. Nevertheless, such data are often sourced from various batches, sequencing technologies, tissues, and even species. Notably, the gene regulatory relationship remains unaffected by the aforementioned factors, highlighting the extensive gene interactions within organisms. Therefore, we propose scHGR, an automated annotation tool designed to leverage gene regulatory relationships in constructing gene-mediated cell communication graphs for single-cell transcriptome data. This strategy helps reduce noise from diverse data sources while establishing distant cellular connections, yielding valuable biological insights. Experiments involving 22 scenarios demonstrate that scHGR precisely and consistently annotates cell identities, benchmarked against state-of-the-art methods. Crucially, scHGR uncovers novel subtypes within peripheral blood mononuclear cells, specifically from CD4+ T cells and cytotoxic T cells. Furthermore, by characterizing a cell atlas comprising 56 cell types for COVID-19 patients, scHGR identifies vital factors like IL1 and calcium ions, offering insights for targeted therapeutic interventions.
Collapse
Affiliation(s)
- Mengyuan Zhao
- College of Computer Science and Control Engineering, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
- University of Chinese Academy of Sciences, Beijing 100190, China
| | - Jiawei Li
- College of Intelligence and Computing, Tianjin University, Tianjin 300350, China
| | - Xiaoyi Liu
- Computer Science and Engineering, University of South Carolina, Columbia, South Carolina 29208, USA
| | - Ke Ma
- College of Engineering, Southern University of Science and Technology, Shenzhen 518055, China
| | - Jijun Tang
- College of Computer Science and Control Engineering, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China;
| | - Fei Guo
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| |
Collapse
|
3
|
Zhang B, Zhang S, Zhang S. Whole brain alignment of spatial transcriptomics between humans and mice with BrainAlign. Nat Commun 2024; 15:6302. [PMID: 39080277 PMCID: PMC11289418 DOI: 10.1038/s41467-024-50608-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2024] [Accepted: 07/10/2024] [Indexed: 08/02/2024] Open
Abstract
The increasing utilization of mouse models in human neuroscience research places higher demands on computational methods to translate findings from the mouse brain to the human one. In this study, we develop BrainAlign, a self-supervised learning approach, for the whole brain alignment of spatial transcriptomics (ST) between humans and mice. BrainAlign encodes spots and genes simultaneously in two separated shared embedding spaces by a heterogeneous graph neural network. We demonstrate that BrainAlign could integrate cross-species spots into the embedding space and reveal the conserved brain regions supported by ST information, which facilitates the detection of homologous regions between humans and mice. Genomic analysis further presents gene expression connections between humans and mice and reveals similar expression patterns for marker genes. Moreover, BrainAlign can accurately map spatially similar homologous regions or clusters onto a unified spatial structural domain while preserving their relative positions.
Collapse
Affiliation(s)
- Biao Zhang
- School of Mathematical Sciences, Fudan University, Shanghai, China
| | - Shuqin Zhang
- School of Mathematical Sciences, Fudan University, Shanghai, China.
- Key Laboratory of Mathematics for Nonlinear Science, Fudan University, Ministry of Education, Shanghai, China.
- Shanghai Key Laboratory for Contemporary Applied Mathematics, Fudan University, Shanghai, China.
| | - Shihua Zhang
- NCMIS, CEMS, RCSDS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China.
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, China.
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou, China.
| |
Collapse
|
4
|
Tian Y, Wu L, Huang CC, Wang L. Identify Regulatory eQTLs by Multiome Sequencing in Prostate Single Cells. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.19.599704. [PMID: 38948854 PMCID: PMC11213234 DOI: 10.1101/2024.06.19.599704] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]
Abstract
While genome-wide association studies and expression quantitative trait loci (eQTL) analysis have made significant progress in identifying noncoding variants associated with prostate cancer risk and bulk tissue transcriptome changes, the regulatory effect of these genetic elements on gene expression remains largely unknown. Recent developments in single-cell sequencing have made it possible to perform ATAC-seq and RNA-seq profiling simultaneously to capture functional associations between chromatin accessibility and gene expression. In this study, we tested our hypothesis that this multiome single-cell approach allows for mapping regulatory elements and their target genes at prostate cancer risk loci. We applied a 10X Multiome ATAC + Gene Expression platform to encapsulate Tn5 transposase-tagged nuclei from multiple prostate cell lines for a total of 65,501 high quality single cells from RWPE1, RWPE2, PrEC, BPH1, DU145, PC3, 22Rv1 and LNCaP cell lines. To address data sparsity commonly seen in the single-cell sequencing, we performed targeted sequencing to enrich sequencing data at prostate cancer risk loci involving 2,730 candidate germline variants and 273 associated genes. Although not increasing the number of captured cells, the targeted multiome data did improve eQTL gene expression abundance by about 20% and chromatin accessibility abundance by about 5%. Based on this multiomic profiling, we further associated RNA expression alterations with chromatin accessibility of germline variants at single cell levels. Cross validation analysis showed high overlaps between the multiome associations and the bulk eQTL findings from GTEx prostate cohort. We found that about 20% of GTEx eQTLs were covered within the significant multiome associations (p-value ≤ 0.05, gene abundance percentage ≥ 5%), and roughly 10% of the multiome associations could be identified by significant GTEx eQTLs. We also analyzed accessible regions with available heterozygous SNP reads and observed more frequent association in genomic regions with allelically accessible variants (p = 0.0055). Among these findings were previously reported regulatory variants including rs60464856-RUVBL1 (multiome p-value = 0.0099 in BPH1) and rs7247241-SPINT2 (multiome p-value = 0.0002- 0.0004 in 22Rv1). We also functionally validated a new regulatory SNP and its target gene rs2474694-VPS53 (multiome p-value = 0.00956 in BPH1 and 0.00625 in DU145) by reporter assay and SILAC proteomics sequencing. Taken together, our data demonstrated the feasibility of the multiome single-cell approach for identifying regulatory SNPs and their regulated genes.
Collapse
Affiliation(s)
- Yijun Tian
- Department of Tumor Biology, Moffitt Cancer Center, 12902 Magnolia Drive, Tampa, FL 33612, United States
| | - Lang Wu
- Population Sciences in the Pacific Program, University of Hawai i Cancer Center, University of Hawai i at Mānoa, Honolulu, HI 96813, USA
| | - Chang-Ching Huang
- Zilber College of Public Health, University of Wisconsin, Milwaukee, WI 53226, United States
| | - Liang Wang
- Department of Tumor Biology, Moffitt Cancer Center, 12902 Magnolia Drive, Tampa, FL 33612, United States
| |
Collapse
|
5
|
Gonzalez-Ferrer J, Lehrer J, O'Farrell A, Paten B, Teodorescu M, Haussler D, Jonsson VD, Mostajo-Radji MA. SIMS: A deep-learning label transfer tool for single-cell RNA sequencing analysis. CELL GENOMICS 2024; 4:100581. [PMID: 38823397 PMCID: PMC11228957 DOI: 10.1016/j.xgen.2024.100581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Revised: 04/02/2024] [Accepted: 05/09/2024] [Indexed: 06/03/2024]
Abstract
Cell atlases serve as vital references for automating cell labeling in new samples, yet existing classification algorithms struggle with accuracy. Here we introduce SIMS (scalable, interpretable machine learning for single cell), a low-code data-efficient pipeline for single-cell RNA classification. We benchmark SIMS against datasets from different tissues and species. We demonstrate SIMS's efficacy in classifying cells in the brain, achieving high accuracy even with small training sets (<3,500 cells) and across different samples. SIMS accurately predicts neuronal subtypes in the developing brain, shedding light on genetic changes during neuronal differentiation and postmitotic fate refinement. Finally, we apply SIMS to single-cell RNA datasets of cortical organoids to predict cell identities and uncover genetic variations between cell lines. SIMS identifies cell-line differences and misannotated cell lineages in human cortical organoids derived from different pluripotent stem cell lines. Altogether, we show that SIMS is a versatile and robust tool for cell-type classification from single-cell datasets.
Collapse
Affiliation(s)
- Jesus Gonzalez-Ferrer
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA 95060, USA; Live Cell Biotechnology Discovery Lab, University of California, Santa Cruz, Santa Cruz, CA 95060, USA; Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA 95060, USA
| | - Julian Lehrer
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA 95060, USA; Live Cell Biotechnology Discovery Lab, University of California, Santa Cruz, Santa Cruz, CA 95060, USA; Department of Applied Mathematics, University of California, Santa Cruz, Santa Cruz, CA 95060, USA
| | - Ash O'Farrell
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA 95060, USA
| | - Benedict Paten
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA 95060, USA; Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA 95060, USA
| | - Mircea Teodorescu
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA 95060, USA; Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA 95060, USA; Department of Electrical and Computer Engineering, University of California, Santa Cruz, Santa Cruz, CA 95060, USA
| | - David Haussler
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA 95060, USA; Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA 95060, USA
| | - Vanessa D Jonsson
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA 95060, USA; Department of Applied Mathematics, University of California, Santa Cruz, Santa Cruz, CA 95060, USA.
| | - Mohammed A Mostajo-Radji
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA 95060, USA; Live Cell Biotechnology Discovery Lab, University of California, Santa Cruz, Santa Cruz, CA 95060, USA.
| |
Collapse
|
6
|
Ma F, Zheng C. Single-cell phylotranscriptomics of developmental and cell type evolution. Trends Genet 2024; 40:495-510. [PMID: 38490933 DOI: 10.1016/j.tig.2024.02.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Revised: 02/16/2024] [Accepted: 02/16/2024] [Indexed: 03/17/2024]
Abstract
Single-cell phylotranscriptomics is an emerging tool to reveal the molecular and cellular mechanisms of evolution. We summarize its utility in studying the hourglass pattern of ontogenetic evolution and for understanding the evolutionary history of cell types. The developmental hourglass model suggests that the mid-embryonic stage is the most conserved period of development across species, which is supported by morphological and molecular studies. Single-cell phylotranscriptomic analysis has revealed previously underappreciated heterogeneity in transcriptome ages among lineages and cell types throughout development, and has identified the lineages and tissues that drive the whole-organism hourglass pattern. Single-cell transcriptome age analyses also provide important insights into the origin of germ layers, the different selective forces on tissues during adaptation, and the evolutionary relationships between cell types.
Collapse
Affiliation(s)
- Fuqiang Ma
- School of Biological Sciences, The University of Hong Kong, Hong Kong SAR, China
| | - Chaogu Zheng
- School of Biological Sciences, The University of Hong Kong, Hong Kong SAR, China.
| |
Collapse
|
7
|
Jiang J, Li J, Huang S, Jiang F, Liang Y, Xu X, Wang J. CACIMAR: cross-species analysis of cell identities, markers, regulations, and interactions using single-cell RNA sequencing data. Brief Bioinform 2024; 25:bbae283. [PMID: 38856169 PMCID: PMC11163379 DOI: 10.1093/bib/bbae283] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 05/10/2024] [Accepted: 05/30/2024] [Indexed: 06/11/2024] Open
Abstract
Transcriptomic analysis across species is increasingly used to reveal conserved gene regulations which implicate crucial regulators. Cross-species analysis of single-cell RNA sequencing (scRNA-seq) data provides new opportunities to identify the cellular and molecular conservations, especially for cell types and cell type-specific gene regulations. However, few methods have been developed to analyze cross-species scRNA-seq data to uncover both molecular and cellular conservations. Here, we built a tool called CACIMAR, which can perform cross-species analysis of cell identities, markers, regulations, and interactions using scRNA-seq profiles. Based on the weighted sum models of the conserved features, we developed different conservation scores to measure the conservation of cell types, regulatory networks, and intercellular interactions. Using publicly available scRNA-seq data on retinal regeneration in mice, zebrafish, and chick, we demonstrated four main functions of CACIMAR. First, CACIMAR allows to identify conserved cell types even in evolutionarily distant species. Second, the tool facilitates the identification of evolutionarily conserved or species-specific marker genes. Third, CACIMAR enables the identification of conserved intracellular regulations, including cell type-specific regulatory subnetworks and regulators. Lastly, CACIMAR provides a unique feature for identifying conserved intercellular interactions. Overall, CACIMAR facilitates the identification of evolutionarily conserved cell types, marker genes, intracellular regulations, and intercellular interactions, providing insights into the cellular and molecular mechanisms of species evolution.
Collapse
Affiliation(s)
- Junyao Jiang
- CAS Key Laboratory of Regenerative Biology, Guangdong Provincial Key Laboratory of Biocomputing, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, No. 190 Kaiyuan Road, Huangpu District, Guangzhou 510530, China
- School of Life Sciences, Westlake University, No. 600 Dunyu Road, Xihu District, Hangzhou, 310030, China
| | - Jinlian Li
- CAS Key Laboratory of Regenerative Biology, Guangdong Provincial Key Laboratory of Biocomputing, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, No. 190 Kaiyuan Road, Huangpu District, Guangzhou 510530, China
- University of Chinese Academy of Sciences, No. 1 Yanqihu East Road, Huairou District, Beijing 101408, China
| | - Sunan Huang
- CAS Key Laboratory of Regenerative Biology, Guangdong Provincial Key Laboratory of Biocomputing, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, No. 190 Kaiyuan Road, Huangpu District, Guangzhou 510530, China
| | - Fan Jiang
- CAS Key Laboratory of Regenerative Biology, Guangdong Provincial Key Laboratory of Biocomputing, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, No. 190 Kaiyuan Road, Huangpu District, Guangzhou 510530, China
| | - Yanran Liang
- CAS Key Laboratory of Regenerative Biology, Guangdong Provincial Key Laboratory of Biocomputing, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, No. 190 Kaiyuan Road, Huangpu District, Guangzhou 510530, China
- University of Chinese Academy of Sciences, No. 1 Yanqihu East Road, Huairou District, Beijing 101408, China
| | - Xueli Xu
- CAS Key Laboratory of Regenerative Biology, Guangdong Provincial Key Laboratory of Biocomputing, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, No. 190 Kaiyuan Road, Huangpu District, Guangzhou 510530, China
| | - Jie Wang
- CAS Key Laboratory of Regenerative Biology, Guangdong Provincial Key Laboratory of Biocomputing, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, No. 190 Kaiyuan Road, Huangpu District, Guangzhou 510530, China
- University of Chinese Academy of Sciences, No. 1 Yanqihu East Road, Huairou District, Beijing 101408, China
- China-New Zealand Joint Laboratory on Biomedicine and Health, No. 190 Kaiyuan Road, Huangpu District, Guangzhou 510530, China
| |
Collapse
|
8
|
Mihai IS, Chafle S, Henriksson J. Representing and extracting knowledge from single-cell data. Biophys Rev 2024; 16:29-56. [PMID: 38495441 PMCID: PMC10937862 DOI: 10.1007/s12551-023-01091-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2023] [Accepted: 06/28/2023] [Indexed: 03/19/2024] Open
Abstract
Single-cell analysis is currently one of the most high-resolution techniques to study biology. The large complex datasets that have been generated have spurred numerous developments in computational biology, in particular the use of advanced statistics and machine learning. This review attempts to explain the deeper theoretical concepts that underpin current state-of-the-art analysis methods. Single-cell analysis is covered from cell, through instruments, to current and upcoming models. The aim of this review is to spread concepts which are not yet in common use, especially from topology and generative processes, and how new statistical models can be developed to capture more of biology. This opens epistemological questions regarding our ontology and models, and some pointers will be given to how natural language processing (NLP) may help overcome our cognitive limitations for understanding single-cell data.
Collapse
Affiliation(s)
- Ionut Sebastian Mihai
- The Laboratory for Molecular Infection Medicine Sweden (MIMS), Umeå, Sweden
- Umeå Centre for Microbial Research (UCMR), Department of Molecular Biology, Umeå University, Umeå, Sweden
- Industrial Doctoral School, Umeå University, Umeå, Sweden
| | - Sarang Chafle
- The Laboratory for Molecular Infection Medicine Sweden (MIMS), Umeå, Sweden
- Umeå Centre for Microbial Research (UCMR), Department of Molecular Biology, Umeå University, Umeå, Sweden
| | - Johan Henriksson
- The Laboratory for Molecular Infection Medicine Sweden (MIMS), Umeå, Sweden
- Umeå Centre for Microbial Research (UCMR), Department of Molecular Biology, Umeå University, Umeå, Sweden
| |
Collapse
|
9
|
Mancuso CA, Johnson KA, Liu R, Krishnan A. Joint representation of molecular networks from multiple species improves gene classification. PLoS Comput Biol 2024; 20:e1011773. [PMID: 38198480 PMCID: PMC10805316 DOI: 10.1371/journal.pcbi.1011773] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 01/23/2024] [Accepted: 12/20/2023] [Indexed: 01/12/2024] Open
Abstract
Network-based machine learning (ML) has the potential for predicting novel genes associated with nearly any health and disease context. However, this approach often uses network information from only the single species under consideration even though networks for most species are noisy and incomplete. While some recent methods have begun addressing this shortcoming by using networks from more than one species, they lack one or more key desirable properties: handling networks from more than two species simultaneously, incorporating many-to-many orthology information, or generating a network representation that is reusable across different types of and newly-defined prediction tasks. Here, we present GenePlexusZoo, a framework that casts molecular networks from multiple species into a single reusable feature space for network-based ML. We demonstrate that this multi-species network representation improves both gene classification within a single species and knowledge-transfer across species, even in cases where the inter-species correspondence is undetectable based on shared orthologous genes. Thus, GenePlexusZoo enables effectively leveraging the high evolutionary molecular, functional, and phenotypic conservation across species to discover novel genes associated with diverse biological contexts.
Collapse
Affiliation(s)
- Christopher A. Mancuso
- Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, Aurora, Colorado, United States of America
| | - Kayla A. Johnson
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, Colorado, United States of America
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan, United States of America
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, Michigan, United States of America
| | - Renming Liu
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, Michigan, United States of America
| | - Arjun Krishnan
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, Colorado, United States of America
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, Michigan, United States of America
| |
Collapse
|
10
|
Brynildsen JK, Rajan K, Henderson MX, Bassett DS. Network models to enhance the translational impact of cross-species studies. Nat Rev Neurosci 2023; 24:575-588. [PMID: 37524935 PMCID: PMC10634203 DOI: 10.1038/s41583-023-00720-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/17/2023] [Indexed: 08/02/2023]
Abstract
Neuroscience studies are often carried out in animal models for the purpose of understanding specific aspects of the human condition. However, the translation of findings across species remains a substantial challenge. Network science approaches can enhance the translational impact of cross-species studies by providing a means of mapping small-scale cellular processes identified in animal model studies to larger-scale inter-regional circuits observed in humans. In this Review, we highlight the contributions of network science approaches to the development of cross-species translational research in neuroscience. We lay the foundation for our discussion by exploring the objectives of cross-species translational models. We then discuss how the development of new tools that enable the acquisition of whole-brain data in animal models with cellular resolution provides unprecedented opportunity for cross-species applications of network science approaches for understanding large-scale brain networks. We describe how these tools may support the translation of findings across species and imaging modalities and highlight future opportunities. Our overarching goal is to illustrate how the application of network science tools across human and animal model studies could deepen insight into the neurobiology that underlies phenomena observed with non-invasive neuroimaging methods and could simultaneously further our ability to translate findings across species.
Collapse
Affiliation(s)
- Julia K Brynildsen
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA, USA
| | - Kanaka Rajan
- Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- The Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Michael X Henderson
- Parkinson's Disease Center, Department of Neurodegenerative Science, Van Andel Institute, Grand Rapids, MI, USA
| | - Dani S Bassett
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA, USA.
- Department of Electrical & Systems Engineering, University of Pennsylvania, Philadelphia, PA, USA.
- Department of Physics & Astronomy, University of Pennsylvania, Philadelphia, PA, USA.
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, USA.
- Department of Psychiatry, University of Pennsylvania, Philadelphia, PA, USA.
- Santa Fe Institute, Santa Fe, NM, USA.
| |
Collapse
|
11
|
Mo S, Qu K, Huang J, Li Q, Zhang W, Yen K. Cross-species transcriptomics reveals bifurcation point during the arterial-to-hemogenic transition. Commun Biol 2023; 6:827. [PMID: 37558796 PMCID: PMC10412572 DOI: 10.1038/s42003-023-05190-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Accepted: 07/28/2023] [Indexed: 08/11/2023] Open
Abstract
Hemogenic endothelium (HE) with hematopoietic stem cell (HSC)-forming potential emerge from specialized arterial endothelial cells (AECs) undergoing the endothelial-to-hematopoietic transition (EHT) in the aorta-gonad-mesonephros (AGM) region. Characterization of this AECs subpopulation and whether this phenomenon is conserved across species remains unclear. Here we introduce HomologySeeker, a cross-species method that leverages refined mouse information to explore under-studied human EHT. Utilizing single-cell transcriptomic ensembles of EHT, HomologySeeker reveals a parallel developmental relationship between these two species, with minimal pre-HSC signals observed in human cells. The pre-HE stage contains a conserved bifurcation point between the two species, where cells progress towards HE or late AECs. By harnessing human spatial transcriptomics, we identify ligand modules that contribute to the bifurcation choice and validate CXCL12 in promoting hemogenic choice using a human in vitro differentiation system. Our findings advance human arterial-to-hemogenic transition understanding and offer valuable insights for manipulating HSC generation using in vitro models.
Collapse
Affiliation(s)
- Shaokang Mo
- Division of Cell, Developmental and Integrative Biology, School of Medicine, South China University of Technology, Guangzhou, China
- State Key Laboratory of Experimental Hematology, National Clinical Research Center for Blood Diseases, Haihe Laboratory of Cell Ecosystem, Institute of Hematology & Blood Diseases Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Tianjin, 300020, China
- Tianjin Institutes of Health Science, Tianjin, 301600, China
| | - Kengyuan Qu
- State Key Laboratory of Experimental Hematology, National Clinical Research Center for Blood Diseases, Haihe Laboratory of Cell Ecosystem, Institute of Hematology & Blood Diseases Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Tianjin, 300020, China
- Tianjin Institutes of Health Science, Tianjin, 301600, China
| | - Junfeng Huang
- State Key Laboratory of Experimental Hematology, National Clinical Research Center for Blood Diseases, Haihe Laboratory of Cell Ecosystem, Institute of Hematology & Blood Diseases Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Tianjin, 300020, China.
- Tianjin Institutes of Health Science, Tianjin, 301600, China.
| | - Qiwei Li
- State Key Laboratory of Experimental Hematology, National Clinical Research Center for Blood Diseases, Haihe Laboratory of Cell Ecosystem, Institute of Hematology & Blood Diseases Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Tianjin, 300020, China
- Tianjin Institutes of Health Science, Tianjin, 301600, China
| | - Wenqing Zhang
- Division of Cell, Developmental and Integrative Biology, School of Medicine, South China University of Technology, Guangzhou, China.
| | - Kuangyu Yen
- State Key Laboratory of Experimental Hematology, National Clinical Research Center for Blood Diseases, Haihe Laboratory of Cell Ecosystem, Institute of Hematology & Blood Diseases Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Tianjin, 300020, China.
- Tianjin Institutes of Health Science, Tianjin, 301600, China.
| |
Collapse
|
12
|
Biharie K, Michielsen L, Reinders MJT, Mahfouz A. Cell type matching across species using protein embeddings and transfer learning. Bioinformatics 2023; 39:i404-i412. [PMID: 37387141 DOI: 10.1093/bioinformatics/btad248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open
Abstract
MOTIVATION Knowing the relation between cell types is crucial for translating experimental results from mice to humans. Establishing cell type matches, however, is hindered by the biological differences between the species. A substantial amount of evolutionary information between genes that could be used to align the species is discarded by most of the current methods since they only use one-to-one orthologous genes. Some methods try to retain the information by explicitly including the relation between genes, however, not without caveats. RESULTS In this work, we present a model to transfer and align cell types in cross-species analysis (TACTiCS). First, TACTiCS uses a natural language processing model to match genes using their protein sequences. Next, TACTiCS employs a neural network to classify cell types within a species. Afterward, TACTiCS uses transfer learning to propagate cell type labels between species. We applied TACTiCS on scRNA-seq data of the primary motor cortex of human, mouse, and marmoset. Our model can accurately match and align cell types on these datasets. Moreover, our model outperforms Seurat and the state-of-the-art method SAMap. Finally, we show that our gene matching method results in better cell type matches than BLAST in our model. AVAILABILITY AND IMPLEMENTATION The implementation is available on GitHub (https://github.com/kbiharie/TACTiCS). The preprocessed datasets and trained models can be downloaded from Zenodo (https://doi.org/10.5281/zenodo.7582460).
Collapse
Affiliation(s)
- Kirti Biharie
- Delft Bioinformatics Lab, Delft University of Technology, Delft 2628XE, The Netherlands
- Department of Human Genetics, Leiden University Medical Center, Leiden 2333ZC, The Netherlands
- Leiden Computational Biology Center, Leiden University Medical Center, Leiden 2333ZC, The Netherlands
| | - Lieke Michielsen
- Delft Bioinformatics Lab, Delft University of Technology, Delft 2628XE, The Netherlands
- Department of Human Genetics, Leiden University Medical Center, Leiden 2333ZC, The Netherlands
- Leiden Computational Biology Center, Leiden University Medical Center, Leiden 2333ZC, The Netherlands
| | - Marcel J T Reinders
- Delft Bioinformatics Lab, Delft University of Technology, Delft 2628XE, The Netherlands
- Department of Human Genetics, Leiden University Medical Center, Leiden 2333ZC, The Netherlands
- Leiden Computational Biology Center, Leiden University Medical Center, Leiden 2333ZC, The Netherlands
| | - Ahmed Mahfouz
- Delft Bioinformatics Lab, Delft University of Technology, Delft 2628XE, The Netherlands
- Department of Human Genetics, Leiden University Medical Center, Leiden 2333ZC, The Netherlands
- Leiden Computational Biology Center, Leiden University Medical Center, Leiden 2333ZC, The Netherlands
| |
Collapse
|