1
|
Tung CC, Kuo SC, Yang CL, Yu JH, Huang CE, Liou PC, Sun YH, Shuai P, Su JC, Ku C, Lin YCJ. Single-cell transcriptomics unveils xylem cell development and evolution. Genome Biol 2023; 24:3. [PMID: 36624504 PMCID: PMC9830878 DOI: 10.1186/s13059-022-02845-1] [Citation(s) in RCA: 29] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 12/31/2022] [Indexed: 01/11/2023] Open
Abstract
BACKGROUND Xylem, the most abundant tissue on Earth, is responsible for lateral growth in plants. Typical xylem has a radial system composed of ray parenchyma cells and an axial system of fusiform cells. In most angiosperms, fusiform cells comprise vessel elements for water transportation and libriform fibers for mechanical support, while both functions are performed by tracheids in other vascular plants such as gymnosperms. Little is known about the developmental programs and evolutionary relationships of these xylem cell types. RESULTS Through both single-cell and laser capture microdissection transcriptomic profiling, we determine the developmental lineages of ray and fusiform cells in stem-differentiating xylem across four divergent woody angiosperms. Based on cross-species analyses of single-cell clusters and overlapping trajectories, we reveal highly conserved ray, yet variable fusiform, lineages across angiosperms. Core eudicots Populus trichocarpa and Eucalyptus grandis share nearly identical fusiform lineages, whereas the more basal angiosperm Liriodendron chinense has a fusiform lineage distinct from that in core eudicots. The tracheids in the basal eudicot Trochodendron aralioides, an evolutionarily reversed trait, exhibit strong transcriptomic similarity to vessel elements rather than libriform fibers. CONCLUSIONS This evo-devo framework provides a comprehensive understanding of the formation of xylem cell lineages across multiple plant species spanning over a hundred million years of evolutionary history.
Collapse
Affiliation(s)
- Chia-Chun Tung
- Department of Life Science, National Taiwan University, Taipei, 10617, Taiwan
| | - Shang-Che Kuo
- Genome and Systems Biology Degree Program, National Taiwan University and Academia Sinica, Taipei, 10617, Taiwan
| | - Chia-Ling Yang
- Institute of Plant and Microbial Biology, Academia Sinica, Taipei, 11529, Taiwan
| | - Jhong-He Yu
- Institute of Plant Biology, National Taiwan University, Taipei, 10617, Taiwan
| | - Chia-En Huang
- Institute of Plant Biology, National Taiwan University, Taipei, 10617, Taiwan
| | - Pin-Chien Liou
- Institute of Plant Biology, National Taiwan University, Taipei, 10617, Taiwan
| | - Ying-Hsuan Sun
- Department of Forestry, National Chung Hsing University, Taichung, 40227, Taiwan
| | - Peng Shuai
- College of Forestry, Fujian Agriculture and Forestry University, Fuzhou, 350002, China
| | - Jung-Chen Su
- Department of Pharmacy, National Yang Ming Chiao Tung University, Taipei, 11221, Taiwan
| | - Chuan Ku
- Genome and Systems Biology Degree Program, National Taiwan University and Academia Sinica, Taipei, 10617, Taiwan.
- Institute of Plant and Microbial Biology, Academia Sinica, Taipei, 11529, Taiwan.
| | - Ying-Chung Jimmy Lin
- Department of Life Science, National Taiwan University, Taipei, 10617, Taiwan.
- Genome and Systems Biology Degree Program, National Taiwan University and Academia Sinica, Taipei, 10617, Taiwan.
- Institute of Plant Biology, National Taiwan University, Taipei, 10617, Taiwan.
| |
Collapse
|
2
|
Zhang J, Merikangas KR, Li H, Shou H. TWO-SAMPLE TESTS FOR MULTIVARIATE REPEATED MEASUREMENTS OF HISTOGRAM OBJECTS WITH APPLICATIONS TO WEARABLE DEVICE DATA. Ann Appl Stat 2022; 16:2396-2416. [PMID: 38037595 PMCID: PMC10688324 DOI: 10.1214/21-aoas1596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2023]
Abstract
Repeated observations have become increasingly common in biomedical research and longitudinal studies. For instance, wearable sensor devices are deployed to continuously track physiological and biological signals from each individual over multiple days. It remains of great interest to appropriately evaluate how the daily distribution of biosignals might differ across disease groups and demographics. Hence, these data could be formulated as multivariate complex object data, such as probability densities, histograms, and observations on a tree. Traditional statistical methods would often fail to apply, as they are sampled from an arbitrary non-Euclidean metric space. In this paper we propose novel, nonparametric, graph-based two-sample tests for object data with the same structure of repeated measures. We treat the repeatedly measured object data as multivariate object data, which requires the same number of repeated observations per individual but eliminates any assumptions on the errors of the repeated observations. A set of test statistics are proposed to capture various possible alternatives. We derive their asymptotic null distributions under the permutation null. These tests exhibit substantial power improvements over the existing methods while controlling the type I errors under finite samples as shown through simulation studies. The proposed tests are demonstrated to provide additional insights on the location, inter- and intra-individual variability of the daily physical activity distributions in a sample of studies for mood disorders.
Collapse
Affiliation(s)
- Jingru Zhang
- Division of Biostatistics, Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine
| | - Kathleen R. Merikangas
- Genetic Epidemiology Research Branch, National Institute of Mental Health, National Institutes of Health
| | - Hongzhe Li
- Division of Biostatistics, Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine
| | - Haochang Shou
- Division of Biostatistics, Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine
| |
Collapse
|
3
|
Wang R, Fan W, Wang X. A hyperbolic divergence based nonparametric test for two‐sample multivariate distributions. CAN J STAT 2022. [DOI: 10.1002/cjs.11736] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Roulin Wang
- Department of Statistics and Finance, School of Management University of Science and Technology of China Hefei Anhui AH 230026 China
| | - Wei Fan
- School of the Gifted Young University of Science and Technology of China Hefei Anhui AH 230026 China
| | - Xueqin Wang
- Department of Statistics and Finance, School of Management University of Science and Technology of China Hefei Anhui AH 230026 China
- International Institute of Finance, School of Management University of Science and Technology of China Hefei Anhui AH 230026 China
| |
Collapse
|
4
|
Liu L, Meng Y, Wu X, Ying Z, Zheng T. Log-Rank-Type Tests for Equality of Distributions in High-Dimensional Spaces. J Comput Graph Stat 2022. [DOI: 10.1080/10618600.2022.2051530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Linxi Liu
- Department of Statistics, University of Pittsburgh
| | - Yang Meng
- Department of Statistics, Columbia University
| | | | | | - Tian Zheng
- Department of Statistics, Columbia University
| |
Collapse
|
5
|
Anchang B, Mendez-Giraldez R, Xu X, Archer TK, Chen Q, Hu G, Plevritis SK, Motsinger-Reif AA, Li JL. Visualization, benchmarking and characterization of nested single-cell heterogeneity as dynamic forest mixtures. Brief Bioinform 2022; 23:6534382. [PMID: 35192692 PMCID: PMC8921621 DOI: 10.1093/bib/bbac017] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Revised: 11/19/2021] [Accepted: 01/13/2022] [Indexed: 11/13/2022] Open
Abstract
A major topic of debate in developmental biology centers on whether development is continuous, discontinuous, or a mixture of both. Pseudo-time trajectory models, optimal for visualizing cellular progression, model cell transitions as continuous state manifolds and do not explicitly model real-time, complex, heterogeneous systems and are challenging for benchmarking with temporal models. We present a data-driven framework that addresses these limitations with temporal single-cell data collected at discrete time points as inputs and a mixture of dependent minimum spanning trees (MSTs) as outputs, denoted as dynamic spanning forest mixtures (DSFMix). DSFMix uses decision-tree models to select genes that account for variations in multimodality, skewness and time. The genes are subsequently used to build the forest using tree agglomerative hierarchical clustering and dynamic branch cutting. We first motivate the use of forest-based algorithms compared to single-tree approaches for visualizing and characterizing developmental processes. We next benchmark DSFMix to pseudo-time and temporal approaches in terms of feature selection, time correlation, and network similarity. Finally, we demonstrate how DSFMix can be used to visualize, compare and characterize complex relationships during biological processes such as epithelial-mesenchymal transition, spermatogenesis, stem cell pluripotency, early transcriptional response from hormones and immune response to coronavirus disease. Our results indicate that the expression of genes during normal development exhibits a high proportion of non-uniformly distributed profiles that are mostly right-skewed and multimodal; the latter being a characteristic of major steady states during development. Our study also identifies and validates gene signatures driving complex dynamic processes during somatic or germline differentiation.
Collapse
Affiliation(s)
- Benedict Anchang
- Corresponding author: Benedict Anchang, Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences. 111 T W Alexander Dr, Research Triangle Park, NC 27709, USA and Center for Cancer Research, National Cancer Institute, Bethesda, MD 20892, USA. Tel +1 984-287-3350; E-mail:
| | - Raul Mendez-Giraldez
- Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences, Stanford, California, USA
| | - Xiaojiang Xu
- Integrative Bioinformatics Support Group, National Institute of Environmental Health Sciences, Stanford, California, USA
| | - Trevor K Archer
- Epigenetics & Stem Cell Biology Laboratory/Chromatin & Gene Expression Group, National Institute of Environmental Health Sciences, Stanford, California, USA
| | - Qing Chen
- Epigenetics & Stem Cell Biology Laboratory/Chromatin & Gene Expression Group, National Institute of Environmental Health Sciences, Stanford, California, USA
| | - Guang Hu
- Epigenetics & Stem Cell Biology Laboratory/Chromatin & Gene Expression Group, National Institute of Environmental Health Sciences, Stanford, California, USA
| | - Sylvia K Plevritis
- Department of Biomedical Data Science, Center for Cancer Systems Biology, Stanford University, Stanford, California, USA
| | - Alison Anne Motsinger-Reif
- Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences, Stanford, California, USA
| | - Jian-Liang Li
- Integrative Bioinformatics Support Group, National Institute of Environmental Health Sciences, Stanford, California, USA
| |
Collapse
|
6
|
Tang L, Li J. Combining dependent tests based on data depth with applications to the two-sample problem for data of arbitrary types. J Nonparametr Stat 2022. [DOI: 10.1080/10485252.2021.2025371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Affiliation(s)
- Linli Tang
- Department of Statistics, University of California, Riverside, CA, USA
| | - Jun Li
- Department of Statistics, University of California, Riverside, CA, USA
| |
Collapse
|
7
|
Some clustering-based exact distribution-free k-sample tests applicable to high dimension, low sample size data. J MULTIVARIATE ANAL 2021. [DOI: 10.1016/j.jmva.2021.104897] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
8
|
Chen H, Xia Y. A Normality Test for High-dimensional Data Based on the Nearest Neighbor Approach. J Am Stat Assoc 2021. [DOI: 10.1080/01621459.2021.1953507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- Hao Chen
- Department of Statistics, University of California at Davis, CA
| | - Yin Xia
- Department of Statistics, School of Management, Fudan University
| |
Collapse
|
9
|
Banerjee T, Bhattacharya BB, Mukherjee G. A nearest-neighbor based nonparametric test for viral remodeling in heterogeneous single-cell proteomic data. Ann Appl Stat 2020. [DOI: 10.1214/20-aoas1362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
10
|
Kim I, Balakrishnan S, Wasserman L. Robust multivariate nonparametric tests via projection averaging. Ann Stat 2020. [DOI: 10.1214/19-aos1936] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
11
|
Mukherjee S, Agarwal D, Zhang NR, Bhattacharya BB. Distribution-Free Multisample Tests Based on Optimal Matchings With Applications to Single Cell Genomics. J Am Stat Assoc 2020. [DOI: 10.1080/01621459.2020.1791131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Affiliation(s)
- Somabha Mukherjee
- Department of Statistics, The Wharton School, University of Pennsylvania, Philadelphia, PA
| | - Divyansh Agarwal
- Department of Statistics, The Wharton School, University of Pennsylvania, Philadelphia, PA
| | - Nancy R. Zhang
- Department of Statistics, The Wharton School, University of Pennsylvania, Philadelphia, PA
| | | |
Collapse
|
12
|
Mukhopadhyay S, Wang K. A nonparametric approach to high-dimensional k-sample comparison problems. Biometrika 2020. [DOI: 10.1093/biomet/asaa015] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Summary
High-dimensional $k$-sample comparison is a common task in applications. We construct a class of easy-to-implement distribution-free tests based on new nonparametric tools and unexplored connections with spectral graph theory. The test is shown to have various desirable properties and a characteristic exploratory flavour that has practical consequences for statistical modelling. Numerical examples show that the proposed method works surprisingly well across a broad range of realistic situations.
Collapse
Affiliation(s)
- Subhadeep Mukhopadhyay
- Department of Statistical Science, Temple University, Philadelphia, Pennsylvania 19122, U.S.A
| | - Kaijun Wang
- Fred Hutchinson Cancer Research Center, 1100 Fairview Ave. N., Seattle, Washington 98109, U.S.A
| |
Collapse
|
13
|
Li J. Asymptotic distribution-free change-point detection based on interpoint distances for high-dimensional data. J Nonparametr Stat 2020. [DOI: 10.1080/10485252.2019.1710505] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Affiliation(s)
- Jun Li
- Department of Statistics, University of California, Riverside, CA, USA
| |
Collapse
|
14
|
Bhattacharya BB. A general asymptotic framework for distribution-free graph-based two-sample tests. J R Stat Soc Series B Stat Methodol 2019. [DOI: 10.1111/rssb.12319] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
15
|
Chu L, Chen H. Asymptotic distribution-free change-point detection for multivariate and non-Euclidean data. Ann Stat 2019. [DOI: 10.1214/18-aos1691] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|