1
|
Marconato L, Palla G, Yamauchi KA, Virshup I, Heidari E, Treis T, Vierdag WM, Toth M, Stockhaus S, Shrestha RB, Rombaut B, Pollaris L, Lehner L, Vöhringer H, Kats I, Saeys Y, Saka SK, Huber W, Gerstung M, Moore J, Theis FJ, Stegle O. SpatialData: an open and universal data framework for spatial omics. Nat Methods 2024:10.1038/s41592-024-02212-x. [PMID: 38509327 DOI: 10.1038/s41592-024-02212-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Accepted: 02/14/2024] [Indexed: 03/22/2024]
Abstract
Spatially resolved omics technologies are transforming our understanding of biological tissues. However, the handling of uni- and multimodal spatial omics datasets remains a challenge owing to large data volumes, heterogeneity of data types and the lack of flexible, spatially aware data structures. Here we introduce SpatialData, a framework that establishes a unified and extensible multiplatform file-format, lazy representation of larger-than-memory data, transformations and alignment to common coordinate systems. SpatialData facilitates spatial annotations and cross-modal aggregation and analysis, the utility of which is illustrated in the context of multiple vignettes, including integrative analysis on a multimodal Xenium and Visium breast cancer study.
Collapse
Affiliation(s)
- Luca Marconato
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
- Division of Computational Genomics and System Genetics, German Cancer Research Center, Heidelberg, Germany
- Collaboration for joint PhD degree between EMBL and Heidelberg University, Faculty of Biosciences, Heidelberg, Germany
| | - Giovanni Palla
- Institute of Computational Biology, Helmholtz, Center Munich, Munich, Germany
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany
| | - Kevin A Yamauchi
- Department of Biosystems, Science and Engineering, ETH Zürich, Basel, Switzerland
- Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Isaac Virshup
- Institute of Computational Biology, Helmholtz, Center Munich, Munich, Germany
| | - Elyas Heidari
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
- Division of Computational Genomics and System Genetics, German Cancer Research Center, Heidelberg, Germany
- Division of Artificial Intelligence in Oncology, German Cancer Research Center, Heidelberg, Germany
| | - Tim Treis
- Division of Computational Genomics and System Genetics, German Cancer Research Center, Heidelberg, Germany
- Institute of Computational Biology, Helmholtz, Center Munich, Munich, Germany
| | | | - Marcella Toth
- Institute of Computational Biology, Helmholtz, Center Munich, Munich, Germany
| | - Sonja Stockhaus
- Institute of Computational Biology, Helmholtz, Center Munich, Munich, Germany
- TUM School of Computation, Information and Technology, Technical University of Munich, Munich, Germany
| | - Rahul B Shrestha
- Institute of Computational Biology, Helmholtz, Center Munich, Munich, Germany
| | - Benjamin Rombaut
- Data Mining and Modeling for Biomedicine, VIB Center for Inflammation Research, Ghent, Belgium
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium
- VIB Center for AI and Computational Biology, Ghent, Belgium
| | - Lotte Pollaris
- Data Mining and Modeling for Biomedicine, VIB Center for Inflammation Research, Ghent, Belgium
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium
- VIB Center for AI and Computational Biology, Ghent, Belgium
| | - Laurens Lehner
- Institute of Computational Biology, Helmholtz, Center Munich, Munich, Germany
- TUM School of Computation, Information and Technology, Technical University of Munich, Munich, Germany
| | - Harald Vöhringer
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
- Molecular Medicine Partnership Unit, Heidelberg, Germany
- Department of Medicine V, Hematology, Oncology, and Rheumatology, University of Heidelberg, Heidelberg, Germany
| | - Ilia Kats
- Division of Computational Genomics and System Genetics, German Cancer Research Center, Heidelberg, Germany
| | - Yvan Saeys
- Data Mining and Modeling for Biomedicine, VIB Center for Inflammation Research, Ghent, Belgium
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium
- VIB Center for AI and Computational Biology, Ghent, Belgium
| | - Sinem K Saka
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - Wolfgang Huber
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - Moritz Gerstung
- Division of Artificial Intelligence in Oncology, German Cancer Research Center, Heidelberg, Germany
| | - Josh Moore
- German BioImaging - Gesellschaft für Mikroskopie und Bildanalyse e.V, Konstanz, Germany.
- Open Microscopy Environment Consortium, Munich, Germany.
| | - Fabian J Theis
- Institute of Computational Biology, Helmholtz, Center Munich, Munich, Germany.
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany.
- Department of Mathematics, Technical University of Munich, Munich, Germany.
- Cellular Genetics Programme, Wellcome Sanger Institute, Cambridge, UK.
| | - Oliver Stegle
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany.
- Division of Computational Genomics and System Genetics, German Cancer Research Center, Heidelberg, Germany.
- Cellular Genetics Programme, Wellcome Sanger Institute, Cambridge, UK.
| |
Collapse
|
2
|
Vierdag WMAM, Saka SK. A perspective on FAIR quality control in multiplexed imaging data processing. FRONTIERS IN BIOINFORMATICS 2024; 4:1336257. [PMID: 38405548 PMCID: PMC10885342 DOI: 10.3389/fbinf.2024.1336257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Accepted: 01/26/2024] [Indexed: 02/27/2024] Open
Abstract
Multiplexed imaging approaches are getting increasingly adopted for imaging of large tissue areas, yielding big imaging datasets both in terms of the number of samples and the size of image data per sample. The processing and analysis of these datasets is complex owing to frequent technical artifacts and heterogeneous profiles from a high number of stained targets To streamline the analysis of multiplexed images, automated pipelines making use of state-of-the-art algorithms have been developed. In these pipelines, the output quality of one processing step is typically dependent on the output of the previous step and errors from each step, even when they appear minor, can propagate and confound the results. Thus, rigorous quality control (QC) at each of these different steps of the image processing pipeline is of paramount importance both for the proper analysis and interpretation of the analysis results and for ensuring the reusability of the data. Ideally, QC should become an integral and easily retrievable part of the imaging datasets and the analysis process. Yet, limitations of the currently available frameworks make integration of interactive QC difficult for large multiplexed imaging data. Given the increasing size and complexity of multiplexed imaging datasets, we present the different challenges for integrating QC in image analysis pipelines as well as suggest possible solutions that build on top of recent advances in bioimage analysis.
Collapse
Affiliation(s)
| | - Sinem K. Saka
- Genome Biology Unit, European Molecular Biology Laboratory (EMBL), Heidelberg, Germany
| |
Collapse
|
3
|
Chen JG, Chávez-Fuentes JC, O'Brien M, Xu J, Ruiz E, Wang W, Amin I, Sarfraz I, Guckhool P, Sistig A, Yuan GC, Dries R. Giotto Suite: a multi-scale and technology-agnostic spatial multi-omics analysis ecosystem. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.26.568752. [PMID: 38077085 PMCID: PMC10705291 DOI: 10.1101/2023.11.26.568752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/24/2023]
Abstract
Emerging spatial omics technologies continue to advance the molecular mapping of tissue architecture and the investigation of gene regulation and cellular crosstalk, which in turn provide new mechanistic insights into a wide range of biological processes and diseases. Such technologies provide an increasingly large amount of information content at multiple spatial scales. However, representing and harmonizing diverse spatial datasets efficiently, including combining multiple modalities or spatial scales in a scalable and flexible manner, remains a substantial challenge. Here, we present Giotto Suite, a suite of open-source software packages that underlies a fully modular and integrated spatial data analysis toolbox. At its core, Giotto Suite is centered around an innovative and technology-agnostic data framework embedded in the R software environment, which allows the representation and integration of virtually any type of spatial omics data at any spatial resolution. In addition, Giotto Suite provides both scalable and extensible end-to-end solutions for data analysis, integration, and visualization. Giotto Suite integrates molecular, morphology, spatial, and annotated feature information to create a responsive and flexible workflow for multi-scale, multi-omic data analyses, as demonstrated here by applications to several state-of-the-art spatial technologies. Furthermore, Giotto Suite builds upon interoperable interfaces and data structures that bridge the established fields of genomics and spatial data science, thereby enabling independent developers to create custom-engineered pipelines. As such, Giotto Suite creates an immersive ecosystem for spatial multi-omic data analysis.
Collapse
Affiliation(s)
- Jiaji George Chen
- Section of Hematology and Medical Oncology, Boston University School of Medicine and Boston Medical Center, Boston, MA 02118, USA
| | | | - Matthew O'Brien
- Section of Hematology and Medical Oncology, Boston University School of Medicine and Boston Medical Center, Boston, MA 02118, USA
| | - Junxiang Xu
- Section of Hematology and Medical Oncology, Boston University School of Medicine and Boston Medical Center, Boston, MA 02118, USA
| | - Edward Ruiz
- Section of Hematology and Medical Oncology, Boston University School of Medicine and Boston Medical Center, Boston, MA 02118, USA
| | - Wen Wang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Iqra Amin
- Section of Hematology and Medical Oncology, Boston University School of Medicine and Boston Medical Center, Boston, MA 02118, USA
| | - Irzam Sarfraz
- Section of Hematology and Medical Oncology, Boston University School of Medicine and Boston Medical Center, Boston, MA 02118, USA
| | - Pratishtha Guckhool
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Adriana Sistig
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Guo-Cheng Yuan
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Ruben Dries
- Section of Hematology and Medical Oncology, Boston University School of Medicine and Boston Medical Center, Boston, MA 02118, USA
- Division of Computational Biomedicine, Boston University School of Medicine, Boston, MA 02118, USA
| |
Collapse
|