1
|
Luo SH, Pan SQ, Chen GY, Xie Y, Ren B, Liu GK, Tian ZQ. Revealing the Denoising Principle of Zero-Shot N2N-Based Algorithm from 1D Spectrum to 2D Image. Anal Chem 2024; 96:4086-4092. [PMID: 38412039 DOI: 10.1021/acs.analchem.3c04608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/29/2024]
Abstract
Denoising is a necessary step in image analysis to extract weak signals, especially those hardly identified by the naked eye. Unlike the data-driven deep-learning denoising algorithms relying on a clean image as the reference, Noise2Noise (N2N) was able to denoise the noise image, providing sufficiently noise images with the same subject but randomly distributed noise. Further, by introducing data augmentation to create a big data set and regularization to prevent model overfitting, zero-shot N2N-based denoising was proposed in which only a single noisy image was needed. Although various N2N-based denoising algorithms have been developed with high performance, their complicated black box operation prevented the lightweight. Therefore, to reveal the working function of the zero-shot N2N-based algorithm, we proposed a lightweight Peak2Peak algorithm (P2P) and qualitatively and quantitatively analyzed its denoising behavior on the 1D spectrum and 2D image. We found that the high-performance denoising originates from the trade-off balance between the loss function and regularization in the denoising module, where regularization is the switch of denoising. Meanwhile, the signal extraction is mainly from the self-supervised characteristic learning in the data augmentation module. Further, the lightweight P2P improved the denoising speed by at least ten times but with little performance loss, compared with that of the current N2N-based algorithms. In general, the visualization of P2P provides a reference for revealing the working function of zero-shot N2N-based algorithms, which would pave the way for the application of these algorithms toward real-time (in situ, in vivo, and operando) research improving both temporal and spatial resolutions. The P2P is open-source at https://github.com/3331822w/Peak2Peakand will be accessible online access at https://ramancloud.xmu.edu.cn/tutorial.
Collapse
Affiliation(s)
- Si-Heng Luo
- State Key Laboratory for Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
- State Key Laboratory of Marine Environmental Science, Fujian Provincial Key Laboratory for Coastal Ecology and Environmental Studies, Center for Marine Environmental Chemistry & Toxicology, College of the Environment and Ecology, Xiamen University, Xiamen 361102, China
| | - Si-Qi Pan
- State Key Laboratory of Marine Environmental Science, Fujian Provincial Key Laboratory for Coastal Ecology and Environmental Studies, Center for Marine Environmental Chemistry & Toxicology, College of the Environment and Ecology, Xiamen University, Xiamen 361102, China
| | - Gan-Yu Chen
- State Key Laboratory for Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| | - Yi Xie
- Fujian Key Laboratory of Sensing and Computing for Smart City, School of Information Science and Engineering, Xiamen University, Xiamen, Fujian 361005, China
- Shenzhen Research Institute of Xiamen University, Xiamen University, Shenzhen 518000, China
| | - Bin Ren
- State Key Laboratory for Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
- Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen 361005, China
| | - Guo-Kun Liu
- State Key Laboratory of Marine Environmental Science, Fujian Provincial Key Laboratory for Coastal Ecology and Environmental Studies, Center for Marine Environmental Chemistry & Toxicology, College of the Environment and Ecology, Xiamen University, Xiamen 361102, China
| | - Zhong-Qun Tian
- State Key Laboratory for Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
- Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen 361005, China
| |
Collapse
|
2
|
Zormpas E, Queen R, Comber A, Cockell SJ. Mapping the transcriptome: Realizing the full potential of spatial data analysis. Cell 2023; 186:5677-5689. [PMID: 38065099 DOI: 10.1016/j.cell.2023.11.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Revised: 09/04/2023] [Accepted: 11/02/2023] [Indexed: 12/24/2023]
Abstract
RNA sequencing in situ allows for whole-transcriptome characterization at high resolution, while retaining spatial information. These data present an analytical challenge for bioinformatics-how to leverage spatial information effectively? Properties of data with a spatial dimension require special handling, which necessitate a different set of statistical and inferential considerations when compared to non-spatial data. The geographical sciences primarily use spatial data and have developed methods to analye them. Here we discuss the challenges associated with spatial analysis and examine how we can take advantage of practice from the geographical sciences to realize the full potential of spatial information in transcriptomic datasets.
Collapse
Affiliation(s)
- Eleftherios Zormpas
- Biosciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne NE2 4HH, UK
| | - Rachel Queen
- Biosciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne NE2 4HH, UK; Bioinformatics Support Unit, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne NE2 4HH, UK
| | - Alexis Comber
- School of Geography and Leeds Institute for Data Analytics, University of Leeds, Leeds LS2 9NL, UK
| | - Simon J Cockell
- Biosciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne NE2 4HH, UK; School of Biomedical, Nutritional and Sport Sciences, Faculty of Medical Sciences, Newcastle upon Tyne NE2 4HH, UK.
| |
Collapse
|
3
|
Su J, Reynier JB, Fu X, Zhong G, Jiang J, Escalante RS, Wang Y, Aparicio L, Izar B, Knowles DA, Rabadan R. Smoother: a unified and modular framework for incorporating structural dependency in spatial omics data. Genome Biol 2023; 24:291. [PMID: 38110959 PMCID: PMC10726548 DOI: 10.1186/s13059-023-03138-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Accepted: 12/04/2023] [Indexed: 12/20/2023] Open
Abstract
Spatial omics technologies can help identify spatially organized biological processes, but existing computational approaches often overlook structural dependencies in the data. Here, we introduce Smoother, a unified framework that integrates positional information into non-spatial models via modular priors and losses. In simulated and real datasets, Smoother enables accurate data imputation, cell-type deconvolution, and dimensionality reduction with remarkable efficiency. In colorectal cancer, Smoother-guided deconvolution reveals plasma cell and fibroblast subtype localizations linked to tumor microenvironment restructuring. Additionally, joint modeling of spatial and single-cell human prostate data with Smoother allows for spatial mapping of reference populations with significantly reduced ambiguity.
Collapse
Affiliation(s)
- Jiayu Su
- Program for Mathematical Genomics, Columbia University, New York, NY, USA.
- Department of Systems Biology, Columbia University, New York, NY, USA.
- New York Genome Center, New York, NY, USA.
| | - Jean-Baptiste Reynier
- Program for Mathematical Genomics, Columbia University, New York, NY, USA
- Department of Biomedical Informatics, Columbia University, New York, NY, USA
| | - Xi Fu
- Program for Mathematical Genomics, Columbia University, New York, NY, USA
- Department of Biomedical Informatics, Columbia University, New York, NY, USA
| | - Guojie Zhong
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Jiahao Jiang
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| | | | - Yiping Wang
- Program for Mathematical Genomics, Columbia University, New York, NY, USA
- Division of Hematology/Oncology, Department of Medicine, Herbert Irving Comprehensive Cancer Center, Columbia University Irving Medical Center, New York, NY, USA
| | - Luis Aparicio
- Program for Mathematical Genomics, Columbia University, New York, NY, USA
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Benjamin Izar
- Program for Mathematical Genomics, Columbia University, New York, NY, USA
- Division of Hematology/Oncology, Department of Medicine, Herbert Irving Comprehensive Cancer Center, Columbia University Irving Medical Center, New York, NY, USA
| | - David A Knowles
- Department of Systems Biology, Columbia University, New York, NY, USA
- New York Genome Center, New York, NY, USA
- Department of Computer Science, Columbia University, New York, NY, USA
| | - Raul Rabadan
- Program for Mathematical Genomics, Columbia University, New York, NY, USA.
- Department of Systems Biology, Columbia University, New York, NY, USA.
- Department of Biomedical Informatics, Columbia University, New York, NY, USA.
| |
Collapse
|
4
|
Lee AJ, Cahill R, Abbasi-Asl R. Machine Learning for Uncovering Biological Insights in Spatial Transcriptomics Data. ARXIV 2023:arXiv:2303.16725v1. [PMID: 37033464 PMCID: PMC10081350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 04/11/2023]
Abstract
Development and homeostasis in multicellular systems both require exquisite control over spatial molecular pattern formation and maintenance. Advances in spatially-resolved and high-throughput molecular imaging methods such as multiplexed immunofluorescence and spatial transcriptomics (ST) provide exciting new opportunities to augment our fundamental understanding of these processes in health and disease. The large and complex datasets resulting from these techniques, particularly ST, have led to rapid development of innovative machine learning (ML) tools primarily based on deep learning techniques. These ML tools are now increasingly featured in integrated experimental and computational workflows to disentangle signals from noise in complex biological systems. However, it can be difficult to understand and balance the different implicit assumptions and methodologies of a rapidly expanding toolbox of analytical tools in ST. To address this, we summarize major ST analysis goals that ML can help address and current analysis trends. We also describe four major data science concepts and related heuristics that can help guide practitioners in their choices of the right tools for the right biological questions.
Collapse
|