1
|
Nabi IR, Cardoen B, Khater IM, Gao G, Wong TH, Hamarneh G. AI analysis of super-resolution microscopy: Biological discovery in the absence of ground truth. J Cell Biol 2024; 223:e202311073. [PMID: 38865088 PMCID: PMC11169916 DOI: 10.1083/jcb.202311073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 04/02/2024] [Accepted: 05/21/2024] [Indexed: 06/13/2024] Open
Abstract
Super-resolution microscopy, or nanoscopy, enables the use of fluorescent-based molecular localization tools to study molecular structure at the nanoscale level in the intact cell, bridging the mesoscale gap to classical structural biology methodologies. Analysis of super-resolution data by artificial intelligence (AI), such as machine learning, offers tremendous potential for the discovery of new biology, that, by definition, is not known and lacks ground truth. Herein, we describe the application of weakly supervised paradigms to super-resolution microscopy and its potential to enable the accelerated exploration of the nanoscale architecture of subcellular macromolecules and organelles.
Collapse
Affiliation(s)
- Ivan R. Nabi
- Department of Cellular and Physiological Sciences, Life Sciences Institute, University of British Columbia, Vancouver, Canada
- School of Biomedical Engineering, University of British Columbia, Vancouver, Canada
| | - Ben Cardoen
- School of Computing Science, Simon Fraser University, Burnaby, Canada
| | - Ismail M. Khater
- School of Computing Science, Simon Fraser University, Burnaby, Canada
- Department of Electrical and Computer Engineering, Faculty of Engineering and Technology, Birzeit University, Birzeit, Palestine
| | - Guang Gao
- Department of Cellular and Physiological Sciences, Life Sciences Institute, University of British Columbia, Vancouver, Canada
| | - Timothy H. Wong
- Department of Cellular and Physiological Sciences, Life Sciences Institute, University of British Columbia, Vancouver, Canada
| | - Ghassan Hamarneh
- School of Computing Science, Simon Fraser University, Burnaby, Canada
| |
Collapse
|
2
|
Zhang X, Venkatachalapathy S, Paysan D, Schaerer P, Tripodo C, Uhler C, Shivashankar GV. Unsupervised representation learning of chromatin images identifies changes in cell state and tissue organization in DCIS. Nat Commun 2024; 15:6112. [PMID: 39030176 DOI: 10.1038/s41467-024-50285-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Accepted: 07/05/2024] [Indexed: 07/21/2024] Open
Abstract
Ductal carcinoma in situ (DCIS) is a pre-invasive tumor that can progress to invasive breast cancer, a leading cause of cancer death. We generate a large-scale tissue microarray dataset of chromatin images, from 560 samples from 122 female patients in 3 disease stages and 11 phenotypic categories. Using representation learning on chromatin images alone, without multiplexed staining or high-throughput sequencing, we identify eight morphological cell states and tissue features marking DCIS. All cell states are observed in all disease stages with different proportions, indicating that cell states enriched in invasive cancer exist in small fractions in normal breast tissue. Tissue-level analysis reveals significant changes in the spatial organization of cell states across disease stages, which is predictive of disease stage and phenotypic category. Taken together, we show that chromatin imaging represents a powerful measure of cell state and disease stage of DCIS, providing a simple and effective tumor biomarker.
Collapse
Affiliation(s)
- Xinyi Zhang
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, USA
- Eric and Wendy Schmidt Center, Broad Institute of MIT and Harvard, Cambridge, USA
| | - Saradha Venkatachalapathy
- Department of Health Sciences and Technology, ETH Zurich, Switzerland
- Laboratory of Nanoscale Biology, Paul Scherrer Institute, Villigen, Switzerland
| | - Daniel Paysan
- Department of Health Sciences and Technology, ETH Zurich, Switzerland
- Laboratory of Nanoscale Biology, Paul Scherrer Institute, Villigen, Switzerland
| | - Paulina Schaerer
- Department of Health Sciences and Technology, ETH Zurich, Switzerland
- Laboratory of Nanoscale Biology, Paul Scherrer Institute, Villigen, Switzerland
| | - Claudio Tripodo
- Tumor Immunology Unit, University of Palermo, Palermo, Italy
- IFOM, FIRC Institute of Molecular Oncology, Milan, Italy
| | - Caroline Uhler
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, USA.
- Eric and Wendy Schmidt Center, Broad Institute of MIT and Harvard, Cambridge, USA.
| | - G V Shivashankar
- Department of Health Sciences and Technology, ETH Zurich, Switzerland.
- Laboratory of Nanoscale Biology, Paul Scherrer Institute, Villigen, Switzerland.
| |
Collapse
|
3
|
Ivanov IE, Hirata-Miyasaki E, Chandler T, Cheloor-Kovilakam R, Liu Z, Pradeep S, Liu C, Bhave M, Khadka S, Arias C, Leonetti MD, Huang B, Mehta SB. Mantis: high-throughput 4D imaging and analysis of the molecular and physical architecture of cells. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.12.19.572435. [PMID: 38187521 PMCID: PMC10769231 DOI: 10.1101/2023.12.19.572435] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
High-throughput dynamic imaging of cells and organelles is essential for understanding complex cellular responses. We report Mantis, a high-throughput 4D microscope that integrates two complementary, gentle, live-cell imaging technologies: remote-refocus label-free microscopy and oblique light-sheet fluorescence microscopy. Additionally, we report shrimPy, an open-source software for high-throughput imaging, deconvolution, and single-cell phenotyping of 4D data. Using Mantis and shrimPy, we achieved high-content correlative imaging of molecular dynamics and the physical architecture of 20 cell lines every 15 minutes over 7.5 hours. This platform also facilitated detailed measurements of the impacts of viral infection on the architecture of host cells and host proteins. The Mantis platform can enable high-throughput profiling of intracellular dynamics, long-term imaging and analysis of cellular responses to perturbations, and live-cell optical screens to dissect gene regulatory networks.
Collapse
Affiliation(s)
- Ivan E. Ivanov
- Chan Zuckerberg Biohub San Francisco, San Francisco, United States
| | | | - Talon Chandler
- Chan Zuckerberg Biohub San Francisco, San Francisco, United States
| | - Rasmi Cheloor-Kovilakam
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Francisco, United States
| | - Ziwen Liu
- Chan Zuckerberg Biohub San Francisco, San Francisco, United States
| | - Soorya Pradeep
- Chan Zuckerberg Biohub San Francisco, San Francisco, United States
| | - Chad Liu
- Chan Zuckerberg Biohub San Francisco, San Francisco, United States
| | - Madhura Bhave
- Chan Zuckerberg Biohub San Francisco, San Francisco, United States
| | - Sudip Khadka
- Chan Zuckerberg Biohub San Francisco, San Francisco, United States
| | - Carolina Arias
- Chan Zuckerberg Biohub San Francisco, San Francisco, United States
| | | | - Bo Huang
- Chan Zuckerberg Biohub San Francisco, San Francisco, United States
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Francisco, United States
| | - Shalin B. Mehta
- Chan Zuckerberg Biohub San Francisco, San Francisco, United States
| |
Collapse
|
4
|
Lo MCK, Siu DMD, Lee KCM, Wong JSJ, Yeung MCF, Hsin MKY, Ho JCM, Tsia KK. Information-Distilled Generative Label-Free Morphological Profiling Encodes Cellular Heterogeneity. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024:e2307591. [PMID: 38864546 DOI: 10.1002/advs.202307591] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 05/17/2024] [Indexed: 06/13/2024]
Abstract
Image-based cytometry faces challenges due to technical variations arising from different experimental batches and conditions, such as differences in instrument configurations or image acquisition protocols, impeding genuine biological interpretation of cell morphology. Existing solutions, often necessitating extensive pre-existing data knowledge or control samples across batches, have proved limited, especially with complex cell image data. To overcome this, "Cyto-Morphology Adversarial Distillation" (CytoMAD), a self-supervised multi-task learning strategy that distills biologically relevant cellular morphological information from batch variations, is introduced to enable integrated analysis across multiple data batches without complex data assumptions or extensive manual annotation. Unique to CytoMAD is its "morphology distillation", symbiotically paired with deep-learning image-contrast translation-offering additional interpretable insights into label-free cell morphology. The versatile efficacy of CytoMAD is demonstrated in augmenting the power of biophysical imaging cytometry. It allows integrated label-free classification of human lung cancer cell types and accurately recapitulates their progressive drug responses, even when trained without the drug concentration information. CytoMAD also allows joint analysis of tumor biophysical cellular heterogeneity, linked to epithelial-mesenchymal plasticity, that standard fluorescence markers overlook. CytoMAD can substantiate the wide adoption of biophysical cytometry for cost-effective diagnosis and screening.
Collapse
Affiliation(s)
- Michelle C K Lo
- Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong, 000000, Hong Kong
- Advanced Biomedical Instrumentation Centre, Hong Kong Science Park, New Territories, Hong Kong, 000000, Hong Kong
| | - Dickson M D Siu
- Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong, 000000, Hong Kong
- Advanced Biomedical Instrumentation Centre, Hong Kong Science Park, New Territories, Hong Kong, 000000, Hong Kong
| | - Kelvin C M Lee
- Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong, 000000, Hong Kong
- Advanced Biomedical Instrumentation Centre, Hong Kong Science Park, New Territories, Hong Kong, 000000, Hong Kong
| | - Justin S J Wong
- Conzeb Limited, Hong Kong Science Park, New Territories, Hong Kong, 000000, Hong Kong
| | - Maximus C F Yeung
- Department of Pathology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam Road, Hong Kong, 000000, Hong Kong
| | - Michael K Y Hsin
- Department of Surgery, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam Road, Hong Kong, 000000, Hong Kong
| | - James C M Ho
- Department of Medicine, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam Road, Hong Kong, 000000, Hong Kong
| | - Kevin K Tsia
- Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong, 000000, Hong Kong
- Advanced Biomedical Instrumentation Centre, Hong Kong Science Park, New Territories, Hong Kong, 000000, Hong Kong
| |
Collapse
|
5
|
Shpigler A, Kolet N, Golan S, Weisbart E, Zaritsky A. Anomaly detection for high-content image-based phenotypic cell profiling. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.01.595856. [PMID: 38895267 PMCID: PMC11185510 DOI: 10.1101/2024.06.01.595856] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]
Abstract
High-content image-based phenotypic profiling combines automated microscopy and analysis to identify phenotypic alterations in cell morphology and provide insight into the cell's physiological state. Classical representations of the phenotypic profile can not capture the full underlying complexity in cell organization, while recent weakly machine-learning based representation-learning methods are hard to biologically interpret. We used the abundance of control wells to learn the in-distribution of control experiments and use it to formulate a self-supervised reconstruction anomaly-based representation that encodes the intricate morphological inter-feature dependencies while preserving the representation interpretability. The performance of our anomaly-based representations was evaluated for downstream tasks with respect to two classical representations across four public Cell Painting datasets. Anomaly-based representations improved reproducibility, Mechanism of Action classification, and complemented classical representations. Unsupervised explainability of autoencoder-based anomalies identified specific inter-feature dependencies causing anomalies. The general concept of anomaly-based representations can be adapted to other applications in cell biology.
Collapse
Affiliation(s)
- Alon Shpigler
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel
| | - Naor Kolet
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel
| | - Shahar Golan
- Department of Computer Science, Jerusalem College of Technology, 91160 Jerusalem, Israel
| | - Erin Weisbart
- Imaging Platform, Broad Institute of MIT and Harvard, Cambridge (MA), USA
| | - Assaf Zaritsky
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel
| |
Collapse
|
6
|
Rafelski SM, Theriot JA. Establishing a conceptual framework for holistic cell states and state transitions. Cell 2024; 187:2633-2651. [PMID: 38788687 DOI: 10.1016/j.cell.2024.04.035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 04/10/2024] [Accepted: 04/24/2024] [Indexed: 05/26/2024]
Abstract
Cell states were traditionally defined by how they looked, where they were located, and what functions they performed. In this post-genomic era, the field is largely focused on a molecular view of cell state. Moving forward, we anticipate that the observables used to define cell states will evolve again as single-cell imaging and analytics are advancing at a breakneck pace via the collection of large-scale, systematic cell image datasets and the application of quantitative image-based data science methods. This is, therefore, a key moment in the arc of cell biological research to develop approaches that integrate the spatiotemporal observables of the physical structure and organization of the cell with molecular observables toward the concept of a holistic cell state. In this perspective, we propose a conceptual framework for holistic cell states and state transitions that is data-driven, practical, and useful to enable integrative analyses and modeling across many data types.
Collapse
Affiliation(s)
- Susanne M Rafelski
- Allen Institute for Cell Science, 615 Westlake Avenue N, Seattle, WA 98125, USA.
| | - Julie A Theriot
- Department of Biology and Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA.
| |
Collapse
|
7
|
Kilgore HR, Chinn I, Mikhael PG, Mitnikov I, Van Dongen C, Zylberberg G, Afeyan L, Banani S, Wilson-Hawken S, Lee TI, Barzilay R, Young RA. Protein codes promote selective subcellular compartmentalization. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.15.589616. [PMID: 38659952 PMCID: PMC11042338 DOI: 10.1101/2024.04.15.589616] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
Cells have evolved mechanisms to distribute ~10 billion protein molecules to subcellular compartments where diverse proteins involved in shared functions must efficiently assemble. Here, we demonstrate that proteins with shared functions share amino acid sequence codes that guide them to compartment destinations. A protein language model, ProtGPS, was developed that predicts with high performance the compartment localization of human proteins excluded from the training set. ProtGPS successfully guided generation of novel protein sequences that selectively assemble in targeted subcellular compartments. ProtGPS also identified pathological mutations that change this code and lead to altered subcellular localization of proteins. Our results indicate that protein sequences contain not only a folding code, but also a previously unrecognized code governing their distribution in specific cellular compartments.
Collapse
Affiliation(s)
- Henry R. Kilgore
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
| | - Itamar Chinn
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Peter G. Mikhael
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Ilan Mitnikov
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | | | - Guy Zylberberg
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Lena Afeyan
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Salman Banani
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
- Department of Pathology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Susana Wilson-Hawken
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
- Program of Computational & Systems Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Tong Ihn Lee
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
| | - Regina Barzilay
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Richard A. Young
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| |
Collapse
|
8
|
Reicher A, Reiniš J, Ciobanu M, Růžička P, Malik M, Siklos M, Kartysh V, Tomek T, Koren A, Rendeiro AF, Kubicek S. Pooled multicolour tagging for visualizing subcellular protein dynamics. Nat Cell Biol 2024; 26:745-756. [PMID: 38641660 PMCID: PMC11098740 DOI: 10.1038/s41556-024-01407-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Accepted: 03/18/2024] [Indexed: 04/21/2024]
Abstract
Imaging-based methods are widely used for studying the subcellular localization of proteins in living cells. While routine for individual proteins, global monitoring of protein dynamics following perturbation typically relies on arrayed panels of fluorescently tagged cell lines, limiting throughput and scalability. Here, we describe a strategy that combines high-throughput microscopy, computer vision and machine learning to detect perturbation-induced changes in multicolour tagged visual proteomics cell (vpCell) pools. We use genome-wide and cancer-focused intron-targeting sgRNA libraries to generate vpCell pools and a large, arrayed collection of clones each expressing two different endogenously tagged fluorescent proteins. Individual clones can be identified in vpCell pools by image analysis using the localization patterns and expression level of the tagged proteins as visual barcodes, enabling simultaneous live-cell monitoring of large sets of proteins. To demonstrate broad applicability and scale, we test the effects of antiproliferative compounds on a pool with cancer-related proteins, on which we identify widespread protein localization changes and new inhibitors of the nuclear import/export machinery. The time-resolved characterization of changes in subcellular localization and abundance of proteins upon perturbation in a pooled format highlights the power of the vpCell approach for drug discovery and mechanism-of-action studies.
Collapse
Affiliation(s)
- Andreas Reicher
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
| | - Jiří Reiniš
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
| | - Maria Ciobanu
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
| | - Pavel Růžička
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
| | - Monika Malik
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
| | - Marton Siklos
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
| | - Victoria Kartysh
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
| | - Tatjana Tomek
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
| | - Anna Koren
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
| | - André F Rendeiro
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
| | - Stefan Kubicek
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria.
| |
Collapse
|
9
|
Razdaibiedina A, Brechalov A, Friesen H, Mattiazzi Usaj M, Masinas MPD, Garadi Suresh H, Wang K, Boone C, Ba J, Andrews B. PIFiA: self-supervised approach for protein functional annotation from single-cell imaging data. Mol Syst Biol 2024; 20:521-548. [PMID: 38472305 PMCID: PMC11066028 DOI: 10.1038/s44320-024-00029-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 02/27/2024] [Accepted: 02/28/2024] [Indexed: 03/14/2024] Open
Abstract
Fluorescence microscopy data describe protein localization patterns at single-cell resolution and have the potential to reveal whole-proteome functional information with remarkable precision. Yet, extracting biologically meaningful representations from cell micrographs remains a major challenge. Existing approaches often fail to learn robust and noise-invariant features or rely on supervised labels for accurate annotations. We developed PIFiA (Protein Image-based Functional Annotation), a self-supervised approach for protein functional annotation from single-cell imaging data. We imaged the global yeast ORF-GFP collection and applied PIFiA to generate protein feature profiles from single-cell images of fluorescently tagged proteins. We show that PIFiA outperforms existing approaches for molecular representation learning and describe a range of downstream analysis tasks to explore the information content of the feature profiles. Specifically, we cluster extracted features into a hierarchy of functional organization, study cell population heterogeneity, and develop techniques to distinguish multi-localizing proteins and identify functional modules. Finally, we confirm new PIFiA predictions using a colocalization assay, suggesting previously unappreciated biological roles for several proteins. Paired with a fully interactive website ( https://thecellvision.org/pifia/ ), PIFiA is a resource for the quantitative analysis of protein organization within the cell.
Collapse
Affiliation(s)
- Anastasia Razdaibiedina
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Vector Institute for Artificial Intelligence, Toronto, ON, Canada
| | - Alexander Brechalov
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada
| | - Helena Friesen
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada
| | - Mojca Mattiazzi Usaj
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Department of Chemistry and Biology, Toronto Metropolitan University, Toronto, ON, Canada
| | | | | | - Kyle Wang
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada
| | - Charles Boone
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada.
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada.
- RIKEN Center for Sustainable Resource Science, 2-1 Hirosawa, Wako, Saitama, Japan.
| | - Jimmy Ba
- Vector Institute for Artificial Intelligence, Toronto, ON, Canada.
- Department of Computer Science, University of Toronto, Toronto, ON, Canada.
| | - Brenda Andrews
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada.
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
10
|
Cui X, Chen X, Li Z, Gao Z, Chen S, Jiang R. Discrete latent embedding of single-cell chromatin accessibility sequencing data for uncovering cell heterogeneity. NATURE COMPUTATIONAL SCIENCE 2024; 4:346-359. [PMID: 38730185 DOI: 10.1038/s43588-024-00625-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Accepted: 04/05/2024] [Indexed: 05/12/2024]
Abstract
Single-cell epigenomic data has been growing continuously at an unprecedented pace, but their characteristics such as high dimensionality and sparsity pose substantial challenges to downstream analysis. Although deep learning models-especially variational autoencoders-have been widely used to capture low-dimensional feature embeddings, the prevalent Gaussian assumption somewhat disagrees with real data, and these models tend to struggle to incorporate reference information from abundant cell atlases. Here we propose CASTLE, a deep generative model based on the vector-quantized variational autoencoder framework to extract discrete latent embeddings that interpretably characterize single-cell chromatin accessibility sequencing data. We validate the performance and robustness of CASTLE for accurate cell-type identification and reasonable visualization compared with state-of-the-art methods. We demonstrate the advantages of CASTLE for effective incorporation of existing massive reference datasets in a weakly supervised or supervised manner. We further demonstrate CASTLE's capacity for intuitively distilling cell-type-specific feature spectra that unveil cell heterogeneity and biological implications quantitatively.
Collapse
Affiliation(s)
- Xuejian Cui
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, China
| | - Xiaoyang Chen
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, China
| | - Zhen Li
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, China
| | - Zijing Gao
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, China
| | - Shengquan Chen
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, China.
| | - Rui Jiang
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, China.
| |
Collapse
|
11
|
Bhushan V, Nita-Lazar A. Recent Advancements in Subcellular Proteomics: Growing Impact of Organellar Protein Niches on the Understanding of Cell Biology. J Proteome Res 2024. [PMID: 38451675 DOI: 10.1021/acs.jproteome.3c00839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/08/2024]
Abstract
The mammalian cell is a complex entity, with membrane-bound and membrane-less organelles playing vital roles in regulating cellular homeostasis. Organellar protein niches drive discrete biological processes and cell functions, thus maintaining cell equilibrium. Cellular processes such as signaling, growth, proliferation, motility, and programmed cell death require dynamic protein movements between cell compartments. Aberrant protein localization is associated with a wide range of diseases. Therefore, analyzing the subcellular proteome of the cell can provide a comprehensive overview of cellular biology. With recent advancements in mass spectrometry, imaging technology, computational tools, and deep machine learning algorithms, studies pertaining to subcellular protein localization and their dynamic distributions are gaining momentum. These studies reveal changing interaction networks because of "moonlighting proteins" and serve as a discovery tool for disease network mechanisms. Consequently, this review aims to provide a comprehensive repository for recent advancements in subcellular proteomics subcontexting methods, challenges, and future perspectives for method developers. In summary, subcellular proteomics is crucial to the understanding of the fundamental cellular mechanisms and the associated diseases.
Collapse
Affiliation(s)
- Vanya Bhushan
- Functional Cellular Networks Section, Laboratory of Immune System Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Aleksandra Nita-Lazar
- Functional Cellular Networks Section, Laboratory of Immune System Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland 20892, United States
| |
Collapse
|
12
|
Burgess J, Nirschl JJ, Zanellati MC, Lozano A, Cohen S, Yeung-Levy S. Orientation-invariant autoencoders learn robust representations for shape profiling of cells and organelles. Nat Commun 2024; 15:1022. [PMID: 38310122 PMCID: PMC10838319 DOI: 10.1038/s41467-024-45362-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2023] [Accepted: 01/19/2024] [Indexed: 02/05/2024] Open
Abstract
Cell and organelle shape are driven by diverse genetic and environmental factors and thus accurate quantification of cellular morphology is essential to experimental cell biology. Autoencoders are a popular tool for unsupervised biological image analysis because they learn a low-dimensional representation that maps images to feature vectors to generate a semantically meaningful embedding space of morphological variation. The learned feature vectors can also be used for clustering, dimensionality reduction, outlier detection, and supervised learning problems. Shape properties do not change with orientation, and thus we argue that representation learning methods should encode this orientation invariance. We show that conventional autoencoders are sensitive to orientation, which can lead to suboptimal performance on downstream tasks. To address this, we develop O2-variational autoencoder (O2-VAE), an unsupervised method that learns robust, orientation-invariant representations. We use O2-VAE to discover morphology subgroups in segmented cells and mitochondria, detect outlier cells, and rapidly characterise cellular shape and texture in large datasets, including in a newly generated synthetic benchmark.
Collapse
Affiliation(s)
- James Burgess
- Institute for Computational & Mathematical Engineering, Stanford University, Stanford, CA, USA.
| | - Jeffrey J Nirschl
- Department of Pathology, School of Medicine, Stanford University, Stanford, CA, USA
| | - Maria-Clara Zanellati
- Department of Cell Biology and Physiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Alejandro Lozano
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
| | - Sarah Cohen
- Department of Cell Biology and Physiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Serena Yeung-Levy
- Departments of Biomedical Data Science, Computer Science, and Electrical Engineering, Stanford University, Stanford, CA, USA.
- Chan Zuckerberg Biohub - San Francisco, San Francisco, CA, USA.
- Clinical Excellence Research Center, School of Medicine, Stanford University, Stanford, CA, USA.
| |
Collapse
|
13
|
Zingman I, Stierstorfer B, Lempp C, Heinemann F. Learning image representations for anomaly detection: Application to discovery of histological alterations in drug development. Med Image Anal 2024; 92:103067. [PMID: 38141454 DOI: 10.1016/j.media.2023.103067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Revised: 12/01/2023] [Accepted: 12/19/2023] [Indexed: 12/25/2023]
Abstract
We present a system for anomaly detection in histopathological images. In histology, normal samples are usually abundant, whereas anomalous (pathological) cases are scarce or not available. Under such settings, one-class classifiers trained on healthy data can detect out-of-distribution anomalous samples. Such approaches combined with pre-trained Convolutional Neural Network (CNN) representations of images were previously employed for anomaly detection (AD). However, pre-trained off-the-shelf CNN representations may not be sensitive to abnormal conditions in tissues, while natural variations of healthy tissue may result in distant representations. To adapt representations to relevant details in healthy tissue we propose training a CNN on an auxiliary task that discriminates healthy tissue of different species, organs, and staining reagents. Almost no additional labeling workload is required, since healthy samples come automatically with aforementioned labels. During training we enforce compact image representations with a center-loss term, which further improves representations for AD. The proposed system outperforms established AD methods on a published dataset of liver anomalies. Moreover, it provided comparable results to conventional methods specifically tailored for quantification of liver anomalies. We show that our approach can be used for toxicity assessment of candidate drugs at early development stages and thereby may reduce expensive late-stage drug attrition.
Collapse
Affiliation(s)
- Igor Zingman
- Drug Discovery Sciences, Boehringer Ingelheim Pharma GmbH and Co., Biberach an der Riß, Germany.
| | - Birgit Stierstorfer
- Non-Clinical Drug Safety, Boehringer Ingelheim Pharma GmbH and Co., Biberach an der Riß, Germany
| | - Charlotte Lempp
- Drug Discovery Sciences, Boehringer Ingelheim Pharma GmbH and Co., Biberach an der Riß, Germany
| | - Fabian Heinemann
- Drug Discovery Sciences, Boehringer Ingelheim Pharma GmbH and Co., Biberach an der Riß, Germany.
| |
Collapse
|
14
|
Xun D, Wang R, Zhang X, Wang Y. Microsnoop: A generalist tool for microscopy image representation. Innovation (N Y) 2024; 5:100541. [PMID: 38235187 PMCID: PMC10794109 DOI: 10.1016/j.xinn.2023.100541] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 11/17/2023] [Indexed: 01/19/2024] Open
Abstract
Accurate profiling of microscopy images from small scale to high throughput is an essential procedure in basic and applied biological research. Here, we present Microsnoop, a novel deep learning-based representation tool trained on large-scale microscopy images using masked self-supervised learning. Microsnoop can process various complex and heterogeneous images, and we classified images into three categories: single-cell, full-field, and batch-experiment images. Our benchmark study on 10 high-quality evaluation datasets, containing over 2,230,000 images, demonstrated Microsnoop's robust and state-of-the-art microscopy image representation ability, surpassing existing generalist and even several custom algorithms. Microsnoop can be integrated with other pipelines to perform tasks such as superresolution histopathology image and multimodal analysis. Furthermore, Microsnoop can be adapted to various hardware and can be easily deployed on local or cloud computing platforms. We will regularly retrain and reevaluate the model using community-contributed data to consistently improve Microsnoop.
Collapse
Affiliation(s)
- Dejin Xun
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Rui Wang
- State Key Lab of Computer-Aided Design & Computer Graphics, Zhejiang University, Hangzhou 310058, China
| | - Xingcai Zhang
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA 02138, USA
| | - Yi Wang
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Hangzhou 310018, China
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, Jiaxing 314100, China
| |
Collapse
|
15
|
Chen S, Yin Y, Pang X, Wang C, Wang L, Wang J, Jia J, Liu X, Xu S, Luo X. Light and endogenous enzyme triggered plasmonic antennas for accurate subcellular molecular imaging with enhanced spatial resolution. Chem Sci 2024; 15:566-572. [PMID: 38179540 PMCID: PMC10762929 DOI: 10.1039/d3sc05728c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Accepted: 11/30/2023] [Indexed: 01/06/2024] Open
Abstract
Developing accurate tumor-specific molecular imaging approaches holds great potential for evaluating cancer progression. However, traditional molecular imaging approaches still suffer from restricted tumor specificity due to the "off-tumor" signal leakage. In this work, we proposed light and endogenous APE1-triggered plasmonic antennas for accurate tumor-specific subcellular molecular imaging with enhanced spatial resolution. Light activation ensures subcellular molecular imaging and endogenous enzyme activation ensures tumor-specific molecular imaging. In addition, combined with the introduction of plasmon enhanced fluorescence (PEF), off-tumor signal leakage at the subcellular level was effectively reduced, resulting in the significantly enhanced discrimination ratio of tumor/normal cells (∼11.57-fold) which is better than in previous reports, demonstrating great prospects of these plasmonic antennas triggered by light and endogenous enzymes for tumor-specific molecular imaging at the subcellular level.
Collapse
Affiliation(s)
- Shuwei Chen
- Key Laboratory of Optic-Electric Sensing and Analytical Chemistry for Life Science, MOE, College of Chemistry and Molecular Engineering, Qingdao University of Science and Technology Qingdao 266042 P. R. China
| | - Yue Yin
- Key Laboratory of Optic-Electric Sensing and Analytical Chemistry for Life Science, MOE, College of Chemistry and Molecular Engineering, Qingdao University of Science and Technology Qingdao 266042 P. R. China
| | - Xiaozhe Pang
- Key Laboratory of Optic-Electric Sensing and Analytical Chemistry for Life Science, MOE, College of Chemistry and Molecular Engineering, Qingdao University of Science and Technology Qingdao 266042 P. R. China
| | - Congkai Wang
- Key Laboratory of Optic-Electric Sensing and Analytical Chemistry for Life Science, MOE, College of Chemistry and Molecular Engineering, Qingdao University of Science and Technology Qingdao 266042 P. R. China
| | - Lei Wang
- Key Laboratory of Optic-Electric Sensing and Analytical Chemistry for Life Science, MOE, College of Chemistry and Molecular Engineering, Qingdao University of Science and Technology Qingdao 266042 P. R. China
| | - Junqi Wang
- Key Laboratory of Optic-Electric Sensing and Analytical Chemistry for Life Science, MOE, College of Chemistry and Molecular Engineering, Qingdao University of Science and Technology Qingdao 266042 P. R. China
| | - Jiangfei Jia
- Key Laboratory of Optic-Electric Sensing and Analytical Chemistry for Life Science, MOE, College of Chemistry and Molecular Engineering, Qingdao University of Science and Technology Qingdao 266042 P. R. China
| | - Xinxue Liu
- Key Laboratory of Optic-Electric Sensing and Analytical Chemistry for Life Science, MOE, College of Chemistry and Molecular Engineering, Qingdao University of Science and Technology Qingdao 266042 P. R. China
| | - Shenghao Xu
- Key Laboratory of Optic-Electric Sensing and Analytical Chemistry for Life Science, MOE, College of Chemistry and Molecular Engineering, Qingdao University of Science and Technology Qingdao 266042 P. R. China
| | - Xiliang Luo
- Key Laboratory of Optic-Electric Sensing and Analytical Chemistry for Life Science, MOE, College of Chemistry and Molecular Engineering, Qingdao University of Science and Technology Qingdao 266042 P. R. China
| |
Collapse
|
16
|
Lu AX, Moses AM. Using Dimensionality Reduction to Visualize Phenotypic Changes in High-Throughput Microscopy. Methods Mol Biol 2024; 2800:217-229. [PMID: 38709487 DOI: 10.1007/978-1-0716-3834-7_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/07/2024]
Abstract
High-throughput microscopy has enabled screening of cell phenotypes at unprecedented scale. Systematic identification of cell phenotype changes (such as cell morphology and protein localization changes) is a major analysis goal. Because cell phenotypes are high-dimensional, unbiased approaches to detect and visualize the changes in phenotypes are still needed. Here, we suggest that changes in cellular phenotype can be visualized in reduced dimensionality representations of the image feature space. We describe a freely available analysis pipeline to visualize changes in protein localization in feature spaces obtained from deep learning. As an example, we use the pipeline to identify changes in subcellular localization after the yeast GFP collection was treated with hydroxyurea.
Collapse
Affiliation(s)
- Alex X Lu
- Microsoft Research New England, Cambridge, MA, USA.
| | - Alan M Moses
- Department of Cell & Systems Biology, University of Toronto, Toronto, Canada
| |
Collapse
|
17
|
Schrader E, Ali HR. Charting multicellular tissue structure cell-to-cell. Nat Genet 2024; 56:14-15. [PMID: 38135722 DOI: 10.1038/s41588-023-01624-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2023]
Affiliation(s)
- Ellen Schrader
- CRUK Cambridge Institute, University of Cambridge, Cambridge, UK
| | - H Raza Ali
- CRUK Cambridge Institute, University of Cambridge, Cambridge, UK.
- Department of Histopathology, Addenbrookes Hospital, Cambridge, UK.
| |
Collapse
|
18
|
Xu L, Kan S, Yu X, Liu Y, Fu Y, Peng Y, Liang Y, Cen Y, Zhu C, Jiang W. Deep learning enables stochastic optical reconstruction microscopy-like superresolution image reconstruction from conventional microscopy. iScience 2023; 26:108145. [PMID: 37867953 PMCID: PMC10587619 DOI: 10.1016/j.isci.2023.108145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Revised: 08/05/2023] [Accepted: 10/02/2023] [Indexed: 10/24/2023] Open
Abstract
Despite its remarkable potential for transforming low-resolution images, deep learning faces significant challenges in achieving high-quality superresolution microscopy imaging from wide-field (conventional) microscopy. Here, we present X-Microscopy, a computational tool comprising two deep learning subnets, UR-Net-8 and X-Net, which enables STORM-like superresolution microscopy image reconstruction from wide-field images with input-size flexibility. X-Microscopy was trained using samples of various subcellular structures, including cytoskeletal filaments, dot-like, beehive-like, and nanocluster-like structures, to generate prediction models capable of producing images of comparable quality to STORM-like images. In addition to enabling multicolour superresolution image reconstructions, X-Microscopy also facilitates superresolution image reconstruction from different conventional microscopic systems. The capabilities of X-Microscopy offer promising prospects for making superresolution microscopy accessible to a broader range of users, going beyond the confines of well-equipped laboratories.
Collapse
Affiliation(s)
- Lei Xu
- Department of Etiology and Carcinogenesis and State Key Laboratory of Molecular Oncology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100021, China
- Key Laboratory of Molecular and Cellular Systems Biology, College of Life Sciences, Tianjin Normal University, Tianjin 300387, China
| | - Shichao Kan
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Xiying Yu
- Department of Etiology and Carcinogenesis and State Key Laboratory of Molecular Oncology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100021, China
| | - Ye Liu
- HAMD (Ningbo) Intelligent Medical Technology Co., Ltd, Ningbo 315194, China
| | - Yuxia Fu
- Department of Etiology and Carcinogenesis and State Key Laboratory of Molecular Oncology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100021, China
| | - Yiqiang Peng
- HAMD (Ningbo) Intelligent Medical Technology Co., Ltd, Ningbo 315194, China
| | - Yanhui Liang
- HAMD (Ningbo) Intelligent Medical Technology Co., Ltd, Ningbo 315194, China
| | - Yigang Cen
- Institute of Information Science, Beijing Jiaotong University, Beijing 100044, China
| | - Changjun Zhu
- Key Laboratory of Molecular and Cellular Systems Biology, College of Life Sciences, Tianjin Normal University, Tianjin 300387, China
| | - Wei Jiang
- Department of Etiology and Carcinogenesis and State Key Laboratory of Molecular Oncology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100021, China
| |
Collapse
|
19
|
Yue T, Wang Y, Zhang L, Gu C, Xue H, Wang W, Lyu Q, Dun Y. Deep Learning for Genomics: From Early Neural Nets to Modern Large Language Models. Int J Mol Sci 2023; 24:15858. [PMID: 37958843 PMCID: PMC10649223 DOI: 10.3390/ijms242115858] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Revised: 10/24/2023] [Accepted: 10/30/2023] [Indexed: 11/15/2023] Open
Abstract
The data explosion driven by advancements in genomic research, such as high-throughput sequencing techniques, is constantly challenging conventional methods used in genomics. In parallel with the urgent demand for robust algorithms, deep learning has succeeded in various fields such as vision, speech, and text processing. Yet genomics entails unique challenges to deep learning, since we expect a superhuman intelligence that explores beyond our knowledge to interpret the genome from deep learning. A powerful deep learning model should rely on the insightful utilization of task-specific knowledge. In this paper, we briefly discuss the strengths of different deep learning models from a genomic perspective so as to fit each particular task with proper deep learning-based architecture, and we remark on practical considerations of developing deep learning architectures for genomics. We also provide a concise review of deep learning applications in various aspects of genomic research and point out current challenges and potential research directions for future genomics applications. We believe the collaborative use of ever-growing diverse data and the fast iteration of deep learning models will continue to contribute to the future of genomics.
Collapse
Affiliation(s)
- Tianwei Yue
- School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA; (Y.W.); (L.Z.); (W.W.)
| | - Yuanxin Wang
- School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA; (Y.W.); (L.Z.); (W.W.)
| | - Longxiang Zhang
- School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA; (Y.W.); (L.Z.); (W.W.)
| | - Chunming Gu
- Department of Biomedical Engineering, School of Medicine, Johns Hopkins University, Baltimore, MD 21218, USA;
| | - Haoru Xue
- The Robotics Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA;
| | - Wenping Wang
- School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA; (Y.W.); (L.Z.); (W.W.)
| | - Qi Lyu
- Department of Computational Mathematics, Science, and Engineering, Michigan State University, East Lansing, MI 48824, USA;
| | - Yujie Dun
- School of Information and Communications Engineering, Xi’an Jiaotong University, Xi’an 710049, China;
| |
Collapse
|
20
|
Aleksandrovych M, Strassberg M, Melamed J, Xu M. Polarization differential interference contrast microscopy with physics-inspired plug-and-play denoiser for single-shot high-performance quantitative phase imaging. BIOMEDICAL OPTICS EXPRESS 2023; 14:5833-5850. [PMID: 38021115 PMCID: PMC10659786 DOI: 10.1364/boe.499316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 08/31/2023] [Accepted: 09/15/2023] [Indexed: 12/01/2023]
Abstract
We present single-shot high-performance quantitative phase imaging with a physics-inspired plug-and-play denoiser for polarization differential interference contrast (PDIC) microscopy. The quantitative phase is recovered by the alternating direction method of multipliers (ADMM), balancing total variance regularization and a pre-trained dense residual U-net (DRUNet) denoiser. The custom DRUNet uses the Tanh activation function to guarantee the symmetry requirement for phase retrieval. In addition, we introduce an adaptive strategy accelerating convergence and explicitly incorporating measurement noise. After validating this deep denoiser-enhanced PDIC microscopy on simulated data and phantom experiments, we demonstrated high-performance phase imaging of histological tissue sections. The phase retrieval by the denoiser-enhanced PDIC microscopy achieves significantly higher quality and accuracy than the solution based on Fourier transforms or the iterative solution with total variance regularization alone.
Collapse
Affiliation(s)
- Mariia Aleksandrovych
- Dept. of Physics and Astronomy, Hunter College and the Graduate Center, The City University of New York, 695 Park Ave, New York, NY 10065, USA
| | - Mark Strassberg
- Dept. of Physics and Astronomy, Hunter College and the Graduate Center, The City University of New York, 695 Park Ave, New York, NY 10065, USA
| | - Jonathan Melamed
- Department of Pathology, New York University Langone School of Medicine, New York, NY 10016, USA
| | - Min Xu
- Dept. of Physics and Astronomy, Hunter College and the Graduate Center, The City University of New York, 695 Park Ave, New York, NY 10065, USA
| |
Collapse
|
21
|
KIM S, KAMARULZAMAN L, TANIGUCHI Y. Recent methodological advances towards single-cell proteomics. PROCEEDINGS OF THE JAPAN ACADEMY. SERIES B, PHYSICAL AND BIOLOGICAL SCIENCES 2023; 99:306-327. [PMID: 37673661 PMCID: PMC10749393 DOI: 10.2183/pjab.99.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Accepted: 07/20/2023] [Indexed: 09/08/2023]
Abstract
Studying the central dogma at the single-cell level has gained increasing attention to reveal hidden cell lineages and functions that cannot be studied using traditional bulk analyses. Nonetheless, most single-cell studies exploiting genomic and transcriptomic levels fail to address information on proteins that are central to many important biological processes. Single-cell proteomics enables understanding of the functional status of individual cells and is particularly crucial when the specimen is composed of heterogeneous entities of cells. With the growing importance of this field, significant methodological advancements have emerged recently. These include miniaturized and automated sample preparation, multi-omics analyses, and combined analyses of multiple techniques such as mass spectrometry and microscopy. Moreover, artificial intelligence and single-molecule detection technologies have advanced throughput and improved sensitivity limitations, respectively, over conventional methods. In this review, we summarize cutting-edge methodologies for single-cell proteomics and relevant emerging technologies that have been reported in the last 5 years, and provide an outlook on this research field.
Collapse
Affiliation(s)
- Sooyeon KIM
- Laboratory for Cell Systems Control, Center for Biosystems Dynamics Research, RIKEN, Suita, Osaka, Japan
- Institute for Integrated Cell-Material Sciences (iCeMS), Kyoto University, Sakyo-ku, Kyoto, Japan
| | - Latiefa KAMARULZAMAN
- Laboratory for Cell Systems Control, Center for Biosystems Dynamics Research, RIKEN, Suita, Osaka, Japan
- Graduate School of Frontier Biosciences, Osaka University, Suita, Osaka, Japan
| | - Yuichi TANIGUCHI
- Laboratory for Cell Systems Control, Center for Biosystems Dynamics Research, RIKEN, Suita, Osaka, Japan
- Institute for Integrated Cell-Material Sciences (iCeMS), Kyoto University, Sakyo-ku, Kyoto, Japan
- Graduate School of Frontier Biosciences, Osaka University, Suita, Osaka, Japan
| |
Collapse
|
22
|
Wang J, Horlacher M, Cheng L, Winther O. RNA trafficking and subcellular localization-a review of mechanisms, experimental and predictive methodologies. Brief Bioinform 2023; 24:bbad249. [PMID: 37466130 PMCID: PMC10516376 DOI: 10.1093/bib/bbad249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Revised: 05/30/2023] [Accepted: 06/16/2023] [Indexed: 07/20/2023] Open
Abstract
RNA localization is essential for regulating spatial translation, where RNAs are trafficked to their target locations via various biological mechanisms. In this review, we discuss RNA localization in the context of molecular mechanisms, experimental techniques and machine learning-based prediction tools. Three main types of molecular mechanisms that control the localization of RNA to distinct cellular compartments are reviewed, including directed transport, protection from mRNA degradation, as well as diffusion and local entrapment. Advances in experimental methods, both image and sequence based, provide substantial data resources, which allow for the design of powerful machine learning models to predict RNA localizations. We review the publicly available predictive tools to serve as a guide for users and inspire developers to build more effective prediction models. Finally, we provide an overview of multimodal learning, which may provide a new avenue for the prediction of RNA localization.
Collapse
Affiliation(s)
- Jun Wang
- Bioinformatics Centre, Department of Biology, University of Copenhagen, København Ø 2100, Denmark
| | - Marc Horlacher
- Computational Health Center, Helmholtz Center, Munich, Germany
| | - Lixin Cheng
- Shenzhen People’s Hospital, First Affiliated Hospital of Southern University of Science and Technology, Second Clinical Medicine College of Jinan University, Shenzhen 518020, China
| | - Ole Winther
- Bioinformatics Centre, Department of Biology, University of Copenhagen, København Ø 2100, Denmark
- Center for Genomic Medicine, Rigshospitalet (Copenhagen University Hospital), Copenhagen 2100, Denmark
- Section for Cognitive Systems, Department of Applied Mathematics and Computer Science, Technical University of Denmark, Kongens Lyngby 2800, Denmark
| |
Collapse
|
23
|
Zhang H, AbdulJabbar K, Grunewald T, Akarca AU, Hagos Y, Sobhani F, Lecat CSY, Patel D, Lee L, Rodriguez-Justo M, Yong K, Ledermann JA, Le Quesne J, Hwang ES, Marafioti T, Yuan Y. Self-supervised deep learning for highly efficient spatial immunophenotyping. EBioMedicine 2023; 95:104769. [PMID: 37672979 PMCID: PMC10493897 DOI: 10.1016/j.ebiom.2023.104769] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 08/07/2023] [Accepted: 08/08/2023] [Indexed: 09/08/2023] Open
Abstract
BACKGROUND Efficient biomarker discovery and clinical translation depend on the fast and accurate analytical output from crucial technologies such as multiplex imaging. However, reliable cell classification often requires extensive annotations. Label-efficient strategies are urgently needed to reveal diverse cell distribution and spatial interactions in large-scale multiplex datasets. METHODS This study proposed Self-supervised Learning for Antigen Detection (SANDI) for accurate cell phenotyping while mitigating the annotation burden. The model first learns intrinsic pairwise similarities in unlabelled cell images, followed by a classification step to map learnt features to cell labels using a small set of annotated references. We acquired four multiplex immunohistochemistry datasets and one imaging mass cytometry dataset, comprising 2825 to 15,258 single-cell images to train and test the model. FINDINGS With 1% annotations (18-114 cells), SANDI achieved weighted F1-scores ranging from 0.82 to 0.98 across the five datasets, which was comparable to the fully supervised classifier trained on 1828-11,459 annotated cells (-0.002 to -0.053 of averaged weighted F1-score, Wilcoxon rank-sum test, P = 0.31). Leveraging the immune checkpoint markers stained in ovarian cancer slides, SANDI-based cell identification reveals spatial expulsion between PD1-expressing T helper cells and T regulatory cells, suggesting an interplay between PD1 expression and T regulatory cell-mediated immunosuppression. INTERPRETATION By striking a fine balance between minimal expert guidance and the power of deep learning to learn similarity within abundant data, SANDI presents new opportunities for efficient, large-scale learning for histology multiplex imaging data. FUNDING This study was funded by the Royal Marsden/ICR National Institute of Health Research Biomedical Research Centre.
Collapse
Affiliation(s)
- Hanyun Zhang
- Centre for Evolution and Cancer, The Institute of Cancer Research, London, UK; Division of Molecular Pathology, The Institute of Cancer Research, London, UK
| | - Khalid AbdulJabbar
- Centre for Evolution and Cancer, The Institute of Cancer Research, London, UK; Division of Molecular Pathology, The Institute of Cancer Research, London, UK
| | - Tami Grunewald
- Department of Oncology, UCL Cancer Institute, University College London, London, UK
| | - Ayse U Akarca
- Department of Cellular Pathology, University College London Hospital, London, UK
| | - Yeman Hagos
- Centre for Evolution and Cancer, The Institute of Cancer Research, London, UK; Division of Molecular Pathology, The Institute of Cancer Research, London, UK
| | - Faranak Sobhani
- Centre for Evolution and Cancer, The Institute of Cancer Research, London, UK; Division of Molecular Pathology, The Institute of Cancer Research, London, UK
| | - Catherine S Y Lecat
- Research Department of Hematology, Cancer Institute, University College London, UK
| | - Dominic Patel
- Research Department of Hematology, Cancer Institute, University College London, UK
| | - Lydia Lee
- Research Department of Hematology, Cancer Institute, University College London, UK
| | | | - Kwee Yong
- Research Department of Hematology, Cancer Institute, University College London, UK
| | - Jonathan A Ledermann
- Department of Oncology, UCL Cancer Institute, University College London, London, UK
| | - John Le Quesne
- School of Cancer Sciences, University of Glasgow, Glasgow, UK; CRUK Beatson Institute, Garscube Estate, Glasgow, UK; Department of Histopathology, Queen Elizabeth University Hospital, Glasgow, UK
| | - E Shelley Hwang
- Department of Surgery, Duke University Medical Center, Durham, NC, USA
| | - Teresa Marafioti
- Department of Cellular Pathology, University College London Hospital, London, UK
| | - Yinyin Yuan
- Centre for Evolution and Cancer, The Institute of Cancer Research, London, UK; Division of Molecular Pathology, The Institute of Cancer Research, London, UK.
| |
Collapse
|
24
|
Gunawan I, Vafaee F, Meijering E, Lock JG. An introduction to representation learning for single-cell data analysis. CELL REPORTS METHODS 2023; 3:100547. [PMID: 37671013 PMCID: PMC10475795 DOI: 10.1016/j.crmeth.2023.100547] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/07/2023]
Abstract
Single-cell-resolved systems biology methods, including omics- and imaging-based measurement modalities, generate a wealth of high-dimensional data characterizing the heterogeneity of cell populations. Representation learning methods are routinely used to analyze these complex, high-dimensional data by projecting them into lower-dimensional embeddings. This facilitates the interpretation and interrogation of the structures, dynamics, and regulation of cell heterogeneity. Reflecting their central role in analyzing diverse single-cell data types, a myriad of representation learning methods exist, with new approaches continually emerging. Here, we contrast general features of representation learning methods spanning statistical, manifold learning, and neural network approaches. We consider key steps involved in representation learning with single-cell data, including data pre-processing, hyperparameter optimization, downstream analysis, and biological validation. Interdependencies and contingencies linking these steps are also highlighted. This overview is intended to guide researchers in the selection, application, and optimization of representation learning strategies for current and future single-cell research applications.
Collapse
Affiliation(s)
- Ihuan Gunawan
- School of Biomedical Sciences, Faculty of Medicine and Health, University of New South Wales, Sydney, NSW, Australia
- School of Computer Science and Engineering, Faculty of Engineering, University of New South Wales, Sydney, NSW, Australia
| | - Fatemeh Vafaee
- School of Biotechnology and Biomolecular Sciences, Faculty of Science, University of New South Wales, Sydney, NSW, Australia
- UNSW Data Science Hub, University of New South Wales, Sydney, NSW, Australia
| | - Erik Meijering
- School of Computer Science and Engineering, Faculty of Engineering, University of New South Wales, Sydney, NSW, Australia
| | - John George Lock
- School of Biomedical Sciences, Faculty of Medicine and Health, University of New South Wales, Sydney, NSW, Australia
- UNSW Data Science Hub, University of New South Wales, Sydney, NSW, Australia
- Ingham Institute for Applied Medical Research, Liverpool, NSW, Australia
| |
Collapse
|
25
|
Sansbury SE, Serebrenik YV, Lapidot T, Burslem GM, Shalem O. Pooled tagging and hydrophobic targeting of endogenous proteins for unbiased mapping of unfolded protein responses. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.13.548611. [PMID: 37503003 PMCID: PMC10370017 DOI: 10.1101/2023.07.13.548611] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
System-level understanding of proteome organization and function requires methods for direct visualization and manipulation of proteins at scale. We developed an approach enabled by high-throughput gene tagging for the generation and analysis of complex cell pools with endogenously tagged proteins. Proteins are tagged with HaloTag to enable visualization or direct perturbation. Fluorescent labeling followed by in situ sequencing and deep learning-based image analysis identifies the localization pattern of each tag, providing a bird's-eye-view of cellular organization. Next, we use a hydrophobic HaloTag ligand to misfold tagged proteins, inducing spatially restricted proteotoxic stress that is read out by single cell RNA sequencing. By integrating optical and perturbation data, we map compartment-specific responses to protein misfolding, revealing inter-compartment organization and direct crosstalk, and assigning proteostasis functions to uncharacterized genes. Altogether, we present a powerful and efficient method for large-scale studies of proteome dynamics, function, and homeostasis.
Collapse
|
26
|
|
27
|
Doron M, Moutakanni T, Chen ZS, Moshkov N, Caron M, Touvron H, Bojanowski P, Pernice WM, Caicedo JC. Unbiased single-cell morphology with self-supervised vision transformers. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.16.545359. [PMID: 37398158 PMCID: PMC10312751 DOI: 10.1101/2023.06.16.545359] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
Accurately quantifying cellular morphology at scale could substantially empower existing single-cell approaches. However, measuring cell morphology remains an active field of research, which has inspired multiple computer vision algorithms over the years. Here, we show that DINO, a vision-transformer based, self-supervised algorithm, has a remarkable ability for learning rich representations of cellular morphology without manual annotations or any other type of supervision. We evaluate DINO on a wide variety of tasks across three publicly available imaging datasets of diverse specifications and biological focus. We find that DINO encodes meaningful features of cellular morphology at multiple scales, from subcellular and single-cell resolution, to multi-cellular and aggregated experimental groups. Importantly, DINO successfully uncovers a hierarchy of biological and technical factors of variation in imaging datasets. The results show that DINO can support the study of unknown biological variation, including single-cell heterogeneity and relationships between samples, making it an excellent tool for image-based biological discovery.
Collapse
Affiliation(s)
- Michael Doron
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | | | - Nikita Moshkov
- Synthetic and Systems Biology Unit, Biological Research Centre (BRC), Szeged, Hungary
| | | | | | | | - Wolfgang M. Pernice
- Department of Neurology, Columbia University Medical Center, New York, NY, USA
| | | |
Collapse
|
28
|
Li J, Zou Q, Yuan L. A review from biological mapping to computation-based subcellular localization. MOLECULAR THERAPY. NUCLEIC ACIDS 2023; 32:507-521. [PMID: 37215152 PMCID: PMC10192651 DOI: 10.1016/j.omtn.2023.04.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Subcellular localization is crucial to the study of virus and diseases. Specifically, research on protein subcellular localization can help identify clues between virus and host cells that can aid in the design of targeted drugs. Research on RNA subcellular localization is significant for human diseases (such as Alzheimer's disease, colon cancer, etc.). To date, only reviews addressing subcellular localization of proteins have been published, which are outdated for reference, and reviews of RNA subcellular localization are not comprehensive. Therefore, we collated (the most up-to-date) literature on protein and RNA subcellular localization to help researchers understand changes in the field of protein and RNA subcellular localization. Extensive and complete methods for constructing subcellular localization models have also been summarized, which can help readers understand the changes in application of biotechnology and computer science in subcellular localization research and explore how to use biological data to construct improved subcellular localization models. This paper is the first review to cover both protein subcellular localization and RNA subcellular localization. We urge researchers from biology and computational biology to jointly pay attention to transformation patterns, interrelationships, differences, and causality of protein subcellular localization and RNA subcellular localization.
Collapse
Affiliation(s)
- Jing Li
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, 1 Chengdian Road, Quzhou, Zhejiang 324000, China
- School of Biomedical Sciences, University of Hong Kong, Hong Kong, China
| | - Quan Zou
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, 1 Chengdian Road, Quzhou, Zhejiang 324000, China
| | - Lei Yuan
- Department of Hepatobiliary Surgery, Quzhou People's Hospital, 100 Minjiang Main Road, Quzhou, Zhejiang 324000, China
| |
Collapse
|
29
|
Spitzer H, Berry S, Donoghoe M, Pelkmans L, Theis FJ. Learning consistent subcellular landmarks to quantify changes in multiplexed protein maps. Nat Methods 2023:10.1038/s41592-023-01894-z. [PMID: 37248388 PMCID: PMC10333128 DOI: 10.1038/s41592-023-01894-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Accepted: 04/25/2023] [Indexed: 05/31/2023]
Abstract
Highly multiplexed imaging holds enormous promise for understanding how spatial context shapes the activity of the genome and its products at multiple length scales. Here, we introduce a deep learning framework called CAMPA (Conditional Autoencoder for Multiplexed Pixel Analysis), which uses a conditional variational autoencoder to learn representations of molecular pixel profiles that are consistent across heterogeneous cell populations and experimental perturbations. Clustering these pixel-level representations identifies consistent subcellular landmarks, which can be quantitatively compared in terms of their size, shape, molecular composition and relative spatial organization. Using high-resolution multiplexed immunofluorescence, this reveals how subcellular organization changes upon perturbation of RNA synthesis, RNA processing or cell size, and uncovers links between the molecular composition of membraneless organelles and cell-to-cell variability in bulk RNA synthesis rates. By capturing interpretable cellular phenotypes, we anticipate that CAMPA will greatly accelerate the systematic mapping of multiscale atlases of biological organization to identify the rules by which context shapes physiology and disease.
Collapse
Affiliation(s)
- Hannah Spitzer
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany
| | - Scott Berry
- Department of Molecular Life Sciences, University of Zurich, Zurich, Switzerland
- EMBL Australia Node in Single Molecule Science, School of Biomedical Sciences, University of New South Wales, Sydney, New South Wales, Australia
| | - Mark Donoghoe
- Stats Central, Mark Wainwright Analytical Centre, University of New South Wales, Sydney, New South Wales, Australia
| | - Lucas Pelkmans
- Department of Molecular Life Sciences, University of Zurich, Zurich, Switzerland.
| | - Fabian J Theis
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany.
- School of Computation, Information and Technology CIT, Technical University of Munich, Munich, Germany.
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany.
| |
Collapse
|
30
|
Husain SS, Ong EJ, Minskiy D, Bober-Irizar M, Irizar A, Bober M. Single-cell subcellular protein localisation using novel ensembles of diverse deep architectures. Commun Biol 2023; 6:489. [PMID: 37147530 PMCID: PMC10163260 DOI: 10.1038/s42003-023-04840-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Accepted: 04/12/2023] [Indexed: 05/07/2023] Open
Abstract
Unravelling protein distributions within individual cells is vital to understanding their function and state and indispensable to developing new treatments. Here we present the Hybrid subCellular Protein Localiser (HCPL), which learns from weakly labelled data to robustly localise single-cell subcellular protein patterns. It comprises innovative DNN architectures exploiting wavelet filters and learnt parametric activations that successfully tackle drastic cell variability. HCPL features correlation-based ensembling of novel architectures that boosts performance and aids generalisation. Large-scale data annotation is made feasible by our AI-trains-AI approach, which determines the visual integrity of cells and emphasises reliable labels for efficient training. In the Human Protein Atlas context, we demonstrate that HCPL is best performing in the single-cell classification of protein localisation patterns. To better understand the inner workings of HCPL and assess its biological relevance, we analyse the contributions of each system component and dissect the emergent features from which the localisation predictions are derived.
Collapse
Affiliation(s)
| | - Eng-Jon Ong
- CVSSP, University of Surrey, Guildford, GU27XH, Surrey, UK
| | - Dmitry Minskiy
- CVSSP, University of Surrey, Guildford, GU27XH, Surrey, UK
| | - Mikel Bober-Irizar
- CVSSP, University of Surrey, Guildford, GU27XH, Surrey, UK
- ForecomAI, London, W1W 5PF, UK
| | | | - Miroslaw Bober
- CVSSP, University of Surrey, Guildford, GU27XH, Surrey, UK
- ForecomAI, London, W1W 5PF, UK
| |
Collapse
|
31
|
Jiang J, Li J, Li J, Pei H, Li M, Zou Q, Lv Z. A Machine Learning Method to Identify Umami Peptide Sequences by Using Multiplicative LSTM Embedded Features. Foods 2023; 12:foods12071498. [PMID: 37048319 PMCID: PMC10094688 DOI: 10.3390/foods12071498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2023] [Revised: 03/24/2023] [Accepted: 03/30/2023] [Indexed: 04/05/2023] Open
Abstract
Umami peptides enhance the umami taste of food and have good food processing properties, nutritional value, and numerous potential applications. Wet testing for the identification of umami peptides is a time-consuming and expensive process. Here, we report the iUmami-DRLF that uses a logistic regression (LR) method solely based on the deep learning pre-trained neural network feature extraction method, unified representation (UniRep based on multiplicative LSTM), for feature extraction from the peptide sequences. The findings demonstrate that deep learning representation learning significantly enhanced the capability of models in identifying umami peptides and predictive precision solely based on peptide sequence information. The newly validated taste sequences were also used to test the iUmami-DRLF and other predictors, and the result indicates that the iUmami-DRLF has better robustness and accuracy and remains valid at higher probability thresholds. The iUmami-DRLF method can aid further studies on enhancing the umami flavor of food for satisfying the need for an umami-flavored diet.
Collapse
Affiliation(s)
- Jici Jiang
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| | - Jiayu Li
- College of Life Science, Sichuan University, Chengdu 610065, China
| | - Junxian Li
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| | - Hongdi Pei
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
- Wu Yuzhang Honors College, Sichuan University, Chengdu 610065, China
| | - Mingxin Li
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China
| | - Zhibin Lv
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| |
Collapse
|
32
|
Dai W, Cui Y, Wang P, Wu H, Zhang L, Bian Y, Li Y, Li Y, Hu H, Zhao J, Xu D, Kong D, Wang Y, Xu L. Classification regularized dimensionality reduction improves ultrasound thyroid nodule diagnostic accuracy and inter-observer consistency. Comput Biol Med 2023; 154:106536. [PMID: 36708654 DOI: 10.1016/j.compbiomed.2023.106536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 12/20/2022] [Accepted: 01/10/2023] [Indexed: 01/13/2023]
Abstract
PROBLEM Convolutional Neural Networks (CNNs) for medical image analysis usually only output a probability value, providing no further information about the original image or inter-relationships between different images. Dimensionality Reduction Techniques (DRTs) are used for visualization of high dimensional medical image data, but they are not intended for discriminative classification analysis. AIM We develop an interactive phenotype distribution field visualization system for medical images to accurately reflect the pathological characteristics of lesions and their similarity to assist radiologists in diagnosis and medical research. METHODS We propose a novel method, Classification Regularized Uniform Manifold Approximation and Projection (UMAP) referred as CReUMAP, combining the advantages of CNN and DRT, to project the extracted feature vector fused with the malignant probability predicted by a CNN to a two-dimensional space, and then apply a spatial segmentation classifier trained on 2614 ultrasound images for prediction of thyroid nodule malignancy and guidance to radiologists. RESULTS The CReUMAP embedding correlates well with the TI-RADS categories of thyroid nodules. The parametric version that embeds external test dataset of 303 images in presence of the training data with known pathological diagnosis improves the benign and malignant nodule diagnostic accuracy (p-value = 0.016) and confidence (p-value = 1.902 × 10-6) of eight radiologists of different experience levels significantly as well as their inter-observer agreements (kappa≥0.75). CReUMAP achieve 90.8% accuracy, 92.1% sensitivity and 88.6% specificity in test set. CONCLUSION CReUMAP embedding is well correlated with the pathological diagnosis of thyroid nodules, and helps radiologists achieve more accurate, confident and consistent diagnosis. It allows a medical center to generate its locally adapted embedding using an already-trained classification model in an updateable manner on an ever-growing local database as long as the extracted feature vectors and predicted diagnostic probabilities of the correspondent classification model can be outputted.
Collapse
Affiliation(s)
- Wenli Dai
- School of Mathematical Sciences, Zhejiang University, Hangzhou, China
| | - Yan Cui
- School of Mathematical Sciences, Zhejiang University, Hangzhou, China
| | - Peiyi Wang
- School of Mathematical Sciences, Zhejiang University, Hangzhou, China
| | - Hao Wu
- Department of Ultrasound, The Second Affiliated Hospital of Zhejiang Chinese Medical University, Hangzhou, China
| | - Lei Zhang
- Department of Ultrasound, The Second Affiliated Hospital of Zhejiang Chinese Medical University, Hangzhou, China
| | - Yeping Bian
- Department of Ultrasonography, The Cancer Hospital of the University of Chinese Academy of Sciences (Zhejiang Cancer Hospital), Institute of Basic Medicine and Cancer, Chinese Academy of Sciences, Hangzhou, China
| | - Yingying Li
- Department of Special Examinations, Hangzhou Third People's Hospital, Hangzhou, China
| | - Yutao Li
- Department of Ultrasound, Hangzhou First People's Hospital Affiliated to Medical College of Zhejiang University, Hangzhou, China
| | - Hairong Hu
- Demetics Medical Technology, Hangzhou, China
| | - Jiaqi Zhao
- Department of Ultrasound, Shanghai Fourth People's Hospital, School of Medicine, Tongji University, Shanghai, China
| | - Dong Xu
- Department of Ultrasonography, The Cancer Hospital of the University of Chinese Academy of Sciences (Zhejiang Cancer Hospital), Institute of Basic Medicine and Cancer, Chinese Academy of Sciences, Hangzhou, China
| | - Dexing Kong
- School of Mathematical Sciences, Zhejiang University, Hangzhou, China; Zhejiang Qiushi Institute for Mathematical Medicine, Hangzhou, China
| | - Yajuan Wang
- Department of Geriatric Medicine & Key Laboratory of Cardiovascular Proteomics of Shandong Province, Qilu Hospital of Shandong University, Jinan, China.
| | - Lei Xu
- Zhejiang Qiushi Institute for Mathematical Medicine, Hangzhou, China.
| |
Collapse
|
33
|
Razdaibiedina A, Brechalov A, Friesen H, Usaj MM, Masinas MPD, Suresh HG, Wang K, Boone C, Ba J, Andrews B. PIFiA: Self-supervised Approach for Protein Functional Annotation from Single-Cell Imaging Data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.24.529975. [PMID: 36909656 PMCID: PMC10002629 DOI: 10.1101/2023.02.24.529975] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/03/2023]
Abstract
Fluorescence microscopy data describe protein localization patterns at single-cell resolution and have the potential to reveal whole-proteome functional information with remarkable precision. Yet, extracting biologically meaningful representations from cell micrographs remains a major challenge. Existing approaches often fail to learn robust and noise-invariant features or rely on supervised labels for accurate annotations. We developed PIFiA, (Protein Image-based Functional Annotation), a self-supervised approach for protein functional annotation from single-cell imaging data. We imaged the global yeast ORF-GFP collection and applied PIFiA to generate protein feature profiles from single-cell images of fluorescently tagged proteins. We show that PIFiA outperforms existing approaches for molecular representation learning and describe a range of downstream analysis tasks to explore the information content of the feature profiles. Specifically, we cluster extracted features into a hierarchy of functional organization, study cell population heterogeneity, and develop techniques to distinguish multi-localizing proteins and identify functional modules. Finally, we confirm new PIFiA predictions using a colocalization assay, suggesting previously unappreciated biological roles for several proteins. Paired with a fully interactive website (https://thecellvision.org/pifia/), PIFiA is a resource for the quantitative analysis of protein organization within the cell.
Collapse
Affiliation(s)
- Anastasia Razdaibiedina
- Department of Molecular Genetics, University of Toronto, Toronto ON, Canada
- The Donnelly Centre, University of Toronto, Toronto ON, Canada
- Vector Institute for Artificial Intelligence, Toronto ON, Canada
| | - Alexander Brechalov
- Department of Molecular Genetics, University of Toronto, Toronto ON, Canada
- The Donnelly Centre, University of Toronto, Toronto ON, Canada
| | - Helena Friesen
- The Donnelly Centre, University of Toronto, Toronto ON, Canada
| | | | | | | | - Kyle Wang
- Department of Molecular Genetics, University of Toronto, Toronto ON, Canada
- The Donnelly Centre, University of Toronto, Toronto ON, Canada
| | - Charles Boone
- Department of Molecular Genetics, University of Toronto, Toronto ON, Canada
- The Donnelly Centre, University of Toronto, Toronto ON, Canada
- RIKEN Center for Sustainable Resource Science, 2-1 Hirosawa, Wako, Saitama, Japan
| | - Jimmy Ba
- Department of Computer Science, University of Toronto, Toronto ON, Canada
- Vector Institute for Artificial Intelligence, Toronto ON, Canada
| | - Brenda Andrews
- Department of Molecular Genetics, University of Toronto, Toronto ON, Canada
- The Donnelly Centre, University of Toronto, Toronto ON, Canada
| |
Collapse
|
34
|
Mou M, Pan Z, Lu M, Sun H, Wang Y, Luo Y, Zhu F. Application of Machine Learning in Spatial Proteomics. J Chem Inf Model 2022; 62:5875-5895. [PMID: 36378082 DOI: 10.1021/acs.jcim.2c01161] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Spatial proteomics is an interdisciplinary field that investigates the localization and dynamics of proteins, and it has gained extensive attention in recent years, especially the subcellular proteomics. Numerous evidence indicate that the subcellular localization of proteins is associated with various cellular processes and disease progression. Mass spectrometry (MS)-based and imaging-based experimental approaches have been developed to acquire large-scale spatial proteomic data. To allow the reliable analysis of increasingly complex spatial proteomics data, machine learning (ML) methods have been widely used in both MS-based and imaging-based spatial proteomic data analysis pipelines. Here, we comprehensively survey the applications of ML in spatial proteomics from following aspects: (1) data resources for spatial proteome are comprehensively introduced; (2) the roles of different ML algorithms in data analysis pipelines are elaborated; (3) successful applications of spatial proteomics and several analytical tools integrating ML methods are presented; (4) challenges existing in modern ML-based spatial proteomics studies are discussed. This review provides guidelines for researchers seeking to apply ML methods to analyze spatial proteomic data and can facilitate insightful understanding of cell biology as well as the future research in medical and drug discovery communities.
Collapse
Affiliation(s)
- Minjie Mou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Ziqi Pan
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Mingkun Lu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Huaicheng Sun
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Yunxia Wang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Yongchao Luo
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| |
Collapse
|
35
|
Funk L, Su KC, Ly J, Feldman D, Singh A, Moodie B, Blainey PC, Cheeseman IM. The phenotypic landscape of essential human genes. Cell 2022; 185:4634-4653.e22. [PMID: 36347254 PMCID: PMC10482496 DOI: 10.1016/j.cell.2022.10.017] [Citation(s) in RCA: 31] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 09/01/2022] [Accepted: 10/14/2022] [Indexed: 11/09/2022]
Abstract
Understanding the basis for cellular growth, proliferation, and function requires determining the roles of essential genes in diverse cellular processes, including visualizing their contributions to cellular organization and morphology. Here, we combined pooled CRISPR-Cas9-based functional screening of 5,072 fitness-conferring genes in human HeLa cells with microscopy-based imaging of DNA, the DNA damage response, actin, and microtubules. Analysis of >31 million individual cells identified measurable phenotypes for >90% of gene knockouts, implicating gene targets in specific cellular processes. Clustering of phenotypic similarities based on hundreds of quantitative parameters further revealed co-functional genes across diverse cellular activities, providing predictions for gene functions and associations. By conducting pooled live-cell screening of ∼450,000 cell division events for 239 genes, we additionally identified diverse genes with functional contributions to chromosome segregation. Our work establishes a resource detailing the consequences of disrupting core cellular processes that represents the functional landscape of essential human genes.
Collapse
Affiliation(s)
- Luke Funk
- Broad Institute of MIT and Harvard, 415 Main St., Cambridge, MA 02142, USA; Harvard-MIT Health Sciences and Technology, Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Kuan-Chung Su
- Whitehead Institute for Biomedical Research, 455 Main Street, Cambridge, MA 02142, USA; Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Jimmy Ly
- Whitehead Institute for Biomedical Research, 455 Main Street, Cambridge, MA 02142, USA; Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - David Feldman
- Broad Institute of MIT and Harvard, 415 Main St., Cambridge, MA 02142, USA
| | - Avtar Singh
- Broad Institute of MIT and Harvard, 415 Main St., Cambridge, MA 02142, USA
| | - Brittania Moodie
- Whitehead Institute for Biomedical Research, 455 Main Street, Cambridge, MA 02142, USA; Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Paul C Blainey
- Broad Institute of MIT and Harvard, 415 Main St., Cambridge, MA 02142, USA; Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02142, USA; Koch Institute for Integrative Cancer Research at MIT, Cambridge, MA 02142, USA.
| | - Iain M Cheeseman
- Whitehead Institute for Biomedical Research, 455 Main Street, Cambridge, MA 02142, USA; Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02142, USA.
| |
Collapse
|
36
|
Multiple Parallel Fusion Network for Predicting Protein Subcellular Localization from Stimulated Raman Scattering (SRS) Microscopy Images in Living Cells. Int J Mol Sci 2022; 23:ijms231810827. [PMID: 36142736 PMCID: PMC9504098 DOI: 10.3390/ijms231810827] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Revised: 09/10/2022] [Accepted: 09/13/2022] [Indexed: 11/23/2022] Open
Abstract
Stimulated Raman Scattering Microscopy (SRS) is a powerful tool for label-free detailed recognition and investigation of the cellular and subcellular structures of living cells. Determining subcellular protein localization from the cell level of SRS images is one of the basic goals of cell biology, which can not only provide useful clues for their functions and biological processes but also help to determine the priority and select the appropriate target for drug development. However, the bottleneck in predicting subcellular protein locations of SRS cell imaging lies in modeling complicated relationships concealed beneath the original cell imaging data owing to the spectral overlap information from different protein molecules. In this work, a multiple parallel fusion network, MPFnetwork, is proposed to study the subcellular locations from SRS images. This model used a multiple parallel fusion model to construct feature representations and combined multiple nonlinear decomposing algorithms as the automated subcellular detection method. Our experimental results showed that the MPFnetwork could achieve over 0.93 dice correlation between estimated and true fractions on SRS lung cancer cell datasets. In addition, we applied the MPFnetwork method to cell images for label-free prediction of several different subcellular components simultaneously, rather than using several fluorescent labels. These results open up a new method for the time-resolved study of subcellular components in different cells, especially cancer cells.
Collapse
|
37
|
Baldus M. Biological solid-state NMR: Integrative across different scientific disciplines. J Struct Biol X 2022; 6:100075. [PMID: 36185734 PMCID: PMC9523391 DOI: 10.1016/j.yjsbx.2022.100075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 09/18/2022] [Accepted: 09/25/2022] [Indexed: 11/29/2022] Open
|