1
|
Mansouri K, Taylor K, Auerbach S, Ferguson S, Frawley R, Hsieh JH, Jahnke G, Kleinstreuer N, Mehta S, Moreira-Filho JT, Parham F, Rider C, Rooney AA, Wang A, Sutherland V. Unlocking the Potential of Clustering and Classification Approaches: Navigating Supervised and Unsupervised Chemical Similarity. ENVIRONMENTAL HEALTH PERSPECTIVES 2024; 132:85002. [PMID: 39106156 DOI: 10.1289/ehp14001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/09/2024]
Abstract
BACKGROUND The field of toxicology has witnessed substantial advancements in recent years, particularly with the adoption of new approach methodologies (NAMs) to understand and predict chemical toxicity. Class-based methods such as clustering and classification are key to NAMs development and application, aiding the understanding of hazard and risk concerns associated with groups of chemicals without additional laboratory work. Advances in computational chemistry, data generation and availability, and machine learning algorithms represent important opportunities for continued improvement of these techniques to optimize their utility for specific regulatory and research purposes. However, due to their intricacy, deep understanding and careful selection are imperative to align the adequate methods with their intended applications. OBJECTIVES This commentary aims to deepen the understanding of class-based approaches by elucidating the pivotal role of chemical similarity (structural and biological) in clustering and classification approaches (CCAs). It addresses the dichotomy between general end point-agnostic similarity, often entailing unsupervised analysis, and end point-specific similarity necessitating supervised learning. The goal is to highlight the nuances of these approaches, their applications, and common misuses. DISCUSSION Understanding similarity is pivotal in toxicological research involving CCAs. The effectiveness of these approaches depends on the right definition and measure of similarity, which varies based on context and objectives of the study. This choice is influenced by how chemical structures are represented and the respective labels indicating biological activity, if applicable. The distinction between unsupervised clustering and supervised classification methods is vital, requiring the use of end point-agnostic vs. end point-specific similarity definition. Separate use or combination of these methods requires careful consideration to prevent bias and ensure relevance for the goal of the study. Unsupervised methods use end point-agnostic similarity measures to uncover general structural patterns and relationships, aiding hypothesis generation and facilitating exploration of datasets without the need for predefined labels or explicit guidance. Conversely, supervised techniques demand end point-specific similarity to group chemicals into predefined classes or to train classification models, allowing accurate predictions for new chemicals. Misuse can arise when unsupervised methods are applied to end point-specific contexts, like analog selection in read-across, leading to erroneous conclusions. This commentary provides insights into the significance of similarity and its role in supervised classification and unsupervised clustering approaches. https://doi.org/10.1289/EHP14001.
Collapse
Affiliation(s)
- Kamel Mansouri
- Division of Translational Toxicology, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, USA
| | - Kyla Taylor
- Division of Translational Toxicology, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, USA
| | - Scott Auerbach
- Division of Translational Toxicology, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, USA
| | - Stephen Ferguson
- Division of Translational Toxicology, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, USA
| | - Rachel Frawley
- Division of Translational Toxicology, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, USA
| | - Jui-Hua Hsieh
- Division of Translational Toxicology, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, USA
| | - Gloria Jahnke
- Division of Translational Toxicology, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, USA
| | - Nicole Kleinstreuer
- Division of Translational Toxicology, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, USA
| | - Suril Mehta
- Division of Translational Toxicology, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, USA
| | - José T Moreira-Filho
- Division of Translational Toxicology, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, USA
| | - Fred Parham
- Division of Translational Toxicology, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, USA
| | - Cynthia Rider
- Division of Translational Toxicology, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, USA
| | - Andrew A Rooney
- Division of Translational Toxicology, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, USA
| | - Amy Wang
- Division of Translational Toxicology, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, USA
| | - Vicki Sutherland
- Division of Translational Toxicology, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, USA
| |
Collapse
|
2
|
Singhal N, Maurya AK, Virdi JS. Bacterial Whole Cell Protein Profiling: Methodology, Applications and Constraints. CURR PROTEOMICS 2019. [DOI: 10.2174/1570164615666180905102253] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background:In the era of modern microbiology, several methods are available for identification and typing of bacteria, including whole genome sequencing. However, in microbiological laboratories or hospitals where genomic based molecular typing methods and/or trained manpower are unavailable, whole cell protein profiling using sodium dodecyl sulfate polyacrylamide gel electrophoresis might be a useful alternative/supplementary method for bacterial identification, strain typing and epidemiology. Whole cell protein profiling by SDS-PAGE is based on the principle that under standard growth conditions, a bacterial strain expresses the same set of proteins, the pattern of which can be used for bacterial identification.Objective:The objective of this review is to assess the current status of whole cell protein profiling by SDS-PAGE and its advantages and constraints for bacterial identification and typing.Results and Conclusions:Several earlier and recent studies prove the potential and utility of this technique as an adjunct or supplementary method for bacterial identification, strain typing and epidemiology. There is no denying the fact that utility of this technique as an adjunct or supplementary method for bacterial identification and typing has already been demonstrated and its practical applications need to be evaluated further.
Collapse
Affiliation(s)
- Neelja Singhal
- Department of Microbiology, University of Delhi South Campus, Benito Juarez Road, New Delhi-110021, India
| | - Anay Kumar Maurya
- Department of Microbiology, University of Delhi South Campus, Benito Juarez Road, New Delhi-110021, India
| | - Jugsharan Singh Virdi
- Department of Microbiology, University of Delhi South Campus, Benito Juarez Road, New Delhi-110021, India
| |
Collapse
|
4
|
Ricciardi A, Parente E, Tramutola A, Guidone A, Ianniello RG, Pavlidis D, Tsakalidou E, Zotta T. Evaluation of a differential medium for the preliminary identification of members of the Lactobacillus plantarum and Lactobacillus casei groups. ANN MICROBIOL 2014. [DOI: 10.1007/s13213-014-1004-y] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
|
9
|
Darling EM, Guilak F. A neural network model for cell classification based on single-cell biomechanical properties. Tissue Eng Part A 2009; 14:1507-15. [PMID: 18620486 DOI: 10.1089/ten.tea.2008.0180] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
The potential success of tissue engineering or other cell-based therapies is dependent on factors such as the purity and homogeneity of the source cell populations. The ability to enrich cell harvests for specific phenotypes can have significant effects on the overall success of such therapies. While most techniques for cell sorting or enrichment have relied on cell surface markers, recent studies have shown that single-cell mechanical properties can serve as identifying markers of phenotype. In this study, a neural network modeling approach was developed to classify mesenchymal-derived primary and stem cells based on their biomechanical properties. Cell sorting was simulated using previously published data characterizing the mechanical properties of several different cell types as measured by atomic force microscopy. Neural networks were trained using combined data sets, with the resultant groupings analyzed for their purity, efficiency, and enrichment. Heterogeneous populations of zonal chondrocytes, chondrosarcoma cells, and mesenchymal-lineage cells, respectively, could all be classified into enriched subpopulations. Additionally, adult stem cells (adipose-derived or bone marrow-derived) separated disproportionately into nodes associated with the three primary mesenchymal lineages examined. These findings suggest that mathematical approaches such as neural network modeling, in combination with novel measures of cell properties, may provide a means of classifying and eventually sorting mixed populations of cells that are otherwise difficult to identify using more established techniques. In this respect, the identification of biomechanically based cell properties that increase the percentage of stem cells capable of differentiating into predictable lineages may improve the overall success of cell-based therapies.
Collapse
Affiliation(s)
- Eric M Darling
- Department of Surgery, Duke University Medical Center, Durham, North Carolina, USA
| | | |
Collapse
|
10
|
Alvarez-Guerra M, González-Piñuela C, Andrés A, Galán B, Viguri JR. Assessment of Self-Organizing Map artificial neural networks for the classification of sediment quality. ENVIRONMENT INTERNATIONAL 2008; 34:782-790. [PMID: 18313753 DOI: 10.1016/j.envint.2008.01.006] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/06/2007] [Revised: 01/09/2008] [Accepted: 01/10/2008] [Indexed: 05/26/2023]
Abstract
The application of mathematical tools in initial steps of sediment quality assessment frameworks can be useful to provide an integrated interpretation of multiple measured variables. This study reveals that the Self-Organizing Map (SOM) artificial neural network can be an effective tool for the integration of multiple physical, chemical and ecotoxicological variables in order to classify different sites under study according to their similar sediment quality. Sediment samples from 40 sites of 3 estuaries of Cantabria (Spain) were classified with respect to 13 physical, chemical and toxicological variables using the SOM. Results obtained with the SOM, when compared to those of traditional multivariate statistical techniques commonly used in the field of sediment quality (principal component analysis (PCA) and hierarchical cluster analysis (HCA)), provided a more useful classification for further assessment steps. Especially, the powerful visualization tools of the SOM, which offer more information and in an easier way than HCA and PCA, facilitate the task of establishing an order of priority between the distinguished groups of sites depending on their need for further investigations or remediation actions in subsequent management steps.
Collapse
Affiliation(s)
- Manuel Alvarez-Guerra
- Department of Chemical Engineering and Inorganic Chemistry, ETSIIT, University of Cantabria, Avda. de los Castros s/n 39005, Santander, Spain
| | | | | | | | | |
Collapse
|
12
|
Urease production by Streptococcus thermophilus. Food Microbiol 2007; 25:113-9. [PMID: 17993384 DOI: 10.1016/j.fm.2007.07.001] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2007] [Revised: 07/12/2007] [Accepted: 07/19/2007] [Indexed: 11/21/2022]
Abstract
In order to identify potential alternative sources of urease for the removal of urea from alcoholic beverages, 205 strains of lactic acid bacteria belonging to 27 different species were screened for urease production. Only Streptococcus thermophilus produced urease. Cell permeabilization with toluene allowed to increase activity significantly. Optimal pH for urease activity in whole and permeabilized cells and of cell free extracts differed slightly, but was in the range 6.0-7.0. Significant activity was retained at pH 3.0 and 8.0, and, for cell free extracts, at pH 4.0 in the presence of ethanol. Urease production was evaluated in fermentations with pH control (5.25-6.5) and without pH control. Very little urease was produced in absence of urea, which at 5g/l slowed growth significantly in fermentations without pH control, but prevented a decrease in pH below 5.1 and resulted in higher final biomass. Optimal pH for growth was between 6.0 and 6.5 but specific urease activity was higher for fermentations at low pH at the beginning of the exponential phase. However, a higher total urease activity was obtained at pH 6.0 and 6.5 because of higher biomass. Potential technological applications of urease production by S. thermophilus are discussed.
Collapse
|