1
|
Whelan A, Elsayed R, Bellofiore A, Anastasiu DC. Selective Partitioned Regression for Accurate Kidney Health Monitoring. Ann Biomed Eng 2024; 52:1448-1462. [PMID: 38413512 PMCID: PMC10995075 DOI: 10.1007/s10439-024-03470-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Accepted: 02/06/2024] [Indexed: 02/29/2024]
Abstract
The number of people diagnosed with advanced stages of kidney disease have been rising every year. Early detection and constant monitoring are the only minimally invasive means to prevent severe kidney damage or kidney failure. We propose a cost-effective machine learning-based testing system that can facilitate inexpensive yet accurate kidney health checks. Our proposed framework, which was developed into an iPhone application, uses a camera-based bio-sensor and state-of-the-art classical machine learning and deep learning techniques for predicting the concentration of creatinine in the sample, based on colorimetric change in the test strip. The predicted creatinine concentration is then used to classify the severity of the kidney disease as healthy, intermediate, or critical. In this article, we focus on the effectiveness of machine learning models to translate the colorimetric reaction to kidney health prediction. In this setting, we thoroughly evaluated the effectiveness of our novel proposed models against state-of-the-art classical machine learning and deep learning approaches. Additionally, we executed a number of ablation studies to measure the performance of our model when trained using different meta-parameter choices. Our evaluation results indicate that our selective partitioned regression (SPR) model, using histogram of colors-based features and a histogram gradient boosted trees underlying estimator, exhibits much better overall prediction performance compared to state-of-the-art methods. Our initial study indicates that SPR can be an effective tool for detecting the severity of kidney disease using inexpensive lateral flow assay test strips and a smart phone-based application. Additional work is needed to verify the performance of the model in various settings.
Collapse
Affiliation(s)
- Alex Whelan
- Computer Science and Engineering, Santa Clara University, 500 El Camino Real, Santa Clara, CA, 95053, USA
| | - Ragwa Elsayed
- Biomedical Engineering, San José State University, 1 Washington Sq, San Jose, CA, 95192, USA
| | - Alessandro Bellofiore
- Biomedical Engineering, San José State University, 1 Washington Sq, San Jose, CA, 95192, USA
| | - David C Anastasiu
- Computer Science and Engineering, Santa Clara University, 500 El Camino Real, Santa Clara, CA, 95053, USA.
| |
Collapse
|
2
|
Balakrishnan N, Katkar R, Pham PV, Downey T, Kashyap P, Anastasiu DC, Ramasubramanian AK. Prospection of Peptide Inhibitors of Thrombin from Diverse Origins Using a Machine Learning Pipeline. Bioengineering (Basel) 2023; 10:1300. [PMID: 38002424 PMCID: PMC10669389 DOI: 10.3390/bioengineering10111300] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 10/30/2023] [Accepted: 11/04/2023] [Indexed: 11/26/2023] Open
Abstract
Thrombin is a key enzyme involved in the development and progression of many cardiovascular diseases. Direct thrombin inhibitors (DTIs), with their minimum off-target effects and immediacy of action, have greatly improved the treatment of these diseases. However, the risk of bleeding, pharmacokinetic issues, and thrombotic complications remain major concerns. In an effort to increase the effectiveness of the DTI discovery pipeline, we developed a two-stage machine learning pipeline to identify and rank peptide sequences based on their effective thrombin inhibitory potential. The positive dataset for our model consisted of thrombin inhibitor peptides and their binding affinities (KI) curated from published literature, and the negative dataset consisted of peptides with no known thrombin inhibitory or related activity. The first stage of the model identified thrombin inhibitory sequences with Matthew's Correlation Coefficient (MCC) of 83.6%. The second stage of the model, which covers an eight-order of magnitude range in KI values, predicted the binding affinity of new sequences with a log room mean square error (RMSE) of 1.114. These models also revealed physicochemical and structural characteristics that are hidden but unique to thrombin inhibitor peptides. Using the model, we classified more than 10 million peptides from diverse sources and identified unique short peptide sequences (<15 aa) of interest, based on their predicted KI. Based on the binding energies of the interaction of the peptide with thrombin, we identified a promising set of putative DTI candidates. The prediction pipeline is available on a web server.
Collapse
Affiliation(s)
- Nivedha Balakrishnan
- Department of Chemical and Materials Engineering, San José State University, San Jose, CA 95192, USA (P.K.)
| | - Rahul Katkar
- Department of Chemical and Materials Engineering, San José State University, San Jose, CA 95192, USA (P.K.)
| | - Peter V. Pham
- Department of Chemical and Materials Engineering, San José State University, San Jose, CA 95192, USA (P.K.)
| | - Taylor Downey
- Department of Computer Science and Engineering, Santa Clara University, Santa Clara, CA 95053, USA (D.C.A.)
| | - Prarthna Kashyap
- Department of Chemical and Materials Engineering, San José State University, San Jose, CA 95192, USA (P.K.)
| | - David C. Anastasiu
- Department of Computer Science and Engineering, Santa Clara University, Santa Clara, CA 95053, USA (D.C.A.)
| | - Anand K. Ramasubramanian
- Department of Chemical and Materials Engineering, San José State University, San Jose, CA 95192, USA (P.K.)
| |
Collapse
|
3
|
Li Y, Nguyen J, Anastasiu DC, Arriaga EA. CosTaL: an accurate and scalable graph-based clustering algorithm for high-dimensional single-cell data analysis. Brief Bioinform 2023; 24:bbad157. [PMID: 37150778 PMCID: PMC10199777 DOI: 10.1093/bib/bbad157] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 03/28/2023] [Accepted: 04/02/2023] [Indexed: 05/09/2023] Open
Abstract
With the aim of analyzing large-sized multidimensional single-cell datasets, we are describing a method for Cosine-based Tanimoto similarity-refined graph for community detection using Leiden's algorithm (CosTaL). As a graph-based clustering method, CosTaL transforms the cells with high-dimensional features into a weighted k-nearest-neighbor (kNN) graph. The cells are represented by the vertices of the graph, while an edge between two vertices in the graph represents the close relatedness between the two cells. Specifically, CosTaL builds an exact kNN graph using cosine similarity and uses the Tanimoto coefficient as the refining strategy to re-weight the edges in order to improve the effectiveness of clustering. We demonstrate that CosTaL generally achieves equivalent or higher effectiveness scores on seven benchmark cytometry datasets and six single-cell RNA-sequencing datasets using six different evaluation metrics, compared with other state-of-the-art graph-based clustering methods, including PhenoGraph, Scanpy and PARC. As indicated by the combined evaluation metrics, Costal has high efficiency with small datasets and acceptable scalability for large datasets, which is beneficial for large-scale analysis.
Collapse
Affiliation(s)
- Yijia Li
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, 420 Washington Ave. S.E., Minneapolis, 55455, Minnesota, USA
| | - Jonathan Nguyen
- Department of Computer Science and Engineering, Santa Clara University, 500 El Camino Real, Santa Clara, 95053, California, USA
| | - David C Anastasiu
- Department of Computer Science and Engineering, Santa Clara University, 500 El Camino Real, Santa Clara, 95053, California, USA
| | - Edgar A Arriaga
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, 420 Washington Ave. S.E., Minneapolis, 55455, Minnesota, USA
- Department of Chemistry, University of Minnesota, Smith Hall, 139 Smith Hall, Pleasant St SE, Minneapolis, 55455, Minnesota, USA
| |
Collapse
|
4
|
Bose B, Downey T, Ramasubramanian AK, Anastasiu DC. Identification of Distinct Characteristics of Antibiofilm Peptides and Prospection of Diverse Sources for Efficacious Sequences. Front Microbiol 2022; 12:783284. [PMID: 35185814 PMCID: PMC8856603 DOI: 10.3389/fmicb.2021.783284] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2021] [Accepted: 12/30/2021] [Indexed: 01/15/2023] Open
Abstract
A majority of microbial infections are associated with biofilms. Targeting biofilms is considered an effective strategy to limit microbial virulence while minimizing the development of antibiotic resistance. Toward this need, antibiofilm peptides are an attractive arsenal since they are bestowed with properties orthogonal to small molecule drugs. In this work, we developed machine learning models to identify the distinguishing characteristics of known antibiofilm peptides, and to mine peptide databases from diverse habitats to classify new peptides with potential antibiofilm activities. Additionally, we used the reported minimum inhibitory/eradication concentration (MBIC/MBEC) of the antibiofilm peptides to create a regression model on top of the classification model to predict the effectiveness of new antibiofilm peptides. We used a positive dataset containing 242 antibiofilm peptides, and a negative dataset which, unlike previous datasets, contains peptides that are likely to promote biofilm formation. Our model achieved a classification accuracy greater than 98% and harmonic mean of precision-recall (F1) and Matthews correlation coefficient (MCC) scores greater than 0.90; the regression model achieved an MCC score greater than 0.81. We utilized our classification-regression pipeline to evaluate 135,015 peptides from diverse sources for potential antibiofilm activity, and we identified 185 candidates that are likely to be effective against preformed biofilms at micromolar concentrations. Structural analysis of the top 37 hits revealed a larger distribution of helices and coils than sheets, and common functional motifs. Sequence alignment of these hits with known antibiofilm peptides revealed that, while some of the hits showed relatively high sequence similarity with known peptides, some others did not indicate the presence of antibiofilm activity in novel sources or sequences. Further, some of the hits had previously recognized therapeutic properties or host defense traits suggestive of drug repurposing applications. Taken together, this work demonstrates a new in silico approach to predicting antibiofilm efficacy, and identifies promising new candidates for biofilm eradication.
Collapse
Affiliation(s)
- Bipasa Bose
- Department of Biomedical Engineering, San Jose State University, San Jose, CA, United States
| | - Taylor Downey
- Department of Computer Science and Engineering, Santa Clara University, Santa Clara, CA, United States
| | - Anand K. Ramasubramanian
- Department of Chemical and Materials Engineering, San Jose State University, San Jose, CA, United States
- *Correspondence: Anand K. Ramasubramanian
| | - David C. Anastasiu
- Department of Computer Science and Engineering, Santa Clara University, Santa Clara, CA, United States
- David C. Anastasiu
| |
Collapse
|