1
|
Kundu P, Beura S, Mondal S, Das AK, Ghosh A. Machine learning for the advancement of genome-scale metabolic modeling. Biotechnol Adv 2024; 74:108400. [PMID: 38944218 DOI: 10.1016/j.biotechadv.2024.108400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 05/13/2024] [Accepted: 06/23/2024] [Indexed: 07/01/2024]
Abstract
Constraint-based modeling (CBM) has evolved as the core systems biology tool to map the interrelations between genotype, phenotype, and external environment. The recent advancement of high-throughput experimental approaches and multi-omics strategies has generated a plethora of new and precise information from wide-ranging biological domains. On the other hand, the continuously growing field of machine learning (ML) and its specialized branch of deep learning (DL) provide essential computational architectures for decoding complex and heterogeneous biological data. In recent years, both multi-omics and ML have assisted in the escalation of CBM. Condition-specific omics data, such as transcriptomics and proteomics, helped contextualize the model prediction while analyzing a particular phenotypic signature. At the same time, the advanced ML tools have eased the model reconstruction and analysis to increase the accuracy and prediction power. However, the development of these multi-disciplinary methodological frameworks mainly occurs independently, which limits the concatenation of biological knowledge from different domains. Hence, we have reviewed the potential of integrating multi-disciplinary tools and strategies from various fields, such as synthetic biology, CBM, omics, and ML, to explore the biochemical phenomenon beyond the conventional biological dogma. How the integrative knowledge of these intersected domains has improved bioengineering and biomedical applications has also been highlighted. We categorically explained the conventional genome-scale metabolic model (GEM) reconstruction tools and their improvement strategies through ML paradigms. Further, the crucial role of ML and DL in omics data restructuring for GEM development has also been briefly discussed. Finally, the case-study-based assessment of the state-of-the-art method for improving biomedical and metabolic engineering strategies has been elaborated. Therefore, this review demonstrates how integrating experimental and in silico strategies can help map the ever-expanding knowledge of biological systems driven by condition-specific cellular information. This multiview approach will elevate the application of ML-based CBM in the biomedical and bioengineering fields for the betterment of society and the environment.
Collapse
Affiliation(s)
- Pritam Kundu
- School School of Energy Science and Engineering, Indian Institute of Technology Kharagpur, West Bengal 721302, India
| | - Satyajit Beura
- Department of Bioscience and Biotechnology, Indian Institute of Technology, Kharagpur, West Bengal 721302, India
| | - Suman Mondal
- P.K. Sinha Centre for Bioenergy and Renewables, Indian Institute of Technology Kharagpur, West Bengal 721302, India
| | - Amit Kumar Das
- Department of Bioscience and Biotechnology, Indian Institute of Technology, Kharagpur, West Bengal 721302, India
| | - Amit Ghosh
- School School of Energy Science and Engineering, Indian Institute of Technology Kharagpur, West Bengal 721302, India; P.K. Sinha Centre for Bioenergy and Renewables, Indian Institute of Technology Kharagpur, West Bengal 721302, India.
| |
Collapse
|
2
|
Manrique-Castano D, Bhaskar D, ElAli A. Dissecting glial scar formation by spatial point pattern and topological data analysis. Sci Rep 2024; 14:19035. [PMID: 39152163 PMCID: PMC11329771 DOI: 10.1038/s41598-024-69426-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 08/05/2024] [Indexed: 08/19/2024] Open
Abstract
Glial scar formation represents a fundamental response to central nervous system (CNS) injuries. It is mainly characterized by a well-defined spatial rearrangement of reactive astrocytes and microglia. The mechanisms underlying glial scar formation have been extensively studied, yet quantitative descriptors of the spatial arrangement of reactive glial cells remain limited. Here, we present a novel approach using point pattern analysis (PPA) and topological data analysis (TDA) to quantify spatial patterns of reactive glial cells after experimental ischemic stroke in mice. We provide open and reproducible tools using R and Julia to quantify spatial intensity, cell covariance and conditional distribution, cell-to-cell interactions, and short/long-scale arrangement, which collectively disentangle the arrangement patterns of the glial scar. This approach unravels a substantial divergence in the distribution of GFAP+ and IBA1+ cells after injury that conventional analysis methods cannot fully characterize. PPA and TDA are valuable tools for studying the complex spatial arrangement of reactive glia and other nervous cells following CNS injuries and have potential applications for evaluating glial-targeted restorative therapies.
Collapse
Affiliation(s)
- Daniel Manrique-Castano
- Neuroscience Axis, Research Center of CHU de Québec-Université Laval, Quebec City, QC, Canada.
- Department of Psychiatry and Neuroscience, Faculty of Medicine, Université Laval, Quebec City, QC, Canada.
| | | | - Ayman ElAli
- Neuroscience Axis, Research Center of CHU de Québec-Université Laval, Quebec City, QC, Canada.
- Department of Psychiatry and Neuroscience, Faculty of Medicine, Université Laval, Quebec City, QC, Canada.
| |
Collapse
|
3
|
Kadavath MRK, Nasor M, Imran A. Enhanced Hand Gesture Recognition with Surface Electromyogram and Machine Learning. SENSORS (BASEL, SWITZERLAND) 2024; 24:5231. [PMID: 39204927 PMCID: PMC11359667 DOI: 10.3390/s24165231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Revised: 07/30/2024] [Accepted: 08/08/2024] [Indexed: 09/04/2024]
Abstract
This study delves into decoding hand gestures using surface electromyography (EMG) signals collected via a precision Myo-armband sensor, leveraging machine learning algorithms. The research entails rigorous data preprocessing to extract features and labels from raw EMG data. Following partitioning into training and testing sets, four traditional machine learning models are scrutinized for their efficacy in classifying finger movements across seven distinct gestures. The analysis includes meticulous parameter optimization and five-fold cross-validation to evaluate model performance. Among the models assessed, the Random Forest emerges as the top performer, consistently delivering superior precision, recall, and F1-score values across gesture classes, with ROC-AUC scores surpassing 99%. These findings underscore the Random Forest model as the optimal classifier for our EMG dataset, promising significant advancements in healthcare rehabilitation engineering and enhancing human-computer interaction technologies.
Collapse
Affiliation(s)
| | - Mohamed Nasor
- College of Engineering and Information Technology, Ajman University, Ajman P.O. Box 346, United Arab Emirates
| | - Ahmed Imran
- College of Engineering and Information Technology, Ajman University, Ajman P.O. Box 346, United Arab Emirates
| |
Collapse
|
4
|
Khatibi SMH, Ali J. Harnessing the power of machine learning for crop improvement and sustainable production. FRONTIERS IN PLANT SCIENCE 2024; 15:1417912. [PMID: 39188546 PMCID: PMC11346375 DOI: 10.3389/fpls.2024.1417912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Accepted: 07/15/2024] [Indexed: 08/28/2024]
Abstract
Crop improvement and production domains encounter large amounts of expanding data with multi-layer complexity that forces researchers to use machine-learning approaches to establish predictive and informative models to understand the sophisticated mechanisms underlying these processes. All machine-learning approaches aim to fit models to target data; nevertheless, it should be noted that a wide range of specialized methods might initially appear confusing. The principal objective of this study is to offer researchers an explicit introduction to some of the essential machine-learning approaches and their applications, comprising the most modern and utilized methods that have gained widespread adoption in crop improvement or similar domains. This article explicitly explains how different machine-learning methods could be applied for given agricultural data, highlights newly emerging techniques for machine-learning users, and lays out technical strategies for agri/crop research practitioners and researchers.
Collapse
Affiliation(s)
| | - Jauhar Ali
- Rice Breeding Platform, International Rice Research Institute, Los Baños, Laguna, Philippines
| |
Collapse
|
5
|
Chu H, Liu T. Comprehensive Research on Druggable Proteins: From PSSM to Pre-Trained Language Models. Int J Mol Sci 2024; 25:4507. [PMID: 38674091 PMCID: PMC11049818 DOI: 10.3390/ijms25084507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Revised: 04/15/2024] [Accepted: 04/17/2024] [Indexed: 04/28/2024] Open
Abstract
Identification of druggable proteins can greatly reduce the cost of discovering new potential drugs. Traditional experimental approaches to exploring these proteins are often costly, slow, and labor-intensive, making them impractical for large-scale research. In response, recent decades have seen a rise in computational methods. These alternatives support drug discovery by creating advanced predictive models. In this study, we proposed a fast and precise classifier for the identification of druggable proteins using a protein language model (PLM) with fine-tuned evolutionary scale modeling 2 (ESM-2) embeddings, achieving 95.11% accuracy on the benchmark dataset. Furthermore, we made a careful comparison to examine the predictive abilities of ESM-2 embeddings and position-specific scoring matrix (PSSM) features by using the same classifiers. The results suggest that ESM-2 embeddings outperformed PSSM features in terms of accuracy and efficiency. Recognizing the potential of language models, we also developed an end-to-end model based on the generative pre-trained transformers 2 (GPT-2) with modifications. To our knowledge, this is the first time a large language model (LLM) GPT-2 has been deployed for the recognition of druggable proteins. Additionally, a more up-to-date dataset, known as Pharos, was adopted to further validate the performance of the proposed model.
Collapse
Affiliation(s)
| | - Taigang Liu
- College of Information Technology, Shanghai Ocean University, Shanghai 201306, China;
| |
Collapse
|
6
|
Michael-Pitschaze T, Cohen N, Ofer D, Hoshen Y, Linial M. Detecting anomalous proteins using deep representations. NAR Genom Bioinform 2024; 6:lqae021. [PMID: 38486884 PMCID: PMC10939404 DOI: 10.1093/nargab/lqae021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Revised: 11/17/2023] [Accepted: 02/23/2024] [Indexed: 03/17/2024] Open
Abstract
Many advances in biomedicine can be attributed to identifying unusual proteins and genes. Many of these proteins' unique properties were discovered by manual inspection, which is becoming infeasible at the scale of modern protein datasets. Here, we propose to tackle this challenge using anomaly detection methods that automatically identify unexpected properties. We adopt a state-of-the-art anomaly detection paradigm from computer vision, to highlight unusual proteins. We generate meaningful representations without labeled inputs, using pretrained deep neural network models. We apply these protein language models (pLM) to detect anomalies in function, phylogenetic families, and segmentation tasks. We compute protein anomaly scores to highlight human prion-like proteins, distinguish viral proteins from their host proteome, and mark non-classical ion/metal binding proteins and enzymes. Other tasks concern segmentation of protein sequences into folded and unstructured regions. We provide candidates for rare functionality (e.g. prion proteins). Additionally, we show the anomaly score is useful in 3D folding-related segmentation. Our novel method shows improved performance over strong baselines and has objectively high performance across a variety of tasks. We conclude that the combination of pLM and anomaly detection techniques is a valid method for discovering a range of global and local protein characteristics.
Collapse
Affiliation(s)
- Tomer Michael-Pitschaze
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Niv Cohen
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Dan Ofer
- Department of Biological Chemistry, Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Yedid Hoshen
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Michal Linial
- Department of Biological Chemistry, Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| |
Collapse
|
7
|
Wang H, Lu H, Sun J, Safo SE. Interpretable deep learning methods for multiview learning. BMC Bioinformatics 2024; 25:69. [PMID: 38350879 PMCID: PMC11265116 DOI: 10.1186/s12859-024-05679-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 01/29/2024] [Indexed: 02/15/2024] Open
Abstract
BACKGROUND Technological advances have enabled the generation of unique and complementary types of data or views (e.g. genomics, proteomics, metabolomics) and opened up a new era in multiview learning research with the potential to lead to new biomedical discoveries. RESULTS We propose iDeepViewLearn (Interpretable Deep Learning Method for Multiview Learning) to learn nonlinear relationships in data from multiple views while achieving feature selection. iDeepViewLearn combines deep learning flexibility with the statistical benefits of data and knowledge-driven feature selection, giving interpretable results. Deep neural networks are used to learn view-independent low-dimensional embedding through an optimization problem that minimizes the difference between observed and reconstructed data, while imposing a regularization penalty on the reconstructed data. The normalized Laplacian of a graph is used to model bilateral relationships between variables in each view, therefore, encouraging selection of related variables. iDeepViewLearn is tested on simulated and three real-world data for classification, clustering, and reconstruction tasks. For the classification tasks, iDeepViewLearn had competitive classification results with state-of-the-art methods in various settings. For the clustering task, we detected molecular clusters that differed in their 10-year survival rates for breast cancer. For the reconstruction task, we were able to reconstruct handwritten images using a few pixels while achieving competitive classification accuracy. The results of our real data application and simulations with small to moderate sample sizes suggest that iDeepViewLearn may be a useful method for small-sample-size problems compared to other deep learning methods for multiview learning. CONCLUSION iDeepViewLearn is an innovative deep learning model capable of capturing nonlinear relationships between data from multiple views while achieving feature selection. It is fully open source and is freely available at https://github.com/lasandrall/iDeepViewLearn .
Collapse
Affiliation(s)
- Hengkang Wang
- Department of Computer Science and Engineering, University of Minnesota, Minneapolis, 55455, USA
| | - Han Lu
- Division of Biostatistics and Health Data Science, University of Minnesota, Minneapolis, 55414, USA
| | - Ju Sun
- Department of Computer Science and Engineering, University of Minnesota, Minneapolis, 55455, USA
| | - Sandra E Safo
- Division of Biostatistics and Health Data Science, University of Minnesota, Minneapolis, 55414, USA.
| |
Collapse
|
8
|
Bajiya N, Choudhury S, Dhall A, Raghava GPS. AntiBP3: A Method for Predicting Antibacterial Peptides against Gram-Positive/Negative/Variable Bacteria. Antibiotics (Basel) 2024; 13:168. [PMID: 38391554 PMCID: PMC10885866 DOI: 10.3390/antibiotics13020168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Revised: 02/03/2024] [Accepted: 02/06/2024] [Indexed: 02/24/2024] Open
Abstract
Most of the existing methods developed for predicting antibacterial peptides (ABPs) are mostly designed to target either gram-positive or gram-negative bacteria. In this study, we describe a method that allows us to predict ABPs against gram-positive, gram-negative, and gram-variable bacteria. Firstly, we developed an alignment-based approach using BLAST to identify ABPs and achieved poor sensitivity. Secondly, we employed a motif-based approach to predict ABPs and obtained high precision with low sensitivity. To address the issue of poor sensitivity, we developed alignment-free methods for predicting ABPs using machine/deep learning techniques. In the case of alignment-free methods, we utilized a wide range of peptide features that include different types of composition, binary profiles of terminal residues, and fastText word embedding. In this study, a five-fold cross-validation technique has been used to build machine/deep learning models on training datasets. These models were evaluated on an independent dataset with no common peptide between training and independent datasets. Our machine learning-based model developed using the amino acid binary profile of terminal residues achieved maximum AUC 0.93, 0.98, and 0.94 for gram-positive, gram-negative, and gram-variable bacteria, respectively, on an independent dataset. Our method performs better than existing methods when compared with existing approaches on an independent dataset. A user-friendly web server, standalone package and pip package have been developed to facilitate peptide-based therapeutics.
Collapse
Affiliation(s)
- Nisha Bajiya
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi 110020, India
| | - Shubham Choudhury
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi 110020, India
| | - Anjali Dhall
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi 110020, India
| | - Gajendra P S Raghava
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi 110020, India
| |
Collapse
|
9
|
Zhang Y, Wu H, Xu R, Wang Y, Chen L, Wei C. Machine learning modeling for the prediction of phosphorus and nitrogen removal efficiency and screening of crucial microorganisms in wastewater treatment plants. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 907:167730. [PMID: 37852495 DOI: 10.1016/j.scitotenv.2023.167730] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Revised: 10/08/2023] [Accepted: 10/08/2023] [Indexed: 10/20/2023]
Abstract
The effectiveness of wastewater treatment plants (WWTPs) is largely determined by the microbial community structure in their activated sludge (AS). Interactions among microbial communities in AS systems and their indirect effects on water quality changes are crucial for WWTP performance. However, there is currently no quantitative method to evaluate the contribution of microorganisms to the operating efficiency of WWTPs. Traditional assessments of WWTP performance are limited by experimental conditions, methods, and other factors, resulting in increased costs and experimental pollutants. Therefore, an effective method is needed to predict WWTP efficiency based on AS community structure and quantitatively evaluate the contribution of microorganisms in the AS system. This study evaluated and compared microbial communities and water quality changes from WWTPs worldwide by meta-analysis of published high-throughput sequencing data. Six machine learning (ML) models were utilized to predict the efficiency of phosphorus and nitrogen removal in WWTPs; among them, XGBoost showed the highest prediction accuracy. Cross-entropy was used to screen the crucial microorganisms related to phosphorus and nitrogen removal efficiency, and the modeling confirmed the reasonableness of the results. Thirteen genera with nitrogen and phosphorus cycling pathways obtained from the screening were considered highly appropriate for the simultaneous removal of phosphorus and nitrogen. The results showed that the microbes Haliangium, Vicinamibacteraceae, Tolumonas, and SWB02 are potentially crucial for phosphorus and nitrogen removal, as they may be involved in the process of phosphorus and nitrogen removal in sewage treatment plants. Overall, these findings have deepened our understanding of the relationship between microbial community structure and performance of WWTPs, indicating that microbial data should play a critical role in the future design of sewage treatment plants. The ML model of this study can efficiently screen crucial microbes associated with WWTP system performance, and it is promising for the discovery of potential microbial metabolic pathways.
Collapse
Affiliation(s)
- Yinan Zhang
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou 510006, PR China
| | - Haizhen Wu
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou 510006, PR China.
| | - Rui Xu
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou 510006, PR China
| | - Ying Wang
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou 510006, PR China
| | - Liping Chen
- School of Environment and Energy, South China University of Technology, Guangzhou Higher Education Mega Centre, Guangzhou 510006, PR China
| | - Chaohai Wei
- School of Environment and Energy, South China University of Technology, Guangzhou Higher Education Mega Centre, Guangzhou 510006, PR China
| |
Collapse
|
10
|
Avila Santos AP, de Almeida BLS, Bonidia RP, Stadler PF, Stefanic P, Mandic-Mulec I, Rocha U, Sanches DS, de Carvalho ACPLF. BioDeepfuse: a hybrid deep learning approach with integrated feature extraction techniques for enhanced non-coding RNA classification. RNA Biol 2024; 21:1-12. [PMID: 38528797 DOI: 10.1080/15476286.2024.2329451] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/23/2024] [Indexed: 03/27/2024] Open
Abstract
The accurate classification of non-coding RNA (ncRNA) sequences is pivotal for advanced non-coding genome annotation and analysis, a fundamental aspect of genomics that facilitates understanding of ncRNA functions and regulatory mechanisms in various biological processes. While traditional machine learning approaches have been employed for distinguishing ncRNA, these often necessitate extensive feature engineering. Recently, deep learning algorithms have provided advancements in ncRNA classification. This study presents BioDeepFuse, a hybrid deep learning framework integrating convolutional neural networks (CNN) or bidirectional long short-term memory (BiLSTM) networks with handcrafted features for enhanced accuracy. This framework employs a combination of k-mer one-hot, k-mer dictionary, and feature extraction techniques for input representation. Extracted features, when embedded into the deep network, enable optimal utilization of spatial and sequential nuances of ncRNA sequences. Using benchmark datasets and real-world RNA samples from bacterial organisms, we evaluated the performance of BioDeepFuse. Results exhibited high accuracy in ncRNA classification, underscoring the robustness of our tool in addressing complex ncRNA sequence data challenges. The effective melding of CNN or BiLSTM with external features heralds promising directions for future research, particularly in refining ncRNA classifiers and deepening insights into ncRNAs in cellular processes and disease manifestations. In addition to its original application in the context of bacterial organisms, the methodologies and techniques integrated into our framework can potentially render BioDeepFuse effective in various and broader domains.
Collapse
Affiliation(s)
- Anderson P Avila Santos
- Institute of Mathematics and Computer Sciences, University of São Paulo, São Carlos, Brazil
- Department of Applied Microbial Ecology, Helmholtz Centre for Environmental Research - UFZ GmbH, Leipzig, Saxony, Germany
| | - Breno L S de Almeida
- Institute of Mathematics and Computer Sciences, University of São Paulo, São Carlos, Brazil
| | - Robson P Bonidia
- Institute of Mathematics and Computer Sciences, University of São Paulo, São Carlos, Brazil
- Department of Computer Science, Federal University of Technology - Paraná, UTFPR, Cornélio Procópio, Brazil
| | - Peter F Stadler
- Department of Computer Science and Interdisciplinary Center of Bioinformatics, University of Leipzig, Leipzig, Saxony, Germany
| | - Polonca Stefanic
- Department of Food Science and Technology, Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia
| | - Ines Mandic-Mulec
- Department of Food Science and Technology, Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia
| | - Ulisses Rocha
- Department of Applied Microbial Ecology, Helmholtz Centre for Environmental Research - UFZ GmbH, Leipzig, Saxony, Germany
| | - Danilo S Sanches
- Department of Computer Science, Federal University of Technology - Paraná, UTFPR, Cornélio Procópio, Brazil
| | | |
Collapse
|
11
|
Blau T, Chades I, Ong CS. Machine Learning for Biological Design. Methods Mol Biol 2024; 2760:319-344. [PMID: 38468097 DOI: 10.1007/978-1-0716-3658-9_19] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/13/2024]
Abstract
We briefly present machine learning approaches for designing better biological experiments. These approaches build on machine learning predictors and provide additional tools to guide scientific discovery. There are two different kinds of objectives when designing better experiments: to improve the predictive model or to improve the experimental outcome. We survey five different approaches for adaptive experimental design that iteratively search the space of possible experiments while adapting to measured data. The approaches are Bayesian optimization, bandits, reinforcement learning, optimal experimental design, and active learning. These machine learning approaches have shown promise in various areas of biology, and we provide broad guidelines to the practitioner and links to further resources.
Collapse
Affiliation(s)
- Tom Blau
- CSIRO, Data61, Eveleigh, NSW, Australia
| | | | | |
Collapse
|
12
|
Yan H, Zhang Y, Shan X, Li H, Liu F, Xie G, Li P, Guo W. Altered interhemispheric functional connectivity in patients with obsessive-compulsive disorder and its potential in therapeutic response prediction. J Neurosci Res 2024; 102. [PMID: 38284840 DOI: 10.1002/jnr.25272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2023] [Revised: 10/23/2023] [Accepted: 10/25/2023] [Indexed: 01/30/2024]
Abstract
The trajectory of voxel-mirrored homotopic connectivity (VMHC) after medical treatment in obsessive-compulsive disorder (OCD) and its value in prediction of treatment response remains unclear. This study aimed to investigate the pathophysiological mechanism of OCD, as well as biomarkers for prediction of pharmacological efficacy. Medication-free patients with OCD and healthy controls (HCs) underwent magnetic resonance imaging. The patients were scanned again after a 4-week treatment with paroxetine. The acquired data were subjected to VMHC, support vector regression (SVR), and correlation analyses. Compared with HCs (36 subjects), patients with OCD (34 subjects after excluding two subjects with excessive head movement) exhibited significantly lower VMHC in the bilateral superior parietal lobule (SPL), postcentral gyrus, and calcarine cortex, and VMHC in the postcentral gyrus was positively correlated with cognitive function. After treatment, the patients showed increased VMHC in the bilateral posterior cingulate cortex/precuneus (PCC/PCu) with the improvement of symptoms. SVR results showed that VMHC in the postcentral gyrus at baseline could aid to predict a change in the scores of OCD scales. This study revealed that SPL, postcentral gyrus, and calcarine cortex participate in the pathophysiological mechanism of OCD while PCC/PCu participate in the pharmacological mechanism. VMHC in the postcentral gyrus is a potential predictive biomarker of the treatment effects in OCD.
Collapse
Affiliation(s)
- Haohao Yan
- Department of Psychiatry, National Clinical Research Center for Mental Disorders, and National Center for Mental Disorders, The Second Xiangya Hospital of Central South University, Changsha, China
| | - Yingying Zhang
- Department of Psychiatry, National Clinical Research Center for Mental Disorders, and National Center for Mental Disorders, The Second Xiangya Hospital of Central South University, Changsha, China
| | - Xiaoxiao Shan
- Department of Psychiatry, National Clinical Research Center for Mental Disorders, and National Center for Mental Disorders, The Second Xiangya Hospital of Central South University, Changsha, China
| | - Huabing Li
- Department of Radiology, The Second Xiangya Hospital of Central South University, Changsha, China
| | - Feng Liu
- Department of Radiology, Tianjin Medical University General Hospital, Tianjin, China
| | - Guojun Xie
- Department of Psychiatry, The Third People's Hospital of Foshan, Foshan, China
| | - Ping Li
- Department of Psychiatry, Qiqihar Medical University, Qiqihar, China
| | - Wenbin Guo
- Department of Psychiatry, National Clinical Research Center for Mental Disorders, and National Center for Mental Disorders, The Second Xiangya Hospital of Central South University, Changsha, China
| |
Collapse
|
13
|
Mitrović K, Savić AM, Radojičić A, Daković M, Petrušić I. Machine learning approach for Migraine Aura Complexity Score prediction based on magnetic resonance imaging data. J Headache Pain 2023; 24:169. [PMID: 38105182 PMCID: PMC10726649 DOI: 10.1186/s10194-023-01704-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Accepted: 12/05/2023] [Indexed: 12/19/2023] Open
Abstract
BACKGROUND Previous studies have developed the Migraine Aura Complexity Score (MACS) system. MACS shows great potential in studying the complexity of migraine with aura (MwA) pathophysiology especially when implemented in neuroimaging studies. The use of sophisticated machine learning (ML) algorithms, together with deep profiling of MwA, could bring new knowledge in this field. We aimed to test several ML algorithms to study the potential of structural cortical features for predicting the MACS and therefore gain a better insight into MwA pathophysiology. METHODS The data set used in this research consists of 340 MRI features collected from 40 MwA patients. Average MACS score was obtained for each subject. Feature selection for ML models was performed using several approaches, including a correlation test and a wrapper feature selection methodology. Regression was performed with the Support Vector Machine (SVM), Linear Regression, and Radial Basis Function network. RESULTS SVM achieved a 0.89 coefficient of determination score with a wrapper feature selection. The results suggest a set of cortical features, located mostly in the parietal and temporal lobes, that show changes in MwA patients depending on aura complexity. CONCLUSIONS The SVM algorithm demonstrated the best potential in average MACS prediction when using a wrapper feature selection methodology. The proposed method achieved promising results in determining MwA complexity, which can provide a basis for future MwA studies and the development of MwA diagnosis and treatment.
Collapse
Affiliation(s)
- Katarina Mitrović
- Department of Information Technologies, Faculty of Technical Sciences Čačak, University of Kragujevac, 65 Svetog Save, Čačak, 32000, Serbia.
| | - Andrej M Savić
- Science and Research Centre, University of Belgrade - School of Electrical Engineering, University of Belgrade, 73 Bulevar kralja Aleksandra, Belgrade, 11000, Serbia
| | - Aleksandra Radojičić
- Headache Center, Neurology Clinic, University Clinical Centre of Serbia, 6 dr Subotića starijeg, Belgrade, 11000, Serbia
- Faculty of Medicine, University of Belgrade, 8 dr Subotića starijeg, Belgrade, 11000, Serbia
| | - Marko Daković
- Laboratory for Advanced Analysis of Neuroimages, Faculty of Physical Chemistry, University of Belgrade, 12-16 Studentski trg, Belgrade, 11000, Serbia
| | - Igor Petrušić
- Laboratory for Advanced Analysis of Neuroimages, Faculty of Physical Chemistry, University of Belgrade, 12-16 Studentski trg, Belgrade, 11000, Serbia
| |
Collapse
|
14
|
Horr NK, Mousavi B, Han K, Li A, Tang R. Human behavior in free search online shopping scenarios can be predicted from EEG activation using Hjorth parameters. Front Neurosci 2023; 17:1191213. [PMID: 38027474 PMCID: PMC10667477 DOI: 10.3389/fnins.2023.1191213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 10/20/2023] [Indexed: 12/01/2023] Open
Abstract
The present work investigates whether and how decisions in real-world online shopping scenarios can be predicted based on brain activation. Potential customers were asked to search through product pages on e-commerce platforms and decide, which products to buy, while their EEG signal was recorded. Machine learning algorithms were then trained to distinguish between EEG activation when viewing products that are later bought or put into the shopping card as opposed to products that are later discarded. We find that Hjorth parameters extracted from the raw EEG can be used to predict purchase choices to a high level of accuracy. Above-chance predictions based on Hjorth parameters are achieved via different standard machine learning methods with random forest models showing the best performance of above 80% prediction accuracy in both 2-class (bought or put into card vs. not bought) and 3-class (bought vs. put into card vs. not bought) classification. While conventional EEG signal analysis commonly employs frequency domain features such as alpha or theta power and phase, Hjorth parameters use time domain signals, which can be calculated rapidly with little computational cost. Given the presented evidence that Hjorth parameters are suitable for the prediction of complex behaviors, their potential and remaining challenges for implementation in real-time applications are discussed.
Collapse
|
15
|
Le AV, Větrovský T, Barucic D, Saraiva JP, Dobbler PT, Kohout P, Pospíšek M, da Rocha UN, Kléma J, Baldrian P. Improved recovery and annotation of genes in metagenomes through the prediction of fungal introns. Mol Ecol Resour 2023; 23:1800-1811. [PMID: 37561110 DOI: 10.1111/1755-0998.13852] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Revised: 06/27/2023] [Accepted: 07/31/2023] [Indexed: 08/11/2023]
Abstract
Metagenomics provides a tool to assess the functional potential of environmental and host-associated microbiomes based on the analysis of environmental DNA: assembly, gene prediction and annotation. While gene prediction is straightforward for most bacterial and archaeal taxa, it has limited applicability in the majority of eukaryotic organisms, including fungi that contain introns in gene coding sequences. As a consequence, eukaryotic genes are underrepresented in metagenomics datasets and our understanding of the contribution of fungi and other eukaryotes to microbiome functioning is limited. Here, we developed a machine intelligence-based algorithm that predicts fungal introns in environmental DNA with reasonable precision and used it to improve the annotation of environmental metagenomes. Intron removal increased the number of predicted genes by up to 9.1% and improved the annotation of several others. The proportion of newly predicted genes increased with the share of eukaryotic genes in the metagenome and-within fungal taxa-increased with the number of introns per gene. Our approach provides a tool named SVMmycointron for improved metagenome annotation, especially of microbiomes with a high proportion of eukaryotes. The scripts described in the paper are made publicly available and can be readily utilized by microbiome researchers analysing metagenomics data.
Collapse
Affiliation(s)
- Anh Vu Le
- Department of Computer Science, Czech Technical University in Prague, Praha, Czech Republic
| | - Tomáš Větrovský
- Laboratory of Environmental Microbiology, Institute of Microbiology of the Czech Academy of Sciences, Praha, Czech Republic
| | - Denis Barucic
- Department of Computer Science, Czech Technical University in Prague, Praha, Czech Republic
| | - Joao Pedro Saraiva
- Department of Environmental Microbiology, UFZ-Helmholtz Centre for Environmental Research, Leipzig, Germany
| | - Priscila Thiago Dobbler
- Laboratory of Environmental Microbiology, Institute of Microbiology of the Czech Academy of Sciences, Praha, Czech Republic
| | - Petr Kohout
- Laboratory of Environmental Microbiology, Institute of Microbiology of the Czech Academy of Sciences, Praha, Czech Republic
| | - Martin Pospíšek
- Department of Genetics and Microbiology, Charles University, Praha, Czech Republic
| | - Ulisses Nunes da Rocha
- Department of Environmental Microbiology, UFZ-Helmholtz Centre for Environmental Research, Leipzig, Germany
| | - Jiří Kléma
- Department of Computer Science, Czech Technical University in Prague, Praha, Czech Republic
| | - Petr Baldrian
- Laboratory of Environmental Microbiology, Institute of Microbiology of the Czech Academy of Sciences, Praha, Czech Republic
| |
Collapse
|
16
|
Shaker B, Lee J, Lee Y, Yu MS, Lee HM, Lee E, Kang HC, Oh KS, Kim HW, Na D. A machine learning-based quantitative model (LogBB_Pred) to predict the blood-brain barrier permeability (logBB value) of drug compounds. Bioinformatics 2023; 39:btad577. [PMID: 37713469 PMCID: PMC10560102 DOI: 10.1093/bioinformatics/btad577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 08/30/2023] [Accepted: 09/14/2023] [Indexed: 09/17/2023] Open
Abstract
MOTIVATION Efficient assessment of the blood-brain barrier (BBB) penetration ability of a drug compound is one of the major hurdles in central nervous system drug discovery since experimental methods are costly and time-consuming. To advance and elevate the success rate of neurotherapeutic drug discovery, it is essential to develop an accurate computational quantitative model to determine the absolute logBB value (a logarithmic ratio of the concentration of a drug in the brain to its concentration in the blood) of a drug candidate. RESULTS Here, we developed a quantitative model (LogBB_Pred) capable of predicting a logBB value of a query compound. The model achieved an R2 of 0.61 on an independent test dataset and outperformed other publicly available quantitative models. When compared with the available qualitative (classification) models that only classified whether a compound is BBB-permeable or not, our model achieved the same accuracy (0.85) with the best qualitative model and far-outperformed other qualitative models (accuracies between 0.64 and 0.70). For further evaluation, our model, quantitative models, and the qualitative models were evaluated on a real-world central nervous system drug screening library. Our model showed an accuracy of 0.97 while the other models showed an accuracy in the range of 0.29-0.83. Consequently, our model can accurately classify BBB-permeable compounds as well as predict the absolute logBB values of drug candidates. AVAILABILITY AND IMPLEMENTATION Web server is freely available on the web at http://ssbio.cau.ac.kr/software/logbb_pred/. The data used in this study are available to download at http://ssbio.cau.ac.kr/software/logbb_pred/dataset.zip.
Collapse
Affiliation(s)
- Bilal Shaker
- Department of Biomedical Engineering, Chung-Ang University, Seoul 06974, Republic of Korea
| | - Jingyu Lee
- Department of Biomedical Engineering, Chung-Ang University, Seoul 06974, Republic of Korea
| | - Yunhyeok Lee
- Department of Biomedical Engineering, Chung-Ang University, Seoul 06974, Republic of Korea
| | - Myeong-Sang Yu
- Department of Biomedical Engineering, Chung-Ang University, Seoul 06974, Republic of Korea
| | - Hyang-Mi Lee
- Department of Biomedical Engineering, Chung-Ang University, Seoul 06974, Republic of Korea
| | - Eunee Lee
- Division of Pediatric Neurology, Department of Pediatrics, Severance Children’s Hospital, Yonsei University College of Medicine, Epilepsy Research Institute, Seoul 03722, Republic of Korea
| | - Hoon-Chul Kang
- Department of Anatomy College of Medicine, Yonsei University, Seoul 03722, Republic of Korea
| | - Kwang-Seok Oh
- Convergence Drug Research Center, Korea Research Institute of Chemical Technology, Daejeon 34114, Republic of Korea
| | - Hyung Wook Kim
- Department of Bio-integrated Science and Technology, College of Life Sciences, Sejong University, Seoul 05006, Republic of Korea
| | - Dokyun Na
- Department of Biomedical Engineering, Chung-Ang University, Seoul 06974, Republic of Korea
| |
Collapse
|
17
|
Procopio A, Cesarelli G, Donisi L, Merola A, Amato F, Cosentino C. Combined mechanistic modeling and machine-learning approaches in systems biology - A systematic literature review. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 240:107681. [PMID: 37385142 DOI: 10.1016/j.cmpb.2023.107681] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 06/14/2023] [Accepted: 06/14/2023] [Indexed: 07/01/2023]
Abstract
BACKGROUND AND OBJECTIVE Mechanistic-based Model simulations (MM) are an effective approach commonly employed, for research and learning purposes, to better investigate and understand the inherent behavior of biological systems. Recent advancements in modern technologies and the large availability of omics data allowed the application of Machine Learning (ML) techniques to different research fields, including systems biology. However, the availability of information regarding the analyzed biological context, sufficient experimental data, as well as the degree of computational complexity, represent some of the issues that both MMs and ML techniques could present individually. For this reason, recently, several studies suggest overcoming or significantly reducing these drawbacks by combining the above-mentioned two methods. In the wake of the growing interest in this hybrid analysis approach, with the present review, we want to systematically investigate the studies available in the scientific literature in which both MMs and ML have been combined to explain biological processes at genomics, proteomics, and metabolomics levels, or the behavior of entire cellular populations. METHODS Elsevier Scopus®, Clarivate Web of Science™ and National Library of Medicine PubMed® databases were enquired using the queries reported in Table 1, resulting in 350 scientific articles. RESULTS Only 14 of the 350 documents returned by the comprehensive search conducted on the three major online databases met our search criteria, i.e. present a hybrid approach consisting of the synergistic combination of MMs and ML to treat a particular aspect of systems biology. CONCLUSIONS Despite the recent interest in this methodology, from a careful analysis of the selected papers, it emerged how examples of integration between MMs and ML are already present in systems biology, highlighting the great potential of this hybrid approach to both at micro and macro biological scales.
Collapse
Affiliation(s)
- Anna Procopio
- Department of Experimental and Clinical Medicine, Università degli Studi Magna Græcia, Catanzaro, 88100, Italia
| | - Giuseppe Cesarelli
- Department of Electrical Engineering and Information Technology, Università degli Studi di Napoli Federico II, Napoli, 80125, Italy
| | - Leandro Donisi
- Department of Advanced Medical and Surgical Sciences, Università della Campania Luigi Vanvitelli, Napoli, 80138, Italy
| | - Alessio Merola
- Department of Experimental and Clinical Medicine, Università degli Studi Magna Græcia, Catanzaro, 88100, Italia
| | - Francesco Amato
- Department of Electrical Engineering and Information Technology, Università degli Studi di Napoli Federico II, Napoli, 80125, Italy.
| | - Carlo Cosentino
- Department of Experimental and Clinical Medicine, Università degli Studi Magna Græcia, Catanzaro, 88100, Italia.
| |
Collapse
|
18
|
Li C, Wang T, Lin X. Analyzing omics data by feature combinations based on kernel functions. J Bioinform Comput Biol 2023; 21:2350021. [PMID: 37852788 DOI: 10.1142/s021972002350021x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2023]
Abstract
Defining meaningful feature (molecule) combinations can enhance the study of disease diagnosis and prognosis. However, feature combinations are complex and various in biosystems, and the existing methods examine the feature cooperation in a single, fixed pattern for all feature pairs, such as linear combination. To identify the appropriate combination between two features and evaluate feature combination more comprehensively, this paper adopts kernel functions to study feature relationships and proposes a new omics data analysis method KF-[Formula: see text]-TSP. Besides linear combination, KF-[Formula: see text]-TSP also explores the nonlinear combination of features, and allows hybridizing multiple kernel functions to evaluate feature interaction from multiple views. KF-[Formula: see text]-TSP selects [Formula: see text] > 0 top-scoring pairs to build an ensemble classifier. Experimental results show that KF-[Formula: see text]-TSP with multiple kernel functions which evaluates feature combinations from multiple views is better than that with only one kernel function. Meanwhile, KF-[Formula: see text]-TSP performs better than TSP family algorithms and the previous methods based on conversion strategy in most cases. It performs similarly to the popular machine learning methods in omics data analysis, but involves fewer feature pairs. In the procedure of physiological and pathological changes, molecular interactions can be both linear and nonlinear. Hence, KF-[Formula: see text]-TSP, which can measure molecular combination from multiple perspectives, can help to mine information closely related to physiological and pathological changes and study disease mechanism.
Collapse
Affiliation(s)
- Chao Li
- School of Computer Science and Technology, Dalian University of Technology, No. 2 Linggong Road, Dalian, Liaoning 116024, P. R. China
| | - Tianxiang Wang
- School of Computer Science and Technology, Dalian University of Technology, No. 2 Linggong Road, Dalian, Liaoning 116024, P. R. China
| | - Xiaohui Lin
- School of Computer Science and Technology, Dalian University of Technology, No. 2 Linggong Road, Dalian, Liaoning 116024, P. R. China
| |
Collapse
|
19
|
Jing H, Zhang C, Yan H, Li X, Liang J, Liang W, Ou Y, Wu W, Guo H, Deng W, Xie G, Guo W. Deviant spontaneous neural activity as a potential early-response predictor for therapeutic interventions in patients with schizophrenia. Front Neurosci 2023; 17:1243168. [PMID: 37727324 PMCID: PMC10505796 DOI: 10.3389/fnins.2023.1243168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 08/18/2023] [Indexed: 09/21/2023] Open
Abstract
Objective Previous studies have established significant differences in the neuroimaging characteristics between healthy controls (HCs) and patients with schizophrenia (SCZ). However, the relationship between homotopic connectivity and clinical features in patients with SCZ is not yet fully understood. Furthermore, there are currently no established neuroimaging biomarkers available for the diagnosis of SCZ or for predicting early treatment response. The aim of this study is to investigate the association between regional homogeneity and specific clinical features in SCZ patients. Methods We conducted a longitudinal investigation involving 56 patients with SCZ and 51 HCs. The SCZ patients underwent a 3-month antipsychotic treatment. Resting-state functional magnetic resonance imaging (fMRI), regional homogeneity (ReHo), support vector machine (SVM), and support vector regression (SVR) were used for data acquisition and analysis. Results In comparison to HCs, individuals with SCZ demonstrated reduced ReHo values in the right postcentral/precentral gyrus, left postcentral/inferior parietal gyrus, left middle/inferior occipital gyrus, and right middle temporal/inferior occipital gyrus, and increased ReHo values in the right putamen. It is noteworthy that there was decreased ReHo values in the right inferior parietal gyrus after treatment compared to baseline data. Conclusion The observed decrease in ReHo values in the sensorimotor network and increase in ReHo values in the right putamen may represent distinctive neurobiological characteristics of patients with SCZ, as well as a potential neuroimaging biomarker for distinguishing between patients with SCZ and HCs. Furthermore, ReHo values in the sensorimotor network and right putamen may serve as predictive indicators for early treatment response in patients with SCZ.
Collapse
Affiliation(s)
- Huan Jing
- Department of Psychiatry, The Third People's Hospital of Foshan, Foshan, Guangdong, China
| | - Chunguo Zhang
- Department of Psychiatry, The Third People's Hospital of Foshan, Foshan, Guangdong, China
| | - Haohao Yan
- Department of Psychiatry, National Clinical Research Center for Mental Disorders, and National Center for Mental Disorders, The Second Xiangya Hospital of Central South University, Changsha, Hunan, China
| | - Xiaoling Li
- Department of Psychiatry, The Third People's Hospital of Foshan, Foshan, Guangdong, China
| | - Jiaquan Liang
- Department of Psychiatry, The Third People's Hospital of Foshan, Foshan, Guangdong, China
| | - Wenting Liang
- Department of Psychiatry, The Third People's Hospital of Foshan, Foshan, Guangdong, China
| | - Yangpan Ou
- Department of Psychiatry, National Clinical Research Center for Mental Disorders, and National Center for Mental Disorders, The Second Xiangya Hospital of Central South University, Changsha, Hunan, China
| | - Weibin Wu
- Department of Psychiatry, The Third People's Hospital of Foshan, Foshan, Guangdong, China
| | - Huagui Guo
- Department of Psychiatry, The Third People's Hospital of Foshan, Foshan, Guangdong, China
| | - Wen Deng
- Department of Psychiatry, The Third People's Hospital of Foshan, Foshan, Guangdong, China
| | - Guojun Xie
- Department of Psychiatry, The Third People's Hospital of Foshan, Foshan, Guangdong, China
| | - Wenbin Guo
- Department of Psychiatry, National Clinical Research Center for Mental Disorders, and National Center for Mental Disorders, The Second Xiangya Hospital of Central South University, Changsha, Hunan, China
| |
Collapse
|
20
|
Hamadani A, Ganai NA. Artificial intelligence algorithm comparison and ranking for weight prediction in sheep. Sci Rep 2023; 13:13242. [PMID: 37582936 PMCID: PMC10427635 DOI: 10.1038/s41598-023-40528-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Accepted: 08/11/2023] [Indexed: 08/17/2023] Open
Abstract
In a rapidly transforming world, farm data is growing exponentially. Realizing the importance of this data, researchers are looking for new solutions to analyse this data and make farming predictions. Artificial Intelligence, with its capacity to handle big data is rapidly becoming popular. In addition, it can also handle non-linear, noisy data and is not limited by the conditions required for conventional data analysis. This study was therefore undertaken to compare the most popular machine learning (ML) algorithms and rank them as per their ability to make predictions on sheep farm data spanning 11 years. Data was cleaned and prepared was done before analysis. Winsorization was done for outlier removal. Principal component analysis (PCA) and feature selection (FS) were done and based on that, three datasets were created viz. PCA (wherein only PCA was used), PCA+ FS (both techniques used for dimensionality reduction), and FS (only feature selection used) bodyweight prediction. Among the 11 ML algorithms that were evaluated, the correlations between true and predicted values for MARS algorithm, Bayesian ridge regression, Ridge regression, Support Vector Machines, Gradient boosting algorithm, Random forests, XgBoost algorithm, Artificial neural networks, Classification and regression trees, Polynomial regression, K nearest neighbours and Genetic Algorithms were 0.993, 0.992, 0.991, 0.991, 0.991, 0.99, 0.99, 0.984, 0.984, 0.957, 0.949, 0.734 respectively for bodyweights. The top five algorithms for the prediction of bodyweights, were MARS, Bayesian ridge regression, Ridge regression, Support Vector Machines and Gradient boosting algorithm. A total of 12 machine learning models were developed for the prediction of bodyweights in sheep in the present study. It may be said that machine learning techniques can perform predictions with reasonable accuracies and can thus help in drawing inferences and making futuristic predictions on farms for their economic prosperity, performance improvement and subsequently food security.
Collapse
Affiliation(s)
| | - Nazir Ahmad Ganai
- Sher-e-Kashmir University of Agricultural Sciences and Technology of Kashmir, Kashmir, India
| |
Collapse
|
21
|
Damavandi S, Shiri F, Emamjomeh A, Pirhadi S, Beyzaei H. A study of the interaction space of two lactate dehydrogenase isoforms (LDHA and LDHB) and some of their inhibitors using proteochemometrics modeling. BMC Chem 2023; 17:70. [PMID: 37415191 DOI: 10.1186/s13065-023-00991-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2023] [Accepted: 06/30/2023] [Indexed: 07/08/2023] Open
Abstract
Lactate dehydrogenase (LDH) is a tetramer enzyme that converts pyruvate to lactate reversibly. This enzyme becomes important because it is associated with diseases such as cancers, heart disease, liver problems, and most importantly, corona disease. As a system-based method, proteochemometrics does not require knowledge of the protein's three-dimensional structure, but rather depends on the amino acid sequence and protein descriptors. Here, we applied this methodology to model a set of LDHA and LDHB isoenzyme inhibitors. To implement the proteochemetrics method, the camb package in the R Studio Server programming environment was used. The activity of 312 compounds of LDHA and LDHB isoenzyme inhibitors from the valid Binding DB database was retrieved. The proteochemometrics method was applied to three machine learning algorithms gradient amplification model, random forest, and support vector machine as regression methods to find the best model. Through the combination of different models into an ensemble (greedy and stacking optimization), we explored the possibility of improving the performance of models. For the RF best ensemble model of inhibitors of LDHA and LDHB isoenzymes, and were 0.66 and 0.62, respectively. LDH inhibitory activation is influenced by Morgan fingerprints and topological structure descriptors.
Collapse
Affiliation(s)
- Sedigheh Damavandi
- Department of Bioinformatics, Laboratory of Computational Biotechnology and Bioinformatics (CBB Lab), University of Zabol, Zabol, Iran
| | - Fereshteh Shiri
- Department of Chemistry, Faculty of Science, University of Zabol, Zabol, Iran.
| | - Abbasali Emamjomeh
- Department of Bioinformatics, Laboratory of Computational Biotechnology and Bioinformatics (CBB Lab), University of Zabol, Zabol, Iran
- Department of Plant Breeding and Biotechnology (PBB), Faculty of Agriculture, University of Zabol, Zabol, Iran
| | - Somayeh Pirhadi
- Medicinal and Natural Products Chemistry Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Hamid Beyzaei
- Department of Chemistry, Faculty of Science, University of Zabol, Zabol, Iran
| |
Collapse
|
22
|
Tognon M, Giugno R, Pinello L. A survey on algorithms to characterize transcription factor binding sites. Brief Bioinform 2023; 24:bbad156. [PMID: 37099664 PMCID: PMC10422928 DOI: 10.1093/bib/bbad156] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 03/27/2023] [Accepted: 04/01/2023] [Indexed: 04/28/2023] Open
Abstract
Transcription factors (TFs) are key regulatory proteins that control the transcriptional rate of cells by binding short DNA sequences called transcription factor binding sites (TFBS) or motifs. Identifying and characterizing TFBS is fundamental to understanding the regulatory mechanisms governing the transcriptional state of cells. During the last decades, several experimental methods have been developed to recover DNA sequences containing TFBS. In parallel, computational methods have been proposed to discover and identify TFBS motifs based on these DNA sequences. This is one of the most widely investigated problems in bioinformatics and is referred to as the motif discovery problem. In this manuscript, we review classical and novel experimental and computational methods developed to discover and characterize TFBS motifs in DNA sequences, highlighting their advantages and drawbacks. We also discuss open challenges and future perspectives that could fill the remaining gaps in the field.
Collapse
Affiliation(s)
- Manuel Tognon
- Computer Science Department, University of Verona, Verona, Italy
- Molecular Pathology Unit, Center for Computational and Integrative Biology and Center for Cancer Research, Massachusetts General Hospital, Charlestown, Massachusetts, United States of America
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Rosalba Giugno
- Computer Science Department, University of Verona, Verona, Italy
| | - Luca Pinello
- Molecular Pathology Unit, Center for Computational and Integrative Biology and Center for Cancer Research, Massachusetts General Hospital, Charlestown, Massachusetts, United States of America
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- Department of Pathology, Harvard Medical School, Boston, Massachusetts, United States of America
| |
Collapse
|
23
|
Zhang Z, Wei X. Artificial intelligence-assisted selection and efficacy prediction of antineoplastic strategies for precision cancer therapy. Semin Cancer Biol 2023; 90:57-72. [PMID: 36796530 DOI: 10.1016/j.semcancer.2023.02.005] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 01/12/2023] [Accepted: 02/13/2023] [Indexed: 02/16/2023]
Abstract
The rapid development of artificial intelligence (AI) technologies in the context of the vast amount of collectable data obtained from high-throughput sequencing has led to an unprecedented understanding of cancer and accelerated the advent of a new era of clinical oncology with a tone of precision treatment and personalized medicine. However, the gains achieved by a variety of AI models in clinical oncology practice are far from what one would expect, and in particular, there are still many uncertainties in the selection of clinical treatment options that pose significant challenges to the application of AI in clinical oncology. In this review, we summarize emerging approaches, relevant datasets and open-source software of AI and show how to integrate them to address problems from clinical oncology and cancer research. We focus on the principles and procedures for identifying different antitumor strategies with the assistance of AI, including targeted cancer therapy, conventional cancer therapy, and cancer immunotherapy. In addition, we also highlight the current challenges and directions of AI in clinical oncology translation. Overall, we hope this article will provide researchers and clinicians with a deeper understanding of the role and implications of AI in precision cancer therapy, and help AI move more quickly into accepted cancer guidelines.
Collapse
Affiliation(s)
- Zhe Zhang
- Laboratory of Aging Research and Cancer Drug Target, State Key Laboratory of Biotherapy and Cancer Center, National Clinical Research Center for Geriatrics, West China Hospital, Sichuan University, Chengdu 610041, PR China; State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, and Collaborative Innovation Center for Biotherapy, Chengdu 610041, PR China
| | - Xiawei Wei
- Laboratory of Aging Research and Cancer Drug Target, State Key Laboratory of Biotherapy and Cancer Center, National Clinical Research Center for Geriatrics, West China Hospital, Sichuan University, Chengdu 610041, PR China.
| |
Collapse
|
24
|
Liang Z. Novel method combining multiscale attention entropy of overnight blood oxygen level and machine learning for easy sleep apnea screening. Digit Health 2023; 9:20552076231211550. [PMID: 37936958 PMCID: PMC10627021 DOI: 10.1177/20552076231211550] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Accepted: 10/16/2023] [Indexed: 11/09/2023] Open
Abstract
Objective Sleep apnea is a common sleep disorder affecting a significant portion of the population, but many apnea patients remain undiagnosed because existing clinical tests are invasive and expensive. This study aimed to develop a method for easy sleep apnea screening. Methods Three supervised machine learning algorithms, including logistic regression, support vector machine, and light gradient boosting machine, were applied to develop apnea screening models at two apnea-hypopnea index cutoff thresholds: ≥ 5 and ≥ 30 events/hours. The SpO2 recordings of the Sleep Heart Health Study database (N = 5786) were used for model training, validation, and test. Multiscale entropy analysis was performed to derive a set of multiscale attention entropy features from the SpO2 recordings. Demographic features including age, sex, body mass index, and blood pressure were also used. The dependency among the multiscale attention entropy features were handled with the independent component analysis. Results For cutoff ≥ 5/hours, logistic regression model achieved the highest Matthew's correlation coefficient (0.402) and area under the curve (0.747), and reasonably good sensitivity (75.38%), specificity (74.02%), and positive predictive value (92.94%). For cutoff ≥ 30/hours, support vector machine model achieved the highest Matthew's correlation coefficient (0.545) and area under the curve (0.823), and good sensitivity (82.00%), specificity (82.69%), and negative predictive value (95.53%). Conclusions Our models achieved better performance than existing methods and have the potential to be integrated with home-use pulse oximeters.
Collapse
Affiliation(s)
- Zilu Liang
- Kyoto University of Advanced Science (KUAS), Japan
| |
Collapse
|
25
|
Aero engines remaining useful life prediction based on enhanced adaptive guided differential evolution. EVOLUTIONARY INTELLIGENCE 2022. [DOI: 10.1007/s12065-022-00805-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
26
|
Zhao XJG, Cao H. Linking research of biomedical datasets. Brief Bioinform 2022; 23:6712704. [PMID: 36151775 DOI: 10.1093/bib/bbac373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Revised: 08/03/2022] [Accepted: 08/08/2022] [Indexed: 12/14/2022] Open
Abstract
Biomedical data preprocessing and efficient computing can be as important as the statistical methods used to fit the data; data processing needs to consider application scenarios, data acquisition and individual rights and interests. We review common principles, knowledge and methods of integrated research according to the whole-pipeline processing mechanism diverse, coherent, sharing, auditable and ecological. First, neuromorphic and native algorithms integrate diverse datasets, providing linear scalability and high visualization. Second, the choice mechanism of different preprocessing, analysis and transaction methods from raw to neuromorphic was summarized on the node and coordinator platforms. Third, combination of node, network, cloud, edge, swarm and graph builds an ecosystem of cohort integrated research and clinical diagnosis and treatment. Looking forward, it is vital to simultaneously combine deep computing, mass data storage and massively parallel communication.
Collapse
Affiliation(s)
- Xiu-Ju George Zhao
- Wuhan Institute of Physics and Mathematics (WIPM), China.,Wuhan Polytechnic University, China
| | - Hui Cao
- Wuhan Polytechnic University, China
| |
Collapse
|
27
|
Garabaghi FH, Benzer R, Benzer S, Günal Ç. Effect of polynomial, radial basis, and Pearson VII function kernels in support vector machine algorithm for classification of crayfish. ECOL INFORM 2022. [DOI: 10.1016/j.ecoinf.2022.101911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
|
28
|
Yang Y, Xu B, Murray J, Haverstick J, Chen X, Tripp RA, Zhao Y. Rapid and quantitative detection of respiratory viruses using surface-enhanced Raman spectroscopy and machine learning. Biosens Bioelectron 2022; 217:114721. [PMID: 36152394 DOI: 10.1016/j.bios.2022.114721] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Revised: 08/29/2022] [Accepted: 09/11/2022] [Indexed: 12/23/2022]
Abstract
Rapid and sensitive pathogen detection is important for prevention and control of disease. Here, we report a label-free diagnostic platform that combines surface-enhanced Raman scattering (SERS) and machine learning for the rapid and accurate detection of thirteen respiratory virus species including SARS-CoV-2, common human coronaviruses, influenza viruses, and others. Virus detection and measurement have been performed using highly sensitive SiO2 coated silver nanorod array substrates, allowing for detection and identification of their characteristic SERS peaks. Using appropriate spectral processing procedures and machine learning algorithms (MLAs) including support vector machine (SVM), k-nearest neighbor, and random forest, the virus species as well as strains and variants have been differentiated and classified and a differentiation accuracy of >99% has been obtained. Utilizing SVM-based regression, quantitative calibration curves have been constructed to accurately estimate the unknown virus concentrations in buffer and saliva. This study shows that using a combination of SERS, MLA, and regression, it is possible to classify and quantify the virus in saliva, which could aid medical diagnosis and therapeutic intervention.
Collapse
Affiliation(s)
- Yanjun Yang
- School of Electrical and Computer Engineering, College of Engineering, The University of Georgia, Athens, GA, 30602, USA.
| | - Beibei Xu
- Department of Statistics, The University of Georgia, Athens, GA, 30602, USA
| | - Jackelyn Murray
- Department of Infectious Diseases, College of Veterinary Medicine, The University of Georgia, Athens, GA, 30602, USA
| | - James Haverstick
- Department of Physics and Astronomy, The University of Georgia, Athens, GA, 30602, USA
| | - Xianyan Chen
- Department of Statistics, The University of Georgia, Athens, GA, 30602, USA
| | - Ralph A Tripp
- Department of Infectious Diseases, College of Veterinary Medicine, The University of Georgia, Athens, GA, 30602, USA
| | - Yiping Zhao
- Department of Physics and Astronomy, The University of Georgia, Athens, GA, 30602, USA.
| |
Collapse
|
29
|
Baranwal M, Magner A, Saldinger J, Turali-Emre ES, Elvati P, Kozarekar S, VanEpps JS, Kotov NA, Violi A, Hero AO. Struct2Graph: a graph attention network for structure based predictions of protein-protein interactions. BMC Bioinformatics 2022; 23:370. [PMID: 36088285 PMCID: PMC9464414 DOI: 10.1186/s12859-022-04910-9] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Accepted: 08/26/2022] [Indexed: 12/03/2022] Open
Abstract
BACKGROUND Development of new methods for analysis of protein-protein interactions (PPIs) at molecular and nanometer scales gives insights into intracellular signaling pathways and will improve understanding of protein functions, as well as other nanoscale structures of biological and abiological origins. Recent advances in computational tools, particularly the ones involving modern deep learning algorithms, have been shown to complement experimental approaches for describing and rationalizing PPIs. However, most of the existing works on PPI predictions use protein-sequence information, and thus have difficulties in accounting for the three-dimensional organization of the protein chains. RESULTS In this study, we address this problem and describe a PPI analysis based on a graph attention network, named Struct2Graph, for identifying PPIs directly from the structural data of folded protein globules. Our method is capable of predicting the PPI with an accuracy of 98.89% on the balanced set consisting of an equal number of positive and negative pairs. On the unbalanced set with the ratio of 1:10 between positive and negative pairs, Struct2Graph achieves a fivefold cross validation average accuracy of 99.42%. Moreover, Struct2Graph can potentially identify residues that likely contribute to the formation of the protein-protein complex. The identification of important residues is tested for two different interaction types: (a) Proteins with multiple ligands competing for the same binding area, (b) Dynamic protein-protein adhesion interaction. Struct2Graph identifies interacting residues with 30% sensitivity, 89% specificity, and 87% accuracy. CONCLUSIONS In this manuscript, we address the problem of prediction of PPIs using a first of its kind, 3D-structure-based graph attention network (code available at https://github.com/baranwa2/Struct2Graph ). Furthermore, the novel mutual attention mechanism provides insights into likely interaction sites through its unsupervised knowledge selection process. This study demonstrates that a relatively low-dimensional feature embedding learned from graph structures of individual proteins outperforms other modern machine learning classifiers based on global protein features. In addition, through the analysis of single amino acid variations, the attention mechanism shows preference for disease-causing residue variations over benign polymorphisms, demonstrating that it is not limited to interface residues.
Collapse
Affiliation(s)
- Mayank Baranwal
- Division of Data and Decision Sciences, Tata Consultancy Services Research, Mumbai, India
- Systems and Control Engineering Group, Indian Institute of Technology, Bombay, India
| | - Abram Magner
- Department of Computer Science, University of Albany, SUNY, Albany, USA
| | - Jacob Saldinger
- Department of Chemical Engineering, University of Michigan, Ann Arbor, USA
| | | | - Paolo Elvati
- Department of Mechanical Engineering, University of Michigan, Ann Arbor, USA
| | - Shivani Kozarekar
- Department of Chemical Engineering, University of Michigan, Ann Arbor, USA
| | - J. Scott VanEpps
- Department of Biomedical Engineering, University of Michigan, Ann Arbor, USA
- Department of Emergency Medicine, University of Michigan, Ann Arbor, USA
- Biointerfaces Institute, University of Michigan, Ann Arbor, USA
| | - Nicholas A. Kotov
- Department of Chemical Engineering, University of Michigan, Ann Arbor, USA
- Department of Biomedical Engineering, University of Michigan, Ann Arbor, USA
- Biointerfaces Institute, University of Michigan, Ann Arbor, USA
- Department of Materials Science and Engineering, University of Michigan, Ann Arbor, USA
| | - Angela Violi
- Department of Chemical Engineering, University of Michigan, Ann Arbor, USA
- Department of Mechanical Engineering, University of Michigan, Ann Arbor, USA
- Biophysics Program, University of Michigan, Ann Arbor, USA
| | - Alfred O. Hero
- Department of Biomedical Engineering, University of Michigan, Ann Arbor, USA
- Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, USA
- Department of Statistics, University of Michigan, Ann Arbor, USA
- Program in Applied Interdisciplinary Mathematics, University of Michigan, Ann Arbor, USA
- Program in Bioinformatics, University of Michigan, Ann Arbor, USA
| |
Collapse
|
30
|
Gnocis: An integrated system for interactive and reproducible analysis and modelling of cis-regulatory elements in Python 3. PLoS One 2022; 17:e0274338. [PMID: 36084008 PMCID: PMC9462789 DOI: 10.1371/journal.pone.0274338] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2021] [Accepted: 08/25/2022] [Indexed: 11/23/2022] Open
Abstract
Gene expression is regulated through cis-regulatory elements (CREs), among which are promoters, enhancers, Polycomb/Trithorax Response Elements (PREs), silencers and insulators. Computational prediction of CREs can be achieved using a variety of statistical and machine learning methods combined with different feature space formulations. Although Python packages for DNA sequence feature sets and for machine learning are available, no existing package facilitates the combination of DNA sequence feature sets with machine learning methods for the genome-wide prediction of candidate CREs. We here present Gnocis, a Python package that streamlines the analysis and the modelling of CRE sequences by providing extensible APIs and implementing the glue required for combining feature sets and models for genome-wide prediction. Gnocis implements a variety of base feature sets, including motif pair occurrence frequencies and the k-spectrum mismatch kernel. It integrates with Scikit-learn and TensorFlow for state-of-the-art machine learning. Gnocis additionally implements a broad suite of tools for the handling and preparation of sequence, region and curve data, which can be useful for general DNA bioinformatics in Python. We also present Deep-MOCCA, a neural network architecture inspired by SVM-MOCCA that achieves moderate to high generalization without prior motif knowledge. To demonstrate the use of Gnocis, we applied multiple machine learning methods to the modelling of D. melanogaster PREs, including a Convolutional Neural Network (CNN), making this the first study to model PREs with CNNs. The models are readily adapted to new CRE modelling problems and to other organisms. In order to produce a high-performance, compiled package for Python 3, we implemented Gnocis in Cython. Gnocis can be installed using the PyPI package manager by running ‘pip install gnocis’. The source code is available on GitHub, at https://github.com/bjornbredesen/gnocis.
Collapse
|
31
|
Dall'Alba G, Casa PL, Abreu FPD, Notari DL, de Avila E Silva S. A Survey of Biological Data in a Big Data Perspective. BIG DATA 2022; 10:279-297. [PMID: 35394342 DOI: 10.1089/big.2020.0383] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
The amount of available data is continuously growing. This phenomenon promotes a new concept, named big data. The highlight technologies related to big data are cloud computing (infrastructure) and Not Only SQL (NoSQL; data storage). In addition, for data analysis, machine learning algorithms such as decision trees, support vector machines, artificial neural networks, and clustering techniques present promising results. In a biological context, big data has many applications due to the large number of biological databases available. Some limitations of biological big data are related to the inherent features of these data, such as high degrees of complexity and heterogeneity, since biological systems provide information from an atomic level to interactions between organisms or their environment. Such characteristics make most bioinformatic-based applications difficult to build, configure, and maintain. Although the rise of big data is relatively recent, it has contributed to a better understanding of the underlying mechanisms of life. The main goal of this article is to provide a concise and reliable survey of the application of big data-related technologies in biology. As such, some fundamental concepts of information technology, including storage resources, analysis, and data sharing, are described along with their relation to biological data.
Collapse
Affiliation(s)
- Gabriel Dall'Alba
- Computational Biology and Bioinformatics Laboratory, Biotechnology Institute, Department of Life Sciences, University of Caxias do Sul, Caxias do Sul, Brazil
- Genome Science and Technology Program, Faculty of Science, The University of British Columbia, Vancouver, Canada
| | - Pedro Lenz Casa
- Computational Biology and Bioinformatics Laboratory, Biotechnology Institute, Department of Life Sciences, University of Caxias do Sul, Caxias do Sul, Brazil
| | - Fernanda Pessi de Abreu
- Computational Biology and Bioinformatics Laboratory, Biotechnology Institute, Department of Life Sciences, University of Caxias do Sul, Caxias do Sul, Brazil
| | - Daniel Luis Notari
- Computational Biology and Bioinformatics Laboratory, Biotechnology Institute, Department of Life Sciences, University of Caxias do Sul, Caxias do Sul, Brazil
| | - Scheila de Avila E Silva
- Computational Biology and Bioinformatics Laboratory, Biotechnology Institute, Department of Life Sciences, University of Caxias do Sul, Caxias do Sul, Brazil
| |
Collapse
|
32
|
Shui W, Zhang Y, Wang X, Liu Y, Wang Q, Duan F, Wu C, Shui W. Does Tibetan Household Livelihood Capital Enhance Tourism Participation Sustainability? Evidence from China’s Jiaju Tibetan Village. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:ijerph19159183. [PMID: 35954539 PMCID: PMC9368086 DOI: 10.3390/ijerph19159183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Revised: 07/21/2022] [Accepted: 07/26/2022] [Indexed: 12/04/2022]
Abstract
Identifying effective transformations to reduce poverty and approach rural sustainability is at the core of the first sustainable development goal of the United Nations. This article offers scientific support for continued efforts in sustaining rural development and livelihood resilience. Many studies have examined drivers of livelihood transition from farming to non-farm activities, especially participation in tourism against the backdrop of rural tourism development. However, few studies have identified ways to measure the level of tourism participation or have discussed how household-level capital influences decisions regarding tourism participation made by Tibetan ethnic households. This article assesses the role of livelihood capital in the adoption of tourism activities at the household level in Jiaju Tibetan Village, an ethnic region that is experiencing struggling agricultural business and developing tourism sector. Using household survey data, this study presents an ordinal logistic regression model to identify the determinants of the household tourism participation level. The results showed that households’ tourism participation was influenced by physical capital (e.g., proximity to major roads, odds ratio = 2.83 at p = 0.024; fixed capitals, odds ratio = 101.19 at p = 0.039), human capital (e.g., availability of family labor, odds ratio = 0.25 at p = 0.004; availability of skilled member, odds ratio = 2.91 at p = 0.002), and social capital (e.g., relatives in governmental sectors, odds ratio = 5.22 at p = 0.044; government payments, odds ratio = 8.78 at p = 0.04), while the influence of financial capital was not significant. The proximity to major roads, availability of skilled members, fixed assets, and direct and indirect support from the government to households were significantly and positively associated with tourism participation level. The effects of household labor availability and annual family income remain unclear. Overall, household livelihood capital plays a critical role in the enhancement of tourism participation in Jiaju Tibetan Village. Our findings have implications for understanding the shift of on-farm occupation to off-farm activities in tourism and for the pursuit of policies contributing to poverty reduction and rural revitalization in China as well as to the Sustainable Development Goals.
Collapse
Affiliation(s)
- Wei Shui
- College of Environment and Safety Engineering, Fuzhou University, Fuzhou 350116, China; (W.S.); (Y.L.); (Q.W.); (C.W.)
| | - Yiyi Zhang
- Department of Geography, McGill University, Montréal, QC H4G 2Y8, Canada;
| | - Xinggui Wang
- School of Historical Culture and Tourism, Sichuan Minzu College, Kangding 626300, China
- Correspondence:
| | - Yuanmeng Liu
- College of Environment and Safety Engineering, Fuzhou University, Fuzhou 350116, China; (W.S.); (Y.L.); (Q.W.); (C.W.)
| | - Qianfeng Wang
- College of Environment and Safety Engineering, Fuzhou University, Fuzhou 350116, China; (W.S.); (Y.L.); (Q.W.); (C.W.)
| | - Fei Duan
- College of Urban and Environmental Sciences, Peking University, Beijing 100871, China;
| | - Chaowei Wu
- College of Environment and Safety Engineering, Fuzhou University, Fuzhou 350116, China; (W.S.); (Y.L.); (Q.W.); (C.W.)
| | - Wanyu Shui
- College of Water Resource and Hydropower, Sichuan University, Chengdu 610065, China;
| |
Collapse
|
33
|
Yan H, Shan X, Li H, Liu F, Guo W. Abnormal spontaneous neural activity in hippocampal-cortical system of patients with obsessive-compulsive disorder and its potential for diagnosis and prediction of early treatment response. Front Cell Neurosci 2022; 16:906534. [PMID: 35910254 PMCID: PMC9334680 DOI: 10.3389/fncel.2022.906534] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Accepted: 06/30/2022] [Indexed: 11/25/2022] Open
Abstract
Early brain functional changes induced by pharmacotherapy in patients with obsessive-compulsive disorder (OCD) in relation to drugs per se or because of the impact of such drugs on the improvement of OCD remain unclear. Moreover, no neuroimaging biomarkers are available for diagnosis of OCD and prediction of early treatment response. We performed a longitudinal study involving 34 patients with OCD and 36 healthy controls (HCs). Patients with OCD received 5-week treatment with paroxetine (40 mg/d). Resting-state functional magnetic resonance imaging (fMRI), regional homogeneity (ReHo), support vector machine (SVM), and support vector regression (SVR) were applied to acquire and analyze the imaging data. Compared with HCs, patients with OCD had higher ReHo values in the right superior temporal gyrus and bilateral hippocampus/parahippocampus/fusiform gyrus/cerebellum at baseline. ReHo values in the left hippocampus and parahippocampus decreased significantly after treatment. The reduction rate (RR) of ReHo values was positively correlated with the RRs of the scores of Yale-Brown Obsessive-Compulsive Scale (Y-BOCS) and obsession. Abnormal ReHo values at baseline could serve as potential neuroimaging biomarkers for OCD diagnosis and prediction of early therapeutic response. This study highlighted the important role of the hippocampal-cortical system in the neuropsychological mechanism underlying OCD, pharmacological mechanism underlying OCD treatment, and the possibility of building models for diagnosis and prediction of early treatment response based on spontaneous activity in the hippocampal-cortical system.
Collapse
Affiliation(s)
- Haohao Yan
- Department of Psychiatry, National Clinical Research Center for Mental Disorders, The Second Xiangya Hospital of Central South University, Changsha, China
| | - Xiaoxiao Shan
- Department of Psychiatry, National Clinical Research Center for Mental Disorders, The Second Xiangya Hospital of Central South University, Changsha, China
| | - Huabing Li
- Department of Radiology, The Second Xiangya Hospital of Central South University, Changsha, China
| | - Feng Liu
- Department of Radiology, Tianjin Medical University General Hospital, Tianjin, China
| | - Wenbin Guo
- Department of Psychiatry, National Clinical Research Center for Mental Disorders, The Second Xiangya Hospital of Central South University, Changsha, China
- Department of Psychiatry, The Third People’s Hospital of Foshan, Foshan, China
- Department of Psychiatry, Qiqihar Medical University, Qiqihar, China
| |
Collapse
|
34
|
Yan H, Shan X, Li H, Liu F, Guo W. Abnormal spontaneous neural activity as a potential predictor of early treatment response in patients with obsessive-compulsive disorder. J Affect Disord 2022; 309:27-36. [PMID: 35472471 DOI: 10.1016/j.jad.2022.04.125] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 04/17/2022] [Accepted: 04/19/2022] [Indexed: 11/28/2022]
Abstract
BACKGROUND We aimed to explore the value of early improvement in obsessive-compulsive disorder (OCD) along with potential imaging changes after treatment with paroxetine in building diagnostic models and predicting treatment response. METHODS The clinical symptoms of patients with OCD were assessed at baseline and post-treatment (four weeks). Resting-state functional magnetic resonance imaging, fractional amplitudes of low-frequency fluctuations (fALFF) indicator, support vector machine (SVM), support vector regression (SVR), and correlation analysis were performed to acquire and analyze the data. RESULTS In comparison with healthy controls, OCD patients at baseline had abnormal fALFF in several brain regions. The abnormal fALFF in the left precuneus/ posterior cingulate cortex (PCC) (r = -0.526, p = 0.001) and right middle cingulate cortex (MCC) (r = -0.588, p < 0.001) were negatively correlated with the severity of compulsions. Patients with OCD showed significantly clinical improvement along with significantly decreased fALFF in the left precuneus after treatment. The SVM analysis showed that the classifier had an accuracy of 90.00% based on the fALFF in the right precentral gyrus and right MCC at baseline. The SVR analysis showed that the actual remission of OCD was positively correlated with the predicted remission based on the fALFF in the left precuneus/PCC and right MCC at baseline. LIMITATIONS This monocentric study with the relatively small sample size might restrict the generalizability of the results to other centers. CONCLUSIONS Abnormal spontaneous neural activities in patients with OCD could serve as potential neuroimaging biomarkers for diagnosis and prediction of early treatment response.
Collapse
Affiliation(s)
- Haohao Yan
- Department of Psychiatry, National Clinical Research Center for Mental Disorders, The Second Xiangya Hospital of Central South University, Changsha 410011, Hunan, China
| | - Xiaoxiao Shan
- Department of Psychiatry, National Clinical Research Center for Mental Disorders, The Second Xiangya Hospital of Central South University, Changsha 410011, Hunan, China
| | - Huabing Li
- Department of Radiology, The Second Xiangya Hospital of Central South University, Changsha, Hunan 410011, China
| | - Feng Liu
- Department of Radiology, Tianjin Medical University General Hospital, Tianjin, China
| | - Wenbin Guo
- Department of Psychiatry, National Clinical Research Center for Mental Disorders, The Second Xiangya Hospital of Central South University, Changsha 410011, Hunan, China; Department of Psychiatry, The Third People's Hospital of Foshan, Foshan 528000, Guangdong, China.
| |
Collapse
|
35
|
Zhang YH, Li ZD, Zeng T, Chen L, Huang T, Cai YD. Screening gene signatures for clinical response subtypes of lung transplantation. Mol Genet Genomics 2022; 297:1301-1313. [PMID: 35780439 DOI: 10.1007/s00438-022-01918-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Accepted: 06/12/2022] [Indexed: 11/30/2022]
Abstract
Lung is the most important organ in the human respiratory system, whose normal functions are quite essential for human beings. Under certain pathological conditions, the normal lung functions could no longer be maintained in patients, and lung transplantation is generally applied to ease patients' breathing and prolong their lives. However, several risk factors exist during and after lung transplantation, including bleeding, infection, and transplant rejections. In particular, transplant rejections are difficult to predict or prevent, leading to the most dangerous complications and severe status in patients undergoing lung transplantation. Given that most common monitoring and validation methods for lung transplantation rejections may take quite a long time and have low reproducibility, new technologies and methods are required to improve the efficacy and accuracy of rejection monitoring after lung transplantation. Recently, one previous study set up the gene expression profiles of patients who underwent lung transplantation. However, it did not provide a tool to predict lung transplantation responses. Here, a further deep investigation was conducted on such profiling data. A computational framework, incorporating several machine learning algorithms, such as feature selection methods and classification algorithms, was built to establish an effective prediction model distinguishing patient into different clinical subgroups, corresponding to different rejection responses after lung transplantation. Furthermore, the framework also screened essential genes with functional enrichments and create quantitative rules for the distinction of patients with different rejection responses to lung transplantation. The outcome of this contribution could provide guidelines for clinical treatment of each rejection subtype and contribute to the revealing of complicated rejection mechanisms of lung transplantation.
Collapse
Affiliation(s)
- Yu-Hang Zhang
- School of Life Sciences, Shanghai University, Shanghai, 200444, China
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Zhan Dong Li
- College of Food Engineering, Jilin Engineering Normal University, Changchun, 130052, China
| | - Tao Zeng
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, 200031, China
| | - Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai, 201306, China
| | - Tao Huang
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, 200031, China.
- CAS Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China.
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai, 200444, China.
| |
Collapse
|
36
|
Rodriguez J, Schulz S, Voss A, Giraldo BF. Recurrence Plot-based Classification of Ischemic and Dilated Cardiomyopathy Patients. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2022; 2022:1394-1397. [PMID: 36086596 DOI: 10.1109/embc48229.2022.9871298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
A large portion of the elderly population are affected by cardiovascular diseases. The early prognosis of cardiomyopathies is still a challenge. The aim of this study was to classify cardiomyopathy patients by their etiology in function of significant indexes extracted from the characterization of the recurrence plot of the systems involved. Thirty-nine cardiomyopathy patients (CMP) classified as ischemic (ICM - 24 patients) and dilated (DCM-15 patients) were considered. In addition, thirty-nine control subjects (CON) were used as reference. The beat-to-beat (BBI) time series, from the electrocardiographic signal, the systolic (SBP), and diastolic (DBP) time series, from the blood pressure signal, and the respiratory time (FLW) from the respiratory flow signal, were extracted. The recurrence plot from each signal considered were calculated and characterized by a total of 12 indexes. The best classifiers were used to build support vector machine models. The optimal model to classify ICM versus DCM patients achieved 92.3% accuracy, 95.8% sensitivity, and 86.6% specificity. When comparing CMP patients and CON subjects, the best model achieved 85.8% accuracy, 92.3% sensitivity, and 80.1% specificity. Our results suggest a more deterministic behavior in DCM patients. Clinical Relevance - This study explores the recurrence plot for the classification of ICM and DCM patients.
Collapse
|
37
|
Ahn JC, Noh YK, Rattan P, Buryska S, Wu T, Kezer CA, Choi C, Arunachalam SP, Simonetto DA, Shah VH, Kamath PS. Machine Learning Techniques Differentiate Alcohol-Associated Hepatitis From Acute Cholangitis in Patients With Systemic Inflammation and Elevated Liver Enzymes. Mayo Clin Proc 2022; 97:1326-1336. [PMID: 35787859 DOI: 10.1016/j.mayocp.2022.01.028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Revised: 10/12/2021] [Accepted: 01/14/2022] [Indexed: 11/19/2022]
Abstract
OBJECTIVE To develop machine learning algorithms (MLAs) that can differentiate patients with acute cholangitis (AC) and alcohol-associated hepatitis (AH) using simple laboratory variables. METHODS A study was conducted of 459 adult patients admitted to Mayo Clinic, Rochester, with AH (n=265) or AC (n=194) from January 1, 2010, to December 31, 2019. Ten laboratory variables (white blood cell count, hemoglobin, mean corpuscular volume, platelet count, aspartate aminotransferase, alanine aminotransferase, alkaline phosphatase, total bilirubin, direct bilirubin, albumin) were collected as input variables. Eight supervised MLAs (decision tree, naive Bayes, logistic regression, k-nearest neighbor, support vector machine, artificial neural networks, random forest, gradient boosting) were trained and tested for classification of AC vs AH. External validation was performed with patients with AC (n=213) and AH (n=92) from the MIMIC-III database. A feature selection strategy was used to choose the best 5-variable combination. There were 143 physicians who took an online quiz to distinguish AC from AH using the same 10 laboratory variables alone. RESULTS The MLAs demonstrated excellent performances with accuracies up to 0.932 and area under the curve (AUC) up to 0.986. In external validation, the MLAs showed comparable accuracy up to 0.909 and AUC up to 0.970. Feature selection in terms of information-theoretic measures was effective, and the choice of the best 5-variable subset produced high performance with an AUC up to 0.994. Physicians did worse, with mean accuracy of 0.790. CONCLUSION Using a few routine laboratory variables, MLAs can differentiate patients with AC and AH and may serve valuable adjunctive roles in cases of diagnostic uncertainty.
Collapse
Affiliation(s)
- Joseph C Ahn
- Division of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN
| | - Yung-Kyun Noh
- Division of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN; Department of Computer Science, Hanyang University, Seoul, South Korea
| | - Puru Rattan
- Division of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN
| | - Seth Buryska
- Division of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN
| | - Tiffany Wu
- Division of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN
| | | | - Chansong Choi
- Division of Internal Medicine, Mayo Clinic, Rochester, MN
| | | | | | - Vijay H Shah
- Division of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN; Division of Internal Medicine, Mayo Clinic, Rochester, MN
| | - Patrick S Kamath
- Division of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN.
| |
Collapse
|
38
|
Zhang M, Holowko MB, Hayman Zumpe H, Ong CS. Machine Learning Guided Batched Design of a Bacterial Ribosome Binding Site. ACS Synth Biol 2022; 11:2314-2326. [PMID: 35704784 PMCID: PMC9295160 DOI: 10.1021/acssynbio.2c00015] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Optimization of gene expression levels is an essential part of the organism design process. Fine control of this process can be achieved by engineering transcription and translation control elements, including the ribosome binding site (RBS). Unfortunately, the design of specific genetic parts remains challenging because of the lack of reliable design methods. To address this problem, we have created a machine learning guided Design-Build-Test-Learn (DBTL) cycle for the experimental design of bacterial RBSs to demonstrate how small genetic parts can be reliably designed using relatively small, high-quality data sets. We used Gaussian Process Regression for the Learn phase of the cycle and the Upper Confidence Bound multiarmed bandit algorithm for the Design of genetic variants to be tested in vivo. We have integrated these machine learning algorithms with laboratory automation and high-throughput processes for reliable data generation. Notably, by Testing a total of 450 RBS variants in four DBTL cycles, we have experimentally validated RBSs with high translation initiation rates equaling or exceeding our benchmark RBS by up to 34%. Overall, our results show that machine learning is a powerful tool for designing RBSs, and they pave the way toward more complicated genetic devices.
Collapse
Affiliation(s)
- Mengyan Zhang
- Machine Learning and Artificial Intelligence Future Science Platform, CSIRO, Canberra, ACT 2601, Australia.,Department of Computer Science, Australian National University, Canberra, ACT 2601, Australia.,Data61, CSIRO, Canberra, ACT 2601, Australia
| | - Maciej Bartosz Holowko
- Synthetic Biology Future Science Platform, CSIRO, Canberra, ACT 2601, Australia.,Land and Water, CSIRO, Canberra, ACT 2601, Australia
| | - Huw Hayman Zumpe
- Synthetic Biology Future Science Platform, CSIRO, Canberra, ACT 2601, Australia.,Land and Water, CSIRO, Canberra, ACT 2601, Australia
| | - Cheng Soon Ong
- Machine Learning and Artificial Intelligence Future Science Platform, CSIRO, Canberra, ACT 2601, Australia.,Department of Computer Science, Australian National University, Canberra, ACT 2601, Australia.,Data61, CSIRO, Canberra, ACT 2601, Australia
| |
Collapse
|
39
|
Di Credico A, Perpetuini D, Izzicupo P, Gaggi G, Cardone D, Filippini C, Merla A, Ghinassi B, Di Baldassarre A. Estimation of Heart Rate Variability Parameters by Machine Learning Approaches Applied to Facial Infrared Thermal Imaging. Front Cardiovasc Med 2022; 9:893374. [PMID: 35656402 PMCID: PMC9152459 DOI: 10.3389/fcvm.2022.893374] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Accepted: 04/04/2022] [Indexed: 01/18/2023] Open
Abstract
Heart rate variability (HRV) is a reliable tool for the evaluation of several physiological factors modulating the heart rate (HR). Importantly, variations of HRV parameters may be indicative of cardiac diseases and altered psychophysiological conditions. Recently, several studies focused on procedures for contactless HR measurements from facial videos. However, the performances of these methods decrease when illumination is poor. Infrared thermography (IRT) could be useful to overcome this limitation. In fact, IRT can measure the infrared radiations emitted by the skin, working properly even in no visible light illumination conditions. This study investigated the capability of facial IRT to estimate HRV parameters through a face tracking algorithm and a cross-validated machine learning approach, employing photoplethysmography (PPG) as the gold standard for the HR evaluation. The results demonstrated a good capability of facial IRT in estimating HRV parameters. Particularly, strong correlations between the estimated and measured HR (r = 0.7), RR intervals (r = 0.67), TINN (r = 0.71), and pNN50 (%) (r = 0.70) were found, whereas moderate correlations for RMSSD (r = 0.58), SDNN (r = 0.44), and LF/HF (r = 0.48) were discovered. The proposed procedure allows for a contactless estimation of the HRV that could be beneficial for evaluating both cardiac and general health status in subjects or conditions where contact probe sensors cannot be used.
Collapse
Affiliation(s)
- Andrea Di Credico
- Department of Medicine and Aging Sciences, University "G. d'Annunzio" of Chieti - Pescara, Chieti, Italy.,Reprogramming and Cell Differentiation Lab, Center for Advanced Studies and Technology, University "G. d'Annunzio" of Chieti - Pescara, Chieti, Italy
| | - David Perpetuini
- Department of Neurosciences, Imaging and Clinical Sciences, University "G. d'Annunzio" of Chieti - Pescara, Chieti, Italy
| | - Pascal Izzicupo
- Department of Medicine and Aging Sciences, University "G. d'Annunzio" of Chieti - Pescara, Chieti, Italy
| | - Giulia Gaggi
- Department of Medicine and Aging Sciences, University "G. d'Annunzio" of Chieti - Pescara, Chieti, Italy.,Reprogramming and Cell Differentiation Lab, Center for Advanced Studies and Technology, University "G. d'Annunzio" of Chieti - Pescara, Chieti, Italy
| | - Daniela Cardone
- Department of Neurosciences, Imaging and Clinical Sciences, University "G. d'Annunzio" of Chieti - Pescara, Chieti, Italy
| | - Chiara Filippini
- Department of Neurosciences, Imaging and Clinical Sciences, University "G. d'Annunzio" of Chieti - Pescara, Chieti, Italy
| | - Arcangelo Merla
- Department of Engineering and Geology, University "G. d'Annunzio" of Chieti - Pescara, Chieti, Italy
| | - Barbara Ghinassi
- Department of Medicine and Aging Sciences, University "G. d'Annunzio" of Chieti - Pescara, Chieti, Italy.,Reprogramming and Cell Differentiation Lab, Center for Advanced Studies and Technology, University "G. d'Annunzio" of Chieti - Pescara, Chieti, Italy
| | - Angela Di Baldassarre
- Department of Medicine and Aging Sciences, University "G. d'Annunzio" of Chieti - Pescara, Chieti, Italy.,Reprogramming and Cell Differentiation Lab, Center for Advanced Studies and Technology, University "G. d'Annunzio" of Chieti - Pescara, Chieti, Italy
| |
Collapse
|
40
|
Jiang W, Merhar SL, Zeng Z, Zhu Z, Yin W, Zhou Z, Wang L, He L, Vannest J, Lin W. Neural alterations in opioid-exposed infants revealed by edge-centric brain functional networks. Brain Commun 2022; 4:fcac112. [PMID: 35602654 PMCID: PMC9117006 DOI: 10.1093/braincomms/fcac112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Revised: 03/29/2022] [Accepted: 05/03/2022] [Indexed: 12/02/2022] Open
Abstract
Prenatal opioid exposure has been linked to adverse effects spanning multiple neurodevelopmental domains, including cognition, motor development, attention, and vision. However, the neural basis of these abnormalities is largely unknown. A total of 49 infants, including 21 opioid-exposed and 28 controls, were enrolled and underwent MRI (43 ± 6 days old) after birth, including resting state functional MRI. Edge-centric functional networks based on dynamic functional connections were constructed, and machine-learning methods were employed to identify neural features distinguishing opioid-exposed infants from unexposed controls. An accuracy of 73.6% (sensitivity 76.25% and specificity 69.33%) was achieved using 10 times 10-fold cross-validation, which substantially outperformed those obtained using conventional static functional connections (accuracy 56.9%). More importantly, we identified that prenatal opioid exposure preferentially affects inter- rather than intra-network dynamic functional connections, particularly with the visual, subcortical, and default mode networks. Consistent results at the brain regional and connection levels were also observed, where the brain regions and connections associated with visual and higher order cognitive functions played pivotal roles in distinguishing opioid-exposed infants from controls. Our findings support the clinical phenotype of infants exposed to opioids in utero and may potentially explain the higher rates of visual and emotional problems observed in this population. Finally, our findings suggested that edge-centric networks could better capture the neural differences between opioid-exposed infants and controls by abstracting the intrinsic co-fluctuation along edges, which may provide a promising tool for future studies focusing on investigating the effects of prenatal opioid exposure on neurodevelopment.
Collapse
Affiliation(s)
- Weixiong Jiang
- Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States
| | - Stephanie L. Merhar
- Perinatal Institute, Division of Neonatology, Cincinnati Children’s Hospital and University of Cincinnati Department of Pediatrics, Cincinnati OH, United States
| | - Zhuohao Zeng
- East Chapel Hill High School, Chapel Hill, North Carolina, United States
| | - Ziliang Zhu
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States
| | - Weiyan Yin
- Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States
| | - Zhen Zhou
- Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States
| | - Li Wang
- Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States
| | - Lili He
- Department of Radiology, Cincinnati Children’s Hospital and University of Cincinnati, Cincinnati OH, United States
| | - Jennifer Vannest
- Department of Communication Sciences and Disorders, University of Cincinnati, Cincinnati OH, United States
| | - Weili Lin
- Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States
- Department of Radiology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States
| |
Collapse
|
41
|
Kumar R, Sharma A, Alexiou A, Bilgrami AL, Kamal MA, Ashraf GM. DeePred-BBB: A Blood Brain Barrier Permeability Prediction Model With Improved Accuracy. Front Neurosci 2022; 16:858126. [PMID: 35592264 PMCID: PMC9112838 DOI: 10.3389/fnins.2022.858126] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Accepted: 03/14/2022] [Indexed: 11/13/2022] Open
Abstract
The blood-brain barrier (BBB) is a selective and semipermeable boundary that maintains homeostasis inside the central nervous system (CNS). The BBB permeability of compounds is an important consideration during CNS-acting drug development and is difficult to formulate in a succinct manner. Clinical experiments are the most accurate method of measuring BBB permeability. However, they are time taking and labor-intensive. Therefore, numerous efforts have been made to predict the BBB permeability of compounds using computational methods. However, the accuracy of BBB permeability prediction models has always been an issue. To improve the accuracy of the BBB permeability prediction, we applied deep learning and machine learning algorithms to a dataset of 3,605 diverse compounds. Each compound was encoded with 1,917 features containing 1,444 physicochemical (1D and 2D) properties, 166 molecular access system fingerprints (MACCS), and 307 substructure fingerprints. The prediction performance metrics of the developed models were compared and analyzed. The prediction accuracy of the deep neural network (DNN), one-dimensional convolutional neural network, and convolutional neural network by transfer learning was found to be 98.07, 97.44, and 97.61%, respectively. The best performing DNN-based model was selected for the development of the “DeePred-BBB” model, which can predict the BBB permeability of compounds using their simplified molecular input line entry system (SMILES) notations. It could be useful in the screening of compounds based on their BBB permeability at the preliminary stages of drug development. The DeePred-BBB is made available at https://github.com/12rajnish/DeePred-BBB.
Collapse
Affiliation(s)
- Rajnish Kumar
- Amity Institute of Biotechnology, Amity University Uttar Pradesh, Lucknow, India
| | - Anju Sharma
- Department of Applied Science, Indian Institute of Information Technology Allahabad, Prayagraj, India
| | - Athanasios Alexiou
- Department of Science and Engineering, Novel Global Community Educational Foundation, Hebersham, NSW, Australia
- AFNP Med Austria, Vienna, Austria
| | - Anwar L. Bilgrami
- Department of Entomology, Rutgers, The State University of New Jersey, New Brunswick, NJ, United States
- Deanship of Scientific Research, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Mohammad Amjad Kamal
- Institutes for Systems Genetics, Frontiers Science Center for Disease-Related Molecular Network, West China Hospital, Sichuan University, Chengdu, China
- King Fahd Medical Research Center, King Abdulaziz University, Jeddah, Saudi Arabia
- Department of Pharmacy, Faculty of Allied Health Sciences, Daffodil International University, Dhaka, Bangladesh
- Enzymoics, Hebersham, NSW, Australia
- Novel Global Community Educational Foundation, Hebersham, NSW, Australia
| | - Ghulam Md Ashraf
- Pre-Clinical Research Unit, King Fahd Medical Research Center, King Abdulaziz University, Jeddah, Saudi Arabia
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah, Saudi Arabia
- *Correspondence: Ghulam Md Ashraf, ,
| |
Collapse
|
42
|
Villavicencio CN, Macrohon JJ, Inbaraj XA, Jeng JH, Hsieh JG. Development of a Machine Learning Based Web Application for Early Diagnosis of COVID-19 Based on Symptoms. Diagnostics (Basel) 2022; 12:diagnostics12040821. [PMID: 35453869 PMCID: PMC9026809 DOI: 10.3390/diagnostics12040821] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Revised: 03/10/2022] [Accepted: 03/24/2022] [Indexed: 12/04/2022] Open
Abstract
Detecting the presence of a disease requires laboratory tests, testing kits, and devices; however, these were not always available on hand. This study proposes a new approach in disease detection using machine learning algorithms by analyzing symptoms experienced by a person without requiring laboratory tests. Six supervised machine learning algorithms such as J48 decision tree, random forest, support vector machine, k-nearest neighbors, naïve Bayes algorithms, and artificial neural networks were applied in the “COVID-19 Symptoms and Presence Dataset” from Kaggle. Through hyperparameter optimization and 10-fold cross validation, we attained the highest possible performance of each algorithm. A comparative analysis was performed according to accuracy, sensitivity, specificity, and area under the ROC curve. Results show that random forest, support vector machine, k-nearest neighbors, and artificial neural networks outweighed other algorithms by attaining 98.84% accuracy, 100% sensitivity, 98.79% specificity, and 98.84% area under the ROC curve. Finally, we developed a web application that will allow users to select symptoms currently being experienced, and use it to predict the presence of COVID-19 through the developed prediction model. Based on this mechanism, the proposed method can effectively predict the presence or absence of COVID-19 in a person immediately without using laboratory tests, kits, and devices in a real-time manner.
Collapse
Affiliation(s)
- Charlyn Nayve Villavicencio
- Department of Information Engineering, I-Shou University, Kaohsiung City 84001, Taiwan; (J.J.M.); (X.A.I.); (J.-H.J.)
- College of Information and Communications Technology, Bulacan State University, Malolos City 3000, Philippines
- Correspondence: ; Tel.: +886-958-450-028
| | - Julio Jerison Macrohon
- Department of Information Engineering, I-Shou University, Kaohsiung City 84001, Taiwan; (J.J.M.); (X.A.I.); (J.-H.J.)
| | - Xavier Alphonse Inbaraj
- Department of Information Engineering, I-Shou University, Kaohsiung City 84001, Taiwan; (J.J.M.); (X.A.I.); (J.-H.J.)
| | - Jyh-Horng Jeng
- Department of Information Engineering, I-Shou University, Kaohsiung City 84001, Taiwan; (J.J.M.); (X.A.I.); (J.-H.J.)
| | - Jer-Guang Hsieh
- Department of Electrical Engineering, I-Shou University, Kaohsiung City 84001, Taiwan;
| |
Collapse
|
43
|
Vinayagam A, Othman ML, Veerasamy V, Saravan Balaji S, Ramaiyan K, Radhakrishnan P, Raman MD, Abdul Wahab NI. A random subspace ensemble classification model for discrimination of power quality events in solar PV microgrid power network. PLoS One 2022; 17:e0262570. [PMID: 35085307 PMCID: PMC8794120 DOI: 10.1371/journal.pone.0262570] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Accepted: 12/29/2021] [Indexed: 11/18/2022] Open
Abstract
This study proposes SVM based Random Subspace (RS) ensemble classifier to discriminate different Power Quality Events (PQEs) in a photovoltaic (PV) connected Microgrid (MG) model. The MG model is developed and simulated with the presence of different PQEs (voltage and harmonic related signals and distinctive transients) in both on-grid and off-grid modes of MG network, respectively. In the pre-stage of classification, the features are extracted from numerous PQE signals by Discrete Wavelet Transform (DWT) analysis, and the extracted features are used to learn the classifiers at the final stage. In this study, first three Kernel types of SVM classifiers (Linear, Quadratic, and Cubic) are used to predict the different PQEs. Among the results that Cubic kernel SVM classifier offers higher accuracy and better performance than other kernel types (Linear and Quadradic). Further, to enhance the accuracy of SVM classifiers, a SVM based RS ensemble model is proposed and its effectiveness is verified with the results of kernel based SVM classifiers under the standard test condition (STC) and varying solar irradiance of PV in real time. From the final results, it can be concluded that the proposed method is more robust and offers superior performance with higher accuracy of classification than kernel based SVM classifiers.
Collapse
Affiliation(s)
- Arangarajan Vinayagam
- Department of Electrical and Electronics Engineering, New Horizon College of Engineering, Bangalore, India
| | - Mohammad Lutfi Othman
- Advanced Lightning, Power and Energy Research (ALPER), Department of Electrical and Electronics Engineering, Universiti Putra Malaysia (UPM), Selangor, Malaysia
- * E-mail:
| | - Veerapandiyan Veerasamy
- School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore, Singapore
| | - Suganthi Saravan Balaji
- Department of Information Technology, College of Engineering and Computer Science, Lebanese French University, Erbil, Kurdistan Region, Iraq
| | - Kalaivani Ramaiyan
- Department of Electrical and Electronics Engineering, Rajalakshmi Engineering College, Chennai, India
| | - Padmavathi Radhakrishnan
- Department of Electrical and Electronics Engineering, Rajalakshmi Engineering College, Chennai, India
| | - Mohan Das Raman
- Department of Electrical and Electronics Engineering, New Horizon College of Engineering, Bangalore, India
| | - Noor Izzri Abdul Wahab
- Advanced Lightning, Power and Energy Research (ALPER), Department of Electrical and Electronics Engineering, Universiti Putra Malaysia (UPM), Selangor, Malaysia
| |
Collapse
|
44
|
Shahraki MF, Atanaki FF, Ariaeenejad S, Ghaffari MR, Norouzi‐Beirami MH, Maleki M, Salekdeh GH, Kavousi K. A computational learning paradigm to targeted discovery of biocatalysts from metagenomic data: a case study of lipase identification. Biotechnol Bioeng 2022; 119:1115-1128. [DOI: 10.1002/bit.28037] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Revised: 08/18/2021] [Accepted: 12/01/2021] [Indexed: 11/09/2022]
Affiliation(s)
- Mehdi Foroozandeh Shahraki
- Laboratory of Complex Biological Systems and Bioinformatics (CBB), Institute of Biochemistry and Biophysics (IBB), University of Tehran Tehran Iran
| | - Fereshteh Fallah Atanaki
- Laboratory of Complex Biological Systems and Bioinformatics (CBB), Institute of Biochemistry and Biophysics (IBB), University of Tehran Tehran Iran
| | - Shohreh Ariaeenejad
- Department of Systems and Synthetic Biology Agricultural Biotechnology Research Institute of Iran (ABRII), Agricultural Research Education and Extension Organization (AREEO) Karaj Iran
| | - Mohammad Reza Ghaffari
- Department of Systems and Synthetic Biology Agricultural Biotechnology Research Institute of Iran (ABRII), Agricultural Research Education and Extension Organization (AREEO) Karaj Iran
| | - Mohammad Hossein Norouzi‐Beirami
- Laboratory of Complex Biological Systems and Bioinformatics (CBB), Institute of Biochemistry and Biophysics (IBB), University of Tehran Tehran Iran
- Department of Computer Engineering Osku Branch, Islamic Azad University Osku Iran
| | - Morteza Maleki
- Department of Systems and Synthetic Biology Agricultural Biotechnology Research Institute of Iran (ABRII), Agricultural Research Education and Extension Organization (AREEO) Karaj Iran
| | - Ghasem Hosseini Salekdeh
- Department of Systems and Synthetic Biology Agricultural Biotechnology Research Institute of Iran (ABRII), Agricultural Research Education and Extension Organization (AREEO) Karaj Iran
- Department of Molecular Sciences Macquarie University Sydney NSW Australia
| | - Kaveh Kavousi
- Laboratory of Complex Biological Systems and Bioinformatics (CBB), Institute of Biochemistry and Biophysics (IBB), University of Tehran Tehran Iran
| |
Collapse
|
45
|
Greener JG, Kandathil SM, Moffat L, Jones DT. A guide to machine learning for biologists. Nat Rev Mol Cell Biol 2022; 23:40-55. [PMID: 34518686 DOI: 10.1038/s41580-021-00407-0] [Citation(s) in RCA: 564] [Impact Index Per Article: 282.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/23/2021] [Indexed: 02/08/2023]
Abstract
The expanding scale and inherent complexity of biological data have encouraged a growing use of machine learning in biology to build informative and predictive models of the underlying biological processes. All machine learning techniques fit models to data; however, the specific methods are quite varied and can at first glance seem bewildering. In this Review, we aim to provide readers with a gentle introduction to a few key machine learning techniques, including the most recently developed and widely used techniques involving deep neural networks. We describe how different techniques may be suited to specific types of biological data, and also discuss some best practices and points to consider when one is embarking on experiments involving machine learning. Some emerging directions in machine learning methodology are also discussed.
Collapse
Affiliation(s)
- Joe G Greener
- Department of Computer Science, University College London, London, UK
| | - Shaun M Kandathil
- Department of Computer Science, University College London, London, UK
| | - Lewis Moffat
- Department of Computer Science, University College London, London, UK
| | - David T Jones
- Department of Computer Science, University College London, London, UK.
| |
Collapse
|
46
|
Niu H, Li W, Wang G, Hu Q, Hao R, Li T, Zhang F, Cheng T. Performances of whole-brain dynamic and static functional connectivity fingerprinting in machine learning-based classification of major depressive disorder. Front Psychiatry 2022; 13:973921. [PMID: 35958666 PMCID: PMC9360427 DOI: 10.3389/fpsyt.2022.973921] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Accepted: 07/08/2022] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Alterations in static and dynamic functional connectivity during resting state have been widely reported in major depressive disorder (MDD). The objective of this study was to compare the performances of whole-brain dynamic and static functional connectivity combined with machine learning approach in differentiating MDD patients from healthy controls at the individual subject level. Given the dynamic nature of brain activity, we hypothesized that dynamic connectivity would outperform static connectivity in the classification. METHODS Seventy-one MDD patients and seventy-one well-matched healthy controls underwent resting-state functional magnetic resonance imaging scans. Whole-brain dynamic and static functional connectivity patterns were calculated and utilized as classification features. Linear kernel support vector machine was employed to design the classifier and a leave-one-out cross-validation strategy was used to assess classifier performance. RESULTS Experimental results of dynamic functional connectivity-based classification showed that MDD patients could be discriminated from healthy controls with an excellent accuracy of 100% irrespective of whether or not global signal regression (GSR) was performed (permutation test with P < 0.0002). Brain regions with the most discriminating dynamic connectivity were mainly and reliably located within the default mode network, cerebellum, and subcortical network. In contrast, the static functional connectivity-based classifiers exhibited unstable classification performances, i.e., a low accuracy of 38.0% without GSR (P = 0.9926) while a high accuracy of 96.5% with GSR (P < 0.0002); moreover, there was a considerable variability in the distribution of brain regions with static connectivity most informative for classification. CONCLUSION These findings suggest the superiority of dynamic functional connectivity in machine learning-based classification of depression, which may be helpful for a better understanding of the neural basis of MDD as well as for the development of effective computer-aided diagnosis tools in clinical settings.
Collapse
Affiliation(s)
- Heng Niu
- Department of MRI, Shanxi Cardiovascular Hospital, Taiyuan, China
| | - Weirong Li
- Department of Neurology, Shanxi Cardiovascular Hospital, Taiyuan, China
| | - Guiquan Wang
- Department of Neurology, Shanxi Cardiovascular Hospital, Taiyuan, China
| | - Qiong Hu
- Department of Neurology, Shanxi Cardiovascular Hospital, Taiyuan, China
| | - Rui Hao
- Department of Neurology, Shanxi Cardiovascular Hospital, Taiyuan, China
| | - Tianliang Li
- Department of Ultrasound, Shanxi Cardiovascular Hospital, Taiyuan, China
| | - Fan Zhang
- Department of Medical Imaging, Shanxi Traditional Chinese Medical Hospital, Taiyuan, China
| | - Tao Cheng
- Department of Neurology, Shanxi Cardiovascular Hospital, Taiyuan, China
| |
Collapse
|
47
|
Banerjee D, Jindra MA, Linot AJ, Pfleger BF, Maranas CD. EnZymClass: Substrate specificity prediction tool of plant acyl-ACP thioesterases based on ensemble learning. CURRENT RESEARCH IN BIOTECHNOLOGY 2022. [DOI: 10.1016/j.crbiot.2021.12.002] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
|
48
|
Panja S, Rahem S, Chu CJ, Mitrofanova A. Big Data to Knowledge: Application of Machine Learning to Predictive Modeling of Therapeutic Response in Cancer. Curr Genomics 2021; 22:244-266. [PMID: 35273457 PMCID: PMC8822229 DOI: 10.2174/1389202921999201224110101] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2020] [Revised: 09/16/2020] [Accepted: 09/30/2020] [Indexed: 11/22/2022] Open
Abstract
Background In recent years, the availability of high throughput technologies, establishment of large molecular patient data repositories, and advancement in computing power and storage have allowed elucidation of complex mechanisms implicated in therapeutic response in cancer patients. The breadth and depth of such data, alongside experimental noise and missing values, requires a sophisticated human-machine interaction that would allow effective learning from complex data and accurate forecasting of future outcomes, ideally embedded in the core of machine learning design. Objective In this review, we will discuss machine learning techniques utilized for modeling of treatment response in cancer, including Random Forests, support vector machines, neural networks, and linear and logistic regression. We will overview their mathematical foundations and discuss their limitations and alternative approaches in light of their application to therapeutic response modeling in cancer. Conclusion We hypothesize that the increase in the number of patient profiles and potential temporal monitoring of patient data will define even more complex techniques, such as deep learning and causal analysis, as central players in therapeutic response modeling.
Collapse
Affiliation(s)
| | | | | | - Antonina Mitrofanova
- Address correspondence to this author at the Department of Health Informatics, Rutgers School of Health Professions, Rutgers Biomedical and Health Sciences, Newark, NJ 07107, USA; E-mail:
| |
Collapse
|
49
|
Sharma A, Kumar R, Varadwaj PK. OBPred: feature-fusion-based deep neural network classifier for odorant-binding protein prediction. Neural Comput Appl 2021. [DOI: 10.1007/s00521-021-06347-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
|
50
|
Sraitih M, Jabrane Y, Hajjam El Hassani A. An Automated System for ECG Arrhythmia Detection Using Machine Learning Techniques. J Clin Med 2021; 10:jcm10225450. [PMID: 34830732 PMCID: PMC8618527 DOI: 10.3390/jcm10225450] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Revised: 11/12/2021] [Accepted: 11/15/2021] [Indexed: 12/29/2022] Open
Abstract
The new advances in multiple types of devices and machine learning models provide opportunities for practical automatic computer-aided diagnosis (CAD) systems for ECG classification methods to be practicable in an actual clinical environment. This imposes the requirements for the ECG arrhythmia classification methods that are inter-patient. We aim in this paper to design and investigate an automatic classification system using a new comprehensive ECG database inter-patient paradigm separation to improve the minority arrhythmical classes detection without performing any features extraction. We investigated four supervised machine learning models: support vector machine (SVM), k-nearest neighbors (KNN), Random Forest (RF), and the ensemble of these three methods. We test the performance of these techniques in classifying: Normal beat (NOR), Left Bundle Branch Block Beat (LBBB), Right Bundle Branch Block Beat (RBBB), Premature Atrial Contraction (PAC), and Premature Ventricular Contraction (PVC), using inter-patient real ECG records from MIT-DB after segmentation and normalization of the data, and measuring four metrics: accuracy, precision, recall, and f1-score. The experimental results emphasized that with applying no complicated data pre-processing or feature engineering methods, the SVM classifier outperforms the other methods using our proposed inter-patient paradigm, in terms of all metrics used in experiments, achieving an accuracy of 0.83 and in terms of computational cost, which remains a very important factor in implementing classification models for ECG arrhythmia. This method is more realistic in a clinical environment, where varieties of ECG signals are collected from different patients.
Collapse
Affiliation(s)
- Mohamed Sraitih
- MSC Laboratory, Cadi Ayyad University, Marrakech 40000, Morocco;
| | - Younes Jabrane
- MSC Laboratory, Cadi Ayyad University, Marrakech 40000, Morocco;
- Correspondence: ; Tel.: +212-524-434-745
| | - Amir Hajjam El Hassani
- Nanomedicine Imagery & Therapeutics Laboratory, EA4662—UBFC, UTBM, 90000 Belfort, France;
| |
Collapse
|