1
|
Identifying a Correlation among Qualitative Non-Numeric Parameters in Natural Fish Microbe Dataset Using Machine Learning. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12125927] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Recent technical innovations and developments in computer-based technology have enabled bioscience researchers to acquire comprehensive datasets and identify unique parameters within experimental datasets. However, field researchers may face the challenge that datasets exhibit few associations among any measurement results (e.g., from analytical instruments, phenotype observations as well as field environmental data), and may contain non-numerical, qualitative parameters, which make statistical analyses difficult. Here, we propose an advanced analysis scheme that combines two machine learning steps to mine association rules between non-numerical parameters. The aim of this analysis is to identify relationships between variables and enable the visualization of association rules from data of samples collected in the field, which have less correlations between genetic, physical, and non-numerical qualitative parameters. The analysis scheme presented here may increase the potential to identify important characteristics of big datasets.
Collapse
|
2
|
Shiokawa Y, Date Y, Kikuchi J. Application of kernel principal component analysis and computational machine learning to exploration of metabolites strongly associated with diet. Sci Rep 2018; 8:3426. [PMID: 29467421 PMCID: PMC5821832 DOI: 10.1038/s41598-018-20121-w] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2017] [Accepted: 01/08/2018] [Indexed: 12/13/2022] Open
Abstract
Computer-based technological innovation provides advancements in sophisticated and diverse analytical instruments, enabling massive amounts of data collection with relative ease. This is accompanied by a fast-growing demand for technological progress in data mining methods for analysis of big data derived from chemical and biological systems. From this perspective, use of a general “linear” multivariate analysis alone limits interpretations due to “non-linear” variations in metabolic data from living organisms. Here we describe a kernel principal component analysis (KPCA)-incorporated analytical approach for extracting useful information from metabolic profiling data. To overcome the limitation of important variable (metabolite) determinations, we incorporated a random forest conditional variable importance measure into our KPCA-based analytical approach to demonstrate the relative importance of metabolites. Using a market basket analysis, hippurate, the most important variable detected in the importance measure, was associated with high levels of some vitamins and minerals present in foods eaten the previous day, suggesting a relationship between increased hippurate and intake of a wide variety of vegetables and fruits. Therefore, the KPCA-incorporated analytical approach described herein enabled us to capture input–output responses, and should be useful not only for metabolic profiling but also for profiling in other areas of biological and environmental systems.
Collapse
Affiliation(s)
- Yuka Shiokawa
- RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, 235-0045, Japan.,Graduate School of Medical Life Science, Yokohama City University, 1-7-29 Suehiro-cho, Tsurumi-ku, Yokohama, 230-0045, Japan
| | - Yasuhiro Date
- RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, 235-0045, Japan.,Graduate School of Medical Life Science, Yokohama City University, 1-7-29 Suehiro-cho, Tsurumi-ku, Yokohama, 230-0045, Japan
| | - Jun Kikuchi
- RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, 235-0045, Japan. .,Graduate School of Medical Life Science, Yokohama City University, 1-7-29 Suehiro-cho, Tsurumi-ku, Yokohama, 230-0045, Japan. .,Graduate School of Bioagricultural Sciences and School of Agricultural Sciences, Nagoya University, 1 Furo-cho, Chikusa-ku, Nagoya, 464-8601, Japan.
| |
Collapse
|