1
|
Compositional PLS biplot based on pivoting balances: an application to explore the association between 24-h movement behaviours and adiposity. Comput Stat 2023. [DOI: 10.1007/s00180-023-01324-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
AbstractMovement behaviour data are compositional in nature, therefore the logratio methodology has been demonstrated appropriate for their statistical analysis. Compositional data can be mapped into the ordinary real space through new sets of variables (orthonormal logratio coordinates) representing balances between the original compositional parts. Geometric rotation between orthonormal logratio coordinates systems can be used to extract relevant information from any of them. We exploit this idea to introduce the concept of pivoting balances, which facilitates the construction and use of interpretable balances according to the purpose of the data analysis. Moreover, graphical representation through ternary diagrams has been ordinarily used to explore time-use compositions consisting of, or being amalgamated into, three parts. Data dimension reduction techniques can however serve well for visualisation and facilitate understanding in the case of larger compositions. We here develop suitable pivoting balance coordinates that in combination with an adapted formulation of compositional partial least squares regression biplots enable meaningful visualisation of more complex time-use patterns and their relationships with an outcome variable. The use and features of the proposed method are illustrated in a study examining the association between movement behaviours and adiposity from a sample of Czech school-aged girls. The results suggest that an adequate strategy for obesity prevention in this group would be to focus on achieving a positive balance of vigorous physical activity in combination with sleep against the other daily behaviours.
Collapse
|
2
|
Geographical Origin Identification of Chinese Tomatoes Using Long-Wave Fourier-Transform Near-Infrared Spectroscopy Combined with Deep Learning Methods. FOOD ANAL METHOD 2023. [DOI: 10.1007/s12161-023-02444-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
|
3
|
Shen Y, Li B, Li G, Lang C, Wang H, Zhu J, Jia N, Liu L. Rapid identification of producing area of wheat using terahertz spectroscopy combined with chemometrics. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2022; 269:120694. [PMID: 34922288 DOI: 10.1016/j.saa.2021.120694] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2021] [Revised: 11/12/2021] [Accepted: 11/30/2021] [Indexed: 06/14/2023]
Abstract
Wheat from different producing areas has different flavor and properties, and thus the identification of producing area of wheat is significant to assure the quality of wheat. The traditional method of producing area of wheat determination is time-consuming, complex and needs a lot of pretreatment. The purpose of this research is to develop a new method for the determination of wheat producing areas by terahertz time domain spectroscopy in combination with chemometrics. Firstly, a total of 240 wheat samples from Shandong Province, Shaanxi Province, Henan Province, Hebei Province and Anhui Province of China were collected to analyze and obtain the time-domain spectral signals, frequency-domain spectral signals, and absorption coefficient spectral signals of the samples were obtained. Then, four different preprocessing methods of Savitzky-Golay (S-G), multiplicative scatter correction (MSC), mean centering, and standard normal variate (SNV) were applied to preprocess the absorption coefficient spectral signals, and the uninformative variable elimination (UVE) was used for variable selection of THz spectra data, for developing an effective prediction model. Finally, chemometrics methods, including the partial least squares discriminant analysis (PLS-DA), back propagation neural network (BPNN) and least squares support vector machines (LS-SVM) qualitative models were used for model building and discrimination results obtained through such models were compared. According to the test results, the comprehensive discrimination accuracy of wheat from different origins by the SNV-LS-SVM model reached 96.76%, Furthermore, these results demonstrated that an accurate qualitative analysis of producing area of wheat samples could be achieved by terahertz time-domain spectroscopy combined with chemometrics, which can provide a fast and accurate solution for grain security detection and origin tracing.
Collapse
Affiliation(s)
- Yin Shen
- Intelligent Equipment Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China; College of Engineering and Technology, Southwest University, Chongqing 400715, China
| | - Bin Li
- Intelligent Equipment Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China.
| | - Guanglin Li
- College of Engineering and Technology, Southwest University, Chongqing 400715, China.
| | - Chongchong Lang
- Intelligent Equipment Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
| | - Haifeng Wang
- Intelligent Equipment Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
| | - Jun Zhu
- Intelligent Equipment Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
| | - Nan Jia
- Intelligent Equipment Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
| | - Lirong Liu
- Intelligent Equipment Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
| |
Collapse
|
4
|
Alenazi A. A review of compositional data analysis and recent advances. COMMUN STAT-THEOR M 2021. [DOI: 10.1080/03610926.2021.2014890] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Affiliation(s)
- Abdulaziz Alenazi
- Department of Mathematics, College of Science, Northern Border University, Arar, Saudi Arabia
| |
Collapse
|
5
|
Štefelová N, Palarea‐Albaladejo J, Hron K. Weighted pivot coordinates for partial least squares‐based marker discovery in high‐throughput compositional data. Stat Anal Data Min 2021. [DOI: 10.1002/sam.11514] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Affiliation(s)
| | | | - Karel Hron
- Faculty of Science Palacký University Olomouc Czech Republic
| |
Collapse
|
6
|
Data Analysis Strategies for Microbiome Studies in Human Populations-a Systematic Review of Current Practice. mSystems 2021; 6:6/1/e01154-20. [PMID: 33622856 PMCID: PMC8573962 DOI: 10.1128/msystems.01154-20] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Reproducibility is a major issue in microbiome studies, which is partly caused by missing consensus about data analysis strategies. The complex nature of microbiome data, which are high-dimensional, zero-inflated, and compositional, makes them challenging to analyze, as they often violate assumptions of classic statistical methods. With advances in human microbiome research, research questions and study designs increase in complexity so that more sophisticated data analysis concepts are applied. To improve current practice of the analysis of microbiome studies, it is important to understand what kind of research questions are asked and which tools are used to answer these questions. We conducted a systematic literature review considering all publications focusing on the analysis of human microbiome data from June 2018 to June 2019. Of 1,444 studies screened, 419 fulfilled the inclusion criteria. Information about research questions, study designs, and analysis strategies were extracted. The results confirmed the expected shift to more advanced research questions, as one-third of the studies analyzed clustered data. Although heterogeneity in the methods used was found at any stage of the analysis process, it was largest for differential abundance testing. Especially if the underlying data structure was clustered, we identified a lack of use of methods that appropriately addressed the underlying data structure while taking into account additional dependencies in the data. Our results confirm considerable heterogeneity in analysis strategies among microbiome studies; increasingly complex research questions require better guidance for analysis strategies. IMPORTANCE The human microbiome has emerged as an important factor in the development of health and disease. Growing interest in this topic has led to an increasing number of studies investigating the human microbiome using high-throughput sequencing methods. However, the development of suitable analytical methods for analyzing microbiome data has not kept pace with the rapid progression in the field. It is crucial to understand current practice to identify the scope for development. Our results highlight the need for an extensive evaluation of the strengths and shortcomings of existing methods in order to guide the choice of proper analysis strategies. We have identified where new methods could be designed to address more advanced research questions while taking into account the complex structure of the data.
Collapse
|
7
|
Gu J, Cui B, Lu S. A classification framework for multivariate compositional data with Dirichlet feature embedding. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2020.106614] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
8
|
Robust distance measure to detect outliers for categorical data. Soft comput 2020. [DOI: 10.1007/s00500-019-04340-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
9
|
Chen J, Zhang X, Hron K. Partial least squares regression with compositional response variables and covariates. J Appl Stat 2020; 48:3130-3149. [DOI: 10.1080/02664763.2020.1795813] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Affiliation(s)
- Jiajia Chen
- School of Statistics, Shanxi University of Finance and Economics, Taiyuan, People's Republic of China
| | - Xiaoqin Zhang
- School of Statistics, Shanxi University of Finance and Economics, Taiyuan, People's Republic of China
| | - Karel Hron
- Department of Mathematical Analysis and Applications of Mathematics, Faculty of Science, Palacký University, Olomouc, Czech Republic
| |
Collapse
|
10
|
Interpretable Log Contrasts for the Classification of Health Biomarkers: a New Approach to Balance Selection. mSystems 2020; 5:5/2/e00230-19. [PMID: 32265314 PMCID: PMC7141889 DOI: 10.1128/msystems.00230-19] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
High-throughput sequencing provides an easy and cost-effective way to measure the relative abundance of bacteria in any environmental or biological sample. When these samples come from humans, the microbiome signatures can act as biomarkers for disease prediction. However, because bacterial abundance is measured as a composition, the data have unique properties that make conventional analyses inappropriate. To overcome this, analysts often use cumbersome normalizations. This article proposes an alternative method that identifies pairs and trios of bacteria whose stoichiometric presence can differentiate between diseased and nondiseased samples. By using interpretable log contrasts called balances, we developed an entirely normalization-free classification procedure that reduces the feature space and improves the interpretability, without sacrificing classifier performance. Since the turn of the century, technological advances have made it possible to obtain the molecular profile of any tissue in a cost-effective manner. Among these advances are sophisticated high-throughput assays that measure the relative abundances of microorganisms, RNA molecules, and metabolites. While these data are most often collected to gain new insights into biological systems, they can also be used as biomarkers to create clinically useful diagnostic classifiers. How best to classify high-dimensional -omics data remains an area of active research. However, few explicitly model the relative nature of these data and instead rely on cumbersome normalizations. This report (i) emphasizes the relative nature of health biomarkers, (ii) discusses the literature surrounding the classification of relative data, and (iii) benchmarks how different transformations perform for regularized logistic regression across multiple biomarker types. We show how an interpretable set of log contrasts, called balances, can prepare data for classification. We propose a simple procedure, called discriminative balance analysis, to select groups of 2 and 3 bacteria that can together discriminate between experimental conditions. Discriminative balance analysis is a fast, accurate, and interpretable alternative to data normalization. IMPORTANCE High-throughput sequencing provides an easy and cost-effective way to measure the relative abundance of bacteria in any environmental or biological sample. When these samples come from humans, the microbiome signatures can act as biomarkers for disease prediction. However, because bacterial abundance is measured as a composition, the data have unique properties that make conventional analyses inappropriate. To overcome this, analysts often use cumbersome normalizations. This article proposes an alternative method that identifies pairs and trios of bacteria whose stoichiometric presence can differentiate between diseased and nondiseased samples. By using interpretable log contrasts called balances, we developed an entirely normalization-free classification procedure that reduces the feature space and improves the interpretability, without sacrificing classifier performance.
Collapse
|
11
|
Wang H, Wang Z, Wang S. Sliced inverse regression method for multivariate compositional data modeling. Stat Pap (Berl) 2019. [DOI: 10.1007/s00362-019-01093-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
12
|
|
13
|
Lee SC, Tang MS, Lim YAL, Choy SH, Kurtz ZD, Cox LM, Gundra UM, Cho I, Bonneau R, Blaser MJ, Chua KH, Loke P. Helminth colonization is associated with increased diversity of the gut microbiota. PLoS Negl Trop Dis 2014; 8:e2880. [PMID: 24851867 PMCID: PMC4031128 DOI: 10.1371/journal.pntd.0002880] [Citation(s) in RCA: 247] [Impact Index Per Article: 24.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2013] [Accepted: 04/08/2014] [Indexed: 01/01/2023] Open
Abstract
Soil-transmitted helminths colonize more than 1.5 billion people worldwide, yet little is known about how they interact with bacterial communities in the gut microbiota. Differences in the gut microbiota between individuals living in developed and developing countries may be partly due to the presence of helminths, since they predominantly infect individuals from developing countries, such as the indigenous communities in Malaysia we examine in this work. We compared the composition and diversity of bacterial communities from the fecal microbiota of 51 people from two villages in Malaysia, of which 36 (70.6%) were infected by helminths. The 16S rRNA V4 region was sequenced at an average of nineteen thousand sequences per samples. Helminth-colonized individuals had greater species richness and number of observed OTUs with enrichment of Paraprevotellaceae, especially with Trichuris infection. We developed a new approach of combining centered log-ratio (clr) transformation for OTU relative abundances with sparse Partial Least Squares Discriminant Analysis (sPLS-DA) to enable more robust predictions of OTU interrelationships. These results suggest that helminths may have an impact on the diversity, bacterial community structure and function of the gut microbiota. Soil-transmitted helminths are carried by large numbers of people in developing countries. These parasites live in the gut and may interact with bacterial communities in the gut, also called the gut microbiota. To determine whether there are alterations to the gut microbiota that are associated with helminth infections, we examined the types of bacteria present in fecal samples from rural Malaysians, many of whom are helminth-positive and find it likely that helminth colonization alters the gut microbiota for rural Malaysians.
Collapse
Affiliation(s)
- Soo Ching Lee
- Department of Parasitology, Faculty of Medicine, University of Malaya, Kuala Lumpur, Malaysia
| | - Mei San Tang
- Jeffrey Cheah School of Medicine and Health Sciences, Monash University Malaysia, Johor Bahru, Johor, Malaysia
| | - Yvonne A. L. Lim
- Department of Parasitology, Faculty of Medicine, University of Malaya, Kuala Lumpur, Malaysia
- * E-mail: (YALL); (PL)
| | - Seow Huey Choy
- Department of Parasitology, Faculty of Medicine, University of Malaya, Kuala Lumpur, Malaysia
| | - Zachary D. Kurtz
- Department of Medicine, New York University School of Medicine, New York, New York, United States of America
- Department of Microbiology, New York University School of Medicine, New York, New York, United States of America
| | - Laura M. Cox
- Department of Medicine, New York University School of Medicine, New York, New York, United States of America
- Department of Microbiology, New York University School of Medicine, New York, New York, United States of America
| | - Uma Mahesh Gundra
- Department of Microbiology, New York University School of Medicine, New York, New York, United States of America
| | - Ilseung Cho
- Department of Medicine, New York University School of Medicine, New York, New York, United States of America
| | - Richard Bonneau
- Department of Biology, Center for Genomics and Systems Biology, New York University, New York, New York, United States of America
| | - Martin J. Blaser
- Department of Medicine, New York University School of Medicine, New York, New York, United States of America
- Department of Microbiology, New York University School of Medicine, New York, New York, United States of America
| | - Kek Heng Chua
- Department of Biomedical Science, Faculty of Medicine, University of Malaya, Kuala Lumpur, Malaysia
| | - P'ng Loke
- Department of Microbiology, New York University School of Medicine, New York, New York, United States of America
- * E-mail: (YALL); (PL)
| |
Collapse
|
14
|
Raseetha S, Oey I, Burritt D, Hamid N. Monitoring colour, volatiles in the headspace and enzyme activity to assess the quality of broccoli florets (Brassica oleraceaL.italicacv.BellstarandLegacy) during postharvest storage. Int J Food Sci Technol 2013. [DOI: 10.1111/ijfs.12447] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Affiliation(s)
- Siva Raseetha
- Department of Food Science; University of Otago; PO Box 56 Dunedin 9054 New Zealand
- Department of Botany; University of Otago; PO Box 56 Dunedin 9054 New Zealand
| | - Indrawati Oey
- Department of Food Science; University of Otago; PO Box 56 Dunedin 9054 New Zealand
| | - David Burritt
- Department of Botany; University of Otago; PO Box 56 Dunedin 9054 New Zealand
| | - Nazimah Hamid
- Faculty of Health and Environment Sciences; School of Applied Sciences; Auckland University of Technology; Private Bag 92006 Auckland 1142 New Zealand
| |
Collapse
|
15
|
A new strategy to assess the quality of broccoli (Brassica oleracea L. italica) based on enzymatic changes and volatile mass ion profile using Proton Transfer Reaction Mass Spectrometry (PTR-MS). INNOV FOOD SCI EMERG 2011. [DOI: 10.1016/j.ifset.2010.12.005] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|