1
|
Structure and metabolic potential of the prokaryotic communities from the hydrothermal system of Paleochori Bay, Milos, Greece. Front Microbiol 2023; 13:1060168. [PMID: 36687571 PMCID: PMC9852839 DOI: 10.3389/fmicb.2022.1060168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2022] [Accepted: 12/01/2022] [Indexed: 01/09/2023] Open
Abstract
Introduction Shallow hydrothermal systems share many characteristics with their deep-sea counterparts, but their accessibility facilitates their study. One of the most studied shallow hydrothermal vent fields lies at Paleochori Bay off the coast of Milos in the Aegean Sea (Greece). It has been studied through extensive mapping and its physical and chemical processes have been characterized over the past decades. However, a thorough description of the microbial communities inhabiting the bay is still missing. Methods We present the first in-depth characterization of the prokaryotic communities of Paleochori Bay by sampling eight different seafloor types that are distributed along the entire gradient of hydrothermal influence. We used deep sequencing of the 16S rRNA marker gene and complemented the analysis with qPCR quantification of the 16S rRNA gene and several functional genes to gain insights into the metabolic potential of the communities. Results We found that the microbiome of the bay is strongly influenced by the hydrothermal venting, with a succession of various groups dominating the sediments from the coldest to the warmest zones. Prokaryotic diversity and abundance decrease with increasing temperature, and thermophilic archaea overtake the community. Discussion Relevant geochemical cycles of the Bay are discussed. This study expands our limited understanding of subsurface microbial communities in acidic shallow-sea hydrothermal systems and the contribution of their microbial activity to biogeochemical cycling.
Collapse
|
2
|
Principal microbial groups: compositional alternative to phylogenetic grouping of microbiome data. Brief Bioinform 2022; 23:6675749. [PMID: 36007229 DOI: 10.1093/bib/bbac328] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Revised: 07/19/2022] [Accepted: 07/20/2022] [Indexed: 11/13/2022] Open
Abstract
Statistical and machine learning techniques based on relative abundances have been used to predict health conditions and to identify microbial biomarkers. However, high dimensionality, sparsity and the compositional nature of microbiome data represent statistical challenges. On the other hand, the taxon grouping allows summarizing microbiome abundance with a coarser resolution in a lower dimension, but it presents new challenges when correlating taxa with a disease. In this work, we present a novel approach that groups Operational Taxonomical Units (OTUs) based only on relative abundances as an alternative to taxon grouping. The proposed procedure acknowledges the compositional data making use of principal balances. The identified groups are called Principal Microbial Groups (PMGs). The procedure reduces the need for user-defined aggregation of $\textrm{OTU}$s and offers the possibility of working with coarse group of $\textrm{OTU}$s, which are not present in a phylogenetic tree. PMGs can be used for two different goals: (1) as a dimensionality reduction method for compositional data, (2) as an aggregation procedure that provides an alternative to taxon grouping for construction of microbial balances afterward used for disease prediction. We illustrate the procedure with a cirrhosis study data. PMGs provide a coherent data analysis for the search of biomarkers in human microbiota. The source code and demo data for PMGs are available at: https://github.com/asliboyraz/PMGs.
Collapse
|
3
|
Compositional baseline assessments to address soil pollution: An application in Langreo, Spain. THE SCIENCE OF THE TOTAL ENVIRONMENT 2022; 812:152383. [PMID: 34952083 DOI: 10.1016/j.scitotenv.2021.152383] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Revised: 12/09/2021] [Accepted: 12/10/2021] [Indexed: 06/14/2023]
Abstract
Potentially Toxic Elements (PTEs) are contaminants with high toxicity and complex geochemical behaviour and, therefore, high PTEs contents in soil may affect ecosystems and/or human health. However, before addressing the measurement of soil pollution, it is necessary to understand what is meant by pollution-free soil. Often, this background, or pollution baseline, is undefined or only partially known. Since the concentration of chemical elements is compositional, as the attributes vary together, here we present a novel approach to build compositional indicators based on Compositional Data (CoDa) principles. The steps of this new methodology are: 1) Exploratory data analysis through variation matrix, biplots or CoDa dendrograms; 2) Selection of geological background in terms of a trimmed subsample that can be assumed as non-pollutant; 3) Computing the spread Aitchison distance from each sample point to the trimmed sample; 4) Performing a compositional balance able to predict the Aitchison distance computed in step 3.Identifying a compositional balance, including pollutant and non-pollutant elements, with sparsity and simplicity as properties, is crucial for the construction of a Compositional Pollution Indicator (CI). Here we explored a database of 150 soil samples and 37 chemical elements from the contaminated region of Langreo, Northwestern Spain. There were obtained three Cis: the first two using elements obtained through CoDa analysis, and the third one selecting a list of pollutants and non-pollutants based on expert knowledge and previous studies. The three indicators went through a Stochastic Sequential Gaussian simulation. The results of the 100 computed simulations are summarized through mean image maps and probability maps of exceeding a given threshold, thus allowing characterization of the spatial distribution and variability of the CIs. A better understanding of the trends of relative enrichment and PTEs fate is discussed.
Collapse
|
4
|
Chronic kidney disease of unknown origin is associated with environmental urbanisation in Belfast, UK. ENVIRONMENTAL GEOCHEMISTRY AND HEALTH 2021; 43:2597-2614. [PMID: 32583129 PMCID: PMC8275563 DOI: 10.1007/s10653-020-00618-y] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Accepted: 06/08/2020] [Indexed: 05/02/2023]
Abstract
Chronic kidney disease (CKD), a collective term for many causes of progressive renal failure, is increasing worldwide due to ageing, obesity and diabetes. However, these factors cannot explain the many environmental clusters of renal disease that are known to occur globally. This study uses data from the UK Renal Registry (UKRR) including CKD of uncertain aetiology (CKDu) to investigate environmental factors in Belfast, UK. Urbanisation has been reported to have an increasing impact on soils. Using an urban soil geochemistry database of elemental concentrations of potentially toxic elements (PTEs), we investigated the association of the standardised incidence rates (SIRs) of both CKD and CKD of uncertain aetiology (CKDu) with environmental factors (PTEs), controlling for social deprivation. A compositional data analysis approach was used through balances (a special class of log contrasts) to identify elemental balances associated with CKDu. A statistically significant relationship was observed between CKD with the social deprivation measures of employment, income and education (significance levels of 0.001, 0.01 and 0.001, respectively), which have been used as a proxy for socio-economic factors such as smoking. Using three alternative regression methods (linear, generalised linear and Tweedie models), the elemental balances of Cr/Ni and As/Mo were found to produce the largest correlation with CKDu. Geogenic and atmospheric pollution deposition, traffic and brake wear emissions have been cited as sources for these PTEs which have been linked to kidney damage. This research, thus, sheds light on the increasing global burden of CKD and, in particular, the environmental and anthropogenic factors that may be linked to CKDu, particularly environmental PTEs linked to urbanisation.
Collapse
|
5
|
Some thoughts on counts in sequencing studies. NAR Genom Bioinform 2021; 2:lqaa094. [PMID: 33575638 PMCID: PMC7679068 DOI: 10.1093/nargab/lqaa094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2020] [Revised: 10/01/2020] [Accepted: 10/27/2020] [Indexed: 11/12/2022] Open
Abstract
Measurements in sequencing studies are mostly based on counts. There is a lack of theoretical developments for the analysis and modelling of this type of data. Some thoughts in this direction are presented, which might serve as a seed. The main issues addressed are the compositional character of multinomial probabilities and the corresponding representation in orthogonal (isometric) coordinates, and modelling distributions for sequencing data taking into account possible effects of amplification techniques.
Collapse
|
6
|
Peter Filzmoser, Karel Hron, Matthias Templ: Applied compositional data analysis, with worked examples in R. Stat Pap (Berl) 2020. [DOI: 10.1007/s00362-020-01163-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
7
|
Abstract
BACKGROUND Fecal microbiota transplantation (FMT) has been recently approved by FDA for the treatment of refractory recurrent clostridial colitis (rCDI). Success of FTM in treatment of rCDI led to a number of studies investigating the effectiveness of its application in the other gastrointestinal diseases. However, in the majority of studies the effects of FMT were evaluated on the patients with initially altered microbiota. The aim of our study was to estimate effects of FMT on the gut microbiota composition in healthy volunteers and to monitor its long-term outcomes. RESULTS We have performed a combined analysis of three healthy volunteers before and after capsule FMT by evaluating their general condition, adverse clinical effects, changes of basic laboratory parameters, and several immune markers. Intestinal microbiota samples were evaluated by 16S rRNA gene and shotgun sequencing. The data analysis demonstrated profound shift towards the donor microbiota taxonomic composition in all volunteers. Following FMT, all the volunteers exhibited gut colonization with donor gut bacteria and persistence of this effect for almost ∼1 year of observation. Transient changes of immune parameters were consistent with suppression of T-cell cytotoxicity. FMT was well tolerated with mild gastrointestinal adverse events, however, one volunteer developed a systemic inflammatory response syndrome. CONCLUSIONS The FMT leads to significant long-term changes of the gut microbiota in healthy volunteers with the shift towards donor microbiota composition and represents a relatively safe procedure to the recipients without long-term adverse events.
Collapse
|
8
|
Survey Data on Perceptions of Contraceptive Methods as Compositional Tables. REVISTA LATINOAMERICANA DE PSICOLOGIA 2018. [DOI: 10.14349/rlp.2018.v50.n3.5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
9
|
Exploration of geochemical data with compositional canonical biplots. JOURNAL OF GEOCHEMICAL EXPLORATION 2018; 194:120-133. [PMID: 33510550 PMCID: PMC7839972 DOI: 10.1016/j.gexplo.2018.07.014] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
The study of the relationships between two compositions is of paramount importance in geochemical data analysis. This paper develops a compositional version of canonical correlation analysis, called CoDA-CCO, for this purpose. We consider two approaches, using the centred log-ratio transformation and the calculation of all possible pairwise log-ratios within sets. The relationships between both approaches are pointed out, and their merits are discussed. The related covariance matrices are structurally singular, and this is efficiently dealt with by using generalized inverses. We develop compositional canonical biplots and detail their properties. The canonical biplots are shown to be powerful tools for discovering the most salient relationships between two compositions. Some guidelines for compositional canonical biplot construction are discussed. A geochemical data set with X-ray fluorescence spectrometry measurements on major oxides and trace elements of European floodplains is used to illustrate the proposed method. The relationships between an analysis based on centred log-ratios and on isometric log-ratios are also shown.
Collapse
|
10
|
Abstract
High-throughput sequencing technologies have revolutionized microbiome research by allowing the relative quantification of microbiome composition and function in different environments. In this work we focus on the identification of microbial signatures, groups of microbial taxa that are predictive of a phenotype of interest. We do this by acknowledging the compositional nature of the microbiome and the fact that it carries relative information. Thus, instead of defining a microbial signature as a linear combination in real space corresponding to the abundances of a group of taxa, we consider microbial signatures given by the geometric means of data from two groups of taxa whose relative abundances, or balance, are associated with the response variable of interest. In this work we present selbal, a greedy stepwise algorithm for selection of balances or microbial signatures that preserves the principles of compositional data analysis. We illustrate the algorithm with 16S rRNA abundance data from a Crohn's microbiome study and an HIV microbiome study. IMPORTANCE We propose a new algorithm for the identification of microbial signatures. These microbial signatures can be used for diagnosis, prognosis, or prediction of therapeutic response based on an individual's specific microbiota.
Collapse
|
11
|
Abstract
With compositional data ordinary covariation indexes, designed for real random variables, fail to describe dependence. There is a need for compositional alternatives to covariance and correlation. Based on the Euclidean structure of the simplex, called Aitchison geometry, compositional association is identied to a linear restriction of the sample space when a log-contrast is constant. In order to simplify interpretation, a sparse and simple version of compositional association is dened in terms of balances which are constant across the sample. It is called b-association. This kind of association of compositional variables is extended to association between groups of compositional variables. In practice, exact b-association seldom occurs, and measures of degree of b-association are reviewed based on those previously proposed. Also, some techniques for testing b-association are studied. These techniques are applied to available oral microbiome data to illustrate both their advantages and diculties. Both testing and measurements of b-association appear to be quite sensible to heterogeneities in the studied populations and to outliers.
Collapse
|
12
|
Microbiome Datasets Are Compositional: And This Is Not Optional. Front Microbiol 2017; 8:2224. [PMID: 29187837 PMCID: PMC5695134 DOI: 10.3389/fmicb.2017.02224] [Citation(s) in RCA: 1170] [Impact Index Per Article: 167.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2017] [Accepted: 10/30/2017] [Indexed: 12/11/2022] Open
Abstract
Datasets collected by high-throughput sequencing (HTS) of 16S rRNA gene amplimers, metagenomes or metatranscriptomes are commonplace and being used to study human disease states, ecological differences between sites, and the built environment. There is increasing awareness that microbiome datasets generated by HTS are compositional because they have an arbitrary total imposed by the instrument. However, many investigators are either unaware of this or assume specific properties of the compositional data. The purpose of this review is to alert investigators to the dangers inherent in ignoring the compositional nature of the data, and point out that HTS datasets derived from microbiome studies can and should be treated as compositions at all stages of analysis. We briefly introduce compositional data, illustrate the pathologies that occur when compositional data are analyzed inappropriately, and finally give guidance and point to resources and examples for the analysis of microbiome datasets using compositional data analysis.
Collapse
|
13
|
Abstract
Datasets collected by high-throughput sequencing (HTS) of 16S rRNA gene amplimers, metagenomes or metatranscriptomes are commonplace and being used to study human disease states, ecological differences between sites, and the built environment. There is increasing awareness that microbiome datasets generated by HTS are compositional because they have an arbitrary total imposed by the instrument. However, many investigators are either unaware of this or assume specific properties of the compositional data. The purpose of this review is to alert investigators to the dangers inherent in ignoring the compositional nature of the data, and point out that HTS datasets derived from microbiome studies can and should be treated as compositions at all stages of analysis. We briefly introduce compositional data, illustrate the pathologies that occur when compositional data are analyzed inappropriately, and finally give guidance and point to resources and examples for the analysis of microbiome datasets using compositional data analysis.
Collapse
|
14
|
Changing the Reference Measure in the Simplex and its Weighting Effects. AUSTRIAN JOURNAL OF STATISTICS 2016. [DOI: 10.17713/ajs.v45i4.126] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Standard analysis of compositional data under the assumption that the Aitchison geometry holds assumes a uniform distribution as reference measure of the space. Weighting of parts can be done changing the reference measure. The changes that appear in the algebraic-geometric structure of the simplex are analysed, as a step towards understanding the implications for elementary statistics of random compositions. Some of the standard tools in exploratory analysis of compositional data analysis, such as center, variation matrix and biplots are studied in some detail, although further research is still needed. The main conclusion is that down-weighting some parts is approaching the geometry of the corresponding subcomposition, thus preserving a kind of coherence between standard and down-weighted analyses.
Collapse
|
15
|
|
16
|
Proportionality: a valid alternative to correlation for relative data. PLoS Comput Biol 2015; 11:e1004075. [PMID: 25775355 PMCID: PMC4361748 DOI: 10.1371/journal.pcbi.1004075] [Citation(s) in RCA: 139] [Impact Index Per Article: 15.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2014] [Accepted: 12/08/2014] [Indexed: 11/18/2022] Open
Abstract
In the life sciences, many measurement methods yield only the relative abundances of different components in a sample. With such relative-or compositional-data, differential expression needs careful interpretation, and correlation-a statistical workhorse for analyzing pairwise relationships-is an inappropriate measure of association. Using yeast gene expression data we show how correlation can be misleading and present proportionality as a valid alternative for relative data. We show how the strength of proportionality between two variables can be meaningfully and interpretably described by a new statistic ϕ which can be used instead of correlation as the basis of familiar analyses and visualisation methods, including co-expression networks and clustered heatmaps. While the main aim of this study is to present proportionality as a means to analyse relative data, it also raises intriguing questions about the molecular mechanisms underlying the proportional regulation of a range of yeast genes.
Collapse
|
17
|
Differential effects of genetic vs. environmental quality in Drosophila melanogaster suggest multiple forms of condition dependence. Ecol Lett 2015; 18:317-26. [PMID: 25649176 DOI: 10.1111/ele.12412] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2014] [Revised: 08/07/2014] [Accepted: 01/03/2015] [Indexed: 01/17/2023]
Abstract
Condition is a central concept in evolutionary ecology, but the roles of genetic and environmental quality in condition-dependent trait expression remain poorly understood. Theory suggests that condition integrates genetic, epigenetic and somatic factors, and therefore predicts alignment between the phenotypic effects of genetic and environmental quality. To test this key prediction, we manipulated both genetic (mutational) and environmental (dietary) quality in Drosophila melanogaster and examined responses in morphological and chemical (cuticular hydrocarbon, CHC) traits in both sexes. While the phenotypic effects of diet were consistent among genotypes, effects of mutation load varied in magnitude and direction. Average effects of diet and mutation were aligned for most morphological traits, but non-aligned for the male sexcombs and CHCs in both sexes. Our results suggest the existence of distinct forms of condition dependence, one integrating both genetic and environmental effects and the other purely environmental. We propose a model to account for these observations.
Collapse
|
18
|
Abstract
Compositional data analysis usually deals with relative information between parts where the total (abundances, mass, amount, etc.) is unknown or uninformative. This article addresses the question of what to do when the total is known and is of interest. Tools used in this case are reviewed and analysed, in particular the relationship between the positive orthant of D-dimensional real space, the product space of the real line times the D-part simplex, and their Euclidean space structures. The first alternative corresponds to data analysis taking logarithms on each component, and the second one to treat a log-transformed total jointly with a composition describing the distribution of component amounts. Real data about total abundances of phytoplankton in an Australian river motivated the present study and are used for illustration.
Collapse
|
19
|
|
20
|
|
21
|
|
22
|
Discriminating geodynamical regimes of tin ore formation using trace element composition of cassiterite: the Sikhote’Alin case (Far Eastern Russia). ACTA ACUST UNITED AC 2006. [DOI: 10.1144/gsl.sp.2006.264.01.04] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
AbstractA possible interpretation of the Sikhote’Alin accretion system (Asian margin of the Pacific Ocean) assumes that this region underwent an alternation of subduction and transform tectogenesis (here called the tectogenetic switch hypothesis). This palaeotectonic model fits well with the observed complexity of ore districts and deposits of the region. In this contribution, several statistical analyses are applied to a compositional dataset of trace elements in cassiterite obtained from this area. The goal is to assess the reliability of the tectogenetic switch hypothesis, based solely on cassiterite compositional information. First, biplots are used to get an insight into the variability of the data. Secondly, cluster analysis is applied to detect the existence of natural groups of samples, without using the existing geological information. Finally, discriminant analysis uncovers the main differences in the composition of cassiterite from the different groups obtained. Results highlight the contrast between areas formed under different tectogenetic environments, being subduction-related cassiterite richer in siderophile elements (In, Fe, Sc, W, Cr) and transform-related cassiterite richer in lithophile elements (Mn, Zr). Further natural groups discriminate cassiterite samples depending on their V/Be ratio, which might be related to the age of the deposit. These results suggest that sources of ore magmas and fluids within the region might have a mixed or varied mantle-crust origin and support the tectogenetic switch assumption.
Collapse
|
23
|
Abstract
AbstractThe main features of the Aitchison geometry of the simplex of D parts are reviewed. Compositions are positive vectors in which the relevant information is contained in the ratios between their components or parts. They can be represented in the simplex of D parts by closing them to a constant sum, e.g. percentages, or parts per million. Perturbation and powering in the simplex of D parts are respectively an internal operation, playing the role of a sum, and of an external product by real numbers or scalars. These operations impose the structure of (D − 1)-dimensional vector space to the simplex of D parts. An inner product, norm and distance, compatible with perturbation and powering, complete the structure of the simplex, a structure known in mathematical terms as a Euclidean space. This general structure allows the representation of compositions by coordinates with respect to a basis of the space, particularly, an orthonormal basis. The interpretation of the so-called balances, coordinates with respect to orthonormal bases associated with groups of parts, is stressed. Subcompositions and balances are interpreted as orthogonal projections. Finally, log-ratio transformations (alr, clr and ilr) are considered in this geometric context.
Collapse
|
24
|
Abstract
AbstractCompositional data are those which contain only relative information. They are parts of some whole. In most cases they are recorded as closed data, i.e. data summing to a constant, such as 100% — whole-rock geochemical data being classic examples. Compositional data have important and particular properties that preclude the application of standard statistical techniques on such data in raw form. Standard techniques are designed to be used with data that are free to range from − ∞ to + ∞. Compositional data are always positive and range only from 0 to 100, or any other constant, when given in closed form. If one component increases, others must, perforce, decrease, whether or not there is a genetic link between these components. This means that the results of standard statistical analysis of the relationships between raw components or parts in a compositional dataset are clouded by spurious effects. Although such analyses may give apparently interpretable results, they are, at best, approximations and need to be treated with considerable circumspection. The methods outlined in this volume are based on the premise that it is the relative variation of components which is of interest, rather than absolute variation. Log-ratios of components provide the natural means of studying compositional data. In this contribution the basic terms and operations are introduced using simple numerical examples to illustrate their computation and to familiarize the reader with their use.
Collapse
|
25
|
Abstract
AbstractThe chemical composition of natural waters is derived from many different sources of solutes, including gases and aerosols from the atmosphere, weathering and erosion of rocks and soil, solution or precipitation reactions occurring below the land surface, and effects resulting from human activities. The chemical composition of the crustal rocks of the Earth, as well as the composition of the ocean and the atmosphere, are important in evaluating sources of solutes. Data used in the investigation of natural and non-natural contributions are obtained usually from chemical analysis of water samples, which may be statistically evaluated with the aim of summarizing the contained information. However, as these data are compositional and thus constrained to move in the simplex, application of usual statistical methodologies may lead to incorrect evaluations and/or interpretations. This paper focuses on how to draw information on natural processes by modelling univariate and multivariate frequency distributions using water data. The chemical composition of 977 samples collected in wells from Vulcano island (Italy) are used as a case study. The methodological approach can be transferred to the investigation of other geochemical or constrained data.
Collapse
|
26
|
|
27
|
Latent Compositional Factors in The Llobregat River Basin (Spain) Hydrogeochemistry. ACTA ACUST UNITED AC 2005. [DOI: 10.1007/s11004-005-7375-7] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
28
|
Relative vs. absolute statistical analysis of compositions: a comparative study of surface waters of a Mediterranean river. WATER RESEARCH 2005; 39:1404-1414. [PMID: 15862341 DOI: 10.1016/j.watres.2005.01.012] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/05/2003] [Revised: 11/16/2004] [Accepted: 01/11/2005] [Indexed: 05/24/2023]
Abstract
Most hydrogeological research includes some sort of statistical study, which is generally conducted on the raw measures of chemical variables, though there are several theoretical and practical studies warning against this practice. Arguments refer mainly to the positive character of this type of data, and to the fact that they carry only information about the relative abundance of each component on the whole, what makes techniques based on correlation, like the widely used Principal Component Analysis (PCA), loose their meaning. The solution proposed by Aitchison (1982, Journal of the Royal Statistical Society, Series B 44(2), 139-177)-based on working with log-ratios of observations-is equivalent to define a new distance between compositions and to adapt usual statistical techniques to it. To illustrate its effect, our study compares the performance of the biplot-a PCA graphical technique-according to the usual Euclidean and to the Aitchison distance. The study is conducted on a set of 14 molarities measured monthly through the years 1997-1999 at 30 different stations along the Llobregat River and its tributaries (Barcelona, NE Spain). Ordinary analysis, implicitly based on an Euclidean distance, presents some deficiencies, mainly because it only captures major ion variations and the inferred relationship between them actually depends on other non-relevant variables, such as water mass. An analysis based on compositional distances captures variations of all the ions; it is robust against the inclusion of non-relevant variables in the analysis; and it offers a way to build factors expressed as equilibrium equations. In our case, two promising factors are extracted, showing the different anthropogenic and geological pollution sources of the rivers.
Collapse
|
29
|
Zero Replacement in Compositional Data Sets. STUDIES IN CLASSIFICATION, DATA ANALYSIS, AND KNOWLEDGE ORGANIZATION 2000. [DOI: 10.1007/978-3-642-59789-3_25] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
|