1
|
Weiß CH. Measuring Dispersion and Serial Dependence in Ordinal Time Series Based on the Cumulative Paired ϕ-Entropy. ENTROPY 2021; 24:e24010042. [PMID: 35052068 PMCID: PMC8774592 DOI: 10.3390/e24010042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Revised: 12/22/2021] [Accepted: 12/23/2021] [Indexed: 11/16/2022]
Abstract
The family of cumulative paired ϕ-entropies offers a wide variety of ordinal dispersion measures, covering many well-known dispersion measures as a special case. After a comprehensive analysis of this family of entropies, we consider the corresponding sample versions and derive their asymptotic distributions for stationary ordinal time series data. Based on an investigation of their asymptotic bias, we propose a family of signed serial dependence measures, which can be understood as weighted types of Cohen’s κ, with the weights being related to the actual choice of ϕ. Again, the asymptotic distribution of the corresponding sample κϕ is derived and applied to test for serial dependence in ordinal time series. Using numerical computations and simulations, the practical relevance of the dispersion and dependence measures is investigated. We conclude with an environmental data example, where the novel ϕ-entropy-related measures are applied to an ordinal time series on the daily level of air quality.
Collapse
Affiliation(s)
- Christian H Weiß
- Department of Mathematics and Statistics, Helmut Schmidt University, 22043 Hamburg, Germany
| |
Collapse
|
2
|
Affiliation(s)
- Christian H. Weiß
- Department of Mathematics and Statistics, Helmut Schmidt University, Hamburg, Germany
| |
Collapse
|
3
|
Hu W, Shah SL, Chen T. Framework for a smart data analytics platform towards process monitoring and alarm management. Comput Chem Eng 2018. [DOI: 10.1016/j.compchemeng.2017.10.010] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
4
|
Woods T, Preeprem T, Lee K, Chang W, Vidakovic B. Characterizing exons and introns by regularity of nucleotide strings. Biol Direct 2016; 11:6. [PMID: 26857564 PMCID: PMC4745173 DOI: 10.1186/s13062-016-0108-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2015] [Accepted: 01/28/2016] [Indexed: 11/13/2022] Open
Abstract
Background Translation of nucleotides into a numeric form has been approached in many ways and has allowed researchers to investigate the properties of protein-coding sequences and noncoding sequences. Typically, more pronounced long-range correlations and increased regularity were found in intron-containing genes and in non-transcribed regulatory DNA sequences, compared to cDNA sequences or intron-less genes. The regularity is assessed by spectral tools defined on numerical translates. In most popular approaches of numerical translation the resulting spectra depend on the assignment of numerical values to nucleotides. Our contribution is to propose and illustrate a spectra which remains invariant to the translation rules used in traditional approaches. Results We outline a methodology for representing sequences of DNA nucleotides as numeric matrices in order to analytically investigate important structural characteristics of DNA. This representation allows us to compute the 2-dimensional wavelet transformation and assess regularity characteristics of the sequence via the slope of the wavelet spectra. In addition to computing a global slope measure for a sequence, we can apply our methodology for overlapping sections of nucleotides to obtain an “evolutionary slope.” To illustrate our methodology, we analyzed 376 gene sequences from the first chromosome of the honeybee. Conclusion For the genes analyzed, we find that introns are significantly more regular (lead to more negative spectral slopes) than exons, which agrees with the results from the literature where regularity is measured on “DNA walks”. However, unlike DNA walks where the nucleotides are assigned numerical values depending on nucleotide characteristics (purine-pyrimidine, weak-strong hydrogen bonds, keto-amino, etc.) or other spatial assignments, the proposed spectral tool is invariant to the assignment of nucleotides. Thus, ambiguity in numerical translation of nucleotides is eliminated. Reviewers This article was reviewed by Dr. Vladimir Kuznetsov, Professor Marek Kimmel and Dr. Natsuhiro Ichinose (nominated by Professor Masanori Arita). Electronic supplementary material The online version of this article (doi:10.1186/s13062-016-0108-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Tonya Woods
- H. Milton Stewart School of Industrial & Systems Engineering, Georgia Institute of Technology, 765 Ferst Drive NW, Atlanta, 30332, USA.
| | - Thanawadee Preeprem
- Faculty of Pharmaceutical Sciences, Ubon Ratchathani University, Ubon Ratchathani, Thailand.
| | | | | | - Brani Vidakovic
- H. Milton Stewart School of Industrial & Systems Engineering, Georgia Institute of Technology, 765 Ferst Drive NW, Atlanta, 30332, USA.
| |
Collapse
|
5
|
Xu S, Baldea M, Edgar TF, Wojsznis W, Blevins T, Nixon M. Root Cause Diagnosis of Plant-Wide Oscillations Based on Information Transfer in the Frequency Domain. Ind Eng Chem Res 2016. [DOI: 10.1021/acs.iecr.5b03068] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Shu Xu
- McKetta
Department of Chemical Engineering, The University of Texas at Austin, Austin, Texas 78712, United States
| | - Michael Baldea
- McKetta
Department of Chemical Engineering, The University of Texas at Austin, Austin, Texas 78712, United States
| | - Thomas F. Edgar
- McKetta
Department of Chemical Engineering, The University of Texas at Austin, Austin, Texas 78712, United States
| | - Willy Wojsznis
- Process
Systems and Solutions, Emerson Process Management, Round Rock, Texas 78759, United States
| | - Terrence Blevins
- Process
Systems and Solutions, Emerson Process Management, Round Rock, Texas 78759, United States
| | - Mark Nixon
- Process
Systems and Solutions, Emerson Process Management, Round Rock, Texas 78759, United States
| |
Collapse
|
6
|
Chaley M, Kutyrkin V. Stochastic model of homogeneous coding and latent periodicity in DNA sequences. J Theor Biol 2016; 390:106-16. [DOI: 10.1016/j.jtbi.2015.11.014] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2015] [Revised: 09/18/2015] [Accepted: 11/14/2015] [Indexed: 11/24/2022]
|
7
|
Chaley M, Kutyrkin V. Spectral-Statistical Approach for Revealing Latent Regular Structures in DNA Sequence. Methods Mol Biol 2016; 1415:315-340. [PMID: 27115640 DOI: 10.1007/978-1-4939-3572-7_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Methods of the spectral-statistical approach (2S-approach) for revealing latent periodicity in DNA sequences are described. The results of data analysis in the HeteroGenome database which collects the sequences similar to approximate tandem repeats in the genomes of model organisms are adduced. In consequence of further developing of the spectral-statistical approach, the techniques for recognizing latent profile periodicity are considered. These techniques are basing on extension of the notion of approximate tandem repeat. Examples of correlation of latent profile periodicity revealed in the CDSs with structural-functional properties in the proteins are given.
Collapse
Affiliation(s)
- Maria Chaley
- Institute of Mathematical Problems of Biology, Russian Academy of Sciences, Institutskaya st., 4, 142290, Pushchino, Russia.
| | - Vladimir Kutyrkin
- Department of Computational Mathematics and Mathematical Physics, Moscow State Technical University, n.a. N.E. Bauman the 2nd Baumanskaya st., 5, 105005, Moscow, Russia
| |
Collapse
|
8
|
Biswas A, del Carmen Pardo M, Guha A. Auto-association measures for stationary time series of categorical data. TEST-SPAIN 2014. [DOI: 10.1007/s11749-014-0364-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
9
|
Duan P, Chen T, Shah SL, Yang F. Methods for root cause diagnosis of plant-wide oscillations. AIChE J 2014. [DOI: 10.1002/aic.14391] [Citation(s) in RCA: 64] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Affiliation(s)
- Ping Duan
- Dept. of Electrical and Computer Engineering; University of Alberta; Edmonton AB Canada T6G 2V4
| | - Tongwen Chen
- Dept. of Electrical and Computer Engineering; University of Alberta; Edmonton AB Canada T6G 2V4
| | - Sirish L. Shah
- Dept. of Chemical and Materials Engineering; University of Alberta; Edmonton AB Canada T6G 2G6
| | - Fan Yang
- Tsinghua National Laboratory for Information Science and Technology and; Dept. of Automation, Tsinghua University; Beijing 100084 China
| |
Collapse
|
10
|
Krafty RT, Xiong S, Stoffer DS, Buysse DJ, Hall M. Enveloping Spectral Surfaces: Covariate Dependent Spectral Analysis of Categorical Time Series. JOURNAL OF TIME SERIES ANALYSIS 2012; 33:797-806. [PMID: 24790257 PMCID: PMC4002131 DOI: 10.1111/j.1467-9892.2011.00773.x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Motivated by problems in Sleep Medicine and Circadian Biology, we present a method for the analysis of cross-sectional categorical time series collected from multiple subjects where the effect of static continuous-valued covariates is of interest. Toward this goal, we extend the spectral envelope methodology for the frequency domain analysis of a single categorical process to cross-sectional categorical processes that are possibly covariate dependent. The analysis introduces an enveloping spectral surface for describing the association between the frequency domain properties of qualitative time series and covariates. The resulting surface offers an intuitively interpretable measure of association between covariates and a qualitative time series by finding the maximum possible conditional power at a given frequency from scalings of the qualitative time series conditional on the covariates. The optimal scalings that maximize the power provide scientific insight by identifying the aspects of the qualitative series which have the most pronounced periodic features at a given frequency conditional on the value of the covariates. To facilitate the assessment of the dependence of the enveloping spectral surface on the covariates, we include a theory for analyzing the partial derivatives of the surface. Our approach is entirely nonparametric, and we present estimation and asymptotics in the setting of local polynomial smoothing.
Collapse
Affiliation(s)
- Robert T. Krafty
- Department of Statistics, University of Pittsburgh, Pittsburgh, PA, USA 15260
| | - Shuangyan Xiong
- Department of Statistics, University of Pittsburgh, Pittsburgh, PA, USA 15260
| | - David S. Stoffer
- Department of Statistics, University of Pittsburgh, Pittsburgh, PA, USA 15260
| | - Daniel J. Buysse
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA 15213
| | - Martica Hall
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA 15213
| |
Collapse
|
11
|
Uldry L, Van Zaen J, Prudat Y, Kappenberger L, Vesin JM. Measures of spatiotemporal organization differentiate persistent from long-standing atrial fibrillation. Europace 2012; 14:1125-31. [PMID: 22308083 DOI: 10.1093/europace/eur436] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
AIMS This study presents an automatic diagnostic method for the discrimination between persistent and long-standing atrial fibrillation (AF) based on the surface electrocardiogram (ECG). METHODS AND RESULTS Standard 12-lead ECG recordings were acquired in 53 patients with either persistent (N= 20) or long-standing AF (N= 33), the latter including both long-standing persistent and permanent AF. A combined frequency analysis of multiple ECG leads followed by the computation of standard complexity measures provided a method for the quantification of spatiotemporal AF organization. All possible pairs of precordial ECG leads were analysed by this method and resulting organization measures were used for automatic classification of persistent and long-standing AF signals. Correct classification rates of 84.9% were obtained, with a predictive value for long-standing AF of 93.1%. Spatiotemporal organization as measured in lateral precordial leads V5 and V6 was shown to be significantly lower during long-standing AF than persistent AF, suggesting that time-related alterations in left atrial electrical activity can be detected in the ECG. CONCLUSION Accurate discrimination between persistent and long-standing AF based on standard surface recordings was demonstrated. This information could contribute to optimize the management of sustained AF, permitting appropriate therapeutic decisions and thereby providing substantial clinical cost savings.
Collapse
Affiliation(s)
- Laurent Uldry
- Applied Signal Processing Group, Swiss Federal Institute of Technology, EPFL STI GR-SCI-STI SCI-STI-JMV, ELD 234-Bâtiment ELD, CH-1015 Lausanne, Switzerland.
| | | | | | | | | |
Collapse
|
12
|
Abeysundera M, Field C, Gu H. Phylogenetic analysis based on spectral methods. Mol Biol Evol 2011; 29:579-97. [PMID: 21880577 DOI: 10.1093/molbev/msr205] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Whole-genome or multiple gene phylogenetic analysis is of interest since single gene analysis often results in poorly resolved trees. Here, the use of spectral techniques for analyzing multigene data sets is explored. The protein sequences are treated as categorical time series, and a measure of similarity between a pair of sequences, the spectral covariance, is based on the common periodicity between these two sequences. Unlike the other methods, the spectral covariance method focuses on the relationship between the sites of genetic sequences. By properly scaling the dissimilarity measures derived from different genes between a pair of species, we can use the mean of these scaled dissimilarity measures as a summary statistic to measure the taxonomic distances across multiple genes. The methods are applied to three different data sets, one noncontroversial and two with some dispute over the correct placement of the taxa in the tree. Trees are constructed using two distance-based methods, BIONJ and FITCH. A variation of block bootstrap sampling method is used for inference. The methods are able to recover all major clades in the corresponding reference trees with moderate to high bootstrap support. Through simulations, we show that the covariance-based methods effectively capture phylogenetic signal even when structural information is not fully retained. Comparisons of simulation results with the bootstrap permutation results indicate that the covariance-based methods are fairly robust under perturbations in sequence similarity but more sensitive to perturbations in structural similarity.
Collapse
Affiliation(s)
- Melanie Abeysundera
- Department of Mathematics and Statistics, Dalhousie University, Halifax, Nova Scotia, Canada.
| | | | | |
Collapse
|
13
|
Field CA. Modeling biological data: Several vignettes. CAN J STAT 2008. [DOI: 10.1002/cjs.5550360301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
|
14
|
|