Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Skutkova H, Vitek M, Babula P, Kizek R, Provaznik I. Classification of genomic signals using dynamic time warping. BMC Bioinformatics 2013;14 Suppl 10:S1. [PMID: 24267034 PMCID: PMC3750471 DOI: 10.1186/1471-2105-14-s10-s1] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

For:	Skutkova H, Vitek M, Babula P, Kizek R, Provaznik I. Classification of genomic signals using dynamic time warping. BMC Bioinformatics 2013;14 Suppl 10:S1. [PMID: 24267034 PMCID: PMC3750471 DOI: 10.1186/1471-2105-14-s10-s1] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Number

Cited by Other Article(s)

Ray D, Parrinello M. Data-driven classification of ligand unbinding pathways. Proc Natl Acad Sci U S A 2024;121:e2313542121. [PMID: 38412121 PMCID: PMC10927508 DOI: 10.1073/pnas.2313542121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Accepted: 01/26/2024] [Indexed: 02/29/2024] Open

Barriga Rubio RH, Otero M. Stochastic modeling of Dalbulus maidis, vector of maize diseases. Theor Popul Biol 2023;154:51-66. [PMID: 37669715 DOI: 10.1016/j.tpb.2023.08.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Revised: 08/01/2023] [Accepted: 08/18/2023] [Indexed: 09/07/2023]

Abstract

We developed a simple linear stochastic model for Dalbulus maidis dependent exclusively on temperature, whose parameters were determined from published field and laboratory studies performed at different temperatures. This model takes into account the principal stages and events of the life cycle of this pest, which is vector of maize diseases. We implemented the effect of distributed delays or Linear Chain Trick (LCT) considering a fixed number of sub-stages for egg and nymph stages of Dalbulus maidis in order to accurately represent what is observed in nature. A sensitivity analysis allows us to observe that the speed of the dynamics is sensitive to changes in the development rates, but not to the longevity of each stage or the fecundity, which almost exclusively affect insect abundance. We used our model to study its predictive and explanatory capacity considering a published experiment as a case study. Although the simulation results show a behavior qualitatively equivalent to that observed in the experimental results it is not possible to explain accurately the magnitude, nor the times in which the maximum abundances of second-generation nymphs and adults are reached. Therefore, we evaluated three possible scenarios for the insect that allow us to glimpse some of the advantages of having a computational model in order to find out what processes, taken into account in the model, may explain the differences observed between published experimental results and model results. The three proposed scenarios, based on variations in the parameterized rates of the model, can satisfactorily explain the experimental observations. We observed that in order to better simulate the experimental results it is not necessary to modify fecundity or mortality rates. However, it is necessary to accelerate the average development rates of our model by 20 to 40 %, compatible with extreme values of the rates close to the upper edges of the confidence bands of our parameterization rate curves, according to insects with faster development rates already reported in literature.

Collapse

Chan NB, Li W, Aung T, Bazuaye E, Montero RM. Machine Learning-Based Time in Patterns for Blood Glucose Fluctuation Pattern Recognition in Type 1 Diabetes Management: Development and Validation Study. JMIR AI 2023;2:e45450. [PMID: 38875568 PMCID: PMC11041419 DOI: 10.2196/45450] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/01/2023] [Revised: 02/15/2023] [Accepted: 02/24/2023] [Indexed: 06/16/2024]

Abstract

BACKGROUND

Continuous glucose monitoring (CGM) for diabetes combines noninvasive glucose biosensors, continuous monitoring, cloud computing, and analytics to connect and simulate a hospital setting in a person's home. CGM systems inspired analytics methods to measure glycemic variability (GV), but existing GV analytics methods disregard glucose trends and patterns; hence, they fail to capture entire temporal patterns and do not provide granular insights about glucose fluctuations.

OBJECTIVE

This study aimed to propose a machine learning-based framework for blood glucose fluctuation pattern recognition, which enables a more comprehensive representation of GV profiles that could present detailed fluctuation information, be easily understood by clinicians, and provide insights about patient groups based on time in blood fluctuation patterns.

METHODS

Overall, 1.5 million measurements from 126 patients in the United Kingdom with type 1 diabetes mellitus (T1DM) were collected, and prevalent blood fluctuation patterns were extracted using dynamic time warping. The patterns were further validated in 225 patients in the United States with T1DM. Hierarchical clustering was then applied on time in patterns to form 4 clusters of patients. Patient groups were compared using statistical analysis.

RESULTS

In total, 6 patterns depicting distinctive glucose levels and trends were identified and validated, based on which 4 GV profiles of patients with T1DM were found. They were significantly different in terms of glycemic statuses such as diabetes duration (P=.04), glycated hemoglobin level (P<.001), and time in range (P<.001) and thus had different management needs.

CONCLUSIONS

The proposed method can analytically extract existing blood fluctuation patterns from CGM data. Thus, time in patterns can capture a rich view of patients' GV profile. Its conceptual resemblance with time in range, along with rich blood fluctuation details, makes it more scalable, accessible, and informative to clinicians.

Collapse

Ao C, Jiao S, Wang Y, Yu L, Zou Q. Biological Sequence Classification: A Review on Data and General Methods. RESEARCH (WASHINGTON, D.C.) 2022;2022:0011. [PMID: 39285948 PMCID: PMC11404319 DOI: 10.34133/research.0011] [Citation(s) in RCA: 32] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Accepted: 10/25/2022] [Indexed: 09/19/2024]

Power spectrum and dynamic time warping for DNA sequences classification. EVOLVING SYSTEMS 2020. [DOI: 10.1007/s12530-019-09306-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]

Ranjard L, Wong TKF, Rodrigo AG. Effective machine-learning assembly for next-generation amplicon sequencing with very low coverage. BMC Bioinformatics 2019;20:654. [PMID: 31829137 PMCID: PMC6907241 DOI: 10.1186/s12859-019-3287-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2019] [Accepted: 11/20/2019] [Indexed: 01/20/2023] Open

Abstract

BACKGROUND

In short-read DNA sequencing experiments, the read coverage is a key parameter to successfully assemble the reads and reconstruct the sequence of the input DNA. When coverage is very low, the original sequence reconstruction from the reads can be difficult because of the occurrence of uncovered gaps. Reference guided assembly can then improve these assemblies. However, when the available reference is phylogenetically distant from the sequencing reads, the mapping rate of the reads can be extremely low. Some recent improvements in read mapping approaches aim at modifying the reference according to the reads dynamically. Such approaches can significantly improve the alignment rate of the reads onto distant references but the processing of insertions and deletions remains challenging.

RESULTS

Here, we introduce a new algorithm to update the reference sequence according to previously aligned reads. Substitutions, insertions and deletions are performed in the reference sequence dynamically. We evaluate this approach to assemble a western-grey kangaroo mitochondrial amplicon. Our results show that more reads can be aligned and that this method produces assemblies of length comparable to the truth while limiting error rate when classic approaches fail to recover the correct length. Finally, we discuss how the core algorithm of this method could be improved and combined with other approaches to analyse larger genomic sequences.

CONCLUSIONS

We introduced an algorithm to perform dynamic alignment of reads on a distant reference. We showed that such approach can improve the reconstruction of an amplicon compared to classically used bioinformatic pipelines. Although not portable to genomic scale in the current form, we suggested several improvements to be investigated to make this method more flexible and allow dynamic alignment to be used for large genome assemblies.

Collapse

Alakus TB, Das B, Turkoglu I. DNA encoding with entropy based numerical mapping technique for phylogenetic analysis. 2019 INTERNATIONAL ARTIFICIAL INTELLIGENCE AND DATA PROCESSING SYMPOSIUM (IDAP) 2019. [DOI: 10.1109/idap.2019.8875937] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]

Bi JH, Tong YF, Qiu ZW, Yang XF, Minna J, Gazdar AF, Song K. ClickGene: an open cloud-based platform for big pan-cancer data genome-wide association study, visualization and exploration. BioData Min 2019;12:12. [PMID: 31391866 PMCID: PMC6595587 DOI: 10.1186/s13040-019-0202-3] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2018] [Accepted: 06/17/2019] [Indexed: 12/15/2022] Open

Abstract

Tremendous amount of whole-genome sequencing data have been provided by large consortium projects such as TCGA (The Cancer Genome Atlas), COSMIC and so on, which creates incredible opportunities for functional gene research and cancer associated mechanism uncovering. While the existing web servers are valuable and widely used, many whole genome analysis functions urgently needed by experimental biologists are still not adequately addressed. A cloud-based platform, named CG (ClickGene), therefore, was developed for DIY analyzing of user's private in-house data or public genome data without any requirement of software installation or system configuration. CG platform provides key interactive and customized functions including Bee-swarm plot, linear regression analyses, Mountain plot, Directional Manhattan plot, Deflection plot and Volcano plot. Using these tools, global profiling or individual gene distributions for expression and copy number variation (CNV) analyses can be generated by only mouse button clicking. The easy accessibility of such comprehensive pan-cancer genome analysis greatly facilitates data mining in wide research areas, such as therapeutic discovery process. Therefore, it fills in the gaps between big cancer genomics data and the delivery of integrated knowledge to end-users, thus helping unleash the value of the current data resources. More importantly, unlike other R-based web platforms, Dubbo, a cloud distributed service governance framework for 'big data' stream global transferring, was used to develop CG platform. After being developed, CG is run on an independent cloud-server, which ensures its steady global accessibility. More than 2 years running history of CG proved that advanced plots for hundreds of whole-genome data can be created through it within seconds by end-users anytime and anywhere. CG is available at http://www.clickgenome.org/.

Collapse

Randhawa GS, Hill KA, Kari L. ML-DSP: Machine Learning with Digital Signal Processing for ultrafast, accurate, and scalable genome classification at all taxonomic levels. BMC Genomics 2019;20:267. [PMID: 30943897 PMCID: PMC6448311 DOI: 10.1186/s12864-019-5571-y] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2018] [Accepted: 02/27/2019] [Indexed: 11/11/2022] Open

Abstract

Background

Although software tools abound for the comparison, analysis, identification, and classification of genomic sequences, taxonomic classification remains challenging due to the magnitude of the datasets and the intrinsic problems associated with classification. The need exists for an approach and software tool that addresses the limitations of existing alignment-based methods, as well as the challenges of recently proposed alignment-free methods.

Results

We propose a novel combination of supervised Machine Learning with Digital Signal Processing, resulting in ML-DSP: an alignment-free software tool for ultrafast, accurate, and scalable genome classification at all taxonomic levels. We test ML-DSP by classifying 7396 full mitochondrial genomes at various taxonomic levels, from kingdom to genus, with an average classification accuracy of >97%.

A quantitative comparison with state-of-the-art classification software tools is performed, on two small benchmark datasets and one large 4322 vertebrate mtDNA genomes dataset. Our results show that ML-DSP overwhelmingly outperforms the alignment-based software MEGA7 (alignment with MUSCLE or CLUSTALW) in terms of processing time, while having comparable classification accuracies for small datasets and superior accuracies for the large dataset. Compared with the alignment-free software FFP (Feature Frequency Profile), ML-DSP has significantly better classification accuracy, and is overall faster.

We also provide preliminary experiments indicating the potential of ML-DSP to be used for other datasets, by classifying 4271 complete dengue virus genomes into subtypes with 100% accuracy, and 4,710 bacterial genomes into phyla with 95.5% accuracy.

Lastly, our analysis shows that the “Purine/Pyrimidine”, “Just-A” and “Real” numerical representations of DNA sequences outperform ten other such numerical representations used in the Digital Signal Processing literature for DNA classification purposes.

Conclusions

Due to its superior classification accuracy, speed, and scalability to large datasets, ML-DSP is highly relevant in the classification of newly discovered organisms, in distinguishing genomic signatures and identifying their mechanistic determinants, and in evaluating genome integrity.

Collapse

Skutkova H, Maderankova D, Sedlar K, Jugas R, Vitek M. A degeneration-reducing criterion for optimal digital mapping of genetic codes. Comput Struct Biotechnol J 2019;17:406-414. [PMID: 30984363 PMCID: PMC6444178 DOI: 10.1016/j.csbj.2019.03.007] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2018] [Revised: 02/07/2019] [Accepted: 03/15/2019] [Indexed: 01/08/2023] Open

Maderankova D, Jugas R, Sedlar K, Vitek M, Skutkova H. Rapid Bacterial Species Delineation Based on Parameters Derived From Genome Numerical Representations. Comput Struct Biotechnol J 2019;17:118-126. [PMID: 30728919 PMCID: PMC6352304 DOI: 10.1016/j.csbj.2018.12.006] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2018] [Revised: 12/07/2018] [Accepted: 12/20/2018] [Indexed: 01/29/2023] Open

Retrieval of Similar Evolution Patterns from Satellite Image Time Series. APPLIED SCIENCES-BASEL 2018. [DOI: 10.3390/app8122435] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Rifaioglu AS, Doğan T, Saraç ÖS, Ersahin T, Saidi R, Atalay MV, Martin MJ, Cetin-Atalay R. Large-scale automated function prediction of protein sequences and an experimental case study validation on PTEN transcript variants. Proteins 2017;86:135-151. [PMID: 29098713 DOI: 10.1002/prot.25416] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2017] [Revised: 10/24/2017] [Accepted: 11/01/2017] [Indexed: 12/24/2022]

Zhang G, Dai M, Yang L, Li W, Li H, Xu C, Shi X, Dong X, Fu F. Fast detection and data compensation for electrodes disconnection in long-term monitoring of dynamic brain electrical impedance tomography. Biomed Eng Online 2017;16:7. [PMID: 28086909 PMCID: PMC5234124 DOI: 10.1186/s12938-016-0294-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2016] [Accepted: 12/04/2016] [Indexed: 11/18/2022] Open

Abstract

Background

Electrode disconnection is a common occurrence during long-term monitoring of brain electrical impedance tomography (EIT) in clinical settings. The data acquisition system suffers remarkable data loss which results in image reconstruction failure. The aim of this study was to: (1) detect disconnected electrodes and (2) account for invalid data.

Methods

Weighted correlation coefficient for each electrode was calculated based on the measurement differences between well-connected and disconnected electrodes. Disconnected electrodes were identified by filtering out abnormal coefficients with discrete wavelet transforms. Further, previously valid measurements were utilized to establish grey model. The invalid frames after electrode disconnection were substituted with the data estimated by grey model. The proposed approach was evaluated on resistor phantom and with eight patients in clinical settings.

Results

The proposed method was able to detect 1 or 2 disconnected electrodes with an accuracy of 100%; to detect 3 and 4 disconnected electrodes with accuracy of 92 and 84% respectively. The time cost of electrode detection was within 0.018 s. Further, the proposed method was capable to compensate at least 60 subsequent frames of data and restore the normal image reconstruction within 0.4 s and with a mean relative error smaller than 0.01%.

Conclusions

In this paper, we proposed a two-step approach to detect multiple disconnected electrodes and to compensate the invalid frames of data after disconnection. Our method is capable of detecting more disconnected electrodes with higher accuracy compared to methods proposed in previous studies. Further, our method provides estimations during the faulty measurement period until the medical staff reconnects the electrodes. This work would improve the clinical practicability of dynamic brain EIT and contribute to its further promotion.

Collapse

Hou W, Pan Q, Peng Q, He M. A new method to analyze protein sequence similarity using Dynamic Time Warping. Genomics 2016;109:123-130. [PMID: 27974244 PMCID: PMC7125777 DOI: 10.1016/j.ygeno.2016.12.002] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2016] [Revised: 12/06/2016] [Accepted: 12/10/2016] [Indexed: 12/05/2022]

Progressive alignment of genomic signals by multiple dynamic time warping. J Theor Biol 2015;385:20-30. [PMID: 26300069 DOI: 10.1016/j.jtbi.2015.08.007] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2014] [Revised: 07/21/2015] [Accepted: 08/03/2015] [Indexed: 11/22/2022]

Sedlar K, Skutkova H, Vitek M, Provaznik I. Set of rules for genomic signal downsampling. Comput Biol Med 2015;69:308-14. [PMID: 26078051 DOI: 10.1016/j.compbiomed.2015.05.022] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2014] [Revised: 05/25/2015] [Accepted: 05/26/2015] [Indexed: 12/14/2022]

Relationship of Bacteria Using Comparison of Whole Genome Sequences in Frequency Domain. ACTA ACUST UNITED AC 2014. [DOI: 10.1007/978-3-319-06593-9_35] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/21/2023]