1
|
Assessing the Effect of Data Quality on Distance Estimation in Smartphone-Based Outdoor 6MWT. SENSORS (BASEL, SWITZERLAND) 2024; 24:2632. [PMID: 38676249 PMCID: PMC11054500 DOI: 10.3390/s24082632] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Revised: 03/18/2024] [Accepted: 03/29/2024] [Indexed: 04/28/2024]
Abstract
As a result of technological advancements, functional capacity assessments, such as the 6-minute walk test, can be performed remotely, at home and in the community. Current studies, however, tend to overlook the crucial aspect of data quality, often limiting their focus to idealised scenarios. Challenging conditions may arise when performing a test given the risk of collecting poor-quality GNSS signal, which can undermine the reliability of the results. This work shows the impact of applying filtering rules to avoid noisy samples in common algorithms that compute the walked distance from positioning data. Then, based on signal features, we assess the reliability of the distance estimation using logistic regression from the following two perspectives: error-based analysis, which relates to the estimated distance error, and user-based analysis, which distinguishes conventional from unconventional tests based on users' previous annotations. We highlight the impact of features associated with walked path irregularity and direction changes to establish data quality. We evaluate features within a binary classification task and reach an F1-score of 0.93 and an area under the curve of 0.97 for the user-based classification. Identifying unreliable tests is helpful to clinicians, who receive the recorded test results accompanied by quality assessments, and to patients, who can be given the opportunity to repeat tests classified as not following the instructions.
Collapse
|
2
|
Processes and the Electronic Health Record: Challenges and Difficulties Faced when Creating an OB Quality Dashboard. Stud Health Technol Inform 2024; 310:1408-1409. [PMID: 38269670 DOI: 10.3233/shti231218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2024]
Abstract
Healthcare quality as defined by the National Academy of Medicine is "the degree to which health care services for individuals and populations increase the likelihood of desired health outcomes [1]". While building QI dashboard data quality to improve the maternal health of our patient population issues were discovered that hindered that the progress of the project. This paper will discuss the challenges and difficulties faced while creating an OB quality dashboard at a regional perinatal.
Collapse
|
3
|
"We don't trust all data coming from all facilities": factors influencing the quality of care network data quality in Ethiopia. Glob Health Action 2023; 16:2279856. [PMID: 38018430 PMCID: PMC10795578 DOI: 10.1080/16549716.2023.2279856] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Accepted: 11/01/2023] [Indexed: 11/30/2023] Open
Abstract
BACKGROUND Good quality data are a key to quality health care. In 2017, WHO has launched the Quality of Care Network (QCN) to reduce maternal, newborn and stillbirth mortality via learning and sharing networks. Guided by the principle of equity and dignity, the network members agreed to implement the programme in 2017-2021. OBJECTIVE This paper seeks to explore how QCN has contributed to improving data quality and to identify factors influencing quality of data in Ethiopia. METHODS We conducted a qualitative study in selected QCN facilities in Ethiopia using key informant interview and observation methods. We interviewed 40 people at national, sub-national and facility levels. Non-participant observations were carried out in four purposively selected health facilities; we accessed monthly reports from 41 QCN learning facilities. A codebook was prepared following a deductive and inductive analytical approach, coded using Nvivo 12 and thematically analysed. RESULTS There was a general perception that QCN had improved health data documentation and use in the learning facilities, achieved through coaching, learning and building from pre-existing initiatives. QCN also enhanced the data elements available by introducing a broader set of quality indicators. However, the perception of poor data quality persisted. Factors negatively affecting data quality included a lack of integration of QCN data within routine health system activities, the perception that QCN was a pilot, plus a lack of inclusive engagement at different levels. Both individual and system capabilities needed to be strengthened. CONCLUSION There is evidence of QCN's contribution to improving data awareness. But a lack of inclusive engagement of actors, alignment and limited skill for data collection and analysis continued to affect data quality and use. In the absence of new resources, integration of new data activities within existing routine health information systems emerged as the most important potential action for positive change.
Collapse
|
4
|
Small Stochastic Data Compactification Concept Justified in the Entropy Basis. ENTROPY (BASEL, SWITZERLAND) 2023; 25:1567. [PMID: 38136447 PMCID: PMC10742484 DOI: 10.3390/e25121567] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 11/15/2023] [Accepted: 11/18/2023] [Indexed: 12/24/2023]
Abstract
Measurement is a typical way of gathering information about an investigated object, generalized by a finite set of characteristic parameters. The result of each iteration of the measurement is an instance of the class of the investigated object in the form of a set of values of characteristic parameters. An ordered set of instances forms a collection whose dimensionality for a real object is a factor that cannot be ignored. Managing the dimensionality of data collections, as well as classification, regression, and clustering, are fundamental problems for machine learning. Compactification is the approximation of the original data collection by an equivalent collection (with a reduced dimension of characteristic parameters) with the control of accompanying information capacity losses. Related to compactification is the data completeness verifying procedure, which is characteristic of the data reliability assessment. If there are stochastic parameters among the initial data collection characteristic parameters, the compactification procedure becomes more complicated. To take this into account, this study proposes a model of a structured collection of stochastic data defined in terms of relative entropy. The compactification of such a data model is formalized by an iterative procedure aimed at maximizing the relative entropy of sequential implementation of direct and reverse projections of data collections, taking into account the estimates of the probability distribution densities of their attributes. The procedure for approximating the relative entropy function of compactification to reduce the computational complexity of the latter is proposed. To qualitatively assess compactification this study undertakes a formal analysis that uses data collection information capacity and the absolute and relative share of information losses due to compaction as its metrics. Taking into account the semantic connection of compactification and completeness, the proposed metric is also relevant for the task of assessing data reliability. Testing the proposed compactification procedure proved both its stability and efficiency in comparison with previously used analogues, such as the principal component analysis method and the random projection method.
Collapse
|
5
|
Quality of child anthropometric data from SISVAN, Brazil, 2008-2017. Rev Saude Publica 2023; 57:62. [PMID: 37878848 PMCID: PMC10519688 DOI: 10.11606/s1518-8787.2023057004655] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Accepted: 12/19/2022] [Indexed: 10/27/2023] Open
Abstract
OBJECTIVE To evaluate the quality of anthropometric data of children recorded in the Food and Nutrition Surveillance System (SISVAN) from 2008 to 2017. METHOD Descriptive study on the quality of anthropometric data of children under five years of age admitted in primary care services of the Unified Health System, from the individual databases of SISVAN. Data quality was annually assessed using the indicators: coverage, completeness, sex ratio, age distribution, weight and height digit preference, implausible z-score values, standard deviation, and normality of z-scores. RESULTS In total, 73,745,023 records and 29,852,480 children were identified. Coverage increased from 17.7% in 2008 to 45.4% in 2017. Completeness of birth date, weight, and height corresponded to almost 100% in all years. The sex ratio was balanced and approximately similar to the expected ratio, ranging from 0.8 to 1. The age distribution revealed higher percentages of registrations from the ages of two to four years until mid-2015. A preference for terminal digits "zero" and "five" was identified among weight and height records. The percentages of implausible z-scores exceeded 1% for all anthropometric indices, with values decreasing from 2014 onwards. A high dispersion of z-scores, including standard deviations between 1.2 and 1.6, was identified mainly in the indices including height and in the records of children under two years of age and residents in the North, Northeast, and Midwest regions. The distribution of z-scores was symmetric for all indices and platykurtic for height/age and weight/age. CONCLUSIONS The quality of SISVAN anthropometric data for children under five years of age has improved substantially between 2008 and 2017. Some indicators require attention, particularly for height measurements, whose quality was lower especially among groups more vulnerable to nutritional problems.
Collapse
|
6
|
NeoStarling: An Efficient and Scalable Collaborative Blockchain-Enabled Obstacle Mapping Solution for Vehicular Environments. SENSORS (BASEL, SWITZERLAND) 2023; 23:7500. [PMID: 37687956 PMCID: PMC10490777 DOI: 10.3390/s23177500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Revised: 08/10/2023] [Accepted: 08/18/2023] [Indexed: 09/10/2023]
Abstract
The Vehicular Self-Organizing Network (VANET) is a burgeoning research topic within Intelligent Transportation Systems, holding promise in enhancing safety and convenience for drivers. In general, VANETs require large amounts of data to be shared among vehicles within the network. But then two challenges arise. First, data security, privacy, and reliability need to be ensured. Second, data management and security solutions must be very scalable, because current and future transportation systems are very dense. However, existing Vehicle-to-Vehicle solutions fall short of guaranteeing the veracity of crucial traffic and vehicle safety data and identifying and excluding malicious vehicles. The introduction of blockchain technology in VANETs seeks to address these issues. But blockchain-enabled solutions, such as the Starling system, are too computationally heavy to be scalable enough. Our proposed NeoStarling system focuses on proving a scalable and efficient secure and reliable obstacle mapping using blockchain. An opportunistic mutual authentication protocol, based on hash functions, is only triggered when vehicles travel a certain distance. Lightweight cryptography and an optimized message exchange enable an improved scalability. The evaluation results show that our collaborative approach reduces the frequency of authentications and increases system efficiency by 35%. In addition, scalability is improved by 50% compared to previous mechanisms.
Collapse
|
7
|
Sisvan food intake markers: structure and measurement invariance in Brazil. Rev Saude Publica 2023; 57:52. [PMID: 37585951 PMCID: PMC10421608 DOI: 10.11606/s1518-8787.2023057004896] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Accepted: 11/21/2022] [Indexed: 08/18/2023] Open
Abstract
OBJECTIVE To characterize the internal structure of the Food and Nutrition Surveillance System (Sisvan) form of food intake markers for individuals over 2 years of age and to investigate measurement invariance between Brazilian macro-regions, life stages and over the years. METHODS A parallel analysis with factor estimation was carried out, complemented with exploratory factor analysis using all Sisvan records with valid responses in the country in 2015 (n = 298,253). Only the first record per individual was considered. Next, multigroup confirmatory factor analysis was used to investigate configural, metric and scalar invariance between the five macro-regions (Midwest, Northeast, North, Southeast, South) and life stages (children, adolescents, adults, elderly) in the same reference year. Invariance was evaluated longitudinally using valid individual records from 2015 to 2019 (n = 4,578,960). The adequacy of fit indices was observed at each step. RESULTS Acceptable fit indices and adequate factor loadings were found for a two-dimensional model, which grouped ultra-processed foods (factor 1) and unprocessed or minimally processed foods (factor 2). The two-dimensional structure, with the respective items in each factor underlying the set of markers, was equivalent across macro-regions, life stages and longitudinally, confirming the configural invariance. The weights of each item and its scale were homogeneous for all groups of interest, confirming metric and scalar invariances. CONCLUSIONS The internal structure of the Sisvan form of food intake markers adequately reflected its conceptual foundation, with stability of factors related to healthy and unhealthy eating in configuration, weights and scale in the investigated categories. These findings qualify food and nutritional surveillance actions, enhancing the use of Sisvan food intake markers in research, monitoring, individual guidance, and care production in the Brazilian Unified Health System.
Collapse
|
8
|
Data Quality Degradation on Prediction Models Generated From Continuous Activity and Heart Rate Monitoring: Exploratory Analysis Using Simulation. JMIR Cardio 2023; 7:e40524. [PMID: 37133921 DOI: 10.2196/40524] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Revised: 11/10/2022] [Accepted: 11/30/2022] [Indexed: 05/04/2023] Open
Abstract
BACKGROUND Limited data accuracy is often cited as a reason for caution in the integration of physiological data obtained from consumer-oriented wearable devices in care management pathways. The effect of decreasing accuracy on predictive models generated from these data has not been previously investigated. OBJECTIVE The aim of this study is to simulate the effect of data degradation on the reliability of prediction models generated from those data and thus determine the extent to which lower device accuracy might or might not limit their use in clinical settings. METHODS Using the Multilevel Monitoring of Activity and Sleep in Healthy People data set, which includes continuous free-living step count and heart rate data from 21 healthy volunteers, we trained a random forest model to predict cardiac competence. Model performance in 75 perturbed data sets with increasing missingness, noisiness, bias, and a combination of all 3 perturbations was compared to model performance for the unperturbed data set. RESULTS The unperturbed data set achieved a mean root mean square error (RMSE) of 0.079 (SD 0.001) in predicting cardiac competence index. For all types of perturbations, RMSE remained stable up to 20%-30% perturbation. Above this level, RMSE started increasing and reached the point at which the model was no longer predictive at 80% for noise, 50% for missingness, and 35% for the combination of all perturbations. Introducing systematic bias in the underlying data had no effect on RMSE. CONCLUSIONS In this proof-of-concept study, the performance of predictive models for cardiac competence generated from continuously acquired physiological data was relatively stable with declining quality of the source data. As such, lower accuracy of consumer-oriented wearable devices might not be an absolute contraindication for their use in clinical prediction models.
Collapse
|
9
|
Attitudes and perspectives of nurses and physicians in South Korea towards the clinical use of person-generated health data. Digit Health 2023; 9:20552076231218133. [PMID: 38033521 PMCID: PMC10685775 DOI: 10.1177/20552076231218133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/10/2023] [Indexed: 12/02/2023] Open
Abstract
This study aimed to explore the adoption of person-generated health data in clinical settings and discern the factors influencing clinicians' willingness to use it. A web-based survey containing 48 questions was developed based on prior research and the Unified Theory of Acceptance and Use of Technology 2 model. The survey was administered to a convenience sample of 486 nurses and physicians in South Korea recruited through an online community and snowball sampling. Of these, 70.7% were physicians. While 65% had used mobile health apps and devices, only 12.8% were familiar with person-generated health data. Still, a promising 73.3% expressed interest in incorporating person-generated health data into patient care, particularly data on blood glucose and vital signs. The findings of the study also indicated that clinicians specializing in internal medicine (OR: 1.9, CI: 1.16-3.19), familiar with person-generated health data (OR: 2.6, CI: 1.58-4.29), with a positive view of information and communication technology adoption (OR: 2.6, CI: 1.65-4.13), and who see the value in person-generated health data (OR: 3.9, CI: 2.55-6.09) showed higher inclination to utilize it. However, those in outpatient settings (OR: 0.4, CI: 0.19-0.73) showed less enthusiasm. The findings of this study suggest that despite the willingness of clinicians to use person-generated health data, various barriers must be addressed first, including a lack of knowledge regarding its use, concerns about data reliability and quality, and a lack of provider incentives. Overcoming these challenges demands concerted organizational or policy support. This research underscores person-generated health data's untapped potential in healthcare and the pressing need for strategies that facilitate its clinical integration.
Collapse
|
10
|
A Deep Learning Based Data Recovery Approach for Missing and Erroneous Data of IoT Nodes. SENSORS (BASEL, SWITZERLAND) 2022; 23:170. [PMID: 36616766 PMCID: PMC9824676 DOI: 10.3390/s23010170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Revised: 12/15/2022] [Accepted: 12/21/2022] [Indexed: 06/17/2023]
Abstract
Internet of things (IoT) nodes are deployed in large-scale automated monitoring applications to capture the massive amount of data from various locations in a time-series manner. The captured data are affected due to several factors such as device malfunctioning, unstable communication, environmental factors, synchronization problem, and unreliable nodes, which results in data inconsistency. Data recovery approaches are one of the best solutions to reduce data inconsistency. This research provides a missing data recovery approach based on spatial-temporal (ST) correlation between the IoT nodes in the network. The proposed approach has a clustering phase (CL) and a data recovery (DR) phase. In the CL phase, the nodes can be clustered based on their spatial and temporal relationship, and common neighbors are extracted. In the DR phase, missing data can be recovered with the help of neighbor nodes using the ST-hierarchical long short-term memory (ST-HLSTM) algorithm. The proposed algorithm has been verified on real-world IoT-based hydraulic test rig data sets which are gathered from things speak real-time cloud platform. The algorithm shows approximately 98.5% reliability as compared with the other existing algorithms due to its spatial-temporal features based on deep neural network architecture.
Collapse
|
11
|
IoT-Chain and Monitoring-Chain Using Multilevel Blockchain for IoT Security. SENSORS (BASEL, SWITZERLAND) 2022; 22:8271. [PMID: 36365971 PMCID: PMC9654779 DOI: 10.3390/s22218271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Revised: 10/23/2022] [Accepted: 10/24/2022] [Indexed: 06/16/2023]
Abstract
In general, the Internet of Things (IoT) relies on centralized servers due to limited computing power and storage capacity. These server-based architectures have vulnerabilities such as DDoS attacks, single-point errors, and data forgery, and cannot guarantee stability and reliability. Blockchain technology can guarantee reliability and stability with a P2P network-based consensus algorithm and distributed ledger technology. However, it requires the high storage capacity of the existing blockchain and the computational power of the consensus algorithm. Therefore, blockchain nodes for IoT data management are maintained through an external cloud, an edge node. As a result, the vulnerability of the existing centralized structure cannot be guaranteed, and reliability cannot be guaranteed in the process of storing IoT data on the blockchain. In this paper, we propose a multi-level blockchain structure and consensus algorithm to solve the vulnerability. A multi-level blockchain operates on IoT devices, and there is an IoT chain layer that stores sensor data to ensure reliability. In addition, there is a hyperledger fabric-based monitoring chain layer that operates the access control for the metadata and data of the IoT chain to lighten the weight. We propose an export consensus method between the two blockchains, the Schnorr signature method, and a random-based lightweight consensus algorithm within the IoT-Chain. Experiments to measure the blockchain size, propagation time, consensus delay time, and transactions per second (TPS) were conducted using IoT. The blockchain did not exceed a certain size, and the delay time was reduced by 96% to 99% on average compared to the existing consensus algorithm. In the throughput tests, the maximum was 1701 TPS and the minimum was 1024 TPS.
Collapse
|
12
|
Safety Monitoring System of CAVs Considering the Trade-Off between Sampling Interval and Data Reliability. SENSORS 2022; 22:s22103611. [PMID: 35632019 PMCID: PMC9147509 DOI: 10.3390/s22103611] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Revised: 05/06/2022] [Accepted: 05/07/2022] [Indexed: 12/04/2022]
Abstract
The safety of urban transportation systems is considered a public health issue worldwide, and many researchers have contributed to improving it. Connected automated vehicles (CAVs) and cooperative intelligent transportation systems (C-ITSs) are considered solutions to ensure the safety of urban transportation systems using various sensors and communication devices. However, realizing a data flow framework, including data collection, data transmission, and data processing, in South Korea is challenging, as CAVs produce a massive amount of data every minute, which cannot be transmitted via existing communication networks. Thus, raw data must be sampled and transmitted to the server for further processing. The data acquired must be highly accurate to ensure the safety of the different agents in C-ITS. On the other hand, raw data must be reduced through sampling to ensure transmission using existing communication systems. Thus, in this study, C-ITS architecture and data flow are designed, including messages and protocols for the safety monitoring system of CAVs, and the optimal sampling interval determined for data transmission while considering the trade-off between communication efficiency and accuracy of the safety performance indicators. Three safety performance indicators were introduced: severe deceleration, lateral position variance, and inverse time to collision. A field test was conducted to collect data from various sensors installed in the CAV, determining the optimal sampling interval. In addition, the Kolmogorov–Smirnov test was conducted to ensure statistical consistency between the sampled and raw datasets. The effects of the sampling interval on message delay, data accuracy, and communication efficiency in terms of the data compression ratio were analyzed. Consequently, a sampling interval of 0.2 s is recommended for optimizing the system’s overall efficiency.
Collapse
|
13
|
MICRO-CONTROLLED THERMAL STIMULATOR FOR DETECTING FINE FIBER CHANGES IN PATIENTS WITH DIABETES MELLITUS: A DIAGNOSTIC ACCURACY STUDY. Prim Care Diabetes 2021; 15:548-553. [PMID: 33541822 DOI: 10.1016/j.pcd.2021.01.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Revised: 12/27/2020] [Accepted: 01/09/2021] [Indexed: 11/22/2022]
Abstract
AIM To evaluate sensitivity and specificity of the micro-controlled thermal stimulator (MTS) for detecting pathological changes in fine fiber of neuropathy patients with DM. METHODS A diagnostic accuracy study including 84 patients, aged 15-75 years was conducted. A patient's foot was subjected to dermatological, musculoskeletal, vascular, and neurological evaluations. The latter was performed through the perception of a sharp touch with a toothpick (pinprick), thermal sensitivity (cold and hot temperature sensations measured using Diapason Handle and MTS, respectively), vibratory sensitivity (128 Hz Diapason Handle), 10 g Semmes-Weinstein monofilament, and a reflex test. Statistical analyses were performed using Stata® software version 13.0. The sensitivity, specificity, positive and negative predictive values, likelihood ratios, AUC, Kappa index, and accuracy of the diagnostic instruments were evaluated. RESULTS Of the 84 volunteers, 66.7% were female, with an average age of 54 years. We observed that 17% of the total patients were positive for pain sensations in the foot, 13% for cold-temperature sensations, and 21% for hot-temperature sensations. The MTS (hot temperature) obtained 97.6% sensitivity and 90% specificity, agreeing on 87.5% (Kappa index) with the Diapason Handle (cold temperature) (AUC > 0.937; p < 0.05). CONCLUSION MTS is an accurate, sensitive, and specific instrument for the evaluation of diabetic neuropathy as compared with the tuning fork as the standard method and, consequently, it could be of help for the early diagnosis of diabetic neuropathy.
Collapse
|
14
|
Abstract
Aim To determine whether our E-diary can be used to diagnose migraine and provide more reliable migraine-related frequency numbers compared to patients’ self-reported estimates. Methods We introduced a self-developed E-diary including automated algorithms differentiating headache and migraine days, indicating whether a patient has migraine. Reliability of the E-diary diagnosis in combination with two previously validated E-questionnaires was compared to a physician’s diagnosis as gold standard in headache patients referred to the Leiden Headache Clinic (n = 596). In a subset of patients with migraine (n = 484), self-estimated migraine-related frequencies were compared to diary-based results. Results The first migraine screening approach including an E-headache questionnaire, and the E-diary revealed a sensitivity of 98% and specificity of 17%. In the second approach, an E-migraine questionnaire was added, resulting in a sensitivity of 79% and specificity of 69%. Mean self-estimated monthly migraine days, non-migrainous headache days and days with acute medication use were different from E-diary-based results (absolute mean difference ± standard deviation respectively 4.7 ± 5.0, 6.2 ± 6.6 and 4.3 ± 4.8). Conclusion The E-diary including algorithms differentiating headache and migraine days showed usefulness in diagnosing migraine. The use emphasised the need for E-diaries to obtain reliable information, as patients do not reliably recall numbers of migraine days and acute medication intake. Adding E-diaries will be helpful in future headache telemedicine.
Collapse
|
15
|
Optimizing Accuracy and Depth of Protein Quantification in Experiments Using Isobaric Carriers. J Proteome Res 2020; 20:880-887. [PMID: 33190502 DOI: 10.1021/acs.jproteome.0c00675] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
The isobaric carrier approach, which combines small isobarically labeled samples with a larger isobarically labeled carrier sample, finds diverse applications in ultrasensitive mass spectrometry analysis of very small samples, such as single cells. To enhance the growing use of isobaric carriers, we characterized the trade-offs of using isobaric carriers in controlled experiments with complex human proteomes. The data indicate that isobaric carriers directly enhance peptide sequence identification without simultaneously increasing the number of protein copies sampled from small samples. The results also indicate strategies for optimizing the amount of isobaric carrier and analytical parameters, such as ion accumulation time, for different priorities such as improved quantification or an increased number of identified proteins. Balancing these trade-offs enables adapting isobaric carrier experiments to different applications, such as quantifying proteins from limited biopsies or organoids, building single-cell atlases, or modeling protein networks in single cells. In all cases, the reliability of protein quantification should be estimated and incorporated in all subsequent analyses. We expect that these guidelines will aid in explicit incorporation of the characterized trade-offs in experimental designs and transparent error propagation in data analysis.
Collapse
|
16
|
Assessing the Reliability of Relevant Tweets and Validation Using Manual and Automatic Approaches for Flood Risk Communication. ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION 2020; 9:532. [PMID: 33511044 PMCID: PMC7839990 DOI: 10.3390/ijgi9090532] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
While Twitter has been touted as a preeminent source of up-to-date information on hazard events, the reliability of tweets is still a concern. Our previous publication extracted relevant tweets containing information about the 2013 Colorado flood event and its impacts. Using the relevant tweets, this research further examined the reliability (accuracy and trueness) of the tweets by examining the text and image content and comparing them to other publicly available data sources. Both manual identification of text information and automated (Google Cloud Vision, application programming interface (API)) extraction of images were implemented to balance accurate information verification and efficient processing time. The results showed that both the text and images contained useful information about damaged/flooded roads/streets. This information will help emergency response coordination efforts and informed allocation of resources when enough tweets contain geocoordinates or location/venue names. This research will identify reliable crowdsourced risk information to facilitate near real-time emergency response through better use of crowdsourced risk communication platforms.
Collapse
|
17
|
Investigating Perceptual Biases, Data Reliability, and Data Discovery in a Methodology for Collecting Speech Errors From Audio Recordings. LANGUAGE AND SPEECH 2019; 62:281-317. [PMID: 29623769 DOI: 10.1177/0023830918765012] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
This work describes a methodology of collecting speech errors from audio recordings and investigates how some of its assumptions affect data quality and composition. Speech errors of all types (sound, lexical, syntactic, etc.) were collected by eight data collectors from audio recordings of unscripted English speech. Analysis of these errors showed that: (i) different listeners find different errors in the same audio recordings, but (ii) the frequencies of error patterns are similar across listeners; (iii) errors collected "online" using on the spot observational techniques are more likely to be affected by perceptual biases than "offline" errors collected from audio recordings; and (iv) datasets built from audio recordings can be explored and extended in a number of ways that traditional corpus studies cannot be.
Collapse
|
18
|
Understanding of researcher behavior is required to improve data reliability. Gigascience 2019; 8:giz017. [PMID: 30715291 PMCID: PMC6528747 DOI: 10.1093/gigascience/giz017] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2018] [Revised: 01/20/2019] [Accepted: 01/25/2019] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND A lack of data reproducibility ("reproducibility crisis") has been extensively debated across many academic disciplines. RESULTS Although a reproducibility crisis is widely perceived, conclusive data on the scale of the problem and the underlying reasons are largely lacking. The debate is primarily focused on methodological issues. However, examples such as the use of misidentified cell lines illustrate that the availability of reliable methods does not guarantee good practice. Moreover, research is often characterized by a lack of established methods. Despite the crucial importance of researcher conduct, research and conclusive data on the determinants of researcher behavior are widely missing. CONCLUSION Meta-research that establishes an understanding of the factors that determine researcher behavior is urgently needed. This knowledge can then be used to implement and iteratively improve measures that incentivize researchers to apply the highest standards, resulting in high-quality data.
Collapse
|
19
|
Systems Approach to Human Hair Fibers: Interdependence Between Physical, Mechanical, Biochemical and Geometric Properties of Natural Healthy Hair. Front Physiol 2019; 10:112. [PMID: 30846943 PMCID: PMC6393780 DOI: 10.3389/fphys.2019.00112] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2017] [Accepted: 01/30/2019] [Indexed: 11/13/2022] Open
Abstract
Contextual interpretation of hair fiber data is often blind to the effects of the dynamic complexity between different fiber properties. This intrinsic complexity requires systems thinking to decipher hair fiber accurately. Hair research, studied by various disciplines, follows a reductionist research approach, where elements of interest are studied from a local context with a certain amount of detachment from other elements or contexts. Following a systems approach, the authors are currently developing a cross-disciplinary taxonomy to provide a holistic view of fiber constituents and their interactions within large-scale dynamics. Based on the development process, this paper presents a review that explores the associated features, interrelationships and interactive complexities between physical, mechanical, biochemical and geometric features of natural, healthy hair fibers. Through the review, the importance of an appropriate taxonomy for interpreting hair fiber data across different disciplines is revealed. The review also demonstrates how seemingly unrelated fiber constituents are indeed interdependent and that these interdependencies may affect the behavior of the fiber. Finally, the review highlights how a non-integrative approach may have a negative impact on the reliability of hair data interpretation.
Collapse
|
20
|
A heuristic multi-criteria classification approach incorporating data quality information for choropleth mapping. CARTOGRAPHY AND GEOGRAPHIC INFORMATION SCIENCE 2016; 44:246-258. [PMID: 28286426 PMCID: PMC5342899 DOI: 10.1080/15230406.2016.1145072] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Despite conceptual and technology advancements in cartography over the decades, choropleth map design and classification fail to address a fundamental issue: estimates that are statistically indifferent may be assigned to different classes on maps or vice versa. Recently, the class separability concept was introduced as a map classification criterion to evaluate the likelihood that estimates in two classes are statistical different. Unfortunately, choropleth maps created according to the separability criterion usually have highly unbalanced classes. To produce reasonably separable but more balanced classes, we propose a heuristic classification approach to consider not just the class separability criterion but also other classification criteria such as evenness and intra-class variability. A geovisual-analytic package was developed to support the heuristic mapping process to evaluate the trade-off between relevant criteria and to select the most preferable classification. Class break values can be adjusted to improve the performance of a classification.
Collapse
|
21
|
The extent of food waste generation across EU-27: different calculation methods and the reliability of their results. WASTE MANAGEMENT & RESEARCH : THE JOURNAL OF THE INTERNATIONAL SOLID WASTES AND PUBLIC CLEANSING ASSOCIATION, ISWA 2014; 32:683-94. [PMID: 25161274 DOI: 10.1177/0734242x14545374] [Citation(s) in RCA: 57] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/18/2023]
Abstract
The reduction of food waste is seen as an important societal issue with considerable ethical, ecological and economic implications. The European Commission aims at cutting down food waste to one-half by 2020. However, implementing effective prevention measures requires knowledge of the reasons and the scale of food waste generation along the food supply chain. The available data basis for Europe is very heterogeneous and doubts about its reliability are legitimate. This mini-review gives an overview of available data on food waste generation in EU-27 and discusses their reliability against the results of own model calculations. These calculations are based on a methodology developed on behalf of the Food and Agriculture Organization of the United Nations and provide data on food waste generation for each of the EU-27 member states, broken down to the individual stages of the food chain and differentiated by product groups. The analysis shows that the results differ significantly, depending on the data sources chosen and the assumptions made. Further research is much needed in order to improve the data stock, which builds the basis for the monitoring and management of food waste.
Collapse
|
22
|
Overview and quality assurance for the oral health component of the National Health and Nutrition Examination Survey (NHANES), 2009-2010. J Public Health Dent 2014; 74:248-56. [PMID: 24849242 DOI: 10.1111/jphd.12056] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
OBJECTIVE In 2009-2010, the oral health component for the National Health and Nutrition Examination Survey (NHANES) focused on adult periodontal health and included a full mouth periodontal examination as well as a series of questions adminis during the home interview. During this period, intraoral assessments were conducted by dental hygienists. METHODS This report provides oral health content information and results of dental examiner reliability for data collected during NHANES 2009-2010 on 7,189 persons aged 3-19 years and 30 years and older representing the US civilian, noninstitutionalized population in these age groups. RESULTS For caries and dental sealant assessments, Kappa statistics ranged from 0.71 to 1.00. Kappa scores for moderate and severe periodontitis using the Centers for Disease Control and Prevention/American Academy of Periodontology case definition guidelines was 0.70, but were lower for other periodontal status definitions. When defining moderate or severe periodontitis based on the NHANES 2003-2004 study, protocols using data from only three facial periodontal sites, the Kappa scores were 0.64 and 0.55. Interclass correlation coefficients (ICCs) for mean attachment loss were 0.80 or higher for both examiners. Site-specific mean attachment loss ICCs were generally higher for interproximal measurements compared with mid-facial and mid-lingual measurements. CONCLUSION Overall, the data reliability analyses conducted for 2009-2010 indicate an acceptable level of data quality and that examiner (dental hygienist) performance in this data collection cycle is similar to prior survey periods since the NHANES continuous survey began in 1999.
Collapse
|