1
|
Coutinho-Almeida J, Saez C, Correia R, Rodrigues PP. Development and initial validation of a data quality evaluation tool in obstetrics real-world data through HL7-FHIR interoperable Bayesian networks and expert rules. JAMIA Open 2024; 7:ooae062. [PMID: 39070966 PMCID: PMC11283181 DOI: 10.1093/jamiaopen/ooae062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Revised: 06/05/2024] [Accepted: 06/19/2024] [Indexed: 07/30/2024] Open
Abstract
Background The increasing prevalence of electronic health records (EHRs) in healthcare systems globally has underscored the importance of data quality for clinical decision-making and research, particularly in obstetrics. High-quality data is vital for an accurate representation of patient populations and to avoid erroneous healthcare decisions. However, existing studies have highlighted significant challenges in EHR data quality, necessitating innovative tools and methodologies for effective data quality assessment and improvement. Objective This article addresses the critical need for data quality evaluation in obstetrics by developing a novel tool. The tool utilizes Health Level 7 (HL7) Fast Healthcare Interoperable Resources (FHIR) standards in conjunction with Bayesian Networks and expert rules, offering a novel approach to assessing data quality in real-world obstetrics data. Methods A harmonized framework focusing on completeness, plausibility, and conformance underpins our methodology. We employed Bayesian networks for advanced probabilistic modeling, integrated outlier detection methods, and a rule-based system grounded in domain-specific knowledge. The development and validation of the tool were based on obstetrics data from 9 Portuguese hospitals, spanning the years 2019-2020. Results The developed tool demonstrated strong potential for identifying data quality issues in obstetrics EHRs. Bayesian networks used in the tool showed high performance for various features with area under the receiver operating characteristic curve (AUROC) between 75% and 97%. The tool's infrastructure and interoperable format as a FHIR Application Programming Interface (API) enables a possible deployment of a real-time data quality assessment in obstetrics settings. Our initial assessments show promised, even when compared with physicians' assessment of real records, the tool can reach AUROC of 88%, depending on the threshold defined. Discussion Our results also show that obstetrics clinical records are difficult to assess in terms of quality and assessments like ours could benefit from more categorical approaches of ranking between bad and good quality. Conclusion This study contributes significantly to the field of EHR data quality assessment, with a specific focus on obstetrics. The combination of HL7-FHIR interoperability, machine learning techniques, and expert knowledge presents a robust, adaptable solution to the challenges of healthcare data quality. Future research should explore tailored data quality evaluations for different healthcare contexts, as well as further validation of the tool capabilities, enhancing the tool's utility across diverse medical domains.
Collapse
Affiliation(s)
- João Coutinho-Almeida
- CINTESIS@RISE—Centre for Health Technologies and Services Research, University of Porto, 4200-319 Porto, Portugal
- MEDCIDS—Faculty of Medicine of University of Porto, 4200-319 Porto, Portugal
- Health Data Science PhD Program, Faculty of Medicine of the University of Porto, 4200-319 Porto, Portugal
| | - Carlos Saez
- Instituto Universitario de Aplicaciones de las Tecnologías de la Información y de las Comunicaciones Avanzadas, Universitat Politècnica de València, 46022 Valencia, Spain
| | - Ricardo Correia
- CINTESIS@RISE—Centre for Health Technologies and Services Research, University of Porto, 4200-319 Porto, Portugal
- MEDCIDS—Faculty of Medicine of University of Porto, 4200-319 Porto, Portugal
- Health Data Science PhD Program, Faculty of Medicine of the University of Porto, 4200-319 Porto, Portugal
| | - Pedro Pereira Rodrigues
- CINTESIS@RISE—Centre for Health Technologies and Services Research, University of Porto, 4200-319 Porto, Portugal
- MEDCIDS—Faculty of Medicine of University of Porto, 4200-319 Porto, Portugal
- Health Data Science PhD Program, Faculty of Medicine of the University of Porto, 4200-319 Porto, Portugal
| |
Collapse
|
2
|
LOTSPEICH SC, AMORIM GGC, SHAW PA, TAO R, SHEPHERD BE. Optimal multiwave validation of secondary use data with outcome and exposure misclassification. CAN J STAT 2024; 52:532-554. [PMID: 39629097 PMCID: PMC11610482 DOI: 10.1002/cjs.11772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Accepted: 12/23/2022] [Indexed: 04/03/2023]
Abstract
Observational databases provide unprecedented opportunities for secondary use in biomedical research. However, these data can be error-prone and must be validated before use. It is usually unrealistic to validate the whole database because of resource constraints. A cost-effective alternative is a two-phase design that validates a subset of records enriched for information about a particular research question. We consider odds ratio estimation under differential outcome and exposure misclassification and propose optimal designs that minimize the variance of the maximum likelihood estimator. Our adaptive grid search algorithm can locate the optimal design in a computationally feasible manner. Because the optimal design relies on unknown parameters, we introduce a multiwave strategy to approximate the optimal design. We demonstrate the proposed design's efficiency gains through simulations and two large observational studies.
Collapse
Affiliation(s)
- Sarah C. LOTSPEICH
- Department of Statistical Sciences, Wake Forest University, Winston-Salem, 27109, North Carolina, U.S.A
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, 37203, Tennessee, U.S.A
| | - Gustavo G. C. AMORIM
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, 37203, Tennessee, U.S.A
| | - Pamela A. SHAW
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, 19104, Pennsylvania, U.S.A
- Biostatistics Unit, Kaiser Permanente Washington Health Research Institute, Seattle, 98101, Washington, U.S.A
| | - Ran TAO
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, 37203, Tennessee, U.S.A
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, 37232, Tennessee, U.S.A
| | - Bryan E. SHEPHERD
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, 37203, Tennessee, U.S.A
| |
Collapse
|
3
|
Sousa B, Chiale S, Bryant H, Dulli L, Medrano T. Adopting Data to Care to Identify and Address Gaps in Services for Children and Adolescents Living With HIV in Mozambique. GLOBAL HEALTH, SCIENCE AND PRACTICE 2024; 12:e2300130. [PMID: 38443100 PMCID: PMC11057801 DOI: 10.9745/ghsp-d-23-00130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 02/06/2024] [Indexed: 03/07/2024]
Abstract
BACKGROUND The Data to Care (D2C) strategy uses multiple sources of complementary data on HIV clients and related services to identify individuals with gaps in HIV treatment. Although D2C has been widely used in the United States, there is no evidence on its use in other settings, such as countries most affected by the epidemic. STRATEGY IMPLEMENTATION The D2C strategy was implemented within the context of a project that provided community-based support to children and adolescents living with HIV (C/ALHIV) in Mozambique. A data tracking tool and a standard operating procedure manual for local partner community organizations and health care facilities were developed to support the effort. Project staff met with local project implementing partners to discuss and coordinate the intervention in pilot health facilities. STRATEGY PILOTING The project initiated a pilot D2C intervention in 2019, working with 14 health facilities across 5 additional districts within 1 province. COVida project data were compared with clinical data from facilities serving C/ALHIV. The D2C intervention identified gaps in HIV treatment for a substantial number of C/ALHIV, and targeted support services were provided to address those gaps. Viral load (VL) monitoring was added in March 2020. Before the intervention, 71% of C/ALHIV reported to be on HIV treatment by their caregivers were documented as on treatment in health facilities. Support interventions targeted those not on treatment, and this proportion increased to 96% within 1 year of implementation. Additionally, 12 months later, the proportion of C/ALHIV with a documented VL test increased from 52% to 72%. CONCLUSION Introducing the D2C pilot intervention was associated with substantial improvements in HIV treatment for C/ALHIV, including increased linkage to and continuity in treatment and increased VL testing. D2C may be a useful approach to improve health outcomes for C/ALHIV in settings outside of the United States.
Collapse
|
4
|
Lotspeich SC, Shepherd BE, Kariuki MA, Wools-Kaloustian K, McGowan CC, Musick B, Semeere A, Crabtree Ramírez BE, Mkwashapi DM, Cesar C, Ssemakadde M, Machado DM, Ngeresa A, Ferreira FF, Lwali J, Marcelin A, Cardoso SW, Luque MT, Otero L, Cortés CP, Duda SN. Lessons learned from over a decade of data audits in international observational HIV cohorts in Latin America and East Africa. J Clin Transl Sci 2023; 7:e245. [PMID: 38033704 PMCID: PMC10685260 DOI: 10.1017/cts.2023.659] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Revised: 10/13/2023] [Accepted: 10/16/2023] [Indexed: 12/02/2023] Open
Abstract
Introduction Routine patient care data are increasingly used for biomedical research, but such "secondary use" data have known limitations, including their quality. When leveraging routine care data for observational research, developing audit protocols that can maximize informational return and minimize costs is paramount. Methods For more than a decade, the Latin America and East Africa regions of the International epidemiology Databases to Evaluate AIDS (IeDEA) consortium have been auditing the observational data drawn from participating human immunodeficiency virus clinics. Since our earliest audits, where external auditors used paper forms to record audit findings from paper medical records, we have streamlined our protocols to obtain more efficient and informative audits that keep up with advancing technology while reducing travel obligations and associated costs. Results We present five key lessons learned from conducting data audits of secondary-use data from resource-limited settings for more than 10 years and share eight recommendations for other consortia looking to implement data quality initiatives. Conclusion After completing multiple audit cycles in both the Latin America and East Africa regions of the IeDEA consortium, we have established a rich reference for data quality in our cohorts, as well as large, audited analytical datasets that can be used to answer important clinical questions with confidence. By sharing our audit processes and how they have been adapted over time, we hope that others can develop protocols informed by our lessons learned from more than a decade of experience in these large, diverse cohorts.
Collapse
Affiliation(s)
- Sarah C. Lotspeich
- Department of Statistical Sciences, Wake Forest
University, Winston-Salem, NC,
USA
- Department of Biostatistics, Vanderbilt University Medical
Center, Nashville, TN, USA
| | - Bryan E. Shepherd
- Department of Biostatistics, Vanderbilt University Medical
Center, Nashville, TN, USA
| | | | - Kara Wools-Kaloustian
- Department of Medicine, Indiana University School of
Medicine, Indianapolis, IN,
USA
| | - Catherine C. McGowan
- Division of Infectious Diseases, Department of Medicine,
Vanderbilt University Medical Center, Nashville,
TN, USA
| | - Beverly Musick
- Department of Biostatistics, Indiana University School of
Medicine, Indianapolis, IN,
USA
| | - Aggrey Semeere
- Infectious Diseases Institute, Makerere University,
Kampala, Uganda
| | - Brenda E. Crabtree Ramírez
- Department of Infectious Diseases, Instituto Nacional de
Ciencias Méxicas y Nutrición Salvador Zubirán, Mexico City,
Mexico
| | - Denna M. Mkwashapi
- Sexual and Reproductive Health Program, National Institute
for Medical Research Mwanza, United Republic of Tanzania,
Mwanza, Tanzania
| | | | | | - Daisy Maria Machado
- Departamento de Pediatria, Universidade Federal de São
Paulo, São Paulo, Brazil
| | - Antony Ngeresa
- Academic Model Providing Access to Health Care (AMPATH),
Eldoret, Kenya
| | | | - Jerome Lwali
- Tumbi Hospital HIV Care and Treatment Clinic, United Republic of
Tanzania, Kibaha, Tanzania
| | - Adias Marcelin
- Le Groupe Haïtien d’Etude du Sarcome de Kaposi et des Infections
Opportunistes, Port-au-Prince, Haiti
| | | | - Marco Tulio Luque
- Instituto Hondureño de Seguridad Social and Hospital Escuela
Universitario, Tegucigalpa, Honduras
| | - Larissa Otero
- Instituto de Medicina Tropical Alexander von Humboldt, Universidad Peruana
Cayetano Heredia, Lima, Peru
- School of Medicine, Universidad Peruana Cayetano Heredia,
Lima, Peru
| | | | - Stephany N. Duda
- Department of Biomedical Informatics, Vanderbilt University
Medical Center, Nashville, TN,
USA
| |
Collapse
|
5
|
Shepherd BE, Han K, Chen T, Bian A, Pugh S, Duda SN, Lumley T, Heerman WJ, Shaw PA. Multiwave validation sampling for error-prone electronic health records. Biometrics 2023; 79:2649-2663. [PMID: 35775996 PMCID: PMC10525037 DOI: 10.1111/biom.13713] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Accepted: 06/16/2022] [Indexed: 11/29/2022]
Abstract
Electronic health record (EHR) data are increasingly used for biomedical research, but these data have recognized data quality challenges. Data validation is necessary to use EHR data with confidence, but limited resources typically make complete data validation impossible. Using EHR data, we illustrate prospective, multiwave, two-phase validation sampling to estimate the association between maternal weight gain during pregnancy and the risks of her child developing obesity or asthma. The optimal validation sampling design depends on the unknown efficient influence functions of regression coefficients of interest. In the first wave of our multiwave validation design, we estimate the influence function using the unvalidated (phase 1) data to determine our validation sample; then in subsequent waves, we re-estimate the influence function using validated (phase 2) data and update our sampling. For efficiency, estimation combines obesity and asthma sampling frames while calibrating sampling weights using generalized raking. We validated 996 of 10,335 mother-child EHR dyads in six sampling waves. Estimated associations between childhood obesity/asthma and maternal weight gain, as well as other covariates, are compared to naïve estimates that only use unvalidated data. In some cases, estimates markedly differ, underscoring the importance of efficient validation sampling to obtain accurate estimates incorporating validated data.
Collapse
Affiliation(s)
- Bryan E. Shepherd
- Department of Biostatistics, Vanderbilt University, Nashville, Tennessee, USA
| | - Kyunghee Han
- Depart. of Mathematics, Statistics, and Computer Science; Univ. of Illinois at Chicago
| | - Tong Chen
- Department of Statistics, University of Auckland
| | - Aihua Bian
- Department of Biostatistics, Vanderbilt University, Nashville, Tennessee, USA
| | - Shannon Pugh
- Department of Emergency Medicine, Vanderbilt University Medical Center
| | - Stephany N. Duda
- Department of Biomedical Informatics, Vanderbilt University Medical Center
| | | | | | - Pamela A. Shaw
- Biostatistics Unit, Kaiser Permanente Washington Health Research Institute
| |
Collapse
|
6
|
Lotspeich SC, Shepherd BE, Amorim GGC, Shaw PA, Tao R. Efficient odds ratio estimation under two-phase sampling using error-prone data from a multi-national HIV research cohort. Biometrics 2022; 78:1674-1685. [PMID: 34213008 PMCID: PMC8720323 DOI: 10.1111/biom.13512] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2020] [Revised: 05/19/2021] [Accepted: 06/17/2021] [Indexed: 12/30/2022]
Abstract
Persons living with HIV engage in routine clinical care, generating large amounts of data in observational HIV cohorts. These data are often error-prone, and directly using them in biomedical research could bias estimation and give misleading results. A cost-effective solution is the two-phase design, under which the error-prone variables are observed for all patients during Phase I, and that information is used to select patients for data auditing during Phase II. For example, the Caribbean, Central, and South America network for HIV epidemiology (CCASAnet) selected a random sample from each site for data auditing. Herein, we consider efficient odds ratio estimation with partially audited, error-prone data. We propose a semiparametric approach that uses all information from both phases and accommodates a number of error mechanisms. We allow both the outcome and covariates to be error-prone and these errors to be correlated, and selection of the Phase II sample can depend on Phase I data in an arbitrary manner. We devise a computationally efficient, numerically stable EM algorithm to obtain estimators that are consistent, asymptotically normal, and asymptotically efficient. We demonstrate the advantages of the proposed methods over existing ones through extensive simulations. Finally, we provide applications to the CCASAnet cohort.
Collapse
Affiliation(s)
- Sarah C. Lotspeich
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, U.S.A
| | - Bryan E. Shepherd
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, U.S.A
| | - Gustavo G. C. Amorim
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, U.S.A
| | - Pamela A. Shaw
- Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, U.S.A
| | - Ran Tao
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, U.S.A
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, U.S.A
| |
Collapse
|
7
|
Yu H, Yu Q, Nie Y, Xu W, Pu Y, Dai W, Wei X, Shi Q. Data Quality of Longitudinally Collected Patient-Reported Outcomes After Thoracic Surgery: Comparison of Paper- and Web-Based Assessments. J Med Internet Res 2021; 23:e28915. [PMID: 34751657 PMCID: PMC8663677 DOI: 10.2196/28915] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Revised: 05/21/2021] [Accepted: 10/03/2021] [Indexed: 01/05/2023] Open
Abstract
Background High-frequency patient-reported outcome (PRO) assessments are used to measure patients' symptoms after surgery for surgical research; however, the quality of those longitudinal PRO data has seldom been discussed. Objective The aim of this study was to determine data quality-influencing factors and to profile error trajectories of data longitudinally collected via paper-and-pencil (P&P) or web-based assessment (electronic PRO [ePRO]) after thoracic surgery. Methods We extracted longitudinal PRO data with 678 patients scheduled for lung surgery from an observational study (n=512) and a randomized clinical trial (n=166) on the evaluation of different perioperative care strategies. PROs were assessed by the MD Anderson Symptom Inventory Lung Cancer Module and single-item Quality of Life Scale before surgery and then daily after surgery until discharge or up to 14 days of hospitalization. Patient compliance and data error were identified and compared between P&P and ePRO. Generalized estimating equations model and 2-piecewise model were used to describe trajectories of error incidence over time and to identify the risk factors. Results Among 678 patients, 629 with at least 2 PRO assessments, 440 completed 3347 P&P assessments and 189 completed 1291 ePRO assessments. In total, 49.4% of patients had at least one error, including (1) missing items (64.69%, 1070/1654), (2) modifications without signatures (27.99%, 463/1654), (3) selection of multiple options (3.02%, 50/1654), (4) missing patient signatures (2.54%, 42/1654), (5) missing researcher signatures (1.45%, 24/1654), and (6) missing completion dates (0.30%, 5/1654). Patients who completed ePRO had fewer errors than those who completed P&P assessments (ePRO: 30.2% [57/189] vs. P&P: 57.7% [254/440]; P<.001). Compared with ePRO patients, those using P&P were older, less educated, and sicker. Common risk factors of having errors were a lower education level (P&P: odds ratio [OR] 1.39, 95% CI 1.20-1.62; P<.001; ePRO: OR 1.82, 95% CI 1.22-2.72; P=.003), treated in a provincial hospital (P&P: OR 3.34, 95% CI 2.10-5.33; P<.001; ePRO: OR 4.73, 95% CI 2.18-10.25; P<.001), and with severe disease (P&P: OR 1.63, 95% CI 1.33-1.99; P<.001; ePRO: OR 2.70, 95% CI 1.53-4.75; P<.001). Errors peaked on postoperative day (POD) 1 for P&P, and on POD 2 for ePRO. Conclusions It is possible to improve data quality of longitudinally collected PRO through ePRO, compared with P&P. However, ePRO-related sampling bias needs to be considered when designing clinical research using longitudinal PROs as major outcomes.
Collapse
Affiliation(s)
- Hongfan Yu
- School of Public Health and Management, Chongqing Medical University, Chonqqing, China
| | - Qingsong Yu
- School of Public Health and Management, Chongqing Medical University, Chonqqing, China
| | - Yuxian Nie
- State Key Laboratory of Ultrasound in Medicine and Engineering, College of Biomedical Engineering, Chongqing Medical University, Chongqing, China
| | - Wei Xu
- School of Public Health and Management, Chongqing Medical University, Chonqqing, China
| | - Yang Pu
- School of Public Health and Management, Chongqing Medical University, Chonqqing, China
| | - Wei Dai
- Department of Thoracic Surgery, Sichuan Cancer Hospital & Institute, Sichuan Cancer Center, School of Medicine, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | - Xing Wei
- Department of Thoracic Surgery, Sichuan Cancer Hospital & Institute, Sichuan Cancer Center, School of Medicine, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | - Qiuling Shi
- School of Public Health and Management, Chongqing Medical University, Chonqqing, China.,State Key Laboratory of Ultrasound in Medicine and Engineering, College of Biomedical Engineering, Chongqing Medical University, Chongqing, China.,Department of Thoracic Surgery, Sichuan Cancer Hospital & Institute, Sichuan Cancer Center, School of Medicine, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| |
Collapse
|
8
|
Data quality methods through remote source data verification auditing: results from the Congenital Cardiac Research Collaborative. Cardiol Young 2021; 31:1829-1834. [PMID: 33726868 DOI: 10.1017/s1047951121000974] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
BACKGROUND Multicentre research databases can provide insights into healthcare processes to improve outcomes and make practice recommendations for novel approaches. Effective audits can establish a framework for reporting research efforts, ensuring accurate reporting, and spearheading quality improvement. Although a variety of data auditing models and standards exist, barriers to effective auditing including costs, regulatory requirements, travel, and design complexity must be considered. MATERIALS AND METHODS The Congenital Cardiac Research Collaborative conducted a virtual data training initiative and remote source data verification audit on a retrospective multicentre dataset. CCRC investigators across nine institutions were trained to extract and enter data into a robust dataset on patients with tetralogy of Fallot who required neonatal intervention. Centres provided de-identified source files for a randomised 10% patient sample audit. Key auditing variables, discrepancy types, and severity levels were analysed across two study groups, primary repair and staged repair. RESULTS Of the total 572 study patients, data from 58 patients (31 staged repairs and 27 primary repairs) were source data verified. Amongst the 1790 variables audited, 45 discrepancies were discovered, resulting in an overall accuracy rate of 97.5%. High accuracy rates were consistent across all CCRC institutions ranging from 94.6% to 99.4% and were reported for both minor (1.5%) and major discrepancies type classifications (1.1%). CONCLUSION Findings indicate that implementing a virtual multicentre training initiative and remote source data verification audit can identify data quality concerns and produce a reliable, high-quality dataset. Remote auditing capacity is especially important during the current COVID-19 pandemic.
Collapse
|
9
|
Gillespie BW, Laurin LP, Zinsser D, Lafayette R, Marasa M, Wenderfer SE, Vento S, Poulton C, Barisoni L, Zee J, Helmuth M, Lugani F, Kamel M, Hill-Callahan P, Hewitt SM, Mariani LH, Smoyer WE, Greenbaum LA, Gipson DS, Robinson BM, Gharavi AG, Guay-Woodford LM, Trachtman H. Improving data quality in observational research studies: Report of the Cure Glomerulonephropathy (CureGN) network. Contemp Clin Trials Commun 2021; 22:100749. [PMID: 33851061 PMCID: PMC8039553 DOI: 10.1016/j.conctc.2021.100749] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2020] [Revised: 01/16/2021] [Accepted: 02/09/2021] [Indexed: 12/21/2022] Open
Abstract
Background High data quality is of crucial importance to the integrity of research projects. In the conduct of multi-center observational cohort studies with increasing types and quantities of data, maintaining data quality is challenging, with few published guidelines. Methods The Cure Glomerulonephropathy (CureGN) Network has established numerous quality control procedures to manage the 70 participating sites in the United States, Canada, and Europe. This effort is supported and guided by the activities of several committees, including Data Quality, Recruitment and Retention, and Central Review, that work in tandem with the Data Coordinating Center to monitor the study. We have implemented coordinator training and feedback channels, data queries of questionable or missing data, and developed performance metrics for recruitment, retention, visit completion, data entry, recording of patient-reported outcomes, collection, shipping and accessing of biological samples and pathology materials, and processing, cataloging and accessing genetic data and materials. Results We describe the development of data queries and site Report Cards, and their use in monitoring and encouraging excellence in site performance. We demonstrate improvements in data quality and completeness over 4 years after implementing these activities. We describe quality initiatives addressing specific challenges in collecting and cataloging whole slide images and other kidney pathology data, and novel methods of data quality assessment. Conclusions This paper reports the CureGN experience in optimizing data quality and underscores the importance of general and study-specific data quality initiatives to maintain excellence in the research measures of a multi-center observational study.
Collapse
Affiliation(s)
- Brenda W Gillespie
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Louis-Philippe Laurin
- Division of Nephrology, Maisonneuve-Rosemont Hospital, Department of Medicine, University of Montreal, Montreal, Quebec, Canada
| | - Dawn Zinsser
- Arbor Research Collaborative for Health, Ann Arbor, MI, 48104, USA
| | | | - Maddalena Marasa
- Department of Medicine, Division of Nephrology, Columbia University Vagelos College of Physicians and Surgeons, New York, NY, USA
| | | | - Suzanne Vento
- NYU Langone Health, Department of Pediatrics, Division of Nephrology, New York, NY, USA
| | - Caroline Poulton
- Kidney Center, Division of Nephrology and Hypertension, Department of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Laura Barisoni
- Department of Pathology, Division of AI and Computational Pathology, Department of Medicine, Division of Nephrology, Duke University, Durham, NC, USA
| | - Jarcy Zee
- Arbor Research Collaborative for Health, Ann Arbor, MI, 48104, USA
| | - Margaret Helmuth
- Arbor Research Collaborative for Health, Ann Arbor, MI, 48104, USA
| | - Francesca Lugani
- Laboratory of Molecular Nephrology, Istituto Giannina Gaslini, IRCCS, Genoa, Italy
| | - Margret Kamel
- Emory University, Department of Pediatrics, Division of Nephrology, Atlanta, GA, USA
| | | | - Stephen M Hewitt
- Laboratory of Pathology, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Laura H Mariani
- University of Michigan, Division of Nephrology, Ann Arbor, MI, USA
| | - William E Smoyer
- Center for Clinical and Translational Research, the Research Institute at Nationwide Children's Hospital, The Ohio State University, Columbus, OH, USA
| | - Larry A Greenbaum
- Emory University and Children's Healthcare of Atlanta, Atlanta, GA, USA
| | - Debbie S Gipson
- University of Michigan, Division of Nephrology, Department of Pediatrics, Ann Arbor, MI, USA
| | | | - Ali G Gharavi
- Department of Medicine, Division of Nephrology, Columbia University Vagelos College of Physicians and Surgeons, New York, NY, USA
| | - Lisa M Guay-Woodford
- Center for Translational Research, Children's National Hospital, Washington, DC, USA
| | - Howard Trachtman
- NYU Langone Health, Department of Pediatrics, Division of Nephrology, New York, NY, USA
| |
Collapse
|
10
|
Garces A, MacGuire E, Franklin HL, Alfaro N, Arroyo G, Figueroa L, Goudar SS, Saleem S, Esamai F, Patel A, Chomba E, Tshefu A, Haque R, Patterson JK, Liechty EA, Derman RJ, Carlo WA, Petri W, Koso-ThomasMcClure MEM, Goldenberg RL, Hibberd P, Krebs NF. Looking beyond the numbers: quality assurance procedures in the Global Network for Women's and Children's Health Research Maternal Newborn Health Registry. Reprod Health 2020; 17:159. [PMID: 33256778 PMCID: PMC7708152 DOI: 10.1186/s12978-020-01009-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2020] [Accepted: 10/05/2020] [Indexed: 02/08/2023] Open
Abstract
Background Quality assurance (QA) is a process that should be an integral part of research to protect the rights and safety of study participants and to reduce the likelihood that the results are affected by bias in data collection. Most QA plans include processes related to study preparation and regulatory compliance, data collection, data analysis and publication of study results. However, little detailed information is available on the specific procedures associated with QA processes to ensure high-quality data in multi-site studies. Methods The Global Network for Women’s and Children’s Health Maternal Newborn Health Registy (MNHR) is a prospective population-based registry of pregnancies and deliveries that is carried out in 8 international sites. Since its inception, QA procedures have been utilized to ensure the quality of the data. More recently, a training and certification process was developed to ensure that standardized, scientifically accurate clinical definitions are used consistently across sites. Staff complete a web-based training module that reviews the MNHR study protocol, study forms and clinical definitions developed by MNHR investigators and are certified through a multiple choice examination prior to initiating study activities and every six months thereafter. A standardized procedure for supervision and evaluation of field staff is carried out to ensure that research activites are conducted according to the protocol across all the MNHR sites. Conclusions We developed standardized QA processes for training, certification and supervision of the MNHR, a multisite research registry. It is expected that these activities, together with ongoing QA processes, will help to further optimize data quality for this protocol.
Collapse
Affiliation(s)
- Ana Garces
- Instituto de Nutrición de Centroamérica y Panamá, Guatemala, Guatemala.
| | | | | | - Norma Alfaro
- Instituto de Nutrición de Centroamérica y Panamá, Guatemala, Guatemala
| | - Gustavo Arroyo
- Instituto de Nutrición de Centroamérica y Panamá, Guatemala, Guatemala
| | - Lester Figueroa
- Instituto de Nutrición de Centroamérica y Panamá, Guatemala, Guatemala
| | - Shivaprasad S Goudar
- KLE Academy Higher Education and Research, J N Medical College, Belagavi, Karnataka, India
| | | | | | | | | | - Antoinette Tshefu
- University of Kinshasa School of Public Health, Kinshasa, Democratic Republic of the Congo
| | - Rashidul Haque
- International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), Dhaka, Bangladesh
| | | | - Edward A Liechty
- Indiana School of Medicine, University of Indiana, Indianapolis, IN, USA
| | | | | | | | | | - Robert L Goldenberg
- Department of Obstetrics and Gynecology, Columbia University School of Medicine, New York, NY, USA
| | | | - Nancy F Krebs
- University of Colorado School of Medicine, Denver, CO, USA
| |
Collapse
|