1
|
Ravikanth M, Korra S, Mamidisetti G, Goutham M, Bhaskar T. An efficient learning based approach for automatic record deduplication with benchmark datasets. Sci Rep 2024; 14:16254. [PMID: 39009682 PMCID: PMC11251143 DOI: 10.1038/s41598-024-63242-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Accepted: 05/27/2024] [Indexed: 07/17/2024] Open
Abstract
With technological innovations, enterprises in the real world are managing every iota of data as it can be mined to derive business intelligence (BI). However, when data comes from multiple sources, it may result in duplicate records. As data is given paramount importance, it is also significant to eliminate duplicate entities towards data integration, performance and resource optimization. To realize reliable systems for record deduplication, late, deep learning could offer exciting provisions with a learning-based approach. Deep ER is one of the deep learning-based methods used recently for dealing with the elimination of duplicates in structured data. Using it as a reference model, in this paper, we propose a framework known as Enhanced Deep Learning-based Record Deduplication (EDL-RD) for improving performance further. Towards this end, we exploited a variant of Long Short Term Memory (LSTM) along with various attribute compositions, similarity metrics, and numerical and null value resolution. We proposed an algorithm known as Efficient Learning based Record Deduplication (ELbRD). The algorithm extends the reference model with the aforementioned enhancements. An empirical study has revealed that the proposed framework with extensions outperforms existing methods.
Collapse
Affiliation(s)
- M Ravikanth
- Department of CSE, Malla Reddy University, Maisammaguda, Kompally, Hyderabad, India.
| | - Sampath Korra
- Department of CSE, Sri Indu College of Engineering and Technology (A), Sheriguda, Ibrahimpatnam, Hyderabad, T.S, 501510, India
| | - Gowtham Mamidisetti
- Department of CSE, Malla Reddy University, Maisammaguda, Kompally, Hyderabad, 500100, India
| | - Maganti Goutham
- Department of CSE, Malla Reddy University, Maisammaguda, Kompally, Hyderabad, 500100, India
| | - T Bhaskar
- Department of CSE CMR College of Engineering and Technology, Kandlakoya, Medchal, Hyderabad, TS, 50140, India
| |
Collapse
|
2
|
Silva AF, Dourado I, Lua I, Jesus GS, Guimarães NS, Morais GAS, Anderle RVR, Pescarini JM, Machado DB, Santos CAST, Ichihara MY, Barreto ML, Magno L, Souza LE, Macinko J, Rasella D. Income determines the impact of cash transfers on HIV/AIDS: cohort study of 22.7 million Brazilians. Nat Commun 2024; 15:1307. [PMID: 38346964 PMCID: PMC10861499 DOI: 10.1038/s41467-024-44975-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Accepted: 01/10/2024] [Indexed: 02/15/2024] Open
Abstract
Living with extremely low-income is an important risk factor for HIV/AIDS and can be mitigated by conditional cash transfers. Using a cohort of 22.7 million low-income individuals during 9 years, we evaluated the effects of the world's largest conditional cash transfer, the Programa Bolsa Família, on HIV/AIDS-related outcomes. Exposure to Programa Bolsa Família was associated with reduced AIDS incidence by 41% (RR:0.59; 95%CI:0.57-0.61), mortality by 39% (RR:0.61; 95%CI:0.57-0.64), and case fatality rates by 25% (RR:0.75; 95%CI:0.66-0.85) in the cohort, and Programa Bolsa Família effects were considerably stronger among individuals of extremely low-income [reduction of 55% for incidence (RR:0.45, 95% CI:0.42-0.47), 54% mortality (RR:0.46, 95% CI:0.42-0.49), and 37% case-fatality (RR:0.63, 95% CI:0.51 -0.76)], decreasing gradually until having no effect in individuals with higher incomes. Similar effects were observed on HIV notification. Programa Bolsa Família impact was also stronger among women and adolescents. Several sensitivity and triangulation analyses demonstrated the robustness of the results. Conditional cash transfers can significantly reduce AIDS morbidity and mortality in extremely vulnerable populations and should be considered an essential intervention to achieve AIDS-related sustainable development goals by 2030.
Collapse
Affiliation(s)
- Andréa F Silva
- Institute of Collective Health, Federal University of Bahia (UFBA), Salvador, Brazil
- Center for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil
| | - Inês Dourado
- Institute of Collective Health, Federal University of Bahia (UFBA), Salvador, Brazil
| | - Iracema Lua
- Institute of Collective Health, Federal University of Bahia (UFBA), Salvador, Brazil
- Center for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil
| | - Gabriela S Jesus
- Institute of Collective Health, Federal University of Bahia (UFBA), Salvador, Brazil
- Faculty of Medicine, Federal University of Bahia (UFBA), Salvador, Brazil
| | - Nathalia S Guimarães
- Institute of Collective Health, Federal University of Bahia (UFBA), Salvador, Brazil
| | - Gabriel A S Morais
- Institute of Collective Health, Federal University of Bahia (UFBA), Salvador, Brazil
| | - Rodrigo V R Anderle
- Institute of Collective Health, Federal University of Bahia (UFBA), Salvador, Brazil
| | - Julia M Pescarini
- Center for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil
| | - Daiane B Machado
- Center for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil
- Department of Global Health and Social Medicine, Harvard Medical School, Boston, MA, USA
| | - Carlos A S T Santos
- Center for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil
| | - Maria Y Ichihara
- Center for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil
| | - Mauricio L Barreto
- Institute of Collective Health, Federal University of Bahia (UFBA), Salvador, Brazil
- Center for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil
| | - Laio Magno
- Institute of Collective Health, Federal University of Bahia (UFBA), Salvador, Brazil
- Department of Life Sciences, State University of Bahia (UNEB), Salvador, Brazil
| | - Luis E Souza
- Institute of Collective Health, Federal University of Bahia (UFBA), Salvador, Brazil
| | - James Macinko
- Departments of Health Policy and Management and Community Health Sciences, UCLA Fielding School of Public Health, Los Angeles, CA, USA
| | - Davide Rasella
- Institute of Collective Health, Federal University of Bahia (UFBA), Salvador, Brazil.
- Center for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil.
- ISGlobal, Hospital Clinic - Universitat de Barcelona, Barcelona, Spain.
| |
Collapse
|
3
|
Guimarães JMN, Pescarini JM, de Sousa Filho JF, Ferreira A, de Almeida MDCC, Gabrielli L, dos-Santos-Silva I, Santos G, Barreto ML, Aquino EML. Income Segregation, Conditional Cash Transfers, and Breast Cancer Mortality Among Women in Brazil. JAMA Netw Open 2024; 7:e2353100. [PMID: 38270952 PMCID: PMC10811554 DOI: 10.1001/jamanetworkopen.2023.53100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Accepted: 12/04/2023] [Indexed: 01/26/2024] Open
Abstract
Importance Women living in income-segregated areas are less likely to receive adequate breast cancer care and access community resources, which may heighten breast cancer mortality risk. Objective To investigate the association between income segregation and breast cancer mortality and whether this association is attenuated by receipt of the Bolsa Família program (BFP), the world's largest conditional cash-transfer program. Design, Setting, and Participants This cohort study was conducted using data from the 100 Million Brazilian Cohort, which were linked with nationwide mortality registries (2004-2015). Data were analyzed from December 2021 to June 2023. Study participants were women aged 18 to 100 years. Exposure Women's income segregation (high, medium, or low) at the municipality level was obtained using income data from the 2010 Brazilian census and assessed using dissimilarity index values in tertiles (low [0.01-0.25], medium [0.26-0.32], and high [0.33-0.73]). Main Outcomes and Measures The main outcome was breast cancer mortality. Mortality rate ratios (MRRs) for the association of segregation with breast cancer deaths were estimated using Poisson regression adjusted for age, race, education, municipality area size, population density, area of residence (rural or urban), and year of enrollment. Multiplicative interactions of segregation and BFP receipt (yes or no) in the association with mortality (2004-2015) were assessed. Results Data on 21 680 930 women (mean [SD] age, 36.1 [15.3] years) were analyzed. Breast cancer mortality was greater among women living in municipalities with high (adjusted MRR [aMRR], 1.18; 95% CI, 1.13-1.24) and medium (aMRR, 1.08; 95% CI, 1.03-1.12) compared with low segregation. Women who did not receive BFP had higher breast cancer mortality than BFP recipients (aMRR, 1.17; 95% CI, 1.12-1.22). By BFP strata, women who did not receive BFP and lived in municipalities with high income segregation had a 24% greater risk of death from breast cancer compared with those living in municipalities with low income segregation (aMRR, 1.24: 95% CI, 1.14-1.34); women who received BFP and were living in areas with high income segregation had a 13% higher risk of death from breast cancer compared with those living in municipalities with low income segregation (aMRR, 1.13; 95% CI, 1.07-1.19; P for interaction = .008). Stratified by the amount of time receiving the benefit, segregation (high vs low) was associated with an increase in mortality risk for women receiving BFP for less time but not for those receiving it for more time (<4 years: aMRR, 1.16; 95% CI, 1.07-1.27; 4-11 years: aMRR, 1.09; 95% CI, 1.00-1.17; P for interaction <.001). Conclusions and Relevance These findings suggest that place-based inequities in breast cancer mortality associated with income segregation may be mitigated with BFP receipt, possibly via improved income and access to preventive cancer care services among women, which may be associated with early detection and treatment and ultimately reduced mortality.
Collapse
Affiliation(s)
| | - Julia M. Pescarini
- Center for Data and Knowledge Integration for Health, Fiocruz, Salvador, Brazil
- Faculty of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | | | - Andrea Ferreira
- Center for Data and Knowledge Integration for Health, Fiocruz, Salvador, Brazil
- Ubuntu Center on Racism, Global Movements and Population Health Equity, Drexel University Dornsife School of Public Health, Philadelphia, Pennsylvania
| | | | - Ligia Gabrielli
- Secretaria de Saúde do Estado da Bahia, Centro de Diabetes e Endocrinologia da Bahia, Salvador, Brazil
| | - Isabel dos-Santos-Silva
- Faculty of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | - Gervasio Santos
- Center for Data and Knowledge Integration for Health, Fiocruz, Salvador, Brazil
| | - Mauricio L. Barreto
- Center for Data and Knowledge Integration for Health, Fiocruz, Salvador, Brazil
- Instituto de Saúde Coletiva, Universidade Federal da Bahia, Salvador, Brazil
| | - Estela M. L. Aquino
- Instituto de Saúde Coletiva, Universidade Federal da Bahia, Salvador, Brazil
| |
Collapse
|
4
|
Pinto PFPS, Teixeira CSS, Ichihara MY, Rasella D, Nery JS, Sena SOL, Brickley EB, Barreto ML, Sanchez MN, Pescarini JM. Incidence and risk factors of tuberculosis among 420 854 household contacts of patients with tuberculosis in the 100 Million Brazilian Cohort (2004-18): a cohort study. THE LANCET. INFECTIOUS DISEASES 2024; 24:46-56. [PMID: 37591301 PMCID: PMC10733584 DOI: 10.1016/s1473-3099(23)00371-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 05/26/2023] [Accepted: 06/05/2023] [Indexed: 08/19/2023]
Abstract
BACKGROUND Although household contacts of patients with tuberculosis are known to be particularly vulnerable to tuberculosis, the published evidence focused on this group at high risk within the low-income and middle-income country context remains sparse. Using nationwide data from Brazil, we aimed to estimate the incidence and investigate the socioeconomic and clinical determinants of tuberculosis in a cohort of contacts of tuberculosis patients. METHODS In this cohort study, we linked individual socioeconomic and demographic data from the 100 Million Brazilian Cohort to mortality data and tuberculosis registries, identified contacts of tuberculosis index patients diagnosed from Jan 1, 2004 to Dec 31, 2018, and followed up the contacts until the contact's subsequent tuberculosis diagnosis, the contact's death, or Dec 31, 2018. We investigated factors associated with active tuberculosis using multilevel Poisson regressions, allowing for municipality-level and household-level random effects. FINDINGS We studied 420 854 household contacts of 137 131 tuberculosis index patients. During the 15 years of follow-up (median 4·4 years [IQR 1·9-7·6]), we detected 8953 contacts with tuberculosis. The tuberculosis incidence among contacts was 427·8 per 100 000 person-years at risk (95% CI 419·1-436·8), 16-times higher than the incidence in the general population (26·2 [26·1-26·3]) and the risk was prolonged. Tuberculosis incidence was associated with the index patient being preschool aged (<5 years; adjusted risk ratio 4·15 [95% CI 3·26-5·28]) or having pulmonary tuberculosis (2·84 [2·55-3·17]). INTERPRETATION The high and sustained risk of tuberculosis among contacts reinforces the need to systematically expand and strengthen contact tracing and preventive treatment policies in Brazil in order to achieve national and international targets for tuberculosis elimination. FUNDING Wellcome Trust and Brazilian Ministry of Health.
Collapse
Affiliation(s)
- Priscila F P S Pinto
- Centro de Integração de Dados e Conhecimentos para Saúde (Cidacs), Fundação Oswaldo Cruz, Salvador, Brazil.
| | - Camila S S Teixeira
- Centro de Integração de Dados e Conhecimentos para Saúde (Cidacs), Fundação Oswaldo Cruz, Salvador, Brazil
| | - Maria Yury Ichihara
- Centro de Integração de Dados e Conhecimentos para Saúde (Cidacs), Fundação Oswaldo Cruz, Salvador, Brazil
| | - Davide Rasella
- Centro de Integração de Dados e Conhecimentos para Saúde (Cidacs), Fundação Oswaldo Cruz, Salvador, Brazil; Institute of Global Health (ISGlobal), Hospital Clínic-Universitat de Barcelona, Barcelona, Spain
| | - Joilda S Nery
- Centro de Integração de Dados e Conhecimentos para Saúde (Cidacs), Fundação Oswaldo Cruz, Salvador, Brazil; Instituto de Saúde Coletiva, Universidade Federal da Bahia, Salvador, Brazil
| | - Samila O L Sena
- Centro de Integração de Dados e Conhecimentos para Saúde (Cidacs), Fundação Oswaldo Cruz, Salvador, Brazil
| | - Elizabeth B Brickley
- Department of Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, UK
| | - Maurício L Barreto
- Centro de Integração de Dados e Conhecimentos para Saúde (Cidacs), Fundação Oswaldo Cruz, Salvador, Brazil; Instituto de Saúde Coletiva, Universidade Federal da Bahia, Salvador, Brazil
| | - Mauro N Sanchez
- Centro de Integração de Dados e Conhecimentos para Saúde (Cidacs), Fundação Oswaldo Cruz, Salvador, Brazil; Núcleo de Medicina Tropical, Universidade de Brasília (UnB), Brasília, Brazil
| | - Julia M Pescarini
- Centro de Integração de Dados e Conhecimentos para Saúde (Cidacs), Fundação Oswaldo Cruz, Salvador, Brazil; Department of Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, UK
| |
Collapse
|
5
|
Christen V, Häntschel T, Christen P, Rahm E. Privacy-preserving record linkage using autoencoders. INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS 2022. [DOI: 10.1007/s41060-022-00377-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
AbstractPrivacy-preserving record linkage (PPRL) is the process aimed at identifying records that represent the same real-world entity across different data sources while guaranteeing the privacy of sensitive information about these entities. A popular PPRL method is to encode sensitive plain-text data into Bloom filters (BFs), bit vectors that enable the efficient calculation of similarities between records that is required for PPRL. However, BF encoding cannot completely prevent the re-identification of plain-text values because sets of BFs can contain bit patterns that can be mapped to plain-text values using cryptanalysis attacks. Various hardening techniques have therefore been proposed that modify the bit patterns in BFs with the aim to prevent such attacks. However, it has been shown that even hardened BFs can still be vulnerable to attacks. To avoid any such attacks, we propose a novel encoding technique for PPRL based on autoencoders that transforms BFs into vectors of real numbers. To achieve a high comparison quality of the generated numerical vectors, we propose a method that guarantees the comparability of encodings generated by the different data owners. Experiments on real-world data sets show that our technique achieves high linkage quality and prevents known cryptanalysis attacks on BF encoding.
Collapse
|
6
|
Araujo JD, Santos-e-Silva JC, Costa-Martins AG, Sampaio V, de Castro DB, de Souza RF, Giddaluru J, Ramos PIP, Pita R, Barreto ML, Barral-Netto M, Nakaya HI. Tucuxi-BLAST: Enabling fast and accurate record linkage of large-scale health-related administrative databases through a DNA-encoded approach. PeerJ 2022; 10:e13507. [PMID: 35846888 PMCID: PMC9281601 DOI: 10.7717/peerj.13507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Accepted: 05/06/2022] [Indexed: 01/17/2023] Open
Abstract
Background Public health research frequently requires the integration of information from different data sources. However, errors in the records and the high computational costs involved make linking large administrative databases using record linkage (RL) methodologies a major challenge. Methods We present Tucuxi-BLAST, a versatile tool for probabilistic RL that utilizes a DNA-encoded approach to encrypt, analyze and link massive administrative databases. Tucuxi-BLAST encodes the identification records into DNA. BLASTn algorithm is then used to align the sequences between databases. We tested and benchmarked on a simulated database containing records for 300 million individuals and also on four large administrative databases containing real data on Brazilian patients. Results Our method was able to overcome misspellings and typographical errors in administrative databases. In processing the RL of the largest simulated dataset (200k records), the state-of-the-art method took 5 days and 7 h to perform the RL, while Tucuxi-BLAST only took 23 h. When compared with five existing RL tools applied to a gold-standard dataset from real health-related databases, Tucuxi-BLAST had the highest accuracy and speed. By repurposing genomic tools, Tucuxi-BLAST can improve data-driven medical research and provide a fast and accurate way to link individual information across several administrative databases.
Collapse
Affiliation(s)
- José Deney Araujo
- Department of Clinical and Toxicological Analyses, Universidade de São Paulo, São Paulo, SP, Brazil
| | | | - André Guilherme Costa-Martins
- Department of Clinical and Toxicological Analyses, Universidade de São Paulo, São Paulo, SP, Brazil,Scientific Platform Pasteur USP, São Paulo, SP, Brazil
| | - Vanderson Sampaio
- Fundação de Medicina Tropical Dr. Heitor Vieira Dourado, Manaus, Brazil,Instituto Todos pela Saúde, São Paulo, SP, Brazil
| | | | - Robson F. de Souza
- Departamento de Microbiologia, Universidade de São Paulo, São Paulo, Brazil
| | - Jeevan Giddaluru
- Department of Clinical and Toxicological Analyses, Universidade de São Paulo, São Paulo, SP, Brazil
| | | | | | | | | | - Helder I. Nakaya
- Department of Clinical and Toxicological Analyses, Universidade de São Paulo, São Paulo, SP, Brazil,Scientific Platform Pasteur USP, São Paulo, SP, Brazil,Instituto Todos pela Saúde, São Paulo, SP, Brazil,Hospital Israelita Albert Einstein, São Paulo, SP, Brazil
| |
Collapse
|
7
|
Canali S, Leonelli S. Reframing the environment in data-intensive health sciences. STUDIES IN HISTORY AND PHILOSOPHY OF SCIENCE 2022; 93:203-214. [PMID: 35576883 DOI: 10.1016/j.shpsa.2022.04.006] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Revised: 02/25/2022] [Accepted: 04/20/2022] [Indexed: 06/15/2023]
Abstract
In this paper, we analyse the relation between the use of environmental data in contemporary health sciences and related conceptualisations and operationalisations of the notion of environment. We consider three case studies that exemplify a different selection of environmental data and mode of data integration in data-intensive epidemiology. We argue that the diversification of data sources, their increase in scale and scope, and the application of novel analytic tools have brought about three significant conceptual shifts. First, we discuss the EXPOsOMICS project, an attempt to integrate genomic and environmental data which suggests a reframing of the boundaries between external and internal environments. Second, we explore the MEDMI platform, whose efforts to combine health, environmental and climate data instantiate a reframing and expansion of environmental exposure. Third, we illustrate how extracting epidemiological insights from extensive social data collected by the CIDACS institute yields innovative attributions of causal power to environmental factors. Identifying these shifts highlights the benefits and opportunities of new environmental data, as well as the challenges that such tools bring to understanding and fostering health. It also emphasises the constraints that data selection and accessibility pose to scientific imagination, including how researchers frame key concepts in health-related research.
Collapse
Affiliation(s)
- Stefano Canali
- Department of Electronics, Information and Bioengineering and META - Social Sciences and Humanities for Science and Technology, Politecnico di Milano, Milan, Italy.
| | - Sabina Leonelli
- Department of Sociology, Philosophy and Anthropology and Exeter Centre for the Study of the Life Sciences (Egenis), University of Exeter, Exeter, UK.
| |
Collapse
|
8
|
FIRLA: A Fast Incremental Record Linkage Algorithm. J Biomed Inform 2022; 130:104094. [PMID: 35550929 DOI: 10.1016/j.jbi.2022.104094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Revised: 05/02/2022] [Accepted: 05/04/2022] [Indexed: 11/23/2022]
Abstract
Record linkage is an important problem studied widely in many domains including biomedical informatics. A standard version of this problem is to cluster records from several datasets, such that each cluster has records pertinent to just one individual. Typically, datasets are huge in size. Hence, existing record linkage algorithms take a very long time. It is thus essential to develop novel fast algorithms for record linkage. The incremental version of this problem is to link previously clustered records with new records added to the input datasets. A novel algorithm has been created to efficiently perform standard and incremental record linkage. This algorithm leverages a set of efficient techniques that significantly restrict the number of record pair comparisons and distance computations. Our algorithm shows an average speed-up of 2.4x (up to 4x) for the standard linkage problem as compared to the state-of-the-art, without any drop in linkage performance at all. On average, our algorithm can incrementally link records in just 33% of the time required for linking them from scratch. Our algorithms achieve comparable or superior linkage performance and outperform the state-of-the-art in terms of linking time in all cases where the number of comparison attributes is greater than two. In practice, more than two comparison attributes are quite common. The proposed algorithm is very efficient and could be used in practice for record linkage applications especially when records are being added over time and linkage output needs to be updated frequently.
Collapse
|
9
|
Barreto ML, Ichihara MY, Pescarini JM, Ali MS, Borges GL, Fiaccone RL, Ribeiro-Silva RDC, Teles CA, Almeida D, Sena S, Carreiro RP, Cabral L, Almeida BA, Barbosa GCG, Pita R, Barreto ME, Mendes AAF, Ramos DO, Brickley EB, Bispo N, Machado DB, Paixao ES, Rodrigues LC, Smeeth L. Cohort Profile: The 100 Million Brazilian Cohort. Int J Epidemiol 2022; 51:e27-e38. [PMID: 34922344 PMCID: PMC9082797 DOI: 10.1093/ije/dyab213] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Accepted: 09/17/2021] [Indexed: 11/16/2022] Open
Affiliation(s)
- Mauricio L Barreto
- Centre for Data and Knowledge Integration for Health (CIDACS), Fundação Oswaldo Cruz, Salvador, Brazil
- Institute of Collective Health, Federal University of Bahia (UFBA), Salvador, Brazil
| | - Maria Yury Ichihara
- Centre for Data and Knowledge Integration for Health (CIDACS), Fundação Oswaldo Cruz, Salvador, Brazil
- Institute of Collective Health, Federal University of Bahia (UFBA), Salvador, Brazil
| | - Julia M Pescarini
- Centre for Data and Knowledge Integration for Health (CIDACS), Fundação Oswaldo Cruz, Salvador, Brazil
- Faculty of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, London, UK
| | - M Sanni Ali
- Centre for Data and Knowledge Integration for Health (CIDACS), Fundação Oswaldo Cruz, Salvador, Brazil
- Faculty of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, London, UK
- Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Center for Statistics in Medicine, University of Oxford, Oxford, UK
| | - Gabriela L Borges
- Centre for Data and Knowledge Integration for Health (CIDACS), Fundação Oswaldo Cruz, Salvador, Brazil
| | - Rosemeire L Fiaccone
- Centre for Data and Knowledge Integration for Health (CIDACS), Fundação Oswaldo Cruz, Salvador, Brazil
- Department of Statistics, Federal University of Bahia, Salvador, Brazil
| | - Rita de Cássia Ribeiro-Silva
- Centre for Data and Knowledge Integration for Health (CIDACS), Fundação Oswaldo Cruz, Salvador, Brazil
- Department of Nutrition, Federal University of Bahia, Salvador, Brazil
| | - Carlos A Teles
- Centre for Data and Knowledge Integration for Health (CIDACS), Fundação Oswaldo Cruz, Salvador, Brazil
| | - Daniela Almeida
- Centre for Data and Knowledge Integration for Health (CIDACS), Fundação Oswaldo Cruz, Salvador, Brazil
| | - Samila Sena
- Centre for Data and Knowledge Integration for Health (CIDACS), Fundação Oswaldo Cruz, Salvador, Brazil
| | - Roberto P Carreiro
- Centre for Data and Knowledge Integration for Health (CIDACS), Fundação Oswaldo Cruz, Salvador, Brazil
| | - Liliana Cabral
- Centre for Data and Knowledge Integration for Health (CIDACS), Fundação Oswaldo Cruz, Salvador, Brazil
| | - Bethania A Almeida
- Centre for Data and Knowledge Integration for Health (CIDACS), Fundação Oswaldo Cruz, Salvador, Brazil
| | - George C G Barbosa
- Centre for Data and Knowledge Integration for Health (CIDACS), Fundação Oswaldo Cruz, Salvador, Brazil
| | - Robespierre Pita
- Centre for Data and Knowledge Integration for Health (CIDACS), Fundação Oswaldo Cruz, Salvador, Brazil
| | - Marcos E Barreto
- Centre for Data and Knowledge Integration for Health (CIDACS), Fundação Oswaldo Cruz, Salvador, Brazil
- Department of Statistics, London School of Economics and Political Science, London, UK
| | - Andre A F Mendes
- Centre for Data and Knowledge Integration for Health (CIDACS), Fundação Oswaldo Cruz, Salvador, Brazil
| | - Dandara O Ramos
- Centre for Data and Knowledge Integration for Health (CIDACS), Fundação Oswaldo Cruz, Salvador, Brazil
- Institute of Collective Health, Federal University of Bahia (UFBA), Salvador, Brazil
| | - Elizabeth B Brickley
- Faculty of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, London, UK
| | - Nivea Bispo
- Centre for Data and Knowledge Integration for Health (CIDACS), Fundação Oswaldo Cruz, Salvador, Brazil
- Department of Statistics, Federal University of Bahia, Salvador, Brazil
| | - Daiane B Machado
- Centre for Data and Knowledge Integration for Health (CIDACS), Fundação Oswaldo Cruz, Salvador, Brazil
| | - Enny S Paixao
- Centre for Data and Knowledge Integration for Health (CIDACS), Fundação Oswaldo Cruz, Salvador, Brazil
- Faculty of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, London, UK
| | - Laura C Rodrigues
- Centre for Data and Knowledge Integration for Health (CIDACS), Fundação Oswaldo Cruz, Salvador, Brazil
- Faculty of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, London, UK
| | - Liam Smeeth
- Faculty of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, London, UK
| |
Collapse
|
10
|
Vaiwsri S, Ranbaduge T, Christen P, Schnell R. Accurate privacy-preserving record linkage for databases with missing values. INFORM SYST 2022. [DOI: 10.1016/j.is.2021.101959] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
11
|
Machado DB, Williamson E, Pescarini JM, Alves FJO, Castro-de-Araujo LFS, Ichihara MY, Rodrigues LC, Araya R, Patel V, Barreto ML. Relationship between the Bolsa Família national cash transfer programme and suicide incidence in Brazil: A quasi-experimental study. PLoS Med 2022; 19:e1004000. [PMID: 35584178 PMCID: PMC9162363 DOI: 10.1371/journal.pmed.1004000] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Revised: 06/02/2022] [Accepted: 04/26/2022] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Socioeconomic factors have been consistently associated with suicide, and economic recessions are linked to rising suicide rates. However, evidence on the impact of socioeconomic interventions to reduce suicide rates is limited. This study investigates the association of the world's largest conditional cash transfer programme with suicide rates in a cohort of half of the Brazilian population. METHODS AND FINDINGS We used data from the 100 Million Brazilian Cohort, covering a 12-year period (2004 to 2015). It comprises socioeconomic and demographic information on 114,008,317 individuals, linked to the "Bolsa Família" programme (BFP) payroll database, and nationwide death registration data. BFP was implemented by the Brazilian government in 2004. We estimated the association of BFP using inverse probability of treatment weighting, estimating the weights for BFP beneficiaries (weight = 1) and nonbeneficiaries by the inverse probability of receiving treatment (weight = E(ps)/(1-E(ps))). We used an average treatment effect on the treated (ATT) estimator and fitted Poisson models to estimate the incidence rate ratios (IRRs) for suicide associated with BFP experience. At the cohort baseline, BFP beneficiaries were younger (median age 27.4 versus 35.4), had higher unemployment rates (56% versus 32%), a lower level of education, resided in rural areas, and experienced worse household conditions. There were 36,742 suicide cases among the 76,532,158 individuals aged 10 years, or older, followed for 489,500,000 person-years at risk. Suicide rates among beneficiaries and nonbeneficiaries were 5.4 (95% CI = 5.32, 5.47, p < 0.001) and 10.7 (95% CI = 10.51, 10.87, p < 0.001) per 100,000 individuals, respectively. BFP beneficiaries had a lower suicide rate than nonbeneficiaries (IRR = 0.44, 95% CI = 0.42, 0.45, p < 0.001). This association was stronger among women (IRR = 0.36, 95% CI = 0.33, 0.38, p < 0.001), and individuals aged between 25 and 59 (IRR = 0.41, 95% CI = 0.40, 0.43, p < 0.001). Study limitations include a lack of control for previous mental disorders and access to means of suicide, and the possible under-registration of suicide cases due to stigma. CONCLUSIONS We observed that BFP was associated with lower suicide rates, with similar results in all sensitivity analyses. These findings should help to inform policymakers and health authorities to better design suicide prevention strategies. Targeting social determinants using cash transfer programmes could be important in limiting suicide, which is predicted to rise with the economic recession, consequent to the Coronavirus Disease 2019 (COVID-19) pandemic.
Collapse
Affiliation(s)
- Daiane Borges Machado
- Centre for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil
- Department of Global Health and Social Medicine, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Elizabeth Williamson
- Department of Medical Statistics and Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine (LSHTM), London, United Kingdom
| | - Julia M. Pescarini
- Centre for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil
- Department of Medical Statistics and Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine (LSHTM), London, United Kingdom
| | - Flavia J. O. Alves
- Centre for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil
| | - Luís F. S. Castro-de-Araujo
- Centre for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil
- Department of Psychiatry, The University of Melbourne, Austin Health, Heidelberg, Victoria, Australia
| | - Maria Yury Ichihara
- Centre for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil
| | - Laura C. Rodrigues
- Centre for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil
- Department of Medical Statistics and Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine (LSHTM), London, United Kingdom
| | - Ricardo Araya
- Centre for Global Mental Health, Institute of Psychiatry, Psychology & Neuroscience, King’s College, London, United Kingdom
| | - Vikram Patel
- Department of Global Health and Social Medicine, Harvard Medical School, Boston, Massachusetts, United States of America
- Department of Global Health and Population, Chan School of Public Health, Harvard, United States of America
| | - Maurício L. Barreto
- Centre for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil
- Institute of Collective Health, Federal University of Bahia (UFBA), Salvador, Brazil
| |
Collapse
|
12
|
Rasella D, Morais GADS, Anderle RV, da Silva AF, Lua I, Coelho R, Rubio FA, Magno L, Machado D, Pescarini J, Souza LE, Macinko J, Dourado I. Evaluating the impact of social determinants, conditional cash transfers and primary health care on HIV/AIDS: Study protocol of a retrospective and forecasting approach based on the data integration with a cohort of 100 million Brazilians. PLoS One 2022; 17:e0265253. [PMID: 35316304 PMCID: PMC8939793 DOI: 10.1371/journal.pone.0265253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Accepted: 02/25/2022] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Despite the great progress made over the last decades, stronger structural interventions are needed to end the HIV/AIDS pandemic in Low and Middle-Income Countries (LMIC). Brazil is one of the largest and data-richest LMIC, with rapidly changing socioeconomic characteristics and an important HIV/AIDS burden. Over the last two decades Brazil has also implemented the world's largest Conditional Cash Transfer programs, the Bolsa Familia Program (BFP), and one of the most consolidated Primary Health Care (PHC) interventions, the Family Health Strategy (FHS). OBJECTIVE We will evaluate the effects of socioeconomic determinants, BFP exposure and FHS coverage on HIV/AIDS incidence, treatment adherence, hospitalizations, case fatality, and mortality using unprecedently large aggregate and individual-level longitudinal data. Moreover, we will integrate the retrospective datasets and estimated parameters with comprehensive forecasting models to project HIV/AIDS incidence, prevalence and mortality scenarios up to 2030 according to future socioeconomic conditions and alternative policy implementations. METHODS AND ANALYSIS We will combine individual-level data from all national HIV/AIDS registries with large-scale databases, including the "100 Million Brazilian Cohort", over a 19-year period (2000-2018). Several approaches will be used for the retrospective quasi-experimental impact evaluations, such as Regression Discontinuity Design (RDD), Random Administrative Delays (RAD) and Propensity Score Matching (PSM), combined with multivariable Poisson regressions for cohort analyses. Moreover, we will explore in depth lagged and long-term effects of changes in living conditions and in exposures to BFP and FHS. We will also investigate the effects of the interventions in a wide range of subpopulations. Finally, we will integrate such retrospective analyses with microsimulation, compartmental and agent-based models to forecast future HIV/AIDS scenarios. CONCLUSION The unprecedented datasets, analyzed through state-of-the-art quasi-experimental methods and innovative mathematical models will provide essential evidences to the understanding and control of HIV/AIDS epidemic in LMICs such as Brazil.
Collapse
Affiliation(s)
- Davide Rasella
- Institute of Collective Health, Federal University of Bahia, Salvador, Brazil
- ISGlobal, Hospital Clínic - Universitat de Barcelona, Barcelona, Spain
- Center for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil
| | | | | | | | - Iracema Lua
- Institute of Collective Health, Federal University of Bahia, Salvador, Brazil
| | - Ronaldo Coelho
- Department of Chronic Conditions and Sexually Transmitted Infections/Department of Health Surveillance/Ministry of Health (DCCI/SVS/MS), Brasília, Brazil
| | - Felipe Alves Rubio
- Institute of Collective Health, Federal University of Bahia, Salvador, Brazil
| | - Laio Magno
- Institute of Collective Health, Federal University of Bahia, Salvador, Brazil
- Life Science Department, University of the State of Bahia, Salvador, Brazil
| | - Daiane Machado
- Center for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil
| | - Julia Pescarini
- Center for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil
| | - Luis Eugênio Souza
- Institute of Collective Health, Federal University of Bahia, Salvador, Brazil
| | - James Macinko
- UCLA Fielding School of Public Health, University of California at Los Angeles (UCLA), Los Angeles, California, United States of America
| | - Inês Dourado
- Institute of Collective Health, Federal University of Bahia, Salvador, Brazil
| |
Collapse
|
13
|
Jesus GS, Pescarini JM, Silva AF, Torrens A, Carvalho WM, Junior EPP, Ichihara MY, Barreto ML, Rebouças P, Macinko J, Sanchez M, Rasella D. The effect of primary health care on tuberculosis in a nationwide cohort of 7·3 million Brazilian people: a quasi-experimental study. Lancet Glob Health 2022; 10:e390-e397. [PMID: 35085514 PMCID: PMC8847211 DOI: 10.1016/s2214-109x(21)00550-7] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2021] [Revised: 11/02/2021] [Accepted: 11/11/2021] [Indexed: 02/01/2023]
Abstract
BACKGROUND Universal health coverage is one of the WHO End TB Strategy priority interventions and could be achieved-particularly in low-income and middle-income countries-through the expansion of primary health care. We evaluated the effects of one of the largest primary health-care programmes in the world, the Brazilian Family Health Strategy (FHS), on tuberculosis morbidity and mortality using a nationwide cohort of 7·3 million individuals over a 10-year study period. METHODS We analysed individuals who entered the 100 Million Brazilians Cohort during the period Jan 1, 2004, to Dec 31, 2013, and compared residents in municipalities with no FHS coverage with residents in municipalities with full FHS coverage. We used a cohort design with multivariable Poisson regressions, adjusted for all relevant demographic and socioeconomic variables and weighted with inverse probability of treatment weighting, to estimate the effect of FHS on tuberculosis incidence, mortality, cure, and case fatality. We also performed a range of stratifications and sensitivity analyses. FINDINGS FHS exposure was associated with lower tuberculosis incidence (rate ratio [RR] 0·78, 95% CI 0·72-0·84) and mortality (0·72, 0·55-0·94), and was positively associated with tuberculosis cure rates (1·04, 1·00-1·08). FHS was also associated with a decrease in tuberculosis case-fatality rates, although this was not statistically significant (RR 0·84, 95% CI 0·55-1·30). FHS associations were stronger among the poorest individuals for all the tuberculosis indicators. INTERPRETATION Community-based primary health care could strongly reduce tuberculosis morbidity and mortality and decrease the unequal distribution of the tuberculosis burden in the most vulnerable populations. During the current marked rise in global poverty due to the COVID-19 pandemic, investments in primary health care could help protect against the expected increases in tuberculosis incidence worldwide and contribute to the attainment of the End TB Strategy goals. FUNDING TB Modelling and Analysis Consortium (Bill & Melinda Gates Foundation), Wellcome Trust, and Brazilian Ministry of Health. TRANSLATION For the Portuguese translation of the abstract see Supplementary Materials section.
Collapse
Affiliation(s)
- Gabriela S Jesus
- Faculty of Medicine, Federal University of Bahia, Salvador, Brazil; Centre for Data and Knowledge Integration for Health, Instituto Gonçalo Moniz, Fundação Oswaldo Cruz, Salvador, Brazil
| | - Julia M Pescarini
- Centre for Data and Knowledge Integration for Health, Instituto Gonçalo Moniz, Fundação Oswaldo Cruz, Salvador, Brazil
| | - Andrea F Silva
- Institute of Collective Health, Federal University of Bahia, Salvador, Brazil
| | - Ana Torrens
- Vital Strategies, Civil Registration and Vital Statistics Improvement and Data Impact Programs, São Paulo, Brazil
| | | | - Elzo P P Junior
- Centre for Data and Knowledge Integration for Health, Instituto Gonçalo Moniz, Fundação Oswaldo Cruz, Salvador, Brazil
| | - Maria Y Ichihara
- Centre for Data and Knowledge Integration for Health, Instituto Gonçalo Moniz, Fundação Oswaldo Cruz, Salvador, Brazil
| | - Mauricio L Barreto
- Institute of Collective Health, Federal University of Bahia, Salvador, Brazil; Centre for Data and Knowledge Integration for Health, Instituto Gonçalo Moniz, Fundação Oswaldo Cruz, Salvador, Brazil
| | - Poliana Rebouças
- Institute of Collective Health, Federal University of Bahia, Salvador, Brazil; Centre for Data and Knowledge Integration for Health, Instituto Gonçalo Moniz, Fundação Oswaldo Cruz, Salvador, Brazil
| | - James Macinko
- Departments of Health Policy and Management and Community Health Sciences, University of California, Los Angeles Fielding School of Public Health, Los Angeles, CA, USA
| | - Mauro Sanchez
- Department of Public Health, University of Brasilia, Brasilia, Brazil
| | - Davide Rasella
- Institute of Collective Health, Federal University of Bahia, Salvador, Brazil; Centre for Data and Knowledge Integration for Health, Instituto Gonçalo Moniz, Fundação Oswaldo Cruz, Salvador, Brazil; ISGlobal, Hospital Clínic- Universitat de Barcelona, Barcelona, Spain.
| |
Collapse
|
14
|
Lucas ADP, de Oliveira Ferreira M, Lucas TDP, Salari P. The intergenerational relationship between conditional cash transfers and newborn health. BMC Public Health 2022; 22:201. [PMID: 35094683 PMCID: PMC8801108 DOI: 10.1186/s12889-022-12565-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Accepted: 01/12/2022] [Indexed: 12/17/2022] Open
Abstract
Background Lack of nutrition, inadequate housing, low education and limited access to quality care can negatively affect children’s health over their lifetime. Implemented in 2003, the Bolsa Familia (“Family Stipend”) Program (PBF) is a conditional cash transfer program targeting poor households in Brazil. This study investigates the long-term benefits of cash transfers through intergenerational transmission of health and poverty by assessing the early life exposure of the mother to the PBF. Methods We used data from the 100M SINASC-SIM cohort compiled and managed by the Center for Data and Knowledge Integration for Health (CIDACS), containing information about participation in the PBF and socioeconomic and health indicators. We analyzed five measures of newborn health: low (less than 2,500 g) and very low (less than 1,500 g) birth weight, premature (less than 37 weeks of gestation) and very premature (less than 28 weeks of gestation) birth, and the presence of some type of malformation (according to ICD-10 codes). Furthermore, we measured the early life exposure to the PBF of the mother as PBF coverage in the previous decade in the city where the mother was born. We applied multilevel logistic regression models to assess the associations between birth outcomes and PBF exposures. Results Results showed that children born in a household where the mother received BF were less likely to have low birth weight (OR 0.93, CI; 0.92-0.94), very low birth weight (0.87, CI; 0.84-0.89), as well as to be born after 37 weeks of gestation (OR 0.98, CI; 0.97-0.99) or 28 weeks of gestation (OR 0.93, CI; 0.88-0.97). There were no significant associations between households where the mother received BF and congenital malformation. On average, the higher the early life exposure to the PBF of the mother, the lower was the prevalence of low birth weight, very low birth weight and congenital malformation of the newborn. No trend was noted for preterm birth. Conclusion The PBF might have indirect intergenerational effects on children’s health. These results provide important implications for policymakers who have to decide how to effectively allocate resources to improve child health. Supplementary Information The online version contains supplementary material available at 10.1186/s12889-022-12565-7.
Collapse
|
15
|
Ferreira AJF, Pescarini J, Sanchez M, Flores-Ortiz RJ, Teixeira CS, Fiaccone R, Ichihara MY, Oliveira R, Aquino EML, Smeeth L, Craig P, Ali S, Leyland AH, Barreto ML, Ribeiro RDC, Katikireddi SV. Evaluating the health effect of a Social Housing programme, Minha Casa Minha Vida, using the 100 million Brazilian Cohort: a natural experiment study protocol. BMJ Open 2021; 11:e041722. [PMID: 33649053 PMCID: PMC8098948 DOI: 10.1136/bmjopen-2020-041722] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Revised: 12/03/2020] [Accepted: 01/22/2021] [Indexed: 01/02/2023] Open
Abstract
INTRODUCTION Social housing programmes have been shown to influence health, but their effects on cardiovascular mortality and incidence of infectious diseases, such as leprosy and tuberculosis, are unknown. We will use individual administrative data to evaluate the effect of the Brazilian housing programme Minha Casa Minha Vida (MCMV) on cardiovascular disease (CVD) mortality and incidence of leprosy and tuberculosis. METHODS AND ANALYSIS We will link the baseline of the 100 Million Brazilian Cohort (2001-2015), which includes information on socioeconomic and demographic variables, to the MCMV (2009-2015), CVD mortality (2007-2015), leprosy (2007-2015) and tuberculosis (2007-2015) registries. We will define our exposed population as individuals who signed the contract to receive a house from MCMV, and our non-exposed group will be comparable individuals within the cohort who have not signed a contract for a house at that time. We will estimate the effect of MCMV on health outcomes using different propensity score approaches to control for observed confounders. Follow-up time of individuals will begin at the date of exposure ascertainment and will end at the time a specific outcome occurs, date of death or end of follow-up (31 December 2015). In addition, we will conduct stratified analyses by the follow-up time, age group, race/ethnicity, gender and socioeconomic position. ETHICS AND DISSEMINATION The study was approved by the ethic committees from Instituto Gonçalo Muniz-Oswaldo Cruz Foundation and University of Glasgow Medical, Veterinary and Life Sciences College. Data analysis will be carried out using an anonymised dataset, accessed by researchers in a secure computational environment according to the Centre for Integration of Data and Health Knowledge procedures. Study findings will be published in high quality peer-reviewed research journals and will also be disseminated to policy makers through stakeholder events and policy briefs.
Collapse
Affiliation(s)
- Andrêa J F Ferreira
- Public Health Institute, Federal University of Bahia, Salvador, Brazil
- Centro de Integração de Dados e Conhecimentos Para Saúde (Cidacs), Fiocruz Bahia, Salvador, Brazil
| | - Julia Pescarini
- Centro de Integração de Dados e Conhecimentos Para Saúde (Cidacs), Fundação Oswaldo Cruz, Salvador, Brazil
| | - Mauro Sanchez
- Public Health, Universidade de Brasília, Brasilia, Brazil
| | - Renzo Joel Flores-Ortiz
- Center for Integration of Data and Health Knowledge (Cidacs), Fiocruz Bahia, Salvador, Brazil
| | | | - Rosemeire Fiaccone
- Mathematics and Statistics, Universidade Federal da Bahia, Salvador, Brazil
| | | | | | - Estela M L Aquino
- Public Health Institute, Federal University of Bahia, Salvador, Brazil
| | - Liam Smeeth
- Epidemiology and Population Health, London School of Hygiene and Tropical Medicine Faculty of Public Health and Policy, London, UK
| | - Peter Craig
- Public Health Sciences Unit, University of Glasgow MRC/CSO Social and Public Health Sciences Unit, Glasgow, UK
| | - Sanni Ali
- Department of Non-communicable Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, UK
| | - Alastair H Leyland
- Public Health Sciences Unit, University of Glasgow MRC/CSO Social and Public Health Sciences Unit, Glasgow, UK
| | | | | | | |
Collapse
|
16
|
Teixeira CSS, Pescarini JM, Alves FJO, Nery JS, Sanchez MN, Teles C, Ichihara MYT, Ramond A, Smeeth L, Fernandes Penna ML, Rodrigues LC, Brickley EB, Penna GO, Barreto ML, Silva RDCR. Incidence of and Factors Associated With Leprosy Among Household Contacts of Patients With Leprosy in Brazil. JAMA Dermatol 2021; 156:640-648. [PMID: 32293649 PMCID: PMC7160739 DOI: 10.1001/jamadermatol.2020.0653] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
Importance Despite progress toward reducing global incidence, leprosy control remains a challenge in low- and middle-income countries. Objective To estimate new case detection rates of leprosy among household contacts of patients with previously diagnosed leprosy and to investigate its associated risk factors. Design, Setting, and Participants This population-based cohort study included families registered in the 100 Million Brazilian Cohort linked with nationwide registries of leprosy; data were collected from January 1, 2007, through December 31, 2014. Household contacts of patients with a previous diagnosis of leprosy from each household unit were followed up from the time of detection of the primary case to the time of detection of a subsequent case or until December 31, 2014. Data analysis was performed from May to December 2018. Exposures Clinical characteristics of the primary case and sociodemographic factors of the household contact. Main Outcomes and Measures Incidence of leprosy, estimated as the new case detection rate of leprosy per 100 000 household contacts at risk (person-years at risk). The association between occurrence of a subsequent leprosy case and the exposure risk factors was assessed using multilevel mixed-effects logistic regressions allowing for state- and household-specific random effects. Results Among 42 725 household contacts (22 449 [52.5%] female; mean [SD] age, 22.4 [18.5] years) of 17 876 patients detected with leprosy, the new case detection rate of leprosy was 636.3 (95% CI, 594.4-681.1) per 100 000 person-years at risk overall and 521.9 (95% CI, 466.3-584.1) per 100 000 person-years at risk among children younger than 15 years. Household contacts of patients with multibacillary leprosy had higher odds of developing leprosy (adjusted odds ratio [OR], 1.48; 95% CI, 1.17-1.88), and the odds increased among contacts aged 50 years or older (adjusted OR, 3.11; 95% CI, 2.03-4.76). Leprosy detection was negatively associated with illiterate or preschool educational level (adjusted OR, 0.59; 95% CI, 0.38-0.92). For children, the odds were increased among boys (adjusted OR, 1.70; 95% CI, 1.20-2.42). Conclusions and Relevance The findings in this Brazilian population-based cohort study suggest that the household contacts of patients with leprosy may have increased risk of leprosy, especially in households with existing multibacillary cases and older contacts. Public health interventions, such as contact screening, that specifically target this population appear to be needed.
Collapse
Affiliation(s)
- Camila Silveira Silva Teixeira
- Centro de Integração de Dados e Conhecimentos para Saúde, Fundação Oswaldo Cruz, Salvador, Brazil.,Instituto de Saúde Coletiva, Universidade Federal da Bahia, Salvador, Brazil
| | - Júlia Moreira Pescarini
- Centro de Integração de Dados e Conhecimentos para Saúde, Fundação Oswaldo Cruz, Salvador, Brazil
| | - Flávia Jôse Oliveira Alves
- Centro de Integração de Dados e Conhecimentos para Saúde, Fundação Oswaldo Cruz, Salvador, Brazil.,Instituto de Saúde Coletiva, Universidade Federal da Bahia, Salvador, Brazil
| | - Joilda Silva Nery
- Instituto de Saúde Coletiva, Universidade Federal da Bahia, Salvador, Brazil
| | - Mauro Niskier Sanchez
- Centro de Integração de Dados e Conhecimentos para Saúde, Fundação Oswaldo Cruz, Salvador, Brazil.,Núcleo de Medicina Tropical, Universidade de Brasília, Brasília, Brazil
| | - Carlos Teles
- Centro de Integração de Dados e Conhecimentos para Saúde, Fundação Oswaldo Cruz, Salvador, Brazil
| | | | - Anna Ramond
- Department of Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, United Kingdom
| | - Liam Smeeth
- Department of Non-communicable Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, United Kingdom.,Health Data Research, London, United Kingdom
| | | | - Laura Cunha Rodrigues
- Department of Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, United Kingdom
| | - Elizabeth B Brickley
- Department of Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, United Kingdom
| | - Gerson Oliveira Penna
- Núcleo de Medicina Tropical, Universidade de Brasília, Brasília, Brazil.,Escola Fiocruz do Governo, Fiocruz Brasília, Brasília, Brazil
| | - Maurício Lima Barreto
- Centro de Integração de Dados e Conhecimentos para Saúde, Fundação Oswaldo Cruz, Salvador, Brazil.,Instituto de Saúde Coletiva, Universidade Federal da Bahia, Salvador, Brazil
| | - Rita de Cássia Ribeiro Silva
- Centro de Integração de Dados e Conhecimentos para Saúde, Fundação Oswaldo Cruz, Salvador, Brazil.,Escola de Nutrição, Universidade Federal da Bahia, Salvador, Brazil
| |
Collapse
|
17
|
Pescarini JM, Williamson E, Ichihara MY, Fiaccone RL, Forastiere L, Ramond A, Nery JS, Penna MLF, Strina A, Reis S, Smeeth L, Rodrigues LC, Brickley EB, Penna GO, Barreto ML. Conditional Cash Transfer Program and Leprosy Incidence: Analysis of 12.9 Million Families From the 100 Million Brazilian Cohort. Am J Epidemiol 2020; 189:1547-1558. [PMID: 32639534 PMCID: PMC7705605 DOI: 10.1093/aje/kwaa127] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2019] [Revised: 06/23/2020] [Accepted: 06/21/2020] [Indexed: 01/19/2023] Open
Abstract
Leprosy is a neglected tropical disease predominately affecting poor and marginalized populations. To test the hypothesis that poverty-alleviating policies might be associated with reduced leprosy incidence, we evaluated the association between the Brazilian Bolsa Familia (BFP) conditional cash transfer program and new leprosy case detection using linked records from 12,949,730 families in the 100 Million Brazilian Cohort (2007–2014). After propensity score matching BFP beneficiary to nonbeneficiary families, we used Mantel-Haenszel tests and Poisson regressions to estimate incidence rate ratios for new leprosy case detection and secondary endpoints related to operational classification and leprosy-associated disabilities at diagnosis. Overall, cumulative leprosy incidence was 17.4/100,000 person-years at risk (95% CI: 17.1, 17.7) and markedly higher in “priority” (high-burden) versus “nonpriority” (low-burden) municipalities (22.8/100,000 person-years at risk, 95% confidence interval (CI): 22.2, 23.3, compared with 14.3/100,000 person-years at risk, 95% CI: 14.0, 14.7). After matching, BFP participation was not associated with leprosy incidence overall (incidence rate ratio (IRR)Poisson = 0.97, 95% CI: 0.90, 1.04) but was associated with lower leprosy incidence when restricted to families living in high-burden municipalities (IRRPoisson = 0.86, 95% CI: 0.77, 0.96). In high-burden municipalities, the association was particularly pronounced for paucibacillary cases (IRRPoisson = 0.82, 95% CI: 0.68, 0.98) and cases with leprosy-associated disabilities (IRRPoisson = 0.79, 95% CI: 0.65, 0.97). These findings provide policy-relevant evidence that social policies might contribute to ongoing leprosy control efforts in high-burden communities.
Collapse
Affiliation(s)
- Julia M Pescarini
- Correspondence to Dr. Julia M. Pescarini, Centro de Integração de Dados e Conhecimentos para Saúde (Cidacs), Fundação Oswaldo Cruz, R. Mundo, 121 – Trobogy, CEP 41301-110, Salvador, Brazil (e-mail: )
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
18
|
Barbosa GCG, Ali MS, Araujo B, Reis S, Sena S, Ichihara MYT, Pescarini J, Fiaccone RL, Amorim LD, Pita R, Barreto ME, Smeeth L, Barreto ML. CIDACS-RL: a novel indexing search and scoring-based record linkage system for huge datasets with high accuracy and scalability. BMC Med Inform Decis Mak 2020; 20:289. [PMID: 33167998 PMCID: PMC7654019 DOI: 10.1186/s12911-020-01285-w] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2019] [Accepted: 10/11/2020] [Indexed: 12/13/2022] Open
Abstract
Background Record linkage is the process of identifying and combining records about the same individual from two or more different datasets. While there are many open source and commercial data linkage tools, the volume and complexity of currently available datasets for linkage pose a huge challenge; hence, designing an efficient linkage tool with reasonable accuracy and scalability is required. Methods We developed CIDACS-RL (Centre for Data and Knowledge Integration for Health – Record Linkage), a novel iterative deterministic record linkage algorithm based on a combination of indexing search and scoring algorithms (provided by Apache Lucene). We described how the algorithm works and compared its performance with four open source linkage tools (AtyImo, Febrl, FRIL and RecLink) in terms of sensitivity and positive predictive value using gold standard dataset. We also evaluated its accuracy and scalability using a case-study and its scalability and execution time using a simulated cohort in serial (single core) and multi-core (eight core) computation settings. Results Overall, CIDACS-RL algorithm had a superior performance: positive predictive value (99.93% versus AtyImo 99.30%, RecLink 99.5%, Febrl 98.86%, and FRIL 96.17%) and sensitivity (99.87% versus AtyImo 98.91%, RecLink 73.75%, Febrl 90.58%, and FRIL 74.66%). In the case study, using a ROC curve to choose the most appropriate cut-off value (0.896), the obtained metrics were: sensitivity = 92.5% (95% CI 92.07–92.99), specificity = 93.5% (95% CI 93.08–93.8) and area under the curve (AUC) = 97% (95% CI 96.97–97.35). The multi-core computation was about four times faster (150 seconds) than the serial setting (550 seconds) when using a dataset of 20 million records. Conclusion CIDACS-RL algorithm is an innovative linkage tool for huge datasets, with higher accuracy, improved scalability, and substantially shorter execution time compared to other existing linkage tools. In addition, CIDACS-RL can be deployed on standard computers without the need for high-speed processors and distributed infrastructures.
Collapse
Affiliation(s)
- George C G Barbosa
- Centre for Data and Knowledge Integration for Health (CIDACS), Fiocruz Bahia, Parque Tecnológico da Bahia, Edf. Tecnocentro, sala 315, Rua Mundo, no 121, Salvador, 41301-110, Brazil.
| | - M Sanni Ali
- Centre for Data and Knowledge Integration for Health (CIDACS), Fiocruz Bahia, Parque Tecnológico da Bahia, Edf. Tecnocentro, sala 315, Rua Mundo, no 121, Salvador, 41301-110, Brazil.,Department of Non-communicable Disease Epidemiology, London School of Hygiene and Tropical Medicine, London, UK.,NDORMS, Center for Statistics in Medicine, University of Oxford, Oxford, UK
| | - Bruno Araujo
- Centre for Data and Knowledge Integration for Health (CIDACS), Fiocruz Bahia, Parque Tecnológico da Bahia, Edf. Tecnocentro, sala 315, Rua Mundo, no 121, Salvador, 41301-110, Brazil
| | - Sandra Reis
- Centre for Data and Knowledge Integration for Health (CIDACS), Fiocruz Bahia, Parque Tecnológico da Bahia, Edf. Tecnocentro, sala 315, Rua Mundo, no 121, Salvador, 41301-110, Brazil
| | - Samila Sena
- Centre for Data and Knowledge Integration for Health (CIDACS), Fiocruz Bahia, Parque Tecnológico da Bahia, Edf. Tecnocentro, sala 315, Rua Mundo, no 121, Salvador, 41301-110, Brazil
| | - Maria Y T Ichihara
- Centre for Data and Knowledge Integration for Health (CIDACS), Fiocruz Bahia, Parque Tecnológico da Bahia, Edf. Tecnocentro, sala 315, Rua Mundo, no 121, Salvador, 41301-110, Brazil
| | - Julia Pescarini
- Centre for Data and Knowledge Integration for Health (CIDACS), Fiocruz Bahia, Parque Tecnológico da Bahia, Edf. Tecnocentro, sala 315, Rua Mundo, no 121, Salvador, 41301-110, Brazil
| | - Rosemeire L Fiaccone
- Centre for Data and Knowledge Integration for Health (CIDACS), Fiocruz Bahia, Parque Tecnológico da Bahia, Edf. Tecnocentro, sala 315, Rua Mundo, no 121, Salvador, 41301-110, Brazil.,Department of Statistics, Federal University of Bahia (UFBA), Salvador, Brazil
| | - Leila D Amorim
- Centre for Data and Knowledge Integration for Health (CIDACS), Fiocruz Bahia, Parque Tecnológico da Bahia, Edf. Tecnocentro, sala 315, Rua Mundo, no 121, Salvador, 41301-110, Brazil.,Department of Statistics, Federal University of Bahia (UFBA), Salvador, Brazil
| | - Robespierre Pita
- Centre for Data and Knowledge Integration for Health (CIDACS), Fiocruz Bahia, Parque Tecnológico da Bahia, Edf. Tecnocentro, sala 315, Rua Mundo, no 121, Salvador, 41301-110, Brazil
| | - Marcos E Barreto
- Centre for Data and Knowledge Integration for Health (CIDACS), Fiocruz Bahia, Parque Tecnológico da Bahia, Edf. Tecnocentro, sala 315, Rua Mundo, no 121, Salvador, 41301-110, Brazil.,Computer Science Department, Federal University of Bahia (UFBA), Salvador, Brazil.,Department of Statistics, London School of Economics and Political Science (LSE), London, UK
| | - Liam Smeeth
- Department of Non-communicable Disease Epidemiology, London School of Hygiene and Tropical Medicine, London, UK
| | - Mauricio L Barreto
- Centre for Data and Knowledge Integration for Health (CIDACS), Fiocruz Bahia, Parque Tecnológico da Bahia, Edf. Tecnocentro, sala 315, Rua Mundo, no 121, Salvador, 41301-110, Brazil.,Institute of Public Health, Federal University of Bahia (UFBA), Salvador, Brazil
| |
Collapse
|
19
|
Abstract
Background: Linkage of administrative data sources provides an efficient means of collecting detailed data on how individuals interact with cross-sectoral services, society, and the environment. These data can be used to supplement conventional cohort studies, or to create population-level electronic cohorts generated solely from administrative data. However, errors occurring during linkage (false matches/missed matches) can lead to bias in results from linked data. Aim: This paper provides guidance on evaluating linkage quality in cohort studies. Methods: We provide an overview of methods for linkage, describe mechanisms by which linkage error can introduce bias, and draw on real-world examples to demonstrate methods for evaluating linkage quality. Results: Methods for evaluating linkage quality described in this paper provide guidance on (i) estimating linkage error rates, (ii) understanding the mechanisms by which linkage error might bias results, and (iii) information that should be shared between data providers, linkers and users, so that approaches to handling linkage error in analysis can be implemented. Conclusion: Linked administrative data can enhance conventional cohorts and offers the ability to answer questions that require large sample sizes or hard-to-reach populations. Care needs to be taken to evaluate linkage quality in order to provide robust results.
Collapse
Affiliation(s)
- Katie Harron
- Department of Population, Practice and Policy, UCL Great Ormond Street Institute of Child Health, London, UK
| | - James C Doidge
- Intensive Care National Audit and Research Centre (ICNARC), London, UK
| | - Harvey Goldstein
- Department of Population, Practice and Policy, UCL Great Ormond Street Institute of Child Health, London, UK.,School of Education, University of Bristol, Bristol, UK
| |
Collapse
|
20
|
Pescarini JM, Williamson E, Nery JS, Ramond A, Ichihara MY, Fiaccone RL, Penna MLF, Smeeth L, Rodrigues LC, Penna GO, Brickley EB, Barreto ML. Effect of a conditional cash transfer programme on leprosy treatment adherence and cure in patients from the nationwide 100 Million Brazilian Cohort: a quasi-experimental study. THE LANCET. INFECTIOUS DISEASES 2020; 20:618-627. [PMID: 32066527 PMCID: PMC7191267 DOI: 10.1016/s1473-3099(19)30624-3] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/24/2019] [Revised: 09/04/2019] [Accepted: 10/25/2019] [Indexed: 01/01/2023]
Abstract
BACKGROUND Indirect financial costs and barriers to health-care access might contribute to leprosy treatment non-adherence. We estimated the association of the Brazilian conditional cash transfer programme, the Programa Bolsa Família (PBF), on leprosy treatment adherence and cure in patients in Brazil. METHODS In this quasi-experimental study, we linked baseline demographic and socioeconomic information for individuals who entered the 100 Million Brazilian Cohort between Jan 1, 2007, and Dec 31, 2014, with the PBF payroll database and the Information System for Notifiable Diseases, which includes nationwide leprosy registries. Individuals were eligible for inclusion if they had a household member older than 15 years and had not received PBF aid or been diagnosed with leprosy before entering the 100 Million Brazilian Cohort; they were excluded if they were partial receivers of PBF benefits, had missing data, or had a monthly per-capita income greater than BRL200 (US$50). Individuals who were PBF beneficiaries before leprosy diagnosis were matched to those who were not beneficiaries through propensity-score matching (1:1) with replacement on the basis of baseline covariates, including sex, age, race or ethnicity, education, work, income, place of residence, and household characteristics. We used logistic regression to assess the average treatment effect on the treated of receipt of PBF benefits on leprosy treatment adherence (six or more multidrug therapy doses for paucibacillary cases or 12 or more doses for multibacillary cases) and cure in individuals of all ages. We stratified our analysis according to operational disease classification (paucibacillary or multibacillary). We also did a subgroup analysis of paediatric leprosy restricted to children aged up to 15 years. FINDINGS We included 11 456 new leprosy cases, of whom 8750 (76·3%) had received PBF before diagnosis and 2706 (23·6%) had not. Overall, 9508 (83·0%) patients adhered to treatment and 10 077 (88·0%) were cured. After propensity score matching, receiving PBF before diagnosis was associated with adherence to treatment (OR 1·22, 95% CI 1·01-1·48) and cure (1·26, 1·01-1·58). PBF receipt did not significantly improve treatment adherence (1·37, 0·98-1·91) or cure (1·12, 0·75-1·67) in patients with paucibacillary leprosy. For patients with multibacillary disease, PBF beneficiaries had better treatment adherence (1·37, 1·08-1·74) and cure (1·43, 1·09-1·90) than non-beneficiaries. In the propensity score-matched analysis in 2654 children younger than 15 years with leprosy, PBF exposure was not associated with leprosy treatment adherence (1·55, 0·89-2·68) or cure (1·57, 0·83-2·97). INTERPRETATION Our results suggest that being a beneficiary of the PBF, which facilitates cash transfers and improved access to health care, is associated with greater leprosy multidrug therapy adherence and cure in multibacillary cases. These results are especially relevant for patients with multibacillary disease, who are treated for a longer period and have lower cure rates than those with paucibacillary disease. FUNDING CONFAP/ESRC/MRC/BBSRC/CNPq/FAPDF-Doenças Negligenciadas, the UK Medical Research Council, the Wellcome Trust, and Coordenação de Aperfeiçoamento de Pessoal de Nível Superior-Brazil (CAPES).
Collapse
Affiliation(s)
- Julia M Pescarini
- Centre for Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Moniz, Fundação Oswaldo Cruz (FIOCRUZ), Salvador, Brazil.
| | - Elizabeth Williamson
- Department of Medical Statistics, London School of Hygiene & Tropical Medicine, London, UK; Health Data Research, London, UK
| | - Joilda S Nery
- Instituto de Saúde Coletiva, Universidade Federal da Bahia, Salvador, Brazil
| | - Anna Ramond
- Department of Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, UK
| | - Maria Yury Ichihara
- Centre for Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Moniz, Fundação Oswaldo Cruz (FIOCRUZ), Salvador, Brazil
| | - Rosemeire L Fiaccone
- Centre for Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Moniz, Fundação Oswaldo Cruz (FIOCRUZ), Salvador, Brazil; Instituto de Matemática e Estatística, Universidade Federal da Bahia, Salvador, Brazil
| | - Maria Lucia F Penna
- Universidade Federal Fluminense, Instituto de Saúde da Comunidade, Niterói, Brazil
| | - Liam Smeeth
- Department of Non-communicable Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, UK; Health Data Research, London, UK
| | - Laura C Rodrigues
- Department of Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, UK
| | - Gerson O Penna
- Núcleo de Medicina Tropical, Universidade de Brasília, Escola FIOCRUZ de Governo Fundação Oswaldo Crus Brasília, Brazil
| | - Elizabeth B Brickley
- Department of Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, UK
| | - Mauricio L Barreto
- Centre for Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Moniz, Fundação Oswaldo Cruz (FIOCRUZ), Salvador, Brazil; Instituto de Saúde Coletiva, Universidade Federal da Bahia, Salvador, Brazil
| |
Collapse
|
21
|
Abstract
AbstractTwins Research Australia (TRA) is a community of twins and researchers working on health research to benefit everyone, including twins. TRA leads multidisciplinary research through the application of twin and family study designs, with the aim of sustaining long-term twin research that, both now and in the future, gives back to the community. This article summarizes TRA’s recent achievements and future directions, including new methodologies addressing causation, linkage to health, economic and educational administrative datasets and to geospatial data to provide insight into health and disease. We also explain how TRA’s knowledge translation and exchange activities are key to communicating the impact of twin studies to twins and the wider community. Building researcher capability, providing registry resources and partnering with all key stakeholders, particularly the participants, are important for how TRA is advancing twin research to improve health outcomes for society. TRA provides researchers with open access to its vibrant volunteer membership of twins, higher order multiples (multiples) and families who are willing to consider participation in research. Established four decades ago, this resource facilitates and supports research across multiple stages and a breadth of health domains.
Collapse
|
22
|
Barreto ML, Ichihara MY, Almeida BA, Barreto ME, Cabral L, Fiaccone RL, Carreiro RP, Teles CAS, Pitta R, Penna GO, Barral-Netto M, Ali MS, Barbosa G, Denaxas S, Rodrigues LC, Smeeth L. The Centre for Data and Knowledge Integration for Health (CIDACS): Linking Health and Social Data in Brazil. Int J Popul Data Sci 2019; 4:1140. [PMID: 34095542 PMCID: PMC8142622 DOI: 10.23889/ijpds.v4i2.1140] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
The Centre for Data and Knowledge Integration for Health (CIDACS) was created in 2016 in Salvador, Bahia-Brazil with the objective of integrating data and knowledge aiming to answer scientific questions related to the health of the Brazilian population. This article details our experiences in the establishment and operations of CIDACS, as well as efforts made to obtain high-quality linked data while adhering to security, ethical use and privacy issues. Every effort has been made to conduct operations while implementing appropriate structures, procedures, processes and controls over the original and integrated databases in order to provide adequate datasets to answer relevant research questions. Looking forward, CIDACS is expected to be an important resource for researchers and policymakers interested in enhancing the evidence base pertaining to different aspects of health, in particular when investigating, from a nation-wide perspective, the role of social determinants of health and the effects of social and environmental policies on different health outcomes.
Collapse
Affiliation(s)
- ML Barreto
- Centre for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil.
- Institute of Collective Health, Federal University of Bahia (UFBA), Salvador, Brazil.
| | - MY Ichihara
- Centre for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil.
- Institute of Collective Health, Federal University of Bahia (UFBA), Salvador, Brazil.
| | - BA Almeida
- Centre for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil.
| | - ME Barreto
- Centre for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil.
- Computer Science Department, Federal University of Bahia (UFBA), Salvador, Brazil.
| | - L Cabral
- Centre for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil.
| | - RL Fiaccone
- Centre for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil.
- Statistics Department, Federal University of Bahia (UFBA), Brazil.
| | - RP Carreiro
- Centre for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil.
| | - CAS Teles
- Centre for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil.
| | - R Pitta
- Centre for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil.
| | - GO Penna
- Centre for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil.
- Tropical Medicine Centre, University of Brasília (UnB), Brazil.
- Escola Fiocruz de Governo, FIOCRUZ Brasília, Brazil.
| | - M Barral-Netto
- Centre for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil.
| | - MS Ali
- Centre for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil.
- Center for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, UK.
- Faculty of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, United Kingdom.
| | - G Barbosa
- Centre for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil.
| | - S Denaxas
- Institute of Health Informatics, University College London, United Kingdom.
| | - LC Rodrigues
- Centre for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil.
- Faculty of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, United Kingdom.
| | - L Smeeth
- Centre for Data and Knowledge Integration for Health (CIDACS), Gonçalo Moniz Institute, Oswaldo Cruz Foundation (FIOCRUZ), Salvador, Brazil.
- Faculty of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, United Kingdom.
| |
Collapse
|
23
|
Ali MS, Ichihara MY, Lopes LC, Barbosa GC, Pita R, Carreiro RP, dos Santos DB, Ramos D, Bispo N, Raynal F, Canuto V, de Araujo Almeida B, Fiaccone RL, Barreto ME, Smeeth L, Barreto ML. Administrative Data Linkage in Brazil: Potentials for Health Technology Assessment. Front Pharmacol 2019; 10:984. [PMID: 31607900 PMCID: PMC6768004 DOI: 10.3389/fphar.2019.00984] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2019] [Accepted: 07/31/2019] [Indexed: 12/17/2022] Open
Abstract
Health technology assessment (HTA) is the systematic evaluation of the properties and impacts of health technologies and interventions. In this article, we presented a discussion of HTA and its evolution in Brazil, as well as a description of secondary data sources available in Brazil with potential applications to generate evidence for HTA and policy decisions. Furthermore, we highlighted record linkage, ongoing record linkage initiatives in Brazil, and the main linkage tools developed and/or used in Brazilian data. Finally, we discussed the challenges and opportunities of using secondary data for research in the Brazilian context. In conclusion, we emphasized the availability of high quality data and an open, modern attitude toward the use of data for research and policy. This is supported by a rigorous but enabling legal framework that will allow the conduct of large-scale observational studies to evaluate clinical, economical, and social impacts of health technologies and social policies.
Collapse
Affiliation(s)
- M Sanni Ali
- Faculty of Epidemiology and Population Health, Department of Non-communicable Disease Epidemiology, London School of Hygiene and Tropical Medicine, London, United Kingdom
- Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences (NDORMS), Center for Statistics in Medicine (CSM), University of Oxford, Oxford, United Kingdom
- Centre for Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Muniz, Fundação Osvaldo Cruz, Salvador, Brazil
| | - Maria Yury Ichihara
- Centre for Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Muniz, Fundação Osvaldo Cruz, Salvador, Brazil
- Institute of Public Health, Federal University of Bahia (UFBA), Salvador, Brazil
| | | | - George C.G. Barbosa
- Centre for Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Muniz, Fundação Osvaldo Cruz, Salvador, Brazil
| | - Robespierre Pita
- Centre for Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Muniz, Fundação Osvaldo Cruz, Salvador, Brazil
| | - Roberto Perez Carreiro
- Centre for Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Muniz, Fundação Osvaldo Cruz, Salvador, Brazil
| | | | - Dandara Ramos
- Centre for Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Muniz, Fundação Osvaldo Cruz, Salvador, Brazil
| | - Nivea Bispo
- Centre for Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Muniz, Fundação Osvaldo Cruz, Salvador, Brazil
| | - Fabiana Raynal
- Department of Management and Incorporation of Health Technology, Ministry of Health (DGITS/MS), Brasília, Brazil
| | - Vania Canuto
- Department of Management and Incorporation of Health Technology, Ministry of Health (DGITS/MS), Brasília, Brazil
| | - Bethania de Araujo Almeida
- Centre for Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Muniz, Fundação Osvaldo Cruz, Salvador, Brazil
| | - Rosemeire L. Fiaccone
- Centre for Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Muniz, Fundação Osvaldo Cruz, Salvador, Brazil
- Institute of Public Health, Federal University of Bahia (UFBA), Salvador, Brazil
- Department of Statistics, Federal University of Bahia (UFBA), Salvador, Brazil
| | - Marcos E. Barreto
- Centre for Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Muniz, Fundação Osvaldo Cruz, Salvador, Brazil
- Department of Computing, Federal University of Bahia (UFBA), Salvador, Brazil
- Institute of Health Informatics, University College London, London, United Kingdom
| | - Liam Smeeth
- Faculty of Epidemiology and Population Health, Department of Non-communicable Disease Epidemiology, London School of Hygiene and Tropical Medicine, London, United Kingdom
- Centre for Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Muniz, Fundação Osvaldo Cruz, Salvador, Brazil
| | - Mauricio L. Barreto
- Centre for Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Muniz, Fundação Osvaldo Cruz, Salvador, Brazil
- Institute of Public Health, Federal University of Bahia (UFBA), Salvador, Brazil
| |
Collapse
|
24
|
Statistical supervised meta-ensemble algorithm for medical record linkage. J Biomed Inform 2019; 95:103220. [PMID: 31158554 DOI: 10.1016/j.jbi.2019.103220] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2018] [Revised: 04/23/2019] [Accepted: 05/28/2019] [Indexed: 11/21/2022]
Abstract
Identifying unique patients across multiple care facilities or services is a major challenge in providing continuous care and undertaking health research. Identifying and linking patients without compromising privacy and security is an emerging issue in the big data era. The large quantity and complexity of the patient data emphasize the need for effective linkage methods that are both scalable and accurate. In this study, we aim to develop and evaluate an ensemble classification method using the three most typically used supervised learning methods, namely support vector machines, logistic regression and standard feed-forward neural networks, to link records that belong to the same patient across multiple service locations. Our ensemble method is the combination of bagging and stacking. Each base learner's critical hyperparameters were selected through grid search technique. Two synthetic datasets were used in this study namely FEBRL and ePBRN. ePBRN linkage dataset was based on linkage errors noticed in the Australian primary care setting. The overall linkage performance was determined by assessing the blocking performance and classification performance. Our ensemble method outperformed the base learners in all evaluation metrics on one dataset. More specifically, the precision, which is average of individual precision scores in case of base learners increased from 90.70% to 94.85% in FEBRL, and from 62.17% to 99.28% in ePBRN. Similarly, the F-score increased from 94.92% to 98.18% in FEBRL, and from 72.99% to 91.72% in ePBRN. Our experiments suggest that we can significantly improve the linkage performance of individual algorithms by employing ensemble strategies.
Collapse
|
25
|
Barreto ML, Rodrigues LC. Linkage of Administrative Datasets: Enhancing Longitudinal Epidemiological Studies in the Era of “Big Data”. CURR EPIDEMIOL REP 2018. [DOI: 10.1007/s40471-018-0177-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|