1
|
Liu P, Wang Z, Liu N, Peres MA. A scoping review of the clinical application of machine learning in data-driven population segmentation analysis. J Am Med Inform Assoc 2023; 30:1573-1582. [PMID: 37369006 PMCID: PMC10436153 DOI: 10.1093/jamia/ocad111] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Revised: 06/08/2023] [Accepted: 06/16/2023] [Indexed: 06/29/2023] Open
Abstract
OBJECTIVE Data-driven population segmentation is commonly used in clinical settings to separate the heterogeneous population into multiple relatively homogenous groups with similar healthcare features. In recent years, machine learning (ML) based segmentation algorithms have garnered interest for their potential to speed up and improve algorithm development across many phenotypes and healthcare situations. This study evaluates ML-based segmentation with respect to (1) the populations applied, (2) the segmentation details, and (3) the outcome evaluations. MATERIALS AND METHODS MEDLINE, Embase, Web of Science, and Scopus were used following the PRISMA-ScR criteria. Peer-reviewed studies in the English language that used data-driven population segmentation analysis on structured data from January 2000 to October 2022 were included. RESULTS We identified 6077 articles and included 79 for the final analysis. Data-driven population segmentation analysis was employed in various clinical settings. K-means clustering is the most prevalent unsupervised ML paradigm. The most common settings were healthcare institutions. The most common targeted population was the general population. DISCUSSION Although all the studies did internal validation, only 11 papers (13.9%) did external validation, and 23 papers (29.1%) conducted methods comparison. The existing papers discussed little validating the robustness of ML modeling. CONCLUSION Existing ML applications on population segmentation need more evaluations regarding giving tailored, efficient integrated healthcare solutions compared to traditional segmentation analysis. Future ML applications in the field should emphasize methods' comparisons and external validation and investigate approaches to evaluate individual consistency using different methods.
Collapse
Affiliation(s)
- Pinyan Liu
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore, Singapore
| | - Ziwen Wang
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore, Singapore
| | - Nan Liu
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore, Singapore
- Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore, Singapore
- Institute of Data Science, National University of Singapore, Singapore, Singapore
| | - Marco Aurélio Peres
- Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore, Singapore
- National Dental Research Institute Singapore, National Dental Centre Singapore, Singapore, Singapore
| |
Collapse
|
2
|
Ho SH, Lim JT, Ong J, Hapuarachchi HC, Sim S, Ng LC. Singapore's 5 decades of dengue prevention and control-Implications for global dengue control. PLoS Negl Trop Dis 2023; 17:e0011400. [PMID: 37347767 DOI: 10.1371/journal.pntd.0011400] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/24/2023] Open
Abstract
This paper summarises the lessons learnt in dengue epidemiology, risk factors, and prevention in Singapore over the last half a century, during which Singapore evolved from a city of 1.9 million people to a highly urban globalised city-state with a population of 5.6 million. Set in a tropical climate, urbanisation among green foliage has created ideal conditions for the proliferation of Aedes aegypti and Aedes albopictus, the mosquito vectors that transmit dengue. A vector control programme, largely for malaria, was initiated as early as 1921, but it was only in 1966 that the Vector Control Unit (VCU) was established to additionally tackle dengue haemorrhagic fever (DHF) that was first documented in the 1960s. Centred on source reduction and public education, and based on research into the bionomics and ecology of the vectors, the programme successfully reduced the Aedes House Index (HI) from 48% in 1966 to <5% in the 1970s. Further enhancement of the programme, including through legislation, suppressed the Aedes HI to around 1% from the 1990s. The current programme is characterised by 4 key features: (i) proactive inter-epidemic surveillance and control that is stepped up during outbreaks; (ii) risk-based prevention and intervention strategies based on advanced data analytics; (iii) coordinated inter-sectoral cooperation between the public, private, and people sectors; and (iv) evidence-based adoption of new tools and strategies. Dengue seroprevalence and force of infection (FOI) among residents have substantially and continuously declined over the 5 decades. This is consistent with the observation that dengue incidence has been delayed to adulthood, with severity highest among the elderly. Paradoxically, the number of reported dengue cases and outbreaks has increased since the 1990s with record-breaking epidemics. We propose that Singapore's increased vulnerability to outbreaks is due to low levels of immunity in the population, constant introduction of new viral variants, expanding urban centres, and increasing human density. The growing magnitude of reported outbreaks could also be attributed to improved diagnostics and surveillance, which at least partially explains the discord between rising trend in cases and the continuous reduction in dengue seroprevalence. Changing global and local landscapes, including climate change, increasing urbanisation and global physical connectivity are expected to make dengue control even more challenging. The adoption of new vector surveillance and control tools, such as the Gravitrap and Wolbachia technology, is important to impede the growing threat of dengue and other Aedes-borne diseases.
Collapse
Affiliation(s)
- Soon Hoe Ho
- Environmental Health Institute, National Environment Agency, Singapore, Singapore
| | - Jue Tao Lim
- Environmental Health Institute, National Environment Agency, Singapore, Singapore
- Lee Kong Chian School of Medicine, Nanyang Technological University Novena Campus, Singapore, Singapore
| | - Janet Ong
- Environmental Health Institute, National Environment Agency, Singapore, Singapore
| | | | - Shuzhen Sim
- Environmental Health Institute, National Environment Agency, Singapore, Singapore
| | - Lee Ching Ng
- Environmental Health Institute, National Environment Agency, Singapore, Singapore
- School of Biological Sciences, Nanyang Technological University, Singapore, Singapore
| |
Collapse
|
3
|
Baak-Baak CM, Cigarroa-Toledo N, Pinto-Castillo JF, Cetina-Trejo RC, Torres-Chable O, Blitvich BJ, Garcia-Rejon JE. Cluster Analysis of Dengue Morbidity and Mortality in Mexico from 2007 to 2020: Implications for the Probable Case Definition. Am J Trop Med Hyg 2022; 106:tpmd210409. [PMID: 35292593 PMCID: PMC9128710 DOI: 10.4269/ajtmh.21-0409] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Accepted: 01/20/2022] [Indexed: 11/07/2022] Open
Abstract
Dengue cases and deaths occur frequently in Mexico, although the trend is not uniform across the country. We performed a Spatio-temporal analysis of dengue cases and deaths in Mexico from 2007 to 2020, and clustered states according to whether there was a low, moderate, or high risk of dengue. A total of 501,600 confirmed dengue cases were registered from 2007 to 2020, with 378,122 cases classified as dengue fever (DF) and 123,478 cases classified as dengue hemorrhagic fever (DHF). For each confirmed case, there were 4.68 probable cases. There were 1,230 dengue deaths, with highest numbers reported in 2009, 2012, 2013, and 2019. The number of deaths had a significant correlation (P ≤ 0.01) with DF (r = 0.82), DHF (r = 0.94), and probable dengue cases (r = 0.84). States were clustered using Machine Learning technique according to select indices associated with dengue. Cluster 1 (low risk) primarily contained states in the northwest, northcentral, and east. Cluster 2 (moderate risk) includes states in the northeast. Cluster 3 (high risk) mostly contained coastal states in the southeast, southwest, and west. The generation of the clusters was supported by the Kruskal-Wallis test. A significant difference was found in the incidence, mortality rates, and case-fatality rates of dengue among the clusters (P ≤ 0.01). Notably, cluster 3 contributed 71.4% of the confirmed cases and 89.2% of the deaths. Public health and vector control strategies designed to mitigate the burden of dengue in Mexico should consider the states in cluster 3 as high priority areas.
Collapse
Affiliation(s)
- Carlos M. Baak-Baak
- Laboratorio de Arbovirología, Centro de Investigaciones Regionales “Dr. Hideyo Noguchi,” Universidad Autónoma de Yucatán, Mérida, Yucatán, México
| | - Nohemi Cigarroa-Toledo
- Laboratorio de Biología Celular, Centro de Investigaciones Regionales “Dr. Hideyo Noguchi,” Universidad Autónoma de Yucatán, Mérida, Yucatán, México
| | - Jose F. Pinto-Castillo
- Laboratorio de Geografía Ambiental, Instituto de Investigación en Gestión de Riesgos y Cambio Climático, Universidad de Ciencias y Artes de Chiapas, México
| | - Rosa C. Cetina-Trejo
- Laboratorio de Arbovirología, Centro de Investigaciones Regionales “Dr. Hideyo Noguchi,” Universidad Autónoma de Yucatán, Mérida, Yucatán, México
| | - Oswaldo Torres-Chable
- Laboratorio de Enfermedades Tropicales y Transmitidas por Vector, Universidad Juárez Autónoma de Tabasco, Villahermosa, Tabasco, México
| | - Bradley J. Blitvich
- Department of Veterinary Microbiology and Preventive Medicine, College of Veterinary Medicine, Iowa State University, Ames, Iowa
| | - Julian E. Garcia-Rejon
- Laboratorio de Arbovirología, Centro de Investigaciones Regionales “Dr. Hideyo Noguchi,” Universidad Autónoma de Yucatán, Mérida, Yucatán, México
| |
Collapse
|
4
|
Faridah L, Mindra IGN, Putra RE, Fauziah N, Agustian D, Natalia YA, Watanabe K. Spatial and temporal analysis of hospitalized dengue patients in Bandung: demographics and risk. Trop Med Health 2021; 49:44. [PMID: 34039439 PMCID: PMC8152360 DOI: 10.1186/s41182-021-00329-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Accepted: 05/03/2021] [Indexed: 01/02/2023] Open
Abstract
Background Bandung, the fourth largest city in Indonesia and capital of West Java province, has been considered a major endemic area of dengue, and studies show that the incidence in this city could increase and spread rapidly. At the same time, estimation of incidence could be inaccurate due to a lack of reliable surveillance systems. To provide strategic information for the dengue control program in the face of limited capacity, this study used spatial pattern analysis of a possible outbreak of dengue cases, through the Geographic Information System (GIS). To further enhance the information needed for effective policymaking, we also analyzed the demographic pattern of dengue cases. Methods Monthly reports of dengue cases from January 2014 to December 2016 from 16 hospitals in Bandung were collected as the database, which consisted of address, sex, age, and code to anonymize the patients. The address was then transformed into geocoding and used to estimate the relative risk of a particular area’s developing a cluster of dengue cases. We used the kernel density estimation method to analyze the dynamics of change of dengue cases. Results The model showed that the spatial cluster of the relative risk of dengue incidence was relatively unchanged for 3 years. Dengue high-risk areas predominated in the southern and southeastern parts of Bandung, while low-risk areas were found mostly in its western and northeastern regions. The kernel density estimation showed strong cluster groups of dengue cases in the city. Conclusions This study demonstrated a strong pattern of reported cases related to specific demographic groups (males and children). Furthermore, spatial analysis using GIS also visualized the dynamic development of the aggregation of disease incidence (hotspots) for dengue cases in Bandung. These data may provide strategic information for the planning and design of dengue control programs.
Collapse
Affiliation(s)
- Lia Faridah
- Parasitology Division, Department of Biomedical Science, Faculty of Medicine, Universitas Padjadjaran, Bandung, Indonesia. .,Foreign Visiting Researcher at Department of Civil and Environmental Engineering, Ehime University, Matsuyama, Japan.
| | | | - Ramadhani Eka Putra
- School of Life Science and Technology, Institut Teknologi Bandung, Jl. Ganeca 10, Bandung, West Java, 40132, Indonesia
| | - Nisa Fauziah
- Parasitology Division, Department of Biomedical Science, Faculty of Medicine, Universitas Padjadjaran, Bandung, Indonesia
| | - Dwi Agustian
- Department of Public Health, Faculty of Medicine, Universitas Padjadjaran, Bandung, Indonesia
| | - Yessika Adelwin Natalia
- Department of Public Health, Faculty of Medicine, Universitas Padjadjaran, Bandung, Indonesia
| | - Kozo Watanabe
- Department of Civil and Environmental Engineering, Ehime University, Matsuyama, Japan
| |
Collapse
|
5
|
Assessing the suitability of mitochondrial and nuclear DNA genetic markers for molecular systematics and species identification of helminths. Parasit Vectors 2021; 14:233. [PMID: 33933158 PMCID: PMC8088577 DOI: 10.1186/s13071-021-04737-y] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2021] [Accepted: 04/21/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Genetic markers are employed widely in molecular studies, and their utility depends on the degree of sequence variation, which dictates the type of application for which they are suited. Consequently, the suitability of a genetic marker for any specific application is complicated by its properties and usage across studies. To provide a yardstick for future users, in this study we assess the suitability of genetic markers for molecular systematics and species identification in helminths and provide an estimate of the cut-off genetic distances per taxonomic level. METHODS We assessed four classes of genetic markers, namely nuclear ribosomal internal transcribed spacers, nuclear rRNA, mitochondrial rRNA and mitochondrial protein-coding genes, based on certain properties that are important for species identification and molecular systematics. For molecular identification, these properties are inter-species sequence variation; length of reference sequences; easy alignment of sequences; and easy to design universal primers. For molecular systematics, the properties are: average genetic distance from order/suborder to species level; the number of monophyletic clades at the order/suborder level; length of reference sequences; easy alignment of sequences; easy to design universal primers; and absence of nucleotide substitution saturation. Estimation of the cut-off genetic distances was performed using the 'K-means' clustering algorithm. RESULTS The nuclear rRNA genes exhibited the lowest sequence variation, whereas the mitochondrial genes exhibited relatively higher variation across the three groups of helminths. Also, the nuclear and mitochondrial rRNA genes were the best possible genetic markers for helminth molecular systematics, whereas the mitochondrial protein-coding and rRNA genes were suitable for molecular identification. We also revealed that a general gauge of genetic distances might not be adequate, using evidence from the wide range of genetic distances among nematodes. CONCLUSION This study assessed the suitability of DNA genetic markers for application in molecular systematics and molecular identification of helminths. We provide a novel way of analyzing genetic distances to generate suitable cut-off values for each taxonomic level using the 'K-means' clustering algorithm. The estimated cut-off genetic distance values, together with the summary of the utility and limitations of each class of genetic markers, are useful information that can benefit researchers conducting molecular studies on helminths.
Collapse
|
6
|
Timothy JWS, Beale MA, Rogers E, Zaizay Z, Halliday KE, Mulbah T, Giddings RK, Walker SL, Thomson NR, Kollie KK, Pullan RL, Marks M. Epidemiologic and Genomic Reidentification of Yaws, Liberia. Emerg Infect Dis 2021; 27:1123-1132. [PMID: 33754988 PMCID: PMC8007311 DOI: 10.3201/eid2704.204442] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
We confirmed endemicity and autochthonous transmission of yaws in Liberia after a population-based, community-led burden estimation (56,825 participants). Serologically confirmed yaws was rare and focal at population level (24 cases; 2.6 [95% CI 1.4-3.9] cases/10,000 population) with similar clinical epidemiology to other endemic countries in West Africa. Unsupervised classification of spatially referenced case finding data indicated that yaws was more likely to occur in hard-to-reach communities; healthcare-seeking was low among communities, and clinical awareness of yaws was low among healthcare workers. We recovered whole bacterial genomes from 12 cases and describe a monophyletic clade of Treponema pallidum subspecies pertenue, phylogenetically distinct from known TPE lineages, including those affecting neighboring nonhuman primate populations (Taï Forest, Côte d'Ivoire). Yaws is endemic in Liberia but exhibits low focal population prevalence with evidence of a historical genetic bottleneck and subsequent local expansion. Reporting gaps appear attributable to challenging epidemiology and low disease awareness.
Collapse
|