1
|
Yang Q, Luo L, Lin Z, Wen W, Zeng W, Deng H. A machine learning-based predictive model of causality in orthopaedic medical malpractice cases in China. PLoS One 2024; 19:e0300662. [PMID: 38630758 PMCID: PMC11023448 DOI: 10.1371/journal.pone.0300662] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Accepted: 02/27/2024] [Indexed: 04/19/2024] Open
Abstract
PURPOSE To explore the feasibility and validity of machine learning models in determining causality in medical malpractice cases and to try to increase the scientificity and reliability of identification opinions. METHODS We collected 13,245 written judgments from PKULAW.COM, a public database. 963 cases were included after the initial screening. 21 medical and ten patient factors were selected as characteristic variables by summarising previous literature and cases. Random Forest, eXtreme Gradient Boosting (XGBoost) and Light Gradient Boosting Machine (LightGBM) were used to establish prediction models of causality for the two data sets, respectively. Finally, the optimal model is obtained by hyperparameter tuning of the six models. RESULTS We built three real data set models and three virtual data set models by three algorithms, and their confusion matrices differed. XGBoost performed best in the real data set, with a model accuracy of 66%. In the virtual data set, the performance of XGBoost and LightGBM was basically the same, and the model accuracy rate was 80%. The overall accuracy of external verification was 72.7%. CONCLUSIONS The optimal model of this study is expected to predict the causality accurately.
Collapse
Affiliation(s)
- Qingxin Yang
- School of Forensic Medicine, Kunming Medical University, Kunming, China
| | - Li Luo
- School of Forensic Medicine, Kunming Medical University, Kunming, China
| | - Zhangpeng Lin
- School of Forensic Medicine, Kunming Medical University, Kunming, China
| | - Wei Wen
- School of Forensic Medicine, Kunming Medical University, Kunming, China
| | - Wenbo Zeng
- West China Hospital of Sichuan University, Chengdu, China
| | - Hong Deng
- School of Forensic Medicine, Kunming Medical University, Kunming, China
| |
Collapse
|
2
|
Cosme LV, Corley M, Johnson T, Severson DW, Yan G, Wang X, Beebe N, Maynard A, Bonizzoni M, Khorramnejad A, Martins AJ, Lima JBP, Munstermann LE, Surendran SN, Chen CH, Maringer K, Wahid I, Mukherjee S, Xu J, Fontaine MC, Estallo EL, Stein M, Livdahl T, Scaraffia PY, Carter BH, Mogi M, Tuno N, Mains JW, Medley KA, Bowles DE, Gill RJ, Eritja R, González-Obando R, Trang HTT, Boyer S, Abunyewa AM, Hackett K, Wu T, Nguyễn J, Shen J, Zhao H, Crawford JE, Armbruster P, Caccone A. A genotyping array for the globally invasive vector mosquito, Aedes albopictus. Parasit Vectors 2024; 17:106. [PMID: 38439081 PMCID: PMC10910840 DOI: 10.1186/s13071-024-06158-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Accepted: 01/24/2024] [Indexed: 03/06/2024] Open
Abstract
BACKGROUND Although whole-genome sequencing (WGS) is the preferred genotyping method for most genomic analyses, limitations are often experienced when studying genomes characterized by a high percentage of repetitive elements, high linkage, and recombination deserts. The Asian tiger mosquito (Aedes albopictus), for example, has a genome comprising up to 72% repetitive elements, and therefore we set out to develop a single-nucleotide polymorphism (SNP) chip to be more cost-effective. Aedes albopictus is an invasive species originating from Southeast Asia that has recently spread around the world and is a vector for many human diseases. Developing an accessible genotyping platform is essential in advancing biological control methods and understanding the population dynamics of this pest species, with significant implications for public health. METHODS We designed a SNP chip for Ae. albopictus (Aealbo chip) based on approximately 2.7 million SNPs identified using WGS data from 819 worldwide samples. We validated the chip using laboratory single-pair crosses, comparing technical replicates, and comparing genotypes of samples genotyped by WGS and the SNP chip. We then used the chip for a population genomic analysis of 237 samples from 28 sites in the native range to evaluate its usefulness in describing patterns of genomic variation and tracing the origins of invasions. RESULTS Probes on the Aealbo chip targeted 175,396 SNPs in coding and non-coding regions across all three chromosomes, with a density of 102 SNPs per 1 Mb window, and at least one SNP in each of the 17,461 protein-coding genes. Overall, 70% of the probes captured the genetic variation. Segregation analysis found that 98% of the SNPs followed expectations of single-copy Mendelian genes. Comparisons with WGS indicated that sites with genotype disagreements were mostly heterozygotes at loci with WGS read depth < 20, while there was near complete agreement with WGS read depths > 20, indicating that the chip more accurately detects heterozygotes than low-coverage WGS. Sample sizes did not affect the accuracy of the SNP chip genotype calls. Ancestry analyses identified four to five genetic clusters in the native range with various levels of admixture. CONCLUSIONS The Aealbo chip is highly accurate, is concordant with genotypes from WGS with high sequence coverage, and may be more accurate than low-coverage WGS.
Collapse
Affiliation(s)
- Luciano Veiga Cosme
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, 06520-8105, USA.
| | - Margaret Corley
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, 06520-8105, USA
| | - Thomas Johnson
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, 06520-8105, USA
| | - Dave W Severson
- Department of Biological Sciences, University of Notre Dame, Notre Dame, IN, USA
| | - Guiyun Yan
- Department of Population Health and Disease Prevention, University of California, Irvine, CA, USA
| | - Xiaoming Wang
- Department of Population Health and Disease Prevention, University of California, Irvine, CA, USA
| | - Nigel Beebe
- School of the Environment, University of Queensland Australia, St Lucia, Australia
| | - Andrew Maynard
- School of the Environment, University of Queensland Australia, St Lucia, Australia
| | - Mariangela Bonizzoni
- Department of Biology and Biotechnology "Lazzaro Spallanzani", University of Pavia, Pavia, Italy
| | - Ayda Khorramnejad
- Department of Biology and Biotechnology "Lazzaro Spallanzani", University of Pavia, Pavia, Italy
| | - Ademir Jesus Martins
- Laboratório de Fisiologia e Controle de Artrópodes Vetores, Instituto Oswaldo Cruz, FIOCRUZ, Rio de Janeiro, RJ, Brazil
| | - José Bento Pereira Lima
- Laboratório de Fisiologia e Controle de Artrópodes Vetores, Instituto Oswaldo Cruz, FIOCRUZ, Rio de Janeiro, RJ, Brazil
| | - Leonard E Munstermann
- Yale School of Public Health and Yale Peabody Museum, Yale University, New Haven, CT, USA
| | | | - Chun-Hong Chen
- National Health Research Institutes, National Mosquito-Borne Disease Control Research Center & National Institute of Infectious Diseases and Vaccinology, Miaoli, Taiwan
| | | | - Isra Wahid
- Center for Zoonotic and Emerging Diseases, Hasanuddin University Medical Research Centre (HUMRC), Makassar, Indonesia
| | - Shomen Mukherjee
- Mitrani Department of Desert Ecology, Jacob Blaustein Institutes of Desert Research, Ben-Gurion University of the Negev, Midreshet Ben-Gurion, Israel
- Biological and Life Sciences Division, School of Arts and Sciences, Ahmedabad University, Ahmedabad, Gujarat, India
| | - Jiannon Xu
- Department of Biology, New Mexico State University, Las Cruces, NM, USA
| | - Michael C Fontaine
- MIVEGEC, Université de Montpellier, CNRS, IRD, Montpellier, France
- University of Groningen, Groningen Institute for Evolutionary Life Sciences, Groningen, The Netherlands
| | - Elizabet L Estallo
- Facultad de Ciencias Exactas, Físicas y Naturales, Centro de Investigaciones Entomológicas de Córdoba, Universidad Nacional de Córdoba, Córdoba, Argentina
- Instituto de Investigaciones Biológicas y Tecnológicas, Consejo Nacional de Investigaciones Científicas y Técnicas, Universidad Nacional de Córdoba, Córdoba, Argentina
| | - Marina Stein
- Instituto de Medicina Regional, Universidad Nacional del Nordeste, CONICET CCT Nordeste, Resistencia, Argentina
| | | | - Patricia Y Scaraffia
- School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, USA
| | - Brendan H Carter
- School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, USA
| | - Motoyoshi Mogi
- Division of Parasitology, Faculty of Medicine, Saga University, Nabeshima, Saga, Japan
| | - Nobuko Tuno
- Laboratory of Ecology, Graduate School of Natural Science and Technology, Kanazawa University, Kanazawa, Japan
| | | | - Kim A Medley
- Tyson Research Center, Washington University in St. Louis, St. Louis, USA
| | | | - Richard J Gill
- Department of Life Sciences, Georgina Mace Centre for the Living Planet, Imperial College London, Berkshire, UK
| | - Roger Eritja
- Centre d'Estudis Avançats de Blanes, Consejo Superior de Investigaciones Científicas, Blanes, Spain
| | | | - Huynh T T Trang
- Department of Medical Entomology and Zoonotics, Pasteur Institute in Ho Chi Minh City, Ho Chi Minh City, Vietnam
| | - Sébastien Boyer
- Medical Entomology Unit, Institut Pasteur du Cambodge, Phnom Penh, Cambodia
| | - Ann-Marie Abunyewa
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, 06520-8105, USA
| | - Kayleigh Hackett
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, 06520-8105, USA
| | - Tina Wu
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, 06520-8105, USA
| | - Justin Nguyễn
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, 06520-8105, USA
| | - Jiangnan Shen
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, 06510, USA
| | - Hongyu Zhao
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, 06510, USA
- Department of Genetics, Yale University School of Medicine, New Haven, CT, 06510, USA
| | | | - Peter Armbruster
- Department of Biology, Georgetown University, Washington, DC, USA
| | - Adalgisa Caccone
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, 06520-8105, USA
| |
Collapse
|
3
|
Huang X, Rymbekova A, Dolgova O, Lao O, Kuhlwilm M. Harnessing deep learning for population genetic inference. Nat Rev Genet 2024; 25:61-78. [PMID: 37666948 DOI: 10.1038/s41576-023-00636-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/11/2023] [Indexed: 09/06/2023]
Abstract
In population genetics, the emergence of large-scale genomic data for various species and populations has provided new opportunities to understand the evolutionary forces that drive genetic diversity using statistical inference. However, the era of population genomics presents new challenges in analysing the massive amounts of genomes and variants. Deep learning has demonstrated state-of-the-art performance for numerous applications involving large-scale data. Recently, deep learning approaches have gained popularity in population genetics; facilitated by the advent of massive genomic data sets, powerful computational hardware and complex deep learning architectures, they have been used to identify population structure, infer demographic history and investigate natural selection. Here, we introduce common deep learning architectures and provide comprehensive guidelines for implementing deep learning models for population genetic inference. We also discuss current challenges and future directions for applying deep learning in population genetics, focusing on efficiency, robustness and interpretability.
Collapse
Affiliation(s)
- Xin Huang
- Department of Evolutionary Anthropology, University of Vienna, Vienna, Austria.
- Human Evolution and Archaeological Sciences (HEAS), University of Vienna, Vienna, Austria.
| | - Aigerim Rymbekova
- Department of Evolutionary Anthropology, University of Vienna, Vienna, Austria
- Human Evolution and Archaeological Sciences (HEAS), University of Vienna, Vienna, Austria
| | - Olga Dolgova
- Integrative Genomics Laboratory, CIC bioGUNE - Centro de Investigación Cooperativa en Biociencias, Derio, Biscaya, Spain
| | - Oscar Lao
- Institute of Evolutionary Biology, CSIC-Universitat Pompeu Fabra, Barcelona, Spain.
| | - Martin Kuhlwilm
- Department of Evolutionary Anthropology, University of Vienna, Vienna, Austria.
- Human Evolution and Archaeological Sciences (HEAS), University of Vienna, Vienna, Austria.
| |
Collapse
|
4
|
Bonet D, Levin M, Montserrat DM, Ioannidis AG. Machine Learning Strategies for Improved Phenotype Prediction in Underrepresented Populations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.12.561949. [PMID: 37904983 PMCID: PMC10614800 DOI: 10.1101/2023.10.12.561949] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
Precision medicine models often perform better for populations of European ancestry due to the over-representation of this group in the genomic datasets and large-scale biobanks from which the models are constructed. As a result, prediction models may misrepresent or provide less accurate treatment recommendations for underrepresented populations, contributing to health disparities. This study introduces an adaptable machine learning toolkit that integrates multiple existing methodologies and novel techniques to enhance the prediction accuracy for underrepresented populations in genomic datasets. By leveraging machine learning techniques, including gradient boosting and automated methods, coupled with novel population-conditional re-sampling techniques, our method significantly improves the phenotypic prediction from single nucleotide polymorphism (SNP) data for diverse populations. We evaluate our approach using the UK Biobank, which is composed primarily of British individuals with European ancestry, and a minority representation of groups with Asian and African ancestry. Performance metrics demonstrate substantial improvements in phenotype prediction for underrepresented groups, achieving prediction accuracy comparable to that of the majority group. This approach represents a significant step towards improving prediction accuracy amidst current dataset diversity challenges. By integrating a tailored pipeline, our approach fosters more equitable validity and utility of statistical genetics methods, paving the way for more inclusive models and outcomes.
Collapse
Affiliation(s)
- David Bonet
- Stanford University, Stanford, CA, US
- Universitat Politècnica de Catalunya, Barcelona, Spain
| | - May Levin
- Stanford University, Stanford, CA, US
| | | | - Alexander G Ioannidis
- Stanford University, Stanford, CA, US
- University of California Santa Cruz, Santa Cruz, CA, US
| |
Collapse
|