1
|
Asatryan B, Murray B, Tadros R, Rieder M, Shah RA, Sharaf Dabbagh G, Landstrom AP, Dobner S, Munroe PB, Haggerty CM, Medeiros-Domingo A, Owens AT, Kullo IJ, Semsarian C, Reichlin T, Barth AS, Roden DM, James CA, Ware JS, Chahal CAA. Promise and Peril of a Genotype-First Approach to Mendelian Cardiovascular Disease. J Am Heart Assoc 2024; 13:e033557. [PMID: 39424414 DOI: 10.1161/jaha.123.033557] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/21/2024]
Abstract
Precision medicine, which among other aspects includes an individual's genomic data in diagnosis and management, has become the standard-of-care for Mendelian cardiovascular disease (CVD). However, early identification and management of asymptomatic patients with potentially lethal and manageable Mendelian CVD through screening, which is the promise of precision health, remains an unsolved challenge. The reduced costs of genomic sequencing have enabled the creation of biobanks containing in-depth genetic and health information, which have facilitated the understanding of genetic variation, penetrance, and expressivity, moving us closer to the genotype-first screening of asymptomatic individuals for Mendelian CVD. This approach could transform health care by diagnostic refinement and facilitating prevention or therapeutic interventions. Yet, potential benefits must be weighed against the potential risks, which include evolving variant pathogenicity assertion or identification of variants with low disease penetrance; costly, stressful, and inappropriate diagnostic evaluations; negative psychological impact; disqualification for employment or of competitive sports; and denial of insurance. Furthermore, the natural history of Mendelian CVD is often unpredictable, making identification of those who will benefit from preventive measures a priority. Currently, there is insufficient evidence that population-based genetic screening for Mendelian CVD can reduce adverse outcomes at a reasonable cost to an extent that outweighs the harms of true-positive and false-positive results. Besides technical, clinical, and financial burdens, ethical and legal aspects pose unprecedented challenges. This review highlights key developments in the field of genotype-first approaches to Mendelian CVD and summarizes challenges with potential solutions that can pave the way for implementing this approach for clinical care.
Collapse
Affiliation(s)
- Babken Asatryan
- Division of Cardiology, Department of Medicine Johns Hopkins University School of Medicine Baltimore MD USA
- Department of Cardiology Inselspital, Bern University Hospital, University of Bern Bern Switzerland
| | - Brittney Murray
- Division of Cardiology, Department of Medicine Johns Hopkins University School of Medicine Baltimore MD USA
| | - Rafik Tadros
- Cardiovascular Genetics Centre Montréal Heart Institute Montréal Québec Canada
| | - Marina Rieder
- Department of Cardiology Inselspital, Bern University Hospital, University of Bern Bern Switzerland
| | - Ravi A Shah
- Royal Brompton Hospital, Guy's and St Thomas' NHS Foundation Trust London United Kingdom
| | - Ghaith Sharaf Dabbagh
- Center for Inherited Cardiovascular Diseases WellSpan Health Lancaster PA USA
- Division of Cardiovascular Medicine University of Michigan Ann Arbor MI USA
| | - Andrew P Landstrom
- Division of Cardiology, Department of Pediatrics, and Department of Cell Biology Duke University School of Medicine Durham NC USA
| | - Stephan Dobner
- Department of Cardiology Inselspital, Bern University Hospital, University of Bern Bern Switzerland
| | - Patricia B Munroe
- NIHR Barts Biomedical Research Centre William Harvey Research Institute, Queen Mary University of London London United Kingdom
| | - Christopher M Haggerty
- Department of Translational Data Science and Informatics Heart Institute, Geisinger Danville PA USA
| | | | - Anjali T Owens
- Center for Inherited Cardiovascular Disease, Cardiovascular Division University of Pennsylvania Perelman School of Medicine Philadelphia PA USA
| | - Iftikhar J Kullo
- Department of Cardiovascular Medicine Mayo Clinic Rochester MN USA
| | - Christopher Semsarian
- Agnes Ginges Centre for Molecular Cardiology at Centenary Institute, The University of Sydney Sydney New South Wales Australia
- Faculty of Medicine and Health The University of Sydney Sydney New South Wales Australia
- Department of Cardiology Royal Prince Alfred Hospital Sydney New South Wales Australia
| | - Tobias Reichlin
- Department of Cardiology Inselspital, Bern University Hospital, University of Bern Bern Switzerland
| | - Andreas S Barth
- Division of Cardiology, Department of Medicine Johns Hopkins University School of Medicine Baltimore MD USA
| | - Dan M Roden
- Department of Medicine, Pharmacology, and Biomedical Informatics Vanderbilt University Medical Center Nashville TN USA
| | - Cynthia A James
- Division of Cardiology, Department of Medicine Johns Hopkins University School of Medicine Baltimore MD USA
| | - James S Ware
- Program in Medical and Population Genetics Broad Institute of MIT and Harvard Cambridge MA USA
- National Heart and Lung Institute & MRC London Institute of Medical Sciences, Institute of Clinical Sciences, Faculty of Medicine, Imperial College London London United Kingdom
- Royal Brompton & Harefield Hospitals Guy's and St. Thomas' NHS Foundation Trust London United Kingdom
| | - C Anwar A Chahal
- Center for Inherited Cardiovascular Diseases WellSpan Health Lancaster PA USA
- NIHR Barts Biomedical Research Centre William Harvey Research Institute, Queen Mary University of London London United Kingdom
- Department of Cardiovascular Medicine Mayo Clinic Rochester MN USA
- Barts Heart Centre St Bartholomew's Hospital, Barts Health NHS Trust London West Smithfield United Kingdom
| |
Collapse
|
2
|
Ioannou KI, Constantinidou A, Chatzittofis A. Genetic testing in psychiatry, the perceptions of healthcare workers and patients: a mini review. Front Public Health 2024; 12:1466585. [PMID: 39450380 PMCID: PMC11499203 DOI: 10.3389/fpubh.2024.1466585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2024] [Accepted: 09/25/2024] [Indexed: 10/26/2024] Open
Abstract
Background Genetic testing in psychiatry has gained attention, raising questions about its application and impact. Understanding stakeholders' perspectives, including healthcare providers and patients, is vital for informed policy development. The aim of this systematic review was to focus on the perceptions and concerns of patients and healthcare workers in psychiatry regarding the use of genetic testing. Methods We conducted a systematic review following PRISMA guidelines, for the period 1/2/2014, to 1/1/2024, via PubMed and Embase databases identifying 50 articles in total. After excluding duplicates (n = 12), 38 articles went through screening. After careful full-text article assessment for eligibility and applying the inclusion and exclusion criteria, only fifteen (n = 15) of the articles were included. Results Among 15 selected studies involving 3,156 participants (2,347 healthcare professionals; 809 patients), thematic analysis identified four primary themes: Organizational-implementation concerns, Ethical Considerations, Concerns on changes in clinical praxis, and Legal implications. Despite these concerns, seven out of eleven studies indicated that healthcare workers viewed genetic testing in psychiatry positively. Patients' perspectives varied, with two of the four studies reflecting positive attitudes. No pervasive negative sentiment was observed. Conclusion Our review highlights the multidimensional perspectives of healthcare professionals and patients surrounding the application of genetic testing in psychiatry. These considerations need to be addressed to facilitate the implementation of genetic testing in clinical praxis in psychiatry. Further research is needed for validation of the results and to guide policies and clinicians in the integration of genetic testing into mental healthcare practice.
Collapse
Affiliation(s)
| | | | - Andreas Chatzittofis
- Medical School, University of Cyprus, Nicosia, Cyprus
- Department of Clinical Sciences and Psychiatry, Umeå University, Umeå, Sweden
| |
Collapse
|
3
|
Scarpa F, Casu M. Genomics and Bioinformatics in One Health: Transdisciplinary Approaches for Health Promotion and Disease Prevention. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2024; 21:1337. [PMID: 39457310 PMCID: PMC11507412 DOI: 10.3390/ijerph21101337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/03/2024] [Revised: 10/02/2024] [Accepted: 10/07/2024] [Indexed: 10/28/2024]
Abstract
The One Health concept underscores the interconnectedness of human, animal, and environmental health, necessitating an integrated, transdisciplinary approach to tackle contemporary health challenges. This perspective paper explores the pivotal role of genomics and bioinformatics in advancing One Health initiatives. By leveraging genomic technologies and bioinformatics tools, researchers can decode complex biological data, enabling comprehensive insights into pathogen evolution, transmission dynamics, and host-pathogen interactions across species and environments (or ecosystems). These insights are crucial for predicting and mitigating zoonotic disease outbreaks, understanding antimicrobial resistance patterns, and developing targeted interventions for health promotion and disease prevention. Furthermore, integrating genomic data with environmental and epidemiological information enhances the precision of public health responses. Here we discuss case studies demonstrating successful applications of genomics and bioinformatics in One Health contexts, such as including data integration, standardization, and ethical considerations in genomic research. By fostering collaboration among geneticists, bioinformaticians, epidemiologists, zoologists, and data scientists, the One Health approach can harness the full potential of genomics and bioinformatics to safeguard global health. This perspective underscores the necessity of continued investment in interdisciplinary education, research infrastructure, and policy frameworks to effectively employ these technologies in the service of a healthier planet.
Collapse
Affiliation(s)
- Fabio Scarpa
- Department of Biomedical Sciences, University of Sassari, 07100 Sassari, Italy
| | - Marco Casu
- Department of Veterinary Medicine, University of Sassari, 07100 Sassari, Italy
| |
Collapse
|
4
|
Alkhatib R, Gaede KI. Data Management in Biobanking: Strategies, Challenges, and Future Directions. BIOTECH 2024; 13:34. [PMID: 39311336 PMCID: PMC11417763 DOI: 10.3390/biotech13030034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2024] [Revised: 08/23/2024] [Accepted: 08/31/2024] [Indexed: 09/26/2024] Open
Abstract
Biobanking plays a pivotal role in biomedical research by providing standardized processing, precise storing, and management of biological sample collections along with the associated data. Effective data management is a prerequisite to ensure the integrity, quality, and accessibility of these resources. This review provides a current landscape of data management in biobanking, discussing key challenges, existing strategies, and potential future directions. We explore multiple aspects of data management, including data collection, storage, curation, sharing, and ethical considerations. By examining the evolving technologies and methodologies in biobanking, we aim to provide insights into addressing the complexities and maximizing the utility of biobank data for research and clinical applications.
Collapse
Affiliation(s)
- Ramez Alkhatib
- Biomaterial Bank Nord, Research Center Borstel Leibniz Lung Center, Parkallee 35, 23845 Borstel, Germany;
- German Centre for Lung Research (DZL), Airway Research Centre North (ARCN), 22927 Großhansdorf, Germany
| | - Karoline I. Gaede
- Biomaterial Bank Nord, Research Center Borstel Leibniz Lung Center, Parkallee 35, 23845 Borstel, Germany;
- German Centre for Lung Research (DZL), Airway Research Centre North (ARCN), 22927 Großhansdorf, Germany
- PopGen 2.0 Biobanking Network (P2N), University Hospital Schleswig-Holstein, Campus Kiel, Kiel University, 24105 Kiel, Germany
| |
Collapse
|
5
|
de Groot NF. A contextual integrity approach to genomic information: what bioethics can learn from big data ethics. MEDICINE, HEALTH CARE, AND PHILOSOPHY 2024; 27:367-379. [PMID: 38865053 PMCID: PMC11310229 DOI: 10.1007/s11019-024-10211-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 05/27/2024] [Indexed: 06/13/2024]
Abstract
Genomic data is generated, processed and analysed at an increasingly rapid pace. This data is not limited to the medical context, but plays an important role in other contexts in society, such as commercial DNA testing, the forensic setting, archaeological research, and genetic surveillance. Genomic information also crosses the borders of these domains, e.g. forensic use of medical genetic information, insurance use of medical genomic information, or research use of commercial genomic data. This paper (1) argues that an informed consent approach for genomic information has limitations in many societal contexts, and (2) seeks to broaden the bioethical debate on genomic information by suggesting an approach that is applicable across multiple societal contexts. I argue that the contextual integrity framework, a theory rooted in information technology and big data ethics, is an effective tool to explore ethical challenges that arise from genomic information within a variety of different contexts. Rather than focusing on individual control over information, the contextual integrity approach holds that information should be shared and protected according to the norms that govern certain distinct social contexts. Several advantages of this contextual integrity approach will be discussed. The paper concludes that the contextual integrity framework helps to articulate and address a broad spectrum of ethical, social, and political factors in a variety of different societal contexts, while giving consideration to the interests of individuals, groups, and society at large.
Collapse
Affiliation(s)
- Nina F de Groot
- Department of Philosophy, Faculty of Humanities, VU University Amsterdam, Amsterdam, the Netherlands.
| |
Collapse
|
6
|
Federico CA, Trotsyuk AA. Biomedical Data Science, Artificial Intelligence, and Ethics: Navigating Challenges in the Face of Explosive Growth. Annu Rev Biomed Data Sci 2024; 7:1-14. [PMID: 38598860 DOI: 10.1146/annurev-biodatasci-102623-104553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/12/2024]
Abstract
Advances in biomedical data science and artificial intelligence (AI) are profoundly changing the landscape of healthcare. This article reviews the ethical issues that arise with the development of AI technologies, including threats to privacy, data security, consent, and justice, as they relate to donors of tissue and data. It also considers broader societal obligations, including the importance of assessing the unintended consequences of AI research in biomedicine. In addition, this article highlights the challenge of rapid AI development against the backdrop of disparate regulatory frameworks, calling for a global approach to address concerns around data misuse, unintended surveillance, and the equitable distribution of AI's benefits and burdens. Finally, a number of potential solutions to these ethical quandaries are offered. Namely, the merits of advocating for a collaborative, informed, and flexible regulatory approach that balances innovation with individual rights and public welfare, fostering a trustworthy AI-driven healthcare ecosystem, are discussed.
Collapse
Affiliation(s)
- Carole A Federico
- Center for Biomedical Ethics, Stanford University School of Medicine, Stanford, California, USA; ,
| | - Artem A Trotsyuk
- Center for Biomedical Ethics, Stanford University School of Medicine, Stanford, California, USA; ,
| |
Collapse
|
7
|
Cho H, Froelicher D, Dokmai N, Nandi A, Sadhuka S, Hong MM, Berger B. Privacy-Enhancing Technologies in Biomedical Data Science. Annu Rev Biomed Data Sci 2024; 7:317-343. [PMID: 39178425 PMCID: PMC11346580 DOI: 10.1146/annurev-biodatasci-120423-120107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/25/2024]
Abstract
The rapidly growing scale and variety of biomedical data repositories raise important privacy concerns. Conventional frameworks for collecting and sharing human subject data offer limited privacy protection, often necessitating the creation of data silos. Privacy-enhancing technologies (PETs) promise to safeguard these data and broaden their usage by providing means to share and analyze sensitive data while protecting privacy. Here, we review prominent PETs and illustrate their role in advancing biomedicine. We describe key use cases of PETs and their latest technical advances and highlight recent applications of PETs in a range of biomedical domains. We conclude by discussing outstanding challenges and social considerations that need to be addressed to facilitate a broader adoption of PETs in biomedical data science.
Collapse
Affiliation(s)
- Hyunghoon Cho
- Department of Biomedical Informatics and Data Science, Yale School of Medicine, New Haven, Connecticut, USA;
| | - David Froelicher
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA;
| | - Natnatee Dokmai
- Department of Biomedical Informatics and Data Science, Yale School of Medicine, New Haven, Connecticut, USA;
| | - Anupama Nandi
- Department of Biomedical Informatics and Data Science, Yale School of Medicine, New Haven, Connecticut, USA;
| | - Shuvom Sadhuka
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA;
| | - Matthew M Hong
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA;
| | - Bonnie Berger
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA;
- Department of Mathematics, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
| |
Collapse
|
8
|
Li W, Chen H, Jiang X, Harmanci A. FedGMMAT: Federated generalized linear mixed model association tests. PLoS Comput Biol 2024; 20:e1012142. [PMID: 39047024 PMCID: PMC11299833 DOI: 10.1371/journal.pcbi.1012142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Revised: 08/05/2024] [Accepted: 05/07/2024] [Indexed: 07/27/2024] Open
Abstract
Increasing genetic and phenotypic data size is critical for understanding the genetic determinants of diseases. Evidently, establishing practical means for collaboration and data sharing among institutions is a fundamental methodological barrier for performing high-powered studies. As the sample sizes become more heterogeneous, complex statistical approaches, such as generalized linear mixed effects models, must be used to correct for the confounders that may bias results. On another front, due to the privacy concerns around Protected Health Information (PHI), genetic information is restrictively protected by sharing according to regulations such as Health Insurance Portability and Accountability Act (HIPAA). This limits data sharing among institutions and hampers efforts around executing high-powered collaborative studies. Federated approaches are promising to alleviate the issues around privacy and performance, since sensitive data never leaves the local sites. Motivated by these, we developed FedGMMAT, a federated genetic association testing tool that utilizes a federated statistical testing approach for efficient association tests that can correct for confounding fixed and additive polygenic random effects among different collaborating sites. Genetic data is never shared among collaborating sites, and the intermediate statistics are protected by encryption. Using simulated and real datasets, we demonstrate FedGMMAT can achieve the virtually same results as pooled analysis under a privacy-preserving framework with practical resource requirements.
Collapse
Affiliation(s)
- Wentao Li
- McWilliams School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, Texas, United States of America
| | - Han Chen
- McWilliams School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, Texas, United States of America
- School of Public Health, University of Texas Health Science Center at Houston, Houston, Texas, United States of America
| | - Xiaoqian Jiang
- McWilliams School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, Texas, United States of America
| | - Arif Harmanci
- McWilliams School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, Texas, United States of America
| |
Collapse
|
9
|
Ho CH. Secondary Use of Health Data for Medical AI: A Cross-Regional Examination of Taiwan and the EU. Asian Bioeth Rev 2024; 16:407-422. [PMID: 39022371 PMCID: PMC11250748 DOI: 10.1007/s41649-024-00279-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 01/08/2024] [Accepted: 01/10/2024] [Indexed: 07/20/2024] Open
Abstract
This paper conducts a comparative analysis of data governance mechanisms concerning the secondary use of health data in Taiwan and the European Union (EU). Both regions have adopted distinctive approaches and regulations for utilizing health data beyond primary care, encompassing areas such as medical research and healthcare system enhancement. Through an examination of these models, this study seeks to elucidate the strategies, frameworks, and legal structures employed by Taiwan and the EU to strike a delicate balance between the imperative of data-driven healthcare innovation and the safeguarding of individual privacy rights. This paper examines and compares several key aspects of the secondary use of health data in Taiwan and the EU. These aspects include data governance frameworks, legal and regulatory frameworks, data access and sharing mechanisms, and privacy and security considerations. This comparative exploration offers invaluable insights into the evolving global landscape of health data governance. It provides a deeper understanding of the strategies implemented by these regions to harness the potential of health data while upholding the ethical and legal considerations surrounding its secondary use. The findings aim to inform best practices for responsible and effective health data utilization, particularly in the context of medical AI applications.
Collapse
Affiliation(s)
- Chih-hsing Ho
- Institute of European and American Studies, Academia Sinica, Taipei, Taiwan
| |
Collapse
|
10
|
Ashenden AJ, Chowdhury A, Anastasi LT, Lam K, Rozek T, Ranieri E, Siu CWK, King J, Mas E, Kassahn KS. The Multi-Omic Approach to Newborn Screening: Opportunities and Challenges. Int J Neonatal Screen 2024; 10:42. [PMID: 39051398 PMCID: PMC11270328 DOI: 10.3390/ijns10030042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 06/13/2024] [Accepted: 06/13/2024] [Indexed: 07/27/2024] Open
Abstract
Newborn screening programs have seen significant evolution since their initial implementation more than 60 years ago, with the primary goal of detecting treatable conditions within the earliest possible timeframe to ensure the optimal treatment and outcomes for the newborn. New technologies have driven the expansion of screening programs to cover additional conditions. In the current era, the breadth of screened conditions could be further expanded by integrating omic technologies such as untargeted metabolomics and genomics. Genomic screening could offer opportunities for lifelong care beyond the newborn period. For genomic newborn screening to be effective and ready for routine adoption, it must overcome barriers such as implementation cost, public acceptability, and scalability. Metabolomics approaches, on the other hand, can offer insight into disease phenotypes and could be used to identify known and novel biomarkers of disease. Given recent advances in metabolomic technologies, alongside advances in genomics including whole-genome sequencing, the combination of complementary multi-omic approaches may provide an exciting opportunity to leverage the best of both approaches and overcome their respective limitations. These techniques are described, along with the current outlook on multi-omic-based NBS research.
Collapse
Affiliation(s)
- Alex J. Ashenden
- Department of Biochemical Genetics, SA Pathology, Women’s and Children’s Hospital, Adelaide, SA 5006, Australia (T.R.)
| | - Ayesha Chowdhury
- Department of Molecular Pathology, SA Pathology, Adelaide, SA 5000, Australia; (A.C.); (L.T.A.)
| | - Lucy T. Anastasi
- Department of Molecular Pathology, SA Pathology, Adelaide, SA 5000, Australia; (A.C.); (L.T.A.)
| | - Khoa Lam
- Department of Biochemical Genetics, SA Pathology, Women’s and Children’s Hospital, Adelaide, SA 5006, Australia (T.R.)
- Adelaide Medical School, Faculty of Health and Medical Sciences, The University of Adelaide, Adelaide, SA 5000, Australia
| | - Tomas Rozek
- Department of Biochemical Genetics, SA Pathology, Women’s and Children’s Hospital, Adelaide, SA 5006, Australia (T.R.)
| | - Enzo Ranieri
- Department of Biochemical Genetics, SA Pathology, Women’s and Children’s Hospital, Adelaide, SA 5006, Australia (T.R.)
| | - Carol Wai-Kwan Siu
- Department of Biochemical Genetics, SA Pathology, Women’s and Children’s Hospital, Adelaide, SA 5006, Australia (T.R.)
- Adelaide Medical School, Faculty of Health and Medical Sciences, The University of Adelaide, Adelaide, SA 5000, Australia
| | - Jovanka King
- Immunology Directorate, SA Pathology, Adelaide, SA 5000, Australia
- Department of Allergy and Clinical Immunology, Women’s and Children’s Hospital, Adelaide, SA 5006, Australia
- Discipline of Paediatrics, Women’s and Children’s Hospital, The University of Adelaide, Adelaide, SA 5006, Australia
| | - Emilie Mas
- Department of Biochemical Genetics, SA Pathology, Women’s and Children’s Hospital, Adelaide, SA 5006, Australia (T.R.)
- Adelaide Medical School, Faculty of Health and Medical Sciences, The University of Adelaide, Adelaide, SA 5000, Australia
| | - Karin S. Kassahn
- Department of Molecular Pathology, SA Pathology, Adelaide, SA 5000, Australia; (A.C.); (L.T.A.)
- Adelaide Medical School, Faculty of Health and Medical Sciences, The University of Adelaide, Adelaide, SA 5000, Australia
| |
Collapse
|
11
|
Brauneck A, Schmalhorst L, Weiss S, Baumbach L, Völker U, Ellinghaus D, Baumbach J, Buchholtz G. Legal aspects of privacy-enhancing technologies in genome-wide association studies and their impact on performance and feasibility. Genome Biol 2024; 25:154. [PMID: 38872191 PMCID: PMC11170858 DOI: 10.1186/s13059-024-03296-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Accepted: 06/03/2024] [Indexed: 06/15/2024] Open
Abstract
Genomic data holds huge potential for medical progress but requires strict safety measures due to its sensitive nature to comply with data protection laws. This conflict is especially pronounced in genome-wide association studies (GWAS) which rely on vast amounts of genomic data to improve medical diagnoses. To ensure both their benefits and sufficient data security, we propose a federated approach in combination with privacy-enhancing technologies utilising the findings from a systematic review on federated learning and legal regulations in general and applying these to GWAS.
Collapse
Affiliation(s)
- Alissa Brauneck
- Hamburg University Faculty of Law, University of Hamburg, Hamburg, Germany.
| | - Louisa Schmalhorst
- Hamburg University Faculty of Law, University of Hamburg, Hamburg, Germany
| | - Stefan Weiss
- Interfaculty Institute of Genetics and Functional Genomics, Department of Functional Genomics, University Medicine Greifswald, Greifswald, Germany
| | - Linda Baumbach
- Department of Health Economics and Health Services Research, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Uwe Völker
- Interfaculty Institute of Genetics and Functional Genomics, Department of Functional Genomics, University Medicine Greifswald, Greifswald, Germany
| | - David Ellinghaus
- Institute of Clinical Molecular Biology (IKMB), Kiel University and University Medical Center Schleswig-Holstein, Kiel, Germany
| | - Jan Baumbach
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Gabriele Buchholtz
- Hamburg University Faculty of Law, University of Hamburg, Hamburg, Germany
| |
Collapse
|
12
|
Silva L, Pacheco T, Araújo E, Duarte RJ, Ribeiro-Vaz I, Ferreira-da-Silva R. Unveiling the future: precision pharmacovigilance in the era of personalized medicine. Int J Clin Pharm 2024; 46:755-760. [PMID: 38416349 PMCID: PMC11133017 DOI: 10.1007/s11096-024-01709-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Accepted: 01/30/2024] [Indexed: 02/29/2024]
Abstract
In the era of personalized medicine, pharmacovigilance faces new challenges and opportunities, demanding a shift from traditional approaches. This article delves into the evolving landscape of drug safety monitoring in the context of personalized treatments. We aim to provide a succinct reflection on the intersection of tailored therapeutic strategies and vigilant pharmacovigilance practices. We discuss the integration of pharmacogenetics in enhancing drug safety, illustrating how genetic profiling aids in predicting drug responses and adverse reactions. Emphasizing the importance of phase IV-post-marketing surveillance, we explore the limitations of pre-marketing trials and the necessity for a comprehensive approach to drug safety. The article discusses the pivotal role of pharmacogenetics in pre-exposure risk management and the redefinition of pharmacoepidemiological methods for post-exposure surveillance. We highlight the significance of integrating patient-specific genetic profiles in creating personalized medication leaflets and the use of advanced computational methods in data analysis. Additionally, we examine the ethical, privacy, and data security challenges inherent in precision medicine, emphasizing their implications for patient consent and data management.
Collapse
Affiliation(s)
- Lurdes Silva
- Faculty of Pharmacy of the University of Porto, Porto, Portugal
| | - Teresa Pacheco
- Faculty of Pharmacy of the University of Porto, Porto, Portugal
| | - Emília Araújo
- Palliative Care Service, Portuguese Oncology Institute of Porto (IPO Porto), Porto, Portugal
- Center for Health Technology and Services Research, Associate Laboratory RISE - Health Research Network (CINTESIS@RISE), Porto, Portugal
| | | | - Inês Ribeiro-Vaz
- Center for Health Technology and Services Research, Associate Laboratory RISE - Health Research Network (CINTESIS@RISE), Porto, Portugal
- Porto Pharmacovigilance Centre, Faculty of Medicine of the University of Porto, Porto, Portugal
- Department of Community Medicine, Health Information and Decision, Faculty of Medicine of the University of Porto, Porto, Portugal
| | - Renato Ferreira-da-Silva
- Center for Health Technology and Services Research, Associate Laboratory RISE - Health Research Network (CINTESIS@RISE), Porto, Portugal.
- Porto Pharmacovigilance Centre, Faculty of Medicine of the University of Porto, Porto, Portugal.
- Department of Community Medicine, Health Information and Decision, Faculty of Medicine of the University of Porto, Porto, Portugal.
| |
Collapse
|
13
|
Thomas M, Mackes N, Preuss-Dodhy A, Wieland T, Bundschus M. Assessing Privacy Vulnerabilities in Genetic Data Sets: Scoping Review. JMIR BIOINFORMATICS AND BIOTECHNOLOGY 2024; 5:e54332. [PMID: 38935957 PMCID: PMC11165293 DOI: 10.2196/54332] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Revised: 03/26/2024] [Accepted: 03/29/2024] [Indexed: 06/29/2024]
Abstract
BACKGROUND Genetic data are widely considered inherently identifiable. However, genetic data sets come in many shapes and sizes, and the feasibility of privacy attacks depends on their specific content. Assessing the reidentification risk of genetic data is complex, yet there is a lack of guidelines or recommendations that support data processors in performing such an evaluation. OBJECTIVE This study aims to gain a comprehensive understanding of the privacy vulnerabilities of genetic data and create a summary that can guide data processors in assessing the privacy risk of genetic data sets. METHODS We conducted a 2-step search, in which we first identified 21 reviews published between 2017 and 2023 on the topic of genomic privacy and then analyzed all references cited in the reviews (n=1645) to identify 42 unique original research studies that demonstrate a privacy attack on genetic data. We then evaluated the type and components of genetic data exploited for these attacks as well as the effort and resources needed for their implementation and their probability of success. RESULTS From our literature review, we derived 9 nonmutually exclusive features of genetic data that are both inherent to any genetic data set and informative about privacy risk: biological modality, experimental assay, data format or level of processing, germline versus somatic variation content, content of single nucleotide polymorphisms, short tandem repeats, aggregated sample measures, structural variants, and rare single nucleotide variants. CONCLUSIONS On the basis of our literature review, the evaluation of these 9 features covers the great majority of privacy-critical aspects of genetic data and thus provides a foundation and guidance for assessing genetic data risk.
Collapse
|
14
|
Zhou J, Huang C, Gao X. Patient privacy in AI-driven omics methods. Trends Genet 2024; 40:383-386. [PMID: 38637270 DOI: 10.1016/j.tig.2024.03.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 03/18/2024] [Accepted: 03/19/2024] [Indexed: 04/20/2024]
Abstract
Artificial intelligence (AI) in omics analysis raises privacy threats to patients. Here, we briefly discuss risk factors to patient privacy in data sharing, model training, and release, as well as methods to safeguard and evaluate patient privacy in AI-driven omics methods.
Collapse
Affiliation(s)
- Juexiao Zhou
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia; Computational Bioscience Research Center, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia
| | - Chao Huang
- Ningbo Institute of Information Technology Application, Chinese Academy of Sciences (CAS), Ningbo, China
| | - Xin Gao
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia; Computational Bioscience Research Center, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia.
| |
Collapse
|
15
|
Baek J, Lawson J, Rahimzadeh V. Investigating the Roles and Responsibilities of Institutional Signing Officials After Data Sharing Policy Reform for Federally Funded Research in the United States: National Survey. JMIR Form Res 2024; 8:e49822. [PMID: 38506894 PMCID: PMC10993121 DOI: 10.2196/49822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2023] [Revised: 01/04/2024] [Accepted: 01/07/2024] [Indexed: 03/21/2024] Open
Abstract
BACKGROUND New federal policies along with rapid growth in data generation, storage, and analysis tools are together driving scientific data sharing in the United States. At the same, triangulating human research data from diverse sources can also create situations where data are used for future research in ways that individuals and communities may consider objectionable. Institutional gatekeepers, namely, signing officials (SOs), are therefore at the helm of compliant management and sharing of human data for research. Of those with data governance responsibilities, SOs most often serve as signatories for investigators who deposit, access, and share research data between institutions. Although SOs play important leadership roles in compliant data sharing, we know surprisingly little about their scope of work, roles, and oversight responsibilities. OBJECTIVE The purpose of this study was to describe existing institutional policies and practices of US SOs who manage human genomic data access, as well as how these may change in the wake of new Data Management and Sharing requirements for National Institutes of Health-funded research in the United States. METHODS We administered an anonymous survey to institutional SOs recruited from biomedical research institutions across the United States. Survey items probed where data generated from extramurally funded research are deposited, how researchers outside the institution access these data, and what happens to these data after extramural funding ends. RESULTS In total, 56 institutional SOs participated in the survey. We found that SOs frequently approve duplicate data deposits and impose stricter access controls when data use limitations are unclear or unspecified. In addition, 21% (n=12) of SOs knew where data from federally funded projects are deposited after project funding sunsets. As a consequence, most investigators deposit their scientific data into "a National Institutes of Health-funded repository" to meet the Data Management and Sharing requirements but also within the "institution's own repository" or a third-party repository. CONCLUSIONS Our findings inform 5 policy recommendations and best practices for US SOs to improve coordination and develop comprehensive and consistent data governance policies that balance the need for scientific progress with effective human data protections.
Collapse
Affiliation(s)
| | | | - Vasiliki Rahimzadeh
- Center for Medical Ethics and Health Policy, Baylor College of Medicine, Houston, TX, United States
| |
Collapse
|
16
|
Huang D, Ye X, Sakurai T. Multi-party collaborative drug discovery via federated learning. Comput Biol Med 2024; 171:108181. [PMID: 38428094 DOI: 10.1016/j.compbiomed.2024.108181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2023] [Revised: 01/28/2024] [Accepted: 02/18/2024] [Indexed: 03/03/2024]
Abstract
In the field of drug discovery and pharmacology research, precise and rapid prediction of drug-target binding affinity (DTA) and drug-drug interaction (DDI) are essential for drug efficacy and safety. However, pharmacological data are often distributed across different institutions. Moreover, due to concerns regarding data privacy and intellectual property, the sharing of pharmacological data is often restricted. It is difficult for institutions to achieve the desired performance by solely utilizing their data. This urgent challenge calls for a solution that not only enhances collaboration between multiple institutions to improve prediction accuracy but also safeguards data privacy. In this study, we propose a novel federated learning (FL) framework to advance the prediction of DTA and DDI, namely FL-DTA and FL-DDI. The proposed framework enables multiple institutions to collaboratively train a predictive model without the need to share their local data. Moreover, to ensure data privacy, we employ secure multi-party computation (MPC) during the federated learning model aggregation phase. We evaluated the proposed method on two DTA and one DDI benchmark datasets and compared them with centralized learning and local learning. The experimental results indicate that the proposed method performs closely to centralized learning, and significantly outperforms local learning. Moreover, the proposed framework ensures data security while promoting collaboration among institutions, thereby accelerating the drug discovery process.
Collapse
Affiliation(s)
- Dong Huang
- Department of Computer Science, University of Tsukuba, Tsukuba, 3058577, Japan
| | - Xiucai Ye
- Department of Computer Science, University of Tsukuba, Tsukuba, 3058577, Japan.
| | - Tetsuya Sakurai
- Department of Computer Science, University of Tsukuba, Tsukuba, 3058577, Japan
| |
Collapse
|
17
|
Kolobkov D, Mishra Sharma S, Medvedev A, Lebedev M, Kosaretskiy E, Vakhitov R. Efficacy of federated learning on genomic data: a study on the UK Biobank and the 1000 Genomes Project. Front Big Data 2024; 7:1266031. [PMID: 38487517 PMCID: PMC10937521 DOI: 10.3389/fdata.2024.1266031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 01/31/2024] [Indexed: 03/17/2024] Open
Abstract
Combining training data from multiple sources increases sample size and reduces confounding, leading to more accurate and less biased machine learning models. In healthcare, however, direct pooling of data is often not allowed by data custodians who are accountable for minimizing the exposure of sensitive information. Federated learning offers a promising solution to this problem by training a model in a decentralized manner thus reducing the risks of data leakage. Although there is increasing utilization of federated learning on clinical data, its efficacy on individual-level genomic data has not been studied. This study lays the groundwork for the adoption of federated learning for genomic data by investigating its applicability in two scenarios: phenotype prediction on the UK Biobank data and ancestry prediction on the 1000 Genomes Project data. We show that federated models trained on data split into independent nodes achieve performance close to centralized models, even in the presence of significant inter-node heterogeneity. Additionally, we investigate how federated model accuracy is affected by communication frequency and suggest approaches to reduce computational complexity or communication costs.
Collapse
Affiliation(s)
- Dmitry Kolobkov
- GENXT, Hinxton, United Kingdom
- Laboratory of Ecological Genetics, Vavilov Institute of General Genetics, Moscow, Russia
| | - Satyarth Mishra Sharma
- GENXT, Hinxton, United Kingdom
- Center for Artificial Intelligence Technology, Skolkovo Institute of Science and Technology, Moscow, Russia
| | - Aleksandr Medvedev
- GENXT, Hinxton, United Kingdom
- Center for Artificial Intelligence Technology, Skolkovo Institute of Science and Technology, Moscow, Russia
| | | | | | | |
Collapse
|
18
|
Tuazon OM, Wickenheiser RA, Ansell R, Guerrini CJ, Zwenne GJ, Custers B. Law enforcement use of genetic genealogy databases in criminal investigations: Nomenclature, definition and scope. Forensic Sci Int Synerg 2024; 8:100460. [PMID: 38380276 PMCID: PMC10876674 DOI: 10.1016/j.fsisyn.2024.100460] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Revised: 02/07/2024] [Accepted: 02/07/2024] [Indexed: 02/22/2024]
Abstract
Although law enforcement use of commercial genetic genealogy databases has gained prominence since the arrest of the Golden State Killer in 2018, and it has been used in hundreds of cases in the United States and more recently in Europe and Australia, it does not have a standard nomenclature and scope. We analyzed the more common terms currently being used and propose a common nomenclature: investigative forensic genetic genealogy (iFGG). We define iFGG as the use by law enforcement of genetic genealogy combined with traditional genealogy to generate suspect investigational leads from forensic samples in criminal investigations. We describe iFGG as a proper subset of forensic genetic genealogy, that is, FGG as applied by law enforcement to criminal investigations; hence, investigative FGG or iFGG. We delineate its steps, compare and contrast it with other investigative techniques involving genetic evidence, and contextualize its use within criminal investigations. This characterization is a critical input to future studies regarding the legal status of iFGG and its implications on the right to genetic privacy.
Collapse
Affiliation(s)
- Oliver M. Tuazon
- Center for Law and Digital Technologies (eLaw), Institute for the Interdisciplinary Study of the Law, Leiden Law School, Leiden University, Kamerlingh Onnes Building, Steenschuur 25, 2311 ES, Leiden, the Netherlands
| | - Ray A. Wickenheiser
- New York State Police Crime Laboratory System, Forensic Investigation Center, 1220 Washington Avenue, Building #30, Albany, NY, 12226-3000, USA
| | - Ricky Ansell
- Swedish Police Authority, National Forensic Centre, SE-581 94, Linköping, Sweden
- Department of Physics, Chemistry and Biology, Linköping University, Sweden
| | - Christi J. Guerrini
- Baylor College of Medicine, Center for Medical Ethics and Health Policy, Houston, TX, 77030, USA
| | - Gerrit-Jan Zwenne
- Center for Law and Digital Technologies (eLaw), Institute for the Interdisciplinary Study of the Law, Leiden Law School, Leiden University, Kamerlingh Onnes Building, Steenschuur 25, 2311 ES, Leiden, the Netherlands
| | - Bart Custers
- Center for Law and Digital Technologies (eLaw), Institute for the Interdisciplinary Study of the Law, Leiden Law School, Leiden University, Kamerlingh Onnes Building, Steenschuur 25, 2311 ES, Leiden, the Netherlands
| |
Collapse
|
19
|
Brancato V, Esposito G, Coppola L, Cavaliere C, Mirabelli P, Scapicchio C, Borgheresi R, Neri E, Salvatore M, Aiello M. Standardizing digital biobanks: integrating imaging, genomic, and clinical data for precision medicine. J Transl Med 2024; 22:136. [PMID: 38317237 PMCID: PMC10845786 DOI: 10.1186/s12967-024-04891-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Accepted: 01/14/2024] [Indexed: 02/07/2024] Open
Abstract
Advancements in data acquisition and computational methods are generating a large amount of heterogeneous biomedical data from diagnostic domains such as clinical imaging, pathology, and next-generation sequencing (NGS), which help characterize individual differences in patients. However, this information needs to be available and suitable to promote and support scientific research and technological development, supporting the effective adoption of the precision medicine approach in clinical practice. Digital biobanks can catalyze this process, facilitating the sharing of curated and standardized imaging data, clinical, pathological and molecular data, crucial to enable the development of a comprehensive and personalized data-driven diagnostic approach in disease management and fostering the development of computational predictive models. This work aims to frame this perspective, first by evaluating the state of standardization of individual diagnostic domains and then by identifying challenges and proposing a possible solution towards an integrative approach that can guarantee the suitability of information that can be shared through a digital biobank. Our analysis of the state of the art shows the presence and use of reference standards in biobanks and, generally, digital repositories for each specific domain. Despite this, standardization to guarantee the integration and reproducibility of the numerical descriptors generated by each domain, e.g. radiomic, pathomic and -omic features, is still an open challenge. Based on specific use cases and scenarios, an integration model, based on the JSON format, is proposed that can help address this problem. Ultimately, this work shows how, with specific standardization and promotion efforts, the digital biobank model can become an enabling technology for the comprehensive study of diseases and the effective development of data-driven technologies at the service of precision medicine.
Collapse
Affiliation(s)
| | - Giuseppina Esposito
- Bio Check Up S.R.L, 80121, Naples, Italy
- Department of Advanced Biomedical Sciences, University of Naples Federico II, 80131, Naples, Italy
| | | | | | - Peppino Mirabelli
- UOS Laboratori di Ricerca e Biobanca, AORN Santobono-Pausilipon, Via Teresa Ravaschieri, 8, 80122, Naples, Italy
| | - Camilla Scapicchio
- Academic Radiology, Department of Translational Research, University of Pisa, via Roma, 67, 56126, Pisa, Italy
| | - Rita Borgheresi
- Academic Radiology, Department of Translational Research, University of Pisa, via Roma, 67, 56126, Pisa, Italy
| | - Emanuele Neri
- Academic Radiology, Department of Translational Research, University of Pisa, via Roma, 67, 56126, Pisa, Italy
| | | | | |
Collapse
|
20
|
Frison E, Breban M, Costantino F. How to translate genetic findings into clinical applications in spondyloarthritis? Front Immunol 2024; 15:1301735. [PMID: 38327520 PMCID: PMC10847566 DOI: 10.3389/fimmu.2024.1301735] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Accepted: 01/08/2024] [Indexed: 02/09/2024] Open
Abstract
Spondyloarthritis (SpA) is characterized by a strong genetic predisposition evidenced by the identification of up to 50 susceptibility loci, in addition to HLA-B27, the major genetic factor associated with the disease. These loci have not only deepened our understanding of disease pathogenesis but also offer the potential to improve disease management. Diagnostic delay is a major issue in SpA. HLA-B27 testing is widely used as diagnostic biomarker in SpA but its predictive value is limited. Several attempts have been made to develop more sophisticated polygenic risk score (PRS). However, these scores currently offer very little improvement as compared to HLA-B27 and are still difficult to implement in clinical routine. Genetics might also help to predict disease outcome including treatment response. Several genetic variants have been reported to be associated with radiographic damage or with poor response to TNF blockers, unfortunately with lack of coherence across studies. Large-scale studies should be conducted to obtain more robust findings. Genetic and genomic evidence in complex diseases can be further used to support the identification of new drug targets and to repurpose existing drugs. Although not fully driven by genetics, development of IL-17 blockers has been facilitated by the discovery of the association between IL23R variants and SpA. Development of recent approaches combining GWAS findings with functional genomics will help to prioritize new drug targets in the future. Although very promising, translational genetics in SpA remains challenging and will require a multidisciplinary approach that integrates genetics, genomics, immunology, and clinical research.
Collapse
Affiliation(s)
- Eva Frison
- UMR1173, INSERM, UFR Simone Veil, Versailles-Saint-Quentin-Paris-Saclay University, Saint-Quentin-en-Yvelines, France
- Labex Inflamex, Paris Diderot Sorbonne Paris-Cité University, Paris, France
| | - Maxime Breban
- UMR1173, INSERM, UFR Simone Veil, Versailles-Saint-Quentin-Paris-Saclay University, Saint-Quentin-en-Yvelines, France
- Labex Inflamex, Paris Diderot Sorbonne Paris-Cité University, Paris, France
- Rheumatology Division, Ambroise Paré Hospital, Assistance Publique des Hôpitaux de Paris (AP-HP), Boulogne-Billancourt, France
| | - Félicie Costantino
- UMR1173, INSERM, UFR Simone Veil, Versailles-Saint-Quentin-Paris-Saclay University, Saint-Quentin-en-Yvelines, France
- Labex Inflamex, Paris Diderot Sorbonne Paris-Cité University, Paris, France
- Rheumatology Division, Ambroise Paré Hospital, Assistance Publique des Hôpitaux de Paris (AP-HP), Boulogne-Billancourt, France
| |
Collapse
|
21
|
Schubach M, Maass T, Nazaretyan L, Röner S, Kircher M. CADD v1.7: using protein language models, regulatory CNNs and other nucleotide-level scores to improve genome-wide variant predictions. Nucleic Acids Res 2024; 52:D1143-D1154. [PMID: 38183205 PMCID: PMC10767851 DOI: 10.1093/nar/gkad989] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 10/14/2023] [Accepted: 10/17/2023] [Indexed: 01/07/2024] Open
Abstract
Machine Learning-based scoring and classification of genetic variants aids the assessment of clinical findings and is employed to prioritize variants in diverse genetic studies and analyses. Combined Annotation-Dependent Depletion (CADD) is one of the first methods for the genome-wide prioritization of variants across different molecular functions and has been continuously developed and improved since its original publication. Here, we present our most recent release, CADD v1.7. We explored and integrated new annotation features, among them state-of-the-art protein language model scores (Meta ESM-1v), regulatory variant effect predictions (from sequence-based convolutional neural networks) and sequence conservation scores (Zoonomia). We evaluated the new version on data sets derived from ClinVar, ExAC/gnomAD and 1000 Genomes variants. For coding effects, we tested CADD on 31 Deep Mutational Scanning (DMS) data sets from ProteinGym and, for regulatory effect prediction, we used saturation mutagenesis reporter assay data of promoter and enhancer sequences. The inclusion of new features further improved the overall performance of CADD. As with previous releases, all data sets, genome-wide CADD v1.7 scores, scripts for on-site scoring and an easy-to-use webserver are readily provided via https://cadd.bihealth.org/ or https://cadd.gs.washington.edu/ to the community.
Collapse
Affiliation(s)
- Max Schubach
- Exploratory Diagnostic Sciences, Berlin Institute of Health at Charité – Universitätsmedizin Berlin, Berlin, Germany
| | - Thorben Maass
- Institute of Human Genetics, University Hospital Schleswig-Holstein, University of Lübeck, Lübeck, Germany
| | - Lusiné Nazaretyan
- Exploratory Diagnostic Sciences, Berlin Institute of Health at Charité – Universitätsmedizin Berlin, Berlin, Germany
| | - Sebastian Röner
- Exploratory Diagnostic Sciences, Berlin Institute of Health at Charité – Universitätsmedizin Berlin, Berlin, Germany
| | - Martin Kircher
- Exploratory Diagnostic Sciences, Berlin Institute of Health at Charité – Universitätsmedizin Berlin, Berlin, Germany
- Institute of Human Genetics, University Hospital Schleswig-Holstein, University of Lübeck, Lübeck, Germany
| |
Collapse
|
22
|
Oliva A, Kaphle A, Reguant R, Sng LMF, Twine NA, Malakar Y, Wickramarachchi A, Keller M, Ranbaduge T, Chan EKF, Breen J, Buckberry S, Guennewig B, Haas M, Brown A, Cowley MJ, Thorne N, Jain Y, Bauer DC. Future-proofing genomic data and consent management: a comprehensive review of technology innovations. Gigascience 2024; 13:giae021. [PMID: 38837943 PMCID: PMC11152178 DOI: 10.1093/gigascience/giae021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 01/15/2024] [Accepted: 04/09/2024] [Indexed: 06/07/2024] Open
Abstract
Genomic information is increasingly used to inform medical treatments and manage future disease risks. However, any personal and societal gains must be carefully balanced against the risk to individuals contributing their genomic data. Expanding our understanding of actionable genomic insights requires researchers to access large global datasets to capture the complexity of genomic contribution to diseases. Similarly, clinicians need efficient access to a patient's genome as well as population-representative historical records for evidence-based decisions. Both researchers and clinicians hence rely on participants to consent to the use of their genomic data, which in turn requires trust in the professional and ethical handling of this information. Here, we review existing and emerging solutions for secure and effective genomic information management, including storage, encryption, consent, and authorization that are needed to build participant trust. We discuss recent innovations in cloud computing, quantum-computing-proof encryption, and self-sovereign identity. These innovations can augment key developments from within the genomics community, notably GA4GH Passports and the Crypt4GH file container standard. We also explore how decentralized storage as well as the digital consenting process can offer culturally acceptable processes to encourage data contributions from ethnic minorities. We conclude that the individual and their right for self-determination needs to be put at the center of any genomics framework, because only on an individual level can the received benefits be accurately balanced against the risk of exposing private information.
Collapse
Affiliation(s)
- Adrien Oliva
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, Level 3/160 Hawkesbury Rd, Westmead NSW 2145, Australia
| | - Anubhav Kaphle
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, Level 3/160 Hawkesbury Rd, Westmead NSW 2145, Australia
| | - Roc Reguant
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, Level 3/160 Hawkesbury Rd, Westmead NSW 2145, Australia
| | - Letitia M F Sng
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, Level 3/160 Hawkesbury Rd, Westmead NSW 2145, Australia
| | - Natalie A Twine
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, Level 3/160 Hawkesbury Rd, Westmead NSW 2145, Australia
| | - Yuwan Malakar
- Responsible Innovation Future Science Platform, Commonwealth Scientific and Industrial Research Organisation, Brisbane, 41 Boggo Rd, Dutton Park QLD 4102, Australia
| | - Anuradha Wickramarachchi
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, Level 3/160 Hawkesbury Rd, Westmead NSW 2145, Australia
| | - Marcel Keller
- Data61, Commonwealth Scientific and Industrial Research Organisation, Level 5/13 Garden St, Eveleigh NSW 2015, Australia
| | - Thilina Ranbaduge
- Data61, Commonwealth Scientific and Industrial Research Organisation, Building 101, Clunies Ross St, Black Mountain, Canberra, ACT 2601, Australia
| | - Eva K F Chan
- NSW Health Pathology, Sydney, 1 Reserve Road, St Leonards NSW 2065, Australia
| | - James Breen
- Telethon Kids Institute, Perth, WA 6009, Australia
- National Centre for Indigenous Genomics, The John Curtin School of Medical Research, Australian National University, Canberra, ACT 2601, Australia
| | - Sam Buckberry
- Telethon Kids Institute, Perth, WA 6009, Australia
- National Centre for Indigenous Genomics, The John Curtin School of Medical Research, Australian National University, Canberra, ACT 2601, Australia
| | - Boris Guennewig
- Sydney Medical School, Brain and Mind Centre, The University of Sydney, Sydney, 94 Mallett St, Camperdown NSW 2050, Australia
| | - Matilda Haas
- Australian Genomics, Parkville, VIC 3052, Australia
- Murdoch Children’s Research Institute, Parkville, Victoria 3052, Australia
| | - Alex Brown
- Telethon Kids Institute, Perth, WA 6009, Australia
- National Centre for Indigenous Genomics, The John Curtin School of Medical Research, Australian National University, Canberra, ACT 2601, Australia
| | - Mark J Cowley
- Children’s Cancer Institute, Lowy Cancer Research Centre, Level 4, Lowy Cancer Research Centre Corner Botany & High Streets UNSW Kensington Campus UNSW Sydney, Kensington NSW 2052, Australia
- School of Clinical Medicine, UNSW Medicine & Health, Wallace Wurth Building (C27), Cnr High St & Botany St, UNSW Sydney, Kensington NSW 2052, Australia
| | - Natalie Thorne
- University of Melbourne, Melbourne, Parkville VIC 3052, Australia
- Melbourne Genomics Health Alliance, Melbourne 1G, Walter and Eliza Hall Institute/1G Royal Parade, Parkville VIC 3052, Australia
- Walter and Eliza Hall Institute, Melbourne, 1G, Walter and Eliza Hall Institute/1G Royal Parade, Parkville VIC 3052, Australia
| | - Yatish Jain
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, Level 3/160 Hawkesbury Rd, Westmead NSW 2145, Australia
- Applied BioSciences, Faculty of Science and Engineering, Macquarie University, Applied BioSciences 205B Culloden Rd Macquarie University, NSW 2109, Australia
| | - Denis C Bauer
- Applied BioSciences, Faculty of Science and Engineering, Macquarie University, Applied BioSciences 205B Culloden Rd Macquarie University, NSW 2109, Australia
- Department of Biomedical Sciences, MQ Health General Practice - Macquarie University, Suite 305, Level 3/2 Technology Pl, Macquarie Park NSW 2109, Australia
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, Gate 13, Kintore Avenue University of Adelaide, Adelaide SA 5000, Australia
| |
Collapse
|
23
|
Roberts W, McKee S, Miranda R, Barnett N. Navigating ethical challenges in psychological research involving digital remote technologies and people who use alcohol or drugs. AMERICAN PSYCHOLOGIST 2024; 79:24-38. [PMID: 38236213 PMCID: PMC10798215 DOI: 10.1037/amp0001193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2024]
Abstract
Digital and remote technologies (DRT) are increasingly being used in scientific investigations to objectively measure human behavior during day-to-day activities. Using these devices, psychologists and other behavioral scientists can investigate health risk behaviors, such as drug and alcohol use, by closely examining the causes and consequences of monitored behaviors as they occur naturalistically. There are, however, complex ethical issues that emerge when using DRT methodologies in research with people who use substances. These issues must be identified and addressed so DRT devices can be incorporated into psychological research with this population in a manner that comports the ethical standards of the American Psychological Association. In this article, we discuss the ethical ramifications of using DRT in behavioral studies with people who use substances. Drawing on allied fields with similar ethical issues, we make recommendations to researchers who wish to incorporate DRT into their own research. Major topics include (a) threats to and methods for protecting participant and nonparticipant privacy, (b) shortcomings of traditional informed consent in DRT research, (c) researcher liabilities introduced by real-time continuous data collection, (d) threats to distributive justice arising from computational tools often used to manage and analyze DRT data, and (e) ethical implications of the "digital divide." We conclude with a more optimistic discussion of how DRT may provide safer alternatives to gold standard paradigms in substance use research, allowing researchers to test hypotheses that were previously prohibited on ethical grounds. (PsycInfo Database Record (c) 2024 APA, all rights reserved).
Collapse
Affiliation(s)
- Walter Roberts
- Department of Psychiatry, Yale University School of Medicine
| | - Sherry McKee
- Department of Psychiatry, Yale University School of Medicine
| | - Robert Miranda
- Department of Psychiatry and Human Behavior, Warren Alpert Medical School, Brown University
- Department of Behavioral and Social Sciences, Center for Alcohol and Addiction Studies, Brown University School of Public Health
| | - Nancy Barnett
- Department of Psychiatry and Human Behavior, Warren Alpert Medical School, Brown University
| |
Collapse
|
24
|
Chen Z, Lemey P, Yu H. Approaches and challenges to inferring the geographical source of infectious disease outbreaks using genomic data. THE LANCET. MICROBE 2024; 5:e81-e92. [PMID: 38042165 DOI: 10.1016/s2666-5247(23)00296-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 09/03/2023] [Accepted: 09/13/2023] [Indexed: 12/04/2023]
Abstract
Genomic data hold increasing potential in the elucidation of transmission dynamics and geographical sources of infectious disease outbreaks. Phylogeographic methods that use epidemiological and genomic data obtained from surveillance enable us to infer the history of spatial transmission that is naturally embedded in the topology of phylogenetic trees as a record of the dispersal of infectious agents between geographical locations. In this Review, we provide an overview of phylogeographic approaches widely used for reconstructing the geographical sources of outbreaks of interest. These approaches can be classified into ancestral trait or state reconstruction and structured population models, with structured population models including popular structured coalescent and birth-death models. We also describe the major challenges associated with sequencing technologies, surveillance strategies, data sharing, and analysis frameworks that became apparent during the generation of large-scale genomic data in recent years, extending beyond inference approaches. Finally, we highlight the role of genomic data in geographical source inference and clarify how this enhances understanding and molecular investigations of outbreak sources.
Collapse
Affiliation(s)
- Zhiyuan Chen
- School of Public Health, Fudan University, Key Laboratory of Public Health Safety, Ministry of Education, Shanghai, China
| | - Philippe Lemey
- Department of Microbiology, Immunology and Transplantation, Rega Institute, Laboratory of Clinical and Evolutionary Virology, KU Leuven, Leuven, Belgium
| | - Hongjie Yu
- School of Public Health, Fudan University, Key Laboratory of Public Health Safety, Ministry of Education, Shanghai, China.
| |
Collapse
|
25
|
Zhang QX, Liu T, Guo X, Zhen J, Yang MY, Khederzadeh S, Zhou F, Han X, Zheng Q, Jia P, Ding X, He M, Zou X, Liao JK, Zhang H, He J, Zhu X, Lu D, Chen H, Zeng C, Liu F, Zheng HF, Liu S, Xu HM, Chen GB. Searching across-cohort relatives in 54,092 GWAS samples via encrypted genotype regression. PLoS Genet 2024; 20:e1011037. [PMID: 38206971 PMCID: PMC10783776 DOI: 10.1371/journal.pgen.1011037] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Accepted: 12/13/2023] [Indexed: 01/13/2024] Open
Abstract
Explicitly sharing individual level data in genomics studies has many merits comparing to sharing summary statistics, including more strict QCs, common statistical analyses, relative identification and improved statistical power in GWAS, but it is hampered by privacy or ethical constraints. In this study, we developed encG-reg, a regression approach that can detect relatives of various degrees based on encrypted genomic data, which is immune of ethical constraints. The encryption properties of encG-reg are based on the random matrix theory by masking the original genotypic matrix without sacrificing precision of individual-level genotype data. We established a connection between the dimension of a random matrix, which masked genotype matrices, and the required precision of a study for encrypted genotype data. encG-reg has false positive and false negative rates equivalent to sharing original individual level data, and is computationally efficient when searching relatives. We split the UK Biobank into their respective centers, and then encrypted the genotype data. We observed that the relatives estimated using encG-reg was equivalently accurate with the estimation by KING, which is a widely used software but requires original genotype data. In a more complex application, we launched a finely devised multi-center collaboration across 5 research institutes in China, covering 9 cohorts of 54,092 GWAS samples. encG-reg again identified true relatives existing across the cohorts with even different ethnic backgrounds and genotypic qualities. Our study clearly demonstrates that encrypted genomic data can be used for data sharing without loss of information or data sharing barrier.
Collapse
Affiliation(s)
- Qi-Xin Zhang
- Institute of Bioinformatics, Zhejiang University, Hangzhou, Zhejiang, China
- Center for Reproductive Medicine, Department of Genetic and Genomic Medicine, and Clinical Research Institute, Zhejiang Provincial People’s Hospital, People’s Hospital of Hangzhou Medical College, Hangzhou, Zhejiang, China
| | - Tianzi Liu
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, Shanghai, China
- CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Xinxin Guo
- School of Public Health (Shenzhen), Sun Yat-sen University, Shenzhen, Guangdong, China
| | - Jianxin Zhen
- Central Laboratory, Shenzhen Baoan Women’s and Children’s Hospital, Shenzhen, Guangdong, China
| | - Meng-yuan Yang
- Diseases & Population (DaP) Geninfo Lab, School of Life Sciences, Westlake University, Hangzhou, Zhejiang, China
| | - Saber Khederzadeh
- Diseases & Population (DaP) Geninfo Lab, School of Life Sciences, Westlake University, Hangzhou, Zhejiang, China
| | - Fang Zhou
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, China
| | - Xiaotong Han
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, Guangdong, China
| | - Qiwen Zheng
- CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Peilin Jia
- CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Xiaohu Ding
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, Guangdong, China
| | - Mingguang He
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, Guangdong, China
- Centre for Eye Research Australia, Royal Victorian Eye and Ear Hospital, Melbourne, Victoria, Australia
- Ophthalmology, Department of Surgery, University of Melbourne, Melbourne, Victoria, Australia
| | - Xin Zou
- State Key Laboratory of CAD & GC, Zhejiang University, Hangzhou, Zhejiang, China
| | - Jia-Kai Liao
- School of Mathematics and Statistics and Research Institute of Mathematical Sciences (RIMS), Jiangsu Provincial Key Laboratory of Educational Big Data Science and Engineering, Jiangsu Normal University, Xuzhou, Jiangsu, China
- Ningbo Institute of Life and Health Industry, University of Chinese Academy of Sciences, Ningbo, Zhejiang, China
| | - Hongxin Zhang
- State Key Laboratory of CAD & GC, Zhejiang University, Hangzhou, Zhejiang, China
| | - Ji He
- Department of Neurology, Peking University Third Hospital, Beijing, China
| | - Xiaofeng Zhu
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, Ohio, United States of America
| | - Daru Lu
- State Key Laboratory of Genetic Engineering and MOE Engineering Research Center of Gene Technology, School of Life Sciences and Zhongshan Hospital, Fudan University, Shanghai, China
- NHC Key Laboratory of Birth Defects and Reproductive Health, Chongqing Population and Family Planning Science and Technology Research Institute, Chongqing, China
| | - Hongyan Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, China
| | - Changqing Zeng
- CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
- Henan Academy of Sciences, Zhengzhou, Henan, China
| | - Fan Liu
- CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
- Department of Forensic Sciences, College of Criminal Justice, Naif Arab University of Security Sciences, Riyadh, Kingdom of Saudi Arabia
| | - Hou-Feng Zheng
- Diseases & Population (DaP) Geninfo Lab, School of Life Sciences, Westlake University, Hangzhou, Zhejiang, China
| | - Siyang Liu
- School of Public Health (Shenzhen), Sun Yat-sen University, Shenzhen, Guangdong, China
| | - Hai-Ming Xu
- Institute of Bioinformatics, Zhejiang University, Hangzhou, Zhejiang, China
| | - Guo-Bo Chen
- Center for Reproductive Medicine, Department of Genetic and Genomic Medicine, and Clinical Research Institute, Zhejiang Provincial People’s Hospital, People’s Hospital of Hangzhou Medical College, Hangzhou, Zhejiang, China
- Key Laboratory of Endocrine Gland Diseases of Zhejiang Province, Hangzhou, Zhejiang, China
| |
Collapse
|
26
|
Yue T, Wang Y, Zhang L, Gu C, Xue H, Wang W, Lyu Q, Dun Y. Deep Learning for Genomics: From Early Neural Nets to Modern Large Language Models. Int J Mol Sci 2023; 24:15858. [PMID: 37958843 PMCID: PMC10649223 DOI: 10.3390/ijms242115858] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Revised: 10/24/2023] [Accepted: 10/30/2023] [Indexed: 11/15/2023] Open
Abstract
The data explosion driven by advancements in genomic research, such as high-throughput sequencing techniques, is constantly challenging conventional methods used in genomics. In parallel with the urgent demand for robust algorithms, deep learning has succeeded in various fields such as vision, speech, and text processing. Yet genomics entails unique challenges to deep learning, since we expect a superhuman intelligence that explores beyond our knowledge to interpret the genome from deep learning. A powerful deep learning model should rely on the insightful utilization of task-specific knowledge. In this paper, we briefly discuss the strengths of different deep learning models from a genomic perspective so as to fit each particular task with proper deep learning-based architecture, and we remark on practical considerations of developing deep learning architectures for genomics. We also provide a concise review of deep learning applications in various aspects of genomic research and point out current challenges and potential research directions for future genomics applications. We believe the collaborative use of ever-growing diverse data and the fast iteration of deep learning models will continue to contribute to the future of genomics.
Collapse
Affiliation(s)
- Tianwei Yue
- School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA; (Y.W.); (L.Z.); (W.W.)
| | - Yuanxin Wang
- School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA; (Y.W.); (L.Z.); (W.W.)
| | - Longxiang Zhang
- School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA; (Y.W.); (L.Z.); (W.W.)
| | - Chunming Gu
- Department of Biomedical Engineering, School of Medicine, Johns Hopkins University, Baltimore, MD 21218, USA;
| | - Haoru Xue
- The Robotics Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA;
| | - Wenping Wang
- School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA; (Y.W.); (L.Z.); (W.W.)
| | - Qi Lyu
- Department of Computational Mathematics, Science, and Engineering, Michigan State University, East Lansing, MI 48824, USA;
| | - Yujie Dun
- School of Information and Communications Engineering, Xi’an Jiaotong University, Xi’an 710049, China;
| |
Collapse
|
27
|
Ayday E, Vaidya J, Jiang X, Telenti A. Ensuring Trust in Genomics Research. ... IEEE INTERNATIONAL CONFERENCE ON TRUST, PRIVACY AND SECURITY IN INTELLIGENT SYSTEMS AND APPLICATIONS : (TPS-ISA ...). IEEE INTERNATIONAL CONFERENCE ON TRUST, PRIVACY AND SECURITY IN INTELLIGENT SYSTEMS AND APPLICATIONS 2023; 2023:1-12. [PMID: 38562180 PMCID: PMC10981793 DOI: 10.1109/tps-isa58951.2023.00011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Reproducibility, transparency, representation, and privacy underpin the trust on genomics research in general and genome-wide association studies (GWAS) in particular. Concerns about these issues can be mitigated by technologies that address privacy protection, quality control, and verifiability of GWAS. However, many of the existing technological solutions have been developed in isolation and may address one aspect of reproducibility, transparency, representation, and privacy of GWAS while unknowingly impacting other aspects. As a consequence, the current patchwork of technological tools only partially and in an overlapping manner address issues with GWAS, sometimes even creating more problems. This paper addresses the progress in a field that creates technological solutions that augment the acceptance and security of population genetic analyses. The text identifies areas that are falling behind in technical implementation or where there is insufficient research. We make the case that a full understanding of the different GWAS settings, technological tools and new research directions can holistically address the requirements for the acceptance of GWAS.
Collapse
Affiliation(s)
- Erman Ayday
- Department of Computer and Data Sciences Case Western Reserve University Cleveland, OH
| | - Jaideep Vaidya
- Management Science and Information Systems Department Rutgers University Newark, NJ
| | - Xiaoqian Jiang
- Department of Data Science and Artificial Intelligence University of Texas - Health Houston, TX
| | - Amalio Telenti
- Dept. of Integrative Structural and Computational Biology Scripps Institute La Jolla, CA
| |
Collapse
|
28
|
Smit JAR, van der Graaf R, Mostert M, Vaartjes I, Zuidgeest M, Grobbee DE, van Delden JJM. Overcoming ethical and legal obstacles to data linkage in health research: stakeholder perspectives. Int J Popul Data Sci 2023; 8:2151. [PMID: 38414541 PMCID: PMC10898216 DOI: 10.23889/ijpds.v8i1.2151] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/29/2024] Open
Abstract
Introduction Data linkage for health research purposes enables the answering of countless new research questions, is said to be cost effective and less intrusive than other means of data collection. Nevertheless, health researchers are currently dealing with a complicated, fragmented, and inconsistent regulatory landscape with regard to the processing of data, and progress in health research is hindered. Aim We designed a qualitative study to assess what different stakeholders perceive as ethical and legal obstacles to data linkage for health research purposes, and how these obstacles could be overcome. Methods Two focus groups and eighteen semi-structured in-depth interviews were held to collect opinions and insights of various stakeholders. An inductive thematic analysis approach was used to identify overarching themes. Results This study showed that the ambiguity regarding the 'correct' interpretation of the law, the fragmentation of policies governing the processing of personal health data, and the demandingness of legal requirements are experienced as causes for the impediment of data linkage for research purposes by the participating stakeholders. To remove or reduce these obstacles authoritative interpretations of the laws and regulations governing data linkage should be issued. The participants furthermore encouraged the harmonisation of data linkage policies, as well as promoting trust and transparency and the enhancement of technical and organisational measures. Lastly, there is a demand for legislative and regulatory modifications amongst the participants. Conclusions To overcome the obstacles in data linkage for scientific research purposes, perhaps we should shift the focus from adapting the current laws and regulations governing data linkage, or even designing completely new laws, towards creating a more thorough understanding of the law and making better use of the flexibilities within the existing legislation. Important steps in achieving this shift could be clarification of the legal provisions governing data linkage by issuing authoritative interpretations, as well as the strengthening of ethical-legal oversight bodies.
Collapse
Affiliation(s)
- Julie-Anne R Smit
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Rieke van der Graaf
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Menno Mostert
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Ilonca Vaartjes
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Mira Zuidgeest
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Diederik E Grobbee
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Johannes J M van Delden
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| |
Collapse
|
29
|
Rothe H, Lauer KB, Talbot-Cooper C, Sivizaca Conde DJ. Digital entrepreneurship from cellular data: How omics afford the emergence of a new wave of digital ventures in health. ELECTRONIC MARKETS 2023; 33:48. [PMID: 37724180 PMCID: PMC10505108 DOI: 10.1007/s12525-023-00669-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Accepted: 08/25/2023] [Indexed: 09/20/2023]
Abstract
Data has become an indispensable input, throughput, and output for the healthcare industry. In recent years, omics technologies such as genomics and proteomics have generated vast amounts of new data at the cellular level including molecular, structural, and functional levels. Cellular data holds the potential to innovate therapeutics, vaccines, diagnostics, consumer products, or even ancestry services. However, data at the cellular level is generated with rapidly evolving omics technologies. These technologies use scientific knowledge from resource-rich environments. This raises the question of how new ventures can use cellular-level data from omics technologies to create new products and scale their business. We report on a series of interviews and a focus group discussion with entrepreneurs, investors, and data providers. By conceptualizing omics technologies as external enablers, we show how characteristics of cellular-level data negatively affect the combination mechanisms that drive venture creation and growth. We illustrate how data characteristics set boundary conditions for innovation and entrepreneurship and highlight how ventures seek to mitigate their impact. Supplementary Information The online version contains supplementary material available at 10.1007/s12525-023-00669-w.
Collapse
Affiliation(s)
- Hannes Rothe
- University of Duisburg Essen, Institute for Computer Science and Business Information Systems, Essen, Germany
| | | | | | | |
Collapse
|
30
|
Li W, Kim M, Zhang K, Chen H, Jiang X, Harmanci A. COLLAGENE enables privacy-aware federated and collaborative genomic data analysis. Genome Biol 2023; 24:204. [PMID: 37697426 PMCID: PMC10496350 DOI: 10.1186/s13059-023-03039-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Accepted: 08/16/2023] [Indexed: 09/13/2023] Open
Abstract
Growing regulatory requirements set barriers around genetic data sharing and collaborations. Moreover, existing privacy-aware paradigms are challenging to deploy in collaborative settings. We present COLLAGENE, a tool base for building secure collaborative genomic data analysis methods. COLLAGENE protects data using shared-key homomorphic encryption and combines encryption with multiparty strategies for efficient privacy-aware collaborative method development. COLLAGENE provides ready-to-run tools for encryption/decryption, matrix processing, and network transfers, which can be immediately integrated into existing pipelines. We demonstrate the usage of COLLAGENE by building a practical federated GWAS protocol for binary phenotypes and a secure meta-analysis protocol. COLLAGENE is available at https://zenodo.org/record/8125935 .
Collapse
Affiliation(s)
- Wentao Li
- Center for Secure Artificial Intelligence For hEalthcare (SAFE), D. Bradley McWilliams School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX, 77030, USA
| | - Miran Kim
- Department of Mathematics, Department of Computer Science, Hanyang University, Seoul, 04763, Republic of Korea
- Research Institute for Convergence of Basic Science, Hanyang University, Seoul, 04763, Republic of Korea
- Bio-BigData Center, Hanyang Institute of Bioscience and Biotechnology, Hanyang University, Seoul, 04763, Republic of Korea
| | - Kai Zhang
- Center for Secure Artificial Intelligence For hEalthcare (SAFE), D. Bradley McWilliams School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX, 77030, USA
| | - Han Chen
- Human Genetics Center, Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
- Center for Precision Health, D. Bradley McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Xiaoqian Jiang
- Center for Secure Artificial Intelligence For hEalthcare (SAFE), D. Bradley McWilliams School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX, 77030, USA
| | - Arif Harmanci
- Center for Secure Artificial Intelligence For hEalthcare (SAFE), D. Bradley McWilliams School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX, 77030, USA.
- Center for Precision Health, D. Bradley McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA.
| |
Collapse
|
31
|
Li W, Chen H, Jiang X, Harmanci A. Federated generalized linear mixed models for collaborative genome-wide association studies. iScience 2023; 26:107227. [PMID: 37529100 PMCID: PMC10387571 DOI: 10.1016/j.isci.2023.107227] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Revised: 01/28/2023] [Accepted: 06/23/2023] [Indexed: 08/03/2023] Open
Abstract
Federated association testing is a powerful approach to conduct large-scale association studies where sites share intermediate statistics through a central server. There are, however, several standing challenges. Confounding factors like population stratification should be carefully modeled across sites. In addition, it is crucial to consider disease etiology using flexible models to prevent biases. Privacy protections for participants pose another significant challenge. Here, we propose distributed Mixed Effects Genome-wide Association study (dMEGA), a method that enables federated generalized linear mixed model-based association testing across multiple sites without explicitly sharing genotype and phenotype data. dMEGA employs a reference projection to correct for population-stratification and utilizes efficient local-gradient updates among sites, incorporating both fixed and random effects. The accuracy and efficiency of dMEGA are demonstrated through simulated and real datasets. dMEGA is publicly available at https://github.com/Li-Wentao/dMEGA.
Collapse
Affiliation(s)
- Wentao Li
- School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX 77030, USA
| | - Han Chen
- School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX 77030, USA
- School of Public Health, University of Texas Health Science Center, Houston, TX 77030, USA
| | - Xiaoqian Jiang
- School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX 77030, USA
| | - Arif Harmanci
- School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX 77030, USA
| |
Collapse
|
32
|
Aneja S, Avesta A, Xu H, Machado LO. Clinical Informatics Approaches to Facilitate Cancer Data Sharing. Yearb Med Inform 2023; 32:104-110. [PMID: 37414028 PMCID: PMC10751108 DOI: 10.1055/s-0043-1768721] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/08/2023] Open
Abstract
OBJECTIVES Despite growing enthusiasm surrounding the utility of clinical informatics to improve cancer outcomes, data availability remains a persistent bottleneck to progress. Difficulty combining data with protected health information often limits our ability to aggregate larger more representative datasets for analysis. With the rise of machine learning techniques that require increasing amounts of clinical data, these barriers have magnified. Here, we review recent efforts within clinical informatics to address issues related to safely sharing cancer data. METHODS We carried out a narrative review of clinical informatics studies related to sharing protected health data within cancer studies published from 2018-2022, with a focus on domains such as decentralized analytics, homomorphic encryption, and common data models. RESULTS Clinical informatics studies that investigated cancer data sharing were identified. A particular focus of the search yielded studies on decentralized analytics, homomorphic encryption, and common data models. Decentralized analytics has been prototyped across genomic, imaging, and clinical data with the most advances in diagnostic image analysis. Homomorphic encryption was most often employed on genomic data and less on imaging and clinical data. Common data models primarily involve clinical data from the electronic health record. Although all methods have robust research, there are limited studies showing wide scale implementation. CONCLUSIONS Decentralized analytics, homomorphic encryption, and common data models represent promising solutions to improve cancer data sharing. Promising results thus far have been limited to smaller settings. Future studies should be focused on evaluating the scalability and efficacy of these methods across clinical settings of varying resources and expertise.
Collapse
Affiliation(s)
- Sanjay Aneja
- Department of Therapeutic Radiology, Yale School of Medicine, New Haven, CT, USA
- Center for Outcomes Research and Evaluation at Yale, New Haven, CT, USA
- Department of Bioinformatics and Data Science, Yale School of Medicine, New Haven, CT, USA
| | - Arman Avesta
- Department of Therapeutic Radiology, Yale School of Medicine, New Haven, CT, USA
- Center for Outcomes Research and Evaluation at Yale, New Haven, CT, USA
| | - Hua Xu
- Department of Bioinformatics and Data Science, Yale School of Medicine, New Haven, CT, USA
| | - Lucila Ohno Machado
- Department of Bioinformatics and Data Science, Yale School of Medicine, New Haven, CT, USA
| |
Collapse
|
33
|
Singh PP, Benayoun BA. Considerations for reproducible omics in aging research. NATURE AGING 2023; 3:921-930. [PMID: 37386258 PMCID: PMC10527412 DOI: 10.1038/s43587-023-00448-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Accepted: 06/01/2023] [Indexed: 07/01/2023]
Abstract
Technical advancements over the past two decades have enabled the measurement of the panoply of molecules of cells and tissues including transcriptomes, epigenomes, metabolomes and proteomes at unprecedented resolution. Unbiased profiling of these molecular landscapes in the context of aging can reveal important details about mechanisms underlying age-related functional decline and age-related diseases. However, the high-throughput nature of these experiments creates unique analytical and design demands for robustness and reproducibility. In addition, 'omic' experiments are generally onerous, making it crucial to effectively design them to eliminate as many spurious sources of variation as possible as well as account for any biological or technical parameter that may influence such measures. In this Perspective, we provide general guidelines on best practices in the design and analysis of omic experiments in aging research from experimental design to data analysis and considerations for long-term reproducibility and validation of such studies.
Collapse
Affiliation(s)
- Param Priya Singh
- Department of Anatomy, University of California, San Francisco, San Francisco, CA, USA.
- Bakar Aging Research Institute, University of California, San Francisco, San Francisco, CA, USA.
| | - Bérénice A Benayoun
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA, USA.
- Molecular and Computational Biology Department, USC Dornsife College of Letters, Arts and Sciences, Los Angeles, CA, USA.
- Biochemistry and Molecular Medicine Department, USC Keck School of Medicine, Los Angeles, CA, USA.
- Epigenetics and Gene Regulation, USC Norris Comprehensive Cancer Center, Los Angeles, CA, USA.
- USC Stem Cell Initiative, Los Angeles, CA, USA.
| |
Collapse
|
34
|
Sadhuka S, Fridman D, Berger B, Cho H. Assessing transcriptomic reidentification risks using discriminative sequence models. Genome Res 2023; 33:1101-1112. [PMID: 37541758 PMCID: PMC10538488 DOI: 10.1101/gr.277699.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2023] [Accepted: 04/19/2023] [Indexed: 08/06/2023]
Abstract
Gene expression data provide molecular insights into the functional impact of genetic variation, for example, through expression quantitative trait loci (eQTLs). With an improving understanding of the association between genotypes and gene expression comes a greater concern that gene expression profiles could be matched to genotype profiles of the same individuals in another data set, known as a linking attack. Prior works show such a risk could analyze only a fraction of eQTLs that is independent owing to restrictive model assumptions, leaving the full extent of this risk incompletely understood. To address this challenge, we introduce the discriminative sequence model (DSM), a novel probabilistic framework for predicting a sequence of genotypes based on gene expression data. By modeling the joint distribution over all known eQTLs in a genomic region, DSM improves the power of linking attacks with necessary calibration for linkage disequilibrium and redundant predictive signals. We show greater linking accuracy of DSM compared with existing approaches across a range of attack scenarios and data sets including up to 22,288 individuals, suggesting that DSM helps uncover a substantial additional risk overlooked by previous studies. Our work provides a unified framework for assessing the privacy risks of sharing diverse omics data sets beyond transcriptomics.
Collapse
Affiliation(s)
- Shuvom Sadhuka
- Computer Science and AI Lab, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Daniel Fridman
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts 02115, USA
| | - Bonnie Berger
- Computer Science and AI Lab, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Hyunghoon Cho
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA;
| |
Collapse
|
35
|
Wang X, Dervishi L, Li W, Jiang X, Ayday E, Vaidya J. Efficient Federated Kinship Relationship Identification. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2023; 2023:534-543. [PMID: 37351796 PMCID: PMC10283133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/24/2023]
Abstract
Kinship relationship estimation plays a significant role in today's genome studies. Since genetic data are mostly stored and protected in different silos, retrieving the desirable kinship relationships across federated data warehouses is a non-trivial problem. The ability to identify and connect related individuals is important for both research and clinical applications. In this work, we propose a new privacy-preserving kinship relationship estimation framework: Incremental Update Kinship Identification (INK). The proposed framework includes three key components that allow us to control the balance between privacy and accuracy (of kinship estimation): an incremental process coupled with the use of auxiliary information and informative scores. Our empirical evaluation shows that INK can achieve higher kinship identification correctness while exposing fewer genetic markers.
Collapse
Affiliation(s)
| | | | | | | | - Erman Ayday
- Case Western Reserve University, Cleveland, OH
| | | |
Collapse
|
36
|
Abondio P, Cilli E, Luiselli D. Human Pangenomics: Promises and Challenges of a Distributed Genomic Reference. Life (Basel) 2023; 13:1360. [PMID: 37374141 DOI: 10.3390/life13061360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 06/02/2023] [Accepted: 06/08/2023] [Indexed: 06/29/2023] Open
Abstract
A pangenome is a collection of the common and unique genomes that are present in a given species. It combines the genetic information of all the genomes sampled, resulting in a large and diverse range of genetic material. Pangenomic analysis offers several advantages compared to traditional genomic research. For example, a pangenome is not bound by the physical constraints of a single genome, so it can capture more genetic variability. Thanks to the introduction of the concept of pangenome, it is possible to use exceedingly detailed sequence data to study the evolutionary history of two different species, or how populations within a species differ genetically. In the wake of the Human Pangenome Project, this review aims at discussing the advantages of the pangenome around human genetic variation, which are then framed around how pangenomic data can inform population genetics, phylogenetics, and public health policy by providing insights into the genetic basis of diseases or determining personalized treatments, targeting the specific genetic profile of an individual. Moreover, technical limitations, ethical concerns, and legal considerations are discussed.
Collapse
Affiliation(s)
- Paolo Abondio
- Laboratory of Ancient DNA, Department of Cultural Heritage, University of Bologna, Via degli Ariani 1, 48121 Ravenna, Italy
| | - Elisabetta Cilli
- Laboratory of Ancient DNA, Department of Cultural Heritage, University of Bologna, Via degli Ariani 1, 48121 Ravenna, Italy
| | - Donata Luiselli
- Laboratory of Ancient DNA, Department of Cultural Heritage, University of Bologna, Via degli Ariani 1, 48121 Ravenna, Italy
| |
Collapse
|
37
|
Gooden A, Thaldar D. Toward an open access genomics database of South Africans: ethical considerations. Front Genet 2023; 14:1166029. [PMID: 37260770 PMCID: PMC10228717 DOI: 10.3389/fgene.2023.1166029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Accepted: 05/03/2023] [Indexed: 06/02/2023] Open
Abstract
Genomics research holds the potential to improve healthcare. Yet, a very low percentage of the genomic data used in genomics research internationally relates to persons of African origin. Establishing a large-scale, open access genomics database of South Africans may contribute to solving this problem. However, this raises various ethics concerns, including privacy expectations and informed consent. The concept of open consent offers a potential solution to these concerns by (a) being explicit about the research participant's data being in the public domain and the associated privacy risks, and (b) setting a higher-than-usual benchmark for informed consent by making use of the objective assessment of prospective research participants' understanding. Furthermore, in the South African context-where local culture is infused with Ubuntu and its relational view of personhood-community engagement is vital for establishing and maintaining an open access genomics database of South Africans. The South African National Health Research Ethics Council is called upon to provide guidelines for genomics researchers-based on open consent and community engagement-on how to plan and implement open access genomics projects.
Collapse
Affiliation(s)
- Amy Gooden
- School of Law, University of KwaZulu-Natal, Durban, South Africa
| | - Donrich Thaldar
- School of Law, University of KwaZulu-Natal, Durban, South Africa
- Petrie-Flom Center for Health Law Policy, Biotechnology and Bioethics, Harvard Law School, Cambridge, MA, United States
| |
Collapse
|
38
|
Bobak CA, Zhao Y, Levy JJ, O’Malley AJ. GRANDPA: GeneRAtive network sampling using degree and property augmentation applied to the analysis of partially confidential healthcare networks. APPLIED NETWORK SCIENCE 2023; 8:23. [PMID: 37188323 PMCID: PMC10173245 DOI: 10.1007/s41109-023-00548-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Accepted: 04/24/2023] [Indexed: 05/17/2023]
Abstract
Protecting medical privacy can create obstacles in the analysis and distribution of healthcare graphs and statistical inferences accompanying them. We pose a graph simulation model which generates networks using degree and property augmentation and provide a flexible R package that allows users to create graphs that preserve vertex attribute relationships and approximating the retention of topological properties observed in the original graph (e.g., community structure). We illustrate our proposed algorithm using a case study based on Zachary's karate network and a patient-sharing graph generated from Medicare claims data in 2019. In both cases, we find that community structure is preserved, and normalized root mean square error between cumulative distributions of the degrees across the generated and the original graphs is low (0.0508 and 0.0514 respectively).
Collapse
Affiliation(s)
- Carly A. Bobak
- Department of Biomedical Data Science, Dartmouth College, Hanover, NH USA
- The Dartmouth Institute for Health Policy and Clinical Practice, Dartmouth College, Hanover, NH USA
- Research Computing, Dartmouth College, Hanover, NH USA
| | - Yifan Zhao
- Department of Biomedical Data Science, Dartmouth College, Hanover, NH USA
- The Dartmouth Institute for Health Policy and Clinical Practice, Dartmouth College, Hanover, NH USA
| | - Joshua J. Levy
- Department of Pathology and Laboratory Medicine, Dartmouth College, Hanover, NH USA
- Department of Dermatology, Dartmouth College, Hanover, NH USA
- Department of Epidemiology, Dartmouth College, Hanover, NH USA
| | - A. James O’Malley
- Department of Biomedical Data Science, Dartmouth College, Hanover, NH USA
- The Dartmouth Institute for Health Policy and Clinical Practice, Dartmouth College, Hanover, NH USA
| |
Collapse
|
39
|
Wagner JK, Yu JH, Fullwiley D, Moore C, Wilson JF, Bamshad MJ, Royal CD. Guidelines for genetic ancestry inference created through roundtable discussions. HGG ADVANCES 2023; 4:100178. [PMID: 36798092 PMCID: PMC9926022 DOI: 10.1016/j.xhgg.2023.100178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Accepted: 01/03/2023] [Indexed: 01/15/2023] Open
Abstract
The use of genetic and genomic technology to infer ancestry is commonplace in a variety of contexts, particularly in biomedical research and for direct-to-consumer genetic testing. In 2013 and 2015, two roundtables engaged a diverse group of stakeholders toward the development of guidelines for inferring genetic ancestry in academia and industry. This report shares the stakeholder groups' work and provides an analysis of, commentary on, and views from the groundbreaking and sustained dialogue. We describe the engagement processes and the stakeholder groups' resulting statements and proposed guidelines. The guidelines focus on five key areas: application of genetic ancestry inference, assumptions and confidence/laboratory and statistical methods, terminology and population identifiers, impact on individuals and groups, and communication or translation of genetic ancestry inferences. We delineate the terms and limitations of the guidelines and discuss their critical role in advancing the development and implementation of best practices for inferring genetic ancestry and reporting the results. These efforts should inform both governmental regulation and self-regulation.
Collapse
Affiliation(s)
- Jennifer K. Wagner
- School of Engineering Design and Innovation, Pennsylvania State University, University Park, PA 16802, USA
- Institute for Computational and Data Science, Pennsylvania State University, University Park, PA 16802, USA
- Department of Biomedical Engineering, Pennsylvania State University, University Park, PA 16802, USA
- Rock Ethics Institute, Pennsylvania State University, University Park, PA 16802, USA
- Penn State Law, University Park, PA 16802, USA
- Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802, USA
| | - Joon-Ho Yu
- Department of Pediatrics and Institute for Public Health Genetics, University of Washington, Seattle, WA 98195, USA
- Treuman Katz Center for Pediatric Bioethics, Seattle Children’s Hospital and Research Institute, Seattle, WA 98101, USA
| | - Duana Fullwiley
- Department of Anthropology, Stanford University, Stanford, CA 94305, USA
| | | | - James F. Wilson
- Centre for Global Health Research, Usher Institute, University of Edinburgh, Edinburgh EH8 9AG, Scotland
| | - Michael J. Bamshad
- Department of Pediatrics and Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
- Division of Genetic Medicine, Seattle Children’s Hospital, Seattle, WA 98101, USA
| | - Charmaine D. Royal
- Departments of African and African American Studies, Biology, Global Health, and Family Medicine and Community Health, Duke University, Durham, NC 27708, USA
| | - Genetic Ancestry Inference Roundtable Participants
- School of Engineering Design and Innovation, Pennsylvania State University, University Park, PA 16802, USA
- Institute for Computational and Data Science, Pennsylvania State University, University Park, PA 16802, USA
- Department of Biomedical Engineering, Pennsylvania State University, University Park, PA 16802, USA
- Rock Ethics Institute, Pennsylvania State University, University Park, PA 16802, USA
- Penn State Law, University Park, PA 16802, USA
- Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802, USA
- Department of Pediatrics and Institute for Public Health Genetics, University of Washington, Seattle, WA 98195, USA
- Treuman Katz Center for Pediatric Bioethics, Seattle Children’s Hospital and Research Institute, Seattle, WA 98101, USA
- Department of Anthropology, Stanford University, Stanford, CA 94305, USA
- The DNA Detectives, Dana Point, CA, USA
- Centre for Global Health Research, Usher Institute, University of Edinburgh, Edinburgh EH8 9AG, Scotland
- Department of Pediatrics and Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
- Division of Genetic Medicine, Seattle Children’s Hospital, Seattle, WA 98101, USA
- Departments of African and African American Studies, Biology, Global Health, and Family Medicine and Community Health, Duke University, Durham, NC 27708, USA
| |
Collapse
|
40
|
Akyüz K, Goisauf M, Chassang G, Kozera Ł, Mežinska S, Tzortzatou-Nanopoulou O, Mayrhofer MT. Post-identifiability in changing sociotechnological genomic data environments. BIOSOCIETIES 2023:1-28. [PMID: 37359141 PMCID: PMC10042674 DOI: 10.1057/s41292-023-00299-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/13/2023] [Indexed: 03/30/2023]
Abstract
Data practices in biomedical research often rely on standards that build on normative assumptions regarding privacy and involve 'ethics work.' In an increasingly datafied research environment, identifiability gains a new temporal and spatial dimension, especially in regard to genomic data. In this paper, we analyze how genomic identifiability is considered as a specific data issue in a recent controversial case: publication of the genome sequence of the HeLa cell line. Considering developments in the sociotechnological and data environment, such as big data, biomedical, recreational, and research uses of genomics, our analysis highlights what it means to be (re-)identifiable in the postgenomic era. By showing how the risk of genomic identifiability is not a specificity of the HeLa controversy, but rather a systematic data issue, we argue that a new conceptualization is needed. With the notion of post-identifiability as a sociotechnological situation, we show how past assumptions and ideas about future possibilities come together in the case of genomic identifiability. We conclude by discussing how kinship, temporality, and openness are subject to renewed negotiations along with the changing understandings and expectations of identifiability and status of genomic data.
Collapse
Affiliation(s)
- Kaya Akyüz
- Department of Science and Technology Studies, University of Vienna, Universitätsstraße 7/Stiege II/6, Stock (NIG), 1010 Vienna, Austria
- BBMRI-ERIC, Graz, Austria
| | - Melanie Goisauf
- Department of Science and Technology Studies, University of Vienna, Universitätsstraße 7/Stiege II/6, Stock (NIG), 1010 Vienna, Austria
- BBMRI-ERIC, Graz, Austria
| | - Gauthier Chassang
- CERPOP, Université de Toulouse, Inserm, Université Paul Sabatier, Toulouse, France
- Plateforme GenoToul Societal “Ethique et Biosciences”, Toulouse, France
| | | | - Signe Mežinska
- Institute of Clinical and Preventive Medicine, University of Latvia, Riga, Latvia
- BBMRI.LV, Riga, Latvia
| | | | | |
Collapse
|
41
|
Yang W, Guo Z, Shi W, Qin L, Xia X, Hao B, Liao S. Security and Sharing of NIPT Data Are the Basis of Ethical Decision-Making Related to Non-Medical Traits. THE AMERICAN JOURNAL OF BIOETHICS : AJOB 2023; 23:29-31. [PMID: 36919545 DOI: 10.1080/15265161.2023.2169392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Affiliation(s)
- Wenke Yang
- Henan Provincial People's Hospital, People's Hospital of Henan University, People's Hospital of Zhengzhou University; and National Health Commission Key Laboratory of Birth Defects Prevention, Henan Provincial Key Laboratory of Genetic Diseases and Functional Genomics
| | - Zhenglong Guo
- Henan Provincial People's Hospital, People's Hospital of Henan University, People's Hospital of Zhengzhou University; and National Health Commission Key Laboratory of Birth Defects Prevention, Henan Provincial Key Laboratory of Genetic Diseases and Functional Genomics
| | - Weili Shi
- Henan Provincial People's Hospital, People's Hospital of Henan University, People's Hospital of Zhengzhou University; and National Health Commission Key Laboratory of Birth Defects Prevention, Henan Provincial Key Laboratory of Genetic Diseases and Functional Genomics
| | - Litao Qin
- Henan Provincial People's Hospital, People's Hospital of Henan University, People's Hospital of Zhengzhou University; and National Health Commission Key Laboratory of Birth Defects Prevention, Henan Provincial Key Laboratory of Genetic Diseases and Functional Genomics
| | - Xiaoliang Xia
- Henan Provincial People's Hospital, People's Hospital of Henan University, People's Hospital of Zhengzhou University; and National Health Commission Key Laboratory of Birth Defects Prevention, Henan Provincial Key Laboratory of Genetic Diseases and Functional Genomics
| | - Bingtao Hao
- Henan Provincial People's Hospital, People's Hospital of Henan University, People's Hospital of Zhengzhou University; and National Health Commission Key Laboratory of Birth Defects Prevention, Henan Provincial Key Laboratory of Genetic Diseases and Functional Genomics
| | - Shixiu Liao
- Henan Provincial People's Hospital, People's Hospital of Henan University, People's Hospital of Zhengzhou University; and National Health Commission Key Laboratory of Birth Defects Prevention, Henan Provincial Key Laboratory of Genetic Diseases and Functional Genomics
| |
Collapse
|
42
|
Ohno-Machado L, Jiang X, Kuo TT, Tao S, Chen L, Ram PM, Zhang GQ, Xu H. A hierarchical strategy to minimize privacy risk when linking "De-identified" data in biomedical research consortia. J Biomed Inform 2023; 139:104322. [PMID: 36806328 DOI: 10.1016/j.jbi.2023.104322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2022] [Revised: 02/10/2023] [Accepted: 02/11/2023] [Indexed: 02/18/2023]
Abstract
Linking data across studies offers an opportunity to enrich data sets and provide a stronger basis for data-driven models for biomedical discovery and/or prognostication. Several techniques to link records have been proposed, and some have been implemented across data repositories holding molecular and clinical data. Not all these techniques guarantee appropriate privacy protection; there are trade-offs between (a) simple strategies that can be associated with data that will be linked and shared with any party and (b) more complex strategies that preserve the privacy of individuals across parties. We propose an intermediary, practical strategy to support linkage in studies that share de-identified data with Data Coordinating Centers. This technology can be extended to link data across multiple data hubs to support privacy preserving record linkage, considering data coordination centers and their awardees, which can be extended to a hierarchy of entities (e.g., awardees, data coordination centers, data hubs, etc.) b.
Collapse
Affiliation(s)
- Lucila Ohno-Machado
- UCSD Health Department of Biomedical Informatics, University of California San Diego Health, La Jolla, CA, USA; Biomedical Informatics & Data Science, Yale School of Medicine, New Haven, CT.
| | - Xiaoqian Jiang
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Tsung-Ting Kuo
- UCSD Health Department of Biomedical Informatics, University of California San Diego Health, La Jolla, CA, USA
| | - Shiqiang Tao
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Luyao Chen
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Pritham M Ram
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Guo-Qiang Zhang
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Hua Xu
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA; Biomedical Informatics & Data Science, Yale School of Medicine, New Haven, CT
| |
Collapse
|
43
|
Galende-Domínguez I, Rivero-Lezcano OM. Ethical considerations about the collection of biological samples for genetic analysis in clinical trials. RESEARCH ETHICS 2023. [DOI: 10.1177/17470161231152077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
Progress in precision medicine is being achieved through the design of clinical trials that use genetic biomarkers to guide stratification of patients and assignation to treatment or control groups. Genetic analysis of biomarkers is, therefore, essential to complete their objectives, and this involves the study of biological samples from donor patients that have been recruited according to criteria previously established in the design of the clinical trial. Nevertheless, it is becoming very common that, in the solicitation of biological samples, purposes that are beyond the objectives of the stated therapeutic trial research are introduced, like the development of ill-explained exploratory studies or the use in unspecified future research. In the digital era, the sequencing of patients’ DNA needs to be considered as a serious security matter, not only for the patients, but also for their relatives. Genetic information may be easily stored, even forever, in digital files. This engenders a permanent risk of being stolen or misused in many ways. Furthermore, re-identification of sample donors is technically feasible through their genetic data. For these reasons, genetic analysis of samples collected in clinical trials should be restricted to the accomplishment of their main objectives or well justified goals.
Collapse
Affiliation(s)
- Inés Galende-Domínguez
- Unidad de Apoyo SG Formación y ADS, Spain
- D.G. de Investigación, Docencia y Documentación
- Consejería de Sanidad, Comunidad de Madrid. Spain
| | - Octavio M Rivero-Lezcano
- Complejo Asistencial Universitario de León-Unidad de Investigación, Spain
- Gerencia Regional de Salud de Castilla y León (SACYL), Spain
- Institute of Biomedical Research of Salamanca (IBSAL), Spain
- Institute of Biomedicine. University of León, Spain
| |
Collapse
|
44
|
Reales G, Wallace C. Sharing GWAS summary statistics results in more citations. Commun Biol 2023; 6:116. [PMID: 36709395 PMCID: PMC9884206 DOI: 10.1038/s42003-023-04497-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Accepted: 01/17/2023] [Indexed: 01/29/2023] Open
Abstract
A review of citation rates from genomic studies in the GWAS Catalog suggests that sharing summary statistics results, on average, in ~81.8% more citations, highlighting a benefit of publicly sharing GWAS summary statistics.
Collapse
Affiliation(s)
- Guillermo Reales
- Cambridge Institute of Therapeutic Immunology and Infectious Disease (CITIID), University of Cambridge, Cambridge, UK.
- Department of Medicine, University of Cambridge, Cambridge, UK.
| | - Chris Wallace
- Cambridge Institute of Therapeutic Immunology and Infectious Disease (CITIID), University of Cambridge, Cambridge, UK
- Department of Medicine, University of Cambridge, Cambridge, UK
- MRC Biostatistics Unit, University of Cambridge, Cambridge, UK
| |
Collapse
|
45
|
Oza VH, Whitlock JH, Wilk EJ, Uno-Antonison A, Wilk B, Gajapathy M, Howton TC, Trull A, Ianov L, Worthey EA, Lasseigne BN. Ten simple rules for using public biological data for your research. PLoS Comput Biol 2023; 19:e1010749. [PMID: 36602970 DOI: 10.1371/journal.pcbi.1010749] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
With an increasing amount of biological data available publicly, there is a need for a guide on how to successfully download and use this data. The 10 simple rules for using public biological data are: (1) use public data purposefully in your research; (2) evaluate data for your use case; (3) check data reuse requirements and embargoes; (4) be aware of ethics for data reuse; (5) plan for data storage and compute requirements; (6) know what you are downloading; (7) download programmatically and verify integrity; (8) properly cite data; (9) make reprocessed data and models Findable, Accessible, Interoperable, and Reusable (FAIR) and share; and (10) make pipelines and code FAIR and share. These rules are intended as a guide for researchers wanting to make use of available data and to increase data reuse and reproducibility.
Collapse
Affiliation(s)
- Vishal H Oza
- Department of Cell, Developmental and Integrative Biology, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, United States of America
| | - Jordan H Whitlock
- Department of Cell, Developmental and Integrative Biology, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, United States of America
| | - Elizabeth J Wilk
- Department of Cell, Developmental and Integrative Biology, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, United States of America
| | - Angelina Uno-Antonison
- Center for Computational Genomics and Data Sciences, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, United States of America
- Department of Pediatrics, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, United States of America
- Department of Pathology, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, United States of America
| | - Brandon Wilk
- Center for Computational Genomics and Data Sciences, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, United States of America
- Department of Pediatrics, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, United States of America
- Department of Pathology, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, United States of America
| | - Manavalan Gajapathy
- Center for Computational Genomics and Data Sciences, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, United States of America
- Department of Pediatrics, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, United States of America
- Department of Pathology, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, United States of America
| | - Timothy C Howton
- Department of Cell, Developmental and Integrative Biology, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, United States of America
| | - Austyn Trull
- Center for Computational Genomics and Data Sciences, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, United States of America
- Department of Pediatrics, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, United States of America
- Department of Pathology, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, United States of America
| | - Lara Ianov
- Civitan International Research Center, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, United States of America
| | - Elizabeth A Worthey
- Center for Computational Genomics and Data Sciences, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, United States of America
- Department of Pediatrics, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, United States of America
- Department of Pathology, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, United States of America
| | - Brittany N Lasseigne
- Department of Cell, Developmental and Integrative Biology, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, United States of America
| |
Collapse
|
46
|
Indonesian Scientists’ Behavior Relative to Research Data Governance in Preventing WMD-Applicable Technology Transfer. PUBLICATIONS 2022. [DOI: 10.3390/publications10040050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Performing research data governance is critical for preventing the transfer of technologies related to weapons of mass destruction (WMD). While research data governance is common in developed countries, it is still often considered less necessary by research organizations in developing countries such as Indonesia. An investigation of research data governance behavior for Indonesian scientists was conducted in this study. The theories of planned behavior (TPB) and protection motivation (PMT) were used to explain the relationships between different factors influencing scientists’ behavior. The theories have been widely used in the information security domain, and the approach was adopted to build the research model of this study. The obtained data were analyzed using partial least-squares structural equation modeling (PLS-SEM) to answer the main research question: “what factors determine the likelihood of practicing research data governance by Indonesian scientists to prevent WMD-applicable technology transfer?” By learning what motivates scientists to adopt research data governance practices, organizations can design relevant strategies that are directed explicitly at stimulating positive responses. The results of this study can also be applied in other developing countries that have similar situations, such as Indonesia.
Collapse
|
47
|
Al Aziz MM, Thulasiraman P, Mohammed N. Parallel and private generalized suffix tree construction and query on genomic data. BMC Genom Data 2022; 23:45. [PMID: 35715724 PMCID: PMC9206251 DOI: 10.1186/s12863-022-01053-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Accepted: 04/25/2022] [Indexed: 11/10/2022] Open
Abstract
Background Several technological advancements and digitization of healthcare data have provided the scientific community with a large quantity of genomic data. Such datasets facilitated a deeper understanding of several diseases and our health in general. Strikingly, these genome datasets require a large storage volume and present technical challenges in retrieving meaningful information. Furthermore, the privacy aspects of genomic data limit access and often hinder timely scientific discovery. Methods In this paper, we utilize the Generalized Suffix Tree (GST); their construction and applications have been fairly studied in related areas. The main contribution of this article is the proposal of a privacy-preserving string query execution framework using GSTs and an additional tree-based hashing mechanism. Initially, we start by introducing an efficient GST construction in parallel that is scalable for a large genomic dataset. The secure indexing scheme allows the genomic data in a GST to be outsourced to an untrusted cloud server under encryption. Additionally, the proposed methods can perform several string search operations (i.e., exact, set-maximal matches) securely and efficiently using the outlined framework. Results The experimental results on different datasets and parameters in a real cloud environment exhibit the scalability of these methods as they also outperform the state-of-the-art method based on Burrows-Wheeler Transformation (BWT). The proposed method only takes around 36.7s to execute a set-maximal match whereas the BWT-based method takes around 160.85s, providing a 4× speedup. Supplementary Information The online version contains supplementary material available at (10.1186/s12863-022-01053-x).
Collapse
|
48
|
Wang S, Kim M, Li W, Jiang X, Chen H, Harmanci A. Privacy-aware estimation of relatedness in admixed populations. Brief Bioinform 2022; 23:bbac473. [PMID: 36384083 PMCID: PMC10144692 DOI: 10.1093/bib/bbac473] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Revised: 09/07/2022] [Accepted: 10/02/2022] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND Estimation of genetic relatedness, or kinship, is used occasionally for recreational purposes and in forensic applications. While numerous methods were developed to estimate kinship, they suffer from high computational requirements and often make an untenable assumption of homogeneous population ancestry of the samples. Moreover, genetic privacy is generally overlooked in the usage of kinship estimation methods. There can be ethical concerns about finding unknown familial relationships in third-party databases. Similar ethical concerns may arise while estimating and reporting sensitive population-level statistics such as inbreeding coefficients for the concerns around marginalization and stigmatization. RESULTS Here, we present SIGFRIED, which makes use of existing reference panels with a projection-based approach that simplifies kinship estimation in the admixed populations. We use simulated and real datasets to demonstrate the accuracy and efficiency of kinship estimation. We present a secure federated kinship estimation framework and implement a secure kinship estimator using homomorphic encryption-based primitives for computing relatedness between samples in two different sites while genotype data are kept confidential. Source code and documentation for our methods can be found at https://doi.org/10.5281/zenodo.7053352. CONCLUSIONS Analysis of relatedness is fundamentally important for identifying relatives, in association studies, and for estimation of population-level estimates of inbreeding. As the awareness of individual and group genomic privacy is growing, privacy-preserving methods for the estimation of relatedness are needed. Presented methods alleviate the ethical and privacy concerns in the analysis of relatedness in admixed, historically isolated and underrepresented populations. SHORT ABSTRACT Genetic relatedness is a central quantity used for finding relatives in databases, correcting biases in genome wide association studies and for estimating population-level statistics. Methods for estimating genetic relatedness have high computational requirements, and occasionally do not consider individuals from admixed ancestries. Furthermore, the ethical concerns around using genetic data and calculating relatedness are not considered. We present a projection-based approach that can efficiently and accurately estimate kinship. We implement our method using encryption-based techniques that provide provable security guarantees to protect genetic data while kinship statistics are computed among multiple sites.
Collapse
Affiliation(s)
- Su Wang
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Miran Kim
- Department of Mathematics, Hanyang University, Seoul, 04763. Republic of Korea
| | - Wentao Li
- Center for Secure Artificial intelligence For hEalthcare (SAFE), School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX, 77030, USA
| | - Xiaoqian Jiang
- Center for Secure Artificial intelligence For hEalthcare (SAFE), School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX, 77030, USA
| | - Han Chen
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- Human Genetics Center, Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Arif Harmanci
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| |
Collapse
|
49
|
Boscarino N, Cartwright RA, Fox K, Tsosie KS. Federated learning and Indigenous genomic data sovereignty. NAT MACH INTELL 2022; 4:909-911. [PMID: 36504698 PMCID: PMC9731328 DOI: 10.1038/s42256-022-00551-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Indigenous peoples are under-represented in genomic datasets, which can lead to limited accuracy and utility of machine learning models in precision health. While open data sharing undermines rights of Indigenous communities to govern data decisions, federated learning may facilitate secure and community-consented data sharing.
Collapse
Affiliation(s)
| | - Reed A. Cartwright
- The Biodesign Institute, Arizona State University, Tempe, AZ, USA
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Keolu Fox
- Department of Anthropology and Global Health, University of California, San Diego, La Jolla, CA, USA
| | | |
Collapse
|
50
|
Jaworski BK, Webb Hooper M, Aklin WM, Jean-Francois B, Elwood WN, Belis D, Riley WT, Hunter CM. Advancing digital health Equity: Directions for behavioral and social science research. Transl Behav Med 2022; 13:132-139. [PMID: 36318232 DOI: 10.1093/tbm/ibac088] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
Abstract
The field of digital health is evolving rapidly and encompasses a wide range of complex and changing technologies used to support individual and population health. The COVID-19 pandemic has augmented digital health expansion and significantly changed how digital health technologies are used. To ensure that these technologies do not create or exacerbate existing health disparities, a multi-pronged and comprehensive research approach is needed. In this commentary, we outline five recommendations for behavioral and social science researchers that are critical to promoting digital health equity. These recommendations include: (i) centering equity in research teams and theoretical approaches, (ii) focusing on issues of digital health literacy and engagement, (iii) using methods that elevate perspectives and needs of underserved populations, (iv) ensuring ethical approaches for collecting and using digital health data, and (v) developing strategies for integrating digital health tools within and across systems and settings. Taken together, these recommendations can help advance the science of digital health equity and justice.
Collapse
Affiliation(s)
- Beth K Jaworski
- Office of Behavioral and Social Sciences Research, National Institutes of Health , Bethesda, MD , USA
| | - Monica Webb Hooper
- National Institute on Minority Health and Health Disparities, National Institutes of Health , Bethesda, MD , USA
| | - Will M Aklin
- National Institute on Drug Abuse, National Institutes of Health , Bethesda, MD , USA
| | - Beda Jean-Francois
- National Center for Complementary and Integrative Health, National Institutes of Health , Bethesda, MD , USA
| | - William N Elwood
- Office of Behavioral and Social Sciences Research, National Institutes of Health , Bethesda, MD , USA
| | - Deshirée Belis
- Office of Behavioral and Social Sciences Research, National Institutes of Health , Bethesda, MD , USA
| | - William T Riley
- Office of Behavioral and Social Sciences Research, National Institutes of Health , Bethesda, MD , USA
| | - Christine M Hunter
- Office of Behavioral and Social Sciences Research, National Institutes of Health , Bethesda, MD , USA
| |
Collapse
|