1
|
Wang S, Kim M, Li W, Jiang X, Chen H, Harmanci A. Privacy-aware estimation of relatedness in admixed populations. Brief Bioinform 2022; 23:bbac473. [PMID: 36384083 PMCID: PMC10144692 DOI: 10.1093/bib/bbac473] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Revised: 09/07/2022] [Accepted: 10/02/2022] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND Estimation of genetic relatedness, or kinship, is used occasionally for recreational purposes and in forensic applications. While numerous methods were developed to estimate kinship, they suffer from high computational requirements and often make an untenable assumption of homogeneous population ancestry of the samples. Moreover, genetic privacy is generally overlooked in the usage of kinship estimation methods. There can be ethical concerns about finding unknown familial relationships in third-party databases. Similar ethical concerns may arise while estimating and reporting sensitive population-level statistics such as inbreeding coefficients for the concerns around marginalization and stigmatization. RESULTS Here, we present SIGFRIED, which makes use of existing reference panels with a projection-based approach that simplifies kinship estimation in the admixed populations. We use simulated and real datasets to demonstrate the accuracy and efficiency of kinship estimation. We present a secure federated kinship estimation framework and implement a secure kinship estimator using homomorphic encryption-based primitives for computing relatedness between samples in two different sites while genotype data are kept confidential. Source code and documentation for our methods can be found at https://doi.org/10.5281/zenodo.7053352. CONCLUSIONS Analysis of relatedness is fundamentally important for identifying relatives, in association studies, and for estimation of population-level estimates of inbreeding. As the awareness of individual and group genomic privacy is growing, privacy-preserving methods for the estimation of relatedness are needed. Presented methods alleviate the ethical and privacy concerns in the analysis of relatedness in admixed, historically isolated and underrepresented populations. SHORT ABSTRACT Genetic relatedness is a central quantity used for finding relatives in databases, correcting biases in genome wide association studies and for estimating population-level statistics. Methods for estimating genetic relatedness have high computational requirements, and occasionally do not consider individuals from admixed ancestries. Furthermore, the ethical concerns around using genetic data and calculating relatedness are not considered. We present a projection-based approach that can efficiently and accurately estimate kinship. We implement our method using encryption-based techniques that provide provable security guarantees to protect genetic data while kinship statistics are computed among multiple sites.
Collapse
Affiliation(s)
- Su Wang
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Miran Kim
- Department of Mathematics, Hanyang University, Seoul, 04763. Republic of Korea
| | - Wentao Li
- Center for Secure Artificial intelligence For hEalthcare (SAFE), School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX, 77030, USA
| | - Xiaoqian Jiang
- Center for Secure Artificial intelligence For hEalthcare (SAFE), School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX, 77030, USA
| | - Han Chen
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- Human Genetics Center, Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Arif Harmanci
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| |
Collapse
|
2
|
Gross MS, Hood AJ, Rubin JC, Miller RC. Respect, justice and learning are limited when patients are deidentified data subjects. Learn Health Syst 2022; 6:e10303. [PMID: 35860318 PMCID: PMC9284924 DOI: 10.1002/lrh2.10303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2021] [Revised: 12/01/2021] [Accepted: 01/04/2022] [Indexed: 11/14/2022] Open
Abstract
Introduction Critical for advancing a Learning Health System (LHS) in the U.S., a regulatory safe harbor for deidentified data reduces barriers to learning from care at scale while minimizing privacy risks. We examine deidentified data policy as a mechanism for synthesizing the ethical obligations underlying clinical care and human subjects research for an LHS which conceptually and practically integrates care and research, blurring the roles of patient and subject. Methods First, we discuss respect for persons vis‐a‐vis the systemic secondary use of data and tissue collected in the fiduciary context of clinical care. We argue that, without traditional informed consent or duty to benefit the individual, deidentification may allow secondary use to supersede the primary purpose of care. Next, we consider the effectiveness of deidentification for minimizing harms via privacy protection and maximizing benefits via promoting learning and translational care. We find that deidentification is unable to fully protect privacy given the vastness of health data and current technology, yet it imposes limitations to learning and barriers for efficient translation. After that, we evaluate the impact of deidentification on distributive justice within an LHS ethical framework in which patients are obligated to contribute to learning and the system has a duty to translate knowledge into better care. Such a system may permit exacerbation of health disparities as it accelerates learning without mechanisms to ensure that individuals' contributions and benefits are fair and balanced. Results We find that, despite its established advantages, system‐wide use of deidentification may be suboptimal for signaling respect, protecting privacy or promoting learning, and satisfying requirements of justice for patients and subjects. Conclusions Finally, we highlight ethical, socioeconomic, technological and legal challenges and next steps, including a critical appreciation for novel approaches to realize an LHS that maximizes efficient, effective learning and just translation without the compromises of deidentification.
Collapse
Affiliation(s)
- Marielle S. Gross
- University of Pittsburgh Department of Obstetrics, Gynecology and Reproductive Sciences, University of Pittsburgh Center for Bioethics and Health Law Johns Hopkins Berman Institute of Bioethics Pittsburgh Pennsylvania USA
- Johns Hopkins Berman Institute of Bioethics Baltimore Maryland USA
| | - Amelia J. Hood
- Johns Hopkins Berman Institute of Bioethics Baltimore Maryland USA
| | - Joshua C. Rubin
- Learning Health Systems Initiative University of Michigan Medical School Ann Arbor Michigan USA
| | | |
Collapse
|
3
|
Crossfield SSR, Zucker K, Baxter P, Wright P, Fistein J, Markham AF, Birkin M, Glaser AW, Hall G. A data flow process for confidential data and its application in a health research project. PLoS One 2022; 17:e0262609. [PMID: 35061834 PMCID: PMC8782367 DOI: 10.1371/journal.pone.0262609] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Accepted: 12/29/2021] [Indexed: 11/30/2022] Open
Abstract
BACKGROUND The use of linked healthcare data in research has the potential to make major contributions to knowledge generation and service improvement. However, using healthcare data for secondary purposes raises legal and ethical concerns relating to confidentiality, privacy and data protection rights. Using a linkage and anonymisation approach that processes data lawfully and in line with ethical best practice to create an anonymous (non-personal) dataset can address these concerns, yet there is no set approach for defining all of the steps involved in such data flow end-to-end. We aimed to define such an approach with clear steps for dataset creation, and to describe its utilisation in a case study linking healthcare data. METHODS We developed a data flow protocol that generates pseudonymous datasets that can be reversibly linked, or irreversibly linked to form an anonymous research dataset. It was designed and implemented by the Comprehensive Patient Records (CPR) study in Leeds, UK. RESULTS We defined a clear approach that received ethico-legal approval for use in creating an anonymous research dataset. Our approach used individual-level linkage through a mechanism that is not computer-intensive and was rendered irreversible to both data providers and processors. We successfully applied it in the CPR study to hospital and general practice and community electronic health record data from two providers, along with patient reported outcomes, for 365,193 patients. The resultant anonymous research dataset is available via DATA-CAN, the Health Data Research Hub for Cancer in the UK. CONCLUSIONS Through ethical, legal and academic review, we believe that we contribute a defined approach that represents a framework that exceeds current minimum standards for effective pseudonymisation and anonymisation. This paper describes our methods and provides supporting information to facilitate the use of this approach in research.
Collapse
Affiliation(s)
| | - Kieran Zucker
- Leeds Institute of Medical Research at St James’s, University of Leeds, Leeds, United Kingdom
| | - Paul Baxter
- Leeds Institute of Cardiovascular and Metabolic Medicine, University of Leeds, Leeds, United Kingdom
| | - Penny Wright
- Leeds Institute of Medical Research at St James’s, University of Leeds, Leeds, United Kingdom
| | - Jon Fistein
- Leeds Institute for Data Analytics, University of Leeds, Leeds, United Kingdom
| | - Alex F. Markham
- Leeds Institute for Data Analytics, University of Leeds, Leeds, United Kingdom
- Leeds Institute of Medical Research at St James’s, University of Leeds, Leeds, United Kingdom
| | - Mark Birkin
- Leeds Institute for Data Analytics, University of Leeds, Leeds, United Kingdom
| | - Adam W. Glaser
- Leeds Institute for Data Analytics, University of Leeds, Leeds, United Kingdom
- Leeds Institute of Medical Research at St James’s, University of Leeds, Leeds, United Kingdom
| | - Geoff Hall
- Leeds Institute for Data Analytics, University of Leeds, Leeds, United Kingdom
- Leeds Institute of Medical Research at St James’s, University of Leeds, Leeds, United Kingdom
| |
Collapse
|
4
|
Soares N, Singhal S, Kloosterman C, Bailey T. An Interdisciplinary Approach to Reducing Errors in Extracted Electronic Health Record Data for Research. PERSPECTIVES IN HEALTH INFORMATION MANAGEMENT 2021; 18:1f. [PMID: 34035787 PMCID: PMC8120677] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Erroneous electronic health record (EHR) data capture is a barrier to preserving data integrity. We assessed the impact of an interdisciplinary process in minimizing EHR data loss from prescription orders. We implemented a three-step approach to reduce data loss due to missing medication doses: Step 1-A data analyst updated the request code to optimize data capture; Step 2-A pharmacist and physician identified variations in EHR prescription workflows; and Step 3-The clinician team determined daily doses for patients with multiple prescriptions in the same encounter. The initial report contained 1421 prescriptions, with 377 (26.5 percent) missing dosages. Missing dosages reduced to 361 (26.3 percent) prescriptions following Step 1, and twenty-three (1.7 percent) records after Step 2. After Step 3, 1210 prescriptions remained, including 16 (1.3 percent) prescriptions missing doses. Prescription data is susceptible to missing values due to multiple data capture workflows. Our approach minimized data loss, improving its validity in retrospective research.
Collapse
|
5
|
Wilhite JA, Altshuler L, Zabar S, Gillespie C, Kalet A. Development and maintenance of a medical education research registry. BMC MEDICAL EDUCATION 2020; 20:199. [PMID: 32560652 PMCID: PMC7305610 DOI: 10.1186/s12909-020-02113-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/20/2019] [Accepted: 06/15/2020] [Indexed: 06/11/2023]
Abstract
BACKGROUND Medical Education research suffers from several methodological limitations including too many single institution, small sample-sized studies, limited access to quality data, and insufficient institutional support. Increasing calls for medical education outcome data and quality improvement research have highlighted a critical need for uniformly clean and easily accessible data. Research registries may fill this gap. In 2006, the Research on Medical Education Outcomes (ROMEO) unit of the Program for Medical Innovations and Research (PrMEIR) at New York University's (NYU) Robert I. Grossman School of Medicine established the Database for Research on Academic Medicine (DREAM). DREAM is a database of routinely collected, de-identified undergraduate (UME, medical school leading up to the Medical Doctor degree) and graduate medical education (GME, residency also known as post graduate education leading to eligibility for specialty board certification) outcomes data available, through application, to researchers. Learners are added to our database through annual consent sessions conducted at the start of educational training. Based on experience, we describe our methods in creating and maintaining DREAM to serve as a guide for institutions looking to build a new or scale up their medical education registry. RESULTS At present, our UME and GME registries have consent rates of 90% (n = 1438/1598) and 76% (n = 1988/2627), respectively, with a combined rate of 81% (n = 3426/4225). 7% (n = 250/3426) of these learners completed both medical school and residency at our institution. DREAM has yielded a total of 61 individual studies conducted by medical education researchers and a total of 45 academic journal publications. CONCLUSION We have built a community of practice through the building of DREAM and hope, by persisting in this work the full potential of this tool and the community will be realized. While researchers with access to the registry have focused primarily on curricular/ program evaluation, learner competency assessment, and measure validation, we hope to expand the output of the registry to include patient outcomes by linking learner educational and clinical performance across the UME-GME continuum and into independent practice. Future publications will reflect our efforts in reaching this goal and will highlight the long-term impact of our collaborative work.
Collapse
Affiliation(s)
- Jeffrey A Wilhite
- Department of Medicine, Division of General Internal Medicine and Clinical Innovation, NYU Robert I. Grossman School of Medicine, 462 1st Avenue, New York, NY, 10016, USA.
| | - Lisa Altshuler
- Department of Medicine, Division of General Internal Medicine and Clinical Innovation, NYU Robert I. Grossman School of Medicine, 462 1st Avenue, New York, NY, 10016, USA
| | - Sondra Zabar
- Department of Medicine, Division of General Internal Medicine and Clinical Innovation, NYU Robert I. Grossman School of Medicine, 462 1st Avenue, New York, NY, 10016, USA
| | - Colleen Gillespie
- Department of Medicine, Division of General Internal Medicine and Clinical Innovation, NYU Robert I. Grossman School of Medicine, 462 1st Avenue, New York, NY, 10016, USA
- Institute for Innovations in Medical Education, Division of Education Quality, 550 First Avenue, Medical Science Building, Suite G107, New York, NY, 10016, USA
| | - Adina Kalet
- Department of Medicine, Division of General Internal Medicine and Clinical Innovation, NYU Robert I. Grossman School of Medicine, 462 1st Avenue, New York, NY, 10016, USA
- Robert D. and Patricia E. Kern Institute for the Transformation of Medical Education, Medical College of Wisconsin, 8701 W. Watertown Plank Road, Wauwatosa, WI, 53226, USA
| |
Collapse
|
6
|
DASH, the data and specimen hub of the National Institute of Child Health and Human Development. Sci Data 2018; 5:180046. [PMID: 29557977 PMCID: PMC5859878 DOI: 10.1038/sdata.2018.46] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2017] [Accepted: 02/15/2018] [Indexed: 11/09/2022] Open
Abstract
The benefits of data sharing are well-established and an increasing number of policies require that data be shared upon publication of the main study findings. As data sharing becomes the new norm, there is a heightened need for additional resources to drive efficient data reuse. This article describes the development and implementation of the Data and Specimen Hub (DASH) by the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD) to promote data sharing from NICHD-funded studies and enable researchers to comply with NIH data sharing policies. DASH’s flexible architecture is designed to archive diverse data types and formats from NICHD’s broad scientific portfolio in a manner that promotes FAIR data sharing principles. Performance of DASH over two years since launch is promising: the number of available studies and data requests are growing; three manuscripts have been published from data reanalysis, all within two years of access. Critical success factors included NICHD leadership commitment, stakeholder engagement and close coordination between the governance body and technical team.
Collapse
|
7
|
Chen F, Jiang X, Wang S, Schilling LM, Meeker D, Ong T, Matheny ME, Doctor JN, Ohno-Machado L, Vaidya J. Perfectly Secure and Efficient Two-Party Electronic-Health-Record Linkage. IEEE INTERNET COMPUTING 2018; 22:32-41. [PMID: 29867290 PMCID: PMC5983039 DOI: 10.1109/mic.2018.112102542] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Patient health data are often found spread across various sources. However, precision medicine and personalized care requires access to the complete medical records. The first step towards this is to enable the linkage of health records spread across different sites. Existing record linkage solutions assume that data is centralized with no privacy/security concerns restricting sharing. However, that is often untrue. Therefore, we design and implement a portable method for privacy-preserving record linkage based on garbled circuits to accurately and securely match records. We also develop a novel approximate matching mechanism that significantly improves efficiency.
Collapse
Affiliation(s)
- Feng Chen
- Health System Department of Biomedical Informatics, UC San Diego, La
Jolla, CA, 92093
| | - Xiaoqian Jiang
- Health System Department of Biomedical Informatics, UC San Diego, La
Jolla, CA, 92093
| | - Shuang Wang
- Health System Department of Biomedical Informatics, UC San Diego, La
Jolla, CA, 92093
| | - Lisa M. Schilling
- Department of Medicine, University of Colorado, Anschutz Medical
Campus, CO, 80045
| | - Daniella Meeker
- Keck School of Medicine, University of Southern California, Los
Angeles, CA 90089
| | - Toan Ong
- Department of Medicine, University of Colorado, Anschutz Medical
Campus, CO, 80045
| | - Michael E. Matheny
- Geriatric Research Education and Clinical Care Service, Tennessee
Valley Healthcare System VA, Nashville, TN 37212
- Departments of Biomedical Informatics, Medicine, and Biostatistics,
Vanderbilt University Medical Center, Nashville, TN 37235
| | - Jason N. Doctor
- Schaeffer Center for Health Policy & Economics, University of
Southern California, Los Angeles, CA 90033
| | - Lucila Ohno-Machado
- Health System Department of Biomedical Informatics, UC San Diego, La
Jolla, CA, 92093
| | - Jaideep Vaidya
- Management Science & Information Systems Department, Rutgers
University, Newark, NJ 07102
| |
Collapse
|
8
|
Li G, Bankhead P, Dunne PD, O’Reilly PG, James JA, Salto-Tellez M, Hamilton PW, McArt DG. Embracing an integromic approach to tissue biomarker research in cancer: Perspectives and lessons learned. Brief Bioinform 2017; 18:634-646. [PMID: 27255914 PMCID: PMC5862317 DOI: 10.1093/bib/bbw044] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2016] [Revised: 04/08/2016] [Indexed: 02/07/2023] Open
Abstract
Modern approaches to biomedical research and diagnostics targeted towards precision medicine are generating 'big data' across a range of high-throughput experimental and analytical platforms. Integrative analysis of this rich clinical, pathological, molecular and imaging data represents one of the greatest bottlenecks in biomarker discovery research in cancer and other diseases. Following on from the publication of our successful framework for multimodal data amalgamation and integrative analysis, Pathology Integromics in Cancer (PICan), this article will explore the essential elements of assembling an integromics framework from a more detailed perspective. PICan, built around a relational database storing curated multimodal data, is the research tool sitting at the heart of our interdisciplinary efforts to streamline biomarker discovery and validation. While recognizing that every institution has a unique set of priorities and challenges, we will use our experiences with PICan as a case study and starting point, rationalizing the design choices we made within the context of our local infrastructure and specific needs, but also highlighting alternative approaches that may better suit other programmes of research and discovery. Along the way, we stress that integromics is not just a set of tools, but rather a cohesive paradigm for how modern bioinformatics can be enhanced. Successful implementation of an integromics framework is a collaborative team effort that is built with an eye to the future and greatly accelerates the processes of biomarker discovery, validation and translation into clinical practice.
Collapse
Affiliation(s)
- Gerald Li
- Centre for Cancer Research and Cell Biology (CCRCB), Queen’s University Belfast, Belfast, United Kingdom
| | - Peter Bankhead
- Centre for Cancer Research and Cell Biology (CCRCB), Queen’s University Belfast, Belfast, United Kingdom
| | - Philip D Dunne
- Centre for Cancer Research and Cell Biology (CCRCB), Queen’s University Belfast, Belfast, United Kingdom
| | - Paul G O’Reilly
- Centre for Cancer Research and Cell Biology (CCRCB), Queen’s University Belfast, Belfast, United Kingdom
| | - Jacqueline A James
- Centre for Cancer Research and Cell Biology (CCRCB), Queen’s University Belfast, Belfast, United Kingdom
| | - Manuel Salto-Tellez
- Centre for Cancer Research and Cell Biology (CCRCB), Queen’s University Belfast, Belfast, United Kingdom
| | - Peter W Hamilton
- Centre for Cancer Research and Cell Biology (CCRCB), Queen’s University Belfast, Belfast, United Kingdom
| | - Darragh G McArt
- Centre for Cancer Research and Cell Biology (CCRCB), Queen’s University Belfast, Belfast, United Kingdom
| |
Collapse
|
9
|
Andry C, Duffy E, Moskaluk CA, McCall S, Roehrl MHA, Remick D. Biobanking-Budgets and the Role of Pathology Biobanks in Precision Medicine. Acad Pathol 2017; 4:2374289517702924. [PMID: 28725790 PMCID: PMC5497908 DOI: 10.1177/2374289517702924] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2016] [Revised: 02/28/2017] [Accepted: 03/04/2017] [Indexed: 12/29/2022] Open
Abstract
Biobanks have become an important component of the routine practice of pathology. At the 2016 meeting of the Association of Pathology Chairs, a series of presentations covered several important aspects of biobanking. An often overlooked aspect of biobanking is the fiscal considerations. A biobank budget must address the costs of consenting, procuring, processing, and preserving high-quality biospecimens. Multiple revenue streams will frequently be necessary to create a sustainable biobank; partnering with other key stakeholders has been shown to be successful at academic institutions which may serve as a model. Biobanking needs to be a deeply science-driven and innovating process so that specimens help transform patient-centered clinical and basic research (ie, fulfill the promise of precision medicine). Pathology’s role must be at the center of the biobanking process. This ensures that optimal research samples are collected while guaranteeing that clinical diagnostics are never impaired. Biobanks will continue to grow as important components in the mission of pathology, especially in the era of precision medicine.
Collapse
Affiliation(s)
- Chris Andry
- Boston Medical Center and Boston University School of Medicine, Boston, MA, USA
| | - Elizabeth Duffy
- Boston Medical Center and Boston University School of Medicine, Boston, MA, USA
| | | | - Shannon McCall
- Department of Pathology, Duke University School of Medicine, Durham, NC, USA
| | | | - Daniel Remick
- Boston Medical Center and Boston University School of Medicine, Boston, MA, USA
| |
Collapse
|
10
|
Dankar FK, Badji R. A risk-based framework for biomedical data sharing. J Biomed Inform 2017; 66:231-240. [PMID: 28126604 DOI: 10.1016/j.jbi.2017.01.012] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2016] [Revised: 12/15/2016] [Accepted: 01/19/2017] [Indexed: 01/21/2023]
Abstract
The problem of biomedical data sharing is a form of gambling; on one hand it incurs the risk of privacy violations and on the other it stands to profit from knowledge discovery. In general, the risk of granting data access to a user depends heavily upon the data requested, the purpose for the access, the user requesting the data (user motives) and the security of the user's environment. While traditional manual biomedical data sharing processes (based on institutional review boards) are lengthy and demanding, the automated ones (known as honest broker systems) disregard the individualities of different requests and offer "one-size-fits-all" solutions to all data requestors. In this manuscript, we propose a conceptual risk-aware data sharing system; the system brings the concept of risk, from all contextual information surrounding a data request, into the data disclosure decision module. The decision module, in turn, imposes mitigation measures to counter the calculated risk.
Collapse
Affiliation(s)
- Fida K Dankar
- College of IT, UAEU, P.O. Box 15551, Al Ain, United Arab Emirates.
| | - Radja Badji
- Sidra Medical and Research Center, P.O. Box 26999, Doha, Qatar.
| |
Collapse
|
11
|
Iafrate RP, Lipori GP, Harle CA, Nelson DR, Barnash TJ, Leebove PT, Adams KA, Montgomery D. Consent2Share: an integrated broad consenting process for re-contacting potential study subjects. J Clin Transl Res 2016; 2:113-122. [PMID: 30873469 PMCID: PMC6410634] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2016] [Revised: 09/09/2016] [Accepted: 10/07/2016] [Indexed: 11/16/2022] Open
Abstract
Background and Aim: Obtaining sufficient subjects into research studies is an ongoing barrier to conducting clinical research. Privacy rules add to the complexity of identifying qualified study subjects. The process described facilitates consent of patients coming to their clinically scheduled appointments who are asked to consent to having researchers review their electronic medical records (EHR), and if they meet study criteria for future research, being contacted by those researchers and asked if they wish to be involved in a research project. Methods: An interdisciplinary group representing the Institutional Review Board (IRB), Information Technology (IT), Hospital, University and Research developed an initial paper then electronic method to consent all patients attending a medical subspecialty clinic. All consent data are integrated to the EHR to facilitate linking to clinical information. Results: Although the paper consenting method resulted in over an 80% "yes" rate of consent, it was complicated by significant procedural challenges which prevented scalability. Revising the process has resulted in nearly 28,000 patients consenting in a 3 year period and in 20 IRB approved protocols using subjects who agreed to Consent2Share. Conclusions: A multi-disciplinary effort is essential to develop a successful electronic based, integrated process to assist investigators and patients to facilitate study subject accrual. Relevance for patients: Consent2Share more efficiently assists researchers in identifying and contacting potential study subjects that meet entrance criteria. The process provides a model to comply with the proposed Notice of Public Rule Making (NPRM) where institutions will be strongly encouraged to develop broad research consent procedures.
Collapse
Affiliation(s)
- R Peter Iafrate
- Institutional Review Board, University of Florida, Gainesville, Florida, United States
| | - Gloria P Lipori
- Operational Planning & Analysis, University of Florida Health and University of Florida Health Sciences Center, Gainesville, Florida, United States
| | - Christopher A Harle
- Department of Health Policy and Management, Indiana University, Indianapolis, Indiana, United States
| | - David R Nelson
- Department of Medicine, University of Florida, Gainesville, Florida, United States
| | - Timothy J Barnash
- Practice Management Applications, UF Health Physicians, University of Florida, Gainesville, Florida, United States
| | - Patricia T Leebove
- Medical Specialties and Transplant Clinic, UF Health Physicians, University of Florida, Gainesville, Florida, United States
| | - Kathleen A Adams
- Medical Specialties and Transplant Clinic, UF Health Physicians, University of Florida, Gainesville, Florida, United States
| | - Debbi Montgomery
- Information Technology, UF Health, Gainesville, Florida, United States
| |
Collapse
|
12
|
Dayyani F, Zurita AJ, Nogueras-González GM, Slack R, Millikan RE, Araujo JC, Gallick GE, Logothetis CJ, Corn PG. The combination of serum insulin, osteopontin, and hepatocyte growth factor predicts time to castration-resistant progression in androgen dependent metastatic prostate cancer- an exploratory study. BMC Cancer 2016; 16:721. [PMID: 27599544 PMCID: PMC5013640 DOI: 10.1186/s12885-016-2723-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2015] [Accepted: 08/10/2016] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND We hypothesized that pretreatment serum levels of insulin and other serum markers would predict Progression-free survival (PFS), defined as time to castration-resistant progression or death, in metastatic androgen-dependent prostate cancer (mADPC). METHODS Serum samples from treatment-naïve men participating in a randomized phase 3 trial of ADT +/- chemotherapy were retrospectively analyzed using multiplex assays for insulin and multiple other soluble factors. Cox proportional hazards regression models were used to identify associations between individual factor levels and PFS. RESULTS Sixty six patients were evaluable (median age = 72 years; median prostate surface antigen [PSA] = 31.5 ng/mL; Caucasian = 86 %; Gleason score ≥8 = 77 %). In the univariable analysis, higher insulin (HR = 0.81 [0.67, 0.98] p = 0.03) and C-peptide (HR = 0.62 [0.39, 1.00]; p = 0.05) levels were associated with a longer PFS, while higher Hepatocyte Growth Factor (HGF; HR = 1.63 [1.06, 2.51] p = 0.03) and Osteopontin (OPN; HR = 1.56 [1.13, 2.15]; p = 0.01) levels were associated with a shorter PFS. In multivariable analysis, insulin below 2.1 (ln scale; HR = 2.55 [1.24, 5.23]; p = 0.011) and HGF above 8.9 (ln scale; HR = 2.67 [1.08, 3.70]; p = 0.027) levels were associated with longer PFS, while adjusted by OPN, C-peptide, trial therapy and metastatic volume. Four distinct risk groups were identified by counting the number of risk factors (RF) including low insulin, high HGF, high OPN levels, and low C-peptide levels (0, 1, 2, and 3). Median PFS was 9.8, 2.0, 1.6, and 0.7 years for each, respectively (p < 0.001). CONCLUSION Pretreatment serum insulin, HGF, OPN, and C-peptide levels can predict PFS in men with mADPC treated with ADT. Risk groups based on these factors are superior predictors of PFS than each marker alone.
Collapse
Affiliation(s)
- Farshid Dayyani
- Department of Genitourinary Medical Oncology, The University of Texas M.D. Anderson Cancer Center, Dan L. Duncan Building (CPB7.3476), 1515 Holcombe Blvd., Unit 1374, Houston, TX, 77030, USA
| | - Amado J Zurita
- Department of Genitourinary Medical Oncology, The University of Texas M.D. Anderson Cancer Center, Dan L. Duncan Building (CPB7.3476), 1515 Holcombe Blvd., Unit 1374, Houston, TX, 77030, USA
| | | | - Rebecca Slack
- Department of Biostatistics, The University of Texas M.D. Anderson Cancer Center, Houston, TX, USA
| | - Randall E Millikan
- Department of Genitourinary Medical Oncology, The University of Texas M.D. Anderson Cancer Center, Dan L. Duncan Building (CPB7.3476), 1515 Holcombe Blvd., Unit 1374, Houston, TX, 77030, USA
| | - John C Araujo
- Department of Genitourinary Medical Oncology, The University of Texas M.D. Anderson Cancer Center, Dan L. Duncan Building (CPB7.3476), 1515 Holcombe Blvd., Unit 1374, Houston, TX, 77030, USA
| | - Gary E Gallick
- Department of Genitourinary Medical Oncology, The University of Texas M.D. Anderson Cancer Center, Dan L. Duncan Building (CPB7.3476), 1515 Holcombe Blvd., Unit 1374, Houston, TX, 77030, USA
| | - Christopher J Logothetis
- Department of Genitourinary Medical Oncology, The University of Texas M.D. Anderson Cancer Center, Dan L. Duncan Building (CPB7.3476), 1515 Holcombe Blvd., Unit 1374, Houston, TX, 77030, USA
| | - Paul G Corn
- Department of Genitourinary Medical Oncology, The University of Texas M.D. Anderson Cancer Center, Dan L. Duncan Building (CPB7.3476), 1515 Holcombe Blvd., Unit 1374, Houston, TX, 77030, USA.
| |
Collapse
|
13
|
Felmeister AS, Masino AJ, Rivera TJ, Resnick AC, Pennington JW. The biorepository portal toolkit: an honest brokered, modular service oriented software tool set for biospecimen-driven translational research. BMC Genomics 2016; 17 Suppl 4:434. [PMID: 27535360 PMCID: PMC5001241 DOI: 10.1186/s12864-016-2797-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND High throughput molecular sequencing and increased biospecimen variety have introduced significant informatics challenges for research biorepository infrastructures. We applied a modular system integration approach to develop an operational biorepository management system. This method enables aggregation of the clinical, specimen and genomic data collected for biorepository resources. METHODS We introduce an electronic Honest Broker (eHB) and Biorepository Portal (BRP) open source project that, in tandem, allow for data integration while protecting patient privacy. This modular approach allows data and specimens to be associated with a biorepository subject at any time point asynchronously. This lowers the bar to develop new research projects based on scientific merit without institutional review for a proposal. RESULTS By facilitating the automated de-identification of specimen and associated clinical and genomic data we create a future proofed specimen set that can withstand new workflows and be connected to new associated information over time. Thus facilitating collaborative advanced genomic and tissue research. CONCLUSIONS As of Janurary of 2016 there are 23 unique protocols/patient cohorts being managed in the Biorepository Portal (BRP). There are over 4000 unique subject records in the electronic honest broker (eHB), over 30,000 specimens accessioned and 8 institutions participating in various biobanking activities using this tool kit. We specifically set out to build rich annotation of biospecimens with longitudinal clinical data; BRP/REDCap integration for multi-institutional repositories; EMR integration; further annotated specimens with genomic data specific to a domain; build application hooks for experiments at the specimen level integrated with analytic software; while protecting privacy per the Office of Civil Rights (OCR) and HIPAA.
Collapse
Affiliation(s)
- Alex S Felmeister
- Department of Biomedical and Health Informatics, The Children's Hospital of Philadelphia, 3401 Civic Center Blvd, Philadelphia, PA, USA.
- College of Computing and Informatics, Drexel University, 3141 Chestnut Street, Philadelphia, PA, USA.
| | - Aaron J Masino
- Department of Biomedical and Health Informatics, The Children's Hospital of Philadelphia, 3401 Civic Center Blvd, Philadelphia, PA, USA
| | - Tyler J Rivera
- Department of Biomedical and Health Informatics, The Children's Hospital of Philadelphia, 3401 Civic Center Blvd, Philadelphia, PA, USA
| | - Adam C Resnick
- Department of Biomedical and Health Informatics, The Children's Hospital of Philadelphia, 3401 Civic Center Blvd, Philadelphia, PA, USA
- Department of Neurosurgery, Perelman School of Medicine at the University of Pennsylvania, 3400 Civic Center Boulevard, Building 421, Philadelphia, PA, USA
| | - Jeffrey W Pennington
- Department of Biomedical and Health Informatics, The Children's Hospital of Philadelphia, 3401 Civic Center Blvd, Philadelphia, PA, USA
| |
Collapse
|
14
|
Drake BF, Brown K, McGowan LD, Haslag-Minoff J, Kaphingst K. Secondary consent to biospecimen use in a prostate cancer biorepository. BMC Res Notes 2016; 9:346. [PMID: 27431491 PMCID: PMC4949745 DOI: 10.1186/s13104-016-2159-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2016] [Accepted: 07/13/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Biorepository research has substantial societal benefits. This is one of the few studies to focus on male willingness to allow future research use of biospecimens. METHODS This study analyzed the future research consent questions from a prostate cancer biorepository study (N = 1931). The consent form asked two questions regarding use of samples in future studies (1) without and (2) with protected health information (PHI). Yes to both questions of use of samples was categorized as Yes-Always; Yes to without and No to with PHI was categorized as Yes-Conditional; No to without PHI was categorized as Never. We analyzed this outcome to determine significant predictors for consent to Yes-Always vs. Yes-Conditional. RESULTS 99.33 % consented to future use of samples; 88.19 % consented to future use without PHI, and among those men 10.2 % consented to future use with PHI. Comparing Yes Always and Yes Conditional responses, bivariate analyses showed that race, family history, stage of cancer, and grade of cancer (Gleason), were significant at the α = 0.05 level. Using stepwise multivariable logistic regression, we found that African-American men were significantly more likely to respond Yes Always when compared to White men (p < 0.001). Those with a family history of prostate cancer were significantly more likely to respond Yes Always (p = 0.002). CONCLUSIONS There is general willingness to consent to future use of specimens without PHI among men.
Collapse
Affiliation(s)
- Bettina F Drake
- Division of Public Health Sciences, Washington University School of Medicine, 600 S. Taylor Ave, Campus Box 8100, St. Louis, MO, 63110, USA. .,Alvin J. Siteman Cancer Center, St. Louis, MO, USA.
| | - Katherine Brown
- Division of Public Health Sciences, Washington University School of Medicine, 600 S. Taylor Ave, Campus Box 8100, St. Louis, MO, 63110, USA
| | | | | | - Kimberly Kaphingst
- Department of Communication, University of Utah, Salt Lake City, UT, USA.,Huntsman Cancer Institute, Salt Lake City, UT, USA
| |
Collapse
|
15
|
Choi HJ, Lee MJ, Choi CM, Lee J, Shin SY, Lyu Y, Park YR, Yoo S. Establishing the role of honest broker: bridging the gap between protecting personal health data and clinical research efficiency. PeerJ 2015; 3:e1506. [PMID: 26713253 PMCID: PMC4690386 DOI: 10.7717/peerj.1506] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2015] [Accepted: 11/24/2015] [Indexed: 11/20/2022] Open
Abstract
Background. The objective of this study is to propose the four conditions for the roles of honest brokers through a review of literature published by ten institutions that are successfully utilizing honest brokers. Furthermore, the study aims to examine whether the Asan Medical Center's (AMC) honest brokers satisfy the four conditions, and examine the need to enhance their roles. Methods. We analyzed the roles, tasks, and types of honest brokers at 10 organizations by reviewing the literature. We also established a Task Force (TF) in our institution for setting the roles and processes of the honest broker system and the honest brokers. The findings of the literature search were compared with the existing systems at AMC-which introduced the honest broker system for the first time in Korea. Results. Only one organization employed an honest broker for validating anonymized clinical data and monitoring the anonymity verifications of the honest broker system. Six organizations complied with HIPAA privacy regulations, while four organizations did not disclose compliance. By comparing functions with those of the AMC, the following four main characteristics of honest brokers were determined: (1) de-identification of clinical data; (2) independence; (3) checking that the data are used only for purposes approved by the IRB; and (4) provision of de-identified data to researchers. These roles were then compared with those of honest brokers at the AMC. Discussion. First, guidelines that regulate the definitions, purposes, roles, and requirements for honest brokers are needed, since there are no currently existing regulations. Second, Korean clinical research institutions and national regulatory departments need to reach a consensus on a Korean version of Limited Data Sets (LDS), since there are no lists that describe the use of personal identification information. Lastly, satisfaction surveys on honest brokers by researchers are necessary to improve the quality of honest brokers.
Collapse
Affiliation(s)
- Hyo Joung Choi
- Office of Clinical Research Information, Asan Institute of Life Sciences, Asan Medical Center, Seoul, Korea
| | - Min Joung Lee
- Office of Clinical Research Information, Asan Institute of Life Sciences, Asan Medical Center, Seoul, Korea
| | - Chang-Min Choi
- Office of Clinical Research Information, Asan Institute of Life Sciences, Asan Medical Center, Seoul, Korea.,Department of Pulmonology and Critical Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea.,Department of Oncology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
| | - JaeHo Lee
- Office of Clinical Research Information, Asan Institute of Life Sciences, Asan Medical Center, Seoul, Korea.,Department of Biomedical Informatics, Asan Medical Center, Seoul, Korea.,Department of Emergency Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
| | - Soo-Yong Shin
- Office of Clinical Research Information, Asan Institute of Life Sciences, Asan Medical Center, Seoul, Korea.,Department of Biomedical Informatics, Asan Medical Center, Seoul, Korea
| | - Yungman Lyu
- Office of Clinical Research Information, Asan Institute of Life Sciences, Asan Medical Center, Seoul, Korea
| | - Yu Rang Park
- Department of Biomedical Informatics, Asan Medical Center, Seoul, Korea.,Clinical Research Center, Asan Institute of Life Sciences, Asan Medical Center, Seoul, Korea
| | - Soyoung Yoo
- Human Research Protection Center, Asan Institute of Life Sciences, Asan Medical Center, Seoul, Korea
| |
Collapse
|
16
|
Jacobson RS, Becich MJ, Bollag RJ, Chavan G, Corrigan J, Dhir R, Feldman MD, Gaudioso C, Legowski E, Maihle NJ, Mitchell K, Murphy M, Sakthivel M, Tseytlin E, Weaver J. A Federated Network for Translational Cancer Research Using Clinical Data and Biospecimens. Cancer Res 2015; 75:5194-201. [PMID: 26670560 PMCID: PMC4683415 DOI: 10.1158/0008-5472.can-15-1973] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Advances in cancer research and personalized medicine will require significant new bridging infrastructures, including more robust biorepositories that link human tissue to clinical phenotypes and outcomes. In order to meet that challenge, four cancer centers formed the Text Information Extraction System (TIES) Cancer Research Network, a federated network that facilitates data and biospecimen sharing among member institutions. Member sites can access pathology data that are de-identified and processed with the TIES natural language processing system, which creates a repository of rich phenotype data linked to clinical biospecimens. TIES incorporates multiple security and privacy best practices that, combined with legal agreements, network policies, and procedures, enable regulatory compliance. The TIES Cancer Research Network now provides integrated access to investigators at all member institutions, where multiple investigator-driven pilot projects are underway. Examples of federated search across the network illustrate the potential impact on translational research, particularly for studies involving rare cancers, rare phenotypes, and specific biologic behaviors. The network satisfies several key desiderata including local control of data and credentialing, inclusion of rich phenotype information, and applicability to diverse research objectives. The TIES Cancer Research Network presents a model for a national data and biospecimen network.
Collapse
Affiliation(s)
| | - Michael J Becich
- University of Pittsburgh Cancer Institute, Pittsburgh, Pennsylvania
| | - Roni J Bollag
- Georgia Regents University Cancer Center, Augusta, Georgia
| | - Girish Chavan
- University of Pittsburgh Cancer Institute, Pittsburgh, Pennsylvania
| | - Julia Corrigan
- University of Pittsburgh Cancer Institute, Pittsburgh, Pennsylvania
| | - Rajiv Dhir
- University of Pittsburgh Cancer Institute, Pittsburgh, Pennsylvania
| | - Michael D Feldman
- Abramson Cancer Center, University of Pennsylvania, Philadelphia, Pennsylvania
| | | | | | - Nita J Maihle
- Georgia Regents University Cancer Center, Augusta, Georgia
| | - Kevin Mitchell
- University of Pittsburgh Cancer Institute, Pittsburgh, Pennsylvania
| | | | | | - Eugene Tseytlin
- University of Pittsburgh Cancer Institute, Pittsburgh, Pennsylvania
| | - JoEllen Weaver
- Abramson Cancer Center, University of Pennsylvania, Philadelphia, Pennsylvania
| |
Collapse
|
17
|
Mosa ASM, Yoo I, Apathy NC, Ko KJ, Parker JC. Secondary Use of Clinical Data to Enable Data-Driven Translational Science with Trustworthy Access Management. MISSOURI MEDICINE 2015; 112:443-448. [PMID: 26821445 PMCID: PMC6168106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
University of Missouri (MU) Health Care produces a large amount of digitized clinical data that can be used in clinical and translational research for cohort identification, retrospective data analysis, feasibility study, and hypothesis generation. In this article, the implementation of an integrated clinical research data repository is discussed. We developed trustworthy access-management protocol for providing access to both clinically relevant data and protected health information. As of September 2014, the database contains approximately 400,000 patients and 82 million observations; and is growing daily. The system will facilitate the secondary use of electronic health record (EHR) data at MU to promote data-driven clinical and translational research, in turn enabling better healthcare through research.
Collapse
|
18
|
Noor AM, Holmberg L, Gillett C, Grigoriadis A. Big Data: the challenge for small research groups in the era of cancer genomics. Br J Cancer 2015; 113:1405-12. [PMID: 26492224 PMCID: PMC4815885 DOI: 10.1038/bjc.2015.341] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2015] [Revised: 08/04/2015] [Accepted: 08/09/2015] [Indexed: 01/06/2023] Open
Abstract
In the past decade, cancer research has seen an increasing trend towards high-throughput techniques and translational approaches. The increasing availability of assays that utilise smaller quantities of source material and produce higher volumes of data output have resulted in the necessity for data storage solutions beyond those previously used. Multifactorial data, both large in sample size and heterogeneous in context, needs to be integrated in a standardised, cost-effective and secure manner. This requires technical solutions and administrative support not normally financially accounted for in small- to moderate-sized research groups. In this review, we highlight the Big Data challenges faced by translational research groups in the precision medicine era; an era in which the genomes of over 75 000 patients will be sequenced by the National Health Service over the next 3 years to advance healthcare. In particular, we have looked at three main themes of data management in relation to cancer research, namely (1) cancer ontology management, (2) IT infrastructures that have been developed to support data management and (3) the unique ethical challenges introduced by utilising Big Data in research.
Collapse
Affiliation(s)
- Aisyah Mohd Noor
- Research Oncology, Faculty of Life Sciences and Medicine, King's College London, Guy's Hospital, London SE1 9RT, UK
| | - Lars Holmberg
- Research Oncology, Faculty of Life Sciences and Medicine, King's College London, Guy's Hospital, London SE1 9RT, UK.,Department of Surgical Sciences, Uppsala University, Uppsala 751 85, Sweden
| | - Cheryl Gillett
- Research Oncology, Faculty of Life Sciences and Medicine, King's College London, Guy's Hospital, London SE1 9RT, UK.,Faculty of Life Sciences and Medicine, King's Health Partners Cancer Biobank, King's College London, Research Oncology, Guy's Hospital, London SE1 9RT, UK
| | - Anita Grigoriadis
- Research Oncology, Faculty of Life Sciences and Medicine, King's College London, Guy's Hospital, London SE1 9RT, UK.,Breast Cancer Now Research Unit, Research Oncology, Faculty of Life Sciences and Medicine, King's College London, Guy's Hospital, London SE1 9RT, UK
| |
Collapse
|
19
|
Abstract
This article raises the concern that biobanks are failing to realize the expected research and health service outcomes. Rather than biobanking, we have been engaging in ‘biohoarding’, where building a quantifiable collection of tissue samples is the primary basis of the bio-resource. The root cause of ‘biohoarding’ is an ideological and motivational confusion as to the purpose for collecting the tissue in the first place. We have lost sight of the knowledge gain that biobanks should generate. The obligation to prevent ‘biohoarding’ lies not with researchers, funders or managers but with policy makers.
Collapse
Affiliation(s)
- Daniel Catchpoole
- Associate Professor, The Tumour Bank, The Children’s Cancer Research Unit, The Kids Research Institute, Westmead, NSW, Australia
| |
Collapse
|
20
|
Starren JB, Winter AQ, Lloyd-Jones DM. Enabling a Learning Health System through a Unified Enterprise Data Warehouse: The Experience of the Northwestern University Clinical and Translational Sciences (NUCATS) Institute. Clin Transl Sci 2015; 8:269-71. [PMID: 26032246 DOI: 10.1111/cts.12294] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022] Open
Affiliation(s)
- Justin B Starren
- Northwestern University Clinical and Translational Sciences Institute, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA.,Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA
| | - Andrew Q Winter
- Northwestern University Clinical and Translational Sciences Institute, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA
| | - Donald M Lloyd-Jones
- Northwestern University Clinical and Translational Sciences Institute, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA.,Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA
| |
Collapse
|
21
|
McCusker ME, Cress RD, Allen M, Fernandez-Ami A, Gandour-Edwards R. Feasibility of linking population-based cancer registries and cancer center biorepositories. Biopreserv Biobank 2015; 10:416-20. [PMID: 24845042 DOI: 10.1089/bio.2012.0014] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open
Abstract
PURPOSE Biospecimen-based research offers tremendous promise as a way to increase understanding of the molecular epidemiology of cancers. Population-based cancer registries can augment this research by providing more clinical detail and long-term follow-up information than is typically available from biospecimen annotations. In order to demonstrate the feasibility of this concept, we performed a pilot linkage between the California Cancer Registry (CCR) and the University of California, Davis Cancer Center Biorepository (UCD CCB) databases to determine if we could identify patients with records in both databases. METHODS We performed a probabilistic data linkage between 2180 UCD CCB biospecimen records collected during the years 2005-2009 and all CCR records for cancers diagnosed from 1988-2009 based on standard data linkage procedures. RESULTS The 1040 UCD records with a unique medical record number, tissue site, and pathology date were linked to 3.3 million CCR records. Of these, 844 (81.2%) were identified in both databases. Overall, record matches were highest (100%) for cancers of the cervix and testis/other male genital system organs. For the most common cancers, matches were highest for cancers of the lung and respiratory system (93%), breast (91.7%), and colon and rectum (89.5%), and lower for prostate (72.9%). CONCLUSIONS This pilot linkage demonstrated that information on existing biospecimens from a cancer center biorepository can be linked successfully to cancer registry data. Linkages between existing biorepositories and cancer registries can foster productive collaborations and provide a foundation for virtual biorepository networks to support population-based biospecimen research.
Collapse
Affiliation(s)
- Margaret E McCusker
- 1 Cancer Surveillance and Research Branch , California Department of Public Health, Sacramento, California, at the time of writing this article
| | | | | | | | | |
Collapse
|
22
|
Mosa ASM, Yoo I, Parker JC. Online electronic data capture and research data repository system for clinical and translational research. MISSOURI MEDICINE 2015; 112:46-52. [PMID: 25812275 PMCID: PMC6170092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Data is at the core of any clinical and translational research (CTR). In many studies, the electronic data capture (EDC) method has been found to be more efficient than standard paper-based data collection methods in many aspects, including accuracy, integrity, timeliness, and cost-effectiveness. The objective of this article is to present a secure, web-based EDC system for CTR that has been implemented by the Institute for Clinical and Translational Science (iCATS) at the University of Missouri School of Medicine.
Collapse
|
23
|
A Theoretical Multi-level Privacy Protection Framework for Biomedical Data Warehouses. ACTA ACUST UNITED AC 2015. [DOI: 10.1016/j.procs.2015.08.386] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
24
|
Grinspan ZM, Abramson EL, Banerjee S, Kern LM, Kaushal R, Shapiro JS. People with epilepsy who use multiple hospitals; prevalence and associated factors assessed via a health information exchange. Epilepsia 2014; 55:734-745. [PMID: 24598038 PMCID: PMC4037914 DOI: 10.1111/epi.12552] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/03/2014] [Indexed: 02/03/2023]
Abstract
OBJECTIVE Hospital crossover occurs when people seek care at multiple hospitals, creating information gaps for physicians at the time of care. Health information exchange (HIE) is technology that fills these gaps, by allowing otherwise unaffiliated physicians to share electronic medical information. However, the potential value of HIE is understudied, particularly for chronic neurologic conditions like epilepsy. We describe the prevalence and associated factors of hospital crossover among people with epilepsy, in order to understand the epidemiology of who may benefit from HIE. METHODS We used a cross-sectional study design to examine the bivariate and multivariable association of demographics, comorbidity, and health service utilization variables with hospital crossover, among people with epilepsy. We identified 8,074 people with epilepsy from the International Classification of Diseases, Ninth Revision (ICD-9) codes, obtained from an HIE that linked seven hospitals in Manhattan, New York. We defined hospital crossover as care from more than one hospital in any setting (inpatient, outpatient, emergency, or radiology) over 2 years. RESULTS Of 8,074 people with epilepsy, 1,770 (22%) engaged in hospital crossover over 2 years. Crossover was associated with younger age (children compared with adults, adjusted odds ratio [OR] 1.4, 95% confidence interval [CI] 1.2-1.7), living near the hospitals (Manhattan vs. other boroughs of New York City, adjusted OR 1.6, 95% CI 1.4-1.8), more visits in the emergency, radiology, inpatient, and outpatient settings (p < 0.001 for each), and more head computerized tomography (CT) scans (p < 0.01). The diagnosis of "encephalopathy" was consistently associated with crossover in bivariate and multivariable analyses (adjusted OR 2.66, 95% CI 2.14-3.29), whereas the relationship between other comorbidities and crossover was less clear. SIGNIFICANCE Hospital crossover is common among people with epilepsy, particularly among children, frequent users of medical services, and people living near the study hospitals. HIE should focus on these populations. Further research should investigate why hospital crossover occurs, how it affects care, and how HIE can most effectively mitigate the resultant fragmentation of medical records.
Collapse
Affiliation(s)
- Zachary M Grinspan
- Department of Pediatrics, Weill Cornell Medical College, New York, NY
- Center for Healthcare Informatics and Policy, Weill Cornell Medical College, New York, NY
- New York Presbyterian Hospital, New York, NY
| | - Erika L Abramson
- Department of Pediatrics, Weill Cornell Medical College, New York, NY
- Center for Healthcare Informatics and Policy, Weill Cornell Medical College, New York, NY
- New York Presbyterian Hospital, New York, NY
- Department of Public Health, Weill Cornell Medical College, New York, NY
- Health Information Technology Evaluation Collaborative, New York, NY
| | - Samprit Banerjee
- Department of Public Health, Weill Cornell Medical College, New York, NY
- Department of Statistical Science, Cornell University, Ithaca, NY
| | - Lisa M Kern
- Center for Healthcare Informatics and Policy, Weill Cornell Medical College, New York, NY
- New York Presbyterian Hospital, New York, NY
- Department of Public Health, Weill Cornell Medical College, New York, NY
- Health Information Technology Evaluation Collaborative, New York, NY
- Department of Medicine, Weill Cornell Medical College, New York, NY
| | - Rainu Kaushal
- Department of Pediatrics, Weill Cornell Medical College, New York, NY
- Center for Healthcare Informatics and Policy, Weill Cornell Medical College, New York, NY
- New York Presbyterian Hospital, New York, NY
- Department of Public Health, Weill Cornell Medical College, New York, NY
- Health Information Technology Evaluation Collaborative, New York, NY
- Department of Medicine, Weill Cornell Medical College, New York, NY
| | - Jason S Shapiro
- Department of Emergency Medicine, Mount Sinai School of Medicine, New York, NY
| |
Collapse
|
25
|
Shin SY, Kim WS, Lee JH. Characteristics desired in clinical data warehouse for biomedical research. Healthc Inform Res 2014; 20:109-16. [PMID: 24872909 PMCID: PMC4030054 DOI: 10.4258/hir.2014.20.2.109] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2014] [Revised: 03/22/2014] [Accepted: 04/06/2014] [Indexed: 11/25/2022] Open
Abstract
Objectives Due to the unique characteristics of clinical data, clinical data warehouses (CDWs) have not been successful so far. Specifically, the use of CDWs for biomedical research has been relatively unsuccessful thus far. The characteristics necessary for the successful implementation and operation of a CDW for biomedical research have not clearly defined yet. Methods Three examples of CDWs were reviewed: a multipurpose CDW in a hospital, a CDW for independent multi-institutional research, and a CDW for research use in an institution. After reviewing the three CDW examples, we propose some key characteristics needed in a CDW for biomedical research. Results A CDW for research should include an honest broker system and an Institutional Review Board approval interface to comply with governmental regulations. It should also include a simple query interface, an anonymized data review tool, and a data extraction tool. Also, it should be a biomedical research platform for data repository use as well as data analysis. Conclusions The proposed characteristics desired in a CDW may have limited transfer value to organizations in other countries. However, these analysis results are still valid in Korea, and we have developed clinical research data warehouse based on these desiderata.
Collapse
Affiliation(s)
- Soo-Yong Shin
- Department of Biomedical Informatics, Asan Medical Center, Seoul, Korea
| | - Woo Sung Kim
- Department of Biomedical Informatics, Asan Medical Center, Seoul, Korea. ; Department of Pulmonary and Critical Care Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
| | - Jae-Ho Lee
- Department of Biomedical Informatics, Asan Medical Center, Seoul, Korea. ; Department of Emergency Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea. ; Division of General Internal Medicine, Brigham and Women's Hospital, Boston, MA, USA
| |
Collapse
|
26
|
Wade TD, Zelarney PT, Hum RC, McGee S, Batson DH. Using patient lists to add value to integrated data repositories. J Biomed Inform 2014; 52:72-7. [PMID: 24534444 DOI: 10.1016/j.jbi.2014.02.010] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2013] [Revised: 12/20/2013] [Accepted: 02/04/2014] [Indexed: 01/16/2023]
Abstract
Patient lists are project-specific sets of patients that can be queried in integrated data repositories (IDR's). By allowing a set of patients to be an addition to the qualifying conditions of a query, returned results will refer to, and only to, that set of patients. We report a variety of use cases for such lists, including: restricting retrospective chart review to a defined set of patients; following a set of patients for practice management purposes; distributing "honest-brokered" (deidentified) data; adding phenotypes to biosamples; and enhancing the content of study or registry data. Among the capabilities needed to implement patient lists in an IDR are: capture of patient identifiers from a query and feedback of these into the IDR; the existence of a permanent internal identifier in the IDR that is mappable to external identifiers; the ability to add queryable attributes to the IDR; the ability to merge data from multiple queries; and suitable control over user access and de-identification of results. We implemented patient lists in a custom IDR of our own design. We reviewed capabilities of other published IDRs for focusing on sets of patients. The widely used i2b2 IDR platform has various ways to address patient sets, and it could be modified to add the low-overhead version of patient lists that we describe.
Collapse
Affiliation(s)
- Ted D Wade
- Division of Biostatistics and Bioinformatics, National Jewish Health, Denver, CO 80206, USA.
| | - Pearlanne T Zelarney
- Division of Biostatistics and Bioinformatics, National Jewish Health, Denver, CO 80206, USA
| | - Richard C Hum
- Division of Biostatistics and Bioinformatics, National Jewish Health, Denver, CO 80206, USA
| | - Sylvia McGee
- Division of Biostatistics and Bioinformatics, National Jewish Health, Denver, CO 80206, USA
| | - Deborah H Batson
- Department of Research Informatics, Children's Hospital Colorado Research Institute, Aurora, CO 80045, USA
| |
Collapse
|
27
|
Gallagher SA, Smith AB, Matthews JE, Potter CW, Woods ME, Raynor M, Wallen EM, Rathmell WK, Whang YE, Kim WY, Godley PA, Chen RC, Wang A, You C, Barocas DA, Pruthi RS, Nielsen ME, Milowsky MI. Roadmap for the development of the University of North Carolina at Chapel Hill Genitourinary OncoLogy Database--UNC GOLD. Urol Oncol 2014; 32:32.e1-9. [PMID: 23434424 PMCID: PMC4058502 DOI: 10.1016/j.urolonc.2012.11.019] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2012] [Revised: 11/15/2012] [Accepted: 11/27/2012] [Indexed: 11/25/2022]
Abstract
BACKGROUND The management of genitourinary malignancies requires a multidisciplinary care team composed of urologists, medical oncologists, and radiation oncologists. A genitourinary (GU) oncology clinical database is an invaluable resource for patient care and research. Although electronic medical records provide a single web-based record used for clinical care, billing, and scheduling, information is typically stored in a discipline-specific manner and data extraction is often not applicable to a research setting. A GU oncology database may be used for the development of multidisciplinary treatment plans, analysis of disease-specific practice patterns, and identification of patients for research studies. Despite the potential utility, there are many important considerations that must be addressed when developing and implementing a discipline-specific database. METHODS AND MATERIALS The creation of the GU oncology database including prostate, bladder, and kidney cancers with the identification of necessary variables was facilitated by meetings of stakeholders in medical oncology, urology, and radiation oncology at the University of North Carolina (UNC) at Chapel Hill with a template data dictionary provided by the Department of Urologic Surgery at Vanderbilt University Medical Center. Utilizing Research Electronic Data Capture (REDCap, version 4.14.5), the UNC Genitourinary OncoLogy Database (UNC GOLD) was designed and implemented. RESULTS The process of designing and implementing a discipline-specific clinical database requires many important considerations. The primary consideration is determining the relationship between the database and the Institutional Review Board (IRB) given the potential applications for both clinical and research uses. Several other necessary steps include ensuring information technology security and federal regulation compliance; determination of a core complete dataset; creation of standard operating procedures; standardizing entry of free text fields; use of data exports, queries, and de-identification strategies; inclusion of individual investigators' data; and strategies for prioritizing specific projects and data entry. CONCLUSIONS A discipline-specific database requires a buy-in from all stakeholders, meticulous development, and data entry resources to generate a unique platform for housing information that may be used for clinical care and research with IRB approval. The steps and issues identified in the development of UNC GOLD provide a process map for others interested in developing a GU oncology database.
Collapse
Affiliation(s)
- Sarah A Gallagher
- Department of Medicine, Division of Hematology and Medical Oncology, University of North Carolina, Chapel Hill, NC
| | - Angela B Smith
- Department of Surgery, Division of Urologic Surgery, University of North Carolina, Chapel Hill, NC
| | - Jonathan E Matthews
- Department of Surgery, Division of Urologic Surgery, University of North Carolina, Chapel Hill, NC
| | - Clarence W Potter
- North Carolina Clinical and Translational Sciences Institute (NC TraCS), Chapel Hill, NC
| | - Michael E Woods
- Department of Surgery, Division of Urologic Surgery, University of North Carolina, Chapel Hill, NC
| | - Mathew Raynor
- Department of Surgery, Division of Urologic Surgery, University of North Carolina, Chapel Hill, NC
| | - Eric M Wallen
- Department of Surgery, Division of Urologic Surgery, University of North Carolina, Chapel Hill, NC
| | - W Kimryn Rathmell
- Department of Medicine, Division of Hematology and Medical Oncology, University of North Carolina, Chapel Hill, NC
| | - Young E Whang
- Department of Medicine, Division of Hematology and Medical Oncology, University of North Carolina, Chapel Hill, NC
| | - William Y Kim
- Department of Medicine, Division of Hematology and Medical Oncology, University of North Carolina, Chapel Hill, NC
| | - Paul A Godley
- Department of Medicine, Division of Hematology and Medical Oncology, University of North Carolina, Chapel Hill, NC
| | - Ronald C Chen
- Department of Radiation Oncology, University of North Carolina, Chapel Hill, NC
| | - Andrew Wang
- Department of Radiation Oncology, University of North Carolina, Chapel Hill, NC
| | - Chaochen You
- Department of Urologic Surgery, Vanderbilt University Medical Center, Nashville, TN
| | - Daniel A Barocas
- Department of Urologic Surgery, Vanderbilt University Medical Center, Nashville, TN
| | - Raj S Pruthi
- Department of Surgery, Division of Urologic Surgery, University of North Carolina, Chapel Hill, NC
| | - Matthew E Nielsen
- Department of Surgery, Division of Urologic Surgery, University of North Carolina, Chapel Hill, NC.
| | - Matthew I Milowsky
- Department of Medicine, Division of Hematology and Medical Oncology, University of North Carolina, Chapel Hill, NC.
| |
Collapse
|
28
|
Liu A. Developing an institutional cancer biorepository for personalized medicine. Clin Biochem 2013; 47:293-9. [PMID: 24373923 DOI: 10.1016/j.clinbiochem.2013.12.015] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2013] [Revised: 12/06/2013] [Accepted: 12/13/2013] [Indexed: 11/27/2022]
Abstract
High quality human biospecimens, such as tissue, blood, cell derivatives, and associated patient clinical information, are key elements of a scientific infrastructure that supports discovery and identification of molecular biomarkers and diagnostic agents. The goal of most biorepositories is to collect, process, store, and distribute human biospecimen for use in basic, translational and clinical research. A biorepository serving as the central hub provides investigators with an invaluable resource with appropriately examined and characterized biospecimens with associated patient clinical information. Expertise in standardization, quality control, and information technology, and awareness of cutting edge research developments are generally required for biorepository development and management. The availability of low cost whole genome profiles of individual tumors has opened up new possibilities for personalized medicine to deliver the most appropriate treatments to individual patients with minimal toxicity. A biorepository in support of personalized medicine thus requires the highest standards of operation and adequate funding, training and certification. This review provides an overview of the development of an institutional cancer biorepository for clinical research and personalized medicine advancement.
Collapse
Affiliation(s)
- Angen Liu
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
29
|
Prognostic value and targeted inhibition of survivin expression in esophageal adenocarcinoma and cancer-adjacent squamous epithelium. PLoS One 2013; 8:e78343. [PMID: 24223792 PMCID: PMC3817247 DOI: 10.1371/journal.pone.0078343] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2013] [Accepted: 09/13/2013] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Survivin is an inhibitor of apoptosis and its over expression is associated with poor prognosis in several malignancies. While several studies have analyzed survivin expression in esophageal squamous cell carcinoma, few have focused on esophageal adenocarcinoma (EAC) and/or cancer-adjacent squamous epithelium (CASE). The purpose of this study was 1) to determine the degree of survivin up regulation in samples of EAC and CASE, 2) to evaluate if survivin expression in EAC and CASE correlates with recurrence and/or death, and 3) to examine the effect of survivin inhibition on apoptosis in EAC cells. METHODS Fresh frozen samples of EAC and CASE from the same patient were used for qRT-PCR and Western blot analysis, and formalin-fixed, paraffin-embedded tissue was used for immunohistochemistry. EAC cell lines, OE19 and OE33, were transfected with small interfering RNAs (siRNAs) to knockdown survivin expression. This was confirmed by qRT-PCR for survivin expression and Western blot analysis of cleaved PARP, cleaved caspase 3 and survivin. Survivin expression data was correlated with clinical outcome. RESULTS Survivin expression was significantly higher in EAC tumor samples compared to the CASE from the same patient. Patients with high expression of survivin in EAC tumor had an increased risk of death. Survivin expression was also noted in CASE and correlated with increased risk of distant recurrence. Cell line evaluation demonstrated that inhibition of survivin resulted in an increase in apoptosis. CONCLUSION Higher expression of survivin in tumor tissue was associated with increased risk of death; while survivin expression in CASE was a superior predictor of recurrence. Inhibition of survivin in EAC cell lines further showed increased apoptosis, supporting the potential benefits of therapeutic strategies targeted to this marker.
Collapse
|
30
|
Amin W, Parwani AV, Melamed J, Flores R, Pennathur A, Valdivieso F, Whelan NB, Landreneau R, Luketich J, Feldman M, Pass HI, Becich MJ. National Mesothelioma Virtual Bank: A Platform for Collaborative Research and Mesothelioma Biobanking Resource to Support Translational Research. LUNG CANCER INTERNATIONAL 2013; 2013:765748. [PMID: 26316942 PMCID: PMC4437393 DOI: 10.1155/2013/765748] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/09/2013] [Revised: 08/12/2013] [Accepted: 08/12/2013] [Indexed: 11/23/2022]
Abstract
The National Mesothelioma Virtual Bank (NMVB), developed six years ago, gathers clinically annotated human mesothelioma specimens for basic and clinical science research. During this period, this resource has greatly increased its collection of specimens by expanding the number of contributing academic health centers including New York University, University of Pennsylvania, University of Pittsburgh Medical Center, and Mount Sinai School of Medicine. Marketing efforts at both national and international annual conferences increase awareness and availability of the mesothelioma specimens at no cost to approved investigators, who query the web-based NMVB database for cumulative and appropriate patient clinicopathological information on the specimens. The data disclosure and specimen distribution protocols are tightly regulated to maintain compliance with participating institutions' IRB and regulatory committee reviews. The NMVB currently has over 1120 annotated cases available for researchers, including paraffin embedded tissues, fresh frozen tissue, tissue microarrays (TMA), blood samples, and genomic DNA. In addition, the resource offers expertise and assistance for collaborative research. Furthermore, in the last six years, the resource has provided hundreds of specimens to the research community. The investigators can request specimens and/or data by submitting a Letter of Intent (LOI) that is evaluated by NMVB research evaluation panel (REP).
Collapse
Affiliation(s)
- Waqas Amin
- Department of Biomedical Informatics, University of Pittsburgh Medical Center, Pittsburgh, PA, USA
| | - Anil V. Parwani
- Department of Pathology, University of Pittsburgh Medical Center, Pittsburgh, PA, USA
| | - Jonathan Melamed
- Department of Pathology, New York University Medical Center, New York, NY, USA
| | - Raja Flores
- Department of Cardiothoracic Surgery, Mount Sinai School of Medicine, New York, NY, USA
| | - Arjun Pennathur
- Department of Cardiothoracic Surgery, University of Pittsburgh Medical Center, Pittsburgh, PA, USA
| | | | - Nancy B. Whelan
- Department of Biomedical Informatics, University of Pittsburgh Medical Center, Pittsburgh, PA, USA
| | - Rodeny Landreneau
- Department of Cardiothoracic Surgery, University of Pittsburgh Medical Center, Pittsburgh, PA, USA
| | - James Luketich
- Department of Cardiothoracic Surgery, University of Pittsburgh Medical Center, Pittsburgh, PA, USA
| | - Michael Feldman
- Department of Pathology, University of Pennsylvania, Philadelphia, PA, USA
| | - Harvey I. Pass
- Department of Cardiothoracic Surgery, New York University Medical Center, New York, NY, USA
| | - Michael J. Becich
- Department of Biomedical Informatics, University of Pittsburgh Medical Center, Pittsburgh, PA, USA
| |
Collapse
|
31
|
Haak LL, Baker D, Ginther DK, Gordon GJ, Probus MA, Kannankutty N, Weinberg BA. Information science. Standards and infrastructure for innovation data exchange. Science 2012; 338:196-7. [PMID: 23066063 DOI: 10.1126/science.1221840] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
32
|
Vaught J, Lockhart NC. The evolution of biobanking best practices. Clin Chim Acta 2012; 413:1569-75. [PMID: 22579478 PMCID: PMC3409343 DOI: 10.1016/j.cca.2012.04.030] [Citation(s) in RCA: 63] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2012] [Revised: 04/28/2012] [Accepted: 04/29/2012] [Indexed: 12/18/2022]
Abstract
Biobanks and biospecimens are critical components for many areas of clinical and basic research. The quality of biospecimens and associated data must be consistent and collected according to standardized methods in order to prevent spurious analytical results that can lead to artifacts being interpreted as valid findings. A number of international institutions have taken the initiative to develop and publish best practices, which include technical recommendations for handling biospecimens as well as recommendations for ethical and regulatory practices in biobanking. These sources of guidance have been useful in raising the overall consistency and quality of research involving biospecimens. However, the lack of international harmonization, uneven adoption, and insufficient oversight of best practices are preventing further improvements in biospecimen quality and coordination among collaborators and biobanking networks. In contrast to the more straightforward technical and management issues, ethical and regulatory practices often involve issues that are more controversial and difficult to standardize.
Collapse
Affiliation(s)
- Jim Vaught
- Office of Biorepositories and Biospecimen Research, National Cancer Institute, National Institutes of Health, Department of Health and Human Services, 11400 Rockville Pike, Bethesda, MD 20892, USA.
| | | |
Collapse
|
33
|
Erdal BS, Liu J, Ding J, Chen J, Marsh CB, Kamal J, Clymer BD. A database de-identification framework to enable direct queries on medical data for secondary use. Methods Inf Med 2012; 51:229-41. [PMID: 22311158 DOI: 10.3414/me11-01-0048] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2011] [Accepted: 11/08/2011] [Indexed: 11/09/2022]
Abstract
OBJECTIVE To qualify the use of patient clinical records as non-human-subject for research purpose, electronic medical record data must be de-identified so there is minimum risk to protected health information exposure. This study demonstrated a robust framework for structured data de-identification that can be applied to any relational data source that needs to be de-identified. METHODS Using a real world clinical data warehouse, a pilot implementation of limited subject areas were used to demonstrate and evaluate this new de-identification process. Query results and performances are compared between source and target system to validate data accuracy and usability. RESULTS The combination of hashing, pseudonyms, and session dependent randomizer provides a rigorous de-identification framework to guard against 1) source identifier exposure; 2) internal data analyst manually linking to source identifiers; and 3) identifier cross-link among different researchers or multiple query sessions by the same researcher. In addition, a query rejection option is provided to refuse queries resulting in less than preset numbers of subjects and total records to prevent users from accidental subject identification due to low volume of data. This framework does not prevent subject re-identification based on prior knowledge and sequence of events. Also, it does not deal with medical free text de-identification, although text de-identification using natural language processing can be included due its modular design. CONCLUSION We demonstrated a framework resulting in HIPAA Compliant databases that can be directly queried by researchers. This technique can be augmented to facilitate inter-institutional research data sharing through existing middleware such as caGrid.
Collapse
Affiliation(s)
- B S Erdal
- Information Warehouse, The Ohio State University Medical Center, Columbus, Ohio, USA
| | | | | | | | | | | | | |
Collapse
|
34
|
Sak J, Pawlikowski J, Goniewicz M, Witt M. Population biobanking in selected European countries and proposed model for a Polish national DNA bank. J Appl Genet 2012; 53:159-65. [PMID: 22281780 PMCID: PMC3334487 DOI: 10.1007/s13353-012-0082-4] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2011] [Revised: 12/28/2011] [Accepted: 01/02/2012] [Indexed: 01/01/2023]
Abstract
Population biobanks offer new opportunities for public health, are rudimentary for the development of its new branch called Public Health Genomics, and are important for translational research. This article presents organizational models of population biobanks in selected European countries. Review of bibliography and websites of European population biobanks (UK, Spain, Estonia). Some countries establish national genomic biobanks (DNA banks) in order to conduct research on new methods of prevention, diagnosis and treatment of the genetic and lifestyle diseases and on pharmacogenetic research. Individual countries have developed different organizational models of these institutions and specific legal regulations regarding various ways of obtaining genetic data from the inhabitants, donors’ rights, organizational and legal aspects. Population biobanks in European countries were funded in different manners. In light of these solutions, the authors discuss prospects of establishing a Polish national genomic biobank for research purpose. They propose the creation of such an institution based on the existing network of blood-donation centres and clinical biobanks in Poland.
Collapse
Affiliation(s)
- Jarosław Sak
- Department of Ethics and Human Philosophy, Medical University of Lublin, ul. Szkolna 18, 20-124 Lublin, Poland.
| | | | | | | |
Collapse
|
35
|
Harkema H, Chapman WW, Saul M, Dellon ES, Schoen RE, Mehrotra A. Developing a natural language processing application for measuring the quality of colonoscopy procedures. J Am Med Inform Assoc 2011; 18 Suppl 1:i150-6. [PMID: 21946240 PMCID: PMC3241178 DOI: 10.1136/amiajnl-2011-000431] [Citation(s) in RCA: 64] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2011] [Accepted: 08/18/2011] [Indexed: 12/24/2022] Open
Abstract
OBJECTIVE The quality of colonoscopy procedures for colorectal cancer screening is often inadequate and varies widely among physicians. Routine measurement of quality is limited by the costs of manual review of free-text patient charts. Our goal was to develop a natural language processing (NLP) application to measure colonoscopy quality. MATERIALS AND METHODS Using a set of quality measures published by physician specialty societies, we implemented an NLP engine that extracts 21 variables for 19 quality measures from free-text colonoscopy and pathology reports. We evaluated the performance of the NLP engine on a test set of 453 colonoscopy reports and 226 pathology reports, considering accuracy in extracting the values of the target variables from text, and the reliability of the outcomes of the quality measures as computed from the NLP-extracted information. RESULTS The average accuracy of the NLP engine over all variables was 0.89 (range: 0.62-1.0) and the average F measure over all variables was 0.74 (range: 0.49-0.89). The average agreement score, measured as Cohen's κ, between the manually established and NLP-derived outcomes of the quality measures was 0.62 (range: 0.09-0.86). DISCUSSION For nine of the 19 colonoscopy quality measures, the agreement score was 0.70 or above, which we consider a sufficient score for the NLP-derived outcomes of these measures to be practically useful for quality measurement. CONCLUSION The use of NLP for information extraction from free-text colonoscopy and pathology reports creates opportunities for large scale, routine quality measurement, which can support quality improvement in colonoscopy care.
Collapse
Affiliation(s)
- Henk Harkema
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA.
| | | | | | | | | | | |
Collapse
|
36
|
Wade TD, Hum RC, Murphy JR. A Dimensional Bus model for integrating clinical and research data. J Am Med Inform Assoc 2011; 18 Suppl 1:i96-102. [PMID: 21856687 PMCID: PMC3241170 DOI: 10.1136/amiajnl-2011-000339] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2011] [Accepted: 07/11/2011] [Indexed: 11/04/2022] Open
Abstract
OBJECTIVES Many clinical research data integration platforms rely on the Entity-Attribute-Value model because of its flexibility, even though it presents problems in query formulation and execution time. The authors sought more balance in these traits. MATERIALS AND METHODS Borrowing concepts from Entity-Attribute-Value and from enterprise data warehousing, the authors designed an alternative called the Dimensional Bus model and used it to integrate electronic medical record, sponsored study, and biorepository data. Each type of observational collection has its own table, and the structure of these tables varies to suit the source data. The observational tables are linked to the Bus, which holds provenance information and links to various classificatory dimensions that amplify the meaning of the data or facilitate its query and exposure management. RESULTS The authors implemented a Bus-based clinical research data repository with a query system that flexibly manages data access and confidentiality, facilitates catalog search, and readily formulates and compiles complex queries. CONCLUSION The design provides a workable way to manage and query mixed schemas in a data warehouse.
Collapse
Affiliation(s)
- Ted D Wade
- Division of Biostatistics and Bioinformatics, National Jewish Health, Denver, Colorado 80206-2761, USA.
| | | | | |
Collapse
|
37
|
Mayol-Heath DN, Woo P, Galbraith K. Biorepository regulatory considerations: a detailed topic follow-up to the blueprint for the development of a community-based hospital biorepository. Biopreserv Biobank 2011; 9:321-6. [PMID: 24836627 DOI: 10.1089/bio.2011.0019] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Human specimen and associated clinical data are crucial resources for uncovering new biomarkers as well as identifying factors in the pathogenesis of disease. As tissue banking becomes more widespread in response to increased researcher demand, the need for specific guidance documents and standards to be developed and/or updated becomes more urgent. This article discusses 4 aspects of current regulation and guidance considerations. First is the application of the Common Rule and the Privacy Rule to the Institutional Review Board review process. Second is the honest broker concept, which supports the further protection of donor confidentiality and allows for future updates to data. The third consideration discusses the regulatory approval process. Finally, inconsistencies between regulatory bodies are identified and possible ways to reconcile those inconsistencies are suggested.
Collapse
|
38
|
Flores I, Muñoz-Antonia T, Matta J, García M, Fenstermacher D, Gutierrez S, Seijo E, Torres-Ruiz J, Pledger WJ, Coppola D. The Establishment of the First Cancer Tissue Biobank at a Hispanic-Serving Institution: A National Cancer Institute-Funded Initiative between Moffitt Cancer Center in Florida and the Ponce School of Medicine and Health Sciences in Puerto Rico. Biopreserv Biobank 2011; 9:363-71. [PMID: 24836632 DOI: 10.1089/bio.2011.0028] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Population-based studies are important to address emerging issues in health disparities among populations. The Partnership between the Moffitt Cancer Center (MCC) in Florida and the Ponce School of Medicine and Health Sciences (PSMHS) in Puerto Rico (the PSMHS-MCC Partnership) was developed to facilitate high-quality research, training, and community outreach focusing on the Puerto Rican population in the island and in the mainland, with funding from the National Cancer Institute. We report here the establishment of a Tissue Biobank at PSMHS, modeled after the MCC tissue biorepository, to support translational research projects on this minority population. This facility, the Puerto Rico Tissue Biobank, was jointly developed by a team of basic and clinical scientists from both institutions in close collaboration with the administrators and clinical faculty of the tissue accrual sites. The efforts required and challenges that needed to be overcome to establish the first functional, centralized cancer-related biobank in Puerto Rico, and to ensure that it continuously evolves to address new needs of this underserved Hispanic population, are described. As a result of the collaborative efforts between PSMHS and MCC, a tissue procurement algorithm was successfully established to acquire, process, store, and conduct pathological analyses of cancer-related biospecimens and their associated clinical-pathological data from Puerto Rican patients with cancer recruited at a tertiary hospital setting. All protocols in place are in accordance with standard operational procedures that ensure high quality of biological materials and patient confidentiality. The processes described here provide a model that can be applied to achieve the establishment of a functional biobank in similar settings.
Collapse
Affiliation(s)
- Idhaliz Flores
- 1 Department of Microbiology, Ponce School of Medicine and Health Sciences , Ponce, Puerto Rico
| | | | | | | | | | | | | | | | | | | |
Collapse
|
39
|
Safdar NM, Siegel E, Erickson BJ, Nagy P. Enabling comparative effectiveness research with informatics: show me the data! Acad Radiol 2011; 18:1072-6. [PMID: 21680206 DOI: 10.1016/j.acra.2011.04.009] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2010] [Revised: 02/22/2011] [Accepted: 04/09/2011] [Indexed: 10/18/2022]
Abstract
RATIONALE Both outcomes researchers and informaticians are concerned with information and data. As such, some of the central challenges to conducting successful comparative effectiveness research can be addressed with informatics solutions. METHODS Specific informatics solutions which address how data in comparative effectiveness research are enriched, stored, shared, and analyzed are reviewed. RESULTS Imaging data can be made more quantitative, uniform, and structured for researchers through the use of lexicons and structured reporting. Secure and scalable storage of research data is enabled through data warehouses and cloud services. There are a number of national efforts to help researchers share research data and analysis tools. CONCLUSION There is a diverse arsenal of informatics tools designed to meet the needs of comparative effective researchers.
Collapse
|
40
|
Morente MM, Cereceda L, Luna-Crespo F, Artiga MJ. Managing a Biobank Network. Biopreserv Biobank 2011; 9:187-90. [DOI: 10.1089/bio.2011.0005] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Affiliation(s)
- Manuel M. Morente
- Tumour Bank Unit, Spanish National Tumour Bank Network, Molecular Pathology Programme, Spanish National Cancer Centre (CNIO), Madrid, Spain
- Spanish National Biobank Network Coordination Office, Instituto de Salud Carlos III (Spanish National Institute of Health Carlos III), Madrid, Spain
| | - Laura Cereceda
- Tumour Bank Unit, Spanish National Tumour Bank Network, Molecular Pathology Programme, Spanish National Cancer Centre (CNIO), Madrid, Spain
| | - Francisco Luna-Crespo
- Spanish National Biobank Network Coordination Office, Instituto de Salud Carlos III (Spanish National Institute of Health Carlos III), Madrid, Spain
| | - Maria J. Artiga
- Tumour Bank Unit, Spanish National Tumour Bank Network, Molecular Pathology Programme, Spanish National Cancer Centre (CNIO), Madrid, Spain
| |
Collapse
|
41
|
Amin W, Singh H, Dzubinski LA, Schoen RE, Parwani AV. Design and utilization of the colorectal and pancreatic neoplasm virtual biorepository: An early detection research network initiative. J Pathol Inform 2010; 1:22. [PMID: 21031013 PMCID: PMC2956178 DOI: 10.4103/2153-3539.70831] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2010] [Accepted: 08/24/2010] [Indexed: 11/22/2022] Open
Abstract
Background: The Early Detection Research Network (EDRN) colorectal and pancreatic neoplasm virtual biorepository is a bioinformatics-driven system that provides high-quality clinicopathology-rich information for clinical biospecimens. This NCI-sponsored EDRN resource supports translational cancer research. The information model of this biorepository is based on three components: (a) development of common data elements (CDE), (b) a robust data entry tool and (c) comprehensive data query tools. Methods: The aim of the EDRN initiative is to develop and sustain a virtual biorepository for support of translational research. High-quality biospecimens were accrued and annotated with pertinent clinical, epidemiologic, molecular and genomic information. A user-friendly annotation tool and query tool was developed for this purpose. The various components of this annotation tool include: CDEs are developed from the College of American Pathologists (CAP) Cancer Checklists and North American Association of Central Cancer Registries (NAACR) standards. The CDEs provides semantic and syntactic interoperability of the data sets by describing them in the form of metadata or data descriptor. The data entry tool is a portable and flexible Oracle-based data entry application, which is an easily mastered, web-based tool. The data query tool facilitates investigators to search deidentified information within the warehouse through a “point and click” interface thus enabling only the selected data elements to be essentially copied into a data mart using a dimensional-modeled structure from the warehouse’s relational structure. Results: The EDRN Colorectal and Pancreatic Neoplasm Virtual Biorepository database contains multimodal datasets that are available to investigators via a web-based query tool. At present, the database holds 2,405 cases and 2,068 tumor accessions. The data disclosure is strictly regulated by user’s authorization. The high-quality and well-characterized biospecimens have been used in different translational science research projects as well as to further various epidemiologic and genomics studies. Conclusions: The EDRN Colorectal and Pancreatic Neoplasm Virtual Biorepository with a tangible translational biomedical informatics infrastructure facilitates translational research. The data query tool acts as a central source and provides a mechanism for researchers to efficiently query clinically annotated datasets and biospecimens that are pertinent to their research areas. The tool ensures patient health information protection by disclosing only deidentified data with Institutional Review Board and Health Insurance Portability and Accountability Act protocols.
Collapse
Affiliation(s)
- Waqas Amin
- Department of Biomedical Informatics, University of Pittsburgh Medical Center, Pittsburgh, PA, USA
| | | | | | | | | |
Collapse
|
42
|
Amin W, Singh H, Pople AK, Winters S, Dhir R, Parwani AV, Becich MJ. A decade of experience in the development and implementation of tissue banking informatics tools for intra and inter-institutional translational research. J Pathol Inform 2010; 1:S2153-3539(22)00104-3. [PMID: 20922029 PMCID: PMC2941965 DOI: 10.4103/2153-3539.68314] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2010] [Accepted: 06/18/2010] [Indexed: 11/15/2022] Open
Abstract
Context: Tissue banking informatics deals with standardized annotation, collection and storage of biospecimens that can further be shared by researchers. Over the last decade, the Department of Biomedical Informatics (DBMI) at the University of Pittsburgh has developed various tissue banking informatics tools to expedite translational medicine research. In this review, we describe the technical approach and capabilities of these models. Design: Clinical annotation of biospecimens requires data retrieval from various clinical information systems and the de-identification of the data by an honest broker. Based upon these requirements, DBMI, with its collaborators, has developed both Oracle-based organ-specific data marts and a more generic, model-driven architecture for biorepositories. The organ-specific models are developed utilizing Oracle 9.2.0.1 server tools and software applications and the model-driven architecture is implemented in a J2EE framework. Result: The organ-specific biorepositories implemented by DBMI include the Cooperative Prostate Cancer Tissue Resource (http://www.cpctr.info/), Pennsylvania Cancer Alliance Bioinformatics Consortium (http://pcabc.upmc.edu/main.cfm), EDRN Colorectal and Pancreatic Neoplasm Database (http://edrn.nci.nih.gov/) and Specialized Programs of Research Excellence (SPORE) Head and Neck Neoplasm Database (http://spores.nci.nih.gov/current/hn/index.htm). The model-based architecture is represented by the National Mesothelioma Virtual Bank (http://mesotissue.org/). These biorepositories provide thousands of well annotated biospecimens for the researchers that are searchable through query interfaces available via the Internet. Conclusion: These systems, developed and supported by our institute, serve to form a common platform for cancer research to accelerate progress in clinical and translational research. In addition, they provide a tangible infrastructure and resource for exposing research resources and biospecimen services in collaboration with the clinical anatomic pathology laboratory information system (APLIS) and the cancer registry information systems.
Collapse
Affiliation(s)
- Waqas Amin
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, USA
| | | | | | | | | | | | | |
Collapse
|
43
|
Amin W, Kang HP, Egloff AM, Singh H, Trent K, Ridge-Hetrick J, Seethala RR, Grandis J, Parwani AV. An informatics supported web-based data annotation and query tool to expedite translational research for head and neck malignancies. BMC Cancer 2009; 9:396. [PMID: 19912644 PMCID: PMC2780457 DOI: 10.1186/1471-2407-9-396] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2009] [Accepted: 11/13/2009] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND The Specialized Program of Research Excellence (SPORE) in Head and Neck Cancer neoplasm virtual biorepository is a bioinformatics-supported system to incorporate data from various clinical, pathological, and molecular systems into a single architecture based on a set of common data elements (CDEs) that provides semantic and syntactic interoperability of data sets. RESULTS The various components of this annotation tool include the Development of Common Data Elements (CDEs) that are derived from College of American Pathologists (CAP) Checklist and North American Association of Central Cancer Registries (NAACR) standards. The Data Entry Tool is a portable and flexible Oracle-based data entry device, which is an easily mastered web-based tool. The Data Query Tool helps investigators and researchers to search de-identified information within the warehouse/resource through a "point and click" interface, thus enabling only the selected data elements to be essentially copied into a data mart using a multi dimensional model from the warehouse's relational structure.The SPORE Head and Neck Neoplasm Database contains multimodal datasets that are accessible to investigators via an easy to use query tool. The database currently holds 6553 cases and 10607 tumor accessions. Among these, there are 965 metastatic, 4227 primary, 1369 recurrent, and 483 new primary cases. The data disclosure is strictly regulated by user's authorization. CONCLUSION The SPORE Head and Neck Neoplasm Virtual Biorepository is a robust translational biomedical informatics tool that can facilitate basic science, clinical, and translational research. The Data Query Tool acts as a central source providing a mechanism for researchers to efficiently find clinically annotated datasets and biospecimens that are relevant to their research areas. The tool protects patient privacy by revealing only de-identified data in accordance with regulations and approvals of the IRB and scientific review committee.
Collapse
Affiliation(s)
- Waqas Amin
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
44
|
Manion FJ, Robbins RJ, Weems WA, Crowley RS. Security and privacy requirements for a multi-institutional cancer research data grid: an interview-based study. BMC Med Inform Decis Mak 2009; 9:31. [PMID: 19527521 PMCID: PMC2709611 DOI: 10.1186/1472-6947-9-31] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2008] [Accepted: 06/15/2009] [Indexed: 11/26/2022] Open
Abstract
Background Data protection is important for all information systems that deal with human-subjects data. Grid-based systems – such as the cancer Biomedical Informatics Grid (caBIG) – seek to develop new mechanisms to facilitate real-time federation of cancer-relevant data sources, including sources protected under a variety of regulatory laws, such as HIPAA and 21CFR11. These systems embody new models for data sharing, and hence pose new challenges to the regulatory community, and to those who would develop or adopt them. These challenges must be understood by both systems developers and system adopters. In this paper, we describe our work collecting policy statements, expectations, and requirements from regulatory decision makers at academic cancer centers in the United States. We use these statements to examine fundamental assumptions regarding data sharing using data federations and grid computing. Methods An interview-based study of key stakeholders from a sample of US cancer centers. Interviews were structured, and used an instrument that was developed for the purpose of this study. The instrument included a set of problem scenarios – difficult policy situations that were derived during a full-day discussion of potentially problematic issues by a set of project participants with diverse expertise. Each problem scenario included a set of open-ended questions that were designed to elucidate stakeholder opinions and concerns. Interviews were transcribed verbatim and used for both qualitative and quantitative analysis. For quantitative analysis, data was aggregated at the individual or institutional unit of analysis, depending on the specific interview question. Results Thirty-one (31) individuals at six cancer centers were contacted to participate. Twenty-four out of thirty-one (24/31) individuals responded to our request- yielding a total response rate of 77%. Respondents included IRB directors and policy-makers, privacy and security officers, directors of offices of research, information security officers and university legal counsel. Nineteen total interviews were conducted over a period of 16 weeks. Respondents provided answers for all four scenarios (a total of 87 questions). Results were grouped by broad themes, including among others: governance, legal and financial issues, partnership agreements, de-identification, institutional technical infrastructure for security and privacy protection, training, risk management, auditing, IRB issues, and patient/subject consent. Conclusion The findings suggest that with additional work, large scale federated sharing of data within a regulated environment is possible. A key challenge is developing suitable models for authentication and authorization practices within a federated environment. Authentication – the recognition and validation of a person's identity – is in fact a global property of such systems, while authorization – the permission to access data or resources – mimics data sharing agreements in being best served at a local level. Nine specific recommendations result from the work and are discussed in detail. These include: (1) the necessity to construct separate legal or corporate entities for governance of federated sharing initiatives on this scale; (2) consensus on the treatment of foreign and commercial partnerships; (3) the development of risk models and risk management processes; (4) development of technical infrastructure to support the credentialing process associated with research including human subjects; (5) exploring the feasibility of developing large-scale, federated honest broker approaches; (6) the development of suitable, federated identity provisioning processes to support federated authentication and authorization; (7) community development of requisite HIPAA and research ethics training modules by federation members; (8) the recognition of the need for central auditing requirements and authority, and; (9) use of two-protocol data exchange models where possible in the federation.
Collapse
Affiliation(s)
- Frank J Manion
- Information Science and Technology, Fox Chase Cancer Center, Philadelphia, PA, USA.
| | | | | | | |
Collapse
|
45
|
Hewitt R, Watson PH, Dhir R, Aamodt R, Thomas G, Mercola D, Grizzle WE, Morente MM. Timing of consent for the research use of surgically removed tissue. Cancer 2008; 115:4-9. [DOI: 10.1002/cncr.23999] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
46
|
Mohanty SK, Mistry AT, Amin W, Parwani AV, Pople AK, Schmandt L, Winters SB, Milliken E, Kim P, Whelan NB, Farhat G, Melamed J, Taioli E, Dhir R, Pass HI, Becich MJ. The development and deployment of Common Data Elements for tissue banks for translational research in cancer - an emerging standard based approach for the Mesothelioma Virtual Tissue Bank. BMC Cancer 2008; 8:91. [PMID: 18397527 PMCID: PMC2329649 DOI: 10.1186/1471-2407-8-91] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2007] [Accepted: 04/08/2008] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND Recent advances in genomics, proteomics, and the increasing demands for biomarker validation studies have catalyzed changes in the landscape of cancer research, fueling the development of tissue banks for translational research. A result of this transformation is the need for sufficient quantities of clinically annotated and well-characterized biospecimens to support the growing needs of the cancer research community. Clinical annotation allows samples to be better matched to the research question at hand and ensures that experimental results are better understood and can be verified. To facilitate and standardize such annotation in bio-repositories, we have combined three accepted and complementary sets of data standards: the College of American Pathologists (CAP) Cancer Checklists, the protocols recommended by the Association of Directors of Anatomic and Surgical Pathology (ADASP) for pathology data, and the North American Association of Central Cancer Registry (NAACCR) elements for epidemiology, therapy and follow-up data. Combining these approaches creates a set of International Standards Organization (ISO) - compliant Common Data Elements (CDEs) for the mesothelioma tissue banking initiative supported by the National Institute for Occupational Safety and Health (NIOSH) of the Center for Disease Control and Prevention (CDC). METHODS The purpose of the project is to develop a core set of data elements for annotating mesothelioma specimens, following standards established by the CAP checklist, ADASP cancer protocols, and the NAACCR elements. We have associated these elements with modeling architecture to enhance both syntactic and semantic interoperability. The system has a Java-based multi-tiered architecture based on Unified Modeling Language (UML). RESULTS Common Data Elements were developed using controlled vocabulary, ontology and semantic modeling methodology. The CDEs for each case are of different types: demographic, epidemiologic data, clinical history, pathology data including block level annotation, and follow-up data including treatment, recurrence and vital status. The end result of such an effort would eventually provide an increased sample set to the researchers, and makes the system interoperable between institutions. CONCLUSION The CAP, ADASP and the NAACCR elements represent widely established data elements that are utilized in many cancer centers. Herein, we have shown these representations can be combined and formalized to create a core set of annotations for banked mesothelioma specimens. Because these data elements are collected as part of the normal workflow of a medical center, data sets developed on the basis of these elements can be easily implemented and maintained.
Collapse
Affiliation(s)
- Sambit K Mohanty
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Amita T Mistry
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Waqas Amin
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Anil V Parwani
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
- Department of Pathology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Andrew K Pople
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Linda Schmandt
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Sharon B Winters
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | | | - Paula Kim
- Translating Research Across Communities, USA
| | - Nancy B Whelan
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Ghada Farhat
- Department of Epidemiology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Jonathan Melamed
- Department of Pathology, New York University School of Medicine, New York, NY, USA
| | - Emanuela Taioli
- Department of Epidemiology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Rajiv Dhir
- Translating Research Across Communities, USA
| | - Harvey I Pass
- Department of Cardiothoracic Surgery, Division of Thoracic Surgery and Thoracic Oncology, New York University School of Medicine, New York, NY, USA
| | - Michael J Becich
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
- Department of Pathology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| |
Collapse
|
47
|
Patel AA, Gilbertson JR, Showe LC, London JW, Ross E, Ochs MF, Carver J, Lazarus A, Parwani AV, Dhir R, Beck JR, Liebman M, Garcia FU, Prichard J, Wilkerson M, Herberman RB, Becich MJ. A novel cross-disciplinary multi-institute approach to translational cancer research: lessons learned from Pennsylvania Cancer Alliance Bioinformatics Consortium (PCABC). Cancer Inform 2007; 3:255-74. [PMID: 19455246 PMCID: PMC2675833] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/29/2022] Open
Abstract
BACKGROUND The Pennsylvania Cancer Alliance Bioinformatics Consortium (PCABC, http://www.pcabc.upmc.edu) is one of the first major project-based initiatives stemming from the Pennsylvania Cancer Alliance that was funded for four years by the Department of Health of the Commonwealth of Pennsylvania. The objective of this was to initiate a prototype biorepository and bioinformatics infrastructure with a robust data warehouse by developing a statewide data model (1) for bioinformatics and a repository of serum and tissue samples; (2) a data model for biomarker data storage; and (3) a public access website for disseminating research results and bioinformatics tools. The members of the Consortium cooperate closely, exploring the opportunity for sharing clinical, genomic and other bioinformatics data on patient samples in oncology, for the purpose of developing collaborative research programs across cancer research institutions in Pennsylvania. The Consortium's intention was to establish a virtual repository of many clinical specimens residing in various centers across the state, in order to make them available for research. One of our primary goals was to facilitate the identification of cancer-specific biomarkers and encourage collaborative research efforts among the participating centers. METHODS The PCABC has developed unique partnerships so that every region of the state can effectively contribute and participate. It includes over 80 individuals from 14 organizations, and plans to expand to partners outside the State. This has created a network of researchers, clinicians, bioinformaticians, cancer registrars, program directors, and executives from academic and community health systems, as well as external corporate partners - all working together to accomplish a common mission. The various sub-committees have developed a common IRB protocol template, common data elements for standardizing data collections for three organ sites, intellectual property/tech transfer agreements, and material transfer agreements that have been approved by each of the member institutions. This was the foundational work that has led to the development of a centralized data warehouse that has met each of the institutions' IRB/HIPAA standards. RESULTS Currently, this "virtual biorepository" has over 58,000 annotated samples from 11,467 cancer patients available for research purposes. The clinical annotation of tissue samples is either done manually over the internet or semi-automated batch modes through mapping of local data elements with PCABC common data elements. The database currently holds information on 7188 cases (associated with 9278 specimens and 46,666 annotated blocks and blood samples) of prostate cancer, 2736 cases (associated with 3796 specimens and 9336 annotated blocks and blood samples) of breast cancer and 1543 cases (including 1334 specimens and 2671 annotated blocks and blood samples) of melanoma. These numbers continue to grow, and plans to integrate new tumor sites are in progress. Furthermore, the group has also developed a central web-based tool that allows investigators to share their translational (genomics/proteomics) experiment data on research evaluating potential biomarkers via a central location on the Consortium's web site. CONCLUSIONS The technological achievements and the statewide informatics infrastructure that have been established by the Consortium will enable robust and efficient studies of biomarkers and their relevance to the clinical course of cancer. Studies resulting from the creation of the Consortium may allow for better classification of cancer types, more accurate assessment of disease prognosis, a better ability to identify the most appropriate individuals for clinical trial participation, and better surrogate markers of disease progression and/or response to therapy.
Collapse
Affiliation(s)
- Ashokkumar A. Patel
- Center for Pathology Informatics, Benedum Oncology Informatics Center, University of Pittsburgh Cancer Institute
| | - John R. Gilbertson
- Center for Pathology Informatics, Benedum Oncology Informatics Center, University of Pittsburgh Cancer Institute
| | | | | | | | | | - Joseph Carver
- Abramson Cancer Center of the University of Pennsylvania
| | - Andrea Lazarus
- Pennsylvania State Cancer Institute at Milton S. Hershey Medical Center
| | - Anil V. Parwani
- Center for Pathology Informatics, Benedum Oncology Informatics Center, University of Pittsburgh Cancer Institute
| | - Rajiv Dhir
- Center for Pathology Informatics, Benedum Oncology Informatics Center, University of Pittsburgh Cancer Institute
| | | | | | | | | | | | - Ronald B. Herberman
- Center for Pathology Informatics, Benedum Oncology Informatics Center, University of Pittsburgh Cancer Institute
| | - Michael J. Becich
- Center for Pathology Informatics, Benedum Oncology Informatics Center, University of Pittsburgh Cancer Institute
| | | |
Collapse
|