Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Wang X, Williams C, Liu ZH, Croghan J. Big data management challenges in health research-a literature review. Brief Bioinform 2019;20:156-167. [PMID: 28968677 DOI: 10.1093/bib/bbx086] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2017] [Indexed: 12/12/2022] Open

For:	Wang X, Williams C, Liu ZH, Croghan J. Big data management challenges in health research-a literature review. Brief Bioinform 2019;20:156-167. [PMID: 28968677 DOI: 10.1093/bib/bbx086] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2017] [Indexed: 12/12/2022] Open

Number

Cited by Other Article(s)

Baum L, Johns M, Müller A, Abu Attieh H, Prasser F. HERALD: A domain-specific query language for longitudinal health data analytics. Int J Med Inform 2024;192:105646. [PMID: 39393126 DOI: 10.1016/j.ijmedinf.2024.105646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2024] [Revised: 10/02/2024] [Accepted: 10/04/2024] [Indexed: 10/13/2024]

Abstract

BACKGROUND

Large-scale health data has significant potential for research and innovation, especially with longitudinal data offering insights into prevention, disease progression, and treatment effects. Yet, analyzing this data type is complex, as data points are repeatedly documented along the timeline. As a consequence, extracting cross-sectional tabular data suitable for statistical analysis and machine learning can be challenging for medical researchers and data scientists alike, with existing tools lacking balance between ease-of-use and comprehensiveness.

OBJECTIVE

This paper introduces HERALD, a novel domain-specific query language designed to support the transformation of longitudinal health data into cross-sectional tables. We describe the basic concepts, the query syntax, a graphical user interface for constructing and executing HERALD queries, as well as an integration into Informatics for Integrating Biology and the Bedside (i2b2).

METHODS

The syntax of HERALD mimics natural language and supports different query types for selection, aggregation, analysis of relationships, and searching for data points based on filter expressions and temporal constraints. Using a hierarchical concept model, queries are executed individually for the data of each patient, while constructing tabular output. HERALD is closed, meaning that queries process data points and generate data points. Queries can refer to data points that have been produced by previous queries, providing a simple, but powerful nesting mechanism.

RESULTS

The open-source implementation consists of a HERALD query parser, an execution engine, as well as a web-based user interface for query construction and statistical analysis. The implementation can be deployed as a standalone component and integrated into self-service data analytics environments like i2b2 as a plugin. HERALD can be valuable tool for data scientists and machine learning experts, as it simplifies the process of transforming longitudinal health data into tables and data matrices.

CONCLUSION

The construction of cross-sectional tables from longitudinal data can be supported through dedicated query languages that strike a reasonable balance between language complexity and transformation capabilities.

Collapse

Alkhatib R, Gaede KI. Data Management in Biobanking: Strategies, Challenges, and Future Directions. BIOTECH 2024;13:34. [PMID: 39311336 PMCID: PMC11417763 DOI: 10.3390/biotech13030034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2024] [Revised: 08/23/2024] [Accepted: 08/31/2024] [Indexed: 09/26/2024] Open

Hallaj S, Chuter BG, Lieu AC, Singh P, Kalpathy-Cramer J, Xu BY, Christopher M, Zangwill LM, Weinreb RN, Baxter SL. Federated Learning in Glaucoma: A Comprehensive Review and Future Perspectives. Ophthalmol Glaucoma 2024:S2589-4196(24)00143-1. [PMID: 39214457 DOI: 10.1016/j.ogla.2024.08.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2024] [Revised: 08/20/2024] [Accepted: 08/23/2024] [Indexed: 09/04/2024]

Abstract

CLINICAL RELEVANCE

Glaucoma is a complex eye condition with varied morphological and clinical presentations, making diagnosis and management challenging. The lack of a consensus definition for glaucoma or glaucomatous optic neuropathy further complicates the development of universal diagnostic tools. Developing robust artificial intelligence (AI) models for glaucoma screening is essential for early detection and treatment but faces significant obstacles. Effective deep learning algorithms require large, well-curated datasets from diverse patient populations and imaging protocols. However, creating centralized data repositories is hindered by concerns over data sharing, patient privacy, regulatory compliance, and intellectual property. Federated Learning (FL) offers a potential solution by enabling data to remain locally hosted while facilitating distributed model training across multiple sites.

METHODS

A comprehensive literature review was conducted on the application of Federated Learning in training AI models for glaucoma screening. Publications from 1950 to 2024 were searched using databases such as PubMed and IEEE Xplore with keywords including "glaucoma," "federated learning," "artificial intelligence," "deep learning," "machine learning," "distributed learning," "privacy-preserving," "data sharing," "medical imaging," and "ophthalmology." Articles were included if they discussed the use of FL in glaucoma-related AI tasks or addressed data sharing and privacy challenges in ophthalmic AI development.

RESULTS

FL enables collaborative model development without centralizing sensitive patient data, addressing privacy and regulatory concerns. Studies show that FL can improve model performance and generalizability by leveraging diverse datasets while maintaining data security. FL models have achieved comparable or superior accuracy to those trained on centralized data, demonstrating effectiveness in real-world clinical settings.

CONCLUSIONS

Federated Learning presents a promising strategy to overcome current obstacles in developing AI models for glaucoma screening. By balancing the need for extensive, diverse training data with the imperative to protect patient privacy and comply with regulations, FL facilitates collaborative model training without compromising data security. This approach offers a pathway toward more accurate and generalizable AI solutions for glaucoma detection and management.

FINANCIAL DISCLOSURE(S)

Proprietary or commercial disclosure may be found after the references in the Footnotes and Disclosures at the end of this article.

Collapse

Affiliation(s)

Shahin Hallaj Division of Ophthalmology Informatics and Data Science, Hamilton Glaucoma Center, Shiley Eye Institute, Viterbi Family Department of Ophthalmology, University of California, San Diego, La Jolla, California; Division of Biomedical Informatics, Department of Medicine, University of California San Diego, La Jolla, California
Benton G Chuter Division of Ophthalmology Informatics and Data Science, Hamilton Glaucoma Center, Shiley Eye Institute, Viterbi Family Department of Ophthalmology, University of California, San Diego, La Jolla, California; Division of Biomedical Informatics, Department of Medicine, University of California San Diego, La Jolla, California
Alexander C Lieu Division of Ophthalmology Informatics and Data Science, Hamilton Glaucoma Center, Shiley Eye Institute, Viterbi Family Department of Ophthalmology, University of California, San Diego, La Jolla, California; Division of Biomedical Informatics, Department of Medicine, University of California San Diego, La Jolla, California
Praveer Singh Division of Artificial Medical Intelligence, Department of Ophthalmology, University of Colorado School of Medicine, Aurora, Colorado
Jayashree Kalpathy-Cramer Division of Artificial Medical Intelligence, Department of Ophthalmology, University of Colorado School of Medicine, Aurora, Colorado
Benjamin Y Xu Roski Eye Institute, Keck School of Medicine, University of Southern California, Los Angeles, California
Mark Christopher Division of Ophthalmology Informatics and Data Science, Hamilton Glaucoma Center, Shiley Eye Institute, Viterbi Family Department of Ophthalmology, University of California, San Diego, La Jolla, California
Linda M Zangwill Division of Ophthalmology Informatics and Data Science, Hamilton Glaucoma Center, Shiley Eye Institute, Viterbi Family Department of Ophthalmology, University of California, San Diego, La Jolla, California
Robert N Weinreb Division of Ophthalmology Informatics and Data Science, Hamilton Glaucoma Center, Shiley Eye Institute, Viterbi Family Department of Ophthalmology, University of California, San Diego, La Jolla, California
Sally L Baxter Division of Ophthalmology Informatics and Data Science, Hamilton Glaucoma Center, Shiley Eye Institute, Viterbi Family Department of Ophthalmology, University of California, San Diego, La Jolla, California; Division of Biomedical Informatics, Department of Medicine, University of California San Diego, La Jolla, California.

Collapse

Giacobbe DR, Marelli C, Guastavino S, Mora S, Rosso N, Signori A, Campi C, Giacomini M, Bassetti M. Explainable and Interpretable Machine Learning for Antimicrobial Stewardship: Opportunities and Challenges. Clin Ther 2024;46:474-480. [PMID: 38519371 DOI: 10.1016/j.clinthera.2024.02.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Revised: 02/23/2024] [Accepted: 02/27/2024] [Indexed: 03/24/2024]

Wündisch E, Hufnagl P, Brunecker P, Meier Zu Ummeln S, Träger S, Kopp M, Prasser F, Weber J. Development of a Trusted Third Party at a Large University Hospital: Design and Implementation Study. JMIR Med Inform 2024;12:e53075. [PMID: 38632712 DOI: 10.2196/53075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Revised: 02/15/2024] [Accepted: 02/17/2024] [Indexed: 04/19/2024] Open

Abstract

Background

Pseudonymization has become a best practice to securely manage the identities of patients and study participants in medical research projects and data sharing initiatives. This method offers the advantage of not requiring the direct identification of data to support various research processes while still allowing for advanced processing activities, such as data linkage. Often, pseudonymization and related functionalities are bundled in specific technical and organization units known as trusted third parties (TTPs). However, pseudonymization can significantly increase the complexity of data management and research workflows, necessitating adequate tool support. Common tasks of TTPs include supporting the secure registration and pseudonymization of patient and sample identities as well as managing consent.

Objective

Despite the challenges involved, little has been published about successful architectures and functional tools for implementing TTPs in large university hospitals. The aim of this paper is to fill this research gap by describing the software architecture and tool set developed and deployed as part of a TTP established at Charité - Universitätsmedizin Berlin.

Methods

The infrastructure for the TTP was designed to provide a modular structure while keeping maintenance requirements low. Basic functionalities were realized with the free MOSAIC tools. However, supporting common study processes requires implementing workflows that span different basic services, such as patient registration, followed by pseudonym generation and concluded by consent collection. To achieve this, an integration layer was developed to provide a unified Representational state transfer (REST) application programming interface (API) as a basis for more complex workflows. Based on this API, a unified graphical user interface was also implemented, providing an integrated view of information objects and workflows supported by the TTP. The API was implemented using Java and Spring Boot, while the graphical user interface was implemented in PHP and Laravel. Both services use a shared Keycloak instance as a unified management system for roles and rights.

Results

By the end of 2022, the TTP has already supported more than 10 research projects since its launch in December 2019. Within these projects, more than 3000 identities were stored, more than 30,000 pseudonyms were generated, and more than 1500 consent forms were submitted. In total, more than 150 people regularly work with the software platform. By implementing the integration layer and the unified user interface, together with comprehensive roles and rights management, the effort for operating the TTP could be significantly reduced, as personnel of the supported research projects can use many functionalities independently.

Conclusions

With the architecture and components described, we created a user-friendly and compliant environment for supporting research projects. We believe that the insights into the design and implementation of our TTP can help other institutions to efficiently and effectively set up corresponding structures.

Collapse

Giacobbe DR, Zhang Y, de la Fuente J. Explainable artificial intelligence and machine learning: novel approaches to face infectious diseases challenges. Ann Med 2023;55:2286336. [PMID: 38010090 PMCID: PMC10836268 DOI: 10.1080/07853890.2023.2286336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Accepted: 11/16/2023] [Indexed: 11/29/2023] Open

Zhang T, Hei R, Huang Y, Shao J, Zhang M, Feng K, Qian W, Li S, Jin F, Chen Y. Construction and experimental validation of a necroptosis-related lncRNA signature as a prognostic model and immune-landscape predictor for lung adenocarcinoma. Am J Cancer Res 2023;13:4418-4433. [PMID: 37818057 PMCID: PMC10560937] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Accepted: 09/14/2023] [Indexed: 10/12/2023] Open

Abstract

Necroptosis is a new form of cell death. Since the discovery that long non-coding RNAs can affect the proliferation of lung adenocarcinoma, much has been learned about it, yet those of necroptosis-related long non-coding RNAs (NRlncRNAs) in lung adenocarcinoma (LUAD) remain enigmatic. This study aims to explore novel biomarkers and therapeutic targets for LUAD. The LUAD data was downloaded from The Cancer Genome Atlas, and necroptosis-related genes were retrieved from published literature. Co-expression analysis, univariate Cox analysis, least absolute shrinkage and selection operator regression analysis were used to identify necroptosis-related prognostic long non-coding RNAs. A comprehensive evaluation of tumor immunity for necrosis-related features was performed, and we identified a 9-NRlncRNA signature. Kaplan-Meier and Cox regression analyses confirmed that the signature was an independent predictor of LUAD outcome in the test and train sets (all P < 0.05). The areas of 1-, 2-, and 3-year overall survival under the time-dependent receiver operating characteristics (ROC) curve (AUC) were 0.754, 0.746, and 0.720, respectively. The GSEA results showed that 9 NRlncRNAs were associated with multiple malignancy-associated and immunoregulatory pathways. Based on this model, we found that the immune status and level of response to chemotherapy and targeted therapy were significantly different in the low-risk group compared with the high-risk group. qRT-PCR assay revealed that 9 NRlncRNAs were involved in the regulation of tumor cell proliferation and may affect the expression of programmed cell death 1 (PD1) and CD28 at human immune checkpoints. Our results indicated that the novel signature involving 9 NRlncRNAs (AL031600.2, LINC01281, AP001178.1, AL157823.2, LINC01290, MED4-AS1, AC026355.2, AL606489.1, FAM83A-AS1) can predict the prognosis of LUAD and are associated with the immune response. This will provide new insights into the pathogenesis and development of therapies for LUAD.

Collapse

Pieters M, Kruger IM, Kruger HS, Breet Y, Moss SJ, van Oort A, Bester P, Ricci C. Strategies of Modelling Incident Outcomes Using Cox Regression to Estimate the Population Attributable Risk. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2023;20:6417. [PMID: 37510649 PMCID: PMC10379285 DOI: 10.3390/ijerph20146417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 07/07/2023] [Accepted: 07/12/2023] [Indexed: 07/30/2023]

Gaudio HA, Padmanabhan V, Landis WP, Silva LEV, Slovis J, Starr J, Weeks MK, Widmann NJ, Forti RM, Laurent GH, Ranieri NR, Mi F, Degani RE, Hallowell T, Delso N, Calkins H, Dobrzynski C, Haddad S, Kao SH, Hwang M, Shi L, Baker WB, Tsui F, Morgan RW, Kilbaugh TJ, Ko TS. A Template for Translational Bioinformatics: Facilitating Multimodal Data Analyses in Preclinical Models of Neurological Injury. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.17.547582. [PMID: 37503137 PMCID: PMC10370067 DOI: 10.1101/2023.07.17.547582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]

Validating functional redundancy with mixed generative adversarial networks. Knowl Based Syst 2023. [DOI: 10.1016/j.knosys.2023.110342] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]

Wang Y, He S, Wang Y. AI-Assisted Dynamic Modelling for Data Management in a Distributed System. INTERNATIONAL JOURNAL OF INFORMATION SYSTEMS AND SUPPLY CHAIN MANAGEMENT 2022. [DOI: 10.4018/ijisscm.313623] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Kimble M, Allers S, Campbell K, Chen C, Jackson LM, King BL, Silverbrand S, York G, Beard K. medna-metadata: an open-source data management system for tracking environmental DNA samples and metadata. Bioinformatics 2022;38:4589-4597. [PMID: 35960154 PMCID: PMC9524998 DOI: 10.1093/bioinformatics/btac556] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Revised: 07/23/2022] [Accepted: 08/09/2022] [Indexed: 12/24/2022] Open

Gurugubelli VS, Fang H, Shikany JM, Balkus SV, Rumbut J, Ngo H, Wang H, Allison JJ, Steffen LM. A review of harmonization methods for studying dietary patterns. SMART HEALTH (AMSTERDAM, NETHERLANDS) 2022;23:100263. [PMID: 35252528 PMCID: PMC8896407 DOI: 10.1016/j.smhl.2021.100263] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/11/2023]

Abstract

Data harmonization is the process by which each of the variables from different research studies are standardized to similar units resulting in comparable datasets. These data may be integrated for more powerful and accurate examination and prediction of outcomes for use in the intelligent and smart electronic health software programs and systems. Prospective harmonization is performed when researchers create guidelines for gathering and managing the data before data collection begins. In contrast, retrospective harmonization is performed by pooling previously collected data from various studies using expert domain knowledge to identify and translate variables. In nutritional epidemiology, dietary data harmonization is often necessary to construct the nutrient and food databases necessary to answer complex research questions and develop effective public health policy. In this paper, we review methods for effective data harmonization, including developing a harmonization plan, which common standards already exist for harmonization, and defining variables needed to harmonize datasets. Currently, several large-scale studies maintain harmonized nutrient databases, especially in Europe, and steps have been proposed to inform the retrospective harmonization process. As an example, data harmonization methods are applied to several U.S longitudinal diet datasets. Based on our review, considerations for future dietary data harmonization include user agreements for sharing private data among participating studies, defining variables and data dictionaries that accurately map variables among studies, and the use of secure data storage servers to maintain privacy. These considerations establish necessary components of harmonized data for smart health applications which can promote healthier eating and provide greater insights into the effect of dietary patterns on health.

Collapse

Valenzuela W, Balsiger F, Wiest R, Scheidegger O. Medical-Blocks: A Platform for Exploration, Management, Analysis, and Sharing of Data in Biomedical Research. JMIR Form Res 2022;6:e32287. [PMID: 35232718 PMCID: PMC9039815 DOI: 10.2196/32287] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Revised: 02/04/2022] [Accepted: 02/28/2022] [Indexed: 02/07/2023] Open

Abstract

BACKGROUND

Biomedical research requires healthcare institutions to provide sensitive clinical data to leverage data science and artificial intelligence technologies. However, providing healthcare data to researchers simple and secure, proves to be challenging for healthcare institutions.

OBJECTIVE

We describe and introduce Medical-Blocks, a platform for data exploration, data management, data analysis, and data sharing in biomedical research.

METHODS

The specification requirements for Medical-Blocks included: i) Connection to data sources of healthcare institutions with an interface for data exploration, ii) management of data in an internal file storage system, iii) data analysis through visualization and classification of data, and iv) data sharing via a file hosting service for collaboration. Medical-Blocks should be simple to use via a web-based user interface and extensible with new functionalities by a modular design via microservices ("blocks"). The scalability of the platform should be ensured by containerization. Security and legal regulations were considered during the development.

RESULTS

Medical-Blocks is a web application that runs in the cloud or as a local instance at a healthcare institution. Local instances of Medical-Blocks access data sources such as electronic health records and picture archiving and communications system (PACS) at healthcare institutions. Researchers and clinicians can explore, manage, and analyze the available data through Medical-Blocks. The data analysis involves classification of data for metadata extraction and the formation of cohorts. In collaborations, metadata (e.g., number of patients per cohort) and/or the data itself can be shared through Medical-Blocks locally or via a cloud instance to other researchers and clinicians.

CONCLUSIONS

Medical-Blocks facilitates biomedical research by providing a centralized platform to interact with medical data in collaborative research projects. The access to and management of medical data is simplified. Data can be swiftly analyzed to form cohorts for research and be shared among researchers. The modularity of Medical-Blocks makes the platform feasible for biomedical research where heterogenous medical data is needed.

CLINICALTRIAL

Collapse

John Cremin C, Dash S, Huang X. Big Data: Historic Advances and Emerging Trends in Biomedical Research. CURRENT RESEARCH IN BIOTECHNOLOGY 2022. [DOI: 10.1016/j.crbiot.2022.02.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022] Open

Best practices in the real-world data life cycle. PLOS DIGITAL HEALTH 2022;1:e0000003. [PMID: 36812509 PMCID: PMC9931348 DOI: 10.1371/journal.pdig.0000003] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]

Ahmed Z. Precision medicine with multi-omics strategies, deep phenotyping, and predictive analysis. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2022;190:101-125. [DOI: 10.1016/bs.pmbts.2022.02.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

Data protection, data management, and data sharing: Stakeholder perspectives on the protection of personal health information in South Africa. PLoS One 2021;16:e0260341. [PMID: 34928950 PMCID: PMC8687565 DOI: 10.1371/journal.pone.0260341] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Accepted: 11/08/2021] [Indexed: 11/19/2022] Open

Abstract

The Protection of Personal Information Act (POPIA) 2013 came into force in South Africa on 1 July 2020. It seeks to strengthen the processing of personal information, including health information. While POPIA is to be welcomed, there are concerns about the impact it will have on the processing of health information. To ensure that the National Health Laboratory Service [NHLS] is compliant with these new strict processing requirements and that compliance does not negatively impact upon its current screening, treatment, surveillance and research mandate, it was decided to consider the development of a NHLS POPIA Code of Conduct for Personal Health. As part of the process of developing such a Code and better understand the challenges faced in the processing of personal health information in South Africa, 19 semi-structured interviews with stakeholders were conducted between June and September 2020. Overall, respondents welcomed the introduction of POPIA. However, they felt that there are tensions between the strengthening of data protection and the use of personal information for individual patient care, treatment programmes, and research. Respondents reported a need to rethink the management of personal health information in South Africa and identified 5 issues needing to be addressed at a national and an institutional level: an understanding of the importance of personal information; an understanding of POPIA and data protection; improve data quality; improve transparency in data use; and improve accountability in data use. The application of POPIA to the processing of personal health information is challenging, complex, and likely costly. However, personal health information must be appropriately managed to ensure the privacy of the data subject is protected, but equally that it is used as a resource in the individual's and wider public interest.

Collapse

The Role of Big Data in Aging and Older People’s Health Research: A Systematic Review and Ecological Framework. SUSTAINABILITY 2021. [DOI: 10.3390/su132111587] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]

Kouper I, Tucker KL, Tharp K, van Booven ME, Clark A. Active Curation of Large Longitudinal Surveys: A Case Study. JOURNAL OF ESCIENCE LIBRARIANSHIP 2021. [DOI: 10.7191/jeslib.2021.1210] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open

Pavel A, del Giudice G, Federico A, Di Lieto A, Kinaret PAS, Serra A, Greco D. Integrated network analysis reveals new genes suggesting COVID-19 chronic effects and treatment. Brief Bioinform 2021;22:1430-1441. [PMID: 33569598 PMCID: PMC7929418 DOI: 10.1093/bib/bbaa417] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2020] [Revised: 11/13/2020] [Accepted: 12/19/2020] [Indexed: 01/08/2023] Open

Defining the big social data paradigm through a systematic literature review approach. JOURNAL OF KNOWLEDGE MANAGEMENT 2021. [DOI: 10.1108/jkm-10-2020-0801] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]

Abstract Purpose This study aims to investigate the Big Social Data (BSD) paradigm, which still lacks a clear and shared definition, and causes a lack of clarity and understanding about its beneficial opportunities for practitioners. In the knowledge management (KM) domain, a clear characterization of the BSD paradigm can lead to more effective and efficient KM strategies, processes and systems that leverage a huge amount of structured and unstructured data sources. Design/methodology/approach The study adopts a systematic literature review (SLR) methodology based on a mixed analysis approach (unsupervised machine learning and human-based) applied to 199 research articles on BSD topics extracted from Scopus and Web of Science. In particular, machine learning processing has been implemented by using topic extraction and hierarchical clustering techniques. Findings The paper provides a threefold contribution: a conceptualization and a consensual definition of the BSD paradigm through the identification of four key conceptual pillars (i.e. sources, properties, technology and value exploitation); a characterization of the taxonomy of BSD data type that extends previous works on this topic; a research agenda for future research studies on BSD and its applications along with a KM perspective. Research limitations/implications The main limits of the research rely on the list of articles considered for the literature review that could be enlarged by considering further sources (in addition to Scopus and Web of Science) and/or further languages (in addition to English) and/or further years (the review considers papers published until 2018). Research implications concern the development of a research agenda organized along with five thematic issues, which can feed future research to deepen the paradigm of BSD and explore linkages with the KM field. Practical implications Practical implications concern the usage of the proposed definition of BSD to purposefully design applications and services based on BSD in knowledge-intensive domains to generate value for citizens, individuals, companies and territories. Originality/value The original contribution concerns the definition of the big data social paradigm built through an SLR the combines machine learning processing and human-based processing. Moreover, the research agenda deriving from the study contributes to investigate the BSD paradigm in the wider domain of KM. Collapse

Biomedical and Clinical Research Data Management. SYSTEMS MEDICINE 2021. [DOI: 10.1016/b978-0-12-801238-3.11621-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open

Khan IH, Javaid M. Big Data Applications in Medical Field: A Literature Review. JOURNAL OF INDUSTRIAL INTEGRATION AND MANAGEMENT 2020. [DOI: 10.1142/s242486222030001x] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]

Abstract Digital imaging and medical reporting have acquired an essential role in healthcare, but the main challenge is the storage of a high volume of patient data. Although newer technologies are already introduced in the medical sciences to save records size, Big Data provides advancements by storing a large amount of data to improve the efficiency and quality of patient treatment with better care. It provides intelligent automation capabilities to reduce errors than manual inputs. Large numbers of research papers on big data in the medical field are studied and analyzed for their impacts, benefits, and applications. Big data has great potential to support the digitalization of all medical and clinical records and then save the entire data regarding the medical history of an individual or a group. This paper discusses big data usage for various industries and sectors. Finally, 12 significant applications for the medical field by the implementation of big data are identified and studied with a brief description. This technology can be gainfully used to extract useful information from the available data by analyzing and managing them through a combination of hardware and software. With technological advancement, big data provides health-related information for millions of patient-related to life issues such as lab tests reporting, clinical narratives, demographics, prescription, medical diagnosis, and related documentation. Thus, Big Data is essential in developing a better yet efficient analysis and storage healthcare services. The demand for big data applications is increasing due to its capability of handling and analyzing massive data. Not only in the future but even now, Big Data is proving itself as an axiom of storing, developing, analyzing, and providing overall health information to the physicians. Collapse

Jahangiri L, Akiva G, Lakhia S, Turkyilmaz I. Understanding the complexities of digital dentistry integration in high-volume dental institutions. Br Dent J 2020;229:166-168. [PMID: 32811935 DOI: 10.1038/s41415-020-1928-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Borda A, Gray K, Fu Y. Research data management in health and biomedical citizen science: practices and prospects. JAMIA Open 2020;3:113-125. [PMID: 32607493 PMCID: PMC7309241 DOI: 10.1093/jamiaopen/ooz052] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2019] [Revised: 07/09/2019] [Accepted: 09/30/2019] [Indexed: 12/25/2022] Open

Egli A, Schrenzel J, Greub G. Digital microbiology. Clin Microbiol Infect 2020;26:1324-1331. [PMID: 32603804 PMCID: PMC7320868 DOI: 10.1016/j.cmi.2020.06.023] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2019] [Revised: 06/15/2020] [Accepted: 06/20/2020] [Indexed: 12/12/2022]

Mallappallil M, Sabu J, Gruessner A, Salifu M. A review of big data and medical research. SAGE Open Med 2020;8:2050312120934839. [PMID: 32637104 PMCID: PMC7323266 DOI: 10.1177/2050312120934839] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2019] [Accepted: 05/21/2020] [Indexed: 12/11/2022] Open

Abstract

Universally, the volume of data has increased, with the collection rate doubling every 40 months, since the 1980s. "Big data" is a term that was introduced in the 1990s to include data sets too large to be used with common software. Medicine is a major field predicted to increase the use of big data in 2025. Big data in medicine may be used by commercial, academic, government, and public sectors. It includes biologic, biometric, and electronic health data. Examples of biologic data include biobanks; biometric data may have individual wellness data from devices; electronic health data include the medical record; and other data demographics and images. Big data has also contributed to the changes in the research methodology. Changes in the clinical research paradigm has been fueled by large-scale biological data harvesting (biobanks), which is developed, analyzed, and managed by cheaper computing technology (big data), supported by greater flexibility in study design (real-world data) and the relationships between industry, government regulators, and academics. Cultural changes along with easy access to information via the Internet facilitate ease of participation by more people. Current needs demand quick answers which may be supplied by big data, biobanks, and changes in flexibility in study design. Big data can reveal health patterns, and promises to provide solutions that have previously been out of society's grasp; however, the murkiness of international laws, questions of data ownership, public ignorance, and privacy and security concerns are slowing down the progress that could otherwise be achieved by the use of big data. The goal of this descriptive review is to create awareness of the ramifications for big data and to encourage readers that this trend is positive and will likely lead to better clinical solutions, but, caution must be exercised to reduce harm.

Collapse

Ercole A, Brinck V, George P, Hicks R, Huijben J, Jarrett M, Vassar M, Wilson L. Guidelines for Data Acquisition, Quality and Curation for Observational Research Designs (DAQCORD). J Clin Transl Sci 2020;4:354-359. [PMID: 33244417 PMCID: PMC7681114 DOI: 10.1017/cts.2020.24] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Revised: 03/05/2020] [Accepted: 03/09/2020] [Indexed: 01/12/2023] Open

Ahmed Z, Mohamed K, Zeeshan S, Dong X. Artificial intelligence with multi-functional machine learning platform development for better healthcare and precision medicine. Database (Oxford) 2020;2020:baaa010. [PMID: 32185396 PMCID: PMC7078068 DOI: 10.1093/database/baaa010] [Citation(s) in RCA: 167] [Impact Index Per Article: 41.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2019] [Revised: 01/05/2020] [Accepted: 01/21/2020] [Indexed: 02/06/2023]

Abstract

Precision medicine is one of the recent and powerful developments in medical care, which has the potential to improve the traditional symptom-driven practice of medicine, allowing earlier interventions using advanced diagnostics and tailoring better and economically personalized treatments. Identifying the best pathway to personalized and population medicine involves the ability to analyze comprehensive patient information together with broader aspects to monitor and distinguish between sick and relatively healthy people, which will lead to a better understanding of biological indicators that can signal shifts in health. While the complexities of disease at the individual level have made it difficult to utilize healthcare information in clinical decision-making, some of the existing constraints have been greatly minimized by technological advancements. To implement effective precision medicine with enhanced ability to positively impact patient outcomes and provide real-time decision support, it is important to harness the power of electronic health records by integrating disparate data sources and discovering patient-specific patterns of disease progression. Useful analytic tools, technologies, databases, and approaches are required to augment networking and interoperability of clinical, laboratory and public health systems, as well as addressing ethical and social issues related to the privacy and protection of healthcare data with effective balance. Developing multifunctional machine learning platforms for clinical data extraction, aggregation, management and analysis can support clinicians by efficiently stratifying subjects to understand specific scenarios and optimize decision-making. Implementation of artificial intelligence in healthcare is a compelling vision that has the potential in leading to the significant improvements for achieving the goals of providing real-time, better personalized and population medicine at lower costs. In this study, we focused on analyzing and discussing various published artificial intelligence and machine learning solutions, approaches and perspectives, aiming to advance academic solutions in paving the way for a new data-centric era of discovery in healthcare.

Collapse

Li F, Wang Y, Li C, Marquez-Lago TT, Leier A, Rawlings ND, Haffari G, Revote J, Akutsu T, Chou KC, Purcell AW, Pike RN, Webb GI, Ian Smith A, Lithgow T, Daly RJ, Whisstock JC, Song J. Twenty years of bioinformatics research for protease-specific substrate and cleavage site prediction: a comprehensive revisit and benchmarking of existing methods. Brief Bioinform 2019;20:2150-2166. [PMID: 30184176 PMCID: PMC6954447 DOI: 10.1093/bib/bby077] [Citation(s) in RCA: 58] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2018] [Revised: 07/26/2018] [Accepted: 08/01/2018] [Indexed: 01/06/2023] Open

Abstract

The roles of proteolytic cleavage have been intensively investigated and discussed during the past two decades. This irreversible chemical process has been frequently reported to influence a number of crucial biological processes (BPs), such as cell cycle, protein regulation and inflammation. A number of advanced studies have been published aiming at deciphering the mechanisms of proteolytic cleavage. Given its significance and the large number of functionally enriched substrates targeted by specific proteases, many computational approaches have been established for accurate prediction of protease-specific substrates and their cleavage sites. Consequently, there is an urgent need to systematically assess the state-of-the-art computational approaches for protease-specific cleavage site prediction to further advance the existing methodologies and to improve the prediction performance. With this goal in mind, in this article, we carefully evaluated a total of 19 computational methods (including 8 scoring function-based methods and 11 machine learning-based methods) in terms of their underlying algorithm, calculated features, performance evaluation and software usability. Then, extensive independent tests were performed to assess the robustness and scalability of the reviewed methods using our carefully prepared independent test data sets with 3641 cleavage sites (specific to 10 proteases). The comparative experimental results demonstrate that PROSPERous is the most accurate generic method for predicting eight protease-specific cleavage sites, while GPS-CCD and LabCaS outperformed other predictors for calpain-specific cleavage sites. Based on our review, we then outlined some potential ways to improve the prediction performance and ease the computational burden by applying ensemble learning, deep learning, positive unlabeled learning and parallel and distributed computing techniques. We anticipate that our study will serve as a practical and useful guide for interested readers to further advance next-generation bioinformatics tools for protease-specific cleavage site prediction.

Collapse

Affiliation(s)

Fuyi Li Biomedicine Discovery Institute and Department of Biochemistry & Molecular Biology, Monash University, Melbourne, VIC 3800, Australia
Yanan Wang Biomedicine Discovery Institute and Department of Biochemistry & Molecular Biology, Monash University, Melbourne, VIC 3800, Australia Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai 200240, China
Chen Li Biomedicine Discovery Institute and Department of Biochemistry & Molecular Biology, Monash University, Melbourne, VIC 3800, Australia Department of Biology, Institute of Molecular Systems Biology,ETH Zürich, Zürich 8093, Switzerland
Tatiana T Marquez-Lago Department of Genetics and Department of Cell, Developmental and Integrative Biology, School of Medicine, University of Alabama at Birmingham, AL, USA
André Leier Department of Genetics and Department of Cell, Developmental and Integrative Biology, School of Medicine, University of Alabama at Birmingham, AL, USA
Neil D Rawlings EMBL European Bioinformatics Institute, Wellcome Trust Genome Campus, Wellcome Trust Genome Campus,Hinxton, Cambridgeshire CB10 1SD, UK
Gholamreza Haffari Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia
Jerico Revote Biomedicine Discovery Institute and Department of Biochemistry & Molecular Biology, Monash University, Melbourne, VIC 3800, Australia
Tatsuya Akutsu Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto 611-0011, Japan
Kuo-Chen Chou Gordon Life Science Institute, Boston, MA 02478, USA Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
Anthony W Purcell Biomedicine Discovery Institute and Department of Biochemistry & Molecular Biology, Monash University, Melbourne, VIC 3800, Australia
Robert N Pike La Trobe Institute for Molecular Science, La Trobe University, Melbourne, VIC 3086, Australia ARC Centre of Excellence in Advanced Molecular Imaging, Monash University, Melbourne, VIC 3800, Australia
Geoffrey I Webb Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia
A Ian Smith Biomedicine Discovery Institute and Department of Biochemistry & Molecular Biology, Monash University, Melbourne, VIC 3800, Australia ARC Centre of Excellence in Advanced Molecular Imaging, Monash University, Melbourne, VIC 3800, Australia
Trevor Lithgow Biomedicine Discovery Institute and Department of Microbiology, Monash University, Melbourne, Victoria 3800, Australia
Roger J Daly Biomedicine Discovery Institute and Department of Biochemistry & Molecular Biology, Monash University, Melbourne, VIC 3800, Australia
James C Whisstock Biomedicine Discovery Institute and Department of Biochemistry & Molecular Biology, Monash University, Melbourne, VIC 3800, Australia ARC Centre of Excellence in Advanced Molecular Imaging, Monash University, Melbourne, VIC 3800, Australia
Jiangning Song Biomedicine Discovery Institute and Department of Biochemistry & Molecular Biology, Monash University, Melbourne, VIC 3800, Australia Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia ARC Centre of Excellence in Advanced Molecular Imaging, Monash University, Melbourne, VIC 3800, Australia

Collapse

McKenzie KA, Hunt SL, Hulshof G, Mudaranthakam DP, Meyer K, Vidoni ED, Burns JM, Mahnken JD. A semi-automated pipeline for fulfillment of resource requests from a longitudinal Alzheimer's disease registry. JAMIA Open 2019;2:516-520. [PMID: 32025648 PMCID: PMC6993996 DOI: 10.1093/jamiaopen/ooz032] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2019] [Revised: 06/21/2019] [Accepted: 07/22/2019] [Indexed: 12/22/2022] Open

Cai W, Lesnik KL, Wade MJ, Heidrich ES, Wang Y, Liu H. Incorporating microbial community data with machine learning techniques to predict feed substrates in microbial fuel cells. Biosens Bioelectron 2019;133:64-71. [PMID: 30909014 DOI: 10.1016/j.bios.2019.03.021] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2018] [Revised: 03/11/2019] [Accepted: 03/12/2019] [Indexed: 12/21/2022]

Abstract

The complicated interactions that occur in mixed-species biotechnologies, including biosensors, hinder chemical detection specificity. This lack of specificity limits applications in which biosensors may be deployed, such as those where an unknown feed substrate must be determined. The application of genomic data and well-developed data mining technologies can overcome these limitations and advance engineering development. In the present study, 69 samples with three different substrate types (acetate, carbohydrates and wastewater) collected from various laboratory environments were evaluated to determine the ability to identify feed substrates from the resultant microbial communities. Six machine learning algorithms with four different input variables were trained and evaluated on their ability to predict feed substrate from genomic datasets. The highest accuracies of 93 ± 6% and 92 ± 5% were obtained using NNET trained on datasets classified at the phylum and family taxonomic level, respectively. These accuracies corresponded to kappa values of 0.87 ± 0.10, 0.86 ± 0.09, respectively. Four out of six of the algorithms used maintained accuracies above 80% and kappa values higher than 0.66. Different sequencing method (Roche 454 or Illumina sequencing) did not affect the accuracies of all algorithms, except SVM at the phylum level. All algorithms trained on NMDS-compressed datasets obtained accuracies over 80%, while models trained on PCoA-compressed datasets presented a 10-30% reduction in accuracy. These results suggest that incorporating microbial community data with machine learning algorithms can be used for the prediction of feed substrate and for the potential improvement of MFC-based biosensor signal specificity, providing a new use of machine learning techniques that has substantial practical applications in biotechnological fields.

Collapse

Read KB. Adapting data management education to support clinical research projects in an academic medical center. J Med Libr Assoc 2019;107:89-97. [PMID: 30598653 PMCID: PMC6300223 DOI: 10.5195/jmla.2019.580] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2018] [Accepted: 09/01/2018] [Indexed: 02/07/2023] Open

Ahmed Z, Kim M, Liang BT. MAV-clic: management, analysis, and visualization of clinical data. JAMIA Open 2018;2:23-28. [PMID: 31984341 PMCID: PMC6951942 DOI: 10.1093/jamiaopen/ooy052] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2018] [Revised: 07/18/2018] [Accepted: 11/22/2018] [Indexed: 11/12/2022] Open