376
|
Saito MA, Saunders JK, Chagnon M, Gaylord DA, Shepherd A, Held NA, Dupont C, Symmonds N, York A, Charron M, Kinkade DB. Development of an Ocean Protein Portal for Interactive Discovery and Education. J Proteome Res 2021; 20:326-336. [PMID: 32897077 PMCID: PMC8036901 DOI: 10.1021/acs.jproteome.0c00382] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Proteins are critical in catalyzing chemical reactions, forming key cellular structures, and in regulating cellular processes. Investigation of marine microbial proteins by metaproteomics methods enables the discovery of numerous aspects of microbial biogeochemical processes. However, these datasets present big data challenges as they often involve many samples collected across broad geospatial and temporal scales, resulting in thousands of protein identifications, abundances, and corresponding annotation information. The Ocean Protein Portal (OPP) was created to enable data sharing and discovery among multiple scientific domains and serve both research and education functions. The portal focuses on three use case questions: "Where is my protein of interest?", "Who makes it?", and "How much is there?" and provides profile and section visualizations, real-time taxonomic analysis, and links to metadata, sequence analysis, and other external resources to enable connections to be made between biogeochemical and proteomics datasets.
Collapse
|
377
|
El Emam K, Mosquera L, Jonker E, Sood H. Evaluating the utility of synthetic COVID-19 case data. JAMIA Open 2021; 4:ooab012. [PMID: 33709065 PMCID: PMC7936723 DOI: 10.1093/jamiaopen/ooab012] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2020] [Revised: 02/01/2021] [Accepted: 02/10/2021] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND Concerns about patient privacy have limited access to COVID-19 datasets. Data synthesis is one approach for making such data broadly available to the research community in a privacy protective manner. OBJECTIVES Evaluate the utility of synthetic data by comparing analysis results between real and synthetic data. METHODS A gradient boosted classification tree was built to predict death using Ontario's 90 514 COVID-19 case records linked with community comorbidity, demographic, and socioeconomic characteristics. Model accuracy and relationships were evaluated, as well as privacy risks. The same model was developed on a synthesized dataset and compared to one from the original data. RESULTS The AUROC and AUPRC for the real data model were 0.945 [95% confidence interval (CI), 0.941-0.948] and 0.34 (95% CI, 0.313-0.368), respectively. The synthetic data model had AUROC and AUPRC of 0.94 (95% CI, 0.936-0.944) and 0.313 (95% CI, 0.286-0.342) with confidence interval overlap of 45.05% and 52.02% when compared with the real data. The most important predictors of death for the real and synthetic models were in descending order: age, days since January 1, 2020, type of exposure, and gender. The functional relationships were similar between the two data sets. Attribute disclosure risks were 0.0585, and membership disclosure risk was low. CONCLUSIONS This synthetic dataset could be used as a proxy for the real dataset.
Collapse
|
378
|
Abstract
Many research agencies are now requiring that data collected as part of funded projects be shared. However, the practice of data sharing in education sciences has lagged these funder requirements. We assert that this is likely because researchers' generally have not been made aware of these requirements and the benefits of data sharing. Furthermore, data sharing is usually not a part of formal training, so many researchers may be unaware how to properly share their data. Finally, the research culture in education science is often filled with concerns regarding the sharing of data. In this article, we address each of these areas, discussing the wide range of benefits of data sharing, the many ways data can be shared, provide a step by step guide to start sharing data, and responses to common concerns.
Collapse
|
379
|
Amit AML, Pepito VCF, Gutierrez B, Rawson T. Data Sharing in Southeast Asia During the First Wave of the COVID-19 Pandemic. Front Public Health 2021; 9:662842. [PMID: 34222173 PMCID: PMC8242246 DOI: 10.3389/fpubh.2021.662842] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Accepted: 05/11/2021] [Indexed: 11/13/2022] Open
Abstract
Background: When a new pathogen emerges, consistent case reporting is critical for public health surveillance. Tracking cases geographically and over time is key for understanding the spread of an infectious disease and effectively designing interventions to contain and mitigate an epidemic. In this paper we describe the reporting systems on COVID-19 in Southeast Asia during the first wave in 2020, and highlight the impact of specific reporting methods. Methods: We reviewed key epidemiological variables from various sources including a regionally comprehensive dataset, national trackers, dashboards, and case bulletins for 11 countries during the first wave of the epidemic in Southeast Asia. We recorded timelines of shifts in epidemiological reporting systems and described the differences in how epidemiological data are reported across countries and timepoints. Results: Our findings suggest that countries in Southeast Asia generally reported precise and detailed epidemiological data during the first wave of the pandemic. Changes in reporting rarely occurred for demographic data, while reporting shifts for geographic and temporal data were frequent. Most countries provided COVID-19 individual-level data daily using HTML and PDF, necessitating scraping and extraction before data could be used in analyses. Conclusion: Our study highlights the importance of more nuanced analyses of COVID-19 epidemiological data within and across countries because of the frequent shifts in reporting. As governments continue to respond to impacts on health and the economy, data sharing also needs to be prioritised given its foundational role in policymaking, and in the implementation and evaluation of interventions.
Collapse
|
380
|
van Lin N, Paliouras G, Vroom E, ’t Hoen PA, Roos M. How Patient Organizations Can Drive FAIR Data Efforts to Facilitate Research and Health Care: A Report of the Virtual Second International Meeting on Duchenne Data Sharing, March 3, 2021. J Neuromuscul Dis 2021; 8:1097-1108. [PMID: 34334415 PMCID: PMC8673524 DOI: 10.3233/jnd-210721] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
BACKGROUND For patients with rare diseases such as Duchenne and Becker muscular dystrophy (DMD/BMD), access to their health data is key to being able to advocate for themselves and be in control of their care. Since 2018, the DMD/BMD patient community has been committed to making DMD/BMD-related data FAIR, i.e., Findable, Accessible, Interoperable, and Reusable. On March 3, 2021, the second international meeting on FAIR data sharing for DMD/BMD was held virtually. OBJECTIVE The aim of this meeting report is to summarize the presentations and discussions of the meeting. METHODS During this meeting, the progress of FAIRification efforts since the first international meeting in 2019, new developments, stakeholder perspectives, and experiences from implementing FAIR data principles in practice were presented and discussed. RESULTS Over 120 attendees representing various stakeholder groups (ie, patient organizations, clinicians, clinical and academic researchers, pharmaceutical companies, regulators, and EU organizations) from 22 countries participated in the meeting. This meeting report summarizes the presentations and discussions from the meeting, provides an overview of the key lessons learned since the first meeting, and outlines the next steps. CONCLUSIONS Patient organizations are key drivers of the FAIRification process in practice and dialogue with stakeholders is critical to success.
Collapse
|
381
|
Gao F, Tao L, Huang Y, Shu Z. Management and Data Sharing of COVID-19 Pandemic Information. Biopreserv Biobank 2020; 18:570-580. [PMID: 33320734 DOI: 10.1089/bio.2020.0134] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
The coronavirus disease 2019 (COVID-19) is an ongoing global pandemic caused by severe acute respiratory syndrome coronavirus 2. During the past 10 months, COVID-19 has killed over 1 million people worldwide. Under this global crisis, data sharing and management of the COVID-19 information are urgently needed and critical for researchers, epidemiologists, physicians, bioengineers, funding agencies, and governments to work together in developing new vaccines, drugs, methods, therapeutics, and strategies for the prevention and treatment of this deadly and rapidly spreading disease. The COVID-19 pandemic information includes the database of COVID-19-patient biospecimen resources in hospitals or biorepositories, electronic patient health records, ongoing clinical trials and research results on this disease, policies, guidelines, and regulations related to COVID-19, and the COVID-19 outbreak tracking records, and so on. A study of the current management and data-sharing approaches, tools, software, network, and internet systems developed in the United States is conducted in this article. Based on this study, it is revealed that the existing data-sharing and management systems are facing many big challenges and problems associated with data decentralization, inconsistencies, security and legal issues, limited financial support, international communications, standardization, and globalization. To overcome and solve these problems, several integrated platform models for national and international data-sharing and management are developed and proposed in this article to meet the unprecedented need and demand for COVID-19 pandemic information sharing and research worldwide.
Collapse
|
382
|
Evin A, Bonhomme V, Claude J. Optimizing digitalization effort in morphometrics. Biol Methods Protoc 2020; 5:bpaa023. [PMID: 33324759 PMCID: PMC7723759 DOI: 10.1093/biomethods/bpaa023] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2020] [Revised: 11/05/2020] [Accepted: 11/13/2020] [Indexed: 12/23/2022] Open
Abstract
Quantifying phenotypes is a common practice for addressing questions regarding morphological variation. The time dedicated to data acquisition can vary greatly depending on methods and on the required quantity of information. Optimizing digitization effort can be done either by pooling datasets among users, by automatizing data collection, or by reducing the number of measurements. Pooling datasets among users is not without risk since potential errors arising from multiple operators in data acquisition prevent combining morphometric datasets. We present an analytical workflow to estimate within and among operator biases and to assess whether morphometric datasets can be pooled. We show that pooling and sharing data requires careful examination of the errors occurring during data acquisition, that the choice of morphometric approach influences amount of error, and that in some cases pooling data should be avoided. The demonstration is based on a worked example (Sus scrofa teeth) using a combinations of 18 morphometric approaches and datasets for which we identified and quantified several potential sources of errors in the workflow. We show that it is possible to estimate the analytical power of a study using a small subset of data to select the best morphometric protocol and to optimize the number of variables necessary for analysis. In particular, we focus on semi-landmarks, which often produce an inflation of variables in contrast to the number of available observations use in statistical testing. We show how the workflow can be used for optimizing digitization efforts and provide recommendations for best practices in error management.
Collapse
|
383
|
Birkenbihl C, Salimi Y, Domingo‐Fernándéz D, Lovestone S, Fröhlich H, Hofmann‐Apitius M. Evaluating the Alzheimer's disease data landscape. ALZHEIMER'S & DEMENTIA (NEW YORK, N. Y.) 2020; 6:e12102. [PMID: 33344750 PMCID: PMC7744022 DOI: 10.1002/trc2.12102] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Accepted: 09/21/2020] [Indexed: 01/08/2023]
Abstract
INTRODUCTION Numerous studies have collected Alzheimer's disease (AD) cohort data sets. To achieve reproducible, robust results in data-driven approaches, an evaluation of the present data landscape is vital. METHODS Previous efforts relied exclusively on metadata and literature. Here, we evaluate the data landscape by directly investigating nine patient-level data sets generated in major clinical cohort studies. RESULTS The investigated cohorts differ in key characteristics, such as demographics and distributions of AD biomarkers. Analyzing the ethnoracial diversity revealed a strong bias toward White/Caucasian individuals. We described and compared the measured data modalities. Finally, the available longitudinal data for important AD biomarkers was evaluated. All results are explorable through our web application ADataViewer (https://adata.scai.fraunhofer.de). DISCUSSION Our evaluation exposed critical limitations in the AD data landscape that impede comparative approaches across multiple data sets. Comparison of our results to those gained by metadata-based approaches highlights that thorough investigation of real patient-level data is imperative to assess a data landscape.
Collapse
|
384
|
Schwalbe N, Wahl B, Song J, Lehtimaki S. Data Sharing and Global Public Health: Defining What We Mean by Data. Front Digit Health 2020; 2:612339. [PMID: 34713073 PMCID: PMC8521885 DOI: 10.3389/fdgth.2020.612339] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Accepted: 11/11/2020] [Indexed: 11/13/2022] Open
|
385
|
Waind E. Trust, security and public interest: striking the balance A narrative review of previous literature on public attitudes towards the sharing, linking and use of administrative data for research. Int J Popul Data Sci 2020; 5:1368. [PMID: 34036179 PMCID: PMC8127133 DOI: 10.23889/ijpds.v5i3.1368] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
This narrative literature review explores previous findings in relation to the UK public’s attitudes towards the sharing, linking and use of public sector administrative data for research. A total of 16 papers are included in the review, for which data was collected between the years 2006-2018. The review finds, on the basis of previous literature on the topic, that the public is broadly supportive of administrative data research if three core conditions are met: public interest, privacy and security, and trust and transparency. None of these conditions is sufficient in isolation; the literature shows public support is underpinned by fulfillment of all three. However, it also shows that in certain cases where the standard of one condition is very high – particularly public interest – this could mean the standard of another may, if necessary, be lower. An appropriate balance must be struck, and the proposed benefits of sharing and using data for research must outweigh the potential risks. Broad, conditional support for the use of administrative data in research has not only been found consistently, but has also been held over time. Most studies identified by this review have focused on exploring the views of the general public towards the acceptability of administrative data use in broad terms. However, with the exception of that related to healthcare data, the review identified little work focused on gaining input from relevant demographics and communities in relation to specific data types or areas of research. In addition to fulfilling the core conditions of public support identified by broader work, initiatives making use of administrative data should aim to seek the views of relevant sub-sectors of the public in the development of research in relation to specific issues.
Collapse
|
386
|
Al-Ebbini L, Khabour OF, Alzoubi KH, Alkaraki AK. Biomedical Data Sharing Among Researchers: A Study from Jordan. J Multidiscip Healthc 2020; 13:1669-1676. [PMID: 33262602 PMCID: PMC7695599 DOI: 10.2147/jmdh.s284294] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Accepted: 10/22/2020] [Indexed: 12/02/2022] Open
Abstract
Background Data sharing is an encouraged practice to support research in all fields. For that purpose, it is important to examine perceptions and concerns of researchers about biomedical data sharing, which was investigated in the current study. Methods This is a cross-sectional survey study that was distributed among biomedical researchers in Jordan, as an example of developing countries. The study survey consisted of questions about demographics and about respondent’s attitudes toward sharing of biomedical data. Results Among study participants, 46.9% (n=82) were positive regarding making their research data available to the public, whereas 53.1% refused the idea. The reasons for refusing to publicly share their data included “lack of regulations” (33.5%), “access to research data should be limited to the research team” (29.5%), “no place to deposit the data” (6.5%), and “lack of funding for data deposition” (6.0%). Agreement with the idea of making data available was associated with academic rank (P=0.003). Moreover, gender (P-value=0.043) and number of publications (P-value=0.005) were associated with a time frame for data sharing (ie, agreeing to share data before vs after publication). Conclusion About half of the respondents reported a positive attitude toward biomedical data sharing. Proper regulations and facilitation data deposition can enhance data sharing in Jordan.
Collapse
|
387
|
Abstract
INTRODUCTION In light of the viral outbreak of SARS-CoV-2 that monopolized the focus of the scientific community and general public alike for the past 6 months, one of the greatest contributors in the battle against this pandemic was the international sharing of information. Whether regarding the viral genome, incubation periods, method of transmission, symptoms, dangerous behaviors, age groups at risk, all information was valuable, all data was shared as soon as possible. AREAS COVERED Considering that the most severely impacted group of patients are already suffering from other conditions, accessing the impact that metabolic associated fatty liver disease (MAFLD), obesity, and diabetes has on patients by sharing information between different healthcare facilities is of vital importance. However, the value behind open information sharing would remain significant even without a viral outbreak and should there be a more efficient infrastructure in place, the global exchange of data can become more practical and less arduous. EXPERT OPINION Since the sharing of data by individual researchers is often motivated by personal benefits, this observed international collaboration is conditional at best, and the widespread misinformation during this pandemic could be an indication of a certain lack of consensus within the scientific community itself.
Collapse
|
388
|
Udesky JO, Boronow KE, Brown P, Perovich LJ, Brody JG. Perceived Risks, Benefits, and Interest in Participating in Environmental Health Studies That Share Personal Exposure Data: A U.S. Survey of Prospective Participants. J Empir Res Hum Res Ethics 2020; 15:425-442. [PMID: 32065041 PMCID: PMC7429332 DOI: 10.1177/1556264620903595] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Little is known about the willingness of prospective study participants to share environmental health data. To fill this gap, we conducted a hypothetical vignette survey among 1,575 women who have volunteered to be contacted about breast cancer studies. Eighty-three percent were interested in participating in the environmental studies, with little difference whether data were restricted to the research team, shared with approved researchers, or publicly accessible. However, participants somewhat preferred controlled access for children's data. Respondents were more interested in studies with environmental rather than biological samples and more interested when researchers would return personal results, a practice of increasing importance. They were more reluctant to share location or to participate if studies involved electronic medical records. Many expressed concerns about privacy, particularly security breaches, but reidentification risks were mentioned infrequently, indicating that this topic should be discussed during informed consent.
Collapse
|
389
|
Kling SM, Harris HA, Marini M, Cook A, Hess LB, Lutcher S, Mowery J, Bell S, Hassink S, Hayward SB, Johnson G, Franceschelli Hosterman J, Paul IM, Seiler C, Sword S, Savage JS, Bailey-Davis L. Advanced Health Information Technologies to Engage Parents, Clinicians, and Community Nutritionists in Coordinating Responsive Parenting Care: Descriptive Case Series of the Women, Infants, and Children Enhancements to Early Healthy Lifestyles for Baby (WEE Baby) Care Randomized Controlled Trial. JMIR Pediatr Parent 2020; 3:e22121. [PMID: 33231559 PMCID: PMC7723742 DOI: 10.2196/22121] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/03/2020] [Revised: 10/08/2020] [Accepted: 10/25/2020] [Indexed: 12/02/2022] Open
Abstract
BACKGROUND Socioeconomically disadvantaged newborns receive care from primary care providers (PCPs) and Women, Infants, and Children (WIC) nutritionists. However, care is not coordinated between these settings, which can result in conflicting messages. Stakeholders support an integrated approach that coordinates services between settings with care tailored to patient-centered needs. OBJECTIVE This analysis describes the usability of advanced health information technologies aiming to engage parents in self-reporting parenting practices, integrate data into electronic health records to inform and facilitate documentation of provided responsive parenting (RP) care, and share data between settings to create opportunities to coordinate care between PCPs and WIC nutritionists. METHODS Parents and newborns (dyads) who were eligible for WIC care and received pediatric care in a single health system were recruited and randomized to a RP intervention or control group. For the 6-month intervention, electronic systems were created to facilitate documentation, data sharing, and coordination of provided RP care. Prior to PCP visits, parents were prompted to respond to the Early Healthy Lifestyles (EHL) self-assessment tool to capture current RP practices. Responses were integrated into the electronic health record and shared with WIC. Documentation of RP care and an 80-character, free-text comment were shared between WIC and PCPs. A care coordination opportunity existed when the dyad attended a WIC visit and these data were available from the PCP, and vice versa. Care coordination was demonstrated when WIC or PCPs interacted with data and documented RP care provided at the visit. RESULTS Dyads (N=131) attended 459 PCP (3.5, SD 1.0 per dyad) and 296 WIC (2.3, SD 1.0 per dyad) visits. Parents completed the EHL tool prior to 53.2% (244/459) of PCP visits (1.9, SD 1.2 per dyad), PCPs documented provided RP care at 35.3% (162/459) of visits, and data were shared with WIC following 100% (459/459) of PCP visits. A WIC visit followed a PCP visit 50.3% (231/459) of the time; thus, there were 1.8 (SD 0.8 per dyad) PCP to WIC care coordination opportunities. WIC coordinated care by documenting RP care at 66.7% (154/231) of opportunities (1.2, SD 0.9 per dyad). WIC visits were followed by a PCP visit 58.9% (116/197) of the time; thus, there were 0.9 (SD 0.8 per dyad) WIC to PCP care coordination opportunities. PCPs coordinated care by documenting RP care at 44.0% (51/116) of opportunities (0.4, SD 0.6 per dyad). CONCLUSIONS Results support the usability of advanced health information technology strategies to collect patient-reported data and share these data between multiple providers. Although PCPs and WIC shared data, WIC nutritionists were more likely to use data and document RP care to coordinate care than PCPs. Variability in timing, sequence, and frequency of visits underscores the need for flexibility in pragmatic studies. TRIAL REGISTRATION ClinicalTrials.gov NCT03482908; https://clinicaltrials.gov/ct2/show/NCT03482908. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID) RR2-10.1186/s12887-018-1263-z.
Collapse
|
390
|
Chen FZ, You LJ, Yang F, Wang LN, Guo XQ, Gao F, Hua C, Tan C, Fang L, Shan RQ, Zeng WJ, Wang B, Wang R, Xu X, Wei XF. CNGBdb: China National GeneBank DataBase. YI CHUAN = HEREDITAS 2020; 42:799-809. [PMID: 32952115 DOI: 10.16288/j.yczz.20-080] [Citation(s) in RCA: 85] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/31/2022]
Abstract
China National GeneBank DataBase (CNGBdb) is a data platform aiming to systematically archiving and sharing of multi-omics data in life science. As the service portal of Bio-informatics Data Center of the core structure, namely, "Three Banks and Two Platforms" of China National GeneBank (CNGB), CNGBdb has the advantages of rich sample resources, data resources, cooperation projects, powerful data computation and analysis capabilities. With the advent of high throughput sequencing technologies, research in life science has entered the big data era, which is in the need of closer international cooperation and data sharing. With the development of China's economy and the increase of investment in life science research, we need to establish a national public platform for data archiving and sharing in life science to promote the systematic management, application and industrial utilization. Currently, CNGBdb can provide genomic data archiving, information search engines, data management and data analysis services. The data schema of CNGBdb has covered projects, samples, experiments, runs, assemblies, variations and sequences. Until May 22, 2020, CNGBdb has archived 2176 research projects and more than 2221 TB sequencing data submitted by researchers globally. In the future, CNGBdb will continue to be dedicated to promoting data sharing in life science research and improving the service capability. CNGBdb website is: https://db.cngb.org/.
Collapse
|
391
|
Hippen AA, Greene CS. Expanding and Remixing the Metadata Landscape. Trends Cancer 2020; 7:276-278. [PMID: 33229213 PMCID: PMC8324015 DOI: 10.1016/j.trecan.2020.10.011] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2020] [Revised: 10/26/2020] [Accepted: 10/27/2020] [Indexed: 12/12/2022]
Abstract
Genomic data sharing accelerates research. Data are most valuable when they are accompanied by detailed metadata. To date, metadata are often human-annotated descriptions of samples and their handling. We discuss how machine learning-derived elements complement such descriptions to enhance the research ecosystem around genomic data.
Collapse
|
392
|
Hamilton DG, Fraser H, Hoekstra R, Fidler F. Journal policies and editors' opinions on peer review. eLife 2020; 9:e62529. [PMID: 33211009 PMCID: PMC7717900 DOI: 10.7554/elife.62529] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2020] [Accepted: 11/18/2020] [Indexed: 12/23/2022] Open
Abstract
Peer review practices differ substantially between journals and disciplines. This study presents the results of a survey of 322 editors of journals in ecology, economics, medicine, physics and psychology. We found that 49% of the journals surveyed checked all manuscripts for plagiarism, that 61% allowed authors to recommend both for and against specific reviewers, and that less than 6% used a form of open peer review. Most journals did not have an official policy on altering reports from reviewers, but 91% of editors identified at least one situation in which it was appropriate for an editor to alter a report. Editors were also asked for their views on five issues related to publication ethics. A majority expressed support for co-reviewing, reviewers requesting access to data, reviewers recommending citations to their work, editors publishing in their own journals, and replication studies. Our results provide a window into what is largely an opaque aspect of the scientific process. We hope the findings will inform the debate about the role and transparency of peer review in scholarly publishing.
Collapse
|
393
|
Velasco I, Toharia P, Benavides-Piccione R, Fernaud-Espinosa I, Brito JP, Mata S, DeFelipe J, Pastor L, Bayona S. Neuronize v2: Bridging the Gap Between Existing Proprietary Tools to Optimize Neuroscientific Workflows. Front Neuroanat 2020; 14:585793. [PMID: 33192345 PMCID: PMC7646287 DOI: 10.3389/fnana.2020.585793] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Accepted: 09/07/2020] [Indexed: 12/15/2022] Open
Abstract
Knowledge about neuron morphology is key to understanding brain structure and function. There are a variety of software tools that are used to segment and trace the neuron morphology. However, these tools usually utilize proprietary formats. This causes interoperability problems since the information extracted with one tool cannot be used in other tools. This article aims to improve neuronal reconstruction workflows by facilitating the interoperability between two of the most commonly used software tools—Neurolucida (NL) and Imaris (Filament Tracer). The new functionality has been included in an existing tool—Neuronize—giving rise to its second version. Neuronize v2 makes it possible to automatically use the data extracted with Imaris Filament Tracer to generate a tracing with dendritic spine information that can be read directly by NL. It also includes some other new features, such as the ability to unify and/or correct inaccurately-formed meshes (i.e., dendritic spines) and to calculate new metrics. This tool greatly facilitates the process of neuronal reconstruction, bridging the gap between existing proprietary tools to optimize neuroscientific workflows.
Collapse
|
394
|
Savarese M, Johari M, Johnson K, Arumilli M, Torella A, Töpf A, Rubegni A, Kuhn M, Giugliano T, Gläser D, Fattori F, Thompson R, Penttilä S, Lehtinen S, Gibertini S, Ruggieri A, Mora M, Maver A, Peterlin B, Mankodi A, Lochmüller H, Santorelli FM, Schoser B, Fajkusová L, Straub V, Nigro V, Hackman P, Udd B. Improved Criteria for the Classification of Titin Variants in Inherited Skeletal Myopathies. J Neuromuscul Dis 2020; 7:153-166. [PMID: 32039858 DOI: 10.3233/jnd-190423] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
BACKGROUND Extensive genetic screening results in the identification of thousands of rare variants that are difficult to interpret. Because of its sheer size, rare variants in the titin gene (TTN) are detected frequently in any individual. Unambiguous interpretation of molecular findings is almost impossible in many patients with myopathies or cardiomyopathies. OBJECTIVE To refine the current classification framework for TTN-associated skeletal muscle disorders and standardize the interpretation of TTN variants. METHODS We used the guidelines issued by the American College of Medical Genetics and Genomics (ACMG) and the Association for Molecular Pathology (AMP) to re-analyze TTN genetic findings from our patient cohort. RESULTS We identified in the classification guidelines three rules that are not applicable to titin-related skeletal muscle disorders; six rules that require disease-/gene-specific adjustments and four rules requiring quantitative thresholds for a proper use. In three cases, the rule strength need to be modified. CONCLUSIONS We suggest adjustments are made to the guidelines. We provide frequency thresholds to facilitate filtering of candidate causative variants and guidance for the use and interpretation of functional data and co-segregation evidence. We expect that the variant classification framework for TTN-related skeletal muscle disorders will be further improved along with a better understanding of these diseases.
Collapse
|
395
|
El Emam K, Mosquera L, Bass J. Evaluating Identity Disclosure Risk in Fully Synthetic Health Data: Model Development and Validation. J Med Internet Res 2020; 22:e23139. [PMID: 33196453 PMCID: PMC7704280 DOI: 10.2196/23139] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2020] [Revised: 09/02/2020] [Accepted: 10/10/2020] [Indexed: 01/13/2023] Open
Abstract
BACKGROUND There has been growing interest in data synthesis for enabling the sharing of data for secondary analysis; however, there is a need for a comprehensive privacy risk model for fully synthetic data: If the generative models have been overfit, then it is possible to identify individuals from synthetic data and learn something new about them. OBJECTIVE The purpose of this study is to develop and apply a methodology for evaluating the identity disclosure risks of fully synthetic data. METHODS A full risk model is presented, which evaluates both identity disclosure and the ability of an adversary to learn something new if there is a match between a synthetic record and a real person. We term this "meaningful identity disclosure risk." The model is applied on samples from the Washington State Hospital discharge database (2007) and the Canadian COVID-19 cases database. Both of these datasets were synthesized using a sequential decision tree process commonly used to synthesize health and social science data. RESULTS The meaningful identity disclosure risk for both of these synthesized samples was below the commonly used 0.09 risk threshold (0.0198 and 0.0086, respectively), and 4 times and 5 times lower than the risk values for the original datasets, respectively. CONCLUSIONS We have presented a comprehensive identity disclosure risk model for fully synthetic data. The results for this synthesis method on 2 datasets demonstrate that synthesis can reduce meaningful identity disclosure risks considerably. The risk model can be applied in the future to evaluate the privacy of fully synthetic data.
Collapse
|
396
|
Sullivan JA, Dumont JR, Memar S, Skirzewski M, Wan J, Mofrad MH, Ansari HZ, Li Y, Muller L, Prado VF, Prado MAM, Saksida LM, Bussey TJ. New frontiers in translational research: Touchscreens, open science, and the mouse translational research accelerator platform. GENES BRAIN AND BEHAVIOR 2020; 20:e12705. [PMID: 33009724 DOI: 10.1111/gbb.12705] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Revised: 09/03/2020] [Accepted: 09/29/2020] [Indexed: 12/18/2022]
Abstract
Many neurodegenerative and neuropsychiatric diseases and other brain disorders are accompanied by impairments in high-level cognitive functions including memory, attention, motivation, and decision-making. Despite several decades of extensive research, neuroscience is little closer to discovering new treatments. Key impediments include the absence of validated and robust cognitive assessment tools for facilitating translation from animal models to humans. In this review, we describe a state-of-the-art platform poised to overcome these impediments and improve the success of translational research, the Mouse Translational Research Accelerator Platform (MouseTRAP), which is centered on the touchscreen cognitive testing system for rodents. It integrates touchscreen-based tests of high-level cognitive assessment with state-of-the art neurotechnology to record and manipulate molecular and circuit level activity in vivo in animal models during human-relevant cognitive performance. The platform also is integrated with two Open Science platforms designed to facilitate knowledge and data-sharing practices within the rodent touchscreen community, touchscreencognition.org and mousebytes.ca. Touchscreencognition.org includes the Wall, showcasing touchscreen news and publications, the Forum, for community discussion, and Training, which includes courses, videos, SOPs, and symposia. To get started, interested researchers simply create user accounts. We describe the origins of the touchscreen testing system, the novel lines of research it has facilitated, and its increasingly widespread use in translational research, which is attributable in part to knowledge-sharing efforts over the past decade. We then identify the unique features of MouseTRAP that stand to potentially revolutionize translational research, and describe new initiatives to partner with similar platforms such as McGill's M3 platform (m3platform.org).
Collapse
|
397
|
Parker W, Jaremko JL, Cicero M, Azar M, El-Emam K, Gray BG, Hurrell C, Lavoie-Cardinal F, Desjardins B, Lum A, Sheremeta L, Lee E, Reinhold C, Tang A, Bromwich R. Canadian Association of Radiologists White Paper on De-Identification of Medical Imaging: Part 1, General Principles. Can Assoc Radiol J 2020; 72:13-24. [PMID: 33138621 DOI: 10.1177/0846537120967349] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
The application of big data, radiomics, machine learning, and artificial intelligence (AI) algorithms in radiology requires access to large data sets containing personal health information. Because machine learning projects often require collaboration between different sites or data transfer to a third party, precautions are required to safeguard patient privacy. Safety measures are required to prevent inadvertent access to and transfer of identifiable information. The Canadian Association of Radiologists (CAR) is the national voice of radiology committed to promoting the highest standards in patient-centered imaging, lifelong learning, and research. The CAR has created an AI Ethical and Legal standing committee with the mandate to guide the medical imaging community in terms of best practices in data management, access to health care data, de-identification, and accountability practices. Part 1 of this article will inform CAR members on principles of de-identification, pseudonymization, encryption, direct and indirect identifiers, k-anonymization, risks of reidentification, implementations, data set release models, and validation of AI algorithms, with a view to developing appropriate standards to safeguard patient information effectively.
Collapse
|
398
|
Parker W, Jaremko JL, Cicero M, Azar M, El-Emam K, Gray BG, Hurrell C, Lavoie-Cardinal F, Desjardins B, Lum A, Sheremeta L, Lee E, Reinhold C, Tang A, Bromwich R. Canadian Association of Radiologists White Paper on De-identification of Medical Imaging: Part 2, Practical Considerations. Can Assoc Radiol J 2020; 72:25-34. [PMID: 33140663 DOI: 10.1177/0846537120967345] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
The application of big data, radiomics, machine learning, and artificial intelligence (AI) algorithms in radiology requires access to large data sets containing personal health information. Because machine learning projects often require collaboration between different sites or data transfer to a third party, precautions are required to safeguard patient privacy. Safety measures are required to prevent inadvertent access to and transfer of identifiable information. The Canadian Association of Radiologists (CAR) is the national voice of radiology committed to promoting the highest standards in patient-centered imaging, lifelong learning, and research. The CAR has created an AI Ethical and Legal standing committee with the mandate to guide the medical imaging community in terms of best practices in data management, access to health care data, de-identification, and accountability practices. Part 2 of this article will inform CAR members on the practical aspects of medical imaging de-identification, strengths and limitations of de-identification approaches, list of de-identification software and tools available, and perspectives on future directions.
Collapse
|
399
|
Kurihara C, Baroutsou V, Becker S, Brun J, Franke-Bray B, Carlesi R, Chan A, Collia LF, Kleist P, Laranjeira LF, Matsuyama K, Naseem S, Schenk J, Silva H, Kerpel-Fronius S. Linking the Declarations of Helsinki and of Taipei: Critical Challenges of Future-Oriented Research Ethics. Front Pharmacol 2020; 11:579714. [PMID: 33324212 PMCID: PMC7723451 DOI: 10.3389/fphar.2020.579714] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2020] [Accepted: 09/29/2020] [Indexed: 11/17/2022] Open
Abstract
Expansion of data-driven research in the 21st century has posed challenges in the evolution of the international agreed framework of research ethics. The World Medical Association (WMA)'s Declaration of Helsinki (DoH) has provided ethical principles for medical research involving humans since 1964, with the last update in 2013. To complement the DoH, WMA issued the Declaration of Taipei (DoT) in 2016 to provide additional principles for health databases and biobanks. However, the ethical principles for secondary use of data or material obtained in research remain unclear. With such a perspective, the Working Group on Ethics (WGE) of the International Federation of Associations of Pharmaceutical Physicians and Pharmaceutical Medicine (IFAPP) suggests a closer scientific linkage in the DoH to the (Declaration of Taipei) DoT focusing specifically on areas that will facilitate data-driven research, and to further strengthen the protection of research participants.
Collapse
|
400
|
Sharma A, Czerwinska KP, Brenna L, Johansen D, Johansen HD. Privacy Perceptions and Concerns in Image-Based Dietary Assessment Systems: Questionnaire-Based Study. JMIR Hum Factors 2020; 7:e19085. [PMID: 33055060 PMCID: PMC7596657 DOI: 10.2196/19085] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2020] [Revised: 07/06/2020] [Accepted: 09/03/2020] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Complying with individual privacy perceptions is essential when processing personal information for research. Our specific research area is performance development of elite athletes, wherein nutritional aspects are important. Before adopting new automated tools that capture such data, it is crucial to understand and address the privacy concerns of the research subjects that are to be studied. Privacy as contextual integrity emphasizes understanding contextual sensitivity in an information flow. In this study, we explore privacy perceptions in image-based dietary assessments. This research field lacks empirical evidence on what will be considered as privacy violations when exploring trends in long-running studies. Prior studies have only classified images as either private or public depending on their basic content. An assessment and analysis are thus needed to prevent unwanted consequences of privacy breach and other issues perceived as sensitive when designing systems for dietary assessment by using food images. OBJECTIVE The aim of this study was to investigate common perceptions of computer systems using food images for dietary assessment. The study delves into perceived risks and data-sharing behaviors. METHODS We investigated the privacy perceptions of 105 individuals by using a web-based survey. We analyzed these perceptions along with perceived risks in sharing dietary information with third parties. RESULTS We found that understanding the motive behind the use of data increases its chances of sharing with a social group. CONCLUSIONS In this study, we highlight various privacy concerns that can be addressed during the design phase. A system design that is compliant with general data protection regulations will increase participants' and stakeholders' trust in an image-based dietary assessment system. Innovative solutions are needed to reduce the intrusiveness of a continuous assessment. Individuals show varying behaviors for sharing metadata, as knowing what the data is being used for, increases the chance of it being shared.
Collapse
|