1
|
Rueden CT, Schindelin J, Hiner MC, DeZonia BE, Walter AE, Arena ET, Eliceiri KW. ImageJ2: ImageJ for the next generation of scientific image data. BMC Bioinformatics 2017; 18:529. [PMID: 29187165 PMCID: PMC5708080 DOI: 10.1186/s12859-017-1934-z] [Citation(s) in RCA: 3290] [Impact Index Per Article: 411.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2017] [Accepted: 11/14/2017] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND ImageJ is an image analysis program extensively used in the biological sciences and beyond. Due to its ease of use, recordable macro language, and extensible plug-in architecture, ImageJ enjoys contributions from non-programmers, amateur programmers, and professional developers alike. Enabling such a diversity of contributors has resulted in a large community that spans the biological and physical sciences. However, a rapidly growing user base, diverging plugin suites, and technical limitations have revealed a clear need for a concerted software engineering effort to support emerging imaging paradigms, to ensure the software's ability to handle the requirements of modern science. RESULTS We rewrote the entire ImageJ codebase, engineering a redesigned plugin mechanism intended to facilitate extensibility at every level, with the goal of creating a more powerful tool that continues to serve the existing community while addressing a wider range of scientific requirements. This next-generation ImageJ, called "ImageJ2" in places where the distinction matters, provides a host of new functionality. It separates concerns, fully decoupling the data model from the user interface. It emphasizes integration with external applications to maximize interoperability. Its robust new plugin framework allows everything from image formats, to scripting languages, to visualization to be extended by the community. The redesigned data model supports arbitrarily large, N-dimensional datasets, which are increasingly common in modern image acquisition. Despite the scope of these changes, backwards compatibility is maintained such that this new functionality can be seamlessly integrated with the classic ImageJ interface, allowing users and developers to migrate to these new methods at their own pace. CONCLUSIONS Scientific imaging benefits from open-source programs that advance new method development and deployment to a diverse audience. ImageJ has continuously evolved with this idea in mind; however, new and emerging scientific requirements have posed corresponding challenges for ImageJ's development. The described improvements provide a framework engineered for flexibility, intended to support these requirements as well as accommodate future needs. Future efforts will focus on implementing new algorithms in this framework and expanding collaborations with other popular scientific software suites.
Collapse
|
research-article |
8 |
3290 |
2
|
Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, Shamseer L, Tetzlaff JM, Akl EA, Brennan SE, Chou R, Glanville J, Grimshaw JM, Hróbjartsson A, Lalu MM, Li T, Loder EW, Mayo-Wilson E, McDonald S, McGuinness LA, Stewart LA, Thomas J, Tricco AC, Welch VA, Whiting P, Moher D. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. J Clin Epidemiol 2021; 134:178-189. [PMID: 33789819 DOI: 10.1016/j.jclinepi.2021.03.001] [Citation(s) in RCA: 1247] [Impact Index Per Article: 311.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
The Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) statement, published in 2009, was designed to help systematic reviewers transparently report why the review was done, what the authors did, and what they found. Over the past decade, advances in systematic review methodology and terminology have necessitated an update to the guideline. The PRISMA 2020 statement replaces the 2009 statement and includes new reporting guidance that reflects advances in methods to identify, select, appraise, and synthesise studies. The structure and presentation of the items have been modified to facilitate implementation. In this article, we present the PRISMA 2020 27-item checklist, an expanded checklist that details reporting recommendations for each item, the PRISMA 2020 abstract checklist, and the revised flow diagrams for original and updated reviews.
Collapse
|
Journal Article |
4 |
1247 |
3
|
Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, Shamseer L, Tetzlaff JM, Moher D. Updating guidance for reporting systematic reviews: development of the PRISMA 2020 statement. J Clin Epidemiol 2021; 134:103-112. [PMID: 33577987 DOI: 10.1016/j.jclinepi.2021.02.003] [Citation(s) in RCA: 1130] [Impact Index Per Article: 282.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Revised: 01/22/2021] [Accepted: 02/03/2021] [Indexed: 12/22/2022]
Abstract
OBJECTIVES To describe the processes used to update the PRISMA 2009 statement for reporting systematic reviews, present results of a survey conducted to inform the update, summarize decisions made at the PRISMA update meeting, and describe and justify changes made to the guideline. METHODS We reviewed 60 documents with reporting guidance for systematic reviews to generate suggested modifications to the PRISMA 2009 statement. We invited 220 systematic review methodologists and journal editors to complete a survey about the suggested modifications. The results of these projects were discussed at a 21-member in-person meeting. Following the meeting, we drafted the PRISMA 2020 statement and refined it based on feedback from co-authors and a convenience sample of 15 systematic reviewers. RESULTS The review of 60 documents revealed that all topics addressed by the PRISMA 2009 statement could be modified. Of the 110 survey respondents, more than 66% recommended keeping six of the original checklist items as they were and modifying 15 of them using wording suggested by us. Attendees at the in-person meeting supported the revised wording for several items but suggested rewording for most to enhance clarity, and further refinements were made over six drafts of the guideline. CONCLUSIONS The PRISMA 2020 statement consists of updated reporting guidance for systematic reviews. We hope that providing this detailed description of the development process will enhance the acceptance and uptake of the guideline and assist those developing and updating future reporting guidelines.
Collapse
|
Journal Article |
4 |
1130 |
4
|
Rethlefsen ML, Kirtley S, Waffenschmidt S, Ayala AP, Moher D, Page MJ, Koffel JB. PRISMA-S: an extension to the PRISMA Statement for Reporting Literature Searches in Systematic Reviews. Syst Rev 2021; 10:39. [PMID: 33499930 PMCID: PMC7839230 DOI: 10.1186/s13643-020-01542-z] [Citation(s) in RCA: 1109] [Impact Index Per Article: 277.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Accepted: 11/23/2020] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Literature searches underlie the foundations of systematic reviews and related review types. Yet, the literature searching component of systematic reviews and related review types is often poorly reported. Guidance for literature search reporting has been diverse, and, in many cases, does not offer enough detail to authors who need more specific information about reporting search methods and information sources in a clear, reproducible way. This document presents the PRISMA-S (Preferred Reporting Items for Systematic reviews and Meta-Analyses literature search extension) checklist, and explanation and elaboration. METHODS The checklist was developed using a 3-stage Delphi survey process, followed by a consensus conference and public review process. RESULTS The final checklist includes 16 reporting items, each of which is detailed with exemplar reporting and rationale. CONCLUSIONS The intent of PRISMA-S is to complement the PRISMA Statement and its extensions by providing a checklist that could be used by interdisciplinary authors, editors, and peer reviewers to verify that each component of a search is completely reported and therefore reproducible.
Collapse
|
Research Support, N.I.H., Extramural |
4 |
1109 |
5
|
Otasek D, Morris JH, Bouças J, Pico AR, Demchak B. Cytoscape Automation: empowering workflow-based network analysis. Genome Biol 2019; 20:185. [PMID: 31477170 PMCID: PMC6717989 DOI: 10.1186/s13059-019-1758-4] [Citation(s) in RCA: 989] [Impact Index Per Article: 164.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2019] [Accepted: 07/09/2019] [Indexed: 12/11/2022] Open
Abstract
Cytoscape is one of the most successful network biology analysis and visualization tools, but because of its interactive nature, its role in creating reproducible, scalable, and novel workflows has been limited. We describe Cytoscape Automation (CA), which marries Cytoscape to highly productive workflow systems, for example, Python/R in Jupyter/RStudio. We expose over 270 Cytoscape core functions and 34 Cytoscape apps as REST-callable functions with standardized JSON interfaces backed by Swagger documentation. Independent projects to create and publish Python/R native CA interface libraries have reached an advanced stage, and a number of automation workflows are already published.
Collapse
|
Research Support, N.I.H., Extramural |
6 |
989 |
6
|
Zuo XN, Xing XX. Test-retest reliabilities of resting-state FMRI measurements in human brain functional connectomics: a systems neuroscience perspective. Neurosci Biobehav Rev 2014; 45:100-18. [PMID: 24875392 DOI: 10.1016/j.neubiorev.2014.05.009] [Citation(s) in RCA: 505] [Impact Index Per Article: 45.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2014] [Revised: 05/12/2014] [Accepted: 05/15/2014] [Indexed: 12/20/2022]
Abstract
Resting-state functional magnetic resonance imaging (RFMRI) enables researchers to monitor fluctuations in the spontaneous brain activities of thousands of regions in the human brain simultaneously, representing a popular tool for macro-scale functional connectomics to characterize normal brain function, mind-brain associations, and the various disorders. However, the test-retest reliability of RFMRI remains largely unknown. We review previously published papers on the test-retest reliability of voxel-wise metrics and conduct a meta-summary reliability analysis of seven common brain networks. This analysis revealed that the heteromodal associative (default, control, and attention) networks were mostly reliable across the seven networks. Regarding examined metrics, independent component analysis with dual regression, local functional homogeneity and functional homotopic connectivity were the three mostly reliable RFMRI metrics. These observations can guide the use of reliable metrics and further improvement of test-retest reliability for other metics in functional connectomics. We discuss the main issues with low reliability related to sub-optimal design and the choice of data processing options. Future research should use large-sample test-retest data to rectify both the within-subject and between-subject variability of RFMRI measurements and accelerate the application of functional connectomics.
Collapse
|
Review |
11 |
505 |
7
|
Debray TPA, Vergouwe Y, Koffijberg H, Nieboer D, Steyerberg EW, Moons KGM. A new framework to enhance the interpretation of external validation studies of clinical prediction models. J Clin Epidemiol 2014; 68:279-89. [PMID: 25179855 DOI: 10.1016/j.jclinepi.2014.06.018] [Citation(s) in RCA: 393] [Impact Index Per Article: 35.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2014] [Revised: 06/18/2014] [Accepted: 06/30/2014] [Indexed: 01/01/2023]
Abstract
OBJECTIVES It is widely acknowledged that the performance of diagnostic and prognostic prediction models should be assessed in external validation studies with independent data from "different but related" samples as compared with that of the development sample. We developed a framework of methodological steps and statistical methods for analyzing and enhancing the interpretation of results from external validation studies of prediction models. STUDY DESIGN AND SETTING We propose to quantify the degree of relatedness between development and validation samples on a scale ranging from reproducibility to transportability by evaluating their corresponding case-mix differences. We subsequently assess the models' performance in the validation sample and interpret the performance in view of the case-mix differences. Finally, we may adjust the model to the validation setting. RESULTS We illustrate this three-step framework with a prediction model for diagnosing deep venous thrombosis using three validation samples with varying case mix. While one external validation sample merely assessed the model's reproducibility, two other samples rather assessed model transportability. The performance in all validation samples was adequate, and the model did not require extensive updating to correct for miscalibration or poor fit to the validation settings. CONCLUSION The proposed framework enhances the interpretation of findings at external validation of prediction models.
Collapse
|
Research Support, Non-U.S. Gov't |
11 |
393 |
8
|
Hugenholtz F, de Vos WM. Mouse models for human intestinal microbiota research: a critical evaluation. Cell Mol Life Sci 2018; 75:149-160. [PMID: 29124307 PMCID: PMC5752736 DOI: 10.1007/s00018-017-2693-8] [Citation(s) in RCA: 372] [Impact Index Per Article: 53.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2017] [Accepted: 09/29/2017] [Indexed: 02/06/2023]
Abstract
Since the early days of the intestinal microbiota research, mouse models have been used frequently to study the interaction of microbes with their host. However, to translate the knowledge gained from mouse studies to a human situation, the major spatio-temporal similarities and differences between intestinal microbiota in mice and humans need to be considered. This is done here with specific attention for the comparative physiology of the intestinal tract, the effect of dietary patterns and differences in genetics. Detailed phylogenetic and metagenomic analysis showed that while many common genera are found in the human and murine intestine, these differ strongly in abundance and in total only 4% of the bacterial genes are found to share considerable identity. Moreover, a large variety of murine strains is available yet most of the microbiota research is performed in wild-type, inbred strains and their transgenic derivatives. It has become increasingly clear that the providers, rearing facilities and the genetic background of these mice have a significant impact on the microbial composition and this is illustrated with recent experimental data. This may affect the reproducibility of mouse microbiota studies and their conclusions. Hence, future studies should take these into account to truly show the effect of diet, genotype or environmental factors on the microbial composition.
Collapse
|
Review |
7 |
372 |
9
|
Bittner T, Zetterberg H, Teunissen CE, Ostlund RE, Militello M, Andreasson U, Hubeek I, Gibson D, Chu DC, Eichenlaub U, Heiss P, Kobold U, Leinenbach A, Madin K, Manuilova E, Rabe C, Blennow K. Technical performance of a novel, fully automated electrochemiluminescence immunoassay for the quantitation of β-amyloid (1-42) in human cerebrospinal fluid. Alzheimers Dement 2015; 12:517-26. [PMID: 26555316 DOI: 10.1016/j.jalz.2015.09.009] [Citation(s) in RCA: 268] [Impact Index Per Article: 26.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2015] [Revised: 09/11/2015] [Accepted: 09/28/2015] [Indexed: 11/16/2022]
Abstract
INTRODUCTION Available assays for quantitation of the Alzheimer's disease (AD) biomarker amyloid-beta 1-42 (Aβ [1-42]) in cerebrospinal fluid demonstrate significant variability and lack of standardization to reference measurement procedures (RMPs). We report analytical performance data for the novel Elecsys β-amyloid (1-42) assay (Roche Diagnostics). METHODS Lot-to-lot comparability was tested using method comparison. Performance parameters were measured according to Clinical & Laboratory Standards Institute (CLSI) guidelines. The assay was standardized to a Joint Committee for Traceability in Laboratory Medicine (JCTLM) approved RMP. RESULTS Limit of quantitation was <11.28 pg/mL, and the assay was linear throughout the measuring range (200-1700 pg/mL). Excellent lot-to-lot comparability was observed (correlation coefficients [Pearson's r] >0.995; bias in medical decision area <2%). Repeatability coefficients of variation (CVs) were 1.0%-1.6%, intermediate CVs were 1.9%-4.0%, and intermodule CVs were 1.1%-3.9%. Estimated total reproducibility was 2.0%-5.1%. Correlation with the RMP was good (Pearson's r, 0.93). DISCUSSION The Elecsys β-amyloid (1-42) assay has high analytical performance that may improve biomarker-based AD diagnosis.
Collapse
|
Research Support, Non-U.S. Gov't |
10 |
268 |
10
|
Bosco E, Hsueh L, McConeghy KW, Gravenstein S, Saade E. Major adverse cardiovascular event definitions used in observational analysis of administrative databases: a systematic review. BMC Med Res Methodol 2021; 21:241. [PMID: 34742250 PMCID: PMC8571870 DOI: 10.1186/s12874-021-01440-5] [Citation(s) in RCA: 250] [Impact Index Per Article: 62.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Accepted: 10/12/2021] [Indexed: 12/28/2022] Open
Abstract
Background Major adverse cardiovascular events (MACE) are increasingly used as composite outcomes in randomized controlled trials (RCTs) and observational studies. However, it is unclear how observational studies most commonly define MACE in the literature when using administrative data. Methods We identified peer-reviewed articles published in MEDLINE and EMBASE between January 1, 2010 to October 9, 2020. Studies utilizing administrative data to assess the MACE composite outcome using International Classification of Diseases 9th or 10th Revision diagnosis codes were included. Reviews, abstracts, and studies not providing outcome code definitions were excluded. Data extracted included data source, timeframe, MACE components, code definitions, code positions, and outcome validation. Results A total of 920 articles were screened, 412 were retained for full-text review, and 58 were included. Only 8.6% (n = 5/58) matched the traditional three-point MACE RCT definition of acute myocardial infarction (AMI), stroke, or cardiovascular death. None matched four-point (+unstable angina) or five-point MACE (+unstable angina and heart failure). The most common MACE components were: AMI and stroke, 15.5% (n = 9/58); AMI, stroke, and all-cause death, 13.8% (n = 8/58); and AMI, stroke and cardiovascular death 8.6% (n = 5/58). Further, 67% (n = 39/58) did not validate outcomes or cite validation studies. Additionally, 70.7% (n = 41/58) did not report code positions of endpoints, 20.7% (n = 12/58) used the primary position, and 8.6% (n = 5/58) used any position. Conclusions Components of MACE endpoints and diagnostic codes used varied widely across observational studies. Variability in the MACE definitions used and information reported across observational studies prohibit the comparison, replication, and aggregation of findings. Studies should transparently report the administrative codes used and code positions, as well as utilize validated outcome definitions when possible. Supplementary Information The online version contains supplementary material available at 10.1186/s12874-021-01440-5.
Collapse
|
Systematic Review |
4 |
250 |
11
|
Park JE, Park SY, Kim HJ, Kim HS. Reproducibility and Generalizability in Radiomics Modeling: Possible Strategies in Radiologic and Statistical Perspectives. Korean J Radiol 2019; 20:1124-1137. [PMID: 31270976 PMCID: PMC6609433 DOI: 10.3348/kjr.2018.0070] [Citation(s) in RCA: 234] [Impact Index Per Article: 39.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2019] [Accepted: 04/07/2019] [Indexed: 02/06/2023] Open
Abstract
Radiomics, which involves the use of high-dimensional quantitative imaging features for predictive purposes, is a powerful tool for developing and testing medical hypotheses. Radiologic and statistical challenges in radiomics include those related to the reproducibility of imaging data, control of overfitting due to high dimensionality, and the generalizability of modeling. The aims of this review article are to clarify the distinctions between radiomics features and other omics and imaging data, to describe the challenges and potential strategies in reproducibility and feature selection, and to reveal the epidemiological background of modeling, thereby facilitating and promoting more reproducible and generalizable radiomics research.
Collapse
|
Review |
6 |
234 |
12
|
Bradbury KE, Young HJ, Guo W, Key TJ. Dietary assessment in UK Biobank: an evaluation of the performance of the touchscreen dietary questionnaire. J Nutr Sci 2018; 7:e6. [PMID: 29430297 PMCID: PMC5799609 DOI: 10.1017/jns.2017.66] [Citation(s) in RCA: 220] [Impact Index Per Article: 31.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2017] [Revised: 09/13/2017] [Accepted: 10/19/2017] [Indexed: 11/10/2022] Open
Abstract
UK Biobank is an open access prospective cohort of 500 000 men and women. Information on the frequency of consumption of main foods was collected at recruitment with a touchscreen questionnaire; prior to examining the associations between diet and disease, it is essential to evaluate the performance of the dietary touchscreen questionnaire. The objectives of the present paper are to: describe the repeatability of the touchscreen questionnaire in participants (n 20 348) who repeated the assessment centre visit approximately 4 years after recruitment, and compare the dietary touchscreen variables with mean intakes from participants (n 140 080) who completed at least one of the four web-based 24-h dietary assessments post-recruitment. For fish and meat items, 90 % or more of participants reported the same or adjacent category of intake at the repeat assessment visit; for vegetables and fruit, and for a derived partial fibre score (in fifths), 70 % or more of participants were classified into the same or adjacent category of intake (κweighted > 0·50 for all). Participants were also categorised based on their responses to the dietary touchscreen questionnaire at recruitment, and within each category the group mean intake of the same food group or nutrient from participants who had completed at least one web-based 24-h dietary assessment was calculated. The comparison showed that the dietary touchscreen variables, available on the full cohort, reliably rank participants according to intakes of the main food groups.
Collapse
|
research-article |
7 |
220 |
13
|
Volpato V, Webber C. Addressing variability in iPSC-derived models of human disease: guidelines to promote reproducibility. Dis Model Mech 2020; 13:dmm042317. [PMID: 31953356 PMCID: PMC6994963 DOI: 10.1242/dmm.042317] [Citation(s) in RCA: 209] [Impact Index Per Article: 41.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Induced pluripotent stem cell (iPSC) technologies have provided in vitro models of inaccessible human cell types, yielding new insights into disease mechanisms especially for neurological disorders. However, without due consideration, the thousands of new human iPSC lines generated in the past decade will inevitably affect the reproducibility of iPSC-based experiments. Differences between donor individuals, genetic stability and experimental variability contribute to iPSC model variation by impacting differentiation potency, cellular heterogeneity, morphology, and transcript and protein abundance. Such effects will confound reproducible disease modelling in the absence of appropriate strategies. In this Review, we explore the causes and effects of iPSC heterogeneity, and propose approaches to detect and account for experimental variation between studies, or even exploit it for deeper biological insight.
Collapse
|
Review |
5 |
209 |
14
|
Zwanenburg A. Radiomics in nuclear medicine: robustness, reproducibility, standardization, and how to avoid data analysis traps and replication crisis. Eur J Nucl Med Mol Imaging 2019; 46:2638-2655. [PMID: 31240330 DOI: 10.1007/s00259-019-04391-8] [Citation(s) in RCA: 196] [Impact Index Per Article: 32.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Accepted: 06/04/2019] [Indexed: 12/16/2022]
Abstract
Radiomics in nuclear medicine is rapidly expanding. Reproducibility of radiomics studies in multicentre settings is an important criterion for clinical translation. We therefore performed a meta-analysis to investigate reproducibility of radiomics biomarkers in PET imaging and to obtain quantitative information regarding their sensitivity to variations in various imaging and radiomics-related factors as well as their inherent sensitivity. Additionally, we identify and describe data analysis pitfalls that affect the reproducibility and generalizability of radiomics studies. After a systematic literature search, 42 studies were included in the qualitative synthesis, and data from 21 were used for the quantitative meta-analysis. Data concerning measurement agreement and reliability were collected for 21 of 38 different factors associated with image acquisition, reconstruction, segmentation and radiomics-specific processing steps. Variations in voxel size, segmentation and several reconstruction parameters strongly affected reproducibility, but the level of evidence remained weak. Based on the meta-analysis, we also assessed inherent sensitivity to variations of 110 PET image biomarkers. SUVmean and SUVmax were found to be reliable, whereas image biomarkers based on the neighbourhood grey tone difference matrix and most biomarkers based on the size zone matrix were found to be highly sensitive to variations, and should be used with care in multicentre settings. Lastly, we identify 11 data analysis pitfalls. These pitfalls concern model validation and information leakage during model development, but also relate to reporting and the software used for data analysis. Avoiding such pitfalls is essential for minimizing bias in the results and to enable reproduction and validation of radiomics studies.
Collapse
|
Review |
6 |
196 |
15
|
Clayton JA. Applying the new SABV (sex as a biological variable) policy to research and clinical care. Physiol Behav 2017; 187:2-5. [PMID: 28823546 DOI: 10.1016/j.physbeh.2017.08.012] [Citation(s) in RCA: 178] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2017] [Revised: 08/14/2017] [Accepted: 08/15/2017] [Indexed: 11/27/2022]
Abstract
Sex as a biological variable (SABV) is a key part of the new National Institutes of Health (NIH) initiative to enhance reproducibility through rigor and transparency. The SABV policy requires researchers to factor sex into the design, analysis, and reporting of vertebrate animal and human studies. The policy was implemented as it has become increasingly clear that male/female differences extend well beyond reproductive and hormonal issues. Implementation of the policy is also meant to address inattention to sex influences in biomedical research. Sex affects: cell physiology, metabolism, and many other biological functions; symptoms and manifestations of disease; and responses to treatment. For example, sex has profound influences in neuroscience, from circuitry to physiology to pain perception. Extending beyond the robust efforts of NIH to ensure that women are included in clinical trials, the SABV policy also includes rigorous preclinical experimental designs that inform clinical research. Additionally, the NIH has engaged journal editors and publishers to facilitate reproducibility by addressing rigor and promoting transparency through scientifically appropriate sex-specific study results reporting. The Sex And Gender Equity in Research (SAGER) guidelines were developed to assist researchers and journal editors in reporting sex and gender information in publications [1].
Collapse
|
Journal Article |
8 |
178 |
16
|
Collins CE, Boggess MM, Watson JF, Guest M, Duncanson K, Pezdirc K, Rollo M, Hutchesson MJ, Burrows TL. Reproducibility and comparative validity of a food frequency questionnaire for Australian adults. Clin Nutr 2013; 33:906-14. [PMID: 24144913 DOI: 10.1016/j.clnu.2013.09.015] [Citation(s) in RCA: 172] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2013] [Revised: 09/20/2013] [Accepted: 09/26/2013] [Indexed: 12/13/2022]
Abstract
BACKGROUND Food frequency questionnaires (FFQ) are used in epidemiological studies to investigate the relationship between diet and disease. There is a need for a valid and reliable adult FFQ with a contemporary food list in Australia. AIMS To evaluate the reproducibility and comparative validity of the Australian Eating Survey (AES) FFQ in adults compared to weighed food records (WFRs). METHODS Two rounds of AES and three-day WFRs were conducted in 97 adults (31 males, median age and BMI for males of 44.9 years, 26.2 kg/m(2), females 41.3 years, 24.0 kg/m(2). Reproducibility was assessed over six months using Wilcoxon signed-rank tests and comparative validity was assessed by intraclass correlation coefficients (ICC) estimated by fitting a mixed effects model for each nutrient to account for age, sex and BMI to allow estimation of between and within person variance. RESULTS Reproducibility was found to be good for both WFR and FFQ since there were no significant differences between round 1 and 2 administrations. For comparative validity, FFQ ICCs were at least as large as those for WFR. The ICC of the WFR-FFQ difference for total energy intake was 0.6 (95% CI 0.43, 0.77) and the median ICC for all nutrients was 0.47, with all ICCs between 0.15 (%E from saturated fat) and 0.7 (g/day sugars). CONCLUSIONS Compared to WFR the AES FFQ is suitable for reliably estimating the dietary intakes of Australian adults across a wide range of nutrients.
Collapse
|
Validation Study |
12 |
172 |
17
|
OECD validation study to assess intra- and inter-laboratory reproducibility of the zebrafish embryo toxicity test for acute aquatic toxicity testing. Regul Toxicol Pharmacol 2014; 69:496-511. [PMID: 24874798 DOI: 10.1016/j.yrtph.2014.05.018] [Citation(s) in RCA: 171] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2014] [Revised: 05/16/2014] [Accepted: 05/19/2014] [Indexed: 11/22/2022]
Abstract
The OECD validation study of the zebrafish embryo acute toxicity test (ZFET) for acute aquatic toxicity testing evaluated the ZFET reproducibility by testing 20 chemicals at 5 different concentrations in 3 independent runs in at least 3 laboratories. Stock solutions and test concentrations were analytically confirmed for 11 chemicals. Newly fertilised zebrafish eggs (20/concentration and control) were exposed for 96h to chemicals. Four apical endpoints were recorded daily as indicators of acute lethality: coagulation of the embryo, lack of somite formation, non-detachment of the tail bud from the yolk sac and lack of heartbeat. Results (LC50 values for 48/96h exposure) show that the ZFET is a robust method with a good intra- and inter-laboratory reproducibility (CV<30%) for most chemicals and laboratories. The reproducibility was lower (CV>30%) for some very toxic or volatile chemicals, and chemicals tested close to their limit of solubility. The ZFET is now available as OECD Test Guideline 236. Considering the high predictive capacity of the ZFET demonstrated by Belanger et al. (2013) in their retrospective analysis of acute fish toxicity and fish embryo acute toxicity data, the ZFET is ready to be considered for acute fish toxicity for regulatory purposes.
Collapse
|
Research Support, Non-U.S. Gov't |
11 |
171 |
18
|
Cock PJA, Chilton JM, Grüning B, Johnson JE, Soranzo N. NCBI BLAST+ integrated into Galaxy. Gigascience 2015; 4:39. [PMID: 26336600 PMCID: PMC4557756 DOI: 10.1186/s13742-015-0080-7] [Citation(s) in RCA: 165] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2014] [Accepted: 08/18/2015] [Indexed: 01/29/2023] Open
Abstract
Background The NCBI BLAST suite has become ubiquitous in modern molecular biology and is used for small tasks such as checking capillary sequencing results of single PCR products, genome annotation or even larger scale pan-genome analyses. For early adopters of the Galaxy web-based biomedical data analysis platform, integrating BLAST into Galaxy was a natural step for sequence comparison workflows. Findings The command line NCBI BLAST+ tool suite was wrapped for use within Galaxy. Appropriate datatypes were defined as needed. The integration of the BLAST+ tool suite into Galaxy has the goal of making common BLAST tasks easy and advanced tasks possible. Conclusions This project is an informal international collaborative effort, and is deployed and used on Galaxy servers worldwide. Several examples of applications are described here.
Collapse
|
Research Support, Non-U.S. Gov't |
10 |
165 |
19
|
Lafage R, Ferrero E, Henry JK, Challier V, Diebo B, Liabaud B, Lafage V, Schwab F. Validation of a new computer-assisted tool to measure spino-pelvic parameters. Spine J 2015; 15:2493-502. [PMID: 26343243 DOI: 10.1016/j.spinee.2015.08.067] [Citation(s) in RCA: 160] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/06/2015] [Revised: 08/19/2015] [Accepted: 08/27/2015] [Indexed: 02/03/2023]
Abstract
BACKGROUND CONTEXT Evaluation of sagittal alignment is essential in the operative treatment of spine pathology, particularly adult spinal deformity (ASD). However, software applications for detailed spino-pelvic analysis are usually complex and not applicable to routine clinical use. PURPOSE This study aimed to validate new clinician-friendly software (Surgimap) in the setting of ASD. STUDY DESIGN/SETTING Accuracy and inter- and intra-rater reliability of spine measurement software were tested. Five users (two experienced spine surgeons, three novice spine research fellows) independently performed each part of the study in two rounds with 1 week between measurements. PATIENT SAMPLE Fifty ASD patients drawn from a prospective database were used as the study sample. OUTCOME MEASURES Spinal, pelvic, and cervical measurement parameters (including pelvic tilt [PT], pelvic incidence [PI], lumbar-pelvic mismatch [PI-LL], lumbar lordosis [LL], thoracic kyphosis [TK], T1 spino-pelvic inclination [T1SPI], sagittal vertical axis [SVA], and cervical lordosis [CL]) were the outcome measures. METHODS For the accuracy evaluation, 30 ASD patient radiographs were pre-marked for anatomic landmarks. Each radiograph was measured twice with the new software (Surgimap); measurements were compared to those from previously validated software. For the reliability and reproducibility evaluation, users measured 50 unmarked ASD radiographs in two rounds. Intra-class correlation (ICC) and International Organization for Standardization (ISO) reproducibility values were calculated. Measurement time was recorded. RESULTS Surgimap demonstrated excellent accuracy as assessed by the mean absolute difference from validated measurements: PT: 0.12°, PI: 0.35°, LL: 0.58°, PI-LL: 0.46°, TK: 5.25°, T1SPI: 0.53°, and SVA: 2.04 mm. The inter- and intra-observer reliability analysis revealed good to excellent agreement for all parameters. The mean difference between rounds was <0.4° for PT, PI, LL, PI-LL, and T1SPI, and <0.3 mm for SVA. For PT, PI, LL, PI-LL, TK, T1SPI, and SVA, the intra-observer ICC values were all >0.93 and the inter-observer ICC values were all >0.87. Parameters based on point landmarks rather than end plate orientation had a better reliability (ICC≥0.95 vs. ICC≥0.84). The average time needed to perform a full spino-pelvic analysis with Surgimap was 75 seconds (+25). CONCLUSIONS Using this new software tool, a simple method for full spine analysis can be performed quickly, accurately, and reliably. The proposed list of parameters offers quantitative values of the spine and pelvis, setting the stage for proper preoperative planning. The new software tool provides an important bridge between clinical and research needs.
Collapse
|
Validation Study |
10 |
160 |
20
|
Amrhein V, Korner-Nievergelt F, Roth T. The earth is flat ( p > 0.05): significance thresholds and the crisis of unreplicable research. PeerJ 2017; 5:e3544. [PMID: 28698825 PMCID: PMC5502092 DOI: 10.7717/peerj.3544] [Citation(s) in RCA: 151] [Impact Index Per Article: 18.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2017] [Accepted: 06/14/2017] [Indexed: 11/25/2022] Open
Abstract
The widespread use of 'statistical significance' as a license for making a claim of a scientific finding leads to considerable distortion of the scientific process (according to the American Statistical Association). We review why degrading p-values into 'significant' and 'nonsignificant' contributes to making studies irreproducible, or to making them seem irreproducible. A major problem is that we tend to take small p-values at face value, but mistrust results with larger p-values. In either case, p-values tell little about reliability of research, because they are hardly replicable even if an alternative hypothesis is true. Also significance (p ≤ 0.05) is hardly replicable: at a good statistical power of 80%, two studies will be 'conflicting', meaning that one is significant and the other is not, in one third of the cases if there is a true effect. A replication can therefore not be interpreted as having failed only because it is nonsignificant. Many apparent replication failures may thus reflect faulty judgment based on significance thresholds rather than a crisis of unreplicable research. Reliable conclusions on replicability and practical importance of a finding can only be drawn using cumulative evidence from multiple independent studies. However, applying significance thresholds makes cumulative knowledge unreliable. One reason is that with anything but ideal statistical power, significant effect sizes will be biased upwards. Interpreting inflated significant results while ignoring nonsignificant results will thus lead to wrong conclusions. But current incentives to hunt for significance lead to selective reporting and to publication bias against nonsignificant findings. Data dredging, p-hacking, and publication bias should be addressed by removing fixed significance thresholds. Consistent with the recommendations of the late Ronald Fisher, p-values should be interpreted as graded measures of the strength of evidence against the null hypothesis. Also larger p-values offer some evidence against the null hypothesis, and they cannot be interpreted as supporting the null hypothesis, falsely concluding that 'there is no effect'. Information on possible true effect sizes that are compatible with the data must be obtained from the point estimate, e.g., from a sample average, and from the interval estimate, such as a confidence interval. We review how confusion about interpretation of larger p-values can be traced back to historical disputes among the founders of modern statistics. We further discuss potential arguments against removing significance thresholds, for example that decision rules should rather be more stringent, that sample sizes could decrease, or that p-values should better be completely abandoned. We conclude that whatever method of statistical inference we use, dichotomous threshold thinking must give way to non-automated informed judgment.
Collapse
|
research-article |
8 |
151 |
21
|
Taoka T, Ito R, Nakamichi R, Kamagata K, Sakai M, Kawai H, Nakane T, Abe T, Ichikawa K, Kikuta J, Aoki S, Naganawa S. Reproducibility of diffusion tensor image analysis along the perivascular space (DTI-ALPS) for evaluating interstitial fluid diffusivity and glymphatic function: CHanges in Alps index on Multiple conditiON acquIsition eXperiment (CHAMONIX) study. Jpn J Radiol 2022; 40:147-158. [PMID: 34390452 PMCID: PMC8803717 DOI: 10.1007/s11604-021-01187-5] [Citation(s) in RCA: 144] [Impact Index Per Article: 48.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2021] [Accepted: 07/29/2021] [Indexed: 02/08/2023]
Abstract
PURPOSE The diffusion tensor image analysis along the perivascular space (DTI-ALPS) method was developed to evaluate the brain's glymphatic function or interstitial fluid dynamics. This study aimed to evaluate the reproducibility of the DTI-ALPS method and the effect of modifications in the imaging method and data evaluation. MATERIALS AND METHODS Seven healthy volunteers were enrolled in this study. Image acquisition was performed for this test-retest study using a fixed imaging sequence and modified imaging methods which included the placement of region of interest (ROI), imaging plane, head position, averaging, number of motion-proving gradients, echo time (TE), and a different scanner. The ALPS-index values were evaluated for the change of conditions listed above. RESULTS This test-retest study by a fixed imaging sequence showed very high reproducibility (intraclass coefficient = 0.828) for the ALPS-index value. The bilateral ROI placement showed higher reproducibility. The number of averaging and the difference of the scanner did not influence the ALPS-index values. However, modification of the imaging plane and head position impaired reproducibility, and the number of motion-proving gradients affected the ALPS-index value. The ALPS-index values from 12-axis DTI and 3-axis diffusion-weighted image (DWI) showed good correlation (r = 0.86). Also, a shorter TE resulted in a larger value of the ALPS-index. CONCLUSION ALPS index was robust under the fixed imaging method even when different scanners were used. ALPS index was influenced by the imaging plane, the number of motion-proving gradient axes, and TE in the imaging sequence. These factors should be uniformed in the planning ALPS method studies. The possibility to develop a 3-axis DWI-ALPS method using three axes of the motion-proving gradient was also suggested.
Collapse
|
research-article |
3 |
144 |
22
|
Bikson M, Brunoni AR, Charvet LE, Clark VP, Cohen LG, Deng ZD, Dmochowski J, Edwards DJ, Frohlich F, Kappenman ES, Lim KO, Loo C, Mantovani A, McMullen DP, Parra LC, Pearson M, Richardson JD, Rumsey JM, Sehatpour P, Sommers D, Unal G, Wassermann EM, Woods AJ, Lisanby SH. Rigor and reproducibility in research with transcranial electrical stimulation: An NIMH-sponsored workshop. Brain Stimul 2018; 11:465-480. [PMID: 29398575 PMCID: PMC5997279 DOI: 10.1016/j.brs.2017.12.008] [Citation(s) in RCA: 137] [Impact Index Per Article: 19.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2017] [Revised: 12/01/2017] [Accepted: 12/21/2017] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Neuropsychiatric disorders are a leading source of disability and require novel treatments that target mechanisms of disease. As such disorders are thought to result from aberrant neuronal circuit activity, neuromodulation approaches are of increasing interest given their potential for manipulating circuits directly. Low intensity transcranial electrical stimulation (tES) with direct currents (transcranial direct current stimulation, tDCS) or alternating currents (transcranial alternating current stimulation, tACS) represent novel, safe, well-tolerated, and relatively inexpensive putative treatment modalities. OBJECTIVE This report seeks to promote the science, technology and effective clinical applications of these modalities, identify research challenges, and suggest approaches for addressing these needs in order to achieve rigorous, reproducible findings that can advance clinical treatment. METHODS The National Institute of Mental Health (NIMH) convened a workshop in September 2016 that brought together experts in basic and human neuroscience, electrical stimulation biophysics and devices, and clinical trial methods to examine the physiological mechanisms underlying tDCS/tACS, technologies and technical strategies for optimizing stimulation protocols, and the state of the science with respect to therapeutic applications and trial designs. RESULTS Advances in understanding mechanisms, methodological and technological improvements (e.g., electronics, computational models to facilitate proper dosing), and improved clinical trial designs are poised to advance rigorous, reproducible therapeutic applications of these techniques. A number of challenges were identified and meeting participants made recommendations made to address them. CONCLUSIONS These recommendations align with requirements in NIMH funding opportunity announcements to, among other needs, define dosimetry, demonstrate dose/response relationships, implement rigorous blinded trial designs, employ computational modeling, and demonstrate target engagement when testing stimulation-based interventions for the treatment of mental disorders.
Collapse
|
Research Support, N.I.H., Extramural |
7 |
137 |
23
|
Kafkafi N, Agassi J, Chesler EJ, Crabbe JC, Crusio WE, Eilam D, Gerlai R, Golani I, Gomez-Marin A, Heller R, Iraqi F, Jaljuli I, Karp NA, Morgan H, Nicholson G, Pfaff DW, Richter SH, Stark PB, Stiedl O, Stodden V, Tarantino LM, Tucci V, Valdar W, Williams RW, Würbel H, Benjamini Y. Reproducibility and replicability of rodent phenotyping in preclinical studies. Neurosci Biobehav Rev 2018; 87:218-232. [PMID: 29357292 PMCID: PMC6071910 DOI: 10.1016/j.neubiorev.2018.01.003] [Citation(s) in RCA: 136] [Impact Index Per Article: 19.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2016] [Revised: 12/13/2017] [Accepted: 01/11/2018] [Indexed: 12/15/2022]
Abstract
The scientific community is increasingly concerned with the proportion of
published “discoveries” that are not replicated in subsequent
studies. The field of rodent behavioral phenotyping was one of the first to
raise this concern, and to relate it to other methodological issues: the complex
interaction between genotype and environment; the definitions of behavioral
constructs; and the use of laboratory mice and rats as model species for
investigating human health and disease mechanisms. In January 2015, researchers
from various disciplines gathered at Tel Aviv University to discuss these
issues. The general consensus was that the issue is prevalent and of concern,
and should be addressed at the statistical, methodological and policy levels,
but is not so severe as to call into question the validity and the usefulness of
model organisms as a whole. Well-organized community efforts, coupled with
improved data and metadata sharing, have a key role in identifying specific
problems and promoting effective solutions. Replicability is closely related to
validity, may affect generalizability and translation of findings, and has
important ethical implications.
Collapse
|
Review |
7 |
136 |
24
|
McDougal RA, Morse TM, Carnevale T, Marenco L, Wang R, Migliore M, Miller PL, Shepherd GM, Hines ML. Twenty years of ModelDB and beyond: building essential modeling tools for the future of neuroscience. J Comput Neurosci 2017; 42:1-10. [PMID: 27629590 PMCID: PMC5279891 DOI: 10.1007/s10827-016-0623-7] [Citation(s) in RCA: 135] [Impact Index Per Article: 16.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2016] [Revised: 08/17/2016] [Accepted: 08/30/2016] [Indexed: 11/29/2022]
Abstract
Neuron modeling may be said to have originated with the Hodgkin and Huxley action potential model in 1952 and Rall's models of integrative activity of dendrites in 1964. Over the ensuing decades, these approaches have led to a massive development of increasingly accurate and complex data-based models of neurons and neuronal circuits. ModelDB was founded in 1996 to support this new field and enhance the scientific credibility and utility of computational neuroscience models by providing a convenient venue for sharing them. It has grown to include over 1100 published models covering more than 130 research topics. It is actively curated and developed to help researchers discover and understand models of interest. ModelDB also provides mechanisms to assist running models both locally and remotely, and has a graphical tool that enables users to explore the anatomical and biophysical properties that are represented in a model. Each of its capabilities is undergoing continued refinement and improvement in response to user experience. Large research groups (Allen Brain Institute, EU Human Brain Project, etc.) are emerging that collect data across multiple scales and integrate that data into many complex models, presenting new challenges of scale. We end by predicting a future for neuroscience increasingly fueled by new technology and high performance computation, and increasingly in need of comprehensive user-friendly databases such as ModelDB to provide the means to integrate the data for deeper insights into brain function in health and disease.
Collapse
|
Review |
8 |
135 |
25
|
Consistency of EEG source localization and connectivity estimates. Neuroimage 2017; 152:590-601. [PMID: 28300640 DOI: 10.1016/j.neuroimage.2017.02.076] [Citation(s) in RCA: 134] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2016] [Revised: 01/26/2017] [Accepted: 02/24/2017] [Indexed: 11/21/2022] Open
Abstract
As the EEG inverse problem does not have a unique solution, the sources reconstructed from EEG and their connectivity properties depend on forward and inverse modeling parameters such as the choice of an anatomical template and electrical model, prior assumptions on the sources, and further implementational details. In order to use source connectivity analysis as a reliable research tool, there is a need for stability across a wider range of standard estimation routines. Using resting state EEG recordings of N=65 participants acquired within two studies, we present the first comprehensive assessment of the consistency of EEG source localization and functional/effective connectivity metrics across two anatomical templates (ICBM152 and Colin27), three electrical models (BEM, FEM and spherical harmonics expansions), three inverse methods (WMNE, eLORETA and LCMV), and three software implementations (Brainstorm, Fieldtrip and our own toolbox). Source localizations were found to be more stable across reconstruction pipelines than subsequent estimations of functional connectivity, while effective connectivity estimates where the least consistent. All results were relatively unaffected by the choice of the electrical head model, while the choice of the inverse method and source imaging package induced a considerable variability. In particular, a relatively strong difference was found between LCMV beamformer solutions on one hand and eLORETA/WMNE distributed inverse solutions on the other hand. We also observed a gradual decrease of consistency when results are compared between studies, within individual participants, and between individual participants. In order to provide reliable findings in the face of the observed variability, additional simulations involving interacting brain sources are required. Meanwhile, we encourage verification of the obtained results using more than one source imaging procedure.
Collapse
|
Research Support, Non-U.S. Gov't |
8 |
134 |