Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Total Articles

545
(from Reference Citation Analysis)

Article PDFs (229)

Cited by > 0 (316)

Searched Name

data processing

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Number	Citation Analysis
251	Morvan M, Jacomo AL, Souque C, Wade MJ, Hoffmann T, Pouwels K, Lilley C, Singer AC, Porter J, Evens NP, Walker DI, Bunce JT, Engeli A, Grimsley J, O'Reilly KM, Danon L. An analysis of 45 large-scale wastewater sites in England to estimate SARS-CoV-2 community prevalence. Nat Commun 2022;13:4313. [PMID: 35879277 PMCID: PMC9312315 DOI: 10.1038/s41467-022-31753-y] [Citation(s) in RCA: 32] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2021] [Accepted: 06/28/2022] [Indexed: 12/23/2022] Open Abstract Accurate surveillance of the COVID-19 pandemic can be weakened by under-reporting of cases, particularly due to asymptomatic or pre-symptomatic infections, resulting in bias. Quantification of SARS-CoV-2 RNA in wastewater can be used to infer infection prevalence, but uncertainty in sensitivity and considerable variability has meant that accurate measurement remains elusive. Here, we use data from 45 sewage sites in England, covering 31% of the population, and estimate SARS-CoV-2 prevalence to within 1.1% of estimates from representative prevalence surveys (with 95% confidence). Using machine learning and phenomenological models, we show that differences between sampled sites, particularly the wastewater flow rate, influence prevalence estimation and require careful interpretation. We find that SARS-CoV-2 signals in wastewater appear 4-5 days earlier in comparison to clinical testing data but are coincident with prevalence surveys suggesting that wastewater surveillance can be a leading indicator for symptomatic viral infections. Surveillance for viruses in wastewater complements and strengthens clinical surveillance, with significant implications for public health. Collapse Key Words viral infection data processing sars-cov-2 epidemiology Collapse MESH Headings COVID-19/epidemiology Humans Pandemics Prevalence RNA, Viral/genetics SARS-CoV-2 Wastewater Wastewater-Based Epidemiological Monitoring Collapse Grants MC_PC_19067 Medical Research Council MC_PC_19067/2 Medical Research Council MR/V038613/1 Medical Research Council UK Health Security Agency Collapse
252	Huang H, Wang Y, Rudin C, Browne EP. Towards a comprehensive evaluation of dimension reduction methods for transcriptomic data visualization. Commun Biol 2022;5:719. [PMID: 35853932 PMCID: PMC9296444 DOI: 10.1038/s42003-022-03628-x] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Accepted: 06/23/2022] [Indexed: 12/11/2022] Open Abstract Dimension reduction (DR) algorithms project data from high dimensions to lower dimensions to enable visualization of interesting high-dimensional structure. DR algorithms are widely used for analysis of single-cell transcriptomic data. Despite widespread use of DR algorithms such as t-SNE and UMAP, these algorithms have characteristics that lead to lack of trust: they do not preserve important aspects of high-dimensional structure and are sensitive to arbitrary user choices. Given the importance of gaining insights from DR, DR methods should be evaluated carefully before trusting their results. In this paper, we introduce and perform a systematic evaluation of popular DR methods, including t-SNE, art-SNE, UMAP, PaCMAP, TriMap and ForceAtlas2. Our evaluation considers five components: preservation of local structure, preservation of global structure, sensitivity to parameter choices, sensitivity to preprocessing choices, and computational efficiency. This evaluation can help us to choose DR tools that align with the scientific goals of the user. Collapse Key Words machine learning data mining data processing Collapse MESH Headings Algorithms Data Visualization Transcriptome Collapse Grants R01 DA054994 NIDA NIH HHS R01 AI143381 NIAID NIH HHS UM1 AI164567 NIAID NIH HHS R61 DA053599 NIDA NIH HHS U.S. Department of Health & Human Services \| NIH \| National Institute of Allergy and Infectious Diseases (NIAID) U.S. Department of Health & Human Services \| NIH \| National Institute on Drug Abuse (NIDA) U.S. Department of Health & Human Services \| National Institutes of Health (NIH) Collapse
253	Neishabouri A, Nguyen J, Samuelsson J, Guthrie T, Biggs M, Wyatt J, Cross D, Karas M, Migueles JH, Khan S, Guo CC. Quantification of acceleration as activity counts in ActiGraph wearable. Sci Rep 2022;12:11958. [PMID: 35831446 PMCID: PMC9279376 DOI: 10.1038/s41598-022-16003-x] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Accepted: 07/04/2022] [Indexed: 11/09/2022] Open Abstract Digital clinical measures based on data collected by wearable devices have seen rapid growth in both clinical trials and healthcare. The widely-used measures based on wearables are epoch-based physical activity counts using accelerometer data. Even though activity counts have been the backbone of thousands of clinical and epidemiological studies, there are large variations of the algorithms that compute counts and their associated parameters-many of which have often been kept proprietary by device providers. This lack of transparency has hindered comparability between studies using different devices and limited their broader clinical applicability. ActiGraph devices have been the most-used wearable accelerometer devices for over two decades. Recognizing the importance of data transparency, interpretability and interoperability to both research and clinical use, we here describe the detailed counts algorithms of five generations of ActiGraph devices going back to the first AM7164 model, and publish the current counts algorithm in ActiGraph's ActiLife and CentrePoint software as a standalone Python package for research use. We believe that this material will provide a useful resource for the research community, accelerate digital health science and facilitate clinical applications of wearable accelerometry. Collapse Key Words computational platforms and environments data integration data processing software Collapse MESH Headings Acceleration Accelerometry Exercise Software Wearable Electronic Devices Collapse Grants U01 MH116928 NIMH NIH HHS 5U01MH116928 NIMH NIH HHS National Institute of Mental Health Forskningsrådet för Arbetsliv och Socialvetenskap Collapse
254	Said S, Pazoki R, Karhunen V, Võsa U, Ligthart S, Bodinier B, Koskeridis F, Welsh P, Alizadeh BZ, Chasman DI, Sattar N, Chadeau-Hyam M, Evangelou E, Jarvelin MR, Elliott P, Tzoulaki I, Dehghan A. Author Correction: Genetic analysis of over half a million people characterises C-reactive protein loci. Nat Commun 2022;13:3865. [PMID: 35790731 PMCID: PMC9256682 DOI: 10.1038/s41467-022-31706-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open Abstract Collapse Key Words data processing chronic inflammation genome-wide association studies Collapse MESH Headings Collapse Grants Collapse
255	Huang Y, Qian X, Wang X, Wang T, Lounder SJ, Ravindran T, Demitrack Z, McCutcheon J, Asatekin A, Li B. Electrospraying Zwitterionic Copolymers as an Effective Biofouling Control for Accurate and Continuous Monitoring of Wastewater Dynamics in a Real-Time and Long-Term Manner. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2022;56:8176-8186. [PMID: 35576931 DOI: 10.1021/acs.est.2c01501] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023] Abstract Long-term continuous monitoring (LTCM) of water quality can provide high-fidelity datasets essential for executing swift control and enhancing system efficiency. One roadblock for LTCM using solid-state ion-selective electrode (S-ISE) sensors is biofouling on the sensor surface, which perturbs analyte mass transfer and deteriorates the sensor reading accuracy. This study advanced the anti-biofouling property of S-ISE sensors through precisely coating a self-assembled channel-type zwitterionic copolymer poly(trifluoroethyl methacrylate-random-sulfobetaine methacrylate) (PTFEMA-r-SBMA) on the sensor surface using electrospray. The PTFEMA-r-SBMA membrane exhibits exceptional permeability and selectivity to primary ions in water solutions. NH₄⁺ S-ISE sensors with this anti-fouling zwitterionic layer were examined in real wastewater for 55 days consecutively, exhibiting sensitivity close to the theoretical value (59.18 mV/dec) and long-term stability (error <4 mg/L). Furthermore, a denoising data processing algorithm (DDPA) was developed to further improve the sensor accuracy, reducing the S-ISE sensor error to only 1.2 mg/L after 50 days of real wastewater analysis. Based on the dynamic energy cost function and carbon footprint models, LTCM is expected to save 44.9% NH₄⁺ discharge, 12.8% energy consumption, and 26.7% greenhouse emission under normal operational conditions. This study unveils an innovative LTCM methodology by integrating advanced materials (anti-fouling layer coating) with sensor data processing (DDPA). Collapse Key Words anti-fouling membrane data processing electrospray solid-state ion-selective electrode (S-ISE) sensor wastewater monitoring zwitterionic copolymers Collapse MESH Headings Biofouling/prevention & control Ions Methacrylates Polymers Wastewater Collapse Grants Collapse
256	Sun H, Poudel S, Vanderwall D, Lee DG, Li Y, Peng J. 29-Plex tandem mass tag mass spectrometry enabling accurate quantification by interference correction. Proteomics 2022;22:e2100243. [PMID: 35723178 DOI: 10.1002/pmic.202100243] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Revised: 06/14/2022] [Accepted: 06/15/2022] [Indexed: 12/14/2022] Abstract Tandem mass tag (TMT) mass spectrometry is a mainstream isobaric chemical labeling strategy for profiling proteomes. Here we present a 29-plex TMT method to combine the 11-plex and 18-plex labeling strategies. The 29-plex method was examined with a pooled sample composed of 1×, 3×, and 10× Escherichia coli peptides with 100× human background peptides, which generated two E. coli datasets (TMT11 and TMT18), displaying the distorted ratios of 1.0:1.7:4.2 and 1.0:1.8:4.9, respectively. This ratio compression from the expected 1:3:10 ratios was caused by co-isolated TMT-labeled ions (i.e., noise). Interestingly, the mixture of two TMT sets produced MS/MS spectra with unique features for the noise detection: (i) in TMT11-labeled spectra, TMT18-specific reporter ions (e.g., 135N) were shown as the noise; (ii) in TMT18-labeled spectra, the TMT11/TMT18-shared reporter ions (e.g., 131C) typically exhibited higher intensities than TMT18-specific reporter ions, due to contaminated TMT11-labeled ions in these shared channels. We further estimated the noise levels contributed by both TMT11- and TMT18-labeled peptides, and corrected reporter ion intensities in every spectrum. Finally, the anticipated 1:3:10 ratios were largely restored. This strategy was also validated using another 29-plex sample with 1:5 ratios. Thus the 29-plex method expands the TMT throughput and enhances the quantitative accuracy. Collapse Key Words data processing interference liquid chromatography mass spectrometry proteome proteomics ratio compression tandem mass tag Collapse MESH Headings Collapse Grants Collapse
257	Errekagorri I, Castellano J, Los Arcos A, Rico-González M, Pino-Ortega J. Different Sampling Frequencies to Calculate Collective Tactical Variables during Competition: A Case of an Official Female's Soccer Match. SENSORS (BASEL, SWITZERLAND) 2022;22:4508. [PMID: 35746288 PMCID: PMC9230581 DOI: 10.3390/s22124508] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Revised: 06/07/2022] [Accepted: 06/13/2022] [Indexed: 02/05/2023] Abstract The objective of the study was to assess the impact of the sampling frequency on the outcomes of collective tactical variables during an official women’s soccer match. To do this, the first half (lasting 46 min) of an official league match of a semi-professional soccer team belonging to the Women’s Second Division of Spain (Reto Iberdrola) was analysed. The collective variables recorded were classified into three main groups: point-related variable (i.e., change in geometrical centre position (cGCp)), distance-related variables (i.e., width, length, height, distance from the goalkeeper to the near defender and mean distance between players), and area-related variables (i.e., surface area). Each variable was measured using eight different sampling frequencies: data every 100 (10 Hz), 200 (5 Hz), 250 (4 Hz), 400 (2.5 Hz), 500 (2 Hz), 1000 (1 Hz), 2000 (0.5 Hz), and 4000 ms (0.25 Hz). With the exception of cGCp, the outcomes of the collective tactical variables did not vary depending on the sampling frequency used (p > 0.05; Effect Size < 0.001). The results suggest that a sampling frequency of 0.5 Hz would be sufficient to measure the collective tactical variables that assess distance and area during an official soccer match. Collapse Key Words data collection data processing electronic performance and tracking systems spatial-positioning variables team behaviour women’s football Collapse MESH Headings Athletic Performance Female Humans Records Soccer Spain Collapse Grants PGC2018-098742-B-C33 Ministerio de Ciencia, Innovación y Universidades (MCIU), la Agencia Estatal de Investigación (AEI) y el Fondo Europeo de Desarrollo Regional (FEDER) Collapse
258	Walzer M, García-Seisdedos D, Prakash A, Brack P, Crowther P, Graham RL, George N, Mohammed S, Moreno P, Papatheodorou I, Hubbard SJ, Vizcaíno JA. Implementing the reuse of public DIA proteomics datasets: from the PRIDE database to Expression Atlas. Sci Data 2022;9:335. [PMID: 35701420 PMCID: PMC9197839 DOI: 10.1038/s41597-022-01380-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2021] [Accepted: 05/12/2022] [Indexed: 11/14/2022] Open Abstract The number of mass spectrometry (MS)-based proteomics datasets in the public domain keeps increasing, particularly those generated by Data Independent Acquisition (DIA) approaches such as SWATH-MS. Unlike Data Dependent Acquisition datasets, the re-use of DIA datasets has been rather limited to date, despite its high potential, due to the technical challenges involved. We introduce a (re-)analysis pipeline for public SWATH-MS datasets which includes a combination of metadata annotation protocols, automated workflows for MS data analysis, statistical analysis, and the integration of the results into the Expression Atlas resource. Automation is orchestrated with Nextflow, using containerised open analysis software tools, rendering the pipeline readily available and reproducible. To demonstrate its utility, we reanalysed 10 public DIA datasets from the PRIDE database, comprising 1,278 SWATH-MS runs. The robustness of the analysis was evaluated, and the results compared to those obtained in the original publications. The final expression values were integrated into Expression Atlas, making SWATH-MS experiments more widely available and combining them with expression data originating from other proteomics and transcriptomics datasets. Collapse Key Words proteome informatics data integration data processing Collapse MESH Headings Data Analysis Databases, Protein Datasets as Topic Mass Spectrometry/methods Proteomics/methods Software Collapse Grants Wellcome Trust RCUK \| Biotechnology and Biological Sciences Research Council (BBSRC) Wellcome Trust (Wellcome) Collapse
259	Enhancing the REMBRANDT MRI collection with expert segmentation labels and quantitative radiomic features. Sci Data 2022;9:338. [PMID: 35701399 PMCID: PMC9198015 DOI: 10.1038/s41597-022-01415-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Accepted: 05/24/2022] [Indexed: 01/26/2023] Open Abstract Malignancy of the brain and CNS is unfortunately a common diagnosis. A large subset of these lesions tends to be high grade tumors which portend poor prognoses and low survival rates, and are estimated to be the tenth leading cause of death worldwide. The complex nature of the brain tissue environment in which these lesions arise offers a rich opportunity for translational research. Magnetic Resonance Imaging (MRI) can provide a comprehensive view of the abnormal regions in the brain, therefore, its applications in the translational brain cancer research is considered essential for the diagnosis and monitoring of disease. Recent years has seen rapid growth in the field of radiogenomics, especially in cancer, and scientists have been able to successfully integrate the quantitative data extracted from medical images (also known as radiomics) with genomics to answer new and clinically relevant questions. In this paper, we took raw MRI scans from the REMBRANDT data collection from public domain, and performed volumetric segmentation to identify subregions of the brain. Radiomic features were then extracted to represent the MRIs in a quantitative yet summarized format. This resulting dataset now enables further biomedical and integrative data analysis, and is being made public via the NeuroImaging Tools & Resources Collaboratory (NITRC) repository ( https://www.nitrc.org/projects/rembrandt_brain/ ). Collapse Key Words data publication and archiving data processing image processing Collapse MESH Headings Collapse Grants Collapse
260	Ko PC, Lin PC, Do HT, Huang YF. P2P Lending Default Prediction Based on AI and Statistical Models. ENTROPY 2022;24:e24060801. [PMID: 35741522 PMCID: PMC9222552 DOI: 10.3390/e24060801] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Revised: 06/01/2022] [Accepted: 06/06/2022] [Indexed: 02/01/2023] Abstract Peer-to-peer lending (P2P lending) has proliferated in recent years thanks to Fintech and big data advancements. However, P2P lending platforms are not tightly governed by relevant laws yet, as their development speed has far exceeded that of regulations. Therefore, P2P lending operations are still subject to risks. This paper proposes prediction models to mitigate the risks of default and asymmetric information on P2P lending platforms. Specifically, we designed sophisticated procedures to pre-process mass data extracted from Lending Club in 2018 Q3–2019 Q2. After that, three statistical models, namely, Logistic Regression, Bayesian Classifier, and Linear Discriminant Analysis (LDA), and five AI models, namely, Decision Tree, Random Forest, LightGBM, Artificial Neural Network (ANN), and Convolutional Neural Network (CNN), were utilized for data analysis. The loan statuses of Lending Club’s customers were rationally classified. To evaluate the models, we adopted the confusion matrix series of metrics, AUC-ROC curve, Kolmogorov–Smirnov chart (KS), and Student’s t-test. Empirical studies show that LightGBM produces the best performance and is 2.91% more accurate than the other models, resulting in a revenue improvement of nearly USD 24 million for Lending Club. Student’s t-test proves that the differences between models are statistically significant. Collapse Key Words AI model P2P lending default prediction data processing statistical model Collapse MESH Headings Collapse Grants Collapse
261	D'Ascenzo L, Popova AM, Abernathy S, Sheng K, Limbach PA, Williamson JR. Pytheas: a software package for the automated analysis of RNA sequences and modifications via tandem mass spectrometry. Nat Commun 2022;13:2424. [PMID: 35505047 PMCID: PMC9065004 DOI: 10.1038/s41467-022-30057-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Accepted: 04/12/2022] [Indexed: 12/23/2022] Open Abstract Mass spectrometry is an important method for analysis of modified nucleosides ubiquitously present in cellular RNAs, in particular for ribosomal and transfer RNAs that play crucial roles in mRNA translation and decoding. Furthermore, modifications have effect on the lifetimes of nucleic acids in plasma and cells and are consequently incorporated into RNA therapeutics. To provide an analytical tool for sequence characterization of modified RNAs, we developed Pytheas, an open-source software package for automated analysis of tandem MS data for RNA. The main features of Pytheas are flexible handling of isotope labeling and RNA modifications, with false discovery rate statistical validation based on sequence decoys. We demonstrate bottom-up mass spectrometry characterization of diverse RNA sequences, with broad applications in the biology of stable RNAs, and quality control of RNA therapeutics and mRNA vaccines. Collapse Key Words rna mass spectrometry software data processing rna modification Collapse MESH Headings Base Sequence RNA/chemistry RNA, Transfer/chemistry Software Tandem Mass Spectrometry/methods Collapse Grants R01 GM058843 NIGMS NIH HHS Simons Foundation U.S. Department of Health & Human Services \| National Institutes of Health (NIH) Collapse
262	Huang Y, Wang X, Xiang W, Wang T, Otis C, Sarge L, Lei Y, Li B. Forward-Looking Roadmaps for Long-Term Continuous Water Quality Monitoring: Bottlenecks, Innovations, and Prospects in a Critical Review. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2022;56:5334-5354. [PMID: 35442035 PMCID: PMC9063115 DOI: 10.1021/acs.est.1c07857] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Revised: 04/05/2022] [Accepted: 04/06/2022] [Indexed: 05/29/2023] Abstract Long-term continuous monitoring (LTCM) of water quality can bring far-reaching influences on water ecosystems by providing spatiotemporal data sets of diverse parameters and enabling operation of water and wastewater treatment processes in an energy-saving and cost-effective manner. However, current water monitoring technologies are deficient for long-term accuracy in data collection and processing capability. Inadequate LTCM data impedes water quality assessment and hinders the stakeholders and decision makers from foreseeing emerging problems and executing efficient control methodologies. To tackle this challenge, this review provides a forward-looking roadmap highlighting vital innovations toward LTCM, and elaborates on the impacts of LTCM through a three-hierarchy perspective: data, parameters, and systems. First, we demonstrate the critical needs and challenges of LTCM in natural resource water, drinking water, and wastewater systems, and differentiate LTCM from existing short-term and discrete monitoring techniques. We then elucidate three steps to achieve LTCM in water systems, consisting of data acquisition (water sensors), data processing (machine learning algorithms), and data application (with modeling and process control as two examples). Finally, we explore future opportunities of LTCM in four key domains, water, energy, sensing, and data, and underscore strategies to transfer scientific discoveries to general end-users. Collapse Key Words data processing emerging contaminants long-term continuous monitoring process control sensors water system Collapse MESH Headings Ecosystem Wastewater Water Purification Water Quality Collapse Grants National Science Foundation UConn CEE Undergraduate Research Initiative Program UConn SOE GE Graduate Fellowship U.S. - Egypt Science and Technology Joint Fund Infiltrator Water Technologies Co. Connecticut SPARK Program Collapse
263	Kabbara A, Robert G, Khalil M, Verin M, Benquet P, Hassan M. An electroencephalography connectome predictive model of major depressive disorder severity. Sci Rep 2022;12:6816. [PMID: 35473962 PMCID: PMC9042869 DOI: 10.1038/s41598-022-10949-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2021] [Accepted: 04/05/2022] [Indexed: 11/21/2022] Open Abstract Emerging evidence showed that major depressive disorder (MDD) is associated with disruptions of brain structural and functional networks, rather than impairment of isolated brain region. Thus, connectome-based models capable of predicting the depression severity at the individual level can be clinically useful. Here, we applied a machine-learning approach to predict the severity of depression using resting-state networks derived from source-reconstructed Electroencephalography (EEG) signals. Using regression models and three independent EEG datasets (N = 328), we tested whether resting state functional connectivity could predict individual depression score. On the first dataset, results showed that individuals scores could be reasonably predicted (r = 0.6, p = 4 × 10-18) using intrinsic functional connectivity in the EEG alpha band (8-13 Hz). In particular, the brain regions which contributed the most to the predictive network belong to the default mode network. We further tested the predictive potential of the established model by conducting two external validations on (N1 = 53, N2 = 154). Results showed statistically significant correlations between the predicted and the measured depression scale scores (r1 = 0.52, r2 = 0.44, p < 0.001). These findings lay the foundation for developing a generalizable and scientifically interpretable EEG network-based markers that can ultimately support clinicians in a biologically-based characterization of MDD. Collapse Key Words data processing depression magnetoencephalography Collapse MESH Headings Brain/diagnostic imaging Connectome Depressive Disorder, Major/diagnostic imaging Electroencephalography Humans Machine Learning Collapse Grants Université Libanaise Institute of Clinical Neuroscience of Rennes Collapse
264	Said S, Pazoki R, Karhunen V, Võsa U, Ligthart S, Bodinier B, Koskeridis F, Welsh P, Alizadeh BZ, Chasman DI, Sattar N, Chadeau-Hyam M, Evangelou E, Jarvelin MR, Elliott P, Tzoulaki I, Dehghan A. Genetic analysis of over half a million people characterises C-reactive protein loci. Nat Commun 2022;13:2198. [PMID: 35459240 PMCID: PMC9033829 DOI: 10.1038/s41467-022-29650-5] [Citation(s) in RCA: 40] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Accepted: 03/25/2022] [Indexed: 01/08/2023] Open Abstract Chronic low-grade inflammation is linked to a multitude of chronic diseases. We report the largest genome-wide association study (GWAS) on C-reactive protein (CRP), a marker of systemic inflammation, in UK Biobank participants (N = 427,367, European descent) and the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium (total N = 575,531 European descent). We identify 266 independent loci, of which 211 are not previously reported. Gene-set analysis highlighted 42 gene sets associated with CRP levels (p ≤ 3.2 ×10-6) and tissue expression analysis indicated a strong association of CRP related genes with liver and whole blood gene expression. Phenome-wide association study identified 27 clinical outcomes associated with genetically determined CRP and subsequent Mendelian randomisation analyses supported a causal association with schizophrenia, chronic airway obstruction and prostate cancer. Our findings identified genetic loci and functional properties of chronic low-grade inflammation and provided evidence for causal associations with a range of diseases. Collapse Key Words data processing chronic inflammation genome-wide association studies Collapse MESH Headings C-Reactive Protein/genetics C-Reactive Protein/metabolism Genetic Loci Genome-Wide Association Study Humans Inflammation/genetics Male Mendelian Randomization Analysis Phenomics Polymorphism, Single Nucleotide Collapse Grants MR/R0265051/1 Medical Research Council British Heart Foundation Department of Health MR/R023484/1 Medical Research Council MR/S019669/1 Medical Research Council MR/R0265051/2 Medical Research Council MC_QA137853 Medical Research Council MC_PC_17228 Medical Research Council MR/R026505/2 Medical Research Council RCUK \| Medical Research Council (MRC) EC \| EU Framework Programme for Research and Innovation H2020 \| H2020 Priority Excellent Science \| H2020 Marie Skłodowska-Curie Actions (H2020 Excellent Science - Marie Skłodowska-Curie Actions) General Secretariat for Research and Technology (GSRT) Hellenic Foundation for Research and Innovation, 1312. UK Dementia Research Institute. Medical Research Council. Collapse
265	Bogdanovic B, Eftimov T, Simjanoska M. In-depth insights into Alzheimer's disease by using explainable machine learning approach. Sci Rep 2022;12:6508. [PMID: 35444165 PMCID: PMC9021280 DOI: 10.1038/s41598-022-10202-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Accepted: 04/04/2022] [Indexed: 11/09/2022] Open Abstract Alzheimer's disease is still a field of research with lots of open questions. The complexity of the disease prevents the early diagnosis before visible symptoms regarding the individual's cognitive capabilities occur. This research presents an in-depth analysis of a huge data set encompassing medical, cognitive and lifestyle's measurements from more than 12,000 individuals. Several hypothesis were established whose validity has been questioned considering the obtained results. The importance of appropriate experimental design is highly stressed in the research. Thus, a sequence of methods for handling missing data, redundancy, data imbalance, and correlation analysis have been applied for appropriate preprocessing of the data set, and consequently XGBoost model has been trained and evaluated with special attention to the hyperparameters tuning. The model was explained by using the Shapley values produced by the SHAP method. XGBoost produced a f1-score of 0.84 and as such is considered to be highly competitive among those published in the literature. This achievement, however, was not the main contribution of this paper. This research's goal was to perform global and local interpretability of the intelligent model and derive valuable conclusions over the established hypothesis. Those methods led to a single scheme which presents either positive, or, negative influence of the values of each of the features whose importance has been confirmed by means of Shapley values. This scheme might be considered as additional source of knowledge for the physicians and other experts whose concern is the exact diagnosis of early stage of Alzheimer's disease. The conclusions derived from the intelligent model's data-driven interpretability confronted all the established hypotheses. This research clearly showed the importance of explainable Machine learning approach that opens the black box and clearly unveils the relationships among the features and the diagnoses. Collapse Key Words data processing databases machine learning predictive medicine statistical methods alzheimer's disease Collapse MESH Headings Alzheimer Disease/diagnosis Humans Machine Learning Collapse Grants U01 AG024904 NIA NIH HHS CIHR Javna Agencija za Raziskovalno Dejavnost RS Horizon 2020 Framework Programme Collapse
266	Ping Z, Chen S, Zhou G, Huang X, Zhu SJ, Zhang H, Lee HH, Lan Z, Cui J, Chen T, Zhang W, Yang H, Xu X, Church GM, Shen Y. Towards practical and robust DNA-based data archiving using the yin-yang codec system. NATURE COMPUTATIONAL SCIENCE 2022;2:234-242. [PMID: 38177542 PMCID: PMC10766522 DOI: 10.1038/s43588-022-00231-2] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Accepted: 03/18/2022] [Indexed: 01/06/2024] Abstract DNA is a promising data storage medium due to its remarkable durability and space-efficient storage. Early bit-to-base transcoding schemes have primarily pursued information density, at the expense of introducing biocompatibility challenges or decoding failure. Here we propose a robust transcoding algorithm named the yin-yang codec, using two rules to encode two binary bits into one nucleotide, to generate DNA sequences that are highly compatible with synthesis and sequencing technologies. We encoded two representative file formats and stored them in vitro as 200 nt oligo pools and in vivo as a ~54 kbps DNA fragment in yeast cells. Sequencing results show that the yin-yang codec exhibits high robustness and reliability for a wide variety of data types, with an average recovery rate of 99.9% above 104 molecule copies and an achieved recovery rate of 87.53% at ≤102 copies. Additionally, the in vivo storage demonstration achieved an experimentally measured physical density close to the theoretical maximum. Collapse Key Words bioinformatics data processing dna Collapse MESH Headings Collapse Grants Collapse
267	Nordin ND, Abdullah F, Zan MSD, A Bakar AA, Krivosheev AI, Barkov FL, Konstantinov YA. Improving Prediction Accuracy and Extraction Precision of Frequency Shift from Low-SNR Brillouin Gain Spectra in Distributed Structural Health Monitoring. SENSORS 2022;22:s22072677. [PMID: 35408291 PMCID: PMC9003443 DOI: 10.3390/s22072677] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Revised: 03/25/2022] [Accepted: 03/25/2022] [Indexed: 02/04/2023] Abstract In this paper, we studied the possibility of increasing the Brillouin frequency shift (BFS) detection accuracy in distributed fibre-optic sensors by the separate and joint use of different algorithms for finding the spectral maximum: Lorentzian curve fitting (LCF, including the Levenberg–Marquardt (LM) method), the backward correlation technique (BWC) and a machine learning algorithm, the generalized linear model (GLM). The study was carried out on real spectra subjected to the subsequent addition of extreme digital noise. The precision and accuracy of the LM and BWC methods were studied by varying the signal-to-noise ratios (SNRs) and by incorporating the GLM method into the processing steps. It was found that the use of methods in sequence gives a gain in the accuracy of determining the sensor temperature from tenths to several degrees Celsius (or MHz in BFS scale), which is manifested for signal-to-noise ratios within 0 to 20 dB. We have found out that the double processing (BWC + GLM) is more effective for positive SNR values (in dB): it gives a gain in BFS measurement precision near 0.4 °C (428 kHz or 9.3 με); for BWC + GLM, the difference of precisions between single and double processing for SNRs below 2.6 dB is about 1.5 °C (1.6 MHz or 35 με). In this case, double processing is more effective for all SNRs. The described technique’s potential application in structural health monitoring (SHM) of concrete objects and different areas in metrology and sensing were also discussed. Collapse Key Words BFS extraction BOTDA Brillouin scattering concrete data processing distributed fibre-optic sensors machine learning structural health monitoring Collapse MESH Headings Algorithms Fiber Optic Technology Noise Signal-To-Noise Ratio Collapse Grants Collapse
268	Zhang X, Jenkins GJ, Hakim CH, Duan D, Yao G. Four-limb wireless IMU sensor system for automatic gait detection in canines. Sci Rep 2022;12:4788. [PMID: 35314731 PMCID: PMC8938443 DOI: 10.1038/s41598-022-08676-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Accepted: 03/10/2022] [Indexed: 12/24/2022] Open Abstract This study aims to develop a 4-limb canine gait analysis system using wireless inertial measurement units (IMUs). 3D printed sensor holders were designed to ensure quick and consistent sensor mounting. Signal analysis algorithms were developed to automatically determine the timing of swing start and end in a stride. To evaluate the accuracy of the new system, a synchronized study was conducted in which stride parameters in four dogs were measured simultaneously using the 4-limb IMU system and a pressure-sensor based walkway gait system. The results showed that stride parameters measured in both systems were highly correlated. Bland-Altman analyses revealed a nominal mean measurement bias between the two systems in both forelimbs and hindlimbs. Overall, the disagreement between the two systems was less than 10% of the mean value in over 92% of the data points acquired from forelimbs. The same performance was observed in hindlimbs except for one parameter due to small mean values. We demonstrated that this 4-limb system could successfully visualize the overall gait types and identify rapid gait changes in dogs. This method provides an effective, low-cost tool for gait studies in veterinary applications or in translational studies using dog models of neuromuscular diseases. Collapse Key Words biomarkers animal biotechnology data processing animal behaviour engineering biomedical engineering Collapse MESH Headings Algorithms Animals Dogs Extremities Forelimb Gait Hindlimb Collapse Grants R01 AR070517 NIAMS NIH HHS R01 NS090634 NINDS NIH HHS National Institutes of Health Jackson Freel DMD Research Fund Jesse’s Journey: The Foundation for Gene and Cell Therapy Collapse
269	Wagner AS, Waite LK, Wierzba M, Hoffstaedter F, Waite AQ, Poldrack B, Eickhoff SB, Hanke M. FAIRly big: A framework for computationally reproducible processing of large-scale data. Sci Data 2022;9:80. [PMID: 35277501 PMCID: PMC8917149 DOI: 10.1038/s41597-022-01163-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Accepted: 02/11/2022] [Indexed: 11/30/2022] Open Abstract Large-scale datasets present unique opportunities to perform scientific investigations with unprecedented breadth. However, they also pose considerable challenges for the findability, accessibility, interoperability, and reusability (FAIR) of research outcomes due to infrastructure limitations, data usage constraints, or software license restrictions. Here we introduce a DataLad-based, domain-agnostic framework suitable for reproducible data processing in compliance with open science mandates. The framework attempts to minimize platform idiosyncrasies and performance-related complexities. It affords the capture of machine-actionable computational provenance records that can be used to retrace and verify the origins of research outcomes, as well as be re-executed independent of the original computing infrastructure. We demonstrate the framework's performance using two showcases: one highlighting data sharing and transparency (using the studyforrest.org dataset) and another highlighting scalability (using the largest public brain imaging dataset available: the UK Biobank dataset). Collapse Key Words data processing data publication and archiving software Collapse MESH Headings Collapse Grants MC_PC_17228 Medical Research Council 826421 EC \| Horizon 2020 Framework Programme (EU Framework Programme for Research and Innovation H2020) 1912266 National Science Foundation (NSF) 01GQ1905 Bundesministerium für Bildung und Forschung (Federal Ministry of Education and Research) 1429999 National Science Foundation (NSF) 945539 EC \| Horizon 2020 Framework Programme (EU Framework Programme for Research and Innovation H2020) MC_QA137853 Medical Research Council 2018/28/T/HS6/00507 Narodowe Centrum Nauki (National Science Centre) 01GQ1411 Bundesministerium für Bildung und Forschung (Federal Ministry of Education and Research) Collapse
270	Raredon MSB, Yang J, Garritano J, Wang M, Kushnir D, Schupp JC, Adams TS, Greaney AM, Leiby KL, Kaminski N, Kluger Y, Levchenko A, Niklason LE. Computation and visualization of cell-cell signaling topologies in single-cell systems data using Connectome. Sci Rep 2022;12:4187. [PMID: 35264704 PMCID: PMC8906120 DOI: 10.1038/s41598-022-07959-x] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Accepted: 02/28/2022] [Indexed: 12/11/2022] Open Abstract Single-cell RNA-sequencing data has revolutionized our ability to understand of the patterns of cell-cell and ligand-receptor connectivity that influence the function of tissues and organs. However, the quantification and visualization of these patterns in a way that informs tissue biology are major computational and epistemological challenges. Here, we present Connectome, a software package for R which facilitates rapid calculation and interactive exploration of cell-cell signaling network topologies contained in single-cell RNA-sequencing data. Connectome can be used with any reference set of known ligand-receptor mechanisms. It has built-in functionality to facilitate differential and comparative connectomics, in which signaling networks are compared between tissue systems. Connectome focuses on computational and graphical tools designed to analyze and explore cell-cell connectivity patterns across disparate single-cell datasets and reveal biologic insight. We present approaches to quantify focused network topologies and discuss some of the biologic theory leading to their design. Collapse Key Words stem cells computational biology and bioinformatics data mining data processing gene regulatory networks network topology software statistical methods systems biology systems analysis Collapse MESH Headings Brain/diagnostic imaging Brain/physiology Connectome Ligands RNA Signal Transduction Collapse Grants R01 GM131642 NIGMS NIH HHS U54 CA209992 NCI NIH HHS F30 HL143906 NHLBI NIH HHS T32 GM007205 NIGMS NIH HHS R01 HL138540 NHLBI NIH HHS P50 CA121974 NCI NIH HHS F30 HG011193 NHGRI NIH HHS U01 HL145567 NHLBI NIH HHS F32 HL162428 NHLBI NIH HHS UM1 DA051410 NIDA NIH HHS UL1 TR001863 NCATS NIH HHS National Institutes of Health U.S. Department of Defense Collapse
271	Charnaud S, Munro JE, Semenec L, Mazhari R, Brewster J, Bourke C, Ruybal-Pesántez S, James R, Lautu-Gumal D, Karunajeewa H, Mueller I, Bahlo M. PacBio long-read amplicon sequencing enables scalable high-resolution population allele typing of the complex CYP2D6 locus. Commun Biol 2022;5:168. [PMID: 35217695 PMCID: PMC8881578 DOI: 10.1038/s42003-022-03102-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2021] [Accepted: 02/01/2022] [Indexed: 01/31/2023] Open Abstract The CYP2D6 enzyme is estimated to metabolize 25% of commonly used pharmaceuticals and is of intense pharmacogenetic interest due to the polymorphic nature of the CYP2D6 gene. Accurate allele typing of CYP2D6 has proved challenging due to frequent copy number variants (CNVs) and paralogous pseudogenes. SNP-arrays, qPCR and short-read sequencing have been employed to interrogate CYP2D6, however these technologies are unable to capture longer range information. Long-read sequencing using the PacBio Single Molecule Real Time (SMRT) sequencing platform has yielded promising results for CYP2D6 allele typing. However, previous studies have been limited in scale and have employed nascent data processing pipelines. We present a robust data processing pipeline "PLASTER" for accurate allele typing of SMRT sequenced amplicons. We demonstrate the pipeline by typing CYP2D6 alleles in a large cohort of 377 Solomon Islanders. This pharmacogenetic method will improve drug safety and efficacy through screening prior to drug administration. Collapse Key Words dna sequencing data processing medical genetics Collapse MESH Headings Collapse Grants Collapse
272	Fleischer CE. A data processing approach with built-in spatial resolution reduction methods to construct energy system models. OPEN RESEARCH EUROPE 2022;1:36. [PMID: 37645144 PMCID: PMC10446009 DOI: 10.12688/openreseurope.13420.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 02/02/2022] [Indexed: 08/31/2023] Abstract Introduction: Data processing is a crucial step in energy system modelling which prepares input data from various sources into a format needed to formulate a model. Multiple open-source web-hosted databases offer pre-processed input data within the European context. However, the number of documented open-source data processing workflows that allow for the construction of energy system models with specified spatial resolution reduction methods is still limited. Methods: The first step of the data-processing method builds a dataset using web-hosted pre-processed data and open-source software. The second step aggregates the dataset using a specified spatial aggregation method. The spatially aggregated dataset is used as input data to construct sector-coupled energy system models. Results: To demonstrate the application of the data processing process, three power and heat optimisation models of Germany were constructed using the proposed data processing approach. Significant variation in generation, transmission and storage capacity of electricity were observed between the optimisation results of the energy system models. Conclusions: This paper presents a novel data processing approach to construct sector-coupled energy system models with integrated spatial aggregations methods. Collapse Key Words data processing energy system modelling sector-coupling spatial aggregation Collapse MESH Headings Collapse Grants Collapse
273	Li Y, Zaheri S, Nguyen K, Liu L, Hassanipour F, Pace BS, Bleris L. Machine learning-based approaches for identifying human blood cells harboring CRISPR-mediated fetal chromatin domain ablations. Sci Rep 2022;12:1481. [PMID: 35087158 PMCID: PMC8795181 DOI: 10.1038/s41598-022-05575-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Accepted: 12/17/2021] [Indexed: 11/08/2022] Open Abstract Two common hemoglobinopathies, sickle cell disease (SCD) and β-thalassemia, arise from genetic mutations within the β-globin gene. In this work, we identified a 500-bp motif (Fetal Chromatin Domain, FCD) upstream of human ϒ-globin locus and showed that the removal of this motif using CRISPR technology reactivates the expression of ϒ-globin. Next, we present two different cell morphology-based machine learning approaches that can be used identify human blood cells (KU-812) that harbor CRISPR-mediated FCD genetic modifications. Three candidate models from the first approach, which uses multilayer perceptron algorithm (MLP 20-26, MLP26-18, and MLP 30-26) and flow cytometry-derived cellular data, yielded 0.83 precision, 0.80 recall, 0.82 accuracy, and 0.90 area under the ROC (receiver operating characteristic) curve when predicting the edited cells. In comparison, the candidate model from the second approach, which uses deep learning (T2D5) and DIC microscopy-derived imaging data, performed with less accuracy (0.80) and ROC AUC (0.87). We envision that equivalent machine learning-based models can complement currently available genotyping protocols for specific genetic modifications which result in morphological changes in human cells. Collapse Key Words genetic engineering data processing machine learning Collapse MESH Headings Anemia, Sickle Cell/blood Anemia, Sickle Cell/genetics Anemia, Sickle Cell/therapy CRISPR-Cas Systems/genetics Cell Line, Tumor Cell Separation/methods Flow Cytometry/methods Gene Editing/methods Genetic Therapy/methods Genotyping Techniques/methods Humans Machine Learning Mutation Protein Domains/genetics ROC Curve beta-Thalassemia/blood beta-Thalassemia/genetics beta-Thalassemia/therapy gamma-Globins/genetics Collapse Grants R01 HL069234 NHLBI NIH HHS National Science Foundation Cecil H. and Ida Green Endowment University of Texas at Dallas Collapse
274	Barr KB, Chiang N, Bertozzi AL, Gilles J, Osher SJ, Weiss PS. Extraction of Hidden Science from Nanoscale Images. THE JOURNAL OF PHYSICAL CHEMISTRY. C, NANOMATERIALS AND INTERFACES 2022;126:3-13. [PMID: 35633819 PMCID: PMC9135097 DOI: 10.1021/acs.jpcc.1c08712] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023] Abstract Scanning probe microscopies and spectroscopies enable investigation of surfaces and even buried interfaces down to the scale of chemical-bonding interactions, and this capability has been enhanced with the support of computational algorithms for data acquisition and image processing to explore physical, chemical, and biological phenomena. Here, we describe how scanning probe techniques have been enhanced by some of these recent algorithmic improvements. One improvement to the data acquisition algorithm is to advance beyond a simple rastering framework by using spirals at constant angular velocity then switching to constant linear velocity, which limits the piezo creep and hysteresis issues seen in traditional acquisition methods. One can also use image-processing techniques to model the distortions that appear from tip motion effects and to make corrections to these images. Another image-processing algorithm we discuss enables researchers to segment images by domains and subdomains, thereby highlighting reactive and interesting disordered sites at domain boundaries. Lastly, we discuss algorithms used to examine the dipole direction of individual molecules and surface domains, hydrogen bonding interactions, and molecular tilt. The computational algorithms used for scanning probe techniques are still improving rapidly and are incorporating machine learning at the next level of iteration. That said, the algorithms are not yet able to perform live adjustments during data recording that could enhance the microscopy and spectroscopic imaging methods significantly. Collapse Key Words data acquisition data processing scanning probe microscopy scanning tunneling microscopy Collapse MESH Headings Collapse Grants R00 EB028325 NIBIB NIH HHS Collapse
275	Cresswell K, Domínguez Hernández A, Williams R, Sheikh A. Key Challenges and Opportunities for Cloud Technology in Health Care: Semistructured Interview Study. JMIR Hum Factors 2022;9:e31246. [PMID: 34989688 PMCID: PMC8778568 DOI: 10.2196/31246] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Revised: 09/14/2021] [Accepted: 10/02/2021] [Indexed: 01/27/2023] Open Abstract Background The use of cloud computing (involving storage and processing of data on the internet) in health care has increasingly been highlighted as having great potential in facilitating data-driven innovations. Although some provider organizations are reaping the benefits of using cloud providers to store and process their data, others are lagging behind. Objective We aim to explore the existing challenges and barriers to the use of cloud computing in health care settings and investigate how perceived risks can be addressed. Methods We conducted a qualitative case study of cloud computing in health care settings, interviewing a range of individuals with perspectives on supply, implementation, adoption, and integration of cloud technology. Data were collected through a series of in-depth semistructured interviews exploring current applications, implementation approaches, challenges encountered, and visions for the future. The interviews were transcribed and thematically analyzed using NVivo 12 (QSR International). We coded the data based on a sociotechnical coding framework developed in related work. Results We interviewed 23 individuals between September 2020 and November 2020, including professionals working across major cloud providers, health care provider organizations, innovators, small and medium-sized software vendors, and academic institutions. The participants were united by a common vision of a cloud-enabled ecosystem of applications and by drivers surrounding data-driven innovation. The identified barriers to progress included the cost of data migration and skill gaps to implement cloud technologies within provider organizations, the cultural shift required to move to externally hosted services, a lack of user pull as many benefits were not visible to those providing frontline care, and a lack of interoperability standards and central regulations. Conclusions Implementations need to be viewed as a digitally enabled transformation of services, driven by skill development, organizational change management, and user engagement, to facilitate the implementation and exploitation of cloud-based infrastructures and to maximize returns on investment. Collapse Key Words adoption cloud technology data processing digital health health care implementation qualitative risk assessment user engagement Collapse MESH Headings Collapse Grants Collapse