1
|
Rautiainen M, Nurk S, Walenz BP, Logsdon GA, Porubsky D, Rhie A, Eichler EE, Phillippy AM, Koren S. Telomere-to-telomere assembly of diploid chromosomes with Verkko. Nat Biotechnol 2023; 41:1474-1482. [PMID: 36797493 PMCID: PMC10427740 DOI: 10.1038/s41587-023-01662-6] [Citation(s) in RCA: 41] [Impact Index Per Article: 41.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Accepted: 01/03/2023] [Indexed: 02/18/2023]
Abstract
The Telomere-to-Telomere consortium recently assembled the first truly complete sequence of a human genome. To resolve the most complex repeats, this project relied on manual integration of ultra-long Oxford Nanopore sequencing reads with a high-resolution assembly graph built from long, accurate PacBio high-fidelity reads. We have improved and automated this strategy in Verkko, an iterative, graph-based pipeline for assembling complete, diploid genomes. Verkko begins with a multiplex de Bruijn graph built from long, accurate reads and progressively simplifies this graph by integrating ultra-long reads and haplotype-specific markers. The result is a phased, diploid assembly of both haplotypes, with many chromosomes automatically assembled from telomere to telomere. Running Verkko on the HG002 human genome resulted in 20 of 46 diploid chromosomes assembled without gaps at 99.9997% accuracy. The complete assembly of diploid genomes is a critical step towards the construction of comprehensive pangenome databases and chromosome-scale comparative genomics.
Collapse
Affiliation(s)
- Mikko Rautiainen
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sergey Nurk
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
- Oxford Nanopore Technologies, Oxford, UK
| | - Brian P Walenz
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Glennis A Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Adam M Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
2
|
Abstract
Why, when, and how do stereotypes change? This paper develops a computational account based on the principles of structure learning: stereotypes are governed by probabilistic beliefs about the assignment of individuals to groups. Two aspects of this account are particularly important. First, groups are flexibly constructed based on the distribution of traits across individuals; groups are not fixed, nor are they assumed to map on to categories we have to provide to the model. This allows the model to explain the phenomena of group discovery and subtyping, whereby deviant individuals are segregated from a group, thus protecting the group's stereotype. Second, groups are hierarchically structured, such that groups can be nested. This allows the model to explain the phenomenon of subgrouping, whereby a collection of deviant individuals is organized into a refinement of the superordinate group. The structure learning account also sheds light on several factors that determine stereotype change, including perceived group variability, individual typicality, cognitive load, and sample size.
Collapse
Affiliation(s)
- Samuel J Gershman
- Department of Psychology, Harvard University, Cambridge, MA, USA.
- Center for Brains, Minds, and Machines, MIT, Cambridge, MA, USA.
| | - Mina Cikara
- Department of Psychology, Harvard University, Cambridge, MA, USA
| |
Collapse
|
3
|
Segal MR. Assessing chromatin relocalization in 3D using the patient rule induction method. Biostatistics 2023; 24:618-634. [PMID: 34494087 PMCID: PMC10449022 DOI: 10.1093/biostatistics/kxab033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Revised: 05/10/2021] [Accepted: 08/07/2021] [Indexed: 11/12/2022] Open
Abstract
Three-dimensional (3D) genome architecture is critical for numerous cellular processes, including transcription, while certain conformation-driven structural alterations are frequently oncogenic. Inferring 3D chromatin configurations has been advanced by the emergence of chromatin conformation capture assays, notably Hi-C, and attendant 3D reconstruction algorithms. These have enhanced understanding of chromatin spatial organization and afforded numerous downstream biological insights. Until recently, comparisons of 3D reconstructions between conditions and/or cell types were limited to prescribed structural features. However, multiMDS, a pioneering approach developed by Rieber and Mahony (2019). that performs joint reconstruction and alignment, enables quantification of all locus-specific differences between paired Hi-C data sets. By subsequently mapping these differences to the linear (1D) genome the identification of relocalization regions is facilitated through the use of peak calling in conjunction with continuous wavelet transformation. Here, we seek to refine this approach by performing the search for significant relocalization regions in terms of the 3D structures themselves, thereby retaining the benefits of 3D reconstruction and avoiding limitations associated with the 1D perspective. The search for (extreme) relocalization regions is conducted using the patient rule induction method (PRIM). Considerations surrounding orienting structures with respect to compartmental and principal component axes are discussed, as are approaches to inference and reconstruction accuracy assessment. The illustration makes recourse to comparisons between four different cell types.
Collapse
Affiliation(s)
- Mark R Segal
- Department of Epidemiology and Biostatistics, University of
California, 550 16th Street, San Francisco, CA 94143-0560, USA
| |
Collapse
|
4
|
Mirmohammadi SL, Ezazi F, Safdari J, Mallah MH. Comparison of the performance of optimal square, symmetric and asymmetric tapered cascades for production of enriched uranium for power reactors. ANN NUCL ENERGY 2023. [DOI: 10.1016/j.anucene.2023.109761] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/05/2023]
|
5
|
Shanmugam RK, Dhingra T. Outcome-based contracts – Linking technology, ownership and reputations. International Journal of Information Management 2023. [DOI: 10.1016/j.ijinfomgt.2023.102624] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
|
6
|
Xie L, Wang D, Ma F. Analysis of individual characteristics influencing user polarization in COVID-19 vaccine hesitancy. Comput Human Behav 2023; 143:107649. [PMID: 36683861 PMCID: PMC9844095 DOI: 10.1016/j.chb.2022.107649] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2022] [Revised: 12/25/2022] [Accepted: 12/31/2022] [Indexed: 01/18/2023]
Abstract
During the COVID-19 pandemic, vaccine hesitancy proved to be a major obstacle in efforts to control and mitigate the negative consequences of COVID-19. This study centered on the degree of polarization on social media about vaccine use and contributing factors to vaccine hesitancy among social media users. Examining the discussion about COVID-19 vaccine on the Weibo platform, a relatively comprehensive system of user features was constructed based on psychological theories and models such as the curiosity-drive theory and the big five model of personality. Then machine learning methods were used to explore the paramount impacting factors that led users into polarization. Findings revealed that factors reflecting the activity and effectiveness of social media use promoted user polarization. In contrast, features reflecting users' information processing ability and personal qualities had a negative impact on polarization. This study hopes to help healthcare organizations and governments understand and curb social media polarization around vaccine development in the face of future surges of pandemics.
Collapse
Affiliation(s)
- Lei Xie
- School of Information Management, Wuhan University, Wuhan, 430072, China,Center for Studies of Information Resources, Wuhan University, Wuhan, 430072, China,Big Data Institute, Wuhan University, Wuhan, 430072, China
| | - Dandan Wang
- School of Information Management, Wuhan University, Wuhan, 430072, China,School of Data Science, City University of Hong Kong, Hong Kong, 999077, China,Center for Studies of Information Resources, Wuhan University, Wuhan, 430072, China,Big Data Institute, Wuhan University, Wuhan, 430072, China
| | - Feicheng Ma
- School of Information Management, Wuhan University, Wuhan, 430072, China,Center for Studies of Information Resources, Wuhan University, Wuhan, 430072, China,Big Data Institute, Wuhan University, Wuhan, 430072, China,Corresponding author. School of Information Management, Wuhan University, Wuhan, China
| |
Collapse
|
7
|
Santus L, Garriga E, Deorowicz S, Gudyś A, Notredame C. Towards the accurate alignment of over a million protein sequences: Current state of the art. Curr Opin Struct Biol 2023; 80:102577. [PMID: 37012200 DOI: 10.1016/j.sbi.2023.102577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Revised: 02/21/2023] [Accepted: 02/27/2023] [Indexed: 04/04/2023]
Abstract
Large-scale genomics requires highly scalable and accurate multiple sequence alignment methods. Results collected over this last decade suggest accuracy loss when scaling up over a few thousand sequences. This issue has been actively addressed with a number of innovative algorithmic solutions that combine low-level hardware optimization with novel higher-level heuristics. This review provides an extensive critical overview of these recent methods. Using established reference datasets we conclude that albeit significant progress has been achieved, a unified framework able to consistently and efficiently produce high-accuracy large-scale multiple alignments is still lacking.
Collapse
|
8
|
Banerjee S, Mukherjee S, Bandyopadhyay S, Pakray P. An extract-then-abstract based method to generate disaster-news headlines using a DNN extractor followed by a transformer abstractor. Inf Process Manag 2023. [DOI: 10.1016/j.ipm.2023.103291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
|
9
|
Li X, Li H, Gao J, Wang R. Privacy preserving via multi-key homomorphic encryption in cloud computing. Journal of Information Security and Applications 2023. [DOI: 10.1016/j.jisa.2023.103463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/11/2023]
|
10
|
Link S, Koehler H, Gandhi A, Hartmann S, Thalheim B. Cardinality constraints and functional dependencies in SQL: Taming data redundancy in logical database design. INFORM SYST 2023. [DOI: 10.1016/j.is.2023.102208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
|
11
|
Bing X, Bunea F, Wegkamp M. Detecting approximate replicate components of a high-dimensional random vector with latent structure. BERNOULLI 2023. [DOI: 10.3150/22-bej1502] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/22/2023]
Affiliation(s)
- Xin Bing
- Department of Statistical Sciences, University of Toronto, Toronto, Ontario, Canada
| | - Florentina Bunea
- Department of Statistics and Data Science, Cornell University, Ithaca, New York, USA
| | - Marten Wegkamp
- Department of Statistics and Data Science, Cornell University, Ithaca, New York, USA
| |
Collapse
|
12
|
Dowdle LT, Vizioli L, Moeller S, Akçakaya M, Olman C, Ghose G, Yacoub E, Uğurbil K. Evaluating increases in sensitivity from NORDIC for diverse fMRI acquisition strategies. Neuroimage 2023; 270:119949. [PMID: 36804422 DOI: 10.1016/j.neuroimage.2023.119949] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Revised: 01/27/2023] [Accepted: 02/15/2023] [Indexed: 02/19/2023] Open
Abstract
As the neuroimaging field moves towards detecting smaller effects at higher spatial resolutions, and faster sampling rates, there is increased attention given to the deleterious contribution of unstructured, thermal noise. Here, we critically evaluate the performance of a recently developed reconstruction method, termed NORDIC, for suppressing thermal noise using datasets acquired with various field strengths, voxel sizes, sampling rates, and task designs. Following minimal preprocessing, statistical activation (t-values) of NORDIC processed data was compared to the results obtained with alternative denoising methods. Additionally, we examined the consistency of the estimates of task responses at the single-voxel, single run level, using a finite impulse response (FIR) model. To examine the potential impact on effective image resolution, the overall smoothness of the data processed with different methods was estimated. Finally, to determine if NORDIC alters or removes temporal information important for modeling responses, we employed an exhaustive leave-p-out cross validation approach, using FIR task responses to predict held out timeseries, quantified using R2. After NORDIC, the t-values are increased, an improvement comparable to what could be achieved by 1.5 voxels smoothing, and task events are clearly visible and have less cross-run error. These advantages are achieved with smoothness estimates increasing by less than 4%, while 1.5 voxel smoothing is associated with increases of over 140%. Cross-validated R2s based on the FIR models show that NORDIC is not measurably distorting the temporal structure of the data under this approach and is the best predictor of non-denoised time courses. The results demonstrate that analyzing 1 run of data after NORDIC produces results equivalent to using 2 to 3 original runs and that NORDIC performs equally well across a diverse array of functional imaging protocols. Significance Statement: For functional neuroimaging, the increasing availability of higher field strengths and ever higher spatiotemporal resolutions has led to concomitant increase in concerns about the deleterious effects of thermal noise. Historically this noise source was suppressed using methods that reduce spatial precision such as image blurring or averaging over a large number of trials or sessions, which necessitates large data collection efforts. Here, we critically evaluate the performance of a recently developed reconstruction method, termed NORDIC, which suppresses thermal noise. Across datasets varying in field strength, voxel sizes, sampling rates, and task designs, NORDIC produces substantial gains in data quality. Both conventional t-statistics derived from general linear models and coefficients of determination for predicting unseen data are improved. These gains match or even exceed those associated with 1 voxel Full Width Half Max image smoothing, however, even such small amounts of smoothing are associated with a 52% reduction in estimates of spatial precision, whereas the measurable difference in spatial precision is less than 4% following NORDIC.
Collapse
Affiliation(s)
- Logan T Dowdle
- Center for Magnetic Resonance Research (CMRR), University of Minnesota, Minneapolis, 2021 6th Street SE, MN 55455, United States; Department of Neuroscience, University of Minnesota, Minneapolis, MN, United States.
| | - Luca Vizioli
- Center for Magnetic Resonance Research (CMRR), University of Minnesota, Minneapolis, 2021 6th Street SE, MN 55455, United States
| | - Steen Moeller
- Center for Magnetic Resonance Research (CMRR), University of Minnesota, Minneapolis, 2021 6th Street SE, MN 55455, United States
| | - Mehmet Akçakaya
- Center for Magnetic Resonance Research (CMRR), University of Minnesota, Minneapolis, 2021 6th Street SE, MN 55455, United States; Department of Electrical and Computer Engineering, University of Minnesota, Minneapolis, MN, United States
| | - Cheryl Olman
- Center for Magnetic Resonance Research (CMRR), University of Minnesota, Minneapolis, 2021 6th Street SE, MN 55455, United States; Department of Psychology, University of Minnesota, Minneapolis, MN, United States
| | - Geoffrey Ghose
- Center for Magnetic Resonance Research (CMRR), University of Minnesota, Minneapolis, 2021 6th Street SE, MN 55455, United States; Department of Neuroscience, University of Minnesota, Minneapolis, MN, United States; Department of Psychology, University of Minnesota, Minneapolis, MN, United States
| | - Essa Yacoub
- Center for Magnetic Resonance Research (CMRR), University of Minnesota, Minneapolis, 2021 6th Street SE, MN 55455, United States
| | - Kâmil Uğurbil
- Center for Magnetic Resonance Research (CMRR), University of Minnesota, Minneapolis, 2021 6th Street SE, MN 55455, United States
| |
Collapse
|
13
|
Zhang H, Shi X. An Improved Quantum-Behaved Particle Swarm Optimization Algorithm Combined with Reinforcement Learning for AUV Path Planning. Journal of Robotics 2023. [DOI: 10.1155/2023/8821906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/08/2023]
Abstract
In order to solve the problem of fast path planning and effective obstacle avoidance for autonomous underwater vehicles (AUVs) in two-dimensional underwater environment, a path planning algorithm based on deep Q-network and Quantum particle swarm optimization (DQN-QPSO) was proposed. Five actions are defined first: normal, exploration, particle explode, random mutation, and fine-tuning operation. After that, the five actions are selected by DQN decision thinking, and the position information of particles is dynamically updated in each iteration according to the selected actions. Finally, considering the complexity of underwater environment, the fitness function is designed, and the route length, deflection angle, and the influence of ocean current are considered comprehensively, so that the algorithm can find the solution path with the shortest energy consumption in underwater environment. Experimental results show that DQN-QPSO algorithm is an effective algorithm, and its performance is better than traditional methods.
Collapse
Affiliation(s)
- HanBin Zhang
- National Deep Sea Center, Qingdao 266237, China
- School of Automation and Electronic Engineering, Qingdao University of Science and Technology, Qingdao 266100, China
| | | |
Collapse
|
14
|
Martínez-López Y, Castillo-Garit JA, Casanola-Martin GM, Rasulev B, Rodríguez-Gonzalez AY, Martínez-Santiago O, Barigye SJ. Exploring proteasome inhibition using atomic weighted vector indices and machine learning approaches. Mol Divers 2023:10.1007/s11030-023-10638-2. [PMID: 37017875 DOI: 10.1007/s11030-023-10638-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2022] [Accepted: 03/17/2023] [Indexed: 04/06/2023]
Abstract
Ubiquitin-proteasome system (UPS) is a highly regulated mechanism of intracellular protein degradation and turnover. The UPS is involved in different biological activities, such as the regulation of gene transcription and cell cycle. Several researchers have applied cheminformatics and artificial intelligence methods to study the inhibition of proteasomes, including the prediction of UPP inhibitors. Following this idea, we applied a new tool for obtaining molecular descriptors (MDs) for modeling proteasome Inhibition in terms of EC50 (µmol/L), in which a set of new MDs called atomic weighted vectors (AWV) and several prediction algorithms were used in cheminformatics studies. In the manuscript, a set of descriptors based on AWV are presented as datasets for training different machine learning techniques, such as linear regression, multiple linear regression (MLR), random forest (RF), K-nearest neighbors (IBK), multi-layer perceptron, best-first search, and genetic algorithm. The results suggest that these atomic descriptors allow adequate modeling of proteasome inhibitors despite artificial intelligence techniques, as a variant to build efficient models for the prediction of inhibitory activity.
Collapse
Affiliation(s)
- Yoan Martínez-López
- Department of Computer Sciences, Faculty of Informatics, Camagüey University, 74650, Camagüey City, Cuba.
| | | | - Gerardo M Casanola-Martin
- Department of Coatings and Polymeric Materials, North Dakota State University, Fargo, ND, 58102, USA
| | - Bakhtiyor Rasulev
- Department of Coatings and Polymeric Materials, North Dakota State University, Fargo, ND, 58102, USA
| | - Ansel Y Rodríguez-Gonzalez
- Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE-UT3), Unidad de Transferencia Tecnológica de Tepic, Tepic, México
| | - Oscar Martínez-Santiago
- Alfa Vitamins Laboratories, Miami, FL, 33166, USA
- Laboratorio de Bioinformática y Química Computacional, Universidad Católica del Maule, Talca, Chile
| | - Stephen J Barigye
- Departamento de Química Física Aplicada, Facultad de Ciencias, Universidad Autónoma de Madrid (UAM), 28049, Madrid, Spain
| |
Collapse
|
15
|
Wan J, Zhang K, Guo Z, Miao D. A new clustering algorithm based on connectivity. APPL INTELL 2023. [DOI: 10.1007/s10489-023-04543-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/08/2023]
|
16
|
Wang H, Zhao L, He Y, Han Z, Li P. Fuzzy pushdown automata based on complete residuated lattices: variants and computing powers. Soft comput 2023. [DOI: 10.1007/s00500-023-08062-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/07/2023]
|
17
|
Medková J, Hynek J. HAkAu: hybrid algorithm for effective k-automorphism anonymization of social networks. Soc Netw Anal Min 2023. [DOI: 10.1007/s13278-023-01064-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/07/2023]
Abstract
AbstractOnline social network datasets contain a large amount of various information about their users. Preserving users’ privacy while publishing or sharing datasets with third parties has become a challenging problem. The k-automorphism is the anonymization method that protects the social network dataset against any passive structural attack. It provides a higher level of protection than other k-anonymity methods, including k-degree or k-neighborhood techniques. In this paper, we propose a hybrid algorithm that effectively modifies the social network to the k-automorphism one. The proposed algorithm is based on the structure of the previously published k-automorphism KM algorithm. However, it solves the NP-hard subtask of finding isomorphic graph extensions with a genetic algorithm and employs the GraMi algorithm for finding frequent subgraphs. In the design of the genetic algorithm, we introduce the novel chromosome representation in which the length of the chromosome is independent of the size of the input network, and each individual in each generation leads to the k-automorphism solution. Moreover, we present a heuristic method for selecting the set of vertex disjoint subgraphs. To test the algorithm, we run experiments on a set of real social networks and use the SecGraph tool to evaluate our results in terms of protection against deanonymization attacks and preserving data utility. It makes our experimental results comparable with any future research.
Collapse
|
18
|
Shutta KH, Balzer LB, Scholtens DM, Balasubramanian R. SpiderLearner: An ensemble approach to Gaussian graphical model estimation. Stat Med 2023; 42:2116-2133. [PMID: 37004994 DOI: 10.1002/sim.9714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Revised: 12/10/2022] [Accepted: 03/07/2023] [Indexed: 04/04/2023]
Abstract
Gaussian graphical models (GGMs) are a popular form of network model in which nodes represent features in multivariate normal data and edges reflect conditional dependencies between these features. GGM estimation is an active area of research. Currently available tools for GGM estimation require investigators to make several choices regarding algorithms, scoring criteria, and tuning parameters. An estimated GGM may be highly sensitive to these choices, and the accuracy of each method can vary based on structural characteristics of the network such as topology, degree distribution, and density. Because these characteristics are a priori unknown, it is not straightforward to establish universal guidelines for choosing a GGM estimation method. We address this problem by introducing SpiderLearner, an ensemble method that constructs a consensus network from multiple estimated GGMs. Given a set of candidate methods, SpiderLearner estimates the optimal convex combination of results from each method using a likelihood-based loss function. K $$ K $$ -fold cross-validation is applied in this process, reducing the risk of overfitting. In simulations, SpiderLearner performs better than or comparably to the best candidate methods according to a variety of metrics, including relative Frobenius norm and out-of-sample likelihood. We apply SpiderLearner to publicly available ovarian cancer gene expression data including 2013 participants from 13 diverse studies, demonstrating our tool's potential to identify biomarkers of complex disease. SpiderLearner is implemented as flexible, extensible, open-source code in the R package ensembleGGM at https://github.com/katehoffshutta/ensembleGGM.
Collapse
Affiliation(s)
- Katherine H Shutta
- Department of Biostatistics and Epidemiology, University of Massachusetts-Amherst, Amherst, Massachusetts, USA
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, USA
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA
| | - Laura B Balzer
- Division of Biostatistics, University of California-Berkeley, Berkeley, California, USA
| | - Denise M Scholtens
- Division of Biostatistics, Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA
| | - Raji Balasubramanian
- Department of Biostatistics and Epidemiology, University of Massachusetts-Amherst, Amherst, Massachusetts, USA
| |
Collapse
|
19
|
Forati A, Ghose R, Mohebbi F, Mantsch JR. The journey to overdose: Using spatial social network analysis as a novel framework to study geographic discordance in overdose deaths. Drug Alcohol Depend 2023; 245:109827. [PMID: 36868092 DOI: 10.1016/j.drugalcdep.2023.109827] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 02/17/2023] [Accepted: 02/21/2023] [Indexed: 03/05/2023]
Abstract
INTRODUCTION Drug overdose deaths are often geographically discordant (the community in which the overdose death occurs is different from the community of residence). Thus, in many cases there is a journey to overdose. METHODS We applied geospatial analysis to examine characteristics that define journeys to overdoses using Milwaukee, Wisconsin, a diverse and segregated metropolitan area in which 26.72 % of overdose deaths are geographically discordant, as a case study. First, we deployed spatial social network analysis to identify hubs (census tracts that are focal points of geographically discordant overdoses) and authorities (the communities of residence from which journeys to overdose commonly begin) for overdose deaths and characterized them according to key demographics. Second, we used temporal trend analysis to identify communities that were consistent, sporadic, and emergent hotspots for overdose deaths. Third, we identified characteristics that differentiated discordant versus non-discordant overdose deaths. RESULTS Authority communities had lower housing stability and were younger, more impoverished, and less educated relative to hubs and county-wide numbers. White communities were more likely to be hubs, while Hispanic communities were more likely to be authorities. Geographically discordant deaths more commonly involved fentanyl, cocaine, and amphetamines and were more likely to be accidental. Non-discordant deaths more commonly involved opioids other than fentanyl or heroin and were more likely to be the result of suicide. CONCLUSION This study is the first to examine the journey to overdose and demonstrates that such analysis can be applied in metropolitan areas to better understand and guide community responses.
Collapse
Affiliation(s)
- Amir Forati
- Department of Geography, University of Wisconsin-Milwaukee, Milwaukee, WI 53211, USA
| | - Rina Ghose
- Department of Geography, University of Wisconsin-Milwaukee, Milwaukee, WI 53211, USA
| | - Fahimeh Mohebbi
- Department of Computer Science, University of Wisconsin-Milwaukee, Milwaukee, WI 53211, USA
| | - John R Mantsch
- Department of Pharmacology & Toxicology, Medical College of Wisconsin, Milwaukee, WI 53226, USA.
| |
Collapse
|
20
|
Mousavirad SJ, Alexandre LA. Energy-aware JPEG image compression: A multi-objective approach. Appl Soft Comput 2023. [DOI: 10.1016/j.asoc.2023.110278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/08/2023]
|
21
|
Liu P, Xu Y, Liu J, Chen S, Cao F, Wu G. Fully reusing clause deduction algorithm based on standard contradiction separation rule. Inf Sci (N Y) 2023; 622:337-356. [DOI: 10.1016/j.ins.2022.11.128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
22
|
Chen Y, You J, He J, Lin Y, Peng Y, Wu C, Zhu Y. SP-GNN: Learning structure and position information from graphs. Neural Netw 2023; 161:505-514. [PMID: 36805265 DOI: 10.1016/j.neunet.2023.01.051] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Revised: 11/30/2022] [Accepted: 01/31/2023] [Indexed: 02/07/2023]
Abstract
Graph neural network (GNN) is a powerful model for learning from graph data. However, existing GNNs may have limited expressive power, especially in terms of capturing adequate structural and positional information of input graphs. Structure properties and node position information are unique to graph-structured data, but few GNNs are capable of capturing them. This paper proposes Structure- and Position-aware Graph Neural Networks (SP-GNN), a new class of GNNs offering generic and expressive power of graph data. SP-GNN enhances the expressive power of GNN architectures by incorporating a near-isometric proximity-aware position encoder and a scalable structure encoder. Further, given a GNN learning task, SP-GNN can be used to analyze positional and structural awareness of GNN tasks using the corresponding embeddings computed by the encoders. The awareness scores can guide fusion strategies of the extracted positional and structural information with raw features for better performance of GNNs on downstream tasks. We conduct extensive experiments using SP-GNN on various graph datasets and observe significant improvement in classification over existing GNN models.
Collapse
Affiliation(s)
- Yangrui Chen
- Department of Computer Science, University of Hong Kong, 999077, Hong Kong, China.
| | - Jiaxuan You
- Department of Computer Science, Stanford University, Stanford, 94305, USA
| | - Jun He
- ByteDance Inc., Beijing, 100086, China
| | - Yuan Lin
- ByteDance Inc., Beijing, 100086, China
| | | | - Chuan Wu
- Department of Computer Science, University of Hong Kong, 999077, Hong Kong, China
| | - Yibo Zhu
- ByteDance Inc., Beijing, 100086, China
| |
Collapse
|
23
|
Zhang G, Wong HC, Zhu J, An T, Wang C. Jigsaw training-based background reverse attention transformer network for guidewire segmentation. Int J Comput Assist Radiol Surg 2023; 18:653-661. [PMID: 36469214 DOI: 10.1007/s11548-022-02803-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2022] [Accepted: 11/21/2022] [Indexed: 12/12/2022]
Abstract
PURPOSE Guidewire segmentation plays a crucial role in percutaneous coronary intervention. However, it is a challenging task due to the low signal-to-noise ratio of X-ray sequences and the great imbalance between the number of foreground and background pixels. Besides, most existing guidewire segmentation methods are designed for single guidewire segmentation. This paper aims to solve the task of single and dual guidewire segmentation in X-ray fluoroscopy sequences. METHODS A jigsaw training-based background reverse attention (BRA) transformer network is proposed. A jigsaw training strategy is used to train the guidewire segmentation network. A BRA module is also designed to reduce the influence of background information. First, robust principal component is conducted to generate background maps for guidewire sequences. Then, BRA is computed on the basis of the background features. RESULTS The experimental results on the dataset collected from three hospitals show that the proposed method can achieve single and dual guidewire segmentation in X-ray fluoroscopy sequences. Higher F1 score and precision than state-of-the-art guidewire segmentation methods can be obtained in most cases. CONCLUSION The jigsaw training strategy helps reduce the need for dual guidewire data and improve the performance of the network. Our BRA module helps reduce the influence of background information and distinguish the guidewire. The proposed methods can obtain higher performance than state-of-the-art guidewire segmentation methods.
Collapse
Affiliation(s)
- Guifang Zhang
- Hanglok-Tech Co., Ltd., Hengqin, China
- School of Information Management, Jiangxi University of Finance and Economics, Nanchang, China
- School of Computer Science and Engineering, Macau University of Science and Technology, Macao, China
| | - Hon-Cheng Wong
- School of Computer Science and Engineering, Macau University of Science and Technology, Macao, China.
| | - Jianjun Zhu
- Hanglok-Tech Co., Ltd., Hengqin, China.
- Zhongda Hospital Southeast University, Nanjing, China.
| | - Tao An
- Zhuhai People's Hospital Medical Group, Zhuhai, China
| | | |
Collapse
|
24
|
Eo M, Kang S, Rhee W. An effective low-rank compression with a joint rank selection followed by a compression-friendly training. Neural Netw 2023; 161:165-177. [PMID: 36745941 DOI: 10.1016/j.neunet.2023.01.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Revised: 12/02/2022] [Accepted: 01/19/2023] [Indexed: 01/25/2023]
Abstract
Low-rank compression of a neural network is one of the popular compression techniques, where it has been known to have two main challenges. The first challenge is determining the optimal rank of all the layers and the second is training the neural network into a compression-friendly form. To overcome the two challenges, we propose BSR (Beam-search and Stable Rank), a low-rank compression algorithm that embodies an efficient rank-selection method and a unique compression-friendly training method. For the rank selection, BSR employs a modified beam search that can perform a joint optimization of the rank allocations over all the layers in contrast to the previously used heuristic methods. For the compression-friendly training, BSR adopts a regularization loss derived from a modified stable rank, which can control the rank while incurring almost no harm in performance. Experiment results confirm that BSR is effective and superior when compared to the existing low-rank compression methods. For CIFAR10 on ResNet56, BSR not only achieves compression but also provides a performance improvement over the baseline model's performance for the compression ratio of up to 0.82. For CIFAR100 on ResNet56 and ImageNet on AlexNet, BSR outperforms the previous SOTA method, LC, by 4.7% and by 6.7% on the average, respectively. BSR is also effective for EfficientNet-B0 and MobileNetV2 that are known for their efficient design in terms of parameters and computational cost. We also show that BSR provides a competitive performance when compared with the recent pruning compression algorithms. As with pruning, BSR can be easily combined with quantization for an additional compression.
Collapse
Affiliation(s)
- Moonjung Eo
- Department of Intelligence and Information, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul, 08826, South Korea.
| | - Suhyun Kang
- Department of Intelligence and Information, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul, 08826, South Korea.
| | - Wonjong Rhee
- Department of Intelligence and Information, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul, 08826, South Korea; Interdisciplinary Program in Artificial Intelligence, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul, 08826, South Korea; AI Institute, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul, 08826, South Korea.
| |
Collapse
|
25
|
Liu Y, Meitei OR, Chin ZE, Dutt A, Tao M, Chuang IL, Van Voorhis T. Bootstrap Embedding on a Quantum Computer. J Chem Theory Comput 2023; 19:2230-2247. [PMID: 37001026 DOI: 10.1021/acs.jctc.3c00012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/03/2023]
Abstract
We extend molecular bootstrap embedding to make it appropriate for implementation on a quantum computer. This enables solution of the electronic structure problem of a large molecule as an optimization problem for a composite Lagrangian governing fragments of the total system, in such a way that fragment solutions can harness the capabilities of quantum computers. By employing state-of-art quantum subroutines including the quantum SWAP test and quantum amplitude amplification, we show how a quadratic speedup can be obtained over the classical algorithm, in principle. Utilization of quantum computation also allows the algorithm to match─at little additional computational cost─full density matrices at fragment boundaries, instead of being limited to 1-RDMs. Current quantum computers are small, but quantum bootstrap embedding provides a potentially generalizable strategy for harnessing such small machines through quantum fragment matching.
Collapse
Affiliation(s)
- Yuan Liu
- Department of Physics, Co-Design Center for Quantum Advantage, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Oinam R. Meitei
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Zachary E. Chin
- Department of Physics, Co-Design Center for Quantum Advantage, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Arkopal Dutt
- Department of Mechanical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Max Tao
- Department of Physics, Co-Design Center for Quantum Advantage, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Isaac L. Chuang
- Department of Physics, Co-Design Center for Quantum Advantage, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Troy Van Voorhis
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
26
|
Chakraborty S, Xu J. Biconvex Clustering. J Comput Graph Stat 2023. [DOI: 10.1080/10618600.2023.2197474] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/03/2023]
|
27
|
Dai C, Zhou D, Gao B, Wang K. A new method for the joint estimation of instantaneous reproductive number and serial interval during epidemics. PLoS Comput Biol 2023; 19:e1011021. [PMID: 37000844 PMCID: PMC10096265 DOI: 10.1371/journal.pcbi.1011021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Revised: 04/12/2023] [Accepted: 03/09/2023] [Indexed: 04/03/2023] Open
Abstract
Although some methods for estimating the instantaneous reproductive number during epidemics have been developed, the existing frameworks usually require information on the distribution of the serial interval and/or additional contact tracing data. However, in the case of outbreaks of emerging infectious diseases with an unknown natural history or undetermined characteristics, the serial interval and/or contact tracing data are often not available, resulting in inaccurate estimates for this quantity. In the present study, a new framework was specifically designed for joint estimates of the instantaneous reproductive number and serial interval. Concretely, a likelihood function for the two quantities was first introduced. Then, the instantaneous reproductive number and the serial interval were modeled parametrically as a function of time using the interpolation method and a known traditional distribution, respectively. Using the Bayesian information criterion and the Markov Chain Monte Carlo method, we ultimately obtained their estimates and distribution. The simulation study revealed that our estimates of the two quantities were consistent with the ground truth. Seven data sets of historical epidemics were considered and further verified the robust performance of our method. Therefore, to some extent, even if we know only the daily incidence, our method can accurately estimate the instantaneous reproductive number and serial interval to provide crucial information for policymakers to design appropriate prevention and control interventions during epidemics.
Collapse
|
28
|
Sutcliffe G, Desharnais M. The 11th IJCAR automated theorem proving system competition – CASC-J11. AI COMMUN 2023. [DOI: 10.3233/aic-220244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/03/2023]
Abstract
The CADE ATP System Competition (CASC) is the annual evaluation of fully automatic, classical logic, Automated Theorem Proving (ATP) systems. CASC-J11 was the twenty-seventh competition in the CASC series. Twenty-four ATP systems competed in the various competition divisions. This paper presents an outline of the competition design and a commentated summary of the results.
Collapse
|
29
|
Janson S. Asymptotic normality for -dependent and constrained -statistics, with applications to pattern matching in random strings and permutations. ADV APPL PROBAB 2023. [DOI: 10.1017/apr.2022.51] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/30/2023]
Abstract
Abstract
We study (asymmetric)
$U$
-statistics based on a stationary sequence of
$m$
-dependent variables; moreover, we consider constrained
$U$
-statistics, where the defining multiple sum only includes terms satisfying some restrictions on the gaps between indices. Results include a law of large numbers and a central limit theorem, together with results on rate of convergence, moment convergence, functional convergence, and a renewal theory version.
Special attention is paid to degenerate cases where, after the standard normalization, the asymptotic variance vanishes; in these cases non-normal limits occur after a different normalization.
The results are motivated by applications to pattern matching in random strings and permutations. We obtain both new results and new proofs of old results.
Collapse
|
30
|
Ahn J. Efficient Sender-Based Message Logging Tolerating Simultaneous Failures with Always No Rollback Property. Symmetry (Basel) 2023. [DOI: 10.3390/sym15040816] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/31/2023] Open
Abstract
Most of the existing sender-based message logging protocols cannot commonly handle simultaneous failures because, if both the sender and the receiver(s) of each message fail together, the receiver(s) cannot obtain the recovery information of the message. This unfortunate situation may happen due to their asymmetric logging behavior. This paper presents a novel sender-based message logging protocol for broadcast network based distributed systems to overcome the critical constraint of the previous ones with the following three features. First, when more than one process crashes at the same time, the protocol enables the system to ensure the always no rollback property by symmetrically replicating the recovery information at each process or group member connected on a network. Second, it can make the first feature persist even if the general form of communication for the system is a combination of point-to-point and group ones. Third, the communication overhead resulting from the replication can be highly lessened by making full use of the capability of the standard broadcast network in both communication modes. Experimental outcomes verify that, no matter which communication patterns are applied, it can reduce about 4.23∼9.96% of the total application execution time against the latest enabling the traditional ones to cope with simultaneous failures.
Collapse
|
31
|
Ryšavý P, Železný F. Reference-free phylogeny from sequencing data. BioData Min 2023; 16:13. [PMID: 36973746 PMCID: PMC10045052 DOI: 10.1186/s13040-023-00329-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Accepted: 03/09/2023] [Indexed: 03/29/2023] Open
Abstract
Abstract
Motivation
Clustering of genetic sequences is one of the key parts of bioinformatics analyses. Resulting phylogenetic trees are beneficial for solving many research questions, including tracing the history of species, studying migration in the past, or tracing a source of a virus outbreak. At the same time, biologists provide more data in the raw form of reads or only on contig-level assembly. Therefore, tools that are able to process those data without supervision need to be developed.
Results
In this paper, we present a tool for reference-free phylogeny capable of handling data where no mature-level assembly is available. The tool allows distance calculation for raw reads, contigs, and the combination of the latter. The tool provides an estimation of the Levenshtein distance between the sequences, which in turn estimates the number of mutations between the organisms. Compared to the previous research, the novelty of the method lies in a newly proposed combination of the read and contig measures, a new method for read-contig mapping, and an efficient embedding of contigs.
Collapse
|
32
|
Reznikova Z. Information Theory Opens New Dimensions in Experimental Studies of Animal Behaviour and Communication. Animals (Basel) 2023; 13:ani13071174. [PMID: 37048430 PMCID: PMC10093743 DOI: 10.3390/ani13071174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 03/15/2023] [Accepted: 03/24/2023] [Indexed: 03/29/2023] Open
Abstract
Over the last 40–50 years, ethology has become increasingly quantitative and computational. However, when analysing animal behavioural sequences, researchers often need help finding an adequate model to assess certain characteristics of these sequences while using a relatively small number of parameters. In this review, I demonstrate that the information theory approaches based on Shannon entropy and Kolmogorov complexity can furnish effective tools to analyse and compare animal natural behaviours. In addition to a comparative analysis of stereotypic behavioural sequences, information theory can provide ideas for particular experiments on sophisticated animal communications. In particular, it has made it possible to discover the existence of a developed symbolic “language” in leader-scouting ant species based on the ability of these ants to transfer abstract information about remote events.
Collapse
|
33
|
Turgut OE, Turgut MS, Kırtepe E. A systematic review of the emerging metaheuristic algorithms on solving complex optimization problems. Neural Comput Appl 2023. [DOI: 10.1007/s00521-023-08481-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
|
34
|
Abstract
We introduce Langevin sampling algorithms to field-theoretic simulations (FTSs) of polymers that, for the same accuracy, are ∼10× more efficient than a previously used Brownian dynamics algorithm that used predictor corrector for such simulations, over 10× more efficient than the smart Monte Carlo (SMC) algorithm, and typically over 1000× more efficient than a simple Monte Carlo (MC) algorithm. These algorithms are known as the Leimkuhler-Matthews (the BAOAB-limited) method and the BAOAB method. Furthermore, the FTS allows for an improved MC algorithm based on the Ornstein-Uhlenbeck process (OU MC), which is 2× more efficient than SMC. The system-size dependence of the efficiency for the sampling algorithms is presented, and it is shown that the aforementioned MC algorithms do not scale well with system sizes. Hence, for larger sizes, the efficiency difference between the Langevin and MC algorithms is even greater, although, for SMC and OU MC, the scaling is less unfavorable than for the simple MC.
Collapse
Affiliation(s)
- Bart Vorselaars
- School of Mathematics and Physics, University of Lincoln, Brayford Pool, Lincoln LN6 7TS, United Kingdom
| |
Collapse
|
35
|
Maleki A, Abbaspour J, Jowkar A, Sotudeh H. Role of citation and non-citation metrics in predicting the educational impact of textbooks. LHT 2023. [DOI: 10.1108/lht-06-2022-0297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/18/2023]
Abstract
PurposeThe main objective of the present study is to determine the role of citation-based metrics (PageRank and HITS’ authority and hub scores) and non-citation metrics (Goodreads readers, reviews and ratings, textbook edition counts) in predicting educational ranks of textbooks.Design/methodology/approachThe rankings of 1869 academic textbooks of various disciplines indexed in Scopus were extracted from the Open Syllabus Project (OSP) and compared with normalized counts of Scopus citations, scores of PageRank, authority and hub (HITS) in Scopus book-to-book citation network, Goodreads ratings and reviews, review sentiment scores and WorldCat book editions.FindingsPrediction of the educational rank of scholarly syllabus books ranged from 32% in technology to 68% in philosophy, psychology and religion. WorldCat editions in social sciences, medicine and technology, Goodreads ratings in humanities, and book-citation-network authority scores in law and political science accounted for the strongest predictions of the educational score. Thus, each indicator of editions, Goodreads ratings, and book citation authority score alone can be used to show the rank of the academic textbooks, and if used in combination, they will help explain the educational uptake of books even better.Originality/valueThis is the first study examining the role of citation indicators, Goodreads readers, reviews and ratings in predicting the OSP rank of academic books.
Collapse
|
36
|
Li L, Ren Y, Ma J. Flexible Hyperspectral Anomaly Detection Using Weighted Nuclear Norm. JACIII 2023. [DOI: 10.20965/jaciii.2023.p0243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/21/2023]
Abstract
It has been demonstrated that nuclear-norm-based low-rank representation is capable of modeling cluttered backgrounds in hyperspectral images (HSIs) for robust anomaly detection. However, minimizing the nuclear norm regularizes each singular value equally during rank reduction, which restricts the capacity and flexibility of modeling the major structures of the background. To address this problem, we propose detection of anomaly pixels in HSIs using the weighted nuclear norm, which can preserve the major singular values during rank reduction. We present a down-up sampling scheme to remove plausible anomaly pixels from the image as much as possible and learn a robust principal component analysis (PCA) background dictionary. From a dictionary, we develop a weighted nuclear-norm minimization model to represent the background with a low-rank coefficients matrix that can be effectively optimized using the standard alternating direction method of multipliers (ADMM). Due to the flexible modeling capacity using the weighted nuclear norm, anomaly pixels can be distinguished from the background with the reconstruction error. The experimental results on two real HSIs datasets demonstrate the effectiveness of the proposed method for anomaly detection.
Collapse
Affiliation(s)
- Lei Li
- Henan Province Engineering Technology Research Center of IIOT, No.1666 Dushi Road, Wancheng District, Nanyang, Henan 473000, China
- School of Electronic Information Engineering, Henan Polytechnic Institute, No.1666 Dushi Road, Wancheng District, Nanyang, Henan 473000, China
| | - Yuemei Ren
- Henan Province Engineering Technology Research Center of IIOT, No.1666 Dushi Road, Wancheng District, Nanyang, Henan 473000, China
- School of Electronic Information Engineering, Henan Polytechnic Institute, No.1666 Dushi Road, Wancheng District, Nanyang, Henan 473000, China
| | - Jinming Ma
- Artificial Intelligence School, Beijing University of Posts and Telecommunications, No.10 Xitucheng Road, Beijing 100876, China
| |
Collapse
|
37
|
Vidal A, Wu Fung S, Tenorio L, Osher S, Nurbekyan L. Taming hyperparameter tuning in continuous normalizing flows using the JKO scheme. Sci Rep 2023; 13:4501. [PMID: 36934141 PMCID: PMC10024737 DOI: 10.1038/s41598-023-31521-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Accepted: 03/13/2023] [Indexed: 03/20/2023] Open
Abstract
A normalizing flow (NF) is a mapping that transforms a chosen probability distribution to a normal distribution. Such flows are a common technique used for data generation and density estimation in machine learning and data science. The density estimate obtained with a NF requires a change of variables formula that involves the computation of the Jacobian determinant of the NF transformation. In order to tractably compute this determinant, continuous normalizing flows (CNF) estimate the mapping and its Jacobian determinant using a neural ODE. Optimal transport (OT) theory has been successfully used to assist in finding CNFs by formulating them as OT problems with a soft penalty for enforcing the standard normal distribution as a target measure. A drawback of OT-based CNFs is the addition of a hyperparameter, [Formula: see text], that controls the strength of the soft penalty and requires significant tuning. We present JKO-Flow, an algorithm to solve OT-based CNF without the need of tuning [Formula: see text]. This is achieved by integrating the OT CNF framework into a Wasserstein gradient flow framework, also known as the JKO scheme. Instead of tuning [Formula: see text], we repeatedly solve the optimization problem for a fixed [Formula: see text] effectively performing a JKO update with a time-step [Formula: see text]. Hence we obtain a "divide and conquer" algorithm by repeatedly solving simpler problems instead of solving a potentially harder problem with large [Formula: see text].
Collapse
Affiliation(s)
- Alexander Vidal
- Department of Applied Mathematics and Statistics, Colorado School of Mines, Golden, USA.
| | - Samy Wu Fung
- Department of Applied Mathematics and Statistics, Department of Computer Science, Colorado School of Mines, Golden, USA
| | - Luis Tenorio
- Department of Applied Mathematics and Statistics, Colorado School of Mines, Golden, USA
| | - Stanley Osher
- Department of Mathematics, University of California, Los Angeles, USA
| | - Levon Nurbekyan
- Department of Mathematics, University of California, Los Angeles, USA
| |
Collapse
|
38
|
Fassò A, Rodeschini J, Moro AF, Shaboviq Q, Maranzano P, Cameletti M, Finazzi F, Golini N, Ignaccolo R, Otto P. Agrimonia: a dataset on livestock, meteorology and air quality in the Lombardy region, Italy. Sci Data 2023; 10:143. [PMID: 36934159 PMCID: PMC10024000 DOI: 10.1038/s41597-023-02034-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Accepted: 02/20/2023] [Indexed: 03/20/2023] Open
Abstract
The air in the Lombardy region, Italy, is one of the most polluted in Europe because of limited air circulation and high emission levels. There is a large scientific consensus that the agricultural sector has a significant impact on air quality. To support studies quantifying the role of the agricultural and livestock sectors on the Lombardy air quality, this paper presents a harmonised dataset containing daily values of air quality, weather, emissions, livestock, and land and soil use in the years 2016-2021, for the Lombardy region. The daily scale is obtained by averaging hourly data and interpolating other variables. In fact, the pollutant data come from the European Environmental Agency and the Lombardy Regional Environment Protection Agency, weather and emissions data from the European Copernicus programme, livestock data from the Italian zootechnical registry, and land and soil use data from the CORINE Land Cover project. The resulting dataset is designed to be used as is by those using air quality data for research.
Collapse
Affiliation(s)
- Alessandro Fassò
- University of Bergamo, Dept. of Economics, Via dei Caniana 2, 24127, Bergamo, Italy.
| | - Jacopo Rodeschini
- University of Bergamo, Dept. of Economics, Via dei Caniana 2, 24127, Bergamo, Italy
| | - Alessandro Fusta Moro
- University of Torino, Dept. of Economics and Statistics, Lungo Dora Siena 100A, 10153, Torino, Italy
| | - Qendrim Shaboviq
- Leibniz University Hannover, Institute of Cartography and Geoinformatics, Appelstrasse 9a, 30167, Hannover, Germany
| | - Paolo Maranzano
- University of Milano-Bicocca, Dept. of Economics, Management and Statistics, Piazza dell'Ateneo Nuovo 1, 20126, Milano, Italy
- Fondazione Eni Enrico Mattei (FEEM), Corso Magenta 63, 20123, Milano, Italy
| | - Michela Cameletti
- University of Bergamo, Dept. of Economics, Via dei Caniana 2, 24127, Bergamo, Italy
| | - Francesco Finazzi
- University of Bergamo, Dept. of Economics, Via dei Caniana 2, 24127, Bergamo, Italy
| | - Natalia Golini
- University of Torino, Dept. of Economics and Statistics, Lungo Dora Siena 100A, 10153, Torino, Italy
| | - Rosaria Ignaccolo
- University of Torino, Dept. of Economics and Statistics, Lungo Dora Siena 100A, 10153, Torino, Italy
| | - Philipp Otto
- Leibniz University Hannover, Institute of Cartography and Geoinformatics, Appelstrasse 9a, 30167, Hannover, Germany
| |
Collapse
|
39
|
Zhang Y, Du K, Huang T. Heuristic Tree-Partition-Based Parallel Method for Biophysically Detailed Neuron Simulation. Neural Comput 2023; 35:627-644. [PMID: 36746142 DOI: 10.1162/neco_a_01565] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Accepted: 10/20/2022] [Indexed: 02/08/2023]
Abstract
Biophysically detailed neuron simulation is a powerful tool to explore the mechanisms behind biological experiments and bridge the gap between various scales in neuroscience research. However, the extremely high computational complexity of detailed neuron simulation restricts the modeling and exploration of detailed network models. The bottleneck is solving the system of linear equations. To accelerate detailed simulation, we propose a heuristic tree-partition-based parallel method (HTP) to parallelize the computation of the Hines algorithm, the kernel for solving linear equations, and leverage the strong parallel capability of the graphic processing unit (GPU) to achieve further speedup. We formulate the problem of how to get a fine parallel process as a tree-partition problem. Next, we present a heuristic partition algorithm to obtain an effective partition to efficiently parallelize the equation-solving process in detailed simulation. With further optimization on GPU, our HTP method achieves 2.2 to 8.5 folds speedup compared to the state-of-the-art GPU method and 36 to 660 folds speedup compared to the typical Hines algorithm.
Collapse
Affiliation(s)
- Yichen Zhang
- School of Computer Science, Peking University, Beijing 100871, China
| | - Kai Du
- School of Computer Science and Institute for Artificial Intelligence, Peking University, Beijing 100871, China
| | - Tiejun Huang
- School of Computer Science and Institute for Artificial Intelligence, Peking University, Beijing 100871, China
| |
Collapse
|
40
|
Gerez S GA, Di Remigio Eikås R, Jensen SR, Bjørgve M, Frediani L. Cavity-Free Continuum Solvation: Implementation and Parametrization in a Multiwavelet Framework. J Chem Theory Comput 2023; 19:1986-1997. [PMID: 36933225 PMCID: PMC10100532 DOI: 10.1021/acs.jctc.2c01098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/19/2023]
Abstract
We present a multiwavelet-based implementation of a quantum/classical polarizable continuum model. The solvent model uses a diffuse solute-solvent boundary and a position-dependent permittivity, lifting the sharp-boundary assumption underlying many existing continuum solvation models. We are able to include both surface and volume polarization effects in the quantum/classical coupling, with guaranteed precision, due to the adaptive refinement strategies of our multiwavelet implementation. The model can account for complex solvent environments and does not need a posteriori corrections for volume polarization effects. We validate our results against a sharp-boundary continuum model and find a very good correlation of the polarization energies computed for the Minnesota solvation database.
Collapse
Affiliation(s)
- Gabriel A Gerez S
- Hylleraas Centre for Quantum Molecular Sciences, Department of Chemistry, UiT The Arctic University of Norway, N-9037 Tromsø, Norway
| | | | - Stig Rune Jensen
- Hylleraas Centre for Quantum Molecular Sciences, Department of Chemistry, UiT The Arctic University of Norway, N-9037 Tromsø, Norway
| | - Magnar Bjørgve
- Hylleraas Centre for Quantum Molecular Sciences, Department of Chemistry, UiT The Arctic University of Norway, N-9037 Tromsø, Norway
| | - Luca Frediani
- Hylleraas Centre for Quantum Molecular Sciences, Department of Chemistry, UiT The Arctic University of Norway, N-9037 Tromsø, Norway
| |
Collapse
|
41
|
Asghari A, Azgomi H, darvishmofarahi Z. Multi-objective edge server placement using the whale optimization algorithm and game theory. Soft comput 2023. [DOI: 10.1007/s00500-023-07995-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/19/2023]
|
42
|
Zhang K, Liu Y, Zhang J, Zhang G, Jin J, Li Y, Tang F. TDCA: improved optimization algorithm with degree distribution and communication traffic for the deployment of software components based on AUTOSAR architecture. Soft comput 2023. [DOI: 10.1007/s00500-023-07989-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/19/2023]
|
43
|
Garcez AD, Lamb LC. Neurosymbolic AI: the 3rd wave. Artif Intell Rev 2023. [DOI: 10.1007/s10462-023-10448-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/17/2023]
|
44
|
Wong R, Chang WL, Chung WY, Vasilakos AV. Biomolecular and quantum algorithms for the dominating set problem in arbitrary networks. Sci Rep 2023; 13:4205. [PMID: 36918570 PMCID: PMC10015031 DOI: 10.1038/s41598-023-30600-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2022] [Accepted: 02/27/2023] [Indexed: 03/16/2023] Open
Abstract
A dominating set of a graph [Formula: see text] is a subset U of its vertices V, such that any vertex of G is either in U, or has a neighbor in U. The dominating-set problem is to find a minimum dominating set in G. Dominating sets are of critical importance for various types of networks/graphs, and find therefore potential applications in many fields. Particularly, in the area of communication, dominating sets are prominently used in the efficient organization of large-scale wireless ad hoc and sensor networks. However, the dominating set problem is also a hard optimization problem and thus currently is not efficiently solvable on classical computers. Here, we propose a biomolecular and a quantum algorithm for this problem, where the quantum algorithm provides a quadratic speedup over any classical algorithm. We show that the dominating set problem can be solved in [Formula: see text] queries by our proposed quantum algorithm, where n is the number of vertices in G. We also demonstrate that our quantum algorithm is the best known procedure to date for this problem. We confirm the correctness of our algorithm by executing it on IBM Quantum's qasm simulator and the Brooklyn superconducting quantum device. And lastly, we show that molecular solutions obtained from solving the dominating set problem are represented in terms of a unit vector in a finite-dimensional Hilbert space.
Collapse
Affiliation(s)
- Renata Wong
- Physics Division, National Center for Theoretical Sciences, National Taiwan University, Taipei, 10617, Taiwan.
| | - Weng-Long Chang
- Department of Computer Science and Information Engineering, National Kaohsiung University of Science and Technology, Kaohsiung, 807618, Taiwan.
| | - Wen-Yu Chung
- Department of Computer Science and Information Engineering, National Kaohsiung University of Science and Technology, Kaohsiung, 807618, Taiwan.
| | | |
Collapse
|
45
|
Hammer M, Bauer G, Stierle R, Gross J, Wilhelmsen Ø. Classical density functional theory for interfacial properties of hydrogen, helium, deuterium, neon, and their mixtures. J Chem Phys 2023; 158:104107. [PMID: 36922124 DOI: 10.1063/5.0137226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/11/2023] Open
Abstract
We present a classical density functional theory (DFT) for fluid mixtures that is based on a third-order thermodynamic perturbation theory of Feynman-Hibbs-corrected Mie potentials. The DFT is developed to study the interfacial properties of hydrogen, helium, neon, deuterium, and their mixtures, i.e., fluids that are strongly influenced by quantum effects at low temperatures. White Bear fundamental measure theory is used for the hard-sphere contribution of the Helmholtz energy functional, and a weighted density approximation is used for the dispersion contribution. For mixtures, a contribution is included to account for non-additivity in the Lorentz-Berthelot combination rule. Predictions of the radial distribution function from DFT are in excellent agreement with results from molecular simulations, both for pure components and mixtures. Above the normal boiling point and 5% below the critical temperature, the DFT yields surface tensions of neon, hydrogen, and deuterium with average deviations from experiments of 7.5%, 4.4%, and 1.8%, respectively. The surface tensions of hydrogen/deuterium, para-hydrogen/helium, deuterium/helium, and hydrogen/neon mixtures are reproduced with a mean absolute error of 5.4%, 8.1%, 1.3%, and 7.5%, respectively. The surface tensions are predicted with an excellent accuracy at temperatures above 20 K. The poor accuracy below 20 K is due to the inability of Feynman-Hibbs-corrected Mie potentials to represent the real fluid behavior at these conditions, motivating the development of new intermolecular potentials. This DFT can be leveraged in the future to study confined fluids and assess the performance of porous materials for hydrogen storage and transport.
Collapse
Affiliation(s)
- Morten Hammer
- Porelab, Department of Chemistry, Norwegian University of Science and Technology, NO-7491 Trondheim, Norway
| | - Gernot Bauer
- Institute of Thermodynamics and Thermal Process Engineering, University of Stuttgart, Pfaffenwaldring 9, D-70569 Stuttgart, Germany
| | - Rolf Stierle
- Institute of Thermodynamics and Thermal Process Engineering, University of Stuttgart, Pfaffenwaldring 9, D-70569 Stuttgart, Germany
| | - Joachim Gross
- Institute of Thermodynamics and Thermal Process Engineering, University of Stuttgart, Pfaffenwaldring 9, D-70569 Stuttgart, Germany
| | - Øivind Wilhelmsen
- Porelab, Department of Chemistry, Norwegian University of Science and Technology, NO-7491 Trondheim, Norway
| |
Collapse
|
46
|
Rappoport D, Jinich A. Enzyme Substrate Prediction from Three-Dimensional Feature Representations Using Space-Filling Curves. J Chem Inf Model 2023; 63:1637-1648. [PMID: 36802628 DOI: 10.1021/acs.jcim.3c00005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/22/2023]
Abstract
Compact and interpretable structural feature representations are required for accurately predicting properties and function of proteins. In this work, we construct and evaluate three-dimensional feature representations of protein structures based on space-filling curves (SFCs). We focus on the problem of enzyme substrate prediction, using two ubiquitous enzyme families as case studies: the short-chain dehydrogenase/reductases (SDRs) and the S-adenosylmethionine-dependent methyltransferases (SAM-MTases). Space-filling curves such as the Hilbert curve and the Morton curve generate a reversible mapping from discretized three-dimensional to one-dimensional representations and thus help to encode three-dimensional molecular structures in a system-independent way and with only a few adjustable parameters. Using three-dimensional structures of SDRs and SAM-MTases generated using AlphaFold2, we assess the performance of the SFC-based feature representations in predictions on a new benchmark database of enzyme classification tasks including their cofactor and substrate selectivity. Gradient-boosted tree classifiers yield binary prediction accuracy of 0.77-0.91 and area under curve (AUC) characteristics of 0.83-0.92 for the classification tasks. We investigate the effects of amino acid encoding, spatial orientation, and (the few) parameters of SFC-based encodings on the accuracy of the predictions. Our results suggest that geometry-based approaches such as SFCs are promising for generating protein structural representations and are complementary to the existing protein feature representations such as evolutionary scale modeling (ESM) sequence embeddings.
Collapse
Affiliation(s)
- Dmitrij Rappoport
- Department of Chemistry, University of California, Irvine, 1102 Natural Sciences 2, Irvine, California 92697, United States
| | - Adrian Jinich
- Weill Cornell Medicine, 1300 York Avenue, Box 65, New York, New York 10065, United States
| |
Collapse
|
47
|
Das S, Anand DV, Chung MK. Topological data analysis of human brain networks through order statistics. PLoS One 2023; 18:e0276419. [PMID: 36913351 PMCID: PMC10010566 DOI: 10.1371/journal.pone.0276419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Accepted: 09/21/2022] [Indexed: 03/14/2023] Open
Abstract
Understanding the common topological characteristics of the human brain network across a population is central to understanding brain functions. The abstraction of human connectome as a graph has been pivotal in gaining insights on the topological properties of the brain network. The development of group-level statistical inference procedures in brain graphs while accounting for the heterogeneity and randomness still remains a difficult task. In this study, we develop a robust statistical framework based on persistent homology using the order statistics for analyzing brain networks. The use of order statistics greatly simplifies the computation of the persistent barcodes. We validate the proposed methods using comprehensive simulation studies and subsequently apply to the resting-state functional magnetic resonance images. We found a statistically significant topological difference between the male and female brain networks.
Collapse
Affiliation(s)
- Soumya Das
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, United States of America
| | - D. Vijay Anand
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, United States of America
| | - Moo K. Chung
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, United States of America
| |
Collapse
|
48
|
Guerrero-Contreras G, Balderas-Díaz S, Garrido JL, Rodríguez-Fórtiz MJ, O’Hare GMP. Proposal and comparative analysis of a voting-based election algorithm for managing service replication in MANETs. APPL INTELL 2023. [DOI: 10.1007/s10489-023-04506-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/28/2023]
|
49
|
Chen Z, Zhang B, Gong F, Wan L, Ma L. RobustTree: An adaptive, robust PCA algorithm for embedded tree structure recovery from single-cell sequencing data. Front Genet 2023; 14:1110899. [PMID: 36968591 PMCID: PMC10030613 DOI: 10.3389/fgene.2023.1110899] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 02/13/2023] [Indexed: 03/11/2023] Open
Abstract
Robust Principal Component Analysis (RPCA) offers a powerful tool for recovering a low-rank matrix from highly corrupted data, with growing applications in computational biology. Biological processes commonly form intrinsic hierarchical structures, such as tree structures of cell development trajectories and tumor evolutionary history. The rapid development of single-cell sequencing (SCS) technology calls for the recovery of embedded tree structures from noisy and heterogeneous SCS data. In this study, we propose RobustTree, a unified framework to reconstruct the inherent topological structure underlying high-dimensional data with noise. By extending RPCA to handle tree structure optimization, RobustTree leverages data denoising, clustering, and tree structure reconstruction. It solves the tree optimization problem with an adaptive parameter selection scheme that we proposed. In addition to recovering real datasets, RobustTree can reconstruct continuous topological structure and discrete-state topological structure of underlying SCS data. We apply RobustTree on multiple synthetic and real datasets and demonstrate its high accuracy and robustness when analyzing high-noise SCS data with embedded complex structures. The code is available at https://github.com/ucasdp/RobustTree.
Collapse
Affiliation(s)
- Ziwei Chen
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, United States
| | - Bingwei Zhang
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Fuzhou Gong
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Lin Wan
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, China
- *Correspondence: Lin Wan, ; Liang Ma,
| | - Liang Ma
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, China
- Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- *Correspondence: Lin Wan, ; Liang Ma,
| |
Collapse
|
50
|
Wang G, Liu T, Zou M, Karsili TNV, Lester MI. UV photodissociation dynamics of the acetone oxide Criegee intermediate: experiment and theory. Phys Chem Chem Phys 2023; 25:7453-7465. [PMID: 36848133 DOI: 10.1039/d3cp00207a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/22/2023]
Abstract
The photodissociation dynamics of the dimethyl-substituted acetone oxide Criegee intermediate [(CH3)2COO] is characterized following electronic excitation to the bright 1ππ* state, which leads to O (1D) + acetone [(CH3)2CO, S0] products. The UV action spectrum of (CH3)2COO recorded with O (1D) detection under jet-cooled conditions is broad, unstructured, and essentially unchanged from the corresponding electronic absorption spectrum obtained using a UV-induced depletion method. This indicates that UV excitation of (CH3)2COO leads predominantly to the O (1D) product channel. A higher energy O (3P) + (CH3)2CO (T1) product channel is not observed, although it is energetically accessible. In addition, complementary MS-CASPT2 trajectory surface-hopping (TSH) simulations indicate minimal population leading to the O (3P) channel and non-unity overall probability for dissociation (within 100 fs). Velocity map imaging of the O (1D) products is utilized to reveal the total kinetic energy release (TKER) distribution upon photodissociation of (CH3)2COO at various UV excitation energies. Simulation of the TKER distributions is performed using a hybrid model that combines an impulsive model with a statistical component, the latter reflecting the longer-lived (>100 fs) trajectories identified in the TSH calculations. The impulsive model accounts for vibrational activation of (CH3)2CO arising from geometrical changes between the Criegee intermediate and the carbonyl product, indicating the importance of CO stretch, CCO bend, and CC stretch along with activation of hindered rotation and rock of the methyl groups in the (CH3)2CO product. Detailed comparison is also made with the TKER distribution arising from photodissociation dynamics of CH2OO upon UV excitation.
Collapse
Affiliation(s)
- Guanghan Wang
- Department of Chemistry, University of Pennsylvania, Philadelphia, PA 19104-6323, USA.
| | - Tianlin Liu
- Department of Chemistry, University of Pennsylvania, Philadelphia, PA 19104-6323, USA.
| | - Meijun Zou
- Department of Chemistry, University of Pennsylvania, Philadelphia, PA 19104-6323, USA.
| | - Tolga N V Karsili
- Department of Chemistry, University of Louisiana at Lafayette, Lafayette, LA 70504, USA.
| | - Marsha I Lester
- Department of Chemistry, University of Pennsylvania, Philadelphia, PA 19104-6323, USA.
| |
Collapse
|