1
|
Multimodal fused deep learning for drug property prediction: Integrating chemical language and molecular graph. Comput Struct Biotechnol J 2024; 23:1666-1679. [PMID: 38680871 PMCID: PMC11046066 DOI: 10.1016/j.csbj.2024.04.030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 04/01/2024] [Accepted: 04/10/2024] [Indexed: 05/01/2024] Open
Abstract
Accurately predicting molecular properties is a challenging but essential task in drug discovery. Recently, many mono-modal deep learning methods have been successfully applied to molecular property prediction. However, mono-modal learning is inherently limited as it relies solely on a single modality of molecular representation, which restricts a comprehensive understanding of drug molecules. To overcome the limitations, we propose a multimodal fused deep learning (MMFDL) model to leverage information from different molecular representations. Specifically, we construct a triple-modal learning model by employing Transformer-Encoder, Bidirectional Gated Recurrent Unit (BiGRU), and graph convolutional network (GCN) to process three modalities of information from chemical language and molecular graph: SMILES-encoded vectors, ECFP fingerprints, and molecular graphs, respectively. We evaluate the proposed triple-modal model using five fusion approaches on six molecule datasets, including Delaney, Llinas2020, Lipophilicity, SAMPL, BACE, and pKa from DataWarrior. The results show that the MMFDL model achieves the highest Pearson coefficients, and stable distribution of Pearson coefficients in the random splitting test, outperforming mono-modal models in accuracy and reliability. Furthermore, we validate the generalization ability of our model in the prediction of binding constants for protein-ligand complex molecules, and assess the resilience capability against noise. Through analysis of feature distributions in chemical space and the assigned contribution of each modal model, we demonstrate that the MMFDL model shows the ability to acquire complementary information by using proper models and suitable fusion approaches. By leveraging diverse sources of bioinformatics information, multimodal deep learning models hold the potential for successful drug discovery.
Collapse
|
2
|
MolPROP: Molecular Property prediction with multimodal language and graph fusion. J Cheminform 2024; 16:56. [PMID: 38778388 PMCID: PMC11112823 DOI: 10.1186/s13321-024-00846-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Accepted: 04/27/2024] [Indexed: 05/25/2024] Open
Abstract
Pretrained deep learning models self-supervised on large datasets of language, image, and graph representations are often fine-tuned on downstream tasks and have demonstrated remarkable adaptability in a variety of applications including chatbots, autonomous driving, and protein folding. Additional research aims to improve performance on downstream tasks by fusing high dimensional data representations across multiple modalities. In this work, we explore a novel fusion of a pretrained language model, ChemBERTa-2, with graph neural networks for the task of molecular property prediction. We benchmark the MolPROP suite of models on seven scaffold split MoleculeNet datasets and compare with state-of-the-art architectures. We find that (1) multimodal property prediction for small molecules can match or significantly outperform modern architectures on hydration free energy (FreeSolv), experimental water solubility (ESOL), lipophilicity (Lipo), and clinical toxicity tasks (ClinTox), (2) the MolPROP multimodal fusion is predominantly beneficial on regression tasks, (3) the ChemBERTa-2 masked language model pretraining task (MLM) outperformed multitask regression pretraining task (MTR) when fused with graph neural networks for multimodal property prediction, and (4) despite improvements from multimodal fusion on regression tasks MolPROP significantly underperforms on some classification tasks. MolPROP has been made available at https://github.com/merck/MolPROP . SCIENTIFIC CONTRIBUTION: This work explores a novel multimodal fusion of learned language and graph representations of small molecules for the supervised task of molecular property prediction. The MolPROP suite of models demonstrates that language and graph fusion can significantly outperform modern architectures on several regression prediction tasks and also provides the opportunity to explore alternative fusion strategies on classification tasks for multimodal molecular property prediction.
Collapse
|
3
|
Locating-dominating number of certain infinite families of convex polytopes with applications. Heliyon 2024; 10:e29304. [PMID: 38628707 PMCID: PMC11019228 DOI: 10.1016/j.heliyon.2024.e29304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 03/27/2024] [Accepted: 04/04/2024] [Indexed: 04/19/2024] Open
Abstract
A convex hull of finitely many points in the Euclidean space R d is known as a convex polytope. Graphically, they are planar graphs i.e. embeddable on R 2 . Minimum dominating sets possess diverse applications in computer science and engineering. Locating-dominating sets are a natural extension of dominating sets. Studying minimizing locating-dominating sets of convex polytopes reveal interesting distance-dominating related topological properties of these geometrical planar graphs. In this paper, exact value of the locating-dominating number is shown for one infinite family of convex polytopes. Moreover, tight upper bounds on γ l - d are shown for two more infinite families. Tightness in the upper bounds is shown by employing an updated integer linear programming (ILP) model for the locating-dominating number γ l - d of a fixed graph. Results are explained with help of some examples. The second part of the paper solves an open problem in Khan (2023) [28] which asks to find a domination-related parameter which delivers a correlation coefficient of ρ > 0.9967 with the total π-electronic energy of lower benzenoid hydrocarbons. We show that the locating-dominating number γ l - d delivers such a strong prediction potential. The paper is concluded with putting forward some open problems in this area.
Collapse
|
4
|
HSG-MGAF Net: Heterogeneous sub graph-guided multiscale graph attention fusion network for interpretable prediction of whole-slide image. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 247:108099. [PMID: 38442623 DOI: 10.1016/j.cmpb.2024.108099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/17/2023] [Revised: 02/12/2024] [Accepted: 02/22/2024] [Indexed: 03/07/2024]
Abstract
BACKGROUND AND OBJECTIVE Pathological whole slide image (WSI) prediction and region of interest (ROI) localization are important issues in computer-aided diagnosis and postoperative analysis in clinical applications. Existing computer-aided methods for predicting WSI are mainly based on multiple instance learning (MIL) and its variants. However, most of the methods are based on instance independence and identical distribution assumption and performed at a single scale, which not fully exploit the hierarchical multiscale heterogeneous information contained in WSI. METHODS Heterogeneous Subgraph-Guided Multiscale Graph Attention Fusion Network (HSG-MGAF Net) is proposed to build the topology of critical image patches at two scales for adaptive WSI prediction and lesion localization. The HSG-MGAF Net simulates the hierarchical heterogeneous information of WSI through graph and hypergraph at two scales, respectively. This framework not only fully exploits the low-order and potential high-order correlations of image patches at each scale, but also leverages the heterogeneous information of the two scales for adaptive WSI prediction. RESULTS We validate the superiority of the proposed method on the CAMELYON16 and the TCGA- NSCLC, and the results show that HSG-MGAF Net outperforms the state-of-the-art method on both datasets. The average ACC, AUC and F1 score of HSG-MGAF Net can reach 92.7 %/0.951/0.892 and 92.2 %/0.957/0.919, respectively. The obtained heatmaps can also localize the positive regions more accurately, which have great consistency with the pixel-level labels. CONCLUSIONS The results demonstrate that HSG-MGAF Net outperforms existing weakly supervised learning methods by introducing critical heterogeneous information between the two scales. This approach paves the way for further research on light weighted heterogeneous graph-based WSI prediction and ROI localization.
Collapse
|
5
|
Classification and Identification of Non-canonical Base Pairs and Structural Motifs. Methods Mol Biol 2024; 2726:143-168. [PMID: 38780731 DOI: 10.1007/978-1-0716-3519-3_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2024]
Abstract
The 3D structures of many ribonucleic acid (RNA) loops are characterized by highly organized networks of non-canonical interactions. Multiple computational methods have been developed to annotate structures with those interactions or automatically identify recurrent interaction networks. By contrast, the reverse problem that aims to retrieve the geometry of a look from its sequence or ensemble of interactions remains much less explored. In this chapter, we will describe how to retrieve and build families of conserved structural motifs using their underlying network of non-canonical interactions. Then, we will show how to assign sequence alignments to those families and use the software BayesPairing to build statistical models of structural motifs with their associated sequence alignments. From this model, we will apply BayesPairing to identify in new sequences regions where those loop geometries can occur.
Collapse
|
6
|
Graph- and transformer-guided boundary aware network for medical image segmentation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 242:107849. [PMID: 37837887 DOI: 10.1016/j.cmpb.2023.107849] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 09/29/2023] [Accepted: 10/06/2023] [Indexed: 10/16/2023]
Abstract
BACKGROUND AND OBJECTIVE Despite the considerable progress achieved by U-Net-based models, medical image segmentation remains a challenging task due to complex backgrounds, irrelevant noises, and ambiguous boundaries. In this study, we present a novel approach called U-shaped Graph- and Transformer-guided Boundary Aware Network (GTBA-Net) to tackle these challenges. METHODS GTBA-Net uses the pre-trained ResNet34 as its basic structure, and involves Global Feature Aggregation (GFA) modules for target localization, Graph-based Dynamic Feature Fusion (GDFF) modules for effective noise suppression, and Uncertainty-based Boundary Refinement (UBR) modules for accurate delineation of ambiguous boundaries. The GFA modules employ an efficient self-attention mechanism to facilitate coarse target localization amidst complex backgrounds, without introducing additional computational complexity. The GDFF modules leverage graph attention mechanism to aggregate information hidden among high- and low-level features, effectively suppressing target-irrelevant noises while preserving valuable spatial details. The UBR modules introduce an uncertainty quantification strategy and auxiliary loss to guide the model's focus towards target regions and uncertain "ridges", gradually mitigating boundary uncertainty and ultimately achieving accurate boundary delineation. RESULTS Comparative experiments on five datasets encompassing diverse modalities (including X-ray, CT, endoscopic procedures, and ultrasound) demonstrate that the proposed GTBA-Net outperforms existing methods in various challenging scenarios. Subsequent ablation studies further demonstrate the efficacy of the GFA, GDFF, and UBR modules in target localization, noise suppression, and ambiguous boundary delineation, respectively. CONCLUSIONS GTBA-Net exhibits substantial potential for extensive application in the field of medical image segmentation, particularly in scenarios involving complex backgrounds, target-irrelevant noises, or ambiguous boundaries.
Collapse
|
7
|
Graph neural network-based breast cancer diagnosis using ultrasound images with optimized graph construction integrating the medically significant features. J Cancer Res Clin Oncol 2023; 149:18039-18064. [PMID: 37982829 PMCID: PMC10725367 DOI: 10.1007/s00432-023-05464-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Accepted: 10/06/2023] [Indexed: 11/21/2023]
Abstract
PURPOSE An automated computerized approach can aid radiologists in the early diagnosis of breast cancer. In this study, a novel method is proposed for classifying breast tumors into benign and malignant, based on the ultrasound images through a Graph Neural Network (GNN) model utilizing clinically significant features. METHOD Ten informative features are extracted from the region of interest (ROI), based on the radiologists' diagnosis markers. The significance of the features is evaluated using density plot and T test statistical analysis method. A feature table is generated where each row represents individual image, considered as node, and the edges between the nodes are denoted by calculating the Spearman correlation coefficient. A graph dataset is generated and fed into the GNN model. The model is configured through ablation study and Bayesian optimization. The optimized model is then evaluated with different correlation thresholds for getting the highest performance with a shallow graph. The performance consistency is validated with k-fold cross validation. The impact of utilizing ROIs and handcrafted features for breast tumor classification is evaluated by comparing the model's performance with Histogram of Oriented Gradients (HOG) descriptor features from the entire ultrasound image. Lastly, a clustering-based analysis is performed to generate a new filtered graph, considering weak and strong relationships of the nodes, based on the similarities. RESULTS The results indicate that with a threshold value of 0.95, the GNN model achieves the highest test accuracy of 99.48%, precision and recall of 100%, and F1 score of 99.28%, reducing the number of edges by 85.5%. The GNN model's performance is 86.91%, considering no threshold value for the graph generated from HOG descriptor features. Different threshold values for the Spearman's correlation score are experimented with and the performance is compared. No significant differences are observed between the previous graph and the filtered graph. CONCLUSION The proposed approach might aid the radiologists in effective diagnosing and learning tumor pattern of breast cancer.
Collapse
|
8
|
A new technique for influence maximization on social networks using a moth-flame optimization algorithm. Heliyon 2023; 9:e22191. [PMID: 38058635 PMCID: PMC10695977 DOI: 10.1016/j.heliyon.2023.e22191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 10/26/2023] [Accepted: 11/06/2023] [Indexed: 12/08/2023] Open
Abstract
In our modern digital era, social networks have seamlessly integrated into the fabric of our daily lives. These digital platforms serve as vital channels for communication, exchanging information, and cultivating valuable connections. The propagation of information within these social networks has emerged as a central focus for numerous sectors, including politics, marketing, research, education, and finance. Diverse models have been employed to depict the dynamics of information dissemination across these networks. Nevertheless, the notion of influence holds profound significance for both businesses and individuals. Influence maximization, particularly within the context of social networks, has garnered considerable attention owing to its potential to reach and impact a broad audience. This intricate challenge is commonly referred to as the "influence maximization problem," a problem well-known for its NP-hard complexity. This paper proposes a cutting-edge technique that leverages the Moth-Flame Optimization Algorithm to enhance influence maximization. Influence maximization is an important issue in network analysis, which widely occurs in social networks. Influence can be seen as a cascading effect, where the actions of a few trigger a chain reaction, ultimately reaching a large portion of the network. Identifying these "influencers" is crucial for efficient resource allocation and information dissemination. One of the important issues in finding the maximum influence is choosing the best vertex among all the vertices in the graph. This research presents a new method to find the maximum influence in social networks based on the Moth-Flame Algorithm (MFA). The proposed method aims to find the maximum influence in the social network graph that has a good fitness degree. The algorithm can identify potential influencers. Our simulations across multiple networks have unequivocally showcased the superiority of this algorithm as the preeminent and scalable solution to the influence maximization problem. The experimental outcomes clearly delineate that the employment of the MFA (Maximal First Activation) approach effectively diminishes the execution time required to approximate the maximum influence. The proposed technique improved the accuracy and excucation time by 3.140 % and 12.2 % compared to other methods.
Collapse
|
9
|
Graph Neural Network for representation learning of lung cancer. BMC Cancer 2023; 23:1037. [PMID: 37884929 PMCID: PMC10601264 DOI: 10.1186/s12885-023-11516-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Accepted: 10/11/2023] [Indexed: 10/28/2023] Open
Abstract
The emergence of image-based systems to improve diagnostic pathology precision, involving the intent to label sets or bags of instances, greatly hinges on Multiple Instance Learning for Whole Slide Images(WSIs). Contemporary works have shown excellent performance for a neural network in MIL settings. Here, we examine a graph-based model to facilitate end-to-end learning and sample suitable patches using a tile-based approach. We propose MIL-GNN to employ a graph-based Variational Auto-encoder with a Gaussian mixture model to discover relations between sample patches for the purposes to aggregate patch details into an individual vector representation. Using the classical MIL dataset MUSK and distinguishing two lung cancer sub-types, lung cancer called adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC), we exhibit the efficacy of our technique. We achieved a 97.42% accuracy on the MUSK dataset and a 94.3% AUC on the classification of lung cancer sub-types utilizing features.
Collapse
|
10
|
Decision tree classifier based on topological characteristics of sub graph for the mining of protein complexes from large scale PPI networks. Comput Biol Chem 2023; 106:107935. [PMID: 37536230 DOI: 10.1016/j.compbiolchem.2023.107935] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Revised: 06/11/2023] [Accepted: 07/23/2023] [Indexed: 08/05/2023]
Abstract
The growing accessibility of large-scale protein interaction data demands extensive research to understand cell organization and its functioning at the network level. Bioinformatics and data mining researchers have extensively studied network clustering to examine the structural and operational features of protein protein interaction (PPI) networks. Clustering PPI networks has proven useful in numerous research over the past two decades for identifying functional modules, understanding the roles of previously unknown proteins, and other purposes. Protein complexes represent one of the essential cellular components for creating biological activities. Inferring protein complexes has been made more accessible by experimental approaches. We offer a novel method that integrates the classification model with local topological data, making it more reliable and efficient. This article describes a decision tree classifier based on topological characteristics of the subgraph for mining protein complexes. The proposed graph-based algorithm is an effective and efficient way to identify protein complexes from large-scale PPI networks. The performance of the proposed algorithm is observed in protein-protein interaction networks of yeast and human in the Database of Interacting Proteins (DIP) and the Biological General Repository for Interaction Datasets (BioGRID) using widely accepted benchmark protein complexes from the comprehensive resource of mammalian protein complexes (CORUM) and the comprehensive catalogue of yeast protein complexes (CYC2008). The outcomes demonstrate that our method can outperform the best-performing supervised, semi-supervised, and unsupervised approaches to detecting protein complexes.
Collapse
|
11
|
Dataset complementary prism networks. Data Brief 2023; 50:109539. [PMID: 37732294 PMCID: PMC10507128 DOI: 10.1016/j.dib.2023.109539] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2023] [Revised: 08/14/2023] [Accepted: 08/29/2023] [Indexed: 09/22/2023] Open
Abstract
The complementary prism of G , denoted by G G ¯ , is the graph obtained from the disjoint union of G and G ¯ by adding edges between the corresponding vertices of G and G ¯ . Up to date, the progress of experimental research around complementary prisms is limited by the unavailability of publicly available instances that could be used to run extensive experiments and to compare the performance on different topological index solutions and its bounds. For this reason, we decided to make publicly available 435 instances of type G G ¯ randomly generated, with increasing network size (from 12 to 1948 nodes). The dataset presents instances of Complementary Prism Networks suitable to measure the Wiener Index and Generalized Wiener Index and the value of these indices for these instances. In addition, are presented the value of some lower and upper bounds proposed in the literature for these indices and their error with respect to the value of the index.
Collapse
|
12
|
A robust approach to 3D neuron shape representation for quantification and classification. BMC Bioinformatics 2023; 24:366. [PMID: 37770830 PMCID: PMC10537603 DOI: 10.1186/s12859-023-05482-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Accepted: 09/13/2023] [Indexed: 09/30/2023] Open
Abstract
We consider the problem of finding an accurate representation of neuron shapes, extracting sub-cellular features, and classifying neurons based on neuron shapes. In neuroscience research, the skeleton representation is often used as a compact and abstract representation of neuron shapes. However, existing methods are limited to getting and analyzing "curve" skeletons which can only be applied for tubular shapes. This paper presents a 3D neuron morphology analysis method for more general and complex neuron shapes. First, we introduce the concept of skeleton mesh to represent general neuron shapes and propose a novel method for computing mesh representations from 3D surface point clouds. A skeleton graph is then obtained from skeleton mesh and is used to extract sub-cellular features. Finally, an unsupervised learning method is used to embed the skeleton graph for neuron classification. Extensive experiment results are provided and demonstrate the robustness of our method to analyze neuron morphology.
Collapse
|
13
|
M6ATMR: identifying N6-methyladenosine sites through RNA sequence similarity matrix reconstruction guided by Transformer. PeerJ 2023; 11:e15899. [PMID: 37719113 PMCID: PMC10501384 DOI: 10.7717/peerj.15899] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Accepted: 07/24/2023] [Indexed: 09/19/2023] Open
Abstract
Numerous studies have focused on the classification of N6-methyladenosine (m6A) modification sites in RNA sequences, treating it as a multi-feature extraction task. In these studies, the incorporation of physicochemical properties of nucleotides has been applied to enhance recognition efficacy. However, the introduction of excessive supplementary information may introduce noise to the RNA sequence features, and the utilization of sequence similarity information remains underexplored. In this research, we present a novel method for RNA m6A modification site recognition called M6ATMR. Our approach relies solely on sequence information, leveraging Transformer to guide the reconstruction of the sequence similarity matrix, thereby enhancing feature representation. Initially, M6ATMR encodes RNA sequences using 3-mers to generate the sequence similarity matrix. Meanwhile, Transformer is applied to extract sequence structure graphs for each RNA sequence. Subsequently, to capture low-dimensional representations of similarity matrices and structure graphs, we introduce a graph self-correlation convolution block. These representations are then fused and reconstructed through the local-global fusion block. Notably, we adopt iteratively updated sequence structure graphs to continuously optimize the similarity matrix, thereby constraining the end-to-end feature extraction process. Finally, we employ the random forest (RF) algorithm for identifying m6A modification sites based on the reconstructed features. Experimental results demonstrate that M6ATMR achieves promising performance by solely utilizing RNA sequences for m6A modification site identification. Our proposed method can be considered an effective complement to existing RNA m6A modification site recognition approaches.
Collapse
|
14
|
The use of networks in spatial and temporal computational models for outbreak spread in epidemiology: A systematic review. J Biomed Inform 2023; 143:104422. [PMID: 37315830 DOI: 10.1016/j.jbi.2023.104422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 06/05/2023] [Accepted: 06/09/2023] [Indexed: 06/16/2023]
Abstract
OBJECTIVES To examine recent literature in order to present a comprehensive overview of the current trends as regards the computational models used to represent the propagation of an infectious outbreak in a population, paying particular attention to those that represent network-based transmission. METHODS a systematic review was conducted following the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. Papers published in English between 2010 and September 2021 were sought in the ACM Digital Library, IEEE Xplore, PubMed and Scopus databases. RESULTS Upon considering their titles and abstracts, 832 papers were obtained, of which 192 were selected for a full content-body check. Of these, 112 studies were eventually deemed suitable for quantitative and qualitative analysis. Emphasis was placed on the spatial and temporal scales studied, the use of networks or graphs, and the granularity of the data used to evaluate the models. The models principally used to represent the spreading of outbreaks have been stochastic (55.36%), while the type of networks most frequently used are relationship networks (32.14%). The most common spatial dimension used is a region (19.64%) and the most used unit of time is a day (28.57%). Synthetic data as opposed to an external source were used in 51.79% of the papers. With regard to the granularity of the data sources, aggregated data such as censuses or transportation surveys are the most common. CONCLUSION We identified a growing interest in the use of networks to represent disease transmission. We detected that research is focused on only certain combinations of the computational model, type of network (in both the expressive and the structural sense) and spatial scale, while the search for other interesting combinations has been left for the future.
Collapse
|
15
|
All lockdowns are not equal: Reducing epidemic impact through evolutionary computation. Biosystems 2023:104935. [PMID: 37269899 DOI: 10.1016/j.biosystems.2023.104935] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2023] [Revised: 05/18/2023] [Accepted: 05/29/2023] [Indexed: 06/05/2023]
Abstract
The impact of different lockdown strategies upon the total number of infections in an epidemic are evaluated for two models of infection: one in which the disease confers permanent immunity, and one in which it does not. The strategies are based upon the proportion of the population infected at a time in order to trigger lockdown, combined with the proportion of interactions removed during lockdown. The population, its interactions, and the relative strengths of those interactions are stored in a weighted contact network, from which edges are removed during lockdown. These edges are selected using an evolutionary algorithm (EA) designed to minimize total infections. Using the EA to select edges significantly reduces total infections in comparison to random selection. In fact, the EA results for the least strict conditions were similar or better to the random results for the most strict conditions, showing that a judicious choice of restrictions during lockdown has the greatest effect on reducing infections. Further, when using the most strict rules a smaller proportion of interactions can be removed to obtain similar or better results in comparison to removing a higher proportion of interactions for less strict rules.
Collapse
|
16
|
Prediction of drug-induced hepatotoxicity based on histopathological whole slide images. Methods 2023; 212:31-38. [PMID: 36706825 DOI: 10.1016/j.ymeth.2023.01.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Revised: 12/30/2022] [Accepted: 01/19/2023] [Indexed: 01/26/2023] Open
Abstract
Liver is an important metabolic organ in human body and is sensitive to toxic chemicals or drugs. Adverse reactions caused by drug hepatotoxicity will damage the liver and hepatotoxicity is the leading cause of removal of approved drugs from the market. Therefore, it is of great significance to identify liver toxicity as early as possible in the drug development process. In this study, we developed a predictive model for drug hepatotoxicity based on histopathological whole slide images (WSI) which are the by-product of drug experiments and have received little attention. To better represent the WSIs, we constructed a graph representation for each WSI by dividing it into small patches, taking sampled patches as nodes and calculating the correlation coefficients between node features as the edges of the graph structure. Then a WSI-level graph convolutional network (GCN) was built to effectively extract the node information of the graph and predict the toxicity. In addition, we introduced a gated attention global context vector (gaGCV) to combine the global context to make node features to contain more comprehensive information. The results validated on rat liver in vivo data from the Open TG-GATES show that the use of WSI for the prediction of toxicity is feasible and effective.
Collapse
|
17
|
The prediction of molecular toxicity based on BiGRU and GraphSAGE. Comput Biol Med 2023; 153:106524. [PMID: 36623439 DOI: 10.1016/j.compbiomed.2022.106524] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Revised: 12/10/2022] [Accepted: 12/31/2022] [Indexed: 01/04/2023]
Abstract
The prediction of molecules toxicity properties plays an crucial role in the realm of the drug discovery, since it can swiftly screen out the expected drug moleculars. The conventional method for predicting toxicity is to use some in vivo or in vitro biological experiments in the laboratory, which can easily pose a threat significant time and financial waste and even ethical issues. Therefore, using computational approaches to predict molecular toxicity has become a common strategy in modern drug discovery. In this article, we propose a novel model named MTBG, which primarily makes use of both SMILES (Simplified molecular input line entry system) strings and graph structures of molecules to extract drug molecular feature in the field of drug molecular toxicity prediction. To verify the performance of the MTBG model, we opt the Tox21 dataset and several widely used baseline models. Experimental results demonstrate that our model can perform better than these baseline models.
Collapse
|
18
|
Graph-based pan-genomes: increased opportunities in plant genomics. JOURNAL OF EXPERIMENTAL BOTANY 2023; 74:24-39. [PMID: 36255144 DOI: 10.1093/jxb/erac412] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Accepted: 10/17/2022] [Indexed: 06/16/2023]
Abstract
Due to the development of sequencing technology and the great reduction in sequencing costs, an increasing number of plant genomes have been assembled, and numerous genomes have revealed large amounts of variations. However, a single reference genome does not allow the exploration of species diversity, and therefore the concept of pan-genome was developed. A pan-genome is a collection of all sequences available for a species, including a large number of consensus sequences, large structural variations, and small variations including single nucleotide polymorphisms and insertions/deletions. A simple linear pan-genome does not allow these structural variations to be intuitively characterized, so graph-based pan-genomes have been developed. These pan-genomes store sequence and structural variation information in the form of nodes and paths to store and display species variation information in a more intuitive manner. The key role of graph-based pan-genomes is to expand the coordinate system of the linear reference genome to accommodate more regions of genetic diversity. Here, we review the origin and development of graph-based pan-genomes, explore their application in plant research, and further highlight the application of graph-based pan-genomes for future plant breeding.
Collapse
|
19
|
A new infinite family of star normal quotient graphs of twisted wreath type. JOURNAL OF ALGEBRAIC COMBINATORICS 2023; 57:739-751. [PMID: 37092124 PMCID: PMC10115705 DOI: 10.1007/s10801-022-01189-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Accepted: 11/05/2022] [Indexed: 05/03/2023]
Abstract
We construct the first infinite families of locally 2-arc transitive graphs with the property that the automorphism group has two orbits on vertices and is quasiprimitive on exactly one orbit, of twisted wreath type. This work contributes to Giudici, Li and Praeger's program for the classification of locally 2-arc transitive graphs by showing that the star normal quotient twisted wreath category also contains infinitely many graphs.
Collapse
|
20
|
Some fixed points of multivalued maps in multiplicative metric spaces. Heliyon 2022; 8:e12453. [PMID: 36578388 PMCID: PMC9791866 DOI: 10.1016/j.heliyon.2022.e12453] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2022] [Revised: 10/27/2022] [Accepted: 12/12/2022] [Indexed: 12/23/2022] Open
Abstract
In this manuscript, new fixed point theorems for multivalued maps, particularly α - ϕ contractions, in complete multiplicative metric spaces are proved. We further prove a fixed point theorem for multivalued mappings on a graph-equipped multiplicative metric space. Our results extend the result of Dosenovic et al. [4], and generalize the result of Ali et al. [2]. We included an example to validate our result.
Collapse
|
21
|
Topological analysis of brain dynamics in autism based on graph and persistent homology. Comput Biol Med 2022; 150:106202. [PMID: 37859293 DOI: 10.1016/j.compbiomed.2022.106202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2022] [Revised: 10/02/2022] [Accepted: 10/09/2022] [Indexed: 11/22/2022]
Abstract
Autism spectrum disorder (ASD) is a heterogeneous disorder with a rapidly growing prevalence. In recent years, the dynamic functional connectivity (DFC) technique has been used to reveal the transient connectivity behavior of ASDs' brains by clustering connectivity matrices in different states. However, the states of DFC have not been yet studied from a topological point of view. In this paper, this study was performed using global metrics of the graph and persistent homology (PH) and resting-state functional magnetic resonance imaging (fMRI) data. The PH has been recently developed in topological data analysis and deals with persistent structures of data. The structural connectivity (SC) and static FC (SFC) were also studied to know which one of the SC, SFC, and DFC could provide more discriminative topological features when comparing ASDs with typical controls (TCs). Significant discriminative features were only found in states of DFC. Moreover, the best classification performance was offered by persistent homology-based metrics and in two out of four states. In these two states, some networks of ASDs compared to TCs were more segregated and isolated (showing the disruption of network integration in ASDs). The results of this study demonstrated that topological analysis of DFC states could offer discriminative features which were not discriminative in SFC and SC. Also, PH metrics can provide a promising perspective for studying ASD and finding candidate biomarkers.
Collapse
|
22
|
A new measure for the attitude to mobility of Italian students and graduates: a topological data analysis approach. STAT METHOD APPL-GER 2022; 32:1-35. [PMID: 36311813 PMCID: PMC9589705 DOI: 10.1007/s10260-022-00666-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/12/2022] [Indexed: 11/05/2022]
Abstract
Students' and graduates' mobility is an interesting topic of discussion especially for the Italian education system and universities. The main reasons for migration and for the so called brain drain, can be found in the socio-economic context and in the famous North-South divide. Measuring mobility and understanding its dynamic over time and space are not trivial tasks. Most of the studies in the related literature focus on the determinants of such phenomenon, in this paper, instead, combining tools coming from graph theory and Topological Data Analysis we propose a new measure for the attitude to mobility. Each mobility trajectory is represented by a graph and the importance of the features constituting the graph are evaluated over time using persistence diagrams. The attitude to mobility of the students is then ranked computing the distance between the individual persistence diagram and the theoretical persistence diagram of the stayer student. The new approach is used for evaluating the mobility of the students that in 2008 enrolled in an Italian university. The relation between attitude to mobility and the main socio-demographic variables is investigated.
Collapse
|
23
|
Circular inference predicts nonuniform overactivation and dysconnectivity in brain-wide connectomes. Schizophr Res 2022; 245:59-67. [PMID: 33618940 DOI: 10.1016/j.schres.2020.12.045] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Revised: 12/22/2020] [Accepted: 12/26/2020] [Indexed: 12/17/2022]
Abstract
Schizophrenia is a severe mental disorder whose neural basis remains difficult to ascertain. Among the available pathophysiological theories, recent work has pointed towards subtle perturbations in the excitation-inhibition (E/I) balance within different neural circuits. Computational approaches have suggested interesting mechanisms that can account for both E/I imbalances and psychotic symptoms. Based on hierarchical neural networks propagating information through a message-passing algorithm, it was hypothesized that changes in the E/I ratio could cause a "circular belief propagation" in which bottom-up and top-down information reverberate. This circular inference (CI) was proposed to account for the clinical features of schizophrenia. Under this assumption, this paper examined the impact of CI on network dynamics in light of brain imaging findings related to psychosis. Using brain-inspired graphical models, we show that CI causes overconfidence and overactivation most specifically at the level of connector hubs (e.g., nodes with many connections allowing integration across networks). By also measuring functional connectivity in these graphs, we provide evidence that CI is able to predict specific changes in modularity known to be associated with schizophrenia. Altogether, these findings suggest that the CI framework may facilitate behavioral and neural research on the multifaceted nature of psychosis.
Collapse
|
24
|
Analyzing SARS-CoV-2 Sequence Patterns by Semantic Trajectories. Stud Health Technol Inform 2022; 295:197-200. [PMID: 35773842 DOI: 10.3233/shti220696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Since the beginning of the pandemic due to the SARS-CoV-2 emergence, several variants has been observed all over the world. One of the last known, Omicron, caused a large spread of the virus in few days, and several countries reached a record number of contaminations. Indeed, the mutation in the Spike region of the virus played an important role in altering its behavior. Therefore, it is important to understand the virus evolution by extracting and analyzing the virus structure of each variant. In this work we show how patterns sequence could be analyzed and extracted by means of semantic trajectories modeling. To do so, we designed a graph-based model in which the genome organization is handled using nodes and edges to represent respectively the nucleotides and sequence connection (point of interest and routes for trajectories). The modeling choices and pattern extraction from the graph allowed to retrieve a region where a mutation occurred in Omicron (NCBI version:OM011974.1).
Collapse
|
25
|
Abstract
Graphs are widespread in many real-life practical applications. One of a graph's fundamental and popular researches is investigating the relations between two given vertices. The relationship between nodes in the graph can be measured by the shortest distance. Moreover, the number of paths is also a popular metric to assess the relationship of different nodes. In many location-based services, users make decisions on the basis of both the two metrics. To address this problem, we propose a new hybrid-metric based on the number of paths with a distance constraint for road networks, which are special graphs. Based on it, a most relevant node query on road networks is identified. To handle this problem, we first propose a Shortest-Distance Constrained DFS, which uses the shortest distance to prune unqualified nodes. To further improve query efficiency, we present Batch Query DFS algorithm, which only needs only one DFS search. Our experiments on four real-life road networks demonstrate the performance of the proposed algorithms.
Collapse
|
26
|
A GIS-aided cellular automata system for monitoring and estimating graph-based spread of epidemics. NATURAL COMPUTING 2022; 21:463-480. [PMID: 35757183 PMCID: PMC9214692 DOI: 10.1007/s11047-022-09891-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 04/26/2022] [Indexed: 06/15/2023]
Abstract
In this study, we introduce an application of a Cellular Automata (CA)-based system for monitoring and estimating the spread of epidemics in real world, considering the example of a Greek city. The proposed system combines cellular structure and graph representation to approach the connections among the area's parts more realistically. The original design of the model is attributed to a classical SIR (Susceptible-Infected-Recovered) mathematical model. Aiming to upgrade the application's effectiveness, we have enriched the model with parameters that advances its functionality to become self-adjusting and more efficient of approaching real situations. Thus, disease-related parameters have been introduced, while human interventions such as vaccination have been represented in algorithmic manner. The model incorporates actual geographical data (GIS, geographic information system) to upgrade its response. A methodology that allows the representation of any area with given population distribution and geographical data in a graph associated with the corresponding CA model for epidemic simulation has been developed. To validate the efficient operation of the proposed model and methodology of data display, the city of Eleftheroupoli, in Eastern Macedonia-Thrace, Greece, was selected as a testing platform (Eleftheroupoli, Kavala). Tests have been performed at both macroscopic and microscopic levels, and the results confirmed the successful operation of the system and verified the correctness of the proposed methodology.
Collapse
|
27
|
Graph-based abstractive biomedical text summarization. J Biomed Inform 2022; 132:104099. [PMID: 35700914 DOI: 10.1016/j.jbi.2022.104099] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2021] [Revised: 05/23/2022] [Accepted: 05/27/2022] [Indexed: 11/18/2022]
Abstract
Summarization is the process of compressing a text to obtain its important informative parts. In recent years, various methods have been presented to extract important parts of textual documents to present them in a summarized form. The first challenge of these methods is to detect the concepts that well convey the main topic of the text and extract sentences that better describe these essential concepts. The second challenge is the correct interpretation of the essential concepts to generate new paraphrased sentences such that they are not exactly the same as the sentences in the main text. The first challenge has been addressed by many researchers. However, the second one is still in progress. In this study, we focus on the abstractive summarization of biomedical documents. In this regard, for the first challenge, a new method is presented based on the graph generation and frequent itemset mining for generating extractive summaries by considering the concepts within the biomedical documents. Then, to address the second challenge, a transfer learning-based method is used to generate abstractive summarizations from extractive summaries. The efficiency of the proposed solution has been evaluated by conducting several experiments over BioMed Central and NLM's PubMed datasets. The obtained results show that the proposed approach admits a better interpretation of the main concepts and sentences of biomedical documents for the abstractive summarization by obtaining the overall ROUGE of 59.60%, which, on average, is 17% better than state-of-the-art summarization techniques. The source code, datasets, and results are available in GitHub1.
Collapse
|
28
|
Exploiting document graphs for inter sentence relation extraction. J Biomed Semantics 2022; 13:15. [PMID: 35659292 PMCID: PMC9166375 DOI: 10.1186/s13326-022-00267-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Accepted: 04/12/2022] [Indexed: 11/13/2022] Open
Abstract
Background Most previous relation extraction (RE) studies have focused on intra sentence relations and have ignored relations that span sentences, i.e. inter sentence relations. Such relations connect entities at the document level rather than as relational facts in a single sentence. Extracting facts that are expressed across sentences leads to some challenges and requires different approaches than those usually applied in recent intra sentence relation extraction. Despite recent results, there are still limitations to be overcome. Results We present a novel representation for a sequence of consecutive sentences, namely document subgraph, to extract inter sentence relations. Experiments on the BioCreative V Chemical-Disease Relation corpus demonstrate the advantages and robustness of our novel system to extract both intra- and inter sentence relations in biomedical literature abstracts. The experimental results are comparable to state-of-the-art approaches and show the potential by demonstrating the effectiveness of graphs, deep learning-based model, and other processing techniques. Experiments were also carried out to verify the rationality and impact of various additional information and model components. Conclusions Our proposed graph-based representation helps to extract ∼50% of inter sentence relations and boosts the model performance on both precision and recall compared to the baseline model. Supplementary Information The online version contains supplementary material available at (10.1186/s13326-022-00267-3).
Collapse
|
29
|
Plant Systems Biology: Lessons from Teaching. Methods Mol Biol 2022; 2395:1-12. [PMID: 34822146 DOI: 10.1007/978-1-0716-1816-5_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Systems biology is the study of biological interactions. These interactions exist between biological entities at every scale, from genes to population, and create incredibly complex networks of feedbacks responsible for emerging behaviors. To study these behaviors, biologists can use models based on mathematical and computational formalisms grounded on vast existing corpus of theoretical work. This chapter develops an overview of this process of plant systems biology study from the point of view of a teaching course, and introduces the methods and studies presented in this second edition of the "Plant Systems Biology" book series.
Collapse
|
30
|
Construction of Practical Haplotype Graph (PHG) with the Whole-Genome Sequence Data. Methods Mol Biol 2022; 2443:273-284. [PMID: 35037212 DOI: 10.1007/978-1-0716-2067-0_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
With the emerging sequencing technologies and cost reduction, the sequence data generation has accelerated from a single individual to multiple (thousands of) individuals of a species. The terabytes of sequence data generated from thousands of individuals include the majority of the redundant sequence which depends on the level of sequence similarity within the population of individuals. Managing large datasets and creating the unique catalogue sequence from such a large population is challenging to analyze, store, and retrieve the information. In this chapter, we discuss the practical haplotype graph (PHG) which addresses the above said challenges and also able to retrieve required information such as variants and sequences more efficiently, which enable researchers to manage and assess large genomic data.
Collapse
|
31
|
Approximating Quasi-Stationary Behaviour in Network-Based SIS Dynamics. Bull Math Biol 2021; 84:4. [PMID: 34800180 DOI: 10.1007/s11538-021-00964-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Accepted: 10/25/2021] [Indexed: 10/19/2022]
Abstract
Deterministic approximations to stochastic Susceptible-Infectious-Susceptible models typically predict a stable endemic steady-state when above threshold. This can be hard to relate to the underlying stochastic dynamics, which has no endemic steady-state but can exhibit approximately stable behaviour. Here, we relate the approximate models to the stochastic dynamics via the definition of the quasi-stationary distribution (QSD), which captures this approximately stable behaviour. We develop a system of ordinary differential equations that approximate the number of infected individuals in the QSD for arbitrary contact networks and parameter values. When the epidemic level is high, these QSD approximations coincide with the existing approximation methods. However, as we approach the epidemic threshold, the models deviate, with these models following the QSD and the existing methods approaching the all susceptible state. Through consistently approximating the QSD, the proposed methods provide a more robust link to the stochastic models.
Collapse
|
32
|
Uncertainty-guided graph attention network for parapneumonic effusion diagnosis. Med Image Anal 2021; 75:102217. [PMID: 34775280 DOI: 10.1016/j.media.2021.102217] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2021] [Revised: 08/12/2021] [Accepted: 08/23/2021] [Indexed: 01/08/2023]
Abstract
Parapneumonic effusion (PPE) is a common condition that causes death in patients hospitalized with pneumonia. Rapid distinction of complicated PPE (CPPE) from uncomplicated PPE (UPPE) in Computed Tomography (CT) scans is of great importance for the management and medical treatment of PPE. However, UPPE and CPPE display similar appearances in CT scans, and it is challenging to distinguish CPPE from UPPE via a single 2D CT image, whether attempted by a human expert, or by any of the existing disease classification approaches. 3D convolutional neural networks (CNNs) can utilize the entire 3D volume for classification: however, they typically suffer from the intrinsic defect of over-fitting. Therefore, it is important to develop a method that not only overcomes the heavy memory and computational requirements of 3D CNNs, but also leverages the 3D information. In this paper, we propose an uncertainty-guided graph attention network (UG-GAT) that can automatically extract and integrate information from all CT slices in a 3D volume for classification into UPPE, CPPE, and normal control cases. Specifically, we frame the distinction of different cases as a graph classification problem. Each individual is represented as a directed graph with a topological structure, where vertices represent the image features of slices, and edges encode the spatial relationship between them. To estimate the contribution of each slice, we first extract the slice representations with uncertainty, using a Bayesian CNN: we then make use of the uncertainty information to weight each slice during the graph prediction phase in order to enable more reliable decision-making. We construct a dataset consisting of 302 chest CT volumetric data from different subjects (99 UPPE, 99 CPPE and 104 normal control cases) in this study, and to the best of our knowledge, this is the first attempt to classify UPPE, CPPE and normal cases using a deep learning method. Extensive experiments show that our approach is lightweight in demands, and outperforms accepted state-of-the-art methods by a large margin. Code is available at https://github.com/iMED-Lab/UG-GAT.
Collapse
|
33
|
Nonsemantic word graphs of texts spanning ∼ 4500 years, including pre-literate Amerindian oral narratives. Data Brief 2021; 38:107296. [PMID: 34458523 PMCID: PMC8379624 DOI: 10.1016/j.dib.2021.107296] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Revised: 05/11/2021] [Accepted: 08/12/2021] [Indexed: 11/09/2022] Open
Abstract
Non-semantic word graphs obtained from oral reports are useful to describe cognitive decline in psychiatric conditions such as Schizophrenia, as well as education-related gains in discourse structure during typical development. Here we provide non-semantic word graph attributes of texts spanning approximately 4500 years of history, and pre-literate Amerindian oral narratives. The dataset assessed comprises 707 literary texts representative of 9 different Afro-Eurasian traditions (Syro-Mesopotamian, Egyptian, Hinduist, Persian, Judeo-Christian, Greek-Roman, Medieval, Modern and Contemporary), and Amerindian narratives (N = 39) obtained from a single ethnic group from South America (Kalapalo, N = 18), or from a mixed ethnic group from South, Central and North America (non-Kalapalo, N = 21). The present article provides detailed information about each text or narrative, including measurements of four graph attributes of interest: number of nodes (lexical diversity), repeated edges (short-range recurrence), largest strongly connected component (long-range recurrence), and average shortest path (graph length).
Collapse
|
34
|
Dynamics of epidemic spreading on connected graphs. J Math Biol 2021; 82:52. [PMID: 33864137 PMCID: PMC8051836 DOI: 10.1007/s00285-021-01602-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Revised: 03/03/2021] [Accepted: 03/26/2021] [Indexed: 01/21/2023]
Abstract
We propose a new model that describes the dynamics of epidemic spreading on connected graphs. Our model consists in a PDE-ODE system where at each vertex of the graph we have a standard SIR model and connections between vertices are given by heat equations on the edges supplemented with Robin like boundary conditions at the vertices modeling exchanges between incident edges and the associated vertex. We describe the main properties of the system, and also derive the final total population of infected individuals. We present a semi-implicit in time numerical scheme based on finite differences in space which preserves the main properties of the continuous model such as the uniqueness and positivity of solutions and the conservation of the total population. We also illustrate our results with a collection of numerical simulations for a selection of connected graphs.
Collapse
|
35
|
Integrated and segregated frequency architecture of the human brain network. Brain Struct Funct 2021; 226:335-350. [PMID: 33389041 DOI: 10.1007/s00429-020-02174-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2020] [Accepted: 11/09/2020] [Indexed: 12/11/2022]
Abstract
The frequency of brain activity modulates the relationship between the brain and human behavior. Insufficient understanding of frequency-specific features may thus lead to inconsistent explanations of human behavior. However, to date, the frequency-specific features of the human brain functional network at the whole-brain level remain poorly understood. Here, we used resting-state fMRI data and graph-theory analyses to investigate the frequency-specific characteristics of fMRI signals in 12 frequency bands (frequency range 0.01-0.7 Hz) in 75 healthy participants. We found that brain regions with higher level and more complex functions had a more variable functional connectivity pattern but engaged less in higher frequency ranges. Moreover, brain regions that engaged in fewer frequency bands played more integrated roles (i.e., higher network participation coefficient and lower within-module degree) in the functional network, whereas regions that engaged in broader frequency ranges exhibited more segregated functions (i.e., lower network participation coefficient and higher within-module degree). Finally, behavioral analyses revealed that regional frequency variability was associated with a spectrum of behavioral functions from sensorimotor functions to complex cognitive and social functions. Taken together, our results showed that segregated functions are executed in wide frequency ranges, whereas integrated functions are executed mainly in lower frequency ranges. These frequency-specific features of brain networks provided crucial insights into the frequency mechanism of fMRI signals, suggesting that signals in higher frequency ranges should be considered for their relation to cognitive functions.
Collapse
|
36
|
Graphs in the COVID-19 news: a mathematics audit of newspapers in Korea. EDUCATIONAL STUDIES IN MATHEMATICS 2021; 108:183-200. [PMID: 34934226 PMCID: PMC7930526 DOI: 10.1007/s10649-021-10029-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 01/12/2021] [Indexed: 05/06/2023]
Abstract
Visual displays in the news media become critical during escalating events such as the COVID-19 pandemic, as they facilitate the communication of complex information to the public. This article investigates the use of graphs in Korea's news media during the COVID-19 outbreak. We selected 12 dates that represent turning points in the outbreak of the disease and collected news stories including graphs from seven Korean daily newspapers issued on those dates. First, we analyzed the usage of graphs in COVID-19 news stories. Quantitative analysis of the types and frequency of graphs used in COVID-19 news stories and qualitative analysis of the content of news stories containing graphs were conducted. Second, we identified cases in which readers may be biased by the mathematical misuse of graphs in the news stories covering COVID-19. The implications of these findings for future teaching and learning of graph literacy in school mathematics courses are discussed.
Collapse
|
37
|
CGNet: A graph-knowledge embedded convolutional neural network for detection of pneumonia. Inf Process Manag 2021; 58:102411. [PMID: 33100482 PMCID: PMC7569413 DOI: 10.1016/j.ipm.2020.102411] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2020] [Revised: 09/26/2020] [Accepted: 10/10/2020] [Indexed: 02/06/2023]
Abstract
Pneumonia is a global disease that causes high children mortality. The situation has even been worsening by the outbreak of the new coronavirus named COVID-19, which has killed more than 983,907 so far. People infected by the virus would show symptoms like fever and coughing as well as pneumonia as the infection progresses. Timely detection is a public consensus achieved that would benefit possible treatments and therefore contain the spread of COVID-19. X-ray, an expedient imaging technique, has been widely used for the detection of pneumonia caused by COVID-19 and some other virus. To facilitate the process of diagnosis of pneumonia, we developed a deep learning framework for a binary classification task that classifies chest X-ray images into normal and pneumonia based on our proposed CGNet. In our CGNet, there are three components including feature extraction, graph-based feature reconstruction and classification. We first use the transfer learning technique to train the state-of-the-art convolutional neural networks (CNNs) for binary classification while the trained CNNs are used to produce features for the following two components. Then, by deploying graph-based feature reconstruction, we, therefore, combine features through the graph to reconstruct features. Finally, a shallow neural network named GNet, a one layer graph neural network, which takes the combined features as the input, classifies chest X-ray images into normal and pneumonia. Our model achieved the best accuracy at 0.9872, sensitivity at 1 and specificity at 0.9795 on a public pneumonia dataset that includes 5,856 chest X-ray images. To evaluate the performance of our proposed method on detection of pneumonia caused by COVID-19, we also tested the proposed method on a public COVID-19 CT dataset, where we achieved the highest performance at the accuracy of 0.99, specificity at 1 and sensitivity at 0.98, respectively.
Collapse
|
38
|
Abstract
Protein-protein interactions (PPIs) are central to cellular functions. Experimental methods for predicting PPIs are well developed but are time and resource expensive and suffer from high false-positive error rates at scale. Computational prediction of PPIs is highly desirable for a mechanistic understanding of cellular processes and offers the potential to identify highly selective drug targets. In this chapter, details of developing a deep learning approach to predicting which residues in a protein are involved in forming a PPI-a task known as PPI site prediction-are outlined. The key decisions to be made in defining a supervised machine learning project in this domain are here highlighted. Alternative training regimes for deep learning models to address shortcomings in existing approaches and provide starting points for further research are discussed. This chapter is written to serve as a companion to developing deep learning approaches to protein-protein interaction site prediction, and an introduction to developing geometric deep learning projects operating on protein structure graphs.
Collapse
|
39
|
Abstract
The accuracy of graph based learning techniques relies on the underlying topological structure and affinity between data points, which are assumed to lie on a smooth Riemannian manifold. However, the assumption of local linearity in a neighborhood does not always hold true. Hence, the Euclidean distance based affinity that determines the graph edges may fail to represent the true connectivity strength between data points. Moreover, the affinity between data points is influenced by the distribution of the data around them and must be considered in the affinity measure. In this paper, we propose two techniques, C C G A L and C C G A N that use cross-covariance based graph affinity (CCGA) to represent the relation between data points in a local region. C C G A L also explores the additional connectivity between data points which share a common local neighborhood. C C G A N considers the influence of respective neighborhoods of the two immediately connected data points, which further enhance the affinity measure. Experimental results of manifold learning on synthetic datasets show that CCGA is able to represent the affinity measure between data points more accurately. This results in better low dimensional representation. Manifold regularization experiments on standard image dataset further indicate that the proposed CCGA based affinity is able to accurately identify and include the influence of the data points and its common neighborhood that increase the classification accuracy. The proposed method outperforms the existing state-of-the-art manifold regularization methods by a significant margin.
Collapse
|
40
|
A deep community based approach for large scale content based X-ray image retrieval. Med Image Anal 2020; 68:101847. [PMID: 33249389 DOI: 10.1016/j.media.2020.101847] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2019] [Revised: 07/31/2020] [Accepted: 08/19/2020] [Indexed: 02/01/2023]
Abstract
A computer assisted system for automatic retrieval of medical images with similar image contents can serve as an efficient management tool for handling and mining large scale data, and can also be used as a tool in clinical decision support systems. In this paper, we propose a deep community based automated medical image retrieval framework for extracting similar images from a large scale X-ray database. The framework integrates a deep learning-based image feature generation approach and a network community detection technique to extract similar images. When compared with the state-of-the-art medical image retrieval techniques, the proposed approach demonstrated improved performance. We evaluated the performance of the proposed method on two large scale chest X-ray datasets, where given a query image, the proposed approach was able to extract images with similar disease labels with a precision of 85%. To the best of our knowledge, this is the first deep community based image retrieval application on large scale chest X-ray database.
Collapse
|
41
|
The History of Writing Reflects the Effects of Education on Discourse Structure: Implications for Literacy, Orality, Psychosis and the Axial Age. Trends Neurosci Educ 2020; 21:100142. [PMID: 33303107 DOI: 10.1016/j.tine.2020.100142] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2018] [Revised: 09/22/2020] [Accepted: 09/24/2020] [Indexed: 11/28/2022]
Abstract
BACKGROUND Graph analysis detects psychosis and literacy acquisition. Bronze Age literature has been proposed to contain childish or psychotic features, which would only have matured during the Axial Age (∼800-200 BC), a putative boundary for contemporary mentality. METHOD Graph analysis of literary texts spanning ∼4,500 years shows remarkable asymptotic changes over time. RESULTS While lexical diversity, long-range recurrence and graph length increase away from randomness, short-range recurrence declines towards random levels. Bronze Age texts are structurally similar to oral reports from literate typical children and literate psychotic adults, but distinct from poetry, and from narratives by preliterate preschoolers or Amerindians. Text structure reconstitutes the "arrow-of-time", converging to educated adult levels at the Axial Age onset. CONCLUSION The educational pathways of oral and literate traditions are structurally divergent, with a decreasing range of recurrence in the former, and an increasing range of recurrence in the latter. Education is seemingly the driving force underlying discourse maturation.
Collapse
|
42
|
Tetramer protein complex interface residue pairs prediction with LSTM combined with graph representations. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2020; 1868:140504. [PMID: 32717382 DOI: 10.1016/j.bbapap.2020.140504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/28/2020] [Revised: 06/30/2020] [Accepted: 07/16/2020] [Indexed: 10/23/2022]
Abstract
MOTIVATION Protein-protein interactions are important for many biological processes. Theoretical understanding of the structurally determining factors of interaction sites will help to understand the underlying mechanism of protein-protein interactions. Taking advantage of advanced mathematical methods to correctly predict interaction sites will be useful. Although some previous studies have been devoted to the interaction interface of protein monomer and the interface residues between chains of protein dimers, very few studies about the interface residues prediction of protein multimers, including trimers, tetramer and even more monomers in a large protein complex. As we all know, a large number of proteins function with the form of multibody protein complexes. And the complexity of the protein multimers structure causes the difficulty of interface residues prediction on them. So, we hope to build a method for the prediction of protein tetramer interface residue pairs. RESULTS Here, we developed a new deep network based on LSTM network combining with graph to predict protein tetramers interaction interface residue pairs. On account of the protein structure data is not the same as the image or video data which is well-arranged matrices, namely the Euclidean Structure mentioned in many researches. Because the Non-Euclidean Structure data can't keep the translation invariance, and we hope to extract some spatial features from this kind of data applying on deep learning, an algorithm combining with graph was developed to predict the interface residue pairs of protein interactions based on a topological graph building a relationship between vertexes and edges in graph theory combining multilayer Long Short-Term Memory network. First, selecting the training and test samples from the Protein Data Bank, and then extracting the physicochemical property features and the geometric features of surface residue associated with interfacial properties. Subsequently, we transform the protein multimers data to topological graphs and predict protein interaction interface residue pairs using the model. In addition, different types of evaluation indicators verified its validity.
Collapse
|
43
|
A method to visualize a complete sensitivity analysis for loss to follow-up in clinical trials. Contemp Clin Trials Commun 2020; 19:100586. [PMID: 32577583 PMCID: PMC7300145 DOI: 10.1016/j.conctc.2020.100586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2020] [Revised: 05/26/2020] [Accepted: 06/07/2020] [Indexed: 12/04/2022] Open
Abstract
Loss to follow-up occurs in randomized controlled trials. Missing data methods, including multiple imputation (MI), can be used but often rely upon untestable assumptions. Sensitivity analysis can quantify violations of these assumptions. Since an adequate sensitivity analysis requires evaluation of multiple scenarios, presenting this information in an easily interpretable manner is challenging. We propose to graphically represent a thorough sensitivity analysis displaying all possible outcomes for loss to follow-up in randomized controlled trial data relating a completely observed binary exposure to a binary outcome. We describe plausible results under different missingness mechanisms using data from the EAGeR Trial (n = 1228) on low-dose aspirin versus placebo on pregnancy and live birth, in which 140 participants had early withdrawal. For the effect of aspirin on live birth, sensitivity analysis risk ratios (RR) for all potential outcome scenarios ranged from 0.88 to 1.34, applicable to any possible missingness mechanism. MI produced RR = 1.10; 95% confidence interval: (0.98, 1.22). RRs from individual imputations ranged from 1.04 to 1.16, the range of results that could have been observed if data were missing at random. Under this mechanism, the conclusions about the efficacy of low-dose aspirin could have been sensitive to the missing outcome data. Rather than limiting sensitivity analysis for loss to follow-up to a few scenarios that can be presented tabularly, results of a complete sensitivity analysis can be presented in a single plot, which should be implemented in all studies with missing outcome data to convey certainty or uncertainty, confidence or caution. Loss to follow-up can cause selection bias in randomized controlled trials. Sensitivity analysis is often limited to a few scenarios presented tabularly. A complete sensitivity analysis is demonstrated for binary exposure and outcomes. All possible and probable outcomes can be graphically displayed in a single plot.
Collapse
|
44
|
Temporal phenotyping for transitional disease progress: An application to epilepsy and Alzheimer's disease. J Biomed Inform 2020; 107:103462. [PMID: 32562896 DOI: 10.1016/j.jbi.2020.103462] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2019] [Revised: 05/27/2020] [Accepted: 05/30/2020] [Indexed: 11/21/2022]
Abstract
Complicated multifactorial diseases deteriorate from one disease to other diseases. For example, existing studies consider Alzheimer's disease (AD) a comorbidity of epilepsy, but also recognize epilepsy to occur more frequently in patients with AD than those without. It is important to understand the progress of disease that deteriorates to severe diseases. To this end, we develop a transitional phenotyping method based on both longitudinal and cross-sectional relationships between diseases and/or medications. For a cross-sectional approach, we utilized a skip-gram model to represent co-occurred disease or medication. For a longitudinal approach, we represented each patient as a transition probability between medical events and used supervised tensor factorization to decompose into groups of medical events that develop together. Then we harmonized both information to derive high-risk transitional patterns. We applied our method to disease progress from epilepsy to AD. An epilepsy-AD cohort of 600,000 patients were extracted from Cerner Health Facts data. Our experimental results suggested a causal relationship between epilepsy and later onset of AD, and also identified five epilepsy subgroups with distinct phenotypic patterns leading to AD. While such findings are preliminary, the proposed method combining representation learning with tensor factorization seems to be an effective approach for risk factor analysis.
Collapse
|
45
|
All graphs of order n ≥ 11 and diameter 2 with partition dimension n - 3. Heliyon 2020; 6:e03694. [PMID: 32322708 PMCID: PMC7160584 DOI: 10.1016/j.heliyon.2020.e03694] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2019] [Revised: 02/04/2020] [Accepted: 03/25/2020] [Indexed: 11/24/2022] Open
Abstract
All graphs of order n with partition dimension 2, n−2, n−1, or n have been characterized. However, finding all graphs on n vertices with partition dimension other than these above numbers is still open. In this paper, we characterize all graphs of order n≥11 and diameter 2 with partition dimension n−3.
Collapse
|
46
|
NUMERICAL INTEGRATION ON GRAPHS: WHERE TO SAMPLE AND HOW TO WEIGH. MATHEMATICS OF COMPUTATION 2020; 89:1933-1952. [PMID: 33927452 PMCID: PMC8081285 DOI: 10.1090/mcom/3515] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Let G = (V,E,w) be a finite, connected graph with weighted edges. We are interested in the problem of finding a subset W ⊂ V of vertices and weights aw such that 1 | V | ∑ v ∈ V f ( v ) ∼ ∑ w ∈ W a w f ( w ) for functions f : V → ℝ that are 'smooth' with respect to the geometry of the graph; here ~ indicates that we want the right-hand side to be as close to the left-hand side as possible. The main application are problems where f is known to vary smoothly over the underlying graph but is expensive to evaluate on even a single vertex. We prove an inequality showing that the integration problem can be rewritten as a geometric problem ('the optimal packing of heat balls'). We discuss how one would construct approximate solutions of the heat ball packing problem; numerical examples demonstrate the efficiency of the method.
Collapse
|
47
|
Improving parallel executions by increasing task granularity in task-based runtime systems using acyclic DAG clustering. PeerJ Comput Sci 2020; 6:e247. [PMID: 33816899 PMCID: PMC7924455 DOI: 10.7717/peerj-cs.247] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2019] [Accepted: 11/20/2019] [Indexed: 11/25/2022]
Abstract
The task-based approach is a parallelization paradigm in which an algorithm is transformed into a direct acyclic graph of tasks: the vertices are computational elements extracted from the original algorithm and the edges are dependencies between those. During the execution, the management of the dependencies adds an overhead that can become significant when the computational cost of the tasks is low. A possibility to reduce the makespan is to aggregate the tasks to make them heavier, while having fewer of them, with the objective of mitigating the importance of the overhead. In this paper, we study an existing clustering/partitioning strategy to speed up the parallel execution of a task-based application. We provide two additional heuristics to this algorithm and perform an in-depth study on a large graph set. In addition, we propose a new model to estimate the execution duration and use it to choose the proper granularity. We show that this strategy allows speeding up a real numerical application by a factor of 7 on a multi-core system.
Collapse
|
48
|
Techniques and Applications in Skin OCT Analysis. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2020; 1213:149-163. [PMID: 32030669 DOI: 10.1007/978-3-030-33128-3_10] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
The skin is the largest organ of our body. Skin disease abnormalities which occur within the skin layers are difficult to examine visually and often require biopsies to make a confirmation on a suspected condition. Such invasive methods are not well-accepted by children and women due to the possibility of scarring. Optical coherence tomography (OCT) is a non-invasive technique enabling in vivo examination of sub-surface skin tissue without the need for excision of tissue. However, one of the challenges in OCT imaging is the interpretation and analysis of OCT images. In this review, we discuss the various methodologies in skin layer segmentation and how it could potentially improve the management of skin diseases. We also present a review of works which use advanced machine learning techniques to achieve layers segmentation and detection of skin diseases. Lastly, current challenges in analysis and applications are also discussed.
Collapse
|
49
|
LEMON: a method to construct the local strains at horizontal gene transfer sites in gut metagenomics. BMC Bioinformatics 2019; 20:702. [PMID: 31881904 PMCID: PMC6933643 DOI: 10.1186/s12859-019-3301-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2019] [Accepted: 12/02/2019] [Indexed: 01/27/2023] Open
Abstract
BACKGROUND Horizontal Gene Transfer (HGT) refers to the transfer of genetic materials between organisms through mechanisms other than parent-offspring inheritance. HGTs may affect human health through a large number of microorganisms, especially the gut microbiomes which the human body harbors. The transferred segments may lead to complicated local genome structural variations. Details of the local genome structure can elucidate the effects of the HGTs. RESULTS In this work, we propose a graph-based method to reconstruct the local strains from the gut metagenomics data at the HGT sites. The method is implemented in a package named LEMON. The simulated results indicate that the method can identify transferred segments accurately on reference sequences of the microbiome. Simulation results illustrate that LEMON could recover local strains with complicated structure variation. Furthermore, the gene fusion points detected in real data near HGT breakpoints validate the accuracy of LEMON. Some strains reconstructed by LEMON have a replication time profile with lower standard error, which demonstrates HGT events recovered by LEMON is reliable. CONCLUSIONS Through LEMON we could reconstruct the sequence structure of bacteria, which harbors HGT events. This helps us to study gene flow among different microbial species.
Collapse
|
50
|
Disruption of posterior brain functional connectivity and its relation to cognitive impairment in idiopathic REM sleep behavior disorder. NEUROIMAGE-CLINICAL 2019; 25:102138. [PMID: 31911344 PMCID: PMC6948254 DOI: 10.1016/j.nicl.2019.102138] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/09/2019] [Revised: 12/16/2019] [Accepted: 12/21/2019] [Indexed: 12/12/2022]
Abstract
There is a reduced brain posterior functional connectivity in IRBD patients. Reduced temporo-parietal connectivity correlates with mental processing slowness. Left superior parietal lobule has reduced centrality in IRBD patients.
Background Resting-state functional MRI has been proposed as a new biomarker of prodromal neurodegenerative disorders, but it has been poorly investigated in the idiopathic form of rapid-eye-movement sleep behavior disorder (IRBD), a clinical harbinger of subsequent synucleinopathy. Particularly, a complex-network approach has not been tested to study brain functional connectivity in IRBD patients. Objectives The aim of the current work is to characterize resting-state functional connectivity in IRBD patients using a complex-network approach and to determine its possible relation to cognitive impairment. Method Twenty patients with IRBD and 27 matched healthy controls (HC) underwent resting-state functional MRI with a 3T scanner and a comprehensive neuropsychological battery. The functional connectome was studied using threshold-free network-based statistics. Global and local network parameters were calculated based on graph theory and compared between groups. Head motion, age and sex were introduced as covariates in all analyses. Results IRBD patients showed reduced cortico-cortical functional connectivity strength in comparison with HC in edges located in posterior regions (p <0.05, FWE corrected). This regional pattern was also shown in an independent analysis comprising posterior areas where a decreased connectivity in 51 edges was found, whereas no significant results were detected when an anterior network was considered (p <0.05, FWE corrected). In the posterior network, the left superior parietal lobule had reduced centrality in IRBD. Functional connectivity strength between left inferior temporal lobe and right superior parietal lobule positively correlated with mental processing speed in IRBD (r = .633; p = .003). No significant correlations were found in the HC group. Conclusion : Our findings support the presence of disrupted posterior functional brain connectivity of IRBD patients similar to that found in synucleinopathies. Moreover, connectivity reductions in IRBD were associated with lower performance in mental processing speed domain.
Collapse
|