1
|
Ekpenyong ME, Adegoke AA, Edoho ME, Inyang UG, Udo IJ, Ekaidem IS, Osang F, Uto NP, Geoffery JI. Collaborative Mining of Whole Genome Sequences for Intelligent HIV-1 Sub-Strain(s) Discovery. Curr HIV Res 2022; 20:163-183. [PMID: 35142269 DOI: 10.2174/1570162x20666220210142209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Revised: 11/30/2021] [Accepted: 12/20/2021] [Indexed: 11/22/2022]
Abstract
BACKGROUND Effective global antiretroviral vaccines and therapeutic strategies depend on the diversity, evolution, and epidemiology of their various strains as well as their transmission and pathogenesis. Most viral disease-causing particles are clustered into a taxonomy of subtypes to suggest pointers toward nucleotide-specific vaccines or therapeutic applications of clinical significance sufficient for sequence-specific diagnosis and homologous viral studies. These are very useful to formulate predictors to induce cross-resistance to some retroviral control drugs being used across study areas. OBJECTIVE This research proposed a collaborative framework of hybridized (Machine Learning and Natural Language Processing) techniques to discover hidden genome patterns and feature predictors, for HIV-1 genome sequences mining. METHOD 630 human HIV-1 genome sequences above 8500 bps were excavated from the National Center for Biotechnology Information (NCBI) database (https://www.ncbi.nlm.nih.gov) for 21 countries across different continents, Antarctica exempt. These sequences were transformed and learned using a self-organizing map (SOM). To discriminate emerging/new sub-strain(s), the HIV-1 reference genome was included as part of the input isolates/samples during the training. After training the SOM, component planes defining pattern clusters of the input datasets were generated, for cognitive knowledge mining and subsequent labelling of the datasets. Additional genome features including dinucleotide transmission recurrences, codon recurrences, and mutation recurrences, were finally extracted from the raw genomes to construct output classification targets for supervised learning. RESULTS SOM training explains the inherent pattern diversity of HIV-1 genomes as well as inter- and intra-country transmissions in which mobility might play an active role, as corroborated by literature. Nine sub-strains were discovered after disassembling the SOM correlation hunting matrix space attributed to disparate clusters. Cognitive knowledge mining separated similar pattern clusters bounded by a certain degree of correlation range, discovered by the SOM. A Kruskal-Wallis rank-sum test and Wilcoxon rank-sum test showed statistically significant variations in dinucleotide, codon, and mutation patterns. CONCLUSION Results of the discovered sub-strains and response clusters visualizations corroborate existing literature, with significant haplotype variations. The proposed framework would assist in the development of decision support systems for easy contact tracing, infectious disease surveillance, and studying the progressive evolution of the reference HIV-1 genome.
Collapse
Affiliation(s)
- Moses E Ekpenyong
- Department of Computer Science, Faculty of Science, University of Uyo, Uyo, Nigeria
- Centre for Research and Development, University of Uyo, Uyo, Nigeria
| | - Anthony A Adegoke
- Department of Microbiology, Faculty of Science, University of Uyo, Uyo, Nigeria
| | - Mercy E Edoho
- Department of Computer Science, Faculty of Science, University of Uyo, Uyo, Nigeria
| | - Udoinyang G Inyang
- Department of Computer Science, Faculty of Science, University of Uyo, Uyo, Nigeria
| | - Ifiok J Udo
- Department of Computer Science, Faculty of Science, University of Uyo, Uyo, Nigeria
| | - Itemobong S Ekaidem
- Department of Chemical Pathology, College of Health Sciences, University of Uyo, Uyo, Nigeria
| | - Francis Osang
- Department of Computer Science, Faculty of Science, National Open University, Abuja, Nigeria
| | - Nseobong P Uto
- School of Mathematics and Statistics, University of St Andrews, Scotland, United Kingdom
| | - Joseph I Geoffery
- Department of Computer Science, Faculty of Science, University of Uyo, Uyo, Nigeria
| |
Collapse
|
2
|
Decision Model for Predicting Social Vulnerability Using Artificial Intelligence. ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION 2019. [DOI: 10.3390/ijgi8120575] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Social vulnerability, from a socio-environmental point of view, focuses on the identification of disadvantaged or vulnerable groups and the conditions and dynamics of the environments in which they live. To understand this issue, it is important to identify the factors that explain the difficulty of facing situations with a social disadvantage. Due to its complexity and multidimensionality, it is not always easy to point out the social groups and urban areas affected. This research aimed to assess the connection between certain dimensions of social vulnerability and its urban and dwelling context as a fundamental framework in which it occurs using a decision model useful for the planning of social and urban actions. For this purpose, a holistic approximation was carried out on the census and demographic data commonly used in this type of study, proposing the construction of (i) a knowledge model based on Artificial Neural Networks (Self-Organizing Map), with which a demographic profile is identified and characterized whose indicators point to a presence of social vulnerability, and (ii) a predictive model of such a profile based on rules from dwelling variables constructed by conditional inference trees. These models, in combination with Geographic Information Systems, make a decision model feasible for the prediction of social vulnerability based on housing information.
Collapse
|
3
|
Wear Scar Similarities between Retrieved and Simulator-Tested Polyethylene TKR Components: An Artificial Neural Network Approach. BIOMED RESEARCH INTERNATIONAL 2016; 2016:2071945. [PMID: 27597955 PMCID: PMC5002291 DOI: 10.1155/2016/2071945] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/20/2016] [Accepted: 06/22/2016] [Indexed: 11/30/2022]
Abstract
The aim of this study was to determine how representative wear scars of simulator-tested polyethylene (PE) inserts compare with retrieved PE inserts from total knee replacement (TKR). By means of a nonparametric self-organizing feature map (SOFM), wear scar images of 21 postmortem- and 54 revision-retrieved components were compared with six simulator-tested components that were tested either in displacement or in load control according to ISO protocols. The SOFM network was then trained with the wear scar images of postmortem-retrieved components since those are considered well-functioning at the time of retrieval. Based on this training process, eleven clusters were established, suggesting considerable variability among wear scars despite an uncomplicated loading history inside their hosts. The remaining components (revision-retrieved and simulator-tested) were then assigned to these established clusters. Six out of five simulator components were clustered together, suggesting that the network was able to identify similarities in loading history. However, the simulator-tested components ended up in a cluster at the fringe of the map containing only 10.8% of retrieved components. This may suggest that current ISO testing protocols were not fully representative of this TKR population, and protocols that better resemble patients' gait after TKR containing activities other than walking may be warranted.
Collapse
|
5
|
George S, Xia T, Rallo R, Zhao Y, Ji Z, Lin S, Wang X, Zhang H, France B, Schoenfeld D, Damoiseaux R, Liu R, Lin S, Bradley KA, Cohen Y, Nel AE. Use of a high-throughput screening approach coupled with in vivo zebrafish embryo screening to develop hazard ranking for engineered nanomaterials. ACS NANO 2011; 5:1805-17. [PMID: 21323332 PMCID: PMC3896549 DOI: 10.1021/nn102734s] [Citation(s) in RCA: 142] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]
Abstract
Because of concerns about the safety of a growing number of engineered nanomaterials (ENM), it is necessary to develop high-throughput screening and in silico data transformation tools that can speed up in vitro hazard ranking. Here, we report the use of a multiparametric, automated screening assay that incorporates sublethal and lethal cellular injury responses to perform high-throughput analysis of a batch of commercial metal/metal oxide nanoparticles (NP) with the inclusion of a quantum dot (QD1). The responses chosen for tracking cellular injury through automated epifluorescence microscopy included ROS production, intracellular calcium flux, mitochondrial depolarization, and plasma membrane permeability. The z-score transformed high volume data set was used to construct heat maps for in vitro hazard ranking as well as showing the similarity patterns of NPs and response parameters through the use of self-organizing maps (SOM). Among the materials analyzed, QD1 and nano-ZnO showed the most prominent lethality, while Pt, Ag, SiO2, Al2O3, and Au triggered sublethal effects but without cytotoxicity. In order to compare the in vitro with the in vivo response outcomes in zebrafish embryos, NPs were used to assess their impact on mortality rate, hatching rate, cardiac rate, and morphological defects. While QDs, ZnO, and Ag induced morphological abnormalities or interfered in embryo hatching, Pt and Ag exerted inhibitory effects on cardiac rate. Ag toxicity in zebrafish differed from the in vitro results, which is congruent with this material's designation as extremely dangerous in the environment. Interestingly, while toxicity in the initially selected QD formulation was due to a solvent (toluene), supplementary testing of additional QDs selections yielded in vitro hazard profiling that reflect the release of chalcogenides. In conclusion, the use of a high-throughput screening, in silico data handling and zebrafish testing may constitute a paradigm for rapid and integrated ENM toxicological screening.
Collapse
Affiliation(s)
- Saji George
- Department of Medicine, Division of NanoMedicine; University of California, Los Angeles, CA, USA
- Center for Environmental Implications of Nanotechnology, California NanoSystems Institute; University of California, Los Angeles, CA, USA
| | - Tian Xia
- Department of Medicine, Division of NanoMedicine; University of California, Los Angeles, CA, USA
- Center for Environmental Implications of Nanotechnology, California NanoSystems Institute; University of California, Los Angeles, CA, USA
| | - Robert Rallo
- Center for Environmental Implications of Nanotechnology, California NanoSystems Institute; University of California, Los Angeles, CA, USA
- Chemical and Biomolecular Engineering, University of California, Los Angeles, CA, USA
| | - Yan Zhao
- Department of Molecular, Cell, and Developmental Biology, University of California, Los Angeles, CA, USA
| | - Zhaoxia Ji
- Center for Environmental Implications of Nanotechnology, California NanoSystems Institute; University of California, Los Angeles, CA, USA
| | - Sijie Lin
- Center for Environmental Implications of Nanotechnology, California NanoSystems Institute; University of California, Los Angeles, CA, USA
| | - Xiang Wang
- Center for Environmental Implications of Nanotechnology, California NanoSystems Institute; University of California, Los Angeles, CA, USA
| | - Haiyuan Zhang
- Center for Environmental Implications of Nanotechnology, California NanoSystems Institute; University of California, Los Angeles, CA, USA
| | - Bryan France
- Molecular Shared Screening Resources, University of California, Los Angeles, CA, USA
| | - David Schoenfeld
- Center for Environmental Implications of Nanotechnology, California NanoSystems Institute; University of California, Los Angeles, CA, USA
- Department of Molecular, Cell, and Developmental Biology, University of California, Los Angeles, CA, USA
| | - Robert Damoiseaux
- Center for Environmental Implications of Nanotechnology, California NanoSystems Institute; University of California, Los Angeles, CA, USA
- Molecular Shared Screening Resources, University of California, Los Angeles, CA, USA
| | - Rong Liu
- Center for Environmental Implications of Nanotechnology, California NanoSystems Institute; University of California, Los Angeles, CA, USA
- Chemical and Biomolecular Engineering, University of California, Los Angeles, CA, USA
| | - Shuo Lin
- Department of Molecular, Cell, and Developmental Biology, University of California, Los Angeles, CA, USA
| | - Kenneth A Bradley
- Center for Environmental Implications of Nanotechnology, California NanoSystems Institute; University of California, Los Angeles, CA, USA
- Department of Microbiology, Immunology and Mol Genetics, University of California, Los Angeles, CA, USA
| | - Yoram Cohen
- Center for Environmental Implications of Nanotechnology, California NanoSystems Institute; University of California, Los Angeles, CA, USA
- Chemical and Biomolecular Engineering, University of California, Los Angeles, CA, USA
| | - André E Nel
- Department of Medicine, Division of NanoMedicine; University of California, Los Angeles, CA, USA
- Center for Environmental Implications of Nanotechnology, California NanoSystems Institute; University of California, Los Angeles, CA, USA
- Address correspondence to
| |
Collapse
|
6
|
Murtola T, Bunker A, Vattulainen I, Deserno M, Karttunen M. Multiscale modeling of emergent materials: biological and soft matter. Phys Chem Chem Phys 2009; 11:1869-92. [PMID: 19279999 DOI: 10.1039/b818051b] [Citation(s) in RCA: 183] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
In this review, we focus on four current related issues in multiscale modeling of soft and biological matter. First, we discuss how to use structural information from detailed models (or experiments) to construct coarse-grained ones in a hierarchical and systematic way. This is discussed in the context of the so-called Henderson theorem and the inverse Monte Carlo method of Lyubartsev and Laaksonen. In the second part, we take a different look at coarse graining by analyzing conformations of molecules. This is done by the application of self-organizing maps, i.e., a neural network type approach. Such an approach can be used to guide the selection of the relevant degrees of freedom. Then, we discuss technical issues related to the popular dissipative particle dynamics (DPD) method. Importantly, the potentials derived using the inverse Monte Carlo method can be used together with the DPD thermostat. In the final part we focus on solvent-free modeling which offers a different route to coarse graining by integrating out the degrees of freedom associated with solvent.
Collapse
Affiliation(s)
- Teemu Murtola
- Department of Applied Physics and Helsinki Institute of Physics, Helsinki University of Technology, Finland
| | | | | | | | | |
Collapse
|
7
|
Murtola T, Kupiainen M, Falck E, Vattulainen I. Conformational analysis of lipid molecules by self-organizing maps. J Chem Phys 2007; 126:054707. [PMID: 17302498 DOI: 10.1063/1.2429066] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The authors have studied the use of the self-organizing map (SOM) in the analysis of lipid conformations produced by atomic-scale molecular dynamics simulations. First, focusing on the methodological aspects, they have systematically studied how the SOM can be employed in the analysis of lipid conformations in a controlled and reliable fashion. For this purpose, they have used a previously reported 50 ns atomistic molecular dynamics simulation of a 1-palmitoyl-2-linoeayl-sn-glycero-3-phosphatidylcholine (PLPC) lipid bilayer and analyzed separately the conformations of the headgroup and the glycerol regions, as well as the diunsaturated fatty acid chain. They have elucidated the effect of training parameters on the quality of the results, as well as the effect of the size of the SOM. It turns out that the main conformational states of each region in the molecule are easily distinguished together with a variety of other typical structural features. As a second topic, the authors applied the SOM to the PLPC data to demonstrate how it can be used in the analysis that goes beyond the standard methods commonly used to study the structure and dynamics of lipid membranes. Overall, the results suggest that the SOM method provides a relatively simple and robust tool for quickly gaining a qualitative understanding of the most important features of the conformations of the system, without a priori knowledge. It seems plausible that the insight given by the SOM could be applied to a variety of biomolecular systems and the design of coarse-grained models for these systems.
Collapse
Affiliation(s)
- Teemu Murtola
- Laboratory of Physics, Helsinki University of Technology, P.O. Box 1100, FI-02015 HUT, Finland
| | | | | | | |
Collapse
|
8
|
Yang ZR, Dry J, Thomson R, Charles Hodgman T. A bio-basis function neural network for protein peptide cleavage activity characterisation. Neural Netw 2006; 19:401-7. [PMID: 16478661 DOI: 10.1016/j.neunet.2005.07.015] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2003] [Accepted: 07/28/2005] [Indexed: 11/20/2022]
Abstract
This paper presents a novel neural learning algorithm for analysing protein peptides which comprise amino acids as non-numerical attributes. The algorithm is derived from the radial basis function neural networks (RBFNNs) and is referred to as a bio-basis function neural network (BBFNN). The basic principle is to replace the radial basis function used by RBFNNs with a bio-basis function. Each basis in BBFNN is supported by a peptide. The bases collectively form a feature space, in which each basis represents a feature dimension. A linear classifier is constructed in the feature space for characterising a protein peptide in terms of functional status. The theoretical basis of BBFNN is that peptides, which perform the same function will have similar compositions of amino acids. Because of this, the similarity between peptides can have statistical significance for modelling while the proposed bio-basis function can well code this information from data. The application to two real cases shows that BBFNN outperformed multi-layer perceptrons and support vector machines.
Collapse
Affiliation(s)
- Zheng Rong Yang
- Department of Computer Science, University of Exeter, Northcote House, The Queen's Drive, Exeter EX4 4QJ, UK.
| | | | | | | |
Collapse
|
10
|
Yang ZR. Prediction of caspase cleavage sites using Bayesian bio-basis function neural networks. Bioinformatics 2005; 21:1831-7. [PMID: 15671118 DOI: 10.1093/bioinformatics/bti281] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Apoptosis has drawn the attention of researchers because of its importance in treating some diseases through finding a proper way to block or slow down the apoptosis process. Having understood that caspase cleavage is the key to apoptosis, we find novel methods or algorithms are essential for studying the specificity of caspase cleavage activity and this helps the effective drug design. As bio-basis function neural networks have proven to outperform some conventional neural learning algorithms, there is a motivation, in this study, to investigate the application of bio-basis function neural networks for the prediction of caspase cleavage sites. RESULTS Thirteen protein sequences with experimentally determined caspase cleavage sites were downloaded from NCBI. Bayesian bio-basis function neural networks are investigated and the comparisons with single-layer perceptrons, multilayer perceptrons, the original bio-basis function neural networks and support vector machines are given. The impact of the sliding window size used to generate sub-sequences for modelling on prediction accuracy is studied. The results show that the Bayesian bio-basis function neural network with two Gaussian distributions for model parameters (weights) performed the best and the highest prediction accuracy is 97.15 +/- 1.13%. AVAILABILITY The package of Bayesian bio-basis function neural network can be obtained by request to the author.
Collapse
Affiliation(s)
- Zheng Rong Yang
- Department of Computer Science, Exeter University, Devonshire, UK.
| |
Collapse
|