1
|
Host genetic basis of COVID-19: from methodologies to genes. Eur J Hum Genet 2022; 30:899-907. [PMID: 35618891 PMCID: PMC9135575 DOI: 10.1038/s41431-022-01121-x] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Revised: 04/04/2022] [Accepted: 05/09/2022] [Indexed: 01/03/2023] Open
Abstract
The COVID-19 pandemic caused by the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) is having a massive impact on public health, societies, and economies worldwide. Despite the ongoing vaccination program, treating COVID-19 remains a high priority; thus, a better understanding of the disease is urgently needed. Initially, susceptibility was associated with age, sex, and other prior existing comorbidities. However, as these conditions alone could not explain the highly variable clinical manifestations of SARS-CoV-2 infection, the attention was shifted toward the identification of the genetic basis of COVID-19. Thanks to international collaborations like The COVID-19 Host Genetics Initiative, it became possible the elucidation of numerous genetic markers that are not only likely to help in explaining the varied clinical outcomes of COVID-19 patients but can also guide the development of novel diagnostics and therapeutics. Within this framework, this review delineates GWAS and Burden test as traditional methodologies employed so far for the discovery of the human genetic basis of COVID-19, with particular attention to recently emerged predictive models such as the post-Mendelian model. A summary table with the main genome-wide significant genomic loci is provided. Besides, various common and rare variants identified in genes like TLR7, CFTR, ACE2, TMPRSS2, TLR3, and SELP are further described in detail to illustrate their association with disease severity.
Collapse
|
2
|
Abstract
A hesitant fuzzy set (HFS) and a cubic set (CS) are two independent approaches to deal with hesitancy and vagueness simultaneously. An HFS assigns an essential hesitant grade to each object in the universe, whereas a CS deals with uncertain information in terms of fuzzy sets as well as interval-valued fuzzy sets. A cubic hesitant fuzzy set (CHFS) is a new computational intelligence approach that combines CS and HFS. The primary objective of this paper is to define topological structure of CHFSs under P(R)-order as well as to develop a new topological data analysis technique. For these objectives, we propose the concept of “cubic hesitant fuzzy topology (CHF topology)”, which is based on CHFSs with both P(R)-order. The idea of CHF points gives rise to the study of several properties of CHF topology, such as CHF closure, CHF exterior, CHF interior, CHF frontier, etc. We also define the notion of CHF subspace and CHF base in CHF topology and related results. We proposed two algorithms for extended cubic hesitant fuzzy TOPSIS and CHF topology method, respectively. The symmetry of optimal decision is analyzed by computations with both algorithms. A numerical analysis is illustrated to discuss similar medical diagnoses. We also discuss a case study of heart failure diagnosis based on CHF information and the modified TOPSIS approach.
Collapse
|
3
|
Reimann MW, Riihimäki H, Smith JP, Lazovskis J, Pokorny C, Levi R. Topology of synaptic connectivity constrains neuronal stimulus representation, predicting two complementary coding strategies. PLoS One 2022; 17:e0261702. [PMID: 35020728 PMCID: PMC8754339 DOI: 10.1371/journal.pone.0261702] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Accepted: 12/07/2021] [Indexed: 11/18/2022] Open
Abstract
In motor-related brain regions, movement intention has been successfully decoded from in-vivo spike train by isolating a lower-dimension manifold that the high-dimensional spiking activity is constrained to. The mechanism enforcing this constraint remains unclear, although it has been hypothesized to be implemented by the connectivity of the sampled neurons. We test this idea and explore the interactions between local synaptic connectivity and its ability to encode information in a lower dimensional manifold through simulations of a detailed microcircuit model with realistic sources of noise. We confirm that even in isolation such a model can encode the identity of different stimuli in a lower-dimensional space. We then demonstrate that the reliability of the encoding depends on the connectivity between the sampled neurons by specifically sampling populations whose connectivity maximizes certain topological metrics. Finally, we developed an alternative method for determining stimulus identity from the activity of neurons by combining their spike trains with their recurrent connectivity. We found that this method performs better for sampled groups of neurons that perform worse under the classical approach, predicting the possibility of two separate encoding strategies in a single microcircuit.
Collapse
Affiliation(s)
- Michael W. Reimann
- Blue Brain Project, École Polytechnique Fédérale de Lausanne (EPFL), Geneva, Switzerland
| | | | - Jason P. Smith
- University of Aberdeen, Aberdeen, United Kingdom
- Nottingham Trent University, Nottingham, United Kingdom
| | - Jānis Lazovskis
- University of Aberdeen, Aberdeen, United Kingdom
- University of Latvia, Rīga, Latvia
| | - Christoph Pokorny
- Blue Brain Project, École Polytechnique Fédérale de Lausanne (EPFL), Geneva, Switzerland
| | - Ran Levi
- University of Aberdeen, Aberdeen, United Kingdom
| |
Collapse
|
4
|
Carr E, Carrière M, Michel B, Chazal F, Iniesta R. Identifying homogeneous subgroups of patients and important features: a topological machine learning approach. BMC Bioinformatics 2021; 22:449. [PMID: 34544357 PMCID: PMC8451168 DOI: 10.1186/s12859-021-04360-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2021] [Accepted: 09/07/2021] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND This paper exploits recent developments in topological data analysis to present a pipeline for clustering based on Mapper, an algorithm that reduces complex data into a one-dimensional graph. RESULTS We present a pipeline to identify and summarise clusters based on statistically significant topological features from a point cloud using Mapper. CONCLUSIONS Key strengths of this pipeline include the integration of prior knowledge to inform the clustering process and the selection of optimal clusters; the use of the bootstrap to restrict the search to robust topological features; the use of machine learning to inspect clusters; and the ability to incorporate mixed data types. Our pipeline can be downloaded under the GNU GPLv3 license at https://github.com/kcl-bhi/mapper-pipeline .
Collapse
Affiliation(s)
- Ewan Carr
- Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
| | | | - Bertrand Michel
- Ecole Centrale de Nantes, LMJL - UMR CNRS 6629, Nantes, France
| | - Frédéric Chazal
- Inria Saclay, Ile-de-France, Alan Turing Building, Palaiseau, France
| | - Raquel Iniesta
- Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK.
| |
Collapse
|
5
|
Ohanuba F, Ismail M, Ali MM. Topological data analysis via unsupervised machine learning for recognizing atmospheric river patterns on flood detection. SCIENTIFIC AFRICAN 2021. [DOI: 10.1016/j.sciaf.2021.e00968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022] Open
|
6
|
Bussola N, Papa B, Melaiu O, Castellano A, Fruci D, Jurman G. Quantification of the Immune Content in Neuroblastoma: Deep Learning and Topological Data Analysis in Digital Pathology. Int J Mol Sci 2021; 22:8804. [PMID: 34445517 PMCID: PMC8396341 DOI: 10.3390/ijms22168804] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2021] [Revised: 08/10/2021] [Accepted: 08/11/2021] [Indexed: 02/06/2023] Open
Abstract
We introduce here a novel machine learning (ML) framework to address the issue of the quantitative assessment of the immune content in neuroblastoma (NB) specimens. First, the EUNet, a U-Net with an EfficientNet encoder, is trained to detect lymphocytes on tissue digital slides stained with the CD3 T-cell marker. The training set consists of 3782 images extracted from an original collection of 54 whole slide images (WSIs), manually annotated for a total of 73,751 lymphocytes. Resampling strategies, data augmentation, and transfer learning approaches are adopted to warrant reproducibility and to reduce the risk of overfitting and selection bias. Topological data analysis (TDA) is then used to define activation maps from different layers of the neural network at different stages of the training process, described by persistence diagrams (PD) and Betti curves. TDA is further integrated with the uniform manifold approximation and projection (UMAP) dimensionality reduction and the hierarchical density-based spatial clustering of applications with noise (HDBSCAN) algorithm for clustering, by the deep features, the relevant subgroups and structures, across different levels of the neural network. Finally, the recent TwoNN approach is leveraged to study the variation of the intrinsic dimensionality of the U-Net model. As the main task, the proposed pipeline is employed to evaluate the density of lymphocytes over the whole tissue area of the WSIs. The model achieves good results with mean absolute error 3.1 on test set, showing significant agreement between densities estimated by our EUNet model and by trained pathologists, thus indicating the potentialities of a promising new strategy in the quantification of the immune content in NB specimens. Moreover, the UMAP algorithm unveiled interesting patterns compatible with pathological characteristics, also highlighting novel insights into the dynamics of the intrinsic dataset dimensionality at different stages of the training process. All the experiments were run on the Microsoft Azure cloud platform.
Collapse
Affiliation(s)
- Nicole Bussola
- Data Science for Health, Fondazione Bruno Kessler, 38123 Trento, Italy; (N.B.); (B.P.)
- CIBIO Department, University of Trento, 38123 Trento, Italy
| | - Bruno Papa
- Data Science for Health, Fondazione Bruno Kessler, 38123 Trento, Italy; (N.B.); (B.P.)
| | - Ombretta Melaiu
- Department of Paediatric Haematology/Oncology and of Cell and Gene Therapy, Ospedale Pediatrico Bambino Gesù IRCCS, 00146 Rome, Italy; (O.M.); (A.C.); (D.F.)
| | - Aurora Castellano
- Department of Paediatric Haematology/Oncology and of Cell and Gene Therapy, Ospedale Pediatrico Bambino Gesù IRCCS, 00146 Rome, Italy; (O.M.); (A.C.); (D.F.)
| | - Doriana Fruci
- Department of Paediatric Haematology/Oncology and of Cell and Gene Therapy, Ospedale Pediatrico Bambino Gesù IRCCS, 00146 Rome, Italy; (O.M.); (A.C.); (D.F.)
| | - Giuseppe Jurman
- Data Science for Health, Fondazione Bruno Kessler, 38123 Trento, Italy; (N.B.); (B.P.)
| |
Collapse
|
7
|
Chekroud AM, Bondar J, Delgadillo J, Doherty G, Wasil A, Fokkema M, Cohen Z, Belgrave D, DeRubeis R, Iniesta R, Dwyer D, Choi K. The promise of machine learning in predicting treatment outcomes in psychiatry. World Psychiatry 2021; 20:154-170. [PMID: 34002503 PMCID: PMC8129866 DOI: 10.1002/wps.20882] [Citation(s) in RCA: 150] [Impact Index Per Article: 50.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
For many years, psychiatrists have tried to understand factors involved in response to medications or psychotherapies, in order to personalize their treatment choices. There is now a broad and growing interest in the idea that we can develop models to personalize treatment decisions using new statistical approaches from the field of machine learning and applying them to larger volumes of data. In this pursuit, there has been a paradigm shift away from experimental studies to confirm or refute specific hypotheses towards a focus on the overall explanatory power of a predictive model when tested on new, unseen datasets. In this paper, we review key studies using machine learning to predict treatment outcomes in psychiatry, ranging from medications and psychotherapies to digital interventions and neurobiological treatments. Next, we focus on some new sources of data that are being used for the development of predictive models based on machine learning, such as electronic health records, smartphone and social media data, and on the potential utility of data from genetics, electrophysiology, neuroimaging and cognitive testing. Finally, we discuss how far the field has come towards implementing prediction tools in real-world clinical practice. Relatively few retrospective studies to-date include appropriate external validation procedures, and there are even fewer prospective studies testing the clinical feasibility and effectiveness of predictive models. Applications of machine learning in psychiatry face some of the same ethical challenges posed by these techniques in other areas of medicine or computer science, which we discuss here. In short, machine learning is a nascent but important approach to improve the effectiveness of mental health care, and several prospective clinical studies suggest that it may be working already.
Collapse
Affiliation(s)
- Adam M Chekroud
- Department of Psychiatry, Yale School of Medicine, New Haven, CT, USA
- Spring Health, New York City, NY, USA
| | | | - Jaime Delgadillo
- Clinical Psychology Unit, Department of Psychology, University of Sheffield, Sheffield, UK
| | - Gavin Doherty
- School of Computer Science and Statistics, Trinity College Dublin, Dublin, Ireland
| | - Akash Wasil
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA
| | - Marjolein Fokkema
- Department of Methods and Statistics, Institute of Psychology, Leiden University, Leiden, The Netherlands
| | - Zachary Cohen
- Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, Los Angeles, CA, USA
| | | | - Robert DeRubeis
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA
| | - Raquel Iniesta
- Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neurosciences, King's College London, London, UK
| | - Dominic Dwyer
- Department of Psychiatry and Psychotherapy, Section for Neurodiagnostic Applications, Ludwig-Maximilian University, Munich, Germany
| | - Karmel Choi
- Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Department of Psychiatry, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
8
|
Adiga R. Benchmarking Datasets from Malaria Cytotoxic T-cell Epitopes Using Machine Learning Approach. Avicenna J Med Biotechnol 2021; 13:87-91. [PMID: 34012524 PMCID: PMC8112139 DOI: 10.18502/ajmb.v13i2.5527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
Background: Epitope prediction remains a major challenge in malaria due to the unique parasite biology, in addition to rapidly evolving parasite sequence variation in Plasmodium species. Although several models for epitope prediction exist, they are not useful in Plasmodium specific epitope development. Hence, it was proposed to use machine learning based methods to develop a peptide sequence based epitope predictor specific for malaria. Methods: Model datasets were developed and performance was tested using various machine learning algorithms. Machine learning classifiers were trained on epitope data using sequence features and comparison of amino acid physicochemical properties was done to yield a valid prediction model. Results: The findings from the analysis reveal that the model developed using selected classifiers after preprocessing by Waikato Environment for Knowledge Analysis (WEKA) performed better than other methods. The datasets for benchmarks of performance are deposited in the repository https://github.com/githubramaadiga/epitope_dataset
. Conclusion: The study is the first in-silico study on benchmarking Plasmodium cytotoxic T cell epitope datasets using machine learning approach. The peptide based predictors have been used for the first time to classify cytotoxic T cell epitopes in malaria. Algorithms has been evaluated using real datasets from malaria to obtain the model.
Collapse
Affiliation(s)
- Rama Adiga
- Nitte (Deemed to be University), Nitte University Centre for Science Education & Research (NUCSER), Division of Bioinformatics and Computational Genomics, Deralakatte, Paneer Campus, Mangalore, India 575018
| |
Collapse
|
9
|
Dagliati A, Geifman N, Peek N, Holmes JH, Sacchi L, Bellazzi R, Sajjadi SE, Tucker A. Using topological data analysis and pseudo time series to infer temporal phenotypes from electronic health records. Artif Intell Med 2020; 108:101930. [PMID: 32972659 PMCID: PMC7536308 DOI: 10.1016/j.artmed.2020.101930] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2020] [Revised: 05/21/2020] [Accepted: 07/11/2020] [Indexed: 11/17/2022]
Abstract
Topological Data and Pseudo Time Series to discover Type 2 Diabetes temporal phenotypes. Temporal phenotypes inferred from state-space model based on hidden-states transitions. Study of states continuous transitions visually delivered in an easily explainable way. Mined phenotypes characterized by significant differences in disease deterioration.
Temporal phenotyping enables clinicians to better understand observable characteristics of a disease as it progresses. Modelling disease progression that captures interactions between phenotypes is inherently challenging. Temporal models that capture change in disease over time can identify the key features that characterize disease subtypes that underpin these trajectories. These models will enable clinicians to identify early warning signs of progression in specific sub-types and therefore to make informed decisions tailored to individual patients. In this paper, we explore two approaches to building temporal phenotypes based on the topology of data: topological data analysis and pseudo time-series. Using type 2 diabetes data, we show that the topological data analysis approach is able to identify disease trajectories and that pseudo time-series can infer a state space model characterized by transitions between hidden states that represent distinct temporal phenotypes. Both approaches highlight lipid profiles as key factors in distinguishing the phenotypes.
Collapse
Affiliation(s)
- Arianna Dagliati
- Centre for Health Informatics, University of Manchester, Manchester, United Kingdom; Manchester Molecular Pathology Innovation Centre, University of Manchester, United Kingdom; Department of Electrical, Computer & Biomedical Engineering University of Pavia, Italy.
| | - Nophar Geifman
- Centre for Health Informatics, University of Manchester, Manchester, United Kingdom
| | - Niels Peek
- Centre for Health Informatics, University of Manchester, Manchester, United Kingdom; NIHR Manchester Biomedical Research Centre, University of Manchester, United Kingdom
| | - John H Holmes
- Department of Biostatistics, Epidemiology, and Informatics, Penn Institute for Biomedical Informatics, University of Pennsylvania Perelman School of Medicine, USA
| | - Lucia Sacchi
- Department of Electrical, Computer & Biomedical Engineering University of Pavia, Italy
| | - Riccardo Bellazzi
- Department of Electrical, Computer & Biomedical Engineering University of Pavia, Italy
| | | | - Allan Tucker
- Department of Computer Science, Brunel University London, United Kingdom
| |
Collapse
|