151
|
Wenck S, Creydt M, Hansen J, Gärber F, Fischer M, Seifert S. Opening the Random Forest Black Box of the Metabolome by the Application of Surrogate Minimal Depth. Metabolites 2021; 12:metabo12010005. [PMID: 35050127 PMCID: PMC8781913 DOI: 10.3390/metabo12010005] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Revised: 12/16/2021] [Accepted: 12/18/2021] [Indexed: 11/16/2022] Open
Abstract
For the untargeted analysis of the metabolome of biological samples with liquid chromatography–mass spectrometry (LC-MS), high-dimensional data sets containing many different metabolites are obtained. Since the utilization of these complex data is challenging, different machine learning approaches have been developed. Those methods are usually applied as black box classification tools, and detailed information about class differences that result from the complex interplay of the metabolites are not obtained. Here, we demonstrate that this information is accessible by the application of random forest (RF) approaches and especially by surrogate minimal depth (SMD) that is applied to metabolomics data for the first time. We show this by the selection of important features and the evaluation of their mutual impact on the multi-level classification of white asparagus regarding provenance and biological identity. SMD enables the identification of multiple features from the same metabolites and reveals meaningful biological relations, proving its high potential for the comprehensive utilization of high-dimensional metabolomics data.
Collapse
|
152
|
Qin X, Ma S, Wu M. Gene-gene interaction analysis incorporating network information via a structured Bayesian approach. Stat Med 2021; 40:6619-6633. [PMID: 34542187 PMCID: PMC8595614 DOI: 10.1002/sim.9202] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2021] [Revised: 08/22/2021] [Accepted: 08/30/2021] [Indexed: 01/14/2023]
Abstract
Increasing evidence has shown that gene-gene interactions have important effects in biological processes of human diseases. Due to the high dimensionality of genetic measurements, interaction analysis usually suffers from a lack of sufficient information and has unsatisfactory results. Biological network information has been massively accumulated, allowing researchers to identify biomarkers while taking a system perspective, conducting network selection (of functionally related biomarkers), and accommodating network structures. In main-effect-only analysis, network information has been incorporated. However, effort has been limited in interaction analysis. Recently, link networks that describe the relationships between genetic interactions have been demonstrated as effective for revealing multiscale hierarchical organizations in networks and providing interesting findings beyond node networks. In this study, we develop a novel structured Bayesian interaction analysis approach to effectively incorporate network information. This study is among the first to identify gene-gene interactions with the assistance of network selection, while simultaneously accommodating the underlying network structures of both main effects and interactions. It innovatively respects multiple hierarchies among main effects, interactions, and networks. The Bayesian technique is adopted, which may be more informative for estimation and prediction over some other techniques. An efficient variational Bayesian expectation-maximization algorithm is developed to explore the posterior distribution. Extensive simulation studies demonstrate the practical superiority of the proposed approach. The analysis of TCGA data on melanoma and lung cancer leads to biologically sensible findings with satisfactory prediction accuracy and selection stability.
Collapse
Affiliation(s)
- Xing Qin
- School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai, China
| | - Shuangge Ma
- Department of Biostatistics, Yale University, New Haven, CT, USA
| | - Mengyun Wu
- School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai, China
| |
Collapse
|
153
|
Functional random forests for curve response. Sci Rep 2021; 11:24159. [PMID: 34921167 PMCID: PMC8683425 DOI: 10.1038/s41598-021-02265-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Accepted: 08/20/2021] [Indexed: 11/22/2022] Open
Abstract
The rapid advancement of functional data in various application fields has increased the demand for advanced statistical approaches that can incorporate complex structures and nonlinear associations. In this article, we propose a novel functional random forests (FunFor) approach to model the functional data response that is densely and regularly measured, as an extension of the landmark work of Breiman, who introduced traditional random forests for a univariate response. The FunFor approach is able to predict curve responses for new observations and selects important variables from a large set of scalar predictors. The FunFor approach inherits the efficiency of the traditional random forest approach in detecting complex relationships, including nonlinear and high-order interactions. Additionally, it is a non-parametric approach without the imposition of parametric and distributional assumptions. Eight simulation settings and one real-data analysis consistently demonstrate the excellent performance of the FunFor approach in various scenarios. In particular, FunFor successfully ranks the true predictors as the most important variables, while achieving the most robust variable sections and the smallest prediction errors when comparing it with three other relevant approaches. Although motivated by a biological leaf shape data analysis, the proposed FunFor approach has great potential to be widely applied in various fields due to its minimal requirement on tuning parameters and its distribution-free and model-free nature. An R package named 'FunFor', implementing the FunFor approach, is available at GitHub.
Collapse
|
154
|
Gardiner LJ, Krishna R. Bluster or Lustre: Can AI Improve Crops and Plant Health? PLANTS (BASEL, SWITZERLAND) 2021; 10:plants10122707. [PMID: 34961177 PMCID: PMC8707749 DOI: 10.3390/plants10122707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Revised: 11/24/2021] [Accepted: 12/06/2021] [Indexed: 06/14/2023]
Abstract
In a changing climate where future food security is a growing concern, researchers are exploring new methods and technologies in the effort to meet ambitious crop yield targets. The application of Artificial Intelligence (AI) including Machine Learning (ML) methods in this area has been proposed as a potential mechanism to support this. This review explores current research in the area to convey the state-of-the-art as to how AI/ML have been used to advance research, gain insights, and generally enable progress in this area. We address the question-Can AI improve crops and plant health? We further discriminate the bluster from the lustre by identifying the key challenges that AI has been shown to address, balanced with the potential issues with its usage, and the key requisites for its success. Overall, we hope to raise awareness and, as a result, promote usage, of AI related approaches where they can have appropriate impact to improve practices in agricultural and plant sciences.
Collapse
|
155
|
Musolf AM, Holzinger ER, Malley JD, Bailey-Wilson JE. What makes a good prediction? Feature importance and beginning to open the black box of machine learning in genetics. Hum Genet 2021; 141:1515-1528. [PMID: 34862561 PMCID: PMC9360120 DOI: 10.1007/s00439-021-02402-z] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Accepted: 11/08/2021] [Indexed: 01/26/2023]
Abstract
Genetic data have become increasingly complex within the past decade, leading researchers to pursue increasingly complex questions, such as those involving epistatic interactions and protein prediction. Traditional methods are ill-suited to answer these questions, but machine learning (ML) techniques offer an alternative solution. ML algorithms are commonly used in genetics to predict or classify subjects, but some methods evaluate which features (variables) are responsible for creating a good prediction; this is called feature importance. This is critical in genetics, as researchers are often interested in which features (e.g., SNP genotype or environmental exposure) are responsible for a good prediction. This allows for the deeper analysis beyond simple prediction, including the determination of risk factors associated with a given phenotype. Feature importance further permits the researcher to peer inside the black box of many ML algorithms to see how they work and which features are critical in informing a good prediction. This review focuses on ML methods that provide feature importance metrics for the analysis of genetic data. Five major categories of ML algorithms: k nearest neighbors, artificial neural networks, deep learning, support vector machines, and random forests are described. The review ends with a discussion of how to choose the best machine for a data set. This review will be particularly useful for genetic researchers looking to use ML methods to answer questions beyond basic prediction and classification.
Collapse
Affiliation(s)
- Anthony M Musolf
- Statistical Genetics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, 333 Cassell Drive Suite 1200, Baltimore, MD, 21224, USA
| | - Emily R Holzinger
- Target Sciences, Informatics and Predictive Sciences, Bristol Myers Squibb, Cambridge, MA, USA
| | - James D Malley
- Statistical Genetics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, 333 Cassell Drive Suite 1200, Baltimore, MD, 21224, USA
| | - Joan E Bailey-Wilson
- Statistical Genetics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, 333 Cassell Drive Suite 1200, Baltimore, MD, 21224, USA.
| |
Collapse
|
156
|
Elastic Correlation Adjusted Regression (ECAR) scores for high dimensional variable importance measuring. Sci Rep 2021; 11:23354. [PMID: 34857823 PMCID: PMC8640025 DOI: 10.1038/s41598-021-02706-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Accepted: 11/22/2021] [Indexed: 11/08/2022] Open
Abstract
Investigation of the genetic basis of traits or clinical outcomes heavily relies on identifying relevant variables in molecular data. However, characteristics such as high dimensionality and complex correlation structures of these data hinder the development of related methods, resulting in the inclusion of false positives and negatives. We developed a variable importance measure method, termed the ECAR scores, that evaluates the importance of variables in the dataset. Based on this score, ranking and selection of variables can be achieved simultaneously. Unlike most current approaches, the ECAR scores aim to rank the influential variables as high as possible while maintaining the grouping property, instead of selecting the ones that are merely predictive. The ECAR scores' performance is tested and compared to other methods on simulated, semi-synthetic, and real datasets. Results showed that the ECAR scores improve the CAR scores in terms of accuracy of variable selection and high-rank variables' predictive power. It also outperforms other classic methods such as lasso and stability selection when there is a high degree of correlation among influential variables. As an application, we used the ECAR scores to analyze genes associated with forced expiratory volume in the first second in patients with lung cancer and reported six associated genes.
Collapse
|
157
|
Arnoriaga-Rodríguez M, Mayneris-Perxachs J, Contreras-Rodríguez O, Burokas A, Ortega-Sanchez JA, Blasco G, Coll C, Biarnés C, Castells-Nobau A, Puig J, Garre-Olmo J, Ramos R, Pedraza S, Brugada R, Vilanova JC, Serena J, Barretina J, Gich J, Pérez-Brocal V, Moya A, Fernández-Real X, Ramio-Torrentà L, Pamplona R, Sol J, Jové M, Ricart W, Portero-Otin M, Maldonado R, Fernández-Real JM. Obesity-associated deficits in inhibitory control are phenocopied to mice through gut microbiota changes in one-carbon and aromatic amino acids metabolic pathways. Gut 2021; 70:2283-2296. [PMID: 33514598 PMCID: PMC8588299 DOI: 10.1136/gutjnl-2020-323371] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/14/2020] [Revised: 12/16/2020] [Accepted: 01/08/2021] [Indexed: 02/07/2023]
Abstract
BACKGROUND Inhibitory control (IC) is critical to keep long-term goals in everyday life. Bidirectional relationships between IC deficits and obesity are behind unhealthy eating and physical exercise habits. METHODS We studied gut microbiome composition and functionality, and plasma and faecal metabolomics in association with cognitive tests evaluating inhibitory control (Stroop test) and brain structure in a discovery (n=156), both cross-sectionally and longitudinally, and in an independent replication cohort (n=970). Faecal microbiota transplantation (FMT) in mice evaluated the impact on reversal learning and medial prefrontal cortex (mPFC) transcriptomics. RESULTS An interplay among IC, brain structure (in humans) and mPFC transcriptomics (in mice), plasma/faecal metabolomics and the gut metagenome was found. Obesity-dependent alterations in one-carbon metabolism, tryptophan and histidine pathways were associated with IC in the two independent cohorts. Bacterial functions linked to one-carbon metabolism (thyX,dut, exodeoxyribonuclease V), and the anterior cingulate cortex volume were associated with IC, cross-sectionally and longitudinally. FMT from individuals with obesity led to alterations in mice reversal learning. In an independent FMT experiment, human donor's bacterial functions related to IC deficits were associated with mPFC expression of one-carbon metabolism-related genes of recipient's mice. CONCLUSION These results highlight the importance of targeting obesity-related impulsive behaviour through the induction of gut microbiota shifts.
Collapse
Affiliation(s)
- María Arnoriaga-Rodríguez
- Department of Diabetes, Endocrinology and Nutrition, Dr. Josep Trueta University Hospital, Girona, Spain
- Nutrition, Eumetabolism and Health Group, Girona Biomedical Research Institute (IdibGi), Girona, Spain
- CIBER Pathophysiology of Obesity and Nutrition (CIBEROBN), Madrid, Spain
- Deparment of Medical Sciences, Faculty of Medicine, University of Girona, Girona, Spain
| | - Jordi Mayneris-Perxachs
- Department of Diabetes, Endocrinology and Nutrition, Dr. Josep Trueta University Hospital, Girona, Spain
- Nutrition, Eumetabolism and Health Group, Girona Biomedical Research Institute (IdibGi), Girona, Spain
- CIBER Pathophysiology of Obesity and Nutrition (CIBEROBN), Madrid, Spain
| | - Oren Contreras-Rodríguez
- Nutrition, Eumetabolism and Health Group, Girona Biomedical Research Institute (IdibGi), Girona, Spain
- Department of Psychiatry, Bellvitge University Hospital, Bellvitge Biomedical Research Institute (IDIBELL) and CIBERSAM, Barcelona, Spain
| | - Aurelijus Burokas
- Laboratory of Neuropharmacology, Deparment of Experimental and Health Sciences, Universitat Pompeu Fabra, Barcelona, Spain
- Present address: Institute of Biochemistry, Life Sciences Center, Vilnius University, Saulėtekio av. 7, LT-10257 Vilnius, Lithuania
| | - Juan-Antonio Ortega-Sanchez
- Laboratory of Neuropharmacology, Deparment of Experimental and Health Sciences, Universitat Pompeu Fabra, Barcelona, Spain
| | - Gerard Blasco
- Institute of Diagnostic Imaging (IDI)-Research Unit (IDIR), Parc Sanitari Pere Virgili, Barcelona, Spain
- Medical Imaging, Girona Biomedical Research Institute (IdibGi), Girona, Spain
| | - Claudia Coll
- Neuroimmunology and Multiple Sclerosis Unit, Deparment of Neurology, Dr. Josep Trueta University Hospital, Girona, Spain
| | - Carles Biarnés
- Medical Imaging, Girona Biomedical Research Institute (IdibGi), Girona, Spain
| | - Anna Castells-Nobau
- Department of Diabetes, Endocrinology and Nutrition, Dr. Josep Trueta University Hospital, Girona, Spain
- Nutrition, Eumetabolism and Health Group, Girona Biomedical Research Institute (IdibGi), Girona, Spain
- CIBER Pathophysiology of Obesity and Nutrition (CIBEROBN), Madrid, Spain
| | - Josep Puig
- Deparment of Medical Sciences, Faculty of Medicine, University of Girona, Girona, Spain
- Institute of Diagnostic Imaging (IDI)-Research Unit (IDIR), Parc Sanitari Pere Virgili, Barcelona, Spain
- Medical Imaging, Girona Biomedical Research Institute (IdibGi), Girona, Spain
| | - Josep Garre-Olmo
- Deparment of Medical Sciences, Faculty of Medicine, University of Girona, Girona, Spain
- Research Group on Aging, Health and Disability, Girona Biomedical Research Institute, Health Assistance Institute, Girona, Spain
| | - Rafel Ramos
- Deparment of Medical Sciences, Faculty of Medicine, University of Girona, Girona, Spain
- Institut Universitari d'Investigació en Atenció Primària Jordi Gol (IDIAP Jordi Gol), Barcelona, Catalonia, Spain
| | - Salvador Pedraza
- Deparment of Medical Sciences, Faculty of Medicine, University of Girona, Girona, Spain
- Medical Imaging, Girona Biomedical Research Institute (IdibGi), Girona, Spain
- Deparment of Radiology, Dr. Josep Trueta University Hospital, Girona, Spain
| | - Ramon Brugada
- Deparment of Medical Sciences, Faculty of Medicine, University of Girona, Girona, Spain
- Cardiovascular Genetics Center, CIBER-CV, Girona Biomedical Research Institute (IDIBGI), Dr. Josep Trueta University Hospital, Girona, Spain
- Biomedical Research Networking Center on Cardiovascular Diseases (CIBERCV), Madrid, Spain
- Deparment of Cardiology, Dr. Josep Trueta University Hospital, Girona, Spain
| | - Joan C Vilanova
- Deparment of Medical Sciences, Faculty of Medicine, University of Girona, Girona, Spain
- Medical Imaging, Girona Biomedical Research Institute (IdibGi), Girona, Spain
- Deparment of Radiology, Dr. Josep Trueta University Hospital, Girona, Spain
| | - Joaquín Serena
- Deparment of Medical Sciences, Faculty of Medicine, University of Girona, Girona, Spain
- Department of Neurology, Dr. Josep Trueta University Hospital, Girona Biomedical Research Institute (IDIBGI), Girona, Spain
| | - Jordi Barretina
- Girona Biomedical Research Institute (IdibGi), Dr. Josep Trueta University Hospital, Girona, Spain
| | - Jordi Gich
- Deparment of Medical Sciences, Faculty of Medicine, University of Girona, Girona, Spain
- Neurodegeneration and Neuroinflammation Group, Girona Biomedical Research Institute (IdibGi), Girona, Spain
| | - Vicente Pérez-Brocal
- Joint Investigation Unit of FISABIO and I2Sysbio, University of València and CSIC, Valencia, Spain
- Biomedical Research Networking Center for Epidemiology and Public Health (CIBERESP), Madrid, Spain
| | - Andrés Moya
- Joint Investigation Unit of FISABIO and I2Sysbio, University of València and CSIC, Valencia, Spain
- Biomedical Research Networking Center for Epidemiology and Public Health (CIBERESP), Madrid, Spain
| | - Xavier Fernández-Real
- Institute of Mathematics, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Lluis Ramio-Torrentà
- Deparment of Medical Sciences, Faculty of Medicine, University of Girona, Girona, Spain
- Neuroimmunology and Multiple Sclerosis Unit, Deparment of Neurology, Dr. Josep Trueta University Hospital, Girona, Spain
- Department of Neurology, Dr. Josep Trueta University Hospital, Girona Biomedical Research Institute (IDIBGI), Girona, Spain
- Neurodegeneration and Neuroinflammation Group, Girona Biomedical Research Institute (IdibGi), Girona, Spain
- Red Española de Esclerosis Múltiple (REEM), Madrid, Spain
| | - Reinald Pamplona
- Metabolic Physiopathology Research Group, Experimental Medicine Department, Lleida University-Lleida Biochemical Research Institute (UdL-IRBLleida), Lleida, Spain
| | - Joaquim Sol
- Metabolic Physiopathology Research Group, Experimental Medicine Department, Lleida University-Lleida Biochemical Research Institute (UdL-IRBLleida), Lleida, Spain
- Institut Català de la Salut, Atenció Primària, Lleida, Spain
- Research Support Unit Lleida, Fundació Institut Universitari per a la recerca a l'Atenció Primària de Salut Jordi Gol i Gurina (IDIAPJGol), Lleida, Spain
| | - Mariona Jové
- Metabolic Physiopathology Research Group, Experimental Medicine Department, Lleida University-Lleida Biochemical Research Institute (UdL-IRBLleida), Lleida, Spain
| | - Wifredo Ricart
- Department of Diabetes, Endocrinology and Nutrition, Dr. Josep Trueta University Hospital, Girona, Spain
- Nutrition, Eumetabolism and Health Group, Girona Biomedical Research Institute (IdibGi), Girona, Spain
- CIBER Pathophysiology of Obesity and Nutrition (CIBEROBN), Madrid, Spain
- Deparment of Medical Sciences, Faculty of Medicine, University of Girona, Girona, Spain
| | - Manuel Portero-Otin
- Metabolic Physiopathology Research Group, Experimental Medicine Department, Lleida University-Lleida Biochemical Research Institute (UdL-IRBLleida), Lleida, Spain
| | - Rafael Maldonado
- Laboratory of Neuropharmacology, Deparment of Experimental and Health Sciences, Universitat Pompeu Fabra, Barcelona, Spain
- Hospital del Mar Medical Research Institute (IMIM), Barcelona, Spain
| | - Jose Manuel Fernández-Real
- Department of Diabetes, Endocrinology and Nutrition, Dr. Josep Trueta University Hospital, Girona, Spain
- Nutrition, Eumetabolism and Health Group, Girona Biomedical Research Institute (IdibGi), Girona, Spain
- CIBER Pathophysiology of Obesity and Nutrition (CIBEROBN), Madrid, Spain
- Deparment of Medical Sciences, Faculty of Medicine, University of Girona, Girona, Spain
| |
Collapse
|
158
|
da Silva Júnior AC, da Silva MJ, Cruz CD, Sant’Anna IDC, Silva GN, Nascimento M, Azevedo CF. Prediction of the importance of auxiliary traits using computational intelligence and machine learning: A simulation study. PLoS One 2021; 16:e0257213. [PMID: 34843488 PMCID: PMC8629227 DOI: 10.1371/journal.pone.0257213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2021] [Accepted: 07/04/2021] [Indexed: 11/18/2022] Open
Abstract
The present study evaluated the importance of auxiliary traits of a principal trait based on phenotypic information and previously known genetic structure using computational intelligence and machine learning to develop predictive tools for plant breeding. Data of an F2 population represented by 500 individuals, obtained from a cross between contrasting homozygous parents, were simulated. Phenotypic traits were simulated based on previously established means and heritability estimates (30%, 50%, and 80%); traits were distributed in a genome with 10 linkage groups, considering two alleles per marker. Four different scenarios were considered. For the principal trait, heritability was 50%, and 40 control loci were distributed in five linkage groups. Another phenotypic control trait with the same complexity as the principal trait but without any genetic relationship with it and without pleiotropy or a factorial link between the control loci for both traits was simulated. These traits shared a large number of control loci with the principal trait, but could be distinguished by the differential action of the environment on them, as reflected in heritability estimates (30%, 50%, and 80%). The coefficient of determination were considered to evaluate the proposed methodologies. Multiple regression, computational intelligence, and machine learning were used to predict the importance of the tested traits. Computational intelligence and machine learning were superior in extracting nonlinear information from model inputs and quantifying the relative contributions of phenotypic traits. The R2 values ranged from 44.0% - 83.0% and 79.0% - 94.0%, for computational intelligence and machine learning, respectively. In conclusion, the relative contributions of auxiliary traits in different scenarios in plant breeding programs can be efficiently predicted using computational intelligence and machine learning.
Collapse
Affiliation(s)
| | - Michele Jorge da Silva
- Department of General Biology, Federal University of Viçosa, Viçosa, Minas Gerais, Brazil
| | - Cosme Damião Cruz
- Department of General Biology, Federal University of Viçosa, Viçosa, Minas Gerais, Brazil
| | | | - Gabi Nunes Silva
- Department of Mathematics and Statistics Scholar, R. Rio Amazonas, Ji-Paraná, RO, Brazil
| | - Moysés Nascimento
- Department of Statistics, Federal University of Viçosa, Viçosa, Minas Gerais, Brazil
| | | |
Collapse
|
159
|
Mayneris-Perxachs J, Moreno-Navarrete JM, Ballanti M, Monteleone G, Alessandro Paoluzi O, Mingrone G, Lefebvre P, Staels B, Federici M, Puig J, Garre J, Ramos R, Fernández-Real JM. Lipidomics and metabolomics signatures of SARS-CoV-2 mediators/receptors in peripheral leukocytes, jejunum and colon. Comput Struct Biotechnol J 2021; 19:6080-6089. [PMID: 34777716 PMCID: PMC8574068 DOI: 10.1016/j.csbj.2021.11.007] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Revised: 11/05/2021] [Accepted: 11/06/2021] [Indexed: 12/14/2022] Open
Abstract
Cell surface receptor-mediated viral entry plays a critical role in this infection. Well-established SARS-CoV-2 receptors such as ACE2 and TMPRSS2 are highly expressed in the gastrointestinal tract. In fact, there are evidences that SARS-CoV-2 infects epithelial cells from the digestive system. However, emerging research has identified novel mediators such as DPP9, TYK2, and CCR2, all playing a critical role in inflammation. We evaluated the expression of SARS-CoV-2 receptors in peripheral leukocytes (n = 469), jejunum (n = 30), and colon (n = 37) of three independent cohorts by real-time PCR, RNA-sequencing, and microarray transcriptomics. We also performed HPCL-MS/MS lipidomics and metabolomics analyses to identify signatures linked to SARS-CoV-2 receptors. We found markedly higher peripheral leukocytes ACE2 expression levels in women compared to men, whereas the intestinal expression of TMPRSS2 was positively associated with BMI. Consistent lipidomics signatures associated with the expression of these mediators were found in both tissues and peripheral leukocytes involving n-3 long-chain PUFAs and arachidonic acid-derived eicosanoids, which play a key role in the regulation of inflammation and may interfere with viral entry and replication. Medium- and long-chain hydroxy acids, which have shown to interfere in viral replication, were also liked to SARS-CoV2 receptors. Gonadal steroids were also associated with the expression of some of these receptors, even after controlling for sex. The expression of SARS-CoV2 receptors was associated with several metabolic and nutritional traits in different cell types. This information may be useful in the design of potential therapies targeted at coronavirus entry.
Collapse
Affiliation(s)
- Jordi Mayneris-Perxachs
- Department of Diabetes, Endocrinology and Nutrition, Dr. Josep Trueta University Hospital, Girona, Spain.,Eumetabolism and Health Group, Girona Biomedical Research Institute (IdibGi), Girona, Spain.,Centro de Investigación Biomédica en Red Fisiopatología de la Obesidad y Nutrición (CIBEROBN), Madrid, Spain.,Girona Biomedical Research Institute (IDIBGI), Dr. Josep Trueta University Hospital, Catalonia, Spain
| | - José Maria Moreno-Navarrete
- Department of Diabetes, Endocrinology and Nutrition, Dr. Josep Trueta University Hospital, Girona, Spain.,Eumetabolism and Health Group, Girona Biomedical Research Institute (IdibGi), Girona, Spain.,Centro de Investigación Biomédica en Red Fisiopatología de la Obesidad y Nutrición (CIBEROBN), Madrid, Spain.,Department of Medical Sciences, School of Medicine, University of Girona, Girona, Spain.,Girona Biomedical Research Institute (IDIBGI), Dr. Josep Trueta University Hospital, Catalonia, Spain
| | - Marta Ballanti
- Department of Systems Medicine, University of Rome Tor Vergata, Rome, Italy.,Center for Atherosclerosis, Policlinico Tor Vergata, Rome, Italy
| | | | | | - Geltrude Mingrone
- Department of Internal Medicine, Catholic University, Rome, Italy.,Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy.,Diabetes and Nutritional Sciences, Hodgkin Building, Guy's Campus, King's College London, London, United Kingdom
| | - Philippe Lefebvre
- Univ. Lille, Inserm, CHU Lille, Institut Pasteur de Lille, U1011-EGID, F-59000 Lille, France
| | - Bart Staels
- Univ. Lille, Inserm, CHU Lille, Institut Pasteur de Lille, U1011-EGID, F-59000 Lille, France
| | - Massimo Federici
- Department of Systems Medicine, University of Rome Tor Vergata, Rome, Italy.,Center for Atherosclerosis, Policlinico Tor Vergata, Rome, Italy
| | - Josep Puig
- Department of Medical Sciences, School of Medicine, University of Girona, Girona, Spain.,Girona Biomedical Research Institute (IDIBGI), Dr. Josep Trueta University Hospital, Catalonia, Spain.,Institute of Diagnostic Imaging (IDI)-Research Unit (IDIR), Parc Sanitari Pere Virgili, Barcelona, Spain.,Medical Imaging, Girona Biomedical Research Institute (IdibGi), Girona, Spain.,Department of Radiology (IDI), Dr. Josep Trueta University Hospital, Girona, Spain
| | - Josep Garre
- Department of Medical Sciences, School of Medicine, University of Girona, Girona, Spain.,Girona Biomedical Research Institute (IDIBGI), Dr. Josep Trueta University Hospital, Catalonia, Spain.,Research Group on Aging, Disability and Health, Girona Biomedical Research Institute (IdIBGi), Girona, Spain.,Serra-Húnter Professor. Department of Nursing, University of Girona, Girona Spain
| | - Rafael Ramos
- Department of Medical Sciences, School of Medicine, University of Girona, Girona, Spain.,Girona Biomedical Research Institute (IDIBGI), Dr. Josep Trueta University Hospital, Catalonia, Spain.,Vascular Health Research Group of Girona (ISV-Girona). Jordi Gol Institute for Primary Care Research (Institut Universitari per a la RecercaenAtencióPrimària Jordi Gol I Gorina -IDIAPJGol), Catalonia, Spain
| | - José-Manuel Fernández-Real
- Department of Diabetes, Endocrinology and Nutrition, Dr. Josep Trueta University Hospital, Girona, Spain.,Eumetabolism and Health Group, Girona Biomedical Research Institute (IdibGi), Girona, Spain.,Centro de Investigación Biomédica en Red Fisiopatología de la Obesidad y Nutrición (CIBEROBN), Madrid, Spain.,Department of Medical Sciences, School of Medicine, University of Girona, Girona, Spain.,Girona Biomedical Research Institute (IDIBGI), Dr. Josep Trueta University Hospital, Catalonia, Spain
| |
Collapse
|
160
|
Hubbard RJ, Zadeh I, Jones AP, Robert B, Bryant NB, Clark VP, Pilly PK. Brain connectivity alterations during sleep by closed-loop transcranial neurostimulation predict metamemory sensitivity. Netw Neurosci 2021; 5:734-756. [PMID: 34746625 PMCID: PMC8567828 DOI: 10.1162/netn_a_00201] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2021] [Accepted: 05/15/2021] [Indexed: 12/23/2022] Open
Abstract
Metamemory involves the ability to correctly judge the accuracy of our memories. The retrieval of memories can be improved using transcranial electrical stimulation (tES) during sleep, but evidence for improvements to metamemory sensitivity is limited. Applying tES can enhance sleep-dependent memory consolidation, which along with metamemory requires the coordination of activity across distributed neural systems, suggesting that examining functional connectivity is important for understanding these processes. Nevertheless, little research has examined how functional connectivity modulations relate to overnight changes in metamemory sensitivity. Here, we developed a closed-loop short-duration tES method, time-locked to up-states of ongoing slow-wave oscillations, to cue specific memory replays in humans. We measured electroencephalographic (EEG) coherence changes following stimulation pulses, and characterized network alterations with graph theoretic metrics. Using machine learning techniques, we show that pulsed tES elicited network changes in multiple frequency bands, including increased connectivity in the theta band and increased efficiency in the spindle band. Additionally, stimulation-induced changes in beta-band path length were predictive of overnight changes in metamemory sensitivity. These findings add new insights into the growing literature investigating increases in memory performance through brain stimulation during sleep, and highlight the importance of examining functional connectivity to explain its effects. Numerous studies have demonstrated a clear link between sleep and memory—namely, memories are consolidated during sleep, leading to more stable and long-lasting representations. We have previously shown that tagging episodes with specific patterns of brain stimulation during encoding and replaying those patterns during sleep can enhance this consolidation process to improve confidence and decision-making of memories (metamemory). Here, we extend this work to examine network-level brain changes that occur following stimulation during sleep that predict metamemory improvements. Using graph theoretic and machine-learning methods, we found that stimulation-induced changes in beta-band path length predicted overnight improvements in metamemory. This novel finding sheds new light on the neural mechanisms of memory consolidation and suggests potential applications for improving metamemory.
Collapse
Affiliation(s)
- Ryan J Hubbard
- Center for Human-Machine Collaboration, Information and Systems Sciences Laboratory, HRL Laboratories, LLC, Malibu, CA, USA
| | - Iman Zadeh
- Center for Human-Machine Collaboration, Information and Systems Sciences Laboratory, HRL Laboratories, LLC, Malibu, CA, USA
| | - Aaron P Jones
- Psychology Clinical Neuroscience Center, Department of Psychology, The University of New Mexico, Albuquerque, NM, USA
| | - Bradley Robert
- Psychology Clinical Neuroscience Center, Department of Psychology, The University of New Mexico, Albuquerque, NM, USA
| | - Natalie B Bryant
- Psychology Clinical Neuroscience Center, Department of Psychology, The University of New Mexico, Albuquerque, NM, USA
| | - Vincent P Clark
- Psychology Clinical Neuroscience Center, Department of Psychology, The University of New Mexico, Albuquerque, NM, USA
| | - Praveen K Pilly
- Center for Human-Machine Collaboration, Information and Systems Sciences Laboratory, HRL Laboratories, LLC, Malibu, CA, USA
| |
Collapse
|
161
|
Chiu CY, Lin G, Wang CJ, Hung SI, Chung WH. Metabolomics reveals microbial-derived metabolites associated with immunoglobulin E responses in filaggrin-related atopic dermatitis. Pediatr Allergy Immunol 2021; 32:1709-1717. [PMID: 34087019 DOI: 10.1111/pai.13570] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/06/2021] [Revised: 05/23/2021] [Accepted: 05/31/2021] [Indexed: 11/27/2022]
Abstract
BACKGROUND Filaggrin (FLG) gene mutation and immunoglobulin E (IgE)-mediated sensitization are the most important predictors of atopic dermatitis (AD). However, a metabolomics-based approach to address the metabolic impact of FLG mutations on allergic IgE responses for AD is still lacking. We, though, determine the relationships of metabolic profiles in AD with FLG mutations and allergic responses. METHODS Eighty-one children with adolescent AD (n = 58) and healthy controls (n = 23) were prospectively enrolled. Mutations in the filaggrin gene were identified using whole-exome sequencing, and plasma metabolic profiles were determined using 1 H-nuclear magnetic resonance (NMR) spectroscopy. Integrative analyses of their associations related to total serum IgE levels were performed, and further metabolic functional pathways for AD were also assessed. RESULTS Metabolites contributed to the separation between AD and controls were identified using the supervised partial least squares discriminant analysis (Q2 /R2 = 0.90, Ppermutation <0.001). Nitrogen and amino acid metabolisms for energy production, and microbe-related methane and propanoate metabolisms were significantly associated with AD compared with healthy controls (FDR-adjusted p < .05). Five of fifteen metabolites related to FLG mutations were positively correlated with total serum IgE levels. Among them, dimethylamine and isopropanol were strongly associated with methane metabolism and propanoate metabolism, respectively, in AD with FLG mutations (FDR-adjusted p < .01). CONCLUSION A strong correlation of microbial-derived metabolites, dimethylamine and isopropanol, with FLG mutations and IgE allergic reactions provides the influence of host genetics on the microbiome to regulate susceptibility to allergic responses in the pathogenesis of AD.
Collapse
Affiliation(s)
- Chih-Yung Chiu
- Department of Pediatrics, Chang Gung Memorial Hospital at Linkou, Chang Gung University, Taoyuan, Taiwan.,Clinical Metabolomics Core Laboratory, Chang Gung Memorial Hospital at Linkou, Taoyuan, Taiwan
| | - Gigin Lin
- Clinical Metabolomics Core Laboratory, Chang Gung Memorial Hospital at Linkou, Taoyuan, Taiwan.,Department of Medical Imaging and Intervention, Chang Gung Memorial Hospital at Linkou, Chang Gung University, Taoyuan, Taiwan
| | - Chia-Jung Wang
- Department of Pediatrics, Chang Gung Memorial Hospital at Linkou, Chang Gung University, Taoyuan, Taiwan
| | - Shuen-Iu Hung
- Cancer Vaccine and Immune Cell Therapy Core Laboratory, Chang Gung Memorial Hospital, Chang Gung University, Taoyuan, Taiwan
| | - Wen-Hung Chung
- Department of Dermatology, Drug Hypersensitivity Clinical and Research Center, Chang Gung Memorial Hospital, Taipei, Taiwan
| |
Collapse
|
162
|
Abstract
Pediatric Index of Mortality 3 is a validated tool including 11 variables for the assessment of mortality risk in PICU patients. With the recent advances in explainable machine learning algorithms, we aimed to assess feasibility of application of these machine learning models to simplify the Pediatric Index of Mortality 3 scoring system in order to decrease time and labor required for data collection and entry for Pediatric Index of Mortality 3. DESIGN Single-center, retrospective cohort study. Data from the Virtual Pediatric Systems for patients admitted to Cleveland Clinic Children`s PICU between January 2008 and December 2019 was obtained. Light Gradient Boosting Machine Regressor (a gradient boosting decision tree algorithm) was used for building the machine learning models. Variable importance was analyzed by SHapley Additive exPlanations. All of the 11 Pediatric Index of Mortality 3 variables were used as input variables in the machine learning models to predict Pediatric Index of Mortality 3 risk of mortality as the outcome variable. Mean absolute error, root mean squared error, and R-squared were calculated for each of the 11 machine learning models as model performance parameters. SETTING Quaternary children's hospital. PATIENTS PICU patients. INTERVENTIONS None. MEASUREMENTS AND MAIN RESULTS Five-thousand sixty-eight patients were analyzed. The machine learning models were able to maintain similar predictive error until the number of input variables decreased to four. The machine learning model with five input variables (mechanical ventilation in the first hour of PICU admission, very-high-risk diagnosis, surgical recovery from a noncardiac procedure, low-risk diagnosis, and base excess) produced lowest mean root mean squared error of 1.49 (95% CI, 1.05-1.93) and highest R-squared of 0.73 (95% CI, 0.6-0.86) with mean absolute error of 0.43 (95% CI, 0.35-0.5) among all the 11 machine learning models. CONCLUSIONS Explainable machine learning methods were feasible in simplifying the Pediatric Index of Mortality 3 scoring system with similar risk of mortality predictions compared to the original Pediatric Index of Mortality 3 model tested in a single-center dataset.
Collapse
|
163
|
Importance of Spatial Autocorrelation in Machine Learning Modeling of Polymetallic Nodules, Model Uncertainty and Transferability at Local Scale. MINERALS 2021. [DOI: 10.3390/min11111172] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
Machine learning spatial modeling is used for mapping the distribution of deep-sea polymetallic nodules (PMN). However, the presence and influence of spatial autocorrelation (SAC) have not been extensively studied. SAC can provide information regarding the variable selection before modeling, and it results in erroneous validation performance when ignored. ML models are also problematic when applied in areas far away from the initial training locations, especially if the (new) area to be predicted covers another feature space. Here, we study the spatial distribution of PMN in a geomorphologically heterogeneous area of the Peru Basin, where SAC of PMN exists. The local Moran’s I analysis showed that there are areas with a significantly higher or lower number of PMN, associated with different backscatter values, aspect orientation, and seafloor geomorphological characteristics. A quantile regression forests (QRF) model is used using three cross-validation (CV) techniques (random-, spatial-, and cluster-blocking). We used the recently proposed “Area of Applicability” method to quantify the geographical areas where feature space extrapolation occurs. The results show that QRF predicts well in morphologically similar areas, with spatial block cross-validation being the least unbiased method. Conversely, random-CV overestimates the prediction performance. Under new conditions, the model transferability is reduced even on local scales, highlighting the need for spatial model-based dissimilarity analysis and transferability assessment in new areas.
Collapse
|
164
|
Della Pepa GM, Caccavella VM, Menna G, Ius T, Auricchio AM, Sabatino G, La Rocca G, Chiesa S, Gaudino S, Marchese E, Olivi A. Machine Learning-Based Prediction of Early Recurrence in Glioblastoma Patients: A Glance Towards Precision Medicine. Neurosurgery 2021; 89:873-883. [PMID: 34459917 DOI: 10.1093/neuros/nyab320] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2021] [Accepted: 06/09/2021] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND Ability to thrive and time-to-recurrence following treatment are important parameters to assess in patients with glioblastoma multiforme (GBM), given its dismal prognosis. Though there is an ongoing debate whether it can be considered an appropriate surrogate endpoint for overall survival in clinical trials, progression-free survival (PFS) is routinely used for clinical decision-making. OBJECTIVE To investigate whether machine learning (ML)-based models can reliably stratify newly diagnosed GBM patients into prognostic subclasses on PFS basis, identifying those at higher risk for an early recurrence (≤6 mo). METHODS Data were extracted from a multicentric database, according to the following eligibility criteria: histopathologically verified GBM and follow-up >12 mo: 474 patients met our inclusion criteria and were included in the analysis. Relevant demographic, clinical, molecular, and radiological variables were selected by a feature selection algorithm (Boruta) and used to build a ML-based model. RESULTS Random forest prediction model, evaluated on an 80:20 split ratio, achieved an AUC of 0.81 (95% CI: 0.77; 0.83) demonstrating high discriminative ability. Optimizing the predictive value derived from the linear and nonlinear combinations of the selected input features, our model outperformed across all performance metrics multivariable logistic regression. CONCLUSION A robust ML-based prediction model that identifies patients at high risk for early recurrence was successfully trained and internally validated. Considerable effort remains to integrate these predictions in a patient-centered care context.
Collapse
Affiliation(s)
- Giuseppe Maria Della Pepa
- Institute of Neurosurgery, Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Catholic University, Rome, Italy
| | - Valerio Maria Caccavella
- Institute of Neurosurgery, Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Catholic University, Rome, Italy
| | - Grazia Menna
- Institute of Neurosurgery, Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Catholic University, Rome, Italy
| | - Tamara Ius
- Neurosurgery Unit, Department of Neuroscience, Santa Maria della Misericordia, University Hospital, Udine, Italy
| | - Anna Maria Auricchio
- Institute of Neurosurgery, Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Catholic University, Rome, Italy
| | - Giovanni Sabatino
- Institute of Neurosurgery, Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Catholic University, Rome, Italy.,Department of Neurosurgery, Mater Olbia Hospital, Olbia, Italy
| | - Giuseppe La Rocca
- Institute of Neurosurgery, Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Catholic University, Rome, Italy.,Department of Neurosurgery, Mater Olbia Hospital, Olbia, Italy
| | - Silvia Chiesa
- Radiotherapy Department, Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Catholic University, Rome, Italy
| | - Simona Gaudino
- Radiology and Neuroradiology Department, Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Catholic University, Rome, Italy
| | - Enrico Marchese
- Institute of Neurosurgery, Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Catholic University, Rome, Italy
| | - Alessandro Olivi
- Institute of Neurosurgery, Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Catholic University, Rome, Italy
| |
Collapse
|
165
|
Wang S, Sun Y, Mao N, Duan S, Li Q, Li R, Jiang T, Wang Z, Xie H, Gu Y. Incorporating the clinical and radiomics features of contrast-enhanced mammography to classify breast lesions: a retrospective study. Quant Imaging Med Surg 2021; 11:4418-4430. [PMID: 34603996 DOI: 10.21037/qims-21-103] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Accepted: 05/11/2021] [Indexed: 12/21/2022]
Abstract
Background Contrast-enhanced mammography (CEM) is a promising breast imaging technique. A limited number of studies have focused on the radiomics analysis of CEM. We intended to explore whether a model constructed with both clinical and radiomics features of CEM can better classify benign and malignant breast lesions. Methods This retrospective, double-center study included women who underwent CEM between August 2017 and February 2020. The data from Center 1 were used as training set and the data from Center 2 were used as external testing set (training: testing =2:1). Models were constructed with the clinical, radiomics, and clinical + radiomics features of CEM. The clinical features included patient age and clinical image features interpreted by the radiologists. The radiomics features were extracted from high-energy (HE), low-energy (LE), and dual-energy subtraction (DES) images of CEM. The Mann-Whitney U test, Pearson correlation and Boruta's approach were used to select the radiomics features. Random Forest (RF) and logistic regression were used to establish the models. For the testing set, the areas under the curve (AUCs) and 95% confidence intervals (CIs) were employed to evaluate the performance of the models. For the training set, the mean AUCs were obtained by performing internal validation for 100 iterations and then compared by the Kruskal-Wallis and Mann-Whitney U tests. Results A total of 226 women (mean age: 47.4±10.1 years) with 226 pathologically proven breast lesions (101 benign; 125 malignant) were included. For the external testing set, the AUCs were 0.964 (95% CI: 0.918-1.000) for the combined model, 0.947 (95% CI: 0.891-0.997) for the radiomics model, and 0.882 (95% CI: 0.803-0.962) for the clinical model. In the internal validation process, the combined model achieved a mean AUC of 0.934±0.030, which was significantly higher than those of the radiomics (mean AUC =0.921±0.031, adjusted P<0.050) and clinical models (mean AUC =0.907±0.036; adjusted P<0.050). Conclusions Incorporating both clinical and radiomics features of CEM may achieve better classification results for breast lesions.
Collapse
Affiliation(s)
- Simin Wang
- Department of Radiology, Fudan University Shanghai Cancer Center, Shanghai, China.,Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Yuqi Sun
- Department of Biostatistics, School of Public Health, Fudan University, Shanghai, China
| | - Ning Mao
- Department of Radiology, Yantai Yuhuangding Hospital, Qingdao University, Qingdao, China
| | | | - Qin Li
- Department of Radiology, Fudan University Shanghai Cancer Center, Shanghai, China.,Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Ruimin Li
- Department of Radiology, Fudan University Shanghai Cancer Center, Shanghai, China.,Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Tingting Jiang
- Department of Radiology, Fudan University Shanghai Cancer Center, Shanghai, China.,Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Zhongyi Wang
- Department of Radiology, Yantai Yuhuangding Hospital, Qingdao University, Qingdao, China
| | - Haizhu Xie
- Department of Radiology, Yantai Yuhuangding Hospital, Qingdao University, Qingdao, China
| | - Yajia Gu
- Department of Radiology, Fudan University Shanghai Cancer Center, Shanghai, China.,Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| |
Collapse
|
166
|
Alkhamis MA, Fountain‐Jones NM, Aguilar‐Vega C, Sánchez‐Vizcaíno JM. Environment, vector, or host? Using machine learning to untangle the mechanisms driving arbovirus outbreaks. ECOLOGICAL APPLICATIONS : A PUBLICATION OF THE ECOLOGICAL SOCIETY OF AMERICA 2021; 31:e02407. [PMID: 34245639 PMCID: PMC9286057 DOI: 10.1002/eap.2407] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/09/2020] [Revised: 01/28/2021] [Accepted: 03/03/2021] [Indexed: 06/13/2023]
Abstract
Climatic, landscape, and host features are critical components in shaping outbreaks of vector-borne diseases. However, the relationship between the outbreaks of vector-borne pathogens and their environmental drivers is typically complicated, nonlinear, and may vary by taxonomic units below the species level (e.g., strain or serotype). Here, we aim to untangle how these complex forces shape the risk of outbreaks of Bluetongue virus (BTV); a vector-borne pathogen that is continuously emerging and re-emerging across Europe, with severe economic implications. We tested if the ecological predictors of BTV outbreak risk were serotype-specific by examining the most prevalent serotypes recorded in Europe (1, 4, and 8). We used a robust machine learning (ML) pipeline and 23 relevant environmental features to fit predictive models to 24,245 outbreaks reported in 25 European countries between 2000 and 2019. Our ML models demonstrated high predictive performance for all BTV serotypes (accuracies > 0.87) and revealed strong nonlinear relationships between BTV outbreak risk and environmental and host features. Serotype-specific analysis suggests, however, that each of the major serotypes (1, 4, and 8) had a unique outbreak risk profile. For example, temperature and midge abundance were as the most important characteristics shaping serotype 1, whereas for serotype 4 goat density and temperature were more important. We were also able to identify strong interactive effects between environmental and host characteristics that were also serotype specific. Our ML pipeline was able to reveal more in-depth insights into the complex epidemiology of BTVs and can guide policymakers in intervention strategies to help reduce the economic implications and social cost of this important pathogen.
Collapse
Affiliation(s)
- Moh A. Alkhamis
- Department of Epidemiology and BiostatisticsFaculty of Public HeathHealth Sciences CentreKuwait UniversityKuwait City13110Kuwait
| | - Nicholas M. Fountain‐Jones
- School of Natural SciencesUniversity of TasmaniaHobartTasmania7001Australia
- Department of Veterinary Population MedicineCollege of Veterinary MedicineUniversity of MinnesotaSt. PaulMinnesota55108USA
| | - Cecilia Aguilar‐Vega
- VISAVET Health Surveillance Centre and Animal Health DepartmentVeterinary SchoolComplutense University of MadridMadrid28040Spain
| | - José M. Sánchez‐Vizcaíno
- VISAVET Health Surveillance Centre and Animal Health DepartmentVeterinary SchoolComplutense University of MadridMadrid28040Spain
| |
Collapse
|
167
|
Danofloxacin Treatment Alters the Diversity and Resistome Profile of Gut Microbiota in Calves. Microorganisms 2021; 9:microorganisms9102023. [PMID: 34683343 PMCID: PMC8538188 DOI: 10.3390/microorganisms9102023] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2021] [Revised: 09/18/2021] [Accepted: 09/23/2021] [Indexed: 12/25/2022] Open
Abstract
Fluoroquinolones, such as danofloxacin, are used to control bovine respiratory disease complex in beef cattle; however, little is known about their effects on gut microbiota and resistome. The objectives were to evaluate the effect of subcutaneously administered danofloxacin on gut microbiota and resistome, and the composition of Campylobacter in calves. Twenty calves were injected with a single dose of danofloxacin, and ten calves were kept as a control. The effects of danofloxacin on microbiota and the resistome were assessed using 16S rRNA sequencing, quantitative real-time PCR, and metagenomic Hi-C ProxiMeta. Alpha and beta diversities were significantly different (p < 0.05) between pre-and post-treatment samples, and the compositions of several bacterial taxa shifted. The patterns of association between the compositions of Campylobacter and other genera were affected by danofloxacin. Antimicrobial resistance genes (ARGs) conferring resistance to five antibiotics were identified with their respective reservoirs. Following the treatment, some ARGs (e.g., ant9, tet40, tetW) increased in frequencies and host ranges, suggesting initiation of horizontal gene transfer, and new ARGs (aac6, ermF, tetL, tetX) were detected in the post-treatment samples. In conclusion, danofloxacin induced alterations of gut microbiota and selection and enrichment of resistance genes even against antibiotics that are unrelated to danofloxacin.
Collapse
|
168
|
Wang S, Xie X, Li C, Jia J, Chen C. Integrative network analysis of N 6 methylation-related genes reveal potential therapeutic targets for spinal cord injury. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2021; 18:8174-8187. [PMID: 34814294 DOI: 10.3934/mbe.2021405] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
The diagnosis of the severity of spinal cord injury (SCI) and the revelation of potential therapeutic targets are crucial for urgent clinical care and improved patient outcomes. Here, we analyzed the overall gene expression data in peripheral blood leukocytes during the acute injury phase collected from Gene Expression Omnibus (GEO) and identified six m6A regulators specifically expressed in SCI compared to normal samples. LncRNA-mRNA network analysis identified AKT2/3 and PIK3R1 related to m6A methylation as potential therapeutic targets for SCI and constructed a classifier to identify patients of SCI to assist clinical diagnosis. Moreover, FTO (eraser) and RBMX (reader) were found to be significantly down-regulated in SCI and the functional gene co-expressed with them was found to be involved in the signal transduction of multiple pathways related to nerve injury. Through the construction of the drug-target gene network, eight key genes were identified as drug targets and it was emphasized that fostamatinib can be used as a potential drug for the treatment of SCI. Taken together, our study characterized the pathogenesis and identified a potential therapeutic target of SCI providing theoretical support for the development of precision medicine.
Collapse
Affiliation(s)
- Shanzheng Wang
- Department of Orthopaedics, Zhongda Hospital, Medical School of Southeast University, 87 Dingjiaqiao Road, Nanjing 210009, China
| | - Xinhui Xie
- Department of Orthopaedics, Zhongda Hospital, Medical School of Southeast University, 87 Dingjiaqiao Road, Nanjing 210009, China
| | - Chao Li
- Department of Orthopaedics, Zhongda Hospital, Medical School of Southeast University, 87 Dingjiaqiao Road, Nanjing 210009, China
| | - Jun Jia
- Department of Orthopaedics, The 904th Hospital of Joint Logistic Support Force, PLA, 101 Xingyuan North Road, Wuxi 214000, China
| | - Changhong Chen
- Department of Orthopaedics, Jiangyin Hospital Affiliated to Nanjing University of Chinese Medicine, 130 Renmin Middle Road, Jiangyin 214400, China
| |
Collapse
|
169
|
Fan C, Huang S, Xiang C, An T, Song Y. Identification of key genes and immune infiltration modulated by CPAP in obstructive sleep apnea by integrated bioinformatics analysis. PLoS One 2021; 16:e0255708. [PMID: 34529670 PMCID: PMC8445487 DOI: 10.1371/journal.pone.0255708] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2021] [Accepted: 08/31/2021] [Indexed: 12/21/2022] Open
Abstract
Patients with obstructive sleep apnea (OSA) experience partial or complete upper airway collapses during sleep resulting in nocturnal hypoxia-normoxia cycling, and continuous positive airway pressure (CPAP) is the golden treatment for OSA. Nevertheless, the exact mechanisms of action, especially the transcriptome effect of CPAP on OSA patients, remain elusive. The goal of this study was to evaluate the longitudinal alterations in peripheral blood mononuclear cells transcriptome profiles of OSA patients in order to identify the hub gene and immune response. GSE133601 was downloaded from Gene Expression Omnibus (GEO). We identified black module via weighted gene co-expression network analysis (WGCNA), the genes in which were correlated significantly with the clinical trait of CPAP treatment. Finally, eleven hub genes (TRAV10, SNORA36A, RPL10, OBP2B, IGLV1-40, H2BC8, ESAM, DNASE1L3, CD22, ANK3, ACP3) were traced and used to construct a random forest model to predict therapeutic efficacy of CPAP in OSA with a good performance with AUC of 0.92. We further studied the immune cells infiltration in OSA patients with CIBERSORT, and monocytes were found to be related with the remission of OSA and partially correlated with the hub genes identified. In conclusion, these key genes and immune infiltration may be of great importance in the remission of OSA and related research of these genes may provide a new therapeutic target for OSA in the future.
Collapse
Affiliation(s)
- Cheng Fan
- Department of Geriatrics, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Shiyuan Huang
- Department of Geriatrics, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Chunhua Xiang
- Department of Geriatrics, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Tianhui An
- Department of Geriatrics, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Yi Song
- Department of Geriatrics, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- * E-mail:
| |
Collapse
|
170
|
Radiomics and Machine Learning Can Differentiate Transient Osteoporosis from Avascular Necrosis of the Hip. Diagnostics (Basel) 2021; 11:diagnostics11091686. [PMID: 34574027 PMCID: PMC8468167 DOI: 10.3390/diagnostics11091686] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2021] [Revised: 09/12/2021] [Accepted: 09/14/2021] [Indexed: 02/07/2023] Open
Abstract
Differentiation between transient osteoporosis (TOH) and avascular necrosis (AVN) of the hip is a longstanding challenge in musculoskeletal radiology. The purpose of this study was to utilize MRI-based radiomics and machine learning (ML) for accurate differentiation between the two entities. A total of 109 hips with TOH and 104 hips with AVN were retrospectively included. Femoral heads and necks with segmented radiomics features were extracted. Three ML classifiers (XGboost, CatBoost and SVM) using 38 relevant radiomics features were trained on 70% and validated on 30% of the dataset. ML performance was compared to two musculoskeletal radiologists, a general radiologist and two radiology residents. XGboost achieved the best performance with an area under the curve (AUC) of 93.7% (95% CI from 87.7 to 99.8%) among ML models. MSK radiologists achieved an AUC of 90.6% (95% CI from 86.7% to 94.5%) and 88.3% (95% CI from 84% to 92.7%), respectively, similar to residents. The general radiologist achieved an AUC of 84.5% (95% CI from 80% to 89%), significantly lower than of XGboost (p = 0.017). In conclusion, radiomics-based ML achieved a performance similar to MSK radiologists and significantly higher compared to general radiologists in differentiating between TOH and AVN.
Collapse
|
171
|
Wu L, Wen Y, Leng D, Zhang Q, Dai C, Wang Z, Liu Z, Yan B, Zhang Y, Wang J, He S, Bo X. Machine learning methods, databases and tools for drug combination prediction. Brief Bioinform 2021; 23:6363058. [PMID: 34477201 PMCID: PMC8769702 DOI: 10.1093/bib/bbab355] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Revised: 08/09/2021] [Accepted: 08/10/2021] [Indexed: 02/07/2023] Open
Abstract
Combination therapy has shown an obvious efficacy on complex diseases and can greatly reduce the development of drug resistance. However, even with high-throughput screens, experimental methods are insufficient to explore novel drug combinations. In order to reduce the search space of drug combinations, there is an urgent need to develop more efficient computational methods to predict novel drug combinations. In recent decades, more and more machine learning (ML) algorithms have been applied to improve the predictive performance. The object of this study is to introduce and discuss the recent applications of ML methods and the widely used databases in drug combination prediction. In this study, we first describe the concept and controversy of synergism between drug combinations. Then, we investigate various publicly available data resources and tools for prediction tasks. Next, ML methods including classic ML and deep learning methods applied in drug combination prediction are introduced. Finally, we summarize the challenges to ML methods in prediction tasks and provide a discussion on future work.
Collapse
Affiliation(s)
- Lianlian Wu
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, China
| | - Yuqi Wen
- Beijing Institute of Radiation Medicine, Beijing, China
| | - Dongjin Leng
- Beijing Institute of Radiation Medicine, Beijing, China
| | | | - Chong Dai
- College of Life Science and Technology, Beijing University of Chemical Technology, Beijing, China
| | - Zhongming Wang
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, China
| | - Ziqi Liu
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, AMMS, Beijing, China
| | - Bowei Yan
- Beijing Institute of Radiation Medicine, Beijing, China
| | - Yixin Zhang
- Beijing Institute of Radiation Medicine, Beijing, China
| | - Jing Wang
- School of Medicine, Tsinghua University, Beijing, China
| | - Song He
- Beijing Institute of Radiation Medicine, Beijing, China
| | - Xiaochen Bo
- Beijing Institute of Radiation Medicine, Beijing, China
| |
Collapse
|
172
|
Kandimalla R, Xu J, Link A, Matsuyama T, Yamamura K, Parker MI, Uetake H, Balaguer F, Borazanci E, Tsai S, Evans D, Meltzer SJ, Baba H, Brand R, Von Hoff D, Li W, Goel A. EpiPanGI Dx: A Cell-free DNA Methylation Fingerprint for the Early Detection of Gastrointestinal Cancers. Clin Cancer Res 2021; 27:6135-6144. [PMID: 34465601 DOI: 10.1158/1078-0432.ccr-21-1982] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2021] [Revised: 07/24/2021] [Accepted: 08/26/2021] [Indexed: 11/16/2022]
Abstract
PURPOSE DNA methylation alterations have emerged as front-runners in cell-free DNA (cfDNA) biomarker development. However, much effort to date has focused on single cancers. In this context, gastrointestinal (GI) cancers constitute the second leading cause of cancer-related deaths worldwide; yet there is no blood-based assay for the early detection and population screening of GI cancers. EXPERIMENTAL DESIGN Herein, we performed a genome-wide DNA methylation analysis of multiple GI cancers to develop a pan-GI diagnostic assay. By analyzing DNA methylation data from 1,781 tumor and adjacent normal tissues, we first identified differentially methylated regions (DMR) between individual GI cancers and adjacent normal, as well as across GI cancers. We next prioritized a list of 67,832 tissue DMRs by incorporating all significant DMRs across various GI cancers to design a custom, targeted bisulfite sequencing platform. We subsequently validated these tissue-specific DMRs in 300 cfDNA specimens and applied machine learning algorithms to develop three distinct categories of DMR panels RESULTS: We identified three distinct DMR panels: (i) cancer-specific biomarker panels with AUC values of 0.98 (colorectal cancer), 0.98 (hepatocellular carcinoma), 0.94 (esophageal squamous cell carcinoma), 0.90 (gastric cancer), 0.90 (esophageal adenocarcinoma), and 0.85 (pancreatic ductal adenocarcinoma); (ii) a pan-GI panel that detected all GI cancers with an AUC of 0.88; and (iii) a multi-cancer (tissue of origin) prediction panel, EpiPanGI Dx, with a prediction accuracy of 0.85-0.95 for most GI cancers. CONCLUSIONS Using a novel biomarker discovery approach, we provide the first evidence for a cfDNA methylation assay that offers robust diagnostic accuracy for GI cancers.
Collapse
Affiliation(s)
- Raju Kandimalla
- Center for Gastrointestinal Research, Center for Translational Genomics and Oncology, Baylor Scott & White Research Institute, Charles A Sammons Cancer Center, Baylor University Medical Center, Dallas, Texas
| | - Jianfeng Xu
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, Texas.,Department of Biological Chemistry, School of Medicine, University of California, Irvine, Irvine, California
| | - Alexander Link
- Department of Gastroenterology, Hepatology and Infectious Diseases, Otto-von-Guericke University Hospital, Magdeburg, Germany
| | - Takatoshi Matsuyama
- Department of Gastrointestinal Surgery, Tokyo Medical and Dental University Graduate School of Medicine, Tokyo, Japan
| | - Kensuke Yamamura
- Department of Gastroenterological Surgery, Graduate School of Medical Sciences, Kumamoto University, Kumamoto, Japan
| | - M Iqbal Parker
- Division of Medical Biochemistry and Structural Biology, Institute for Infectious Disease and Molecular Medicine, University of Cape Town, Cape Town, South Africa
| | - Hiroyuki Uetake
- Department of Specialized Surgery, Tokyo Medical and Dental University Graduate School of Medicine, Tokyo, Japan
| | - Francesc Balaguer
- Gastroenterology Department, Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBERehd), Hospital Clínic, University of Barcelona, Barcelona, Spain
| | | | - Susan Tsai
- Department of Surgery, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Douglas Evans
- Department of Surgery, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Stephen J Meltzer
- Department of Medicine, Division of Gastroenterology, The Johns Hopkins University School of Medicine, Baltimore, Maryland
| | - Hideo Baba
- Department of Gastroenterological Surgery, Graduate School of Medical Sciences, Kumamoto University, Kumamoto, Japan
| | - Randall Brand
- Department of Medicine, Division of Gastroenterology, Hepatology, and Nutrition, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Daniel Von Hoff
- HonorHealth Research Institute, Scottsdale, Arizona.,Translational Genomics Research Institute, an Affiliate of City of Hope, Phoenix, Arizona
| | - Wei Li
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, Texas. .,Department of Biological Chemistry, School of Medicine, University of California, Irvine, Irvine, California
| | - Ajay Goel
- Center for Gastrointestinal Research, Center for Translational Genomics and Oncology, Baylor Scott & White Research Institute, Charles A Sammons Cancer Center, Baylor University Medical Center, Dallas, Texas. .,Department of Molecular Diagnostics and Experimental Therapeutics, Beckman Research Institute of City of Hope, Monrovia, California.,City of Hope Comprehensive Cancer Center, Duarte, California
| |
Collapse
|
173
|
Benfatto S, Serçin Ö, Dejure FR, Abdollahi A, Zenke FT, Mardin BR. Uncovering cancer vulnerabilities by machine learning prediction of synthetic lethality. Mol Cancer 2021; 20:111. [PMID: 34454516 PMCID: PMC8401190 DOI: 10.1186/s12943-021-01405-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2021] [Accepted: 08/10/2021] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Synthetic lethality describes a genetic interaction between two perturbations, leading to cell death, whereas neither event alone has a significant effect on cell viability. This concept can be exploited to specifically target tumor cells. CRISPR viability screens have been widely employed to identify cancer vulnerabilities. However, an approach to systematically infer genetic interactions from viability screens is missing. METHODS Here we describe PAn-canceR Inferred Synthetic lethalities (PARIS), a machine learning approach to identify cancer vulnerabilities. PARIS predicts synthetic lethal (SL) interactions by combining CRISPR viability screens with genomics and transcriptomics data across hundreds of cancer cell lines profiled within the Cancer Dependency Map. RESULTS Using PARIS, we predicted 15 high confidence SL interactions within 549 DNA damage repair (DDR) genes. We show experimental validation of an SL interaction between the tumor suppressor CDKN2A, thymidine phosphorylase (TYMP) and the thymidylate synthase (TYMS), which may allow stratifying patients for treatment with TYMS inhibitors. Using genome-wide mapping of SL interactions for DDR genes, we unraveled a dependency between the aldehyde dehydrogenase ALDH2 and the BRCA-interacting protein BRIP1. Our results suggest BRIP1 as a potential therapeutic target in ~ 30% of all tumors, which express low levels of ALDH2. CONCLUSIONS PARIS is an unbiased, scalable and easy to adapt platform to identify SL interactions that should aid in improving cancer therapy with increased availability of cancer genomics data.
Collapse
Affiliation(s)
- Salvatore Benfatto
- BioMed X Institute (GmbH), Im Neuenheimer Feld 583, 69120, Heidelberg, Germany
| | - Özdemirhan Serçin
- BioMed X Institute (GmbH), Im Neuenheimer Feld 583, 69120, Heidelberg, Germany
| | - Francesca R Dejure
- BioMed X Institute (GmbH), Im Neuenheimer Feld 583, 69120, Heidelberg, Germany
| | - Amir Abdollahi
- Division of Molecular and Translational Radiation Oncology, National Centre for Tumour Diseases (NCT), Heidelberg University Hospital, 69120, Heidelberg, Germany
| | - Frank T Zenke
- Translational Innovation Platform Oncology & Immuno-Oncology, Merck KGaA, Frankfurter Str. 250, 64293, Darmstadt, Germany
| | - Balca R Mardin
- BioMed X Institute (GmbH), Im Neuenheimer Feld 583, 69120, Heidelberg, Germany.
| |
Collapse
|
174
|
Creason A, Haan D, Dang K, Chiotti KE, Inkman M, Lamb A, Yu T, Hu Y, Norman TC, Buchanan A, van Baren MJ, Spangler R, Rollins MR, Spellman PT, Rozanov D, Zhang J, Maher CA, Caloian C, Watson JD, Uhrig S, Haas BJ, Jain M, Akeson M, Ahsen ME, Stolovitzky G, Guinney J, Boutros PC, Stuart JM, Ellrott K. A community challenge to evaluate RNA-seq, fusion detection, and isoform quantification methods for cancer discovery. Cell Syst 2021; 12:827-838.e5. [PMID: 34146471 PMCID: PMC8376800 DOI: 10.1016/j.cels.2021.05.021] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2019] [Revised: 09/15/2020] [Accepted: 05/25/2021] [Indexed: 02/03/2023]
Abstract
The accurate identification and quantitation of RNA isoforms present in the cancer transcriptome is key for analyses ranging from the inference of the impacts of somatic variants to pathway analysis to biomarker development and subtype discovery. The ICGC-TCGA DREAM Somatic Mutation Calling in RNA (SMC-RNA) challenge was a crowd-sourced effort to benchmark methods for RNA isoform quantification and fusion detection from bulk cancer RNA sequencing (RNA-seq) data. It concluded in 2018 with a comparison of 77 fusion detection entries and 65 isoform quantification entries on 51 synthetic tumors and 32 cell lines with spiked-in fusion constructs. We report the entries used to build this benchmark, the leaderboard results, and the experimental features associated with the accurate prediction of RNA species. This challenge required submissions to be in the form of containerized workflows, meaning each of the entries described is easily reusable through CWL and Docker containers at https://github.com/SMC-RNA-challenge. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
Affiliation(s)
- Allison Creason
- Biomedical Engineering, Oregon Health and Science University, Portland, OR 97239, USA
| | - David Haan
- Biomolecular Engineering and UC Santa Cruz Genome Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | | | - Kami E Chiotti
- Biomedical Engineering, Oregon Health and Science University, Portland, OR 97239, USA
| | - Matthew Inkman
- The Genome Institute, Washington University School of Medicine, 4444 Forest Park Avenue, St. Louis, MO 63110, USA
| | | | | | - Yin Hu
- Sage Bionetworks, Seattle, WA, USA
| | | | - Alex Buchanan
- Biomedical Engineering, Oregon Health and Science University, Portland, OR 97239, USA
| | - Marijke J van Baren
- Biomolecular Engineering and UC Santa Cruz Genome Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Ryan Spangler
- Biomedical Engineering, Oregon Health and Science University, Portland, OR 97239, USA
| | - M Rick Rollins
- Biomedical Engineering, Oregon Health and Science University, Portland, OR 97239, USA
| | - Paul T Spellman
- Biomedical Engineering, Oregon Health and Science University, Portland, OR 97239, USA
| | - Dmitri Rozanov
- Biomedical Engineering, Oregon Health and Science University, Portland, OR 97239, USA
| | - Jin Zhang
- The Genome Institute, Washington University School of Medicine, 4444 Forest Park Avenue, St. Louis, MO 63110, USA
| | - Christopher A Maher
- The Genome Institute, Washington University School of Medicine, 4444 Forest Park Avenue, St. Louis, MO 63110, USA
| | - Cristian Caloian
- Computational Biology, Ontario Institute for Cancer Research, Toronto, Canada
| | - John D Watson
- Computational Biology, Ontario Institute for Cancer Research, Toronto, Canada
| | - Sebastian Uhrig
- Division of Applied Bioinformatics, German Cancer Research Center (DKFZ) and Faculty of Biosciences, Heidelberg University, Heidelberg, Germany
| | - Brian J Haas
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Miten Jain
- Biomolecular Engineering and UC Santa Cruz Genome Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Mark Akeson
- Biomolecular Engineering and UC Santa Cruz Genome Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Mehmet Eren Ahsen
- Icahn School of Medicine at Mount Sinai, Department of Genetics and Genomic Sciences, One Gustave Levy Place, New York, NY 1498, USA
| | - Gustavo Stolovitzky
- Icahn School of Medicine at Mount Sinai, Department of Genetics and Genomic Sciences, One Gustave Levy Place, New York, NY 1498, USA; IBM T.J. Watson Research Center, 1101 Kitchawan Road, Route 134, Yorktown Heights, NY 10598, USA
| | | | - Paul C Boutros
- Computational Biology, Ontario Institute for Cancer Research, Toronto, Canada; Departments of Medical Biophysics and Pharmacology & Toxicology, University of Toronto, Toronto, Canada; Departments of Human Genetics and Urology, University of California, Los Angeles, Los Angeles, CA, USA
| | - Joshua M Stuart
- Biomolecular Engineering and UC Santa Cruz Genome Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Kyle Ellrott
- Biomedical Engineering, Oregon Health and Science University, Portland, OR 97239, USA.
| |
Collapse
|
175
|
Ceulemans E, Ibrahim HMM, De Coninck B, Goossens A. Pathogen Effectors: Exploiting the Promiscuity of Plant Signaling Hubs. TRENDS IN PLANT SCIENCE 2021; 26:780-795. [PMID: 33674173 DOI: 10.1016/j.tplants.2021.01.005] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Revised: 01/21/2021] [Accepted: 01/29/2021] [Indexed: 05/27/2023]
Abstract
Pathogens produce effectors to overcome plant immunity, thereby threatening crop yields and global food security. Large-scale interactomic studies have revealed that pathogens from different kingdoms of life target common plant proteins during infection, the so-called effector hubs. These hubs often play central roles in numerous plant processes through their ability to interact with multiple plant proteins. This ability arises partly from the presence of intrinsically disordered domains (IDDs) in their structure. Here, we highlight the role of the TEOSINTE BRANCHED1/CYCLOIDEA/PROLIFERATING CELL FACTOR (TCP) and JASMONATE-ZIM DOMAIN (JAZ) transcription regulator families as plant signaling and effector hubs. We consider different evolutionary hypotheses to rationalize the existence of diverse effectors sharing common targets and the possible role of IDDs in this interaction.
Collapse
Affiliation(s)
- Evi Ceulemans
- Ghent University, Department of Plant Biotechnology and Bioinformatics, 9052 Ghent, Belgium; VIB, Center for Plant Systems Biology, 9052 Ghent, Belgium
| | - Heba M M Ibrahim
- Division of Crop Biotechnics, Department of Biosystems, Katholieke Universiteit (KU) Leuven, 3001 Leuven, Belgium
| | - Barbara De Coninck
- Division of Crop Biotechnics, Department of Biosystems, Katholieke Universiteit (KU) Leuven, 3001 Leuven, Belgium.
| | - Alain Goossens
- Ghent University, Department of Plant Biotechnology and Bioinformatics, 9052 Ghent, Belgium; VIB, Center for Plant Systems Biology, 9052 Ghent, Belgium.
| |
Collapse
|
176
|
Applying random forest in a health administrative data context: a conceptual guide. HEALTH SERVICES AND OUTCOMES RESEARCH METHODOLOGY 2021. [DOI: 10.1007/s10742-021-00255-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
177
|
Early Prediction of the Need for Orthognathic Surgery in Patients With Repaired Unilateral Cleft Lip and Palate Using Machine Learning and Longitudinal Lateral Cephalometric Analysis Data. J Craniofac Surg 2021; 32:616-620. [PMID: 33704994 DOI: 10.1097/scs.0000000000006943] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
ABSTRACT The purpose of this study was to determine the cephalometric predictors of the future need for orthognathic surgery in patients with repaired unilateral cleft lip and palate (UCLP) using machine learning. This study included 56 Korean patients with UCLP, who were treated by a single surgeon and a single orthodontist with the same treatment protocol. Lateral cephalograms were obtained before the commencement of orthodontic/orthopedic treatment (T0; mean age, 6.3 years) and at at least of 15 years of age (T1; mean age, 16.7 years). 38 cephalometric variables were measured. At T1 stage, 3 cephalometric criteria (ANB ≤ -3°; Wits appraisal ≤ -5 mm; Harvold unit difference ≥34 mm for surgery group) were used to classify the subjects into the surgery group (n = 10, 17.9%) and non-surgery group (n = 46, 82.1%). Independent t-test was used for statistical analyses. The Boruta method and XGBoost algorithm were used to determine the cephalometric variables for the prediction model. At T0 stage, 2 variables exhibited a significant intergroup difference (ANB and facial convexity angle [FCA], all P < 0.05). However, 18 cephalometric variables at the T1 stage and 14 variables in the amount of change (ΔT1-T0) exhibited significant intergroup differences (all, more significant than P < 0.05). At T0 stage, the ANB, PP-FH, combination factor, and FCA were selected as predictive parameters with a cross-validation accuracy of 87.4%. It was possible to predict the future need for surgery to correct sagittal skeletal discrepancy in UCLP patients at the age of 6 years.
Collapse
|
178
|
Yang L, Qin Y, Jian C. Screening for Core Genes Related to Pathogenesis of Alzheimer's Disease. Front Cell Dev Biol 2021; 9:668738. [PMID: 33968940 PMCID: PMC8101499 DOI: 10.3389/fcell.2021.668738] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2021] [Accepted: 04/01/2021] [Indexed: 12/18/2022] Open
Abstract
Alzheimer’s disease (AD), a nervous system disease, lacks effective therapies at present. RNA expression is the basic way to regulate life activities, and identifying related characteristics in AD patients may aid the exploration of AD pathogenesis and treatment. This study developed a classifier that could accurately classify AD patients and healthy people, and then obtained 3 core genes that may be related to the pathogenesis of AD. To this end, RNA expression data of the middle temporal gyrus of AD patients were firstly downloaded from GEO database, and the data were then normalized using limma package following a supplementation of missing data by k-Nearest Neighbor (KNN) algorithm. Afterwards, the top 500 genes of the most feature importance were obtained through Max-Relevance and Min-Redundancy (mRMR) analysis, and based on these genes, a series of AD classifiers were constructed through Support Vector Machine (SVM), Random Forest (RF), and KNN algorithms. Then, the KNN classifier with the highest Matthews correlation coefficient (MCC) value composed of 14 genes in incremental feature selection (IFS) analysis was identified as the best AD classifier. As analyzed, the 14 genes played a pivotal role in determination of AD and may be core genes associated with the pathogenesis of AD. Finally, protein-protein interaction (PPI) network and Random Walk with Restart (RWR) analysis were applied to obtain core gene-associated genes, and key pathways related to AD were further analyzed. Overall, this study contributed to a deeper understanding of AD pathogenesis and provided theoretical guidance for related research and experiments.
Collapse
Affiliation(s)
- Longxiu Yang
- Department of Neurology, The First Affiliated Hospital of Guangxi Medical University, Nanning, China
| | - Yuan Qin
- Department of Neurology, The First Affiliated Hospital of Guangxi Medical University, Nanning, China
| | - Chongdong Jian
- Department of Neurology, The Affiliated Hospital of Youjiang Medical University for Nationalities, Baise, China
| |
Collapse
|
179
|
Yin J, Li X, Li F, Lu Y, Zeng S, Zhu F. Identification of the key target profiles underlying the drugs of narrow therapeutic index for treating cancer and cardiovascular disease. Comput Struct Biotechnol J 2021; 19:2318-2328. [PMID: 33995923 PMCID: PMC8105181 DOI: 10.1016/j.csbj.2021.04.035] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2020] [Revised: 04/09/2021] [Accepted: 04/15/2021] [Indexed: 12/14/2022] Open
Abstract
An appropriate therapeutic index is crucial for drug discovery and development since narrow therapeutic index (NTI) drugs with slight dosage variation may induce severe adverse drug reactions or potential treatment failure. To date, the shared characteristics underlying the targets of NTI drugs have been explored by several studies, which have been applied to identify potential drug targets. However, the association between the drug therapeutic index and the related disease has not been dissected, which is important for revealing the NTI drug mechanism and optimizing drug design. Therefore, in this study, two classes of disease (cancers and cardiovascular disorders) with the largest number of NTI drugs were selected, and the target property of the corresponding NTI drugs was analyzed. By calculating the biological system profiles and human protein–protein interaction (PPI) network properties of drug targets and adopting an AI-based algorithm, differentiated features between two diseases were discovered to reveal the distinct underlying mechanisms of NTI drugs in different diseases. Consequently, ten shared features and four unique features were identified for both diseases to distinguish NTI from NNTI drug targets. These computational discoveries, as well as the newly found features, suggest that in the clinical study of avoiding narrow therapeutic index in those diseases, the ability of target to be a hub and the efficiency of target signaling in the human PPI network should be considered, and it could thus provide novel guidance in the drug discovery and clinical research process and help to estimate the drug safety of cancer and cardiovascular disease.
Collapse
Affiliation(s)
- Jiayi Yin
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Xiaoxu Li
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Fengcheng Li
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Yinjing Lu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Su Zeng
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.,Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Hangzhou 310018, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.,Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Hangzhou 310018, China.,Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| |
Collapse
|
180
|
Zwep LB, Duisters KLW, Jansen M, Guo T, Meulman JJ, Upadhyay PJ, van Hasselt JGC. Identification of high-dimensional omics-derived predictors for tumor growth dynamics using machine learning and pharmacometric modeling. CPT Pharmacometrics Syst Pharmacol 2021; 10:350-361. [PMID: 33792207 PMCID: PMC8099445 DOI: 10.1002/psp4.12603] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Revised: 01/07/2021] [Accepted: 02/01/2021] [Indexed: 12/26/2022] Open
Abstract
Pharmacometric modeling can capture tumor growth inhibition (TGI) dynamics and variability. These approaches do not usually consider covariates in high-dimensional settings, whereas high-dimensional molecular profiling technologies ("omics") are being increasingly considered for prediction of anticancer drug treatment response. Machine learning (ML) approaches have been applied to identify high-dimensional omics predictors for treatment outcome. Here, we aimed to combine TGI modeling and ML approaches for two distinct aims: omics-based prediction of tumor growth profiles and identification of pathways associated with treatment response and resistance. We propose a two-step approach combining ML using least absolute shrinkage and selection operator (LASSO) regression with pharmacometric modeling. We demonstrate our workflow using a previously published dataset consisting of 4706 tumor growth profiles of patient-derived xenograft (PDX) models treated with a variety of mono- and combination regimens. Pharmacometric TGI models were fit to the tumor growth profiles. The obtained empirical Bayes estimates-derived TGI parameter values were regressed using the LASSO on high-dimensional genomic copy number variation data, which contained over 20,000 variables. The predictive model was able to decrease median prediction error by 4% as compared with a model without any genomic information. A total of 74 pathways were identified as related to treatment response or resistance development by LASSO, of which part was verified by literature. In conclusion, we demonstrate how the combined use of ML and pharmacometric modeling can be used to gain pharmacological understanding in genomic factors driving variation in treatment response.
Collapse
Affiliation(s)
- Laura B. Zwep
- Leiden Academic Centre for Drug ResearchLeiden UniversityLeidenThe Netherlands
- Mathematical InstituteLeiden UniversityLeidenThe Netherlands
| | | | - Martijn Jansen
- Leiden Academic Centre for Drug ResearchLeiden UniversityLeidenThe Netherlands
| | - Tingjie Guo
- Leiden Academic Centre for Drug ResearchLeiden UniversityLeidenThe Netherlands
- Department of Intensive Care MedicineAmsterdam UMCVrije Universiteit AmsterdamAmsterdamThe Netherlands
| | | | - Parth J. Upadhyay
- Leiden Academic Centre for Drug ResearchLeiden UniversityLeidenThe Netherlands
| | | |
Collapse
|
181
|
Polano M, Fabbiani E, Adreuzzi E, Cintio FD, Bedon L, Gentilini D, Mongiat M, Ius T, Arcicasa M, Skrap M, Dal Bo M, Toffoli G. A New Epigenetic Model to Stratify Glioma Patients According to Their Immunosuppressive State. Cells 2021; 10:cells10030576. [PMID: 33807997 PMCID: PMC8001235 DOI: 10.3390/cells10030576] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2021] [Revised: 02/27/2021] [Accepted: 02/28/2021] [Indexed: 01/02/2023] Open
Abstract
Gliomas are the most common primary neoplasm of the central nervous system. A promising frontier in the definition of glioma prognosis and treatment is represented by epigenetics. Furthermore, in this study, we developed a machine learning classification model based on epigenetic data (CpG probes) to separate patients according to their state of immunosuppression. We considered 573 cases of low-grade glioma (LGG) and glioblastoma (GBM) from The Cancer Genome Atlas (TCGA). First, from gene expression data, we derived a novel binary indicator to flag patients with a favorable immune state. Then, based on previous studies, we selected the genes related to the immune state of tumor microenvironment. After, we improved the selection with a data-driven procedure, based on Boruta. Finally, we tuned, trained, and evaluated both random forest and neural network classifiers on the resulting dataset. We found that a multi-layer perceptron network fed by the 338 probes selected by applying both expert choice and Boruta results in the best performance, achieving an out-of-sample accuracy of 82.8%, a Matthews correlation coefficient of 0.657, and an area under the ROC curve of 0.9. Based on the proposed model, we provided a method to stratify glioma patients according to their epigenomic state.
Collapse
Affiliation(s)
- Maurizio Polano
- Experimental and Clinical Pharmacology Unit, Centro di Riferimento Oncologico di Aviano (CRO) IRCCS, 33081 Aviano, Italy; (F.D.C.); (L.B.); (M.D.B.); (G.T.)
- Correspondence:
| | - Emanuele Fabbiani
- Department of Electrical, Computer and Biomedical Engineering, University of Pavia, 27100 Pavia, Italy;
| | - Eva Adreuzzi
- Centro di Riferimento Oncologico di Aviano (CRO) IRCCS, Division of Molecular Oncology, 33081 Aviano, Italy; (E.A.); (M.M.)
| | - Federica Di Cintio
- Experimental and Clinical Pharmacology Unit, Centro di Riferimento Oncologico di Aviano (CRO) IRCCS, 33081 Aviano, Italy; (F.D.C.); (L.B.); (M.D.B.); (G.T.)
- Department of Life Sciences, University of Trieste, 34127 Trieste, Italy
| | - Luca Bedon
- Experimental and Clinical Pharmacology Unit, Centro di Riferimento Oncologico di Aviano (CRO) IRCCS, 33081 Aviano, Italy; (F.D.C.); (L.B.); (M.D.B.); (G.T.)
- Department of Chemical and Pharmaceutical Sciences, University of Trieste, Via L. Giorgieri 1, 34127 Trieste, Italy
| | - Davide Gentilini
- Bioinformatics and Statistical Genomics Unit, Istituto Auxologico Italiano IRCCS, 20095 Cusano Milanino, Italy;
- Department of Brain and Behavioral Sciences, University of Pavia, 27100 Pavia, Italy
| | - Maurizio Mongiat
- Centro di Riferimento Oncologico di Aviano (CRO) IRCCS, Division of Molecular Oncology, 33081 Aviano, Italy; (E.A.); (M.M.)
| | - Tamara Ius
- Neurosurgery Unit, Department of Neuroscience, Santa Maria della Misericordia University Hospital, 33100 Udine, Italy; (T.I.); (M.S.)
| | - Mauro Arcicasa
- Centro di Riferimento Oncologico di Aviano (CRO) IRCCS, Department of Radiotherapy, 33081 Aviano, Italy;
| | - Miran Skrap
- Neurosurgery Unit, Department of Neuroscience, Santa Maria della Misericordia University Hospital, 33100 Udine, Italy; (T.I.); (M.S.)
| | - Michele Dal Bo
- Experimental and Clinical Pharmacology Unit, Centro di Riferimento Oncologico di Aviano (CRO) IRCCS, 33081 Aviano, Italy; (F.D.C.); (L.B.); (M.D.B.); (G.T.)
| | - Giuseppe Toffoli
- Experimental and Clinical Pharmacology Unit, Centro di Riferimento Oncologico di Aviano (CRO) IRCCS, 33081 Aviano, Italy; (F.D.C.); (L.B.); (M.D.B.); (G.T.)
| |
Collapse
|
182
|
Chan HC, Chattopadhyay A, Chuang EY, Lu TP. Development of a Gene-Based Prediction Model for Recurrence of Colorectal Cancer Using an Ensemble Learning Algorithm. Front Oncol 2021; 11:631056. [PMID: 33692961 PMCID: PMC7938710 DOI: 10.3389/fonc.2021.631056] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2020] [Accepted: 01/05/2021] [Indexed: 01/21/2023] Open
Abstract
It is difficult to determine which patients with stage I and II colorectal cancer are at high risk of recurrence, qualifying them to undergo adjuvant chemotherapy. In this study, we aimed to determine a gene signature using gene expression data that could successfully identify high risk of recurrence among stage I and II colorectal cancer patients. First, a synthetic minority oversampling technique was used to address the problem of imbalanced data due to rare recurrence events. We then applied a sequential workflow of three methods (significance analysis of microarrays, logistic regression, and recursive feature elimination) to identify genes differentially expressed between patients with and without recurrence. To stabilize the prediction algorithm, we repeated the above processes on 10 subsets by bagging the training data set and then used support vector machine methods to construct the prediction models. The final predictions were determined by majority voting. The 10 models, using 51 differentially expressed genes, successfully predicted a high risk of recurrence within 3 years in the training data set, with a sensitivity of 91.18%. For the validation data sets, the sensitivity of the prediction with samples from two other countries was 80.00% and 91.67%. These prediction models can potentially function as a tool to decide if adjuvant chemotherapy should be administered after surgery for patients with stage I and II colorectal cancer.
Collapse
Affiliation(s)
- Han-Ching Chan
- Department of Public Health, College of Public Health, National Taiwan University, Institute of Epidemiology and Preventive Medicine, Taipei, Taiwan
| | - Amrita Chattopadhyay
- Bioinformatics and Biostatistics Core, Center of Genomic and Precision Medicine, National Taiwan University, Taipei, Taiwan
| | - Eric Y Chuang
- Bioinformatics and Biostatistics Core, Center of Genomic and Precision Medicine, National Taiwan University, Taipei, Taiwan.,Department of Electrical Engineering, Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, Taipei, Taiwan
| | - Tzu-Pin Lu
- Department of Public Health, College of Public Health, National Taiwan University, Institute of Epidemiology and Preventive Medicine, Taipei, Taiwan.,Bioinformatics and Biostatistics Core, Center of Genomic and Precision Medicine, National Taiwan University, Taipei, Taiwan
| |
Collapse
|
183
|
Land Use Classification of VHR Images for Mapping Small-Sized Abandoned Citrus Plots by Using Spectral and Textural Information. REMOTE SENSING 2021. [DOI: 10.3390/rs13040681] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
Agricultural land abandonment is an increasing problem in Europe. The Comunitat Valenciana Region (Spain) is one of the most important citrus producers in Europe suffering this problem. This region characterizes by small sized citrus plots and high spatial fragmentation which makes necessary to use Very High-Resolution images to detect abandoned plots. In this paper spectral and Gray Level Co-Occurrence Matrix (GLCM)-based textural information derived from the Normalized Difference Vegetation Index (NDVI) are used to map abandoned citrus plots in Oliva municipality (eastern Spain). The proposed methodology is based on three general steps: (a) extraction of spectral and textural features from the image, (b) pixel-based classification of the image using the Random Forest algorithm, and (c) assignment of a single value per plot by majority voting. The best results were obtained when extracting the texture features with a 9 × 9 window size and the Random Forest model showed convergence around 100 decision trees. Cross-validation of the model showed an overall accuracy of the pixel-based classification of 87% and an overall accuracy of the plot-based classification of 95%. All the variables used are statistically significant for the classification, however the most important were contrast, dissimilarity, NIR band (720 nm), and blue band (620 nm). According to our results, 31% of the plots classified as citrus in Oliva by current methodology are abandoned. This is very important to avoid overestimating crop yield calculations by public administrations. The model was applied successfully outside the main study area (Oliva municipality); with a slightly lower accuracy (92%). This research provides a new approach to map small agricultural plots, especially to detect land abandonment in woody evergreen crops that have been little studied until now.
Collapse
|
184
|
Vitense P, Kasbohm E, Klassen A, Gierschner P, Trefz P, Weber M, Miekisch W, Schubert JK, Möbius P, Reinhold P, Liebscher V, Köhler H. Detection of Mycobacterium avium ssp. paratuberculosis in Cultures From Fecal and Tissue Samples Using VOC Analysis and Machine Learning Tools. Front Vet Sci 2021; 8:620327. [PMID: 33614764 PMCID: PMC7887282 DOI: 10.3389/fvets.2021.620327] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2020] [Accepted: 01/13/2021] [Indexed: 12/17/2022] Open
Abstract
Analysis of volatile organic compounds (VOCs) is a novel approach to accelerate bacterial culture diagnostics of Mycobacterium avium subsp. paratuberculosis (MAP). In the present study, cultures of fecal and tissue samples from MAP-infected and non-suspect dairy cattle and goats were explored to elucidate the effects of sample matrix and of animal species on VOC emissions during bacterial cultivation and to identify early markers for bacterial growth. The samples were processed following standard laboratory procedures, culture tubes were incubated for different time periods. Headspace volume of the tubes was sampled by needle trap-micro-extraction, and analyzed by gas chromatography-mass spectrometry. Analysis of MAP-specific VOC emissions considered potential characteristic VOC patterns. To address variation of the patterns, a flexible and robust machine learning workflow was set up, based on random forest classifiers, and comprising three steps: variable selection, parameter optimization, and classification. Only a few substances originated either from a certain matrix or could be assigned to one animal species. These additional emissions were not considered informative by the variable selection procedure. Classification accuracy of MAP-positive and negative cultures of bovine feces was 0.98 and of caprine feces 0.88, respectively. Six compounds indicating MAP presence were selected in all four settings (cattle vs. goat, feces vs. tissue): 2-Methyl-1-propanol, 2-methyl-1-butanol, 3-methyl-1-butanol, heptanal, isoprene, and 2-heptanone. Classification accuracies for MAP growth-scores ranged from 0.82 for goat tissue to 0.89 for cattle feces. Misclassification occurred predominantly between related scores. Seventeen compounds indicating MAP growth were selected in all four settings, including the 6 compounds indicating MAP presence. The concentration levels of 2,3,5-trimethylfuran, 2-pentylfuran, 1-propanol, and 1-hexanol were indicative for MAP cultures before visible growth was apparent. Thus, very accurate classification of the VOC samples was achieved and the potential of VOC analysis to detect bacterial growth before colonies become visible was confirmed. These results indicate that diagnosis of paratuberculosis can be optimized by monitoring VOC emissions of bacterial cultures. Further validation studies are needed to increase the robustness of indicative VOC patterns for early MAP growth as a pre-requisite for the development of VOC-based diagnostic analysis systems.
Collapse
Affiliation(s)
- Philipp Vitense
- Institute of Mathematics and Computer Science, University of Greifswald, Greifswald, Germany
| | - Elisa Kasbohm
- Institute of Mathematics and Computer Science, University of Greifswald, Greifswald, Germany
| | - Anne Klassen
- Institute of Molecular Pathogenesis, Friedrich-Loeffler-Institut, Jena, Germany
| | - Peter Gierschner
- Department of Anaesthesia and Intensive Care, University Medicine Rostock, Rostock, Germany
| | - Phillip Trefz
- Department of Anaesthesia and Intensive Care, University Medicine Rostock, Rostock, Germany
| | - Michael Weber
- Institute of Molecular Pathogenesis, Friedrich-Loeffler-Institut, Jena, Germany
| | - Wolfram Miekisch
- Department of Anaesthesia and Intensive Care, University Medicine Rostock, Rostock, Germany
| | - Jochen K Schubert
- Department of Anaesthesia and Intensive Care, University Medicine Rostock, Rostock, Germany
| | - Petra Möbius
- National Reference Laboratory for Paratuberculosis, Institute of Molecular Pathogenesis, Friedrich-Loeffler-Institut, Jena, Germany
| | - Petra Reinhold
- Institute of Molecular Pathogenesis, Friedrich-Loeffler-Institut, Jena, Germany
| | - Volkmar Liebscher
- Institute of Mathematics and Computer Science, University of Greifswald, Greifswald, Germany
| | - Heike Köhler
- National Reference Laboratory for Paratuberculosis, Institute of Molecular Pathogenesis, Friedrich-Loeffler-Institut, Jena, Germany
| |
Collapse
|
185
|
Seifert S, Gundlach S, Junge O, Szymczak S. Integrating biological knowledge and gene expression data using pathway-guided random forests: a benchmarking study. Bioinformatics 2021; 36:4301-4308. [PMID: 32399562 PMCID: PMC7520048 DOI: 10.1093/bioinformatics/btaa483] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2019] [Revised: 03/13/2020] [Accepted: 05/05/2020] [Indexed: 12/12/2022] Open
Abstract
MOTIVATION High-throughput technologies allow comprehensive characterization of individuals on many molecular levels. However, training computational models to predict disease status based on omics data is challenging. A promising solution is the integration of external knowledge about structural and functional relationships into the modeling process. We compared four published random forest-based approaches using two simulation studies and nine experimental datasets. RESULTS The self-sufficient prediction error approach should be applied when large numbers of relevant pathways are expected. The competing methods hunting and learner of functional enrichment should be used when low numbers of relevant pathways are expected or the most strongly associated pathways are of interest. The hybrid approach synthetic features is not recommended because of its high false discovery rate. AVAILABILITY AND IMPLEMENTATION An R package providing functions for data analysis and simulation is available at GitHub (https://github.com/szymczak-lab/PathwayGuidedRF). An accompanying R data package (https://github.com/szymczak-lab/DataPathwayGuidedRF) stores the processed and quality controlled experimental datasets downloaded from Gene Expression Omnibus (GEO). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Stephan Seifert
- Institute of Medical Informatics and Statistics, Kiel University, University Hospital Schleswig-Holstein, Kiel 24105, Germany
| | - Sven Gundlach
- Institute of Medical Informatics and Statistics, Kiel University, University Hospital Schleswig-Holstein, Kiel 24105, Germany
| | - Olaf Junge
- Institute of Medical Informatics and Statistics, Kiel University, University Hospital Schleswig-Holstein, Kiel 24105, Germany
| | - Silke Szymczak
- Institute of Medical Informatics and Statistics, Kiel University, University Hospital Schleswig-Holstein, Kiel 24105, Germany
| |
Collapse
|
186
|
Hinzke T, Kleiner M, Meister M, Schlüter R, Hentschker C, Pané-Farré J, Hildebrandt P, Felbeck H, Sievert SM, Bonn F, Völker U, Becher D, Schweder T, Markert S. Bacterial symbiont subpopulations have different roles in a deep-sea symbiosis. eLife 2021; 10:58371. [PMID: 33404502 PMCID: PMC7787665 DOI: 10.7554/elife.58371] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2020] [Accepted: 12/05/2020] [Indexed: 12/13/2022] Open
Abstract
The hydrothermal vent tubeworm Riftia pachyptila hosts a single 16S rRNA phylotype of intracellular sulfur-oxidizing symbionts, which vary considerably in cell morphology and exhibit a remarkable degree of physiological diversity and redundancy, even in the same host. To elucidate whether multiple metabolic routes are employed in the same cells or rather in distinct symbiont subpopulations, we enriched symbionts according to cell size by density gradient centrifugation. Metaproteomic analysis, microscopy, and flow cytometry strongly suggest that Riftia symbiont cells of different sizes represent metabolically dissimilar stages of a physiological differentiation process: While small symbionts actively divide and may establish cellular symbiont-host interaction, large symbionts apparently do not divide, but still replicate DNA, leading to DNA endoreduplication. Moreover, in large symbionts, carbon fixation and biomass production seem to be metabolic priorities. We propose that this division of labor between smaller and larger symbionts benefits the productivity of the symbiosis as a whole.
Collapse
Affiliation(s)
- Tjorven Hinzke
- Institute of Pharmacy, University of Greifswald, Greifswald, Germany.,Institute of Marine Biotechnology, Greifswald, Germany.,Energy Bioengineering Group, University of Calgary, Calgary, Canada
| | - Manuel Kleiner
- Department of Plant and Microbial Biology, North Carolina State University, Raleigh, United States
| | - Mareike Meister
- Institute of Microbiology, University of Greifswald, Greifswald, Germany.,Leibniz Institute for Plasma Science and Technology, Greifswald, Germany
| | - Rabea Schlüter
- Imaging Center of the Department of Biology, University of Greifswald, Greifswald, Germany
| | - Christian Hentschker
- Interfaculty Institute for Genetics and Functional Genomics, University Medicine Greifswald, Greifswald, Germany
| | - Jan Pané-Farré
- Center for Synthetic Microbiology (SYNMIKRO), Philipps-University Marburg, Marburg, Germany
| | - Petra Hildebrandt
- Interfaculty Institute for Genetics and Functional Genomics, University Medicine Greifswald, Greifswald, Germany
| | - Horst Felbeck
- Scripps Institution of Oceanography, University of California San Diego, San Diego, United States
| | - Stefan M Sievert
- Biology Department, Woods Hole Oceanographic Institution, Woods Hole, United States
| | - Florian Bonn
- Institute of Biochemistry, University Hospital, Goethe University School of Medicine Frankfurt, Frankfurt, Germany
| | - Uwe Völker
- Interfaculty Institute for Genetics and Functional Genomics, University Medicine Greifswald, Greifswald, Germany
| | - Dörte Becher
- Institute of Microbiology, University of Greifswald, Greifswald, Germany
| | - Thomas Schweder
- Institute of Pharmacy, University of Greifswald, Greifswald, Germany.,Institute of Marine Biotechnology, Greifswald, Germany
| | - Stephanie Markert
- Institute of Pharmacy, University of Greifswald, Greifswald, Germany.,Institute of Marine Biotechnology, Greifswald, Germany
| |
Collapse
|
187
|
Rosenbusch H, Soldner F, Evans AM, Zeelenberg M. Supervised machine learning methods in psychology: A practical introduction with annotated R code. SOCIAL AND PERSONALITY PSYCHOLOGY COMPASS 2021. [DOI: 10.1111/spc3.12579] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Affiliation(s)
- Hannes Rosenbusch
- Department of Social Psychology Tilburg University Tilburg The Netherlands
| | - Felix Soldner
- Department of Security and Crime Science University College London London UK
| | - Anthony M. Evans
- Department of Social Psychology Tilburg University Tilburg The Netherlands
| | - Marcel Zeelenberg
- Department of Social Psychology Tilburg University Tilburg The Netherlands
- Department of Marketing Tilburg University Tilburg The Netherlands
| |
Collapse
|
188
|
Tsatsanis A, McCorkindale AN, Wong BX, Patrick E, Ryan TM, Evans RW, Bush AI, Sutherland GT, Sivaprasadarao A, Guennewig B, Duce JA. The acute phase protein lactoferrin is a key feature of Alzheimer's disease and predictor of Aβ burden through induction of APP amyloidogenic processing. Mol Psychiatry 2021; 26:5516-5531. [PMID: 34400772 PMCID: PMC8758478 DOI: 10.1038/s41380-021-01248-1] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Revised: 07/17/2021] [Accepted: 07/23/2021] [Indexed: 02/06/2023]
Abstract
Amyloidogenic processing of the amyloid precursor protein (APP) forms the amyloid-β peptide (Aβ) component of pathognomonic extracellular plaques of AD. Additional early cortical changes in AD include neuroinflammation and elevated iron levels. Activation of the innate immune system in the brain is a neuroprotective response to infection; however, persistent neuroinflammation is linked to AD neuropathology by uncertain mechanisms. Non-parametric machine learning analysis on transcriptomic data from a large neuropathologically characterised patient cohort revealed the acute phase protein lactoferrin (Lf) as the key predictor of amyloid pathology. In vitro studies showed that an interaction between APP and the iron-bound form of Lf secreted from activated microglia diverted neuronal APP endocytosis from the canonical clathrin-dependent pathway to one requiring ADP ribosylation factor 6 trafficking. By rerouting APP recycling to the Rab11-positive compartment for amyloidogenic processing, Lf dramatically increased neuronal Aβ production. Lf emerges as a novel pharmacological target for AD that not only modulates APP processing but provides a link between Aβ production, neuroinflammation and iron dysregulation.
Collapse
Affiliation(s)
- Andrew Tsatsanis
- grid.5335.00000000121885934The ALBORADA Drug Discovery Institute, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK ,grid.9909.90000 0004 1936 8403Faculty of Biological Sciences, School of Biomedical Sciences, University of Leeds, Leeds, West Yorkshire UK
| | - Andrew N. McCorkindale
- grid.1013.30000 0004 1936 834XFaculty of Medicine and Health, Charles Perkins Centre and School of Medical Sciences, University of Sydney, Camperdown, NSW Australia
| | - Bruce X. Wong
- grid.5335.00000000121885934The ALBORADA Drug Discovery Institute, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK ,grid.9909.90000 0004 1936 8403Faculty of Biological Sciences, School of Biomedical Sciences, University of Leeds, Leeds, West Yorkshire UK ,grid.1008.90000 0001 2179 088XMelbourne Dementia Research Centre, The Florey Institute of Neuroscience and Mental Health, The University of Melbourne, Melbourne, VIC Australia
| | - Ellis Patrick
- grid.1013.30000 0004 1936 834XFaculty of Science, School of Mathematics and Statistics, University of Sydney, Camperdown, NSW Australia
| | - Tim M. Ryan
- grid.1008.90000 0001 2179 088XMelbourne Dementia Research Centre, The Florey Institute of Neuroscience and Mental Health, The University of Melbourne, Melbourne, VIC Australia
| | - Robert W. Evans
- grid.7728.a0000 0001 0724 6933School of Engineering and Design, Brunel University, London, UK
| | - Ashley I. Bush
- grid.1008.90000 0001 2179 088XMelbourne Dementia Research Centre, The Florey Institute of Neuroscience and Mental Health, The University of Melbourne, Melbourne, VIC Australia
| | - Greg T. Sutherland
- grid.1013.30000 0004 1936 834XFaculty of Medicine and Health, Charles Perkins Centre and School of Medical Sciences, University of Sydney, Camperdown, NSW Australia
| | - Asipu Sivaprasadarao
- grid.9909.90000 0004 1936 8403Faculty of Biological Sciences, School of Biomedical Sciences, University of Leeds, Leeds, West Yorkshire UK
| | - Boris Guennewig
- grid.1013.30000 0004 1936 834XFaculty of Medicine and Health, Brain and Mind Centre and School of Medical Sciences, The University of Sydney, Camperdown, NSW Australia
| | - James A. Duce
- grid.5335.00000000121885934The ALBORADA Drug Discovery Institute, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK ,grid.9909.90000 0004 1936 8403Faculty of Biological Sciences, School of Biomedical Sciences, University of Leeds, Leeds, West Yorkshire UK ,grid.1008.90000 0001 2179 088XMelbourne Dementia Research Centre, The Florey Institute of Neuroscience and Mental Health, The University of Melbourne, Melbourne, VIC Australia
| |
Collapse
|
189
|
Zandler H, Senftl T, Vanselow KA. Reanalysis datasets outperform other gridded climate products in vegetation change analysis in peripheral conservation areas of Central Asia. Sci Rep 2020; 10:22446. [PMID: 33384431 PMCID: PMC7775429 DOI: 10.1038/s41598-020-79480-y] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Accepted: 12/08/2020] [Indexed: 11/29/2022] Open
Abstract
Global environmental research requires long-term climate data. Yet, meteorological infrastructure is missing in the vast majority of the world’s protected areas. Therefore, gridded products are frequently used as the only available climate data source in peripheral regions. However, associated evaluations are commonly biased towards well observed areas and consequently, station-based datasets. As evaluations on vegetation monitoring abilities are lacking for regions with poor data availability, we analyzed the potential of several state-of-the-art climate datasets (CHIRPS, CRU, ERA5-Land, GPCC-Monitoring-Product, IMERG-GPM, MERRA-2, MODIS-MOD10A1) for assessing NDVI anomalies (MODIS-MOD13Q1) in two particularly suitable remote conservation areas. We calculated anomalies of 156 climate variables and seasonal periods during 2001–2018, correlated these with vegetation anomalies while taking the multiple comparison problem into consideration, and computed their spatial performance to derive suitable parameters. Our results showed that four datasets (MERRA-2, ERA5-Land, MOD10A1, CRU) were suitable for vegetation analysis in both regions, by showing significant correlations controlled at a false discovery rate < 5% and in more than half of the analyzed areas. Cross-validated variable selection and importance assessment based on the Boruta algorithm indicated high importance of the reanalysis datasets ERA5-Land and MERRA-2 in both areas but higher differences and variability between the regions with all other products. CHIRPS, GPCC and the bias-corrected version of MERRA-2 were unsuitable and not important in both regions. We provide evidence that reanalysis datasets are most suitable for spatiotemporally consistent environmental analysis whereas gauge- or satellite-based products and their combinations are highly variable and may not be applicable in peripheral areas.
Collapse
Affiliation(s)
- Harald Zandler
- Working Group of Climatology, Department of Geography, University of Bayreuth, Universitätsstr. 30, 95447, Bayreuth, Germany. .,Bayreuth Center of Ecology and Environmental Research, University of Bayreuth, Dr. Hans-Frisch-Straße 1-3, 95448, Bayreuth, Germany.
| | - Thomas Senftl
- Working Group of Climatology, Department of Geography, University of Bayreuth, Universitätsstr. 30, 95447, Bayreuth, Germany
| | - Kim André Vanselow
- Institute of Geography, Friedrich-Alexander-Universität Erlangen-Nürnberg, Wetterkreuz 15, 91058, Erlangen, Germany
| |
Collapse
|
190
|
Polewko-Klim A, Lesiński W, Golińska AK, Mnich K, Siwek M, Rudnicki WR. Sensitivity analysis based on the random forest machine learning algorithm identifies candidate genes for regulation of innate and adaptive immune response of chicken. Poult Sci 2020; 99:6341-6354. [PMID: 33248550 PMCID: PMC7704721 DOI: 10.1016/j.psj.2020.08.059] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2019] [Revised: 07/14/2020] [Accepted: 08/11/2020] [Indexed: 11/25/2022] Open
Abstract
Two categories of immune responses—innate and adaptive immunity—have both polygenic backgrounds and a significant environmental component. The goal of the reported study was to define candidate genes and mutations for the immune traits of interest in chickens using machine learning–based sensitivity analysis for single-nucleotide polymorphisms (SNPs) located in candidate genes defined in quantitative trait loci regions. Here the adaptive immunity is represented by the specific antibody response toward keyhole limpet hemocyanin (KLH), whereas the innate immunity was represented by natural antibodies toward lipopolysaccharide (LPS) and lipoteichoic acid (LTA). The analysis consisted of 3 basic steps: an identification of candidate SNPs via feature selection, an optimisation of the feature set using recursive feature elimination, and finally a gene-level sensitivity analysis for final selection of models. The predictive model based on 5 genes (MAPK8IP3 CRLF3, UNC13D, ILR9, and PRCKB) explains 14.9% of variance for KLH adaptive response. The models obtained for LTA and LPS use more genes and have lower predictive power, explaining respectively 7.8 and 4.5% of total variance. In comparison, the linear models built on genes identified by a standard statistical analysis explain 1.5, 0.5, and 0.3% of variance for KLH, LTA, and LPS response, respectively. The present study shows that machine learning methods applied to systems with a complex interaction network can discover phenotype-genotype associations with much higher sensitivity than traditional statistical models. It adds contribution to evidence suggesting a role of MAPK8IP3 in the adaptive immune response. It also indicates that CRLF3 is involved in this process as well. Both findings need additional verification.
Collapse
Affiliation(s)
- Aneta Polewko-Klim
- Institute of Computer Science, University of Bialystok, Białystok, Poland.
| | - Wojciech Lesiński
- Institute of Computer Science, University of Bialystok, Białystok, Poland
| | | | - Krzysztof Mnich
- Computational Centre, University of Bialystok, Białystok, Poland
| | - Maria Siwek
- Animal Biotechnology and Genetics Department, University of Technology and Life Sciences, Bydgoszcz, Poland
| | - Witold R Rudnicki
- Institute of Computer Science, University of Bialystok, Białystok, Poland; Computational Centre, University of Bialystok, Białystok, Poland; Interdisciplinary Centre for Mathematical and Computational Modelling, University of Warsaw, Warsaw, Poland
| |
Collapse
|
191
|
Yang S, Koo BK, Hoshino M, Lee JM, Murai T, Park J, Zhang J, Hwang D, Shin ES, Doh JH, Nam CW, Wang J, Chen S, Tanaka N, Matsuo H, Akasaka T, Choi G, Petersen K, Chang HJ, Kakuta T, Narula J. CT Angiographic and Plaque Predictors of Functionally Significant Coronary Disease and Outcome Using Machine Learning. JACC Cardiovasc Imaging 2020; 14:629-641. [PMID: 33248965 DOI: 10.1016/j.jcmg.2020.08.025] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/04/2020] [Revised: 07/13/2020] [Accepted: 08/20/2020] [Indexed: 10/22/2022]
Abstract
OBJECTIVES The goal of this study was to investigate the association of stenosis and plaque features with myocardial ischemia and their prognostic implications. BACKGROUND Various anatomic, functional, and morphological attributes of coronary artery disease (CAD) have been independently explored to define ischemia and prognosis. METHODS A total of 1,013 vessels with fractional flow reserve (FFR) measurement and available coronary computed tomography angiography were analyzed. Stenosis and plaque features of the target lesion and vessel were evaluated by an independent core laboratory. Relevant features associated with low FFR (≤0.80) were identified by using machine learning, and their predictability of 5-year risk of vessel-oriented composite outcome, including cardiac death, target vessel myocardial infarction, or target vessel revascularization, were evaluated. RESULTS The mean percent diameter stenosis and invasive FFR were 48.5 ± 17.4% and 0.81 ± 0.14, respectively. Machine learning interrogation identified 6 clusters for low FFR, and the most relevant feature from each cluster was minimum lumen area, percent atheroma volume, fibrofatty and necrotic core volume, plaque volume, proximal left anterior descending coronary artery lesion, and remodeling index (in order of importance). These 6 features showed predictability for low FFR (area under the receiver-operating characteristic curve: 0.797). The risk of 5-year vessel-oriented composite outcome increased with every increment of the number of 6 relevant features, and it had incremental prognostic value over percent diameter stenosis and FFR (area under the receiver-operating characteristic curve: 0.706 vs. 0.611; p = 0.031). CONCLUSIONS Six functionally relevant features, including minimum lumen area, percent atheroma volume, fibrofatty and necrotic core volume, plaque volume, proximal left anterior descending coronary artery lesion, and remodeling index, help define the presence of myocardial ischemia and provide better prognostication in patients with CAD. (CCTA-FFR Registry for Risk Prediction; NCT04037163).
Collapse
Affiliation(s)
- Seokhun Yang
- Department of Internal Medicine and Cardiovascular Center, Seoul National University Hospital, Seoul, South Korea
| | - Bon-Kwon Koo
- Department of Internal Medicine and Cardiovascular Center, Seoul National University Hospital, Seoul, South Korea; Institute on Aging, Seoul National University, Seoul, South Korea.
| | - Masahiro Hoshino
- Division of Cardiovascular Medicine, Tsuchiura Kyodo General Hospital, Ibaraki, Japan
| | - Joo Myung Lee
- Division of Cardiology, Department of Internal Medicine, Heart Vascular Stroke Institute, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, South Korea
| | - Tadashi Murai
- Division of Cardiovascular Medicine, Tsuchiura Kyodo General Hospital, Ibaraki, Japan
| | - Jiesuck Park
- Department of Internal Medicine and Cardiovascular Center, Seoul National University Hospital, Seoul, South Korea
| | - Jinlong Zhang
- Department of Internal Medicine and Cardiovascular Center, Seoul National University Hospital, Seoul, South Korea
| | - Doyeon Hwang
- Department of Internal Medicine and Cardiovascular Center, Seoul National University Hospital, Seoul, South Korea
| | - Eun-Seok Shin
- Department of Cardiology, Ulsan Medical Center, Ulsan Hospital, Ulsan, South Korea
| | - Joon-Hyung Doh
- Department of Medicine, Inje University Ilsan Paik Hospital, Goyang, South Korea
| | - Chang-Wook Nam
- Department of Medicine, Keimyung University Dongsan Medical Center, Daegu, South Korea
| | - Jianan Wang
- Department of Cardiology, The Second Affiliated Hospital, School of Medicine, Zhejiang University, China
| | - Shaoliang Chen
- Department of Cardiology, Nanjing First Hospital, Nanjing Medical University, Nanjing, China
| | - Nobuhiro Tanaka
- Department of Cardiology, Tokyo Medical University, Tokyo, Japan
| | - Hitoshi Matsuo
- Department of Cardiology, Gifu Heart Center, Gifu, Japan
| | | | - Gilwoo Choi
- HeartFlow, Inc., Redwood City, California, USA
| | | | - Hyuk-Jae Chang
- Division of Cardiology, Severance Cardiovascular Hospital, Yonsei-Cedars-Sinai Integrative Cardiovascular Imaging Research Center, Yonsei University College of Medicine, Seoul, South Korea
| | - Tsunekazu Kakuta
- Division of Cardiovascular Medicine, Tsuchiura Kyodo General Hospital, Ibaraki, Japan
| | - Jagat Narula
- Icahn School of Medicine at Mount Sinai, New York, New York, USA
| |
Collapse
|
192
|
Acharjee A, Larkman J, Xu Y, Cardoso VR, Gkoutos GV. A random forest based biomarker discovery and power analysis framework for diagnostics research. BMC Med Genomics 2020; 13:178. [PMID: 33228632 PMCID: PMC7685541 DOI: 10.1186/s12920-020-00826-6] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2020] [Accepted: 11/15/2020] [Indexed: 11/25/2022] Open
Abstract
Background Biomarker identification is one of the major and important goal of functional genomics and translational medicine studies. Large scale –omics data are increasingly being accumulated and can provide vital means for the identification of biomarkers for the early diagnosis of complex disease and/or for advanced patient/diseases stratification. These tasks are clearly interlinked, and it is essential that an unbiased and stable methodology is applied in order to address them. Although, recently, many, primarily machine learning based, biomarker identification approaches have been developed, the exploration of potential associations between biomarker identification and the design of future experiments remains a challenge. Methods In this study, using both simulated and published experimentally derived datasets, we assessed the performance of several state-of-the-art Random Forest (RF) based decision approaches, namely the Boruta method, the permutation based feature selection without correction method, the permutation based feature selection with correction method, and the backward elimination based feature selection method. Moreover, we conducted a power analysis to estimate the number of samples required for potential future studies. Results We present a number of different RF based stable feature selection methods and compare their performances using simulated, as well as published, experimentally derived, datasets. Across all of the scenarios considered, we found the Boruta method to be the most stable methodology, whilst the Permutation (Raw) approach offered the largest number of relevant features, when allowed to stabilise over a number of iterations. Finally, we developed and made available a web interface (https://joelarkman.shinyapps.io/PowerTools/) to streamline power calculations thereby aiding the design of potential future studies within a translational medicine context. Conclusions We developed a RF-based biomarker discovery framework and provide a web interface for our framework, termed PowerTools, that caters the design of appropriate and cost-effective subsequent future omics study.
Collapse
Affiliation(s)
- Animesh Acharjee
- College of Medical and Dental Sciences, Institute of Cancer and Genomic Sciences, Centre for Computational Biology, University of Birmingham, Birmingham, B15 2TT, UK. .,Institute of Translational Medicine, University Hospitals Birmingham NHS, Foundation Trust, Birmingham, B15 2TT, UK. .,NIHR Surgical Reconstruction and Microbiology Research Centre, University Hospital Birmingham, Birmingham, B15 2WB, UK.
| | - Joseph Larkman
- College of Medical and Dental Sciences, Institute of Cancer and Genomic Sciences, Centre for Computational Biology, University of Birmingham, Birmingham, B15 2TT, UK.,Institute of Translational Medicine, University Hospitals Birmingham NHS, Foundation Trust, Birmingham, B15 2TT, UK
| | - Yuanwei Xu
- College of Medical and Dental Sciences, Institute of Cancer and Genomic Sciences, Centre for Computational Biology, University of Birmingham, Birmingham, B15 2TT, UK.,Institute of Translational Medicine, University Hospitals Birmingham NHS, Foundation Trust, Birmingham, B15 2TT, UK
| | - Victor Roth Cardoso
- College of Medical and Dental Sciences, Institute of Cancer and Genomic Sciences, Centre for Computational Biology, University of Birmingham, Birmingham, B15 2TT, UK.,Institute of Translational Medicine, University Hospitals Birmingham NHS, Foundation Trust, Birmingham, B15 2TT, UK.,MRC Health Data Research UK (HDR UK), London, UK
| | - Georgios V Gkoutos
- College of Medical and Dental Sciences, Institute of Cancer and Genomic Sciences, Centre for Computational Biology, University of Birmingham, Birmingham, B15 2TT, UK.,Institute of Translational Medicine, University Hospitals Birmingham NHS, Foundation Trust, Birmingham, B15 2TT, UK.,NIHR Surgical Reconstruction and Microbiology Research Centre, University Hospital Birmingham, Birmingham, B15 2WB, UK.,MRC Health Data Research UK (HDR UK), London, UK.,NIHR Experimental Cancer Medicine Centre, Birmingham, B15 2TT, UK.,NIHR Biomedical Research Centre, University Hospital Birmingham, Birmingham, B15 2TT, UK
| |
Collapse
|
193
|
Genome Wide Epistasis Study of On-Statin Cardiovascular Events with Iterative Feature Reduction and Selection. J Pers Med 2020; 10:jpm10040212. [PMID: 33171725 PMCID: PMC7712544 DOI: 10.3390/jpm10040212] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Revised: 10/30/2020] [Accepted: 11/04/2020] [Indexed: 12/25/2022] Open
Abstract
Predicting risk for major adverse cardiovascular events (MACE) is an evidence-based practice that incorporates lifestyle, history, and other risk factors. Statins reduce risk for MACE by decreasing lipids, but it is difficult to stratify risk following initiation of a statin. Genetic risk determinants for on-statin MACE are low-effect size and impossible to generalize. Our objective was to determine high-level epistatic risk factors for on-statin MACE with GWAS-scale data. Controlled-access data for 5890 subjects taking a statin collected from Vanderbilt University Medical Center's BioVU were obtained from dbGaP. We used Random Forest Iterative Feature Reduction and Selection (RF-IFRS) to select highly informative genetic and environmental features from a GWAS-scale dataset of patients taking statin medications. Variant-pairs were distilled into overlapping networks and assembled into individual decision trees to provide an interpretable set of variants and associated risk. 1718 cases who suffered MACE and 4172 controls were obtained from dbGaP. Pathway analysis showed that variants in genes related to vasculogenesis (FDR = 0.024), angiogenesis (FDR = 0.019), and carotid artery disease (FDR = 0.034) were related to risk for on-statin MACE. We identified six gene-variant networks that predicted odds of on-statin MACE. The most elevated risk was found in a small subset of patients carrying variants in COL4A2, TMEM178B, SZT2, and TBXAS1 (OR = 4.53, p < 0.001). The RF-IFRS method is a viable method for interpreting complex "black-box" findings from machine-learning. In this study, it identified epistatic networks that could be applied to risk estimation for on-statin MACE. Further study will seek to replicate these findings in other populations.
Collapse
|
194
|
Fortino V, Scala G, Greco D. Feature set optimization in biomarker discovery from genome-scale data. Bioinformatics 2020; 36:3393-3400. [PMID: 32119073 DOI: 10.1093/bioinformatics/btaa144] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2019] [Revised: 02/20/2020] [Accepted: 02/26/2020] [Indexed: 12/27/2022] Open
Abstract
MOTIVATION Omics technologies have the potential to facilitate the discovery of new biomarkers. However, only few omics-derived biomarkers have been successfully translated into clinical applications to date. Feature selection is a crucial step in this process that identifies small sets of features with high predictive power. Models consisting of a limited number of features are not only more robust in analytical terms, but also ensure cost effectiveness and clinical translatability of new biomarker panels. Here we introduce GARBO, a novel multi-island adaptive genetic algorithm to simultaneously optimize accuracy and set size in omics-driven biomarker discovery problems. RESULTS Compared to existing methods, GARBO enables the identification of biomarker sets that best optimize the trade-off between classification accuracy and number of biomarkers. We tested GARBO and six alternative selection methods with two high relevant topics in precision medicine: cancer patient stratification and drug sensitivity prediction. We found multivariate biomarker models from different omics data types such as mRNA, miRNA, copy number variation, mutation and DNA methylation. The top performing models were evaluated by using two different strategies: the Pareto-based selection, and the weighted sum between accuracy and set size (w = 0.5). Pareto-based preferences show the ability of the proposed algorithm to search minimal subsets of relevant features that can be used to model accurate random forest-based classification systems. Moreover, GARBO systematically identified, on larger omics data types, such as gene expression and DNA methylation, biomarker panels exhibiting higher classification accuracy or employing a number of features much lower than those discovered with other methods. These results were confirmed on independent datasets. AVAILABILITY AND IMPLEMENTATION github.com/Greco-Lab/GARBO. CONTACT dario.greco@tuni.fi. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- V Fortino
- Institute of Biomedicine, University of Eastern Finland, Kuopio 70210, Finland
| | - G Scala
- Faculty of Medicine and Health Technology, Tampere University, Tampere 33100, Finland
- Institute of Biotechnology, University of Helsinki, Helsinki 00014, Finland
| | - D Greco
- Faculty of Medicine and Health Technology, Tampere University, Tampere 33100, Finland
- Institute of Biotechnology, University of Helsinki, Helsinki 00014, Finland
| |
Collapse
|
195
|
Wavelength Selection Method Based on Partial Least Square from Hyperspectral Unmanned Aerial Vehicle Orthomosaic of Irrigated Olive Orchards. REMOTE SENSING 2020. [DOI: 10.3390/rs12203426] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Identifying and mapping irrigated areas is essential for a variety of applications such as agricultural planning and water resource management. Irrigated plots are mainly identified using supervised classification of multispectral images from satellite or manned aerial platforms. Recently, hyperspectral sensors on-board Unmanned Aerial Vehicles (UAV) have proven to be useful analytical tools in agriculture due to their high spectral resolution. However, few efforts have been made to identify which wavelengths could be applied to provide relevant information in specific scenarios. In this study, hyperspectral reflectance data from UAV were used to compare the performance of several wavelength selection methods based on Partial Least Square (PLS) regression with the purpose of discriminating two systems of irrigation commonly used in olive orchards. The tested PLS methods include filter methods (Loading Weights, Regression Coefficient and Variable Importance in Projection); Wrapper methods (Genetic Algorithm-PLS, Uninformative Variable Elimination-PLS, Backward Variable Elimination-PLS, Sub-window Permutation Analysis-PLS, Iterative Predictive Weighting-PLS, Regularized Elimination Procedure-PLS, Backward Interval-PLS, Forward Interval-PLS and Competitive Adaptive Reweighted Sampling-PLS); and an Embedded method (Sparse-PLS). In addition, two non-PLS based methods, Lasso and Boruta, were also used. Linear Discriminant Analysis and nonlinear K-Nearest Neighbors techniques were established for identification and assessment. The results indicate that wavelength selection methods, commonly used in other disciplines, provide utility in remote sensing for agronomical purposes, the identification of irrigation techniques being one such example. In addition to the aforementioned, these PLS and non-PLS based methods can play an important role in multivariate analysis, which can be used for subsequent model analysis. Of all the methods evaluated, Genetic Algorithm-PLS and Boruta eliminated nearly 90% of the original spectral wavelengths acquired from a hyperspectral sensor onboard a UAV while increasing the identification accuracy of the classification.
Collapse
|
196
|
Arnoriaga-Rodríguez M, Mayneris-Perxachs J, Burokas A, Contreras-Rodríguez O, Blasco G, Coll C, Biarnés C, Miranda-Olivos R, Latorre J, Moreno-Navarrete JM, Castells-Nobau A, Sabater M, Palomo-Buitrago ME, Puig J, Pedraza S, Gich J, Pérez-Brocal V, Ricart W, Moya A, Fernández-Real X, Ramió-Torrentà L, Pamplona R, Sol J, Jové M, Portero-Otin M, Maldonado R, Fernández-Real JM. Obesity Impairs Short-Term and Working Memory through Gut Microbial Metabolism of Aromatic Amino Acids. Cell Metab 2020; 32:548-560.e7. [PMID: 33027674 DOI: 10.1016/j.cmet.2020.09.002] [Citation(s) in RCA: 69] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Revised: 05/12/2020] [Accepted: 08/31/2020] [Indexed: 02/07/2023]
Abstract
The gut microbiome has been linked to fear extinction learning in animal models. Here, we aimed to explore the gut microbiome and memory domains according to obesity status. A specific microbiome profile associated with short-term memory, working memory, and the volume of the hippocampus and frontal regions of the brain differentially in human subjects with and without obesity. Plasma and fecal levels of aromatic amino acids, their catabolites, and vegetable-derived compounds were longitudinally associated with short-term and working memory. Functionally, microbiota transplantation from human subjects with obesity led to decreased memory scores in mice, aligning this trait from humans with that of recipient mice. RNA sequencing of the medial prefrontal cortex of mice revealed that short-term memory associated with aromatic amino acid pathways, inflammatory genes, and clusters of bacterial species. These results highlight the potential therapeutic value of targeting the gut microbiota for memory impairment, specifically in subjects with obesity.
Collapse
Affiliation(s)
- María Arnoriaga-Rodríguez
- Department of Diabetes, Endocrinology and Nutrition, Dr. Josep Trueta University Hospital, Girona, Spain; Nutrition, Eumetabolism and Health Group, Girona Biomedical Research Institute (IdibGi), Girona, Spain; Biomedical Research Networking Center for Physiopathology of Obesity and Nutrition (CIBEROBN), Madrid, Spain; Department of Medical Sciences, Faculty of Medicine, Girona University, Girona, Spain
| | - Jordi Mayneris-Perxachs
- Department of Diabetes, Endocrinology and Nutrition, Dr. Josep Trueta University Hospital, Girona, Spain; Nutrition, Eumetabolism and Health Group, Girona Biomedical Research Institute (IdibGi), Girona, Spain; Biomedical Research Networking Center for Physiopathology of Obesity and Nutrition (CIBEROBN), Madrid, Spain
| | - Aurelijus Burokas
- Laboratory of Neuropharmacology, Department of Experimental and Health Sciences, Universitat Pompeu Fabra, Barcelona, Spain
| | - Oren Contreras-Rodríguez
- Nutrition, Eumetabolism and Health Group, Girona Biomedical Research Institute (IdibGi), Girona, Spain; Psychiatry Department, Bellvitge University Hospital, Bellvitge Biomedical Research Institute (IDIBELL) and CIBERSAM, Barcelona, Spain
| | - Gerard Blasco
- Institute of Diagnostic Imaging (IDI)-Research Unit (IDIR), Parc Sanitari Pere Virgili, Barcelona, Spain; Medical Imaging, Girona Biomedical Research Institute (IdibGi), Girona, Spain
| | - Clàudia Coll
- Neuroimmunology and Multiple Sclerosis Unit, Department of Neurology, Dr. Josep Trueta University Hospital, Girona, Spain
| | - Carles Biarnés
- Institute of Diagnostic Imaging (IDI)-Research Unit (IDIR), Parc Sanitari Pere Virgili, Barcelona, Spain
| | - Romina Miranda-Olivos
- Nutrition, Eumetabolism and Health Group, Girona Biomedical Research Institute (IdibGi), Girona, Spain; Psychiatry Department, Bellvitge University Hospital, Bellvitge Biomedical Research Institute (IDIBELL) and CIBERSAM, Barcelona, Spain
| | - Jèssica Latorre
- Department of Diabetes, Endocrinology and Nutrition, Dr. Josep Trueta University Hospital, Girona, Spain; Nutrition, Eumetabolism and Health Group, Girona Biomedical Research Institute (IdibGi), Girona, Spain; Biomedical Research Networking Center for Physiopathology of Obesity and Nutrition (CIBEROBN), Madrid, Spain
| | - José-Maria Moreno-Navarrete
- Department of Diabetes, Endocrinology and Nutrition, Dr. Josep Trueta University Hospital, Girona, Spain; Nutrition, Eumetabolism and Health Group, Girona Biomedical Research Institute (IdibGi), Girona, Spain; Biomedical Research Networking Center for Physiopathology of Obesity and Nutrition (CIBEROBN), Madrid, Spain; Department of Medical Sciences, Faculty of Medicine, Girona University, Girona, Spain
| | - Anna Castells-Nobau
- Department of Diabetes, Endocrinology and Nutrition, Dr. Josep Trueta University Hospital, Girona, Spain; Nutrition, Eumetabolism and Health Group, Girona Biomedical Research Institute (IdibGi), Girona, Spain; Biomedical Research Networking Center for Physiopathology of Obesity and Nutrition (CIBEROBN), Madrid, Spain
| | - Mònica Sabater
- Department of Diabetes, Endocrinology and Nutrition, Dr. Josep Trueta University Hospital, Girona, Spain; Nutrition, Eumetabolism and Health Group, Girona Biomedical Research Institute (IdibGi), Girona, Spain; Biomedical Research Networking Center for Physiopathology of Obesity and Nutrition (CIBEROBN), Madrid, Spain
| | - María Encarnación Palomo-Buitrago
- Department of Diabetes, Endocrinology and Nutrition, Dr. Josep Trueta University Hospital, Girona, Spain; Nutrition, Eumetabolism and Health Group, Girona Biomedical Research Institute (IdibGi), Girona, Spain
| | - Josep Puig
- Department of Medical Sciences, Faculty of Medicine, Girona University, Girona, Spain; Institute of Diagnostic Imaging (IDI)-Research Unit (IDIR), Parc Sanitari Pere Virgili, Barcelona, Spain; Medical Imaging, Girona Biomedical Research Institute (IdibGi), Girona, Spain
| | - Salvador Pedraza
- Department of Medical Sciences, Faculty of Medicine, Girona University, Girona, Spain; Medical Imaging, Girona Biomedical Research Institute (IdibGi), Girona, Spain; Department of Radiology, Dr. Josep Trueta University Hospital, Girona, Spain
| | - Jordi Gich
- Department of Medical Sciences, Faculty of Medicine, Girona University, Girona, Spain; Girona Neurodegeneration and Neuroinflammation Group, Girona Biomedical Research Institute (IdibGi), Girona, Spain
| | - Vicente Pérez-Brocal
- Department of Genomics and Health, Foundation for the Promotion of Health and Biomedical Research of Valencia Region (FISABIO-Public Health), Valencia, Spain; Biomedical Research Networking Center for Epidemiology and Public Health (CIBERESP), Madrid, Spain
| | - Wifredo Ricart
- Department of Diabetes, Endocrinology and Nutrition, Dr. Josep Trueta University Hospital, Girona, Spain; Nutrition, Eumetabolism and Health Group, Girona Biomedical Research Institute (IdibGi), Girona, Spain; Biomedical Research Networking Center for Physiopathology of Obesity and Nutrition (CIBEROBN), Madrid, Spain; Department of Medical Sciences, Faculty of Medicine, Girona University, Girona, Spain
| | - Andrés Moya
- Department of Genomics and Health, Foundation for the Promotion of Health and Biomedical Research of Valencia Region (FISABIO-Public Health), Valencia, Spain; Biomedical Research Networking Center for Epidemiology and Public Health (CIBERESP), Madrid, Spain; Institute for Integrative Systems Biology (I2SysBio), University of Valencia and Spanish National Research Council (CSIC), Valencia, Spain
| | - Xavier Fernández-Real
- Institute of Mathematics, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Lluís Ramió-Torrentà
- Department of Medical Sciences, Faculty of Medicine, Girona University, Girona, Spain; Neuroimmunology and Multiple Sclerosis Unit, Department of Neurology, Dr. Josep Trueta University Hospital, Girona, Spain; Girona Neurodegeneration and Neuroinflammation Group, Girona Biomedical Research Institute (IdibGi), Girona, Spain
| | - Reinald Pamplona
- Metabolic Pathophysiology Research Group, Lleida Biomedical Research Institute (IRBLleida)-Universitat de Lleida, Lleida, Spain
| | - Joaquim Sol
- Metabolic Pathophysiology Research Group, Lleida Biomedical Research Institute (IRBLleida)-Universitat de Lleida, Lleida, Spain
| | - Mariona Jové
- Metabolic Pathophysiology Research Group, Lleida Biomedical Research Institute (IRBLleida)-Universitat de Lleida, Lleida, Spain
| | - Manuel Portero-Otin
- Metabolic Pathophysiology Research Group, Lleida Biomedical Research Institute (IRBLleida)-Universitat de Lleida, Lleida, Spain
| | - Rafael Maldonado
- Laboratory of Neuropharmacology, Department of Experimental and Health Sciences, Universitat Pompeu Fabra, Barcelona, Spain; Hospital del Mar Medical Research Institute (IMIM), Barcelona, Spain.
| | - José Manuel Fernández-Real
- Department of Diabetes, Endocrinology and Nutrition, Dr. Josep Trueta University Hospital, Girona, Spain; Nutrition, Eumetabolism and Health Group, Girona Biomedical Research Institute (IdibGi), Girona, Spain; Biomedical Research Networking Center for Physiopathology of Obesity and Nutrition (CIBEROBN), Madrid, Spain; Department of Medical Sciences, Faculty of Medicine, Girona University, Girona, Spain.
| |
Collapse
|
197
|
Atuegwu NC, Oncken C, Laubenbacher RC, Perez MF, Mortensen EM. Factors Associated with E-Cigarette Use in U.S. Young Adult Never Smokers of Conventional Cigarettes: A Machine Learning Approach. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2020; 17:ijerph17197271. [PMID: 33027932 PMCID: PMC7579019 DOI: 10.3390/ijerph17197271] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/17/2020] [Revised: 09/24/2020] [Accepted: 09/28/2020] [Indexed: 02/08/2023]
Abstract
E-cigarette use is increasing among young adult never smokers of conventional cigarettes, but the awareness of the factors associated with e-cigarette use in this population is limited. The goal of this work was to use machine learning (ML) algorithms to determine the factors associated with current e-cigarette use among US young adult never cigarette smokers. Young adult (18-34 years) never cigarette smokers from the 2016 and 2017 Behavioral Risk Factor Surveillance System (BRFSS) who reported current or never e-cigarette use were used for the analysis (n = 79,539). Variables associated with current e-cigarette use were selected by two ML algorithms (Boruta and Least absolute shrinkage and selection operator (LASSO)). Odds ratios were calculated to determine the association between e-cigarette use and the variables selected by the ML algorithms, after adjusting for age, gender and race/ethnicity and incorporating the BRFSS complex design. The prevalence of e-cigarette use varied across states. Factors previously reported in the literature, such as age, race/ethnicity, alcohol use, depression, as well as novel factors associated with e-cigarette use, such as disabilities, obesity, history of diabetes and history of arthritis were identified. These results can be used to generate further hypotheses for research, increase public awareness and help provide targeted e-cigarette education.
Collapse
Affiliation(s)
- Nkiruka C. Atuegwu
- Department of Medicine, University of Connecticut School of Medicine, Farmington, CT 06030, USA; (C.O.); (M.F.P.); (E.M.M.)
- Correspondence: ; Tel.: +1-860-0679-2372; Fax: +1-860-0679-8087
| | - Cheryl Oncken
- Department of Medicine, University of Connecticut School of Medicine, Farmington, CT 06030, USA; (C.O.); (M.F.P.); (E.M.M.)
| | | | - Mario F. Perez
- Department of Medicine, University of Connecticut School of Medicine, Farmington, CT 06030, USA; (C.O.); (M.F.P.); (E.M.M.)
| | - Eric M. Mortensen
- Department of Medicine, University of Connecticut School of Medicine, Farmington, CT 06030, USA; (C.O.); (M.F.P.); (E.M.M.)
| |
Collapse
|
198
|
Wang W, Alzate-Correa D, Alves MJ, Jones M, Garcia AJ, Zhao J, Czeisler CM, Otero JJ. Machine learning-based data analytic approaches for evaluating post-natal mouse respiratory physiological evolution. Respir Physiol Neurobiol 2020; 283:103558. [PMID: 33010456 DOI: 10.1016/j.resp.2020.103558] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Revised: 09/24/2020] [Accepted: 09/26/2020] [Indexed: 11/16/2022]
Abstract
Respiratory parameters change during post-natal development, but the nature of their changes have not been well-described. The advent of commercially available plethysmographic instruments provided improved repeatability of measurements and standardization of measured breathing in mice across laboratories. These technologies thus allowed for exploration of more precise respiratory pattern changes during the post-natal developmental epoch. Current methods to analyze respiratory behavior utilize plethysmography to acquire standing values of frequency, volume and flow at specific time points in murine maturation. These metrics have historically been independently analyzed as a function of time with no further analysis examining the interplay these variables have with each other and in the context of postnatal maturation or during blood gas homeostasis. We posit that machine learning workflows can provide deeper physiological understanding into the postnatal development of respiration. In this manuscript, we delineate a machine learning workflow based on the R-statistical programming language to examine how variation and relationships of frequency (f) and tidal volume (TV) change with respect to inspiratory and expiratory parameters. Our analytical workflows could successfully predict age and found that the variation and relationships between respiratory metrics are dynamically shifting with age and during hypercapnic breathing. Thus, our work demonstrates the utility of high dimensional analyses to provide reliable class label predictions using non-invasive respiratory metrics. These approaches may be useful in large-scale phenotyping across development and in disease.
Collapse
Affiliation(s)
- Wesley Wang
- Department of Pathology, Division of Neuropathology, The Ohio State University College of Medicine, Columbus, OH, United States
| | - Diego Alzate-Correa
- Department of Pathology, Division of Neuropathology, The Ohio State University College of Medicine, Columbus, OH, United States
| | - Michele Joana Alves
- Department of Pathology, Division of Neuropathology, The Ohio State University College of Medicine, Columbus, OH, United States
| | - Mikayla Jones
- Department of Pathology, Division of Neuropathology, The Ohio State University College of Medicine, Columbus, OH, United States
| | - Alfredo J Garcia
- Department of Emergency Medicine, University of Chicago, Chicago, IL, United States
| | - Jing Zhao
- Department of Biomedical Informatics, The Ohio State University College of Medicine, Columbus, OH, United States
| | - Catherine Miriam Czeisler
- Department of Pathology, Division of Neuropathology, The Ohio State University College of Medicine, Columbus, OH, United States.
| | - José Javier Otero
- Department of Pathology, Division of Neuropathology, The Ohio State University College of Medicine, Columbus, OH, United States.
| |
Collapse
|
199
|
Lee SH, Han P, Hales RK, Voong KR, Noro K, Sugiyama S, Haller JW, McNutt TR, Lee J. Multi-view radiomics and dosiomics analysis with machine learning for predicting acute-phase weight loss in lung cancer patients treated with radiotherapy. Phys Med Biol 2020; 65:195015. [PMID: 32235058 DOI: 10.1088/1361-6560/ab8531] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
We propose a multi-view data analysis approach using radiomics and dosiomics (R&D) texture features for predicting acute-phase weight loss (WL) in lung cancer radiotherapy. Baseline weight of 388 patients who underwent intensity modulated radiation therapy (IMRT) was measured between one month prior to and one week after the start of IMRT. Weight change between one week and two months after the commencement of IMRT was analyzed, and dichotomized at 5% WL. Each patient had a planning CT and contours of gross tumor volume (GTV) and esophagus (ESO). A total of 355 features including clinical parameter (CP), GTV and ESO (GTV&ESO) dose-volume histogram (DVH), GTV radiomics, and GTV&ESO dosiomics features were extracted. R&D features were categorized as first- (L1), second- (L2), higher-order (L3) statistics, and three combined groups, L1 + L2, L2 + L3 and L1 + L2 + L3. Multi-view texture analysis was performed to identify optimal R&D input features. In the training set (194 earlier patients), feature selection was performed using Boruta algorithm followed by collinearity removal based on variance inflation factor. Machine-learning models were developed using Laplacian kernel support vector machine (lpSVM), deep neural network (DNN) and their averaged ensemble classifiers. Prediction performance was tested on an independent test set (194 more recent patients), and compared among seven different input conditions: CP-only, DVH-only, R&D-only, DVH + CP, R&D + CP, R&D + DVH and R&D + DVH + CP. Combined GTV L1 + L2 + L3 radiomics and GTV&ESO L3 dosiomics were identified as optimal input features, which achieved the best performance with an ensemble classifier (AUC = 0.710), having statistically significantly higher predictability compared with DVH and/or CP features (p < 0.05). When this performance was compared to that with full R&D-only features which reflect traditional single-view data, there was a statistically significant difference (p < 0.05). Using optimized multi-view R&D input features is beneficial for predicting early WL in lung cancer radiotherapy, leading to improved performance compared to using conventional DVH and/or CP features.
Collapse
Affiliation(s)
- Sang Ho Lee
- Department of Radiation Oncology and Molecular Radiation Sciences, Johns Hopkins University School of Medicine, Baltimore, MD 21231, United States of America
| | | | | | | | | | | | | | | | | |
Collapse
|
200
|
Monczak A, McKinney B, Mueller C, Montie EW. What's all that racket! Soundscapes, phenology, and biodiversity in estuaries. PLoS One 2020; 15:e0236874. [PMID: 32881856 PMCID: PMC7470342 DOI: 10.1371/journal.pone.0236874] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2020] [Accepted: 07/15/2020] [Indexed: 12/02/2022] Open
Abstract
There is now clear evidence that climate change affects terrestrial and marine ecosystems and can cause phenological shifts in behavior. Utilizing sound to demonstrate phenology is gaining popularity in terrestrial environments. In marine ecosystems, this technique is yet to be used due to a lack of multiyear datasets. Our study demonstrates soundscape phenology in an estuary using a six-year dataset. In this study, we showed that an increase in acoustic activity of snapping shrimp and certain fish species occurred earlier in years with warmer springs. In addition, we combined passive acoustics and traditional sampling methods (seines) and detected positive relationships between temporal patterns of the soundscape and biodiversity. This study shows that passive acoustics can provide information on the ecological response of estuaries to climate variability.
Collapse
Affiliation(s)
- Agnieszka Monczak
- Department of Natural Sciences, University of South Carolina Beaufort, Bluffton, South Carolina, United States of America
- Institute of Biological and Environmental Sciences, University of Aberdeen, Aberdeen, United Kingdom
| | - Bradshaw McKinney
- Department of Natural Sciences, University of South Carolina Beaufort, Bluffton, South Carolina, United States of America
| | - Claire Mueller
- Department of Natural Sciences, University of South Carolina Beaufort, Bluffton, South Carolina, United States of America
| | - Eric W. Montie
- Department of Natural Sciences, University of South Carolina Beaufort, Bluffton, South Carolina, United States of America
- * E-mail:
| |
Collapse
|