1
|
Sharma A, Lysenko A, Jia S, Boroevich KA, Tsunoda T. Advances in AI and machine learning for predictive medicine. J Hum Genet 2024:10.1038/s10038-024-01231-y. [PMID: 38424184 DOI: 10.1038/s10038-024-01231-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 02/04/2024] [Accepted: 02/12/2024] [Indexed: 03/02/2024]
Abstract
The field of omics, driven by advances in high-throughput sequencing, faces a data explosion. This abundance of data offers unprecedented opportunities for predictive modeling in precision medicine, but also presents formidable challenges in data analysis and interpretation. Traditional machine learning (ML) techniques have been partly successful in generating predictive models for omics analysis but exhibit limitations in handling potential relationships within the data for more accurate prediction. This review explores a revolutionary shift in predictive modeling through the application of deep learning (DL), specifically convolutional neural networks (CNNs). Using transformation methods such as DeepInsight, omics data with independent variables in tabular (table-like, including vector) form can be turned into image-like representations, enabling CNNs to capture latent features effectively. This approach not only enhances predictive power but also leverages transfer learning, reducing computational time, and improving performance. However, integrating CNNs in predictive omics data analysis is not without challenges, including issues related to model interpretability, data heterogeneity, and data size. Addressing these challenges requires a multidisciplinary approach, involving collaborations between ML experts, bioinformatics researchers, biologists, and medical doctors. This review illuminates these complexities and charts a course for future research to unlock the full predictive potential of CNNs in omics data analysis and related fields.
Collapse
Affiliation(s)
- Alok Sharma
- Laboratory for Medical Science Mathematics, Department of Biological Sciences, School of Science, The University of Tokyo, Tokyo, Japan.
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan.
- Institute for Integrated and Intelligent Systems, Griffith University, Queensland, Australia.
| | - Artem Lysenko
- Laboratory for Medical Science Mathematics, Department of Biological Sciences, School of Science, The University of Tokyo, Tokyo, Japan.
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan.
| | - Shangru Jia
- Laboratory for Medical Science Mathematics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | - Keith A Boroevich
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Tatsuhiko Tsunoda
- Laboratory for Medical Science Mathematics, Department of Biological Sciences, School of Science, The University of Tokyo, Tokyo, Japan.
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan.
- Laboratory for Medical Science Mathematics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan.
| |
Collapse
|
2
|
Zhang Y, Jian X, Xu L, Zhao J, Lu M, Lin Y, Xie L. iTCep: a deep learning framework for identification of T cell epitopes by harnessing fusion features. Front Genet 2023; 14:1141535. [PMID: 37229205 PMCID: PMC10203616 DOI: 10.3389/fgene.2023.1141535] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Accepted: 04/20/2023] [Indexed: 05/27/2023] Open
Abstract
Neoantigens recognized by cytotoxic T cells are effective targets for tumor-specific immune responses for personalized cancer immunotherapy. Quite a few neoantigen identification pipelines and computational strategies have been developed to improve the accuracy of the peptide selection process. However, these methods mainly consider the neoantigen end and ignore the interaction between peptide-TCR and the preference of each residue in TCRs, resulting in the filtered peptides often fail to truly elicit an immune response. Here, we propose a novel encoding approach for peptide-TCR representation. Subsequently, a deep learning framework, namely iTCep, was developed to predict the interactions between peptides and TCRs using fusion features derived from a feature-level fusion strategy. The iTCep achieved high predictive performance with AUC up to 0.96 on the testing dataset and above 0.86 on independent datasets, presenting better prediction performance compared with other predictors. Our results provided strong evidence that model iTCep can be a reliable and robust method for predicting TCR binding specificities of given antigen peptides. One can access the iTCep through a user-friendly web server at http://biostatistics.online/iTCep/, which supports prediction modes of peptide-TCR pairs and peptide-only. A stand-alone software program for T cell epitope prediction is also available for convenient installing at https://github.com/kbvstmd/iTCep/.
Collapse
Affiliation(s)
- Yu Zhang
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai, China
- Shanghai-MOST Key Laboratory of Health and Disease Genomics, Institute of Genome and Bioinformatics, Shanghai Institute for Biomedical and Pharmaceutical Technologies, Shanghai, China
| | - Xingxing Jian
- Shanghai-MOST Key Laboratory of Health and Disease Genomics, Institute of Genome and Bioinformatics, Shanghai Institute for Biomedical and Pharmaceutical Technologies, Shanghai, China
- Bioinformatics Center, National Clinical Research Centre for Geriatric Disorders, Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, Hunan, China
| | - Linfeng Xu
- Shanghai-MOST Key Laboratory of Health and Disease Genomics, Institute of Genome and Bioinformatics, Shanghai Institute for Biomedical and Pharmaceutical Technologies, Shanghai, China
- Ministry of Education Key Laboratory for Biodiversity Science and Ecological Engineering, Institute of Bio-Diversity Science, School of Life Sciences, Fudan University, Shanghai, China
| | - Jingjing Zhao
- Shanghai-MOST Key Laboratory of Health and Disease Genomics, Institute of Genome and Bioinformatics, Shanghai Institute for Biomedical and Pharmaceutical Technologies, Shanghai, China
| | - Manman Lu
- Shanghai-MOST Key Laboratory of Health and Disease Genomics, Institute of Genome and Bioinformatics, Shanghai Institute for Biomedical and Pharmaceutical Technologies, Shanghai, China
| | - Yong Lin
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai, China
| | - Lu Xie
- Shanghai-MOST Key Laboratory of Health and Disease Genomics, Institute of Genome and Bioinformatics, Shanghai Institute for Biomedical and Pharmaceutical Technologies, Shanghai, China
| |
Collapse
|
3
|
Deep learning in regulatory genomics: from identification to design. Curr Opin Biotechnol 2023; 79:102887. [PMID: 36640453 DOI: 10.1016/j.copbio.2022.102887] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Revised: 11/12/2022] [Accepted: 12/14/2022] [Indexed: 01/14/2023]
Abstract
Genomics and deep learning are a natural match since both are data-driven fields. Regulatory genomics refers to functional noncoding DNA regulating gene expression. In recent years, deep learning applications on regulatory genomics have achieved remarkable advances so-much-so that it has revolutionized the rules of the game of the computational methods in this field. Here, we review two emerging trends: (i) the modeling of very long input sequence (up to 200 kb), which requires self-matched modularization of model architecture; (ii) on the balance of model predictability and model interpretability because the latter is more able to meet biological demands. Finally, we discuss how to employ these two routes to design synthetic regulatory DNA, as a promising strategy for optimizing crop agronomic properties.
Collapse
|
4
|
Vegetable biology and breeding in the genomics era. SCIENCE CHINA. LIFE SCIENCES 2023; 66:226-250. [PMID: 36508122 DOI: 10.1007/s11427-022-2248-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Accepted: 11/17/2022] [Indexed: 12/14/2022]
Abstract
Vegetable crops provide a rich source of essential nutrients for humanity and represent critical economic values to global rural societies. However, genetic studies of vegetable crops have lagged behind major food crops, such as rice, wheat and maize, thereby limiting the application of molecular breeding. In the past decades, genome sequencing technologies have been increasingly applied in genetic studies and breeding of vegetables. In this review, we recapitulate recent progress on reference genome construction, population genomics and the exploitation of multi-omics datasets in vegetable crops. These advances have enabled an in-depth understanding of their domestication and evolution, and facilitated the genetic dissection of numerous agronomic traits, which jointly expedites the exploitation of state-of-the-art biotechnologies in vegetable breeding. We further provide perspectives of further directions for vegetable genomics and indicate how the ever-increasing omics data could accelerate genetic, biological studies and breeding in vegetable crops.
Collapse
|
5
|
Yan J, Wang X. Machine learning bridges omics sciences and plant breeding. TRENDS IN PLANT SCIENCE 2023; 28:199-210. [PMID: 36153276 DOI: 10.1016/j.tplants.2022.08.018] [Citation(s) in RCA: 21] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Revised: 08/15/2022] [Accepted: 08/23/2022] [Indexed: 06/16/2023]
Abstract
Some of the biological knowledge obtained from fundamental research will be implemented in applied plant breeding. To bridge basic research and breeding practice, machine learning (ML) holds great promise to translate biological knowledge and omics data into precision-designed plant breeding. Here, we review ML for multi-omics analysis in plants, including data dimensionality reduction, inference of gene-regulation networks, and gene discovery and prioritization. These applications will facilitate understanding trait regulation mechanisms and identifying target genes potentially applicable to knowledge-driven molecular design breeding. We also highlight applications of deep learning in plant phenomics and ML in genomic selection-assisted breeding, such as various ML algorithms that model the correlations among genotypes (genes), phenotypes (traits), and environments, to ultimately achieve data-driven genomic design breeding.
Collapse
Affiliation(s)
- Jun Yan
- National Maize Improvement Center, College of Agronomy and Biotechnology, China Agricultural University, Beijing 100094, China; Frontiers Science Center for Molecular Design Breeding, China Agricultural University, Beijing 100094, China
| | - Xiangfeng Wang
- National Maize Improvement Center, College of Agronomy and Biotechnology, China Agricultural University, Beijing 100094, China; Frontiers Science Center for Molecular Design Breeding, China Agricultural University, Beijing 100094, China.
| |
Collapse
|
6
|
Chen S, Duan B, Zhu C, Tang C, Wang S, Gao Y, Fu S, Fan L, Yang Q, Liu Q. Privacy-preserving integration of multiple institutional data for single-cell type identification with scPrivacy. SCIENCE CHINA. LIFE SCIENCES 2022; 66:1183-1195. [PMID: 36543995 PMCID: PMC9771767 DOI: 10.1007/s11427-022-2224-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Accepted: 09/15/2022] [Indexed: 12/24/2022]
Abstract
The rapid accumulation of large-scale single-cell RNA-seq datasets from multiple institutions presents remarkable opportunities for automatically cell annotations through integrative analyses. However, the privacy issue has existed but being ignored, since we are limited to access and utilize all the reference datasets distributed in different institutions globally due to the prohibited data transmission across institutions by data regulation laws. To this end, we present scPrivacy, which is the first and generalized automatically single-cell type identification prototype to facilitate single cell annotations in a data privacy-preserving collaboration manner. We evaluated scPrivacy on a comprehensive set of publicly available benchmark datasets for single-cell type identification to stimulate the scenario that the reference datasets are rapidly generated and distributed in multiple institutions, while they are prohibited to be integrated directly or exposed to each other due to the data privacy regulations, demonstrating its effectiveness, time efficiency and robustness for privacy-preserving integration of multiple institutional datasets in single cell annotations.
Collapse
Affiliation(s)
- Shaoqi Chen
- grid.24516.340000000123704535Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department of Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092 China
| | - Bin Duan
- grid.24516.340000000123704535Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department of Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092 China
| | - Chenyu Zhu
- grid.24516.340000000123704535Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department of Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092 China
| | - Chen Tang
- grid.24516.340000000123704535Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department of Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092 China
| | - Shuguang Wang
- grid.24516.340000000123704535Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department of Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092 China
| | - Yicheng Gao
- grid.24516.340000000123704535Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department of Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092 China
| | - Shaliu Fu
- grid.24516.340000000123704535Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department of Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092 China
| | - Lixin Fan
- Department of AI, WeBank, Shenzhen, 518055 China
| | - Qiang Yang
- Department of AI, WeBank, Shenzhen, 518055 China
| | - Qi Liu
- grid.24516.340000000123704535Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department of Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092 China ,grid.24516.340000000123704535Translational Medical Center for Stem Cell Therapy and Institution for Regenerative Medicine, Shanghai East Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092 China ,Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201210 China
| |
Collapse
|
7
|
Naqvi RZ, Siddiqui HA, Mahmood MA, Najeebullah S, Ehsan A, Azhar M, Farooq M, Amin I, Asad S, Mukhtar Z, Mansoor S, Asif M. Smart breeding approaches in post-genomics era for developing climate-resilient food crops. FRONTIERS IN PLANT SCIENCE 2022; 13:972164. [PMID: 36186056 PMCID: PMC9523482 DOI: 10.3389/fpls.2022.972164] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Accepted: 08/15/2022] [Indexed: 06/16/2023]
Abstract
Improving the crop traits is highly required for the development of superior crop varieties to deal with climate change and the associated abiotic and biotic stress challenges. Climate change-driven global warming can trigger higher insect pest pressures and plant diseases thus affecting crop production sternly. The traits controlling genes for stress or disease tolerance are economically imperative in crop plants. In this scenario, the extensive exploration of available wild, resistant or susceptible germplasms and unraveling the genetic diversity remains vital for breeding programs. The dawn of next-generation sequencing technologies and omics approaches has accelerated plant breeding by providing the genome sequences and transcriptomes of several plants. The availability of decoded plant genomes offers an opportunity at a glance to identify candidate genes, quantitative trait loci (QTLs), molecular markers, and genome-wide association studies that can potentially aid in high throughput marker-assisted breeding. In recent years genomics is coupled with marker-assisted breeding to unravel the mechanisms to harness better better crop yield and quality. In this review, we discuss the aspects of marker-assisted breeding and recent perspectives of breeding approaches in the era of genomics, bioinformatics, high-tech phonemics, genome editing, and new plant breeding technologies for crop improvement. In nutshell, the smart breeding toolkit in the post-genomics era can steadily help in developing climate-smart future food crops.
Collapse
|
8
|
Yan J, Wang X. Unsupervised and semi-supervised learning: the next frontier in machine learning for plant systems biology. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2022; 111:1527-1538. [PMID: 35821601 DOI: 10.1111/tpj.15905] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Revised: 07/05/2022] [Accepted: 07/07/2022] [Indexed: 06/15/2023]
Abstract
Advances in high-throughput omics technologies are leading plant biology research into the era of big data. Machine learning (ML) performs an important role in plant systems biology because of its excellent performance and wide application in the analysis of big data. However, to achieve ideal performance, supervised ML algorithms require large numbers of labeled samples as training data. In some cases, it is impossible or prohibitively expensive to obtain enough labeled training data; here, the paradigms of unsupervised learning (UL) and semi-supervised learning (SSL) play an indispensable role. In this review, we first introduce the basic concepts of ML techniques, as well as some representative UL and SSL algorithms, including clustering, dimensionality reduction, self-supervised learning (self-SL), positive-unlabeled (PU) learning and transfer learning. We then review recent advances and applications of UL and SSL paradigms in both plant systems biology and plant phenotyping research. Finally, we discuss the limitations and highlight the significance and challenges of UL and SSL strategies in plant systems biology.
Collapse
Affiliation(s)
- Jun Yan
- Frontiers Science Center for Molecular Design Breeding, China Agricultural University, Beijing, 100094, China
- National Maize Improvement Center, College of Agronomy and Biotechnology, China Agricultural University, Beijing, 100094, China
| | - Xiangfeng Wang
- Frontiers Science Center for Molecular Design Breeding, China Agricultural University, Beijing, 100094, China
- National Maize Improvement Center, College of Agronomy and Biotechnology, China Agricultural University, Beijing, 100094, China
| |
Collapse
|
9
|
Pandi A, Diehl C, Yazdizadeh Kharrazi A, Scholz SA, Bobkova E, Faure L, Nattermann M, Adam D, Chapin N, Foroughijabbari Y, Moritz C, Paczia N, Cortina NS, Faulon JL, Erb TJ. A versatile active learning workflow for optimization of genetic and metabolic networks. Nat Commun 2022; 13:3876. [PMID: 35790733 PMCID: PMC9256728 DOI: 10.1038/s41467-022-31245-z] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2022] [Accepted: 06/10/2022] [Indexed: 11/13/2022] Open
Abstract
Optimization of biological networks is often limited by wet lab labor and cost, and the lack of convenient computational tools. Here, we describe METIS, a versatile active machine learning workflow with a simple online interface for the data-driven optimization of biological targets with minimal experiments. We demonstrate our workflow for various applications, including cell-free transcription and translation, genetic circuits, and a 27-variable synthetic CO2-fixation cycle (CETCH cycle), improving these systems between one and two orders of magnitude. For the CETCH cycle, we explore 1025 conditions with only 1,000 experiments to yield the most efficient CO2-fixation cascade described to date. Beyond optimization, our workflow also quantifies the relative importance of individual factors to the performance of a system identifying unknown interactions and bottlenecks. Overall, our workflow opens the way for convenient optimization and prototyping of genetic and metabolic networks with customizable adjustments according to user experience, experimental setup, and laboratory facilities. Optimization of biological networks is often limited by wet lab labor and cost, and the lack of convenient computational tools. Here, aimed at democratization and standardization, the authors describe METIS, a modular and versatile active machine learning workflow with a simple online interface for the optimization of biological target functions with minimal experimental datasets.
Collapse
Affiliation(s)
- Amir Pandi
- Department of Biochemistry & Synthetic Metabolism, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany.
| | - Christoph Diehl
- Department of Biochemistry & Synthetic Metabolism, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany
| | | | - Scott A Scholz
- Department of Biochemistry & Synthetic Metabolism, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany
| | - Elizaveta Bobkova
- Department of Biochemistry & Synthetic Metabolism, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany
| | - Léon Faure
- Micalis Institute, INRAE, AgroParisTech, University of Paris-Saclay, Jouy-en-Josas, France
| | - Maren Nattermann
- Department of Biochemistry & Synthetic Metabolism, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany
| | - David Adam
- Department of Biochemistry & Synthetic Metabolism, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany
| | - Nils Chapin
- Department of Biochemistry & Synthetic Metabolism, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany
| | - Yeganeh Foroughijabbari
- Department of Biochemistry & Synthetic Metabolism, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany
| | - Charles Moritz
- Department of Biochemistry & Synthetic Metabolism, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany
| | - Nicole Paczia
- Core Facility for Metabolomics and Small Molecule Mass Spectrometry, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany
| | - Niña Socorro Cortina
- Department of Biochemistry & Synthetic Metabolism, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany.,LiVeritas Biosciences, Inc., 432N Canal St.; Ste. 20, South San Francisco, CA, 94080, USA
| | - Jean-Loup Faulon
- Micalis Institute, INRAE, AgroParisTech, University of Paris-Saclay, Jouy-en-Josas, France.,Genomique Metabolique, Genoscope, Institut Francois Jacob, CEA, CNRS, Univ Evry, University of Paris-Saclay, Evry, France.,Manchester Institute of Biotechnology, SYNBIOCHEM center, School of Chemistry, The University of Manchester, Manchester, UK
| | - Tobias J Erb
- Department of Biochemistry & Synthetic Metabolism, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany. .,SYNMIKRO Center of Synthetic Microbiology, Marburg, Germany.
| |
Collapse
|
10
|
Saha P, Neogy S. Concat_CNN: A Model to Detect COVID-19 from Chest X-ray Images with Deep Learning. SN COMPUTER SCIENCE 2022; 3:305. [PMID: 35647557 PMCID: PMC9125955 DOI: 10.1007/s42979-022-01182-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Accepted: 04/27/2022] [Indexed: 12/15/2022]
Abstract
COVID-19 is creating havoc on the lives of human beings all around the world. It continues to affect the normal lives of people. As number of cases are high, a cost effective and fast system is required to detect COVID-19 at appropriate time to provide the necessary healthcare. Chest X-rays have emerged as an easiest way to detect COVID-19 in no time as RT-PCR takes time to detect the infection. In this paper we propose a concatenation-based CNN model that will detect COVID-19 from chest X-rays. We have developed a multiclass classification problem which can detect and classify a chest X-ray image as either COVID + ve, or viral pneumonia, or normal. We have used chest X-rays collected from different open sources. To maintain class balancing, we took 500 images of COVID, 500 normal images, and 500 pneumonia images. We divided our dataset in training, validation, and test set in 70:10:20 ratio respectively. We used four CNNs as feature extractors from the images and concatenated their feature maps to get better efficiency of the network. After training our model for 5 folds, we have obtained around 96.31% accuracy, 95.8% precision, 92.99% recall, and 98.02% AUC. We have compared our work with state-of-the-art pretrained transfer learning algorithms and other state-of-the-art CNN models referred in different research papers. The proposed model (Concat_CNN) exhibits better accuracy than the state-of-the-art models. We hope our proposed model will help to classify chest X-rays effectively and help medical professionals with their treatment.
Collapse
Affiliation(s)
- Priyanka Saha
- Depatment of Computer Science and Engineering, Jadavpur University, Kolkata, 700032 India
| | - Sarmistha Neogy
- Depatment of Computer Science and Engineering, Jadavpur University, Kolkata, 700032 India
| |
Collapse
|
11
|
Alibakhshi A, Hartke B. Implicitly perturbed Hamiltonian as a class of versatile and general-purpose molecular representations for machine learning. Nat Commun 2022; 13:1245. [PMID: 35273170 PMCID: PMC8913769 DOI: 10.1038/s41467-022-28912-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Accepted: 02/01/2022] [Indexed: 11/28/2022] Open
Abstract
Unraveling challenging problems by machine learning has recently become a hot topic in many scientific disciplines. For developing rigorous machine-learning models to study problems of interest in molecular sciences, translating molecular structures to quantitative representations as suitable machine-learning inputs play a central role. Many different molecular representations and the state-of-the-art ones, although efficient in studying numerous molecular features, still are suboptimal in many challenging cases, as discussed in the context of the present research. The main aim of the present study is to introduce the Implicitly Perturbed Hamiltonian (ImPerHam) as a class of versatile representations for more efficient machine learning of challenging problems in molecular sciences. ImPerHam representations are defined as energy attributes of the molecular Hamiltonian, implicitly perturbed by a number of hypothetic or real arbitrary solvents based on continuum solvation models. We demonstrate the outstanding performance of machine-learning models based on ImPerHam representations for three diverse and challenging cases of predicting inhibition of the CYP450 enzyme, high precision, and transferrable evaluation of non-covalent interaction energy of molecular systems, and accurately reproducing solvation free energies for large benchmark sets. Molecular representations are fundamental tools for machine-learning models. The current work introduces a new set of molecular representations demonstrated to enable accurate predictions of molecular conformational energy and solvation free energy.
Collapse
Affiliation(s)
- Amin Alibakhshi
- Theoretical Chemistry, Institute for Physical Chemistry, Christian-Albrechts-University, Olshausenstr. 40, Kiel, Germany.
| | - Bernd Hartke
- Theoretical Chemistry, Institute for Physical Chemistry, Christian-Albrechts-University, Olshausenstr. 40, Kiel, Germany
| |
Collapse
|
12
|
Fernie AR, Alseekh S, Liu J, Yan J. Using precision phenotyping to inform de novo domestication. PLANT PHYSIOLOGY 2021; 186:1397-1411. [PMID: 33848336 PMCID: PMC8260140 DOI: 10.1093/plphys/kiab160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Accepted: 03/22/2021] [Indexed: 05/09/2023]
Abstract
An update on the use of precision phenotyping to assess the potential of lesser cultivated species as candidates for de novo domestication or similar development for future agriculture.
Collapse
Affiliation(s)
- Alisdair R Fernie
- Max Planck Institute for Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany
- Centre of Plant Systems Biology and Biotechnology, 4000 Plovdiv, Bulgaria
- Author for communication: (A.R.F.)
| | - Saleh Alseekh
- Max Planck Institute for Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany
- Centre of Plant Systems Biology and Biotechnology, 4000 Plovdiv, Bulgaria
| | - Jie Liu
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, 430070 Wuhan, Hubei, China
| | - Jianbing Yan
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, 430070 Wuhan, Hubei, China
| |
Collapse
|
13
|
Zhang X, Man Y, Zhuang X, Shen J, Zhang Y, Cui Y, Yu M, Xing J, Wang G, Lian N, Hu Z, Ma L, Shen W, Yang S, Xu H, Bian J, Jing Y, Li X, Li R, Mao T, Jiao Y, Sodmergen, Ren H, Lin J. Plant multiscale networks: charting plant connectivity by multi-level analysis and imaging techniques. SCIENCE CHINA-LIFE SCIENCES 2021; 64:1392-1422. [PMID: 33974222 DOI: 10.1007/s11427-020-1910-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Accepted: 03/04/2021] [Indexed: 12/21/2022]
Abstract
In multicellular and even single-celled organisms, individual components are interconnected at multiscale levels to produce enormously complex biological networks that help these systems maintain homeostasis for development and environmental adaptation. Systems biology studies initially adopted network analysis to explore how relationships between individual components give rise to complex biological processes. Network analysis has been applied to dissect the complex connectivity of mammalian brains across different scales in time and space in The Human Brain Project. In plant science, network analysis has similarly been applied to study the connectivity of plant components at the molecular, subcellular, cellular, organic, and organism levels. Analysis of these multiscale networks contributes to our understanding of how genotype determines phenotype. In this review, we summarized the theoretical framework of plant multiscale networks and introduced studies investigating plant networks by various experimental and computational modalities. We next discussed the currently available analytic methodologies and multi-level imaging techniques used to map multiscale networks in plants. Finally, we highlighted some of the technical challenges and key questions remaining to be addressed in this emerging field.
Collapse
Affiliation(s)
- Xi Zhang
- Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, Beijing Forestry University, Beijing, 100083, China.,College of Biological Sciences and Biotechnology, Beijing Forestry University, Beijing, 100083, China
| | - Yi Man
- Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, Beijing Forestry University, Beijing, 100083, China.,College of Biological Sciences and Biotechnology, Beijing Forestry University, Beijing, 100083, China
| | - Xiaohong Zhuang
- School of Life Sciences, Centre for Cell & Developmental Biology and State Key Laboratory of Agrobiotechnology, The Chinese University of Hong Kong, Hong Kong, 999077, China
| | - Jinbo Shen
- State Key Laboratory of Subtropical Silviculture, Zhejiang A&F University, Hangzhou, 311300, China
| | - Yi Zhang
- Key Laboratory of Cell Proliferation and Regulation Biology, Ministry of Education, College of Life Science, Beijing Normal University, Beijing, 100875, China
| | - Yaning Cui
- Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, Beijing Forestry University, Beijing, 100083, China.,College of Biological Sciences and Biotechnology, Beijing Forestry University, Beijing, 100083, China
| | - Meng Yu
- Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, Beijing Forestry University, Beijing, 100083, China.,College of Biological Sciences and Biotechnology, Beijing Forestry University, Beijing, 100083, China
| | - Jingjing Xing
- Key Laboratory of Plant Stress Biology, School of Life Sciences, Henan University, Kaifeng, 457004, China
| | - Guangchao Wang
- College of Biological Sciences and Biotechnology, Beijing Forestry University, Beijing, 100083, China
| | - Na Lian
- Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, Beijing Forestry University, Beijing, 100083, China.,College of Biological Sciences and Biotechnology, Beijing Forestry University, Beijing, 100083, China
| | - Zijian Hu
- College of Biological Sciences and Biotechnology, Beijing Forestry University, Beijing, 100083, China
| | - Lingyu Ma
- College of Biological Sciences and Biotechnology, Beijing Forestry University, Beijing, 100083, China
| | - Weiwei Shen
- College of Biological Sciences and Biotechnology, Beijing Forestry University, Beijing, 100083, China
| | - Shunyao Yang
- College of Biological Sciences and Biotechnology, Beijing Forestry University, Beijing, 100083, China
| | - Huimin Xu
- College of Biological Sciences, China Agricultural University, Beijing, 100193, China
| | - Jiahui Bian
- College of Biological Sciences and Biotechnology, Beijing Forestry University, Beijing, 100083, China
| | - Yanping Jing
- College of Biological Sciences and Biotechnology, Beijing Forestry University, Beijing, 100083, China
| | - Xiaojuan Li
- Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, Beijing Forestry University, Beijing, 100083, China.,College of Biological Sciences and Biotechnology, Beijing Forestry University, Beijing, 100083, China
| | - Ruili Li
- Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, Beijing Forestry University, Beijing, 100083, China.,College of Biological Sciences and Biotechnology, Beijing Forestry University, Beijing, 100083, China
| | - Tonglin Mao
- State Key Laboratory of Plant Physiology and Biochemistry, Department of Plant Sciences, College of Biological Sciences, China Agricultural University, Beijing, 100193, China
| | - Yuling Jiao
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, and National Center for Plant Gene Research, Beijing, 100101, China
| | - Sodmergen
- Key Laboratory of Ministry of Education for Cell Proliferation and Differentiation, College of Life Sciences, Peking University, Beijing, 100871, China
| | - Haiyun Ren
- Key Laboratory of Cell Proliferation and Regulation Biology, Ministry of Education, College of Life Science, Beijing Normal University, Beijing, 100875, China
| | - Jinxing Lin
- Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, Beijing Forestry University, Beijing, 100083, China. .,College of Biological Sciences and Biotechnology, Beijing Forestry University, Beijing, 100083, China.
| |
Collapse
|
14
|
Liu J, Fernie AR, Yan J. Crop breeding - From experience-based selection to precision design. JOURNAL OF PLANT PHYSIOLOGY 2021; 256:153313. [PMID: 33202375 DOI: 10.1016/j.jplph.2020.153313] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/19/2020] [Revised: 10/25/2020] [Accepted: 10/27/2020] [Indexed: 06/11/2023]
Abstract
Crops are the foundation of human society, not only by providing needed nutrition, but also by feeding livestock and serving as raw materials for industry. Cereal crops, which supply most of our calories, have been supporting humans for thousands of years. However food security is facing many challenges nowadays, including growing populations, water shortage, and increased incidence of biotic and abiotic stresses. According to statistical data from the Food and Agriculture Organization of the United Nations (FAO, http://www.fao.org/), the people suffering severe food insecurity increased from 7.9 % in 2015 to 9.7 % in 2019 and the number of people exposed to moderate or severe food insecurity have increased by 400 million over the same time period. Although there are many ways to cope with these challenges, crop breeding remains the most crucial and direct manner. With the development of molecular genetics, the speed of cloning genetic variations underlying corresponding phenotypes of agricultural importance is considerably more rapid. As a consequence breeding methods have evolved from phenotype-based to genome-based selection. In the future, knowledge-driven crop design, which integrates multi-omics data to reveal the connections between genotypes and phenotypes and to build selection models, will undoubtedly become the most efficient way to shape plants, to improve crops, and to ensure food security.
Collapse
Affiliation(s)
- Jie Liu
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, China.
| | - Alisdair R Fernie
- Department of Molecular Physiology, Max-Planck-Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476, Potsdam-Golm, Germany
| | - Jianbing Yan
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, China.
| |
Collapse
|