1
|
Harrison MC, Ubbelohde EJ, LaBella AL, Opulente DA, Wolters JF, Zhou X, Shen XX, Groenewald M, Hittinger CT, Rokas A. Machine learning enables identification of an alternative yeast galactose utilization pathway. Proc Natl Acad Sci U S A 2024; 121:e2315314121. [PMID: 38669185 PMCID: PMC11067038 DOI: 10.1073/pnas.2315314121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 02/27/2024] [Indexed: 04/28/2024] Open
Abstract
How genomic differences contribute to phenotypic differences is a major question in biology. The recently characterized genomes, isolation environments, and qualitative patterns of growth on 122 sources and conditions of 1,154 strains from 1,049 fungal species (nearly all known) in the yeast subphylum Saccharomycotina provide a powerful, yet complex, dataset for addressing this question. We used a random forest algorithm trained on these genomic, metabolic, and environmental data to predict growth on several carbon sources with high accuracy. Known structural genes involved in assimilation of these sources and presence/absence patterns of growth in other sources were important features contributing to prediction accuracy. By further examining growth on galactose, we found that it can be predicted with high accuracy from either genomic (92.2%) or growth data (82.6%) but not from isolation environment data (65.6%). Prediction accuracy was even higher (93.3%) when we combined genomic and growth data. After the GALactose utilization genes, the most important feature for predicting growth on galactose was growth on galactitol, raising the hypothesis that several species in two orders, Serinales and Pichiales (containing the emerging pathogen Candida auris and the genus Ogataea, respectively), have an alternative galactose utilization pathway because they lack the GAL genes. Growth and biochemical assays confirmed that several of these species utilize galactose through an alternative oxidoreductive D-galactose pathway, rather than the canonical GAL pathway. Machine learning approaches are powerful for investigating the evolution of the yeast genotype-phenotype map, and their application will uncover novel biology, even in well-studied traits.
Collapse
Affiliation(s)
- Marie-Claire Harrison
- Department of Biological Sciences and Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235
| | - Emily J Ubbelohde
- Laboratory of Genetics, Department of Energy (DOE) Great Lakes Bioenergy Research Center, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, Wisconsin Energy Institute, University of Wisconsin-Madison, Madison, WI 53726
| | - Abigail L LaBella
- Department of Biological Sciences and Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28262
| | - Dana A Opulente
- Laboratory of Genetics, Department of Energy (DOE) Great Lakes Bioenergy Research Center, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, Wisconsin Energy Institute, University of Wisconsin-Madison, Madison, WI 53726
- Department of Biology, Villanova University, Villanova, PA 19085
| | - John F Wolters
- Laboratory of Genetics, Department of Energy (DOE) Great Lakes Bioenergy Research Center, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, Wisconsin Energy Institute, University of Wisconsin-Madison, Madison, WI 53726
| | - Xiaofan Zhou
- Guangdong Province Key Laboratory of Microbial Signals and Disease Control, Integrative Microbiology Research Center, South China Agricultural University, Guangzhou 510642, China
| | - Xing-Xing Shen
- Key Laboratory of Biology of Crop Pathogens and Insects of Zhejiang Province, Institute of Insect Sciences, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou 310058, China
| | | | - Chris Todd Hittinger
- Laboratory of Genetics, Department of Energy (DOE) Great Lakes Bioenergy Research Center, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, Wisconsin Energy Institute, University of Wisconsin-Madison, Madison, WI 53726
| | - Antonis Rokas
- Department of Biological Sciences and Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235
| |
Collapse
|
2
|
Bai W, Li C, Li W, Wang H, Han X, Wang P, Wang L. Machine learning assists prediction of genes responsible for plant specialized metabolite biosynthesis by integrating multi-omics data. BMC Genomics 2024; 25:418. [PMID: 38679745 PMCID: PMC11057162 DOI: 10.1186/s12864-024-10258-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Accepted: 03/26/2024] [Indexed: 05/01/2024] Open
Abstract
BACKGROUND Plant specialized (or secondary) metabolites (PSM), also known as phytochemicals, natural products, or plant constituents, play essential roles in interactions between plants and environment. Although many research efforts have focused on discovering novel metabolites and their biosynthetic genes, the resolution of metabolic pathways and identified biosynthetic genes was limited by rudimentary analysis approaches and enormous number of candidate genes. RESULTS Here we integrated state-of-the-art automated machine learning (ML) frame AutoGluon-Tabular and multi-omics data from Arabidopsis to predict genes encoding enzymes involved in biosynthesis of plant specialized metabolite (PSM), focusing on the three main PSM categories: terpenoids, alkaloids, and phenolics. We found that the related features of genomics and proteomics were the top two crucial categories of features contributing to the model performance. Using only these key features, we built a new model in Arabidopsis, which performed better than models built with more features including those related with transcriptomics and epigenomics. Finally, the built models were validated in maize and tomato, and models tested for maize and trained with data from two other species exhibited either equivalent or superior performance to intraspecies predictions. CONCLUSIONS Our external validation results in grape and poppy on the one hand implied the applicability of our model to the other species, and on the other hand showed enormous potential to improve the prediction of enzymes synthesizing PSM with the inclusion of valid data from a wider range of species.
Collapse
Affiliation(s)
- Wenhui Bai
- College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Taiyuan, 030024, China
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, China, 518000, Shenzhen
| | - Cheng Li
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, China, 518000, Shenzhen
| | - Wei Li
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, China, 518000, Shenzhen
| | - Hai Wang
- National Maize Improvement Center, Key Laboratory of Crop Heterosis and Utilization, Joint Laboratory for International Cooperation in Crop Molecular Breeding, China Agricultural University, Beijing, 100193, China
| | - Xiaohong Han
- College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Taiyuan, 030024, China.
| | - Peipei Wang
- Kunpeng Institute of Modern Agriculture at Foshan, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518124, China.
| | - Li Wang
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, China, 518000, Shenzhen.
| |
Collapse
|
3
|
Kerwin RE, Hart JE, Fiesel PD, Lou YR, Fan P, Jones AD, Last RL. Tomato root specialized metabolites evolved through gene duplication and regulatory divergence within a biosynthetic gene cluster. SCIENCE ADVANCES 2024; 10:eadn3991. [PMID: 38657073 PMCID: PMC11094762 DOI: 10.1126/sciadv.adn3991] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Accepted: 03/20/2024] [Indexed: 04/26/2024]
Abstract
Tremendous plant metabolic diversity arises from phylogenetically restricted specialized metabolic pathways. Specialized metabolites are synthesized in dedicated cells or tissues, with pathway genes sometimes colocalizing in biosynthetic gene clusters (BGCs). However, the mechanisms by which spatial expression patterns arise and the role of BGCs in pathway evolution remain underappreciated. In this study, we investigated the mechanisms driving acylsugar evolution in the Solanaceae. Previously thought to be restricted to glandular trichomes, acylsugars were recently found in cultivated tomato roots. We demonstrated that acylsugars in cultivated tomato roots and trichomes have different sugar cores, identified root-enriched paralogs of trichome acylsugar pathway genes, and characterized a key paralog required for root acylsugar biosynthesis, SlASAT1-LIKE (SlASAT1-L), which is nested within a previously reported trichome acylsugar BGC. Last, we provided evidence that ASAT1-L arose through duplication of its paralog, ASAT1, and was trichome-expressed before acquiring root-specific expression in the Solanum genus. Our results illuminate the genomic context and molecular mechanisms underpinning metabolic diversity in plants.
Collapse
Affiliation(s)
- Rachel E. Kerwin
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA
| | - Jaynee E. Hart
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA
| | - Paul D. Fiesel
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA
| | - Yann-Ru Lou
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA
- Department of Plant Biology, University of California, Davis, Davis, CA 95616, USA
| | - Pengxiang Fan
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA
- Department of Horticulture, Zhejiang University, Hangzhou, China
| | - A. Daniel Jones
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA
- Department of Chemistry, Michigan State University, East Lansing, MI 48824, USA
| | - Robert L. Last
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA
- Department of Plant Biology, Michigan State University, East Lansing, MI 48824, USA
| |
Collapse
|
4
|
Thomas SK, Hoek KV, Ogoti T, Duong H, Angelovici R, Pires JC, Mendoza-Cozatl D, Washburn J, Schenck CA. Halophytes and heavy metals: A multi-omics approach to understand the role of gene and genome duplication in the abiotic stress tolerance of Cakile maritima. AMERICAN JOURNAL OF BOTANY 2024:e16310. [PMID: 38600732 DOI: 10.1002/ajb2.16310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 02/02/2024] [Accepted: 02/05/2024] [Indexed: 04/12/2024]
Abstract
PREMISE The origin of diversity is a fundamental biological question. Gene duplications are one mechanism that provides raw material for the emergence of novel traits, but evolutionary outcomes depend on which genes are retained and how they become functionalized. Yet, following different duplication types (polyploidy and tandem duplication), the events driving gene retention and functionalization remain poorly understood. Here we used Cakile maritima, a species that is tolerant to salt and heavy metals and shares an ancient whole-genome triplication with closely related salt-sensitive mustard crops (Brassica), as a model to explore the evolution of abiotic stress tolerance following polyploidy. METHODS Using a combination of ionomics, free amino acid profiling, and comparative genomics, we characterize aspects of salt stress response in C. maritima and identify retained duplicate genes that have likely enabled adaptation to salt and mild levels of cadmium. RESULTS Cakile maritima is tolerant to both cadmium and salt treatments through uptake of cadmium in the roots. Proline constitutes greater than 30% of the free amino acid pool in C. maritima and likely contributes to abiotic stress tolerance. We find duplicated gene families are enriched in metabolic and transport processes and identify key transport genes that may be involved in C. maritima abiotic stress tolerance. CONCLUSIONS These findings identify pathways and genes that could be used to enhance plant resilience and provide a putative understanding of the roles of duplication types and retention on the evolution of abiotic stress response.
Collapse
Affiliation(s)
- Shawn K Thomas
- Division of Biological Sciences, University of Missouri, Columbia, 65211, MO, USA
- Bioinformatics and Analytics Core, University of Missouri, Columbia, 65211, MO, USA
- Interdisciplinary Plant Group, University of Missouri, Columbia, 65211, MO, USA
| | - Kathryn Vanden Hoek
- Department of Biochemistry, University of Missouri, Columbia, 65211, MO, USA
| | - Tasha Ogoti
- Department of Computer Science, University of Missouri, Columbia, 65211, MO, USA
| | - Ha Duong
- Interdisciplinary Plant Group, University of Missouri, Columbia, 65211, MO, USA
- Department of Biochemistry, University of Missouri, Columbia, 65211, MO, USA
| | - Ruthie Angelovici
- Division of Biological Sciences, University of Missouri, Columbia, 65211, MO, USA
- Interdisciplinary Plant Group, University of Missouri, Columbia, 65211, MO, USA
| | - J Chris Pires
- Soil and Crop Sciences, Colorado State University, Fort Collins, 80523-1170, CO, USA
| | - David Mendoza-Cozatl
- Interdisciplinary Plant Group, University of Missouri, Columbia, 65211, MO, USA
- Division of Plant Sciences and Technology, University of Missouri, Columbia, 65211, MO, USA
| | - Jacob Washburn
- Interdisciplinary Plant Group, University of Missouri, Columbia, 65211, MO, USA
- Plant Genetics Research Unit, USDA-ARS, Columbia, 65211, MO, USA
| | - Craig A Schenck
- Interdisciplinary Plant Group, University of Missouri, Columbia, 65211, MO, USA
- Department of Biochemistry, University of Missouri, Columbia, 65211, MO, USA
| |
Collapse
|
5
|
Mehta N, Meng Y, Zare R, Kamenetsky-Goldstein R, Sattely E. A developmental gradient reveals biosynthetic pathways to eukaryotic toxins in monocot geophytes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.12.540595. [PMID: 37214939 PMCID: PMC10197729 DOI: 10.1101/2023.05.12.540595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Numerous eukaryotic toxins that accumulate in geophytic plants are valuable in the clinic, yet their biosynthetic pathways have remained elusive. A lead example is the >150 Amaryllidaceae alkaloids (AmAs) including galantamine, an FDA-approved treatment for Alzheimer's disease. We show that while AmAs accumulate to high levels in many tissues in daffodils, biosynthesis is localized to nascent, growing tissue at the base of leaves. A similar trend is found for the production of steroidal alkaloids (e.g. cyclopamine) in corn lily. This model of active biosynthesis enabled elucidation of a complete set of biosynthetic genes for the production of AmAs. Taken together, our work sheds light on the developmental and enzymatic logic of diverse alkaloid biosynthesis in daffodil. More broadly, it suggests a paradigm for biosynthesis regulation in monocot geophytes where plants are protected from herbivory through active charging of newly formed cells with eukaryotic toxins that persist as aboveground tissue develops.
Collapse
Affiliation(s)
- Niraj Mehta
- Department of Chemistry, Stanford University, Stanford, CA, 94305, USA
| | - Yifan Meng
- Department of Chemistry, Stanford University, Stanford, CA, 94305, USA
| | - Richard Zare
- Department of Chemistry, Stanford University, Stanford, CA, 94305, USA
| | | | - Elizabeth Sattely
- Department of Chemical Engineering, Stanford University, Stanford, CA, 94305, USA
- HHMI, Stanford University, Stanford, CA 94305
| |
Collapse
|
6
|
Kisiel A, Krzemińska A, Cembrowska-Lech D, Miller T. Data Science and Plant Metabolomics. Metabolites 2023; 13:metabo13030454. [PMID: 36984894 PMCID: PMC10054611 DOI: 10.3390/metabo13030454] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 03/16/2023] [Accepted: 03/17/2023] [Indexed: 03/30/2023] Open
Abstract
The study of plant metabolism is one of the most complex tasks, mainly due to the huge amount and structural diversity of metabolites, as well as the fact that they react to changes in the environment and ultimately influence each other. Metabolic profiling is most often carried out using tools that include mass spectrometry (MS), which is one of the most powerful analytical methods. All this means that even when analyzing a single sample, we can obtain thousands of data. Data science has the potential to revolutionize our understanding of plant metabolism. This review demonstrates that machine learning, network analysis, and statistical modeling are some techniques being used to analyze large quantities of complex data that provide insights into plant development, growth, and how they interact with their environment. These findings could be key to improving crop yields, developing new forms of plant biotechnology, and understanding the relationship between plants and microbes. It is also necessary to consider the constraints that come with data science such as quality and availability of data, model complexity, and the need for deep knowledge of the subject in order to achieve reliable outcomes.
Collapse
Affiliation(s)
- Anna Kisiel
- Institute of Marine and Environmental Sciences, University of Szczecin, Wąska 13, 71-415 Szczecin, Poland
- Polish Society of Bioinformatics and Data Science BIODATA, Popiełuszki 4c, 71-214 Szczecin, Poland
| | - Adrianna Krzemińska
- Polish Society of Bioinformatics and Data Science BIODATA, Popiełuszki 4c, 71-214 Szczecin, Poland
| | - Danuta Cembrowska-Lech
- Polish Society of Bioinformatics and Data Science BIODATA, Popiełuszki 4c, 71-214 Szczecin, Poland
- Department of Physiology and Biochemistry, Institute of Biology, University of Szczecin, Felczaka 3c, 71-412 Szczecin, Poland
| | - Tymoteusz Miller
- Institute of Marine and Environmental Sciences, University of Szczecin, Wąska 13, 71-415 Szczecin, Poland
- Polish Society of Bioinformatics and Data Science BIODATA, Popiełuszki 4c, 71-214 Szczecin, Poland
| |
Collapse
|
7
|
Depuydt T, De Rybel B, Vandepoele K. Charting plant gene functions in the multi-omics and single-cell era. TRENDS IN PLANT SCIENCE 2023; 28:283-296. [PMID: 36307271 DOI: 10.1016/j.tplants.2022.09.008] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Revised: 09/09/2022] [Accepted: 09/30/2022] [Indexed: 06/16/2023]
Abstract
Despite the increased access to high-quality plant genome sequences, the set of genes with a known function remains far from complete. With the advent of novel bulk and single-cell omics profiling methods, we are entering a new era where advanced and highly integrative functional annotation strategies are being developed to elucidate the functions of all plant genes. Here, we review different multi-omics approaches to improve functional and regulatory gene characterization and highlight the power of machine learning and network biology to fully exploit the complementary information embedded in different omics layers. Finally, we discuss the potential of emerging single-cell methods and algorithms to further increase the resolution, allowing generation of functional insights about plant biology.
Collapse
Affiliation(s)
- Thomas Depuydt
- Ghent University, Department of Plant Biotechnology and Bioinformatics, Ghent, Belgium; Vlaams Instituut voor Biotechnologie, Center for Plant Systems Biology, Ghent, Belgium
| | - Bert De Rybel
- Ghent University, Department of Plant Biotechnology and Bioinformatics, Ghent, Belgium; Vlaams Instituut voor Biotechnologie, Center for Plant Systems Biology, Ghent, Belgium
| | - Klaas Vandepoele
- Ghent University, Department of Plant Biotechnology and Bioinformatics, Ghent, Belgium; Vlaams Instituut voor Biotechnologie, Center for Plant Systems Biology, Ghent, Belgium; Ghent University, Bioinformatics Institute Ghent, Ghent, Belgium.
| |
Collapse
|
8
|
Maglietta R, Saccotelli L, Fanizza C, Telesca V, Dimauro G, Causio S, Lecci R, Federico I, Coppini G, Cipriano G, Carlucci R. Environmental variables and machine learning models to predict cetacean abundance in the Central-eastern Mediterranean Sea. Sci Rep 2023; 13:2600. [PMID: 36788321 PMCID: PMC9929343 DOI: 10.1038/s41598-023-29681-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Accepted: 02/08/2023] [Indexed: 02/16/2023] Open
Abstract
Although the Mediterranean Sea is a crucial hotspot in marine biodiversity, it has been threatened by numerous anthropogenic pressures. As flagship species, Cetaceans are exposed to those anthropogenic impacts and global changes. Assessing their conservation status becomes strategic to set effective management plans. The aim of this paper is to understand the habitat requirements of cetaceans, exploiting the advantages of a machine-learning framework. To this end, 28 physical and biogeochemical variables were identified as environmental predictors related to the abundance of three odontocete species in the Northern Ionian Sea (Central-eastern Mediterranean Sea). In fact, habitat models were built using sighting data collected for striped dolphins Stenella coeruleoalba, common bottlenose dolphins Tursiops truncatus, and Risso's dolphins Grampus griseus between July 2009 and October 2021. Random Forest was a suitable machine learning algorithm for the cetacean abundance estimation. Nitrate, phytoplankton carbon biomass, temperature, and salinity were the most common influential predictors, followed by latitude, 3D-chlorophyll and density. The habitat models proposed here were validated using sighting data acquired during 2022 in the study area, confirming the good performance of the strategy. This study provides valuable information to support management decisions and conservation measures in the EU marine spatial planning context.
Collapse
Affiliation(s)
- Rosalia Maglietta
- Institute of Intelligent Industrial Technologies and Systems for Advanced Manufacturing, National Research Council, via Amendola 122/D-I, 70126, Bari, Italy.
| | - Leonardo Saccotelli
- Ocean Predictions and Applications Division, Centro Euro-Mediterraneo sui Cambiamenti Climatici, Lecce, Italy
| | - Carmelo Fanizza
- Jonian Dolphin Conservation, viale Virgilio 102, 74121, Taranto, Italy
| | - Vito Telesca
- School of Engineering, University of Basilicata, viale Ateneo Lucano 10, 85100, Potenza, Italy
| | - Giovanni Dimauro
- Department of Computer Science, University of Bari, via Orabona 4, 70125, Bari, Italy
| | - Salvatore Causio
- Ocean Predictions and Applications Division, Centro Euro-Mediterraneo sui Cambiamenti Climatici, Lecce, Italy
| | - Rita Lecci
- Ocean Predictions and Applications Division, Centro Euro-Mediterraneo sui Cambiamenti Climatici, Lecce, Italy
| | - Ivan Federico
- Ocean Predictions and Applications Division, Centro Euro-Mediterraneo sui Cambiamenti Climatici, Lecce, Italy
| | - Giovanni Coppini
- Ocean Predictions and Applications Division, Centro Euro-Mediterraneo sui Cambiamenti Climatici, Lecce, Italy
| | - Giulia Cipriano
- Department of Biology, University of Bari, via Orabona 4, 70125, Bari, Italy
| | - Roberto Carlucci
- Department of Biology, University of Bari, via Orabona 4, 70125, Bari, Italy
| |
Collapse
|
9
|
Gomes EN, Patel H, Yuan B, Lyu W, Juliani HR, Wu Q, Simon JE. Successive harvests affect the aromatic and polyphenol profiles of novel catnip ( Nepeta cataria L.) cultivars in a genotype-dependent manner. FRONTIERS IN PLANT SCIENCE 2023; 14:1121582. [PMID: 36866384 PMCID: PMC9971627 DOI: 10.3389/fpls.2023.1121582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Accepted: 01/25/2023] [Indexed: 06/18/2023]
Abstract
INTRODUCTION Catnip (Nepeta cataria L.) produces volatile iridoid terpenes, mainly nepetalactones, with strong repellent activity against species of arthropods with commercial and medical importance. Recently, new catnip cultivars CR3 and CR9 have been developed, both characterized by producing copious amounts of nepetalactones. Due to its perennial nature, multiple harvests can be obtained from this specialty crop and the effects of such practice on the phytochemical profile of the plants are not extensively studied. METHODS In this study we assessed the productivity of biomass, chemical composition of the essential oil and polyphenol accumulation of new catnip cultivars CR3 and CR9 and their hybrid, CR9×CR3, across four successive harvests. The essential oil was obtained by hydrodistillation and the chemical composition was obtained via gas chromatography-mass spectrometry (GC-MS). Individual polyphenols were quantified by Ultra-High-Performance Liquid Chromatography- diode-array detection (UHPLC-DAD). RESULTS Although the effects on biomass accumulation were independent of genotypes, the aromatic profile and the accumulation of polyphenols had a genotype-dependent response to successive harvests. While cultivar CR3 had its essential oil dominated by E,Z-nepetalactone in all four harvests, cultivar CR9 showed Z,E-nepetalactone as the main component of its aromatic profile during the 1st, 3rd and 4th harvests. At the second harvest, the essential oil of CR9 was mainly composed of caryophyllene oxide and (E)-β-caryophyllene. The same sesquiterpenes represented the majority of the essential oil of the hybrid CR9×CR3 at the 1st and 2nd successive harvests, while Z,E-nepetalactone was the main component at the 3rd and 4th harvests. For CR9 and CR9×CR3, rosmarinic acid and luteolin diglucuronide were at the highest contents at the 1st and 2nd harvest, while for CR3 the peak occurred at the 3rd successive harvest. DISCUSSION The results emphasize that agronomic practices can significantly affect the accumulation of specialized metabolites in N. cataria and the genotype-specific interactions may indicate differential ecological adaptations of each cultivar. This is the first report on the effects of successive harvest on these novel catnip genotypes and highlights their potential for the supply of natural products for the pest control and other industries.
Collapse
Affiliation(s)
- Erik Nunes Gomes
- New Use Agriculture and Natural Plant Products, Department of Plant Biology, Rutgers University, New Brunswick, NJ, United States
- Federal Agency for Support and Evaluation of Graduate Education (CAPES), Ministry of Education of Brazil, Brasilia, DF, Brazil
| | - Harna Patel
- New Use Agriculture and Natural Plant Products, Department of Plant Biology, Rutgers University, New Brunswick, NJ, United States
| | - Bo Yuan
- New Use Agriculture and Natural Plant Products, Department of Plant Biology, Rutgers University, New Brunswick, NJ, United States
| | - Weiting Lyu
- New Use Agriculture and Natural Plant Products, Department of Plant Biology, Rutgers University, New Brunswick, NJ, United States
- Department of Medicinal Chemistry, Ernest Mario School of Pharmacy, Rutgers University, Piscataway, NJ, United States
| | - H. Rodolfo Juliani
- New Use Agriculture and Natural Plant Products, Department of Plant Biology, Rutgers University, New Brunswick, NJ, United States
| | - Qingli Wu
- New Use Agriculture and Natural Plant Products, Department of Plant Biology, Rutgers University, New Brunswick, NJ, United States
- Department of Medicinal Chemistry, Ernest Mario School of Pharmacy, Rutgers University, Piscataway, NJ, United States
- Center for Agricultural Food Ecosystems, Institute of Food, Nutrition & Health, Rutgers University, New Brunswick, NJ, United States
| | - James E. Simon
- New Use Agriculture and Natural Plant Products, Department of Plant Biology, Rutgers University, New Brunswick, NJ, United States
- Department of Medicinal Chemistry, Ernest Mario School of Pharmacy, Rutgers University, Piscataway, NJ, United States
- Center for Agricultural Food Ecosystems, Institute of Food, Nutrition & Health, Rutgers University, New Brunswick, NJ, United States
| |
Collapse
|
10
|
Ji W, Mandal S, Rezenom YH, McKnight TD. Specialized metabolism by trichome-enriched Rubisco and fatty acid synthase components. PLANT PHYSIOLOGY 2023; 191:1199-1213. [PMID: 36264116 PMCID: PMC9922422 DOI: 10.1093/plphys/kiac487] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Accepted: 09/26/2022] [Indexed: 06/16/2023]
Abstract
Acylsugars, specialized metabolites with defense activities, are secreted by trichomes of many solanaceous plants. Several acylsugar metabolic genes (AMGs) remain unknown. We previously reported multiple candidate AMGs. Here, using multiple approaches, we characterized additional AMGs. First, we identified differentially expressed genes between high- and low-acylsugar-producing F2 plants derived from a cross between cultivated tomato (Solanum lycopersicum) and a wild relative (Solanum pennellii), which produce acylsugars that are ∼1% and ∼20% of leaf dry weight, respectively. Expression levels of many known and candidate AMGs positively correlated with acylsugar amounts in F2 individuals. Next, we identified lycopersicum-pennellii putative orthologs with higher nonsynonymous to synonymous substitutions. These analyses identified four candidate genes, three of which showed enriched expression in stem trichomes compared to underlying tissues (shaved stems). Virus-induced gene silencing confirmed two candidates, Sopen05g009610 [beta-ketoacyl-(acyl-carrier-protein) reductase; fatty acid synthase component] and Sopen07g006810 (Rubisco small subunit), as AMGs. Phylogenetic analysis indicated that Sopen05g009610 is distinct from specialized metabolic cytosolic reductases but closely related to two capsaicinoid biosynthetic reductases, suggesting evolutionary relationship between acylsugar and capsaicinoid biosynthesis. Analysis of publicly available datasets revealed enriched expression of Sopen05g009610 orthologs in trichomes of several acylsugar-producing species. Similarly, orthologs of Sopen07g006810 were identified as solanaceous trichome-enriched members, which form a phylogenetic clade distinct from those of mesophyll-expressed "regular" Rubisco small subunits. Furthermore, δ13C analyses indicated recycling of metabolic CO2 into acylsugars by Sopen07g006810 and showed how trichomes support high levels of specialized metabolite production. These findings have implications for genetic manipulation of trichome-specialized metabolism in solanaceous crops.
Collapse
Affiliation(s)
| | | | - Yohannes H Rezenom
- Department of Chemistry, Texas A&M University, College Station, Texas 77843, USA
| | | |
Collapse
|
11
|
Liu X, Zhang P, Zhao Q, Huang AC. Making small molecules in plants: A chassis for synthetic biology-based production of plant natural products. JOURNAL OF INTEGRATIVE PLANT BIOLOGY 2023; 65:417-443. [PMID: 35852486 DOI: 10.1111/jipb.13330] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Accepted: 07/18/2022] [Indexed: 06/15/2023]
Abstract
Plant natural products have been extensively exploited in food, medicine, flavor, cosmetic, renewable fuel, and other industrial sectors. Synthetic biology has recently emerged as a promising means for the cost-effective and sustainable production of natural products. Compared with engineering microbes for the production of plant natural products, the potential of plants as chassis for producing these compounds is underestimated, largely due to challenges encountered in engineering plants. Knowledge in plant engineering is instrumental for enabling the effective and efficient production of valuable phytochemicals in plants, and also paves the way for a more sustainable future agriculture. In this manuscript, we briefly recap the biosynthesis of plant natural products, focusing primarily on industrially important terpenoids, alkaloids, and phenylpropanoids. We further summarize the plant hosts and strategies that have been used to engineer the production of natural products. The challenges and opportunities of using plant synthetic biology to achieve rapid and scalable production of high-value plant natural products are also discussed.
Collapse
Affiliation(s)
- Xinyu Liu
- Key Laboratory of Molecular Design for Plant Cell Factory of Guangdong Higher Education Institutes, Department of Biology, School of Life Sciences, SUSTech-PKU Institute of Plant and Food Science, Southern University of Science and Technology, Shenzhen, 518055, China
| | - Peijun Zhang
- Key Laboratory of Molecular Design for Plant Cell Factory of Guangdong Higher Education Institutes, Department of Biology, School of Life Sciences, SUSTech-PKU Institute of Plant and Food Science, Southern University of Science and Technology, Shenzhen, 518055, China
| | - Qiao Zhao
- Shenzhen Institutes of Advanced Technology (SIAT), the Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Ancheng C Huang
- Key Laboratory of Molecular Design for Plant Cell Factory of Guangdong Higher Education Institutes, Department of Biology, School of Life Sciences, SUSTech-PKU Institute of Plant and Food Science, Southern University of Science and Technology, Shenzhen, 518055, China
| |
Collapse
|
12
|
Current status and future prospects in cannabinoid production through in vitro culture and synthetic biology. Biotechnol Adv 2023; 62:108074. [PMID: 36481387 DOI: 10.1016/j.biotechadv.2022.108074] [Citation(s) in RCA: 29] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Revised: 10/27/2022] [Accepted: 11/30/2022] [Indexed: 12/12/2022]
Abstract
For centuries, cannabis has been a rich source of fibrous, pharmaceutical, and recreational ingredients. Phytocannabinoids are the most important and well-known class of cannabis-derived secondary metabolites and display a broad range of health-promoting and psychoactive effects. The unique characteristics of phytocannabinoids (e.g., metabolite likeness, multi-target spectrum, and safety profile) have resulted in the development and approval of several cannabis-derived drugs. While most work has focused on the two main cannabinoids produced in the plant, over 150 unique cannabinoids have been identified. To meet the rapidly growing phytocannabinoid demand, particularly many of the minor cannabinoids found in low amounts in planta, biotechnology offers promising alternatives for biosynthesis through in vitro culture and heterologous systems. In recent years, the engineered production of phytocannabinoids has been obtained through synthetic biology both in vitro (cell suspension culture and hairy root culture) and heterologous systems. However, there are still several bottlenecks (e.g., the complexity of the cannabinoid biosynthetic pathway and optimizing the bioprocess), hampering biosynthesis and scaling up the biotechnological process. The current study reviews recent advances related to in vitro culture-mediated cannabinoid production. Additionally, an integrated overview of promising conventional approaches to cannabinoid production is presented. Progress toward cannabinoid production in heterologous systems and possible avenues for avoiding autotoxicity are also reviewed and highlighted. Machine learning is then introduced as a powerful tool to model, and optimize bioprocesses related to cannabinoid production. Finally, regulation and manipulation of the cannabinoid biosynthetic pathway using CRISPR- mediated metabolic engineering is discussed.
Collapse
|
13
|
Li J, Chroumpi T, Garrigues S, Kun RS, Meng J, Salazar-Cerezo S, Aguilar-Pontes MV, Zhang Y, Tejomurthula S, Lipzen A, Ng V, Clendinen CS, Tolić N, Grigoriev IV, Tsang A, Mäkelä MR, Snel B, Peng M, de Vries RP. The Sugar Metabolic Model of Aspergillus niger Can Only Be Reliably Transferred to Fungi of Its Phylum. J Fungi (Basel) 2022; 8:jof8121315. [PMID: 36547648 PMCID: PMC9781776 DOI: 10.3390/jof8121315] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 12/14/2022] [Accepted: 12/14/2022] [Indexed: 12/23/2022] Open
Abstract
Fungi play a critical role in the global carbon cycle by degrading plant polysaccharides to small sugars and metabolizing them as carbon and energy sources. We mapped the well-established sugar metabolic network of Aspergillus niger to five taxonomically distant species (Aspergillus nidulans, Penicillium subrubescens, Trichoderma reesei, Phanerochaete chrysosporium and Dichomitus squalens) using an orthology-based approach. The diversity of sugar metabolism correlates well with the taxonomic distance of the fungi. The pathways are highly conserved between the three studied Eurotiomycetes (A. niger, A. nidulans, P. subrubescens). A higher level of diversity was observed between the T. reesei and A. niger, and even more so for the two Basidiomycetes. These results were confirmed by integrative analysis of transcriptome, proteome and metabolome, as well as growth profiles of the fungi growing on the corresponding sugars. In conclusion, the establishment of sugar pathway models in different fungi revealed the diversity of fungal sugar conversion and provided a valuable resource for the community, which would facilitate rational metabolic engineering of these fungi as microbial cell factories.
Collapse
Affiliation(s)
- Jiajia Li
- Fungal Physiology, Westerdijk Fungal Biodiversity Institute & Fungal Molecular Physiology, Utrecht University, Uppsalalaan 8, 3584 CT Utrecht, The Netherlands
| | - Tania Chroumpi
- Fungal Physiology, Westerdijk Fungal Biodiversity Institute & Fungal Molecular Physiology, Utrecht University, Uppsalalaan 8, 3584 CT Utrecht, The Netherlands
| | - Sandra Garrigues
- Fungal Physiology, Westerdijk Fungal Biodiversity Institute & Fungal Molecular Physiology, Utrecht University, Uppsalalaan 8, 3584 CT Utrecht, The Netherlands
| | - Roland S. Kun
- Fungal Physiology, Westerdijk Fungal Biodiversity Institute & Fungal Molecular Physiology, Utrecht University, Uppsalalaan 8, 3584 CT Utrecht, The Netherlands
| | - Jiali Meng
- Fungal Physiology, Westerdijk Fungal Biodiversity Institute & Fungal Molecular Physiology, Utrecht University, Uppsalalaan 8, 3584 CT Utrecht, The Netherlands
| | - Sonia Salazar-Cerezo
- Fungal Physiology, Westerdijk Fungal Biodiversity Institute & Fungal Molecular Physiology, Utrecht University, Uppsalalaan 8, 3584 CT Utrecht, The Netherlands
| | | | - Yu Zhang
- USA Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Rd, Berkeley, CA 94720, USA
| | - Sravanthi Tejomurthula
- USA Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Rd, Berkeley, CA 94720, USA
| | - Anna Lipzen
- USA Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Rd, Berkeley, CA 94720, USA
| | - Vivian Ng
- USA Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Rd, Berkeley, CA 94720, USA
| | - Chaevien S. Clendinen
- Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Nikola Tolić
- Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Igor V. Grigoriev
- USA Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Rd, Berkeley, CA 94720, USA
- Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, CA 94598, USA
| | - Adrian Tsang
- Department of Biology, Concordia University, 7141 Sherbrooke Street West, Montreal, QC H4B 1R6, Canada
| | - Miia R. Mäkelä
- Department of Microbiology, University of Helsinki, Viikinkaari 9, 00014 Helsinki, Finland
| | - Berend Snel
- Theoretical Biology and Bioinformatics, Utrecht University, Padualaan 8, 3584 CH Utrecht, The Netherlands
| | - Mao Peng
- Fungal Physiology, Westerdijk Fungal Biodiversity Institute & Fungal Molecular Physiology, Utrecht University, Uppsalalaan 8, 3584 CT Utrecht, The Netherlands
| | - Ronald P. de Vries
- Fungal Physiology, Westerdijk Fungal Biodiversity Institute & Fungal Molecular Physiology, Utrecht University, Uppsalalaan 8, 3584 CT Utrecht, The Netherlands
- Correspondence:
| |
Collapse
|
14
|
Ng JWX, Chua SK, Mutwil M. Feature importance network reveals novel functional relationships between biological features in Arabidopsis thaliana. FRONTIERS IN PLANT SCIENCE 2022; 13:944992. [PMID: 36212273 PMCID: PMC9539877 DOI: 10.3389/fpls.2022.944992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Accepted: 08/24/2022] [Indexed: 06/16/2023]
Abstract
Understanding how the different cellular components are working together to form a living cell requires multidisciplinary approaches combining molecular and computational biology. Machine learning shows great potential in life sciences, as it can find novel relationships between biological features. Here, we constructed a dataset of 11,801 gene features for 31,522 Arabidopsis thaliana genes and developed a machine learning workflow to identify linked features. The detected linked features are visualised as a Feature Important Network (FIN), which can be mined to reveal a variety of novel biological insights pertaining to gene function. We demonstrate how FIN can be used to generate novel insights into gene function. To make this network easily accessible to the scientific community, we present the FINder database, available at finder.plant.tools.
Collapse
|
15
|
Yan S, Bhawal R, Yin Z, Thannhauser TW, Zhang S. Recent advances in proteomics and metabolomics in plants. MOLECULAR HORTICULTURE 2022; 2:17. [PMID: 37789425 PMCID: PMC10514990 DOI: 10.1186/s43897-022-00038-9] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/24/2022] [Accepted: 06/20/2022] [Indexed: 10/05/2023]
Abstract
Over the past decade, systems biology and plant-omics have increasingly become the main stream in plant biology research. New developments in mass spectrometry and bioinformatics tools, and methodological schema to integrate multi-omics data have leveraged recent advances in proteomics and metabolomics. These progresses are driving a rapid evolution in the field of plant research, greatly facilitating our understanding of the mechanistic aspects of plant metabolisms and the interactions of plants with their external environment. Here, we review the recent progresses in MS-based proteomics and metabolomics tools and workflows with a special focus on their applications to plant biology research using several case studies related to mechanistic understanding of stress response, gene/protein function characterization, metabolic and signaling pathways exploration, and natural product discovery. We also present a projection concerning future perspectives in MS-based proteomics and metabolomics development including their applications to and challenges for system biology. This review is intended to provide readers with an overview of how advanced MS technology, and integrated application of proteomics and metabolomics can be used to advance plant system biology research.
Collapse
Affiliation(s)
- Shijuan Yan
- Guangdong Key Laboratory for Crop Germplasm Resources Preservation and Utilization, Agro-biological Gene Research Center, Guangdong Academy of Agricultural Sciences, Guangzhou, China
| | - Ruchika Bhawal
- Proteomics and Metabolomics Facility, Institute of Biotechnology, Cornell University, 139 Biotechnology Building, 526 Campus Road, Ithaca, NY, 14853, USA
| | - Zhibin Yin
- Guangdong Key Laboratory for Crop Germplasm Resources Preservation and Utilization, Agro-biological Gene Research Center, Guangdong Academy of Agricultural Sciences, Guangzhou, China
| | | | - Sheng Zhang
- Proteomics and Metabolomics Facility, Institute of Biotechnology, Cornell University, 139 Biotechnology Building, 526 Campus Road, Ithaca, NY, 14853, USA.
| |
Collapse
|
16
|
Schenck CA, Busta L. Using interdisciplinary, phylogeny-guided approaches to understand the evolution of plant metabolism. PLANT MOLECULAR BIOLOGY 2022; 109:355-367. [PMID: 34816350 DOI: 10.1007/s11103-021-01220-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Accepted: 11/05/2021] [Indexed: 06/13/2023]
Abstract
To cope with relentless environmental pressures, plants produce an arsenal of structurally diverse chemicals, often called specialized metabolites. These lineage-specific compounds are derived from the simple building blocks made by ubiquitous core metabolic pathways. Although the structures of many specialized metabolites are known, the underlying metabolic pathways and the evolutionary events that have shaped the plant chemical diversity landscape are only beginning to be understood. However, with the advent of multi-omics data sets and the relative ease of studying pathways in previously intractable non-model species, plant specialized metabolic pathways are now being systematically identified. These large datasets also provide a foundation for comparative, phylogeny-guided studies of plant metabolism. Comparisons of metabolic traits and features like chemical abundances, enzyme activities, or gene sequences from phylogenetically diverse plants provide insights into how metabolic pathways evolved. This review highlights the power of studying evolution through the lens of comparative biochemistry, particularly how placing metabolism into a phylogenetic context can help a researcher identify the metabolic innovations enabling the evolution of structurally diverse plant metabolites.
Collapse
Affiliation(s)
- Craig A Schenck
- Department of Biochemistry, University of Missouri, Columbia, MO, USA.
| | - Lucas Busta
- Department of Chemistry and Biochemistry, University of Minnesota Duluth, Duluth, MN, USA
| |
Collapse
|
17
|
Using genome and transcriptome analysis to elucidate biosynthetic pathways. Curr Opin Biotechnol 2022; 75:102708. [DOI: 10.1016/j.copbio.2022.102708] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Revised: 02/19/2022] [Accepted: 02/23/2022] [Indexed: 12/21/2022]
|
18
|
Han X, Tsuda K. Evolutionary footprint of plant immunity. CURRENT OPINION IN PLANT BIOLOGY 2022; 67:102209. [PMID: 35430538 DOI: 10.1016/j.pbi.2022.102209] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Revised: 02/24/2022] [Accepted: 03/04/2022] [Indexed: 06/14/2023]
Abstract
There are pieces of evidence from genomic footprints and fossil records indicating that plants have co-evolved with microbes after terrestrialization for more than 407 million years. Therefore, to truly comprehend plant evolution, we need to understand the co-evolutionary process and history between plants and microbes. Recent developments in genomes and transcriptomes of a vast number of plant species as well as microbes have greatly expanded our knowledge of the evolution of the plant immune system. In this review, we summarize recent advances in the co-evolution between plants and microbes with emphasis on the plant side and point out future research needed for understanding plant-microbial co-evolution. Knowledge of the evolution and variation of the plant immune system will better equip us on designing crops with boosted performance in agricultural fields.
Collapse
Affiliation(s)
- Xiaowei Han
- State Key Laboratory of Agricultural Microbiology, Hubei Hongshan Laboratory, Hubei Key Lab of Plant Pathology, College of Plant Science and Technology, Huazhong Agricultural University, Wuhan, 430070, China; Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Wuhan, 430070, China; Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
| | - Kenichi Tsuda
- State Key Laboratory of Agricultural Microbiology, Hubei Hongshan Laboratory, Hubei Key Lab of Plant Pathology, College of Plant Science and Technology, Huazhong Agricultural University, Wuhan, 430070, China; Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Wuhan, 430070, China; Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China.
| |
Collapse
|
19
|
Mukherjee D, Saha D, Acharya D, Mukherjee A, Ghosh TC. Interplay between gene expression and gene architecture as a consequence of gene and genome duplications: evidence from metabolic genes of Arabidopsis thaliana. PHYSIOLOGY AND MOLECULAR BIOLOGY OF PLANTS : AN INTERNATIONAL JOURNAL OF FUNCTIONAL PLANT BIOLOGY 2022; 28:1091-1108. [PMID: 35722515 PMCID: PMC9203644 DOI: 10.1007/s12298-022-01188-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Revised: 05/16/2022] [Accepted: 05/18/2022] [Indexed: 05/03/2023]
Abstract
Gene and genome duplications have been widespread during the evolution of flowering plant which resulted in the increment of biological complexity as well as creation of plasticity of a genome helping the species to adapt to changing environments. Duplicated genes with higher evolutionary rates can act as a mechanism of generating novel functions in secondary metabolism. In this study, we explored duplication as a potential factor governing the expression heterogeneity and gene architecture of Primary Metabolic Genes (PMGs) and Secondary Metabolic Genes (SMGs) of Arabidopsis thaliana. It is remarkable that different types of duplication processes controlled gene expression and tissue specificity differently in PMGs and SMGs. A complex relationship exists between gene architecture and expression patterns of primary and secondary metabolic genes. Our study reflects, expression heterogeneity and gene structure variation of primary and secondary metabolism in Arabidopsis thaliana are partly results of duplication events of different origins. Our study suggests that duplication has differential effect on PMGs and SMGs regarding expression pattern by controlling gene structure, epigenetic modifications, multifunctionality and subcellular compartmentalization. This study provides an insight into the evolution of metabolism in plants in the light of gene and genome scale duplication. Supplementary Information The online version contains supplementary material available at 10.1007/s12298-022-01188-2.
Collapse
Affiliation(s)
- Dola Mukherjee
- Bioinformatics Centre, Bose Institute, P 1/12, C.I.T. Scheme VII M, Kolkata, 700 054 India
| | - Deeya Saha
- Bioinformatics Centre, Bose Institute, P 1/12, C.I.T. Scheme VII M, Kolkata, 700 054 India
| | - Debarun Acharya
- Bioinformatics Centre, Bose Institute, P 1/12, C.I.T. Scheme VII M, Kolkata, 700 054 India
| | - Ashutosh Mukherjee
- Department of Botany, Vivekananda College, 269, Diamond Harbour Road, Thakurpukur, Kolkata, West Bengal 700063 India
| | - Tapash Chandra Ghosh
- Bioinformatics Centre, Bose Institute, P 1/12, C.I.T. Scheme VII M, Kolkata, 700 054 India
| |
Collapse
|
20
|
Wang P, Schumacher AM, Shiu SH. Computational prediction of plant metabolic pathways. CURRENT OPINION IN PLANT BIOLOGY 2022; 66:102171. [PMID: 35078130 DOI: 10.1016/j.pbi.2021.102171] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Revised: 12/07/2021] [Accepted: 12/18/2021] [Indexed: 06/14/2023]
Abstract
Uncovering genes encoding enzymes responsible for the biosynthesis of diverse plant metabolites is essential for metabolic engineering and production of plant metabolite-derived medicine. With the availability of multi-omics data for an ever-increasing number of plant species and the development of computational approaches, the metabolic pathways of many important plant compounds can be predicted, complementing a more traditional genetic and/or biochemical approach. Here, we summarize recent progress in predicting plant metabolic pathways using genome, transcriptome, proteome, interactome, and/or metabolome data, and the utility of integrating these data with machine learning to further improve metabolic pathway predictions.
Collapse
Affiliation(s)
- Peipei Wang
- Department of Plant Biology, Michigan State University, East Lansing, MI, 48824, USA.
| | - Ally M Schumacher
- Department of Plant Biology, Michigan State University, East Lansing, MI, 48824, USA
| | - Shin-Han Shiu
- Department of Plant Biology, Michigan State University, East Lansing, MI, 48824, USA; Department of Computational Mathematics, Science, and Engineering, Michigan State University, East Lansing, MI, 48824, USA.
| |
Collapse
|
21
|
Fiesel PD, Parks HM, Last RL, Barry CS. Fruity, sticky, stinky, spicy, bitter, addictive, and deadly: evolutionary signatures of metabolic complexity in the Solanaceae. Nat Prod Rep 2022; 39:1438-1464. [PMID: 35332352 DOI: 10.1039/d2np00003b] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Covering: 2000-2022Plants collectively synthesize a huge repertoire of metabolites. General metabolites, also referred to as primary metabolites, are conserved across the plant kingdom and are required for processes essential to growth and development. These include amino acids, sugars, lipids, and organic acids. In contrast, specialized metabolites, historically termed secondary metabolites, are structurally diverse, exhibit lineage-specific distribution and provide selective advantage to host species to facilitate reproduction and environmental adaptation. Due to their potent bioactivities, plant specialized metabolites attract considerable attention for use as flavorings, fragrances, pharmaceuticals, and bio-pesticides. The Solanaceae (Nightshade family) consists of approximately 2700 species and includes crops of significant economic, cultural, and scientific importance: these include potato, tomato, pepper, eggplant, tobacco, and petunia. The Solanaceae has emerged as a model family for studying the biochemical evolution of plant specialized metabolism and multiple examples exist of lineage-specific metabolites that influence the senses and physiology of commensal and harmful organisms, including humans. These include, alcohols, phenylpropanoids, and carotenoids that contribute to fruit aroma and color in tomato (fruity), glandular trichome-derived terpenoids and acylsugars that contribute to plant defense (stinky & sticky, respectively), capsaicinoids in chilli-peppers that influence seed dispersal (spicy), and steroidal glycoalkaloids (bitter) from Solanum, nicotine (addictive) from tobacco, as well as tropane alkaloids (deadly) from Deadly Nightshade that deter herbivory. Advances in genomics and metabolomics, coupled with the adoption of comparative phylogenetic approaches, resulted in deeper knowledge of the biosynthesis and evolution of these metabolites. This review highlights recent progress in this area and outlines opportunities for - and challenges of-developing a more comprehensive understanding of Solanaceae metabolism.
Collapse
Affiliation(s)
- Paul D Fiesel
- Department of Biochemistry & Molecular Biology, Michigan State University, East Lansing, MI 48824, USA
| | - Hannah M Parks
- Department of Biochemistry & Molecular Biology, Michigan State University, East Lansing, MI 48824, USA
| | - Robert L Last
- Department of Biochemistry & Molecular Biology, Michigan State University, East Lansing, MI 48824, USA.,Department of Plant Biology, Michigan State University, East Lansing, MI 48824, USA
| | - Cornelius S Barry
- Department of Horticulture, Michigan State University, East Lansing, MI 48824, USA.
| |
Collapse
|
22
|
The ease and complexity of identifying and using specialized metabolites for crop engineering. Emerg Top Life Sci 2022; 6:153-162. [PMID: 35302160 PMCID: PMC9023015 DOI: 10.1042/etls20210248] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Revised: 01/20/2022] [Accepted: 01/24/2022] [Indexed: 12/11/2022]
Abstract
Plants produce a broad variety of specialized metabolites with distinct biological activities and potential applications. Despite this potential, most biosynthetic pathways governing specialized metabolite production remain largely unresolved across the plant kingdom. The rapid advancement of genetics and biochemical tools has enhanced our ability to identify plant specialized metabolic pathways. Further advancements in transgenic technology and synthetic biology approaches have extended this to a desire to design new pathways or move existing pathways into new systems to address long-running difficulties in crop systems. This includes improving abiotic and biotic stress resistance, boosting nutritional content, etc. In this review, we assess the potential and limitations for (1) identifying specialized metabolic pathways in plants with multi-omics tools and (2) using these enzymes in synthetic biology or crop engineering. The goal of these topics is to highlight areas of research that may need further investment to enhance the successful application of synthetic biology for exploiting the myriad of specialized metabolic pathways.
Collapse
|
23
|
Zhou X, Liu Z. Unlocking plant metabolic diversity: A (pan)-genomic view. PLANT COMMUNICATIONS 2022; 3:100300. [PMID: 35529944 PMCID: PMC9073316 DOI: 10.1016/j.xplc.2022.100300] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 12/12/2021] [Accepted: 01/13/2022] [Indexed: 05/28/2023]
Abstract
Plants produce a remarkable diversity of structurally and functionally diverse natural chemicals that serve as adaptive compounds throughout their life cycles. However, unlocking this metabolic diversity is significantly impeded by the size, complexity, and abundant repetitive elements of typical plant genomes. As genome sequencing becomes routine, we anticipate that links between metabolic diversity and genetic variation will be strengthened. In addition, an ever-increasing number of plant genomes have revealed that biosynthetic gene clusters are not only a hallmark of microbes and fungi; gene clusters for various classes of compounds have also been found in plants, and many are associated with important agronomic traits. We present recent examples of plant metabolic diversification that have been discovered through the exploration and exploitation of various genomic and pan-genomic data. We also draw attention to the fundamental genomic and pan-genomic basis of plant chemodiversity and discuss challenges and future perspectives for investigating metabolic diversity in the coming pan-genomics era.
Collapse
Affiliation(s)
- Xuan Zhou
- Joint Center for Single Cell Biology, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai 200240, China
- Shanghai Collaborative Innovation Center of Agri-Seeds, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Zhenhua Liu
- Joint Center for Single Cell Biology, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai 200240, China
- Shanghai Collaborative Innovation Center of Agri-Seeds, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai 200240, China
| |
Collapse
|
24
|
Lopez-Nieves S, El-Azaz J, Men Y, Holland CK, Feng T, Brockington SF, Jez JM, Maeda HA. Two independently evolved natural mutations additively deregulate TyrA enzymes and boost tyrosine production in planta. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2022; 109:844-855. [PMID: 34807484 DOI: 10.1111/tpj.15597] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Revised: 10/29/2021] [Accepted: 11/15/2021] [Indexed: 06/13/2023]
Abstract
l-Tyrosine is an essential amino acid for protein synthesis and is also used in plants to synthesize diverse natural products. Plants primarily synthesize tyrosine via TyrA arogenate dehydrogenase (TyrAa or ADH), which are typically strongly feedback inhibited by tyrosine. However, two plant lineages, Fabaceae (legumes) and Caryophyllales, have TyrA enzymes that exhibit relaxed sensitivity to tyrosine inhibition and are associated with elevated production of tyrosine-derived compounds, such as betalain pigments uniquely produced in core Caryophyllales. Although we previously showed that a single D222N substitution is primarily responsible for the deregulation of legume TyrAs, it is unknown when and how the deregulated Caryophyllales TyrA emerged. Here, through phylogeny-guided TyrA structure-function analysis, we found that functionally deregulated TyrAs evolved early in the core Caryophyllales before the origin of betalains, where the E208D amino acid substitution in the active site, which is at a different and opposite location from D222N found in legume TyrAs, played a key role in the TyrA functionalization. Unlike legumes, however, additional substitutions on non-active site residues further contributed to the deregulation of TyrAs in Caryophyllales. The introduction of a mutation analogous to E208D partially deregulated tyrosine-sensitive TyrAs, such as Arabidopsis TyrA2 (AtTyrA2). Moreover, the combined introduction of D222N and E208D additively deregulated AtTyrA2, for which the expression in Nicotiana benthamiana led to highly elevated accumulation of tyrosine in planta. The present study demonstrates that phylogeny-guided characterization of key residues underlying primary metabolic innovations can provide powerful tools to boost the production of essential plant natural products.
Collapse
Affiliation(s)
- Samuel Lopez-Nieves
- Department of Botany, University of Wisconsin-Madison, Madison, WI, 53706, USA
- Department of Plant Sciences, University of Cambridge, Cambridge, CB2 3EA, UK
| | - Jorge El-Azaz
- Department of Botany, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Yusen Men
- Department of Botany, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Cynthia K Holland
- Department of Biology, Williams College, Williamstown, MA, 01267, USA
| | - Tao Feng
- Department of Plant Sciences, University of Cambridge, Cambridge, CB2 3EA, UK
| | | | - Joseph M Jez
- Department of Biology, Washington University in St Louis, St Louis, MO, 63130, USA
| | - Hiroshi A Maeda
- Department of Botany, University of Wisconsin-Madison, Madison, WI, 53706, USA
| |
Collapse
|
25
|
Suttiyut T, Auber RP, Ghaste M, Kane CN, McAdam SAM, Wisecaver JH, Widhalm JR. Integrative analysis of the shikonin metabolic network identifies new gene connections and reveals evolutionary insight into shikonin biosynthesis. HORTICULTURE RESEARCH 2022; 9:uhab087. [PMID: 35048120 PMCID: PMC8969065 DOI: 10.1093/hr/uhab087] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Accepted: 12/07/2021] [Indexed: 05/28/2023]
Abstract
Plant specialized 1,4-naphthoquinones present a remarkable case of convergent evolution. Species across multiple discrete orders of vascular plants produce diverse 1,4-naphthoquinones via one of several pathways using different metabolic precursors. Evolution of these pathways was preceded by events of metabolic innovation and many appear to share connections with biosynthesis of photosynthetic or respiratory quinones. Here, we sought to shed light on the metabolic connections linking shikonin biosynthesis with its precursor pathways and on the origins of shiknoin metabolic genes. Downregulation of Lithospermum erythrorhizon geranyl diphosphate synthase (LeGPPS), recently shown to have been recruited from a cytoplasmic farnesyl diphosphate synthase (FPPS), resulted in reduced shikonin production and a decrease in expression of mevalonic acid and phenylpropanoid pathway genes. Next, we used LeGPPS and other known shikonin pathway genes to build a coexpression network model for identifying new gene connections to shikonin metabolism. Integrative in silico analyses of network genes revealed candidates for biochemical steps in the shikonin pathway arising from Boraginales-specific gene family expansion. Multiple genes in the shikonin coexpression network were also discovered to have originated from duplication of ubiquinone pathway genes. Taken together, our study provides evidence for transcriptional crosstalk between shikonin biosynthesis and its precursor pathways, identifies several shikonin pathway gene candidates and their evolutionary histories, and establishes additional evolutionary links between shikonin and ubiquinone metabolism. Moreover, we demonstrate that global coexpression analysis using limited transcriptomic data obtained from targeted experiments is effective for identifying gene connections within a defined metabolic network.
Collapse
Affiliation(s)
- Thiti Suttiyut
- Department of Horticulture and Landscape Architecture, Purdue University, West Lafayette, Indiana, 47907, USA
- Purdue Center for Plant Biology, Purdue University, West Lafayette, Indiana 47907, USA
| | - Robert P Auber
- Purdue Center for Plant Biology, Purdue University, West Lafayette, Indiana 47907, USA
- Department of Biochemistry, Purdue University, West Lafayette, Indiana 47907, USA
| | - Manoj Ghaste
- Department of Horticulture and Landscape Architecture, Purdue University, West Lafayette, Indiana, 47907, USA
- Purdue Center for Plant Biology, Purdue University, West Lafayette, Indiana 47907, USA
| | - Cade N Kane
- Purdue Center for Plant Biology, Purdue University, West Lafayette, Indiana 47907, USA
- Department of Botany and Plant Pathology, Purdue University, West Lafayette, Indiana 47907, USA
| | - Scott A M McAdam
- Purdue Center for Plant Biology, Purdue University, West Lafayette, Indiana 47907, USA
- Department of Botany and Plant Pathology, Purdue University, West Lafayette, Indiana 47907, USA
| | - Jennifer H Wisecaver
- Purdue Center for Plant Biology, Purdue University, West Lafayette, Indiana 47907, USA
- Department of Biochemistry, Purdue University, West Lafayette, Indiana 47907, USA
| | - Joshua R Widhalm
- Department of Horticulture and Landscape Architecture, Purdue University, West Lafayette, Indiana, 47907, USA
- Department of Biochemistry, Purdue University, West Lafayette, Indiana 47907, USA
| |
Collapse
|
26
|
Peng M, de Vries RP. Machine learning prediction of novel pectinolytic enzymes in Aspergillus niger through integrating heterogeneous (post-) genomics data. Microb Genom 2021; 7. [PMID: 34874247 PMCID: PMC8767319 DOI: 10.1099/mgen.0.000674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Pectinolytic enzymes are a variety of enzymes involved in breaking down pectin, a complex and abundant plant cell-wall polysaccharide. In nature, pectinolytic enzymes play an essential role in allowing bacteria and fungi to depolymerize and utilize pectin. In addition, pectinases have been widely applied in various industries, such as the food, wine, textile, paper and pulp industries. Due to their important biological function and increasing industrial potential, discovery of novel pectinolytic enzymes has received global interest. However, traditional enzyme characterization relies heavily on biochemical experiments, which are time consuming, laborious and expensive. To accelerate identification of novel pectinolytic enzymes, an automatic approach is needed. We developed a machine learning (ML) approach for predicting pectinases in the industrial workhorse fungus, Aspergillus niger. The prediction integrated a diverse range of features, including evolutionary profile, gene expression, transcriptional regulation and biochemical characteristics. Results on both the training and the independent testing dataset showed that our method achieved over 90 % accuracy, and recalled over 60 % of pectinolytic genes. Application of the ML model on the A. niger genome led to the identification of 83 pectinases, covering both previously described pectinases and novel pectinases that do not belong to any known pectinolytic enzyme family. Our study demonstrated the tremendous potential of ML in discovery of new industrial enzymes through integrating heterogeneous (post-) genomimcs data.
Collapse
Affiliation(s)
- Mao Peng
- Fungal Physiology, Westerdijk Fungal Biodiversity Institute, & Fungal Molecular Physiology, Utrecht University, Utrecht, The Netherlands
- *Correspondence: Mao Peng,
| | - Ronald P. de Vries
- Fungal Physiology, Westerdijk Fungal Biodiversity Institute, & Fungal Molecular Physiology, Utrecht University, Utrecht, The Netherlands
| |
Collapse
|
27
|
Wang H, Guo H, Wang N, Huo YX. Toward the Heterologous Biosynthesis of Plant Natural Products: Gene Discovery and Characterization. ACS Synth Biol 2021; 10:2784-2795. [PMID: 34757715 DOI: 10.1021/acssynbio.1c00315] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Plant natural products (PNPs) represent a vast and diverse group of natural products, which have wide applications such as emulsifiers in cosmetics, sweeteners in foods, and active ingredients in medicines. Large-scale production of certain PNPs (e.g., artemisinin, taxol) has been implemented by reconstruction of biosynthetic pathways in heterologous hosts. However, unknown biosynthetic pathways greatly restrict wide applications of heterologous production of PNPs of interest. With the rapid development of sequencing and multiomics analysis technologies, huge amounts of omics data, i.e., genomics, transcriptomics, and proteomics, have been deposited in public databases, which is a precious resource for identification of the unknown biosynthetic pathway of PNPs. Herein, we have enumerated the approaches which have been widely used to screen candidate genes involved in the biosynthesis of PNPs of interest. We also discuss recent developments in the characterization of putative genes and elucidation of the complete biosynthetic pathway in heterologous hosts.
Collapse
Affiliation(s)
- Huiyan Wang
- School of Life Science, Beijing Institute of Technology, No. 5 South Zhongguancun Street, 100081 Beijing, China
| | - Hao Guo
- School of Life Science, Beijing Institute of Technology, No. 5 South Zhongguancun Street, 100081 Beijing, China
| | - Ning Wang
- School of Life Science, Beijing Institute of Technology, No. 5 South Zhongguancun Street, 100081 Beijing, China
| | - Yi-Xin Huo
- School of Life Science, Beijing Institute of Technology, No. 5 South Zhongguancun Street, 100081 Beijing, China
- Tobacco Research Institute, Chinese Academy of Agricultural Sciences, Qingdao 266101, China
| |
Collapse
|
28
|
Beniddir MA, Kang KB, Genta-Jouve G, Huber F, Rogers S, van der Hooft JJJ. Advances in decomposing complex metabolite mixtures using substructure- and network-based computational metabolomics approaches. Nat Prod Rep 2021; 38:1967-1993. [PMID: 34821250 PMCID: PMC8597898 DOI: 10.1039/d1np00023c] [Citation(s) in RCA: 56] [Impact Index Per Article: 18.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Indexed: 12/13/2022]
Abstract
Covering: up to the end of 2020Recently introduced computational metabolome mining tools have started to positively impact the chemical and biological interpretation of untargeted metabolomics analyses. We believe that these current advances make it possible to start decomposing complex metabolite mixtures into substructure and chemical class information, thereby supporting pivotal tasks in metabolomics analysis including metabolite annotation, the comparison of metabolic profiles, and network analyses. In this review, we highlight and explain key tools and emerging strategies covering 2015 up to the end of 2020. The majority of these tools aim at processing and analyzing liquid chromatography coupled to mass spectrometry fragmentation data. We start with defining what substructures are, how they relate to molecular fingerprints, and how recognizing them helps to decompose complex mixtures. We continue with chemical classes that are based on the presence or absence of particular molecular scaffolds and/or functional groups and are thus intrinsically related to substructures. We discuss novel tools to mine substructures, annotate chemical compound classes, and create mass spectral networks from metabolomics data and demonstrate them using two case studies. We also review and speculate about the opportunities that NMR spectroscopy-based metabolome mining of complex metabolite mixtures offers to discover substructures and chemical classes. Finally, we will describe the main benefits and limitations of the current tools and strategies that rely on them, and our vision on how this exciting field can develop toward repository-scale-sized metabolomics analyses. Complementary sources of structural information from genomics analyses and well-curated taxonomic records are also discussed. Many research fields such as natural products discovery, pharmacokinetic and drug metabolism studies, and environmental metabolomics increasingly rely on untargeted metabolomics to gain biochemical and biological insights. The here described technical advances will benefit all those metabolomics disciplines by transforming spectral data into knowledge that can answer biological questions.
Collapse
Affiliation(s)
- Mehdi A Beniddir
- Université Paris-Saclay, CNRS, BioCIS, 5 rue J.-B Clément, 92290 Châtenay-Malabry, France
| | - Kyo Bin Kang
- Research Institute of Pharmaceutical Sciences, College of Pharmacy, Sookmyung Women's University, Seoul 04310, Republic of Korea
| | - Grégory Genta-Jouve
- Laboratoire de Chimie-Toxicologie Analytique et Cellulaire (C-TAC), UMR CNRS 8038, CiTCoM, Université de Paris, 4, Avenue de l'Observatoire, 75006, Paris, France
- Laboratoire Ecologie, Evolution, Interactions des Systèmes Amazoniens (LEEISA), USR 3456, Université De Guyane, CNRS Guyane, 275 Route de Montabo, 97334 Cayenne, French Guiana, France
| | - Florian Huber
- Netherlands eScience Center, 1098 XG Amsterdam, The Netherlands
| | - Simon Rogers
- School of Computing Science, University of Glasgow, Glasgow G12 8QQ, UK
| | | |
Collapse
|
29
|
Hawkins C, Ginzburg D, Zhao K, Dwyer W, Xue B, Xu A, Rice S, Cole B, Paley S, Karp P, Rhee SY. Plant Metabolic Network 15: A resource of genome-wide metabolism databases for 126 plants and algae. JOURNAL OF INTEGRATIVE PLANT BIOLOGY 2021; 63:1888-1905. [PMID: 34403192 DOI: 10.1111/jipb.13163] [Citation(s) in RCA: 66] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Accepted: 08/14/2021] [Indexed: 05/18/2023]
Abstract
To understand and engineer plant metabolism, we need a comprehensive and accurate annotation of all metabolic information across plant species. As a step towards this goal, we generated genome-scale metabolic pathway databases of 126 algal and plant genomes, ranging from model organisms to crops to medicinal plants (https://plantcyc.org). Of these, 104 have not been reported before. We systematically evaluated the quality of the databases, which revealed that our semi-automated validation pipeline dramatically improves the quality. We then compared the metabolic content across the 126 organisms using multiple correspondence analysis and found that Brassicaceae, Poaceae, and Chlorophyta appeared as metabolically distinct groups. To demonstrate the utility of this resource, we used recently published sorghum transcriptomics data to discover previously unreported trends of metabolism underlying drought tolerance. We also used single-cell transcriptomics data from the Arabidopsis root to infer cell type-specific metabolic pathways. This work shows the quality and quantity of our resource and demonstrates its wide-ranging utility in integrating metabolism with other areas of plant biology.
Collapse
Affiliation(s)
- Charles Hawkins
- Department of Plant Biology, Carnegie Institution for Science, Stanford, California, 94305, USA
| | - Daniel Ginzburg
- Department of Plant Biology, Carnegie Institution for Science, Stanford, California, 94305, USA
| | - Kangmei Zhao
- Department of Plant Biology, Carnegie Institution for Science, Stanford, California, 94305, USA
| | - William Dwyer
- Department of Plant Biology, Carnegie Institution for Science, Stanford, California, 94305, USA
| | - Bo Xue
- Department of Plant Biology, Carnegie Institution for Science, Stanford, California, 94305, USA
| | - Angela Xu
- Department of Plant Biology, Carnegie Institution for Science, Stanford, California, 94305, USA
| | - Selena Rice
- Department of Plant Biology, Carnegie Institution for Science, Stanford, California, 94305, USA
| | - Benjamin Cole
- DOE-Joint Genome Institute, Lawrence Berkeley Laboratory, Berkeley, California, 94720, USA
| | - Suzanne Paley
- SRI International, Menlo Park, California, 94025, USA
| | - Peter Karp
- SRI International, Menlo Park, California, 94025, USA
| | - Seung Y Rhee
- Department of Plant Biology, Carnegie Institution for Science, Stanford, California, 94305, USA
| |
Collapse
|
30
|
Tsugawa H, Rai A, Saito K, Nakabayashi R. Metabolomics and complementary techniques to investigate the plant phytochemical cosmos. Nat Prod Rep 2021; 38:1729-1759. [PMID: 34668509 DOI: 10.1039/d1np00014d] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Covering: up to 2021Plants and their associated microbial communities are known to produce millions of metabolites, a majority of which are still not characterized and are speculated to possess novel bioactive properties. In addition to their role in plant physiology, these metabolites are also relevant as existing and next-generation medicine candidates. Elucidation of the plant metabolite diversity is thus valuable for the successful exploitation of natural resources for humankind. Herein, we present a comprehensive review on recent metabolomics approaches to illuminate molecular networks in plants, including chemical isolation and enzymatic production as well as the modern metabolomics approaches such as stable isotope labeling, ultrahigh-resolution mass spectrometry, metabolome imaging (spatial metabolomics), single-cell analysis, cheminformatics, and computational mass spectrometry. Mass spectrometry-based strategies to characterize plant metabolomes through metabolite identification and annotation are described in detail. We also highlight the use of phytochemical genomics to mine genes associated with specialized metabolites' biosynthesis. Understanding the metabolic diversity through biotechnological advances is fundamental to elucidate the functions of the plant-derived specialized metabolome.
Collapse
Affiliation(s)
- Hiroshi Tsugawa
- RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan. .,RIKEN Center for Integrative Medical Sciences, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan.,Department of Biotechnology and Life Science, Tokyo University of Agriculture and Technology, 2-24-16 Nakamachi, Koganei, Tokyo 184-8588, Japan.,Graduate School of Medical Life Science, Yokohama City University, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
| | - Amit Rai
- RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan. .,Plant Molecular Science Center, Chiba University, 1-8-1 Inohana, Chuo-ku, Chiba 260-8675, Japan
| | - Kazuki Saito
- RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan. .,Plant Molecular Science Center, Chiba University, 1-8-1 Inohana, Chuo-ku, Chiba 260-8675, Japan
| | - Ryo Nakabayashi
- RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan.
| |
Collapse
|
31
|
Abstract
The same gene is often regulated differently in response to stress in even closely related plant species. Directly measuring stress-responsive gene expression can be financially and logistically challenging in nonmodel species. Here, we show that models trained using data on which genes respond to cold in one species can predict which genes will respond to cold in related species, even when the training and target species vary in their degree of tolerance to cold. The prediction models we used require only genomic sequence and gene models. As a result, data from well-studied model species may be used to predict which genes will respond to stress in less-studied species with sequenced genomes. Although genome-sequence assemblies are available for a growing number of plant species, gene-expression responses to stimuli have been cataloged for only a subset of these species. Many genes show altered transcription patterns in response to abiotic stresses. However, orthologous genes in related species often exhibit different responses to a given stress. Accordingly, data on the regulation of gene expression in one species are not reliable predictors of orthologous gene responses in a related species. Here, we trained a supervised classification model to identify genes that transcriptionally respond to cold stress. A model trained with only features calculated directly from genome assemblies exhibited only modest decreases in performance relative to models trained by using genomic, chromatin, and evolution/diversity features. Models trained with data from one species successfully predicted which genes would respond to cold stress in other related species. Cross-species predictions remained accurate when training was performed in cold-sensitive species and predictions were performed in cold-tolerant species and vice versa. Models trained with data on gene expression in multiple species provided at least equivalent performance to models trained and tested in a single species and outperformed single-species models in cross-species prediction. These results suggest that classifiers trained on stress data from well-studied species may suffice for predicting gene-expression patterns in related, less-studied species with sequenced genomes.
Collapse
|
32
|
Wang P, Moore BM, Uygun S, Lehti-Shiu MD, Barry CS, Shiu SH. Optimising the use of gene expression data to predict plant metabolic pathway memberships. THE NEW PHYTOLOGIST 2021; 231:475-489. [PMID: 33749860 DOI: 10.1111/nph.17355] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Accepted: 03/13/2021] [Indexed: 06/12/2023]
Abstract
Plant metabolites from diverse pathways are important for plant survival, human nutrition and medicine. The pathway memberships of most plant enzyme genes are unknown. While co-expression is useful for assigning genes to pathways, expression correlation may exist only under specific spatiotemporal and conditional contexts. Utilising > 600 tomato (Solanum lycopersicum) expression data combinations, three strategies for predicting memberships in 85 pathways were explored. Optimal predictions for different pathways require distinct data combinations indicative of pathway functions. Naive prediction (i.e. identifying pathways with the most similarly expressed genes) is error prone. In 52 pathways, unsupervised learning performed better than supervised approaches, possibly due to limited training data availability. Using gene-to-pathway expression similarities led to prediction models that outperformed those based simply on expression levels. Using 36 experimental validated genes, the pathway-best model prediction accuracy is 58.3%, significantly better compared with that for predicting annotated genes without experimental evidence (37.0%) or random guess (1.2%), demonstrating the importance of data quality. Our study highlights the need to extensively explore expression-based features and prediction strategies to maximise the accuracy of metabolic pathway membership assignment. The prediction framework outlined here can be applied to other species and serves as a baseline model for future comparisons.
Collapse
Affiliation(s)
- Peipei Wang
- Department of Plant Biology, Michigan State University, East Lansing, MI, 48824, USA
| | - Bethany M Moore
- Department of Botany, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | | | - Melissa D Lehti-Shiu
- Department of Plant Biology, Michigan State University, East Lansing, MI, 48824, USA
| | - Cornelius S Barry
- Department of Horticulture, Michigan State University, East Lansing, MI, 48824, USA
| | - Shin-Han Shiu
- Department of Plant Biology, Michigan State University, East Lansing, MI, 48824, USA
- Department of Computational Mathematics, Science, and Engineering, Michigan State University, East Lansing, MI, 48824, USA
| |
Collapse
|
33
|
Gupta C, Ramegowda V, Basu S, Pereira A. Using Network-Based Machine Learning to Predict Transcription Factors Involved in Drought Resistance. Front Genet 2021; 12:652189. [PMID: 34249082 PMCID: PMC8264776 DOI: 10.3389/fgene.2021.652189] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Accepted: 05/13/2021] [Indexed: 12/13/2022] Open
Abstract
Gene regulatory networks underpin stress response pathways in plants. However, parsing these networks to prioritize key genes underlying a particular trait is challenging. Here, we have built the Gene Regulation and Association Network (GRAiN) of rice (Oryza sativa). GRAiN is an interactive query-based web-platform that allows users to study functional relationships between transcription factors (TFs) and genetic modules underlying abiotic-stress responses. We built GRAiN by applying a combination of different network inference algorithms to publicly available gene expression data. We propose a supervised machine learning framework that complements GRAiN in prioritizing genes that regulate stress signal transduction and modulate gene expression under drought conditions. Our framework converts intricate network connectivity patterns of 2160 TFs into a single drought score. We observed that TFs with the highest drought scores define the functional, structural, and evolutionary characteristics of drought resistance in rice. Our approach accurately predicted the function of OsbHLH148 TF, which we validated using in vitro protein-DNA binding assays and mRNA sequencing loss-of-function mutants grown under control and drought stress conditions. Our network and the complementary machine learning strategy lends itself to predicting key regulatory genes underlying other agricultural traits and will assist in the genetic engineering of desirable rice varieties.
Collapse
Affiliation(s)
- Chirag Gupta
- Department of Crop, Soil, and Environmental Sciences, University of Arkansas, Fayetteville, AR, United States
| | - Venkategowda Ramegowda
- Department of Crop, Soil, and Environmental Sciences, University of Arkansas, Fayetteville, AR, United States
| | - Supratim Basu
- Department of Crop, Soil, and Environmental Sciences, University of Arkansas, Fayetteville, AR, United States
| | - Andy Pereira
- Department of Crop, Soil, and Environmental Sciences, University of Arkansas, Fayetteville, AR, United States
| |
Collapse
|
34
|
Chen Y, Pan H, Hao S, Pan D, Wang G, Yu W. Evaluation of phenolic composition and antioxidant properties of different varieties of Chinese citrus. Food Chem 2021; 364:130413. [PMID: 34175629 DOI: 10.1016/j.foodchem.2021.130413] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2021] [Revised: 05/17/2021] [Accepted: 06/16/2021] [Indexed: 01/27/2023]
Abstract
Citrus peels have health-promoting effects and are a rich source of antioxidant substances. This study evaluated the compositions of phenolic compounds and antioxidant activities in the peels of 52 citrus varieties with consistent planting time and management. The highest levels of total phenols (72.95 ± 37.60 mg/g DW) and total flavonoids (71.43 ± 37.64 mg/g DW) were found in mandarin. The highest phenolic acid content (18.78 ± 0.38 mg/g DW), dominated by protocatechuic acid, was found in kumquat. The antioxidant potency composite index was 6.23-94.56, suggesting mandarin varieties HJ, TWPG, TTPG, AY28, BZH and TCJC had the highest antioxidant activity. Statistics analysis indicated phenolic compounds and antioxidant activity were positively correlated. Principal component analysis and hierarchical cluster analysis suggested a strong relationship between phenolic compound composition and genetic background. This study indicated significant differences in the biological properties of various types of citrus peels; which are valuable for future utilization and research of citrus peels.
Collapse
Affiliation(s)
- Yuan Chen
- Fujian Academy of Agricultural Sciences/Research Institute of Agri-engineering Technology, Fuzhou 350003, China; Fujian Academy of Agricultural Sciences, Fuzhou 350003, China.
| | - Heli Pan
- College of Horticulture, Fujian Agriculture and Forestry University, Fujian Engineering Research Center for Narcissus Breeding, Fuzhou 350003, China
| | - Shuxia Hao
- College of Horticulture, Fujian Agriculture and Forestry University, Fujian Engineering Research Center for Narcissus Breeding, Fuzhou 350003, China
| | - Dongming Pan
- College of Horticulture, Fujian Agriculture and Forestry University, Fujian Engineering Research Center for Narcissus Breeding, Fuzhou 350003, China
| | - Guojun Wang
- Harbor Branch Oceanographic Institute, Florida Atlantic University, Fort Pierce, FL 34946, USA.
| | - Wenquan Yu
- Fujian Academy of Agricultural Sciences, Fuzhou 350003, China.
| |
Collapse
|
35
|
Ding Y, Northen TR, Khalil A, Huffaker A, Schmelz EA. Getting back to the grass roots: harnessing specialized metabolites for improved crop stress resilience. Curr Opin Biotechnol 2021; 70:174-186. [PMID: 34129999 DOI: 10.1016/j.copbio.2021.05.010] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Revised: 05/06/2021] [Accepted: 05/31/2021] [Indexed: 12/12/2022]
Abstract
Roots remain an understudied site of complex and important biological interactions mediating plant productivity. In grain and bioenergy crops, grass root specialized metabolites (GRSM) are central to key interactions, yet our basic knowledge of the chemical language remains fragmentary. Continued improvements in plant genome assembly and metabolomics are enabling large-scale advances in the discovery of specialized metabolic pathways as a means of regulating root-biotic interactions. Metabolomics, transcript coexpression analyses, forward genetic studies, gene synthesis and heterologous expression assays drive efficient pathway discoveries. Functional genetic variants identified through genome wide analyses, targeted CRISPR/Cas9 approaches, and both native and non-native overexpression studies critically inform novel strategies for bioengineering metabolic pathways to improve plant traits.
Collapse
Affiliation(s)
- Yezhang Ding
- Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Trent R Northen
- Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA; Joint BioEnergy Institute, Emeryville, CA 94608, USA
| | - Ahmed Khalil
- Section of Cell and Developmental Biology, University of California at San Diego, La Jolla, CA, USA
| | - Alisa Huffaker
- Section of Cell and Developmental Biology, University of California at San Diego, La Jolla, CA, USA
| | - Eric A Schmelz
- Section of Cell and Developmental Biology, University of California at San Diego, La Jolla, CA, USA.
| |
Collapse
|
36
|
Su W, Jing Y, Lin S, Yue Z, Yang X, Xu J, Wu J, Zhang Z, Xia R, Zhu J, An N, Chen H, Hong Y, Yuan Y, Long T, Zhang L, Jiang Y, Liu Z, Zhang H, Gao Y, Liu Y, Lin H, Wang H, Yant L, Lin S, Liu Z. Polyploidy underlies co-option and diversification of biosynthetic triterpene pathways in the apple tribe. Proc Natl Acad Sci U S A 2021; 118:e2101767118. [PMID: 33986115 PMCID: PMC8157987 DOI: 10.1073/pnas.2101767118] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Whole-genome duplication (WGD) plays important roles in plant evolution and function, yet little is known about how WGD underlies metabolic diversification of natural products that bear significant medicinal properties, especially in nonmodel trees. Here, we reveal how WGD laid the foundation for co-option and differentiation of medicinally important ursane triterpene pathway duplicates, generating distinct chemotypes between species and between developmental stages in the apple tribe. After generating chromosome-level assemblies of a widely cultivated loquat variety and Gillenia trifoliata, we define differentially evolved, duplicated gene pathways and date the WGD in the apple tribe at 13.5 to 27.1 Mya, much more recent than previously thought. We then functionally characterize contrasting metabolic pathways responsible for major triterpene biosynthesis in G. trifoliata and loquat, which pre- and postdate the Maleae WGD, respectively. Our work mechanistically details the metabolic diversity that arose post-WGD and provides insights into the genomic basis of medicinal properties of loquat, which has been used in both traditional and modern medicines.
Collapse
Affiliation(s)
- Wenbing Su
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops, Ministry of Agriculture and Rural Affairs, College of Horticulture, South China Agricultural University, Guangzhou 510642, China
| | - Yi Jing
- Research Cooperation Department, Beijing Genomics Institute Genomics, Shenzhen 518083, China
| | - Shoukai Lin
- Key laboratory of Loquat Germplasm Innovation and Utilization (Fujian Province), Putian University, Putian 351100, China
| | - Zhen Yue
- Research Cooperation Department, Beijing Genomics Institute Genomics, Shenzhen 518083, China
| | - Xianghui Yang
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops, Ministry of Agriculture and Rural Affairs, College of Horticulture, South China Agricultural University, Guangzhou 510642, China
| | - Jiabao Xu
- Research Cooperation Department, Beijing Genomics Institute Genomics, Shenzhen 518083, China
| | - Jincheng Wu
- Key laboratory of Loquat Germplasm Innovation and Utilization (Fujian Province), Putian University, Putian 351100, China
| | - Zhike Zhang
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops, Ministry of Agriculture and Rural Affairs, College of Horticulture, South China Agricultural University, Guangzhou 510642, China
| | - Rui Xia
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops, Ministry of Agriculture and Rural Affairs, College of Horticulture, South China Agricultural University, Guangzhou 510642, China
| | - Jiaojiao Zhu
- Joint Center for Single Cell Biology, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Ning An
- Joint Center for Single Cell Biology, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Haixin Chen
- Research Cooperation Department, Beijing Genomics Institute Genomics, Shenzhen 518083, China
| | - Yanping Hong
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops, Ministry of Agriculture and Rural Affairs, College of Horticulture, South China Agricultural University, Guangzhou 510642, China
| | - Yuan Yuan
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops, Ministry of Agriculture and Rural Affairs, College of Horticulture, South China Agricultural University, Guangzhou 510642, China
| | - Ting Long
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops, Ministry of Agriculture and Rural Affairs, College of Horticulture, South China Agricultural University, Guangzhou 510642, China
| | - Ling Zhang
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops, Ministry of Agriculture and Rural Affairs, College of Horticulture, South China Agricultural University, Guangzhou 510642, China
| | - Yuanyuan Jiang
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops, Ministry of Agriculture and Rural Affairs, College of Horticulture, South China Agricultural University, Guangzhou 510642, China
| | - Zongli Liu
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops, Ministry of Agriculture and Rural Affairs, College of Horticulture, South China Agricultural University, Guangzhou 510642, China
| | - Hailan Zhang
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops, Ministry of Agriculture and Rural Affairs, College of Horticulture, South China Agricultural University, Guangzhou 510642, China
| | - Yongshun Gao
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops, Ministry of Agriculture and Rural Affairs, College of Horticulture, South China Agricultural University, Guangzhou 510642, China
| | - Yuexue Liu
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops, Ministry of Agriculture and Rural Affairs, College of Horticulture, South China Agricultural University, Guangzhou 510642, China
| | - Hailan Lin
- Key laboratory of Loquat Germplasm Innovation and Utilization (Fujian Province), Putian University, Putian 351100, China
| | - Huicong Wang
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops, Ministry of Agriculture and Rural Affairs, College of Horticulture, South China Agricultural University, Guangzhou 510642, China
| | - Levi Yant
- Future Food Beacon and School of Life Sciences, University of Nottingham, Nottingham NG7 2RD, United Kingdom
| | - Shunquan Lin
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops, Ministry of Agriculture and Rural Affairs, College of Horticulture, South China Agricultural University, Guangzhou 510642, China;
| | - Zhenhua Liu
- Joint Center for Single Cell Biology, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai 200240, China;
| |
Collapse
|
37
|
Katz E, Li JJ, Jaegle B, Ashkenazy H, Abrahams SR, Bagaza C, Holden S, Pires CJ, Angelovici R, Kliebenstein DJ. Genetic variation, environment and demography intersect to shape Arabidopsis defense metabolite variation across Europe. eLife 2021; 10:67784. [PMID: 33949309 PMCID: PMC8205490 DOI: 10.7554/elife.67784] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Accepted: 05/02/2021] [Indexed: 12/03/2022] Open
Abstract
Plants produce diverse metabolites to cope with the challenges presented by complex and ever-changing environments. These challenges drive the diversification of specialized metabolites within and between plant species. However, we are just beginning to understand how frequently new alleles arise controlling specialized metabolite diversity and how the geographic distribution of these alleles may be structured by ecological and demographic pressures. Here, we measure the variation in specialized metabolites across a population of 797 natural Arabidopsis thaliana accessions. We show that a combination of geography, environmental parameters, demography and different genetic processes all combine to influence the specific chemotypes and their distribution. This showed that causal loci in specialized metabolism contain frequent independently generated alleles with patterns suggesting potential within-species convergence. This provides a new perspective about the complexity of the selective forces and mechanisms that shape the generation and distribution of allelic variation that may influence local adaptation. Since plants cannot move, they have evolved chemical defenses to help them respond to changes in their surroundings. For example, where animals run from predators, plants may produce toxins to put predators off. This approach is why plants are such a rich source of drugs, poisons, dyes and other useful substances. The chemicals plants produce are known as specialized metabolites, and they can change a lot between, and even within, plant species. The variety of specialized metabolites is a result of genetic changes and evolution over millions of years. Evolution is a slow process, yet plants are able to rapidly develop new specialized metabolites to protect them from new threats. Even different populations of the same species produce many distinct metabolites that help them survive in their surroundings. However, the factors that lead plants to produce new metabolites are not well understood, and it is not known how this affects genetic variation. To gain a better understanding of this process, Katz et al. studied 797 European variants of a common weed species called Arabidopsis thaliana, which is widely studied. The investigation found that many factors affect the range of specialized metabolites in each variant. These included local geography and environment, as well as genetics and population history (demography). Katz et al. revealed a pattern of relationships between the variants that could mirror their evolutionary history as the species spread and adapted to new locations. These results highlight the complex network of factors that affect plant evolution. Rapid diversification is key to plant survival in new and changing environments and has resulted in a wide range of specialized metabolites. As such they are of interest both for studying plant evolution and for understanding their ecology. Expanding similar work to more populations and other species will broaden the scope of our ability to understand how plants adapt to their surroundings.
Collapse
Affiliation(s)
- Ella Katz
- Department of Plant Sciences, University of California, Davis, Davis, United States
| | - Jia-Jie Li
- Department of Plant Sciences, University of California, Davis, Davis, United States
| | - Benjamin Jaegle
- Gregor Mendel Institute, Austrian Academy of Sciences, Vienna Biocenter (VBC), Vienna, Austria
| | - Haim Ashkenazy
- Department of Molecular Biology, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Shawn R Abrahams
- Division of Biological Sciences, Bond Life Sciences Center, University of Missouri, Columbia, United States
| | - Clement Bagaza
- Division of Biological Sciences, Interdisciplinary Plant Group, Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, United States
| | - Samuel Holden
- Division of Biological Sciences, Interdisciplinary Plant Group, Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, United States
| | - Chris J Pires
- Division of Biological Sciences, Bond Life Sciences Center, University of Missouri, Columbia, United States
| | - Ruthie Angelovici
- Division of Biological Sciences, Interdisciplinary Plant Group, Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, United States
| | - Daniel J Kliebenstein
- Department of Plant Sciences, University of California, Davis, Davis, United States.,DynaMo Center of Excellence, University of Copenhagen, Frederiksberg, Denmark
| |
Collapse
|
38
|
Cusack SA, Wang P, Lotreck SG, Moore BM, Meng F, Conner JK, Krysan PJ, Lehti-Shiu MD, Shiu SH. Predictive Models of Genetic Redundancy in Arabidopsis thaliana. Mol Biol Evol 2021; 38:3397-3414. [PMID: 33871641 PMCID: PMC8321531 DOI: 10.1093/molbev/msab111] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Genetic redundancy refers to a situation where an individual with a loss-of-function mutation in one gene (single mutant) does not show an apparent phenotype until one or more paralogs are also knocked out (double/higher-order mutant). Previous studies have identified some characteristics common among redundant gene pairs, but a predictive model of genetic redundancy incorporating a wide variety of features derived from accumulating omics and mutant phenotype data is yet to be established. In addition, the relative importance of these features for genetic redundancy remains largely unclear. Here, we establish machine learning models for predicting whether a gene pair is likely redundant or not in the model plant Arabidopsis thaliana based on six feature categories: functional annotations, evolutionary conservation including duplication patterns and mechanisms, epigenetic marks, protein properties including posttranslational modifications, gene expression, and gene network properties. The definition of redundancy, data transformations, feature subsets, and machine learning algorithms used significantly affected model performance based on holdout, testing phenotype data. Among the most important features in predicting gene pairs as redundant were having a paralog(s) from recent duplication events, annotation as a transcription factor, downregulation during stress conditions, and having similar expression patterns under stress conditions. We also explored the potential reasons underlying mispredictions and limitations of our studies. This genetic redundancy model sheds light on characteristics that may contribute to long-term maintenance of paralogs, and will ultimately allow for more targeted generation of functionally informative double mutants, advancing functional genomic studies.
Collapse
Affiliation(s)
- Siobhan A Cusack
- Cell and Molecular Biology Program, Michigan State University, East Lansing, MI, USA
| | - Peipei Wang
- Department of Plant Biology, Michigan State University, East Lansing, MI, USA
| | - Serena G Lotreck
- Department of Plant Biology, Michigan State University, East Lansing, MI, USA.,Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI, USA
| | - Bethany M Moore
- Department of Botany, University of Wisconsin-Madison, Madison, WI, USA
| | - Fanrui Meng
- Department of Plant Biology, Michigan State University, East Lansing, MI, USA
| | - Jeffrey K Conner
- Department of Plant Biology, Michigan State University, East Lansing, MI, USA.,Ecology, Evolution, and Behavior Program, Michigan State University, East Lansing, MI, USA.,Kellogg Biological Station, Michigan State University, East Lansing, MI, USA
| | - Patrick J Krysan
- Department of Horticulture, University of Wisconsin-Madison, Madison, WI, USA
| | | | - Shin-Han Shiu
- Cell and Molecular Biology Program, Michigan State University, East Lansing, MI, USA.,Department of Plant Biology, Michigan State University, East Lansing, MI, USA.,Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI, USA.,Ecology, Evolution, and Behavior Program, Michigan State University, East Lansing, MI, USA
| |
Collapse
|
39
|
Wang P, Meng F, Moore BM, Shiu SH. Impact of short-read sequencing on the misassembly of a plant genome. BMC Genomics 2021; 22:99. [PMID: 33530937 PMCID: PMC7852129 DOI: 10.1186/s12864-021-07397-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Accepted: 01/19/2021] [Indexed: 12/16/2022] Open
Abstract
Background Availability of plant genome sequences has led to significant advances. However, with few exceptions, the great majority of existing genome assemblies are derived from short read sequencing technologies with highly uneven read coverages indicative of sequencing and assembly issues that could significantly impact any downstream analysis of plant genomes. In tomato for example, 0.6% (5.1 Mb) and 9.7% (79.6 Mb) of short-read based assembly had significantly higher and lower coverage compared to background, respectively. Results To understand what the causes may be for such uneven coverage, we first established machine learning models capable of predicting genomic regions with variable coverages and found that high coverage regions tend to have higher simple sequence repeat and tandem gene densities compared to background regions. To determine if the high coverage regions were misassembled, we examined a recently available tomato long-read based assembly and found that 27.8% (1.41 Mb) of high coverage regions were potentially misassembled of duplicate sequences, compared to 1.4% in background regions. In addition, using a predictive model that can distinguish correctly and incorrectly assembled high coverage regions, we found that misassembled, high coverage regions tend to be flanked by simple sequence repeats, pseudogenes, and transposon elements. Conclusions Our study provides insights on the causes of variable coverage regions and a quantitative assessment of factors contributing to plant genome misassembly when using short reads and the generality of these causes and factors should be tested further in other species. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-07397-5.
Collapse
Affiliation(s)
- Peipei Wang
- Department of Plant Biology, Michigan State University, East Lansing, MI, 48824, USA.,DOE Great Lake Bioenergy Research Center, Michigan State University, East Lansing, MI, 48824, USA
| | - Fanrui Meng
- Department of Plant Biology, Michigan State University, East Lansing, MI, 48824, USA.,DOE Great Lake Bioenergy Research Center, Michigan State University, East Lansing, MI, 48824, USA
| | - Bethany M Moore
- Department of Plant Biology, Michigan State University, East Lansing, MI, 48824, USA.,The Ecology, Evolution, and Behavioral Biology Program, Michigan State University, East Lansing, MI, 48824, USA
| | - Shin-Han Shiu
- Department of Plant Biology, Michigan State University, East Lansing, MI, 48824, USA. .,DOE Great Lake Bioenergy Research Center, Michigan State University, East Lansing, MI, 48824, USA. .,The Ecology, Evolution, and Behavioral Biology Program, Michigan State University, East Lansing, MI, 48824, USA. .,Department of Computational Mathematics, Science, and Engineering, Michigan State University, East Lansing, MI, 48824, USA.
| |
Collapse
|
40
|
Shoji T, Yuan L. ERF Gene Clusters: Working Together to Regulate Metabolism. TRENDS IN PLANT SCIENCE 2021; 26:23-32. [PMID: 32883605 DOI: 10.1016/j.tplants.2020.07.015] [Citation(s) in RCA: 50] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/08/2020] [Revised: 07/28/2020] [Accepted: 07/30/2020] [Indexed: 05/18/2023]
Abstract
Plants produce structurally diverse specialized metabolites, including bioactive alkaloids and terpenoids, in response to biotic and abiotic environmental stresses. The APETALA2/ETHYLENE RESPONSE FACTOR (AP2/ERF) family of transcription factors (TFs) play key roles in regulating biosynthesis of specialized metabolites. Increasing genomic and functional evidence shows that a subset of the ERF genes occurs in clusters on the chromosomes. These jasmonate-responsive ERF TF gene clusters control the biosynthesis of many important metabolites, from natural products, such as nicotine and steroidal glycoalkaloids (SGAs), to pharmaceuticals, such as artemisinin, vinblastine, and vincristine. Here, we review the function, regulation, and evolution of ERF clusters and highlight recent advances in understanding the distinct roles of clustered ERF genes and their possible application in metabolic engineering.
Collapse
Affiliation(s)
- Tsubasa Shoji
- Department of Biological Science, Nara Institute of Science and Technology, Ikoma, Japan.
| | - Ling Yuan
- Department of Plant and Soil Sciences, University of Kentucky, Lexington, KY, USA; South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China.
| |
Collapse
|
41
|
Gupta C, Ramegowda V, Basu S, Pereira A. Using Network-Based Machine Learning to Predict Transcription Factors Involved in Drought Resistance. Front Genet 2021. [PMID: 34249082 DOI: 10.1101/2020.04.29.068379] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/13/2023] Open
Abstract
Gene regulatory networks underpin stress response pathways in plants. However, parsing these networks to prioritize key genes underlying a particular trait is challenging. Here, we have built the Gene Regulation and Association Network (GRAiN) of rice (Oryza sativa). GRAiN is an interactive query-based web-platform that allows users to study functional relationships between transcription factors (TFs) and genetic modules underlying abiotic-stress responses. We built GRAiN by applying a combination of different network inference algorithms to publicly available gene expression data. We propose a supervised machine learning framework that complements GRAiN in prioritizing genes that regulate stress signal transduction and modulate gene expression under drought conditions. Our framework converts intricate network connectivity patterns of 2160 TFs into a single drought score. We observed that TFs with the highest drought scores define the functional, structural, and evolutionary characteristics of drought resistance in rice. Our approach accurately predicted the function of OsbHLH148 TF, which we validated using in vitro protein-DNA binding assays and mRNA sequencing loss-of-function mutants grown under control and drought stress conditions. Our network and the complementary machine learning strategy lends itself to predicting key regulatory genes underlying other agricultural traits and will assist in the genetic engineering of desirable rice varieties.
Collapse
Affiliation(s)
- Chirag Gupta
- Department of Crop, Soil, and Environmental Sciences, University of Arkansas, Fayetteville, AR, United States
| | - Venkategowda Ramegowda
- Department of Crop, Soil, and Environmental Sciences, University of Arkansas, Fayetteville, AR, United States
| | - Supratim Basu
- Department of Crop, Soil, and Environmental Sciences, University of Arkansas, Fayetteville, AR, United States
| | - Andy Pereira
- Department of Crop, Soil, and Environmental Sciences, University of Arkansas, Fayetteville, AR, United States
| |
Collapse
|
42
|
Xia J, Wang J, Niu S. Research challenges and opportunities for using big data in global change biology. GLOBAL CHANGE BIOLOGY 2020; 26:6040-6061. [PMID: 32799353 DOI: 10.1111/gcb.15317] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2020] [Accepted: 07/13/2020] [Indexed: 06/11/2023]
Abstract
Global change biology has been entering a big data era due to the vast increase in availability of both environmental and biological data. Big data refers to large data volume, complex data sets, and multiple data sources. The recent use of such big data is improving our understanding of interactions between biological systems and global environmental changes. In this review, we first explore how big data has been analyzed to identify the general patterns of biological responses to global changes at scales from gene to ecosystem. After that, we investigate how observational networks and space-based big data have facilitated the discovery of emergent mechanisms and phenomena on the regional and global scales. Then, we evaluate the predictions of terrestrial biosphere under global changes by big modeling data. Finally, we introduce some methods to extract knowledge from big data, such as meta-analysis, machine learning, traceability analysis, and data assimilation. The big data has opened new research opportunities, especially for developing new data-driven theories for improving biological predictions in Earth system models, tracing global change impacts across different organismic levels, and constructing cyberinfrastructure tools to accelerate the pace of model-data integrations. These efforts will uncork the bottleneck of using big data to understand biological responses and adaptations to future global changes.
Collapse
Affiliation(s)
- Jianyang Xia
- Zhejiang Tiantong Forest Ecosystem National Observation and Research Station, Research Center for Global Change and Ecological Forecasting, School of Ecological and Environmental Sciences, East China Normal University, Shanghai, China
| | - Jing Wang
- Zhejiang Tiantong Forest Ecosystem National Observation and Research Station, Research Center for Global Change and Ecological Forecasting, School of Ecological and Environmental Sciences, East China Normal University, Shanghai, China
- Shanghai Institute of Pollution Control and Ecological Security, Shanghai, China
| | - Shuli Niu
- Key Laboratory of Ecosystem Network Observation and Modeling, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
43
|
Baranwal M, Magner A, Elvati P, Saldinger J, Violi A, Hero AO. A deep learning architecture for metabolic pathway prediction. Bioinformatics 2020; 36:2547-2553. [PMID: 31879763 DOI: 10.1093/bioinformatics/btz954] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2019] [Revised: 12/02/2019] [Accepted: 12/22/2019] [Indexed: 01/14/2023] Open
Abstract
MOTIVATION Understanding the mechanisms and structural mappings between molecules and pathway classes are critical for design of reaction predictors for synthesizing new molecules. This article studies the problem of prediction of classes of metabolic pathways (series of chemical reactions occurring within a cell) in which a given biochemical compound participates. We apply a hybrid machine learning approach consisting of graph convolutional networks used to extract molecular shape features as input to a random forest classifier. In contrast to previously applied machine learning methods for this problem, our framework automatically extracts relevant shape features directly from input SMILES representations, which are atom-bond specifications of chemical structures composing the molecules. RESULTS Our method is capable of correctly predicting the respective metabolic pathway class of 95.16% of tested compounds, whereas competing methods only achieve an accuracy of 84.92% or less. Furthermore, our framework extends to the task of classification of compounds having mixed membership in multiple pathway classes. Our prediction accuracy for this multi-label task is 97.61%. We analyze the relative importance of various global physicochemical features to the pathway class prediction problem and show that simple linear/logistic regression models can predict the values of these global features from the shape features extracted using our framework. AVAILABILITY AND IMPLEMENTATION https://github.com/baranwa2/MetabolicPathwayPrediction. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mayank Baranwal
- Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109, USA
| | - Abram Magner
- Department of Computer Science, University at Albany, SUNY, Albany, NY 12222, USA
| | | | | | - Angela Violi
- Department of Mechanical Engineering.,Department of Chemical Engineering and Biophysics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Alfred O Hero
- Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
44
|
Ko DK, Brandizzi F. Network-based approaches for understanding gene regulation and function in plants. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2020; 104:302-317. [PMID: 32717108 PMCID: PMC8922287 DOI: 10.1111/tpj.14940] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/12/2020] [Accepted: 07/14/2020] [Indexed: 05/03/2023]
Abstract
Expression reprogramming directed by transcription factors is a primary gene regulation underlying most aspects of the biology of any organism. Our views of how gene regulation is coordinated are dramatically changing thanks to the advent and constant improvement of high-throughput profiling and transcriptional network inference methods: from activities of individual genes to functional interactions across genes. These technical and analytical advances can reveal the topology of transcriptional networks in which hundreds of genes are hierarchically regulated by multiple transcription factors at systems level. Here we review the state of the art of experimental and computational methods used in plant biology research to obtain large-scale datasets and model transcriptional networks. Examples of direct use of these network models and perspectives on their limitations and future directions are also discussed.
Collapse
Affiliation(s)
- Dae Kwan Ko
- MSU-DOE Plant Research Lab, Michigan State University, East Lansing, MI 48824, USA
- Great Lakes Bioenergy Research Center, Michigan State University, East Lansing, MI 48824, USA
| | - Federica Brandizzi
- MSU-DOE Plant Research Lab, Michigan State University, East Lansing, MI 48824, USA
- Great Lakes Bioenergy Research Center, Michigan State University, East Lansing, MI 48824, USA
- Department of Plant Biology, Michigan State University, East Lansing, MI 48824, USA
- For correspondence ()
| |
Collapse
|
45
|
Arya SS, Rookes JE, Cahill DM, Lenka SK. Next-generation metabolic engineering approaches towards development of plant cell suspension cultures as specialized metabolite producing biofactories. Biotechnol Adv 2020; 45:107635. [PMID: 32976930 DOI: 10.1016/j.biotechadv.2020.107635] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Revised: 09/04/2020] [Accepted: 09/17/2020] [Indexed: 12/11/2022]
Abstract
Plant cell suspension culture (PCSC) has emerged as a viable technology to produce plant specialized metabolites (PSM). While Taxol® and ginsenoside are two examples of successfully commercialized PCSC-derived PSM, widespread utilization of the PCSC platform has yet to be realized primarily due to a lack of understanding of the molecular genetics of PSM biosynthesis. Recent advances in computational, molecular and synthetic biology tools provide the opportunity to rapidly characterize and harness the specialized metabolic potential of plants. Here, we discuss the prospects of integrating computational modeling, artificial intelligence, and precision genome editing (CRISPR/Cas and its variants) toolboxes to discover the genetic regulators of PSM. We also explore how synthetic biology can be applied to develop metabolically optimized PSM-producing native and heterologous PCSC systems. Taken together, this review provides an interdisciplinary approach to realize and link the potential of next-generation computational and molecular tools to convert PCSC into commercially viable PSM-producing biofactories.
Collapse
Affiliation(s)
- Sagar S Arya
- TERI-Deakin Nano Biotechnology Centre, The Energy and Resources Institute, Gurugram, Haryana 122001, India; Deakin University, School of Life and Environmental Sciences, Waurn Ponds Campus, Geelong, Victoria 3216, Australia
| | - James E Rookes
- Deakin University, School of Life and Environmental Sciences, Waurn Ponds Campus, Geelong, Victoria 3216, Australia
| | - David M Cahill
- Deakin University, School of Life and Environmental Sciences, Waurn Ponds Campus, Geelong, Victoria 3216, Australia
| | - Sangram K Lenka
- TERI-Deakin Nano Biotechnology Centre, The Energy and Resources Institute, Gurugram, Haryana 122001, India.
| |
Collapse
|
46
|
Sen P, Lamichhane S, Mathema VB, McGlinchey A, Dickens AM, Khoomrung S, Orešič M. Deep learning meets metabolomics: a methodological perspective. Brief Bioinform 2020; 22:1531-1542. [PMID: 32940335 DOI: 10.1093/bib/bbaa204] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2020] [Revised: 08/08/2020] [Accepted: 08/10/2020] [Indexed: 12/15/2022] Open
Abstract
Deep learning (DL), an emerging area of investigation in the fields of machine learning and artificial intelligence, has markedly advanced over the past years. DL techniques are being applied to assist medical professionals and researchers in improving clinical diagnosis, disease prediction and drug discovery. It is expected that DL will help to provide actionable knowledge from a variety of 'big data', including metabolomics data. In this review, we discuss the applicability of DL to metabolomics, while presenting and discussing several examples from recent research. We emphasize the use of DL in tackling bottlenecks in metabolomics data acquisition, processing, metabolite identification, as well as in metabolic phenotyping and biomarker discovery. Finally, we discuss how DL is used in genome-scale metabolic modelling and in interpretation of metabolomics data. The DL-based approaches discussed here may assist computational biologists with the integration, prediction and drawing of statistical inference about biological outcomes, based on metabolomics data.
Collapse
Affiliation(s)
- Partho Sen
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, 20520 Turku, Finland.,School of Medical Sciences, Örebro University, 702 81 Örebro, Sweden
| | - Santosh Lamichhane
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, 20520 Turku, Finland
| | - Vivek B Mathema
- Metabolomics and Systems Biology, Department of Biochemistry, and Siriraj Metabolomics and Phenomics Center, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
| | - Aidan McGlinchey
- School of Medical Sciences, Örebro University, 702 81 Örebro, Sweden
| | - Alex M Dickens
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, 20520 Turku, Finland
| | - Sakda Khoomrung
- Metabolomics and Systems Biology, Department of Biochemistry, and Siriraj Metabolomics and Phenomics Center, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand.,Center for Innovation in Chemistry (PERCH), Faculty of Science, Mahidol University, Rama 6 Road, Bangkok 10400, Thailand
| | - Matej Orešič
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, 20520 Turku, Finland.,School of Medical Sciences, Örebro University, 702 81 Örebro, Sweden
| |
Collapse
|
47
|
Moore BM, Wang P, Fan P, Lee A, Leong B, Lou YR, Schenck CA, Sugimoto K, Last R, Lehti-Shiu MD, Barry CS, Shiu SH. Within- and cross-species predictions of plant specialized metabolism genes using transfer learning. IN SILICO PLANTS 2020; 2:diaa005. [PMID: 33344884 PMCID: PMC7731531 DOI: 10.1093/insilicoplants/diaa005] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/13/2020] [Accepted: 07/21/2020] [Indexed: 06/12/2023]
Abstract
Plant specialized metabolites mediate interactions between plants and the environment and have significant agronomical/pharmaceutical value. Most genes involved in specialized metabolism (SM) are unknown because of the large number of metabolites and the challenge in differentiating SM genes from general metabolism (GM) genes. Plant models like Arabidopsis thaliana have extensive, experimentally derived annotations, whereas many non-model species do not. Here we employed a machine learning strategy, transfer learning, where knowledge from A. thaliana is transferred to predict gene functions in cultivated tomato with fewer experimentally annotated genes. The first tomato SM/GM prediction model using only tomato data performs well (F-measure = 0.74, compared with 0.5 for random and 1.0 for perfect predictions), but from manually curating 88 SM/GM genes, we found many mis-predicted entries were likely mis-annotated. When the SM/GM prediction models built with A. thaliana data were used to filter out genes where the A. thaliana-based model predictions disagreed with tomato annotations, the new tomato model trained with filtered data improved significantly (F-measure = 0.92). Our study demonstrates that SM/GM genes can be better predicted by leveraging cross-species information. Additionally, our findings provide an example for transfer learning in genomics where knowledge can be transferred from an information-rich species to an information-poor one.
Collapse
Affiliation(s)
- Bethany M Moore
- Department of Plant Biology, Michigan State University, East Lansing, MI, USA
- Ecology, Evolutionary Biology, and Behavior Program, Michigan State University, East Lansing, MI, USA
| | - Peipei Wang
- Department of Plant Biology, Michigan State University, East Lansing, MI, USA
| | - Pengxiang Fan
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, USA
| | - Aaron Lee
- Department of Biology, The College of New Jersey, Ewing, NJ, USA
| | - Bryan Leong
- Department of Plant Biology, Michigan State University, East Lansing, MI, USA
| | - Yann-Ru Lou
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, USA
| | - Craig A Schenck
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, USA
| | - Koichi Sugimoto
- MSU-DOE Plant Research Laboratory, Michigan State University, East Lansing, MI, USA
- Science Research Center, Yamaguchi University, Yamaguchi, Japan
| | - Robert Last
- Department of Plant Biology, Michigan State University, East Lansing, MI, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, USA
| | | | - Cornelius S Barry
- Department of Horticulture, Michigan State University, East Lansing, MI, USA
| | - Shin-Han Shiu
- Department of Plant Biology, Michigan State University, East Lansing, MI, USA
- Ecology, Evolutionary Biology, and Behavior Program, Michigan State University, East Lansing, MI, USA
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI
| |
Collapse
|
48
|
Mahood EH, Kruse LH, Moghe GD. Machine learning: A powerful tool for gene function prediction in plants. APPLICATIONS IN PLANT SCIENCES 2020; 8:e11376. [PMID: 32765975 PMCID: PMC7394712 DOI: 10.1002/aps3.11376] [Citation(s) in RCA: 51] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2019] [Accepted: 03/19/2020] [Indexed: 05/06/2023]
Abstract
Recent advances in sequencing and informatic technologies have led to a deluge of publicly available genomic data. While it is now relatively easy to sequence, assemble, and identify genic regions in diploid plant genomes, functional annotation of these genes is still a challenge. Over the past decade, there has been a steady increase in studies utilizing machine learning algorithms for various aspects of functional prediction, because these algorithms are able to integrate large amounts of heterogeneous data and detect patterns inconspicuous through rule-based approaches. The goal of this review is to introduce experimental plant biologists to machine learning, by describing how it is currently being used in gene function prediction to gain novel biological insights. In this review, we discuss specific applications of machine learning in identifying structural features in sequenced genomes, predicting interactions between different cellular components, and predicting gene function and organismal phenotypes. Finally, we also propose strategies for stimulating functional discovery using machine learning-based approaches in plants.
Collapse
Affiliation(s)
- Elizabeth H. Mahood
- Plant Biology SectionSchool of Integrative Plant SciencesCornell UniversityIthacaNew York14853USA
| | - Lars H. Kruse
- Plant Biology SectionSchool of Integrative Plant SciencesCornell UniversityIthacaNew York14853USA
| | - Gaurav D. Moghe
- Plant Biology SectionSchool of Integrative Plant SciencesCornell UniversityIthacaNew York14853USA
| |
Collapse
|
49
|
Jamil IN, Remali J, Azizan KA, Nor Muhammad NA, Arita M, Goh HH, Aizat WM. Systematic Multi-Omics Integration (MOI) Approach in Plant Systems Biology. FRONTIERS IN PLANT SCIENCE 2020; 11:944. [PMID: 32754171 PMCID: PMC7371031 DOI: 10.3389/fpls.2020.00944] [Citation(s) in RCA: 58] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/05/2020] [Accepted: 06/10/2020] [Indexed: 05/03/2023]
Abstract
Across all facets of biology, the rapid progress in high-throughput data generation has enabled us to perform multi-omics systems biology research. Transcriptomics, proteomics, and metabolomics data can answer targeted biological questions regarding the expression of transcripts, proteins, and metabolites, independently, but a systematic multi-omics integration (MOI) can comprehensively assimilate, annotate, and model these large data sets. Previous MOI studies and reviews have detailed its usage and practicality on various organisms including human, animals, microbes, and plants. Plants are especially challenging due to large poorly annotated genomes, multi-organelles, and diverse secondary metabolites. Hence, constructive and methodological guidelines on how to perform MOI for plants are needed, particularly for researchers newly embarking on this topic. In this review, we thoroughly classify multi-omics studies on plants and verify workflows to ensure successful omics integration with accurate data representation. We also propose three levels of MOI, namely element-based (level 1), pathway-based (level 2), and mathematical-based integration (level 3). These MOI levels are described in relation to recent publications and tools, to highlight their practicality and function. The drawbacks and limitations of these MOI are also discussed for future improvement toward more amenable strategies in plant systems biology.
Collapse
Affiliation(s)
- Ili Nadhirah Jamil
- Institute of Systems Biology (INBIOSIS), Universiti Kebangsaan Malaysia (UKM), Bangi, Malaysia
| | - Juwairiah Remali
- Institute of Systems Biology (INBIOSIS), Universiti Kebangsaan Malaysia (UKM), Bangi, Malaysia
| | - Kamalrul Azlan Azizan
- Institute of Systems Biology (INBIOSIS), Universiti Kebangsaan Malaysia (UKM), Bangi, Malaysia
| | - Nor Azlan Nor Muhammad
- Institute of Systems Biology (INBIOSIS), Universiti Kebangsaan Malaysia (UKM), Bangi, Malaysia
| | - Masanori Arita
- Bioinformation & DDBJ Center, National Institute of Genetics (NIG), Mishima, Japan
- Metabolome Informatics Team, RIKEN Center for Sustainable Resource Science, Yokohama, Japan
| | - Hoe-Han Goh
- Institute of Systems Biology (INBIOSIS), Universiti Kebangsaan Malaysia (UKM), Bangi, Malaysia
| | - Wan Mohd Aizat
- Institute of Systems Biology (INBIOSIS), Universiti Kebangsaan Malaysia (UKM), Bangi, Malaysia
| |
Collapse
|
50
|
Lichman BR, Godden GT, Buell CR. Gene and genome duplications in the evolution of chemodiversity: perspectives from studies of Lamiaceae. CURRENT OPINION IN PLANT BIOLOGY 2020; 55:74-83. [PMID: 32344371 DOI: 10.1016/j.pbi.2020.03.005] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/10/2019] [Revised: 02/19/2020] [Accepted: 03/04/2020] [Indexed: 05/28/2023]
Abstract
Plants are reservoirs of extreme chemical diversity, yet biosynthetic pathways remain underexplored in the majority of taxa. Access to improved, inexpensive genomic and computational technologies has recently enhanced our understanding of plant specialized metabolism at the biochemical and evolutionary levels including the elucidation of pathways leading to key metabolites. Furthermore, these approaches have provided insights into the mechanisms of chemical evolution, including neofunctionalization and subfunctionalization, structural variation, and modulation of gene expression. The broader utilization of genomic tools across the plant tree of life, and an expansion of genomic resources from multiple accessions within species or populations, will improve our overall understanding of chemodiversity. These data and knowledge will also lead to greater insight into the selective pressures contributing to and maintaining this diversity, which in turn will enable the development of more accurate predictive models of specialized metabolism in plants.
Collapse
Affiliation(s)
- Benjamin R Lichman
- Centre for Novel Agricultural Products, Department of Biology, University of York, York YO10 5DD, UK
| | - Grant T Godden
- Florida Museum of Natural History, University of Florida, Gainesville, FL 32611, USA
| | - Carol Robin Buell
- Department of Plant Biology, Michigan State University, 612 Wilson Road, East Lansing, MI 48824, USA; Plant Resilience Institute, Michigan State University, 612 Wilson Road, East Lansing, MI 48824, USA; MSU AgBioResearch, Michigan State University, 446 West Circle Drive, East Lansing, MI 48824, USA.
| |
Collapse
|