1
|
Wang H, Chen M, Wei X, Xia R, Pei D, Huang X, Han B. Computational tools for plant genomics and breeding. SCIENCE CHINA. LIFE SCIENCES 2024; 67:1579-1590. [PMID: 38676814 DOI: 10.1007/s11427-024-2578-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Accepted: 03/25/2024] [Indexed: 04/29/2024]
Abstract
Plant genomics and crop breeding are at the intersection of biotechnology and information technology. Driven by a combination of high-throughput sequencing, molecular biology and data science, great advances have been made in omics technologies at every step along the central dogma, especially in genome assembling, genome annotation, epigenomic profiling, and transcriptome profiling. These advances further revolutionized three directions of development. One is genetic dissection of complex traits in crops, along with genomic prediction and selection. The second is comparative genomics and evolution, which open up new opportunities to depict the evolutionary constraints of biological sequences for deleterious variant discovery. The third direction is the development of deep learning approaches for the rational design of biological sequences, especially proteins, for synthetic biology. All three directions of development serve as the foundation for a new era of crop breeding where agronomic traits are enhanced by genome design.
Collapse
Affiliation(s)
- Hai Wang
- State Key Laboratory of Maize Bio-breeding, Frontiers Science Center for Molecular Design Breeding, Joint International Research Laboratory of Crop Molecular Breeding, National Maize Improvement Center, College of Agronomy and Biotechnology, China Agricultural University, Beijing, 100193, China.
- Sanya Institute of China Agricultural University, Sanya, 572025, China.
- Hainan Yazhou Bay Seed Laboratory, Sanya, 572025, China.
| | - Mengjiao Chen
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Tree Breeding and Cultivation of the State Forestry and Grassland Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, 100091, China
| | - Xin Wei
- Shanghai Key Laboratory of Plant Molecular Sciences, College of Life Sciences, Shanghai Normal University, Shanghai, 200234, China
| | - Rui Xia
- College of Horticulture, South China Agricultural University, Guangzhou, 510640, China
| | - Dong Pei
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Tree Breeding and Cultivation of the State Forestry and Grassland Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, 100091, China
| | - Xuehui Huang
- Shanghai Key Laboratory of Plant Molecular Sciences, College of Life Sciences, Shanghai Normal University, Shanghai, 200234, China
| | - Bin Han
- National Center for Gene Research, CAS Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, 200233, China
| |
Collapse
|
2
|
Marand AP, Eveland AL, Kaufmann K, Springer NM. cis-Regulatory Elements in Plant Development, Adaptation, and Evolution. ANNUAL REVIEW OF PLANT BIOLOGY 2023; 74:111-137. [PMID: 36608347 PMCID: PMC9881396 DOI: 10.1146/annurev-arplant-070122-030236] [Citation(s) in RCA: 19] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
cis-Regulatory elements encode the genomic blueprints that ensure the proper spatiotemporal patterning of gene expression necessary for appropriate development and responses to the environment. Accumulating evidence implicates changes to gene expression as a major source of phenotypic novelty in eukaryotes, including acute phenotypes such as disease and cancer in mammals. Moreover, genetic and epigenetic variation affecting cis-regulatory sequences over longer evolutionary timescales has become a recurring theme in studies of morphological divergence and local adaptation. Here, we discuss the functions of and methods used to identify various classes of cis-regulatory elements, as well as their role in plant development and response to the environment. We highlight opportunities to exploit cis-regulatory variants underlying plant development and environmental responses for crop improvement efforts. Although a comprehensive understanding of cis-regulatory mechanisms in plants has lagged behind that in animals, we showcase several breakthrough findings that have profoundly influenced plant biology and shaped the overall understanding of transcriptional regulation in eukaryotes.
Collapse
Affiliation(s)
| | | | - Kerstin Kaufmann
- Department for Plant Cell and Molecular Biology, Institute of Biology, Humboldt-Universität zu Berlin, Berlin, Germany;
| | - Nathan M Springer
- Department of Plant and Microbial Biology, University of Minnesota, Saint Paul, Minnesota, USA;
| |
Collapse
|
3
|
Liang Z, Myers ZA, Petrella D, Engelhorn J, Hartwig T, Springer NM. Mapping responsive genomic elements to heat stress in a maize diversity panel. Genome Biol 2022; 23:234. [PMID: 36345007 PMCID: PMC9639295 DOI: 10.1186/s13059-022-02807-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Accepted: 10/29/2022] [Indexed: 11/09/2022] Open
Abstract
BACKGROUND Many plant species exhibit genetic variation for coping with environmental stress. However, there are still limited approaches to effectively uncover the genomic region that regulates distinct responsive patterns of the gene across multiple varieties within the same species under abiotic stress. RESULTS By analyzing the transcriptomes of more than 100 maize inbreds, we reveal many cis- and trans-acting eQTLs that influence the expression response to heat stress. The cis-acting eQTLs in response to heat stress are identified in genes with differential responses to heat stress between genotypes as well as genes that are only expressed under heat stress. The cis-acting variants for heat stress-responsive expression likely result from distinct promoter activities, and the differential heat responses of the alleles are confirmed for selected genes using transient expression assays. Global footprinting of transcription factor binding is performed in control and heat stress conditions to document regions with heat-enriched transcription factor binding occupancies. CONCLUSIONS Footprints enriched near proximal regions of characterized heat-responsive genes in a large association panel can be utilized for prioritizing functional genomic regions that regulate genotype-specific responses under heat stress.
Collapse
Affiliation(s)
- Zhikai Liang
- Department of Plant and Microbial Biology, University of Minnesota, Saint Paul, MN, 55108, USA.
| | - Zachary A Myers
- Department of Plant and Microbial Biology, University of Minnesota, Saint Paul, MN, 55108, USA
| | - Dominic Petrella
- Department of Horticulture, University of Minnesota, Saint Paul, MN, 55108, USA
- Present address: Agricultural Technical Institute, The Ohio State University, Wooster, OH, 44691, USA
| | - Julia Engelhorn
- Max Planck Institute for Plant Breeding Research, 50829, Cologne, Germany
- Heinrich-Heine University, 40225, Dusseldorf, Germany
| | - Thomas Hartwig
- Max Planck Institute for Plant Breeding Research, 50829, Cologne, Germany
- Heinrich-Heine University, 40225, Dusseldorf, Germany
| | - Nathan M Springer
- Department of Plant and Microbial Biology, University of Minnesota, Saint Paul, MN, 55108, USA.
| |
Collapse
|
4
|
Yocca AE, Edger PP. Current status and future perspectives on the evolution of cis-regulatory elements in plants. CURRENT OPINION IN PLANT BIOLOGY 2022; 65:102139. [PMID: 34837823 DOI: 10.1016/j.pbi.2021.102139] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/05/2021] [Revised: 09/20/2021] [Accepted: 10/06/2021] [Indexed: 06/13/2023]
Abstract
Cis-regulatory elements (CREs) are short stretches (∼5-15 base pairs) of DNA capable of being bound by a transcription factor and influencing the expression of nearby genes. These regions are of great interest to anyone studying the relationship between phenotype and genotype as these sequences often dictate genes' spatio-temporal expression. Indeed, several associative signals between genotype and phenotype are known to lie outside of protein-coding regions. Therefore, a key to understand evolutionary biology requires their characterization in current and future genome assemblies. In this review, we cover some recent examples of how CRE variation contributes to phenotypic evolution, discuss evidence for the selective pressures experienced by non-coding regions of the genome, and consider several studies on accessible chromatin regions in plants and what they can tell us about CREs. Finally, we discuss how current advances in sequencing technologies will improve our knowledge of CRE variation.
Collapse
Affiliation(s)
- Alan E Yocca
- Department of Plant Biology, Michigan State University, East Lansing, MI, 48824, USA; Department of Horticulture, Michigan State University, East Lansing, MI, 48824, USA
| | - Patrick P Edger
- Department of Horticulture, Michigan State University, East Lansing, MI, 48824, USA; Genetics and Genome Sciences Program, Michigan State University, East Lansing, MI, 48824, USA.
| |
Collapse
|
5
|
Savadel SD, Hartwig T, Turpin ZM, Vera DL, Lung PY, Sui X, Blank M, Frommer WB, Dennis JH, Zhang J, Bass HW. The native cistrome and sequence motif families of the maize ear. PLoS Genet 2021; 17:e1009689. [PMID: 34383745 PMCID: PMC8360572 DOI: 10.1371/journal.pgen.1009689] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2020] [Accepted: 06/30/2021] [Indexed: 01/22/2023] Open
Abstract
Elucidating the transcriptional regulatory networks that underlie growth and development requires robust ways to define the complete set of transcription factor (TF) binding sites. Although TF-binding sites are known to be generally located within accessible chromatin regions (ACRs), pinpointing these DNA regulatory elements globally remains challenging. Current approaches primarily identify binding sites for a single TF (e.g. ChIP-seq), or globally detect ACRs but lack the resolution to consistently define TF-binding sites (e.g. DNAse-seq, ATAC-seq). To address this challenge, we developed MNase-defined cistrome-Occupancy Analysis (MOA-seq), a high-resolution (< 30 bp), high-throughput, and genome-wide strategy to globally identify putative TF-binding sites within ACRs. We used MOA-seq on developing maize ears as a proof of concept, able to define a cistrome of 145,000 MOA footprints (MFs). While a substantial majority (76%) of the known ATAC-seq ACRs intersected with the MFs, only a minority of MFs overlapped with the ATAC peaks, indicating that the majority of MFs were novel and not detected by ATAC-seq. MFs were associated with promoters and significantly enriched for TF-binding and long-range chromatin interaction sites, including for the well-characterized FASCIATED EAR4, KNOTTED1, and TEOSINTE BRANCHED1. Importantly, the MOA-seq strategy improved the spatial resolution of TF-binding prediction and allowed us to identify 215 motif families collectively distributed over more than 100,000 non-overlapping, putatively-occupied binding sites across the genome. Our study presents a simple, efficient, and high-resolution approach to identify putative TF footprints and binding motifs genome-wide, to ultimately define a native cistrome atlas. Understanding gene regulation remains a central goal of modern biology. Delineating the full set of regulatory DNA elements that orchestrate this regulation requires information at two scales; the broad landscape of accessible chromatin, and the site-specific binding of transcription factors (TFs) at discrete cis-regulatory DNA elements. Here we describe a single assay that uses micrococcal nuclease (MNase) as a structural probe to simultaneously reveal regions of accessible chromatin in addition to high-resolution footprints with signatures of TF-occupied cis-elements. We have used maize developing ear tissue as proof of concept, showing the method detects known TF-binding sites. This genome-wide assay not only defines chromatin landscapes, but crucially enables global discovery and mapping of sequence motifs underlying small footprints of ~30 bp to produce an atlas of candidate TF occupancy.
Collapse
Affiliation(s)
- Savannah D. Savadel
- Department of Biological Science, Florida State University, Tallahassee, Florida, United States of America
| | - Thomas Hartwig
- Institute for Molecular Physiologie, Heinrich-Heine-Universität, Düsseldorf, Germany
- Independent research groups, Max Planck Institute for Plant Breeding Research, Cologne, Germany
| | - Zachary M. Turpin
- Department of Biological Science, Florida State University, Tallahassee, Florida, United States of America
| | - Daniel L. Vera
- Department of Biological Science, Florida State University, Tallahassee, Florida, United States of America
| | - Pei-Yau Lung
- Department of Statistics, Florida State University, Tallahassee, Florida, United States of America
| | - Xin Sui
- Department of Statistics, Florida State University, Tallahassee, Florida, United States of America
| | - Max Blank
- Institute for Molecular Physiologie, Heinrich-Heine-Universität, Düsseldorf, Germany
- Independent research groups, Max Planck Institute for Plant Breeding Research, Cologne, Germany
| | - Wolf B. Frommer
- Institute for Molecular Physiologie, Heinrich-Heine-Universität, Düsseldorf, Germany
- Independent research groups, Max Planck Institute for Plant Breeding Research, Cologne, Germany
| | - Jonathan H. Dennis
- Department of Biological Science, Florida State University, Tallahassee, Florida, United States of America
| | - Jinfeng Zhang
- Department of Statistics, Florida State University, Tallahassee, Florida, United States of America
| | - Hank W. Bass
- Department of Biological Science, Florida State University, Tallahassee, Florida, United States of America
- * E-mail:
| |
Collapse
|
6
|
Song B, Buckler ES, Wang H, Wu Y, Rees E, Kellogg EA, Gates DJ, Khaipho-Burch M, Bradbury PJ, Ross-Ibarra J, Hufford MB, Romay MC. Conserved noncoding sequences provide insights into regulatory sequence and loss of gene expression in maize. Genome Res 2021; 31:1245-1257. [PMID: 34045362 PMCID: PMC8256870 DOI: 10.1101/gr.266528.120] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Accepted: 05/21/2021] [Indexed: 01/16/2023]
Abstract
Thousands of species will be sequenced in the next few years; however, understanding how their genomes work, without an unlimited budget, requires both molecular and novel evolutionary approaches. We developed a sensitive sequence alignment pipeline to identify conserved noncoding sequences (CNSs) in the Andropogoneae tribe (multiple crop species descended from a common ancestor ∼18 million years ago). The Andropogoneae share similar physiology while being tremendously genomically diverse, harboring a broad range of ploidy levels, structural variation, and transposons. These contribute to the potential of Andropogoneae as a powerful system for studying CNSs and are factors we leverage to understand the function of maize CNSs. We found that 86% of CNSs were comprised of annotated features, including introns, UTRs, putative cis-regulatory elements, chromatin loop anchors, noncoding RNA (ncRNA) genes, and several transposable element superfamilies. CNSs were enriched in active regions of DNA replication in the early S phase of the mitotic cell cycle and showed different DNA methylation ratios compared to the genome-wide background. More than half of putative cis-regulatory sequences (identified via other methods) overlapped with CNSs detected in this study. Variants in CNSs were associated with gene expression levels, and CNS absence contributed to loss of gene expression. Furthermore, the evolution of CNSs was associated with the functional diversification of duplicated genes in the context of maize subgenomes. Our results provide a quantitative understanding of the molecular processes governing the evolution of CNSs in maize.
Collapse
Affiliation(s)
- Baoxing Song
- Institute for Genomic Diversity, Cornell University, Ithaca, New York 14853, USA
| | - Edward S Buckler
- Institute for Genomic Diversity, Cornell University, Ithaca, New York 14853, USA
- Section of Plant Breeding and Genetics, Cornell University, Ithaca, New York 14853, USA
- Agricultural Research Service, United States Department of Agriculture, Ithaca, New York 14853, USA
| | - Hai Wang
- Institute for Genomic Diversity, Cornell University, Ithaca, New York 14853, USA
- National Maize Improvement Center, Key Laboratory of Crop Heterosis and Utilization, Joint Laboratory for International Cooperation in Crop Molecular Breeding, China Agricultural University, Beijing 100193, China
| | - Yaoyao Wu
- Institute for Genomic Diversity, Cornell University, Ithaca, New York 14853, USA
- Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
| | - Evan Rees
- Section of Plant Breeding and Genetics, Cornell University, Ithaca, New York 14853, USA
| | | | - Daniel J Gates
- Department of Evolution and Ecology, University of California Davis, Davis, California 95616, USA
| | - Merritt Khaipho-Burch
- Section of Plant Breeding and Genetics, Cornell University, Ithaca, New York 14853, USA
| | - Peter J Bradbury
- Agricultural Research Service, United States Department of Agriculture, Ithaca, New York 14853, USA
| | - Jeffrey Ross-Ibarra
- Department of Evolution and Ecology, University of California Davis, Davis, California 95616, USA
- Center for Population Biology and Genome Center, University of California Davis, Davis, California 95616, USA
| | - Matthew B Hufford
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, Iowa 50011, USA
| | - M Cinta Romay
- Institute for Genomic Diversity, Cornell University, Ithaca, New York 14853, USA
| |
Collapse
|
7
|
Yocca AE, Lu Z, Schmitz RJ, Freeling M, Edger PP. Evolution of Conserved Noncoding Sequences in Arabidopsis thaliana. Mol Biol Evol 2021; 38:2692-2703. [PMID: 33565589 PMCID: PMC8233505 DOI: 10.1093/molbev/msab042] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Recent pangenome studies have revealed a large fraction of the gene content within a species exhibits presence-absence variation (PAV). However, coding regions alone provide an incomplete assessment of functional genomic sequence variation at the species level. Little to no attention has been paid to noncoding regulatory regions in pangenome studies, though these sequences directly modulate gene expression and phenotype. To uncover regulatory genetic variation, we generated chromosome-scale genome assemblies for thirty Arabidopsis thaliana accessions from multiple distinct habitats and characterized species level variation in Conserved Noncoding Sequences (CNS). Our analyses uncovered not only PAV and positional variation (PosV) but that diversity in CNS is nonrandom, with variants shared across different accessions. Using evolutionary analyses and chromatin accessibility data, we provide further evidence supporting roles for conserved and variable CNS in gene regulation. Additionally, our data suggests that transposable elements contribute to CNS variation. Characterizing species-level diversity in all functional genomic sequences may later uncover previously unknown mechanistic links between genotype and phenotype.
Collapse
Affiliation(s)
- Alan E Yocca
- Department of Plant Biology, Michigan State University, East Lansing, MI, USA.,Department of Horticulture, Michigan State University, East Lansing, MI, USA
| | - Zefu Lu
- Department of Genetics, University of Georgia, Athens, GA, USA
| | | | - Michael Freeling
- Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA
| | - Patrick P Edger
- Department of Horticulture, Michigan State University, East Lansing, MI, USA.,Ecology, Evolutionary Biology and Behavior, Michigan State University, East Lansing, MI, USA
| |
Collapse
|
8
|
Liang Z, Qiu Y, Schnable JC. Genome-Phenome Wide Association in Maize and Arabidopsis Identifies a Common Molecular and Evolutionary Signature. MOLECULAR PLANT 2020; 13:907-922. [PMID: 32171733 DOI: 10.1016/j.molp.2020.03.003] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/02/2019] [Revised: 01/20/2020] [Accepted: 03/08/2020] [Indexed: 06/10/2023]
Abstract
Linking natural genetic variation to trait variation can help determine the functional roles ofdifferent genes. Variations of one or several traits are often assessed separately. High-throughput phenotyping and data mining can capture dozens or hundreds of traits from the same individuals. Here, we test the association between markers within a gene and many traits simultaneously. This genome-phenome wide association study (GPWAS) is both a multi-marker and multi-trait test. Genes identified using GPWAS with 260 phenotypic traits in maize were enriched for genes independently linked to phenotypic variation. Traits associated with classical mutants were consistent with reported phenotypes for mutant alleles. Genes linked to phenomic variation in maize using GPWAS shared molecular, population genetic, and evolutionary features with classical mutants in maize. Genes linked to phenomic variation in Arabidopsis using GPWAS are significantly enriched in genes with known loss-of-function phenotypes. GPWAS may be an effective strategy to identify genes in which loss-of-function alleles produce mutant phenotypes. The shared signatures present in classical mutants and genes identified using GPWAS may be markers for genes with a role in specifying plant phenotypes generally or pleiotropy specifically.
Collapse
Affiliation(s)
- Zhikai Liang
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE, USA; Plant Science Innovation Center, University of Nebraska-Lincoln, Lincoln, NE, USA
| | - Yumou Qiu
- Department of Statistics, Iowa State University, Ames, IA, USA
| | - James C Schnable
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE, USA; Plant Science Innovation Center, University of Nebraska-Lincoln, Lincoln, NE, USA.
| |
Collapse
|
9
|
Lai X, Yan L, Lu Y, Schnable JC. Largely unlinked gene sets targeted by selection for domestication syndrome phenotypes in maize and sorghum. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2018; 93:843-855. [PMID: 29265526 DOI: 10.1111/tpj.13806] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/04/2017] [Revised: 11/27/2017] [Accepted: 12/04/2017] [Indexed: 05/14/2023]
Abstract
The domestication of diverse grain crops from wild grasses was a result of artificial selection for a suite of overlapping traits producing changes referred to in aggregate as 'domestication syndrome'. Parallel phenotypic change can be accomplished by either selection on orthologous genes or selection on non-orthologous genes with parallel phenotypic effects. To determine how often artificial selection for domestication traits in the grasses targeted orthologous genes, we employed resequencing data from wild and domesticated accessions of Zea (maize) and Sorghum (sorghum). Many 'classic' domestication genes identified through quantitative trait locus mapping in populations resulting from wild/domesticated crosses indeed show signatures of parallel selection in both maize and sorghum. However, the overall number of genes showing signatures of parallel selection in both species is not significantly different from that expected by chance. This suggests that while a small number of genes will extremely large phenotypic effects have been targeted repeatedly by artificial selection during domestication, the optimization part of domestication targeted small and largely non-overlapping subsets of all possible genes which could produce equivalent phenotypic alterations.
Collapse
Affiliation(s)
- Xianjun Lai
- Center for Plant Science Innovation and Department of Agronomy and Horticulture, University of Nebraska-Lincoln, NE, 68588, USA
- Maize Research Institute, Sichuan Agricultural University, Chengdu, 611130, China
| | - Lang Yan
- Center for Plant Science Innovation and Department of Agronomy and Horticulture, University of Nebraska-Lincoln, NE, 68588, USA
- Laboratory of Functional Genome and Application of Potato, Xichang College, Liangshan, 615000, China
- College of Life Sciences, Sichuan University, Chengdu, 610065, China
| | - Yanli Lu
- Maize Research Institute, Sichuan Agricultural University, Chengdu, 611130, China
| | - James C Schnable
- Center for Plant Science Innovation and Department of Agronomy and Horticulture, University of Nebraska-Lincoln, NE, 68588, USA
| |
Collapse
|