1
|
Dai A, Lan W, Lyu Y, Zhou X, Mi X, Tang T, Liufu Z. MicroRNA-mediated network redundancy is constrained by purifying selection and contributes to expression robustness in Drosophila melanogaster. Commun Biol 2024; 7:1431. [PMID: 39496904 PMCID: PMC11535065 DOI: 10.1038/s42003-024-07162-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Accepted: 10/29/2024] [Indexed: 11/06/2024] Open
Abstract
MicroRNAs (miRNAs) are post-transcriptional, non-coding regulatory RNAs that function coordinately with transcription factors (TFs) in gene regulatory networks. TFs and their targets are often co-regulated by miRNAs, forming composite feedforward circuits (cFFCs) with varying degrees of redundancy, primarily mediated by miRNAs. However, the maintenance of miRNA-mediated regulatory redundancy and its impact on gene expression evolution remain elusive. By integrating ChIP-seq data from ENCODE and miRNA targeting from TargetScanFly, we quantified miRNA-mediated cFFC redundancy in Drosophila melanogaster embryos and larvae, revealing more than three quarters of miRNA targets are involved in redundant cFFCs. Higher cFFC redundancy, where more miRNAs target the same gene within a cFFC, is correlated with stronger purifying selection, reduced expression divergence between species, and increased expression stability under heat shock stress. Redundant cFFCs primarily regulate older or broadly expressed young genes. These findings highlight the role of miRNA-mediated cFFC redundancy in enhancing gene expression robustness through natural selection.
Collapse
Affiliation(s)
- Aimei Dai
- State Key Laboratory of Biocontrol and Guangdong Key Laboratory of Plant Resources, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, Guangdong, China
- Innovation Center for Evolutionary Synthetic Biology, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, Guangdong, China
| | - Wenqi Lan
- State Key Laboratory of Biocontrol and Guangdong Key Laboratory of Plant Resources, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, Guangdong, China
- Innovation Center for Evolutionary Synthetic Biology, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, Guangdong, China
| | - Yang Lyu
- Department of Molecular Biology and Biochemistry, Rutgers, the State University of New Jersey, 604 Allison Road, Piscataway, NJ, 08854, USA
| | - Xuanyi Zhou
- State Key Laboratory of Biocontrol and Guangdong Key Laboratory of Plant Resources, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, Guangdong, China
- Innovation Center for Evolutionary Synthetic Biology, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, Guangdong, China
| | - Xin Mi
- State Key Laboratory of Biocontrol and Guangdong Key Laboratory of Plant Resources, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, Guangdong, China
- Innovation Center for Evolutionary Synthetic Biology, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, Guangdong, China
| | - Tian Tang
- State Key Laboratory of Biocontrol and Guangdong Key Laboratory of Plant Resources, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, Guangdong, China.
- Innovation Center for Evolutionary Synthetic Biology, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, Guangdong, China.
| | - Zhongqi Liufu
- State Key Laboratory of Genetic Resources and Evolution / Yunnan Key Laboratory of Biodiversity Information, Kunming Institute of Zoology, The Chinese Academy of Sciences, Kunming, 650223, China.
| |
Collapse
|
2
|
Zhao L, Svetec N, Begun DJ. De Novo Genes. Annu Rev Genet 2024; 58:211-232. [PMID: 39088850 DOI: 10.1146/annurev-genet-111523-102413] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/03/2024]
Abstract
Although the majority of annotated new genes in a given genome appear to have arisen from duplication-related mechanisms, recent studies have shown that genes can also originate de novo from ancestrally nongenic sequences. Investigating de novo-originated genes offers rich opportunities to understand the origin and functions of new genes, their regulatory mechanisms, and the associated evolutionary processes. Such studies have uncovered unexpected and intriguing facets of gene origination, offering novel perspectives on the complexity of the genome and gene evolution. In this review, we provide an overview of the research progress in this field, highlight recent advancements, identify key technical and conceptual challenges, and underscore critical questions that remain to be addressed.
Collapse
Affiliation(s)
- Li Zhao
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, USA; ,
| | - Nicolas Svetec
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, USA; ,
| | - David J Begun
- Department of Evolution and Ecology, University of California, Davis, California, USA;
| |
Collapse
|
3
|
Li XC, Srinivasan V, Laiker I, Misunou N, Frankel N, Pallares LF, Crocker J. TF-High-Evolutionary: In Vivo Mutagenesis of Gene Regulatory Networks for the Study of the Genetics and Evolution of the Drosophila Regulatory Genome. Mol Biol Evol 2024; 41:msae167. [PMID: 39117360 PMCID: PMC11342961 DOI: 10.1093/molbev/msae167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2024] [Revised: 07/29/2024] [Accepted: 08/06/2024] [Indexed: 08/10/2024] Open
Abstract
Understanding the evolutionary potential of mutations in gene regulatory networks is essential to furthering the study of evolution and development. However, in multicellular systems, genetic manipulation of regulatory networks in a targeted and high-throughput way remains challenging. In this study, we designed TF-High-Evolutionary (HighEvo), a transcription factor (TF) fused with a base editor (activation-induced deaminase), to continuously induce germline mutations at TF-binding sites across regulatory networks in Drosophila. Populations of flies expressing TF-HighEvo in their germlines accumulated mutations at rates an order of magnitude higher than natural populations. Importantly, these mutations accumulated around the targeted TF-binding sites across the genome, leading to distinct morphological phenotypes consistent with the developmental roles of the tagged TFs. As such, this TF-HighEvo method allows the interrogation of the mutational space of gene regulatory networks at scale and can serve as a powerful reagent for experimental evolution and genetic screens focused on the regulatory genome.
Collapse
Affiliation(s)
- Xueying C Li
- European Molecular Biology Laboratory, Heidelberg, Germany
| | | | - Ian Laiker
- Instituto de Fisiología, Biología Molecular y Neurociencias (IFIBYNE), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) y Universidad de Buenos Aires (UBA), Buenos Aires 1428, Argentina
| | | | - Nicolás Frankel
- Instituto de Fisiología, Biología Molecular y Neurociencias (IFIBYNE), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) y Universidad de Buenos Aires (UBA), Buenos Aires 1428, Argentina
| | - Luisa F Pallares
- Friedrich Miescher Laboratory, Max Planck Society, Tübingen, Germany
| | - Justin Crocker
- European Molecular Biology Laboratory, Heidelberg, Germany
| |
Collapse
|
4
|
Lee U, Li C, Langer CB, Svetec N, Zhao L. Comparative Single Cell Analysis of Transcriptional Bursting Reveals the Role of Genome Organization on de novo Transcript Origination. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.29.591771. [PMID: 38746255 PMCID: PMC11092510 DOI: 10.1101/2024.04.29.591771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Spermatogenesis is a key developmental process underlying the origination of newly evolved genes. However, rapid cell type-specific transcriptomic divergence of the Drosophila germline has posed a significant technical barrier for comparative single-cell RNA-sequencing (scRNA-Seq) studies. By quantifying a surprisingly strong correlation between species-and cell type-specific divergence in three closely related Drosophila species, we apply a simple statistical procedure to identify a core set of 198 genes that are highly predictive of cell type identity while remaining robust to species-specific differences that span over 25-30 million years of evolution. We then utilize cell type classifications based on the 198-gene set to show how transcriptional divergence in cell type increases throughout spermatogenic developmental time, contrasting with traditional hourglass models of whole-organism development. With these cross-species cell type classifications, we then investigate the influence of genome organization on the molecular evolution of spermatogenesis vis-a-vis transcriptional bursting. We first demonstrate how mechanistic control of pre-meiotic transcription is achieved by altering transcriptional burst size while post-meiotic control is exerted via altered bursting frequency. We then report how global differences in autosomal vs. X chromosomal transcription likely arise in a developmental stage preceding full testis organogenesis by showing evolutionarily conserved decreases in X-linked transcription bursting kinetics in all examined somatic and germline cell types. Finally, we provide evidence supporting the cultivator model of de novo gene origination by demonstrating how the appearance of newly evolved testis-specific transcripts potentially provides short-range regulation of the transcriptional bursting properties of neighboring genes during key stages of spermatogenesis.
Collapse
|
5
|
Guo X, Wang C, Zhang Y, Wei R, Xi R. Cell-fate conversion of intestinal cells in adult Drosophila midgut by depleting a single transcription factor. Nat Commun 2024; 15:2656. [PMID: 38531872 DOI: 10.1038/s41467-024-46956-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 03/14/2024] [Indexed: 03/28/2024] Open
Abstract
The manipulation of cell identity by reprograming holds immense potential in regenerative medicine, but is often limited by the inefficient acquisition of fully functional cells. This problem can potentially be resolved by better understanding the reprogramming process using in vivo genetic models, which are currently scarce. Here we report that both enterocytes (ECs) and enteroendocrine cells (EEs) in adult Drosophila midgut show a surprising degree of cell plasticity. Depleting the transcription factor Tramtrack in the differentiated ECs can initiate Prospero-mediated cell transdifferentiation, leading to EE-like cells. On the other hand, depletion of Prospero in the differentiated EEs can lead to the loss of EE-specific transcription programs and the gain of intestinal progenitor cell identity, allowing cell cycle re-entry or differentiation into ECs. We find that intestinal progenitor cells, ECs, and EEs have a similar chromatin accessibility profile, supporting the concept that cell plasticity is enabled by pre-existing chromatin accessibility with switchable transcription programs. Further genetic analysis with this system reveals that the NuRD chromatin remodeling complex, cell lineage confliction, and age act as barriers to EC-to-EE transdifferentiation. The establishment of this genetically tractable in vivo model should facilitate mechanistic investigation of cell plasticity at the molecular and genetic level.
Collapse
Affiliation(s)
- Xingting Guo
- National Institute of Biological Sciences, No. 7 Science Park Road, Zhongguancun Life Science Park, Beijing, 102206, China
- Tsinghua Institute of Multidisciplinary Biomedical Research, Tsinghua University, Beijing, 102206, China
| | - Chenhui Wang
- National Institute of Biological Sciences, No. 7 Science Park Road, Zhongguancun Life Science Park, Beijing, 102206, China.
- School of Life Science and Technology, ShanghaiTech University, Shanghai, 201210, China.
| | - Yongchao Zhang
- National Institute of Biological Sciences, No. 7 Science Park Road, Zhongguancun Life Science Park, Beijing, 102206, China
- Tsinghua Institute of Multidisciplinary Biomedical Research, Tsinghua University, Beijing, 102206, China
| | - Ruxue Wei
- National Institute of Biological Sciences, No. 7 Science Park Road, Zhongguancun Life Science Park, Beijing, 102206, China
| | - Rongwen Xi
- National Institute of Biological Sciences, No. 7 Science Park Road, Zhongguancun Life Science Park, Beijing, 102206, China.
- Tsinghua Institute of Multidisciplinary Biomedical Research, Tsinghua University, Beijing, 102206, China.
| |
Collapse
|
6
|
Peng J, Zhao L. The origin and structural evolution of de novo genes in Drosophila. Nat Commun 2024; 15:810. [PMID: 38280868 PMCID: PMC10821953 DOI: 10.1038/s41467-024-45028-1] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 01/09/2024] [Indexed: 01/29/2024] Open
Abstract
Recent studies reveal that de novo gene origination from previously non-genic sequences is a common mechanism for gene innovation. These young genes provide an opportunity to study the structural and functional origins of proteins. Here, we combine high-quality base-level whole-genome alignments and computational structural modeling to study the origination, evolution, and protein structures of lineage-specific de novo genes. We identify 555 de novo gene candidates in D. melanogaster that originated within the Drosophilinae lineage. Sequence composition, evolutionary rates, and expression patterns indicate possible gradual functional or adaptive shifts with their gene ages. Surprisingly, we find little overall protein structural changes in candidates from the Drosophilinae lineage. We identify several candidates with potentially well-folded protein structures. Ancestral sequence reconstruction analysis reveals that most potentially well-folded candidates are often born well-folded. Single-cell RNA-seq analysis in testis shows that although most de novo gene candidates are enriched in spermatocytes, several young candidates are biased towards the early spermatogenesis stage, indicating potentially important but less emphasized roles of early germline cells in the de novo gene origination in testis. This study provides a systematic overview of the origin, evolution, and protein structural changes of Drosophilinae-specific de novo genes.
Collapse
Affiliation(s)
- Junhui Peng
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, USA
| | - Li Zhao
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, USA.
| |
Collapse
|
7
|
Yin Z, Ding G, Xue Y, Yu X, Dong J, Huang J, Ma J, He F. A postmeiotically bifurcated roadmap of honeybee spermatogenesis marked by phylogenetically restricted genes. PLoS Genet 2023; 19:e1011081. [PMID: 38048317 PMCID: PMC10721206 DOI: 10.1371/journal.pgen.1011081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 12/14/2023] [Accepted: 11/22/2023] [Indexed: 12/06/2023] Open
Abstract
Haploid males of hymenopteran species produce gametes through an abortive meiosis I followed by meiosis II that can either be symmetric or asymmetric in different species. Thus, one spermatocyte could give rise to two spermatids with either equal or unequal amounts of cytoplasm. It is currently unknown what molecular features accompany these postmeiotic sperm cells especially in species with asymmetric meiosis II such as bees. Here we present testis single-cell RNA sequencing datasets from the honeybee (Apis mellifera) drones of 3 and 14 days after emergence (3d and 14d). We show that, while 3d testes exhibit active, ongoing spermatogenesis, 14d testes only have late-stage spermatids. We identify a postmeiotic bifurcation in the transcriptional roadmap during spermatogenesis, with cells progressing toward the annotated spermatids (SPT) and small spermatids (sSPT), respectively. Despite an overall similarity in their transcriptomic profiles, sSPTs express the fewest genes and the least RNA content among all the sperm cell types. Intriguingly, sSPTs exhibit a relatively high expression level for Hymenoptera-restricted genes and a high mutation load, suggesting that the special meiosis II during spermatogenesis in the honeybee is accompanied by phylogenetically young gene activities.
Collapse
Affiliation(s)
- Zhiyong Yin
- Center for Genetic Medicine, the Fourth Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
| | - Guiling Ding
- State Key Laboratory of Resource Insects, Institute of Apicultural Research, Chinese Academy of Agricultural Sciences, Beijing, China
- Key Laboratory for Insect-Pollinator Biology of the Ministry of Agriculture and Rural Affairs, Institute of Apicultural Research, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Yingdi Xue
- Center for Genetic Medicine, the Fourth Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
| | - Xianghui Yu
- Center for Genetic Medicine, the Fourth Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
| | - Jie Dong
- Institute of Animal Husbandry and Veterinary Science, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Jiaxing Huang
- State Key Laboratory of Resource Insects, Institute of Apicultural Research, Chinese Academy of Agricultural Sciences, Beijing, China
- Key Laboratory for Insect-Pollinator Biology of the Ministry of Agriculture and Rural Affairs, Institute of Apicultural Research, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Jun Ma
- Center for Genetic Medicine, the Fourth Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
- Women’s Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
- Institute of Genetics, Zhejiang University International School of Medicine, Hangzhou, Zhejiang, China
- Zhejiang Provincial Key Laboratory of Genetic and Developmental Disorder, Hangzhou, Zhejiang, China
| | - Feng He
- Center for Genetic Medicine, the Fourth Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
- Women’s Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
- Institute of Genetics, Zhejiang University International School of Medicine, Hangzhou, Zhejiang, China
- Zhejiang Provincial Key Laboratory of Genetic and Developmental Disorder, Hangzhou, Zhejiang, China
| |
Collapse
|
8
|
Clifton BD, Hariyani I, Kimura A, Luo F, Nguyen A, Ranz JM. Paralog transcriptional differentiation in the D. melanogaster-specific gene family Sdic across populations and spermatogenesis stages. Commun Biol 2023; 6:1069. [PMID: 37864070 PMCID: PMC10589255 DOI: 10.1038/s42003-023-05427-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Accepted: 10/05/2023] [Indexed: 10/22/2023] Open
Abstract
How recently originated gene copies become stable genomic components remains uncertain as high sequence similarity of young duplicates precludes their functional characterization. The tandem multigene family Sdic is specific to Drosophila melanogaster and has been annotated across multiple reference-quality genome assemblies. Here we show the existence of a positive correlation between Sdic copy number and total expression, plus vast intrastrain differences in mRNA abundance among paralogs, using RNA-sequencing from testis of four strains with variable paralog composition. Single cell and nucleus RNA-sequencing data expose paralog expression differentiation in meiotic cell types within testis from third instar larva and adults. Additional RNA-sequencing across synthetic strains only differing in their Y chromosomes reveal a tissue-dependent trans-regulatory effect on Sdic: upregulation in testis and downregulation in male accessory gland. By leveraging paralog-specific expression information from tissue- and cell-specific data, our results elucidate the intraspecific functional diversification of a recently expanded tandem gene family.
Collapse
Affiliation(s)
- Bryan D Clifton
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, CA, 92697, USA.
| | - Imtiyaz Hariyani
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, CA, 92697, USA
| | - Ashlyn Kimura
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, CA, 92697, USA
| | - Fangning Luo
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, CA, 92697, USA
| | - Alvin Nguyen
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, CA, 92697, USA
| | - José M Ranz
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, CA, 92697, USA.
| |
Collapse
|
9
|
Khodursky S, Zheng EB, Svetec N, Durkin SM, Benjamin S, Gadau A, Wu X, Zhao L. The evolution and mutational robustness of chromatin accessibility in Drosophila. Genome Biol 2023; 24:232. [PMID: 37845780 PMCID: PMC10578003 DOI: 10.1186/s13059-023-03079-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 09/29/2023] [Indexed: 10/18/2023] Open
Abstract
BACKGROUND The evolution of genomic regulatory regions plays a critical role in shaping the diversity of life. While this process is primarily sequence-dependent, the enormous complexity of biological systems complicates the understanding of the factors underlying regulation and its evolution. Here, we apply deep neural networks as a tool to investigate the sequence determinants underlying chromatin accessibility in different species and tissues of Drosophila. RESULTS We train hybrid convolution-attention neural networks to accurately predict ATAC-seq peaks using only local DNA sequences as input. We show that our models generalize well across substantially evolutionarily diverged species of insects, implying that the sequence determinants of accessibility are highly conserved. Using our model to examine species-specific gains in accessibility, we find evidence suggesting that these regions may be ancestrally poised for evolution. Using in silico mutagenesis, we show that accessibility can be accurately predicted from short subsequences in each example. However, in silico knock-out of these sequences does not qualitatively impair classification, implying that accessibility is mutationally robust. Subsequently, we show that accessibility is predicted to be robust to large-scale random mutation even in the absence of selection. Conversely, simulations under strong selection demonstrate that accessibility can be extremely malleable despite its robustness. Finally, we identify motifs predictive of accessibility, recovering both novel and previously known motifs. CONCLUSIONS These results demonstrate the conservation of the sequence determinants of accessibility and the general robustness of chromatin accessibility, as well as the power of deep neural networks to explore fundamental questions in regulatory genomics and evolution.
Collapse
Affiliation(s)
- Samuel Khodursky
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, 10065, USA
| | - Eric B Zheng
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, 10065, USA
| | - Nicolas Svetec
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, 10065, USA
| | - Sylvia M Durkin
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, 10065, USA
- Present Address: Department of Integrative Biology and Museum of Vertebrate Zoology, University of California, Berkeley, Berkeley, CA, USA
| | - Sigi Benjamin
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, 10065, USA
| | - Alice Gadau
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, 10065, USA
| | - Xia Wu
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, 10065, USA
| | - Li Zhao
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, 10065, USA.
| |
Collapse
|
10
|
Puixeu G, Macon A, Vicoso B. Sex-specific estimation of cis and trans regulation of gene expression in heads and gonads of Drosophila melanogaster. G3 (BETHESDA, MD.) 2023; 13:jkad121. [PMID: 37259621 PMCID: PMC10411594 DOI: 10.1093/g3journal/jkad121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 05/17/2023] [Accepted: 05/17/2023] [Indexed: 06/02/2023]
Abstract
The regulatory architecture of gene expression is known to differ substantially between sexes in Drosophila, but most studies performed so far used whole-body data and only single crosses, which may have limited their scope to detect patterns that are robust across tissues and biological replicates. Here, we use allele-specific gene expression of parental and reciprocal hybrid crosses between 6 Drosophila melanogaster inbred lines to quantify cis- and trans-regulatory variation in heads and gonads of both sexes separately across 3 replicate crosses. Our results suggest that female and male heads, as well as ovaries, have a similar regulatory architecture. On the other hand, testes display more and substantially different cis-regulatory effects, suggesting that sex differences in the regulatory architecture that have been previously observed may largely derive from testis-specific effects. We also examine the difference in cis-regulatory variation of genes across different levels of sex bias in gonads and heads. Consistent with the idea that intersex correlations constrain expression and can lead to sexual antagonism, we find more cis variation in unbiased and moderately biased genes in heads. In ovaries, reduced cis variation is observed for male-biased genes, suggesting that cis variants acting on these genes in males do not lead to changes in ovary expression. Finally, we examine the dominance patterns of gene expression and find that sex- and tissue-specific patterns of inheritance as well as trans-regulatory variation are highly variable across biological crosses, although these were performed in highly controlled experimental conditions. This highlights the importance of using various genetic backgrounds to infer generalizable patterns.
Collapse
Affiliation(s)
- Gemma Puixeu
- Institute of Science and Technology Austria, Am Campus 1, Klosterneuburg 3400, Austria
| | - Ariana Macon
- Institute of Science and Technology Austria, Am Campus 1, Klosterneuburg 3400, Austria
| | - Beatriz Vicoso
- Institute of Science and Technology Austria, Am Campus 1, Klosterneuburg 3400, Austria
| |
Collapse
|
11
|
Khodursky S, Zheng EB, Svetec N, Durkin SM, Benjamin S, Gadau A, Wu X, Zhao L. The evolution and mutational robustness of chromatin accessibility in Drosophila. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.26.546587. [PMID: 37425760 PMCID: PMC10327059 DOI: 10.1101/2023.06.26.546587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
The evolution of regulatory regions in the genome plays a critical role in shaping the diversity of life. While this process is primarily sequence-dependent, the enormous complexity of biological systems has made it difficult to understand the factors underlying regulation and its evolution. Here, we apply deep neural networks as a tool to investigate the sequence determinants underlying chromatin accessibility in different tissues of Drosophila. We train hybrid convolution-attention neural networks to accurately predict ATAC-seq peaks using only local DNA sequences as input. We show that a model trained in one species has nearly identical performance when tested in another species, implying that the sequence determinants of accessibility are highly conserved. Indeed, model performance remains excellent even in distantly-related species. By using our model to examine species-specific gains in chromatin accessibility, we find that their orthologous inaccessible regions in other species have surprisingly similar model outputs, suggesting that these regions may be ancestrally poised for evolution. We then use in silico saturation mutagenesis to reveal evidence of selective constraint acting specifically on inaccessible chromatin regions. We further show that chromatin accessibility can be accurately predicted from short subsequences in each example. However, in silico knock-out of these sequences does not qualitatively impair classification, implying that chromatin accessibility is mutationally robust. Subsequently, we demonstrate that chromatin accessibility is predicted to be robust to large-scale random mutation even in the absence of selection. We also perform in silico evolution experiments under the regime of strong selection and weak mutation (SSWM) and show that chromatin accessibility can be extremely malleable despite its mutational robustness. However, selection acting in different directions in a tissue-specific manner can substantially slow adaptation. Finally, we identify motifs predictive of chromatin accessibility and recover motifs corresponding to known chromatin accessibility activators and repressors. These results demonstrate the conservation of the sequence determinants of accessibility and the general robustness of chromatin accessibility, as well as the power of deep neural networks as tools to answer fundamental questions in regulatory genomics and evolution.
Collapse
Affiliation(s)
- Samuel Khodursky
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY 10065, USA
- These authors contributed equally
| | - Eric B Zheng
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY 10065, USA
- These authors contributed equally
| | - Nicolas Svetec
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY 10065, USA
| | - Sylvia M Durkin
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY 10065, USA
- Current Address: Department of Integrative Biology and Museum of Vertebrate Zoology, University of California, Berkeley, Berkeley, CA, USA
| | - Sigi Benjamin
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY 10065, USA
| | - Alice Gadau
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY 10065, USA
| | - Xia Wu
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY 10065, USA
| | - Li Zhao
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY 10065, USA
| |
Collapse
|
12
|
Peng J, Zhao L. The origin and structural evolution of de novo genes in Drosophila. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.13.532420. [PMID: 37425675 PMCID: PMC10326970 DOI: 10.1101/2023.03.13.532420] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
Although previously thought to be unlikely, recent studies have shown that de novo gene origination from previously non-genic sequences is a relatively common mechanism for gene innovation in many species and taxa. These young genes provide a unique set of candidates to study the structural and functional origination of proteins. However, our understanding of their protein structures and how these structures originate and evolve are still limited, due to a lack of systematic studies. Here, we combined high-quality base-level whole genome alignments, bioinformatic analysis, and computational structure modeling to study the origination, evolution, and protein structure of lineage-specific de novo genes. We identified 555 de novo gene candidates in D. melanogaster that originated within the Drosophilinae lineage. We found a gradual shift in sequence composition, evolutionary rates, and expression patterns with their gene ages, which indicates possible gradual shifts or adaptations of their functions. Surprisingly, we found little overall protein structural changes for de novo genes in the Drosophilinae lineage. Using Alphafold2, ESMFold, and molecular dynamics, we identified a number of de novo gene candidates with protein products that are potentially well-folded, many of which are more likely to contain transmembrane and signal proteins compared to other annotated protein-coding genes. Using ancestral sequence reconstruction, we found that most potentially well-folded proteins are often born folded. Interestingly, we observed one case where disordered ancestral proteins become ordered within a relatively short evolutionary time. Single-cell RNA-seq analysis in testis showed that although most de novo genes are enriched in spermatocytes, several young de novo genes are biased in the early spermatogenesis stage, indicating potentially important but less emphasized roles of early germline cells in the de novo gene origination in testis. This study provides a systematic overview of the origin, evolution, and structural changes of Drosophilinae-specific de novo genes.
Collapse
Affiliation(s)
- Junhui Peng
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY 10065, USA
| | - Li Zhao
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY 10065, USA
| |
Collapse
|
13
|
Huynh K, Smith BR, Macdonald SJ, Long AD. Genetic variation in chromatin state across multiple tissues in Drosophila melanogaster. PLoS Genet 2023; 19:e1010439. [PMID: 37146087 PMCID: PMC10191298 DOI: 10.1371/journal.pgen.1010439] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 05/17/2023] [Accepted: 04/20/2023] [Indexed: 05/07/2023] Open
Abstract
We use ATAC-seq to examine chromatin accessibility for four different tissues in Drosophila melanogaster: adult female brain, ovaries, and both wing and eye-antennal imaginal discs from males. Each tissue is assayed in eight different inbred strain genetic backgrounds, seven associated with a reference quality genome assembly. We develop a method for the quantile normalization of ATAC-seq fragments and test for differences in coverage among genotypes, tissues, and their interaction at 44099 peaks throughout the euchromatic genome. For the strains with reference quality genome assemblies, we correct ATAC-seq profiles for read mis-mapping due to nearby polymorphic structural variants (SVs). Comparing coverage among genotypes without accounting for SVs results in a highly elevated rate (55%) of identifying false positive differences in chromatin state between genotypes. After SV correction, we identify 1050, 30383, and 4508 regions whose peak heights are polymorphic among genotypes, among tissues, or exhibit genotype-by-tissue interactions, respectively. Finally, we identify 3988 candidate causative variants that explain at least 80% of the variance in chromatin state at nearby ATAC-seq peaks.
Collapse
Affiliation(s)
- Khoi Huynh
- Department of Ecology and Evolutionary Biology, University of California, Irvine, California, United States of America
| | - Brittny R. Smith
- Department of Molecular Biosciences, University of Kansas, Lawrence, Kansas, United States of America
| | - Stuart J. Macdonald
- Department of Molecular Biosciences, University of Kansas, Lawrence, Kansas, United States of America
- Center for Computational Biology, University of Kansas, Lawrence, Kansas, United States of America
| | - Anthony D. Long
- Department of Ecology and Evolutionary Biology, University of California, Irvine, California, United States of America
| |
Collapse
|
14
|
Transcriptional and mutational signatures of the Drosophila ageing germline. Nat Ecol Evol 2023; 7:440-449. [PMID: 36635344 PMCID: PMC10291629 DOI: 10.1038/s41559-022-01958-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Accepted: 11/24/2022] [Indexed: 01/14/2023]
Abstract
Ageing is a complex biological process that is accompanied by changes in gene expression and mutational load. In many species, including humans, older fathers pass on more paternally derived de novo mutations; however, the cellular basis and cell types driving this pattern are still unclear. To explore the root causes of this phenomenon, we performed single-cell RNA sequencing on testes from young and old male Drosophila and genomic sequencing (DNA sequencing) on somatic tissues from the same flies. We found that early germ cells from old and young flies enter spermatogenesis with similar mutational loads but older flies are less able to remove mutations during spermatogenesis. Mutations in old cells may also increase during spermatogenesis. Our data reveal that old and young flies have distinct mutational biases. Many classes of genes show increased postmeiotic expression in the germlines of older flies. Late spermatogenesis-biased genes have higher dN/dS (ratio of non-synonymous to synonymous substitutions) than early spermatogenesis-biased genes, supporting the hypothesis that late spermatogenesis is a source of evolutionary innovation. Surprisingly, genes biased in young germ cells show higher dN/dS than genes biased in old germ cells. Our results provide new insights into the role of the germline in de novo mutation.
Collapse
|
15
|
Venkataraman K, Shai N, Lakhiani P, Zylka S, Zhao J, Herre M, Zeng J, Neal LA, Molina H, Zhao L, Vosshall LB. Two novel, tightly linked, and rapidly evolving genes underlie Aedes aegypti mosquito reproductive resilience during drought. eLife 2023; 12:e80489. [PMID: 36744865 PMCID: PMC10076016 DOI: 10.7554/elife.80489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2022] [Accepted: 01/29/2023] [Indexed: 02/07/2023] Open
Abstract
Female Aedes aegypti mosquitoes impose a severe global public health burden as vectors of multiple viral pathogens. Under optimal environmental conditions, Aedes aegypti females have access to human hosts that provide blood proteins for egg development, conspecific males that provide sperm for fertilization, and freshwater that serves as an egg-laying substrate suitable for offspring survival. As global temperatures rise, Aedes aegypti females are faced with climate challenges like intense droughts and intermittent precipitation, which create unpredictable, suboptimal conditions for egg-laying. Here, we show that under drought-like conditions simulated in the laboratory, females retain mature eggs in their ovaries for extended periods, while maintaining the viability of these eggs until they can be laid in freshwater. Using transcriptomic and proteomic profiling of Aedes aegypti ovaries, we identify two previously uncharacterized genes named tweedledee and tweedledum, each encoding a small, secreted protein that both show ovary-enriched, temporally-restricted expression during egg retention. These genes are mosquito-specific, linked within a syntenic locus, and rapidly evolving under positive selection, raising the possibility that they serve an adaptive function. CRISPR-Cas9 deletion of both tweedledee and tweedledum demonstrates that they are specifically required for extended retention of viable eggs. These results highlight an elegant example of taxon-restricted genes at the heart of an important adaptation that equips Aedes aegypti females with 'insurance' to flexibly extend their reproductive schedule without losing reproductive capacity, thus allowing this species to exploit unpredictable habitats in a changing world.
Collapse
Affiliation(s)
- Krithika Venkataraman
- Laboratory of Neurogenetics and Behavior, Rockefeller UniversityNew YorkUnited States
| | - Nadav Shai
- Laboratory of Neurogenetics and Behavior, Rockefeller UniversityNew YorkUnited States
- Howard Hughes Medical InstituteNew YorkUnited States
| | - Priyanka Lakhiani
- Laboratory of Neurogenetics and Behavior, Rockefeller UniversityNew YorkUnited States
- Laboratory of Evolutionary Genetics and Genomics, Rockefeller UniversityNew YorkUnited States
| | - Sarah Zylka
- Laboratory of Neurogenetics and Behavior, Rockefeller UniversityNew YorkUnited States
| | - Jieqing Zhao
- Laboratory of Neurogenetics and Behavior, Rockefeller UniversityNew YorkUnited States
| | - Margaret Herre
- Laboratory of Neurogenetics and Behavior, Rockefeller UniversityNew YorkUnited States
- Kavli Neural Systems InstituteNew YorkUnited States
| | - Joshua Zeng
- Laboratory of Neurogenetics and Behavior, Rockefeller UniversityNew YorkUnited States
| | - Lauren A Neal
- Laboratory of Neurogenetics and Behavior, Rockefeller UniversityNew YorkUnited States
| | - Henrik Molina
- Proteomics Resource Center, Rockefeller UniversityNew YorkUnited States
| | - Li Zhao
- Laboratory of Evolutionary Genetics and Genomics, Rockefeller UniversityNew YorkUnited States
| | - Leslie B Vosshall
- Laboratory of Neurogenetics and Behavior, Rockefeller UniversityNew YorkUnited States
- Howard Hughes Medical InstituteNew YorkUnited States
- Kavli Neural Systems InstituteNew YorkUnited States
| |
Collapse
|
16
|
Ma J, Jiang Y, Pei W, Wu M, Ma Q, Liu J, Song J, Jia B, Liu S, Wu J, Zhang J, Yu J. Expressed genes and their new alleles identification during fibre elongation reveal the genetic factors underlying improvements of fibre length in cotton. PLANT BIOTECHNOLOGY JOURNAL 2022; 20:1940-1955. [PMID: 35718938 PMCID: PMC9491459 DOI: 10.1111/pbi.13874] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/28/2021] [Revised: 05/29/2022] [Accepted: 06/11/2022] [Indexed: 05/27/2023]
Abstract
Interspecific breeding in cotton takes advantage of genetic recombination among desirable genes from different parental lines. However, the expression new alleles (ENAs) from crossovers within genic regions and their significance in fibre length (FL) improvement are currently not understood. Here, we generated resequencing genomes of 191 interspecific backcross inbred lines derived from CRI36 (Gossypium hirsutum) × Hai7124 (Gossypium barbadense) and 277 dynamic fibre transcriptomes to identify the ENAs and extremely expressed genes (eGenes) potentially influencing FL, and uncovered the dynamic regulatory network of fibre elongation. Of 35 420 eGenes in developing fibres, 10 366 ENAs were identified and preferentially distributed in chromosomes subtelomeric regions. In total, 1056-1255 ENAs showed transgressive expression in fibres at 5-15 dpa (days post-anthesis) of some BILs, 520 of which were located in FL-quantitative trait locus (QTLs) and GhFLA9 (recombination allele) was identified with a larger effect for FL than GhFLA9 of CRI36 allele. Using ENAs as a type of markers, we identified three novel FL-QTLs. Additionally, 456 extremely eGenes were identified that were preferentially distributed in recombination hotspots. Importantly, 34 of them were significantly associated with FL. Gene expression quantitative trait locus analysis identified 1286, 1089 and 1059 eGenes that were colocalized with the FL trait at 5, 10 and 15 dpa, respectively. Finally, we verified the Ghir_D10G011050 gene linked to fibre elongation by the CRISPR-cas9 system. This study provides the first glimpse into the occurrence, distribution and expression of the developing fibres genes (especially ENAs) in an introgression population, and their possible biological significance in FL.
Collapse
Affiliation(s)
- Jianjiang Ma
- State Key Laboratory of Cotton BiologyInstitute of Cotton Research of Chinese Academy of Agricultural SciencesKey Laboratory of Cotton Genetic ImprovementMinistry of AgricultureAnyangChina
- Zhengzhou Research Base, State Key Laboratory of Cotton BiologyZhengzhou UniversityZhengzhouChina
| | - Yafei Jiang
- Novogene Bioinformatics InstituteBeijingChina
| | - Wenfeng Pei
- State Key Laboratory of Cotton BiologyInstitute of Cotton Research of Chinese Academy of Agricultural SciencesKey Laboratory of Cotton Genetic ImprovementMinistry of AgricultureAnyangChina
| | - Man Wu
- State Key Laboratory of Cotton BiologyInstitute of Cotton Research of Chinese Academy of Agricultural SciencesKey Laboratory of Cotton Genetic ImprovementMinistry of AgricultureAnyangChina
| | - Qifeng Ma
- State Key Laboratory of Cotton BiologyInstitute of Cotton Research of Chinese Academy of Agricultural SciencesKey Laboratory of Cotton Genetic ImprovementMinistry of AgricultureAnyangChina
| | - Ji Liu
- State Key Laboratory of Cotton BiologyInstitute of Cotton Research of Chinese Academy of Agricultural SciencesKey Laboratory of Cotton Genetic ImprovementMinistry of AgricultureAnyangChina
| | - Jikun Song
- State Key Laboratory of Cotton BiologyInstitute of Cotton Research of Chinese Academy of Agricultural SciencesKey Laboratory of Cotton Genetic ImprovementMinistry of AgricultureAnyangChina
| | - Bing Jia
- State Key Laboratory of Cotton BiologyInstitute of Cotton Research of Chinese Academy of Agricultural SciencesKey Laboratory of Cotton Genetic ImprovementMinistry of AgricultureAnyangChina
| | - Shang Liu
- State Key Laboratory of Cotton BiologyInstitute of Cotton Research of Chinese Academy of Agricultural SciencesKey Laboratory of Cotton Genetic ImprovementMinistry of AgricultureAnyangChina
| | - Jianyong Wu
- State Key Laboratory of Cotton BiologyInstitute of Cotton Research of Chinese Academy of Agricultural SciencesKey Laboratory of Cotton Genetic ImprovementMinistry of AgricultureAnyangChina
- Zhengzhou Research Base, State Key Laboratory of Cotton BiologyZhengzhou UniversityZhengzhouChina
| | - Jinfa Zhang
- Department of Plant and Environmental SciencesNew Mexico State UniversityLas CrucesNew MexicoUSA
| | - Jiwen Yu
- State Key Laboratory of Cotton BiologyInstitute of Cotton Research of Chinese Academy of Agricultural SciencesKey Laboratory of Cotton Genetic ImprovementMinistry of AgricultureAnyangChina
- Zhengzhou Research Base, State Key Laboratory of Cotton BiologyZhengzhou UniversityZhengzhouChina
| |
Collapse
|