1
|
Du D, Zhong F, Liu L. Enhancing recognition and interpretation of functional phenotypic sequences through fine-tuning pre-trained genomic models. J Transl Med 2024; 22:756. [PMID: 39135093 PMCID: PMC11318145 DOI: 10.1186/s12967-024-05567-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Accepted: 08/03/2024] [Indexed: 08/16/2024] Open
Abstract
BACKGROUND Decoding human genomic sequences requires comprehensive analysis of DNA sequence functionality. Through computational and experimental approaches, researchers have studied the genotype-phenotype relationship and generate important datasets that help unravel complicated genetic blueprints. Thus, the recently developed artificial intelligence methods can be used to interpret the functions of those DNA sequences. METHODS This study explores the use of deep learning, particularly pre-trained genomic models like DNA_bert_6 and human_gpt2-v1, in interpreting and representing human genome sequences. Initially, we meticulously constructed multiple datasets linking genotypes and phenotypes to fine-tune those models for precise DNA sequence classification. Additionally, we evaluate the influence of sequence length on classification results and analyze the impact of feature extraction in the hidden layers of our model using the HERV dataset. To enhance our understanding of phenotype-specific patterns recognized by the model, we perform enrichment, pathogenicity and conservation analyzes of specific motifs in the human endogenous retrovirus (HERV) sequence with high average local representation weight (ALRW) scores. RESULTS We have constructed multiple genotype-phenotype datasets displaying commendable classification performance in comparison with random genomic sequences, particularly in the HERV dataset, which achieved binary and multi-classification accuracies and F1 values exceeding 0.935 and 0.888, respectively. Notably, the fine-tuning of the HERV dataset not only improved our ability to identify and distinguish diverse information types within DNA sequences but also successfully identified specific motifs associated with neurological disorders and cancers in regions with high ALRW scores. Subsequent analysis of these motifs shed light on the adaptive responses of species to environmental pressures and their co-evolution with pathogens. CONCLUSIONS These findings highlight the potential of pre-trained genomic models in learning DNA sequence representations, particularly when utilizing the HERV dataset, and provide valuable insights for future research endeavors. This study represents an innovative strategy that combines pre-trained genomic model representations with classical methods for analyzing the functionality of genome sequences, thereby promoting cross-fertilization between genomics and artificial intelligence.
Collapse
Affiliation(s)
- Duo Du
- School of Basic Medical Sciences and Intelligent Medicine Institute, Fudan University, Shanghai, 200032, China
| | - Fan Zhong
- School of Basic Medical Sciences and Intelligent Medicine Institute, Fudan University, Shanghai, 200032, China.
| | - Lei Liu
- School of Basic Medical Sciences and Intelligent Medicine Institute, Fudan University, Shanghai, 200032, China.
- Shanghai Institute of Stem Cell Research and Clinical Translation, Shanghai, 200120, China.
| |
Collapse
|
2
|
Shamshad A, Rashid M, Zaman QU. In-silico analysis of heat shock transcription factor (OsHSF) gene family in rice (Oryza sativa L.). BMC PLANT BIOLOGY 2023; 23:395. [PMID: 37592226 PMCID: PMC10433574 DOI: 10.1186/s12870-023-04399-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Accepted: 08/03/2023] [Indexed: 08/19/2023]
Abstract
BACKGROUND One of the most important cash crops worldwide is rice (Oryza sativa L.). Under varying climatic conditions, however, its yield is negatively affected. In order to create rice varieties that are resilient to abiotic stress, it is essential to explore the factors that control rice growth, development, and are source of resistance. HSFs (heat shock transcription factors) control a variety of plant biological processes and responses to environmental stress. The in-silico analysis offers a platform for thorough genome-wide identification of OsHSF genes in the rice genome. RESULTS In this study, 25 randomly dispersed HSF genes with significant DNA binding domains (DBD) were found in the rice genome. According to a gene structural analysis, all members of the OsHSF family share Gly-66, Phe-67, Lys-69, Trp-75, Glu-76, Phe-77, Ala-78, Phe-82, Ile-93, and Arg-96. Rice HSF family genes are widely distributed in the vegetative organs, first in the roots and then in the leaf and stem; in contrast, in reproductive tissues, the embryo and lemma exhibit the highest levels of gene expression. According to chromosomal localization, tandem duplication and repetition may have aided in the development of novel genes in the rice genome. OsHSFs have a significant role in the regulation of gene expression, regulation in primary metabolism and tolerance to environmental stress, according to gene networking analyses. CONCLUSION Six genes viz; Os01g39020, Os01g53220, Os03g25080, Os01g54550, Os02g13800 and Os10g28340 were annotated as promising genes. This study provides novel insights for functional studies on the OsHSFs in rice breeding programs. With the ultimate goal of enhancing crops, the data collected in this survey will be valuable for performing genomic research to pinpoint the specific function of the HSF gene during stress responses.
Collapse
Affiliation(s)
- Areeqa Shamshad
- Nuclear Institute for Agriculture and Biology College, Pakistan Institute of Engineering and Applied Sciences (NIAB-C, PIEAS), Faisalabad, Pakistan
| | - Muhammad Rashid
- Nuclear Institute for Agriculture and Biology College, Pakistan Institute of Engineering and Applied Sciences (NIAB-C, PIEAS), Faisalabad, Pakistan
| | - Qamar Uz Zaman
- Department of Environmental Sciences, The University of Lahore, Lahore, 54590, Pakistan.
| |
Collapse
|
3
|
Role of the Heme Activator Protein Complex in the Sexual Development of Cryptococcus neoformans. mSphere 2022; 7:e0017022. [PMID: 35638350 PMCID: PMC9241503 DOI: 10.1128/msphere.00170-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
The CCAAT-binding heme activator protein (HAP) complex, comprising the DNA-binding heterotrimeric complex Hap2/3/5 and transcriptional activation subunit HapX, is a key regulator of iron homeostasis, mitochondrial functions, and pathogenicity in Cryptococcus neoformans, which causes fatal meningoencephalitis. However, its role in the development of human fungal pathogens remains unclear. To elucidate the role of the HAP complex in C. neoformans development, we constructed hap2Δ, hap3Δ, hap5Δ, and hapXΔ mutants and their complemented congenic MATα H99 and MATa YL99a strains. The HAP complex plays a conserved role in iron utilization and stress responses in cells of both mating types. Deletion of any of the HAP complex components markedly enhances filamentation during bisexual mating. However, the Hap2/3/5 complex, but not HapX, is crucial in repressing pheromone production and cell fusion and is thus a critical repressor of sexual differentiation of C. neoformans. Interestingly, deletion of the heterotrimeric complex transcriptionally regulated both positive and negative regulators in the pheromone-responsive Cpk1 mitogen-activated protein kinase (MAPK) pathway. Chromatin immunoprecipitation-quantitative PCR analysis revealed that the HAP complex physically bound to the CCAAT motif of the CRG1 and GPA2 promoter regions. Notably, the HAP complex was differentially localized depending on the mating type in basal conditions; it was enriched in the nuclei of MATα cells but diffused in the cytoplasm of MATa cells. Interestingly, however, a portion of the HAP complex in both mating types relocalized to the cell membrane during mating. In conclusion, the Hap2/3/5 heterotrimeric complex and HapX play major and minor roles, respectively, in repressing the sexual development of C. neoformans in association with the Cpk1 MAPK pathway. IMPORTANCECryptococcus neoformans isolates are of two mating types: MATα strains, which are predominant, and MATa strains, isolated from the sub-Saharan African region, where cryptococcosis is most abundant and severe. Here, we demonstrated the function of the CCAAT-binding HAP complex (Hap2/3/5/X) as a transcriptional repressor of Cpk1 pathway-related genes in cells of both mating types. Deletion of any HAP complex component markedly enhanced filamentation without affecting normal sporulation. In particular, deletion of the DNA-binding HAP complex components (Hap2/3/5), but not HapX, markedly enhanced pheromone production and cell fusion efficiency, validating its repressive role in the early stage of mating in C. neoformans. The HAP complex regulates the expression of both negative and positive mating regulators and is thus crucial for the regulation of the Cpk1 MAPK pathway during mating. This study provides insights into the complex signaling networks governing the sexual differentiation of C. neoformans.
Collapse
|
4
|
Mishra M, Rathore RS, Joshi R, Pareek A, Singla-Pareek SL. DTH8 overexpression induces early flowering, boosts yield, and improves stress recovery in rice cv IR64. PHYSIOLOGIA PLANTARUM 2022; 174:e13691. [PMID: 35575899 DOI: 10.1111/ppl.13691] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Revised: 04/17/2022] [Accepted: 04/27/2022] [Indexed: 06/15/2023]
Abstract
Rice yield and heading date are the two discrete traits controlled by quantitative trait loci (QTLs). Both traits are influenced by the genetic make-up of the plant as well as the environmental factors where it thrives. Drought and salinity adversely affect crop productivity in many parts of the world. Tolerance to these stresses is multigenic and complex in nature. In this study, we have characterized a QTL, DTH8 (days to heading) from Oryza sativa L. cv IR64 that encodes a putative HAP3/NF-YB/CBF subunit of CCAAT-box binding protein (HAP complex). We demonstrate DTH8 to be positively influencing the yield, heading date, and stress tolerance in IR64. DTH8 up-regulates the transcription of RFT1, Hd3a, GHD7, MOC1, and RCN1 in IR64 at the pre-flowering stage and plays a role in early flowering, increased number of tillers, enhanced panicle branching, and improved tolerance towards drought and salinity stress at the reproductive stage. The presence of DTH8 binding elements (CCAAT) in the promoter regions of all of these genes, predicted by in silico analysis of the promoter region, indicates the regulation of their expression by DTH8. In addition, DTH8 overexpressing transgenic lines showed favorable physiological parameters causing less yield penalty under stress than the WT plants. Taken together, DTH8 is a positive regulator of the network of genes related to early flowering/heading, higher yield, as well as salinity and drought stress tolerance, thus, enabling the crops to adapt to a wide range of climatic conditions.
Collapse
Affiliation(s)
- Manjari Mishra
- Plant Stress Biology, International Center for Genetic Engineering and Biotechnology, New Delhi, India
| | - Ray Singh Rathore
- Plant Stress Biology, International Center for Genetic Engineering and Biotechnology, New Delhi, India
| | - Rohit Joshi
- Plant Stress Biology, International Center for Genetic Engineering and Biotechnology, New Delhi, India
| | - Ashwani Pareek
- Stress Physiology and Molecular Biology Laboratory, School of Life Sciences, Jawaharlal Nehru University, New Delhi, India
| | - Sneh Lata Singla-Pareek
- Plant Stress Biology, International Center for Genetic Engineering and Biotechnology, New Delhi, India
| |
Collapse
|
5
|
John E, Singh KB, Oliver RP, Tan K. Transcription factor control of virulence in phytopathogenic fungi. MOLECULAR PLANT PATHOLOGY 2021; 22:858-881. [PMID: 33973705 PMCID: PMC8232033 DOI: 10.1111/mpp.13056] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/22/2020] [Revised: 03/02/2021] [Accepted: 03/04/2021] [Indexed: 05/12/2023]
Abstract
Plant-pathogenic fungi are a significant threat to economic and food security worldwide. Novel protection strategies are required and therefore it is critical we understand the mechanisms by which these pathogens cause disease. Virulence factors and pathogenicity genes have been identified, but in many cases their roles remain elusive. It is becoming increasingly clear that gene regulation is vital to enable plant infection and transcription factors play an essential role. Efforts to determine their regulatory functions in plant-pathogenic fungi have expanded since the annotation of fungal genomes revealed the ubiquity of transcription factors from a broad range of families. This review establishes the significance of transcription factors as regulatory elements in plant-pathogenic fungi and provides a systematic overview of those that have been functionally characterized. Detailed analysis is provided on regulators from well-characterized families controlling various aspects of fungal metabolism, development, stress tolerance, and the production of virulence factors such as effectors and secondary metabolites. This covers conserved transcription factors with either specialized or nonspecialized roles, as well as recently identified regulators targeting key virulence pathways. Fundamental knowledge of transcription factor regulation in plant-pathogenic fungi provides avenues to identify novel virulence factors and improve our understanding of the regulatory networks linked to pathogen evolution, while transcription factors can themselves be specifically targeted for disease control. Areas requiring further insight regarding the molecular mechanisms and/or specific classes of transcription factors are identified, and direction for future investigation is presented.
Collapse
Affiliation(s)
- Evan John
- Centre for Crop and Disease ManagementCurtin UniversityBentleyWestern AustraliaAustralia
- School of Molecular and Life SciencesCurtin UniversityBentleyWestern AustraliaAustralia
| | - Karam B. Singh
- Agriculture and FoodCommonwealth Scientific and Industrial Research OrganisationFloreatWestern AustraliaAustralia
| | - Richard P. Oliver
- School of Molecular and Life SciencesCurtin UniversityBentleyWestern AustraliaAustralia
| | - Kar‐Chun Tan
- Centre for Crop and Disease ManagementCurtin UniversityBentleyWestern AustraliaAustralia
- School of Molecular and Life SciencesCurtin UniversityBentleyWestern AustraliaAustralia
| |
Collapse
|
6
|
Tini F, Beccari G, Marconi G, Porceddu A, Sulyok M, Gardiner DM, Albertini E, Covarelli L. Identification of Putative Virulence Genes by DNA Methylation Studies in the Cereal Pathogen Fusarium graminearum. Cells 2021; 10:cells10051192. [PMID: 34068122 PMCID: PMC8152758 DOI: 10.3390/cells10051192] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Revised: 05/03/2021] [Accepted: 05/10/2021] [Indexed: 01/17/2023] Open
Abstract
DNA methylation mediates organisms’ adaptations to environmental changes in a wide range of species. We investigated if a such a strategy is also adopted by Fusarium graminearum in regulating virulence toward its natural hosts. A virulent strain of this fungus was consecutively sub-cultured for 50 times (once a week) on potato dextrose agar. To assess the effect of subculturing on virulence, wheat seedlings and heads (cv. A416) were inoculated with subcultures (SC) 1, 23, and 50. SC50 was also used to re-infect (three times) wheat heads (SC50×3) to restore virulence. In vitro conidia production, colonies growth and secondary metabolites production were also determined for SC1, SC23, SC50, and SC50×3. Seedling stem base and head assays revealed a virulence decline of all subcultures, whereas virulence was restored in SC50×3. The same trend was observed in conidia production. The DNA isolated from SC50 and SC50×3 was subject to a methylation content-sensitive enzyme and double-digest, restriction-site-associated DNA technique (ddRAD-MCSeEd). DNA methylation analysis indicated 1024 genes, whose methylation levels changed in response to the inoculation on a healthy host after subculturing. Several of these genes are already known to be involved in virulence by functional analysis. These results demonstrate that the physiological shifts following sub-culturing have an impact on genomic DNA methylation levels and suggest that the ddRAD-MCSeEd approach can be an important tool for detecting genes potentially related to fungal virulence.
Collapse
Affiliation(s)
- Francesco Tini
- Department of Agricultural, Food and Environmental Sciences, University of Perugia, Borgo XX Giugno, 74, 06121 Perugia, Italy; (F.T.); (G.B.); (E.A.); (L.C.)
| | - Giovanni Beccari
- Department of Agricultural, Food and Environmental Sciences, University of Perugia, Borgo XX Giugno, 74, 06121 Perugia, Italy; (F.T.); (G.B.); (E.A.); (L.C.)
| | - Gianpiero Marconi
- Department of Agricultural, Food and Environmental Sciences, University of Perugia, Borgo XX Giugno, 74, 06121 Perugia, Italy; (F.T.); (G.B.); (E.A.); (L.C.)
- Correspondence:
| | - Andrea Porceddu
- Department of Agriculture, University of Sassari, Viale Italia, 39a, 07100 Sassari, Italy;
| | - Micheal Sulyok
- Department of Agrobiotechnology (IFA-Tulln), University of Natural Resources and Applied Life Sciences, Vienna (BOKU), Konrad Lorenz Strasse, 20, A-3430 Tulln, Austria;
| | - Donald M. Gardiner
- Commonwealth Scientific and Industrial Research Organisation, Agriculture and Food, 306 Carmody Road, St Lucia, QLD 4067, Australia;
| | - Emidio Albertini
- Department of Agricultural, Food and Environmental Sciences, University of Perugia, Borgo XX Giugno, 74, 06121 Perugia, Italy; (F.T.); (G.B.); (E.A.); (L.C.)
| | - Lorenzo Covarelli
- Department of Agricultural, Food and Environmental Sciences, University of Perugia, Borgo XX Giugno, 74, 06121 Perugia, Italy; (F.T.); (G.B.); (E.A.); (L.C.)
| |
Collapse
|
7
|
Yu J, Kim KH. A Phenome-Wide Association Study of the Effects of Fusarium graminearum Transcription Factors on Fusarium Graminearum Virus 1 Infection. Front Microbiol 2021; 12:622261. [PMID: 33643250 PMCID: PMC7904688 DOI: 10.3389/fmicb.2021.622261] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2020] [Accepted: 01/07/2021] [Indexed: 11/13/2022] Open
Abstract
The Fusarium graminearum virus 1 (FgV1) causes noticeable phenotypic changes such as reduced mycelial growth, increase pigmentation, and reduced pathogenicity in its host fungi, Fusarium graminearum. Previous study showed that the numerous F. graminearum genes including regulatory factors were differentially expressed upon FgV1 infection, however, we have limited knowledge on the effect(s) of specific transcription factor (TF) during FgV1 infection in host fungus. Using gene-deletion mutant library of 657 putative TFs in F. graminearum, we transferred FgV1 by hyphal anastomosis to screen transcription factors that might be associated with viral replication or symptom induction. FgV1-infected TF deletion mutants were divided into three groups according to the mycelial growth phenotype compare to the FgV1-infected wild-type strain (WT-VI). The FgV1-infected TF deletion mutants in Group 1 exhibited slow or weak mycelial growth compare to that of WT-VI on complete medium at 5 dpi. In contrast, Group 3 consists of virus-infected TF deletion mutants showing faster mycelial growth and mild symptom compared to that of WT-VI. The hyphal growth of FgV1-infected TF deletion mutants in Group 2 was not significantly different from that of WT-VI. We speculated that differences of mycelial growth among the FgV1-infected TF deletion mutant groups might be related with the level of FgV1 RNA accumulations in infected host fungi. By conducting real-time quantitative reverse transcription polymerase chain reaction, we observed close association between FgV1 RNA accumulation and phenotypic differences of FgV1-infected TF deletion mutants in each group, i.e., increased and decreased dsRNA accumulation in Group 1 and Group 3, respectively. Taken together, our analysis provides an opportunity to identify host's regulator(s) of FgV1-triggered signaling and antiviral responses and helps to understand complex regulatory networks between FgV1 and F. graminearum interaction.
Collapse
Affiliation(s)
- Jisuk Yu
- Plant Genomics and Breeding Institute, Seoul National University, Seoul, South Korea
| | - Kook-Hyung Kim
- Plant Genomics and Breeding Institute, Seoul National University, Seoul, South Korea.,Department of Agricultural Biotechnology, College of Agriculture and Life Sciences, Seoul, South Korea.,Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, South Korea
| |
Collapse
|
8
|
Shen C, Liu H, Guan Z, Yan J, Zheng T, Yan W, Wu C, Zhang Q, Yin P, Xing Y. Structural Insight into DNA Recognition by CCT/NF-YB/YC Complexes in Plant Photoperiodic Flowering. THE PLANT CELL 2020; 32:3469-3484. [PMID: 32843433 PMCID: PMC7610279 DOI: 10.1105/tpc.20.00067] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Revised: 08/07/2020] [Accepted: 08/25/2020] [Indexed: 05/18/2023]
Abstract
CONSTANS, CONSTANS-LIKE, and TIMING OF CAB EXPRESSION1 (CCT) domain-containing proteins are a large family unique to plants. They transcriptionally regulate photoperiodic flowering, circadian rhythms, vernalization, and other related processes. Through their CCT domains, CONSTANS and HEADING DATE1 (HD1) coordinate with the NUCLEAR FACTOR Y (NF-Y) B/C dimer to specifically target a conserved 'CCACA' motif within the promoters of their target genes. However, the mechanism underlying DNA recognition by the CCT domain remains unclear. Here we determined the crystal structures of the rice (Oryza sativa) NF-YB/YC dimer and the florigen gene Heading date 3a (Hd3a)-bound HD1CCT/NF-YB/YC trimer with resolutions of 2.0 Å and 2.55 Å, respectively. The CCT domain of HD1 displays an elongated structure containing two α-helices and two loops, tethering Hd3a to the NF-YB/YC dimer. Helix α2 and loop 2 are anchored into the minor groove of the 'CCACA' motif, which determines the specific base recognition. Our structures reveal the interaction mechanism among the CCT domain, NF-YB/YC dimer, and the target DNA. These results not only provide insight into the network between the CCT proteins and NF-Y subunits, but also offer potential approaches for improving productivity and global adaptability of crops by manipulating florigen expression.
Collapse
Affiliation(s)
- Cuicui Shen
- National Key Laboratory of Crop Genetic Improvement and National Centre of Plant Gene Research, Huazhong Agricultural University, Wuhan 430070, China
| | - Haiyang Liu
- College of Agriculture, Yangtze University, Jingzhou 434000, China
| | - Zeyuan Guan
- National Key Laboratory of Crop Genetic Improvement and National Centre of Plant Gene Research, Huazhong Agricultural University, Wuhan 430070, China
- College of Life Sciences and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Junjie Yan
- National Key Laboratory of Crop Genetic Improvement and National Centre of Plant Gene Research, Huazhong Agricultural University, Wuhan 430070, China
| | - Ting Zheng
- College of Plant Sciences and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Wenhao Yan
- College of Plant Sciences and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Changyin Wu
- National Key Laboratory of Crop Genetic Improvement and National Centre of Plant Gene Research, Huazhong Agricultural University, Wuhan 430070, China
| | - Qifa Zhang
- National Key Laboratory of Crop Genetic Improvement and National Centre of Plant Gene Research, Huazhong Agricultural University, Wuhan 430070, China
| | - Ping Yin
- National Key Laboratory of Crop Genetic Improvement and National Centre of Plant Gene Research, Huazhong Agricultural University, Wuhan 430070, China
| | - Yongzhong Xing
- National Key Laboratory of Crop Genetic Improvement and National Centre of Plant Gene Research, Huazhong Agricultural University, Wuhan 430070, China
| |
Collapse
|