1
|
Qiao Y, Yang R, Liu Y, Chen J, Zhao L, Huo P, Wang Z, Bu D, Wu Y, Zhao Y. DeepFusion: A deep bimodal information fusion network for unraveling protein-RNA interactions using in vivo RNA structures. Comput Struct Biotechnol J 2024; 23:617-625. [PMID: 38274994 PMCID: PMC10808905 DOI: 10.1016/j.csbj.2023.12.040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Revised: 12/04/2023] [Accepted: 12/26/2023] [Indexed: 01/27/2024] Open
Abstract
RNA-binding proteins (RBPs) are key post-transcriptional regulators, and the malfunctions of RBP-RNA binding lead to diverse human diseases. However, prediction of RBP binding sites is largely based on RNA sequence features, whereas in vivo RNA structural features based on high-throughput sequencing are rarely incorporated. Here, we designed a deep bimodal information fusion network called DeepFusion for unraveling protein-RNA interactions by incorporating structural features derived from DMS-seq data. DeepFusion integrates two sub-models to extract local motif-like information and long-term context information. We show that DeepFusion performs best compared with other cutting-edge methods with only sequence inputs on two datasets. DeepFusion's performance is further improved with bimodal input after adding in vivo DMS-seq structural features. Furthermore, DeepFusion can be used for analyzing RNA degradation, demonstrating significantly different RBP-binding scores in genes with slow degradation rates versus those with rapid degradation rates. DeepFusion thus provides enhanced abilities for further analysis of functional RNAs. DeepFusion's code and data are available at http://bioinfo.org/deepfusion/.
Collapse
Affiliation(s)
- Yixuan Qiao
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Rui Yang
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yang Liu
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jiaxin Chen
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
| | - Lianhe Zhao
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
| | - Peipei Huo
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
| | - Zhihao Wang
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
| | - Dechao Bu
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
| | - Yang Wu
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
| | - Yi Zhao
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
2
|
Zhao Y, Xie Q, Yang Q, Cui J, Tan W, Zhang D, Xiang J, Deng L, Guo Y, Li M, Liu L, Yan M. Genome-wide identification and evolutionary analysis of the NRAMP gene family in the AC genomes of Brassica species. BMC Plant Biol 2024; 24:311. [PMID: 38649805 PMCID: PMC11036763 DOI: 10.1186/s12870-024-04981-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Accepted: 04/03/2024] [Indexed: 04/25/2024]
Abstract
BACKGROUND Brassica napus, a hybrid resulting from the crossing of Brassica rapa and Brassica oleracea, is one of the most important oil crops. Despite its significance, B. napus productivity faces substantial challenges due to heavy metal stress, especially in response to cadmium (Cd), which poses a significant threat among heavy metals. Natural resistance-associated macrophage proteins (NRAMPs) play pivotal roles in Cd uptake and transport within plants. However, our understanding of the role of BnNRAMPs in B. napus is limited. Thus, this study aimed to conduct genome-wide identification and bioinformatics analysis of three Brassica species: B. napus, B. rapa, and B. oleracea. RESULTS A total of 37 NRAMPs were identified across the three Brassica species and classified into two distinct subfamilies based on evolutionary relationships. Conservative motif analysis revealed that motif 6 and motif 8 might significantly contribute to the differentiation between subfamily I and subfamily II within Brassica species. Evolutionary analyses and chromosome mapping revealed a reduction in the NRAMP gene family during B. napus evolutionary history, resulting in the loss of an orthologous gene derived from BoNRAMP3.2. Cis-acting element analysis suggested potential regulation of the NRAMP gene family by specific plant hormones, such as abscisic acid (ABA) and methyl jasmonate (MeJA). However, gene expression pattern analyses under hormonal or stress treatments indicated limited responsiveness of the NRAMP gene family to these treatments, warranting further experimental validation. Under Cd stress in B. napus, expression pattern analysis of the NRAMP gene family revealed a decrease in the expression levels of most BnNRAMP genes with increasing Cd concentrations. Notably, BnNRAMP5.1/5.2 exhibited a unique response pattern, being stimulated at low Cd concentrations and inhibited at high Cd concentrations, suggesting potential response mechanisms distinct from those of other NRAMP genes. CONCLUSIONS In summary, this study indicates complex molecular dynamics within the NRAMP gene family under Cd stress, suggesting potential applications in enhancing plant resilience, particularly against Cd. The findings also offer valuable insights for further understanding the functionality and regulatory mechanisms of the NRAMP gene family.
Collapse
Affiliation(s)
- Yuquan Zhao
- Hunan Key Laboratory of Economic Crops Genetic Improvement and Integrated Utilization, School of Life and Health Sciences, Hunan University of Science and Technology, Xiangtan, 411201, China
- Yuelushan Laboratory, Hongqi Road, Changsha, 410125, China
- Crop Research Institute, Hunan Academy of Agricultural Sciences, Changsha, 410125, China
| | - Qijun Xie
- Yuelushan Laboratory, Hongqi Road, Changsha, 410125, China
- School of Life Science and Chemistry, Hunan University of Technology, Zhuzhou, 412007, China
| | - Qian Yang
- Yuelushan Laboratory, Hongqi Road, Changsha, 410125, China
- Crop Research Institute, Hunan Academy of Agricultural Sciences, Changsha, 410125, China
| | - Jiamin Cui
- Hunan Key Laboratory of Economic Crops Genetic Improvement and Integrated Utilization, School of Life and Health Sciences, Hunan University of Science and Technology, Xiangtan, 411201, China
- Yuelushan Laboratory, Hongqi Road, Changsha, 410125, China
| | - Wenqing Tan
- Yuelushan Laboratory, Hongqi Road, Changsha, 410125, China
- Crop Research Institute, Hunan Academy of Agricultural Sciences, Changsha, 410125, China
| | - Dawei Zhang
- Hunan Key Laboratory of Economic Crops Genetic Improvement and Integrated Utilization, School of Life and Health Sciences, Hunan University of Science and Technology, Xiangtan, 411201, China
- Yuelushan Laboratory, Hongqi Road, Changsha, 410125, China
| | - Jianhua Xiang
- Hunan Key Laboratory of Economic Crops Genetic Improvement and Integrated Utilization, School of Life and Health Sciences, Hunan University of Science and Technology, Xiangtan, 411201, China
- Yuelushan Laboratory, Hongqi Road, Changsha, 410125, China
| | - Lichao Deng
- Yuelushan Laboratory, Hongqi Road, Changsha, 410125, China
- Crop Research Institute, Hunan Academy of Agricultural Sciences, Changsha, 410125, China
- Hunan Engineering and Technology Research Center of Hybrid Rapeseed, Hunan Academy of Agricultural Sciences, Changsha, 410125, China
| | - Yiming Guo
- Yuelushan Laboratory, Hongqi Road, Changsha, 410125, China
- Crop Research Institute, Hunan Academy of Agricultural Sciences, Changsha, 410125, China
- Hunan Engineering and Technology Research Center of Hybrid Rapeseed, Hunan Academy of Agricultural Sciences, Changsha, 410125, China
| | - Mei Li
- Yuelushan Laboratory, Hongqi Road, Changsha, 410125, China
- Crop Research Institute, Hunan Academy of Agricultural Sciences, Changsha, 410125, China
- Hunan Engineering and Technology Research Center of Hybrid Rapeseed, Hunan Academy of Agricultural Sciences, Changsha, 410125, China
| | - Lili Liu
- Hunan Key Laboratory of Economic Crops Genetic Improvement and Integrated Utilization, School of Life and Health Sciences, Hunan University of Science and Technology, Xiangtan, 411201, China.
- Yuelushan Laboratory, Hongqi Road, Changsha, 410125, China.
| | - Mingli Yan
- Yuelushan Laboratory, Hongqi Road, Changsha, 410125, China.
- Crop Research Institute, Hunan Academy of Agricultural Sciences, Changsha, 410125, China.
- Hunan Engineering and Technology Research Center of Hybrid Rapeseed, Hunan Academy of Agricultural Sciences, Changsha, 410125, China.
| |
Collapse
|
3
|
Ballarin CS, Vizentin-Bugoni J, Hachuy-Filho L, Amorim FW. Imprints of indirect interactions on a resource-mediated ant-plant network across different levels of network organization. Oecologia 2024; 204:661-673. [PMID: 38448764 DOI: 10.1007/s00442-024-05522-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Accepted: 01/30/2024] [Indexed: 03/08/2024]
Abstract
Indirect interactions are pivotal in the evolution of interacting species and the assembly of populations and communities. Nevertheless, despite recently being investigated in plant-animal mutualism at the community level, indirect interactions have not been studied in resource-mediated mutualisms involving plant individuals that share different animal species as partners within a population (i.e., individual-based networks). Here, we analyzed an individual-based ant-plant network to evaluate how resource properties affect indirect interaction patterns and how changes in indirect links leave imprints in the network across multiple levels of network organization. Using complementary analytical approaches, we described the patterns of indirect interactions at the micro-, meso-, and macro-scale. We predicted that plants offering intermediate levels of nectar quantity and quality interact with more diverse ant assemblages. The increased number of ant species would cause a higher potential for indirect interactions in all scales evaluated. We found that nectar properties modified patterns of indirect interactions of plant individuals that share mutualistic partners, leaving imprints across different network scales. To our knowledge, this is the first study tracking indirect interactions in multiple scales within an individual-based network. We show that functional traits of interacting species, such as nectar properties, may lead to changes in indirect interactions, which could be tracked across different levels of the network organization evaluated.
Collapse
Affiliation(s)
- Caio S Ballarin
- Laboratório de Ecologia da Polinização e Interações, LEPI, Departamento de Biodiversidade e Bioestatística, Instituto de Biociências, Universidade Estadual Paulista "Júlio de Mesquita Filho" (UNESP), Rua Prof. Dr. Antonio Celso Wagner Zanin, Botucatu, São Paulo, CEP 18618-689, Brazil.
- Programa de Pós-Graduação em Biologia Vegetal, Instituto de Biociências, Universidade Estadual Paulista "Júlio de Mesquita Filho", Botucatu, São Paulo, CEP 18618-689, Brazil.
| | - Jeferson Vizentin-Bugoni
- Programa de Pós-Graduação Em Biodiversidade Animal, Departamento de Ecologia, Zoologia e Genética, Universidade Federal de Pelotas, Campus Universitário, Capão do Leão, RS, CEP 96010-900, Brasil
| | - Leandro Hachuy-Filho
- Laboratório de Ecologia da Polinização e Interações, LEPI, Departamento de Biodiversidade e Bioestatística, Instituto de Biociências, Universidade Estadual Paulista "Júlio de Mesquita Filho" (UNESP), Rua Prof. Dr. Antonio Celso Wagner Zanin, Botucatu, São Paulo, CEP 18618-689, Brazil
- Programa de Pós-Graduação Em Zoologia, Instituto de Biociências, Universidade Estadual Paulista "Júlio de Mesquita Filho", Botucatu, São Paulo, CEP 18618-689, Brazil
| | - Felipe W Amorim
- Laboratório de Ecologia da Polinização e Interações, LEPI, Departamento de Biodiversidade e Bioestatística, Instituto de Biociências, Universidade Estadual Paulista "Júlio de Mesquita Filho" (UNESP), Rua Prof. Dr. Antonio Celso Wagner Zanin, Botucatu, São Paulo, CEP 18618-689, Brazil
- Programa de Pós-Graduação em Biologia Vegetal, Instituto de Biociências, Universidade Estadual Paulista "Júlio de Mesquita Filho", Botucatu, São Paulo, CEP 18618-689, Brazil
- Programa de Pós-Graduação Em Zoologia, Instituto de Biociências, Universidade Estadual Paulista "Júlio de Mesquita Filho", Botucatu, São Paulo, CEP 18618-689, Brazil
| |
Collapse
|
4
|
Kapoor P, Rakhra G, Kumar V, Joshi R, Gupta M, Rakhra G. Insights into the functional characterization of DIR proteins through genome-wide in silico and evolutionary studies: a systematic review. Funct Integr Genomics 2023; 23:166. [PMID: 37202648 DOI: 10.1007/s10142-023-01095-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Revised: 05/04/2023] [Accepted: 05/10/2023] [Indexed: 05/20/2023]
Abstract
Dirigent proteins (DIRs) are a new class of proteins that were identified during the 8-8' lignan biosynthetic pathway and involves the formation of ( +) or ( -)-pinoresinol through stereoselective coupling from E-coniferyl alcohol. These proteins are known to play a vital role in the development and stress response in plants. Various studies have reported the functional and structural characterization of dirigent gene family in different plants using in silico approaches. Here, we have summarized the importance of dirigent proteins in plants and their role in plant stress tolerance by analyzing the genome-wide analysis including gene structure, mapping of chromosomes, phylogenetic evolution, conserved motifs, gene structure, and gene duplications in important plants. Overall, this review would help to compare and clarify the molecular and evolutionary characteristics of dirigent gene family in different plants.
Collapse
Affiliation(s)
- Preedhi Kapoor
- Department of Biochemistry, School of Bioengineering and Biosciences, Lovely Professional University, Phagwara, Punjab, 144411, India
| | - Gurseen Rakhra
- Department of Nutrition and Dietetics, Faculty of Allied Health Sciences, Manav Rachna International Institute of Research and Studies, Faridabad, Haryana, India
| | - Vineet Kumar
- Department of Biotechnology, School of Bioengineering and Biosciences, Lovely Professional University, Phagwara, Punjab, 144411, India
| | - Ridhi Joshi
- Department of Biotechnology, School of Bioengineering and Biosciences, Lovely Professional University, Phagwara, Punjab, 144411, India
| | - Mahiti Gupta
- Department of Biotechnology, Maharishi Markandeshwar (Deemed to Be University), Mullana, Ambala, 133207, India
| | - Gurmeen Rakhra
- Department of Biochemistry, School of Bioengineering and Biosciences, Lovely Professional University, Phagwara, Punjab, 144411, India.
- Department of Biotechnology, Maharishi Markandeshwar (Deemed to Be University), Mullana, Ambala, 133207, India.
| |
Collapse
|
5
|
Pandey M, Gromiha MM. MutBLESS: A tool to identify disease-prone sites in cancer using deep learning. Biochim Biophys Acta Mol Basis Dis 2023; 1869:166721. [PMID: 37105446 DOI: 10.1016/j.bbadis.2023.166721] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 04/07/2023] [Accepted: 04/12/2023] [Indexed: 04/29/2023]
Abstract
Understanding the molecular basis and impact of mutations at different stages of cancer are long-standing challenges in cancer biology. Identification of driver mutations from experiments is expensive and time intensive. In the present study, we collected the data for experimentally known driver mutations in 22 different cancer types and classified them into six categories: breast cancer (BRCA), acute myeloid leukaemia (LAML), endometrial carcinoma (EC), stomach cancer (STAD), skin cancer (SKCM), and other cancer types which contains 5747 disease prone and 5514 neutral sites in 516 proteins. The analysis of amino acid distribution along mutant sites revealed that the motifs AAA and LR are preferred in disease-prone sites whereas QPP and QF are dominant in neutral sites. Further, we developed a method using deep neural networks to predict disease-prone sites with amino acid sequence-based features such as physicochemical properties, secondary structure, tri-peptide motifs and conservation scores. We obtained an average AUC of 0.97 in five cancer types BRCA, LAML, EC, STAD and SKCM in a test dataset and 0.72 in all other cancer types together. Our method showed excellent performance for identifying cancer-specific mutations with an average sensitivity, specificity, and accuracy of 96.56 %, 97.39 %, and 97.64 %, respectively. We developed a web server for identifying cancer-prone sites, and it is available at https://web.iitm.ac.in/bioinfo2/MutBLESS/index.html. We suggest that our method can serve as an effective method to identify disease-prone sites and assist to develop therapeutic strategies.
Collapse
Affiliation(s)
- Medha Pandey
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India.
| |
Collapse
|
6
|
Dervisi I, Valassakis C, Koletti A, Kouvelis VN, Flemetakis E, Ouzounis CA, Roussis A. Evolutionary Aspects of Selenium Binding Protein (SBP). J Mol Evol 2023:10.1007/s00239-023-10105-4. [PMID: 37039856 DOI: 10.1007/s00239-023-10105-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2022] [Accepted: 03/21/2023] [Indexed: 04/12/2023]
Abstract
Selenium-binding proteins represent a ubiquitous protein family and recently SBP1 was described as a new stress response regulator in plants. SBP1 has been characterized as a methanethiol oxidase, however its exact role remains unclear. Moreover, in mammals, it is involved in the regulation of anti-carcinogenic growth and progression as well as reduction/oxidation modulation and detoxification. In this work, we delineate the functional potential of certain motifs of SBP in the context of evolutionary relationships. The phylogenetic profiling approach revealed the absence of SBP in the fungi phylum as well as in most non eukaryotic organisms. The phylogenetic tree also indicates the differentiation and evolution of characteristic SBP motifs. Main evolutionary events concern the CSSC motif for which Acidobacteria, Fungi and Archaea carry modifications. Moreover, the CC motif is harbored by some bacteria and remains conserved in Plants, while modified to CxxC in Animals. Thus, the characteristic sequence motifs of SBPs mainly appeared in Archaea and Bacteria and retained in Animals and Plants. Our results demonstrate the emergence of SBP from bacteria and most likely as a methanethiol oxidase.
Collapse
Affiliation(s)
- Irene Dervisi
- Section of Botany, Department of Biology, National & Kapodistrian University of Athens, 15784, Athens, Greece
| | - Chrysanthi Valassakis
- Section of Botany, Department of Biology, National & Kapodistrian University of Athens, 15784, Athens, Greece
| | - Aikaterini Koletti
- Department of Biotechnology, School of Applied Biology and Biotechnology, Agricultural University of Athens, 11855, Athens, Greece
| | - Vassilis N Kouvelis
- Section of Genetics and Biotechnology, Department of Biology, National & Kapodistrian University of Athens, 15784, Athens, Greece
| | - Emmanouil Flemetakis
- Department of Biotechnology, School of Applied Biology and Biotechnology, Agricultural University of Athens, 11855, Athens, Greece
| | - Christos A Ouzounis
- Biological Computation & Process Laboratory, Centre for Research & Technology Hellas, Chemical Process & Energy Resources Institute, 54124, Thessaloníki, Greece
- Biological Computation & Computational Biology Group, AIIA Lab, School of Informatics, Aristotle University of Thessalonica, 57001, Thessaloníki, Greece
| | - Andreas Roussis
- Section of Botany, Department of Biology, National & Kapodistrian University of Athens, 15784, Athens, Greece.
| |
Collapse
|
7
|
Naorem LD, Sharma N, Raghava GPS. A web server for predicting and scanning of IL-5 inducing peptides using alignment-free and alignment-based method. Comput Biol Med 2023; 158:106864. [PMID: 37058758 DOI: 10.1016/j.compbiomed.2023.106864] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 03/06/2023] [Accepted: 03/30/2023] [Indexed: 04/16/2023]
Abstract
Interleukin-5 (IL-5) can act as an enticing therapeutic target due to its pivotal role in several eosinophil-mediated diseases. The aim of this study is to develop a model for predicting IL-5 inducing antigenic regions in a protein with high precision. All models in this study have been trained, tested and validated on experimentally validated 1907 IL-5 inducing and 7759 non-IL-5 inducing peptides obtained from IEDB. Our primary analysis indicates that IL-5 inducing peptides are dominated by certain residues like Ile, Asn, and Tyr. It was also observed that binders of a wide range of HLA alleles can induce IL-5. Initially, alignment-based methods have been developed using similarity and motif search. These alignment-based methods provide high precision but poor coverage. In order to overcome this limitation, we explore alignment-free methods which are mainly machine learning-based models. Firstly, models have been developed using binary profiles and eXtreme Gradient Boosting-based model achieved a maximum AUC of 0.59. Secondly, composition-based models have been developed and our dipeptide-based random forest model achieved a maximum AUC of 0.74. Thirdly, random forest model developed using selected 250 dipeptides and achieved AUC 0.75 and MCC 0.29 on validation dataset; best among alignment-free models. In order to improve the performance, we developed an ensemble or hybrid method that combined alignment-based and alignment-free methods. Our hybrid method achieved AUC 0.94 with MCC 0.60 on a validation/independent dataset. The best hybrid model developed in this study has been incorporated into the user-friendly web server and a standalone package named 'IL5pred' (https://webs.iiitd.edu.in/raghava/il5pred/).
Collapse
Affiliation(s)
- Leimarembi Devi Naorem
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| | - Neelam Sharma
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| | - Gajendra P S Raghava
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| |
Collapse
|
8
|
Kandoor A, Fierst J. Dauer fate in a Caenorhabditis elegans Boolean network model. PeerJ 2023; 11:e14713. [PMID: 36710867 PMCID: PMC9879150 DOI: 10.7717/peerj.14713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2022] [Accepted: 12/16/2022] [Indexed: 01/24/2023] Open
Abstract
Cellular fates are determined by genes interacting across large, complex biological networks. A critical question is how to identify causal relationships spanning distinct signaling pathways and underlying organismal phenotypes. Here, we address this question by constructing a Boolean model of a well-studied developmental network and analyzing information flows through the system. Depending on environmental signals Caenorhabditis elegans develop normally to sexual maturity or enter a reproductively delayed, developmentally quiescent 'dauer' state, progressing to maturity when the environment changes. The developmental network that starts with environmental signal and ends in the dauer/no dauer fate involves genes across 4 signaling pathways including cyclic GMP, Insulin/IGF-1, TGF-β and steroid hormone synthesis. We identified three stable motifs leading to normal development, each composed of genes interacting across the Insulin/IGF-1, TGF-β and steroid hormone synthesis pathways. Three genes known to influence dauer fate, daf-2, daf-7 and hsf-1, acted as driver nodes in the system. Using causal logic analysis, we identified a five gene cyclic subgraph integrating the information flow from environmental signal to dauer fate. Perturbation analysis showed that a multifactorial insulin profile determined the stable motifs the system entered and interacted with daf-12 as the switchpoint driving the dauer/no dauer fate. Our results show that complex organismal systems can be distilled into abstract representations that permit full characterization of the causal relationships driving developmental fates. Analyzing organismal systems from this perspective of logic and function has important implications for studies examining the evolution and conservation of signaling pathways.
Collapse
Affiliation(s)
- Alekhya Kandoor
- Biomedical Engineering, University of Virginia, Charlottesville, VA, United States of America
| | - Janna Fierst
- Biomolecular Sciences Institute and Department of Biology, Florida International University, Miami, FL, United States of America
| |
Collapse
|
9
|
Lau V, Provart NJ. AGENT for Exploring and Analyzing Gene Regulatory Networks from Arabidopsis. Methods Mol Biol 2023; 2698:351-360. [PMID: 37682484 DOI: 10.1007/978-1-0716-3354-0_20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/09/2023]
Abstract
Gene regulatory networks (GRNs) are important for determining how an organism develops and how it responds to external stimuli. In the case of Arabidopsis thaliana, several GRNs have been identified covering many important biological processes. We present AGENT, the Arabidopsis GEne Network Tool, for exploring and analyzing published GRNs. Using tools in AGENT, regulatory motifs such as feed-forward loops can be easily identified. Nodes with high centrality-and hence importance-can likewise be identified. Gene expression data can also be overlaid onto GRNs to help discover subnetworks acting in specific tissues or under certain conditions.
Collapse
Affiliation(s)
- Vincent Lau
- Department of Cell and Systems Biology/Centre for the Analysis of Genome Evolution and Function, University of Toronto, Toronto, ON, Canada
| | - Nicholas J Provart
- Department of Cell and Systems Biology/Centre for the Analysis of Genome Evolution and Function, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
10
|
Ramlal A, Samanta A. In Silico functional and phylogenetic analyses of fungal immunomodulatory proteins of some edible mushrooms. AMB Express 2022; 12:159. [PMID: 36571664 PMCID: PMC9791630 DOI: 10.1186/s13568-022-01503-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Accepted: 12/13/2022] [Indexed: 12/27/2022] Open
Abstract
Mushrooms are a well known source of many bioactive and nutritional compounds with immense applicability in both the pharmaceutical and food industries. They are widely used to cure various kinds of ailments in traditional medicines. They have a low amount of fats and cholesterol and possess a high number of proteins. Immunomodulators have the ability which can improve immunity and act as defensive agents against pathogens. One such class of immunomodulators is fungal immunomodulatory proteins (FIPs). FIPs have potential roles in the treatment of cancer, and immunostimulatory effects and show anti-tumor activities. In the current study, 19 FIPs from edible mushrooms have been used for comparison and analysis of the conserved motifs. Phylogenetic analysis was also carried out using the FIPs. The conserved motif analysis revealed that some of the motifs strongly supported their identity as FIPs while some are novel. The fungal immunomodulatory proteins are important and have many properties which can be used for treating ailments and diseases and this preliminary study can be used for the identification and functional characterization of the proposed novel motifs and in unraveling the potential roles of FIPs for developing newer drugs.
Collapse
Affiliation(s)
- Ayyagari Ramlal
- grid.8195.50000 0001 2109 4999Department of Botany, University of Delhi, New Delhi, Delhi 110007 India ,grid.11875.3a0000 0001 2294 3534School of Biological Sciences, Universiti Sains Malaysia (USM), 11800 Georgetown, Penang Malaysia
| | - Aveek Samanta
- Department of Botany, Prabhat Kumar College, Contai, 721401 West Bengal India
| |
Collapse
|
11
|
Hernández-Montiel Á, Giffard-Mena I, Weidmann M, Bekaert M, Ulrich K, Benkaroun J. Virulence and genetic differences among white spot syndrome virus isolates inoculated in Penaeus vannamei. Dis Aquat Organ 2022; 152:85-98. [PMID: 36453457 DOI: 10.3354/dao03707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
White spot syndrome virus (WSSV) infects several economically important aquaculture species, and has caused significant losses to the industry. This virus belongs to the Nimaviridae family and has a dsDNA genome ranging between 257 and 309 kb (more than 20 isolate genomes have been fully sequenced and published to date). Multiple routes of infection could be the cause of the high virulence and mortality rates detected in shrimp species. Particularly in Penaeus vannamei, differences in isolate virulence have been observed, along with controversy over whether deletions or insertions are associated with virulence gain or loss. The pathogenicity of 3 isolates from 3 localities in Mexico (2 from Sinaloa: 'CIAD' and 'Angostura'; and one from Sonora: 'Sonora') was evaluated in vivo in whiteleg shrimp P. vannamei infection assays. Differences were observed in shrimp mortality rates among the 3 isolates, of which Sonora was the most virulent. Subsequently, the complete genomes of the Sonora and Angostura isolates were sequenced in depth from infected shrimp tissues and assembled in reference to the genome of isolate strain CN01 (KT995472), comprising 289350 and 288995 bp, respectively. Three deletion zones were identified compared to CN01, comprising 15 genes, including 3 envelope proteins (VP41A, VP52A and VP41B), 1 non-structural protein (ICP35) and 11 other encoding proteins whose function is currently unknown. In addition, 5 genes (wsv129, wsv178, wsv204, wsv249 and wsv497) presented differences in their repetitive motifs, which could potentially be involved in the regulation of gene expression, causing virulence variations.
Collapse
Affiliation(s)
- Álvaro Hernández-Montiel
- Facultad de Ciencias Marinas, Universidad Autónoma de Baja California, Carretera Tijuana-Ensenada No. 3917, Ensenada, Baja California 22860, Mexico
| | | | | | | | | | | |
Collapse
|
12
|
Khanal J, Kandel J, Tayara H, Chong KT. CapsNh-Kcr: Capsule network-based prediction of lysine crotonylation sites in human non-histone proteins. Comput Struct Biotechnol J 2022; 21:120-127. [PMID: 36544479 PMCID: PMC9735261 DOI: 10.1016/j.csbj.2022.11.056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Revised: 11/10/2022] [Accepted: 11/26/2022] [Indexed: 12/04/2022] Open
Abstract
Lysine crotonylation (Kcr) is one of the most important post-translational modifications (PTMs) that is widely detected in both histone and non-histone proteins. In fact, Kcr is reported to be involved in various biological processes, such as metabolism and cell differentiation. However, the available experimental methods for Kcr site identification are laborious and costly. To effectively replace existing experimental approaches, some computational methods have been developed in the last few years. The available computational methods still lack some important aspects, as they can only identify Kcr sites on either histone-only or combined histone and nonhistone proteins. Although a tool was developed to identify Kcr sites on non-histone proteins only, its performance is inadequate and the exploration of hidden Kcr patterns (motifs) has been completely ignored, which might be significant for detailed Kcr studies. Therefore, algorithms that can more effectively predict Kcr sites on non-histone proteins with their biological meaning need to be designed. Accordingly, we developed a novel deep learning (capsule network)-based model, named CapsNh-Kcr, for Kcr site prediction, particularly focusing on non-histone proteins. Based on the independent results, the proposed model achieves an AUC of 0.9120, which is approximately 6% higher than that of previous nhKcr model in the prediction of Kcr sites on non-histone proteins. Further, we revealed, for the first time, that the proposed model can represent obvious motif distribution across Kcr sites in non-histone proteins. The source code (in Python) is publicly available at https://github.com/Jhabindra-bioinfo/CapsNh-Kcr.
Collapse
Affiliation(s)
- Jhabindra Khanal
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, South Korea
| | - Jeevan Kandel
- Graduate School of Integrated Energy-AI, Jeonbuk National University, Jeonju 54896, South Korea
| | - Hilal Tayara
- School of International Engineering and Science, Jeonbuk National University, Jeonju 54896, South Korea,Corresponding authors at: School of International Engineering and Science, Jeonbuk National University, Jeonju 54896, South Korea (H. Tayara); Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, South Korea (K.T. Chong).
| | - Kil To Chong
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, South Korea,Advanced Electronics and Information Research Center, Jeonbuk National University, Jeonju 54896, South Korea,Corresponding authors at: School of International Engineering and Science, Jeonbuk National University, Jeonju 54896, South Korea (H. Tayara); Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, South Korea (K.T. Chong).
| |
Collapse
|
13
|
Lal D, Pandey H, Lal R. Phylogenetic Analyses of Microbial Hydrolytic Dehalogenases Reveal Polyphyletic Origin. Indian J Microbiol 2022; 62:651-657. [PMID: 36458228 PMCID: PMC9705686 DOI: 10.1007/s12088-022-01043-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Accepted: 10/11/2022] [Indexed: 11/15/2022] Open
Abstract
Hydrolytic dehalogenases form an important class of dehalogenases that include haloacid dehalogenase, haloalkane dehalogenase, haloacetate dehalogenase, and atrazine chlorohydrolase. These enzymes are involved in biodegradation of various environmental pollutants and therefore it is important to understand their phylogeny. In the present study, it was found that the enzymes haloalkane and haloacetate dehalogenases share a common ancestry with enzymes such as carboxyesterase, epoxide hydrolase, and lipases, which can be traced to ancestral α/β hydrolase fold enzyme. Haloacid dehalogenases and atrazine chlorohydrolases have probabaly evolved from ancestral enzymes with phosphatase and deaminases activity, respectively. These findings were supported by the similarities in the secondary structure, key catalytic motifs and placement of catalytic residues. The phylogeny of haloalkane dehalogenases and haloacid dehalogenases differs from 16S rRNA gene phylogeny, suggesting spread through horizontal gene transfer. Hydrolytic dehalogenases are polyphyletic and do not share a common evolutionay history, the functional similarities are due to convergent evolution. The present study also identifies key functional residues, mutating which, can help in generating better enzymes for clean up of the persistent environmental pollutants using enzymatic bioremediation. Supplementary Information The online version contains supplementary material available at 10.1007/s12088-022-01043-8.
Collapse
Affiliation(s)
- Devi Lal
- Ramjas College, University of Delhi, New Delhi, Delhi 110007 India
| | - Himani Pandey
- Redcliffe Genetics, H55, Electronic City, Noida, Uttar Pradesh 201301 India
| | - Rup Lal
- The Energy Resources Institute, India Habitat Center Complex, Lodhi Road, New Delhi, Delhi 110003 India
- Phixgen Private Limited, 101 GH-11 Atlantis CGHS Ltd, Gurgaon, Haryana 122001 India
| |
Collapse
|
14
|
Liu J, Yuan Y, Zhao P, Liu G, Huo H, Li Z, Fang T. Change of motifs in C. elegans reveals developmental principle of neural network. Biochem Biophys Res Commun 2022; 624:112-9. [PMID: 35940123 DOI: 10.1016/j.bbrc.2022.07.108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Accepted: 07/28/2022] [Indexed: 11/23/2022]
Abstract
Revealing the organizing principles of developing neural networks is a difficult but significant task in neuroscience. As a creature with a rather compact and well-studied neural network, C. elegans is an ideal subject for neuroscience study. However, the researches on its developing neural network remain challenging. The changes in specific properties of neural network across development may uncover part of its principles. Motif is a typical structure property that can be well applied to various complex networks. Here, we study the motif changes in C. elegans neural network across development. By counting the occurrence number of all three-node subgraph motif structures in its neural network at different stages of C. elegans development, along with those in corresponding random networks, we determine which of these structures are motifs for C. elegans, finding out the regular changes of motifs during its development. Combined with the potential function of these subgraph motifs and synaptic information, we gain insight into the organizing principle of neural network during development, which may increase our understanding of neuroscience and inspire the construction of artificial neural network.
Collapse
|
15
|
Kshirsagar M, Yuan H, Ferres JL, Leslie C. BindVAE: Dirichlet variational autoencoders for de novo motif discovery from accessible chromatin. Genome Biol 2022; 23:174. [PMID: 35971180 PMCID: PMC9380350 DOI: 10.1186/s13059-022-02723-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Accepted: 06/28/2022] [Indexed: 11/10/2022] Open
Abstract
We present a novel unsupervised deep learning approach called BindVAE, based on Dirichlet variational autoencoders, for jointly decoding multiple TF binding signals from open chromatin regions. BindVAE can disentangle an input DNA sequence into distinct latent factors that encode cell-type specific in vivo binding signals for individual TFs, composite patterns for TFs involved in cooperative binding, and genomic context surrounding the binding sites. On the task of retrieving the motifs of expressed TFs in a given cell type, BindVAE is competitive with existing motif discovery approaches.
Collapse
Affiliation(s)
| | - Han Yuan
- Calico Life Sciences, South San Francisco, CA, USA
| | | | | |
Collapse
|
16
|
Das S. Analysis of domain organization and functional signatures of trypanosomatid keIF4Gs. Mol Cell Biochem 2022; 477:2415-2431. [PMID: 35585276 DOI: 10.1007/s11010-022-04464-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Accepted: 05/02/2022] [Indexed: 11/25/2022]
Abstract
Translation initiation is the first step in three essential processes leading to protein synthesis. It is carried out by proteins called translation initiation factors and ribosomes on the mRNA. One of the critical translation initiation factors in eukaryotes is eIF4G which is a scaffold protein that helps assemble translation initiation complexes that carry out translation initiation which ultimately leads to polypeptide synthesis. Trypanosomatids are a large family of kinetoplastids, some of which are protozoan parasites that cause diseases in humans through transmission by vectors. While the protein translation mechanisms in eukaryotes and prokaryotes are well understood, the protein translation factors and mechanisms in trypanosomatids are poorly understood necessitating further studies. Unlike other eukaryotes, trypanosomatids contain five eIF4G orthologues with diversity in length and sequences. Here, I have used bioinformatics tools to look at trypanosomatid keIF4G orthologue sequences and report that there are similarities and considerable differences in their domains/motifs organization and signature amino acid sequences that are required for different functions as compared to human eIF4G. My analysis suggests that there is likely to be considerable diversity and complexity in trypanosomatid keIF4G functions as compared to other eukaryotes.
Collapse
Affiliation(s)
- Supratik Das
- Infection and Immunology, Translational Health Science and Technology Institute, Faridabad, Haryana, 121001, India.
- Translational Health Science and Technology Institute, NCR Biotech Science Cluster, 3rd Milestone, Faridabad-Gurgaon Expressway, PO Box #04, Faridabad, Haryana, 121001, India.
| |
Collapse
|
17
|
Chumin EJ, Faskowitz J, Esfahlani FZ, Jo Y, Merritt H, Tanner J, Cutts SA, Pope M, Betzel R, Sporns O. Cortico-subcortical interactions in overlapping communities of edge functional connectivity. Neuroimage 2022; 250:118971. [PMID: 35131435 DOI: 10.1016/j.neuroimage.2022.118971] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Revised: 01/25/2022] [Accepted: 02/03/2022] [Indexed: 02/01/2023] Open
Abstract
Both cortical and subcortical regions can be functionally organized into networks. Regions of the basal ganglia are extensively interconnected with the cortex via reciprocal connections that relay and modulate cortical function. Here we employ an edge-centric approach, which computes co-fluctuations among region pairs in a network to investigate the role and interaction of subcortical regions with cortical systems. By clustering edges into communities, we show that cortical systems and subcortical regions couple via multiple edge communities, with hippocampus and amygdala having a distinct pattern from striatum and thalamus. We show that the edge community structure of cortical networks is highly similar to one obtained from cortical nodes when the subcortex is present in the network. Additionally, we show that the edge community profile of both cortical and subcortical nodes can be estimates solely from cortico-subcortical interactions. Finally, we used a motif analysis focusing on edge community triads where a subcortical region coupled to two cortical regions and found that two community triads where one community couples the subcortex to the cortex were overrepresented. In summary, our results show organized coupling of the subcortex to the cortex that may play a role in cortical organization of primary sensorimotor/attention and heteromodal systems and puts forth the motif analysis of edge community triads as a promising method for investigation of communication patterns in networks.
Collapse
|
18
|
Sivakumar HP, Sundararajan S, Rajendran V, Ramalingam S. Genome wide survey, and expression analysis of Ornithine decarboxylase gene associated with alkaloid biosynthesis in plants. Genomics 2022; 114:84-94. [PMID: 34839021 DOI: 10.1016/j.ygeno.2021.11.029] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2021] [Revised: 09/21/2021] [Accepted: 11/23/2021] [Indexed: 11/04/2022]
Abstract
Plant ODC (ornithine decarboxylase) plays a vital role in normalizing cell division in actively growing tissues. The ODC is a key precursor enzyme for nicotine and nornicotine biosynthesis in plants. ODCs are widely present in many plant families but have not been functionally validated and characterized at the molecular level. In the present study, 58 plant ODCs were identified and were found to contain two putative regulatory motifs, specifically PLP (Pyridoxal 5'-phosphate) and Orn/DAP/Arg decarboxylase family 2 pyridoxal-phosphate, that are highly conserved among diverse plant species. Further, the cis-regulatory elements and interacting partners of the gene revealed the importance of ODC in various metabolic pathways. The qRT-PCR revealed highest relative expression of ODC in floral meristem and roots. Our results suggest that ODC can be effectively used as an ideal candidate for engineering polyamine biosynthesis and would be crucial for developing ultra-low nicotine content tobacco lines via genome editing.
Collapse
Affiliation(s)
- Hari Priya Sivakumar
- Plant Genetic Engineering Laboratory, Department of Biotechnology, Bharathiar University, Coimbatore 641 046, India; DRDO-BU Center for Life Sciences, Bharathiar University campus, Coimbatore 641 046, India
| | - Sathish Sundararajan
- Plant Genetic Engineering Laboratory, Department of Biotechnology, Bharathiar University, Coimbatore 641 046, India
| | - Venkatesh Rajendran
- Plant Genetic Engineering Laboratory, Department of Biotechnology, Bharathiar University, Coimbatore 641 046, India
| | - Sathishkumar Ramalingam
- Plant Genetic Engineering Laboratory, Department of Biotechnology, Bharathiar University, Coimbatore 641 046, India.
| |
Collapse
|
19
|
Khanal J, Tayara H, Zou Q, To Chong K. DeepCap-Kcr: accurate identification and investigation of protein lysine crotonylation sites based on capsule network. Brief Bioinform 2021; 23:6457166. [PMID: 34882222 DOI: 10.1093/bib/bbab492] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Revised: 10/13/2021] [Accepted: 10/25/2021] [Indexed: 12/22/2022] Open
Abstract
Lysine crotonylation (Kcr) is a posttranslational modification widely detected in histone and nonhistone proteins. It plays a vital role in human disease progression and various cellular processes, including cell cycle, cell organization, chromatin remodeling and a key mechanism to increase proteomic diversity. Thus, accurate information on such sites is beneficial for both drug development and basic research. Existing computational methods can be improved to more effectively identify Kcr sites in proteins. In this study, we proposed a deep learning model, DeepCap-Kcr, a capsule network (CapsNet) based on a convolutional neural network (CNN) and long short-term memory (LSTM) for robust prediction of Kcr sites on histone and nonhistone proteins (mammals). The proposed model outperformed the existing CNN architecture Deep-Kcr and other well-established tools in most cases and provided promising outcomes for practical use; in particular, the proposed model characterized the internal hierarchical representation as well as the important features from multiple levels of abstraction automatically learned from a small number of samples. The trained model was well generalized in other species (papaya). Moreover, we showed the features and properties generated by the internal capsule layer that can explore the internal data distribution related to biological significance (as a motif detector). The source code and data are freely available at https://github.com/Jhabindra-bioinfo/DeepCap-Kcr.
Collapse
Affiliation(s)
- Jhabindra Khanal
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, South Korea
| | - Hilal Tayara
- School of international Engineering and Science, Jeonbuk National University, Jeonju 54896, South Korea
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Kil To Chong
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, South Korea.,Advances Electronics and Information Research Center, Jeonbuk National University, Jeonju 54896, South Korea
| |
Collapse
|
20
|
Abstract
BACKGROUND Identification of motifs and quantification of their occurrences are important for the study of genetic diseases, gene evolution, transcription sites, and other biological mechanisms. Exact formulae for estimating count distributions of motifs under Markovian assumptions have high computational complexity and are impractical to be used on large motif sets. Approximated formulae, e.g. based on compound Poisson, are faster, but reliable p value calculation remains challenging. Here, we introduce 'motif_prob', a fast implementation of an exact formula for motif count distribution through progressive approximation with arbitrary precision. Our implementation speeds up the exact calculation, usually impractical, making it feasible and posit to substitute currently employed heuristics. RESULTS We implement motif_prob in both Perl and C+ + languages, using an efficient error-bound iterative process for the exact formula, providing comparison with state-of-the-art tools (e.g. MoSDi) in terms of precision, run time benchmarks, along with a real-world use case on bacterial motif characterization. Our software is able to process a million of motifs (13-31 bases) over genome lengths of 5 million bases within the minute on a regular laptop, and the run times for both the Perl and C+ + code are several orders of magnitude smaller (50-1000× faster) than MoSDi, even when using their fast compound Poisson approximation (60-120× faster). In the real-world use cases, we first show the consistency of motif_prob with MoSDi, and then how the p-value quantification is crucial for enrichment quantification when bacteria have different GC content, using motifs found in antimicrobial resistance genes. The software and the code sources are available under the MIT license at https://github.com/DataIntellSystLab/motif_prob . CONCLUSIONS The motif_prob software is a multi-platform and efficient open source solution for calculating exact frequency distributions of motifs. It can be integrated with motif discovery/characterization tools for quantifying enrichment and deviation from expected frequency ranges with exact p values, without loss in data processing efficiency.
Collapse
Affiliation(s)
- Mattia Prosperi
- Data Intelligence Systems Lab, Department of Epidemiology, College of Public Health and Health Professions and College of Medicine, University of Florida, Gainesville, FL, USA.
| | - Simone Marini
- Data Intelligence Systems Lab, Department of Epidemiology, College of Public Health and Health Professions and College of Medicine, University of Florida, Gainesville, FL, USA
| | - Christina Boucher
- Department of Computer and Information Science and Engineering, University of Florida, Gainesville, FL, USA
| |
Collapse
|
21
|
Abstract
Discovering motifs and repeats in data sequences is of great importance in biology and a large number of efficient tools for their finding have been developed. As the number of results found can be very large, our goal is to provide a tool that, on a mathematical basis, can precisely find all motifs and repeats, filter them according to input arguments and output the results in a convenient way. RepeatsPlus is a program that provides statistical filtering according to input sequence length and number of repeat occurrences, motif mask filtering and filtering related to ambiguous letters in input sequence and a large number of other options. RepeatPlus is implemented in Python and C[Formula: see text]. It is freely available for public use. The user manual and examples of usage are also available.
Collapse
Affiliation(s)
- Ana Jelović
- Department of Mathematics, Faculty of Transport and Traffic Engineering, Belgrade University, Vojvode Stepe 305, 11000 Belgrade, Serbia
| |
Collapse
|
22
|
Aman Beshir J, Kebede M. In silico analysis of promoter regions and regulatory elements ( motifs and CpG islands) of the genes encoding for alcohol production in Saccharomyces cerevisiaea S288C and Schizosaccharomyces pombe 972h. J Genet Eng Biotechnol 2021; 19:8. [PMID: 33428031 PMCID: PMC7801573 DOI: 10.1186/s43141-020-00097-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Accepted: 11/17/2020] [Indexed: 11/10/2022]
Abstract
BACKGROUND The crucial factor in the production of bio-fuels is the choice of potent microorganisms used in fermentation processes. Despite the evolving trend of using bacteria, yeast is still the primary choice for fermentation. Molecular characterization of many genes from baker's yeast (Saccharomyces cerevisiaea), and fission yeast (Schizosaccharomyces pombe), have improved our understanding in gene structure and the regulation of its expression. This in silico study was done with the aim of analyzing the promoter regions, transcription start site (TSS), and CpG islands of genes encoding for alcohol production in S. cerevisiaea S288C and S. pombe 972h-. RESULTS The analysis revealed the highest promoter prediction scores (1.0) were obtained in five sequences (AAD4, SFA1, GRE3, YKL071W, and YPR127W) for S. cerevisiaea S288C TSS while the lowest (0.8) were found in three sequences (AAD6, ADH5, and BDH2). Similarly, in S. pombe 972h-, the highest (0.99) and lowest (0.88) prediction scores were obtained in five (Adh1, SPBC8E4.04, SPBC215.11c, SPAP32A8.02, and SPAC19G12.09) and one (erg27) sequences, respectively. Determination of common motifs revealed that S. cerevisiaea S288C had 100% coverage at MSc1 with an E value of 3.7e-007 while S. pombe 972h- had 95.23% at MSp1 with an E value of 2.6e+002. Furthermore, comparison of identified transcription factor proteins indicated that 88.88% of MSp1 were exactly similar to MSc1. It also revealed that only 21.73% in S. cerevisiaea S288C and 28% in S. pombe 972h- of the gene body regions had CpG islands. A combined phylogenetic analysis indicated that all sequences from both S. cerevisiaea S288C and S. pombe 972h- were divided into four subgroups (I, II, III, and IV). The four clades are respectively colored in blue, red, green, and violet. CONCLUSION This in silico analysis of gene promoter regions and transcription factors through the actions of regulatory structure such as motifs and CpG islands of genes encoding alcohol production could be used to predict gene expression profiles in yeast species.
Collapse
Affiliation(s)
- Jemal Aman Beshir
- Department of Applied Biology, School of Applied Natural Science, Adama Science and Technology University, P.O. Box 1888, Adama, Ethiopia
- Ethiopian Sugar Corporation, Sugar Academy, Wonji, Ethiopia
| | - Mulugeta Kebede
- Department of Applied Biology, School of Applied Natural Science, Adama Science and Technology University, P.O. Box 1888, Adama, Ethiopia
| |
Collapse
|
23
|
Lees-Miller JP, Cobban A, Katsonis P, Bacolla A, Tsutakawa SE, Hammel M, Meek K, Anderson DW, Lichtarge O, Tainer JA, Lees-Miller SP. Uncovering DNA-PKcs ancient phylogeny, unique sequence motifs and insights for human disease. Prog Biophys Mol Biol 2020; 163:87-108. [PMID: 33035590 PMCID: PMC8021618 DOI: 10.1016/j.pbiomolbio.2020.09.010] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/19/2020] [Revised: 09/12/2020] [Accepted: 09/29/2020] [Indexed: 01/26/2023]
Abstract
DNA-dependent protein kinase catalytic subunit (DNA-PKcs) is a key member of the phosphatidylinositol-3 kinase-like (PIKK) family of protein kinases with critical roles in DNA-double strand break repair, transcription, metastasis, mitosis, RNA processing, and innate and adaptive immunity. The absence of DNA-PKcs from many model organisms has led to the assumption that DNA-PKcs is a vertebrate-specific PIKK. Here, we find that DNA-PKcs is widely distributed in invertebrates, fungi, plants, and protists, and that threonines 2609, 2638, and 2647 of the ABCDE cluster of phosphorylation sites are highly conserved amongst most Eukaryotes. Furthermore, we identify highly conserved amino acid sequence motifs and domains that are characteristic of DNA-PKcs relative to other PIKKs. These include residues in the Forehead domain and a novel motif we have termed YRPD, located in an α helix C-terminal to the ABCDE phosphorylation site loop. Combining sequence with biochemistry plus structural data on human DNA-PKcs unveils conserved sequence and conformational features with functional insights and implications. The defined generally progressive DNA-PKcs sequence diversification uncovers conserved functionality supported by Evolutionary Trace analysis, suggesting that for many organisms both functional sites and evolutionary pressures remain identical due to fundamental cell biology. The mining of cancer genomic data and germline mutations causing human inherited disease reveal that robust DNA-PKcs activity in tumors is detrimental to patient survival, whereas germline mutations compromising function are linked to severe immunodeficiency and neuronal degeneration. We anticipate that these collective results will enable ongoing DNA-PKcs functional analyses with biological and medical implications.
Collapse
Affiliation(s)
- James P Lees-Miller
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, Alberta, T2N 4N1, Canada
| | - Alexander Cobban
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, Alberta, T2N 4N1, Canada
| | - Panagiotis Katsonis
- Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Albino Bacolla
- Departments of Cancer Biology and of Molecular and Cellular Oncology, University of Texas MD Anderson Cancer Center, 6767 Bertner Avenue, Houston, TX, 77030, USA
| | - Susan E Tsutakawa
- Molecular Biophysics and Integrated Bioimaging, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Michal Hammel
- Molecular Biophysics and Integrated Bioimaging, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Katheryn Meek
- College of Veterinary Medicine, Department of Microbiology & Molecular Genetics, And Department of Pathobiology & Diagnostic Investigation, Michigan State University, East Lansing, MI, 48824, USA
| | - Dave W Anderson
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, Alberta, T2N 4N1, Canada
| | - Olivier Lichtarge
- Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - John A Tainer
- Departments of Cancer Biology and of Molecular and Cellular Oncology, University of Texas MD Anderson Cancer Center, 6767 Bertner Avenue, Houston, TX, 77030, USA; Molecular Biophysics and Integrated Bioimaging, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.
| | - Susan P Lees-Miller
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, Alberta, T2N 4N1, Canada.
| |
Collapse
|
24
|
Beg AZ, Khan AU. Motifs and interface amino acid-mediated regulation of amyloid biogenesis in microbes to humans: potential targets for intervention. Biophys Rev 2020; 12:1249-1256. [PMID: 32930961 DOI: 10.1007/s12551-020-00759-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Accepted: 09/04/2020] [Indexed: 02/08/2023] Open
Abstract
Amyloids are linked to many debilitating diseases in mammals. Some organisms produce amyloids that have a functional role in the maintenance of their biological processes. Microbes utilize functional bacterial amyloids (FuBA) for pathogenicity and infections. Amyloid biogenesis is regulated differentially in various systems to avoid its toxic accumulation. A familiar feature in the process of amyloid biogenesis from humans to microbes is its regulation by protein-protein interactions (PPI). The spatial arrangement of amino acid residues in proteins generates topologies like flat interface and linear motif, which participate in protein interactions. Motifs and interface residue-mediated interactions have a direct or an indirect impact on amyloid secretion and assembly. Some motifs undergo post-translational modifications (PTM), which effects interactions and dynamics of the amyloid biogenesis cascade. Interaction-induced local changes stimulate global conformational transitions in the PPI complex, which indirectly affects amyloid formation. Perturbation of such motifs and interface residues results in amyloid abolishment. Interface residues, motifs and their respective interactive protein partners could serve as potential targets for intervention to inhibit amyloid biogenesis.
Collapse
Affiliation(s)
- Ayesha Z Beg
- Medical Microbiology and Molecular Biology, Interdisciplinary Biotechnology Unit, Aligarh Muslim University, Aligarh, 202002, India
| | - Asad U Khan
- Medical Microbiology and Molecular Biology, Interdisciplinary Biotechnology Unit, Aligarh Muslim University, Aligarh, 202002, India.
| |
Collapse
|
25
|
Mossé B, Remy É. A Combinatorial Exploration of Boolean Dynamics Generated by Isolated and Chorded Circuits. Acta Biotheor 2020; 68:87-117. [PMID: 31407132 DOI: 10.1007/s10441-019-09355-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2019] [Accepted: 07/27/2019] [Indexed: 10/26/2022]
Abstract
Most studies of motifs of biological regulatory networks focus on the analysis of asymptotical behaviours (attractors, and even often only stable states), but transient properties are rarely addressed. In the line of our previous study devoted to isolated circuits (Remy et al. in Bioinformatics (Oxford, England) 19(Suppl. 2):172-178, 2003), we consider chorded circuits, that are motifs made of an elementary positive or negative circuit with a chord, possibly a self-loop. We provide detailed descriptions of the boolean dynamics of chorded circuits versus isolated circuits, under the synchronous and asynchronous updating schemes within the logical formalism. To this end, we address the description of the trajectories in the dynamics of isolated circuits with coding techniques and adapt them for chorded circuits. The use of the logical modeling gives access to mathematical tools (group actions, analysis of recurrent sequences, coding of trajectories, specific abacus...) allowing complete analytical analysis of basic yet important motifs. In particular, we show that whatever the chosen updating rule, the dynamics depends on a small number of parameters.
Collapse
|
26
|
Nolte M, Gal E, Markram H, Reimann MW. Impact of higher order network structure on emergent cortical activity. Netw Neurosci 2020; 4:292-314. [PMID: 32181420 PMCID: PMC7069066 DOI: 10.1162/netn_a_00124] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Accepted: 12/23/2019] [Indexed: 11/04/2022] Open
Abstract
Synaptic connectivity between neocortical neurons is highly structured. The network structure of synaptic connectivity includes first-order properties that can be described by pairwise statistics, such as strengths of connections between different neuron types and distance-dependent connectivity, and higher order properties, such as an abundance of cliques of all-to-all connected neurons. The relative impact of first- and higher order structure on emergent cortical network activity is unknown. Here, we compare network structure and emergent activity in two neocortical microcircuit models with different synaptic connectivity. Both models have a similar first-order structure, but only one model includes higher order structure arising from morphological diversity within neuronal types. We find that such morphological diversity leads to more heterogeneous degree distributions, increases the number of cliques, and contributes to a small-world topology. The increase in higher order network structure is accompanied by more nuanced changes in neuronal firing patterns, such as an increased dependence of pairwise correlations on the positions of neurons in cliques. Our study shows that circuit models with very similar first-order structure of synaptic connectivity can have a drastically different higher order network structure, and suggests that the higher order structure imposed by morphological diversity within neuronal types has an impact on emergent cortical activity.
Collapse
Affiliation(s)
- Max Nolte
- Blue Brain Project, École Polytechnique Fédérale de Lausanne, Geneva, Switzerland
| | - Eyal Gal
- Edmond and Lily Safra Center for Brain Sciences, The Hebrew University, Jerusalem, Israel
- Department of Neurobiology, The Hebrew University, Jerusalem, Israel
| | - Henry Markram
- Blue Brain Project, École Polytechnique Fédérale de Lausanne, Geneva, Switzerland
- Laboratory of Neural Microcircuitry, Brain Mind Institute, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Michael W. Reimann
- Blue Brain Project, École Polytechnique Fédérale de Lausanne, Geneva, Switzerland
| |
Collapse
|
27
|
Bauermeister C, Keren H, Braun J. Unstructured network topology begets order-based representation by privileged neurons. Biol Cybern 2020; 114:113-135. [PMID: 32107622 PMCID: PMC7062672 DOI: 10.1007/s00422-020-00819-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/18/2019] [Accepted: 02/01/2020] [Indexed: 06/10/2023]
Abstract
How spiking activity reverberates through neuronal networks, how evoked and spontaneous activity interacts and blends, and how the combined activities represent external stimulation are pivotal questions in neuroscience. We simulated minimal models of unstructured spiking networks in silico, asking whether and how gentle external stimulation might be subsequently reflected in spontaneous activity fluctuations. Consistent with earlier findings in silico and in vitro, we observe a privileged subpopulation of 'pioneer neurons' that, by their firing order, reliably encode previous external stimulation. We also confirm that pioneer neurons are 'sensitive' in that they are recruited by small fluctuations of population activity. We show that order-based representations rely on a 'chain' of pioneer neurons with different degrees of sensitivity and thus constitute an emergent property of collective dynamics. The forming of such representations is greatly favoured by a broadly heterogeneous connection topology-a broad 'middle class' in degree of connectedness. In conclusion, we offer a minimal model for the representational role of pioneer neurons, as observed experimentally in vitro. In addition, we show that broadly heterogeneous connectivity enhances the representational capacity of unstructured networks.
Collapse
Affiliation(s)
- Christoph Bauermeister
- Institute of Biology, Otto-von-Guericke University, Leipziger Str. 44, Haus 91, 39120, Magdeburg, Germany
- Center for Behavioral Brain Sciences, Leipziger Str. 44, 39120, Magdeburg, Germany
| | - Hanna Keren
- Network Biology Research Laboratory, Electrical Engineering, Technion-Israel Institute of Technology, 3200003, Haifa, Israel
| | - Jochen Braun
- Institute of Biology, Otto-von-Guericke University, Leipziger Str. 44, Haus 91, 39120, Magdeburg, Germany.
- Center for Behavioral Brain Sciences, Leipziger Str. 44, 39120, Magdeburg, Germany.
| |
Collapse
|
28
|
Rendsvig JKH, Workman CT, Hoof JB. Bidirectional histone-gene promoters in Aspergillus: characterization and application for multi-gene expression. Fungal Biol Biotechnol 2019; 6:24. [PMID: 31867115 DOI: 10.1186/s40694-019-0088-3] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2019] [Accepted: 11/23/2019] [Indexed: 02/01/2023] Open
Abstract
Background Filamentous fungi are important producers of enzymes and bioactive secondary metabolites and are exploited for industrial purposes. Expression and characterization of biosynthetic pathways requires stable expression of multiple genes in the production host. Fungal promoters are indispensable for the accomplishment of this task, and libraries of promoters that show functionality across diverse fungal species facilitate synthetic biology approaches, pathway expression, and cell-factory construction. Results In this study, we characterized the intergenic region between the genes encoding histones H4.1 and H3, from five phylogenetically diverse species of Aspergillus, as bidirectional promoters (Ph4h3). By expression of the genes encoding fluorescent proteins mRFP1 and mCitrine, we show at the translational and transcriptional level that this region from diverse species is applicable as strong and constitutive bidirectional promoters in Aspergillus nidulans. Bioinformatic analysis showed that the divergent gene orientation of h4.1 and h3 appears maintained among fungi, and that the Ph4h3 display conserved DNA motifs among the investigated 85 Aspergilli. Two of the heterologous Ph4h3s were utilized for single-locus expression of four genes from the putative malformin producing pathway from Aspergillus brasiliensis in A. nidulans. Strikingly, heterologous expression of mlfA encoding the non-ribosomal peptide synthetase is sufficient for biosynthesis of malformins in A. nidulans, which indicates an iterative use of one adenylation domain in the enzyme. However, this resulted in highly stressed colonies, which was reverted to a healthy phenotype by co-expressing the residual four genes from the putative biosynthetic gene cluster. Conclusions Our study has documented that Ph4h3 is a strong constitutive bidirectional promoter and a valuable new addition to the genetic toolbox of at least the genus Aspergillus.
Collapse
|
29
|
Calkins TL, Tamborindeguy C, Pietrantonio PV. GPCR annotation, G proteins, and transcriptomics of fire ant (Solenopsis invicta) queen and worker brain: An improved view of signaling in an invasive superorganism. Gen Comp Endocrinol 2019; 278:89-103. [PMID: 30576645 DOI: 10.1016/j.ygcen.2018.12.008] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/14/2018] [Revised: 12/13/2018] [Accepted: 12/17/2018] [Indexed: 10/27/2022]
Abstract
Knowledge of G protein-coupled receptors (GPCRs) and their signaling modalities is crucial to advancing insect endocrinology, specifically in highly successful invasive social insects, such as the red imported fire ant, Solenopsis invicta Buren. In the first published draft genome of S. invicta, emphasis was placed on the annotation of olfactory receptors, and only the number of predicted GPCR genes was reported. Without an organized and curated resource for GPCRs, it will be difficult to test hypotheses on the endocrine role of neuropeptide hormones, or the function of neurotransmitters and neuromodulators. Therefore, we mined the S. invicta genome for GPCRs and found 324 predicted transcripts encoded by 125 predicted loci and improved the annotation of 55 of these loci. Among them are sixteen GPCRs that are currently annotated as "uncharacterized proteins". Further, the phylogenetic analysis of class A neuropeptide receptors presented here and the comparative listing of GPCRs in the hymenopterans S. invicta, Apis mellifera (both eusocial), Nasonia vitripennis (solitary), and the solitary model dipteran Drosophila melanogaster will facilitate comparative endocrinological studies related to social insect evolution and diversity. We compiled the 24 G protein transcripts predicted (15 α, 7 β, and 2 γ) from 12 G protein genes (5 α, 5 β, and 2 γ). Reproductive division of labor is extreme in this ant species, therefore, we compared GPCR and G protein gene expression among worker, mated queen and alate virgin queen ant brain transcriptomes. Transcripts for ten GPCRs and two G proteins were differentially expressed between queen and worker brains. The differentially expressed GPCRs are candidate receptors to explore hypotheses on division of labor in this species.
Collapse
Affiliation(s)
- Travis L Calkins
- Department of Entomology, Texas A&M University, College Station, TX 77843-2475, USA
| | | | | |
Collapse
|
30
|
Abstract
Cap Analysis of Gene Expression (CAGE) is one of the most popular 5'-end sequencing methods. In a single experiment, CAGE can be used to locate and quantify the expression of both Transcription Start Sites (TSSs) and enhancers. This is workflow is a case study on how to use the CAGEfightR package to orchestrate analysis of CAGE data within the Bioconductor project. This workflow starts from BigWig-files and covers both basic CAGE analyses such as identifying, quantifying and annotating TSSs and enhancers, advanced analysis such as finding interacting TSS-enhancer pairs and enhancer clusters, to differential expression analysis and alternative TSS usage. R-code, discussion and references are intertwined to help provide guidelines for future CAGE studies of the same kind.
Collapse
Affiliation(s)
- Malte Thodberg
- Biotech Research and Innovation Centre, University of Copenhagen, Copenhagen, Denmark
- Section for Computational and RNA Biology, University of Copenhagen, Copenhagen, Denmark
| | - Albin Sandelin
- Biotech Research and Innovation Centre, University of Copenhagen, Copenhagen, Denmark
- Section for Computational and RNA Biology, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
31
|
Wozniak JM, Gonzalez DJ. PTMphinder: an R package for PTM site localization and motif extraction from proteomic datasets. PeerJ 2019; 7:e7046. [PMID: 31198645 PMCID: PMC6555389 DOI: 10.7717/peerj.7046] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2019] [Accepted: 04/24/2019] [Indexed: 11/30/2022] Open
Abstract
Background Mass-spectrometry-based proteomics is a prominent field of study that allows for the unbiased quantification of thousands of proteins from a particular sample. A key advantage of these techniques is the ability to detect protein post-translational modifications (PTMs) and localize them to specific amino acid residues. These approaches have led to many significant findings in a wide range of biological disciplines, from developmental biology to cancer and infectious diseases. However, there is a current lack of tools available to connect raw PTM site information to biologically meaningful results in a high-throughput manner. Furthermore, many of the available tools require significant programming knowledge to implement. Results The R package PTMphinder was designed to enable researchers, particularly those with minimal programming background, to thoroughly analyze PTMs in proteomic data sets. The package contains three functions: parseDB, phindPTMs and extractBackground. Together, these functions allow users to reformat proteome databases for easier analysis, localize PTMs within full proteins, extract motifs surrounding the identified sites and create proteome-specific motif backgrounds for statistical purposes. Beta-testing of this R package has demonstrated its simplicity and ease of integration with existing tools. Conclusion PTMphinder empowers researchers to fully analyze and interpret PTMs derived from proteomic data. This package is simple enough for researchers with limited programming experience to understand and implement. The data produced from this package can inform subsequent research by itself and also be used in conjunction with other tools, such as motif-x, for further analysis.
Collapse
Affiliation(s)
- Jacob M Wozniak
- Department of Pharmacology, University of California, San Diego, La Jolla, CA, United States of America.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, United States of America
| | - David J Gonzalez
- Department of Pharmacology, University of California, San Diego, La Jolla, CA, United States of America.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, United States of America
| |
Collapse
|
32
|
Chen L, Wang X, Wang L, Fang Y, Pan X, Gao X, Zhang W. Functional characterization of chloroplast transit peptide in the small subunit of Rubisco in maize. J Plant Physiol 2019; 237:12-20. [PMID: 30999073 DOI: 10.1016/j.jplph.2019.04.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/18/2018] [Revised: 04/04/2019] [Accepted: 04/04/2019] [Indexed: 06/09/2023]
Abstract
Functions of domains or motifs, which are encoded by the transit peptide (TP) of the precursor of the small subunit of Rubisco (prSSU), have been investigated intensively in dicots. Functional characterization of the prSSU TP, however, is still understudied in maize. In this study, we found that the TP of maize prSSU1 did not function fully in chloroplast targeting in Arabidopsis or vice versa, indicating the divergent function of TPs in chloroplast targeting between maize and Arabidopsis. Through deletion or substitution assays, we found that the N-terminal region of maize or Arabidopsis prSSU1 was necessary and sufficient for importing specifically the fused-green fluorescent protein (GFP) into each corresponding chloroplast. Finally, we found that the first-five amino acids and MM motif in the N-terminal domain of the maize TP played an essential role in maize chloroplast targeting. Thus, our analyses demonstrate that the N-terminal domain of the prSSU1 TP is the key determinant in chloroplast targeting between maize and Arabidopsis. Our study highlights the unique properties of the maize prSSU1 TP in chloroplast targeting, thus helping to understand the role of N-terminal domain in chloroplast targeting across species. It will help to manipulate chloroplast transit peptides (cTPs) for crop bioengineering.
Collapse
Affiliation(s)
- Lifen Chen
- State Key Laboratory for Crop Genetics and Germplasm Enhancement, JiangSu Collaborative Innovation Center for Modern Crop Production, Nanjing Agricultural University, No.1 Weigang, Nanjing, Jiangsu, 210095, China
| | - Ximeng Wang
- State Key Laboratory for Crop Genetics and Germplasm Enhancement, JiangSu Collaborative Innovation Center for Modern Crop Production, Nanjing Agricultural University, No.1 Weigang, Nanjing, Jiangsu, 210095, China
| | - Lei Wang
- State Key Laboratory for Crop Genetics and Germplasm Enhancement, JiangSu Collaborative Innovation Center for Modern Crop Production, Nanjing Agricultural University, No.1 Weigang, Nanjing, Jiangsu, 210095, China
| | - Yuan Fang
- State Key Laboratory for Crop Genetics and Germplasm Enhancement, JiangSu Collaborative Innovation Center for Modern Crop Production, Nanjing Agricultural University, No.1 Weigang, Nanjing, Jiangsu, 210095, China
| | - Xiucai Pan
- State Key Laboratory for Crop Genetics and Germplasm Enhancement, JiangSu Collaborative Innovation Center for Modern Crop Production, Nanjing Agricultural University, No.1 Weigang, Nanjing, Jiangsu, 210095, China
| | - Xiquan Gao
- State Key Laboratory for Crop Genetics and Germplasm Enhancement, JiangSu Collaborative Innovation Center for Modern Crop Production, Nanjing Agricultural University, No.1 Weigang, Nanjing, Jiangsu, 210095, China
| | - Wenli Zhang
- State Key Laboratory for Crop Genetics and Germplasm Enhancement, JiangSu Collaborative Innovation Center for Modern Crop Production, Nanjing Agricultural University, No.1 Weigang, Nanjing, Jiangsu, 210095, China.
| |
Collapse
|
33
|
Abstract
BACKGROUND Our previous study found that more than 500 transcripts significantly increased in abundance in the zebrafish and mouse several hours to days postmortem relative to live controls. The current literature suggests that most mRNAs are post-transcriptionally regulated in stressful conditions. We rationalized that the postmortem transcripts must contain sequence features (3- to 9- mers) that are unique from those in the rest of the transcriptome and that these features putatively serve as binding sites for proteins and/or non-coding RNAs involved in post-transcriptional regulation. RESULTS We identified 5117 and 2245 over-represented sequence features in the mouse and zebrafish, respectively, which represents less than 1.5% of all possible features. Some of these features were disproportionately distributed along the transcripts with high densities in the 3' untranslated regions of the zebrafish (0.3 mers/nt) and the open reading frames of the mouse (0.6 mers/nt). Yet, the highest density (2.3 mers/nt) occurred in the open reading frames of 11 mouse transcripts that lacked 3' or 5' untranslated regions. These results suggest the transcripts with high density of features might serve as 'molecular sponges' that sequester RNA binding proteins and/or microRNAs, and thus indirectly increase the stability and gene expression of other transcripts. In addition, some of the features were identified as binding sites for Rbfox and Hud proteins that are also involved in increasing transcript stability and gene expression. CONCLUSIONS Our results are consistent with the hypothesis that transcripts involved in responding to extreme stress, such as organismal death, have sequence features that make them different from the rest of the transcriptome. Some of these features serve as putative binding sites for proteins and non-coding RNAs that determine transcript stability and fate. A small number of the transcripts have high density sequence features, which are presumably involved in sequestering RNA binding proteins and microRNAs and thus preventing regulatory interactions among other transcripts. Our results provide baseline data on post-transcriptional regulation in stressful conditions that has implications for regulation in disease, starvation, and cancer.
Collapse
Affiliation(s)
- Peter A. Noble
- Department of Periodontics, University of Washington, Box 357444, Seattle, WA 98195 USA
| | - Alexander E. Pozhitkov
- City of Hope, Information Sciences - Beckman Research Institute, 4920 Rivergrade Rd., Irwindale, CA 91706 USA
| |
Collapse
|
34
|
Mohapatra RK, Nanda S. In silico analysis of onion chitinases using transcriptome data. Bioinformation 2018; 14:440-445. [PMID: 30310251 PMCID: PMC6166401 DOI: 10.6026/97320630014440] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2018] [Revised: 08/25/2018] [Accepted: 08/25/2018] [Indexed: 12/24/2022] Open
Abstract
Chitinases are glycoside hydrolase (GH) family of proteins having multifaceted roles in plants. It is of interest to identify and characterize chitinase-encoding genes from the popular bulbous plant onion (Allium cepa L.). We have used the EST sequences for onion chitinases to elucidate its functional features using sequence, structure and functional analysis. These contigs belong to the GH19 chitinases family according to domain architecture analysis. They have highly conserved chitinase motifs including motifs exclusive to plant chitinases as implied using the MEME based structural characterization. Estimation of biochemical properties suggested that these proteins have features to form stable and hydrophilic proteins capable of localizing extracellular and in vacuoles. Further, they have multiple cellular processes including defense role as inferred by DeepGO function prediction. Phylogenetic analysis grouped them as class I and class VII plant chitinase, with possible abundance of class I chitinase in onion. These observations help in the isolation and functional validation of onion chitinases.
Collapse
Affiliation(s)
- Rupesh Kumar Mohapatra
- Center for Biotechnology, Siksha 'O' Anusandhan University, Bhubaneswar, Odisha 751003, India
| | - Satyabrata Nanda
- Center for Biotechnology, Siksha 'O' Anusandhan University, Bhubaneswar, Odisha 751003, India
- State Key Laboratory of Rice Biology, China National Rice Research Institute, Hangzhou, Zhejiang 311440, China
| |
Collapse
|
35
|
Pandey V, Krishnan V, Basak N, Marathe A, Thimmegowda V, Dahuja A, Jolly M, Sachdev A. Molecular modeling and in silico characterization of GmABCC5: a phytate transporter and potential target for low-phytate crops. 3 Biotech 2018; 8:54. [PMID: 29354365 DOI: 10.1007/s13205-017-1053-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2017] [Accepted: 12/17/2017] [Indexed: 02/06/2023] Open
Abstract
Designing low-phytate crops without affecting the developmental process in plants had led to the identification of ABCC5 gene in soybean. The GmABCC5 gene was identified and a partial gene sequence was cloned from popular Indian soybean genotype Pusa16. Conserved domains and motifs unique to ABC transporters were identified in the 30 homologous sequences retrieved by BLASTP analysis. The homologs were analyzed for their evolutionary relationship and physiochemical properties. Conserved domains, transmembrane architecture and secondary structure of GmABCC5 were predicted with the aid of computational tools. Analysis identified 53 alpha helices and 31 beta strands, predicting 60% residues in alpha conformation. A three-dimensional (3D) model for GmABCC5 was developed based on 5twv.1.B (Homo sapiens) template homology to gain better insight into its molecular mechanism of transport and sequestration. Spatio-temporal real-time PCR analysis identified mid-to-late seed developmental stages as the time window for the maximum GmABCC5 gene expression, a potential target stage for phytate reduction. Results of this study provide valuable insights into the structural and functional characteristics of GmABCC5, which may be further utilized for the development of nutritionally enriched low-phytate soybean with improved mineral bioavailability.
Collapse
Affiliation(s)
- Vanita Pandey
- 1Division of Biochemistry, ICAR-Indian Agricultural Research Institute, New Delhi, 110012 India
- Quality and Basic Sciences, ICAR-Indian Institute of Wheat and Barley Research, Karnal, New Delhi 132 001 India
| | - Veda Krishnan
- 1Division of Biochemistry, ICAR-Indian Agricultural Research Institute, New Delhi, 110012 India
| | - Nabaneeta Basak
- 1Division of Biochemistry, ICAR-Indian Agricultural Research Institute, New Delhi, 110012 India
- Crop Physiology and Biochemistry, ICAR-National Rice Research Institute, Cuttack, 753006 India
| | - Ashish Marathe
- 1Division of Biochemistry, ICAR-Indian Agricultural Research Institute, New Delhi, 110012 India
| | - Vinutha Thimmegowda
- 1Division of Biochemistry, ICAR-Indian Agricultural Research Institute, New Delhi, 110012 India
| | - Anil Dahuja
- 1Division of Biochemistry, ICAR-Indian Agricultural Research Institute, New Delhi, 110012 India
| | - Monica Jolly
- 1Division of Biochemistry, ICAR-Indian Agricultural Research Institute, New Delhi, 110012 India
| | - Archana Sachdev
- 1Division of Biochemistry, ICAR-Indian Agricultural Research Institute, New Delhi, 110012 India
| |
Collapse
|
36
|
Wiener HW, Shrestha S, Lu H, Karita E, Kilembe W, Allen S, Hunter E, Goepfert PA, Tang J. Immunogenetic factors in early immune control of human immunodeficiency virus type 1 (HIV-1) infection: Evaluation of HLA class I amino acid variants in two African populations. Hum Immunol 2017; 79:166-171. [PMID: 29289742 DOI: 10.1016/j.humimm.2017.12.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2017] [Revised: 12/12/2017] [Accepted: 12/13/2017] [Indexed: 01/07/2023]
Abstract
Immune control of HIV-1 infection depends heavily on cytotoxic T-lymphocyte responses restricted by diverse HLA class I molecules. Recent work has uncovered specific amino acid residues (AARs) that seem to dictate the extent of immune control in African Americans, which prompted us to test these emerging hypotheses in seroconverters (SCs) from southern and eastern Africa. Based on data from 196 Zambians and 76 Rwandans with fully resolved HLA alleles and pre-therapy HIV-1 viral loads (VL) in the first 3- to 36-month of infection (>2300 person-visits), four AARs of primary interest (positions 63, 97, 116 and 245 in the mature HLA-B protein) were found to explain 8.1% and 15.8% of variance in set-point VL for these cohorts (P = .024 and 7.5 × 10-6, respectively). Two AARs not reported previously (167S in HLA-B and 116F in HLA-C) also showed relatively consistent associations with VL (adjusted P = .009-.069), while many population-specific associations were also noted (false discovery rate <0.05). Extensive and often strong linkage disequilibrium among neighboring AAR variants called for more extensive analyses of AAR haplotypes in diverse cohorts before the structural basis of antigen presentation can be fully comprehended.
Collapse
Affiliation(s)
- Howard W Wiener
- Department of Epidemiology, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Sadeep Shrestha
- Department of Epidemiology, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Hailin Lu
- Department of Medicine, University of Alabama at Birmingham, Birmingham, AL, USA
| | | | | | - Susan Allen
- Zambia-Emory HIV Research Project, Lusaka, Zambia; Department of Pathology and Laboratory Medicine, Emory University, Atlanta, GA, USA
| | - Eric Hunter
- Vaccine Research Center, Emory University, Atlanta, GA, USA
| | - Paul A Goepfert
- Department of Medicine, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Jianming Tang
- Department of Medicine, University of Alabama at Birmingham, Birmingham, AL, USA.
| |
Collapse
|
37
|
Reed M, Best J, Golubitsky M, Stewart I, Nijhout HF. Analysis of Homeostatic Mechanisms in Biochemical Networks. Bull Math Biol 2017; 79:2534-2557. [PMID: 28884446 PMCID: PMC5842936 DOI: 10.1007/s11538-017-0340-z] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2017] [Accepted: 08/25/2017] [Indexed: 12/18/2022]
Abstract
Cell metabolism is an extremely complicated dynamical system that maintains important cellular functions despite large changes in inputs. This "homeostasis" does not mean that the dynamical system is rigid and fixed. Typically, large changes in external variables cause large changes in some internal variables so that, through various regulatory mechanisms, certain other internal variables (concentrations or velocities) remain approximately constant over a finite range of inputs. Outside that range, the mechanisms cease to function and concentrations change rapidly with changes in inputs. In this paper we analyze four different common biochemical homeostatic mechanisms: feedforward excitation, feedback inhibition, kinetic homeostasis, and parallel inhibition. We show that all four mechanisms can occur in a single biological network, using folate and methionine metabolism as an example. Golubitsky and Stewart have proposed a method to find homeostatic nodes in networks. We show that their method works for two of these mechanisms but not the other two. We discuss the many interesting mathematical and biological questions that emerge from this analysis, and we explain why understanding homeostatic control is crucial for precision medicine.
Collapse
Affiliation(s)
- Michael Reed
- Department of Mathematics, Duke University, Durham, NC, 27708, USA.
| | - Janet Best
- Department of Mathematics, The Ohio State University, Columbus, OH, 43210, USA
| | - Martin Golubitsky
- Department of Mathematics, The Ohio State University, Columbus, OH, 43210, USA
| | - Ian Stewart
- Mathematics Institute, University of Warwick, Coventry, CV47AL, UK
| | | |
Collapse
|
38
|
Ortmann M, Brandes U. Efficient orbit-aware triad and quad census in directed and undirected graphs. Appl Netw Sci 2017; 2:13. [PMID: 30443568 PMCID: PMC6214268 DOI: 10.1007/s41109-017-0027-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/21/2016] [Accepted: 04/20/2017] [Indexed: 06/09/2023]
Abstract
The prevalence of select substructures is an indicator of network effects in applications such as social network analysis and systems biology. Moreover, subgraph statistics are pervasive in stochastic network models, and they need to be assessed repeatedly in MCMC sampling and estimation algorithms. We present a new approach to count all induced and non-induced four-node subgraphs (the quad census) on a per-node and per-edge basis, complete with a separation into their non-automorphic roles in these subgraphs. It is the first approach to do so in a unified manner, and is based on only a clique-listing subroutine. Computational experiments indicate that, despite its simplicity, the approach outperforms previous, less general approaches. By way of the more presentable triad census, we additionally show how to extend the quad census to directed graphs. As a byproduct we obtain the asymptotically fastest triad census algorithm to date.
Collapse
Affiliation(s)
- Mark Ortmann
- Department of Computer & Information Science, University of Konstanz, Box 67, Konstanz, 78457 Germany
| | - Ulrik Brandes
- Department of Computer & Information Science, University of Konstanz, Box 67, Konstanz, 78457 Germany
| |
Collapse
|
39
|
Abstract
The ProFunc web server is a tool for helping identify the function of a given protein whose 3D coordinates have been experimentally determined or homology modeled. It uses a cocktail of both sequence- and structure-based methods to identify matches to other proteins that may, in turn, suggest the query protein's most likely function. The server was originally developed to aid the worldwide structural genomics effort at the start of the millennium. It accepts a file containing the protein's 3D coordinates in PDB format, and, when processing is complete, sends an email containing a link to the password-protected result pages. The results include an at-a-glance summary, as well as separate pages containing more detailed analyses. The server can be found at: http://www.ebi.ac.uk/thornton-srv/databases/profunc .
Collapse
Affiliation(s)
- Roman A Laskowski
- European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| |
Collapse
|
40
|
Abstract
Analysis of gene co-expression networks is a powerful "data-driven" tool, invaluable for understanding cancer biology and mechanisms of tumor development. Yet, despite of completion of thousands of studies on cancer gene expression, there were few attempts to normalize and integrate co-expression data from scattered sources in a concise "meta-analysis" framework. Here we describe an integrated approach to cancer expression meta-analysis, which combines generation of "data-driven" co-expression networks with detailed statistical detection of promoter sequence motifs within the co-expression clusters. First, we applied Weighted Gene Co-Expression Network Analysis (WGCNA) workflow and Pearson's correlation to generate a comprehensive set of over 3000 co-expression clusters in 82 normalized microarray datasets from nine cancers of different origin. Next, we designed a genome-wide statistical approach to the detection of specific DNA sequence motifs based on similarities between the promoters of similarly expressed genes. The approach, realized as cisExpress software module, was specifically designed for analysis of very large data sets such as those generated by publicly accessible whole genome and transcriptome projects. cisExpress uses a task farming algorithm to exploit all available computational cores within a shared memory node.We discovered that although co-expression modules are populated with different sets of genes, they share distinct stable patterns of co-regulation based on promoter sequence analysis. The number of motifs per co-expression cluster varies widely in accordance with cancer tissue of origin, with the largest number in colon (68 motifs) and the lowest in ovary (18 motifs). The top scored motifs are typically shared between several tissues; they define sets of target genes responsible for certain functionality of cancerogenesis. Both the co-expression modules and a database of precalculated motifs are publically available and accessible for further studies.
Collapse
Affiliation(s)
- Martin Triska
- Spatial Sciences Institute, University of Southern California, Los Angeles, CA, USA
| | | | - Yuri Nikolsky
- Prosapia Genetics, Solana Beach, CA, USA.,School of Systems Biology, George Mason University, Fairfax, VA, USA
| | - Tatiana V Tatarinova
- Spatial Sciences Institute, University of Southern California, Los Angeles, CA, USA. .,Center for Personalized Medicine, Children's Hospital Los Angeles, 4640 Hollywood Blvd, Los Angeles, CA, 90027, USA. .,A.A. Kharkevich Institute for Information Transmission Problems RAS, Moscow, Russia.
| |
Collapse
|
41
|
Langhans M, Weber W, Babel L, Grunewald M, Meckel T. The right motifs for plant cell adhesion: what makes an adhesive site? Protoplasma 2017; 254:95-108. [PMID: 27091341 DOI: 10.1007/s00709-016-0970-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2016] [Accepted: 03/31/2016] [Indexed: 06/05/2023]
Abstract
Cells of multicellular organisms are surrounded by and attached to a matrix of fibrous polysaccharides and proteins known as the extracellular matrix. This fibrous network not only serves as a structural support to cells and tissues but also plays an integral part in the process as important as proliferation, differentiation, or defense. While at first sight, the extracellular matrices of plant and animals do not have much in common, a closer look reveals remarkable similarities. In particular, the proteins involved in the adhesion of the cell to the extracellular matrix share many functional properties. At the sequence level, however, a surprising lack of homology is found between adhesion-related proteins of plants and animals. Both protein machineries only reveal similarities between small subdomains and motifs, which further underlines their functional relationship. In this review, we provide an overview on the similarities between motifs in proteins known to be located at the plant cell wall-plasma membrane-cytoskeleton interface to proteins of the animal adhesome. We also show that by comparing the proteome of both adhesion machineries at the level of motifs, we are also able to identify potentially new candidate proteins that functionally contribute to the adhesion of the plant plasma membrane to the cell wall.
Collapse
Affiliation(s)
- Markus Langhans
- Membrane Dynamics, Department of Biology, Technische Universität Darmstadt, Germany, Schnittspahnstrasse 3, 64297, Darmstadt, Germany
| | - Wadim Weber
- Membrane Dynamics, Department of Biology, Technische Universität Darmstadt, Germany, Schnittspahnstrasse 3, 64297, Darmstadt, Germany
| | - Laura Babel
- Membrane Dynamics, Department of Biology, Technische Universität Darmstadt, Germany, Schnittspahnstrasse 3, 64297, Darmstadt, Germany
| | - Miriam Grunewald
- Membrane Dynamics, Department of Biology, Technische Universität Darmstadt, Germany, Schnittspahnstrasse 3, 64297, Darmstadt, Germany
| | - Tobias Meckel
- Membrane Dynamics, Department of Biology, Technische Universität Darmstadt, Germany, Schnittspahnstrasse 3, 64297, Darmstadt, Germany.
| |
Collapse
|
42
|
Abstract
Background Comparative genomics can leverage the vast amount of available genomic sequences to reconstruct and analyze transcriptional regulatory networks in Bacteria, but the efficacy of this approach hinges on the ability to transfer regulatory network information from reference species to the genomes under analysis. Several methods have been proposed to transfer regulatory information between bacterial species, but the paucity and distributed nature of experimental information on bacterial transcriptional networks have prevented their systematic evaluation. Results We report the compilation of a large catalog of transcription factor-binding sites across Bacteria and its use to systematically benchmark proposed transfer methods across pairs of bacterial species. We evaluate motif- and accuracy-based metrics to assess the results of regulatory network transfer and we identify the precision-recall area-under-the-curve as the best metric for this purpose due to the large class-imbalanced nature of the problem. Methods assuming conservation of the transcription factor-binding motif (motif-based) are shown to substantially outperform those assuming conservation of regulon composition (network-based), even though their efficiency can decrease sharply with increasing phylogenetic distance. Variations of the basic motif-based transfer method do not yield significant improvements in transfer accuracy. Our results indicate that detection of a large enough number of regulated orthologs is critical for network-based transfer methods, but that relaxing orthology requirements does not improve results. Using the transcriptional regulators LexA and Fur as case examples, we also show how DNA-binding domain sequence similarity can yield confounding results as an indicator of transfer efficiency for motif-based methods. Conclusions Counter to standard practice, our evaluation of metrics to assess the efficiency of methods for regulatory network information transfer reveals that the area under precision-recall (PR) curves is a more precise and informative metric than that of receiver-operating-characteristic (ROC) curves, confirming similar findings in other class-imbalanced settings. Our systematic assessment of transfer methods reveals that simple approaches to both motif- and network-based transfer of regulatory information provide equal or better results than more elaborate methods. We also show that there are not effective predictors of transfer efficacy, substantiating the long-standing practice of manual curation in comparative genomics analyses. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1113-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sefa Kılıç
- Department of Biological Sciences, University of Maryland Baltimore County (UMBC), Baltimore, MD, 21250, USA
| | - Ivan Erill
- Department of Biological Sciences, University of Maryland Baltimore County (UMBC), Baltimore, MD, 21250, USA.
| |
Collapse
|
43
|
Moses V, Hatherley R, Tastan Bishop Ö. Bioinformatic characterization of type-specific sequence and structural features in auxiliary activity family 9 proteins. Biotechnol Biofuels 2016; 9:239. [PMID: 27833654 PMCID: PMC5101804 DOI: 10.1186/s13068-016-0655-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/23/2016] [Accepted: 10/25/2016] [Indexed: 05/22/2023]
Abstract
BACKGROUND Due to the impending depletion of fossil fuels, it has become important to identify alternative energy sources. The biofuel industry has proven to be a promising alternative. However, owing to the complex nature of plant biomass, hence the degradation, biofuel production remains a challenge. The copper-dependent Auxiliary Activity family 9 (AA9) proteins have been found to act synergistically with other cellulose-degrading enzymes resulting in an increased rate of cellulose breakdown. AA9 proteins are lytic polysaccharide monooxygenase (LPMO) enzymes, otherwise known as polysaccharide monooxygenases (PMOs). They are further classified as Type 1, 2 or 3 PMOs, depending on the different cleavage products formed. As AA9 proteins are known to exhibit low sequence conservation, the analysis of unique features of AA9 domains of these enzymes should provide insights for the better understanding of how different AA9 PMO types function. RESULTS Bioinformatics approaches were used to identify features specific to the catalytic AA9 domains of each type of AA9 PMO. Sequence analysis showed the N terminus to be highly variable with type-specific inserts evident in this region. Phylogenetic analysis was performed to cluster AA9 domains based on their types. Motif analysis enabled the identification of sub-groups within each AA9 PMO type with the majority of these motifs occurring within the highly variable N terminus of AA9 domains. AA9 domain structures were manually docked to crystalline cellulose and used to analyze both the type-specific inserts and motifs at a structural level. The results indicated that these regions influence the AA9 domain active site topology and may contribute to the regioselectivity displayed by different AA9 PMO types. Physicochemical property analysis was performed and detected significant differences in aromaticity, isoelectric point and instability index between certain AA9 PMO types. CONCLUSIONS In this study, a type-specific characterisation of AA9 domains was performed using various bioinformatics approaches. These highly variable proteins were found to have a greater degree of conservation within their respective types. Type-specific features were identified for AA9 domains, which could be observed at a sequence, structural and physicochemical level. This provides a basis under which to identify and group new AA9 LPMOs in future.
Collapse
Affiliation(s)
- Vuyani Moses
- Research Unit in Bioinformatics (RUBi), Department of Biochemistry and Microbiology, Rhodes University, Grahamstown, 6140 South Africa
| | - Rowan Hatherley
- Research Unit in Bioinformatics (RUBi), Department of Biochemistry and Microbiology, Rhodes University, Grahamstown, 6140 South Africa
| | - Özlem Tastan Bishop
- Research Unit in Bioinformatics (RUBi), Department of Biochemistry and Microbiology, Rhodes University, Grahamstown, 6140 South Africa
| |
Collapse
|
44
|
Zhong Y, Zhang J, Yu H, Zhang J, Sun XX, Chen W, Bian H, Li Z. Characterization and sub-cellular localization of GalNAc-binding proteins isolated from human hepatic stellate cells. Biochem Biophys Res Commun 2015; 468:906-12. [PMID: 26616059 DOI: 10.1016/j.bbrc.2015.11.055] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2015] [Accepted: 11/11/2015] [Indexed: 01/28/2023]
Abstract
Although the expression levels of total GalNAc-binding proteins (GNBPs) were up-regulated significantly in human hepatic stellate cells (HSCs) activated with transforming growth factor-β1(TGF-β1), yet little is known about the precise types, distribution and sub-cellular localization of the GNBPs in HSCs. Here, 264 GNBPs from the activated HSCs and 257 GNBPs from the quiescent HSCs were identified and annotated. A total of 46 GNBPs were estimated to be significantly up-regulated and 40 GNBPs were estimated to be significantly down-regulated in the activated HSCs. For example, the GNBPs (i.e. BTF3, COX17, and ATP5A1) responsible for the regulation of protein binding were up-regulated, and those (i.e. FAM114A1, ENO3, and TKT) responsible for the regulation of protein binding were down-regulated in the activated HSCs. The motifs of the isolated GNBPs showed that Proline residue had the maximum preference in consensus sequences. The western blotting showed the expression levels of COX17, and PRMT1 were significantly up-regulated, while, the expression level of CLIC1(B5) was down-regulated in the activated HSCs and liver cirrhosis tissues. Moreover, the GNBPs were sub-localized in the Golgi apparatus of HSCs. In conclusion, the precision alteration of the GNBPs referred to pathological changes in liver fibrosis/cirrhosis may provide useful information to find new molecular mechanism of HSC activation and discover the biomarkers for diagnosis of liver fibrosis/cirrhosis as well as development of new anti-fibrotic strategies.
Collapse
|
45
|
Zaghlool M, Al-Khayyat S. In silico structural analysis of quorum sensing genes in Vibriofischeri. Mol Biol Res Commun 2015; 4:115-124. [PMID: 27844003 PMCID: PMC5019203] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Quorum sensing controls the luminescence of Vibrio fischeri through the transcriptional activator LuxR and the specific autoinducer signal produced by luxI. Amino acid sequences of these two genes were analyzed using bioinformatics tools. LuxI consists of 193 amino acids and appears to contain five α-helices and six ß-sheets when analyzed by SSpro8. LuxI belongs to the autoinducer synthetase family and contains an acetyltransferase domain extending from residues 24 to 110 as MOTIF predicted. LuxR, on the other hand, contains 250 amino acids and has ten α-helices and four ß-sheets. MOTIF predicted LuxR to possess functional motifs; the inducer binding site extending from amino acid residues 23 to 147 and the LuxR activator site extending between amino acids 182 and 236. The InterProScan5 server identified a winged helix- turn-helix DNA binding motif.
Collapse
Affiliation(s)
- Mohammed Zaghlool
- Biology Department, College of Education for Pure Sciences, University of Mosul, Mosul city, Iraq
| | - Saeed Al-Khayyat
- Biology Department, College of Education for Pure Sciences, University of Mosul, Mosul city, Iraq
| |
Collapse
|
46
|
Polavarapu R, Meetei PA, Midha M, Bharne D, Vindal V. ClosIndb: A resource for computationally derived information from clostridial genomes. Infect Genet Evol 2015; 33:127-30. [PMID: 25913159 DOI: 10.1016/j.meegid.2015.04.020] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/25/2014] [Revised: 04/20/2015] [Accepted: 04/21/2015] [Indexed: 11/24/2022]
Abstract
Over the past few years, several clostridial genomes have been sequenced, and since then new sequencing projects are also under way. Clostridia is one of the most sequenced genera, and presently, complete genome sequences of 49 clostridial species are available in public archives. Unraveling this wealth of genomic information opens up potential avenues in clostridial research. In the present study, we have carried out in silico analysis to decipher the genomic data. Subsequently, a web resource, ClosIndb, has been developed which collates the computationally derived information associated with all clostridial genes. It features various aspects of coding regions as well as non-coding regions, such as putative orthologs, proteins physicochemical properties, operons and cis-regulatory elements. It provides users with comparative details of all clostridial proteins across the firmicutes. ClosIndb is a comprehensive resource for all completely sequenced clostridial genomes and is under constant development. ClosIndb is freely accessible at http://bif.uohyd.ac.in/closindb/.
Collapse
|
47
|
Hutchins AP, Jauch R, Dyla M, Miranda-Saavedra D. glbase: a framework for combining, analyzing and displaying heterogeneous genomic and high-throughput sequencing data. Cell Regen 2014; 3:1. [PMID: 25408880 DOI: 10.1186/2045-9769-3-1] [Citation(s) in RCA: 53] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/25/2013] [Accepted: 01/23/2014] [Indexed: 12/13/2022]
Abstract
Genomic datasets and the tools to analyze them have proliferated at an astonishing rate. However, such tools are often poorly integrated with each other: each program typically produces its own custom output in a variety of non-standard file formats. Here we present glbase, a framework that uses a flexible set of descriptors that can quickly parse non-binary data files. glbase includes many functions to intersect two lists of data, including operations on genomic interval data and support for the efficient random access to huge genomic data files. Many glbase functions can produce graphical outputs, including scatter plots, heatmaps, boxplots and other common analytical displays of high-throughput data such as RNA-seq, ChIP-seq and microarray expression data. glbase is designed to rapidly bring biological data into a Python-based analytical environment to facilitate analysis and data processing. In summary, glbase is a flexible and multifunctional toolkit that allows the combination and analysis of high-throughput data (especially next-generation sequencing and genome-wide data), and which has been instrumental in the analysis of complex data sets. glbase is freely available at http://bitbucket.org/oaxiom/glbase/.
Collapse
|
48
|
Mhaindarkar V, Sharma K, Lole KS. Mutagenesis of hepatitis E virus helicase motifs: effects on enzyme activity. Virus Res 2013; 179:26-33. [PMID: 24333153 DOI: 10.1016/j.virusres.2013.11.022] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2013] [Revised: 11/27/2013] [Accepted: 11/29/2013] [Indexed: 11/17/2022]
Abstract
Hepatitis E virus (HEV), the causative agent of hepatitis E, is a non-enveloped RNA virus. The open reading frame 1 encoded non-structural polyprotein has putative domains for methyltransferase, cysteine protease, helicase and RNA-dependent RNA polymerase, however processing of this polyprotein is still uncertain. HEV helicase belongs to superfamily 1 and has all seven conserved motifs typical of the family. NTPase and RNA duplex unwinding activities of HEV helicase domain were recently demonstrated by us. A non-radioactive RNA unwinding assay was developed using biotin and digoxigenin labeled duplex RNA substrate with 5' overhangs for measuring strand displacement activity of the helicase. A series of deletion mutants were constructed to investigate role of individual motifs in the enzymatic activities. Deletion mutants for motif M I and M IV showed increase in ATPase activity. Deletion mutant M VI retained ATPase activity comparable to wild type protein. Mutant M II showed reduced ATPase activity (P=0.003) with no significant decrease in unwinding activity while mutants M Ia and M III showed major reduction of both ATPase and unwinding activities indicating crucial role of these motifs in the helicase function. Overall analysis of deletion mutants showed that Motif I, IV, V and VI have alternative motifs to carry out enzymatic functions of the protein while motifs Ia and III are critical as well as unique motifs in the protein. Knowing the important role of helicase protein during positive sense RNA virus replication, these unique motifs could be good antiviral targets.
Collapse
Affiliation(s)
- Vaibhav Mhaindarkar
- Hepatitis Division, National Institute of Virology, Microbial Containment Complex, Sus Road, Pashan, Pune 411021, India
| | - Kavyanjali Sharma
- Hepatitis Division, National Institute of Virology, Microbial Containment Complex, Sus Road, Pashan, Pune 411021, India
| | - Kavita S Lole
- Hepatitis Division, National Institute of Virology, Microbial Containment Complex, Sus Road, Pashan, Pune 411021, India.
| |
Collapse
|