51
|
Janda JO, Popal A, Bauer J, Busch M, Klocke M, Spitzer W, Keller J, Merkl R. H2rs: deducing evolutionary and functionally important residue positions by means of an entropy and similarity based analysis of multiple sequence alignments. BMC Bioinformatics 2014; 15:118. [PMID: 24766829 PMCID: PMC4021312 DOI: 10.1186/1471-2105-15-118] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2014] [Accepted: 04/17/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The identification of functionally important residue positions is an important task of computational biology. Methods of correlation analysis allow for the identification of pairs of residue positions, whose occupancy is mutually dependent due to constraints imposed by protein structure or function. A common measure assessing these dependencies is the mutual information, which is based on Shannon's information theory that utilizes probabilities only. Consequently, such approaches do not consider the similarity of residue pairs, which may degrade the algorithm's performance. One typical algorithm is H2r, which characterizes each individual residue position k by the conn(k)-value, which is the number of significantly correlated pairs it belongs to. RESULTS To improve specificity of H2r, we developed a revised algorithm, named H2rs, which is based on the von Neumann entropy (vNE). To compute the corresponding mutual information, a matrix A is required, which assesses the similarity of residue pairs. We determined A by deducing substitution frequencies from contacting residue pairs observed in the homologs of 35 809 proteins, whose structure is known. In analogy to H2r, the enhanced algorithm computes a normalized conn(k)-value. Within the framework of H2rs, only statistically significant vNE values were considered. To decide on significance, the algorithm calculates a p-value by performing a randomization test for each individual pair of residue positions. The analysis of a large in silico testbed demonstrated that specificity and precision were higher for H2rs than for H2r and two other methods of correlation analysis. The gain in prediction quality is further confirmed by a detailed assessment of five well-studied enzymes. The outcome of H2rs and of a method that predicts contacting residue positions (PSICOV) overlapped only marginally. H2rs can be downloaded from http://www-bioinf.uni-regensburg.de. CONCLUSIONS Considering substitution frequencies for residue pairs by means of the von Neumann entropy and a p-value improved the success rate in identifying important residue positions. The integration of proven statistical concepts and normalization allows for an easier comparison of results obtained with different proteins. Comparing the outcome of the local method H2rs and of the global method PSICOV indicates that such methods supplement each other and have different scopes of application.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Rainer Merkl
- Institute of Biophysics and Physical Biochemistry, University of Regensburg, D-93040 Regensburg, Germany.
| |
Collapse
|
52
|
Computational prediction of protein function based on weighted mapping of domains and GO terms. BIOMED RESEARCH INTERNATIONAL 2014; 2014:641469. [PMID: 24868539 PMCID: PMC4017789 DOI: 10.1155/2014/641469] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/21/2013] [Accepted: 03/12/2014] [Indexed: 11/17/2022]
Abstract
In this paper, we propose a novel method, SeekFun, to predict protein function based on weighted mapping of domains and GO terms. Firstly, a weighted mapping of domains and GO terms is constructed according to GO annotations and domain composition of the proteins. The association strength between domain and GO term is weighted by symmetrical conditional probability. Secondly, the mapping is extended along the true paths of the terms based on GO hierarchy. Finally, the terms associated with resident domains are transferred to host protein and real annotations of the host protein are determined by association strengths. Our careful comparisons demonstrate that SeekFun outperforms the concerned methods on most occasions. SeekFun provides a flexible and effective way for protein function prediction. It benefits from the well-constructed mapping of domains and GO terms, as well as the reasonable strategy for inferring annotations of protein from those of its domains.
Collapse
|
53
|
Mesiti M, Re M, Valentini G. Think globally and solve locally: secondary memory-based network learning for automated multi-species function prediction. Gigascience 2014; 3:5. [PMID: 24843788 PMCID: PMC4006453 DOI: 10.1186/2047-217x-3-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2013] [Accepted: 04/01/2014] [Indexed: 01/08/2023] Open
Abstract
Background Network-based learning algorithms for automated function prediction (AFP) are negatively affected by the limited coverage of experimental data and limited a priori known functional annotations. As a consequence their application to model organisms is often restricted to well characterized biological processes and pathways, and their effectiveness with poorly annotated species is relatively limited. A possible solution to this problem might consist in the construction of big networks including multiple species, but this in turn poses challenging computational problems, due to the scalability limitations of existing algorithms and the main memory requirements induced by the construction of big networks. Distributed computation or the usage of big computers could in principle respond to these issues, but raises further algorithmic problems and require resources not satisfiable with simple off-the-shelf computers. Results We propose a novel framework for scalable network-based learning of multi-species protein functions based on both a local implementation of existing algorithms and the adoption of innovative technologies: we solve “locally” the AFP problem, by designing “vertex-centric” implementations of network-based algorithms, but we do not give up thinking “globally” by exploiting the overall topology of the network. This is made possible by the adoption of secondary memory-based technologies that allow the efficient use of the large memory available on disks, thus overcoming the main memory limitations of modern off-the-shelf computers. This approach has been applied to the analysis of a large multi-species network including more than 300 species of bacteria and to a network with more than 200,000 proteins belonging to 13 Eukaryotic species. To our knowledge this is the first work where secondary-memory based network analysis has been applied to multi-species function prediction using biological networks with hundreds of thousands of proteins. Conclusions The combination of these algorithmic and technological approaches makes feasible the analysis of large multi-species networks using ordinary computers with limited speed and primary memory, and in perspective could enable the analysis of huge networks (e.g. the whole proteomes available in SwissProt), using well-equipped stand-alone machines.
Collapse
Affiliation(s)
- Marco Mesiti
- AnacletoLab - Department of Computer Science, University of Milano, Via Comelico 39/41, 20135 Milano, Italy
| | - Matteo Re
- AnacletoLab - Department of Computer Science, University of Milano, Via Comelico 39/41, 20135 Milano, Italy
| | - Giorgio Valentini
- AnacletoLab - Department of Computer Science, University of Milano, Via Comelico 39/41, 20135 Milano, Italy
| |
Collapse
|
54
|
Muñoz-Mérida A, Viguera E, Claros MG, Trelles O, Pérez-Pulido AJ. Sma3s: a three-step modular annotator for large sequence datasets. DNA Res 2014; 21:341-53. [PMID: 24501397 PMCID: PMC4131829 DOI: 10.1093/dnares/dsu001] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
Automatic sequence annotation is an essential component of modern 'omics' studies, which aim to extract information from large collections of sequence data. Most existing tools use sequence homology to establish evolutionary relationships and assign putative functions to sequences. However, it can be difficult to define a similarity threshold that achieves sufficient coverage without sacrificing annotation quality. Defining the correct configuration is critical and can be challenging for non-specialist users. Thus, the development of robust automatic annotation techniques that generate high-quality annotations without needing expert knowledge would be very valuable for the research community. We present Sma3s, a tool for automatically annotating very large collections of biological sequences from any kind of gene library or genome. Sma3s is composed of three modules that progressively annotate query sequences using either: (i) very similar homologues, (ii) orthologous sequences or (iii) terms enriched in groups of homologous sequences. We trained the system using several random sets of known sequences, demonstrating average sensitivity and specificity values of ~85%. In conclusion, Sma3s is a versatile tool for high-throughput annotation of a wide variety of sequence datasets that outperforms the accuracy of other well-established annotation algorithms, and it can enrich existing database annotations and uncover previously hidden features. Importantly, Sma3s has already been used in the functional annotation of two published transcriptomes.
Collapse
Affiliation(s)
- Antonio Muñoz-Mérida
- Integrated Bioinformatics, National Institute for Bioinformatics, University of Málaga, Campus de Teatinos, Spain
| | - Enrique Viguera
- Cellular Biology, Genetics and Physiology Department, University of Málaga, Campus de Teatinos, Spain
| | - M Gonzalo Claros
- Molecular Biology and Biochemistry Department, University of Málaga, Campus de Teatinos, Spain
| | - Oswaldo Trelles
- Integrated Bioinformatics, National Institute for Bioinformatics, University of Málaga, Campus de Teatinos, Spain Computer Architecture Department, University of Málaga, Campus de Teatinos, Spain
| | - Antonio J Pérez-Pulido
- Centro Andaluz de Biología del Desarrollo (CABD, UPO-CSIC-JA), Facultad de Ciencias Experimentales (Área de Genética), Universidad Pablo de Olavide, Sevilla 41013, Spain
| |
Collapse
|
55
|
Yi F, Xie S, Liu Y, Qi X, Yu J. Genome-wide characterization of microRNA in foxtail millet (Setaria italica). BMC PLANT BIOLOGY 2013; 13:212. [PMID: 24330712 PMCID: PMC3878754 DOI: 10.1186/1471-2229-13-212] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/15/2013] [Accepted: 11/27/2013] [Indexed: 05/23/2023]
Abstract
BACKGROUND MicroRNAs (miRNAs) are a class of short non-coding, endogenous RNAs that play key roles in many biological processes in both animals and plants. Although many miRNAs have been identified in a large number of organisms, the miRNAs in foxtail millet (Setaria italica) have, until now, been poorly understood. RESULTS In this study, two replicate small RNA libraries from foxtail millet shoots were sequenced, and 40 million reads representing over 10 million unique sequences were generated. We identified 43 known miRNAs, 172 novel miRNAs and 2 mirtron precursor candidates in foxtail millet. Some miRNA*s of the known and novel miRNAs were detected as well. Further, eight novel miRNAs were validated by stem-loop RT-PCR. Potential targets of the foxtail millet miRNAs were predicted based on our strict criteria. Of the predicted target genes, 79% (351) had functional annotations in InterPro and GO analyses, indicating the targets of the miRNAs were involved in a wide range of regulatory functions and some specific biological processes. A total of 69 pairs of syntenic miRNA precursors that were conserved between foxtail millet and sorghum were found. Additionally, stem-loop RT-PCR was conducted to confirm the tissue-specific expression of some miRNAs in the four tissues identified by deep-sequencing. CONCLUSIONS We predicted, for the first time, 215 miRNAs and 447 miRNA targets in foxtail millet at a genome-wide level. The precursors, expression levels, miRNA* sequences, target functions, conservation, and evolution of miRNAs we identified were investigated. Some of the novel foxtail millet miRNAs and miRNA targets were validated experimentally.
Collapse
Affiliation(s)
- Fei Yi
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Shaojun Xie
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Yuwei Liu
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Xin Qi
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Jingjuan Yu
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| |
Collapse
|
56
|
Hirakawa H, Shirasawa K, Kosugi S, Tashiro K, Nakayama S, Yamada M, Kohara M, Watanabe A, Kishida Y, Fujishiro T, Tsuruoka H, Minami C, Sasamoto S, Kato M, Nanri K, Komaki A, Yanagi T, Guoxin Q, Maeda F, Ishikawa M, Kuhara S, Sato S, Tabata S, Isobe SN. Dissection of the octoploid strawberry genome by deep sequencing of the genomes of Fragaria species. DNA Res 2013; 21:169-81. [PMID: 24282021 PMCID: PMC3989489 DOI: 10.1093/dnares/dst049] [Citation(s) in RCA: 130] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Cultivated strawberry (Fragaria x ananassa) is octoploid and shows allogamous behaviour. The present study aims at dissecting this octoploid genome through comparison with its wild relatives, F. iinumae, F. nipponica, F. nubicola, and F. orientalis by de novo whole-genome sequencing on an Illumina and Roche 454 platforms. The total length of the assembled Illumina genome sequences obtained was 698 Mb for F. x ananassa, and ∼200 Mb each for the four wild species. Subsequently, a virtual reference genome termed FANhybrid_r1.2 was constructed by integrating the sequences of the four homoeologous subgenomes of F. x ananassa, from which heterozygous regions in the Roche 454 and Illumina genome sequences were eliminated. The total length of FANhybrid_r1.2 thus created was 173.2 Mb with the N50 length of 5137 bp. The Illumina-assembled genome sequences of F. x ananassa and the four wild species were then mapped onto the reference genome, along with the previously published F. vesca genome sequence to establish the subgenomic structure of F. x ananassa. The strategy adopted in this study has turned out to be successful in dissecting the genome of octoploid F. x ananassa and appears promising when applied to the analysis of other polyploid plant species.
Collapse
Affiliation(s)
- Hideki Hirakawa
- 1 Kazusa DNA Research Institute, Kazusa-Kamatari 2-6-7, Kisarazu, Chiba 292-0818, Japan
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
57
|
An innovative portal for rare genetic diseases research: the semantic Diseasecard. J Biomed Inform 2013; 46:1108-15. [PMID: 23973272 DOI: 10.1016/j.jbi.2013.08.006] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2013] [Revised: 06/26/2013] [Accepted: 08/13/2013] [Indexed: 12/17/2022]
Abstract
Advances in "omics" hardware and software technologies are bringing rare diseases research back from the sidelines. Whereas in the past these disorders were seldom considered relevant, in the era of whole genome sequencing the direct connections between rare phenotypes and a reduced set of genes are of vital relevance. This increased interest in rare genetic diseases research is pushing forward investment and effort towards the creation of software in the field, and leveraging the wealth of available life sciences data. Alas, most of these tools target one or more rare diseases, are focused solely on a single type of user, or are limited to the most relevant scientific breakthroughs for a specific niche. Furthermore, despite some high quality efforts, the ever-growing number of resources, databases, services and applications is still a burden to this area. Hence, there is a clear interest in new strategies to deliver a holistic perspective over the entire rare genetic diseases research domain. This is Diseasecard's reasoning, to build a true lightweight knowledge base covering rare genetic diseases. Developed with the latest semantic web technologies, this portal delivers unified access to a comprehensive network for researchers, clinicians, patients and bioinformatics developers. With in-context access covering over 20 distinct heterogeneous resources, Diseasecard's workspace provides access to the most relevant scientific knowledge regarding a given disorder, whether through direct common identifiers or through full-text search over all connected resources. In addition to its user-oriented features, Diseasecard's semantic knowledge base is also available for direct querying, enabling everyone to include rare genetic diseases knowledge in new or existing information systems. Diseasecard is publicly available at http://bioinformatics.ua.pt/diseasecard/.
Collapse
|
58
|
Fujinami S, Takarada H, Kasai H, Sekine M, Omata S, Harada T, Fukai R, Hosoyama A, Horikawa H, Kato Y, Nakazawa H, Fujita N. Complete genome sequence of Ilumatobacter coccineum YM16-304(T.). Stand Genomic Sci 2013; 8:430-40. [PMID: 24501628 PMCID: PMC3910706 DOI: 10.4056/sigs.4007734] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Ilumatobacter coccineum YM16-304(T) (=NBRC 103263(T)) is a novel marine actinobacterium isolated from a sand sample collected at a beach in Shimane Prefecture, Japan. Strain YM16-304(T) is the type strain of the species. Phylogenetically, strain YM16-304(T) is close to Ilumatobacter nonamiense YM16-303(T) (=NBRC 109120(T)), Ilumatobacter fluminis YM22-133(T) and some uncultured bacteria including putative marine sponge symbionts. Whole genome sequence of these species has not been reported. Here we report the complete genome sequence of strain YM16-304(T). The 4,830,181 bp chromosome was predicted to encode a total of 4,291 protein-coding genes.
Collapse
Affiliation(s)
- Shun Fujinami
- Biological Resource Center, National Institute of Technology and Evaluation, Shibuya, Tokyo, Japan
- Bio-Nano Electronics Research Centre, Toyo University, 2100 Kujirai, Kawagoe Saitama, Japan
| | - Hiromi Takarada
- Biological Resource Center, National Institute of Technology and Evaluation, Shibuya, Tokyo, Japan
| | - Hiroaki Kasai
- Marine Biosciences Kamaishi Research Laboratory, Kitasato University, Ofunato, Iwate, Japan
| | - Mitsuo Sekine
- Biological Resource Center, National Institute of Technology and Evaluation, Shibuya, Tokyo, Japan
| | - Seiha Omata
- Biological Resource Center, National Institute of Technology and Evaluation, Shibuya, Tokyo, Japan
| | - Takeshi Harada
- Biological Resource Center, National Institute of Technology and Evaluation, Shibuya, Tokyo, Japan
| | - Rieko Fukai
- Biological Resource Center, National Institute of Technology and Evaluation, Shibuya, Tokyo, Japan
| | - Akira Hosoyama
- Biological Resource Center, National Institute of Technology and Evaluation, Shibuya, Tokyo, Japan
| | - Hiroshi Horikawa
- Biological Resource Center, National Institute of Technology and Evaluation, Shibuya, Tokyo, Japan
| | - Yumiko Kato
- Biological Resource Center, National Institute of Technology and Evaluation, Shibuya, Tokyo, Japan
| | - Hidekazu Nakazawa
- Biological Resource Center, National Institute of Technology and Evaluation, Shibuya, Tokyo, Japan
| | - Nobuyuki Fujita
- Biological Resource Center, National Institute of Technology and Evaluation, Shibuya, Tokyo, Japan
| |
Collapse
|
59
|
Genotypic and phenotypic versatility of Aspergillus flavus during maize exploitation. PLoS One 2013; 8:e68735. [PMID: 23894339 PMCID: PMC3716879 DOI: 10.1371/journal.pone.0068735] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2013] [Accepted: 05/31/2013] [Indexed: 11/19/2022] Open
Abstract
Aspergillus flavus is a cosmopolitan fungus able to respond to external stimuli and to shift both its trophic behaviour and the production of secondary metabolites, including that of the carcinogen aflatoxin (AF). To better understand the adaptability of this fungus, we examined genetic and phenotypic responses within the fungus when grown under four conditions that mimic different ecological niches ranging from saprophytic growth to parasitism. Global transcription changes were observed in both primary and secondary metabolism in response to these conditions, particularly in secondary metabolism where transcription of nearly half of the predicted secondary metabolite clusters changed in response to the trophic states of the fungus. The greatest transcriptional change was found between saprophytic and parasitic growth, which resulted in expression changes in over 800 genes in A. flavus. The fungus also responded to growth conditions, putatively by adaptive changes in conidia, resulting in differences in their ability to utilize carbon sources. We also examined tolerance of A. flavus to oxidative stress and found that growth and secondary metabolism were altered in a superoxide dismutase (sod) mutant and an alkyl-hydroperoxide reductase (ahp) mutant of A. flavus. Data presented in this study show a multifaceted response of A. flavus to its environment and suggest that oxidative stress and secondary metabolism are important in the ecology of this fungus, notably in its interaction with host plant and in relation to changes in its lifestyle (i.e. saprobic to pathogenic).
Collapse
|
60
|
Shiraishi A, Niijima S, Brown JB, Nakatsui M, Okuno Y. Chemical genomics approach for GPCR-ligand interaction prediction and extraction of ligand binding determinants. J Chem Inf Model 2013; 53:1253-62. [PMID: 23721295 DOI: 10.1021/ci300515z] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
Chemical genomics research has revealed that G-protein coupled receptors (GPCRs) interact with a variety of ligands and that a large number of ligands are known to bind GPCRs even with low transmembrane (TM) sequence similarity. It is crucial to extract informative binding region propensities from large quantities of bioactivity data. To address this issue, we propose a machine learning approach that enables identification of both chemical substructures and amino acid properties that are associated with ligand binding, which can be applied to virtual ligand screening on a GPCR-wide scale. We also address the question of how to select plausible negative noninteraction pairs based on a statistical approach in order to develop reliable prediction models for GPCR-ligand interactions. The key interaction sites estimated by our approach can be of great use not only for screening of active compounds but also for modification of active compounds with the aim of improving activity or selectivity.
Collapse
Affiliation(s)
- Akira Shiraishi
- Department of Systems Biosciences for Drug Discovery, Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto
| | | | | | | | | |
Collapse
|
61
|
Peng FY, Weselake RJ. Genome-wide identification and analysis of the B3 superfamily of transcription factors in Brassicaceae and major crop plants. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2013; 126:1305-19. [PMID: 23377560 DOI: 10.1007/s00122-013-2054-4] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/28/2012] [Accepted: 01/09/2013] [Indexed: 05/04/2023]
Abstract
The plant-specific B3 superfamily of transcription factors has diverse functions in plant growth and development. Using a genome-wide domain analysis, we identified 92, 187, 58, 90, 81, 55, and 77 B3 transcription factor genes in the sequenced genome of Arabidopsis, Brassica rapa, castor bean (Ricinus communis), cocoa (Theobroma cacao), soybean (Glycine max), maize (Zea mays), and rice (Oryza sativa), respectively. The B3 superfamily has substantially expanded during the evolution in eudicots particularly in Brassicaceae, as compared to monocots in the analysis. We observed domain duplication in some of these B3 proteins, forming more complex domain architectures than currently understood. We found that the length of B3 domains exhibits a large variation, which may affect their exact number of α-helices and β-sheets in the core structure of B3 domains, and possibly have functional implications. Analysis of the public microarray data indicated that most of the B3 gene pairs encoding Arabidopsis-rice orthologs are preferentially expressed in different tissues, suggesting their different roles in these two species. Using ESTs in crops, we identified many B3 genes preferentially expressed in reproductive tissues. In a sequence-based quantitative trait loci analysis in rice and maize, we have found many B3 genes associated with traits such as grain yield, seed weight and number, and protein content. Our results provide a framework for future studies into the function of B3 genes in different phases of plant development, especially the ones related to traits in major crops.
Collapse
Affiliation(s)
- Fred Y Peng
- Agricultural Lipid Biotechnology Program, Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, AB T6G 2P5, Canada
| | | |
Collapse
|
62
|
Zhang Y, Li Q, Huang W, Zhang J, Han Z, Wei H, Cui J, Wang Y, Yan W. Increased expression of apoptosis-related protein 3 is highly associated with tumorigenesis and progression of cervical squamous cell carcinoma. Hum Pathol 2013; 44:388-93. [DOI: 10.1016/j.humpath.2012.05.028] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/20/2012] [Revised: 05/23/2012] [Accepted: 05/25/2012] [Indexed: 02/03/2023]
|
63
|
Katano Y, Fujinami S, Kawakoshi A, Nakazawa H, Oji S, Iino T, Oguchi A, Ankai A, Fukui S, Terui Y, Kamata S, Harada T, Tanikawa S, Suzuki KI, Fujita N. Complete genome sequence of Oscillibacter valericigenes Sjm18-20(T) (=NBRC 101213(T)). Stand Genomic Sci 2013; 6:406-14. [PMID: 23408234 PMCID: PMC3558957 DOI: 10.4056/sigs.2826118] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Oscillibacter valericigenes is a mesophilic, strictly anaerobic bacterium belonging to the clostridial cluster IV. Strain Sjm18-20(T) (=NBRC 101213(T) =DSM 18026(T)) is the type strain of the species and represents the genus Oscillibacter Iino et al. 2007. It was isolated from the alimentary canal of a Japanese corbicula clam (Corbicula japonica) collected on a seacoast in Shimane Prefecture in Japan. Phylogenetically, strain Sjm18-20(T) is closest to uncultured bacteria in digestive tracts, including the enriched cells thought to represent Oscillospira guilliermondii Chatton and Perard 1913. The isolated phylogenetic position and some distinct characteristics prompted us to determine the complete genome sequence. The 4,410,036 bp chromosome and the 60,586 bp plasmid were predicted to encode a total of 4,723 protein-coding genes.
Collapse
Affiliation(s)
- Yoko Katano
- Biological Resource Center, National Institute of Technology and Evaluation, Shibuya, Tokyo, Japan
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
64
|
Koo HJ, McDowell ET, Ma X, Greer KA, Kapteyn J, Xie Z, Descour A, Kim H, Yu Y, Kudrna D, Wing RA, Soderlund CA, Gang DR. Ginger and turmeric expressed sequence tags identify signature genes for rhizome identity and development and the biosynthesis of curcuminoids, gingerols and terpenoids. BMC PLANT BIOLOGY 2013; 13:27. [PMID: 23410187 PMCID: PMC3608961 DOI: 10.1186/1471-2229-13-27] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2012] [Accepted: 02/11/2013] [Indexed: 05/23/2023]
Abstract
BACKGROUND Ginger (Zingiber officinale) and turmeric (Curcuma longa) accumulate important pharmacologically active metabolites at high levels in their rhizomes. Despite their importance, relatively little is known regarding gene expression in the rhizomes of ginger and turmeric. RESULTS In order to identify rhizome-enriched genes and genes encoding specialized metabolism enzymes and pathway regulators, we evaluated an assembled collection of expressed sequence tags (ESTs) from eight different ginger and turmeric tissues. Comparisons to publicly available sorghum rhizome ESTs revealed a total of 777 gene transcripts expressed in ginger/turmeric and sorghum rhizomes but apparently absent from other tissues. The list of rhizome-specific transcripts was enriched for genes associated with regulation of tissue growth, development, and transcription. In particular, transcripts for ethylene response factors and AUX/IAA proteins appeared to accumulate in patterns mirroring results from previous studies regarding rhizome growth responses to exogenous applications of auxin and ethylene. Thus, these genes may play important roles in defining rhizome growth and development. Additional associations were made for ginger and turmeric rhizome-enriched MADS box transcription factors, their putative rhizome-enriched homologs in sorghum, and rhizomatous QTLs in rice. Additionally, analysis of both primary and specialized metabolism genes indicates that ginger and turmeric rhizomes are primarily devoted to the utilization of leaf supplied sucrose for the production and/or storage of specialized metabolites associated with the phenylpropanoid pathway and putative type III polyketide synthase gene products. This finding reinforces earlier hypotheses predicting roles of this enzyme class in the production of curcuminoids and gingerols. CONCLUSION A significant set of genes were found to be exclusively or preferentially expressed in the rhizome of ginger and turmeric. Specific transcription factors and other regulatory genes were found that were common to the two species and that are excellent candidates for involvement in rhizome growth, differentiation and development. Large classes of enzymes involved in specialized metabolism were also found to have apparent tissue-specific expression, suggesting that gene expression itself may play an important role in regulating metabolite production in these plants.
Collapse
Affiliation(s)
- Hyun Jo Koo
- School of Plant Sciences and BIO5 Institute, The University of Arizona, Tucson, AZ, 85721, USA
- Present address: Salk Institute for Biological Studies, PO Box 85800, San Diego, CA, 92186, USA
| | - Eric T McDowell
- School of Plant Sciences and BIO5 Institute, The University of Arizona, Tucson, AZ, 85721, USA
| | - Xiaoqiang Ma
- School of Plant Sciences and BIO5 Institute, The University of Arizona, Tucson, AZ, 85721, USA
- Present address: XenoBiotic Laboratories, Inc., Morgan Ln 107, Plainsboro, NJ, 08536, USA
| | - Kevin A Greer
- Arizona Genomics Computational Laboratory and BIO5 Institute, The University of Arizona, Tucson, AZ, 85721, USA
- Present address: Department of Surgery, College of Medicine, The University of Arizona, Tucson, AZ, 85724, USA
| | - Jeremy Kapteyn
- School of Plant Sciences and BIO5 Institute, The University of Arizona, Tucson, AZ, 85721, USA
| | - Zhengzhi Xie
- School of Plant Sciences and BIO5 Institute, The University of Arizona, Tucson, AZ, 85721, USA
- Department of Pharmaceutical Sciences, The University of Arizona, Tucson, AZ, 85721, USA
- Present address: Division of Cardiovascular Medicine, University of Louisville, Louisville, KY, 40202, USA
| | - Anne Descour
- Arizona Genomics Computational Laboratory and BIO5 Institute, The University of Arizona, Tucson, AZ, 85721, USA
| | - HyeRan Kim
- School of Plant Sciences and BIO5 Institute, The University of Arizona, Tucson, AZ, 85721, USA
- Arizona Genomics Institute, The University of Arizona, Tucson, AZ, 85721, USA
- Present address: Plant Genome Research Center, KRIBB, Daejeon, 305-803, South Korea
| | - Yeisoo Yu
- School of Plant Sciences and BIO5 Institute, The University of Arizona, Tucson, AZ, 85721, USA
- Arizona Genomics Institute, The University of Arizona, Tucson, AZ, 85721, USA
| | - David Kudrna
- School of Plant Sciences and BIO5 Institute, The University of Arizona, Tucson, AZ, 85721, USA
- Arizona Genomics Institute, The University of Arizona, Tucson, AZ, 85721, USA
| | - Rod A Wing
- School of Plant Sciences and BIO5 Institute, The University of Arizona, Tucson, AZ, 85721, USA
- Arizona Genomics Institute, The University of Arizona, Tucson, AZ, 85721, USA
| | - Carol A Soderlund
- Arizona Genomics Computational Laboratory and BIO5 Institute, The University of Arizona, Tucson, AZ, 85721, USA
| | - David R Gang
- School of Plant Sciences and BIO5 Institute, The University of Arizona, Tucson, AZ, 85721, USA
- Institute of Biological Chemistry, Washington State University, Pullman, WA, 99164, USA
- Institute of Biological Chemistry, Washington State University, P.O. Box 646340, Pullman, WA, 99164-6340, USA
| |
Collapse
|
65
|
Blandin G, Marchand S, Charton K, Danièle N, Gicquel E, Boucheteil JB, Bentaib A, Barrault L, Stockholm D, Bartoli M, Richard I. A human skeletal muscle interactome centered on proteins involved in muscular dystrophies: LGMD interactome. Skelet Muscle 2013; 3:3. [PMID: 23414517 PMCID: PMC3610214 DOI: 10.1186/2044-5040-3-3] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2012] [Accepted: 02/07/2013] [Indexed: 02/01/2023] Open
Abstract
Background The complexity of the skeletal muscle and the identification of numerous human disease-causing mutations in its constitutive proteins make it an interesting tissue for proteomic studies aimed at understanding functional relationships of interacting proteins in both health and diseases. Method We undertook a large-scale study using two-hybrid screens and a human skeletal-muscle cDNA library to establish a proteome-scale map of protein-protein interactions centered on proteins involved in limb-girdle muscular dystrophies (LGMD). LGMD is a group of more than 20 different neuromuscular disorders that principally affect the proximal pelvic and shoulder girdle muscles. Results and conclusion The interaction network we unraveled incorporates 1018 proteins connected by 1492 direct binary interactions and includes 1420 novel protein-protein interactions. Computational, experimental and literature-based analyses were performed to assess the overall quality of this network. Interestingly, LGMD proteins were shown to be highly interconnected, in particular indirectly through sarcomeric proteins. In-depth mining of the LGMD-centered interactome identified new candidate genes for orphan LGMDs and other neuromuscular disorders. The data also suggest the existence of functional links between LGMD2B/dysferlin and gene regulation, between LGMD2C/γ-sarcoglycan and energy control and between LGMD2G/telethonin and maintenance of genome integrity. This dataset represents a valuable resource for future functional investigations.
Collapse
Affiliation(s)
- Gaëlle Blandin
- Généthon CNRS UMR8587, 1, rue de l'Internationale, Evry 91000, France.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
66
|
Intricate interplay between astrocytes and motor neurons in ALS. Proc Natl Acad Sci U S A 2013; 110:E756-65. [PMID: 23388633 DOI: 10.1073/pnas.1222361110] [Citation(s) in RCA: 104] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
ALS results from the selective and progressive degeneration of motor neurons. Although the underlying disease mechanisms remain unknown, glial cells have been implicated in ALS disease progression. Here, we examine the effects of glial cell/motor neuron interactions on gene expression using the hSOD1(G93A) (the G93A allele of the human superoxide dismutase gene) mouse model of ALS. We detect striking cell autonomous and nonautonomous changes in gene expression in cocultured motor neurons and glia, revealing that the two cell types profoundly affect each other. In addition, we found a remarkable concordance between the cell culture data and expression profiles of whole spinal cords and acutely isolated spinal cord cells during disease progression in the G93A mouse model, providing validation of the cell culture approach. Bioinformatics analyses identified changes in the expression of specific genes and signaling pathways that may contribute to motor neuron degeneration in ALS, among which are TGF-β signaling pathways.
Collapse
|
67
|
Renier S, Chambon C, Viala D, Chagnot C, Hébraud M, Desvaux M. Exoproteomic analysis of the SecA2-dependent secretion in Listeria monocytogenes EGD-e. J Proteomics 2013; 80:183-95. [PMID: 23291529 DOI: 10.1016/j.jprot.2012.11.027] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2012] [Revised: 11/12/2012] [Accepted: 11/29/2012] [Indexed: 12/21/2022]
Abstract
As part of the Sec translocase, the accessory ATPase SecA2 is present in some pathogenic Gram-positive bacteria. In Listeria monocytogenes, deletion of secA2 results in filamentous cells that form rough colonies and have lower virulence. However, only a few proteins have been identified that are secreted by this pathway. This investigation aims to provide the first exoproteomic analysis of the SecA2-dependent secretion in L. monocytogenes EGD-e. By using media and temperatures relevant to bacterial physiology, we demonstrated that the rough colony and elongated bacterial cell morphotypes are highly dependent on growth conditions. Subsequently, comparative exoproteomic analyses of the ΔsecA2 versus wt strains were performed in chemically defined medium at 20°C and 37°C. Analyzing the proteomic data following the secretomics-based method, part of the proteins appeared routed towards the Sec pathway and exhibited an N-terminal signal peptide. For another significant part, they were primarily cytoplasmic proteins, thus lacking signal peptide and with no predictable secretion pathway. In total, 13 proteins were newly identified as secreted via SecA2, which were essentially associated with cell-wall metabolism, adhesion and/or biofilm formation. From this comparative exoproteomic analysis, new insights into the L. monocytogenes physiology are discussed in relation to its saprophytic and pathogenic lifestyle.
Collapse
Affiliation(s)
- Sandra Renier
- INRA, UR454 Microbiologie, F-63122 Saint-Genès Champanelle, France
| | - Christophe Chambon
- INRA, Plate-forme d'Exploration du Métabolisme, F-63122 Saint-Genès Champanelle, France
| | - Didier Viala
- INRA, Plate-forme d'Exploration du Métabolisme, F-63122 Saint-Genès Champanelle, France
| | - Caroline Chagnot
- INRA, UR454 Microbiologie, F-63122 Saint-Genès Champanelle, France
| | - Michel Hébraud
- INRA, UR454 Microbiologie, F-63122 Saint-Genès Champanelle, France; INRA, Plate-forme d'Exploration du Métabolisme, F-63122 Saint-Genès Champanelle, France
| | - Mickaël Desvaux
- INRA, UR454 Microbiologie, F-63122 Saint-Genès Champanelle, France.
| |
Collapse
|
68
|
Menon R, Gasser RB, Mitreva M, Ranganathan S. An analysis of the transcriptome of Teladorsagia circumcincta: its biological and biotechnological implications. BMC Genomics 2012; 13 Suppl 7:S10. [PMID: 23282110 PMCID: PMC3521389 DOI: 10.1186/1471-2164-13-s7-s10] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
BACKGROUND Teladorsagia circumcincta (order Strongylida) is an economically important parasitic nematode of small ruminants (including sheep and goats) in temperate climatic regions of the world. Improved insights into the molecular biology of this parasite could underpin alternative methods required to control this and related parasites, in order to circumvent major problems associated with anthelmintic resistance. The aims of the present study were to define the transcriptome of the adult stage of T. circumcincta and to infer the main pathways linked to molecules known to be expressed in this nematode. Since sheep develop acquired immunity against T. circumcincta, there is some potential for the development of a vaccine against this parasite. Hence, we infer excretory/secretory molecules for T. circumcincta as possible immunogens and vaccine candidates. RESULTS A total of 407,357 ESTs were assembled yielding 39,852 putative gene sequences. Conceptual translation predicted 24,013 proteins, which were then subjected to detailed annotation which included pathway mapping of predicted proteins (including 112 excreted/secreted [ES] and 226 transmembrane peptides), domain analysis and GO annotation was carried out using InterProScan along with BLAST2GO. Further analysis was carried out for secretory signal peptides using SignalP and non-classical sec pathway using SecretomeP tools. For ES proteins, key pathways, including Fc epsilon RI, T cell receptor, and chemokine signalling as well as leukocyte transendothelial migration were inferred to be linked to immune responses, along with other pathways related to neurodegenerative diseases and infectious diseases, which warrant detailed future studies. KAAS could identify new and updated pathways like phagosome and protein processing in endoplasmic reticulum. Domain analysis for the assembled dataset revealed families of serine, cysteine and proteinase inhibitors which might represent targets for parasite intervention. InterProScan could identify GO terms pertaining to the extracellular region. Some of the important domain families identified included the SCP-like extracellular proteins which belong to the pathogenesis-related proteins (PRPs) superfamily along with C-type lectin, saposin-like proteins. The 'extracellular region' that corresponds to allergen V5/Tpx-1 related, considered important in parasite-host interactions, was also identified. Six cysteine motif (SXC1) proteins, transthyretin proteins, C-type lectins, activation-associated secreted proteins (ASPs), which could represent potential candidates for developing novel anthelmintics or vaccines were few other important findings. Of these, SXC1, protein kinase domain-containing protein, trypsin family protein, trypsin-like protease family member (TRY-1), putative major allergen and putative lipid binding protein were identified which have not been reported in the published T. circumcincta proteomics analysis. Detailed analysis of 6,058 raw EST sequences from dbEST revealed 315 putatively secreted proteins. Amongst them, C-type single domain activation associated secreted protein ASP3 precursor, activation-associated secreted proteins (ASP-like protein), cathepsin B-like cysteine protease, cathepsin L cysteine protease, cysteine protease, TransThyretin-Related and Venom-Allergen-like proteins were the key findings. CONCLUSIONS We have annotated a large dataset ESTs of T. circumcincta and undertaken detailed comparative bioinformatics analyses. The results provide a comprehensive insight into the molecular biology of this parasite and disease manifestation which provides potential focal point for future research. We identified a number of pathways responsible for immune response. This type of large-scale computational scanning could be coupled with proteomic and metabolomic studies of this parasite leading to novel therapeutic intervention and disease control strategies. We have also successfully affirmed the use of bioinformatics tools, for the study of ESTs, which could now serve as a benchmark for the development of new computational EST analysis pipelines.
Collapse
Affiliation(s)
- Ranjeeta Menon
- Department of Chemistry and Biomolecular Sciences, Macquarie University, Sydney, New South Wales 2109, Australia
| | | | | | | |
Collapse
|
69
|
Wang XR, Moreno YA, Wu HR, Ma C, Li YF, Zhang JA, Yang C, Sun S, Ma WJ, Geary TG. Proteomic profiles of soluble proteins from the esophageal gland in female Meloidogyne incognita. Int J Parasitol 2012; 42:1177-83. [PMID: 23142006 DOI: 10.1016/j.ijpara.2012.10.008] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2012] [Revised: 10/09/2012] [Accepted: 10/10/2012] [Indexed: 12/17/2022]
Abstract
Meloidogyne incognita can infect multiple plant species. Proteins synthesized in the esophageal glands and secreted through the stylet of plant parasitic nematodes play critical roles in the plant-nematode interactions. Female M. incognita live for approximately 15days, embedded in a host plant, but their esophageal gland proteins have not yet been comprehensively analyzed. In this study, a new bacterium-contamination-resistant method for collecting soluble proteins from esophageal gland cells (SPEGC) of female M. incognita was established. Approximately 5μg of freeze-dried proteins could be extracted from 150 female M. incognita. Bands of a one-dimensional SDS-polyacrylamide gel were excised after electrophoresis of 20μg of protein and were analyzed. Two hundred and forty-six proteins from SPEGC of female M. incognita were identified by LC-MS/MS. Gene Ontology analysis suggests that many of the secreted proteins are involved in protein or carbohydrate metabolism and proteolysis. Some of the SPEGC (46.3%) were predicted to be secreted through classical or non-classical secretory pathways. The described method presents a new approach for the identification of proteins stored in SPEGC of an important plant parasitic nematode. This global proteomic profile of SPEGC provides a basis for future studies to elucidate the functions of proteins secreted from female M. incognita on plant responses.
Collapse
Affiliation(s)
- Xin-Rong Wang
- Laboratory of Plant Nematology, College of Natural Resources and Environment, South China Agricultural University, Guangzhou, China
| | | | | | | | | | | | | | | | | | | |
Collapse
|
70
|
A Novel Type III Endosome Transmembrane Protein, TEMP. Cells 2012; 1:1029-44. [PMID: 24710541 PMCID: PMC3901140 DOI: 10.3390/cells1041029] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2012] [Revised: 10/26/2012] [Accepted: 10/30/2012] [Indexed: 12/18/2022] Open
Abstract
As part of a high-throughput subcellular localisation project, the protein encoded by the RIKEN mouse cDNA 2610528J11 was expressed and identified to be associated with both endosomes and the plasma membrane. Based on this, we have assigned the name TEMP for Type III Endosome Membrane Protein. TEMP encodes a short protein of 111 amino acids with a single, alpha-helical transmembrane domain. Experimental analysis of its membrane topology demonstrated it is a Type III membrane protein with the amino-terminus in the lumenal, or extracellular region, and the carboxy-terminus in the cytoplasm. In addition to the plasma membrane TEMP was localized to Rab5 positive early endosomes, Rab5/Rab11 positive recycling endosomes but not Rab7 positive late endosomes. Video microscopy in living cells confirmed TEMP’s plasma membrane localization and identified the intracellular endosome compartments to be tubulovesicular. Overexpression of TEMP resulted in the early/recycling endosomes clustering at the cell periphery that was dependent on the presence of intact microtubules. The cellular function of TEMP cannot be inferred based on bioinformatics comparison, but its cellular distribution between early/recycling endosomes and the plasma membrane suggests a role in membrane transport.
Collapse
|
71
|
Galetto CD, Izaguirre MF, Bessone V, Casco VH. Isolation and nucleotide sequence analysis of the of Rhinella arenarum β-catenin: an mRNA and protein expression study during the larval stages of the digestive tract development. Gene 2012; 511:256-64. [PMID: 23000021 DOI: 10.1016/j.gene.2012.09.030] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2012] [Revised: 05/16/2012] [Accepted: 09/05/2012] [Indexed: 12/18/2022]
Abstract
β-catenin interacts with several proteins mediating key biological processes, such as cadherin-mediated cell-cell adhesion as well as signal transduction. This work was done to establish the molecular basis and regulation of the formation pattern of cadherin/β-catenin-mediated adherens junctions, using an animal model of unknown gene sequence, the toad Rhinella arenarum. A Rhinella arenarum β-catenin homolog was isolated from larval tissue, their sequence compared and analyzed with those of eight other vertebrates using bioinformatics tools. The mRNA and protein expression levels of β-catenin were determined during the development of Rhinella arenarum digestive tract both by Reverse Transcriptase-Polymerase Chain Reaction (RT-PCR) and immunohistochemistry-morphometry respectively. Using Xenopus laevis frog specific primers, a fragment 539 bp of Rhinella arenarum toad β-catenin cDNA was obtained and sequenced. The resulting putative sequence of 177 amino acids showed high similarity at the amino acid level (97%) when compared to other six vertebrates (Xenopus laevis, Xenopus tropicalis, Mus musculus, Rattus norvegicus, Bos taurus and Homo sapiens), with sequences and structural domains characteristic of catenins. Subsequently, using primers specifically designed for Rhinella arenarum nucleotide sequence, β-catenin-mRNA increasing levels were found during the Rhinella arenarum metamorphosis. Finally, increasing β-catenin protein expression during development has confirmed the specificity the detection of Rhinella arenarum β-catenin. Summarizing, we have isolated and sequenced a β-catenin-homologue sequence from the Rhinella arenarum toad, which is highly conserved between species, and following we have detected β-catenin mRNA and protein levels during their digestive tract development.
Collapse
Affiliation(s)
- C D Galetto
- Laboratorio de Microscopia Aplicada a Estudios Moleculares y Celulares, Facultad de Ingeniería (Bioingeniería-Bioinformática), Universidad Nacional de Entre Ríos. Ruta 11, Km, 10, Oro Verde, Entre Ríos, Argentina.
| | | | | | | |
Collapse
|
72
|
Messih MA, Chitale M, Bajic VB, Kihara D, Gao X. Protein domain recurrence and order can enhance prediction of protein functions. Bioinformatics 2012; 28:i444-i450. [PMID: 22962465 PMCID: PMC3436825 DOI: 10.1093/bioinformatics/bts398] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
MOTIVATION Burgeoning sequencing technologies have generated massive amounts of genomic and proteomic data. Annotating the functions of proteins identified in this data has become a big and crucial problem. Various computational methods have been developed to infer the protein functions based on either the sequences or domains of proteins. The existing methods, however, ignore the recurrence and the order of the protein domains in this function inference. RESULTS We developed two new methods to infer protein functions based on protein domain recurrence and domain order. Our first method, DRDO, calculates the posterior probability of the Gene Ontology terms based on domain recurrence and domain order information, whereas our second method, DRDO-NB, relies on the naïve Bayes methodology using the same domain architecture information. Our large-scale benchmark comparisons show strong improvements in the accuracy of the protein function inference achieved by our new methods, demonstrating that domain recurrence and order can provide important information for inference of protein functions. AVAILABILITY The new models are provided as open source programs at http://sfb.kaust.edu.sa/Pages/Software.aspx. CONTACT dkihara@cs.purdue.edu, xin.gao@kaust.edu.sa SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics Online.
Collapse
Affiliation(s)
- Mario Abdel Messih
- Mathematical and Computer Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, 23955-6900, Saudi Arabia
| | | | | | | | | |
Collapse
|
73
|
|
74
|
Kawakoshi A, Nakazawa H, Fukada J, Sasagawa M, Katano Y, Nakamura S, Hosoyama A, Sasaki H, Ichikawa N, Hanada S, Kamagata Y, Nakamura K, Yamazaki S, Fujita N. Deciphering the genome of polyphosphate accumulating actinobacterium Microlunatus phosphovorus. DNA Res 2012; 19:383-94. [PMID: 22923697 PMCID: PMC3473371 DOI: 10.1093/dnares/dss020] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
Polyphosphate accumulating organisms (PAOs) belong mostly to Proteobacteria and Actinobacteria and are quite divergent. Under aerobic conditions, they accumulate intracellular polyphosphate (polyP), while they typically synthesize polyhydroxyalkanoates (PHAs) under anaerobic conditions. Many ecological, physiological, and genomic analyses have been performed with proteobacterial PAOs, but few with actinobacterial PAOs. In this study, the whole genome sequence of an actinobacterial PAO, Microlunatus phosphovorus NM-1T (NBRC 101784T), was determined. The number of genes for polyP metabolism was greater in M. phosphovorus than in other actinobacteria; it possesses genes for four polyP kinases (ppks), two polyP-dependent glucokinases (ppgks), and three phosphate transporters (pits). In contrast, it harbours only a single ppx gene for exopolyphosphatase, although two copies of ppx are generally present in other actinobacteria. Furthermore, M. phosphovorus lacks the phaABC genes for PHA synthesis and the actP gene encoding an acetate/H+ symporter, both of which play crucial roles in anaerobic PHA accumulation in proteobacterial PAOs. Thus, while the general features of M. phosphovorus regarding aerobic polyP accumulation are similar to those of proteobacterial PAOs, its anaerobic polyP use and PHA synthesis appear to be different.
Collapse
Affiliation(s)
- Akatsuki Kawakoshi
- Biological Resource Center, National Institute of Technology and Evaluation, 2-10-49 Nishihara, Tokyo 151-0066, Japan
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
75
|
Renier S, Micheau P, Talon R, Hébraud M, Desvaux M. Subcellular localization of extracytoplasmic proteins in monoderm bacteria: rational secretomics-based strategy for genomic and proteomic analyses. PLoS One 2012; 7:e42982. [PMID: 22912771 PMCID: PMC3415414 DOI: 10.1371/journal.pone.0042982] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2012] [Accepted: 07/13/2012] [Indexed: 11/20/2022] Open
Abstract
Genome-scale prediction of subcellular localization (SCL) is not only useful for inferring protein function but also for supporting proteomic data. In line with the secretome concept, a rational and original analytical strategy mimicking the secretion steps that determine ultimate SCL was developed for Gram-positive (monoderm) bacteria. Based on the biology of protein secretion, a flowchart and decision trees were designed considering (i) membrane targeting, (ii) protein secretion systems, (iii) membrane retention, and (iv) cell-wall retention by domains or post-translocational modifications, as well as (v) incorporation to cell-surface supramolecular structures. Using Listeria monocytogenes as a case study, results were compared with known data set from SCL predictors and experimental proteomics. While in good agreement with experimental extracytoplasmic fractions, the secretomics-based method outperforms other genomic analyses, which were simply not intended to be as inclusive. Compared to all other localization predictors, this method does not only supply a static snapshot of protein SCL but also offers the full picture of the secretion process dynamics: (i) the protein routing is detailed, (ii) the number of distinct SCL and protein categories is comprehensive, (iii) the description of protein type and topology is provided, (iv) the SCL is unambiguously differentiated from the protein category, and (v) the multiple SCL and protein category are fully considered. In that sense, the secretomics-based method is much more than a SCL predictor. Besides a major step forward in genomics and proteomics of protein secretion, the secretomics-based method appears as a strategy of choice to generate in silico hypotheses for experimental testing.
Collapse
Affiliation(s)
- Sandra Renier
- INRA, UR454 Microbiology, Saint-Genès Champanelle, France
| | - Pierre Micheau
- INRA, UR454 Microbiology, Saint-Genès Champanelle, France
| | - Régine Talon
- INRA, UR454 Microbiology, Saint-Genès Champanelle, France
| | - Michel Hébraud
- INRA, UR454 Microbiology, Saint-Genès Champanelle, France
| | - Mickaël Desvaux
- INRA, UR454 Microbiology, Saint-Genès Champanelle, France
- * E-mail:
| |
Collapse
|
76
|
Characterization of microRNAs expression during maize seed development. BMC Genomics 2012; 13:360. [PMID: 22853295 PMCID: PMC3468377 DOI: 10.1186/1471-2164-13-360] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2012] [Accepted: 07/09/2012] [Indexed: 12/21/2022] Open
Abstract
Background MicroRNAs (miRNAs) are approximately 20-22 nt non-coding RNAs that play key roles in many biological processes in both animals and plants. Although a number of miRNAs were identified in maize, the function of miRNA in seed development was merely discussed. Results In this study, two small RNA libraries were sequenced, and a total reads of 9,705,761 and 9,005,563 were generated from developing seeds and growing leaves, respectively. Further analysis identified 125 known miRNAs in seeds and 127 known miRNAs in leaves. 54 novel miRNAs were identified and they were not reported in other plants. Additionally, some miRNA*s of these novel miRNAs were detected. Potential targets of all novel miRNAs were predicted based on our strict criteria. In addition to deep-sequencing, miRNA microarray study confirmed the higher expression of several miRNAs in seeds. In summary, our results indicated the distinct expression of miRNAs during seed development. Conclusions We had identified 125 and 127 known miRNAs from seeds and leaves in maize, and a total of 54 novel miRNAs were discovered. The different miRNA expression profile in developing seeds were revealed by both sequencing and microarray studies.
Collapse
|
77
|
Geary J, Satti M, Moreno Y, Madrill N, Whitten D, Headley SA, Agnew D, Geary T, Mackenzie C. First analysis of the secretome of the canine heartworm, Dirofilaria immitis. Parasit Vectors 2012; 5:140. [PMID: 22781075 PMCID: PMC3439246 DOI: 10.1186/1756-3305-5-140] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2012] [Accepted: 06/13/2012] [Indexed: 12/18/2022] Open
Abstract
Background The characterization of proteins released from filariae is an important step in addressing many of the needs in the diagnosis and treatment of these clinically important parasites, as well as contributing to a clearer understanding of their biology. This report describes findings on the proteins released during in vitro cultivation of adult Dirofilaria immitis , the causative agent of canine and feline heartworm disease. Differences in protein secretion among nematodes in vivo may relate to the ecological niche of each parasite and the pathological changes that they induce. Methods The proteins in the secretions of cultured adult worms were run on Tris-Glycine gels, bands separated and peptides from each band analysed by ultra mass spectrometry and compared with a FastA dataset of predicted tryptic peptides derived from a genome sequence of D. immitis. Results This study identified 110 proteins. Of these proteins, 52 were unique to D. immitis . A total of 23 (44%) were recognized as proteins likely to be secreted. Although these proteins were unique, the motifs were conserved compared with proteins secreted by other nematodes. Conclusion The present data indicate that D. immitis secretes proteins that are unique to this species, when compared with Brugia malayi. The two major functional groups of molecules represented were those representing cellular and of metabolic processes. Unique proteins might be important for maintaining an infection in the host environment, intimately involved in the pathogenesis of disease and may also provide new tools for the diagnosis of heartworm infection.
Collapse
Affiliation(s)
- James Geary
- Department of Pathobiology and Diagnostic Investigation, College of Veterinary Medicine, Michigan State University, East Lansing, MI 48824, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
78
|
The yak genome and adaptation to life at high altitude. Nat Genet 2012; 44:946-9. [DOI: 10.1038/ng.2343] [Citation(s) in RCA: 540] [Impact Index Per Article: 45.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2011] [Accepted: 06/06/2012] [Indexed: 01/17/2023]
|
79
|
He QF, Li D, Xu QY, Zheng S. Predicted essential proteins of Plasmodium falciparum for potential drug targets. ASIAN PAC J TROP MED 2012; 5:352-4. [PMID: 22546649 DOI: 10.1016/s1995-7645(12)60057-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2012] [Revised: 03/15/2012] [Accepted: 04/15/2012] [Indexed: 11/16/2022] Open
Abstract
OBJECTIVE To identify novel drug targets for treatment of Plasmodium falciparum. METHODS Local BLASTP were used to find the proteins non-homologous to human essential proteins as novel drug targets. Functional domains of novel drug targets were identified by InterPro and Pfam, 3D structures of potential drug targets were predicated by the SWISS-MODEL workspace. Ligands and ligand-binding sites of the proteins were searched by Ef-seek. RESULTS Three essential proteins were identified that might be considered as potential drug targets. AAN37254.1 belonged to 1-deoxy-D-xylulose 5-phosphate reductoisomerase, CAD50499.1 belonged to chorismate synthase, CAD51220.1 belonged to FAD binging 3 family, but the function of CAD51220.1 was unknown. The 3D structures, ligands and ligand-binding sites of AAN37254.1 and CAD50499.1 were successfully predicated. CONCLUSIONS Two of these potential drug targets are key enzymes in 2-C-methyl-d-erythritol 4-phosphate pathway and shikimate pathway, which are absent in humans, so these two essential proteins are good potential drug targets. The function and 3D structures of CAD50499.1 is still unknown, it still need further study.
Collapse
Affiliation(s)
- Qing-Feng He
- Department of Parasitology, Guangdong Medical College, Dongguan, Guangdong, China.
| | | | | | | |
Collapse
|
80
|
Abstract
Background A wealth of clustering algorithms has been applied to gene co-expression experiments. These algorithms cover a broad range of approaches, from conventional techniques such as k-means and hierarchical clustering, to graphical approaches such as k-clique communities, weighted gene co-expression networks (WGCNA) and paraclique. Comparison of these methods to evaluate their relative effectiveness provides guidance to algorithm selection, development and implementation. Most prior work on comparative clustering evaluation has focused on parametric methods. Graph theoretical methods are recent additions to the tool set for the global analysis and decomposition of microarray co-expression matrices that have not generally been included in earlier methodological comparisons. In the present study, a variety of parametric and graph theoretical clustering algorithms are compared using well-characterized transcriptomic data at a genome scale from Saccharomyces cerevisiae. Methods For each clustering method under study, a variety of parameters were tested. Jaccard similarity was used to measure each cluster's agreement with every GO and KEGG annotation set, and the highest Jaccard score was assigned to the cluster. Clusters were grouped into small, medium, and large bins, and the Jaccard score of the top five scoring clusters in each bin were averaged and reported as the best average top 5 (BAT5) score for the particular method. Results Clusters produced by each method were evaluated based upon the positive match to known pathways. This produces a readily interpretable ranking of the relative effectiveness of clustering on the genes. Methods were also tested to determine whether they were able to identify clusters consistent with those identified by other clustering methods. Conclusions Validation of clusters against known gene classifications demonstrate that for this data, graph-based techniques outperform conventional clustering approaches, suggesting that further development and application of combinatorial strategies is warranted.
Collapse
|
81
|
Reimand J, Hui S, Jain S, Law B, Bader GD. Domain-mediated protein interaction prediction: From genome to network. FEBS Lett 2012; 586:2751-63. [PMID: 22561014 DOI: 10.1016/j.febslet.2012.04.027] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2012] [Accepted: 04/17/2012] [Indexed: 11/19/2022]
Abstract
Protein-protein interactions (PPIs), involved in many biological processes such as cellular signaling, are ultimately encoded in the genome. Solving the problem of predicting protein interactions from the genome sequence will lead to increased understanding of complex networks, evolution and human disease. We can learn the relationship between genomes and networks by focusing on an easily approachable subset of high-resolution protein interactions that are mediated by peptide recognition modules (PRMs) such as PDZ, WW and SH3 domains. This review focuses on computational prediction and analysis of PRM-mediated networks and discusses sequence- and structure-based interaction predictors, techniques and datasets for identifying physiologically relevant PPIs, and interpreting high-resolution interaction networks in the context of evolution and human disease.
Collapse
Affiliation(s)
- Jüri Reimand
- The Donnelly Centre, University of Toronto, 160 College Street, Toronto, Ontario, Canada.
| | | | | | | | | |
Collapse
|
82
|
Wang MC, Chen FC, Chen YZ, Huang YT, Chuang TJ. LDGIdb: a database of gene interactions inferred from long-range strong linkage disequilibrium between pairs of SNPs. BMC Res Notes 2012; 5:212. [PMID: 22551073 PMCID: PMC3441865 DOI: 10.1186/1756-0500-5-212] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2011] [Accepted: 04/26/2012] [Indexed: 12/22/2022] Open
Abstract
Background Complex human diseases may be associated with many gene interactions. Gene interactions take several different forms and it is difficult to identify all of the interactions that are potentially associated with human diseases. One approach that may fill this knowledge gap is to infer previously unknown gene interactions via identification of non-physical linkages between different mutations (or single nucleotide polymorphisms, SNPs) to avoid hitchhiking effect or lack of recombination. Strong non-physical SNP linkages are considered to be an indication of biological (gene) interactions. These interactions can be physical protein interactions, regulatory interactions, functional compensation/antagonization or many other forms of interactions. Previous studies have shown that mutations in different genes can be linked to the same disorders. Therefore, non-physical SNP linkages, coupled with knowledge of SNP-disease associations may shed more light on the role of gene interactions in human disorders. A user-friendly web resource that integrates information about non-physical SNP linkages, gene annotations, SNP information, and SNP-disease associations may thus be a good reference for biomedical research. Findings Here we extracted the SNPs located within the promoter or exonic regions of protein-coding genes from the HapMap database to construct a database named the Linkage-Disequilibrium-based Gene Interaction database (LDGIdb). The database stores 646,203 potential human gene interactions, which are potential interactions inferred from SNP pairs that are subject to long-range strong linkage disequilibrium (LD), or non-physical linkages. To minimize the possibility of hitchhiking, SNP pairs inferred to be non-physically linked were required to be located in different chromosomes or in different LD blocks of the same chromosomes. According to the genomic locations of the involved SNPs (i.e., promoter, untranslated region (UTR) and coding region (CDS)), the SNP linkages inferred were categorized into promoter-promoter, promoter-UTR, promoter-CDS, CDS-CDS, CDS-UTR and UTR-UTR linkages. For the CDS-related linkages, the coding SNPs were further classified into nonsynonymous and synonymous variations, which represent potential gene interactions at the protein and RNA level, respectively. The LDGIdb also incorporates human disease-association databases such as Genome-Wide Association Studies (GWAS) and Online Mendelian Inheritance in Man (OMIM), so that the user can search for potential disease-associated SNP linkages. The inferred SNP linkages are also classified in the context of population stratification to provide a resource for investigating potential population-specific gene interactions. Conclusion The LDGIdb is a user-friendly resource that integrates non-physical SNP linkages and SNP-disease associations for studies of gene interactions in human diseases. With the help of the LDGIdb, it is plausible to infer population-specific SNP linkages for more focused studies, an avenue that is potentially important for pharmacogenetics. Moreover, by referring to disease-association information such as the GWAS data, the LDGIdb may help identify previously uncharacterized disease-associated gene interactions and potentially lead to new discoveries in studies of human diseases. Keywords Gene interaction, SNP, Linkage disequilibrium, Systems biology, Bioinformatics
Collapse
Affiliation(s)
- Ming-Chih Wang
- Genomics Research Center, Academia Sinica, Taipei, 11529, Taiwan
| | | | | | | | | |
Collapse
|
83
|
Aguileta G, Lengelle J, Chiapello H, Giraud T, Viaud M, Fournier E, Rodolphe F, Marthey S, Ducasse A, Gendrault A, Poulain J, Wincker P, Gout L. Genes under positive selection in a model plant pathogenic fungus, Botrytis. INFECTION GENETICS AND EVOLUTION 2012; 12:987-96. [PMID: 22406010 DOI: 10.1016/j.meegid.2012.02.012] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/06/2011] [Revised: 02/15/2012] [Accepted: 02/23/2012] [Indexed: 11/29/2022]
Abstract
The rapid evolution of particular genes is essential for the adaptation of pathogens to new hosts and new environments. Powerful methods have been developed for detecting targets of selection in the genome. Here we used divergence data to compare genes among four closely related fungal pathogens adapted to different hosts to elucidate the functions putatively involved in adaptive processes. For this goal, ESTs were sequenced in the specialist fungal pathogens Botrytis tulipae and Botrytis ficariarum, and compared with genome sequences of Botrytis cinerea and Sclerotinia sclerotiorum, responsible for diseases on over 200 plant species. A maximum likelihood-based analysis of 642 predicted orthologs detected 21 genes showing footprints of positive selection. These results were validated by resequencing nine of these genes in additional Botrytis species, showing they have also been rapidly evolving in other related species. Twenty of the 21 genes had not previously been identified as pathogenicity factors in B. cinerea, but some had functions related to plant-fungus interactions. The putative functions were involved in respiratory and energy metabolism, protein and RNA metabolism, signal transduction or virulence, similarly to what was detected in previous studies using the same approach in other pathogens. Mutants of B. cinerea were generated for four of these genes as a first attempt to elucidate their functions.
Collapse
Affiliation(s)
- Gabriela Aguileta
- Ecologie, Systématique et Evolution, Université Paris-Sud UMR8079, F-91405 Orsay Cedex, France
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
84
|
Chowdhary R, Tan SL, Pavesi G, Jin J, Dong D, Mathur SK, Burkart A, Narang V, Glurich I, Raby BA, Weiss ST, Wong L, Liu JS, Bajic VB. A database of annotated promoters of genes associated with common respiratory and related diseases. Am J Respir Cell Mol Biol 2012; 47:112-9. [PMID: 22383585 DOI: 10.1165/rcmb.2011-0419oc] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
Many genes have been implicated in the pathogenesis of common respiratory and related diseases (RRDs), yet the underlying mechanisms are largely unknown. Differential gene expression patterns in diseased and healthy individuals suggest that RRDs affect or are affected by modified transcription regulation programs. It is thus crucial to characterize implicated genes in terms of transcriptional regulation. For this purpose, we conducted a promoter analysis of genes associated with 11 common RRDs including allergic rhinitis, asthma, bronchiectasis, bronchiolitis, bronchitis, chronic obstructive pulmonary disease, cystic fibrosis, emphysema, eczema, psoriasis, and urticaria, many of which are thought to be genetically related. The objective of the present study was to obtain deeper insight into the transcriptional regulation of these disease-associated genes by annotating their promoter regions with transcription factors (TFs) and TF binding sites (TFBSs). We discovered many TFs that are significantly enriched in the target disease groups including associations that have been documented in the literature. We also identified a number of putative TFs/TFBSs that appear to be novel. The results of our analysis are provided in an online database that is freely accessible to researchers at http://www.respiratorygenomics.com. Promoter-associated TFBS information and related genomic features, such as histone modification sites, microsatellites, CpG islands, and SNPs, are graphically summarized in the database. Users can compare and contrast underlying mechanisms of specific RRDs relative to candidate genes, TFs, gene ontology terms, micro-RNAs, and biological pathways for the conduct of metaanalyses. This database represents a novel, useful resource for RRD researchers.
Collapse
Affiliation(s)
- Rajesh Chowdhary
- Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield Clinic, Wisconsin 54449, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
85
|
Abstract
Here we discuss proteomic analyses of whole cell preparations of the mosquito stages of malaria parasite development (i.e. gametocytes, microgamete, ookinete, oocyst and sporozoite) of Plasmodium berghei. We also include critiques of the proteomes of two cell fractions from the purified ookinete, namely the micronemes and cell surface. Whereas we summarise key biological interpretations of the data, we also try to identify key methodological constraints we have met, only some of which we were able to resolve. Recognising the need to translate the potential of current genome sequencing into functional understanding, we report our efforts to develop more powerful combinations of methods for the in silico prediction of protein function and location. We have applied this analysis to the proteome of the male gamete, a cell whose very simple structural organisation facilitated interpretation of data. Some of the in silico predictions made have now been supported by ongoing protein tagging and genetic knockout studies. We hope this discussion may assist future studies.
Collapse
|
86
|
Rao RU, Huang Y, Abubucker S, Heinz M, Crosby SD, Mitreva M, Weil GJ. Effects of doxycycline on gene expression in Wolbachia and Brugia malayi adult female worms in vivo. J Biomed Sci 2012; 19:21. [PMID: 22321609 PMCID: PMC3352068 DOI: 10.1186/1423-0127-19-21] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2011] [Accepted: 02/09/2012] [Indexed: 12/28/2022] Open
Abstract
Background Most filarial nematodes contain Wolbachia symbionts. The purpose of this study was to examine the effects of doxycycline on gene expression in Wolbachia and adult female Brugia malayi. Methods Brugia malayi infected gerbils were treated with doxycycline for 6-weeks. This treatment largely cleared Wolbachia and arrested worm reproduction. RNA recovered from treated and control female worms was labeled by random priming and hybridized to the Version 2- filarial microarray to obtain expression profiles. Results and discussion Results showed significant changes in expression for 200 Wolbachia (29% of Wolbachia genes with expression signals in untreated worms) and 546 B. malayi array elements after treatment. These elements correspond to known genes and also to novel genes with unknown biological functions. Most differentially expressed Wolbachia genes were down-regulated after treatment (98.5%). In contrast, doxycycline had a mixed effect on B. malayi gene expression with many more genes being significantly up-regulated after treatment (85% of differentially expressed genes). Genes and processes involved in reproduction (gender-regulated genes, collagen, amino acid metabolism, ribosomal processes, and cytoskeleton) were down-regulated after doxycycline while up-regulated genes and pathways suggest adaptations for survival in response to stress (energy metabolism, electron transport, anti-oxidants, nutrient transport, bacterial signaling pathways, and immune evasion). Conclusions Doxycycline reduced Wolbachia and significantly decreased bacterial gene expression. Wolbachia ribosomes are believed to be the primary biological target for doxycycline in filarial worms. B. malayi genes essential for reproduction, growth and development were also down-regulated; these changes are consistent with doxycycline effects on embryo development and reproduction. On the other hand, many B. malayi genes involved in energy production, electron-transport, metabolism, anti-oxidants, and others with unknown functions had increased expression signals after doxycycline treatment. These results suggest that female worms are able to compensate in part for the loss of Wolbachia so that they can survive, albeit without reproductive capacity. This study of doxycycline induced changes in gene expression has provided new clues regarding the symbiotic relationship between Wolbachia and B. malayi.
Collapse
Affiliation(s)
- Ramakrishna U Rao
- Infectious Diseases Division, Department of Internal Medicine, St, Louis, Missouri, USA.
| | | | | | | | | | | | | |
Collapse
|
87
|
Shahbaba B, Shachaf CM, Yu Z. A pathway analysis method for genome-wide association studies. Stat Med 2012; 31:988-1000. [PMID: 22302470 DOI: 10.1002/sim.4477] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2011] [Revised: 10/20/2011] [Accepted: 11/02/2011] [Indexed: 12/20/2022]
Abstract
For genome-wide association studies, we propose a new method for identifying significant biological pathways. In this approach, we aggregate data across single-nucleotide polymorphisms to obtain summary measures at the gene level. We then use a hierarchical Bayesian model, which takes the gene-level summary measures as data, in order to evaluate the relevance of each pathway to an outcome of interest (e.g., disease status). Although shifting the focus of analysis from individual genes to pathways has proven to improve the statistical power and provide more robust results, such methods tend to eliminate a large number of genes whose pathways are unknown. For these genes, we propose to use a Bayesian multinomial logit model to predict the associated pathways by using the genes with known pathways as the training data. Our hierarchical Bayesian model takes the uncertainty regarding the pathway predictions into account while assessing the significance of pathways. We apply our method to two independent studies on type 2 diabetes and show that the overlap between the results from the two studies is statistically significant. We also evaluate our approach on the basis of simulated data.
Collapse
Affiliation(s)
- Babak Shahbaba
- Department of Statistics, University of California, Irvine, CA, USA
| | | | | |
Collapse
|
88
|
Childs KL, Konganti K, Buell CR. The Biofuel Feedstock Genomics Resource: a web-based portal and database to enable functional genomics of plant biofuel feedstock species. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2012; 2012:bar061. [PMID: 22250003 PMCID: PMC3259624 DOI: 10.1093/database/bar061] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Major feedstock sources for future biofuel production are likely to be high biomass producing plant species such as poplar, pine, switchgrass, sorghum and maize. One active area of research in these species is genome-enabled improvement of lignocellulosic biofuel feedstock quality and yield. To facilitate genomic-based investigations in these species, we developed the Biofuel Feedstock Genomic Resource (BFGR), a database and web-portal that provides high-quality, uniform and integrated functional annotation of gene and transcript assembly sequences from species of interest to lignocellulosic biofuel feedstock researchers. The BFGR includes sequence data from 54 species and permits researchers to view, analyze and obtain annotation at the gene, transcript, protein and genome level. Annotation of biochemical pathways permits the identification of key genes and transcripts central to the improvement of lignocellulosic properties in these species. The integrated nature of the BFGR in terms of annotation methods, orthologous/paralogous relationships and linkage to seven species with complete genome sequences allows comparative analyses for biofuel feedstock species with limited sequence resources. Database URL:http://bfgr.plantbiology.msu.edu
Collapse
Affiliation(s)
- Kevin L Childs
- Department of Plant Biology, Michigan State University, East Lansing, MI 48824, USA.
| | | | | |
Collapse
|
89
|
Ellis JT, Sims RC, Miller CD. Monitoring microbial diversity of bioreactors using metagenomic approaches. Subcell Biochem 2012; 64:73-94. [PMID: 23080246 DOI: 10.1007/978-94-007-5055-5_4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
With the rapid development of molecular techniques, particularly 'omics' technologies, the field of microbial ecology is growing rapidly. The applications of next generation sequencing have allowed researchers to produce massive amounts of genetic data on individual microbes, providing information about microbial communities and their interactions through in situ and in vitro measurements. The ability to identify novel microbes, functions, and enzymes, along with developing an understanding of microbial interactions and functions, is necessary for efficient production of useful and high value products in bioreactors. The ability to optimize bioreactors fully and understand microbial interactions and functions within these systems will establish highly efficient industrial processes for the production of bioproducts. This chapter will provide an overview of bioreactors and metagenomic technologies to help the reader understand microbial communities, interactions, and functions in bioreactors.
Collapse
Affiliation(s)
- Joshua T Ellis
- Department of Biological Engineering, Utah State University, 4105 Old Main Hill, Logan, UT, 84322-4105, USA
| | | | | |
Collapse
|
90
|
Generation and Analysis of Large-Scale Data-Driven Mycobacterium tuberculosis Functional Networks for Drug Target Identification. Adv Bioinformatics 2011; 2011:801478. [PMID: 22190924 PMCID: PMC3235424 DOI: 10.1155/2011/801478] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2011] [Accepted: 08/28/2011] [Indexed: 11/18/2022] Open
Abstract
Technological developments in large-scale biological experiments, coupled with bioinformatics tools, have opened the doors to computational approaches for the global analysis of whole genomes. This has provided the opportunity to look at genes within their context in the cell. The integration of vast amounts of data generated by these technologies provides a strategy for identifying potential drug targets within microbial pathogens, the causative agents of infectious diseases. As proteins are druggable targets, functional interaction networks between proteins are used to identify proteins essential to the survival, growth, and virulence of these microbial pathogens. Here we have integrated functional genomics data to generate functional interaction networks between Mycobacterium tuberculosis proteins and carried out computational analyses to dissect the functional interaction network produced for identifying drug targets using network topological properties. This study has provided the opportunity to expand the range of potential drug targets and to move towards optimal target-based strategies.
Collapse
|
91
|
Hamilton JP, Neeno-Eckwall EC, Adhikari BN, Perna NT, Tisserat N, Leach JE, Lévesque CA, Buell CR. The Comprehensive Phytopathogen Genomics Resource: a web-based resource for data-mining plant pathogen genomes. Database (Oxford) 2011; 2011:bar053. [PMID: 22120664 PMCID: PMC3225079 DOI: 10.1093/database/bar053] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
The Comprehensive Phytopathogen Genomics Resource (CPGR) provides a web-based portal for plant pathologists and diagnosticians to view the genome and trancriptome sequence status of 806 bacterial, fungal, oomycete, nematode, viral and viroid plant pathogens. Tools are available to search and analyze annotated genome sequences of 74 bacterial, fungal and oomycete pathogens. Oomycete and fungal genomes are obtained directly from GenBank, whereas bacterial genome sequences are downloaded from the A Systematic Annotation Package (ASAP) database that provides curation of genomes using comparative approaches. Curated lists of bacterial genes relevant to pathogenicity and avirulence are also provided. The Plant Pathogen Transcript Assemblies Database provides annotated assemblies of the transcribed regions of 82 eukaryotic genomes from publicly available single pass Expressed Sequence Tags. Data-mining tools are provided along with tools to create candidate diagnostic markers, an emerging use for genomic sequence data in plant pathology. The Plant Pathogen Ribosomal DNA (rDNA) database is a resource for pathogens that lack genome or transcriptome data sets and contains 131 755 rDNA sequences from GenBank for 17 613 species identified as plant pathogens and related genera. Database URL: http://cpgr.plantbiology.msu.edu.
Collapse
Affiliation(s)
- John P. Hamilton
- Department of Plant Biology, 178 Wilson Lane, Michigan State University, East Lansing, MI, 48824, USA, Department of Genetics, 4434 Genetics-Biotech Center BLDG, 425 Henry Mall, University of Wisconsin, Madison, WI, 53706, USA, Department of Bioagricultural Sciences and Pest Management, Plant Science C129, Colorado State University, Fort Collins, CO, 80523–1177, USA, Agriculture and Agri-Food Canada, 960 Carling Ave., ON, K1A 0C6 and Department of Biology, Carleton University, ON, K1S 5B6, Ottawa, Canada
| | - Eric C. Neeno-Eckwall
- Department of Plant Biology, 178 Wilson Lane, Michigan State University, East Lansing, MI, 48824, USA, Department of Genetics, 4434 Genetics-Biotech Center BLDG, 425 Henry Mall, University of Wisconsin, Madison, WI, 53706, USA, Department of Bioagricultural Sciences and Pest Management, Plant Science C129, Colorado State University, Fort Collins, CO, 80523–1177, USA, Agriculture and Agri-Food Canada, 960 Carling Ave., ON, K1A 0C6 and Department of Biology, Carleton University, ON, K1S 5B6, Ottawa, Canada
| | - Bishwo N. Adhikari
- Department of Plant Biology, 178 Wilson Lane, Michigan State University, East Lansing, MI, 48824, USA, Department of Genetics, 4434 Genetics-Biotech Center BLDG, 425 Henry Mall, University of Wisconsin, Madison, WI, 53706, USA, Department of Bioagricultural Sciences and Pest Management, Plant Science C129, Colorado State University, Fort Collins, CO, 80523–1177, USA, Agriculture and Agri-Food Canada, 960 Carling Ave., ON, K1A 0C6 and Department of Biology, Carleton University, ON, K1S 5B6, Ottawa, Canada
| | - Nicole T. Perna
- Department of Plant Biology, 178 Wilson Lane, Michigan State University, East Lansing, MI, 48824, USA, Department of Genetics, 4434 Genetics-Biotech Center BLDG, 425 Henry Mall, University of Wisconsin, Madison, WI, 53706, USA, Department of Bioagricultural Sciences and Pest Management, Plant Science C129, Colorado State University, Fort Collins, CO, 80523–1177, USA, Agriculture and Agri-Food Canada, 960 Carling Ave., ON, K1A 0C6 and Department of Biology, Carleton University, ON, K1S 5B6, Ottawa, Canada
| | - Ned Tisserat
- Department of Plant Biology, 178 Wilson Lane, Michigan State University, East Lansing, MI, 48824, USA, Department of Genetics, 4434 Genetics-Biotech Center BLDG, 425 Henry Mall, University of Wisconsin, Madison, WI, 53706, USA, Department of Bioagricultural Sciences and Pest Management, Plant Science C129, Colorado State University, Fort Collins, CO, 80523–1177, USA, Agriculture and Agri-Food Canada, 960 Carling Ave., ON, K1A 0C6 and Department of Biology, Carleton University, ON, K1S 5B6, Ottawa, Canada
| | - Jan E. Leach
- Department of Plant Biology, 178 Wilson Lane, Michigan State University, East Lansing, MI, 48824, USA, Department of Genetics, 4434 Genetics-Biotech Center BLDG, 425 Henry Mall, University of Wisconsin, Madison, WI, 53706, USA, Department of Bioagricultural Sciences and Pest Management, Plant Science C129, Colorado State University, Fort Collins, CO, 80523–1177, USA, Agriculture and Agri-Food Canada, 960 Carling Ave., ON, K1A 0C6 and Department of Biology, Carleton University, ON, K1S 5B6, Ottawa, Canada
| | - C. André Lévesque
- Department of Plant Biology, 178 Wilson Lane, Michigan State University, East Lansing, MI, 48824, USA, Department of Genetics, 4434 Genetics-Biotech Center BLDG, 425 Henry Mall, University of Wisconsin, Madison, WI, 53706, USA, Department of Bioagricultural Sciences and Pest Management, Plant Science C129, Colorado State University, Fort Collins, CO, 80523–1177, USA, Agriculture and Agri-Food Canada, 960 Carling Ave., ON, K1A 0C6 and Department of Biology, Carleton University, ON, K1S 5B6, Ottawa, Canada
| | - C. Robin Buell
- Department of Plant Biology, 178 Wilson Lane, Michigan State University, East Lansing, MI, 48824, USA, Department of Genetics, 4434 Genetics-Biotech Center BLDG, 425 Henry Mall, University of Wisconsin, Madison, WI, 53706, USA, Department of Bioagricultural Sciences and Pest Management, Plant Science C129, Colorado State University, Fort Collins, CO, 80523–1177, USA, Agriculture and Agri-Food Canada, 960 Carling Ave., ON, K1A 0C6 and Department of Biology, Carleton University, ON, K1S 5B6, Ottawa, Canada
| |
Collapse
|
92
|
Forslund K, Pekkari I, Sonnhammer ELL. Domain architecture conservation in orthologs. BMC Bioinformatics 2011; 12:326. [PMID: 21819573 PMCID: PMC3215765 DOI: 10.1186/1471-2105-12-326] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2011] [Accepted: 08/05/2011] [Indexed: 11/16/2022] Open
Abstract
Background As orthologous proteins are expected to retain function more often than other homologs, they are often used for functional annotation transfer between species. However, ortholog identification methods do not take into account changes in domain architecture, which are likely to modify a protein's function. By domain architecture we refer to the sequential arrangement of domains along a protein sequence. To assess the level of domain architecture conservation among orthologs, we carried out a large-scale study of such events between human and 40 other species spanning the entire evolutionary range. We designed a score to measure domain architecture similarity and used it to analyze differences in domain architecture conservation between orthologs and paralogs relative to the conservation of primary sequence. We also statistically characterized the extents of different types of domain swapping events across pairs of orthologs and paralogs. Results The analysis shows that orthologs exhibit greater domain architecture conservation than paralogous homologs, even when differences in average sequence divergence are compensated for, for homologs that have diverged beyond a certain threshold. We interpret this as an indication of a stronger selective pressure on orthologs than paralogs to retain the domain architecture required for the proteins to perform a specific function. In general, orthologs as well as the closest paralogous homologs have very similar domain architectures, even at large evolutionary separation. The most common domain architecture changes observed in both ortholog and paralog pairs involved insertion/deletion of new domains, while domain shuffling and segment duplication/deletion were very infrequent. Conclusions On the whole, our results support the hypothesis that function conservation between orthologs demands higher domain architecture conservation than other types of homologs, relative to primary sequence conservation. This supports the notion that orthologs are functionally more similar than other types of homologs at the same evolutionary distance.
Collapse
Affiliation(s)
- Kristoffer Forslund
- Stockholm Bioinformatics Centre, Science for Life Laboratory, Box 1031, Solna, 17121 Sweden
| | | | | |
Collapse
|
93
|
Silkov A, Yoon Y, Lee H, Gokhale N, Adu-Gyamfi E, Stahelin RV, Cho W, Murray D. Genome-wide structural analysis reveals novel membrane binding properties of AP180 N-terminal homology (ANTH) domains. J Biol Chem 2011; 286:34155-63. [PMID: 21828048 DOI: 10.1074/jbc.m111.265611] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
An increasing number of cytosolic proteins are shown to interact with membrane lipids during diverse cellular processes, but computational prediction of these proteins and their membrane binding behaviors remains challenging. Here, we introduce a new combinatorial computation protocol for systematic and robust functional prediction of membrane-binding proteins through high throughput homology modeling and in-depth calculation of biophysical properties. The approach was applied to the genomic scale identification of the AP180 N-terminal homology (ANTH) domain, one of the modular lipid binding domains, and prediction of their membrane binding properties. Our analysis yielded comprehensive coverage of the ANTH domain family and allowed classification and functional annotation of proteins based on the differences in local structural and biophysical features. Our analysis also identified a group of plant ANTH domains with unique structural features that may confer novel functionalities. Experimental characterization of a representative member of this subfamily confirmed its unique membrane binding mechanism and unprecedented membrane deforming activity. Collectively, these studies suggest that our new computational approach can be applied to genome-wide functional prediction of other lipid binding domains.
Collapse
Affiliation(s)
- Antonina Silkov
- Department of Pharmacology, Columbia University, New York, New York 11032, USA
| | | | | | | | | | | | | | | |
Collapse
|
94
|
De Martino A, Bartual A, Willis A, Meichenin A, Villazán B, Maheswari U, Bowler C. Physiological and Molecular Evidence that Environmental Changes Elicit Morphological Interconversion in the Model Diatom Phaeodactylum tricornutum. Protist 2011; 162:462-81. [DOI: 10.1016/j.protis.2011.02.002] [Citation(s) in RCA: 70] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2010] [Accepted: 01/17/2011] [Indexed: 11/30/2022]
|
95
|
Airoldi EM, Heller KA, Silva R. Small sets of interacting proteins suggest functional linkage mechanisms via Bayesian analogical reasoning. Bioinformatics 2011; 27:i374-82. [PMID: 21685095 PMCID: PMC3117334 DOI: 10.1093/bioinformatics/btr236] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Proteins and protein complexes coordinate their activity to execute cellular functions. In a number of experimental settings, including synthetic genetic arrays, genetic perturbations and RNAi screens, scientists identify a small set of protein interactions of interest. A working hypothesis is often that these interactions are the observable phenotypes of some functional process, which is not directly observable. Confirmatory analysis requires finding other pairs of proteins whose interaction may be additional phenotypical evidence about the same functional process. Extant methods for finding additional protein interactions rely heavily on the information in the newly identified set of interactions. For instance, these methods leverage the attributes of the individual proteins directly, in a supervised setting, in order to find relevant protein pairs. A small set of protein interactions provides a small sample to train parameters of prediction methods, thus leading to low confidence. RESULTS We develop RBSets, a computational approach to ranking protein interactions rooted in analogical reasoning; that is, the ability to learn and generalize relations between objects. Our approach is tailored to situations where the training set of protein interactions is small, and leverages the attributes of the individual proteins indirectly, in a Bayesian ranking setting that is perhaps closest to propensity scoring in mathematical psychology. We find that RBSets leads to good performance in identifying additional interactions starting from a small evidence set of interacting proteins, for which an underlying biological logic in terms of functional processes and signaling pathways can be established with some confidence. Our approach is scalable and can be applied to large databases with minimal computational overhead. Our results suggest that analogical reasoning within a Bayesian ranking problem is a promising new approach for real-time biological discovery. AVAILABILITY Java code is available at: www.gatsby.ucl.ac.uk/~rbas. CONTACT airoldi@fas.harvard.edu; kheller@mit.edu; ricardo@stats.ucl.ac.uk.
Collapse
Affiliation(s)
- Edoardo M Airoldi
- Department of Statistics and FAS Center for Systems Biology, Harvard University, Cambridge, MA 02138, USA.
| | | | | |
Collapse
|
96
|
NELL-1 binds to APR3 affecting human osteoblast proliferation and differentiation. FEBS Lett 2011; 585:2410-8. [PMID: 21723284 DOI: 10.1016/j.febslet.2011.06.024] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2010] [Revised: 06/11/2011] [Accepted: 06/17/2011] [Indexed: 11/23/2022]
Abstract
Nel-like protein 1 (NELL-1) is an osteoinductive molecule associated with premature calvarial suture closure. Here we identified apoptosis related protein 3 (APR3), a membrane protein known as a proliferation suppressor, as a binding protein of NELL-1 by biopanning. NELL-1 and APR3 colocalized on the nuclear envelope of human osteoblasts. NELL-1 significantly inhibited proliferation of osteoblasts co-transfected with APR3 through further down-regulation of Cyclin D1. The co-expression of NELL-1 and APR3 enhanced Ocn and Bsp expression and mineralization. RNAi of APR3 significantly reduced the differentiation effect of NELL-1. These findings suggest that the effects of NELL-1 on osteoblastic differentiation and proliferation are partly through binding to APR3.
Collapse
|
97
|
Wang Y, Wu W, Negre NN, White KP, Li C, Shah PK. Determinants of antigenicity and specificity in immune response for protein sequences. BMC Bioinformatics 2011; 12:251. [PMID: 21693021 PMCID: PMC3133554 DOI: 10.1186/1471-2105-12-251] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2010] [Accepted: 06/21/2011] [Indexed: 11/22/2022] Open
Abstract
Background Target specific antibodies are pivotal for the design of vaccines, immunodiagnostic tests, studies on proteomics for cancer biomarker discovery, identification of protein-DNA and other interactions, and small and large biochemical assays. Therefore, it is important to understand the properties of protein sequences that are important for antigenicity and to identify small peptide epitopes and large regions in the linear sequence of the proteins whose utilization result in specific antibodies. Results Our analysis using protein properties suggested that sequence composition combined with evolutionary information and predicted secondary structure, as well as solvent accessibility is sufficient to predict successful peptide epitopes. The antigenicity and the specificity in immune response were also found to depend on the epitope length. We trained the B-Cell Epitope Oracle (BEOracle), a support vector machine (SVM) classifier, for the identification of continuous B-Cell epitopes with these protein properties as learning features. The BEOracle achieved an F1-measure of 81.37% on a large validation set. The BEOracle classifier outperformed the classical methods based on propensity and sophisticated methods like BCPred and Bepipred for B-Cell epitope prediction. The BEOracle classifier also identified peptides for the ChIP-grade antibodies from the modENCODE/ENCODE projects with 96.88% accuracy. High BEOracle score for peptides showed some correlation with the antibody intensity on Immunofluorescence studies done on fly embryos. Finally, a second SVM classifier, the B-Cell Region Oracle (BROracle) was trained with the BEOracle scores as features to predict the performance of antibodies generated with large protein regions with high accuracy. The BROracle classifier achieved accuracies of 75.26-63.88% on a validation set with immunofluorescence, immunohistochemistry, protein arrays and western blot results from Protein Atlas database. Conclusions Together our results suggest that antigenicity is a local property of the protein sequences and that protein sequence properties of composition, secondary structure, solvent accessibility and evolutionary conservation are the determinants of antigenicity and specificity in immune response. Moreover, specificity in immune response could also be accurately predicted for large protein regions without the knowledge of the protein tertiary structure or the presence of discontinuous epitopes. The dataset prepared in this work and the classifier models are available for download at https://sites.google.com/site/oracleclassifiers/.
Collapse
Affiliation(s)
- Yulong Wang
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute & Harvard School of Public Health, Boston 02115 MA, USA.
| | | | | | | | | | | |
Collapse
|
98
|
Woo NS, Gordon MJ, Graham SR, Rossel JB, Badger MR, Pogson BJ. A mutation in the purine biosynthetic enzyme ATASE2 impacts high light signalling and acclimation responses in green and chlorotic sectors of Arabidopsis leaves. FUNCTIONAL PLANT BIOLOGY : FPB 2011; 38:401-419. [PMID: 32480896 DOI: 10.1071/fp10218] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/17/2010] [Accepted: 03/22/2011] [Indexed: 05/14/2023]
Abstract
In this report, we investigate the altered APX2 expression 13 (alx13) mutation of Arabidopsis thaliana, a mutation in glutamine phosphoribosyl pyrophosphate amidotransferase 2 (ATASE2), the primary isoform of the enzyme mediating the first committed step of purine biosynthesis. Light-dependent leaf variegation was exhibited by alx13 plants, with partial shading of alx13 rosettes revealing that the development of chlorosis in emerging leaves is influenced by the growth irradiance of established leaves. Chlorotic sectors arose from emerging green alx13 leaves during a phase of rapid cell division and expansion, which shows that each new cell's fate is independent of its progenitor. In conjunction with the variegated phenotype, alx13 plants showed altered high light stress responses, including changed expression of genes encoding proteins with antioxidative functions, impaired anthocyanin production and over-accumulation of reactive oxygen species. These characteristics were observed in both photosynthetically-normal green tissues and chlorotic tissues. Chlorotic tissues of alx13 leaves accumulated mRNAs of nuclear-encoded photosynthesis genes that are repressed in other variegated mutants of Arabidopsis. Thus, defective purine biosynthesis impairs chloroplast biogenesis in a light-dependent manner and alters the induction of high light stress pathways and nuclear-encoded photosynthesis genes.
Collapse
Affiliation(s)
- Nick S Woo
- Australian Research Council Centre of Excellence in Plant Energy Biology, Research School of Biology, Australian National University, Canberra, ACT 0200, Australia
| | - Matthew J Gordon
- Australian Research Council Centre of Excellence in Plant Energy Biology, Research School of Biology, Australian National University, Canberra, ACT 0200, Australia
| | - Stephen R Graham
- Australian Research Council Centre of Excellence in Plant Energy Biology, Research School of Biology, Australian National University, Canberra, ACT 0200, Australia
| | - Jan Bart Rossel
- Australian Research Council Centre of Excellence in Plant Energy Biology, Research School of Biology, Australian National University, Canberra, ACT 0200, Australia
| | - Murray R Badger
- Australian Research Council Centre of Excellence in Plant Energy Biology, Research School of Biology, Australian National University, Canberra, ACT 0200, Australia
| | - Barry J Pogson
- Australian Research Council Centre of Excellence in Plant Energy Biology, Research School of Biology, Australian National University, Canberra, ACT 0200, Australia
| |
Collapse
|
99
|
Panphut W, Senapin S, Sriurairatana S, Withyachumnarnkul B, Flegel TW. A novel integrase-containing element may interact with Laem-Singh virus (LSNV) to cause slow growth in giant tiger shrimp. BMC Vet Res 2011; 7:18. [PMID: 21569542 PMCID: PMC3117699 DOI: 10.1186/1746-6148-7-18] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2010] [Accepted: 05/14/2011] [Indexed: 11/24/2022] Open
Abstract
Background From 2001-2003 monodon slow growth syndrome (MSGS) caused severe economic losses for Thai shrimp farmers who cultivated the native, giant tiger shrimp, and this led them to adopt exotic stocks of the domesticated whiteleg shrimp as the species of cultivation choice, despite the higher value of giant tiger shrimp. In 2008, newly discovered Laem-Singh virus (LSNV) was proposed as a necessary but insufficient cause of MSGS, and this stimulated the search for the additional component cause(s) of MSGS in the hope that discovery would lead to preventative measures that could revive cultivation of the higher value native shrimp species. Results Using a universal shotgun cloning protocol, a novel RNA, integrase-containing element (ICE) was found in giant tiger shrimp from MSGS ponds (GenBank accession number FJ498866). In situ hybridization probes and RT-PCR tests revealed that ICE and Laem-Singh virus (LSNV) occurred together in lymphoid organs (LO) of shrimp from MSGS ponds but not in shrimp from normal ponds. Tissue homogenates of shrimp from MSGS ponds yielded a fraction that gave positive RT-PCR reactions for both ICE and LSNV and showed viral-like particles by transmission electron microscopy (TEM). Bioassays of this fraction with juvenile giant tiger shrimp resulted in retarded growth with gross signs of MSGS, and in situ hybridization assays revealed ICE and LSNV together in LO, eyes and gills. Viral-like particles similar to those seen in tissue extracts from natural infections were also seen by TEM. Conclusions ICE and LSNV were found together only in shrimp from MSGS ponds and only in shrimp showing gross signs of MSGS after injection with a preparation containing ICE and LSNV. ICE was never found in the absence of LSNV although LSNV was sometimes found in normal shrimp in the absence of ICE. The results suggest that ICE and LSNV may act together as component causes of MSGS, but this cannot be proven conclusively without single and combined bioassays using purified preparations of both ICE and LSNV. Despite this ambiguity, it is recommended in the interim that ICE be added to the agents such as LSNV already listed for exclusion from domesticated stocks of the black tiger shrimp.
Collapse
Affiliation(s)
- Wattana Panphut
- Centex Shrimp, Faculty of Science, Mahidol University, Bangkok 10400, Thailand
| | | | | | | | | |
Collapse
|
100
|
Cohen-Gihon I, Sharan R, Nussinov R. Processes of fungal proteome evolution and gain of function: gene duplication and domain rearrangement. Phys Biol 2011; 8:035009. [PMID: 21572172 DOI: 10.1088/1478-3975/8/3/035009] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
During evolution, organisms have gained functional complexity mainly by modifying and improving existing functioning systems rather than creating new ones ab initio. Here we explore the interplay between two processes which during evolution have had major roles in the acquisition of new functions: gene duplication and protein domain rearrangements. We consider four possible evolutionary scenarios: gene families that have undergone none of these event types; only gene duplication; only domain rearrangement, or both events. We characterize each of the four evolutionary scenarios by functional attributes. Our analysis of ten fungal genomes indicates that at least for the fungi clade, species significantly appear to gain complexity by gene duplication accompanied by the expansion of existing domain architectures via rearrangements. We show that paralogs gaining new domain architectures via duplication tend to adopt new functions compared to paralogs that preserve their domain architectures. We conclude that evolution of protein families through gene duplication and domain rearrangement is correlated with their functional properties. We suggest that in general, new functions are acquired via the integration of gene duplication and domain rearrangements rather than each process acting independently.
Collapse
Affiliation(s)
- Inbar Cohen-Gihon
- Department of Human Genetics, Sackler Faculty of Medicine, Sackler Institute of Molecular Medicine, Tel Aviv University, Tel Aviv, Israel
| | | | | |
Collapse
|