1
|
Vanniya S P, Chandru J, Jeffrey JM, Rabinowitz T, Brownstein Z, Krishnamoorthy M, Avraham KB, Cheng L, Shomron N, Srisailapathy CRS. PNPT1, MYO15A, PTPRQ, and SLC12A2-associated genetic and phenotypic heterogeneity among hearing impaired assortative mating families in Southern India. Ann Hum Genet 2021; 86:1-13. [PMID: 34374074 DOI: 10.1111/ahg.12442] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Revised: 07/15/2021] [Accepted: 07/19/2021] [Indexed: 12/24/2022]
Abstract
The study was conducted between 2018 and 2020. From a cohort of 113 hearing impaired (HI), five non-DFNB12 probands identified with heterozygous CDH23 variants were subjected to exome analysis. This resolved the etiology of hearing loss (HL) in four South Indian assortative mating families. Six variants, including three novel ones, were identified in four genes: PNPT1 p.(Ala46Gly) and p.(Asn540Ser), MYO15A p.(Leu1485Pro) and p.(Tyr1891Ter), PTPRQ p.(Gln1336Ter), and SLC12A2 p.(Pro988Ser). Compound heterozygous PNPT1 variants were associated with DFNB70 causing prelingual profound sensorineural hearing loss (SNHL), vestibular dysfunction, and unilateral progressive vision loss in one family. In the second family, MYO15A variants in the myosin motor domain, including a novel variant, causing DFNB3, were found to be associated with prelingual profound SNHL. A novel PTPRQ variant was associated with postlingual progressive sensorineural/mixed HL and vestibular dysfunction in the third family with DFNB84A. In the fourth family, the SLC12A2 novel variant was found to segregate with severe-to-profound HL causing DFNA78, across three generations. Our results suggest a high level of allelic, genotypic, and phenotypic heterogeneity of HL in these families. This study is the first to report the association of PNPT1, PTPRQ, and SLC12A2 variants with HL in the Indian population.
Collapse
Affiliation(s)
- Paridhy Vanniya S
- Department of Genetics, Dr. ALM PG Institute of Basic Medical Sciences, University of Madras, Chennai, India
| | - Jayasankaran Chandru
- Department of Genetics, Dr. ALM PG Institute of Basic Medical Sciences, University of Madras, Chennai, India.,LifeBytes India Pvt. Ltd., Bengaluru, India
| | - Justin Margret Jeffrey
- Department of Genetics, Dr. ALM PG Institute of Basic Medical Sciences, University of Madras, Chennai, India
| | - Tom Rabinowitz
- Department of Cell and Developmental Biology, Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Zippora Brownstein
- Department of Human Molecular Genetics and Biochemistry, Faculty of Medicine and Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
| | - Mathuravalli Krishnamoorthy
- Department of Genetics, Dr. ALM PG Institute of Basic Medical Sciences, University of Madras, Chennai, India
| | - Karen B Avraham
- Department of Human Molecular Genetics and Biochemistry, Faculty of Medicine and Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
| | - Le Cheng
- BGI Genomics, Shenzhen, P. R. China
| | - Noam Shomron
- Department of Cell and Developmental Biology, Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - C R Srikumari Srisailapathy
- Department of Genetics, Dr. ALM PG Institute of Basic Medical Sciences, University of Madras, Chennai, India
| |
Collapse
|
2
|
Song B, Li Z, Lin X, Wang J, Wang T, Fu X. Pretraining model for biological sequence data. Brief Funct Genomics 2021; 20:181-195. [PMID: 34050350 PMCID: PMC8194843 DOI: 10.1093/bfgp/elab025] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Revised: 04/13/2021] [Accepted: 04/21/2021] [Indexed: 12/26/2022] Open
Abstract
With the development of high-throughput sequencing technology, biological sequence data reflecting life information becomes increasingly accessible. Particularly on the background of the COVID-19 pandemic, biological sequence data play an important role in detecting diseases, analyzing the mechanism and discovering specific drugs. In recent years, pretraining models that have emerged in natural language processing have attracted widespread attention in many research fields not only to decrease training cost but also to improve performance on downstream tasks. Pretraining models are used for embedding biological sequence and extracting feature from large biological sequence corpus to comprehensively understand the biological sequence data. In this survey, we provide a broad review on pretraining models for biological sequence data. Moreover, we first introduce biological sequences and corresponding datasets, including brief description and accessible link. Subsequently, we systematically summarize popular pretraining models for biological sequences based on four categories: CNN, word2vec, LSTM and Transformer. Then, we present some applications with proposed pretraining models on downstream tasks to explain the role of pretraining models. Next, we provide a novel pretraining scheme for protein sequences and a multitask benchmark for protein pretraining models. Finally, we discuss the challenges and future directions in pretraining models for biological sequences.
Collapse
Affiliation(s)
| | | | | | | | | | - Xiangzheng Fu
- Corresponding author: Xiangzheng Fu, College of Information Science and Engineering, Hunan University, Changsha, Hunan, China. Tel: 86-0731-88821907; E-mail:
| |
Collapse
|
3
|
Bano S, Fatima S, Ahamad S, Ansari S, Gupta D, Tabish M, Rehman SU, Jairajpuri MA. Identification and characterization of a novel isoform of heparin cofactor II in human liver. IUBMB Life 2020; 72:2180-2193. [PMID: 32827448 DOI: 10.1002/iub.2361] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Revised: 07/05/2020] [Accepted: 07/07/2020] [Indexed: 11/07/2022]
Abstract
Heparin cofactor II (HCII) is predominantly expressed in the liver and inhibits thrombin in blood plasma to influence the blood coagulation cascade. Its deficiency is associated with arterial thrombosis. Its cleavage by neutrophil elastase produces fragment that helps in neutrophil chemotaxis in the acute inflammatory response in human. In the present study, we have identified a novel alternatively spliced transcript of the HCII gene in human liver. This novel transcript includes an additional novel region in continuation with exon 3 called exon 3b. Exon 3b acts like an alternate last exon, and hence its inclusion in the transcript due to alternative splicing removes exon 4 and encodes for a different C-terminal region to give a novel protein, HCII-N. MD simulations of HCII-N and three-dimensional structure showed a unique 51 amino acid sequence at the C-terminal having unique RCL-like structure. The HCII-N protein purified from bacterial culture showed a protein migrating at lower molecular weight (MW 55 kDa) as compared to native HCII (MW 66 kDa). A fluorescence-based analysis revealed a more compact structure of HCII-N that was in a more hydrophilic environment. The HCII-N protein, however, showed no inhibitory activity against thrombin. Due to large conformational variation observed in comparison with native HCII, HCII-N may have alternate protease specificity or a non-inhibitory role. Western blot of HCII purified from large plasma volume showed the presence of a low MW 59 kDa band with no thrombin activity. This study provides the first evidence of alternatively spliced novel isoform of the HCII gene.
Collapse
Affiliation(s)
- Shadabi Bano
- Department of Biosciences, Jamia Millia Islamia, New Delhi, India
| | - Sana Fatima
- Department of Biosciences, Jamia Millia Islamia, New Delhi, India
| | - Shahzaib Ahamad
- Translational Bioinformatics Group, International Centre for Genetic Engineering and Biotechnology, New Delhi, India
| | - Shoyab Ansari
- Department of Biosciences, Jamia Millia Islamia, New Delhi, India
| | - Dinesh Gupta
- Translational Bioinformatics Group, International Centre for Genetic Engineering and Biotechnology, New Delhi, India
| | - Mohammad Tabish
- Department of Biochemistry, Faculty of Life Sciences, Aligarh M. University, Aligarh, India
| | - Sayeed Ur Rehman
- Department of Biochemistry, School of Chemical and Life Sciences, Jamia Hamdard, New Delhi, India
| | | |
Collapse
|
4
|
Gaffney S, Ad O, Smaga S, Schepartz A, Townsend JP. GEM-NET: Lessons in Multi-Institution Teamwork Using Collaboration Software. ACS CENTRAL SCIENCE 2019; 5:1159-1169. [PMID: 31404233 PMCID: PMC6661976 DOI: 10.1021/acscentsci.9b00111] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/05/2019] [Indexed: 06/10/2023]
Abstract
The Center for Genetically Encoded Materials (C-GEM) is an NSF Phase I Center for Chemical Innovation that comprises six laboratories spread across three university campuses. Our success as a multi-institution research team demanded the development of a software infrastructure, GEM-NET, that allows all C-GEM members to work together seamlessly-as though everyone was in the same room. GEM-NET was designed to support both science and communication by integrating task management, scheduling, data sharing, and collaborative document and code editing with frictionless internal and public communication; it also maintains security over data and internal communications. In this Article, we document the design and implementation of GEM-NET: our objectives and motivating goals, how each component contributes to these goals, and the lessons learned throughout development. We also share open source code for several custom applications and document how GEM-NET can benefit users in multiple fields and teams that are both small and large. We anticipate that this knowledge will guide other multi-institution teams, regardless of discipline, to plan their software infrastructure and utilize it as swiftly and smoothly as possible.
Collapse
Affiliation(s)
- Stephen
G. Gaffney
- Department
of Biostatistics, Yale University School
of Public Health, New Haven, Connecticut 06510, United States
| | - Omer Ad
- Department
of Chemistry, Yale University, New Haven, Connecticut 06510, United States
| | - Sarah Smaga
- Department
of Chemistry, Yale University, New Haven, Connecticut 06510, United States
| | - Alanna Schepartz
- Department
of Chemistry, Yale University, New Haven, Connecticut 06510, United States
- Department
of Molecular, Cellular and Developmental Biology, Yale University, New Haven, Connecticut 06510, United States
| | - Jeffrey P. Townsend
- Department
of Biostatistics, Yale University School
of Public Health, New Haven, Connecticut 06510, United States
- Program
in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut 06510, United States
| |
Collapse
|
5
|
Sandhu P, Akhter Y. Siderophore transport by MmpL5-MmpS5 protein complex in Mycobacterium tuberculosis. J Inorg Biochem 2017; 170:75-84. [DOI: 10.1016/j.jinorgbio.2017.02.013] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2016] [Revised: 12/27/2016] [Accepted: 02/10/2017] [Indexed: 12/17/2022]
|
6
|
He S, Reif JC, Korzun V, Bothe R, Ebmeyer E, Jiang Y. Genome-wide mapping and prediction suggests presence of local epistasis in a vast elite winter wheat populations adapted to Central Europe. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2017; 130:635-647. [PMID: 27995275 DOI: 10.1007/s00122-016-2840-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/05/2016] [Accepted: 12/01/2016] [Indexed: 05/05/2023]
Abstract
Genome-wide association mapping as well as marker- and haplotype-based genome-wide selection unraveled a complex genetic architecture of grain yield with absence of large effect QTL and presence of local epistatic effects. The genetic architecture of grain yield determines to a large extent the optimum design of genomic-assisted wheat breeding programs. The main goal of our study was to examine the potential and limitations to dissect the genetic architecture of grain yield in wheat using a large experimental data set. Our study was based on phenotypic information and genomic data of 13,901 SNPs of a diverse set of 3816 elite wheat lines adapted to Central Europe. We applied genome-wide association mapping based on experimental and simulated data sets and performed marker- and haplotype-based genomic prediction. Computer simulations revealed for our mapping population a high power to detect QTL, even if they individually explained only 2.5% of the genetic variation. Despite this, we found no stable marker-trait associations when validating in independent subsets. A two-dimensional scan for marker-marker interactions indicated presence of local epistasis which was further supported by improved prediction abilities when shifting from marker- to haplotype-based genome-wide prediction approaches. We observed that marker effects estimated using genome-wide prediction approaches strongly varied across years albeit resulting in high prediction abilities. Thus, our results suggested that the prediction accuracy of genomic selection in wheat is mainly driven by relatedness rather than by exploiting knowledge of the genetic architecture.
Collapse
Affiliation(s)
- Sang He
- Department of Breeding Research, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), 06466, Gatersleben, Germany
| | - Jochen C Reif
- Department of Breeding Research, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), 06466, Gatersleben, Germany.
| | | | | | | | - Yong Jiang
- Department of Breeding Research, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), 06466, Gatersleben, Germany
| |
Collapse
|
7
|
Basharat Z, Zaib S, Yasmin A. Computational study of some amoebicidal phytochemicals against heat shock protein of Naegleria fowleri. GENE REPORTS 2017. [DOI: 10.1016/j.genrep.2016.09.003] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
8
|
Larsen PA, Lutz MW, Hunnicutt KE, Mihovilovic M, Saunders AM, Yoder AD, Roses AD. The Alu neurodegeneration hypothesis: A primate-specific mechanism for neuronal transcription noise, mitochondrial dysfunction, and manifestation of neurodegenerative disease. Alzheimers Dement 2017; 13:828-838. [PMID: 28242298 PMCID: PMC6647845 DOI: 10.1016/j.jalz.2017.01.017] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2016] [Revised: 01/12/2017] [Accepted: 01/24/2017] [Indexed: 01/13/2023]
Abstract
It is hypothesized that retrotransposons have played a fundamental role in primate evolution and that enhanced neurologic retrotransposon activity in humans may underlie the origin of higher cognitive function. As a potential consequence of this enhanced activity, it is likely that neurons are susceptible to deleterious retrotransposon pathways that can disrupt mitochondrial function. An example is observed in the TOMM40 gene, encoding a β-barrel protein critical for mitochondrial preprotein transport. Primate-specific Alu retrotransposons have repeatedly inserted into TOMM40 introns, and at least one variant associated with late-onset Alzheimer’s disease originated from an Alu insertion event. We provide evidence of enriched Alu content in mitochondrial genes and postulate that Alus can disrupt mitochondrial populations in neurons, thereby setting the stage for progressive neurologic dysfunction. This Alu neurodegeneration hypothesis is compatible with decades of research and offers a plausible mechanism for the disruption of neuronal mitochondrial homeostasis, ultimately cascading into neurodegenerative disease.
Collapse
Affiliation(s)
- Peter A Larsen
- Department of Biology, Duke University, Durham, NC, USA.
| | - Michael W Lutz
- Department of Neurology, Duke University School of Medicine, Durham, NC, USA
| | | | - Mirta Mihovilovic
- Department of Neurology, Duke University School of Medicine, Durham, NC, USA
| | - Ann M Saunders
- Department of Neurology, Duke University School of Medicine, Durham, NC, USA
| | - Anne D Yoder
- Department of Biology, Duke University, Durham, NC, USA; Duke Lemur Center, Duke University, Durham, NC, USA
| | - Allen D Roses
- Department of Neurology, Duke University School of Medicine, Durham, NC, USA; Zinfandel Pharmaceuticals, Inc, Durham, NC, USA
| |
Collapse
|
9
|
Zhang YF, Ho M. Humanization of rabbit monoclonal antibodies via grafting combined Kabat/IMGT/Paratome complementarity-determining regions: Rationale and examples. MAbs 2017; 9:419-429. [PMID: 28165915 DOI: 10.1080/19420862.2017.1289302] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Rabbit monoclonal antibodies (RabMAbs) can recognize diverse epitopes, including those poorly immunogenic in mice and humans. However, there have been only a few reports on RabMAb humanization, an important antibody engineering step usually done before clinical applications are investigated. To pursue a general method for humanization of RabMAbs, we analyzed the complex structures of 5 RabMAbs with their antigens currently available in the Protein Data Bank, and identified antigen-contacting residues on the rabbit Fv within the 6 Angstrom distance to its antigen. We also analyzed the supporting residues for antigen-contacting residues on the same heavy or light chain. We identified "HV4" and "LV4" in rabbit Fvs, non-complementarity-determining region (CDR) loops that are structurally close to the antigen and located in framework 3 of the heavy chain and light chain, respectively. Based on our structural and sequence analysis, we designed a humanization strategy by grafting the combined Kabat/IMGT/Paratome CDRs, which cover most antigen-contacting residues, into a human germline framework sequence. Using this strategy, we humanized 4 RabMAbs that recognize poorly immunogenic epitopes in the cancer target mesothelin. Three of the 4 humanized rabbit Fvs have similar or improved functional binding affinity for mesothelin-expressing cells. Interestingly, 4 immunotoxins composed of the humanized scFvs fused to a clinically used fragment of Pseudomonas exotoxin (PE38) showed stronger cytotoxicity against tumor cells than the immunotoxins derived from their original rabbit scFvs. Our data suggest that grafting the combined Kabat/IMGT/Paratome CDRs to a stable human germline framework can be a general approach to humanize RabMAbs.
Collapse
Affiliation(s)
- Yi-Fan Zhang
- a Laboratory of Molecular Biology , National Cancer Institute , Bethesda , MD , USA
| | - Mitchell Ho
- a Laboratory of Molecular Biology , National Cancer Institute , Bethesda , MD , USA
| |
Collapse
|
10
|
Dixit VA, Deshpande S. Advances in Computational Prediction of Regioselective and Isoform-Specific Drug Metabolism Catalyzed by CYP450s. ChemistrySelect 2016. [DOI: 10.1002/slct.201601051] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Affiliation(s)
- Vaibhav A. Dixit
- Department of Pharmaceutical Chemistry; School of Pharmacy and Technology Management (SPTM), Shri Vile Parle Kelavani Mandal's (SVKM's) Narsee Monjee Institute of Management Studies (NMIMS), Mukesh Patel Technology Park, Babulde, Bank of Tapi River; Mumbai-Agra Road Shirpur, Dist. Dhule−425405 India
| | - Shirish Deshpande
- Department of Pharmaceutical Chemistry; School of Pharmacy and Technology Management (SPTM), Shri Vile Parle Kelavani Mandal's (SVKM's) Narsee Monjee Institute of Management Studies (NMIMS), Mukesh Patel Technology Park, Babulde, Bank of Tapi River; Mumbai-Agra Road Shirpur, Dist. Dhule−425405 India
| |
Collapse
|
11
|
Doumayrou J, Sheber M, Bonning BC, Miller WA. Role of Pea Enation Mosaic Virus Coat Protein in the Host Plant and Aphid Vector. Viruses 2016; 8:E312. [PMID: 27869713 PMCID: PMC5127026 DOI: 10.3390/v8110312] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2016] [Revised: 10/14/2016] [Accepted: 11/02/2016] [Indexed: 11/16/2022] Open
Abstract
Understanding the molecular mechanisms involved in plant virus-vector interactions is essential for the development of effective control measures for aphid-vectored epidemic plant diseases. The coat proteins (CP) are the main component of the viral capsids, and they are implicated in practically every stage of the viral infection cycle. Pea enation mosaic virus 1 (PEMV1, Enamovirus, Luteoviridae) and Pea enation mosaic virus 2 (PEMV2, Umbravirus, Tombusviridae) are two RNA viruses in an obligate symbiosis causing the pea enation mosaic disease. Sixteen mutant viruses were generated with mutations in different domains of the CP to evaluate the role of specific amino acids in viral replication, virion assembly, long-distance movement in Pisum sativum, and aphid transmission. Twelve mutant viruses were unable to assemble but were able to replicate in inoculated leaves, move long-distance, and express the CP in newly infected leaves. Four mutant viruses produced virions, but three were not transmissible by the pea aphid, Acyrthosiphon pisum. Three-dimensional modeling of the PEMV CP, combined with biological assays for virion assembly and aphid transmission, allowed for a model of the assembly of PEMV coat protein subunits.
Collapse
Affiliation(s)
- Juliette Doumayrou
- Department of Plant Pathology & Microbiology, 351 Bessey Hall, Iowa State University, Ames, IA 50011, USA.
| | - Melissa Sheber
- Department of Plant Pathology & Microbiology, 351 Bessey Hall, Iowa State University, Ames, IA 50011, USA.
| | - Bryony C Bonning
- Department of Entomology, 339 Science II, Iowa State University, Ames, IA 50011, USA.
| | - W Allen Miller
- Department of Plant Pathology & Microbiology, 351 Bessey Hall, Iowa State University, Ames, IA 50011, USA.
| |
Collapse
|
12
|
Demidov G, Simakova T, Vnuchkova J, Bragin A. A statistical approach to detection of copy number variations in PCR-enriched targeted sequencing data. BMC Bioinformatics 2016; 17:429. [PMID: 27770783 PMCID: PMC5075217 DOI: 10.1186/s12859-016-1272-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2016] [Accepted: 09/21/2016] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND Multiplex polymerase chain reaction (PCR) is a common enrichment technique for targeted massive parallel sequencing (MPS) protocols. MPS is widely used in biomedical research and clinical diagnostics as the fast and accurate tool for the detection of short genetic variations. However, identification of larger variations such as structure variants and copy number variations (CNV) is still being a challenge for targeted MPS. Some approaches and tools for structural variants detection were proposed, but they have limitations and often require datasets of certain type, size and expected number of amplicons affected by CNVs. In the paper, we describe novel algorithm for high-resolution germinal CNV detection in the PCR-enriched targeted sequencing data and present accompanying tool. RESULTS We have developed a machine learning algorithm for the detection of large duplications and deletions in the targeted sequencing data generated with PCR-based enrichment step. We have performed verification studies and established the algorithm's sensitivity and specificity. We have compared developed tool with other available methods applicable for the described data and revealed its higher performance. CONCLUSION We showed that our method has high specificity and sensitivity for high-resolution copy number detection in targeted sequencing data using large cohort of samples.
Collapse
Affiliation(s)
- German Demidov
- Parseq Lab, Birzhevaya, 16, Saint-Petersburg, 199053 Russia
- Department of Mathematics and Information Technology in SPbAU RAS, Khlopina, 8/3, Saint-Petersburg, 194021 Russia
- Genomic and Epigenomic Variation in Disease Group, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003 Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | | | | | - Anton Bragin
- Parseq Lab, Birzhevaya, 16, Saint-Petersburg, 199053 Russia
| |
Collapse
|
13
|
Antczak M, Kasprzak M, Lukasiak P, Blazewicz J. Structural alignment of protein descriptors - a combinatorial model. BMC Bioinformatics 2016; 17:383. [PMID: 27639380 PMCID: PMC5027075 DOI: 10.1186/s12859-016-1237-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2016] [Accepted: 09/02/2016] [Indexed: 11/17/2022] Open
Abstract
Background Structural alignment of proteins is one of the most challenging problems in molecular biology. The tertiary structure of a protein strictly correlates with its function and computationally predicted structures are nowadays a main premise for understanding the latter. However, computationally derived 3D models often exhibit deviations from the native structure. A way to confirm a model is a comparison with other structures. The structural alignment of a pair of proteins can be defined with the use of a concept of protein descriptors. The protein descriptors are local substructures of protein molecules, which allow us to divide the original problem into a set of subproblems and, consequently, to propose a more efficient algorithmic solution. In the literature, one can find many applications of the descriptors concept that prove its usefulness for insight into protein 3D structures, but the proposed approaches are presented rather from the biological perspective than from the computational or algorithmic point of view. Efficient algorithms for identification and structural comparison of descriptors can become crucial components of methods for structural quality assessment as well as tertiary structure prediction. Results In this paper, we propose a new combinatorial model and new polynomial-time algorithms for the structural alignment of descriptors. The model is based on the maximum-size assignment problem, which we define here and prove that it can be solved in polynomial time. We demonstrate suitability of this approach by comparison with an exact backtracking algorithm. Besides a simplification coming from the combinatorial modeling, both on the conceptual and complexity level, we gain with this approach high quality of obtained results, in terms of 3D alignment accuracy and processing efficiency. Conclusions All the proposed algorithms were developed and integrated in a computationally efficient tool descs-standalone, which allows the user to identify and structurally compare descriptors of biological molecules, such as proteins and RNAs. Both PDB (Protein Data Bank) and mmCIF (macromolecular Crystallographic Information File) formats are supported. The proposed tool is available as an open source project stored on GitHub (https://github.com/mantczak/descs-standalone). Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1237-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Maciej Antczak
- Institute of Computing Science, Poznan University of Technology, Piotrowo 2, Poznan, 60-965, Poland.
| | - Marta Kasprzak
- Institute of Computing Science, Poznan University of Technology, Piotrowo 2, Poznan, 60-965, Poland.,Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, Poznan, 61-704, Poland
| | - Piotr Lukasiak
- Institute of Computing Science, Poznan University of Technology, Piotrowo 2, Poznan, 60-965, Poland.,Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, Poznan, 61-704, Poland
| | - Jacek Blazewicz
- Institute of Computing Science, Poznan University of Technology, Piotrowo 2, Poznan, 60-965, Poland.,Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, Poznan, 61-704, Poland
| |
Collapse
|
14
|
A gene-signature progression approach to identifying candidate small-molecule cancer therapeutics with connectivity mapping. BMC Bioinformatics 2016; 17:211. [PMID: 27170106 PMCID: PMC4864913 DOI: 10.1186/s12859-016-1066-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2015] [Accepted: 04/29/2016] [Indexed: 01/14/2023] Open
Abstract
Background Gene expression connectivity mapping has gained much popularity recently with a number of successful applications in biomedical research testifying its utility and promise. Previously methodological research in connectivity mapping mainly focused on two of the key components in the framework, namely, the reference gene expression profiles and the connectivity mapping algorithms. The other key component in this framework, the query gene signature, has been left to users to construct without much consensus on how this should be done, albeit it has been an issue most relevant to end users. As a key input to the connectivity mapping process, gene signature is crucially important in returning biologically meaningful and relevant results. This paper intends to formulate a standardized procedure for constructing high quality gene signatures from a user’s perspective. Results We describe a two-stage process for making quality gene signatures using gene expression data as initial inputs. First, a differential gene expression analysis comparing two distinct biological states; only the genes that have passed stringent statistical criteria are considered in the second stage of the process, which involves ranking genes based on statistical as well as biological significance. We introduce a “gene signature progression” method as a standard procedure in connectivity mapping. Starting from the highest ranked gene, we progressively determine the minimum length of the gene signature that allows connections to the reference profiles (drugs) being established with a preset target false discovery rate. We use a lung cancer dataset and a breast cancer dataset as two case studies to demonstrate how this standardized procedure works, and we show that highly relevant and interesting biological connections are returned. Of particular note is gefitinib, identified as among the candidate therapeutics in our lung cancer case study. Our gene signature was based on gene expression data from Taiwan female non-smoker lung cancer patients, while there is evidence from independent studies that gefitinib is highly effective in treating women, non-smoker or former light smoker, advanced non-small cell lung cancer patients of Asian origin. Conclusions In summary, we introduced a gene signature progression method into connectivity mapping, which enables a standardized procedure for constructing high quality gene signatures. This progression method is particularly useful when the number of differentially expressed genes identified is large, and when there is a need to prioritize them to be included in the query signature. The results from two case studies demonstrate that the approach we have developed is capable of obtaining pertinent candidate drugs with high precision. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1066-x) contains supplementary material, which is available to authorized users.
Collapse
|
15
|
Investigating the structural impact of S311C mutation in DRD2 receptor by molecular dynamics & docking studies. Biochimie 2016; 123:52-64. [DOI: 10.1016/j.biochi.2016.01.011] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2015] [Accepted: 01/16/2016] [Indexed: 01/11/2023]
|
16
|
Forero-Baena N, Sánchez-Lancheros D, Buitrago JC, Bustos V, Ramírez-Hernández MH. Identification of a nicotinamide/nicotinate mononucleotide adenylyltransferase in Giardia lamblia (GlNMNAT). BIOCHIMIE OPEN 2015; 1:61-69. [PMID: 29632831 PMCID: PMC5889475 DOI: 10.1016/j.biopen.2015.11.001] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/09/2015] [Revised: 10/18/2015] [Indexed: 01/19/2023]
Abstract
Giardia lamblia is an intestinal protozoan parasite that causes giardiasis, a disease of high prevalence in Latin America, Asia and Africa. Giardiasis leads to poor absorption of nutrients, severe electrolyte loss and growth retardation. In addition to its clinical importance, this parasite is of special biological interest due to its basal evolutionary position and simplified metabolism, which has not been studied thoroughly. One of the most important and conserved metabolic pathways is the biosynthesis of nicotinamide adenine dinucleotide (NAD). This molecule is widely known as a coenzyme in multiple redox reactions and as a substrate in cellular processes such as synthesis of Ca2+ mobilizing agents, DNA repair and gene expression regulation. There are two pathways for NAD biosynthesis, which converge at the step catalyzed by nicotinamide/nicotinate mononucleotide adenylyltransferase (NMNAT, EC 2.7.7.1/18). Using bioinformatics tools, we found two NMNAT sequences in Giardia lamblia (glnmnat-a and glnmnat-b). We first verified the identity of the sequences in silico. Subsequently, glnmnat-a was cloned into an expression vector. The recombinant protein (His-GlNMNAT) was purified by nickel-affinity binding and was used in direct in vitro enzyme assays assessed by C18-HPLC, verifying adenylyltransferase activity with both nicotinamide (NMN) and nicotinic acid (NAMN) mononucleotides. Optimal reaction pH and temperature were 7.3 and 26 °C. Michaelis-Menten kinetics were observed for NMN and ATP, but saturation was not accomplished with NAMN, implying low affinity yet detectable activity with this substrate. Double-reciprocal plots showed no cooperativity for this enzyme. This represents an advance in the study of NAD metabolism in Giardia spp.
Collapse
Key Words
- Enzyme activity
- Giardia lamblia
- NA, nicotinic acid
- NAAD, nicotinic acid adenine dinucleotide
- NAD metabolism
- NAD synthetase, EC. 6.3.5.1
- NAD, nicotinamide adenine dinucleotide
- NAM, nicotinamide
- NAMN, nicotinic acid mononucleotide
- NAMPRT, nicotinamide phosphoribosyltransferase
- NAPRT, nicotinic acid phosphoribosyltransferase
- NMN, nicotinamide mononucleotide
- NMNAT
- NMNAT, nicotinamide/nicotinic acid mononucleotide adenylyltransferase
- NR, nicotinamide riboside
- NRK, nicotinamide riboside kinase
- QA, quinolinic acid
- QAPRT, quinolinic acid phosphoribosyltransferase
Collapse
Affiliation(s)
- Nicolás Forero-Baena
- Department of Chemistry, Universidad Nacional de Colombia, Bogotá Cundinamarca, Colombia
| | | | | | - Victor Bustos
- Department of Chemistry, Universidad Nacional de Colombia, Bogotá Cundinamarca, Colombia
| | | |
Collapse
|
17
|
Guardia GDA, Pires LF, Vêncio RZN, Malmegrim KCR, de Farias CRG. A Methodology for the Development of RESTful Semantic Web Services for Gene Expression Analysis. PLoS One 2015. [PMID: 26207740 PMCID: PMC4514690 DOI: 10.1371/journal.pone.0134011] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Gene expression studies are generally performed through multi-step analysis processes, which require the integrated use of a number of analysis tools. In order to facilitate tool/data integration, an increasing number of analysis tools have been developed as or adapted to semantic web services. In recent years, some approaches have been defined for the development and semantic annotation of web services created from legacy software tools, but these approaches still present many limitations. In addition, to the best of our knowledge, no suitable approach has been defined for the functional genomics domain. Therefore, this paper aims at defining an integrated methodology for the implementation of RESTful semantic web services created from gene expression analysis tools and the semantic annotation of such services. We have applied our methodology to the development of a number of services to support the analysis of different types of gene expression data, including microarray and RNASeq. All developed services are publicly available in the Gene Expression Analysis Services (GEAS) Repository at http://dcm.ffclrp.usp.br/lssb/geas. Additionally, we have used a number of the developed services to create different integrated analysis scenarios to reproduce parts of two gene expression studies documented in the literature. The first study involves the analysis of one-color microarray data obtained from multiple sclerosis patients and healthy donors. The second study comprises the analysis of RNA-Seq data obtained from melanoma cells to investigate the role of the remodeller BRG1 in the proliferation and morphology of these cells. Our methodology provides concrete guidelines and technical details in order to facilitate the systematic development of semantic web services. Moreover, it encourages the development and reuse of these services for the creation of semantically integrated solutions for gene expression analysis.
Collapse
Affiliation(s)
- Gabriela D. A. Guardia
- Department of Computer Science and Mathematics—Faculty of Philosophy, Sciences and Letters of Ribeirão Preto (FFCLRP)—University of São Paulo (USP), Ribeirão Preto, Brazil
| | - Luís Ferreira Pires
- Faculty of Electrical Engineering, Mathematics and Computer Science—University of Twente, Enschede, the Netherlands
| | - Ricardo Z. N. Vêncio
- Department of Computer Science and Mathematics—Faculty of Philosophy, Sciences and Letters of Ribeirão Preto (FFCLRP)—University of São Paulo (USP), Ribeirão Preto, Brazil
| | - Kelen C. R. Malmegrim
- Department of Clinical, Toxicological and Bromatological Analysis—Faculty of Pharmaceutical Sciences of Ribeirão Preto—University of São Paulo (USP), Ribeirão Preto, Brazil
| | - Cléver R. G. de Farias
- Department of Computer Science and Mathematics—Faculty of Philosophy, Sciences and Letters of Ribeirão Preto (FFCLRP)—University of São Paulo (USP), Ribeirão Preto, Brazil
- * E-mail:
| |
Collapse
|
18
|
Rahman MS, Thomas P. Molecular characterization and hypoxia-induced upregulation of neuronal nitric oxide synthase in Atlantic croaker: Reversal by antioxidant and estrogen treatments. Comp Biochem Physiol A Mol Integr Physiol 2015; 185:91-106. [DOI: 10.1016/j.cbpa.2015.03.013] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2014] [Revised: 03/20/2015] [Accepted: 03/25/2015] [Indexed: 01/27/2023]
|
19
|
Bousova K, Jirku M, Bumba L, Bednarova L, Sulc M, Franek M, Vyklicky L, Vondrasek J, Teisinger J. PIP2 and PIP3 interact with N-terminus region of TRPM4 channel. Biophys Chem 2015; 205:24-32. [PMID: 26071843 DOI: 10.1016/j.bpc.2015.06.004] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2015] [Revised: 06/04/2015] [Accepted: 06/06/2015] [Indexed: 01/07/2023]
Abstract
The transient receptor potential melastatin 4 (TRPM4) is a calcium-activated non-selective ion channel broadly expressed in a variety of tissues. Receptor has been identified as a crucial modulator of numerous calcium dependent mechanisms in the cell such as immune response, cardiac conduction, neurotransmission and insulin secretion. It is known that phosphoinositide lipids (PIPs) play a unique role in the regulation of TRP channel function. However the molecular mechanism of this process is still unknown. We characterized the binding site of PIP2 and its structural analogue PIP3 in the E733-W772 proximal region of the TRPM4 N-terminus via biophysical and molecular modeling methods. The specific positions R755 and R767 in this domain were identified as being important for interactions with PIP2/PIP3 ligands. Their mutations caused a partial loss of PIP2/PIP3 binding specificity. The interaction of PIP3 with TRPM4 channels has never been described before. These findings provide new insight into the ligand binding domains of the TRPM4 channel.
Collapse
Affiliation(s)
- Kristyna Bousova
- 2nd Faculty of Medicine, Charles University in Prague, 15006 Prague, Czech Republic; Institute of Physiology, Academy of Sciences of the Czech Republic, 14220 Prague, Czech Republic
| | - Michaela Jirku
- Institute of Physiology, Academy of Sciences of the Czech Republic, 14220 Prague, Czech Republic; Faculty of Science, Charles University in Prague, 12843 Prague, Czech Republic
| | - Ladislav Bumba
- Institute of Microbiology, Academy of Sciences of the Czech Republic, 14220 Prague, Czech Republic
| | - Lucie Bednarova
- Institute of Organic Chemistry and Biochemistry, Academy of Sciences of the Czech Republic, 16610 Prague, Czech Republic
| | - Miroslav Sulc
- Institute of Microbiology, Academy of Sciences of the Czech Republic, 14220 Prague, Czech Republic
| | - Miloslav Franek
- 3rd Faculty of Medicine, Charles University in Prague, 10000 Prague, Czech Republic
| | - Ladislav Vyklicky
- Institute of Physiology, Academy of Sciences of the Czech Republic, 14220 Prague, Czech Republic
| | - Jiri Vondrasek
- Institute of Organic Chemistry and Biochemistry, Academy of Sciences of the Czech Republic, 16610 Prague, Czech Republic
| | - Jan Teisinger
- Institute of Physiology, Academy of Sciences of the Czech Republic, 14220 Prague, Czech Republic.
| |
Collapse
|
20
|
Cuervo-Soto LI, Valdés-García G, Batista-García R, del Rayo Sánchez-Carbente M, Balcázar-López E, Lira-Ruan V, Pastor N, Folch-Mallol JL. Identification of a novel carbohydrate esterase from Bjerkandera adusta: structural and function predictions through bioinformatics analysis and molecular modeling. Proteins 2015; 83:533-46. [PMID: 25586442 DOI: 10.1002/prot.24760] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2014] [Revised: 12/19/2014] [Accepted: 12/31/2014] [Indexed: 11/07/2022]
Abstract
A new gene from Bjerkandera adusta strain UAMH 8258 encoding a carbohydrate esterase (designated as BacesI) was isolated and expressed in Pichia pastoris. The gene had an open reading frame of 1410 bp encoding a polypeptide of 470 amino acid residues, the first 18 serving as a secretion signal peptide. Homology and phylogenetic analyses showed that BaCesI belongs to carbohydrate esterases family 4. Three-dimensional modeling of the protein and normal mode analysis revealed a breathing mode of the active site that could be relevant for esterase activity. Furthermore, the overall negative electrostatic potential of this enzyme suggests that it degrades neutral substrates and will not act on negative substrates such as peptidoglycan or p-nitrophenol derivatives. The enzyme shows a specific activity of 1.118 U mg(-1) protein on 2-naphthyl acetate. No activity was detected on p-nitrophenol derivatives as proposed from the electrostatic potential data. The deacetylation activity of the recombinant BaCesI was confirmed by measuring the release of acetic acid from several substrates, including oat xylan, shrimp shell chitin, N-acetylglucosamine, and natural substrates such as sugar cane bagasse and grass. This makes the protein very interesting for the biofuels production industry from lignocellulosic materials and for the production of chitosan from chitin.
Collapse
Affiliation(s)
- Laura I Cuervo-Soto
- Department of Biochemistry and Molecular Biology, Facultad de Ciencias, Universidad Autónoma del Estado de Morelos. Av. Universidad 1001, Col., Chamilpa, Cuernavaca, Morelos México; Department of Environmental Biotechnology, Centro de Investigación en Biotecnología, Universidad Autónoma del Estado de Morelos. Av. Universidad 1001, Col., Chamilpa, Cuernavaca, Morelos México
| | | | | | | | | | | | | | | |
Collapse
|
21
|
Three-dimensional protein structure prediction: Methods and computational strategies. Comput Biol Chem 2014; 53PB:251-276. [DOI: 10.1016/j.compbiolchem.2014.10.001] [Citation(s) in RCA: 121] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2014] [Revised: 10/03/2014] [Accepted: 10/07/2014] [Indexed: 01/01/2023]
|
22
|
Ma L, Li A, Zou D, Xu X, Xia L, Yu J, Bajic VB, Zhang Z. LncRNAWiki: harnessing community knowledge in collaborative curation of human long non-coding RNAs. Nucleic Acids Res 2014; 43:D187-92. [PMID: 25399417 PMCID: PMC4383965 DOI: 10.1093/nar/gku1167] [Citation(s) in RCA: 103] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
Long non-coding RNAs (lncRNAs) perform a diversity of functions in numerous important biological processes and are implicated in many human diseases. In this report we present lncRNAWiki (http://lncrna.big.ac.cn), a wiki-based platform that is open-content and publicly editable and aimed at community-based curation and collection of information on human lncRNAs. Current related databases are dependent primarily on curation by experts, making it laborious to annotate the exponentially accumulated information on lncRNAs, which inevitably requires collective efforts in community-based curation of lncRNAs. Unlike existing databases, lncRNAWiki features comprehensive integration of information on human lncRNAs obtained from multiple different resources and allows not only existing lncRNAs to be edited, updated and curated by different users but also the addition of newly identified lncRNAs by any user. It harnesses community collective knowledge in collecting, editing and annotating human lncRNAs and rewards community-curated efforts by providing explicit authorship based on quantified contributions. LncRNAWiki relies on the underling knowledge of scientific community for collective and collaborative curation of human lncRNAs and thus has the potential to serve as an up-to-date and comprehensive knowledgebase for human lncRNAs.
Collapse
Affiliation(s)
- Lina Ma
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Ang Li
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Dong Zou
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Xingjian Xu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China University of Chinese Academy of Sciences, Beijing 100049, China
| | - Lin Xia
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jun Yu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Vladimir B Bajic
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Zhang Zhang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| |
Collapse
|
23
|
Panigrahi R, Whelan J, Vrielink A. Exploring ligand recognition, selectivity and dynamics of TPR domains of chloroplast Toc64 and mitochondria Om64 fromArabidopsis thaliana. J Mol Recognit 2014; 27:402-14. [DOI: 10.1002/jmr.2360] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2013] [Revised: 01/14/2014] [Accepted: 01/15/2014] [Indexed: 01/31/2023]
Affiliation(s)
- Rashmi Panigrahi
- School of Chemistry and Biochemistry; University of Western Australia; 35 Stirling Highway Crawley WA 6009 Australia
| | - James Whelan
- ARC Centre of Excellence in Plant Energy Biology; University of Western Australia; 35 Stirling Highway Crawley WA 6009 Australia
- Department of Botany, School of Life Science; La Trobe University; Bundoora Victoria 3086 Australia
| | - Alice Vrielink
- School of Chemistry and Biochemistry; University of Western Australia; 35 Stirling Highway Crawley WA 6009 Australia
| |
Collapse
|
24
|
Du P, Xu C. Predicting multisite protein subcellular locations: progress and challenges. Expert Rev Proteomics 2014; 10:227-37. [DOI: 10.1586/epr.13.16] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
25
|
Zhang Z, Sang J, Ma L, Wu G, Wu H, Huang D, Zou D, Liu S, Li A, Hao L, Tian M, Xu C, Wang X, Wu J, Xiao J, Dai L, Chen LL, Hu S, Yu J. RiceWiki: a wiki-based database for community curation of rice genes. Nucleic Acids Res 2013; 42:D1222-8. [PMID: 24136999 PMCID: PMC3964990 DOI: 10.1093/nar/gkt926] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Rice is the most important staple food for a large part of the world’s human population and also a key model organism for biological studies of crops as well as other related plants. Here we present RiceWiki (http://ricewiki.big.ac.cn), a wiki-based, publicly editable and open-content platform for community curation of rice genes. Most existing related biological databases are based on expert curation; with the exponentially exploding volume of rice knowledge and other relevant data, however, expert curation becomes increasingly laborious and time-consuming to keep knowledge up-to-date, accurate and comprehensive, struggling with the flood of data and requiring a large number of people getting involved in rice knowledge curation. Unlike extant relevant databases, RiceWiki features harnessing collective intelligence in community curation of rice genes, quantifying users' contributions in each curated gene and providing explicit authorship for each contributor in any given gene, with the aim to exploit the full potential of the scientific community for rice knowledge curation. Based on community curation, RiceWiki bears the potential to make it possible to build a rice encyclopedia by and for the scientific community that harnesses community intelligence for collaborative knowledge curation, covers all aspects of biological knowledge and keeps evolving with novel knowledge.
Collapse
Affiliation(s)
- Zhang Zhang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China, Research Institute of Subtropical Forestry, Chinese Academy of Forestry, Fuyang, Zhejiang 311400, China, School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China and College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Walsh P, Carroll J, Sleator RD. Accelerating in silico research with workflows: a lesson in Simplicity. Comput Biol Med 2013; 43:2028-35. [PMID: 24290918 DOI: 10.1016/j.compbiomed.2013.09.011] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2012] [Revised: 09/09/2013] [Accepted: 09/12/2013] [Indexed: 10/26/2022]
Abstract
Bioinformatics is the application of computer science and related disciplines to the field of molecular biology. While there are currently several web based and desktop tools available for biologists to perform routine bioinformatics tasks, these tools often require users to manually and repeatedly co-ordinate multiple applications before reaching a result. In an effort to reduce time and error, workflow tools have been developed to automate these tasks. However, many of these tools require expert knowledge of the techniques and supporting databases which more often than not lies outside the scope of most biologists. Herein, we describe the development of sequence information management platform (Simplicity), a workflow-based bioinformatics management tool, which allows non-bioinformaticians to rapidly annotate large amounts of DNA and protein sequence data.
Collapse
Affiliation(s)
- Paul Walsh
- nSilico LifeSciences, Ltd., Melbourne Building, Bishopstown, Cork, Ireland
| | | | | |
Collapse
|
27
|
Di Silvio E, Di Matteo A, Malatesta F, Travaglini-Allocatelli C. Recognition and binding of apocytochrome c to P. aeruginosa CcmI, a component of cytochrome c maturation machinery. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2013; 1834:1554-61. [PMID: 23648553 DOI: 10.1016/j.bbapap.2013.04.027] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/12/2013] [Revised: 04/23/2013] [Accepted: 04/25/2013] [Indexed: 01/13/2023]
Abstract
The biogenesis of c-type cytochromes (Cytc) is a process that in Gram-negative bacteria demands the coordinated action of different periplasmic proteins (CcmA-I), whose specific roles are still being investigated. Activities of Ccm proteins span from the chaperoning of heme b in the periplasm to the specific reduction of oxidized apocytochrome (apoCyt) cysteine residues and to chaperoning and recognition of the unfolded apoCyt before covalent attachment of the heme to the cysteine thiols can occur. We present here the functional characterization of the periplasmic domain of CcmI from the pathogen Pseudomonas aeruginosa (Pa-CcmI*). Pa-CcmI* is composed of a TPR domain and a peculiar C-terminal domain. Pa-CcmI* fulfills both the ability to recognize and bind to P. aeruginosa apo-cytochrome c551 (Pa-apoCyt) and a chaperoning activity towards unfolded proteins, as it prevents citrate synthase aggregation in a concentration-dependent manner. Equilibrium and kinetic experiments with Pa-CcmI*, or its isolated domains, with peptides mimicking portions of Pa-apoCyt sequence allow us to quantify the molecular details of the interaction between Pa-apoCyt and Pa-CcmI*. Binding experiments show that the interaction occurs at the level of the TPR domain and that the recognition is mediated mainly by the C-terminal sequence of Pa-apoCyt. The affinity of Pa-CcmI* to full-length Pa-apoCyt or to its C-terminal sequence is in the range expected for a component of a multi-protein complex, whose task is to receive the apoCyt and to deliver it to other components of the apoCyt:heme b ligation protein machinery.
Collapse
Affiliation(s)
- Eva Di Silvio
- Department of Biochemical Sciences, Università di Roma La Sapienza, Roma, Italy
| | | | | | | |
Collapse
|
28
|
Singh S, Singh A. Emerging Web Tools and Their Applications in Bioinformatics. Bioinformatics 2013. [DOI: 10.4018/978-1-4666-3604-0.ch094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
Bioinformatics is an emerging area of interest for many researchers and scientists. It has unlimited applications in many areas. The most important application of this is to know about genes, et cetera. But nowadays, research has also started in the emerging areas of network security and threats using bioinformatics. In the present scenario, we are highly dependent on Internet. The Web has invited different people from different backgrounds to work together sitting at far places. And to fulfill the needs of the interested and involved people, lots of Web based tools have been developed, and many others are being developed. In this chapter, the area of bioinformatics has been introduced along with its applications, Web, developed Web based tools, and a case study of one such tool.
Collapse
|
29
|
Dai L, Xu C, Tian M, Sang J, Zou D, Li A, Liu G, Chen F, Wu J, Xiao J, Wang X, Yu J, Zhang Z. Community intelligence in knowledge curation: an application to managing scientific nomenclature. PLoS One 2013; 8:e56961. [PMID: 23451119 PMCID: PMC3581571 DOI: 10.1371/journal.pone.0056961] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2012] [Accepted: 01/16/2013] [Indexed: 11/22/2022] Open
Abstract
Harnessing community intelligence in knowledge curation bears significant promise in dealing with communication and education in the flood of scientific knowledge. As knowledge is accumulated at ever-faster rates, scientific nomenclature, a particular kind of knowledge, is concurrently generated in all kinds of fields. Since nomenclature is a system of terms used to name things in a particular discipline, accurate translation of scientific nomenclature in different languages is of critical importance, not only for communications and collaborations with English-speaking people, but also for knowledge dissemination among people in the non-English-speaking world, particularly young students and researchers. However, it lacks of accuracy and standardization when translating scientific nomenclature from English to other languages, especially for those languages that do not belong to the same language family as English. To address this issue, here we propose for the first time the application of community intelligence in scientific nomenclature management, namely, harnessing collective intelligence for translation of scientific nomenclature from English to other languages. As community intelligence applied to knowledge curation is primarily aided by wiki and Chinese is the native language for about one-fifth of the world’s population, we put the proposed application into practice, by developing a wiki-based English-to-Chinese Scientific Nomenclature Dictionary (ESND; http://esnd.big.ac.cn). ESND is a wiki-based, publicly editable and open-content platform, exploiting the whole power of the scientific community in collectively and collaboratively managing scientific nomenclature. Based on community curation, ESND is capable of achieving accurate, standard, and comprehensive scientific nomenclature, demonstrating a valuable application of community intelligence in knowledge curation.
Collapse
Affiliation(s)
- Lin Dai
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Chao Xu
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
| | - Ming Tian
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
| | - Jian Sang
- Research Institute of Subtropical Forestry, Chinese Academy of Forestry, Fuyang, Zhejiang, China
| | - Dong Zou
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Ang Li
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Guocheng Liu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Fei Chen
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Jiayan Wu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Jingfa Xiao
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Xumin Wang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Jun Yu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Zhang Zhang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
- * E-mail:
| |
Collapse
|
30
|
Jimenez-Lopez JC, Gachomo EW, Sharma S, Kotchoni SO. Genome sequencing and next-generation sequence data analysis: A comprehensive compilation of bioinformatics tools and databases. ACTA ACUST UNITED AC 2013. [DOI: 10.4236/ajmb.2013.32016] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
31
|
Powers ML, McDermott AG, Shaner NC, Haddock SHD. Expression and characterization of the calcium-activated photoprotein from the ctenophore Bathocyroe fosteri: insights into light-sensitive photoproteins. Biochem Biophys Res Commun 2012; 431:360-6. [PMID: 23262181 DOI: 10.1016/j.bbrc.2012.12.026] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2012] [Accepted: 12/06/2012] [Indexed: 11/19/2022]
Abstract
Calcium-binding photoproteins have been discovered in a variety of luminous marine organisms [1]. Recent interest in photoproteins from the phylum Ctenophora has stemmed from cloning and expression of several photoproteins from this group [2-5]. Additional characterization has revealed unique biochemical properties found only in ctenophore photoproteins, such as inactivation by light. Here we report the cloning, expression, and characterization of the photoprotein responsible for luminescence in the deep-sea ctenophore Bathocyroe fosteri. This animal was of particular interest due to the unique broad color spectrum observed in live specimens [6]. Full-length sequences were identified by BLAST searches of known photoprotein sequences against Bathocyroe transcripts obtained from 454 sequencing. Recombinantly expressed Bathocyroe photoprotein (BfosPP) displayed an optimal coelenterazine-loading pH of 8.5, and produced calcium-triggered luminescence with peak wavelengths closely matching the 493 nm peak observed in the spectrum of live B. fosteri specimens. Luminescence from recombinant BfosPP was inactivated most efficiently by UV and blue light. Primary structure alignment of BfosPP with other characterized photoproteins showed very strong sequence similarity to other ctenophore photoproteins and conservation of EF-hand motifs. Both alignment and structural prediction data provide more insight into the formation of the coelenterazine-binding domain and the probable mechanism of photoinactivation.
Collapse
Affiliation(s)
- Meghan L Powers
- Monterey Bay Aquarium Research Institute, 7700 Sandholdt Road, Moss Landing, CA 95039, USA.
| | | | | | | |
Collapse
|
32
|
Abstract
Abstract As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics. Reviewers This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor.
Collapse
Affiliation(s)
- Lin Dai
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, No.7 Beitucheng West Road, Building G, Chaoyang District, Beijing 100029, China
| | | | | | | | | |
Collapse
|
33
|
Almeida JS, Iriabho EE, Gorrepati VL, Wilkinson SR, Grüneberg A, Robbins DE, Hackney JR. ImageJS: Personalized, participated, pervasive, and reproducible image bioinformatics in the web browser. J Pathol Inform 2012; 3:25. [PMID: 22934238 PMCID: PMC3424663 DOI: 10.4103/2153-3539.98813] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2012] [Accepted: 06/06/2012] [Indexed: 11/19/2022] Open
Abstract
Background: Image bioinformatics infrastructure typically relies on a combination of server-side high-performance computing and client desktop applications tailored for graphic rendering. On the server side, matrix manipulation environments are often used as the back-end where deployment of specialized analytical workflows takes place. However, neither the server-side nor the client-side desktop solution, by themselves or combined, is conducive to the emergence of open, collaborative, computational ecosystems for image analysis that are both self-sustained and user driven. Materials and Methods: ImageJS was developed as a browser-based webApp, untethered from a server-side backend, by making use of recent advances in the modern web browser such as a very efficient compiler, high-end graphical rendering capabilities, and I/O tailored for code migration. Results: Multiple versioned code hosting services were used to develop distinct ImageJS modules to illustrate its amenability to collaborative deployment without compromise of reproducibility or provenance. The illustrative examples include modules for image segmentation, feature extraction, and filtering. The deployment of image analysis by code migration is in sharp contrast with the more conventional, heavier, and less safe reliance on data transfer. Accordingly, code and data are loaded into the browser by exactly the same script tag loading mechanism, which offers a number of interesting applications that would be hard to attain with more conventional platforms, such as NIH's popular ImageJ application. Conclusions: The modern web browser was found to be advantageous for image bioinformatics in both the research and clinical environments. This conclusion reflects advantages in deployment scalability and analysis reproducibility, as well as the critical ability to deliver advanced computational statistical procedures machines where access to sensitive data is controlled, that is, without local “download and installation”.
Collapse
Affiliation(s)
- Jonas S Almeida
- Division Informatics, Department of Pathology, University of Alabama at Birmingham, Alabama, USA
| | | | | | | | | | | | | |
Collapse
|
34
|
Kong L, Wang J, Zhao S, Gu X, Luo J, Gao G. ABrowse--a customizable next-generation genome browser framework. BMC Bioinformatics 2012; 13:2. [PMID: 22222089 PMCID: PMC3265404 DOI: 10.1186/1471-2105-13-2] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2011] [Accepted: 01/05/2012] [Indexed: 11/14/2022] Open
Abstract
Background With the rapid growth of genome sequencing projects, genome browser is becoming indispensable, not only as a visualization system but also as an interactive platform to support open data access and collaborative work. Thus a customizable genome browser framework with rich functions and flexible configuration is needed to facilitate various genome research projects. Results Based on next-generation web technologies, we have developed a general-purpose genome browser framework ABrowse which provides interactive browsing experience, open data access and collaborative work support. By supporting Google-map-like smooth navigation, ABrowse offers end users highly interactive browsing experience. To facilitate further data analysis, multiple data access approaches are supported for external platforms to retrieve data from ABrowse. To promote collaborative work, an online user-space is provided for end users to create, store and share comments, annotations and landmarks. For data providers, ABrowse is highly customizable and configurable. The framework provides a set of utilities to import annotation data conveniently. To build ABrowse on existing annotation databases, data providers could specify SQL statements according to database schema. And customized pages for detailed information display of annotation entries could be easily plugged in. For developers, new drawing strategies could be integrated into ABrowse for new types of annotation data. In addition, standard web service is provided for data retrieval remotely, providing underlying machine-oriented programming interface for open data access. Conclusions ABrowse framework is valuable for end users, data providers and developers by providing rich user functions and flexible customization approaches. The source code is published under GNU Lesser General Public License v3.0 and is accessible at http://www.abrowse.org/. To demonstrate all the features of ABrowse, a live demo for Arabidopsis thaliana genome has been built at http://arabidopsis.cbi.edu.cn/.
Collapse
Affiliation(s)
- Lei Kong
- College of Life Sciences, State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, Peking University, Beijing, 100871, P.R. China
| | | | | | | | | | | |
Collapse
|
35
|
Sreenivasaiah PK, Kim DH. Current trends and new challenges of databases and web applications for systems driven biological research. Front Physiol 2010; 1:147. [PMID: 21423387 PMCID: PMC3059952 DOI: 10.3389/fphys.2010.00147] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2010] [Accepted: 10/18/2010] [Indexed: 12/17/2022] Open
Abstract
Dynamic and rapidly evolving nature of systems driven research imposes special requirements on the technology, approach, design and architecture of computational infrastructure including database and Web application. Several solutions have been proposed to meet the expectations and novel methods have been developed to address the persisting problems of data integration. It is important for researchers to understand different technologies and approaches. Having familiarized with the pros and cons of the existing technologies, researchers can exploit its capabilities to the maximum potential for integrating data. In this review we discuss the architecture, design and key technologies underlying some of the prominent databases and Web applications. We will mention their roles in integration of biological data and investigate some of the emerging design concepts and computational technologies that are likely to have a key role in the future of systems driven biomedical research.
Collapse
Affiliation(s)
- Pradeep Kumar Sreenivasaiah
- Systems Biology Research Center and College of Life Science, Gwangju Institute of Science and TechnologyGwangju, Republic of Korea
| | - Do Han Kim
- Systems Biology Research Center and College of Life Science, Gwangju Institute of Science and TechnologyGwangju, Republic of Korea
| |
Collapse
|
36
|
Do LH, Esteves FF, Karten HJ, Bier E. Booly: a new data integration platform. BMC Bioinformatics 2010; 11:513. [PMID: 20942966 PMCID: PMC2970612 DOI: 10.1186/1471-2105-11-513] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2010] [Accepted: 10/13/2010] [Indexed: 01/22/2023] Open
Abstract
Background Data integration is an escalating problem in bioinformatics. We have developed a web tool and warehousing system, Booly, that features a simple yet flexible data model coupled with the ability to perform powerful comparative analysis, including the use of Boolean logic to merge datasets together, and an integrated aliasing system to decipher differing names of the same gene or protein. Furthermore, Booly features a collaborative sharing system and a public repository so that users can retrieve new datasets while contributors can easily disseminate new content. Results We illustrate the uses of Booly with several examples including: the versatile creation of homebrew datasets, the integration of heterogeneous data to identify genes useful for comparing avian and mammalian brain architecture, and generation of a list of Food and Drug Administration (FDA) approved drugs with possible alternative disease targets. Conclusions The Booly paradigm for data storage and analysis should facilitate integration between disparate biological and medical fields and result in novel discoveries that can then be validated experimentally. Booly can be accessed at http://booly.ucsd.edu.
Collapse
Affiliation(s)
- Long H Do
- Section of Cell and Developmental Biology, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0349, USA.
| | | | | | | |
Collapse
|
37
|
Zhang Z, Townsend JP. The filamentous fungal gene expression database (FFGED). Fungal Genet Biol 2009; 47:199-204. [PMID: 20025988 DOI: 10.1016/j.fgb.2009.12.001] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2009] [Revised: 11/18/2009] [Accepted: 12/09/2009] [Indexed: 01/25/2023]
Abstract
Filamentous fungal gene expression assays provide essential information for understanding systemic cellular regulation. To aid research on fungal gene expression, we constructed a novel, comprehensive, free database, the filamentous fungal gene expression database (FFGED), available at http://bioinfo.townsend.yale.edu. FFGED features user-friendly management of gene expression data, which are assorted into experimental metadata, experimental design, raw data, normalized details, and analysis results. Data may be submitted in the process of an experiment, and any user can submit multiple experiments, thus classifying the FFGED as an "active experiment" database. Most importantly, FFGED functions as a collective and collaborative platform, by connecting each experiment with similar related experiments made public by other users, maximizing data sharing among different users, and correlating diverse gene expression levels under multiple experimental designs within different experiments. A clear and efficient web interface is provided with enhancement by AJAX (Asynchronous JavaScript and XML) and through a collection of tools to effectively facilitate data submission, sharing, retrieval and visualization.
Collapse
Affiliation(s)
- Zhang Zhang
- Department of Ecology and Evolutionary Biology, Yale University, 165 Prospect Street, New Haven, CT 06520, USA.
| | | |
Collapse
|
38
|
Pathway projector: web-based zoomable pathway browser using KEGG atlas and Google Maps API. PLoS One 2009; 4:e7710. [PMID: 19907644 PMCID: PMC2770834 DOI: 10.1371/journal.pone.0007710] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2009] [Accepted: 10/11/2009] [Indexed: 11/19/2022] Open
Abstract
Background Biochemical pathways provide an essential context for understanding comprehensive experimental data and the systematic workings of a cell. Therefore, the availability of online pathway browsers will facilitate post-genomic research, just as genome browsers have contributed to genomics. Many pathway maps have been provided online as part of public pathway databases. Most of these maps, however, function as the gateway interface to a specific database, and the comprehensiveness of their represented entities, data mapping capabilities, and user interfaces are not always sufficient for generic usage. Methodology/Principal Findings We have identified five central requirements for a pathway browser: (1) availability of large integrated maps showing genes, enzymes, and metabolites; (2) comprehensive search features and data access; (3) data mapping for transcriptomic, proteomic, and metabolomic experiments, as well as the ability to edit and annotate pathway maps; (4) easy exchange of pathway data; and (5) intuitive user experience without the requirement for installation and regular maintenance. According to these requirements, we have evaluated existing pathway databases and tools and implemented a web-based pathway browser named Pathway Projector as a solution. Conclusions/Significance Pathway Projector provides integrated pathway maps that are based upon the KEGG Atlas, with the addition of nodes for genes and enzymes, and is implemented as a scalable, zoomable map utilizing the Google Maps API. Users can search pathway-related data using keywords, molecular weights, nucleotide sequences, and amino acid sequences, or as possible routes between compounds. In addition, experimental data from transcriptomic, proteomic, and metabolomic analyses can be readily mapped. Pathway Projector is freely available for academic users at http://www.g-language.org/PathwayProjector/.
Collapse
|
39
|
A web server for interactive and zoomable Chaos Game Representation images. SOURCE CODE FOR BIOLOGY AND MEDICINE 2009; 4:6. [PMID: 19761591 PMCID: PMC2753581 DOI: 10.1186/1751-0473-4-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/05/2009] [Accepted: 09/17/2009] [Indexed: 11/10/2022]
Abstract
Chaos Game Representation (CGR) is a generalized scale-independent Markov transition table, which is useful for the visualization and comparative study of genomic signature, or for the study of characteristic sequence motifs. However, in order to fully utilize the scale-independent properties of CGR, it should be accessible through scale-independent user interface instead of static images. Here we describe a web server and Perl library for generating zoomable CGR images utilizing Google Maps API, which is also easily searchable for specific motifs. The web server is freely accessible at http://www.g-language.org/wiki/cgr/, and the Perl library as well as the source code is distributed with the G-language Genome Analysis Environment under GNU General Public License.
Collapse
|
40
|
Antezana E, Kuiper M, Mironov V. Biological knowledge management: the emerging role of the Semantic Web technologies. Brief Bioinform 2009; 10:392-407. [PMID: 19457869 DOI: 10.1093/bib/bbp024] [Citation(s) in RCA: 83] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
New knowledge is produced at a continuously increasing speed, and the list of papers, databases and other knowledge sources that a researcher in the life sciences needs to cope with is actually turning into a problem rather than an asset. The adequate management of knowledge is therefore becoming fundamentally important for life scientists, especially if they work with approaches that thoroughly depend on knowledge integration, such as systems biology. Several initiatives to organize biological knowledge sources into a readily exploitable resourceome are presently being carried out. Ontologies and Semantic Web technologies revolutionize these efforts. Here, we review the benefits, trends, current possibilities, and the potential this holds for the biosciences.
Collapse
Affiliation(s)
- Erick Antezana
- Department of Biology at the Norwegian University of Science and Technology
| | | | | |
Collapse
|
41
|
Glez-Peña D, Gómez-López G, Pisano DG, Fdez-Riverola F. WhichGenes: a web-based tool for gathering, building, storing and exporting gene sets with application in gene set enrichment analysis. Nucleic Acids Res 2009; 37:W329-34. [PMID: 19406925 PMCID: PMC2703947 DOI: 10.1093/nar/gkp263] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
WhichGenes is a web-based interactive gene set building tool offering a very simple interface to extract always-updated gene lists from multiple databases and unstructured biological data sources. While the user can specify new gene sets of interest by following a simple four-step wizard, the tool is able to run several queries in parallel. Every time a new set is generated, it is automatically added to the private gene-set cart and the user is notified by an e-mail containing a direct link to the new set stored in the server. WhichGenes provides functionalities to edit, delete and rename existing sets as well as the capability of generating new ones by combining previous existing sets (intersection, union and difference operators). The user can export his sets configuring the output format and selecting among multiple gene identifiers. In addition to the user-friendly environment, WhichGenes allows programmers to access its functionalities in a programmatic way through a Representational State Transfer web service. WhichGenes front-end is freely available at http://www.whichgenes.org/, WhichGenes API is accessible at http://www.whichgenes.org/api/.
Collapse
Affiliation(s)
- Daniel Glez-Peña
- Higher Technical School of Computer Engineering, University of Vigo, Ourense, Spain
| | | | | | | |
Collapse
|
42
|
Arakawa K, Tamaki S, Kono N, Kido N, Ikegami K, Ogawa R, Tomita M. Genome Projector: zoomable genome map with multiple views. BMC Bioinformatics 2009; 10:31. [PMID: 19166610 PMCID: PMC2636772 DOI: 10.1186/1471-2105-10-31] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2008] [Accepted: 01/23/2009] [Indexed: 01/24/2023] Open
Abstract
BACKGROUND Molecular biology data exist on diverse scales, from the level of molecules to -omics. At the same time, the data at each scale can be categorised into multiple layers, such as the genome, transcriptome, proteome, metabolome, and biochemical pathways. Due to the highly multi-layer and multi-dimensional nature of biological information, software interfaces for database browsing should provide an intuitive interface that allows for rapid migration across different views and scales. The Zoomable User Interface (ZUI) and tabbed browsing have proven successful for this purpose in other areas, especially to navigate the vast information in the World Wide Web. RESULTS This paper presents Genome Projector, a Web-based gateway for genomics information with a zoomable user interface using Google Maps API, equipped with four seamlessly accessible and searchable views: a circular genome map, a traditional genome map, a biochemical pathways map, and a DNA walk map. The Web application for 320 bacterial genomes is available at http://www.g-language.org/GenomeProjector/. All data and software including the source code, documentations, and development API are freely available under the GNU General Public License. Zoomable maps can be easily created from any image file using the development API, and an online data mapping service for Genome Projector is also available at our Web site. CONCLUSION Genome Projector is an intuitive Web application for browsing genomics information, implemented with a zoomable user interface and tabbed browsing utilising Google Maps API and Asynchronous JavaScript and XML (AJAX) technology.
Collapse
Affiliation(s)
- Kazuharu Arakawa
- Institute for Advanced Biosciences, Keio University, Fujisawa, 252-8520, Japan.
| | | | | | | | | | | | | |
Collapse
|