Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

5762
(from Reference Citation Analysis)

Article PDFs (912)

Cited by ≥ 1 (3588)

Searched Name

Sequence Analysis, Protein

Year Published

Show more Refine

Article Statistics

Refine

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Journal Articles

Rank	Citation Analysis	Article Type	Number of Years	Citation(s) in RCA
1	Tamura K, Dudley J, Nei M, Kumar S. MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Mol Biol Evol 2007;24:1596-9. [PMID: 17488738 DOI: 10.1093/molbev/msm092] [Citation(s) in RCA: 19658] [Impact Index Per Article: 1092.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open Abstract We announce the release of the fourth version of MEGA software, which expands on the existing facilities for editing DNA sequence data from autosequencers, mining Web-databases, performing automatic and manual sequence alignment, analyzing sequence alignments to estimate evolutionary distances, inferring phylogenetic trees, and testing evolutionary hypotheses. Version 4 includes a unique facility to generate captions, written in figure legend format, in order to provide natural language descriptions of the models and methods used in the analyses. This facility aims to promote a better understanding of the underlying assumptions used in analyses, and of the results generated. Another new feature is the Maximum Composite Likelihood (MCL) method for estimating evolutionary distances between all pairs of sequences simultaneously, with and without incorporating rate variation among sites and substitution pattern heterogeneities among lineages. This MCL method also can be used to estimate transition/transversion bias and nucleotide substitution pattern without knowledge of the phylogenetic tree. This new version is a native 32-bit Windows application with multi-threading and multi-user supports, and it is also available to run in a Linux desktop environment (via the Wine compatibility layer) and on Intel-based Macintosh computers under the Parallels program. The current version of MEGA is available free of charge at (http://www.megasoftware.net). Collapse Key Words Collapse MESH Headings Databases, Genetic Evolution, Molecular Internet Phylogeny Sequence Alignment/methods Sequence Analysis, DNA Sequence Analysis, Protein Software Collapse Grants Collapse Collaborators Collapse	Research Support, Non-U.S. Gov't	18	19658
2	Conesa A, Götz S, García-Gómez JM, Terol J, Talón M, Robles M. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 2005;21:3674-6. [PMID: 16081474 DOI: 10.1093/bioinformatics/bti610] [Citation(s) in RCA: 8340] [Impact Index Per Article: 417.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open Abstract SUMMARY We present here Blast2GO (B2G), a research tool designed with the main purpose of enabling Gene Ontology (GO) based data mining on sequence data for which no GO annotation is yet available. B2G joints in one application GO annotation based on similarity searches with statistical analysis and highlighted visualization on directed acyclic graphs. This tool offers a suitable platform for functional genomics research in non-model species. B2G is an intuitive and interactive desktop application that allows monitoring and comprehension of the whole annotation and analysis process. AVAILABILITY Blast2GO is freely available via Java Web Start at http://www.blast2go.de. SUPPLEMENTARY MATERIAL http://www.blast2go.de -> Evaluation. Collapse Key Words Collapse MESH Headings Algorithms Computational Biology/methods Computer Graphics Database Management Systems Databases, Protein Gene Expression Profiling/methods Genome Genomics Information Storage and Retrieval/methods Internet Oligonucleotide Array Sequence Analysis Sequence Alignment Sequence Analysis, Protein Software User-Computer Interface Collapse Grants Collapse Collaborators Collapse	Research Support, Non-U.S. Gov't	20	8340
3	Kumar S, Tamura K, Nei M. MEGA3: Integrated software for Molecular Evolutionary Genetics Analysis and sequence alignment. Brief Bioinform 2005;5:150-63. [PMID: 15260895 DOI: 10.1093/bib/5.2.150] [Citation(s) in RCA: 8079] [Impact Index Per Article: 404.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open Abstract With its theoretical basis firmly established in molecular evolutionary and population genetics, the comparative DNA and protein sequence analysis plays a central role in reconstructing the evolutionary histories of species and multigene families, estimating rates of molecular evolution, and inferring the nature and extent of selective forces shaping the evolution of genes and genomes. The scope of these investigations has now expanded greatly owing to the development of high-throughput sequencing techniques and novel statistical and computational methods. These methods require easy-to-use computer programs. One such effort has been to produce Molecular Evolutionary Genetics Analysis (MEGA) software, with its focus on facilitating the exploration and analysis of the DNA and protein sequence variation from an evolutionary perspective. Currently in its third major release, MEGA3 contains facilities for automatic and manual sequence alignment, web-based mining of databases, inference of the phylogenetic trees, estimation of evolutionary distances and testing evolutionary hypotheses. This paper provides an overview of the statistical methods, computational tools, and visual exploration modules for data input and the results obtainable in MEGA. Collapse Key Words Collapse MESH Headings Databases, Genetic Evolution, Molecular Internet Phylogeny Sequence Alignment/methods Sequence Analysis, DNA Sequence Analysis, Protein Software Collapse Grants Collapse Collaborators Collapse	Research Support, U.S. Gov't, P.H.S.	20	8079
4	Waterhouse AM, Procter JB, Martin DMA, Clamp M, Barton GJ. Jalview Version 2--a multiple sequence alignment editor and analysis workbench. Bioinformatics 2009;25:1189-91. [PMID: 19151095 PMCID: PMC2672624 DOI: 10.1093/bioinformatics/btp033] [Citation(s) in RCA: 7181] [Impact Index Per Article: 448.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2008] [Revised: 11/24/2008] [Accepted: 01/08/2009] [Indexed: 12/11/2022] Open Abstract UNLABELLED Jalview Version 2 is a system for interactive WYSIWYG editing, analysis and annotation of multiple sequence alignments. Core features include keyboard and mouse-based editing, multiple views and alignment overviews, and linked structure display with Jmol. Jalview 2 is available in two forms: a lightweight Java applet for use in web applications, and a powerful desktop application that employs web services for sequence alignment, secondary structure prediction and the retrieval of alignments, sequences, annotation and structures from public databases and any DAS 1.53 compliant sequence or annotation server. AVAILABILITY The Jalview 2 Desktop application and JalviewLite applet are made freely available under the GPL, and can be downloaded from www.jalview.org. Collapse Key Words Collapse MESH Headings Computational Biology/methods Databases, Protein Proteins/chemistry Sequence Alignment/methods Sequence Analysis, Protein Software Collapse Grants BB/G022682/1 Biotechnology and Biological Sciences Research Council BBSB16542 Biotechnology and Biological Sciences Research Council Collapse Collaborators Collapse	research-article	16	7181
5	Bailey TL, Boden M, Buske FA, Frith M, Grant CE, Clementi L, Ren J, Li WW, Noble WS. MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res 2009;37:W202-8. [PMID: 19458158 PMCID: PMC2703892 DOI: 10.1093/nar/gkp335] [Citation(s) in RCA: 6817] [Impact Index Per Article: 426.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2009] [Revised: 04/10/2009] [Accepted: 04/21/2009] [Indexed: 11/13/2022] Open Abstract The MEME Suite web server provides a unified portal for online discovery and analysis of sequence motifs representing features such as DNA binding sites and protein interaction domains. The popular MEME motif discovery algorithm is now complemented by the GLAM2 algorithm which allows discovery of motifs containing gaps. Three sequence scanning algorithms--MAST, FIMO and GLAM2SCAN--allow scanning numerous DNA and protein sequence databases for motifs discovered by MEME and GLAM2. Transcription factor motifs (including those discovered using MEME) can be compared with motifs in many popular motif databases using the motif database scanning algorithm TOMTOM. Transcription factor motifs can be further analyzed for putative function by association with Gene Ontology (GO) terms using the motif-GO term association tool GOMO. MEME output now contains sequence LOGOS for each discovered motif, as well as buttons to allow motifs to be conveniently submitted to the sequence and motif database scanning algorithms (MAST, FIMO and TOMTOM), or to GOMO, for further analysis. GLAM2 output similarly contains buttons for further analysis using GLAM2SCAN and for rerunning GLAM2 with different parameters. All of the motif-based tools are now implemented as web services via Opal. Source code, binaries and a web server are freely available for noncommercial use at http://meme.nbcr.net. Collapse Key Words Collapse MESH Headings Algorithms Binding Sites Databases, Genetic Internet Regulatory Elements, Transcriptional Sequence Analysis, DNA Sequence Analysis, Protein Software Transcription Factors/metabolism Collapse Grants P41 RR008605 NCRR NIH HHS R01 RR021692 NCRR NIH HHS P41 RR08605 NCRR NIH HHS Collapse Collaborators Collapse	Research Support, N.I.H., Extramural	16	6817
6	The UniProt Consortium. UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res 2020;47:D506-D515. [PMID: 30395287 PMCID: PMC6323992 DOI: 10.1093/nar/gky1049] [Citation(s) in RCA: 5175] [Impact Index Per Article: 1035.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Accepted: 10/18/2018] [Indexed: 12/14/2022] Open Abstract The UniProt Knowledgebase is a collection of sequences and annotations for over 120 million proteins across all branches of life. Detailed annotations extracted from the literature by expert curators have been collected for over half a million of these proteins. These annotations are supplemented by annotations provided by rule based automated systems, and those imported from other resources. In this article we describe significant updates that we have made over the last 2 years to the resource. We have greatly expanded the number of Reference Proteomes that we provide and in particular we have focussed on improving the number of viral Reference Proteomes. The UniProt website has been augmented with new data visualizations for the subcellular localization of proteins as well as their structure and interactions. UniProt resources are available under a CC-BY (4.0) license via the web at https://www.uniprot.org/. Collapse Key Words Collapse MESH Headings Data Curation Databases, Protein Knowledge Bases Molecular Sequence Annotation Proteome/genetics Sequence Analysis, Protein Collapse Grants R01 GM080646 NIGMS NIH HHS U41 HG002273 NHGRI NIH HHS UL1 TR001409 NCATS NIH HHS U01 GM120953 NIGMS NIH HHS P20 GM103446 NIGMS NIH HHS U41 HG007822 NHGRI NIH HHS BB/M011674/1 Biotechnology and Biological Sciences Research Council RG/13/5/30112 British Heart Foundation Collapse Collaborators Collapse	Research Support, Non-U.S. Gov't	5	5175
7	Ludwig W, Strunk O, Westram R, Richter L, Meier H, Yadhukumar, Buchner A, Lai T, Steppi S, Jobb G, Förster W, Brettske I, Gerber S, Ginhart AW, Gross O, Grumann S, Hermann S, Jost R, König A, Liss T, Lüssmann R, May M, Nonhoff B, Reichel B, Strehlow R, Stamatakis A, Stuckmann N, Vilbig A, Lenke M, Ludwig T, Bode A, Schleifer KH. ARB: a software environment for sequence data. Nucleic Acids Res 2004;32:1363-71. [PMID: 14985472 PMCID: PMC390282 DOI: 10.1093/nar/gkh293] [Citation(s) in RCA: 4658] [Impact Index Per Article: 221.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2004] [Revised: 01/28/2004] [Accepted: 01/28/2004] [Indexed: 11/12/2022] Open Abstract The ARB (from Latin arbor, tree) project was initiated almost 10 years ago. The ARB program package comprises a variety of directly interacting software tools for sequence database maintenance and analysis which are controlled by a common graphical user interface. Although it was initially designed for ribosomal RNA data, it can be used for any nucleic and amino acid sequence data as well. A central database contains processed (aligned) primary structure data. Any additional descriptive data can be stored in database fields assigned to the individual sequences or linked via local or worldwide networks. A phylogenetic tree visualized in the main window can be used for data access and visualization. The package comprises additional tools for data import and export, sequence alignment, primary and secondary structure editing, profile and filter calculation, phylogenetic analyses, specific hybridization probe design and evaluation and other components for data analysis. Currently, the package is used by numerous working groups worldwide. Collapse Key Words Collapse MESH Headings Data Display Databases, Genetic Internet Phylogeny Sequence Alignment Sequence Analysis, DNA Sequence Analysis, Protein Sequence Analysis, RNA Software Time Factors Collapse Grants Collapse Collaborators Collapse	research-article	21	4658
8	Finn RD, Bateman A, Clements J, Coggill P, Eberhardt RY, Eddy SR, Heger A, Hetherington K, Holm L, Mistry J, Sonnhammer ELL, Tate J, Punta M. Pfam: the protein families database. Nucleic Acids Res 2014;42:D222-30. [PMID: 24288371 PMCID: PMC3965110 DOI: 10.1093/nar/gkt1223] [Citation(s) in RCA: 4487] [Impact Index Per Article: 407.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2013] [Revised: 11/04/2013] [Accepted: 11/05/2013] [Indexed: 01/17/2023] Open Abstract Pfam, available via servers in the UK (http://pfam.sanger.ac.uk/) and the USA (http://pfam.janelia.org/), is a widely used database of protein families, containing 14 831 manually curated entries in the current release, version 27.0. Since the last update article 2 years ago, we have generated 1182 new families and maintained sequence coverage of the UniProt Knowledgebase (UniProtKB) at nearly 80%, despite a 50% increase in the size of the underlying sequence database. Since our 2012 article describing Pfam, we have also undertaken a comprehensive review of the features that are provided by Pfam over and above the basic family data. For each feature, we determined the relevance, computational burden, usage statistics and the functionality of the feature in a website context. As a consequence of this review, we have removed some features, enhanced others and developed new ones to meet the changing demands of computational biology. Here, we describe the changes to Pfam content. Notably, we now provide family alignments based on four different representative proteome sequence data sets and a new interactive DNA search interface. We also discuss the mapping between Pfam and known 3D structures. Collapse Key Words Collapse MESH Headings Databases, Protein Internet Intrinsically Disordered Proteins/chemistry Protein Conformation Proteins/chemistry Proteins/classification Proteins/genetics Proteome/chemistry Sequence Alignment Sequence Analysis, DNA Sequence Analysis, Protein Collapse Grants BB/F010435/1 Biotechnology and Biological Sciences Research Council Howard Hughes Medical Institute Collapse Collaborators Collapse	research-article	11	4487
9	Schwede T, Kopp J, Guex N, Peitsch MC. SWISS-MODEL: An automated protein homology-modeling server. Nucleic Acids Res 2003;31:3381-5. [PMID: 12824332 PMCID: PMC168927 DOI: 10.1093/nar/gkg520] [Citation(s) in RCA: 4224] [Impact Index Per Article: 192.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open Abstract SWISS-MODEL (http://swissmodel.expasy.org) is a server for automated comparative modeling of three-dimensional (3D) protein structures. It pioneered the field of automated modeling starting in 1993 and is the most widely-used free web-based automated modeling facility today. In 2002 the server computed 120 000 user requests for 3D protein models. SWISS-MODEL provides several levels of user interaction through its World Wide Web interface: in the 'first approach mode' only an amino acid sequence of a protein is submitted to build a 3D model. Template selection, alignment and model building are done completely automated by the server. In the 'alignment mode', the modeling process is based on a user-defined target-template alignment. Complex modeling tasks can be handled with the 'project mode' using DeepView (Swiss-PdbViewer), an integrated sequence-to-structure workbench. All models are sent back via email with a detailed modeling report. WhatCheck analyses and ANOLEA evaluations are provided optionally. The reliability of SWISS-MODEL is continuously evaluated in the EVA-CM project. The SWISS-MODEL server is under constant development to improve the successful implementation of expert knowledge into an easy-to-use server. Collapse Key Words Collapse MESH Headings Computer Graphics Internet Models, Molecular Proteins/chemistry Sequence Alignment Sequence Analysis, Protein Software Structural Homology, Protein User-Computer Interface Collapse Grants Collapse Collaborators Collapse	research-article	22	4224
10	The UniProt Consortium. UniProt: a hub for protein information. Nucleic Acids Res 2014;43:D204-12. [PMID: 25348405 PMCID: PMC4384041 DOI: 10.1093/nar/gku989] [Citation(s) in RCA: 3642] [Impact Index Per Article: 331.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open Abstract UniProt is an important collection of protein sequences and their annotations, which has doubled in size to 80 million sequences during the past year. This growth in sequences has prompted an extension of UniProt accession number space from 6 to 10 characters. An increasing fraction of new sequences are identical to a sequence that already exists in the database with the majority of sequences coming from genome sequencing projects. We have created a new proteome identifier that uniquely identifies a particular assembly of a species and strain or subspecies to help users track the provenance of sequences. We present a new website that has been designed using a user-experience design process. We have introduced an annotation score for all entries in UniProt to represent the relative amount of knowledge known about each protein. These scores will be helpful in identifying which proteins are the best characterized and most informative for comparative analysis. All UniProt data is provided freely and is available on the web at http://www.uniprot.org/. Collapse Key Words Collapse MESH Headings Databases, Protein Molecular Sequence Annotation Proteome Sequence Analysis, Protein Collapse Grants R01 GM080646 NIGMS NIH HHS RG/13/5/30112 British Heart Foundation G08LM010720 NLM NIH HHS U41 HG002273 NHGRI NIH HHS U41HG007822 NHGRI NIH HHS U41HG006104 NHGRI NIH HHS Wellcome Trust U41 HG007822 NHGRI NIH HHS U41HG002273 NHGRI NIH HHS P20GM103446 NIGMS NIH HHS G-1307 Parkinson's UK R01GM080646 NIGMS NIH HHS P20 GM103446 NIGMS NIH HHS Collapse Collaborators Collapse	Research Support, U.S. Gov't, Non-P.H.S.	11	3642
11	Madeira F, Park YM, Lee J, Buso N, Gur T, Madhusoodanan N, Basutkar P, Tivey ARN, Potter SC, Finn RD, Lopez R. The EMBL-EBI search and sequence analysis tools APIs in 2019. Nucleic Acids Res 2019;47:W636-W641. [PMID: 30976793 PMCID: PMC6602479 DOI: 10.1093/nar/gkz268] [Citation(s) in RCA: 3042] [Impact Index Per Article: 507.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Revised: 03/22/2019] [Accepted: 04/03/2019] [Indexed: 02/07/2023] Open Abstract The EMBL-EBI provides free access to popular bioinformatics sequence analysis applications as well as to a full-featured text search engine with powerful cross-referencing and data retrieval capabilities. Access to these services is provided via user-friendly web interfaces and via established RESTful and SOAP Web Services APIs (https://www.ebi.ac.uk/seqdb/confluence/display/JDSAT/EMBL-EBI+Web+Services+APIs+-+Data+Retrieval). Both systems have been developed with the same core principles that allow them to integrate an ever-increasing volume of biological data, making them an integral part of many popular data resources provided at the EMBL-EBI. Here, we describe the latest improvements made to the frameworks which enhance the interconnectivity between public EMBL-EBI resources and ultimately enhance biological data discoverability, accessibility, interoperability and reusability. Collapse Key Words Collapse MESH Headings Databases, Nucleic Acid Databases, Protein Sequence Alignment Sequence Analysis Sequence Analysis, Protein Software Collapse Grants Collapse Collaborators Collapse	research-article	6	3042
12	Bailey TL, Johnson J, Grant CE, Noble WS. The MEME Suite. Nucleic Acids Res 2015;43:W39-49. [PMID: 25953851 PMCID: PMC4489269 DOI: 10.1093/nar/gkv416] [Citation(s) in RCA: 2709] [Impact Index Per Article: 270.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2015] [Revised: 04/10/2015] [Accepted: 04/18/2015] [Indexed: 11/13/2022] Open Abstract The MEME Suite is a powerful, integrated set of web-based tools for studying sequence motifs in proteins, DNA and RNA. Such motifs encode many biological functions, and their detection and characterization is important in the study of molecular interactions in the cell, including the regulation of gene expression. Since the previous description of the MEME Suite in the 2009 Nucleic Acids Research Web Server Issue, we have added six new tools. Here we describe the capabilities of all the tools within the suite, give advice on their best use and provide several case studies to illustrate how to combine the results of various MEME Suite tools for successful motif-based analyses. The MEME Suite is freely available for academic use at http://meme-suite.org, and source code is also available for download and local installation. Collapse Key Words Collapse MESH Headings Amino Acid Motifs DNA/chemistry Internet Nucleotide Motifs Plasmodium falciparum Protein Interaction Domains and Motifs Protein Sorting Signals Protozoan Proteins/chemistry Receptors, Calcitriol/chemistry Sequence Analysis, DNA Sequence Analysis, Protein Sequence Analysis, RNA Software Collapse Grants R01 GM103544 NIGMS NIH HHS Collapse Collaborators Collapse	Research Support, N.I.H., Extramural	10	2709
13	Quevillon E, Silventoinen V, Pillai S, Harte N, Mulder N, Apweiler R, Lopez R. InterProScan: protein domains identifier. Nucleic Acids Res 2005;33:W116-20. [PMID: 15980438 PMCID: PMC1160203 DOI: 10.1093/nar/gki442] [Citation(s) in RCA: 2158] [Impact Index Per Article: 107.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open Abstract InterProScan [E. M. Zdobnov and R. Apweiler (2001) Bioinformatics, 17, 847–848] is a tool that combines different protein signature recognition methods from the InterPro [N. J. Mulder, R. Apweiler, T. K. Attwood, A. Bairoch, A. Bateman, D. Binns, P. Bradley, P. Bork, P. Bucher, L. Cerutti et al. (2005) Nucleic Acids Res., 33, D201–D205] consortium member databases into one resource. At the time of writing there are 10 distinct publicly available databases in the application. Protein as well as DNA sequences can be analysed. A web-based version is accessible for academic and commercial organizations from the EBI (). In addition, a standalone Perl version and a SOAP Web Service [J. Snell, D. Tidwell and P. Kulchenko (2001) Programming Web Services with SOAP, 1st edn. O'Reilly Publishers, Sebastopol, CA, ] are also available to the users. Various output formats are supported and include text tables, XML documents, as well as various graphs to help interpret the results. Collapse Key Words Collapse MESH Headings Databases, Protein Internet Protein Structure, Tertiary Sequence Analysis, DNA Sequence Analysis, Protein Software User-Computer Interface Collapse Grants Collapse Collaborators Collapse	Research Support, Non-U.S. Gov't	20	2158
14	Suyama M, Torrents D, Bork P. PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res 2006;34:W609-12. [PMID: 16845082 PMCID: PMC1538804 DOI: 10.1093/nar/gkl315] [Citation(s) in RCA: 2157] [Impact Index Per Article: 113.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open Abstract PAL2NAL is a web server that constructs a multiple codon alignment from the corresponding aligned protein sequences. Such codon alignments can be used to evaluate the type and rate of nucleotide substitutions in coding DNA for a wide range of evolutionary analyses, such as the identification of levels of selective constraint acting on genes, or to perform DNA-based phylogenetic studies. The server takes a protein sequence alignment and the corresponding DNA sequences as input. In contrast to other existing applications, this server is able to construct codon alignments even if the input DNA sequence has mismatches with the input protein sequence, or contains untranslated regions and polyA tails. The server can also deal with frame shifts and inframe stop codons in the input models, and is thus suitable for the analysis of pseudogenes. Another distinct feature is that the user can specify a subregion of the input alignment in order to specifically analyze functional domains or exons of interest. The PAL2NAL server is available at . Collapse Key Words Collapse MESH Headings Codon/chemistry Exons Internet Sequence Alignment/methods Sequence Analysis, DNA Sequence Analysis, Protein Software User-Computer Interface Collapse Grants Collapse Collaborators Collapse	Research Support, Non-U.S. Gov't	19	2157
15	Darriba D, Taboada GL, Doallo R, Posada D. ProtTest 3: fast selection of best-fit models of protein evolution. Bioinformatics 2011;27:1164-5. [PMID: 21335321 PMCID: PMC5215816 DOI: 10.1093/bioinformatics/btr088] [Citation(s) in RCA: 2078] [Impact Index Per Article: 148.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open Abstract UNLABELLED We have implemented a high-performance computing (HPC) version of ProtTest that can be executed in parallel in multicore desktops and clusters. This version, called ProtTest 3, includes new features and extended capabilities. AVAILABILITY ProtTest 3 source code and binaries are freely available under GNU license for download from http://darwin.uvigo.es/software/prottest3, linked to a Mercurial repository at Bitbucket (https://bitbucket.org/). CONTACT dposada@uvigo.es SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online. Collapse Key Words Collapse MESH Headings Evolution, Molecular Models, Statistical Phylogeny Sequence Alignment/methods Sequence Analysis, Protein Software Collapse Grants 203161 European Research Council Collapse Collaborators Collapse	Evaluation Study	14	2078
16	Fredriksson R, Lagerström MC, Lundin LG, Schiöth HB. The G-protein-coupled receptors in the human genome form five main families. Phylogenetic analysis, paralogon groups, and fingerprints. Mol Pharmacol 2003;63:1256-72. [PMID: 12761335 DOI: 10.1124/mol.63.6.1256] [Citation(s) in RCA: 2074] [Impact Index Per Article: 94.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open Abstract The superfamily of G-protein-coupled receptors (GPCRs) is very diverse in structure and function and its members are among the most pursued targets for drug development. We identified more than 800 human GPCR sequences and simultaneously analyzed 342 unique functional nonolfactory human GPCR sequences with phylogenetic analyses. Our results show, with high bootstrap support, five main families, named glutamate, rhodopsin, adhesion, frizzled/taste2, and secretin, forming the GRAFS classification system. The rhodopsin family is the largest and forms four main groups with 13 sub-branches. Positions of the GPCRs in chromosomal paralogons regions indicate the importance of tetraploidizations or local gene duplication events for their creation. We also searched for "fingerprint" motifs using Hidden Markov Models delineating the putative inter-relationship of the GRAFS families. We show several common structural features indicating that the human GPCRs in the GRAFS families share a common ancestor. This study represents the first overall map of the GPCRs in a single mammalian genome. Our novel approach of analyzing such large and diverse sequence sets may be useful for studies on GPCRs in other genomes and divergent protein families. Collapse Key Words Collapse MESH Headings Chromosome Mapping GTP-Binding Proteins/classification GTP-Binding Proteins/genetics Genome, Human Humans Membrane Glycoproteins Membrane Proteins/classification Membrane Proteins/genetics Phylogeny Platelet Glycoprotein GPIb-IX Complex Receptors, Cell Surface/classification Receptors, Cell Surface/genetics Receptors, G-Protein-Coupled Receptors, Gastrointestinal Hormone/classification Receptors, Gastrointestinal Hormone/genetics Receptors, Glutamate/classification Receptors, Glutamate/genetics Rhodopsin/classification Rhodopsin/genetics Sequence Analysis, Protein Sequence Homology, Amino Acid Collapse Grants Collapse Collaborators Collapse		22	2074
17	Finn RD, Tate J, Mistry J, Coggill PC, Sammut SJ, Hotz HR, Ceric G, Forslund K, Eddy SR, Sonnhammer ELL, Bateman A. The Pfam protein families database. Nucleic Acids Res 2008;36:D281-8. [PMID: 18039703 PMCID: PMC2238907 DOI: 10.1093/nar/gkm960] [Citation(s) in RCA: 1709] [Impact Index Per Article: 100.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2007] [Revised: 10/10/2007] [Accepted: 10/16/2007] [Indexed: 12/14/2022] Open Abstract Pfam is a comprehensive collection of protein domains and families, represented as multiple sequence alignments and as profile hidden Markov models. The current release of Pfam (22.0) contains 9318 protein families. Pfam is now based not only on the UniProtKB sequence database, but also on NCBI GenPept and on sequences from selected metagenomics projects. Pfam is available on the web from the consortium members using a new, consistent and improved website design in the UK (http://pfam.sanger.ac.uk/), the USA (http://pfam.janelia.org/) and Sweden (http://pfam.sbc.su.se/), as well as from mirror sites in France (http://pfam.jouy.inra.fr/) and South Korea (http://pfam.ccbb.re.kr/). Collapse Key Words Collapse MESH Headings Animals Databases, Protein Genomics Internet Protein Structure, Tertiary Proteins/classification Proteins/genetics Sequence Alignment Sequence Analysis, Protein User-Computer Interface Collapse Grants 087656 Wellcome Trust BB/F010435/1 Biotechnology and Biological Sciences Research Council G0100305 Medical Research Council Collapse Collaborators Collapse	research-article	17	1709
18	Zimmermann L, Stephens A, Nam SZ, Rau D, Kübler J, Lozajic M, Gabler F, Söding J, Lupas AN, Alva V. A Completely Reimplemented MPI Bioinformatics Toolkit with a New HHpred Server at its Core. J Mol Biol 2017;430:2237-2243. [PMID: 29258817 DOI: 10.1016/j.jmb.2017.12.007] [Citation(s) in RCA: 1694] [Impact Index Per Article: 211.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2017] [Revised: 12/10/2017] [Accepted: 12/11/2017] [Indexed: 12/12/2022] Abstract The MPI Bioinformatics Toolkit (https://toolkit.tuebingen.mpg.de) is a free, one-stop web service for protein bioinformatic analysis. It currently offers 34 interconnected external and in-house tools, whose functionality covers sequence similarity searching, alignment construction, detection of sequence features, structure prediction, and sequence classification. This breadth has made the Toolkit an important resource for experimental biology and for teaching bioinformatic inquiry. Recently, we replaced the first version of the Toolkit, which was released in 2005 and had served around 2.5 million queries, with an entirely new version, focusing on improved features for the comprehensive analysis of proteins, as well as on promoting teaching. For instance, our popular remote homology detection server, HHpred, now allows pairwise comparison of two sequences or alignments and offers additional profile HMMs for several model organisms and domain databases. Here, we introduce the new version of our Toolkit and its application to the analysis of proteins. Collapse Key Words HHblits HHpred MPI Bioinformatics Toolkit remote homology detection structure prediction Collapse MESH Headings Amino Acid Sequence Animals Computational Biology/methods Humans Internet Models, Molecular Protein Conformation Proteins/chemistry Proteins/genetics Proteins/metabolism Sequence Alignment/methods Sequence Analysis, Protein Sequence Homology, Amino Acid Software Collapse Grants Collapse Collaborators Collapse	Research Support, Non-U.S. Gov't	8	1694
19	Ishihama Y, Oda Y, Tabata T, Sato T, Nagasu T, Rappsilber J, Mann M. Exponentially modified protein abundance index (emPAI) for estimation of absolute protein amount in proteomics by the number of sequenced peptides per protein. Mol Cell Proteomics 2005;4:1265-72. [PMID: 15958392 DOI: 10.1074/mcp.m500061-mcp200] [Citation(s) in RCA: 1646] [Impact Index Per Article: 82.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open Abstract To estimate absolute protein contents in complex mixtures, we previously defined a protein abundance index (PAI) as the number of observed peptides divided by the number of observable peptides per protein (Rappsilber, J., Ryder, U., Lamond, A. I., and Mann, M. (2002) Large-scale proteomic analysis of the human spliceosome. Genome. Res. 12, 1231-1245). Here we report that PAI values obtained at different concentrations of serum albumin show a linear relationship with the logarithm of protein concentration in LC-MS/MS experiments. This was also the case for 46 proteins in a mouse whole cell lysate. For absolute quantitation, PAI was converted to exponentially modified PAI (emPAI), equal to 10PAI minus one, which is proportional to protein content in a protein mixture. For the 46 proteins in the whole lysate, the deviation percentages of the emPAI-based abundances from the actual values were within 63% on average, similar or better than determination of abundance by protein staining. emPAI was applied to comprehensive protein expression analysis and to a comparison study between gene and protein expression in a human cancer cell line, HCT116. The values of emPAI are easily calculated and add important quantitation information to proteomic experiments; therefore we suggest that they should be reported in large scale proteomic identification projects. Collapse Key Words Collapse MESH Headings Animals Cell Extracts Cell Line, Tumor Chromatography, Liquid Clone Cells Colonic Neoplasms/pathology Gene Expression Humans Mass Spectrometry Mice Neuroblastoma/pathology Peptides/chemistry Proteins/analysis Proteins/genetics Proteomics Sequence Analysis, Protein Serum Albumin/analysis Collapse Grants Collapse Collaborators Collapse	Research Support, Non-U.S. Gov't	20	1646
20	Cheng RP, Gellman SH, DeGrado WF. beta-Peptides: from structure to function. Chem Rev 2001;101:3219-32. [PMID: 11710070 DOI: 10.1021/cr000045i] [Citation(s) in RCA: 1593] [Impact Index Per Article: 66.4] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Abstract Collapse Key Words Collapse MESH Headings Amino Acid Sequence Amino Acids/chemistry Anti-Infective Agents/metabolism Carbohydrate Sequence Cholesterol/metabolism Circular Dichroism Lipid Metabolism Lipids/antagonists & inhibitors Magnetic Resonance Spectroscopy Models, Molecular Molecular Sequence Data Molecular Structure Peptides/chemistry Peptides, Cyclic/chemistry Protein Conformation Protein Folding Sequence Analysis, Protein Collapse Grants GM-19664 NIGMS NIH HHS GM-56414 NIGMS NIH HHS Collapse Collaborators Collapse	Review	24	1593
21	Marchler-Bauer A, Bryant SH. CD-Search: protein domain annotations on the fly. Nucleic Acids Res 2004;32:W327-31. [PMID: 15215404 PMCID: PMC441592 DOI: 10.1093/nar/gkh454] [Citation(s) in RCA: 1518] [Impact Index Per Article: 72.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2004] [Revised: 04/21/2004] [Accepted: 04/21/2004] [Indexed: 11/14/2022] Open Abstract We describe the Conserved Domain Search service (CD-Search), a web-based tool for the detection of structural and functional domains in protein sequences. CD-Search uses BLAST(R) heuristics to provide a fast, interactive service, and searches a comprehensive collection of domain models. Search results are displayed as domain architecture cartoons and pairwise alignments between the query and domain-model consensus sequences. Search results may be visualized in further detail by embedding the query sequence into multiple alignment displays and by mapping onto three-dimensional molecular graphic displays of known structures within the domain family. CD-Search can be accessed at http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi. Collapse Key Words Collapse MESH Headings Amino Acid Sequence Computer Graphics Consensus Sequence Internet Protein Structure, Tertiary Proteins/chemistry Sequence Alignment Sequence Analysis, Protein Software Time Factors User-Computer Interface Collapse Grants Collapse Collaborators Collapse	research-article	21	1518
22	Hunter S, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bork P, Das U, Daugherty L, Duquenne L, Finn RD, Gough J, Haft D, Hulo N, Kahn D, Kelly E, Laugraud A, Letunic I, Lonsdale D, Lopez R, Madera M, Maslen J, McAnulla C, McDowall J, Mistry J, Mitchell A, Mulder N, Natale D, Orengo C, Quinn AF, Selengut JD, Sigrist CJA, Thimma M, Thomas PD, Valentin F, Wilson D, Wu CH, Yeats C. InterPro: the integrative protein signature database. Nucleic Acids Res 2009;37:D211-5. [PMID: 18940856 PMCID: PMC2686546 DOI: 10.1093/nar/gkn785] [Citation(s) in RCA: 1511] [Impact Index Per Article: 94.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2008] [Revised: 10/08/2008] [Accepted: 10/09/2008] [Indexed: 11/13/2022] Open Abstract The InterPro database (http://www.ebi.ac.uk/interpro/) integrates together predictive models or 'signatures' representing protein domains, families and functional sites from multiple, diverse source databases: Gene3D, PANTHER, Pfam, PIRSF, PRINTS, ProDom, PROSITE, SMART, SUPERFAMILY and TIGRFAMs. Integration is performed manually and approximately half of the total approximately 58,000 signatures available in the source databases belong to an InterPro entry. Recently, we have started to also display the remaining un-integrated signatures via our web interface. Other developments include the provision of non-signature data, such as structural data, in new XML files on our FTP site, as well as the inclusion of matchless UniProtKB proteins in the existing match XML files. The web interface has been extended and now links out to the ADAN predicted protein-protein interaction database and the SPICE and Dasty viewers. The latest public release (v18.0) covers 79.8% of UniProtKB (v14.1) and consists of 16 549 entries. InterPro data may be accessed either via the web address above, via web services, by downloading files by anonymous FTP or by using the InterProScan search software (http://www.ebi.ac.uk/Tools/InterProScan/). Collapse Key Words Collapse MESH Headings Databases, Protein Proteins/chemistry Proteins/classification Sequence Analysis, Protein Systems Integration Collapse Grants 087656 Wellcome Trust BB/F010435/1 Biotechnology and Biological Sciences Research Council GM081084 NIGMS NIH HHS BB/F010508/1 Biotechnology and Biological Sciences Research Council Collapse Collaborators Collapse	Research Support, N.I.H., Extramural	16	1511
23	Kim DE, Chivian D, Baker D. Protein structure prediction and analysis using the Robetta server. Nucleic Acids Res 2004;32:W526-31. [PMID: 15215442 PMCID: PMC441606 DOI: 10.1093/nar/gkh468] [Citation(s) in RCA: 1505] [Impact Index Per Article: 71.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open Abstract The Robetta server (http://robetta.bakerlab.org) provides automated tools for protein structure prediction and analysis. For structure prediction, sequences submitted to the server are parsed into putative domains and structural models are generated using either comparative modeling or de novo structure prediction methods. If a confident match to a protein of known structure is found using BLAST, PSI-BLAST, FFAS03 or 3D-Jury, it is used as a template for comparative modeling. If no match is found, structure predictions are made using the de novo Rosetta fragment insertion method. Experimental nuclear magnetic resonance (NMR) constraints data can also be submitted with a query sequence for RosettaNMR de novo structure determination. Other current capabilities include the prediction of the effects of mutations on protein-protein interactions using computational interface alanine scanning. The Rosetta protein design and protein-protein docking methodologies will soon be available through the server as well. Collapse Key Words Collapse MESH Headings Alanine/genetics Internet Models, Molecular Nuclear Magnetic Resonance, Biomolecular Protein Conformation Protein Structure, Tertiary Proteins/chemistry Proteins/genetics Proteins/metabolism Reproducibility of Results Sequence Analysis, Protein Software Structural Homology, Protein User-Computer Interface Collapse Grants P50 GM064655 NIGMS NIH HHS Collapse Collaborators Collapse	Journal Article	21	1505
24	Huerta-Cepas J, Szklarczyk D, Forslund K, Cook H, Heller D, Walter MC, Rattei T, Mende DR, Sunagawa S, Kuhn M, Jensen LJ, von Mering C, Bork P. eggNOG 4.5: a hierarchical orthology framework with improved functional annotations for eukaryotic, prokaryotic and viral sequences. Nucleic Acids Res 2016;44:D286-93. [PMID: 26582926 PMCID: PMC4702882 DOI: 10.1093/nar/gkv1248] [Citation(s) in RCA: 1467] [Impact Index Per Article: 163.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2015] [Revised: 10/30/2015] [Accepted: 11/02/2015] [Indexed: 01/19/2023] Open Abstract eggNOG is a public resource that provides Orthologous Groups (OGs) of proteins at different taxonomic levels, each with integrated and summarized functional annotations. Developments since the latest public release include changes to the algorithm for creating OGs across taxonomic levels, making nested groups hierarchically consistent. This allows for a better propagation of functional terms across nested OGs and led to the novel annotation of 95 890 previously uncharacterized OGs, increasing overall annotation coverage from 67% to 72%. The functional annotations of OGs have been expanded to also provide Gene Ontology terms, KEGG pathways and SMART/Pfam domains for each group. Moreover, eggNOG now provides pairwise orthology relationships within OGs based on analysis of phylogenetic trees. We have also incorporated a framework for quickly mapping novel sequences to OGs based on precomputed HMM profiles. Finally, eggNOG version 4.5 incorporates a novel data set spanning 2605 viral OGs, covering 5228 proteins from 352 viral proteomes. All data are accessible for bulk downloading, as a web-service, and through a completely redesigned web interface. The new access points provide faster searches and a number of new browsing and visualization capabilities, facilitating the needs of both experts and less experienced users. eggNOG v4.5 is available at http://eggnog.embl.de. Collapse Key Words Collapse MESH Headings Algorithms Archaeal Proteins/chemistry Bacterial Proteins/chemistry Databases, Protein Eukaryota Molecular Sequence Annotation Phylogeny Proteome/chemistry Sequence Analysis, Protein Viral Proteins/chemistry Collapse Grants Collapse Collaborators Collapse	research-article	9	1467
25	Potter SC, Luciani A, Eddy SR, Park Y, Lopez R, Finn RD. HMMER web server: 2018 update. Nucleic Acids Res 2018;46:W200-W204. [PMID: 29905871 PMCID: PMC6030962 DOI: 10.1093/nar/gky448] [Citation(s) in RCA: 1415] [Impact Index Per Article: 202.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2018] [Revised: 04/18/2018] [Accepted: 06/12/2018] [Indexed: 12/25/2022] Open Abstract The HMMER webserver [http://www.ebi.ac.uk/Tools/hmmer] is a free-to-use service which provides fast searches against widely used sequence databases and profile hidden Markov model (HMM) libraries using the HMMER software suite (http://hmmer.org). The results of a sequence search may be summarized in a number of ways, allowing users to view and filter the significant hits by domain architecture or taxonomy. For large scale usage, we provide an application programmatic interface (API) which has been expanded in scope, such that all result presentations are available via both HTML and API. Furthermore, we have refactored our JavaScript visualization library to provide standalone components for different result representations. These consume the aforementioned API and can be integrated into third-party websites. The range of databases that can be searched against has been expanded, adding four sequence datasets (12 in total) and one profile HMM library (6 in total). To help users explore the biological context of their results, and to discover new data resources, search results are now supplemented with cross references to other EMBL-EBI databases. Collapse Key Words Collapse MESH Headings Catalytic Domain Databases, Genetic Internet Markov Chains Sequence Analysis Sequence Analysis, Protein Software User-Computer Interface Collapse Grants R01 HG009116 NHGRI NIH HHS Collapse Collaborators Collapse	Research Support, N.I.H., Extramural	7	1415

Please SIGN IN to browse more articles.