Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

492
(from Reference Citation Analysis)

Article PDFs (231)

Cited by ≥ 1 (315)

Searched Name

Web Browser

Year Published

Show more Refine

Article Statistics

Refine

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Journal Articles

Rank	Citation Analysis	Article Type	Number of Years	Citation(s) in RCA
1	Dong E, Du H, Gardner L. An interactive web-based dashboard to track COVID-19 in real time. THE LANCET. INFECTIOUS DISEASES 2020;20:533-534. [PMID: 32087114 PMCID: PMC7159018 DOI: 10.1016/s1473-3099(20)30120-1] [Citation(s) in RCA: 6113] [Impact Index Per Article: 1222.6] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/11/2020] [Accepted: 02/13/2020] [Indexed: 02/07/2023] Abstract Collapse Key Words Collapse MESH Headings Betacoronavirus COVID-19 Coronavirus Infections/epidemiology Humans Pandemics Patient Identification Systems Pneumonia, Viral/epidemiology SARS-CoV-2 Time Factors Web Browser Collapse Grants Collapse Collaborators Collapse	Letter	5	6113
2	The UniProt Consortium BatemanAlexMartinMaria JesusO’DonovanClaireMagraneMicheleAlpiEmanueleAntunesRicardoBelyBenoitBingleyMarkBonillaCarlosBrittoRamonaBursteinasBorisasBye-A-JeeHemaCowleyAndrewSilvaAlan DaGiorgiMaurizio DeDoganTuncaFazziniFrancescoCastroLeyla GarciaFigueiraLuisGarmiriPenelopeGeorghiouGeorgeGonzalezDanielHatton-EllisEmmaLiWeizhongLiuWudongLopezRodrigoLuoJieLussiYvonneMacDougallAlistairNightingaleAndrewPalkaBarbaraPichlerKlemensPoggioliDiegoPundirSangyaPurezaLuisQiGuoyingRenauxAlexandreRosanoffStevenSaidiRabieSawfordTonyShypitsynaAleksandraSperettaElenaTurnerEdwardTyagiNidhiVolynkinVladimirWardellTonyWarnerKateWatkinsXavierZaruRossanaZellnerHermannXenariosIoannisBougueleretLydieBridgeAlanPouxSylvainRedaschiNicoleAimoLucilaArgoud-PuyGhislaineAuchinclossAndreaAxelsenKristianBansalParitBaratinDelphineBlatterMarie-ClaudeBoeckmannBrigitteBollemanJervenBoutetEmmanuelBreuzaLionelCasal-CasasCristinade CastroEdouardCoudertElisabethCucheBeatriceDocheMikaelDornevilDolnideDuvaudSeverineEstreicherAnneFamigliettiLiviaFeuermannMarcGasteigerElisabethGehantSebastienGerritsenVivienneGosArnaudGruaz-GumowskiNadineHinzUrsulaHuloChantalJungoFlorenceKellerGuillaumeLaraVicenteLemercierPhilippeLieberherrDamienLombardotThierryMartinXavierMassonPatrickMorgatAnneNetoTeresaNouspikelNevilaPaesanoSalvoPedruzziIvoPilboutSandrinePozzatoMonicaPruessManuelaRivoireCatherineRoechertBerndSchneiderMichelSigristChristianSonessonKarinStaehliSylvieStutzAndreSundaramShyamalaTognolliMichaelVerbregueLaureVeutheyAnne-LiseWuCathy HArighiCecilia NArminskiLeslieChenChumingChenYongxingGaravelliJohn SHuangHongzhanLaihoKatiMcGarveyPeterNataleDarren ARossKarenVinayakaC RWangQinghuaWangYuqiYehLai-SuZhangJian. UniProt: the universal protein knowledgebase. Nucleic Acids Res 2016;45:D158-D169. [PMID: 27899622 PMCID: PMC5210571 DOI: 10.1093/nar/gkw1099] [Citation(s) in RCA: 3337] [Impact Index Per Article: 370.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2016] [Revised: 10/25/2016] [Accepted: 11/05/2016] [Indexed: 02/06/2023] Open Abstract The UniProt knowledgebase is a large resource of protein sequences and associated detailed annotation. The database contains over 60 million sequences, of which over half a million sequences have been curated by experts who critically review experimental and predicted data for each protein. The remainder are automatically annotated based on rule systems that rely on the expert curated knowledge. Since our last update in 2014, we have more than doubled the number of reference proteomes to 5631, giving a greater coverage of taxonomic diversity. We implemented a pipeline to remove redundant highly similar proteomes that were causing excessive redundancy in UniProt. The initial run of this pipeline reduced the number of sequences in UniProt by 47 million. For our users interested in the accessory proteomes, we have made available sets of pan proteome sequences that cover the diversity of sequences for each species that is found in its strains and sub-strains. To help interpretation of genomic variants, we provide tracks of detailed protein information for the major genome browsers. We provide a SPARQL endpoint that allows complex queries of the more than 22 billion triples of data in UniProt (http://sparql.uniprot.org/). UniProt resources can be accessed via the website at http://www.uniprot.org/. Collapse Key Words Collapse MESH Headings Computational Biology/methods Databases, Protein Genomics/methods Proteome Proteomics/methods Web Browser Collapse Grants R13 GM109648 NIGMS NIH HHS RG/13/5/30112 British Heart Foundation R01 GM080646 NIGMS NIH HHS U41 HG002273 NHGRI NIH HHS G-1307 Parkinson's UK U01 GM120953 NIGMS NIH HHS P20 GM103446 NIGMS NIH HHS Wellcome Trust U41 HG007822 NHGRI NIH HHS Collapse Collaborators Collapse	Research Support, Non-U.S. Gov't	9	3337
3	Zerbino DR, Achuthan P, Akanni W, Amode M, Barrell D, Bhai J, Billis K, Cummins C, Gall A, Girón CG, Gil L, Gordon L, Haggerty L, Haskell E, Hourlier T, Izuogu OG, Janacek SH, Juettemann T, To JK, Laird MR, Lavidas I, Liu Z, Loveland JE, Maurel T, McLaren W, Moore B, Mudge J, Murphy DN, Newman V, Nuhn M, Ogeh D, Ong CK, Parker A, Patricio M, Riat HS, Schuilenburg H, Sheppard D, Sparrow H, Taylor K, Thormann A, Vullo A, Walts B, Zadissa A, Frankish A, Hunt SE, Kostadima M, Langridge N, Martin FJ, Muffato M, Perry E, Ruffier M, Staines DM, Trevanion SJ, Aken BL, Cunningham F, Yates A, Flicek P. Ensembl 2018. Nucleic Acids Res 2018;46:D754-D761. [PMID: 29155950 PMCID: PMC5753206 DOI: 10.1093/nar/gkx1098] [Citation(s) in RCA: 1992] [Impact Index Per Article: 284.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2017] [Revised: 10/17/2017] [Accepted: 10/21/2017] [Indexed: 01/29/2023] Open Abstract The Ensembl project has been aggregating, processing, integrating and redistributing genomic datasets since the initial releases of the draft human genome, with the aim of accelerating genomics research through rapid open distribution of public data. Large amounts of raw data are thus transformed into knowledge, which is made available via a multitude of channels, in particular our browser (http://www.ensembl.org). Over time, we have expanded in multiple directions. First, our resources describe multiple fields of genomics, in particular gene annotation, comparative genomics, genetics and epigenomics. Second, we cover a growing number of genome assemblies; Ensembl Release 90 contains exactly 100. Third, our databases feed simultaneously into an array of services designed around different use cases, ranging from quick browsing to genome-wide bioinformatic analysis. We present here the latest developments of the Ensembl project, with a focus on managing an increasing number of assemblies, supporting efforts in genome interpretation and improving our browser. Collapse Key Words Collapse MESH Headings Animals Databases, Genetic Datasets as Topic Epigenomics Genome Genome, Human Genome-Wide Association Study Genomics High-Throughput Nucleotide Sequencing Humans Information Dissemination Molecular Sequence Annotation Vertebrates/genetics Web Browser Collapse Grants 201535/Z/16/Z Wellcome Trust U41 HG007234 NHGRI NIH HHS U41 HG007823 NHGRI NIH HHS WT108749/Z/15/Z Wellcome Trust Collapse Collaborators Collapse	Research Support, N.I.H., Extramural	7	1992
4	Shen W, Le S, Li Y, Hu F. SeqKit: A Cross-Platform and Ultrafast Toolkit for FASTA/Q File Manipulation. PLoS One 2016;11:e0163962. [PMID: 27706213 PMCID: PMC5051824 DOI: 10.1371/journal.pone.0163962] [Citation(s) in RCA: 1640] [Impact Index Per Article: 182.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2016] [Accepted: 09/16/2016] [Indexed: 11/23/2022] Open Abstract FASTA and FASTQ are basic and ubiquitous formats for storing nucleotide and protein sequences. Common manipulations of FASTA/Q file include converting, searching, filtering, deduplication, splitting, shuffling, and sampling. Existing tools only implement some of these manipulations, and not particularly efficiently, and some are only available for certain operating systems. Furthermore, the complicated installation process of required packages and running environments can render these programs less user friendly. This paper describes a cross-platform ultrafast comprehensive toolkit for FASTA/Q processing. SeqKit provides executable binary files for all major operating systems, including Windows, Linux, and Mac OSX, and can be directly used without any dependencies or pre-configurations. SeqKit demonstrates competitive performance in execution time and memory usage compared to similar tools. The efficiency and usability of SeqKit enable researchers to rapidly accomplish common FASTA/Q file manipulations. SeqKit is open source and available on Github at https://github.com/shenwei356/seqkit. Collapse Key Words Collapse MESH Headings Amino Acid Sequence Base Sequence Data Mining/methods Sequence Alignment Software Web Browser Collapse Grants National Natural Science Foundation of China Collapse Collaborators Collapse	Journal Article	9	1640
5	Nilsson RH, Larsson KH, Taylor AF, Bengtsson-Palme J, Jeppesen TS, Schigel D, Kennedy P, Picard K, Glöckner FO, Tedersoo L, Saar I, Kõljalg U, Abarenkov K. The UNITE database for molecular identification of fungi: handling dark taxa and parallel taxonomic classifications. Nucleic Acids Res 2019;47:D259-D264. [PMID: 30371820 PMCID: PMC6324048 DOI: 10.1093/nar/gky1022] [Citation(s) in RCA: 1582] [Impact Index Per Article: 263.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Revised: 10/11/2018] [Accepted: 10/12/2018] [Indexed: 12/12/2022] Open Abstract UNITE (https://unite.ut.ee/) is a web-based database and sequence management environment for the molecular identification of fungi. It targets the formal fungal barcode-the nuclear ribosomal internal transcribed spacer (ITS) region-and offers all ∼1 000 000 public fungal ITS sequences for reference. These are clustered into ∼459 000 species hypotheses and assigned digital object identifiers (DOIs) to promote unambiguous reference across studies. In-house and web-based third-party sequence curation and annotation have resulted in more than 275 000 improvements to the data over the past 15 years. UNITE serves as a data provider for a range of metabarcoding software pipelines and regularly exchanges data with all major fungal sequence databases and other community resources. Recent improvements include redesigned handling of unclassifiable species hypotheses, integration with the taxonomic backbone of the Global Biodiversity Information Facility, and support for an unlimited number of parallel taxonomic classification systems. Collapse Key Words Collapse MESH Headings Computational Biology/methods DNA Barcoding, Taxonomic/methods Databases, Nucleic Acid Fungi/classification Fungi/genetics Genome, Fungal Genomics/methods Software Web Browser Collapse Grants Collapse Collaborators Collapse	research-article	6	1582
6	The Gene Ontology Consortium. Expansion of the Gene Ontology knowledgebase and resources. Nucleic Acids Res 2016;45:D331-D338. [PMID: 27899567 PMCID: PMC5210579 DOI: 10.1093/nar/gkw1108] [Citation(s) in RCA: 1374] [Impact Index Per Article: 152.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2016] [Accepted: 11/16/2016] [Indexed: 12/11/2022] Open Abstract The Gene Ontology (GO) is a comprehensive resource of computable knowledge regarding the functions of genes and gene products. As such, it is extensively used by the biomedical research community for the analysis of -omics and related data. Our continued focus is on improving the quality and utility of the GO resources, and we welcome and encourage input from researchers in all areas of biology. In this update, we summarize the current contents of the GO knowledgebase, and present several new features and improvements that have been made to the ontology, the annotations and the tools. Among the highlights are 1) developments that facilitate access to, and application of, the GO knowledgebase, and 2) extensions to the resource as well as increasing support for descriptions of causal models of biological systems and network biology. To learn more, visit http://geneontology.org/. Collapse Key Words Collapse MESH Headings Computational Biology/methods Databases, Genetic Gene Ontology Genomics/methods Molecular Sequence Annotation Phylogeny Web Browser Collapse Grants R13 GM109648 NIGMS NIH HHS RG/13/5/30112 British Heart Foundation R01 GM089636 NIGMS NIH HHS U41 HG002273 NHGRI NIH HHS 104967 Wellcome Trust G-1307 Parkinson's UK UL1 TR001422 NCATS NIH HHS U41 HG002659 NHGRI NIH HHS G1000968 Medical Research Council Collapse Collaborators Collapse	Research Support, N.I.H., Extramural	9	1374
7	Fornes O, Castro-Mondragon JA, Khan A, van der Lee R, Zhang X, Richmond PA, Modi BP, Correard S, Gheorghe M, Baranašić D, Santana-Garcia W, Tan G, Chèneby J, Ballester B, Parcy F, Sandelin A, Lenhard B, Wasserman WW, Mathelier A. JASPAR 2020: update of the open-access database of transcription factor binding profiles. Nucleic Acids Res 2020;48:D87-D92. [PMID: 31701148 PMCID: PMC7145627 DOI: 10.1093/nar/gkz1001] [Citation(s) in RCA: 857] [Impact Index Per Article: 171.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2019] [Revised: 10/15/2019] [Accepted: 10/16/2019] [Indexed: 02/07/2023] Open Abstract JASPAR (http://jaspar.genereg.net) is an open-access database of curated, non-redundant transcription factor (TF)-binding profiles stored as position frequency matrices (PFMs) for TFs across multiple species in six taxonomic groups. In this 8th release of JASPAR, the CORE collection has been expanded with 245 new PFMs (169 for vertebrates, 42 for plants, 17 for nematodes, 10 for insects, and 7 for fungi), and 156 PFMs were updated (125 for vertebrates, 28 for plants and 3 for insects). These new profiles represent an 18% expansion compared to the previous release. JASPAR 2020 comes with a novel collection of unvalidated TF-binding profiles for which our curators did not find orthogonal supporting evidence in the literature. This collection has a dedicated web form to engage the community in the curation of unvalidated TF-binding profiles. Moreover, we created a Q&A forum to ease the communication between the user community and JASPAR curators. Finally, we updated the genomic tracks, inference tool, and TF-binding profile similarity clusters. All the data is available through the JASPAR website, its associated RESTful API, and through the JASPAR2020 R/Bioconductor package. Collapse Key Words Collapse MESH Headings Animals Binding Sites Computational Biology Databases, Genetic Genomics/methods Protein Binding Software Transcription Factors/metabolism User-Computer Interface Web Browser Collapse Grants MC_EX_MR/S300007/1 Medical Research Council MC_UP_1102/1 Medical Research Council Collapse Collaborators Collapse	research-article	5	857
8	Yates AD, Achuthan P, Akanni W, Allen J, Allen J, Alvarez-Jarreta J, Amode MR, Armean IM, Azov AG, Bennett R, Bhai J, Billis K, Boddu S, Marugán JC, Cummins C, Davidson C, Dodiya K, Fatima R, Gall A, Giron CG, Gil L, Grego T, Haggerty L, Haskell E, Hourlier T, Izuogu OG, Janacek SH, Juettemann T, Kay M, Lavidas I, Le T, Lemos D, Martinez JG, Maurel T, McDowall M, McMahon A, Mohanan S, Moore B, Nuhn M, Oheh DN, Parker A, Parton A, Patricio M, Sakthivel MP, Abdul Salam AI, Schmitt BM, Schuilenburg H, Sheppard D, Sycheva M, Szuba M, Taylor K, Thormann A, Threadgold G, Vullo A, Walts B, Winterbottom A, Zadissa A, Chakiachvili M, Flint B, Frankish A, Hunt SE, IIsley G, Kostadima M, Langridge N, Loveland JE, Martin FJ, Morales J, Mudge JM, Muffato M, Perry E, Ruffier M, Trevanion SJ, Cunningham F, Howe KL, Zerbino DR, Flicek P. Ensembl 2020. Nucleic Acids Res 2020;48:D682-D688. [PMID: 31691826 PMCID: PMC7145704 DOI: 10.1093/nar/gkz966] [Citation(s) in RCA: 767] [Impact Index Per Article: 153.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2019] [Revised: 10/09/2019] [Accepted: 10/10/2019] [Indexed: 12/11/2022] Open Abstract The Ensembl (https://www.ensembl.org) is a system for generating and distributing genome annotation such as genes, variation, regulation and comparative genomics across the vertebrate subphylum and key model organisms. The Ensembl annotation pipeline is capable of integrating experimental and reference data from multiple providers into a single integrated resource. Here, we present 94 newly annotated and re-annotated genomes, bringing the total number of genomes offered by Ensembl to 227. This represents the single largest expansion of the resource since its inception. We also detail our continued efforts to improve human annotation, developments in our epigenome analysis and display, a new tool for imputing causal genes from genome-wide association studies and visualisation of variation within a 3D protein model. Finally, we present information on our new website. Both software and data are made available without restriction via our website, online tools platform and programmatic interfaces (available under an Apache 2.0 license) and data updates made available four times a year. Collapse Key Words Collapse MESH Headings Algorithms Animals Computational Biology/methods Computer Graphics Databases, Genetic Databases, Protein Epigenome Genetic Variation Genome-Wide Association Study Genomics Histones/metabolism Humans Imaging, Three-Dimensional Internet Ligands Molecular Sequence Annotation Search Engine Software Species Specificity Transcriptome User-Computer Interface Web Browser Collapse Grants WT108749/Z/15/Z Wellcome Trust U41 HG007823 NHGRI NIH HHS U41 HG007234 NHGRI NIH HHS 201535/Z/16/Z Wellcome Trust Wellcome Trust Collapse Collaborators Collapse	Research Support, N.I.H., Extramural	5	767
9	Anwyl-Irvine AL, Massonnié J, Flitton A, Kirkham N, Evershed JK. Gorilla in our midst: An online behavioral experiment builder. Behav Res Methods 2020;52:388-407. [PMID: 31016684 PMCID: PMC7005094 DOI: 10.3758/s13428-019-01237-x] [Citation(s) in RCA: 718] [Impact Index Per Article: 143.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Abstract Behavioral researchers are increasingly conducting their studies online, to gain access to large and diverse samples that would be difficult to get in a laboratory environment. However, there are technical access barriers to building experiments online, and web browsers can present problems for consistent timing-an important issue with reaction-time-sensitive measures. For example, to ensure accuracy and test-retest reliability in presentation and response recording, experimenters need a working knowledge of programming languages such as JavaScript. We review some of the previous and current tools for online behavioral research, as well as how well they address the issues of usability and timing. We then present the Gorilla Experiment Builder (gorilla.sc), a fully tooled experiment authoring and deployment platform, designed to resolve many timing issues and make reliable online experimentation open and accessible to a wider range of technical abilities. To demonstrate the platform's aptitude for accessible, reliable, and scalable research, we administered a task with a range of participant groups (primary school children and adults), settings (without supervision, at home, and under supervision, in both schools and public engagement events), equipment (participant's own computer, computer supplied by the researcher), and connection types (personal internet connection, mobile phone 3G/4G). We used a simplified flanker task taken from the attentional network task (Rueda, Posner, & Rothbart, 2004). We replicated the "conflict network" effect in all these populations, demonstrating the platform's capability to run reaction-time-sensitive experiments. Unresolved limitations of running experiments online are then discussed, along with potential solutions and some future features of the platform. Collapse Key Words Attentional control Browser timing Online methods Online research Remote testing Timing accuracy Collapse MESH Headings Adult Attention Behavioral Research Cell Phone Child Child, Preschool Cognition Data Collection Female Humans Internet Male Reaction Time Reproducibility of Results Research Design Web Browser Collapse Grants MC_UP_A060_1103 Medical Research Council MC_UU_00005/2 Medical Research Council Templeton World Charity Foundation Economic and Social Research Council Collapse Collaborators Collapse	research-article	5	718
10	Tang D, Chen M, Huang X, Zhang G, Zeng L, Zhang G, Wu S, Wang Y. SRplot: A free online platform for data visualization and graphing. PLoS One 2023;18:e0294236. [PMID: 37943830 PMCID: PMC10635526 DOI: 10.1371/journal.pone.0294236] [Citation(s) in RCA: 689] [Impact Index Per Article: 344.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Accepted: 10/27/2023] [Indexed: 11/12/2023] Open Abstract Graphics are widely used to provide summarization of complex data in scientific publications. Although there are many tools available for drawing graphics, their use is limited by programming skills, costs, and platform specificities. Here, we presented a freely accessible easy-to-use web server named SRplot that integrated more than a hundred of commonly used data visualization and graphing functions together. It can be run easily using all Web browsers and there are no strong requirements on the computing power of users' machines. With a user-friendly graphical interface, users can simply paste the contents of the input file into the text box according to the defined file format. Modification operations can be easily performed, and graphs can be generated in real-time. The resulting graphs can be easily downloaded in bitmap (PNG or TIFF) or vector (PDF or SVG) format in publication quality. The website is updated promptly and continuously. Functions in SRplot have been improved, optimized and updated depend on feedback and suggestions from users. The graphs prepared with SRplot have been featured in more than five hundred peer-reviewed publications. The SRplot web server is now freely available at http://www.bioinformatics.com.cn/SRplot. Collapse Key Words Collapse MESH Headings Software Data Visualization Computer Graphics Web Browser Internet User-Computer Interface Collapse Grants Natural Science Foundation of Hunan Province Health and Family Planning Commission of Hunan Province Collapse Collaborators Collapse	research-article	2	689
11	Shen L, Shao N, Liu X, Nestler E. ngs.plot: Quick mining and visualization of next-generation sequencing data by integrating genomic databases. BMC Genomics 2014;15:284. [PMID: 24735413 PMCID: PMC4028082 DOI: 10.1186/1471-2164-15-284] [Citation(s) in RCA: 681] [Impact Index Per Article: 61.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2014] [Accepted: 04/04/2014] [Indexed: 12/01/2022] Open Abstract BACKGROUND Understanding the relationship between the millions of functional DNA elements and their protein regulators, and how they work in conjunction to manifest diverse phenotypes, is key to advancing our understanding of the mammalian genome. Next-generation sequencing technology is now used widely to probe these protein-DNA interactions and to profile gene expression at a genome-wide scale. As the cost of DNA sequencing continues to fall, the interpretation of the ever increasing amount of data generated represents a considerable challenge. RESULTS We have developed ngs.plot - a standalone program to visualize enrichment patterns of DNA-interacting proteins at functionally important regions based on next-generation sequencing data. We demonstrate that ngs.plot is not only efficient but also scalable. We use a few examples to demonstrate that ngs.plot is easy to use and yet very powerful to generate figures that are publication ready. CONCLUSIONS We conclude that ngs.plot is a useful tool to help fill the gap between massive datasets and genomic information in this era of big sequencing data. Collapse Key Words next-generation sequencing visualization epigenomics data mining genomic databases Collapse MESH Headings Algorithms Animals Computational Biology/methods Data Mining/methods Databases, Genetic Embryonic Stem Cells/metabolism Epigenomics/methods Genomics/methods High-Throughput Nucleotide Sequencing Humans Promoter Regions, Genetic Reproducibility of Results Sequence Analysis, DNA/methods Sequence Analysis, RNA/methods Software Web Browser Workflow Collapse Grants P01 DA008227 NIDA NIH HHS P50 MH096890 NIMH NIH HHS P01DA008227 NIDA NIH HHS P50MH096890 NIMH NIH HHS Collapse Collaborators Collapse	Research Support, N.I.H., Extramural	11	681
12	Zhou Z, Alikhan NF, Mohamed K, Fan Y, Achtman M. The EnteroBase user's guide, with case studies on Salmonella transmissions, Yersinia pestis phylogeny, and Escherichia core genomic diversity. Genome Res 2020;30:138-152. [PMID: 31809257 PMCID: PMC6961584 DOI: 10.1101/gr.251678.119] [Citation(s) in RCA: 605] [Impact Index Per Article: 121.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2019] [Accepted: 12/03/2019] [Indexed: 01/08/2023] Abstract EnteroBase is an integrated software environment that supports the identification of global population structures within several bacterial genera that include pathogens. Here, we provide an overview of how EnteroBase works, what it can do, and its future prospects. EnteroBase has currently assembled more than 300,000 genomes from Illumina short reads from Salmonella, Escherichia, Yersinia, Clostridioides, Helicobacter, Vibrio, and Moraxella and genotyped those assemblies by core genome multilocus sequence typing (cgMLST). Hierarchical clustering of cgMLST sequence types allows mapping a new bacterial strain to predefined population structures at multiple levels of resolution within a few hours after uploading its short reads. Case Study 1 illustrates this process for local transmissions of Salmonella enterica serovar Agama between neighboring social groups of badgers and humans. EnteroBase also supports single nucleotide polymorphism (SNP) calls from both genomic assemblies and after extraction from metagenomic sequences, as illustrated by Case Study 2 which summarizes the microevolution of Yersinia pestis over the last 5000 years of pandemic plague. EnteroBase can also provide a global overview of the genomic diversity within an entire genus, as illustrated by Case Study 3, which presents a novel, global overview of the population structure of all of the species, subspecies, and clades within Escherichia. Collapse Key Words Collapse MESH Headings Databases, Genetic Escherichia/classification Escherichia/genetics Genome, Bacterial Genomics/methods Metagenome Metagenomics/methods Multilocus Sequence Typing Phylogeny Salmonella/classification Salmonella/genetics Software User-Computer Interface Web Browser Yersinia pestis/classification Yersinia pestis/genetics Collapse Grants Wellcome Trust 202792/Z/16/Z Wellcome Trust BB/L020319/1 Biotechnology and Biological Sciences Research Council Biotechnology and Biological Sciences Research Council Wellcome Trust Collapse Collaborators Collapse	research-article	5	605
13	Lánczky A, Nagy Á, Bottai G, Munkácsy G, Szabó A, Santarpia L, Győrffy B. miRpower: a web-tool to validate survival-associated miRNAs utilizing expression data from 2178 breast cancer patients. Breast Cancer Res Treat 2016;160:439-446. [PMID: 27744485 DOI: 10.1007/s10549-016-4013-7] [Citation(s) in RCA: 588] [Impact Index Per Article: 65.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2016] [Accepted: 10/08/2016] [Indexed: 02/06/2023] Abstract PURPOSE The proper validation of prognostic biomarkers is an important clinical issue in breast cancer research. MicroRNAs (miRNAs) have emerged as a new class of promising breast cancer biomarkers. In the present work, we developed an integrated online bioinformatic tool to validate the prognostic relevance of miRNAs in breast cancer. METHODS A database was set up by searching the GEO, EGA, TCGA, and PubMed repositories to identify datasets with published miRNA expression and clinical data. Kaplan-Meier survival analysis was performed to validate the prognostic value of a set of 41 previously published survival-associated miRNAs. RESULTS All together 2178 samples from four independent datasets were integrated into the system including the expression of 1052 distinct human miRNAs. In addition, the web-tool allows for the selection of patients, which can be filtered by receptors status, lymph node involvement, histological grade, and treatments. The complete analysis tool can be accessed online at: www.kmplot.com/mirpower . We used this tool to analyze a large number of deregulated miRNAs associated with breast cancer features and outcome, and confirmed the prognostic value of 26 miRNAs. A significant correlation in three out of four datasets was validated only for miR-29c and miR-101. CONCLUSIONS In summary, we established an integrated platform capable to mine all available miRNA data to perform a survival analysis for the identification and validation of prognostic miRNA markers in breast cancer. Collapse Key Words Biomarkers Breast cancer Gene expression MicroRNAs Prognosis Survival Collapse MESH Headings Biomarkers, Tumor Breast Neoplasms/genetics Breast Neoplasms/mortality Computational Biology/methods Databases, Genetic Female Gene Expression Regulation, Neoplastic Humans MicroRNAs/genetics Prognosis Reproducibility of Results Software User-Computer Interface Web Browser Collapse Grants Collapse Collaborators Collapse	Research Support, Non-U.S. Gov't	9	588
14	Haeussler M, Zweig AS, Tyner C, Speir ML, Rosenbloom KR, Raney BJ, Lee CM, Lee BT, Hinrichs AS, Gonzalez JN, Gibson D, Diekhans M, Clawson H, Casper J, Barber GP, Haussler D, Kuhn RM, Kent W. The UCSC Genome Browser database: 2019 update. Nucleic Acids Res 2019;47:D853-D858. [PMID: 30407534 PMCID: PMC6323953 DOI: 10.1093/nar/gky1095] [Citation(s) in RCA: 546] [Impact Index Per Article: 91.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Revised: 10/17/2018] [Accepted: 10/19/2018] [Indexed: 01/17/2023] Open Abstract The UCSC Genome Browser (https://genome.ucsc.edu) is a graphical viewer for exploring genome annotations. For almost two decades, the Browser has provided visualization tools for genetics and molecular biology and continues to add new data and features. This year, we added a new tool that lets users interactively arrange existing graphing tracks into new groups. Other software additions include new formats for chromosome interactions, a ChIP-Seq peak display for track hubs and improved support for HGVS. On the annotation side, we have added gnomAD, TCGA expression, RefSeq Functional elements, GTEx eQTLs, CRISPR Guides, SNPpedia and created a 30-way primate alignment on the human genome. Nine assemblies now have RefSeq-mapped gene models. Collapse Key Words Collapse MESH Headings Animals Chromosome Mapping Databases, Genetic Genome/genetics Genome, Human/genetics Genomics Humans Molecular Sequence Annotation Software Web Browser Collapse Grants U01 MH114825 NIMH NIH HHS U41 HG002371 NHGRI NIH HHS U41 HG007234 NHGRI NIH HHS U54 HG007990 NHGRI NIH HHS Collapse Collaborators Collapse	Research Support, N.I.H., Extramural	6	546
15	Karczewski KJ, Weisburd B, Thomas B, Solomonson M, Ruderfer DM, Kavanagh D, Hamamsy T, Lek M, Samocha KE, Cummings BB, Birnbaum D, Daly MJ, MacArthur DG. The ExAC browser: displaying reference data information from over 60 000 exomes. Nucleic Acids Res 2017;45:D840-D845. [PMID: 27899611 PMCID: PMC5210650 DOI: 10.1093/nar/gkw971] [Citation(s) in RCA: 538] [Impact Index Per Article: 67.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2016] [Revised: 10/09/2016] [Accepted: 10/11/2016] [Indexed: 11/30/2022] Open Abstract Worldwide, hundreds of thousands of humans have had their genomes or exomes sequenced, and access to the resulting data sets can provide valuable information for variant interpretation and understanding gene function. Here, we present a lightweight, flexible browser framework to display large population datasets of genetic variation. We demonstrate its use for exome sequence data from 60 706 individuals in the Exome Aggregation Consortium (ExAC). The ExAC browser provides gene- and transcript-centric displays of variation, a critical view for clinical applications. Additionally, we provide a variant display, which includes population frequency and functional annotation data as well as short read support for the called variant. This browser is open-source, freely available at http://exac.broadinstitute.org, and has already been used extensively by clinical laboratories worldwide. Collapse Key Words Collapse MESH Headings Computational Biology/methods Databases, Genetic Exome Genome-Wide Association Study/methods Genomics/methods Humans Software User-Computer Interface Web Browser Collapse Grants P30 DK043351 NIDDK NIH HHS F32 GM115208 NIGMS NIH HHS MC_UP_1102/20 Medical Research Council U54 DK105566 NIDDK NIH HHS R01 GM104371 NIGMS NIH HHS Collapse Collaborators Collapse	Research Support, N.I.H., Extramural	8	538
16	Boutet E, Lieberherr D, Tognolli M, Schneider M, Bansal P, Bridge AJ, Poux S, Bougueleret L, Xenarios I. UniProtKB/Swiss-Prot, the Manually Annotated Section of the UniProt KnowledgeBase: How to Use the Entry View. Methods Mol Biol 2016;1374:23-54. [PMID: 26519399 DOI: 10.1007/978-1-4939-3167-5_2] [Citation(s) in RCA: 515] [Impact Index Per Article: 57.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/08/2023] Abstract The Universal Protein Resource (UniProt, http://www.uniprot.org ) consortium is an initiative of the SIB Swiss Institute of Bioinformatics (SIB), the European Bioinformatics Institute (EBI) and the Protein Information Resource (PIR) to provide the scientific community with a central resource for protein sequences and functional information. The UniProt consortium maintains the UniProt KnowledgeBase (UniProtKB), updated every 4 weeks, and several supplementary databases including the UniProt Reference Clusters (UniRef) and the UniProt Archive (UniParc).The Swiss-Prot section of the UniProt KnowledgeBase (UniProtKB/Swiss-Prot) contains publicly available expertly manually annotated protein sequences obtained from a broad spectrum of organisms. Plant protein entries are produced in the frame of the Plant Proteome Annotation Program (PPAP), with an emphasis on characterized proteins of Arabidopsis thaliana and Oryza sativa. High level annotations provided by UniProtKB/Swiss-Prot are widely used to predict annotation of newly available proteins through automatic pipelines.The purpose of this chapter is to present a guided tour of a UniProtKB/Swiss-Prot entry. We will also present some of the tools and databases that are linked to each entry. Collapse Key Words Amino-acid sequence Manual annotation Protein database Swiss-Prot TrEMBL UniProt Collapse MESH Headings Animals Computational Biology/methods Databases, Protein Humans Web Browser Collapse Grants 5R01GM080646-07 NIGMS NIH HHS 2P41HG02273 NHGRI NIH HHS 8P20GM103446-12 NIGMS NIH HHS 5G08LM010720-03 NLM NIH HHS 1 U41 HG006104 NHGRI NIH HHS 3R01GM080646-07S1 NIGMS NIH HHS U41 HG007822 NHGRI NIH HHS Collapse Collaborators Collapse	Research Support, N.I.H., Extramural	9	515
17	Quang D, Xie X. DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences. Nucleic Acids Res 2016;44:e107. [PMID: 27084946 PMCID: PMC4914104 DOI: 10.1093/nar/gkw226] [Citation(s) in RCA: 454] [Impact Index Per Article: 50.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2015] [Revised: 02/27/2016] [Accepted: 03/22/2016] [Indexed: 01/19/2023] Open Abstract Modeling the properties and functions of DNA sequences is an important, but challenging task in the broad field of genomics. This task is particularly difficult for non-coding DNA, the vast majority of which is still poorly understood in terms of function. A powerful predictive model for the function of non-coding DNA can have enormous benefit for both basic science and translational research because over 98% of the human genome is non-coding and 93% of disease-associated variants lie in these regions. To address this need, we propose DanQ, a novel hybrid convolutional and bi-directional long short-term memory recurrent neural network framework for predicting non-coding function de novo from sequence. In the DanQ model, the convolution layer captures regulatory motifs, while the recurrent layer captures long-term dependencies between the motifs in order to learn a regulatory 'grammar' to improve predictions. DanQ improves considerably upon other models across several metrics. For some regulatory markers, DanQ can achieve over a 50% relative improvement in the area under the precision-recall curve metric compared to related models. We have made the source code available at the github repository http://github.com/uci-cbcl/DanQ. Collapse Key Words Collapse MESH Headings DNA Genome-Wide Association Study Genomics/methods Humans Neural Networks, Computer Polymorphism, Single Nucleotide Quantitative Trait Loci ROC Curve Sequence Analysis, DNA Software Web Browser Collapse Grants T32 EB009418 NIBIB NIH HHS R01 HG006870 NHGRI NIH HHS Collapse Collaborators Collapse	research-article	9	454
18	Gramates LS, Marygold SJ, Santos GD, Urbano JM, Antonazzo G, Matthews BB, Rey AJ, Tabone CJ, Crosby MA, Emmert DB, Falls K, Goodman JL, Hu Y, Ponting L, Schroeder AJ, Strelets VB, Thurmond J, Zhou P. FlyBase at 25: looking to the future. Nucleic Acids Res 2017;45:D663-D671. [PMID: 27799470 PMCID: PMC5210523 DOI: 10.1093/nar/gkw1016] [Citation(s) in RCA: 407] [Impact Index Per Article: 50.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2016] [Revised: 10/14/2016] [Accepted: 10/18/2016] [Indexed: 01/12/2023] Open Abstract Since 1992, FlyBase (flybase.org) has been an essential online resource for the Drosophila research community. Concentrating on the most extensively studied species, Drosophila melanogaster, FlyBase includes information on genes (molecular and genetic), transgenic constructs, phenotypes, genetic and physical interactions, and reagents such as stocks and cDNAs. Access to data is provided through a number of tools, reports, and bulk-data downloads. Looking to the future, FlyBase is expanding its focus to serve a broader scientific community. In this update, we describe new features, datasets, reagent collections, and data presentations that address this goal, including enhanced orthology data, Human Disease Model Reports, protein domain search and visualization, concise gene summaries, a portal for external resources, video tutorials and the FlyBase Community Advisory Group. Collapse Key Words Collapse MESH Headings Animals Computational Biology/methods Databases, Genetic Disease Models, Animal Drosophila/genetics Genetic Association Studies Genomics/methods Humans Web Browser Collapse Grants Wellcome Trust G1000968 Medical Research Council U41 HG000739 NHGRI NIH HHS Collapse Collaborators Collapse	Research Support, N.I.H., Extramural	8	407
19	Wang Y, Zhang S, Li F, Zhou Y, Zhang Y, Wang Z, Zhang R, Zhu J, Ren Y, Tan Y, Qin C, Li Y, Li X, Chen Y, Zhu F. Therapeutic target database 2020: enriched resource for facilitating research and early development of targeted therapeutics. Nucleic Acids Res 2020;48:D1031-D1041. [PMID: 31691823 PMCID: PMC7145558 DOI: 10.1093/nar/gkz981] [Citation(s) in RCA: 402] [Impact Index Per Article: 80.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Revised: 10/10/2019] [Accepted: 10/12/2019] [Indexed: 12/12/2022] Open Abstract Knowledge of therapeutic targets and early drug candidates is useful for improved drug discovery. In particular, information about target regulators and the patented therapeutic agents facilitates research regarding druggability, systems pharmacology, new trends, molecular landscapes, and the development of drug discovery tools. To complement other databases, we constructed the Therapeutic Target Database (TTD) with expanded information about (i) target-regulating microRNAs and transcription factors, (ii) target-interacting proteins, and (iii) patented agents and their targets (structures and experimental activity values if available), which can be conveniently retrieved and is further enriched with regulatory mechanisms or biochemical classes. We also updated the TTD with the recently released International Classification of Diseases ICD-11 codes and additional sets of successful, clinical trial, and literature-reported targets that emerged since the last update. TTD is accessible at http://bidd.nus.edu.sg/group/ttd/ttd.asp. In case of possible web connectivity issues, two mirror sites of TTD are also constructed (http://db.idrblab.org/ttd/ and http://db.idrblab.net/ttd/). Collapse Key Words Collapse MESH Headings Biomarkers Computational Biology/methods Databases, Factual Drug Discovery/methods Humans Ligands Molecular Targeted Therapy Software User-Computer Interface Web Browser Collapse Grants Collapse Collaborators Collapse	research-article	5	402
20	Volders PJ, Anckaert J, Verheggen K, Nuytens J, Martens L, Mestdagh P, Vandesompele J. LNCipedia 5: towards a reference set of human long non-coding RNAs. Nucleic Acids Res 2020;47:D135-D139. [PMID: 30371849 PMCID: PMC6323963 DOI: 10.1093/nar/gky1031] [Citation(s) in RCA: 371] [Impact Index Per Article: 74.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2018] [Accepted: 10/17/2018] [Indexed: 12/20/2022] Open Abstract While long non-coding RNA (lncRNA) research in the past has primarily focused on the discovery of novel genes, today it has shifted towards functional annotation of this large class of genes. With thousands of lncRNA studies published every year, the current challenge lies in keeping track of which lncRNAs are functionally described. This is further complicated by the fact that lncRNA nomenclature is not straightforward and lncRNA annotation is scattered across different resources with their own quality metrics and definition of a lncRNA. To overcome this issue, large scale curation and annotation is needed. Here, we present the fifth release of the human lncRNA database LNCipedia (https://lncipedia.org). The most notable improvements include manual literature curation of 2482 lncRNA articles and the use of official gene symbols when available. In addition, an improved filtering pipeline results in a higher quality reference lncRNA gene set. Collapse Key Words Collapse MESH Headings Computational Biology/methods Databases, Nucleic Acid Genomics/methods Humans Molecular Sequence Annotation RNA, Long Noncoding/genetics Web Browser Collapse Grants Collapse Collaborators Collapse	Research Support, Non-U.S. Gov't	5	371
21	Wang Y, Song F, Zhang B, Zhang L, Xu J, Kuang D, Li D, Choudhary MNK, Li Y, Hu M, Hardison R, Wang T, Yue F. The 3D Genome Browser: a web-based browser for visualizing 3D genome organization and long-range chromatin interactions. Genome Biol 2018;19:151. [PMID: 30286773 PMCID: PMC6172833 DOI: 10.1186/s13059-018-1519-9] [Citation(s) in RCA: 360] [Impact Index Per Article: 51.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2017] [Accepted: 08/29/2018] [Indexed: 12/20/2022] Open Abstract Here, we introduce the 3D Genome Browser, http://3dgenome.org , which allows users to conveniently explore both their own and over 300 publicly available chromatin interaction data of different types. We design a new binary data format for Hi-C data that reduces the file size by at least a magnitude and allows users to visualize chromatin interactions over millions of base pairs within seconds. Our browser provides multiple methods linking distal cis-regulatory elements with their potential target genes. Users can seamlessly integrate thousands of other omics data to gain a comprehensive view of both regulatory landscape and 3D genome structure. Collapse Key Words Collapse MESH Headings Animals Chromatin/metabolism DNA, Intergenic/genetics Databases, Genetic Deoxyribonuclease I/metabolism Genetic Variation Genome, Human Humans Imaging, Three-Dimensional Mice Neoplasms/genetics Polymorphism, Single Nucleotide/genetics Species Specificity Web Browser Collapse Grants U01 HG009391 NHGRI NIH HHS R01 HG006292 NHGRI NIH HHS R01 HG009906 NHGRI NIH HHS U54DK107977 NIH HHS R01 HG007175 NHGRI NIH HHS U54 DK107977 NIDDK NIH HHS 1U01CA200060 NIH HHS R01 HL129132 NHLBI NIH HHS R01HG006292 NIH HHS U01HG009391 NIH HHS U01 CA200060 NCI NIH HHS R01HG007175 NIH HHS R01 ES024992 NIEHS NIH HHS R01ES024992 NIH HHS R24 DK106766 NIDDK NIH HHS R25 DA027995 NIDA NIH HHS R35GM124820 NIH HHS R01HL129132 NIH HHS U24ES026699 NIH HHS U24 ES026699 NIEHS NIH HHS R35 GM124820 NIGMS NIH HHS R01 HG007354 NHGRI NIH HHS R01HG007354 NIH HHS National Institutes of Health Collapse Collaborators Collapse	Research Support, N.I.H., Extramural	7	360
22	Deutsch EW, Bandeira N, Sharma V, Perez-Riverol Y, Carver JJ, Kundu DJ, García-Seisdedos D, Jarnuczak AF, Hewapathirana S, Pullman BS, Wertz J, Sun Z, Kawano S, Okuda S, Watanabe Y, Hermjakob H, MacLean B, MacCoss MJ, Zhu Y, Ishihama Y, Vizcaíno JA. The ProteomeXchange consortium in 2020: enabling 'big data' approaches in proteomics. Nucleic Acids Res 2020;48:D1145-D1152. [PMID: 31686107 PMCID: PMC7145525 DOI: 10.1093/nar/gkz984] [Citation(s) in RCA: 356] [Impact Index Per Article: 71.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2019] [Revised: 10/11/2019] [Accepted: 10/14/2019] [Indexed: 11/24/2022] Open Abstract The ProteomeXchange (PX) consortium of proteomics resources (http://www.proteomexchange.org) has standardized data submission and dissemination of mass spectrometry proteomics data worldwide since 2012. In this paper, we describe the main developments since the previous update manuscript was published in Nucleic Acids Research in 2017. Since then, in addition to the four PX existing members at the time (PRIDE, PeptideAtlas including the PASSEL resource, MassIVE and jPOST), two new resources have joined PX: iProX (China) and Panorama Public (USA). We first describe the updated submission guidelines, now expanded to include six members. Next, with current data submission statistics, we demonstrate that the proteomics field is now actively embracing public open data policies. At the end of June 2019, more than 14 100 datasets had been submitted to PX resources since 2012, and from those, more than 9 500 in just the last three years. In parallel, an unprecedented increase of data re-use activities in the field, including 'big data' approaches, is enabling novel research and new data resources. At last, we also outline some of our future plans for the coming years. Collapse Key Words Collapse MESH Headings Big Data Computational Biology/methods Data Mining Databases, Protein Proteomics/methods Software Software Design Web Browser Collapse Grants P41 GM103533 NIGMS NIH HHS R24 GM127667 NIGMS NIH HHS Wellcome Trust R01 GM103551 NIGMS NIH HHS R01 GM087221 NIGMS NIH HHS Collapse Collaborators Collapse	Research Support, N.I.H., Extramural	5	356
23	Diesh C, Stevens GJ, Xie P, De Jesus Martinez T, Hershberg EA, Leung A, Guo E, Dider S, Zhang J, Bridge C, Hogue G, Duncan A, Morgan M, Flores T, Bimber BN, Haw R, Cain S, Buels RM, Stein LD, Holmes IH. JBrowse 2: a modular genome browser with views of synteny and structural variation. Genome Biol 2023;24:74. [PMID: 37069644 PMCID: PMC10108523 DOI: 10.1186/s13059-023-02914-z] [Citation(s) in RCA: 341] [Impact Index Per Article: 170.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Accepted: 03/20/2023] [Indexed: 04/19/2023] Open Abstract We present JBrowse 2, a general-purpose genome annotation browser offering enhanced visualization of complex structural variation and evolutionary relationships. It retains core features of JBrowse while adding new views for synteny, dotplots, breakpoints, gene fusions, and whole-genome overviews. It allows users to share sessions, open multiple genomes, and navigate between views. It can be embedded in a web page, used as a standalone application, or run from Jupyter notebooks or R sessions. These improvements are enabled by a ground-up redesign using modern web technology. We describe application functionality, use cases, performance benchmarks, and implementation notes for web administrators and developers. Collapse Key Words Collapse MESH Headings Genomics Software Synteny Genome Biological Evolution Web Browser Internet Collapse Grants P51 OD011092 NIH HHS R01 GM080203 NIGMS NIH HHS R01 HG004483 NHGRI NIH HHS R24 OD021324 NIH HHS U24 CA220441 NCI NIH HHS National Human Genome Research Institute National Institute of General Medical Sciences Division of Cancer Epidemiology and Genetics, National Cancer Institute Collapse Collaborators Collapse	Research Support, N.I.H., Extramural	2	341
24	Robinson J, Barker DJ, Georgiou X, Cooper MA, Flicek P, Marsh SGE. IPD-IMGT/HLA Database. Nucleic Acids Res 2020;48:D948-D955. [PMID: 31667505 PMCID: PMC7145640 DOI: 10.1093/nar/gkz950] [Citation(s) in RCA: 334] [Impact Index Per Article: 66.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2019] [Revised: 10/03/2019] [Accepted: 10/29/2019] [Indexed: 11/14/2022] Open Abstract The IPD-IMGT/HLA Database, http://www.ebi.ac.uk/ipd/imgt/hla/, currently contains over 25 000 allele sequence for 45 genes, which are located within the Major Histocompatibility Complex (MHC) of the human genome. This region is the most polymorphic region of the human genome, and the levels of polymorphism seen exceed most other genes. Some of the genes have several thousand variants and are now termed hyperpolymorphic, rather than just simply polymorphic. The IPD-IMGT/HLA Database has provided a stable, highly accessible, user-friendly repository for this information, providing the scientific and medical community access to the many variant sequences of this gene system, that are critical for the successful outcome of transplantation. The number of currently known variants, and dramatic increase in the number of new variants being identified has necessitated a dedicated resource with custom tools for curation and publication. The challenge for the database is to continue to provide a highly curated database of sequence variants, while supporting the increased number of submissions and complexity of sequences. In order to do this, traditional methods of accessing and presenting data will be challenged, and new methods will need to be utilized to keep pace with new discoveries. Collapse Key Words Collapse MESH Headings Alleles Computational Biology/methods Databases, Genetic High-Throughput Nucleotide Sequencing Histocompatibility Antigens/genetics Humans Major Histocompatibility Complex/genetics Software Web Browser Collapse Grants Collapse Collaborators Collapse	research-article	5	334
25	Alanis-Lobato G, Andrade-Navarro MA, Schaefer MH. HIPPIE v2.0: enhancing meaningfulness and reliability of protein-protein interaction networks. Nucleic Acids Res 2016;45:D408-D414. [PMID: 27794551 PMCID: PMC5210659 DOI: 10.1093/nar/gkw985] [Citation(s) in RCA: 322] [Impact Index Per Article: 35.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2016] [Revised: 09/28/2016] [Accepted: 10/14/2016] [Indexed: 01/01/2023] Open Abstract The increasing number of experimentally detected interactions between proteins makes it difficult for researchers to extract the interactions relevant for specific biological processes or diseases. This makes it necessary to accompany the large-scale detection of protein–protein interactions (PPIs) with strategies and tools to generate meaningful PPI subnetworks. To this end, we generated the Human Integrated Protein–Protein Interaction rEference or HIPPIE (http://cbdm.uni-mainz.de/hippie/). HIPPIE is a one-stop resource for the generation and interpretation of PPI networks relevant to a specific research question. We provide means to generate highly reliable, context-specific PPI networks and to make sense out of them. We just released the second major update of HIPPIE, implementing various new features. HIPPIE grew substantially over the last years and now contains more than 270 000 confidence scored and annotated PPIs. We integrated different types of experimental information for the confidence scoring and the construction of context-specific networks. We implemented basic graph algorithms that highlight important proteins and interactions. HIPPIE's graphical interface implements several ways for wet lab and computational scientists alike to access the PPI data. Collapse Key Words Collapse MESH Headings Computational Biology/methods Databases, Protein Humans Protein Interaction Mapping/methods Protein Interaction Maps Reproducibility of Results Software Web Browser Collapse Grants Collapse Collaborators Collapse	Research Support, Non-U.S. Gov't	9	322

Please SIGN IN to browse more articles.