26
|
Walsh I, Nguyen-Khuong T, Wongtrakul-Kish K, Jie Tay S, Chew D, José T, Taron CH, Rudd PM. GlycanAnalyzer: software for automated interpretation of N-glycan profiles after exoglycosidase digestions. Bioinformatics 2019; 35:3214. [PMID: 30789217 PMCID: PMC6735761 DOI: 10.1093/bioinformatics/btz077] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
|
27
|
Wongtrakul-Kish K, Walsh I, Sim LC, Mak A, Liau B, Ding V, Hayati N, Wang H, Choo A, Rudd PM, Nguyen-Khuong T. Combining Glucose Units, m/z, and Collision Cross Section Values: Multiattribute Data for Increased Accuracy in Automated Glycosphingolipid Glycan Identifications and Its Application in Triple Negative Breast Cancer. Anal Chem 2019; 91:9078-9085. [DOI: 10.1021/acs.analchem.9b01476] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
28
|
Carswell C, Reid J, Walsh I, McAneney H, Noble H. Implementing an arts-based intervention for patients with end-stage kidney disease whilst receiving haemodialysis: a feasibility study protocol. Pilot Feasibility Stud 2019; 5:1. [PMID: 30622728 PMCID: PMC6320589 DOI: 10.1186/s40814-018-0389-y] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2018] [Accepted: 12/19/2018] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND End-stage kidney disease is a life-changing illness. Many patients require haemodialysis, a treatment that impacts profoundly on quality of life and mental health. Arts-based interventions have been used in other healthcare settings to improve mental health and quality of life; therefore, they may help address the impact of haemodialysis by improving these outcomes. However, there is a lack of evidence assessing their effectiveness in this population and few randomised controlled trials (RCTs) evaluating the effectiveness of complex arts-based interventions. METHODS The aims of this study are to establish the feasibility of a cluster RCT of an arts-based intervention for patients with end-stage kidney disease whilst receiving haemodialysis through a cluster randomised pilot study, explore the acceptability of the intervention with a process evaluation and explore the feasibility of an economic evaluation. The study will have three phases. The first phase consists of a cluster randomised pilot study to establish recruitment, participation and retention rates. This will involve the recruitment of 30 participants who will be randomly allocated through cluster randomisation according to shift pattern to experimental and control group. The second phase will be a qualitative process evaluation to establish the acceptability of the intervention within a clinical setting. This will involve semi-structured interviews with 13 patients and three focus groups with healthcare professionals. The third phase will be a feasibility economic evaluation to establish the best methods for data collection within a future cluster RCT. DISCUSSION Arts-based interventions have been shown to improve quality of life in healthcare settings, but there is a lack of evidence evaluating arts-based interventions for patients receiving haemodialysis. This study aims to assess the feasibility of a future cluster RCT assessing the impact of an arts-based intervention on the wellbeing and mental health of patients receiving haemodialysis and identify the key factors leading to successful implementation. The hope is this study will inform a trial that can influence future healthcare policy by providing robust evidence for arts-based interventions within the haemodialysis setting. TRIAL REGISTRATION The trial was prospectively registered on clinicaltrials.gov on 14/8/2018, registration number NCT03629496.
Collapse
|
29
|
Carswell C, Reid J, Walsh I, Noble H. Arts-based interventions for hospitalised patients with cancer: a systematic literature review. ACTA ACUST UNITED AC 2018. [DOI: 10.12968/bjhc.2018.24.12.611] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
30
|
Walsh I, Quinn J, Spencer A, Noble H. AnArtomy: Arts, Anatomy and Medicine - Human Beings Being Human. MEDEDPUBLISH 2018; 7:204. [PMID: 38074618 PMCID: PMC10701823 DOI: 10.15694/mep.2018.0000204.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/28/2024] Open
Abstract
This article was migrated. The article was marked as recommended. Background The study of anatomy underpins medical education and is an important facet of clinical practice in various diverse disciplines. We explored the dynamic relationship between arts, anatomy and medicine, along the continuum axis of anatomy, medicine, healthcare and art. Aims 1. to foster and gauge artistic and creative expression within the context of medical science and practice. 2. to generate representative artwork examining the relationship between art, medicine and healthcare. Methods Two purposefully open and expressive creative workshops were held within the cadaveric dissection laboratory of the Queen's University Department of Anatomy; with awide variety of artistic substrates available for faculty and student participants. Themes included: the relationship between art and medicine, the impact art and science have upon each other and the effects of creativity on wellbeing. Accompanying questionnaires included a quantification of perceived relationships between art and medicine; with an estimation of connectedness to feelings. Qualitative items within each questionnaire also addressed key humanistic questions. Comparative analysis of quantitative results was by Student's t-testing; statistical significance being p values <0.05. Results, Summary and Conclusions There was a statistically significant increase in "connectedness to feelings" amongst participants over the course of the workshop. There was a trend for participants to agree or strongly agree that art and medicine were important to each other. Qualitative responses changed from specific, task-oriented hopes to responses more aligned with social/gregarious themes and those related to higher order functioning. Humanistic responses changed across the entire group from a largely fixed inclusion of the concept of emotions to broader, more altruistic visions; inclusive of communal, social views. There was a noticeable shift in emphasis from succinctly defined descriptive terms to more expressive terms; reflective and inclusive of caring, holistic practice. The most arresting and compelling results were those of the resulting representative artwork.
Collapse
|
31
|
Doherty M, Theodoratou E, Walsh I, Adamczyk B, Stöckmann H, Agakov F, Timofeeva M, Trbojević-Akmačić I, Vučković F, Duffy F, McManus CA, Farrington SM, Dunlop MG, Perola M, Lauc G, Campbell H, Rudd PM. Plasma N-glycans in colorectal cancer risk. Sci Rep 2018; 8:8655. [PMID: 29872119 PMCID: PMC5988698 DOI: 10.1038/s41598-018-26805-7] [Citation(s) in RCA: 52] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2017] [Accepted: 05/16/2018] [Indexed: 12/22/2022] Open
Abstract
Aberrant glycosylation has been associated with a number of diseases including cancer. Our aim was to elucidate changes in whole plasma N-glycosylation between colorectal cancer (CRC) cases and controls in one of the largest cohorts of its kind. A set of 633 CRC patients and 478 age and gender matched controls was analysed. Additionally, patients were stratified into four CRC stages. Moreover, N-glycan analysis was carried out in plasma of 40 patients collected prior to the initial diagnosis of CRC. Statistically significant differences were observed in the plasma N-glycome at all stages of CRC, this included a highly significant decrease in relation to the core fucosylated bi-antennary glycans F(6)A2G2 and F(6)A2G2S(6)1 (P < 0.0009). Stage 1 showed a unique biomarker signature compared to stages 2, 3 and 4. There were indications that at risk groups could be identified from the glycome (retrospective AUC = 0.77 and prospective AUC = 0.65). N-glycome biomarkers related to the pathogenic progress of the disease would be a considerable asset in a clinical setting and it could enable novel therapeutics to be developed to target the disease in patients at risk of progression.
Collapse
|
32
|
Piovesan D, Walsh I, Minervini G, Tosatto SCE. FELLS: fast estimator of latent local structure. Bioinformatics 2018; 33:1889-1891. [PMID: 28186245 DOI: 10.1093/bioinformatics/btx085] [Citation(s) in RCA: 48] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2016] [Accepted: 02/04/2017] [Indexed: 11/12/2022] Open
Abstract
Motivation The behavior of a protein is encoded in its sequence, which can be used to predict distinct features such as secondary structure, intrinsic disorder or amphipathicity. Integrating these and other features can help explain the context-dependent behavior of proteins. However, most tools focus on a single aspect, hampering a holistic understanding of protein structure. Here, we present Fast Estimator of Latent Local Structure (FELLS) to visualize structural features from the protein sequence. FELLS provides disorder, aggregation and low complexity predictions as well as estimated local propensities including amphipathicity. A novel fast estimator of secondary structure (FESS) is also trained to provide a fast response. The calculations required for FELLS are extremely fast and suited for large-scale analysis while providing a detailed analysis of difficult cases. Availability and Implementation The FELLS web server is available from URL: http://protein.bio.unipd.it/fells/ . The server also exposes RESTful functionality allowing programmatic prediction requests. An executable version of FESS for Linux can be downloaded from URL: protein.bio.unipd.it/download/. Contact silvio.tosatto@unipd.it. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
|
33
|
Saldova R, Haakensen VD, Rødland E, Walsh I, Stöckmann H, Engebraaten O, Børresen-Dale AL, Rudd PM. Serum N-glycome alterations in breast cancer during multimodal treatment and follow-up. Mol Oncol 2017; 11:1361-1379. [PMID: 28657165 PMCID: PMC5623820 DOI: 10.1002/1878-0261.12105] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2017] [Revised: 06/01/2017] [Accepted: 06/01/2017] [Indexed: 11/09/2022] Open
Abstract
Using our recently developed high-throughput automated platform, N-glycans from all serum glycoproteins from patients with breast cancer were analysed at diagnosis, after neoadjuvant chemotherapy, surgery, radiotherapy and up to 3 years after surgery. Surprisingly, alterations in the serum N-glycome after chemotherapy were pro-inflammatory with an increase in glycan structures associated with cancer. Surgery, on the other hand, induced anti-inflammatory changes in the serum N-glycome, towards a noncancerous phenotype. At the time of first follow-up, glycosylation in patients with affected lymph nodes changed towards a malignant phenotype. C-reactive protein showed a different pattern, increasing after first line of neoadjuvant chemotherapy, then decreasing throughout treatment until 1 year after surgery. This may reflect a switch from acute to chronic inflammation, where chronic inflammation is reflected in the serum after the acute phase response subsides. In conclusion, we here present the first time-course serum N-glycome profiling of patients with breast cancer during and after treatment. We identify significant glycosylation changes with chemotherapy, surgery and follow-up, reflecting the host response to therapy and tumour removal.
Collapse
|
34
|
Sonke J, B. Lee J, Helgemo M, Rollins J, Carytsas F, Imus S, Lambert PD, Mullen T, Pabst M, Rosal M, Spooner H, Walsh I. Arts in health: considering language from an educational perspective in the United States. Arts Health 2017. [DOI: 10.1080/17533015.2017.1334680] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
35
|
McCauley SHJ, Walsh I. A case of repetitive penile fracture: an increasingly observed phenomenon. JOURNAL OF CLINICAL UROLOGY 2017. [DOI: 10.1177/2051415816664276] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
36
|
Isaac A, Davey P, Gilliland R, Loughrey MB, Walsh I. The road less travelled: a novel description of a urachal remnant causing small bowel obstruction. JOURNAL OF CLINICAL UROLOGY 2017. [DOI: 10.1177/2051415816686779] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
37
|
Cunningham P, Noble H, Al-Modhefer AK, Walsh I. Kidney stones: pathophysiology, diagnosis and management. ACTA ACUST UNITED AC 2017; 25:1112-1116. [PMID: 27834524 DOI: 10.12968/bjon.2016.25.20.1112] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
The prevalence of kidney stones is increasing, and approximately 12 000 hospital admissions every year are due to this condition. This article will use a case study to focus on a patient diagnosed with a calcium oxalate kidney stone. It will discuss the affected structures in relation to kidney stones and describe the pathology of the condition. Investigations for kidney stones, differential diagnosis and diagnosis, possible complications and prognosis, will be discussed. Finally, a detailed account of management strategies for the patient with kidney stones will be given, looking at pain management, medical procedures and dietary interventions.
Collapse
|
38
|
Walsh I, Zhao S, Campbell M, Taron CH, Rudd PM. Quantitative profiling of glycans and glycopeptides: an informatics' perspective. Curr Opin Struct Biol 2016; 40:70-80. [PMID: 27522273 DOI: 10.1016/j.sbi.2016.07.022] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2016] [Revised: 07/25/2016] [Accepted: 07/30/2016] [Indexed: 12/16/2022]
Abstract
Experimental techniques to identify and quantify glycan structures in a given sample are continuously improving. However, as they advance data analysis and annotation seems to become more complex. To address this issue, much progress has been made in developing software for interpretation of quantitative glycan profiles. Here, we focus on these informatics tools for high/ultra performance liquid chromatography (H/UPLC), mass spectrometry (MS), tandem mass spectrometry (MSn) and combinations thereof. Software for biomarker discovery, pathway, genomic and disease analysis and a final note on some future prospects for glycoinformatics are also mentioned.
Collapse
|
39
|
Walsh I, Pollastri G, Tosatto SCE. Correct machine learning on protein sequences: a peer-reviewing perspective. Brief Bioinform 2015; 17:831-40. [PMID: 26411473 DOI: 10.1093/bib/bbv082] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2015] [Indexed: 12/20/2022] Open
Abstract
Machine learning methods are becoming increasingly popular to predict protein features from sequences. Machine learning in bioinformatics can be powerful but carries also the risk of introducing unexpected biases, which may lead to an overestimation of the performance. This article espouses a set of guidelines to allow both peer reviewers and authors to avoid common machine learning pitfalls. Understanding biology is necessary to produce useful data sets, which have to be large and diverse. Separating the training and test process is imperative to avoid over-selling method performance, which is also dependent on several hidden parameters. A novel predictor has always to be compared with several existing methods, including simple baseline strategies. Using the presented guidelines will help nonspecialists to appreciate the critical issues in machine learning.
Collapse
|
40
|
Potenza E, Di Domenico T, Walsh I, Tosatto SCE. MobiDB 2.0: an improved database of intrinsically disordered and mobile proteins. Nucleic Acids Res 2014; 43:D315-20. [PMID: 25361972 PMCID: PMC4384034 DOI: 10.1093/nar/gku982] [Citation(s) in RCA: 152] [Impact Index Per Article: 15.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
MobiDB (http://mobidb.bio.unipd.it/) is a database of intrinsically disordered and mobile proteins. Intrinsically disordered regions are key for the function of numerous proteins. Here we provide a new version of MobiDB, a centralized source aimed at providing the most complete picture on different flavors of disorder in protein structures covering all UniProt sequences (currently over 80 million). The database features three levels of annotation: manually curated, indirect and predicted. Manually curated data is extracted from the DisProt database. Indirect data is inferred from PDB structures that are considered an indication of intrinsic disorder. The 10 predictors currently included (three ESpritz flavors, two IUPred flavors, two DisEMBL flavors, GlobPlot, VSL2b and JRONN) enable MobiDB to provide disorder annotations for every protein in absence of more reliable data. The new version also features a consensus annotation and classification for long disordered regions. In order to complement the disorder annotations, MobiDB features additional annotations from external sources. Annotations from the UniProt database include post-translational modifications and linear motifs. Pfam annotations are displayed in graphical form and are link-enabled, allowing the user to visit the corresponding Pfam page for further information. Experimental protein–protein interactions from STRING are also classified for disorder content.
Collapse
|
41
|
Walsh I, Giollo M, Di Domenico T, Ferrari C, Zimmermann O, Tosatto SCE. Comprehensive large-scale assessment of intrinsic protein disorder. ACTA ACUST UNITED AC 2014; 31:201-8. [PMID: 25246432 DOI: 10.1093/bioinformatics/btu625] [Citation(s) in RCA: 123] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
MOTIVATION Intrinsically disordered regions are key for the function of numerous proteins. Due to the difficulties in experimental disorder characterization, many computational predictors have been developed with various disorder flavors. Their performance is generally measured on small sets mainly from experimentally solved structures, e.g. Protein Data Bank (PDB) chains. MobiDB has only recently started to collect disorder annotations from multiple experimental structures. RESULTS MobiDB annotates disorder for UniProt sequences, allowing us to conduct the first large-scale assessment of fast disorder predictors on 25 833 different sequences with X-ray crystallographic structures. In addition to a comprehensive ranking of predictors, this analysis produced the following interesting observations. (i) The predictors cluster according to their disorder definition, with a consensus giving more confidence. (ii) Previous assessments appear over-reliant on data annotated at the PDB chain level and performance is lower on entire UniProt sequences. (iii) Long disordered regions are harder to predict. (iv) Depending on the structural and functional types of the proteins, differences in prediction performance of up to 10% are observed. AVAILABILITY The datasets are available from Web site at URL: http://mobidb.bio.unipd.it/lsd. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
42
|
Walsh I, Seno F, Tosatto SCE, Trovato A. PASTA 2.0: an improved server for protein aggregation prediction. Nucleic Acids Res 2014; 42:W301-7. [PMID: 24848016 PMCID: PMC4086119 DOI: 10.1093/nar/gku399] [Citation(s) in RCA: 280] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open
Abstract
The formation of amyloid aggregates upon protein misfolding is related to several devastating degenerative diseases. The propensities of different protein sequences to aggregate into amyloids, how they are enhanced by pathogenic mutations, the presence of aggregation hot spots stabilizing pathological interactions, the establishing of cross-amyloid interactions between co-aggregating proteins, all rely at the molecular level on the stability of the amyloid cross-beta structure. Our redesigned server, PASTA 2.0, provides a versatile platform where all of these different features can be easily predicted on a genomic scale given input sequences. The server provides other pieces of information, such as intrinsic disorder and secondary structure predictions, that complement the aggregation data. The PASTA 2.0 energy function evaluates the stability of putative cross-beta pairings between different sequence stretches. It was re-derived on a larger dataset of globular protein domains. The resulting algorithm was benchmarked on comprehensive peptide and protein test sets, leading to improved, state-of-the-art results with more amyloid forming regions correctly detected at high specificity. The PASTA 2.0 server can be accessed at http://protein.bio.unipd.it/pasta2/.
Collapse
|
43
|
Giollo M, Martin AJM, Walsh I, Ferrari C, Tosatto SCE. NeEMO: a method using residue interaction networks to improve prediction of protein stability upon mutation. BMC Genomics 2014; 15 Suppl 4:S7. [PMID: 25057121 PMCID: PMC4083412 DOI: 10.1186/1471-2164-15-s4-s7] [Citation(s) in RCA: 71] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The rapid growth of un-annotated missense variants poses challenges requiring novel strategies for their interpretation. From the thermodynamic point of view, amino acid changes can lead to a change in the internal energy of a protein and induce structural rearrangements. This is of great relevance for the study of diseases and protein design, justifying the development of prediction methods for variant-induced stability changes. RESULTS Here we propose NeEMO, a tool for the evaluation of stability changes using an effective representation of proteins based on residue interaction networks (RINs). RINs are used to extract useful features describing interactions of the mutant amino acid with its structural environment. Benchmarking shows NeEMO to be very effective, allowing reliable predictions in different parts of the protein such as β-strands and buried residues. Validation on a previously published independent dataset shows that NeEMO has a Pearson correlation coefficient of 0.77 and a standard error of 1 Kcal/mol, outperforming nine recent methods. The NeEMO web server can be freely accessed from URL: http://protein.bio.unipd.it/neemo/. CONCLUSIONS NeEMO offers an innovative and reliable tool for the annotation of amino acid changes. A key contribution are RINs, which can be used for modeling proteins and their interactions effectively. Interestingly, the approach is very general, and can motivate the development of a new family of RIN-based protein structure analyzers. NeEMO may suggest innovative strategies for bioinformatics tools beyond protein stability prediction.
Collapse
|
44
|
Kukic P, Mirabello C, Tradigo G, Walsh I, Veltri P, Pollastri G. Toward an accurate prediction of inter-residue distances in proteins using 2D recursive neural networks. BMC Bioinformatics 2014; 15:6. [PMID: 24410833 PMCID: PMC3893389 DOI: 10.1186/1471-2105-15-6] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2013] [Accepted: 12/20/2013] [Indexed: 11/21/2022] Open
Abstract
Background Protein inter-residue contact maps provide a translation and rotation invariant topological representation of a protein. They can be used as an intermediary step in protein structure predictions. However, the prediction of contact maps represents an unbalanced problem as far fewer examples of contacts than non-contacts exist in a protein structure. In this study we explore the possibility of completely eliminating the unbalanced nature of the contact map prediction problem by predicting real-value distances between residues. Predicting full inter-residue distance maps and applying them in protein structure predictions has been relatively unexplored in the past. Results We initially demonstrate that the use of native-like distance maps is able to reproduce 3D structures almost identical to the targets, giving an average RMSD of 0.5Å. In addition, the corrupted physical maps with an introduced random error of ±6Å are able to reconstruct the targets within an average RMSD of 2Å. After demonstrating the reconstruction potential of distance maps, we develop two classes of predictors using two-dimensional recursive neural networks: an ab initio predictor that relies only on the protein sequence and evolutionary information, and a template-based predictor in which additional structural homology information is provided. We find that the ab initio predictor is able to reproduce distances with an RMSD of 6Å, regardless of the evolutionary content provided. Furthermore, we show that the template-based predictor exploits both sequence and structure information even in cases of dubious homology and outperforms the best template hit with a clear margin of up to 3.7Å. Lastly, we demonstrate the ability of the two predictors to reconstruct the CASP9 targets shorter than 200 residues producing the results similar to the state of the machine learning art approach implemented in the Distill server. Conclusions The methodology presented here, if complemented by more complex reconstruction protocols, can represent a possible path to improve machine learning algorithms for 3D protein structure prediction. Moreover, it can be used as an intermediary step in protein structure predictions either on its own or complemented by NMR restraints.
Collapse
|
45
|
Walsh I, Di Domenico T, Tosatto SCE. RUBI: rapid proteomic-scale prediction of lysine ubiquitination and factors influencing predictor performance. Amino Acids 2013; 46:853-62. [PMID: 24363213 DOI: 10.1007/s00726-013-1645-3] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2013] [Accepted: 12/11/2013] [Indexed: 11/25/2022]
Abstract
Post-translational modification of protein lysines was recently shown to be a common feature of eukaryotic organisms. The ubiquitin modification is regarded as a versatile regulatory mechanism with many important cellular roles. Large-scale datasets are becoming available for H. sapiens ubiquitination. However, using current experimental techniques the vast majority of their sites remain unidentified and in silico tools may offer an alternative. Here, we introduce Rapid UBIquitination (RUBI) a sequence-based ubiquitination predictor designed for rapid application on a genome scale. RUBI was constructed using an iterative approach. At each iteration, important factors which influenced performance and its usability were investigated. The final RUBI model has an AUC of 0.868 on a large cross-validation set and is shown to outperform other available methods on independent sets. Predicted intrinsic disorder is shown to be weakly anti-correlated to ubiquitination for the H. sapiens dataset and improves performance slightly. RUBI predicts the number of ubiquitination sites correctly within three sites for ca. 80% of the tested proteins. The average potentially ubiquitinated proteome fraction is predicted to be at least 25% across a variety of model organisms, including several thousand possible H. sapiens proteins awaiting experimental characterization. RUBI can accurately predict ubiquitination on unseen examples and has a signal across different eukaryotic organisms. The factors which influenced the construction of RUBI could also be tested in other post-translational modification predictors. One of the more interesting factors is the influence of intrinsic protein disorder on ubiquitinated lysines where residues with low disorder probability are preferred.
Collapse
|
46
|
Di Domenico T, Potenza E, Walsh I, Parra RG, Giollo M, Minervini G, Piovesan D, Ihsan A, Ferrari C, Kajava AV, Tosatto SCE. RepeatsDB: a database of tandem repeat protein structures. Nucleic Acids Res 2013; 42:D352-7. [PMID: 24311564 PMCID: PMC3964956 DOI: 10.1093/nar/gkt1175] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
RepeatsDB (http://repeatsdb.bio.unipd.it/) is a database of annotated tandem repeat protein structures. Tandem repeats pose a difficult problem for the analysis of protein structures, as the underlying sequence can be highly degenerate. Several repeat types haven been studied over the years, but their annotation was done in a case-by-case basis, thus making large-scale analysis difficult. We developed RepeatsDB to fill this gap. Using state-of-the-art repeat detection methods and manual curation, we systematically annotated the Protein Data Bank, predicting 10 745 repeat structures. In all, 2797 structures were classified according to a recently proposed classification schema, which was expanded to accommodate new findings. In addition, detailed annotations were performed in a subset of 321 proteins. These annotations feature information on start and end positions for the repeat regions and units. RepeatsDB is an ongoing effort to systematically classify and annotate structural protein repeats in a consistent way. It provides users with the possibility to access and download high-quality datasets either interactively or programmatically through web services.
Collapse
|
47
|
Martin AJM, Walsh I, Domenico TD, Mičetić I, Tosatto SCE. PANADA: protein association network annotation, determination and analysis. PLoS One 2013; 8:e78383. [PMID: 24265686 PMCID: PMC3827049 DOI: 10.1371/journal.pone.0078383] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2013] [Accepted: 09/20/2013] [Indexed: 11/18/2022] Open
Abstract
Increasingly large numbers of proteins require methods for functional annotation. This is typically based on pairwise inference from the homology of either protein sequence or structure. Recently, similarity networks have been presented to leverage both the ability to visualize relationships between proteins and assess the transferability of functional inference. Here we present PANADA, a novel toolkit for the visualization and analysis of protein similarity networks in Cytoscape. Networks can be constructed based on pairwise sequence or structural alignments either on a set of proteins or, alternatively, by database search from a single sequence. The Panada web server, executable for download and examples and extensive help files are available at URL: http://protein.bio.unipd.it/panada/.
Collapse
|
48
|
Di Domenico T, Walsh I, Tosatto SCE. Analysis and consensus of currently available intrinsic protein disorder annotation sources in the MobiDB database. BMC Bioinformatics 2013; 14 Suppl 7:S3. [PMID: 23815411 PMCID: PMC3633070 DOI: 10.1186/1471-2105-14-s7-s3] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Intrinsic protein disorder is becoming an increasingly important topic in protein science. During the last few years, intrinsically disordered proteins (IDPs) have been shown to play a role in many important biological processes, e.g. protein signalling and regulation. This has sparked a need to better understand and characterize different types of IDPs, their functions and roles. Our recently published database, MobiDB, provides a centralized resource for accessing and analysing intrinsic protein disorder annotations. RESULTS Here, we present a thorough description and analysis of the data made available by MobiDB, providing descriptive statistics on the various available annotation sources. Version 1.2.1 of the database contains annotations for ca. 4,500,000 UniProt sequences, covering all eukaryotic proteomes. In addition, we describe a novel consensus annotation calculation and its related weighting scheme. The comparison between disorder information sources highlights how the MobiDB consensus captures the main features of intrinsic disorder and correlates well with manually curated datasets. Finally, we demonstrate the annotation of 13 eukaryotic model organisms through MobiDB's datasets, and of an example protein through the interactive user interface. CONCLUSIONS MobiDB is a central resource for intrinsic disorder research, containing both experimental data and predictions. In the future it will be expanded to include additional information for all known proteins.
Collapse
|
49
|
Walsh I, Sirocco FG, Minervini G, Di Domenico T, Ferrari C, Tosatto SCE. RAPHAEL: recognition, periodicity and insertion assignment of solenoid protein structures. ACTA ACUST UNITED AC 2012; 28:3257-64. [PMID: 22962341 DOI: 10.1093/bioinformatics/bts550] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
MOTIVATION Repeat proteins form a distinct class of structures where folding is greatly simplified. Several classes have been defined, with solenoid repeats of periodicity between ca. 5 and 40 being the most challenging to detect. Such proteins evolve quickly and their periodicity may be rapidly hidden at sequence level. From a structural point of view, finding solenoids may be complicated by the presence of insertions or multiple domains. To the best of our knowledge, no automated methods are available to characterize solenoid repeats from structure. RESULTS Here we introduce RAPHAEL, a novel method for the detection of solenoids in protein structures. It reliably solves three problems of increasing difficulty: (1) recognition of solenoid domains, (2) determination of their periodicity and (3) assignment of insertions. RAPHAEL uses a geometric approach mimicking manual classification, producing several numeric parameters that are optimized for maximum performance. The resulting method is very accurate, with 89.5% of solenoid proteins and 97.2% of non-solenoid proteins correctly classified. RAPHAEL periodicities have a Spearman correlation coefficient of 0.877 against the manually established ones. A baseline algorithm for insertion detection in identified solenoids has a Q(2) value of 79.8%, suggesting room for further improvement. RAPHAEL finds 1931 highly confident repeat structures not previously annotated as solenoids in the Protein Data Bank records.
Collapse
|
50
|
Walsh I, Minervini G, Corazza A, Esposito G, Tosatto SCE, Fogolari F. Bluues server: electrostatic properties of wild-type and mutated protein structures. Bioinformatics 2012; 28:2189-90. [PMID: 22711791 DOI: 10.1093/bioinformatics/bts343] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Electrostatic calculations are an important tool for deciphering many functional mechanisms in proteins. Generalized Born (GB) models offer a fast and convenient computational approximation over other implicit solvent-based electrostatic models. Here we present a novel GB-based web server, using the program Bluues, to calculate numerous electrostatic features including pKa-values and surface potentials. The output is organized allowing both experts and beginners to rapidly sift the data. A novel feature of the Bluues server is that it explicitly allows to find electrostatic differences between wild-type and mutant structures. AVAILABILITY The Bluues server, examples and extensive help files are available for non-commercial use at URL: http://protein.bio.unipd.it/bluues/.
Collapse
|