Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Harmston N, Filsell W, Stumpf MPH. What the papers say: text mining for genomics and systems biology. Hum Genomics 2010;5:17-29. [PMID: 21106487 PMCID: PMC3500154 DOI: 10.1186/1479-7364-5-1-17] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2010] [Accepted: 08/06/2010] [Indexed: 12/11/2022] Open

For:	Harmston N, Filsell W, Stumpf MPH. What the papers say: text mining for genomics and systems biology. Hum Genomics 2010;5:17-29. [PMID: 21106487 PMCID: PMC3500154 DOI: 10.1186/1479-7364-5-1-17] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2010] [Accepted: 08/06/2010] [Indexed: 12/11/2022] Open

Number

Cited by Other Article(s)

Voskamp M, Vinhoven L, Stanke F, Hafkemeyer S, Nietert MM. Integrating Text Mining into the Curation of Disease Maps. Biomolecules 2022;12:biom12091278. [PMID: 36139119 PMCID: PMC9496510 DOI: 10.3390/biom12091278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Revised: 09/02/2022] [Accepted: 09/07/2022] [Indexed: 11/16/2022] Open

Kanakoglou DS, Pampalou A, Vrachnos DM, Karatrasoglou EA, Zouki DN, Dimonitsas E, Klonou A, Kokla G, Theologi V, Christofidou E, Sakellariou S, Lakiotaki E, Piperi C, Korkolopoulou P. Laying the groundwork for the Biobank of Rare Malignant Neoplasms at the service of the Hellenic Network of Precision Medicine on Cancer. Int J Oncol 2022;60:31. [PMID: 35169862 PMCID: PMC8878762 DOI: 10.3892/ijo.2022.5321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2021] [Accepted: 12/23/2021] [Indexed: 11/06/2022] Open

Affiliation(s)

Dimitrios S. Kanakoglou First Department of Pathology, Medical School, National and Kapodistrian University of Athens, 11527 Athens, Greece
Andromachi Pampalou First Department of Pathology, Medical School, National and Kapodistrian University of Athens, 11527 Athens, Greece
Dimitrios M. Vrachnos First Department of Pathology, Medical School, National and Kapodistrian University of Athens, 11527 Athens, Greece
Eleni A. Karatrasoglou First Department of Pathology, Medical School, National and Kapodistrian University of Athens, 11527 Athens, Greece
Dionysia N. Zouki First Department of Pathology, Medical School, National and Kapodistrian University of Athens, 11527 Athens, Greece
Emmanouil Dimonitsas Department of Plastic and Reconstructive Surgery, Greek Anticancer Institute, Saint Savvas Hospital, 11522 Athens, Greece
Alexia Klonou Department of Biological Chemistry, Medical School, National and Kapodistrian University of Athens, 11527 Athens, Greece
Georgia Kokla First Department of Pathology, Medical School, National and Kapodistrian University of Athens, 11527 Athens, Greece
Varvara Theologi Department of Pathology, Andreas Syggros Hospital of Cutaneous and Venereal Diseases, 16121 Athens, Greece
Errieta Christofidou Department of Pathology, Andreas Syggros Hospital of Cutaneous and Venereal Diseases, 16121 Athens, Greece
Stratigoula Sakellariou First Department of Pathology, Medical School, National and Kapodistrian University of Athens, 11527 Athens, Greece
Eleftheria Lakiotaki First Department of Pathology, Medical School, National and Kapodistrian University of Athens, 11527 Athens, Greece
Christina Piperi Department of Biological Chemistry, Medical School, National and Kapodistrian University of Athens, 11527 Athens, Greece
Penelope Korkolopoulou First Department of Pathology, Medical School, National and Kapodistrian University of Athens, 11527 Athens, Greece

Collapse

Zafeiropoulos H, Paragkamian S, Ninidakis S, Pavlopoulos GA, Jensen LJ, Pafilis E. PREGO: A Literature and Data-Mining Resource to Associate Microorganisms, Biological Processes, and Environment Types. Microorganisms 2022;10:microorganisms10020293. [PMID: 35208748 PMCID: PMC8879827 DOI: 10.3390/microorganisms10020293] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2021] [Revised: 01/19/2022] [Accepted: 01/20/2022] [Indexed: 12/12/2022] Open

Ali I, Dreij K, Baker S, Högberg J, Korhonen A, Stenius U. Application of Text Mining in Risk Assessment of Chemical Mixtures: A Case Study of Polycyclic Aromatic Hydrocarbons (PAHs). ENVIRONMENTAL HEALTH PERSPECTIVES 2021;129:67008. [PMID: 34165340 PMCID: PMC8318069 DOI: 10.1289/ehp6702] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/19/2019] [Revised: 05/07/2021] [Accepted: 05/10/2021] [Indexed: 05/08/2023]

Abstract

BACKGROUND

Cancer risk assessment of complex exposures, such as exposure to mixtures of polycyclic aromatic hydrocarbons (PAHs), is challenging due to the diverse biological activities of these compounds. With the help of text mining (TM), we have developed TM tools-the latest iteration of the Cancer Risk Assessment using Biomedical literature tool (CRAB3) and a Cancer Hallmarks Analytics Tool (CHAT)-that could be useful for automatic literature analyses in cancer risk assessment and research. Although CRAB3 analyses are based on carcinogenic modes of action (MOAs) and cover almost all the key characteristics of carcinogens, CHAT evaluates literature according to the hallmarks of cancer referring to the alterations in cellular behavior that characterize the cancer cell.

OBJECTIVES

The objective was to evaluate the usefulness of these tools to support cancer risk assessment by performing a case study of 22 European Union and U.S. Environmental Protection Agency priority PAHs and diesel exhaust and a case study of PAH interactions with silica.

METHODS

We analyzed PubMed literature, comprising 57,498 references concerning priority PAHs and complex PAH mixtures, using CRAB3 and CHAT.

RESULTS

CRAB3 analyses correctly identified similarities and differences in genotoxic and nongenotoxic MOAs of the 22 priority PAHs and grouped them according to their known carcinogenic potential. CHAT had the same capacity and complemented the CRAB output when comparing, for example, benzo[a]pyrene and dibenzo[a,l]pyrene. Both CRAB3 and CHAT analyses highlighted potentially interacting mechanisms within and across complex PAH mixtures and mechanisms of possible importance for interactions with silica.

CONCLUSION

These data suggest that our TM approach can be useful in the hazard identification of PAHs and mixtures including PAHs. The tools can assist in grouping chemicals and identifying similarities and differences in carcinogenic MOAs and their interactions. https://doi.org/10.1289/EHP6702.

Collapse

Herrgårdh T, Madai VI, Kelleher JD, Magnusson R, Gustafsson M, Milani L, Gennemark P, Cedersund G. Hybrid modelling for stroke care: Review and suggestions of new approaches for risk assessment and simulation of scenarios. Neuroimage Clin 2021;31:102694. [PMID: 34000646 PMCID: PMC8141769 DOI: 10.1016/j.nicl.2021.102694] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2020] [Revised: 04/27/2021] [Accepted: 05/04/2021] [Indexed: 11/28/2022]

Singh G, Papoutsoglou EA, Keijts-Lalleman F, Vencheva B, Rice M, Visser RG, Bachem CW, Finkers R. Extracting knowledge networks from plant scientific literature: potato tuber flesh color as an exemplary trait. BMC PLANT BIOLOGY 2021;21:198. [PMID: 33894758 PMCID: PMC8070292 DOI: 10.1186/s12870-021-02943-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Accepted: 03/29/2021] [Indexed: 06/12/2023]

Abstract

BACKGROUND

Scientific literature carries a wealth of information crucial for research, but only a fraction of it is present as structured information in databases and therefore can be analyzed using traditional data analysis tools. Natural language processing (NLP) is often and successfully employed to support humans by distilling relevant information from large corpora of free text and structuring it in a way that lends itself to further computational analyses. For this pilot, we developed a pipeline that uses NLP on biological literature to produce knowledge networks. We focused on the flesh color of potato, a well-studied trait with known associations, and we investigated whether these knowledge networks can assist us in formulating new hypotheses on the underlying biological processes.

RESULTS

We trained an NLP model based on a manually annotated corpus of 34 full-text potato articles, to recognize relevant biological entities and relationships between them in text (genes, proteins, metabolites and traits). This model detected the number of biological entities with a precision of 97.65% and a recall of 88.91% on the training set. We conducted a time series analysis on 4023 PubMed abstract of plant genetics-based articles which focus on 4 major Solanaceous crops (tomato, potato, eggplant and capsicum), to determine that the networks contained both previously known and contemporaneously unknown leads to subsequently discovered biological phenomena relating to flesh color. A novel time-based analysis of these networks indicates a connection between our trait and a candidate gene (zeaxanthin epoxidase) already two years prior to explicit statements of that connection in the literature.

CONCLUSIONS

Our time-based analysis indicates that network-assisted hypothesis generation shows promise for knowledge discovery, data integration and hypothesis generation in scientific research.

Collapse

Kaushik V, Plazzer J, Macrae F. Evaluation of literature searching tools for curation of mismatch repair gene variants in hereditary colon cancer. ADVANCED GENETICS (HOBOKEN, N.J.) 2021;2:e10039. [PMID: 36618447 PMCID: PMC9744508 DOI: 10.1002/ggn2.10039] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Revised: 01/12/2021] [Accepted: 01/14/2021] [Indexed: 01/11/2023]

Abstract

Pathogenic constitutional genomic variants in the mismatch repair (MMR) genes are the drivers of Lynch syndrome; optimal variant interpretation is required for the management of suspected and confirmed cases. The International Society for Hereditary Gastrointestinal Tumours (InSiGHT) provides expert classifications for MMR variants for the US National Human Genome Research Institute's (NHGRI) ClinGen initiative and interprets variants with discordant classifications and those of uncertain significance (VUSs). Given the onerous nature of extracting information related to variants, literature searching tools which harness artificial intelligence may aid in retrieving information to allow optimum variant classification. In this study, we described the nature of discordance in a sample of 80 variants from a list of variants requiring updating by InSiGHT for ClinGen by comparing their existing InSiGHT classifications with the various submissions for each variant on the US National Centre for Biotechnology Information's (NCBI) ClinVar database. To identify the potential value of a literature searching tool in extracting information related to classification, all variants were searched for using a traditional method (Google Scholar) and literature searching tool (Mastermind) independently. Descriptive statistics were used to compare: the number of articles before and after screening for relevance and the number of relevant articles unique to either method. Relevance was defined as containing the variant in question as well as data informing variant interpretation. A total of 916 articles were returned by both methods and Mastermind averaged four relevant articles per search compared to Google Scholar's three. Of relevant Mastermind articles, 193/308 (62.7%) were unique to it, compared to 87/202, (43.0%) for Google Scholar. For 24 variants, either or both methods found no information. All 6/80 (20%) variants with pathogenic or likely pathogenic InSiGHT classifications have newer VUS assertions on ClinVar. Our study demonstrated that for a sample of variants with varying discordant interpretations, Mastermind was able to return on average, a more relevant and unique literature search. Google Scholar was able to retrieve information that Mastermind did not, which supports a conclusion that Mastermind could play a complementary role in literature searching for classification. This work will aid InSiGHT in its role of classifying MMR variants.

Collapse

Bao Y, Deng Z, Wang Y, Kim H, Armengol VD, Acevedo F, Ouardaoui N, Wang C, Parmigiani G, Barzilay R, Braun D, Hughes KS. Using Machine Learning and Natural Language Processing to Review and Classify the Medical Literature on Cancer Susceptibility Genes. JCO Clin Cancer Inform 2020;3:1-9. [PMID: 31545655 DOI: 10.1200/cci.19.00042] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open

Abstract

PURPOSE

The medical literature relevant to germline genetics is growing exponentially. Clinicians need tools that help to monitor and prioritize the literature to understand the clinical implications of pathogenic genetic variants. We developed and evaluated two machine learning models to classify abstracts as relevant to the penetrance-risk of cancer for germline mutation carriers-or prevalence of germline genetic mutations.

MATERIALS AND METHODS

We conducted literature searches in PubMed and retrieved paper titles and abstracts to create an annotated data set for training and evaluating the two machine learning classification models. Our first model is a support vector machine (SVM) which learns a linear decision rule on the basis of the bag-of-ngrams representation of each title and abstract. Our second model is a convolutional neural network (CNN) which learns a complex nonlinear decision rule on the basis of the raw title and abstract. We evaluated the performance of the two models on the classification of papers as relevant to penetrance or prevalence.

RESULTS

For penetrance classification, we annotated 3,740 paper titles and abstracts and evaluated the two models using 10-fold cross-validation. The SVM model achieved 88.93% accuracy-percentage of papers that were correctly classified-whereas the CNN model achieved 88.53% accuracy. For prevalence classification, we annotated 3,753 paper titles and abstracts. The SVM model achieved 88.92% accuracy and the CNN model achieved 88.52% accuracy.

CONCLUSION

Our models achieve high accuracy in classifying abstracts as relevant to penetrance or prevalence. By facilitating literature review, this tool could help clinicians and researchers keep abreast of the burgeoning knowledge of gene-cancer associations and keep the knowledge bases for clinical decision support tools up to date.

Collapse

Ogris C, Guala D, Sonnhammer ELL. FunCoup 4: new species, data, and visualization. Nucleic Acids Res 2019;46:D601-D607. [PMID: 29165593 PMCID: PMC5755233 DOI: 10.1093/nar/gkx1138] [Citation(s) in RCA: 61] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2017] [Accepted: 10/31/2017] [Indexed: 01/22/2023] Open

Guala D, Ogris C, Müller N, Sonnhammer ELL. Genome-wide functional association networks: background, data & state-of-the-art resources. Brief Bioinform 2019;21:1224-1237. [PMID: 31281921 PMCID: PMC7373183 DOI: 10.1093/bib/bbz064] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2019] [Revised: 04/29/2019] [Accepted: 05/04/2019] [Indexed: 02/06/2023] Open

Couch D, Yu Z, Nam JH, Allen C, Ramos PS, da Silveira WA, Hunt KJ, Hazard ES, Hardiman G, Lawson A, Chung D. GAIL: An interactive webserver for inference and dynamic visualization of gene-gene associations based on gene ontology guided mining of biomedical literature. PLoS One 2019;14:e0219195. [PMID: 31260503 PMCID: PMC6602258 DOI: 10.1371/journal.pone.0219195] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2019] [Accepted: 06/18/2019] [Indexed: 01/08/2023] Open

Xu J, Yang P, Xue S, Sharma B, Sanchez-Martin M, Wang F, Beaty KA, Dehan E, Parikh B. Translating cancer genomics into precision medicine with artificial intelligence: applications, challenges and future perspectives. Hum Genet 2019;138:109-124. [PMID: 30671672 PMCID: PMC6373233 DOI: 10.1007/s00439-019-01970-5] [Citation(s) in RCA: 95] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2018] [Accepted: 01/02/2019] [Indexed: 02/07/2023]

Zhao N, Zheng G, Li J, Zhao HY, Lu C, Jiang M, Zhang C, Guo HT, Lu AP. Text Mining of Rheumatoid Arthritis and Diabetes Mellitus to Understand the Mechanisms of Chinese Medicine in Different Diseases with Same Treatment. Chin J Integr Med 2018;24:777-784. [PMID: 29327123 DOI: 10.1007/s11655-018-2825-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/25/2016] [Indexed: 11/28/2022]

Fergadis A, Baziotis C, Pappas D, Papageorgiou H, Potamianos A. Hierarchical bi-directional attention-based RNNs for supporting document classification on protein-protein interactions affected by genetic mutations. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2018;2018:5077305. [PMID: 30137284 PMCID: PMC6105093 DOI: 10.1093/database/bay076] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/15/2018] [Accepted: 06/22/2018] [Indexed: 02/03/2023]

Babtie AC, Stumpf MPH. How to deal with parameters for whole-cell modelling. J R Soc Interface 2017;14:20170237. [PMID: 28768879 PMCID: PMC5582120 DOI: 10.1098/rsif.2017.0237] [Citation(s) in RCA: 52] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2017] [Accepted: 06/22/2017] [Indexed: 11/12/2022] Open

Gomez-Cabrero D, Tegnér J. Iterative Systems Biology for Medicine – Time for advancing from network signatures to mechanistic equations. ACTA ACUST UNITED AC 2017. [DOI: 10.1016/j.coisb.2017.05.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]

Gomez-Cabrero D, Menche J, Vargas C, Cano I, Maier D, Barabási AL, Tegnér J, Roca J. From comorbidities of chronic obstructive pulmonary disease to identification of shared molecular mechanisms by data integration. BMC Bioinformatics 2016;17:441. [PMID: 28185567 PMCID: PMC5133493 DOI: 10.1186/s12859-016-1291-3] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open

Abstract

Background

Deep mining of healthcare data has provided maps of comorbidity relationships between diseases. In parallel, integrative multi-omics investigations have generated high-resolution molecular maps of putative relevance for understanding disease initiation and progression. Yet, it is unclear how to advance an observation of comorbidity relations (one disease to others) to a molecular understanding of the driver processes and associated biomarkers.

Results

Since Chronic Obstructive Pulmonary disease (COPD) has emerged as a central hub in temporal comorbidity networks, we developed a systematic integrative data-driven framework to identify shared disease-associated genes and pathways, as a proxy for the underlying generative mechanisms inducing comorbidity. We integrated records from approximately 13 M patients from the Medicare database with disease-gene maps that we derived from several resources including a semantic-derived knowledge-base. Using rank-based statistics we not only recovered known comorbidities but also discovered a novel association between COPD and digestive diseases. Furthermore, our analysis provides the first set of COPD co-morbidity candidate biomarkers, including IL15, TNF and JUP, and characterizes their association to aging and life-style conditions, such as smoking and physical activity.

Conclusions

The developed framework provides novel insights in COPD and especially COPD co-morbidity associated mechanisms. The methodology could be used to discover and decipher the molecular underpinning of other comorbidity relationships and furthermore, allow the identification of candidate co-morbidity biomarkers.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-016-1291-3) contains supplementary material, which is available to authorized users.

Collapse

Affiliation(s)

David Gomez-Cabrero Department of Medicine, Karolinska Institutet, Unit of Computational Medicine, Stockholm, 171 77, Sweden. .,Karolinska Institutet, Center for Molecular Medicine, Stockholm, 171 77, Sweden. .,Department of Medicine, Unit of Clinical Epidemiology, Karolinska University Hospital, Solna, L8, 17176, Sweden. .,Science for Life Laboratory, Solna, 17121, Sweden. .,Mucosal and Salivary Biology Division, King's College London Dental Institute, London, UK.
Jörg Menche Center for Complex Networks Research and Department of Physics, Northeastern University, Boston, MA, USA.,Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA.,Center for Network Science, Central European University, Budapest, Hungary
Claudia Vargas Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Hospital Clinic de Barcelona, Universitat de Barcelona, Barcelona, Spain.,Center for Biomedical Network Research in Respiratory Diseases (CIBERES), Madrid, Spain
Isaac Cano Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Hospital Clinic de Barcelona, Universitat de Barcelona, Barcelona, Spain.,Center for Biomedical Network Research in Respiratory Diseases (CIBERES), Madrid, Spain
Dieter Maier Biomax Informatics AG, Planegg, Germany
Albert-László Barabási Center for Complex Networks Research and Department of Physics, Northeastern University, Boston, MA, USA.,Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA.,Center for Network Science, Central European University, Budapest, Hungary.,Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
Jesper Tegnér Department of Medicine, Karolinska Institutet, Unit of Computational Medicine, Stockholm, 171 77, Sweden.,Karolinska Institutet, Center for Molecular Medicine, Stockholm, 171 77, Sweden.,Department of Medicine, Unit of Clinical Epidemiology, Karolinska University Hospital, Solna, L8, 17176, Sweden.,Science for Life Laboratory, Solna, 17121, Sweden
Josep Roca Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Hospital Clinic de Barcelona, Universitat de Barcelona, Barcelona, Spain. .,Center for Biomedical Network Research in Respiratory Diseases (CIBERES), Madrid, Spain.

Collapse

Tennant JP, Waldner F, Jacques DC, Masuzzo P, Collister LB, Hartgerink CHJ. The academic, economic and societal impacts of Open Access: an evidence-based review. F1000Res 2016;5:632. [PMID: 27158456 DOI: 10.12688/f1000research.8460.1] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 04/08/2016] [Indexed: 11/20/2022] Open

Abstract

Ongoing debates surrounding Open Access to the scholarly literature are multifaceted and complicated by disparate and often polarised viewpoints from engaged stakeholders. At the current stage, Open Access has become such a global issue that it is critical for all involved in scholarly publishing, including policymakers, publishers, research funders, governments, learned societies, librarians, and academic communities, to be well-informed on the history, benefits, and pitfalls of Open Access. In spite of this, there is a general lack of consensus regarding the potential pros and cons of Open Access at multiple levels. This review aims to be a resource for current knowledge on the impacts of Open Access by synthesizing important research in three major areas: academic, economic and societal. While there is clearly much scope for additional research, several key trends are identified, including a broad citation advantage for researchers who publish openly, as well as additional benefits to the non-academic dissemination of their work. The economic impact of Open Access is less well-understood, although it is clear that access to the research literature is key for innovative enterprises, and a range of governmental and non-governmental services. Furthermore, Open Access has the potential to save both publishers and research funders considerable amounts of financial resources, and can provide some economic benefits to traditionally subscription-based journals. The societal impact of Open Access is strong, in particular for advancing citizen science initiatives, and leveling the playing field for researchers in developing countries. Open Access supersedes all potential alternative modes of access to the scholarly literature through enabling unrestricted re-use, and long-term stability independent of financial constraints of traditional publishers that impede knowledge sharing. However, Open Access has the potential to become unsustainable for research communities if high-cost options are allowed to continue to prevail in a widely unregulated scholarly publishing market. Open Access remains only one of the multiple challenges that the scholarly publishing system is currently facing. Yet, it provides one foundation for increasing engagement with researchers regarding ethical standards of publishing and the broader implications of 'Open Research'.

Collapse

Tennant JP, Waldner F, Jacques DC, Masuzzo P, Collister LB, Hartgerink CHJ. The academic, economic and societal impacts of Open Access: an evidence-based review. F1000Res 2016;5:632. [PMID: 27158456 PMCID: PMC4837983 DOI: 10.12688/f1000research.8460.3] [Citation(s) in RCA: 167] [Impact Index Per Article: 20.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 09/20/2016] [Indexed: 12/22/2022] Open

Abstract

Collapse

Tennant JP, Waldner F, Jacques DC, Masuzzo P, Collister LB, Hartgerink CHJ. The academic, economic and societal impacts of Open Access: an evidence-based review. F1000Res 2016;5:632. [PMID: 27158456 PMCID: PMC4837983 DOI: 10.12688/f1000research.8460.2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 09/20/2016] [Indexed: 12/02/2023] Open

Abstract

Collapse

Abascal MF, Besso MJ, Rosso M, Mencucci MV, Aparicio E, Szapiro G, Furlong LI, Vazquez-Levin MH. CDH1/E-cadherin and solid tumors. An updated gene-disease association analysis using bioinformatics tools. Comput Biol Chem 2015;60:9-20. [PMID: 26674224 DOI: 10.1016/j.compbiolchem.2015.10.002] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2015] [Revised: 10/17/2015] [Accepted: 10/19/2015] [Indexed: 12/13/2022]

Abstract

Cancer is a group of diseases that causes millions of deaths worldwide. Among cancers, Solid Tumors (ST) stand-out due to their high incidence and mortality rates. Disruption of cell-cell adhesion is highly relevant during tumor progression. Epithelial-cadherin (protein: E-cadherin, gene: CDH1) is a key molecule in cell-cell adhesion and an abnormal expression or/and function(s) contributes to tumor progression and is altered in ST. A systematic study was carried out to gather and summarize current knowledge on CDH1/E-cadherin and ST using bioinformatics resources. The DisGeNET database was exploited to survey CDH1-associated diseases. Reported mutations in specific ST were obtained by interrogating COSMIC and IntOGen tools. CDH1 Single Nucleotide Polymorphisms (SNP) were retrieved from the dbSNP database. DisGeNET analysis identified 609 genes annotated to ST, among which CDH1 was listed. Using CDH1 as query term, 26 disease concepts were found, 21 of which were neoplasms-related terms. Using DisGeNET ALL Databases, 172 disease concepts were identified. Of those, 80 ST disease-related terms were subjected to manual curation and 75/80 (93.75%) associations were validated. On selected ST, 489 CDH1 somatic mutations were listed in COSMIC and IntOGen databases. Breast neoplasms had the highest CDH1-mutation rate. CDH1 was positioned among the 20 genes with highest mutation frequency and was confirmed as driver gene in breast cancer. Over 14,000 SNP for CDH1 were found in the dbSNP database. This report used DisGeNET to gather/compile current knowledge on gene-disease association for CDH1/E-cadherin and ST; data curation expanded the number of terms that relate them. An updated list of CDH1 somatic mutations was obtained with COSMIC and IntOGen databases and of SNP from dbSNP. This information can be used to further understand the role of CDH1/E-cadherin in health and disease.

Collapse

Affiliation(s)

María Florencia Abascal Laboratory of Cell-Cell Interaction in Cancer and Reproduction, Instituto de Biología & Medicina Experimental (IBYME), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Fundación IBYME (FIBYME), Vuelta de Obligado 2490, Zip Code C1428ADN, Buenos Aires, Argentina.
María José Besso Laboratory of Cell-Cell Interaction in Cancer and Reproduction, Instituto de Biología & Medicina Experimental (IBYME), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Fundación IBYME (FIBYME), Vuelta de Obligado 2490, Zip Code C1428ADN, Buenos Aires, Argentina.
Marina Rosso Laboratory of Cell-Cell Interaction in Cancer and Reproduction, Instituto de Biología & Medicina Experimental (IBYME), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Fundación IBYME (FIBYME), Vuelta de Obligado 2490, Zip Code C1428ADN, Buenos Aires, Argentina.
María Victoria Mencucci Laboratory of Cell-Cell Interaction in Cancer and Reproduction, Instituto de Biología & Medicina Experimental (IBYME), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Fundación IBYME (FIBYME), Vuelta de Obligado 2490, Zip Code C1428ADN, Buenos Aires, Argentina.
Evangelina Aparicio Laboratory of Cell-Cell Interaction in Cancer and Reproduction, Instituto de Biología & Medicina Experimental (IBYME), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Fundación IBYME (FIBYME), Vuelta de Obligado 2490, Zip Code C1428ADN, Buenos Aires, Argentina.
Gala Szapiro Laboratory of Cell-Cell Interaction in Cancer and Reproduction, Instituto de Biología & Medicina Experimental (IBYME), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Fundación IBYME (FIBYME), Vuelta de Obligado 2490, Zip Code C1428ADN, Buenos Aires, Argentina.
Laura Inés Furlong Research Programme on Biomedical Informatics (GRIB) (IMIM), DCEXS, Universitat Pompeu Fabra, C/Dr Aiguader 88, Zip Code 08003, Barcelona, Spain.
Mónica Hebe Vazquez-Levin Laboratory of Cell-Cell Interaction in Cancer and Reproduction, Instituto de Biología & Medicina Experimental (IBYME), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Fundación IBYME (FIBYME), Vuelta de Obligado 2490, Zip Code C1428ADN, Buenos Aires, Argentina; Laboratory of Cell-Cell Interaction in Cancer and Reproduction, Instituto de Biología y Medicina Experimental (IBYME), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Fundación IBYME (FIBYME), Vuelta de Obligado 2490, Zip Code C1428ADN, Buenos Aires, Argentina.

Collapse

Kiela D, Guo Y, Stenius U, Korhonen A. Unsupervised discovery of information structure in biomedical documents. Bioinformatics 2015;31:1084-92. [PMID: 25411329 DOI: 10.1093/bioinformatics/btu758] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2014] [Accepted: 11/10/2014] [Indexed: 11/14/2022] Open

Abstract

MOTIVATION

Information structure (IS) analysis is a text mining technique, which classifies text in biomedical articles into categories that capture different types of information, such as objectives, methods, results and conclusions of research. It is a highly useful technique that can support a range of Biomedical Text Mining tasks and can help readers of biomedical literature find information of interest faster, accelerating the highly time-consuming process of literature review. Several approaches to IS analysis have been presented in the past, with promising results in real-world biomedical tasks. However, all existing approaches, even weakly supervised ones, require several hundreds of hand-annotated training sentences specific to the domain in question. Because biomedicine is subject to considerable domain variation, such annotations are expensive to obtain. This makes the application of IS analysis across biomedical domains difficult. In this article, we investigate an unsupervised approach to IS analysis and evaluate the performance of several unsupervised methods on a large corpus of biomedical abstracts collected from PubMed.

RESULTS

Our best unsupervised algorithm (multilevel-weighted graph clustering algorithm) performs very well on the task, obtaining over 0.70 F scores for most IS categories when applied to well-known IS schemes. This level of performance is close to that of lightly supervised IS methods and has proven sufficient to aid a range of practical tasks. Thus, using an unsupervised approach, IS could be applied to support a wide range of tasks across sub-domains of biomedicine. We also demonstrate that unsupervised learning brings novel insights into IS of biomedical literature and discovers information categories that are not present in any of the existing IS schemes.

AVAILABILITY AND IMPLEMENTATION

The annotated corpus and software are available at http://www.cl.cam.ac.uk/∼dk427/bio14info.html.

Collapse

Tamaddoni-Nezhad A, Milani GA, Raybould A, Muggleton S, Bohan DA. Construction and Validation of Food Webs Using Logic-Based Machine Learning and Text Mining. ADV ECOL RES 2013. [DOI: 10.1016/b978-0-12-420002-9.00004-4] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]

Approaches to verb subcategorization for biomedicine. J Biomed Inform 2012;46:212-27. [PMID: 23276747 DOI: 10.1016/j.jbi.2012.12.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2012] [Revised: 12/05/2012] [Accepted: 12/06/2012] [Indexed: 11/23/2022]

Harmston N, Filsell W, Stumpf MPH. Which species is it? Species-driven gene name disambiguation using random walks over a mixture of adjacency matrices. Bioinformatics 2011;28:254-60. [PMID: 22135416 DOI: 10.1093/bioinformatics/btr640] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Carreira R, Carneiro S, Pereira R, Rocha M, Rocha I, Ferreira EC, Lourenço A. Semantic annotation of biological concepts interplaying microbial cellular responses. BMC Bioinformatics 2011;12:460. [PMID: 22122862 PMCID: PMC3259143 DOI: 10.1186/1471-2105-12-460] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2011] [Accepted: 11/28/2011] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Automated extraction systems have become a time saving necessity in Systems Biology. Considerable human effort is needed to model, analyse and simulate biological networks. Thus, one of the challenges posed to Biomedical Text Mining tools is that of learning to recognise a wide variety of biological concepts with different functional roles to assist in these processes.

RESULTS

Here, we present a novel corpus concerning the integrated cellular responses to nutrient starvation in the model-organism Escherichia coli. Our corpus is a unique resource in that it annotates biomedical concepts that play a functional role in expression, regulation and metabolism. Namely, it includes annotations for genetic information carriers (genes and DNA, RNA molecules), proteins (transcription factors, enzymes and transporters), small metabolites, physiological states and laboratory techniques. The corpus consists of 130 full-text papers with a total of 59043 annotations for 3649 different biomedical concepts; the two dominant classes are genes (highest number of unique concepts) and compounds (most frequently annotated concepts), whereas other important cellular concepts such as proteins account for no more than 10% of the annotated concepts.

CONCLUSIONS

To the best of our knowledge, a corpus that details such a wide range of biological concepts has never been presented to the text mining community. The inter-annotator agreement statistics provide evidence of the importance of a consolidated background when dealing with such complex descriptions, the ambiguities naturally arising from the terminology and their impact for modelling purposes.Availability is granted for the full-text corpora of 130 freely accessible documents, the annotation scheme and the annotation guidelines. Also, we include a corpus of 340 abstracts.

Collapse

Verification of systems biology research in the age of collaborative competition. Nat Biotechnol 2011;29:811-5. [DOI: 10.1038/nbt.1968] [Citation(s) in RCA: 74] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]