1
|
Ma Y, Li Q, Tang Y, Zhang Z, Liu R, Luo Q, Wang Y, Hu J, Chen Y, Li Z, Zhao C, Ran Y, Mu Y, Li Y, Xu X, Gong Y, He Z, Ba Y, Guo K, Dong K, Li X, Tan W, Zhu Y, Xiang Z, Xu H. The architecture of silk-secreting organs during the final larval stage of silkworms revealed by single-nucleus and spatial transcriptomics. Cell Rep 2024; 43:114460. [PMID: 38996068 DOI: 10.1016/j.celrep.2024.114460] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 04/26/2024] [Accepted: 06/22/2024] [Indexed: 07/14/2024] Open
Abstract
Natural silks are renewable proteins with impressive mechanical properties and biocompatibility that are useful in various fields. However, the cellular and spatial organization of silk-secreting organs remains unclear. Here, we combined single-nucleus and spatially resolved transcriptomics to systematically map the cellular and spatial composition of the silk glands (SGs) of mulberry silkworms late in larval development. This approach allowed us to profile SG cell types and cell state dynamics and identify regulatory networks and cell-cell communication related to efficient silk protein synthesis; key markers were validated via transgenic approaches. Notably, we demonstrated the indispensable role of the ecdysone receptor (ultraspiracle) in regulating endoreplication in SG cells. Our atlas presents the results of spatiotemporal analysis of silk-secreting organ architecture late in larval development; this atlas provides a valuable reference for elucidating the mechanism of efficient silk protein synthesis and developing sustainable products made from natural silk.
Collapse
Affiliation(s)
- Yan Ma
- State Key Laboratory of Resource Insects, College of Sericulture, Textile and Biomass Sciences, Southwest University, Chongqing 400715, China
| | - Qingjun Li
- State Key Laboratory of Resource Insects, College of Sericulture, Textile and Biomass Sciences, Southwest University, Chongqing 400715, China
| | - Yiyun Tang
- State Key Laboratory of Resource Insects, College of Sericulture, Textile and Biomass Sciences, Southwest University, Chongqing 400715, China
| | - Zhiyong Zhang
- Beijing SeekGene BioSciences Co., Ltd., Beijing 102206, China
| | - Rongpeng Liu
- State Key Laboratory of Resource Insects, College of Sericulture, Textile and Biomass Sciences, Southwest University, Chongqing 400715, China
| | - Qin Luo
- State Key Laboratory of Resource Insects, College of Sericulture, Textile and Biomass Sciences, Southwest University, Chongqing 400715, China
| | - Yuting Wang
- State Key Laboratory of Resource Insects, College of Sericulture, Textile and Biomass Sciences, Southwest University, Chongqing 400715, China
| | - Jie Hu
- State Key Laboratory of Resource Insects, College of Sericulture, Textile and Biomass Sciences, Southwest University, Chongqing 400715, China
| | - Yuqin Chen
- State Key Laboratory of Resource Insects, College of Sericulture, Textile and Biomass Sciences, Southwest University, Chongqing 400715, China
| | - Zhiwei Li
- State Key Laboratory of Resource Insects, College of Sericulture, Textile and Biomass Sciences, Southwest University, Chongqing 400715, China
| | - Chen Zhao
- State Key Laboratory of Resource Insects, College of Sericulture, Textile and Biomass Sciences, Southwest University, Chongqing 400715, China
| | - Yiting Ran
- State Key Laboratory of Resource Insects, College of Sericulture, Textile and Biomass Sciences, Southwest University, Chongqing 400715, China
| | - Yuanyuan Mu
- State Key Laboratory of Resource Insects, College of Sericulture, Textile and Biomass Sciences, Southwest University, Chongqing 400715, China
| | - Yinghao Li
- State Key Laboratory of Resource Insects, College of Sericulture, Textile and Biomass Sciences, Southwest University, Chongqing 400715, China
| | - Xiaoqing Xu
- State Key Laboratory of Resource Insects, College of Sericulture, Textile and Biomass Sciences, Southwest University, Chongqing 400715, China
| | - Yuyan Gong
- Beijing SeekGene BioSciences Co., Ltd., Beijing 102206, China
| | - Zihan He
- Beijing SeekGene BioSciences Co., Ltd., Beijing 102206, China
| | - Yongbing Ba
- Shanghai OE Biotech. Co., Ltd., Shanghai 201212, China
| | - Kaiqi Guo
- Shanghai OE Biotech. Co., Ltd., Shanghai 201212, China
| | - Keshu Dong
- State Key Laboratory of Resource Insects, College of Sericulture, Textile and Biomass Sciences, Southwest University, Chongqing 400715, China
| | - Xiao Li
- State Key Laboratory of Resource Insects, College of Sericulture, Textile and Biomass Sciences, Southwest University, Chongqing 400715, China
| | - Wei Tan
- State Key Laboratory of Resource Insects, College of Sericulture, Textile and Biomass Sciences, Southwest University, Chongqing 400715, China
| | - Yumeng Zhu
- State Key Laboratory of Resource Insects, College of Sericulture, Textile and Biomass Sciences, Southwest University, Chongqing 400715, China
| | - Zhonghuai Xiang
- State Key Laboratory of Resource Insects, College of Sericulture, Textile and Biomass Sciences, Southwest University, Chongqing 400715, China
| | - Hanfu Xu
- State Key Laboratory of Resource Insects, College of Sericulture, Textile and Biomass Sciences, Southwest University, Chongqing 400715, China.
| |
Collapse
|
2
|
Ullah S, Rahman W, Ullah F, Ullah A, Ahmad G, Ijaz M, Ullah H, Sharafmal DM. The HABD: Home of All Biological Databases Empowering Biological Research With Cutting-Edge Database Systems. Curr Protoc 2024; 4:e1063. [PMID: 38808697 DOI: 10.1002/cpz1.1063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/30/2024]
Abstract
The emergence of computer technologies and computing power has led to the development of several database systems that provide standardized access to vast quantities of data, making it possible to collect, search, index, evaluate, and extract useful knowledge across various fields. The Home of All Biological Databases (HABD) has been established as a continually expanding platform that aims to store, organize, and distribute biological data in a searchable manner, removing all dead and non-accessible data. The platform meticulously categorizes data into various categories, such as COVID-19 Pandemic Database (CO-19PDB), Database relevant to Human Research (DBHR), Cancer Research Database (CRDB), Latest Database of Protein Research (LDBPR), Fungi Databases Collection (FDBC), and many other databases that are categorized based on biological phenomena. It currently provides a total of 22 databases, including 6 published, 5 submitted, and the remaining in various stages of development. These databases encompass a range of areas, including phytochemical-specific and plastic biodegradation databases. HABD is equipped with search engine optimization (SEO) analyzer and Neil Patel tools, which ensure excellent SEO and high-speed value. With timely updates, HABD aims to facilitate the processing and visualization of data for scientists, providing a one-stop-shop for all biological databases. Computer platforms, such as PhP, html, CSS, Java script and Biopython, are used to build all the databases. © 2024 Wiley Periodicals LLC.
Collapse
Affiliation(s)
- Shahid Ullah
- S-Khan Lab, Mardan, Khyber Pakhtunkhwa, Pakistan
| | | | - Farhan Ullah
- S-Khan Lab, Mardan, Khyber Pakhtunkhwa, Pakistan
| | - Anees Ullah
- S-Khan Lab, Mardan, Khyber Pakhtunkhwa, Pakistan
| | - Gulzar Ahmad
- S-Khan Lab, Mardan, Khyber Pakhtunkhwa, Pakistan
| | | | - Hameed Ullah
- S-Khan Lab, Mardan, Khyber Pakhtunkhwa, Pakistan
| | | |
Collapse
|
3
|
Ross KE, Bastian FB, Buys M, Cook CE, D’Eustachio P, Harrison M, Hermjakob H, Li D, Lord P, Natale DA, Peters B, Sternberg PW, Su AI, Thakur M, Thomas PD, Bateman A. Perspectives on tracking data reuse across biodata resources. BIOINFORMATICS ADVANCES 2024; 4:vbae057. [PMID: 38721398 PMCID: PMC11076920 DOI: 10.1093/bioadv/vbae057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 03/13/2024] [Accepted: 04/11/2024] [Indexed: 06/14/2024]
Abstract
Motivation Data reuse is a common and vital practice in molecular biology and enables the knowledge gathered over recent decades to drive discovery and innovation in the life sciences. Much of this knowledge has been collated into molecular biology databases, such as UniProtKB, and these resources derive enormous value from sharing data among themselves. However, quantifying and documenting this kind of data reuse remains a challenge. Results The article reports on a one-day virtual workshop hosted by the UniProt Consortium in March 2023, attended by representatives from biodata resources, experts in data management, and NIH program managers. Workshop discussions focused on strategies for tracking data reuse, best practices for reusing data, and the challenges associated with data reuse and tracking. Surveys and discussions showed that data reuse is widespread, but critical information for reproducibility is sometimes lacking. Challenges include costs of tracking data reuse, tensions between tracking data and open sharing, restrictive licenses, and difficulties in tracking commercial data use. Recommendations that emerged from the discussion include: development of standardized formats for documenting data reuse, education about the obstacles posed by restrictive licenses, and continued recognition by funding agencies that data management is a critical activity that requires dedicated resources. Availability and implementation Summaries of survey results are available at: https://docs.google.com/forms/d/1j-VU2ifEKb9C-sW6l3ATB79dgHdRk5v_lESv2hawnso/viewanalytics (survey of data providers) and https://docs.google.com/forms/d/18WbJFutUd7qiZoEzbOytFYXSfWFT61hVce0vjvIwIjk/viewanalytics (survey of users).
Collapse
Affiliation(s)
- Karen E Ross
- Protein Information Resource, Department of Biochemistry and Molecular & Cellular Biology, Georgetown University Medical Center, Washington, DC 20007, United States
| | - Frederic B Bastian
- Evolutionary Bioinformatics Group, SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland
| | | | | | - Peter D’Eustachio
- Department of Biochemistry & Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY 10012, United States
| | - Melissa Harrison
- Literature Services, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, United Kingdom
| | - Henning Hermjakob
- Molecular Systems, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, United Kingdom
| | - Donghui Li
- Chan Zuckerberg Initiative, Redwood City, CA 94063, United States
| | - Phillip Lord
- School of Computing, Newcastle University, Newcastle upon Tyne NE4 5TG, United Kingdom
| | - Darren A Natale
- Protein Information Resource, Department of Biochemistry and Molecular & Cellular Biology, Georgetown University Medical Center, Washington, DC 20007, United States
| | - Bjoern Peters
- Center for Vaccine Innovation, La Jolla Institute of Immunology, La Jolla, CA 92037, United States
| | - Paul W Sternberg
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, United States
| | - Andrew I Su
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| | - Matthew Thakur
- Data Services, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SA, United Kingdom
| | - Paul D Thomas
- Department of Population and Public Health Sciences, University of Southern California, Los Angeles, CA 90089, United States
| | - Alex Bateman
- MSCB, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, United Kingdom
| |
Collapse
|
4
|
Naidoo L, Arumugam T, Ramsuran V. Narrative Review Explaining the Role of HLA-A, -B, and -C Molecules in COVID-19 Disease in and around Africa. Infect Dis Rep 2024; 16:380-406. [PMID: 38667755 PMCID: PMC11049896 DOI: 10.3390/idr16020029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Revised: 04/15/2024] [Accepted: 04/15/2024] [Indexed: 04/28/2024] Open
Abstract
The coronavirus disease 2019 (COVID-19) has left a devasting effect on various regions globally. Africa has exceptionally high rates of other infectious diseases, such as tuberculosis (TB), human immunodeficiency virus (HIV), and malaria, and was not impacted by COVID-19 to the extent of other continents Globally, COVID-19 has caused approximately 7 million deaths and 700 million infections thus far. COVID-19 disease severity and susceptibility vary among individuals and populations, which could be attributed to various factors, including the viral strain, host genetics, environment, lifespan, and co-existing conditions. Host genetics play a substantial part in COVID-19 disease severity among individuals. Human leukocyte antigen (HLA) was previously been shown to be very important across host immune responses against viruses. HLA has been a widely studied gene region for various disease associations that have been identified. HLA proteins present peptides to the cytotoxic lymphocytes, which causes an immune response to kill infected cells. The HLA molecule serves as the central region for infectious disease association; therefore, we expect HLA disease association with COVID-19. Therefore, in this narrative review, we look at the HLA gene region, particularly, HLA class I, to understand its role in COVID-19 disease.
Collapse
Affiliation(s)
- Lisa Naidoo
- School of Laboratory Medicine and Medical Sciences, College of Health Sciences, University of KwaZulu-Natal, Durban 4041, South Africa; (L.N.); (T.A.)
| | - Thilona Arumugam
- School of Laboratory Medicine and Medical Sciences, College of Health Sciences, University of KwaZulu-Natal, Durban 4041, South Africa; (L.N.); (T.A.)
| | - Veron Ramsuran
- School of Laboratory Medicine and Medical Sciences, College of Health Sciences, University of KwaZulu-Natal, Durban 4041, South Africa; (L.N.); (T.A.)
- Centre for the AIDS Programme of Research in South Africa (CAPRISA), University of KwaZulu-Natal, Durban 4041, South Africa
| |
Collapse
|
5
|
Wu Z, Liu X, Xie F, Ma C, Lam EWF, Kang N, Jin D, Yan J, Jin B. Comprehensive pan-cancer analysis identifies the RNA-binding protein LRPPRC as a novel prognostic and immune biomarker. Life Sci 2024; 343:122527. [PMID: 38417544 DOI: 10.1016/j.lfs.2024.122527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 01/25/2024] [Accepted: 02/21/2024] [Indexed: 03/01/2024]
Abstract
AIMS RNA-binding proteins (RBPs) play pivotal roles in carcinogenesis and immunotherapy. Leucine-rich pentapeptide repeat-containing protein (LRPPRC) is crucial for RNA polyadenylation, transport, and stability. Although recent studies have suggested LRPPRC's potential role in tumor progression, its significance in tumor prognosis, diagnosis, and immunology remains unclear. MAIN METHODS We comprehensively analyzed LRPPRC expression in tumors using various databases, including Human Transcriptome Cell Atlas (HTCA), University of California Santa Cruz (UCSC), Human Protein Atlas (HPA), Sangerbox, TISIDB, GeneMANIA, GSCALite, and CellMiner. We examined the correlation between LRPPRC expression level and prognosis, immune infiltration, immunotherapy, methylation, biological function, and drug sensitivity. Single-cell analysis was performed using Tumor Immune Single Cell Hub (TISCH) and CancerSEA software. Patients with acute myeloid leukemia (AML) were categorized based on LRPPRC levels for functional and immune infiltration analyses. The role of LRPPRC in cancer was validated using in vitro experiments. KEY FINDINGS Our findings revealed that LRPPRC was highly expressed in almost all cancer types, indicating its significant prognostic and diagnostic potential. Notably, LRPPRC was associated with diverse immune features, such as immune cell infiltration, immune checkpoint genes, tumor mutational burden, and microsatellite instability, suggesting its value in guiding immunotherapy strategies. Within AML, the high-expression group had lower levels of immune cells, including CD8+ T cells. In vitro experiments confirmed the inhibitory effects of LRPPRC knockdown on AML cell proliferation. SIGNIFICANCE This study highlights LRPPRC as a reliable pan-cancer prognostic and immune biomarker, particularly in AML. It lays the groundwork for future research on LRPPRC-targeted cancer therapies.
Collapse
Affiliation(s)
- Zheng Wu
- Institute of Cancer Stem Cell, Liaoning Key Laboratory of Nucleic Acid Biology, Dalian Medical University, Dalian 116044, Liaoning, China; Department of Hematology, Liaoning Key Laboratory of Hematopoietic Stem Cell Transplantation and Translational Medicine, Liaoning Medical Center for Hematopoietic Stem Cell Transplantation, Dalian Key Laboratory of Hematology, Diamond Bay Institute of Hematology, The Second Hospital of Dalian Medical University, Dalian 116027, Liaoning, China
| | - Xinyue Liu
- Institute of Cancer Stem Cell, Liaoning Key Laboratory of Nucleic Acid Biology, Dalian Medical University, Dalian 116044, Liaoning, China
| | - Fang Xie
- Department of Hematology, Liaoning Key Laboratory of Hematopoietic Stem Cell Transplantation and Translational Medicine, Liaoning Medical Center for Hematopoietic Stem Cell Transplantation, Dalian Key Laboratory of Hematology, Diamond Bay Institute of Hematology, The Second Hospital of Dalian Medical University, Dalian 116027, Liaoning, China
| | - Chao Ma
- Institute of Cancer Stem Cell, Liaoning Key Laboratory of Nucleic Acid Biology, Dalian Medical University, Dalian 116044, Liaoning, China
| | - Eric W-F Lam
- Department of Surgery and Cancer, Imperial College London, London W12 0NN, UK
| | - Ning Kang
- Institute of Cancer Stem Cell, Liaoning Key Laboratory of Nucleic Acid Biology, Dalian Medical University, Dalian 116044, Liaoning, China
| | - Di Jin
- Institute of Cancer Stem Cell, Liaoning Key Laboratory of Nucleic Acid Biology, Dalian Medical University, Dalian 116044, Liaoning, China.
| | - Jinsong Yan
- Department of Hematology, Liaoning Key Laboratory of Hematopoietic Stem Cell Transplantation and Translational Medicine, Liaoning Medical Center for Hematopoietic Stem Cell Transplantation, Dalian Key Laboratory of Hematology, Diamond Bay Institute of Hematology, The Second Hospital of Dalian Medical University, Dalian 116027, Liaoning, China.
| | - Bilian Jin
- Institute of Cancer Stem Cell, Liaoning Key Laboratory of Nucleic Acid Biology, Dalian Medical University, Dalian 116044, Liaoning, China.
| |
Collapse
|
6
|
Barakat A, Munro G, Heegaard AM. Finding new analgesics: Computational pharmacology faces drug discovery challenges. Biochem Pharmacol 2024; 222:116091. [PMID: 38412924 DOI: 10.1016/j.bcp.2024.116091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 01/10/2024] [Accepted: 02/23/2024] [Indexed: 02/29/2024]
Abstract
Despite the worldwide prevalence and huge burden of pain, pain is an undertreated phenomenon. Currently used analgesics have several limitations regarding their efficacy and safety. The discovery of analgesics possessing a novel mechanism of action has faced multiple challenges, including a limited understanding of biological processes underpinning pain and analgesia and poor animal-to-human translation. Computational pharmacology is currently employed to face these challenges. In this review, we discuss the theory, methods, and applications of computational pharmacology in pain research. Computational pharmacology encompasses a wide variety of theoretical concepts and practical methodological approaches, with the overall aim of gaining biological insight through data acquisition and analysis. Data are acquired from patients or animal models with pain or analgesic treatment, at different levels of biological organization (molecular, cellular, physiological, and behavioral). Distinct methodological algorithms can then be used to analyze and integrate data. This helps to facilitate the identification of biological molecules and processes associated with pain phenotype, build quantitative models of pain signaling, and extract translatable features between humans and animals. However, computational pharmacology has several limitations, and its predictions can provide false positive and negative findings. Therefore, computational predictions are required to be validated experimentally before drawing solid conclusions. In this review, we discuss several case study examples of combining and integrating computational tools with experimental pain research tools to meet drug discovery challenges.
Collapse
Affiliation(s)
- Ahmed Barakat
- Department of Drug Design and Pharmacology, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark; Department of Pharmacology and Toxicology, Faculty of Pharmacy, Assiut University, Assiut, Egypt.
| | | | - Anne-Marie Heegaard
- Department of Drug Design and Pharmacology, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
7
|
Honda S, Misawa N, Sato Y, Oikawa D, Tokunaga F. The hypothetical molecular mechanism of the ethnic variations in the manifestation of age-related macular degeneration; focuses on the functions of the most significant susceptibility genes. Graefes Arch Clin Exp Ophthalmol 2024:10.1007/s00417-024-06442-9. [PMID: 38507046 DOI: 10.1007/s00417-024-06442-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2023] [Revised: 02/27/2024] [Accepted: 03/08/2024] [Indexed: 03/22/2024] Open
Abstract
Age-related macular degeneration (AMD) is the leading sight-threatening disease in developed countries. On the other hand, recent studies indicated an ethnic variation in the phenotype of AMD. For example, several reports demonstrated that the incidence of drusen in AMD patients is less in Asians compared to Caucasians though the reason has not been clarified yet. In the last decades, several genome association studies have disclosed many susceptible genes of AMD and revealed that the association strength of some genes was different among races and AMD phenotypes. In this review article, the essential findings of the clinical studies and genome association studies for the most significant genes CFH and ARMS2/HTRA1 in AMD of different races are summarized, and theoretical hypotheses about the molecular mechanisms underlying the ethnic variation in the AMD manifestation mainly focused on those genes between Caucasians and Asians are discussed.
Collapse
Affiliation(s)
- Shigeru Honda
- Department of Ophthalmology and Visual Sciences, Osaka Metropolitan University Graduate School of Medicine, 1-4-3 Asahi-Machi, Abeno-Ku, Osaka, Japan.
| | - Norihiko Misawa
- Department of Ophthalmology and Visual Sciences, Osaka Metropolitan University Graduate School of Medicine, 1-4-3 Asahi-Machi, Abeno-Ku, Osaka, Japan
| | - Yusuke Sato
- Center for Research On Green Sustainable Chemistry, Graduate School of Engineering, Tottori University, Tottori, Japan
- Department of Chemistry and Biotechnology, Graduate School of Engineering, Tottori University, Tottori, Japan
| | - Daisuke Oikawa
- Department of Medical Biochemistry, Osaka Metropolitan University Graduate School of Medicine, Osaka, Japan
| | - Fuminori Tokunaga
- Department of Medical Biochemistry, Osaka Metropolitan University Graduate School of Medicine, Osaka, Japan
| |
Collapse
|
8
|
Sale JE, Stoddard BL. Fifty years of Nucleic Acids Research. Nucleic Acids Res 2024; 52:1-3. [PMID: 38178306 PMCID: PMC10783492 DOI: 10.1093/nar/gkad1156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Accepted: 11/15/2023] [Indexed: 01/06/2024] Open
Affiliation(s)
- Julian E Sale
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, UK
| | - Barry L Stoddard
- Division of Basic Sciences, Fred Hutchinson Cancer Center, 1100 Fairview Ave. N., Seattle WA 98109, USA
| |
Collapse
|
9
|
Kasperski A, Heng HH. The Digital World of Cytogenetic and Cytogenomic Web Resources. Methods Mol Biol 2024; 2825:361-391. [PMID: 38913321 DOI: 10.1007/978-1-0716-3946-7_21] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/25/2024]
Abstract
The dynamic growth of technological capabilities at the cellular and molecular level has led to a rapid increase in the amount of data on the genes and genomes of organisms. In order to store, access, compare, validate, classify, and understand the massive data generated by different researchers, and to promote effective communication among research communities, various genome and cytogenetic online databases have been established. These data platforms/resources are essential not only for computational analyses and theoretical syntheses but also for helping researchers select future research topics and prioritize molecular targets. Furthermore, they are valuable for identifying shared recurrent genomic patterns related to human diseases and for avoiding unnecessary duplications among different researchers. The website interface, menu, graphics, animations, text layout, and data from databases are displayed by a front end on the screen of a monitor or smartphone. A database front-end refers to the user interface or application that enables accessing tabular, structured, or raw data stored in the database. The Internet makes it possible to reach a greater number of users around the world and gives them quick access to information stored in databases. The number of ways of presenting this data by front-ends increases as well. This requires unifying the ways of operating and presenting information by front-ends and ensuring contextual switching between front-ends of different databases. This chapter aims to present selected cytogenetic and cytogenomic Internet resources in terms of obtaining the needed information and to indicate how to increase the efficiency of access to stored information. Through a brief introduction of these databases and by providing examples of their usage in cytogenetic analyses, we aim to bridge the gap between cytogenetics and molecular genomics by encouraging their utilization.
Collapse
Affiliation(s)
- Andrzej Kasperski
- Institute of Biological Sciences, Department of Biotechnology, Laboratory of Bioinformatics and Control of Bioprocesses, University of Zielona Góra, Zielona Góra, Poland.
| | - Henry H Heng
- Center for Molecular Medicine and Genetics, and Department of Pathology, Wayne State University School of Medicine, Detroit, MI, USA
| |
Collapse
|
10
|
Imker HJ, Schackart KE, Istrate AM, Cook CE. A machine learning-enabled open biodata resource inventory from the scientific literature. PLoS One 2023; 18:e0294812. [PMID: 38015968 PMCID: PMC10684096 DOI: 10.1371/journal.pone.0294812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 11/07/2023] [Indexed: 11/30/2023] Open
Abstract
Modern biological research depends on data resources. These resources archive difficult-to-reproduce data and provide added-value aggregation, curation, and analyses. Collectively, they constitute a global infrastructure of biodata resources. While the organic proliferation of biodata resources has enabled incredible research, sustained support for the individual resources that make up this distributed infrastructure is a challenge. The Global Biodata Coalition (GBC) was established by research funders in part to aid in developing sustainable funding strategies for biodata resources. An important component of this work is understanding the scope of the resource infrastructure; how many biodata resources there are, where they are, and how they are supported. Existing registries require self-registration and/or extensive curation, and we sought to develop a method for assembling a global inventory of biodata resources that could be periodically updated with minimal human intervention. The approach we developed identifies biodata resources using open data from the scientific literature. Specifically, we used a machine learning-enabled natural language processing approach to identify biodata resources from titles and abstracts of life sciences publications contained in Europe PMC. Pretrained BERT (Bidirectional Encoder Representations from Transformers) models were fine-tuned to classify publications as describing a biodata resource or not and to predict the resource name using named entity recognition. To improve the quality of the resulting inventory, low-confidence predictions and potential duplicates were manually reviewed. Further information about the resources were then obtained using article metadata, such as funder and geolocation information. These efforts yielded an inventory of 3112 unique biodata resources based on articles published from 2011-2021. The code was developed to facilitate reuse and includes automated pipelines. All products of this effort are released under permissive licensing, including the biodata resource inventory itself (CC0) and all associated code (BSD/MIT).
Collapse
Affiliation(s)
- Heidi J. Imker
- Global Biodata Coalition, Strasbourg, France
- University Library, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
| | - Kenneth E. Schackart
- Global Biodata Coalition, Strasbourg, France
- Department of Biosystems Engineering, The University of Arizona, Tucson, Arizona, United States of America
| | - Ana-Maria Istrate
- Chan Zuckerberg Initiative, Redwood City, California, United States of America
| | | |
Collapse
|
11
|
Friedrichs M, Königs C. A web-based platform for the annotation and analysis of NAR-published databases. PLoS One 2023; 18:e0293134. [PMID: 37871106 PMCID: PMC10593211 DOI: 10.1371/journal.pone.0293134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Accepted: 10/06/2023] [Indexed: 10/25/2023] Open
Abstract
Biological databases are essential resources for life science research, but finding and selecting the most relevant and up-to-date databases can be challenging due to the large number and diversity of available databases. The Nucleic Acids Research (NAR) journal publishes annual database issues that provide a comprehensive list of databases in the molecular biology domain. However, the information provided by NAR is limited and sometimes does not reflect the current status and quality of the databases. In this article, we present a web-based platform for the annotation and analysis of NAR-published databases. The platform allows users to manually curate and enrich the NAR entries with additional information such as availability, downloadability, source code links, cross-references, and duplicates. Statistics and visualizations on various aspects of the database landscape, such as recency, status, category, and curation history are also provided. Currently, it contains a total of 2,246 database entries of which 2,025 are unique with the majority updated within the last five years. Around 75% of all databases are still available and more than half provide a download option. Cross references to Database Commons are available for 1,889 entries. The platform is freely available online at https://nardbstatus.kalis-amts.de and aims to help researchers in database selection and decision-making. It also provides insights into the current state and challenges of a subset of all databases in the life sciences.
Collapse
Affiliation(s)
- Marcel Friedrichs
- Bioinformatics / Medical Informatics Department, Bielefeld University, Bielefeld, NRW, Germany
| | - Cassandra Königs
- Bioinformatics / Medical Informatics Department, Bielefeld University, Bielefeld, NRW, Germany
| |
Collapse
|
12
|
Bao Y, Xue Y. From BIG Data Center to China National Center for Bioinformation. GENOMICS, PROTEOMICS & BIOINFORMATICS 2023; 21:900-903. [PMID: 37832784 PMCID: PMC10928365 DOI: 10.1016/j.gpb.2023.10.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Revised: 09/30/2023] [Accepted: 10/07/2023] [Indexed: 10/15/2023]
Affiliation(s)
- Yiming Bao
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China.
| | - Yongbiao Xue
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China.
| |
Collapse
|
13
|
Ritsch M, Cassman NA, Saghaei S, Marz M. Navigating the Landscape: A Comprehensive Review of Current Virus Databases. Viruses 2023; 15:1834. [PMID: 37766241 PMCID: PMC10537806 DOI: 10.3390/v15091834] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 08/18/2023] [Accepted: 08/21/2023] [Indexed: 09/29/2023] Open
Abstract
Viruses are abundant and diverse entities that have important roles in public health, ecology, and agriculture. The identification and surveillance of viruses rely on an understanding of their genome organization, sequences, and replication strategy. Despite technological advancements in sequencing methods, our current understanding of virus diversity remains incomplete, highlighting the need to explore undiscovered viruses. Virus databases play a crucial role in providing access to sequences, annotations and other metadata, and analysis tools for studying viruses. However, there has not been a comprehensive review of virus databases in the last five years. This study aimed to fill this gap by identifying 24 active virus databases and included an extensive evaluation of their content, functionality and compliance with the FAIR principles. In this study, we thoroughly assessed the search capabilities of five database catalogs, which serve as comprehensive repositories housing a diverse array of databases and offering essential metadata. Moreover, we conducted a comprehensive review of different types of errors, encompassing taxonomy, names, missing information, sequences, sequence orientation, and chimeric sequences, with the intention of empowering users to effectively tackle these challenges. We expect this review to aid users in selecting suitable virus databases and other resources, and to help databases in error management and improve their adherence to the FAIR principles. The databases listed here represent the current knowledge of viruses and will help aid users find databases of interest based on content, functionality, and scope. The use of virus databases is integral to gaining new insights into the biology, evolution, and transmission of viruses, and developing new strategies to manage virus outbreaks and preserve global health.
Collapse
Affiliation(s)
- Muriel Ritsch
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, 07743 Jena, Germany;
- European Virus Bioinformatics Center, 07743 Jena, Germany
| | - Noriko A. Cassman
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, 07743 Jena, Germany;
- European Virus Bioinformatics Center, 07743 Jena, Germany
| | - Shahram Saghaei
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, 07743 Jena, Germany;
- European Virus Bioinformatics Center, 07743 Jena, Germany
| | - Manja Marz
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, 07743 Jena, Germany;
- European Virus Bioinformatics Center, 07743 Jena, Germany
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, 04103 Leipzig, Germany
- FLI Leibniz Institute for Age Research, 07745 Jena, Germany
| |
Collapse
|
14
|
Hegde M, Girisa S, Kunnumakkara AB. A compilation of bioinformatic approaches to identify novel downstream targets for the detection and prophylaxis of cancer. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2023; 134:75-113. [PMID: 36858743 DOI: 10.1016/bs.apcsb.2022.11.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/22/2023]
Abstract
The paradigm of cancer genomics has been radically changed by the development in next-generation sequencing (NGS) technologies making it possible to envisage individualized treatment based on tumor and stromal cells genome in a clinical setting within a short timeframe. The abundance of data has led to new avenues for studying coordinated alterations that impair biological processes, which in turn has increased the demand for bioinformatic tools for pathway analysis. While most of this work has been concentrated on optimizing certain algorithms to obtain quicker and more accurate results. Large volumes of these existing algorithm-based data are difficult for the biologists and clinicians to access, download and reanalyze them. In the present study, we have listed the bioinformatics algorithms and user-friendly graphical user interface (GUI) tools that enable code-independent analysis of big data without compromising the quality and time. We have also described the advantages and drawbacks of each of these platforms. Additionally, we emphasize the importance of creating new, more user-friendly solutions to provide better access to open data and talk about relevant problems like data sharing and patient privacy.
Collapse
Affiliation(s)
- Mangala Hegde
- Cancer Biology Laboratory, Department of Biosciences and Bioengineering, Indian Institute of Technology (IIT) Guwahati, Guwahati, Assam, India
| | - Sosmitha Girisa
- Cancer Biology Laboratory, Department of Biosciences and Bioengineering, Indian Institute of Technology (IIT) Guwahati, Guwahati, Assam, India
| | - Ajaikumar B Kunnumakkara
- Cancer Biology Laboratory, Department of Biosciences and Bioengineering, Indian Institute of Technology (IIT) Guwahati, Guwahati, Assam, India.
| |
Collapse
|