1
|
Tan M, Xia J, Luo H, Meng G, Zhu Z. Applying the digital data and the bioinformatics tools in SARS-CoV-2 research. Comput Struct Biotechnol J 2023; 21:4697-4705. [PMID: 37841328 PMCID: PMC10568291 DOI: 10.1016/j.csbj.2023.09.044] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 09/29/2023] [Accepted: 09/29/2023] [Indexed: 10/17/2023] Open
Abstract
Bioinformatics has been playing a crucial role in the scientific progress to fight against the pandemic of the coronavirus disease 2019 (COVID-19) caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The advances in novel algorithms, mega data technology, artificial intelligence and deep learning assisted the development of novel bioinformatics tools to analyze daily increasing SARS-CoV-2 data in the past years. These tools were applied in genomic analyses, evolutionary tracking, epidemiological analyses, protein structure interpretation, studies in virus-host interaction and clinical performance. To promote the in-silico analysis in the future, we conducted a review which summarized the databases, web services and software applied in SARS-CoV-2 research. Those digital resources applied in SARS-CoV-2 research may also potentially contribute to the research in other coronavirus and non-coronavirus viruses.
Collapse
Affiliation(s)
- Meng Tan
- School of Life Sciences, Chongqing University, Chongqing, China
| | - Jiaxin Xia
- School of Life Sciences, Chongqing University, Chongqing, China
| | - Haitao Luo
- School of Life Sciences, Chongqing University, Chongqing, China
| | - Geng Meng
- College of Veterinary Medicine, China Agricultural University, Beijing, China
| | - Zhenglin Zhu
- School of Life Sciences, Chongqing University, Chongqing, China
| |
Collapse
|
2
|
Gupta T, He X, Uddin MR, Zeng X, Zhou A, Zhang J, Freyberg Z, Xu M. Self-supervised learning for macromolecular structure classification based on cryo-electron tomograms. Front Physiol 2022; 13:957484. [PMID: 36111160 PMCID: PMC9468634 DOI: 10.3389/fphys.2022.957484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Accepted: 08/02/2022] [Indexed: 11/21/2022] Open
Abstract
Macromolecular structure classification from cryo-electron tomography (cryo-ET) data is important for understanding macro-molecular dynamics. It has a wide range of applications and is essential in enhancing our knowledge of the sub-cellular environment. However, a major limitation has been insufficient labelled cryo-ET data. In this work, we use Contrastive Self-supervised Learning (CSSL) to improve the previous approaches for macromolecular structure classification from cryo-ET data with limited labels. We first pretrain an encoder with unlabelled data using CSSL and then fine-tune the pretrained weights on the downstream classification task. To this end, we design a cryo-ET domain-specific data-augmentation pipeline. The benefit of augmenting cryo-ET datasets is most prominent when the original dataset is limited in size. Overall, extensive experiments performed on real and simulated cryo-ET data in the semi-supervised learning setting demonstrate the effectiveness of our approach in macromolecular labeling and classification.
Collapse
Affiliation(s)
- Tarun Gupta
- Department of Computer Science and Engineering, Indian Institute of Technology, Indore, India
| | - Xuehai He
- Department of Electrical and Computer Engineering, University of California, San Diego, San Diego, CA, United States
| | - Mostofa Rafid Uddin
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, United States
| | - Xiangrui Zeng
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, United States
| | - Andrew Zhou
- Irvington High School, Irvington, NY, United States
| | - Jing Zhang
- Department of Computer Science, University of California, Irvine, Irvine, CA, United States
| | - Zachary Freyberg
- Departments of Psychiatry and Cell Biology, University of Pittsburgh, Pittsburgh, PA, United States
| | - Min Xu
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, United States
- *Correspondence: Min Xu,
| |
Collapse
|
3
|
Nawirska-Olszańska A, Zaczyńska E, Czarny A, Kolniak-Ostek J. Chemical Characteristics of Ethanol and Water Extracts of Black Alder ( Alnus glutinosa L.) Acorns and Their Antibacterial, Anti-Fungal and Antitumor Properties. MOLECULES (BASEL, SWITZERLAND) 2022; 27:molecules27092804. [PMID: 35566154 PMCID: PMC9105167 DOI: 10.3390/molecules27092804] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/09/2022] [Revised: 04/22/2022] [Accepted: 04/26/2022] [Indexed: 11/29/2022]
Abstract
The aim of this study was to identify polyphenolic compounds contained in ethanol and water extracts of black alder (Alnus glutinosa L.) acorns and evaluate their anti-cancer and antimicrobial effects. The significant anti-cancer potential on the human skin epidermoid carcinoma cell line A431 and the human epithelial cell line A549 derived from lung carcinoma tissue was observed. Aqueous and ethanolic extracts of alder acorns inhibited the growth of mainly Gram-positive microorganisms (Staphylococcus aureus, Bacillus subtilis, Streptococcus mutans) and yeast-like fungi (Candida albicans, Candida glabrata), as well as Gram-negative (Escherichia coli, Citrobacter freundii, Proteus mirabilis, Pseudomonas aeruginosa) strains. The identification of polyphenols was carried out using an ACQUITY UPLC-PDA-MS system. The extracts were composed of 29 compounds belonging to phenolic acids, flavonols, ellagitannins and ellagic acid derivatives. Ellagitannins were identified as the predominant phenolics in ethanol and aqueous extract (2171.90 and 1593.13 mg/100 g DM, respectively) The results may explain the use of A. glutinosa extracts in folk medicine.
Collapse
Affiliation(s)
- Agnieszka Nawirska-Olszańska
- Department of Fruit, Vegetable and Plant Nutraceutical Technology, Wrocław University of Environmental and Life Sciences, 37 Chelmonskiego Street, 51-630 Wroclaw, Poland;
| | - Ewa Zaczyńska
- Hirszfeld Institute of Immunology and Experimental Therapy, Polish Academy of Sciences, 12 R. Weigla Street, 53-114 Wroclaw, Poland; (E.Z.); (A.C.)
| | - Anna Czarny
- Hirszfeld Institute of Immunology and Experimental Therapy, Polish Academy of Sciences, 12 R. Weigla Street, 53-114 Wroclaw, Poland; (E.Z.); (A.C.)
| | - Joanna Kolniak-Ostek
- Department of Fruit, Vegetable and Plant Nutraceutical Technology, Wrocław University of Environmental and Life Sciences, 37 Chelmonskiego Street, 51-630 Wroclaw, Poland;
- Correspondence:
| |
Collapse
|
4
|
Wittmann BJ, Yue Y, Arnold FH. Informed training set design enables efficient machine learning-assisted directed protein evolution. Cell Syst 2021; 12:1026-1045.e7. [PMID: 34416172 DOI: 10.1016/j.cels.2021.07.008] [Citation(s) in RCA: 65] [Impact Index Per Article: 21.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Revised: 05/06/2021] [Accepted: 07/26/2021] [Indexed: 11/17/2022]
Abstract
Directed evolution of proteins often involves a greedy optimization in which the mutation in the highest-fitness variant identified in each round of single-site mutagenesis is fixed. The efficiency of such a single-step greedy walk depends on the order in which beneficial mutations are identified-the process is path dependent. Here, we investigate and optimize a path-independent machine learning-assisted directed evolution (MLDE) protocol that allows in silico screening of full combinatorial libraries. In particular, we evaluate the importance of different protein encoding strategies, training procedures, models, and training set design strategies on MLDE outcome, finding the most important consideration to be the implementation of strategies that reduce inclusion of minimally informative "holes" (protein variants with zero or extremely low fitness) in training data. When applied to an epistatic, hole-filled, four-site combinatorial fitness landscape, our optimized protocol achieved the global fitness maximum up to 81-fold more frequently than single-step greedy optimization. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
Affiliation(s)
- Bruce J Wittmann
- Division of Biology and Biological Engineering, California Institute of Technology, MC 210-41, 1200 E. California Blvd., Pasadena, CA 91125, USA
| | - Yisong Yue
- Department of Computing and Mathematical Sciences, California Institute of Technology, MC 305-16, 1200 E. California Blvd., Pasadena, CA 91125, USA
| | - Frances H Arnold
- Division of Biology and Biological Engineering, California Institute of Technology, MC 210-41, 1200 E. California Blvd., Pasadena, CA 91125, USA; Division of Chemistry and Chemical Engineering, California Institute of Technology, MC 210-41, 1200 E. California Blvd., Pasadena, CA 91125, USA.
| |
Collapse
|
5
|
Zhu Z, Meng G. ASFVdb: an integrative resource for genomic and proteomic analyses of African swine fever virus. Database (Oxford) 2020; 2020:baaa023. [PMID: 32294195 PMCID: PMC7159030 DOI: 10.1093/database/baaa023] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Revised: 02/23/2020] [Accepted: 03/03/2020] [Indexed: 11/17/2022]
Abstract
The recent outbreaks of African swine fever (ASF) in China and Europe have threatened the swine industry globally. To control the transmission of ASF virus (ASFV), we developed the African swine fever virus database (ASFVdb), an online data visualization and analysis platform for comparative genomics and proteomics. On the basis of known ASFV genes, ASFVdb reannotates the genomes of every strain and newly annotates 5352 possible open reading frames (ORFs) of 45 strains. Moreover, ASFVdb performs a thorough analysis of the population genetics of all the published genomes of ASFV strains and performs functional and structural predictions for all genes. Users can obtain not only basic information for each gene but also its distribution in strains and conserved or high mutation regions, possible subcellular location and topology. In the genome browser, ASFVdb provides a sliding window for results of population genetic analysis, which facilitates genetic and evolutionary analyses at the genomic level. The web interface was constructed based on SWAV 1.0. ASFVdb is freely accessible at http://asfvdb.popgenetics.net.
Collapse
Affiliation(s)
- Zhenglin Zhu
- School of Life Sciences, Chongqing University, Chongqing, China
| | - Geng Meng
- Laboratory of Biomedical Research and College of Veterinary Medicine, China Agricultural University, Beijing, China
| |
Collapse
|