1
|
Sun H, Murphy RF. Learning Morphological, Spatial, and Dynamic Models of Cellular Components. Methods Mol Biol 2024; 2800:231-244. [PMID: 38709488 DOI: 10.1007/978-1-0716-3834-7_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/07/2024]
Abstract
In this chapter, we describe protocols for using the CellOrganizer software on the Jupyter Notebook platform to analyze and model cell and organelle shape and spatial arrangement. CellOrganizer is an open-source system for using microscope images to learn statistical models of the structure of cell components and how those components are organized relative to each other. Such models capture the statistical variation in the organization of cellular components by jointly modeling the distributions of their number, shape, and spatial distributions. These models can be created for different cell types or conditions and compared to reflect differences in their spatial organizations. The models are also generative, in that they can be used to synthesize new cell instances reflecting what a model learned and to provide well-structured cell geometries that can be used for biochemical simulations.
Collapse
Affiliation(s)
- Huangqingbo Sun
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Robert F Murphy
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA.
| |
Collapse
|
2
|
Hasan MN, Mosharaf MP, Uddin KS, Das KR, Sultana N, Noorunnahar M, Naim D, Mollah MNH. Genome-Wide Identification and Characterization of Major RNAi Genes Highlighting Their Associated Factors in Cowpea ( Vigna unguiculata (L.) Walp.). BIOMED RESEARCH INTERNATIONAL 2023; 2023:8832406. [PMID: 38046903 PMCID: PMC10691899 DOI: 10.1155/2023/8832406] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Revised: 09/07/2023] [Accepted: 10/30/2023] [Indexed: 12/05/2023]
Abstract
In different regions of the world, cowpea (Vigna unguiculata (L.) Walp.) is an important vegetable and an excellent source of protein. It lessens the malnutrition of the underprivileged in developing nations and has some positive effects on health, such as a reduction in the prevalence of cancer and cardiovascular disease. However, occasionally, certain biotic and abiotic stresses caused a sharp fall in cowpea yield. Major RNA interference (RNAi) genes like Dicer-like (DCL), Argonaute (AGO), and RNA-dependent RNA polymerase (RDR) are essential for the synthesis of their associated factors like domain, small RNAs (sRNAs), transcription factors, micro-RNAs, and cis-acting factors that shield plants from biotic and abiotic stresses. In this study, applying BLASTP search and phylogenetic tree analysis with reference to the Arabidopsis RNAi (AtRNAi) genes, we discovered 28 VuRNAi genes, including 7 VuDCL, 14 VuAGO, and 7 VuRDR genes in cowpea. We looked at the domains, motifs, gene structures, chromosomal locations, subcellular locations, gene ontology (GO) terms, and regulatory factors (transcription factors, micro-RNAs, and cis-acting elements (CAEs)) to characterize the VuRNAi genes and proteins in cowpea in response to stresses. Predicted VuDCL1, VuDCL2(a, b), VuAGO7, VuAGO10, and VuRDR6 genes might have an impact on cowpea growth, development of the vegetative and flowering stages, and antiviral defense. The VuRNAi gene regulatory features miR395 and miR396 might contribute to grain quality improvement, immunity boosting, and pathogen infection resistance under salinity and drought conditions. Predicted CAEs from the VuRNAi genes might play a role in plant growth and development, improving grain quality and production and protecting plants from biotic and abiotic stresses. Therefore, our study provides crucial information about the functional roles of VuRNAi genes and their associated components, which would aid in the development of future cowpeas that are more resilient to biotic and abiotic stress. The manuscript is available as a preprint at this link: doi:10.1101/2023.02.15.528631v1.
Collapse
Affiliation(s)
- Mohammad Nazmol Hasan
- Department of Statistics, Bangabandhu Sheikh Mujibur Rahman Agricultural University, Gazipur 1706, Bangladesh
| | - Md Parvez Mosharaf
- School of Business, Faculty of Business, Education, Law and Arts, University of Southern Queensland, Toowoomba, QLD 4350, Australia
| | - Khandoker Saif Uddin
- Department of Quantitative Science (Statistics), International University of Business Agriculture and Technology (IUBAT), Uttara, Bangladesh
| | - Keya Rani Das
- Department of Statistics, Bangabandhu Sheikh Mujibur Rahman Agricultural University, Gazipur 1706, Bangladesh
| | - Nasrin Sultana
- Department of Statistics, Bangabandhu Sheikh Mujibur Rahman Agricultural University, Gazipur 1706, Bangladesh
| | - Mst. Noorunnahar
- Department of Statistics, Bangabandhu Sheikh Mujibur Rahman Agricultural University, Gazipur 1706, Bangladesh
| | - Darun Naim
- Department of Botany, Faculty of Biological Sciences, University of Rajshahi, Rajshahi 6205, Bangladesh
- Bioinformatics Lab, Department of Statistics, Faculty of Science, University of Rajshahi, Rajshahi 6205, Bangladesh
| | - Md. Nurul Haque Mollah
- Bioinformatics Lab, Department of Statistics, Faculty of Science, University of Rajshahi, Rajshahi 6205, Bangladesh
| |
Collapse
|
3
|
Faysal Ahmed F, Dola FS, Zohra FT, Rahman SM, Konak JN, Sarkar MAR. Genome-wide identification, classification, and characterization of lectin gene superfamily in sweet orange (Citrus sinensis L.). PLoS One 2023; 18:e0294233. [PMID: 37956187 PMCID: PMC10642848 DOI: 10.1371/journal.pone.0294233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Accepted: 10/30/2023] [Indexed: 11/15/2023] Open
Abstract
Lectins are sugar-binding proteins found abundantly in plants. Lectin superfamily members have diverse roles, including plant growth, development, cellular processes, stress responses, and defense against microbes. However, the genome-wide identification and functional analysis of lectin genes in sweet orange (Citrus sinensis L.) remain unexplored. Therefore, we used integrated bioinformatics approaches (IBA) for in-depth genome-wide identification, characterization, and regulatory factor analysis of sweet orange lectin genes. Through genome-wide comparative analysis, we identified a total of 141 lectin genes distributed across 10 distinct gene families such as 68 CsB-Lectin, 13 CsLysin Motif (LysM), 4 CsChitin-Bind1, 1 CsLec-C, 3 CsGal-B, 1 CsCalreticulin, 3 CsJacalin, 13 CsPhloem, 11 CsGal-Lec, and 24 CsLectinlegB.This classification relied on characteristic domain and phylogenetic analysis, showing significant homology with Arabidopsis thaliana's lectin gene families. A thorough analysis unveiled common similarities within specific groups and notable variations across different protein groups. Gene Ontology (GO) enrichment analysis highlighted the predicted genes' roles in diverse cellular components, metabolic processes, and stress-related regulation. Additionally, network analysis of lectin genes with transcription factors (TFs) identified pivotal regulators like ERF, MYB, NAC, WRKY, bHLH, bZIP, and TCP. The cis-acting regulatory elements (CAREs) found in sweet orange lectin genes showed their roles in crucial pathways, including light-responsive (LR), stress-responsive (SR), hormone-responsive (HR), and more. These findings will aid in the in-depth molecular examination of these potential genes and their regulatory elements, contributing to targeted enhancements of sweet orange species in breeding programs.
Collapse
Affiliation(s)
- Fee Faysal Ahmed
- Department of Mathematics, Faculty of Science, Jashore University of Science and Technology, Jashore, Bangladesh
| | - Farah Sumaiya Dola
- Department of Genetic Engineering and Biotechnology, Faculty of Biological Science and Technology, Jashore University of Science and Technology, Jashore, Bangladesh
| | - Fatema Tuz Zohra
- Department of Genetic Engineering and Biotechnology, Faculty of Biological Sciences, University of Rajshahi, Rajshahi, Bangladesh
| | - Shaikh Mizanur Rahman
- Department of Genetic Engineering and Biotechnology, Faculty of Biological Science and Technology, Jashore University of Science and Technology, Jashore, Bangladesh
| | - Jesmin Naher Konak
- Department of Biochemistry and Molecular Biology, Faculty of LifeScience, Mawlana Bhashani Science and Technology University, Santosh, Tangail, Bangladesh
| | - Md. Abdur Rauf Sarkar
- Department of Genetic Engineering and Biotechnology, Faculty of Biological Science and Technology, Jashore University of Science and Technology, Jashore, Bangladesh
| |
Collapse
|
4
|
Sarkar MAR, Sarkar S, Islam MSU, Zohra FT, Rahman SM. A genome‑wide approach to the systematic and comprehensive analysis of LIM gene family in sorghum (Sorghum bicolor L.). Genomics Inform 2023; 21:e36. [PMID: 37813632 PMCID: PMC10584642 DOI: 10.5808/gi.23007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Revised: 06/23/2023] [Accepted: 08/09/2023] [Indexed: 10/11/2023] Open
Abstract
The LIM domain-containing proteins are dominantly found in plants and play a significant role in various biological processes such as gene transcription as well as actin cytoskeletal organization. Nevertheless, genome-wide identification as well as functional analysis of the LIM gene family have not yet been reported in the economically important plant sorghum (Sorghum bicolor L.). Therefore, we conducted an in silico identification and characterization of LIM genes in S. bicolor genome using integrated bioinformatics approaches. Based on phylogenetic tree analysis and conserved domain, we identified five LIM genes in S. bicolor (SbLIM) genome corresponding to Arabidopsis LIM (AtLIM) genes. The conserved domain, motif as well as gene structure analyses of the SbLIM gene family showed the similarity within the SbLIM and AtLIM members. The gene ontology (GO) enrichment study revealed that the candidate LIM genes are directly involved in cytoskeletal organization and various other important biological as well as molecular pathways. Some important families of regulating transcription factors such as ERF, MYB, WRKY, NAC, bZIP, C2H2, Dof, and G2-like were detected by analyzing their interaction network with identified SbLIM genes. The cis-acting regulatory elements related to predicted SbLIM genes were identified as responsive to light, hormones, stress, and other functions. The present study will provide valuable useful information about LIM genes in sorghum which would pave the way for the future study of functional pathways of candidate SbLIM genes as well as their regulatory factors in wet-lab experiments.
Collapse
Affiliation(s)
- Md. Abdur Rauf Sarkar
- Department of Genetic Engineering and Biotechnology, Faculty of Biological Science and Technology, Jashore University of Science and Technology, Jashore 7408, Bangladesh
| | - Salim Sarkar
- Department of Genetic Engineering and Biotechnology, Faculty of Biological Science and Technology, Jashore University of Science and Technology, Jashore 7408, Bangladesh
| | - Md Shohel Ul Islam
- Department of Genetic Engineering and Biotechnology, Faculty of Biological Science and Technology, Jashore University of Science and Technology, Jashore 7408, Bangladesh
| | - Fatema Tuz Zohra
- Department of Genetic Engineering and Biotechnology, Faculty of Biological Sciences, University of Rajshahi, Rajshahi 6205, Bangladesh
| | - Shaikh Mizanur Rahman
- Department of Genetic Engineering and Biotechnology, Faculty of Biological Science and Technology, Jashore University of Science and Technology, Jashore 7408, Bangladesh
| |
Collapse
|
5
|
Podder A, Ahmed FF, Suman MZH, Mim AY, Hasan K. Genome-wide identification of DCL, AGO and RDR gene families and their associated functional regulatory element analyses in sunflower (Helianthus annuus). PLoS One 2023; 18:e0286994. [PMID: 37294803 PMCID: PMC10256174 DOI: 10.1371/journal.pone.0286994] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Accepted: 05/27/2023] [Indexed: 06/11/2023] Open
Abstract
RNA interference (RNAi) regulates a variety of eukaryotic gene expressions that are engaged in response to stress, growth, and the conservation of genomic stability during developmental phases. It is also intimately connected to the post-transcriptional gene silencing (PTGS) process and chromatin modification levels. The entire process of RNA interference (RNAi) pathway gene families mediates RNA silencing. The main factors of RNA silencing are the Dicer-Like (DCL), Argonaute (AGO), and RNA-dependent RNA polymerase (RDR) gene families. To the best of our knowledge, genome-wide identification of RNAi gene families like DCL, AGO, and RDR in sunflower (Helianthus annuus) has not yet been studied despite being discovered in some species. So, the goal of this study is to find the RNAi gene families like DCL, AGO, and RDR in sunflower based on bioinformatics approaches. Therefore, we accomplished an inclusive in silico investigation for genome-wide identification of RNAi pathway gene families DCL, AGO, and RDR through bioinformatics approaches such as (sequence homogeneity, phylogenetic relationship, gene structure, chromosomal localization, PPIs, GO, sub-cellular localization). In this study, we have identified five DCL (HaDCLs), fifteen AGO (HaAGOs), and ten RDR (HaRDRs) in the sunflower genome database corresponding to the RNAi genes of model plant Arabidopsis thaliana based on genome-wide analysis and a phylogenetic method. The analysis of the gene structure that contains exon-intron numbers, conserved domain, and motif composition analyses for all HaDCL, HaAGO, and HaRDR gene families indicated almost homogeneity among the same gene family. The protein-protein interaction (PPI) network analysis illustrated that there exists interconnection among identified three gene families. The analysis of the Gene Ontology (GO) enrichment showed that the detected genes directly contribute to the RNA gene-silencing and were involved in crucial pathways. It was observed that the cis-acting regulatory components connected to the identified genes were shown to be responsive to hormone, light, stress, and other functions. That was found in HaDCL, HaAGO, and HaRDR genes associated with the development and growth of plants. Finally, we are able to provide some essential information about the components of sunflower RNA silencing through our genome-wide comparison and integrated bioinformatics analysis, which open the door for further research into the functional mechanisms of the identified genes and their regulatory elements.
Collapse
Affiliation(s)
- Anamika Podder
- Department of Mathematics, Faculty of Science, Jashore University of Science and Technology, Jashore, Bangladesh
| | - Fee Faysal Ahmed
- Department of Mathematics, Faculty of Science, Jashore University of Science and Technology, Jashore, Bangladesh
| | - Md. Zahid Hasan Suman
- Department of Mathematics, Faculty of Science, Jashore University of Science and Technology, Jashore, Bangladesh
| | - Afsana Yeasmin Mim
- Department of Mathematics, Faculty of Science, Jashore University of Science and Technology, Jashore, Bangladesh
| | - Khadiza Hasan
- Department of Mathematics, Faculty of Science, Jashore University of Science and Technology, Jashore, Bangladesh
| |
Collapse
|
6
|
Genome-Wide Identification of Strawberry C2H2-ZFP C1-2i Subclass and the Potential Function of FaZAT10 in Abiotic Stress. Int J Mol Sci 2022; 23:ijms232113079. [PMID: 36361867 PMCID: PMC9654774 DOI: 10.3390/ijms232113079] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 10/25/2022] [Accepted: 10/26/2022] [Indexed: 11/17/2022] Open
Abstract
C2H2-type zinc finger proteins (C2H2-ZFPs) play a key role in various plant biological processes and responses to environmental stresses. In Arabidopsisthaliana, C2H2-ZFP members with two zinc finger domains have been well-characterized in response to abiotic stresses. To date, the functions of these genes in strawberries are still uncharacterized. Here, 126 C2H2-ZFPs in cultivated strawberry were firstly identified using the recently sequenced Fragaria × ananassa genome. Among these C2H2-ZFPs, 46 members containing two zinc finger domains in cultivated strawberry were further identified as the C1-2i subclass. These genes were unevenly distributed on 21 chromosomes and classified into five groups according to the phylogenetic relationship, with similar physicochemical properties and motif compositions in the same group. Analyses of conserved domains and gene structures indicated the evolutionary conservation of the C1-2i subclass. A Ka/Ks analysis indicated that the C1-2i members were subjected to purifying selection during evolution. Furthermore, FaZAT10, a typical C2H2-ZFP, was isolated. FaZAT10 was expressed the highest in roots, and it was induced by drought, salt, low-temperature, ABA, and MeJA treatments. It was localized in the nucleus and showed no transactivation activity in yeast cells. Overall, these results provide useful information for enriching the analysis of the ZFPs gene family in strawberry, and they provide support for revealing the mechanism of FaZAT10 in the regulatory network of abiotic stress.
Collapse
|
7
|
Babi M, Neuman K, Peng CY, Maiuri T, Suart CE, Truant R. Recent Microscopy Advances and the Applications to Huntington’s Disease Research. J Huntingtons Dis 2022; 11:269-280. [PMID: 35848031 PMCID: PMC9484089 DOI: 10.3233/jhd-220536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Huntingtin is a 3144 amino acid protein defined as a scaffold protein with many intracellular locations that suggest functions in these compartments. Expansion of the CAG DNA tract in the huntingtin first exon is the cause of Huntington’s disease. An important tool in understanding the biological functions of huntingtin is molecular imaging at the single-cell level by microscopy and nanoscopy. The evolution of these technologies has accelerated since the Nobel Prize in Chemistry was awarded in 2014 for super-resolution nanoscopy. We are in a new era of light imaging at the single-cell level, not just for protein location, but also for protein conformation and biochemical function. Large-scale microscopy-based screening is also being accelerated by a coincident development of machine-based learning that offers a framework for truly unbiased data acquisition and analysis at very large scales. This review will summarize the newest technologies in light, electron, and atomic force microscopy in the context of unique challenges with huntingtin cell biology and biochemistry.
Collapse
Affiliation(s)
- Mouhanad Babi
- McMaster Centre for Advanced Light Microscopy (CALM) McMaster University, Hamilton, Canada
| | - Kaitlyn Neuman
- Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Canada
| | - Christina Y. Peng
- Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Canada
| | - Tamara Maiuri
- Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Canada
| | - Celeste E. Suart
- Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Canada
| | - Ray Truant
- Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Canada
- McMaster Centre for Advanced Light Microscopy (CALM) McMaster University, Hamilton, Canada
| |
Collapse
|
8
|
Vaish S, Parveen R, Gupta D, Basantani MK. Genome-wide identification and characterization of glutathione S-transferase gene family in Musa acuminata L. AAA group and gaining an insight to their role in banana fruit development. J Appl Genet 2022; 63:609-631. [PMID: 35689012 DOI: 10.1007/s13353-022-00707-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Revised: 05/31/2022] [Accepted: 06/02/2022] [Indexed: 10/18/2022]
Abstract
Glutathione S-transferases are a multifunctional protein superfamily that is involved in diverse plant functions such as defense mechanisms, signaling, stress response, secondary metabolism, and plant growth and development. Although the banana whole-genome sequence is available, the distribution of GST genes on banana chromosomes, their subcellular localization, gene structure, their evolutionary relation with each other, conserved motifs, and their roles in banana are still unknown. A total of 62 full-length GST genes with the canonical thioredoxin fold have been identified belonging to nine GST classes, namely tau, phi, theta, zeta, lambda, DHAR, EF1G, GHR, and TCHQD. The 62 GST genes were distributed into 11 banana chromosomes. All the MaGSTs were majorly localized in the cytoplasm. Gene architecture showed the conservation of exon numbers in individual GST classes. Multiple Em for Motif Elicitation analyses revealed few class-specific motifs and many motifs were found in all the GST classes. Multiple sequence alignment of banana GST amino acid sequences with rice, Arabidopsis, and soybean sequences revealed the Ser and Cys as conserved catalytic residues. Gene duplication analyses showed the tandem duplication as a driving force for GST gene family expansion in banana. Cis-regulatory element analysis showed the dominance of light-responsive element followed by stress- and hormone-responsive elements. Expression profiling analyses were also done by RNA-seq data. It was observed that MaGSTs are involved in various stages of fruit development. MaGSTU1 was highly upregulated. The comprehensive and organized studies of MaGST gene family provide groundwork for further functional analysis of MaGST genes in banana at molecular level and further for plant breeding approaches.
Collapse
Affiliation(s)
- Swati Vaish
- Faculty of Biosciences, Institute of Biosciences and Technology, Shri Ramswaroop Memorial University, Lucknow-Deva Road, Barabanki, 225003, Uttar Pradesh, India
| | - Reshma Parveen
- Faculty of Biosciences, Institute of Biosciences and Technology, Shri Ramswaroop Memorial University, Lucknow-Deva Road, Barabanki, 225003, Uttar Pradesh, India
| | - Divya Gupta
- Faculty of Biosciences, Institute of Biosciences and Technology, Shri Ramswaroop Memorial University, Lucknow-Deva Road, Barabanki, 225003, Uttar Pradesh, India
| | - Mahesh Kumar Basantani
- Faculty of Biosciences, Institute of Biosciences and Technology, Shri Ramswaroop Memorial University, Lucknow-Deva Road, Barabanki, 225003, Uttar Pradesh, India.
| |
Collapse
|
9
|
Tu Y, Lei H, Shen HB, Yang Y. SIFLoc: a self-supervised pre-training method for enhancing the recognition of protein subcellular localization in immunofluorescence microscopic images. Brief Bioinform 2022; 23:6527276. [DOI: 10.1093/bib/bbab605] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Revised: 12/15/2021] [Accepted: 12/27/2021] [Indexed: 12/19/2022] Open
Abstract
Abstract
With the rapid growth of high-resolution microscopy imaging data, revealing the subcellular map of human proteins has become a central task in the spatial proteome. The cell atlas of the Human Protein Atlas (HPA) provides precious resources for recognizing subcellular localization patterns at the cell level, and the large-scale annotated data enable learning via advanced deep neural networks. However, the existing predictors still suffer from the imbalanced class distribution and the lack of labeled data for minor classes. Thus, it is necessary to develop new methods for coping with these issues. We leverage the self-supervised learning protocol to address these problems. Especially, we propose a pre-training scheme to enhance the conventional supervised learning framework called SIFLoc. The pre-training is featured by a hybrid data augmentation method and a modified contrastive loss function, aiming to learn good feature representations from microscopic images. The experiments are performed on a large-scale immunofluorescence microscopic image dataset collected from the HPA database. Using the same deep neural networks as the classifier, the model pre-trained via SIFLoc not only outperforms the model without pre-training by a large margin but also shows advantages over the state-of-the-art self-supervised learning methods. Especially, SIFLoc improves the prediction accuracy for minor organelles significantly.
Collapse
Affiliation(s)
- Yanlun Tu
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, 200240 Shanghai, China
| | - Houchao Lei
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, 200240 Shanghai, China
| | - Hong-Bin Shen
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, 200240 Shanghai, China
- Institute of Image Processing and Pattern Recognition and Key Laboratory of System Control and Information Processing, Shanghai Jiao Tong University, 200240 Shanghai, China
| | - Yang Yang
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, 200240 Shanghai, China
| |
Collapse
|
10
|
Deep localization of subcellular protein structures from fluorescence microscopy images. Neural Comput Appl 2022. [DOI: 10.1007/s00521-021-06715-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
11
|
Ullah M, Han K, Hadi F, Xu J, Song J, Yu DJ. PScL-HDeep: image-based prediction of protein subcellular location in human tissue using ensemble learning of handcrafted and deep learned features with two-layer feature selection. Brief Bioinform 2021; 22:bbab278. [PMID: 34337652 PMCID: PMC8574991 DOI: 10.1093/bib/bbab278] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2021] [Revised: 06/30/2021] [Accepted: 07/01/2021] [Indexed: 01/17/2023] Open
Abstract
Protein subcellular localization plays a crucial role in characterizing the function of proteins and understanding various cellular processes. Therefore, accurate identification of protein subcellular location is an important yet challenging task. Numerous computational methods have been proposed to predict the subcellular location of proteins. However, most existing methods have limited capability in terms of the overall accuracy, time consumption and generalization power. To address these problems, in this study, we developed a novel computational approach based on human protein atlas (HPA) data, referred to as PScL-HDeep, for accurate and efficient image-based prediction of protein subcellular location in human tissues. We extracted different handcrafted and deep learned (by employing pretrained deep learning model) features from different viewpoints of the image. The step-wise discriminant analysis (SDA) algorithm was applied to generate the optimal feature set from each original raw feature set. To further obtain a more informative feature subset, support vector machine-based recursive feature elimination with correlation bias reduction (SVM-RFE + CBR) feature selection algorithm was applied to the integrated feature set. Finally, the classification models, namely support vector machine with radial basis function (SVM-RBF) and support vector machine with linear kernel (SVM-LNR), were learned on the final selected feature set. To evaluate the performance of the proposed method, a new gold standard benchmark training dataset was constructed from the HPA databank. PScL-HDeep achieved the maximum performance on 10-fold cross validation test on this dataset and showed a better efficacy over existing predictors. Furthermore, we also illustrated the generalization ability of the proposed method by conducting a stringent independent validation test.
Collapse
Affiliation(s)
- Matee Ullah
- Nanjing University of Science and Technology, China
| | - Ke Han
- School of Computer Science and Engineering, Nanjing University of Science and Technology, China
| | - Fazal Hadi
- Pakistan Institute of Engineering and Applied Sciences, Islamabad, Pakistan
| | - Jian Xu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, China
| | - Jiangning Song
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Australia
| | - Dong-Jun Yu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, China
| |
Collapse
|
12
|
Hu JX, Yang Y, Xu YY, Shen HB. Incorporating label correlations into deep neural networks to classify protein subcellular location patterns in immunohistochemistry images. Proteins 2021; 90:493-503. [PMID: 34546597 DOI: 10.1002/prot.26244] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Revised: 03/16/2021] [Accepted: 09/13/2021] [Indexed: 12/17/2022]
Abstract
Analysis of protein subcellular localization is a critical part of proteomics. In recent years, as both the number and quality of microscopic images are increasing rapidly, many automated methods, especially convolutional neural networks (CNN), have been developed to predict protein subcellular location(s) based on bioimages, but their performance always suffers from some inherent properties of the problem. First, many microscopic images have non-informative or noisy sections, like unstained stroma and unspecific background, which affect the extraction of protein expression information. Second, the patterns of protein subcellular localization are very complex, as a lot of proteins locate in more than one compartment. In this study, we propose a new label-correlation enhanced deep neural network, laceDNN, to classify the subcellular locations of multi-label proteins from immunohistochemistry images. The model uses small representative patches as input to alleviate the image noise issue, and its backbone is a hybrid architecture of CNN and recurrent neural network, where the former network extracts representative image features and the latter learns the organelle dependency relationships. Our experimental results indicate that the proposed model can improve the performance of multi-label protein subcellular classification.
Collapse
Affiliation(s)
- Jin-Xian Hu
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, China
| | - Yang Yang
- Department of Computer Science and Engineering, Center for Brain-Like Computing and Machine Intelligence, Shanghai Jiao Tong University, Shanghai, China
| | - Ying-Ying Xu
- School of Biomedical Engineering and Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou, China
| | - Hong-Bin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, China
| |
Collapse
|
13
|
Ahmed FF, Hossen MI, Sarkar MAR, Konak JN, Zohra FT, Shoyeb M, Mondal S. Genome-wide identification of DCL, AGO and RDR gene families and their associated functional regulatory elements analyses in banana (Musa acuminata). PLoS One 2021; 16:e0256873. [PMID: 34473743 PMCID: PMC8412350 DOI: 10.1371/journal.pone.0256873] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Accepted: 08/17/2021] [Indexed: 12/15/2022] Open
Abstract
RNA silencing is mediated through RNA interference (RNAi) pathway gene families, i.e., Dicer-Like (DCL), Argonaute (AGO), and RNA-dependent RNA polymerase (RDR) and their cis-acting regulatory elements. The RNAi pathway is also directly connected with the post-transcriptional gene silencing (PTGS) mechanism, and the pathway controls eukaryotic gene regulation during growth, development, and stress response. Nevertheless, genome-wide identification of RNAi pathway gene families such as DCL, AGO, and RDR and their regulatory network analyses related to transcription factors have not been studied in many fruit crop species, including banana (Musa acuminata). In this study, we studied in silico genome-wide identification and characterization of DCL, AGO, and RDR genes in bananas thoroughly via integrated bioinformatics approaches. A genome-wide analysis identified 3 MaDCL, 13 MaAGO, and 5 MaRDR candidate genes based on multiple sequence alignment and phylogenetic tree related to the RNAi pathway in banana genomes. These genes correspond to the Arabidopsis thaliana RNAi silencing genes. The analysis of the conserved domain, motif, and gene structure (exon-intron numbers) for MaDCL, MaAGO, and MaRDR genes showed higher homogeneity within the same gene family. The Gene Ontology (GO) enrichment analysis exhibited that the identified RNAi genes could be involved in RNA silencing and associated metabolic pathways. A number of important transcription factors (TFs), e.g., ERF, Dof, C2H2, TCP, GATA and MIKC_MADS families, were identified by network and sub-network analyses between TFs and candidate RNAi gene families. Furthermore, the cis-acting regulatory elements related to light-responsive (LR), stress-responsive (SR), hormone-responsive (HR), and other activities (OT) functions were identified in candidate MaDCL, MaAGO, and MaRDR genes. These genome-wide analyses of these RNAi gene families provide valuable information related to RNA silencing, which would shed light on further characterization of RNAi genes, their regulatory elements, and functional roles, which might be helpful for banana improvement in the breeding program.
Collapse
Affiliation(s)
- Fee Faysal Ahmed
- Faculty of Science, Department of Mathematics, Jashore University of Science and Technology, Jashore, Bangladesh
- * E-mail:
| | - Md. Imran Hossen
- Faculty of Science, Department of Mathematics, Jashore University of Science and Technology, Jashore, Bangladesh
| | - Md. Abdur Rauf Sarkar
- Faculty of Biological Science and Technology, Department of Genetic Engineering and Biotechnology, Jashore University of Science and Technology, Jashore, Bangladesh
| | - Jesmin Naher Konak
- Faculty of Life Science, Department of Biochemistry and Molecular Biology, Mawlana Bhashani Science and Technology University, Tangail, Bangladesh
| | - Fatema Tuz Zohra
- Faculty of Agriculture, Laboratory of Fruit Science, Saga University, Honjo-machi, Saga, Japan
| | - Md. Shoyeb
- Faculty of Biological Science and Technology, Department of Genetic Engineering and Biotechnology, Jashore University of Science and Technology, Jashore, Bangladesh
| | - Samiran Mondal
- Faculty of Science, Department of Mathematics, Jashore University of Science and Technology, Jashore, Bangladesh
| |
Collapse
|
14
|
Cottle L, Gilroy I, Deng K, Loudovaris T, Thomas HE, Gill AJ, Samra JS, Kebede MA, Kim J, Thorn P. Machine Learning Algorithms, Applied to Intact Islets of Langerhans, Demonstrate Significantly Enhanced Insulin Staining at the Capillary Interface of Human Pancreatic β Cells. Metabolites 2021; 11:metabo11060363. [PMID: 34200432 PMCID: PMC8229564 DOI: 10.3390/metabo11060363] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Revised: 05/27/2021] [Accepted: 05/28/2021] [Indexed: 11/16/2022] Open
Abstract
Pancreatic β cells secrete the hormone insulin into the bloodstream and are critical in the control of blood glucose concentrations. β cells are clustered in the micro-organs of the islets of Langerhans, which have a rich capillary network. Recent work has highlighted the intimate spatial connections between β cells and these capillaries, which lead to the targeting of insulin secretion to the region where the β cells contact the capillary basement membrane. In addition, β cells orientate with respect to the capillary contact point and many proteins are differentially distributed at the capillary interface compared with the rest of the cell. Here, we set out to develop an automated image analysis approach to identify individual β cells within intact islets and to determine if the distribution of insulin across the cells was polarised. Our results show that a U-Net machine learning algorithm correctly identified β cells and their orientation with respect to the capillaries. Using this information, we then quantified insulin distribution across the β cells to show enrichment at the capillary interface. We conclude that machine learning is a useful analytical tool to interrogate large image datasets and analyse sub-cellular organisation.
Collapse
Affiliation(s)
- Louise Cottle
- Charles Perkins Centre, School of Medical Sciences, University of Sydney, Camperdown 2006, Australia
| | - Ian Gilroy
- School of Computer Science, University of Sydney, Camperdown 2006, Australia
| | - Kylie Deng
- Charles Perkins Centre, School of Medical Sciences, University of Sydney, Camperdown 2006, Australia
| | | | - Helen E Thomas
- St Vincent's Institute, Fitzroy 3065, Australia
- Department of Medicine, St Vincent's Hospital, University of Melbourne, Fitzroy 3065, Australia
| | - Anthony J Gill
- Northern Clinical School, University of Sydney, St Leonards 2065, Australia
- Department of Anatomical Pathology, Royal North Shore Hospital, St Leonards 2065, Australia
- Cancer Diagnosis and Pathology Research Group, Kolling Institute of Medical Research, St Leonards 2065, Australia
| | - Jaswinder S Samra
- Northern Clinical School, University of Sydney, St Leonards 2065, Australia
- Upper Gastrointestinal Surgical Unit, Royal North Shore Hospital, St Leonards 2065, Australia
| | - Melkam A Kebede
- Charles Perkins Centre, School of Medical Sciences, University of Sydney, Camperdown 2006, Australia
| | - Jinman Kim
- School of Computer Science, University of Sydney, Camperdown 2006, Australia
| | - Peter Thorn
- Charles Perkins Centre, School of Medical Sciences, University of Sydney, Camperdown 2006, Australia
| |
Collapse
|
15
|
Sadau SB, Ahmad A, Tajo SM, Ibrahim S, Kazeem BB, Wei H, Yu S. Overexpression of GhMPK3 from Cotton Enhances Cold, Drought, and Salt Stress in Arabidopsis. AGRONOMY 2021; 11:1049. [PMID: 0 DOI: 10.3390/agronomy11061049] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Cotton production is hampered by a variety of abiotic stresses that wreak havoc on the growth and development of plants, resulting in significant financial losses. According to reports, cotton production areas have declined around the world as a result of the ongoing stress. Therefore, plant breeding programs are concentrating on abiotic stress-tolerant cotton varieties. Mitogen-activated protein kinase (MAPK) cascades are involved in plant growth, stress responses, and the hormonal signaling pathway. In this research, three abiotic stresses (cold, drought, and salt) were analyzed on GhMPK3 transformed Arabidopsis plants. The transgenic plant’s gene expression and morphologic analysis were studied under cold, drought, and salt stress. Physiological parameters such as relative leaf water content, excised leaf water loss, chlorophyll content, and ion leakage showed that overexpressed plants possess more stable content under stress conditions compared with the WT plants. Furthermore, GhMPK3 overexpressed plants had greater antioxidant activities and weaker oxidant activities. Silencing GhMPK3 in cotton inhibited its tolerance to drought stress. Our research findings strongly suggest that GhMPK3 can be regarded as an essential gene for abiotic stress tolerance in cotton plants.
Collapse
|
16
|
Abstract
Cell imaging has entered the 'Big Data' era. New technologies in light microscopy and molecular biology have led to an explosion in high-content, dynamic and multidimensional imaging data. Similar to the 'omics' fields two decades ago, our current ability to process, visualize, integrate and mine this new generation of cell imaging data is becoming a critical bottleneck in advancing cell biology. Computation, traditionally used to quantitatively test specific hypotheses, must now also enable iterative hypothesis generation and testing by deciphering hidden biologically meaningful patterns in complex, dynamic or high-dimensional cell image data. Data science is uniquely positioned to aid in this process. In this Perspective, we survey the rapidly expanding new field of data science in cell imaging. Specifically, we highlight how data science tools are used within current image analysis pipelines, propose a computation-first approach to derive new hypotheses from cell image data, identify challenges and describe the next frontiers where we believe data science will make an impact. We also outline steps to ensure broad access to these powerful tools - democratizing infrastructure availability, developing sensitive, robust and usable tools, and promoting interdisciplinary training to both familiarize biologists with data science and expose data scientists to cell imaging.
Collapse
Affiliation(s)
- Meghan K Driscoll
- Department of Bioinformatics, UT Southwestern Medical Center, Dallas, TX 75390, USA
| | - Assaf Zaritsky
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel
| |
Collapse
|
17
|
Li W, Zhang S, Yang G. Dynamic organization of intracellular organelle networks. WIREs Mech Dis 2020; 13:e1505. [PMID: 32865347 DOI: 10.1002/wsbm.1505] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2020] [Revised: 06/06/2020] [Accepted: 07/09/2020] [Indexed: 01/07/2023]
Abstract
Intracellular organelles are membrane-bound and biochemically distinct compartments constructed to serve specialized functions in eukaryotic cells. Through extensive interactions, they form networks to coordinate and integrate their specialized functions for cell physiology. A fundamental property of these organelle networks is that they constantly undergo dynamic organization via membrane fusion and fission to remodel their internal connections and to mediate direct material exchange between compartments. The dynamic organization not only enables them to serve critical physiological functions adaptively but also differentiates them from many other biological networks such as gene regulatory networks and cell signaling networks. This review examines this fundamental property of the organelle networks from a systems point of view. The focus is exclusively on homotypic networks formed by mitochondria, lysosomes, endosomes, and the endoplasmic reticulum, respectively. First, key mechanisms that drive the dynamic organization of these networks are summarized. Then, several distinct organizational properties of these networks are highlighted. Next, spatial properties of the dynamic organization of these networks are emphasized, and their functional implications are examined. Finally, some representative molecular machineries that mediate the dynamic organization of these networks are surveyed. Overall, the dynamic organization of intracellular organelle networks is emerging as a fundamental and unifying paradigm in the internal organization of eukaryotic cells. This article is categorized under: Metabolic Diseases > Molecular and Cellular Physiology.
Collapse
Affiliation(s)
- Wenjing Li
- Laboratory of Computational Biology and Machine Intelligence, School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China.,National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China
| | - Shuhao Zhang
- Laboratory of Computational Biology and Machine Intelligence, School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China.,National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China.,College of Life Sciences, Nankai University, Tianjin, China
| | - Ge Yang
- Laboratory of Computational Biology and Machine Intelligence, School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China.,National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China.,Department of Biomedical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA.,Department of Computational Biology, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
| |
Collapse
|
18
|
Liu Y, Yuan H, Wang Z, Ji S. Global Pixel Transformers for Virtual Staining of Microscopy Images. IEEE TRANSACTIONS ON MEDICAL IMAGING 2020; 39:2256-2266. [PMID: 31985413 DOI: 10.1109/tmi.2020.2968504] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Visualizing the details of different cellular structures is of great importance to elucidate cellular functions. However, it is challenging to obtain high quality images of different structures directly due to complex cellular environments. Fluorescence staining is a popular technique to label different structures but has several drawbacks. In particular, label staining is time consuming and may affect cell morphology, and simultaneous labels are inherently limited. This raises the need of building computational models to learn relationships between unlabeled microscopy images and labeled fluorescence images, and to infer fluorescence labels of other microscopy images excluding the physical staining process. We propose to develop a novel deep model for virtual staining of unlabeled microscopy images. We first propose a novel network layer, known as the global pixel transformer layer, that fuses global information from inputs effectively. The proposed global pixel transformer layer can generate outputs with arbitrary dimensions, and can be employed for all the regular, down-sampling, and up-sampling operators. We then incorporate our proposed global pixel transformer layers and dense blocks to build an U-Net like network. We believe such a design can promote feature reusing between layers. In addition, we propose a multi-scale input strategy to encourage networks to capture features at different scales. We conduct evaluations across various fluorescence image prediction tasks to demonstrate the effectiveness of our approach. Both quantitative and qualitative results show that our method outperforms the state-of-the-art approach significantly. It is also shown that our proposed global pixel transformer layer is useful to improve the fluorescence image prediction results.
Collapse
|
19
|
Boby N, Abbas MA, Lee EB, Park SC. Pharmacodynamics of Ceftiofur Selected by Genomic and Proteomic Approaches of Streptococcus parauberis Isolated from the Flounder, Paralichthys olivaceus. Int J Genomics 2020; 2020:4850290. [PMID: 32318593 PMCID: PMC7150728 DOI: 10.1155/2020/4850290] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Revised: 02/14/2020] [Accepted: 03/13/2020] [Indexed: 11/17/2022] Open
Abstract
We employed an integrative strategy to present subtractive and comparative metabolic and genomic-based findings of therapeutic targets against Streptococcus parauberis. For the first time, we not only identified potential targets based on genomic and proteomic database analyses but also recommend a new antimicrobial drug for the treatment of olive flounder (Paralichthys olivaceus) infected with S. parauberis. To do that, 102 total annotated metabolic pathways of this bacterial strain were extracted from computational comparative metabolic and genomic databases. Six druggable proteins were identified from these metabolic pathways from the DrugBank database with their respective genes as mtnN, penA, pbp2, murB, murA, coaA, and fni out of 112 essential nonhomologous proteins. Among these hits, 26 transmembrane proteins and 77 cytoplasmic proteins were extracted as potential vaccines and drug targets, respectively. From the FDA DrugBank, ceftiofur was selected to prevent antibiotic resistance as it inhibited our selected identified target. Florfenicol is used for treatment of S. parauberis infection in flounder and was chosen as a comparator drug. All tested strains of fish isolates with S. parauberis were susceptible to ceftiofur and florfenicol with minimum inhibitory concentrations (MIC) of 0.0039-1 μg/mL and 0.5-8 μg/mL, IC50 of 0.001-0.5 μg/mL and 0.7-2.7 μg/mL, and minimum biofilm eradication concentrations (MBEC) of 2-256 μg/mL and 4-64 μg/mL, respectively. Similar susceptibility profiles for ceftiofur and florfenicol were found, with ceftiofur observed as an effective and potent antimicrobial drug against both planktonic and biofilm-forming strains of the fish pathogen Streptococcus parauberis, and it can be applied in the aquaculture industry. Thus, our predictive approach not only showed novel therapeutic agents but also indicated that marketed drugs should also be tested for efficacy against newly identified targets of this important fish pathogen.
Collapse
Affiliation(s)
- Naila Boby
- Laboratory of Veterinary Pharmacokinetics and Pharmacodynamics, College of Veterinary Medicine, Kyungpook National University, Daegu 41569, Republic of Korea
| | - Muhammad Aleem Abbas
- Laboratory of Veterinary Pharmacokinetics and Pharmacodynamics, College of Veterinary Medicine, Kyungpook National University, Daegu 41569, Republic of Korea
| | - Eon-Bee Lee
- Laboratory of Veterinary Pharmacokinetics and Pharmacodynamics, College of Veterinary Medicine, Kyungpook National University, Daegu 41569, Republic of Korea
| | - Seung-Chun Park
- Laboratory of Veterinary Pharmacokinetics and Pharmacodynamics, College of Veterinary Medicine, Kyungpook National University, Daegu 41569, Republic of Korea
| |
Collapse
|
20
|
Liu XX, Chou KC. pLoc_Deep-mGneg: Predict Subcellular Localization of Gram Negative Bacterial Proteins by Deep Learning. ACTA ACUST UNITED AC 2020. [DOI: 10.4236/abb.2020.115011] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
21
|
Shao YT, Liu XX, Lu Z, Chou KC. pLoc_Deep-mPlant: Predict Subcellular Localization of Plant Proteins by Deep Learning. ACTA ACUST UNITED AC 2020. [DOI: 10.4236/ns.2020.125021] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
22
|
Shao YT, Liu XX, Lu Z, Chou KC. pLoc_Deep-mHum: Predict Subcellular Localization of Human Proteins by Deep Learning. ACTA ACUST UNITED AC 2020. [DOI: 10.4236/ns.2020.127042] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
23
|
Lu Z, Chou KC. pLoc_Deep-mGpos: Predict Subcellular Localization of Gram Positive Bacteria Proteins by Deep Learning. ACTA ACUST UNITED AC 2020. [DOI: 10.4236/jbise.2020.135005] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
24
|
Shao Y, Chou KC. pLoc_Deep-mVirus: A CNN Model for Predicting Subcellular Localization of Virus Proteins by Deep Learning. ACTA ACUST UNITED AC 2020. [DOI: 10.4236/ns.2020.126033] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
25
|
Shao Y, Chou KC. pLoc_Deep-mEuk: Predict Subcellular Localization of Eukaryotic Proteins by Deep Learning. ACTA ACUST UNITED AC 2020. [DOI: 10.4236/ns.2020.126034] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
26
|
Ba Q, Raghavan G, Kiselyov K, Yang G. Whole-Cell Scale Dynamic Organization of Lysosomes Revealed by Spatial Statistical Analysis. Cell Rep 2019; 23:3591-3606. [PMID: 29925001 DOI: 10.1016/j.celrep.2018.05.079] [Citation(s) in RCA: 46] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2017] [Revised: 04/14/2018] [Accepted: 05/23/2018] [Indexed: 01/22/2023] Open
Abstract
In eukaryotic cells, lysosomes are distributed in the cytoplasm as individual membrane-bound compartments to degrade macromolecules and to control cellular metabolism. A fundamental yet unanswered question is whether and, if so, how individual lysosomes are organized spatially to coordinate and integrate their functions. To address this question, we analyzed their collective behavior in cultured cells using spatial statistical techniques. We found that in single cells, lysosomes maintain non-random, stable, yet distinct spatial distributions mediated by the cytoskeleton, the endoplasmic reticulum (ER), and lysosomal biogenesis. Throughout the intracellular space, lysosomes form dynamic clusters that significantly increase their interactions with endosomes. Cluster formation is associated with local increases in ER spatial density but does not depend on fusion with endosomes or spatial exclusion by mitochondria. Taken together, our findings reveal whole-cell scale spatial organization of lysosomes and provide insights into how organelle interactions are mediated and regulated across the entire intracellular space.
Collapse
Affiliation(s)
- Qinle Ba
- Department of Biomedical Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Guruprasad Raghavan
- Department of Biomedical Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Kirill Kiselyov
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260, USA
| | - Ge Yang
- Department of Biomedical Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA; Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA.
| |
Collapse
|
27
|
Javed F, Hayat M. Predicting subcellular localization of multi-label proteins by incorporating the sequence features into Chou's PseAAC. Genomics 2019; 111:1325-1332. [DOI: 10.1016/j.ygeno.2018.09.004] [Citation(s) in RCA: 56] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2018] [Accepted: 09/04/2018] [Indexed: 12/13/2022]
|
28
|
pLoc_bal-mHum: Predict subcellular localization of human proteins by PseAAC and quasi-balancing training dataset. Genomics 2019; 111:1274-1282. [DOI: 10.1016/j.ygeno.2018.08.007] [Citation(s) in RCA: 56] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2018] [Revised: 08/14/2018] [Accepted: 08/16/2018] [Indexed: 12/17/2022]
|
29
|
Xiao X, Cheng X, Chen G, Mao Q, Chou KC. pLoc_bal-mVirus: Predict Subcellular Localization of Multi-Label Virus Proteins by Chou's General PseAAC and IHTS Treatment to Balance Training Dataset. Med Chem 2019; 15:496-509. [DOI: 10.2174/1573406415666181217114710] [Citation(s) in RCA: 44] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2018] [Revised: 10/23/2018] [Accepted: 12/12/2018] [Indexed: 12/17/2022]
Abstract
Background/Objective:Knowledge of protein subcellular localization is vitally important for both basic research and drug development. Facing the avalanche of protein sequences emerging in the post-genomic age, it is urgent to develop computational tools for timely and effectively identifying their subcellular localization based on the sequence information alone. Recently, a predictor called “pLoc-mVirus” was developed for identifying the subcellular localization of virus proteins. Its performance is overwhelmingly better than that of the other predictors for the same purpose, particularly in dealing with multi-label systems in which some proteins, known as “multiplex proteins”, may simultaneously occur in, or move between two or more subcellular location sites. Despite the fact that it is indeed a very powerful predictor, more efforts are definitely needed to further improve it. This is because pLoc-mVirus was trained by an extremely skewed dataset in which some subset was over 10 times the size of the other subsets. Accordingly, it cannot avoid the biased consequence caused by such an uneven training dataset.Methods:Using the Chou's general PseAAC (Pseudo Amino Acid Composition) approach and the IHTS (Inserting Hypothetical Training Samples) treatment to balance out the training dataset, we have developed a new predictor called “pLoc_bal-mVirus” for predicting the subcellular localization of multi-label virus proteins.Results:Cross-validation tests on exactly the same experiment-confirmed dataset have indicated that the proposed new predictor is remarkably superior to pLoc-mVirus, the existing state-of-theart predictor for the same purpose.Conclusion:Its user-friendly web-server is available at http://www.jci-bioinfo.cn/pLoc_balmVirus/, by which the majority of experimental scientists can easily get their desired results without the need to go through the detailed complicated mathematics. Accordingly, pLoc_bal-mVirus will become a very useful tool for designing multi-target drugs and in-depth understanding of the biological process in a cell.
Collapse
Affiliation(s)
- Xuan Xiao
- Gordon Life Science Institute, Boston, MA 02478, United States
| | - Xiang Cheng
- Gordon Life Science Institute, Boston, MA 02478, United States
| | - Genqiang Chen
- College of Chemistry, Chemical Engineering and Biotechnology, Donghua University, Shanghai 201620, China
| | - Qi Mao
- College of Information Science and Technology, Donghua University, Shanghai, China
| | - Kuo-Chen Chou
- Gordon Life Science Institute, Boston, MA 02478, United States
| |
Collapse
|
30
|
Chou KC, Cheng X, Xiao X. pLoc_bal-mEuk: Predict Subcellular Localization of Eukaryotic Proteins by General PseAAC and Quasi-balancing Training Dataset. Med Chem 2019; 15:472-485. [DOI: 10.2174/1573406415666181218102517] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2018] [Revised: 10/23/2018] [Accepted: 12/12/2018] [Indexed: 12/24/2022]
Abstract
<P>Background/Objective: Information of protein subcellular localization is crucially important for both basic research and drug development. With the explosive growth of protein sequences discovered in the post-genomic age, it is highly demanded to develop powerful bioinformatics tools for timely and effectively identifying their subcellular localization purely based on the sequence information alone. Recently, a predictor called “pLoc-mEuk” was developed for identifying the subcellular localization of eukaryotic proteins. Its performance is overwhelmingly better than that of the other predictors for the same purpose, particularly in dealing with multi-label systems where many proteins, called “multiplex proteins”, may simultaneously occur in two or more subcellular locations. Although it is indeed a very powerful predictor, more efforts are definitely needed to further improve it. This is because pLoc-mEuk was trained by an extremely skewed dataset where some subset was about 200 times the size of the other subsets. Accordingly, it cannot avoid the biased consequence caused by such an uneven training dataset. </P><P> Methods: To alleviate such bias, we have developed a new predictor called pLoc_bal-mEuk by quasi-balancing the training dataset. Cross-validation tests on exactly the same experimentconfirmed dataset have indicated that the proposed new predictor is remarkably superior to pLocmEuk, the existing state-of-the-art predictor in identifying the subcellular localization of eukaryotic proteins. It has not escaped our notice that the quasi-balancing treatment can also be used to deal with many other biological systems. </P><P> Results: To maximize the convenience for most experimental scientists, a user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/pLoc_bal-mEuk/. </P><P> Conclusion: It is anticipated that the pLoc_bal-Euk predictor holds very high potential to become a useful high throughput tool in identifying the subcellular localization of eukaryotic proteins, particularly for finding multi-target drugs that is currently a very hot trend trend in drug development.</P>
Collapse
Affiliation(s)
- Kuo-Chen Chou
- Gordon Life Science Institute, Boston, MA 02478, United States
| | - Xiang Cheng
- Gordon Life Science Institute, Boston, MA 02478, United States
| | - Xuan Xiao
- Gordon Life Science Institute, Boston, MA 02478, United States
| |
Collapse
|
31
|
Abstract
Background:
Revealing the subcellular location of a newly discovered protein can
bring insight into their function and guide research at the cellular level. The experimental methods
currently used to identify the protein subcellular locations are both time-consuming and expensive.
Thus, it is highly desired to develop computational methods for efficiently and effectively identifying
the protein subcellular locations. Especially, the rapidly increasing number of protein sequences
entering the genome databases has called for the development of automated analysis methods.
Methods:
In this review, we will describe the recent advances in predicting the protein subcellular
locations with machine learning from the following aspects: i) Protein subcellular location benchmark
dataset construction, ii) Protein feature representation and feature descriptors, iii) Common
machine learning algorithms, iv) Cross-validation test methods and assessment metrics, v) Web
servers.
Result & Conclusion:
Concomitant with a large number of protein sequences generated by highthroughput
technologies, four future directions for predicting protein subcellular locations with
machine learning should be paid attention. One direction is the selection of novel and effective features
(e.g., statistics, physical-chemical, evolutional) from the sequences and structures of proteins.
Another is the feature fusion strategy. The third is the design of a powerful predictor and the fourth
one is the protein multiple location sites prediction.
Collapse
Affiliation(s)
- Ting-He Zhang
- School of Automation, Northwestern Polytechnical University, Xi'an, 710072, China
| | - Shao-Wu Zhang
- School of Automation, Northwestern Polytechnical University, Xi'an, 710072, China
| |
Collapse
|
32
|
Govyadinov PA, Womack T, Eriksen JL, Chen G, Mayerich D. Robust Tracing and Visualization of Heterogeneous Microvascular Networks. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2019; 25:1760-1773. [PMID: 29993636 PMCID: PMC6360128 DOI: 10.1109/tvcg.2018.2818701] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
Advances in high-throughput imaging allow researchers to collect three-dimensional images of whole organ microvascular networks. These extremely large images contain networks that are highly complex, time consuming to segment, and difficult to visualize. In this paper, we present a framework for segmenting and visualizing vascular networks from terabyte-sized three-dimensional images collected using high-throughput microscopy. While these images require terabytes of storage, the volume devoted to the fiber network is ≈ 4 percent of the total volume size. While the networks themselves are sparse, they are tremendously complex, interconnected, and vary widely in diameter. We describe a parallel GPU-based predictor-corrector method for tracing filaments that is robust to noise and sampling errors common in these data sets. We also propose a number of visualization techniques designed to convey the complex statistical descriptions of fibers across large tissue sections-including commonly studied microvascular characteristics, such as orientation and volume.
Collapse
|
33
|
Sauvat A, Leduc M, Müller K, Kepp O, Kroemer G. ColocalizR: An open-source application for cell-based high-throughput colocalization analysis. Comput Biol Med 2019; 107:227-234. [DOI: 10.1016/j.compbiomed.2019.02.024] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2018] [Revised: 02/20/2019] [Accepted: 02/26/2019] [Indexed: 12/14/2022]
|
34
|
Cheng X, Xiao X, Chou KC. pLoc_bal-mGneg: Predict subcellular localization of Gram-negative bacterial proteins by quasi-balancing training dataset and general PseAAC. J Theor Biol 2018; 458:92-102. [DOI: 10.1016/j.jtbi.2018.09.005] [Citation(s) in RCA: 65] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2018] [Revised: 09/05/2018] [Accepted: 09/07/2018] [Indexed: 01/03/2023]
|
35
|
Samacoits A, Chouaib R, Safieddine A, Traboulsi AM, Ouyang W, Zimmer C, Peter M, Bertrand E, Walter T, Mueller F. A computational framework to study sub-cellular RNA localization. Nat Commun 2018; 9:4584. [PMID: 30389932 PMCID: PMC6214940 DOI: 10.1038/s41467-018-06868-w] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2017] [Accepted: 10/01/2018] [Indexed: 02/01/2023] Open
Abstract
RNA localization is a crucial process for cellular function and can be quantitatively studied by single molecule FISH (smFISH). Here, we present an integrated analysis framework to analyze sub-cellular RNA localization. Using simulated images, we design and validate a set of features describing different RNA localization patterns including polarized distribution, accumulation in cell extensions or foci, at the cell membrane or nuclear envelope. These features are largely invariant to RNA levels, work in multiple cell lines, and can measure localization strength in perturbation experiments. Most importantly, they allow classification by supervised and unsupervised learning at unprecedented accuracy. We successfully validate our approach on representative experimental data. This analysis reveals a surprisingly high degree of localization heterogeneity at the single cell level, indicating a dynamic and plastic nature of RNA localization. Automated analysis of RNA localisation in smFISH data has been elusive. Here, the authors simulate and use a large dataset of images to design and validate a framework for highly accurate classification of sub-cellular RNA localisation patterns from smFISH experiments.
Collapse
Affiliation(s)
- Aubin Samacoits
- Unité Imagerie et Modélisation, Institut Pasteur and CNRS UMR 3691, 28 rue du Docteur Roux, 75015, Paris, France.,C3BI, USR 3756 IP CNRS, 28 rue du Docteur Roux, 75015, Paris, France
| | - Racha Chouaib
- Institut de Génétique Moléculaire de Montpellier, University of Montpellier, CNRS, Montpellier, France.,Equipe labellisée Ligue Nationale Contre le Cancer, Paris, France
| | - Adham Safieddine
- Institut de Génétique Moléculaire de Montpellier, University of Montpellier, CNRS, Montpellier, France.,Equipe labellisée Ligue Nationale Contre le Cancer, Paris, France
| | - Abdel-Meneem Traboulsi
- Institut de Génétique Moléculaire de Montpellier, University of Montpellier, CNRS, Montpellier, France.,Equipe labellisée Ligue Nationale Contre le Cancer, Paris, France
| | - Wei Ouyang
- Unité Imagerie et Modélisation, Institut Pasteur and CNRS UMR 3691, 28 rue du Docteur Roux, 75015, Paris, France.,C3BI, USR 3756 IP CNRS, 28 rue du Docteur Roux, 75015, Paris, France
| | - Christophe Zimmer
- Unité Imagerie et Modélisation, Institut Pasteur and CNRS UMR 3691, 28 rue du Docteur Roux, 75015, Paris, France.,C3BI, USR 3756 IP CNRS, 28 rue du Docteur Roux, 75015, Paris, France
| | - Marion Peter
- Institut de Génétique Moléculaire de Montpellier, University of Montpellier, CNRS, Montpellier, France.,Equipe labellisée Ligue Nationale Contre le Cancer, Paris, France
| | - Edouard Bertrand
- Institut de Génétique Moléculaire de Montpellier, University of Montpellier, CNRS, Montpellier, France. .,Equipe labellisée Ligue Nationale Contre le Cancer, Paris, France.
| | - Thomas Walter
- MINES ParisTech, PSL-Research University, CBIO-Centre for Computational Biology, 75006, Paris, France. .,Institut Curie, PSL Research University, 75005, Paris, France. .,INSERM, U900, 75005, Paris, France.
| | - Florian Mueller
- Unité Imagerie et Modélisation, Institut Pasteur and CNRS UMR 3691, 28 rue du Docteur Roux, 75015, Paris, France. .,C3BI, USR 3756 IP CNRS, 28 rue du Docteur Roux, 75015, Paris, France.
| |
Collapse
|
36
|
van Beers JJBC, Hahn M, Fraune J, Mallet K, Krause C, Hormann W, Fechner K, Damoiseaux JGMC. Performance analysis of automated evaluation of antinuclear antibody indirect immunofluorescent tests in a routine setting. AUTOIMMUNITY HIGHLIGHTS 2018; 9:8. [PMID: 30238164 PMCID: PMC6147779 DOI: 10.1007/s13317-018-0108-y] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 08/29/2018] [Accepted: 09/10/2018] [Indexed: 12/20/2022]
Abstract
Purpose Indirect immunofluorescence (IIF) on the human epithelial cell-line HEp-2 (or derivatives) serves as the gold standard in antinuclear antibody (ANA) screening. IIF, and its evaluation, is a labor-intensive method, making ANA testing a major challenge for present clinical laboratories. Nowadays, several automated ANA pattern recognition systems are on the market. In the current study, the EUROPattern Suite is evaluated for its use in daily practice in a routine setting. Methods A total of 1033 consecutive routine samples was used to screen for ANA. Results (positive/negative ANA screening, pattern identification and titer) were compared between software-generated results (EUROPattern) and visual interpretation (observer) of automatically acquired digital images. Results Considering the visual interpretation as reference, a relative sensitivity of 99.3% and a relative specificity of 88.9% were obtained for negative and positive discrimination by the software (EPa). A good agreement between visual and software-based interpretation was observed with respect to pattern recognition (mean kappa: for 7 patterns: 0.7). Interestingly, EPa software distinguished more patterns per positive sample than the observer (on average 1.5 and 1.2, respectively). Finally, a concordance of 99.3% was observed within the range of 1 titer step difference between EPa and observer. Conclusions The ANA IIF results reported by the EPa software are in very good agreement with the results reported by the observer with respect to being negative/positive, pattern recognition and titer, making automated ANA IIF evaluation an objective and time-efficient tool for routine testing. Electronic supplementary material The online version of this article (10.1007/s13317-018-0108-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Joyce J B C van Beers
- Central Diagnostic Laboratory, Maastricht University Medical Center, P. Debyelaan 25, 6229 HX, Maastricht, The Netherlands
| | - Melanie Hahn
- Institute for Experimental Immunology, EUROIMMUN Medizinische Labordiagnostika AG, Seekamp 31, 23560, Lübeck, Germany
| | - Johanna Fraune
- Institute for Experimental Immunology, EUROIMMUN Medizinische Labordiagnostika AG, Seekamp 31, 23560, Lübeck, Germany
| | - Kathleen Mallet
- Central Diagnostic Laboratory, Maastricht University Medical Center, P. Debyelaan 25, 6229 HX, Maastricht, The Netherlands
| | - Christopher Krause
- Institute for Experimental Immunology, EUROIMMUN Medizinische Labordiagnostika AG, Seekamp 31, 23560, Lübeck, Germany
| | - Wymke Hormann
- Institute for Experimental Immunology, EUROIMMUN Medizinische Labordiagnostika AG, Seekamp 31, 23560, Lübeck, Germany
| | - Kai Fechner
- Institute for Experimental Immunology, EUROIMMUN Medizinische Labordiagnostika AG, Seekamp 31, 23560, Lübeck, Germany
| | - Jan G M C Damoiseaux
- Central Diagnostic Laboratory, Maastricht University Medical Center, P. Debyelaan 25, 6229 HX, Maastricht, The Netherlands.
| |
Collapse
|
37
|
Fan Y, Sun J, Chen Q, Zhang J, Zuo C. Wide-field anti-aliased quantitative differential phase contrast microscopy. OPTICS EXPRESS 2018; 26:25129-25146. [PMID: 30469639 DOI: 10.1364/oe.26.025129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/10/2018] [Accepted: 08/23/2018] [Indexed: 06/09/2023]
Abstract
Differential phase contrast (DPC) microscopy is a popular methodology to recover quantitative phase information of thin transparent samples under multi-axis asymmetric illumination patterns. Based on spatially partially coherent illuminations, DPC provides high-quality, speckle-free 3D reconstructions with lateral resolution up to twice the coherent diffraction limit, under the precondition that the pixel size of the imaging sensor is small enough to prevent spatial aliasing/undersampling. However, microscope cameras are in general designed to have a large pixel size so that the intensity information transmitted by the optical system cannot be adequately sampled or digitized. On the other hand, using an image sensor with a smaller pixel size or adding a magnification camera adapter to the camera can resolve the undersampling at the expense of a reduced field of view (FOV). To solve this tradeoff, we introduce a new variation of quantitative DPC approach, termed anti-aliased DPC (AADPC), which uses several aliased intensity images under asymmetric illuminations to recover wide-field aliasing-free phase images. Besides, phase transfer functions under different illumination patterns in DPC are analyzed to design an illumination scheme with better phase transfer characteristics. AADPC starts from an initial phase estimate obtained by a DPC-like deconvolution based on the system's weak phase transfer function under discrete half-annular illumination. Then the obtained initial phase map is further refined by the iterative de-multiplexing algorithm to overcome pixel-aliasing and improve the imaging resolution. The data redundancy requirement as well as the optimal illumination scheme of AADPC are analyzed and discussed based on several simulations, suggesting that the spatial undersampling can be mitigated through the iterative algorithm that uses only 4 images, yielding a nearly 4-fold increase in the space-bandwidth product (SBP) compared to the conventional DPC approach. We experimentally verify that AADPC can achieve a half-pitch imaging resolution of 345 nm, corresponding to 1.88× of the theoretical Nyquist-Shannon sampling resolution limit imposed by the sensor pixel size. The high-speed, high-throughput quantitative phase imaging capabilities of AADPC are also demonstrated by imaging HeLa cells mitosis in vitro, achieving a full-pitch lateral resolution of 665 nm across a wide FOV of 1.77mm2 at 25 fps.
Collapse
|
38
|
Birhanu BT, Lee SJ, Park NH, Song JB, Park SC. In silico analysis of putative drug and vaccine targets of the metabolic pathways of Actinobacillus pleuropneumoniae using a subtractive/comparative genomics approach. J Vet Sci 2018; 19:188-199. [PMID: 29032659 PMCID: PMC5879067 DOI: 10.4142/jvs.2018.19.2.188] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2017] [Revised: 08/04/2017] [Accepted: 10/07/2017] [Indexed: 11/20/2022] Open
Abstract
Actinobacillus pleuropneumoniae is a Gram-negative bacterium that resides in the respiratory tract of pigs and causes porcine respiratory disease complex, which leads to significant losses in the pig industry worldwide. The incidence of drug resistance in this bacterium is increasing; thus, identifying new protein/gene targets for drug and vaccine development is critical. In this study, we used an in silico approach, utilizing several databases including the Kyoto Encyclopedia of Genes and Genomes (KEGG), the Database of Essential Genes (DEG), DrugBank, and Swiss-Prot to identify non-homologous essential genes and prioritize these proteins for their druggability. The results showed 20 metabolic pathways that were unique and contained 273 non-homologous proteins, of which 122 were essential. Of the 122 essential proteins, there were 95 cytoplasmic proteins and 11 transmembrane proteins, which are potentially suitable for drug and vaccine targets, respectively. Among these, 25 had at least one hit in DrugBank, and three had similarity to metabolic proteins from Mycoplasma hyopneumoniae, another pathogen causing porcine respiratory disease complex; thus, they could serve as common therapeutic targets. In conclusion, we identified glyoxylate and dicarboxylate pathways as potential targets for antimicrobial therapy and tetra-acyldisaccharide 4'-kinase and 3-deoxy-D-manno-octulosonic-acid transferase as vaccine candidates against A. pleuropneumoniae.
Collapse
Affiliation(s)
- Biruk T Birhanu
- Laboratory of Veterinary Pharmacokinetics and Pharmacodynamics, College of Veterinary Medicine, Kyungpook National University, Daegu 41566, Korea
| | - Seung-Jin Lee
- Laboratory of Veterinary Pharmacokinetics and Pharmacodynamics, College of Veterinary Medicine, Kyungpook National University, Daegu 41566, Korea
| | - Na-Hye Park
- Laboratory of Veterinary Pharmacokinetics and Pharmacodynamics, College of Veterinary Medicine, Kyungpook National University, Daegu 41566, Korea
| | - Ju-Beom Song
- Department of Chemistry Education, Teachers College, Kyungpook National University, Daegu 41566, Korea
| | - Seung-Chun Park
- Laboratory of Veterinary Pharmacokinetics and Pharmacodynamics, College of Veterinary Medicine, Kyungpook National University, Daegu 41566, Korea
| |
Collapse
|
39
|
Zhang L, Khattar N, Kemenes I, Kemenes G, Zrinyi Z, Pirger Z, Vertes A. Subcellular Peptide Localization in Single Identified Neurons by Capillary Microsampling Mass Spectrometry. Sci Rep 2018; 8:12227. [PMID: 30111831 PMCID: PMC6093924 DOI: 10.1038/s41598-018-29704-z] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2018] [Accepted: 07/17/2018] [Indexed: 12/22/2022] Open
Abstract
Single cell mass spectrometry (MS) is uniquely positioned for the sequencing and identification of peptides in rare cells. Small peptides can take on different roles in subcellular compartments. Whereas some peptides serve as neurotransmitters in the cytoplasm, they can also function as transcription factors in the nucleus. Thus, there is a need to analyze the subcellular peptide compositions in identified single cells. Here, we apply capillary microsampling MS with ion mobility separation for the sequencing of peptides in single neurons of the mollusk Lymnaea stagnalis, and the analysis of peptide distributions between the cytoplasm and nucleus of identified single neurons that are known to express cardioactive Phe-Met-Arg-Phe amide-like (FMRFamide-like) neuropeptides. Nuclei and cytoplasm of Type 1 and Type 2 F group (Fgp) neurons were analyzed for neuropeptides cleaved from the protein precursors encoded by alternative splicing products of the FMRFamide gene. Relative abundances of nine neuropeptides were determined in the cytoplasm. The nuclei contained six of these peptides at different abundances. Enabled by its relative enrichment in Fgp neurons, a new 28-residue neuropeptide was sequenced by tandem MS.
Collapse
Affiliation(s)
- Linwen Zhang
- Department of Chemistry, The George Washington University, Washington, DC, 20052, USA
| | - Nikkita Khattar
- Department of Chemistry, The George Washington University, Washington, DC, 20052, USA
| | - Ildiko Kemenes
- Sussex Neuroscience, School of Life Sciences, University of Sussex, Brighton, BN1 9QG, UK
| | - Gyorgy Kemenes
- Sussex Neuroscience, School of Life Sciences, University of Sussex, Brighton, BN1 9QG, UK
| | - Zita Zrinyi
- Department of Experimental Zoology, Balaton Limnological Institute, MTA Center for Ecological Research, 8237, Tihany, Hungary
| | - Zsolt Pirger
- Department of Experimental Zoology, Balaton Limnological Institute, MTA Center for Ecological Research, 8237, Tihany, Hungary
| | - Akos Vertes
- Department of Chemistry, The George Washington University, Washington, DC, 20052, USA.
| |
Collapse
|
40
|
Mirzaei Mehrabad E, Hassanzadeh R, Eslahchi C. PMLPR: A novel method for predicting subcellular localization based on recommender systems. Sci Rep 2018; 8:12006. [PMID: 30104743 PMCID: PMC6089892 DOI: 10.1038/s41598-018-30394-w] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2016] [Accepted: 07/30/2018] [Indexed: 12/16/2022] Open
Abstract
The importance of protein subcellular localization problem is due to the importance of protein's functions in different cell parts. Moreover, prediction of subcellular locations helps to identify the potential molecular targets for drugs and has an important role in genome annotation. Most of the existing prediction methods assign only one location for each protein. But, since some proteins move between different subcellular locations, they can have multiple locations. In recent years, some multiple location predictors have been introduced. However, their performances are not accurate enough and there is much room for improvement. In this paper, we introduced a method, PMLPR, to predict locations for a protein. PMLPR predicts a list of locations for each protein based on recommender systems and it can properly overcome the multiple location prediction problem. For evaluating the performance of PMLPR, we considered six datasets RAT, FLY, HUMAN, Du et al., DBMLoc and Höglund. The performance of this algorithm is compared with six state-of-the-art algorithms, YLoc, WOLF-PSORT, prediction channel, MDLoc, Du et al. and MultiLoc2-HighRes. The results indicate that our proposed method is significantly superior on RAT and Fly proteins, and decent on HUMAN proteins. Moreover, on the datasets introduced by Du et al., DBMLoc and Höglund, PMLPR has comparable results. For the case study, we applied the algorithms on 8 proteins which are important in cancer research. The results of comparison with other methods indicate the efficiency of PMLPR.
Collapse
Affiliation(s)
- Elnaz Mirzaei Mehrabad
- Department of Computer Science, Faculty of Mathematical Sciences, Shahid Beheshti University, Tehran, Iran
| | - Reza Hassanzadeh
- Department of Engineering Sciences, Faculty of Advanced Technologies, University of Mohaghegh Ardabili, Namin, Iran
- Department of Bioinformatics, Faculty of Computer Engineering and Information Technology, Sabalan University of Advanced Technologies (SUAT), Namin, Iran
| | - Changiz Eslahchi
- Department of Computer Science, Faculty of Mathematical Sciences, Shahid Beheshti University, Tehran, Iran.
- School of Biological Science, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran.
| |
Collapse
|
41
|
Cheng X, Lin WZ, Xiao X, Chou KC. pLoc_bal-mAnimal: predict subcellular localization of animal proteins by balancing training dataset and PseAAC. Bioinformatics 2018; 35:398-406. [DOI: 10.1093/bioinformatics/bty628] [Citation(s) in RCA: 79] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2018] [Accepted: 07/11/2018] [Indexed: 12/25/2022] Open
Affiliation(s)
- Xiang Cheng
- Computer Science, Jingdezhen Ceramic Institute, Jingdezhen, China
- Computational Biology, Gordon Life Science Institute, Boston, MA, USA
| | - Wei-Zhong Lin
- Computer Science, Jingdezhen Ceramic Institute, Jingdezhen, China
| | - Xuan Xiao
- Computer Science, Jingdezhen Ceramic Institute, Jingdezhen, China
- Computational Biology, Gordon Life Science Institute, Boston, MA, USA
| | - Kuo-Chen Chou
- Computational Biology, Gordon Life Science Institute, Boston, MA, USA
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
42
|
pLoc_bal-mGpos: Predict subcellular localization of Gram-positive bacterial proteins by quasi-balancing training dataset and PseAAC. Genomics 2018; 111:886-892. [PMID: 29842950 DOI: 10.1016/j.ygeno.2018.05.017] [Citation(s) in RCA: 79] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2018] [Revised: 05/14/2018] [Accepted: 05/18/2018] [Indexed: 12/12/2022]
Abstract
Knowledge of protein subcellular localization is vitally important for both basic research and drug development. With the avalanche of protein sequences emerging in the post-genomic age, it is highly desired to develop computational tools for timely and effectively identifying their subcellular localization purely based on the sequence information alone. Recently, a predictor called "pLoc-mGpos" was developed for identifying the subcellular localization of Gram-positive bacterial proteins. Its performance is overwhelmingly better than that of the other predictors for the same purpose, particularly in dealing with multi-label systems in which some proteins, called "multiplex proteins", may simultaneously occur in two or more subcellular locations. Although it is indeed a very powerful predictor, more efforts are definitely needed to further improve it. This is because pLoc-mGpos was trained by an extremely skewed dataset in which some subset (subcellular location) was over 11 times the size of the other subsets. Accordingly, it cannot avoid the bias consequence caused by such an uneven training dataset. To alleviate such bias consequence, we have developed a new and bias-reducing predictor called pLoc_bal-mGpos by quasi-balancing the training dataset. Rigorous target jackknife tests on exactly the same experiment-confirmed dataset have indicated that the proposed new predictor is remarkably superior to pLoc-mGpos, the existing state-of-the-art predictor in identifying the subcellular localization of Gram-positive bacterial proteins. To maximize the convenience for most experimental scientists, a user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/pLoc_bal-mGpos/, by which users can easily get their desired results without the need to go through the detailed mathematics.
Collapse
|
43
|
High-speed Fourier ptychographic microscopy based on programmable annular illuminations. Sci Rep 2018; 8:7669. [PMID: 29769558 PMCID: PMC5956106 DOI: 10.1038/s41598-018-25797-8] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2018] [Accepted: 04/27/2018] [Indexed: 11/08/2022] Open
Abstract
High-throughput quantitative phase imaging (QPI) is essential to cellular phenotypes characterization as it allows high-content cell analysis and avoids adverse effects of staining reagents on cellular viability and cell signaling. Among different approaches, Fourier ptychographic microscopy (FPM) is probably the most promising technique to realize high-throughput QPI by synthesizing a wide-field, high-resolution complex image from multiple angle-variably illuminated, low-resolution images. However, the large dataset requirement in conventional FPM significantly limits its imaging speed, resulting in low temporal throughput. Moreover, the underlying theoretical mechanism as well as optimum illumination scheme for high-accuracy phase imaging in FPM remains unclear. Herein, we report a high-speed FPM technique based on programmable annular illuminations (AIFPM). The optical-transfer-function (OTF) analysis of FPM reveals that the low-frequency phase information can only be correctly recovered if the LEDs are precisely located at the edge of the objective numerical aperture (NA) in the frequency space. By using only 4 low-resolution images corresponding to 4 tilted illuminations matching a 10×, 0.4 NA objective, we present the high-speed imaging results of in vitro Hela cells mitosis and apoptosis at a frame rate of 25 Hz with a full-pitch resolution of 655 nm at a wavelength of 525 nm (effective NA = 0.8) across a wide field-of-view (FOV) of 1.77 mm2, corresponding to a space-bandwidth-time product of 411 megapixels per second. Our work reveals an important capability of FPM towards high-speed high-throughput imaging of in vitro live cells, achieving video-rate QPI performance across a wide range of scales, both spatial and temporal.
Collapse
|
44
|
Ricchiuti V, Adams J, Hardy DJ, Katayev A, Fleming JK. Automated Processing and Evaluation of Anti-Nuclear Antibody Indirect Immunofluorescence Testing. Front Immunol 2018; 9:927. [PMID: 29780386 PMCID: PMC5946161 DOI: 10.3389/fimmu.2018.00927] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2017] [Accepted: 04/13/2018] [Indexed: 01/18/2023] Open
Abstract
Indirect immunofluorescence (IIF) is considered by the American College of Rheumatology (ACR) and the international consensus on ANA patterns (ICAP) the gold standard for the screening of anti-nuclear antibodies (ANA). As conventional IIF is labor intensive, time-consuming, subjective, and poorly standardized, there have been ongoing efforts to improve the standardization of reagents and to develop automated platforms for assay incubation, microscopy, and evaluation. In this study, the workflow and performance characteristics of a fully automated ANA IIF system (Sprinter XL, EUROPattern Suite, IFA 40: HEp-20-10 cells) were compared to a manual approach using visual microscopy with a filter device for single-well titration and to technologist reading. The Sprinter/EUROPattern system enabled the processing of large daily workload cohorts in less than 8 h and the reduction of labor hands-on time by more than 4 h. Regarding the discrimination of positive from negative samples, the overall agreement of the EUROPattern software with technologist reading was higher (95.6%) than when compared to the current method (89.4%). Moreover, the software was consistent with technologist reading in 80.6–97.5% of patterns and 71.0–93.8% of titers. In conclusion, the Sprinter/EUROPattern system provides substantial labor savings and good concordance with technologist ANA IIF microscopy, thus increasing standardization, laboratory efficiency, and removing subjectivity.
Collapse
Affiliation(s)
- Vincent Ricchiuti
- North Central Division, Laboratory Corporation of America Holdings (LabCorp), Dublin, OH, United States
| | - Joseph Adams
- North Central Division, Laboratory Corporation of America Holdings (LabCorp), Dublin, OH, United States
| | - Donna J Hardy
- North Central Division, Laboratory Corporation of America Holdings (LabCorp), Dublin, OH, United States
| | - Alexander Katayev
- Department of Science and Technology, Laboratory Corporation of America Holdings (LabCorp), Elon, NC, United States
| | - James K Fleming
- Department of Science and Technology, Laboratory Corporation of America Holdings (LabCorp), Elon, NC, United States
| |
Collapse
|
45
|
Nanni L, Brahnam S, Ghidoni S, Lumini A. Bioimage Classification with Handcrafted and Learned Features. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 16:874-885. [PMID: 29994096 DOI: 10.1109/tcbb.2018.2821127] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Bioimage classification is increasingly becoming more important in many biological studies including those that require accurate cell phenotype recognition, subcellular localization, and histopathological classification. In this paper, we present a new General Purpose (GenP) bioimage classification method that can be applied to a large range of classification problems. The GenP system we propose is an ensemble that combines multiple texture features (both handcrafted and learned descriptors) for superior and generalizable discriminative power. Our ensemble obtains a boosting of performance by combining local features, dense sampling features, and deep learning features. Each descriptor is used to train a different Support Vector Machine that is then combined by sum rule. We evaluate our method on a diverse set of bioimage classification tasks each represented by a benchmark database, including some of those available in the IICBU 2008 database. Each bioimage classification task represents a typical subcellular, cellular, and tissue level classification problem. Our evaluation on these datasets demonstrates that the proposed GenP bioimage ensemble obtains state-of-the-art performance without any ad-hoc dataset tuning of the parameters (thereby avoiding any risk of overfitting/overtraining). To reproduce the experiments reported in this paper, the MATLAB code of all the descriptors is available at https://github.com/LorisNanni and https://www.dropbox.com/s/bguw035yrqz0pwp/ElencoCode.docx?dl=0.
Collapse
|
46
|
Shankar A, Fernandes JL, Kaur K, Sharma M, Kundu S, Pandey GK. Rice phytoglobins regulate responses under low mineral nutrients and abiotic stresses in Arabidopsis thaliana. PLANT, CELL & ENVIRONMENT 2018; 41:215-230. [PMID: 29044557 DOI: 10.1111/pce.13081] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/22/2016] [Revised: 09/25/2017] [Accepted: 09/27/2017] [Indexed: 06/07/2023]
Abstract
Just like animals, plants also contain haemoglobins (known as phytoglobins in plants). Plant phytoglobins (Pgbs) have been categorized into 6 different classes, namely, Phytogb0 (Pgb0), Phytogb1 (Pgb1), Phytogb2 (Pgb2), SymPhytogb (sPgb), Leghaemoglobin (Lb), and Phytogb3 (Pgb3). Among the 6 Phytogbs, sPgb and Lb have been functionally characterized, whereas understanding of the roles of other Pgbs is still evolving. In our present study, we have explored the function of 2 rice Pgbs (OsPgb1.1 and OsPgb1.2). OsPgb1.1, OsPgb1.2, OsPgb1.3, and OsPgb1.4 displayed increased level of transcript upon salt, drought, cold, and ABA treatment. The overexpression (OX) lines of OsPgb1.2 in Arabidopsis showed a tolerant phenotype in terms of better root growth in low potassium (K+ ) conditions. The expression of the known K+ gene markers such as LOX2, HAK5, and CAX3 was much higher in the OsPgb1.2 OX as compared to wild type. Furthermore, the OsPgb1.2 OX lines showed a decrease in reactive oxygen species (ROS) production and conversely an increase in the K+ content, both in root and shoot, as compared to wild type in K+ limiting condition. Our results indicated the potential involvement of OsPgb1.2 in signalling networks triggered by the nutrient deficiency stresses.
Collapse
Affiliation(s)
- Alka Shankar
- Department of Plant Molecular Biology, University of Delhi South Campus, Benito Juarez Road, New Delhi, 110021, India
| | - Joel Lars Fernandes
- Department of Plant Molecular Biology, University of Delhi South Campus, Benito Juarez Road, New Delhi, 110021, India
| | - Kanwaljeet Kaur
- Department of Plant Molecular Biology, University of Delhi South Campus, Benito Juarez Road, New Delhi, 110021, India
| | - Manisha Sharma
- Department of Plant Molecular Biology, University of Delhi South Campus, Benito Juarez Road, New Delhi, 110021, India
| | - Suman Kundu
- Department of Biochemistry, University of Delhi South Campus, Benito Juarez Road, New Delhi, 110021, India
| | - Girdhar K Pandey
- Department of Plant Molecular Biology, University of Delhi South Campus, Benito Juarez Road, New Delhi, 110021, India
| |
Collapse
|
47
|
pLoc-mEuk: Predict subcellular localization of multi-label eukaryotic proteins by extracting the key GO information into general PseAAC. Genomics 2018; 110:50-58. [DOI: 10.1016/j.ygeno.2017.08.005] [Citation(s) in RCA: 180] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2017] [Revised: 08/10/2017] [Accepted: 08/11/2017] [Indexed: 11/22/2022]
|
48
|
Wan S, Duan Y, Zou Q. HPSLPred: An Ensemble Multi-Label Classifier for Human Protein Subcellular Location Prediction with Imbalanced Source. Proteomics 2017; 17. [PMID: 28776938 DOI: 10.1002/pmic.201700262] [Citation(s) in RCA: 70] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2017] [Revised: 07/19/2017] [Indexed: 11/11/2022]
Abstract
Predicting the subcellular localization of proteins is an important and challenging problem. Traditional experimental approaches are often expensive and time-consuming. Consequently, a growing number of research efforts employ a series of machine learning approaches to predict the subcellular location of proteins. There are two main challenges among the state-of-the-art prediction methods. First, most of the existing techniques are designed to deal with multi-class rather than multi-label classification, which ignores connections between multiple labels. In reality, multiple locations of particular proteins imply that there are vital and unique biological significances that deserve special focus and cannot be ignored. Second, techniques for handling imbalanced data in multi-label classification problems are necessary, but never employed. For solving these two issues, we have developed an ensemble multi-label classifier called HPSLPred, which can be applied for multi-label classification with an imbalanced protein source. For convenience, a user-friendly webserver has been established at http://server.malab.cn/HPSLPred.
Collapse
Affiliation(s)
- Shixiang Wan
- School of Computer Science and Technology, Tianjin University, Tianjin, P. R. China
| | - Yucong Duan
- State Key Laboratory of Marine Resource Utilization in the South China Sea, College of Information and Technology, Hainan University, Haikou, Hainan, P. R. China
| | - Quan Zou
- School of Computer Science and Technology, Tianjin University, Tianjin, P. R. China
| |
Collapse
|
49
|
Cheng X, Xiao X, Chou KC. pLoc-mHum: predict subcellular localization of multi-location human proteins via general PseAAC to winnow out the crucial GO information. Bioinformatics 2017; 34:1448-1456. [DOI: 10.1093/bioinformatics/btx711] [Citation(s) in RCA: 127] [Impact Index Per Article: 18.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2017] [Accepted: 10/31/2017] [Indexed: 01/19/2023] Open
Affiliation(s)
- Xiang Cheng
- Computer Science, Jingdezhen Ceramic Institute, Jingdezhen, China
- Computational Biology, Gordon Life Science Institute, Boston, MA, USA
| | - Xuan Xiao
- Computer Science, Jingdezhen Ceramic Institute, Jingdezhen, China
- Computational Biology, Gordon Life Science Institute, Boston, MA, USA
| | - Kuo-Chen Chou
- Computer Science, Jingdezhen Ceramic Institute, Jingdezhen, China
- Computational Biology, Gordon Life Science Institute, Boston, MA, USA
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
50
|
Cheng X, Xiao X, Chou KC. pLoc-mGneg: Predict subcellular localization of Gram-negative bacterial proteins by deep gene ontology learning via general PseAAC. Genomics 2017; 110:S0888-7543(17)30102-7. [PMID: 28989035 DOI: 10.1016/j.ygeno.2017.10.002] [Citation(s) in RCA: 92] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2017] [Revised: 09/28/2017] [Accepted: 10/04/2017] [Indexed: 01/21/2023]
Abstract
Information of the proteins' subcellular localization is crucially important for revealing their biological functions in a cell, the basic unit of life. With the avalanche of protein sequences generated in the postgenomic age, it is highly desired to develop computational tools for timely identifying their subcellular locations based on the sequence information alone. The current study is focused on the Gram-negative bacterial proteins. Although considerable efforts have been made in protein subcellular prediction, the problem is far from being solved yet. This is because mounting evidences have indicated that many Gram-negative bacterial proteins exist in two or more location sites. Unfortunately, most existing methods can be used to deal with single-location proteins only. Actually, proteins with multi-locations may have some special biological functions important for both basic research and drug design. In this study, by using the multi-label theory, we developed a new predictor called "pLoc-mGneg" for predicting the subcellular localization of Gram-negative bacterial proteins with both single and multiple locations. Rigorous cross-validation on a high quality benchmark dataset indicated that the proposed predictor is remarkably superior to "iLoc-Gneg", the state-of-the-art predictor for the same purpose. For the convenience of most experimental scientists, a user-friendly web-server for the novel predictor has been established at http://www.jci-bioinfo.cn/pLoc-mGneg/, by which users can easily get their desired results without the need to go through the complicated mathematics involved.
Collapse
Affiliation(s)
- Xiang Cheng
- Computer Department, Jingdezhen Ceramic Institute, Jingdezhen, China; The Gordon Life Science Institute, Boston, MA 02478, USA.
| | - Xuan Xiao
- Computer Department, Jingdezhen Ceramic Institute, Jingdezhen, China; The Gordon Life Science Institute, Boston, MA 02478, USA.
| | - Kuo-Chen Chou
- The Gordon Life Science Institute, Boston, MA 02478, USA; Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China; Faculty of Computing and Information Technology in Rabigh, King Abdulaziz University, Jeddah, Saudi Arabia.
| |
Collapse
|