Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

18
(from Reference Citation Analysis)

Article PDFs (8)

Cited by > 0 (13)

Searched Name

position weight matrix

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Type

Show more Refine

Article Statistics

Refine

MESH Headings

Show more Refine

First Author

Show more Refine

First Author Affiliations

Show more Refine

Authors

Show more Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Countries/Regions

Show more Refine

Affiliations

Show more Refine

Corresponding Author Affiliations

Show more Refine

Category

Show more Refine

Number

Citation Analysis

Rudenko V, Korotkov E. Study of Dispersed Repeats in the Cyanidioschyzon merolae Genome. Int J Mol Sci 2024;25:4441. [PMID: 38674025 PMCID: PMC11050394 DOI: 10.3390/ijms25084441] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Revised: 04/08/2024] [Accepted: 04/15/2024] [Indexed: 04/28/2024] Open

Lavezzo GM, Lauretto MDS, Andrioli LPM, Machado-Lima A. Position Weight Matrix or Acyclic Probabilistic Finite Automaton: Which model to use? A decision rule inferred for the prediction of transcription factor binding sites. Genet Mol Biol 2024;46:e20230048. [PMID: 38285430 PMCID: PMC10945726 DOI: 10.1590/1678-4685-gmb-2023-0048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Accepted: 10/18/2023] [Indexed: 01/30/2024] Open

Ali S, Bello B, Chourasia P, Punathil RT, Zhou Y, Patterson M. PWM2Vec: An Efficient Embedding Approach for Viral Host Specification from Coronavirus Spike Sequences. Biology (Basel) 2022;11:418. [PMID: 35336792 DOI: 10.3390/biology11030418] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Revised: 02/24/2022] [Accepted: 03/07/2022] [Indexed: 01/14/2023]

Abstract

Simple Summary

The family of coronaviruses comprises a diverse set of strains and variants which cause diseases from the common cold to COVID-19. Moreover, they infect a wide array of hosts from bats, camels, birds, to humans. Studying coronaviruses through the lens of host specificity provides a unique perspective to understanding the evolution, diversity and dynamics of this family. In particular, this can reveal groups of different hosts infected by similar strains, giving clues on strains which were more likely to have evolved to jump from one host to another. In this work, we frame host specificity as a classification task, in designing a very compact numerical representation of the spike sequences of different coronaviruses. Based on this numerical representation, classification methods are able to detect the target host with high accuracy. Such an approach can used to efficiently scale to large volumes of sequences, in order to unveil trends in the host specificity of different coronavirus strains.

Abstract

The study of host specificity has important connections to the question about the origin of SARS-CoV-2 in humans which led to the COVID-19 pandemic—an important open question. There are speculations that bats are a possible origin. Likewise, there are many closely related (corona)viruses, such as SARS, which was found to be transmitted through civets. The study of the different hosts which can be potential carriers and transmitters of deadly viruses to humans is crucial to understanding, mitigating, and preventing current and future pandemics. In coronaviruses, the surface (S) protein, or spike protein, is important in determining host specificity, since it is the point of contact between the virus and the host cell membrane. In this paper, we classify the hosts of over five thousand coronaviruses from their spike protein sequences, segregating them into clusters of distinct hosts among birds, bats, camels, swine, humans, and weasels, to name a few. We propose a feature embedding based on the well-known position weight matrix (PWM), which we call PWM2Vec, and we use it to generate feature vectors from the spike protein sequences of these coronaviruses. While our embedding is inspired by the success of PWMs in biological applications, such as determining protein function and identifying transcription factor binding sites, we are the first (to the best of our knowledge) to use PWMs from viral sequences to generate fixed-length feature vector representations, and use them in the context of host classification. The results on real world data show that when using PWM2Vec, machine learning classifiers are able to perform comparably to the baseline models in terms of predictive performance and runtime—in some cases, the performance is better. We also measure the importance of different amino acids using information gain to show the amino acids which are important for predicting the host of a given coronavirus. Finally, we perform some statistical analyses on these results to show that our embedding is more compact than the embeddings of the baseline models.

Collapse

Xia X. Post-Alignment Adjustment and Its Automation. Genes (Basel) 2021;12:genes12111809. [PMID: 34828415 PMCID: PMC8623120 DOI: 10.3390/genes12111809] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Revised: 11/13/2021] [Accepted: 11/16/2021] [Indexed: 11/16/2022] Open

Jin Y, Jiang J, Wang R, Qin ZS. Systematic Evaluation of DNA Sequence Variations on in vivo Transcription Factor Binding Affinity. Front Genet 2021;12:667866. [PMID: 34567058 PMCID: PMC8458901 DOI: 10.3389/fgene.2021.667866] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Accepted: 08/02/2021] [Indexed: 02/01/2023] Open

Yu CP, Kuo CH, Nelson CW, Chen CA, Soh ZT, Lin JJ, Hsiao RX, Chang CY, Li WH. Discovering unknown human and mouse transcription factor binding sites and their characteristics from ChIP-seq data. Proc Natl Acad Sci U S A 2021;118:e2026754118. [PMID: 33975951 DOI: 10.1073/pnas.2026754118] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open

崔颖, 徐泽, 李建. [Identification of nucleosome positioning using support vector machine method based on comprehensive DNA sequence feature]. Sheng Wu Yi Xue Gong Cheng Xue Za Zhi 2020;37:496-501. [PMID: 32597092 PMCID: PMC10319573 DOI: 10.7507/1001-5515.201911064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Received: 11/23/2019] [Indexed: 11/03/2022]

Hu X, Feng Z, Zhang X, Liu L, Wang S. The Identification of Metal Ion Ligand-Binding Residues by Adding the Reclassified Relative Solvent Accessibility. Front Genet 2020;11:214. [PMID: 32265982 PMCID: PMC7096583 DOI: 10.3389/fgene.2020.00214] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2019] [Accepted: 02/24/2020] [Indexed: 11/13/2022] Open

Cui Y, Xu Z, Li J. ZCMM: A Novel Method Using Z-Curve Theory- Based and Position Weight Matrix for Predicting Nucleosome Positioning. Genes (Basel) 2019;10:E765. [PMID: 31569414 DOI: 10.3390/genes10100765] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2019] [Revised: 09/25/2019] [Accepted: 09/26/2019] [Indexed: 02/04/2023] Open

Abstract

Nucleosomes are the basic units of eukaryotes. The accurate positioning of nucleosomes plays a significant role in understanding many biological processes such as transcriptional regulation mechanisms and DNA replication and repair. Here, we describe the development of a novel method, termed ZCMM, based on Z-curve theory and position weight matrix (PWM). The ZCMM was trained and tested using the nucleosomal and linker sequences determined by support vector machine (SVM) in Saccharomyces cerevisiae (S. cerevisiae), and experimental results showed that the sensitivity (Sn), specificity (Sp), accuracy (Acc), and Matthews correlation coefficient (MCC) values for ZCMM were 91.40%, 96.56%, 96.75%, and 0.88, respectively, and the average area under the receiver operating characteristic curve (AUC) value was 0.972. A ZCMM predictor was developed to predict nucleosome positioning in Homo sapiens (H. sapiens), Caenorhabditis elegans (C. elegans), and Drosophila melanogaster (D. melanogaster) genomes, and the accuracy (Acc) values were 77.72%, 85.34%, and 93.62%, respectively. The maximum AUC values of the four species were 0.982, 0.861, 0.912 and 0.911, respectively. Another independent dataset for S. cerevisiae was used to predict nucleosome positioning. Compared with the results of Wu's method, it was found that the Sn, Sp, Acc, and MCC of ZCMM results for S. cerevisiae were all higher, reaching 96.72%, 96.54%, 94.10%, and 0.88. Compared with the Guo's method 'iNuc-PseKNC', the results of ZCMM for D. melanogaster were better. Meanwhile, the ZCMM was compared with some experimental data in vitro and in vivo for S. cerevisiae, and the results showed that the nucleosomes predicted by ZCMM were highly consistent with those confirmed by these experiments. Therefore, it was further confirmed that the ZCMM method has good accuracy and reliability in predicting nucleosome positioning.

Collapse

Townley RA, Bülow HE. Deciphering functional glycosaminoglycan motifs in development. Curr Opin Struct Biol 2018;50:144-154. [PMID: 29579579 PMCID: PMC6078790 DOI: 10.1016/j.sbi.2018.03.011] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2017] [Revised: 03/07/2018] [Accepted: 03/08/2018] [Indexed: 01/12/2023]

Javed M, Solanki M, Sinha A, Shukla LI. Position Based Nucleotide Analysis of miR168 Family in Higher Plants and its Targets in Mammalian Transcripts. Microrna 2018;6:136-142. [PMID: 28215140 DOI: 10.2174/2211536606666170215154151] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2016] [Revised: 01/20/2017] [Accepted: 02/10/2017] [Indexed: 11/22/2022]

Zhang N, Rao RSP, Salvato F, Havelund JF, Møller IM, Thelen JJ, Xu D. MU-LOC: A Machine-Learning Method for Predicting Mitochondrially Localized Proteins in Plants. Front Plant Sci 2018;9:634. [PMID: 29875778 PMCID: PMC5974146 DOI: 10.3389/fpls.2018.00634] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/19/2018] [Accepted: 04/23/2018] [Indexed: 05/19/2023]

Dresch JM, Zellers RG, Bork DK, Drewell RA. Nucleotide Interdependency in Transcription Factor Binding Sites in the Drosophila Genome. Gene Regul Syst Bio 2016;10:21-33. [PMID: 27330274 PMCID: PMC4907338 DOI: 10.4137/grsb.s38462] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/05/2016] [Revised: 04/17/2016] [Accepted: 04/28/2016] [Indexed: 01/14/2023]

Abstract

A long-standing objective in modern biology is to characterize the molecular components that drive the development of an organism. At the heart of eukaryotic development lies gene regulation. On the molecular level, much of the research in this field has focused on the binding of transcription factors (TFs) to regulatory regions in the genome known as cis-regulatory modules (CRMs). However, relatively little is known about the sequence-specific binding preferences of many TFs, especially with respect to the possible interdependencies between the nucleotides that make up binding sites. A particular limitation of many existing algorithms that aim to predict binding site sequences is that they do not allow for dependencies between nonadjacent nucleotides. In this study, we use a recently developed computational algorithm, MARZ, to compare binding site sequences using 32 distinct models in a systematic and unbiased approach to explore nucleotide dependencies within binding sites for 15 distinct TFs known to be critical to Drosophila development. Our results indicate that many of these proteins have varying levels of nucleotide interdependencies within their DNA recognition sequences, and that, in some cases, models that account for these dependencies greatly outperform traditional models that are used to predict binding sites. We also directly compare the ability of different models to identify the known KRUPPEL TF binding sites in CRMs and demonstrate that a more complex model that accounts for nucleotide interdependencies performs better when compared with simple models. This ability to identify TFs with critical nucleotide interdependencies in their binding sites will lead to a deeper understanding of how these molecular characteristics contribute to the architecture of CRMs and the precise regulation of transcription during organismal development.

Collapse

Zemlyanskaya EV, Levitsky VG, Oshchepkov DY, Grosse I, Mironova VV. The Interplay of Chromatin Landscape and DNA-Binding Context Suggests Distinct Modes of EIN3 Regulation in Arabidopsis thaliana. Front Plant Sci 2016;7:2044. [PMID: 28119721 PMCID: PMC5220190 DOI: 10.3389/fpls.2016.02044] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/20/2016] [Accepted: 12/21/2016] [Indexed: 05/08/2023]

Nettling M, Treutler H, Grau J, Keilwagen J, Posch S, Grosse I. DiffLogo: a comparative visualization of sequence motifs. BMC Bioinformatics 2015;16:387. [PMID: 26577052 PMCID: PMC4650857 DOI: 10.1186/s12859-015-0767-x] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2015] [Accepted: 10/08/2015] [Indexed: 11/10/2022] Open

Bragin EY, Shtratnikova VY, Dovbnya DV, Schelkunov MI, Pekov YA, Malakho SG, Egorova OV, Ivashina TV, Sokolov SL, Ashapkin VV, Donova MV. Comparative analysis of genes encoding key steroid core oxidation enzymes in fast-growing Mycobacterium spp. strains. J Steroid Biochem Mol Biol 2013;138:41-53. [PMID: 23474435 DOI: 10.1016/j.jsbmb.2013.02.016] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/24/2012] [Revised: 01/28/2013] [Accepted: 02/24/2013] [Indexed: 11/27/2022]

Nandi S, Ioshikhes I. Optimizing the GATA-3 position weight matrix to improve the identification of novel binding sites. BMC Genomics 2012;13:416. [PMID: 22913572 PMCID: PMC3481455 DOI: 10.1186/1471-2164-13-416] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2011] [Accepted: 08/02/2012] [Indexed: 11/21/2022] Open

Sinha S, Ling X, Whitfield CW, Zhai C, Robinson GE. Genome scan for cis-regulatory DNA motifs associated with social behavior in honey bees. Proc Natl Acad Sci U S A 2006;103:16352-7. [PMID: 17065326 PMCID: PMC1637586 DOI: 10.1073/pnas.0607448103] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open