Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	[Subscribe] [Scholar Register]

Number

Cited by Other Article(s)

Stafford IS, Ashton JJ, Mossotto E, Cheng G, Mark Beattie R, Ennis S. Supervised Machine Learning Classifies Inflammatory Bowel Disease Patients by Subtype Using Whole Exome Sequencing Data. J Crohns Colitis 2023;17:1672-1680. [PMID: 37205778 PMCID: PMC10637043 DOI: 10.1093/ecco-jcc/jjad084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/20/2022] [Indexed: 05/21/2023]

Teichman G, Cohen D, Ganon O, Dunsky N, Shani S, Gingold H, Rechavi O. RNAlysis: analyze your RNA sequencing data without writing a single line of code. BMC Biol 2023;21:74. [PMID: 37024838 PMCID: PMC10080885 DOI: 10.1186/s12915-023-01574-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 03/17/2023] [Indexed: 04/08/2023] Open

Robin V, Bodein A, Scott-Boyer MP, Leclercq M, Périn O, Droit A. Overview of methods for characterization and visualization of a protein-protein interaction network in a multi-omics integration context. Front Mol Biosci 2022;9:962799. [PMID: 36158572 PMCID: PMC9494275 DOI: 10.3389/fmolb.2022.962799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Accepted: 08/16/2022] [Indexed: 11/26/2022] Open

Bar N, Nikparvar B, Jayavelu ND, Roessler FK. Constrained Fourier estimation of short-term time-series gene expression data reduces noise and improves clustering and gene regulatory network predictions. BMC Bioinformatics 2022;23:330. [PMID: 35945515 PMCID: PMC9364503 DOI: 10.1186/s12859-022-04839-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Accepted: 07/12/2022] [Indexed: 01/15/2023] Open

Abstract

BACKGROUND

Biological data suffers from noise that is inherent in the measurements. This is particularly true for time-series gene expression measurements. Nevertheless, in order to to explore cellular dynamics, scientists employ such noisy measurements in predictive and clustering tools. However, noisy data can not only obscure the genes temporal patterns, but applying predictive and clustering tools on noisy data may yield inconsistent, and potentially incorrect, results.

RESULTS

To reduce the noise of short-term (< 48 h) time-series expression data, we relied on the three basic temporal patterns of gene expression: waves, impulses and sustained responses. We constrained the estimation of the true signals to these patterns by estimating the parameters of first and second-order Fourier functions and using the nonlinear least-squares trust-region optimization technique. Our approach lowered the noise in at least 85% of synthetic time-series expression data, significantly more than the spline method ([Formula: see text]). When the data contained a higher signal-to-noise ratio, our method allowed downstream network component analyses to calculate consistent and accurate predictions, particularly when the noise variance was high. Conversely, these tools led to erroneous results from untreated noisy data. Our results suggest that at least 5-7 time points are required to efficiently de-noise logarithmic scaled time-series expression data. Investing in sampling additional time points provides little benefit to clustering and prediction accuracy.

CONCLUSIONS

Our constrained Fourier de-noising method helps to cluster noisy gene expression and interpret dynamic gene networks more accurately. The benefit of noise reduction is large and can constitute the difference between a successful application and a failing one.

Collapse

Koyuncu E. Centroidal Clustering of Noisy Observations by Using r th Power Distortion Measures. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022;PP:1430-1438. [PMID: 35731771 DOI: 10.1109/tnnls.2022.3183294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]

DTIP-TC2A: An analytical framework for drug-target interactions prediction methods. Comput Biol Chem 2022;99:107707. [DOI: 10.1016/j.compbiolchem.2022.107707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2021] [Revised: 05/01/2022] [Accepted: 05/26/2022] [Indexed: 11/18/2022]

Sadeghi SS, Keyvanpour MR. Computational Drug Repurposing: Classification of the Research Opportunities and Challenges. Curr Comput Aided Drug Des 2021;16:354-364. [PMID: 31198115 DOI: 10.2174/1573409915666190613113822] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2018] [Revised: 02/13/2019] [Accepted: 05/18/2019] [Indexed: 11/22/2022]

Peterson EJR, Abidi AA, Arrieta-Ortiz ML, Aguilar B, Yurkovich JT, Kaur A, Pan M, Srinivas V, Shmulevich I, Baliga NS. Intricate Genetic Programs Controlling Dormancy in Mycobacterium tuberculosis. Cell Rep 2021;31:107577. [PMID: 32348771 PMCID: PMC7605849 DOI: 10.1016/j.celrep.2020.107577] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2019] [Revised: 12/18/2019] [Accepted: 04/06/2020] [Indexed: 11/24/2022] Open

Stacey RG, Skinnider MA, Foster LJ. On the Robustness of Graph-Based Clustering to Random Network Alterations. Mol Cell Proteomics 2020;20:100002. [PMID: 33592499 PMCID: PMC7896145 DOI: 10.1074/mcp.ra120.002275] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Revised: 10/30/2020] [Accepted: 11/04/2020] [Indexed: 11/23/2022] Open

Abstract

Biological functions emerge from complex and dynamic networks of protein-protein interactions. Because these protein-protein interaction networks, or interactomes, represent pairwise connections within a hierarchically organized system, it is often useful to identify higher-order associations embedded within them, such as multimember protein complexes. Graph-based clustering techniques are widely used to accomplish this goal, and dozens of field-specific and general clustering algorithms exist. However, interactomes can be prone to errors, especially when inferred from high-throughput biochemical assays. Therefore, robustness to network-level noise is an important criterion. Here, we tested the robustness of a range of graph-based clustering algorithms in the presence of noise, including algorithms common across domains and those specific to protein networks. Strikingly, we found that all of the clustering algorithms tested here markedly amplified network-level noise. Randomly rewiring only 1% of network edges yielded more than a 50% change in clustering results. Moreover, we found the impact of network noise on individual clusters was not uniform: some clusters were consistently robust to injected noise, whereas others were not. Therefore we developed the clust.perturb R package and Shiny web application to measure the reproducibility of clusters by randomly perturbing the network. We show that clust.perturb results are predictive of real-world cluster stability: poorly reproducible clusters as identified by clust.perturb are significantly less likely to be reclustered across experiments. We conclude that graph-based clustering amplifies noise in protein interaction networks, but quantifying the robustness of a cluster to network noise can separate stable protein complexes from spurious associations.

Collapse

Wong YKE, Lam KW, Ho KY, Yu CSA, Cho CSW, Tsang HF, Chu MKM, Ng PWL, Tai CSW, Chan LWC, Wong EYL, Wong SCC. The applications of big data in molecular diagnostics. Expert Rev Mol Diagn 2019;19:905-917. [PMID: 31422710 DOI: 10.1080/14737159.2019.1657834] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]

Clustering data with the presence of attribute noise: a study of noise completely at random and ensemble of multiple k-means clusterings. INT J MACH LEARN CYB 2019. [DOI: 10.1007/s13042-019-00989-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

de Ridder M, Klein K, Kim J. A review and outlook on visual analytics for uncertainties in functional magnetic resonance imaging. Brain Inform 2018;5:5. [PMID: 29968092 PMCID: PMC6170942 DOI: 10.1186/s40708-018-0083-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2018] [Accepted: 06/18/2018] [Indexed: 11/10/2022] Open

Ronan T, Qi Z, Naegle KM. Avoiding common pitfalls when clustering biological data. Sci Signal 2016;9:re6. [PMID: 27303057 DOI: 10.1126/scisignal.aad1932] [Citation(s) in RCA: 86] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]

Reeb PD, Bramardi SJ, Steibel JP. Assessing Dissimilarity Measures for Sample-Based Hierarchical Clustering of RNA Sequencing Data Using Plasmode Datasets. PLoS One 2015;10:e0132310. [PMID: 26162080 PMCID: PMC4498680 DOI: 10.1371/journal.pone.0132310] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2015] [Accepted: 06/11/2015] [Indexed: 01/03/2023] Open

Abstract

Sample- and gene- based hierarchical cluster analyses have been widely adopted as tools for exploring gene expression data in high-throughput experiments. Gene expression values (read counts) generated by RNA sequencing technology (RNA-seq) are discrete variables with special statistical properties, such as over-dispersion and right-skewness. Additionally, read counts are subject to technology artifacts as differences in sequencing depth. This possesses a challenge to finding distance measures suitable for hierarchical clustering. Normalization and transformation procedures have been proposed to favor the use of Euclidean and correlation based distances. Additionally, novel model-based dissimilarities that account for RNA-seq data characteristics have also been proposed. Adequacy of dissimilarity measures has been assessed using parametric simulations or exemplar datasets that may limit the scope of the conclusions. Here, we propose the simulation of realistic conditions through creation of plasmode datasets, to assess the adequacy of dissimilarity measures for sample-based hierarchical clustering of RNA-seq data. Consistent results were obtained using plasmode datasets based on RNA-seq experiments conducted under widely different conditions. Dissimilarity measures based on Euclidean distance that only considered data normalization or data standardization were not reliable to represent the expected hierarchical structure. Conversely, using either a Poisson-based dissimilarity or a rank correlation based dissimilarity or an appropriate data transformation, resulted in dendrograms that resemble the expected hierarchical structure. Plasmode datasets can be generated for a wide range of scenarios upon which dissimilarity measures can be evaluated for sample-based hierarchical clustering analysis. We showed different ways of generating such plasmodes and applied them to the problem of selecting a suitable dissimilarity measure. We report several measures that are satisfactory and the choice of a particular measure may rely on the availability on the software pipeline of preference.

Collapse

Dinov ID, Petrosyan P, Liu Z, Eggert P, Hobel S, Vespa P, Woo Moon S, Van Horn JD, Franco J, Toga AW. High-throughput neuroimaging-genetics computational infrastructure. Front Neuroinform 2014;8:41. [PMID: 24795619 PMCID: PMC4005931 DOI: 10.3389/fninf.2014.00041] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2014] [Accepted: 03/27/2014] [Indexed: 01/01/2023] Open

Abstract

Many contemporary neuroscientific investigations face significant challenges in terms of data management, computational processing, data mining, and results interpretation. These four pillars define the core infrastructure necessary to plan, organize, orchestrate, validate, and disseminate novel scientific methods, computational resources, and translational healthcare findings. Data management includes protocols for data acquisition, archival, query, transfer, retrieval, and aggregation. Computational processing involves the necessary software, hardware, and networking infrastructure required to handle large amounts of heterogeneous neuroimaging, genetics, clinical, and phenotypic data and meta-data. Data mining refers to the process of automatically extracting data features, characteristics and associations, which are not readily visible by human exploration of the raw dataset. Result interpretation includes scientific visualization, community validation of findings and reproducible findings. In this manuscript we describe the novel high-throughput neuroimaging-genetics computational infrastructure available at the Institute for Neuroimaging and Informatics (INI) and the Laboratory of Neuro Imaging (LONI) at University of Southern California (USC). INI and LONI include ultra-high-field and standard-field MRI brain scanners along with an imaging-genetics database for storing the complete provenance of the raw and derived data and meta-data. In addition, the institute provides a large number of software tools for image and shape analysis, mathematical modeling, genomic sequence processing, and scientific visualization. A unique feature of this architecture is the Pipeline environment, which integrates the data management, processing, transfer, and visualization. Through its client-server architecture, the Pipeline environment provides a graphical user interface for designing, executing, monitoring validating, and disseminating of complex protocols that utilize diverse suites of software tools and web-services. These pipeline workflows are represented as portable XML objects which transfer the execution instructions and user specifications from the client user machine to remote pipeline servers for distributed computing. Using Alzheimer's and Parkinson's data, we provide several examples of translational applications using this infrastructure.

Collapse

Ren X, Wang Y, Zhang XS, Jin Q. iPcc: a novel feature extraction method for accurate disease class discovery and prediction. Nucleic Acids Res 2013;41:e143. [PMID: 23761440 PMCID: PMC3737526 DOI: 10.1093/nar/gkt343] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open

A Noise Removal Algorithm for Time Series Microarray Data. PROGRESS IN ARTIFICIAL INTELLIGENCE 2013. [DOI: 10.1007/978-3-642-40669-0_14] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]