Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

31
(from Reference Citation Analysis)

Article PDFs (16)

Cited by > 0 (25)

Searched Name

Sara Ballouz

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Type

Show more Refine

Article Statistics

Refine

MESH Headings

Show more Refine

First Author

Show more Refine

First Author Affiliations

Show more Refine

Authors

Show more Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Countries/Regions

Show more Refine

Affiliations

Show more Refine

Corresponding Author Affiliations

Show more Refine

Category

Show more Refine

Number

Citation Analysis

Phetsouphanh C, Jacka B, Ballouz S, Jackson KJL, Wilson DB, Manandhar B, Klemm V, Tan HX, Wheatley A, Aggarwal A, Akerman A, Milogiannakis V, Starr M, Cunningham P, Turville SG, Kent SJ, Byrne A, Brew BJ, Darley DR, Dore GJ, Kelleher AD, Matthews GV. Improvement of immune dysregulation in individuals with long COVID at 24-months following SARS-CoV-2 infection. Nat Commun 2024;15:3315. [PMID: 38632311 PMCID: PMC11024141 DOI: 10.1038/s41467-024-47720-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Accepted: 04/11/2024] [Indexed: 04/19/2024] Open

Affiliation(s)

Chansavath Phetsouphanh The Kirby Institute, University of New South Wales, Sydney, NSW, Australia.
Brendan Jacka The Kirby Institute, University of New South Wales, Sydney, NSW, Australia
Sara Ballouz Garvan Institute for Medical research, Sydney, NSW, Australia School of Computer Science and Engineering, Faculty of Engineering, University of New South Wales, Sydney, NSW, Australia
Katherine J L Jackson Garvan Institute for Medical research, Sydney, NSW, Australia
Daniel B Wilson The Kirby Institute, University of New South Wales, Sydney, NSW, Australia
Bikash Manandhar The Kirby Institute, University of New South Wales, Sydney, NSW, Australia
Vera Klemm The Kirby Institute, University of New South Wales, Sydney, NSW, Australia
Hyon-Xhi Tan Department of Microbiology and Immunology, Peter Doherty Institute, University of Melbourne, Victoria, VIC, Australia
Adam Wheatley Department of Microbiology and Immunology, Peter Doherty Institute, University of Melbourne, Victoria, VIC, Australia
Anupriya Aggarwal The Kirby Institute, University of New South Wales, Sydney, NSW, Australia
Anouschka Akerman The Kirby Institute, University of New South Wales, Sydney, NSW, Australia
Vanessa Milogiannakis The Kirby Institute, University of New South Wales, Sydney, NSW, Australia
Mitchell Starr NSW State Reference Laboratory for HIV, St. Vincent's Centre for Applied Medical Research, Sydney, NSW, Australia
Phillip Cunningham NSW State Reference Laboratory for HIV, St. Vincent's Centre for Applied Medical Research, Sydney, NSW, Australia
Stuart G Turville The Kirby Institute, University of New South Wales, Sydney, NSW, Australia
Stephen J Kent Department of Microbiology and Immunology, Peter Doherty Institute, University of Melbourne, Victoria, VIC, Australia
Anthony Byrne Heart Lung Clinic, St. Vincent's Hospital Sydney and Faculty of Medicine and Health (UNSW), Sydney, NSW, Australia
Bruce J Brew Peter Duncan Neurosciences Unit- St Vincent's Centre for Applied Medical Research, Sydney, NSW, Australia
David R Darley St. Vincent's Hospital, Darlinghurst, NSW, Australia
Gregory J Dore The Kirby Institute, University of New South Wales, Sydney, NSW, Australia St. Vincent's Hospital, Darlinghurst, NSW, Australia
Anthony D Kelleher The Kirby Institute, University of New South Wales, Sydney, NSW, Australia. St. Vincent's Hospital, Darlinghurst, NSW, Australia.
Gail V Matthews The Kirby Institute, University of New South Wales, Sydney, NSW, Australia. St. Vincent's Hospital, Darlinghurst, NSW, Australia.

Collapse

Ballouz S, Kawaguchi RK, Pena MT, Fischer S, Crow M, French L, Knight FM, Adams LB, Gillis J. The transcriptional legacy of developmental stochasticity. Nat Commun 2023;14:7226. [PMID: 37940702 PMCID: PMC10632366 DOI: 10.1038/s41467-023-43024-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2022] [Accepted: 10/30/2023] [Indexed: 11/10/2023] Open

Stavrou MR, So SS, Finch AM, Ballouz S, Smith NJ. Gene expression analyses of TAS1R taste receptors relevant to the treatment of cardiometabolic disease. Chem Senses 2023;48:bjad027. [PMID: 37539767 DOI: 10.1093/chemse/bjad027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Indexed: 08/05/2023] Open

Werner JM, Ballouz S, Hover J, Gillis J. Variability of cross-tissue X-chromosome inactivation characterizes timing of human embryonic lineage specification events. Dev Cell 2022;57:1995-2008.e5. [PMID: 35914524 PMCID: PMC9398941 DOI: 10.1016/j.devcel.2022.07.007] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Revised: 05/11/2022] [Accepted: 07/07/2022] [Indexed: 12/14/2022]

Muzumdar S, Ballouz S, Lam F, Degrange M, Kreuzburg S, Chong H, Zerbe C, Jongco A, Gillis J. A granular view of X-linked chronic granulomatous disease exploiting single-cell transcriptomics. The Journal of Immunology 2022. [DOI: 10.4049/jimmunol.208.supp.159.04] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]

Abstract Abstract X-linked chronic granulomatous disease (X-CGD) is a rare monogenetic immunodeficiency primarily affecting phagocytes. Precipitated by mutations in the CYBB gene, patients exhibit a compromised oxidative burst, leading to recurrent infections which can be life-threatening. Curiously, autoimmune manifestations are also common in patients and carriers. Here, exploiting the cell type-specific nature of this disorder, we characterize X-CGD on a transcriptional level using single-cell sequencing. Peripheral blood from 14 X-CGD probands and 10 carriers signed onto IRB approved protocol NCT00404560, as well as from 15 controls was sampled, and PBMCs and isolated monocytes were subjected to single-cell sequencing. Probands exhibited a strong differential expression signal relative to controls. This was composed of not only genes previously described to be up-regulated in X-CGD such as IFI27, and indeed an autoimmunity-associated broader type I interferon response, but also previously undescribed genes involved in monocyte function (ARG1), antimicrobial proteins (CAMP, SLPI), and inflammasome components (AIM2). Surprisingly, expression variability was not greater in carriers relative to probands or controls, indicating a lack of cell autonomous effects from the deletion of CYBB. Interestingly, aggregate expression of differentially expressed genes in the probands was able to classify carriers from sex-matched controls with high accuracy (AUROC = 0.92), indicating the presence of an X-CGD-specific gene signature. This gene signature was also strongly co-expressed across 17 chordate species, pointing towards the disruption of ancestral pathways important in antimicrobial immunity in X-CGD probands and carriers. This work was partially supported by a Swiss National Science Foundation fellowship to S.M. Collapse

Kaminow B, Ballouz S, Gillis J, Dobin A. Pan-human consensus genome significantly improves the accuracy of RNA-seq analyses. Genome Res 2022;32:738-749. [PMID: 35256454 PMCID: PMC8997357 DOI: 10.1101/gr.275613.121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Accepted: 03/02/2022] [Indexed: 11/25/2022]

Lee J, Shah M, Ballouz S, Crow M, Gillis J. CoCoCoNet: conserved and comparative co-expression across a diverse set of species. Nucleic Acids Res 2020;48:W566-W571. [PMID: 32392296 PMCID: PMC7319556 DOI: 10.1093/nar/gkaa348] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Revised: 04/21/2020] [Accepted: 04/24/2020] [Indexed: 12/19/2022] Open

Pang CNI, Ballouz S, Weissberger D, Thibaut LM, Hamey JJ, Gillis J, Wilkins MR, Hart-Smith G. Analytical Guidelines for co-fractionation Mass Spectrometry Obtained through Global Profiling of Gold Standard Saccharomyces cerevisiae Protein Complexes. Mol Cell Proteomics 2020;19:1876-1895. [PMID: 32817346 PMCID: PMC7664123 DOI: 10.1074/mcp.ra120.002154] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2020] [Revised: 07/14/2020] [Indexed: 11/06/2022] Open

Abstract

Co-fractionation MS (CF-MS) is a technique with potential to characterize endogenous and unmanipulated protein complexes on an unprecedented scale. However this potential has been offset by a lack of guidelines for best-practice CF-MS data collection and analysis. To obtain such guidelines, this study thoroughly evaluates novel and published Saccharomyces cerevisiae CF-MS data sets using very high proteome coverage libraries of yeast gold standard complexes. A new method for identifying gold standard complexes in CF-MS data, Reference Complex Profiling, and the Extending 'Guilt-by-Association' by Degree (EGAD) R package are used for these evaluations, which are verified with concurrent analyses of published human data. By evaluating data collection designs, which involve fractionation of cell lysates, it is found that near-maximum recall of complexes can be achieved with fewer samples than published studies. Distributing sample collection across orthogonal fractionation methods, rather than a single high resolution data set, leads to particularly efficient recall. By evaluating 17 different similarity scoring metrics, which are central to CF-MS data analysis, it is found that two metrics rarely used in past CF-MS studies - Spearman and Kendall correlations - and the recently introduced Co-apex metric frequently maximize recall, whereas a popular metric-Euclidean distance-delivers poor recall. The common practice of integrating external genomic data into CF-MS data analysis is also evaluated, revealing that this practice may improve the precision and recall of known complexes but is generally unsuitable for predicting novel complexes in model organisms. If studying nonmodel organisms using orthologous genomic data, it is found that particular subsets of fractionation profiles (e.g. the lowest abundance quartile) should be excluded to minimize false discovery. These assessments are summarized in a series of universally applicable guidelines for precise, sensitive and efficient CF-MS studies of known complexes, and effective predictions of novel complexes for orthogonal experimental validation.

Collapse

Ballouz S, Mangala MM, Perry MD, Heitmann S, Gillis JA, Hill AP, Vandenberg JI. Co-expression of calcium and hERG potassium channels reduces the incidence of proarrhythmic events. Cardiovasc Res 2020;117:2216-2227. [PMID: 33002116 DOI: 10.1093/cvr/cvaa280] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/16/2020] [Revised: 08/25/2020] [Accepted: 09/17/2020] [Indexed: 01/02/2023] Open

Abstract

AIMS

Cardiac electrical activity is extraordinarily robust. However, when it goes wrong it can have fatal consequences. Electrical activity in the heart is controlled by the carefully orchestrated activity of more than a dozen different ion conductances. While there is considerable variability in cardiac ion channel expression levels between individuals, studies in rodents have indicated that there are modules of ion channels whose expression co-vary. The aim of this study was to investigate whether meta-analytic co-expression analysis of large-scale gene expression datasets could identify modules of co-expressed cardiac ion channel genes in human hearts that are of functional importance.

METHODS AND RESULTS

Meta-analysis of 3653 public human RNA-seq datasets identified a strong correlation between expression of CACNA1C (L-type calcium current, ICaL) and KCNH2 (rapid delayed rectifier K+ current, IKr), which was also observed in human adult heart tissue samples. In silico modelling suggested that co-expression of CACNA1C and KCNH2 would limit the variability in action potential duration seen with variations in expression of ion channel genes and reduce susceptibility to early afterdepolarizations, a surrogate marker for proarrhythmia. We also found that levels of KCNH2 and CACNA1C expression are correlated in human-induced pluripotent stem cell-derived cardiac myocytes and the levels of CACNA1C and KCNH2 expression were inversely correlated with the magnitude of changes in repolarization duration following inhibition of IKr.

CONCLUSION

Meta-analytic approaches of multiple independent human gene expression datasets can be used to identify gene modules that are important for regulating heart function. Specifically, we have verified that there is co-expression of CACNA1C and KCNH2 ion channel genes in human heart tissue, and in silico analyses suggest that CACNA1C-KCNH2 co-expression increases the robustness of cardiac electrical activity.

Collapse

Ballouz S, Dobin A, Gingeras TR, Gillis J. The fractured landscape of RNA-seq alignment: the default in our STARs. Nucleic Acids Res 2019;46:5125-5138. [PMID: 29718481 PMCID: PMC6007662 DOI: 10.1093/nar/gky325] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2017] [Accepted: 04/16/2018] [Indexed: 12/28/2022] Open

Ballouz S, Dobin A, Gillis JA. Is it time to change the reference genome? Genome Biol 2019;20:159. [PMID: 31399121 PMCID: PMC6688217 DOI: 10.1186/s13059-019-1774-4] [Citation(s) in RCA: 97] [Impact Index Per Article: 19.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open

Ballouz S, Pavlidis P, Gillis J. Using predictive specificity to determine when gene set analysis is biologically meaningful. Nucleic Acids Res 2018;45:e20. [PMID: 28204549 PMCID: PMC5389513 DOI: 10.1093/nar/gkw957] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2016] [Revised: 10/04/2016] [Accepted: 10/10/2016] [Indexed: 11/14/2022] Open

Crow M, Paul A, Ballouz S, Huang ZJ, Gillis J. Characterizing the replicability of cell types defined by single cell RNA-sequencing data using MetaNeighbor. Nat Commun 2018;9:884. [PMID: 29491377 PMCID: PMC5830442 DOI: 10.1038/s41467-018-03282-0] [Citation(s) in RCA: 142] [Impact Index Per Article: 23.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2017] [Accepted: 02/02/2018] [Indexed: 12/19/2022] Open

Ballouz S, Weber M, Pavlidis P, Gillis J. EGAD: ultra-fast functional analysis of gene networks. Bioinformatics 2018;33:612-614. [PMID: 27993773 DOI: 10.1093/bioinformatics/btw695] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2016] [Accepted: 11/03/2016] [Indexed: 12/25/2022] Open

Ballouz S, Gillis J. Strength of functional signature correlates with effect size in autism. Genome Med 2017;9:64. [PMID: 28687074 PMCID: PMC5501949 DOI: 10.1186/s13073-017-0455-8] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2017] [Accepted: 06/23/2017] [Indexed: 01/03/2023] Open

Abstract

BACKGROUND

Disagreements over genetic signatures associated with disease have been particularly prominent in the field of psychiatric genetics, creating a sharp divide between disease burdens attributed to common and rare variation, with study designs independently targeting each. Meta-analysis within each of these study designs is routine, whether using raw data or summary statistics, but combining results across study designs is atypical. However, tests of functional convergence are used across all study designs, where candidate gene sets are assessed for overlaps with previously known properties. This suggests one possible avenue for combining not study data, but the functional conclusions that they reach.

METHOD

In this work, we test for functional convergence in autism spectrum disorder (ASD) across different study types, and specifically whether the degree to which a gene is implicated in autism is correlated with the degree to which it drives functional convergence. Because different study designs are distinguishable by their differences in effect size, this also provides a unified means of incorporating the impact of study design into the analysis of convergence.

RESULTS

We detected remarkably significant positive trends in aggregate (p < 2.2e-16) with 14 individually significant properties (false discovery rate <0.01), many in areas researchers have targeted based on different reasoning, such as the fragile X mental retardation protein (FMRP) interactor enrichment (false discovery rate 0.003). We are also able to detect novel technical effects and we see that network enrichment from protein-protein interaction data is heavily confounded with study design, arising readily in control data.

CONCLUSIONS

We see a convergent functional signal for a subset of known and novel functions in ASD from all sources of genetic variation. Meta-analytic approaches explicitly accounting for different study designs can be adapted to other diseases to discover novel functional associations and increase statistical power.

Collapse

O’Meara MJ, Ballouz S, Shoichet BK, Gillis J. Ligand Similarity Complements Sequence, Physical Interaction, and Co-Expression for Gene Function Prediction. PLoS One 2016;11:e0160098. [PMID: 27467773 PMCID: PMC4965129 DOI: 10.1371/journal.pone.0160098] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2016] [Accepted: 07/13/2016] [Indexed: 12/13/2022] Open

Crow M, Paul A, Ballouz S, Huang ZJ, Gillis J. Exploiting single-cell expression to characterize co-expression replicability. Genome Biol 2016;17:101. [PMID: 27165153 PMCID: PMC4862082 DOI: 10.1186/s13059-016-0964-6] [Citation(s) in RCA: 59] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2016] [Accepted: 04/25/2016] [Indexed: 01/25/2023] Open

Abstract

Background

Co-expression networks have been a useful tool for functional genomics, providing important clues about the cellular and biochemical mechanisms that are active in normal and disease processes. However, co-expression analysis is often treated as a black box with results being hard to trace to their basis in the data. Here, we use both published and novel single-cell RNA sequencing (RNA-seq) data to understand fundamental drivers of gene-gene connectivity and replicability in co-expression networks.

Results

We perform the first major analysis of single-cell co-expression, sampling from 31 individual studies. Using neighbor voting in cross-validation, we find that single-cell network connectivity is less likely to overlap with known functions than co-expression derived from bulk data, with functional variation within cell types strongly resembling that also occurring across cell types. To identify features and analysis practices that contribute to this connectivity, we perform our own single-cell RNA-seq experiment of 126 cortical interneurons in an experimental design targeted to co-expression. By assessing network replicability, semantic similarity and overall functional connectivity, we identify technical factors influencing co-expression and suggest how they can be controlled for. Many of the technical effects we identify are expression-level dependent, making expression level itself highly predictive of network topology. We show this occurs generally through re-analysis of the BrainSpan RNA-seq data.

Conclusions

Technical properties of single-cell RNA-seq data create confounds in co-expression networks which can be identified and explicitly controlled for in any supervised analysis. This is useful both in improving co-expression performance and in characterizing single-cell data in generally applicable terms, permitting cross-laboratory comparison within a common framework.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-016-0964-6) contains supplementary material, which is available to authorized users.

Collapse

Ballouz S, Gillis J. AuPairWise: A Method to Estimate RNA-Seq Replicability through Co-expression. PLoS Comput Biol 2016;12:e1004868. [PMID: 27082953 PMCID: PMC4833304 DOI: 10.1371/journal.pcbi.1004868] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2015] [Accepted: 03/14/2016] [Indexed: 11/23/2022] Open

Abstract

In addition to detecting novel transcripts and higher dynamic range, a principal claim for RNA-sequencing has been greater replicability, typically measured in sample-sample correlations of gene expression levels. Through a re-analysis of ENCODE data, we show that replicability of transcript abundances will provide misleading estimates of the replicability of conditional variation in transcript abundances (i.e., most expression experiments). Heuristics which implicitly address this problem have emerged in quality control measures to obtain ‘good’ differential expression results. However, these methods involve strict filters such as discarding low expressing genes or using technical replicates to remove discordant transcripts, and are costly or simply ad hoc. As an alternative, we model gene-level replicability of differential activity using co-expressing genes. We find that sets of housekeeping interactions provide a sensitive means of estimating the replicability of expression changes, where the co-expressing pair can be regarded as pseudo-replicates of one another. We model the effects of noise that perturbs a gene’s expression within its usual distribution of values and show that perturbing expression by only 5% within that range is readily detectable (AUROC~0.73). We have made our method available as a set of easily implemented R scripts.

RNA-sequencing has become a popular means to detect the expression levels of genes. However, quality control is still challenging, requiring both extreme measures and rules which are set in stone from extensive previous analysis. Instead of relying on these rules, we show that co-expression can be used to measure biological replicability with extremely high precision. Co-expression is a well-studied phenomenon in which two genes that are known to form a functional unit are also expressed at similar levels, and change in similar ways across conditions. Using this concept, we can detect how well an experiment replicates by measuring how well it has retained the co-expression pattern across defined gene-pairs. We do this by measuring how easy it is to detect a sample to which some noise has been added. We show this method is a useful tool for quality control.

Collapse

Verleyen W, Ballouz S, Gillis J. Positive and negative forms of replicability in gene network analysis. Bioinformatics 2015;32:1065-73. [PMID: 26668004 DOI: 10.1093/bioinformatics/btv734] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2015] [Accepted: 12/09/2015] [Indexed: 02/07/2023] Open

Ballouz S, Verleyen W, Gillis J. Guidance for RNA-seq co-expression network construction and analysis: safety in numbers. ACTA ACUST UNITED AC 2015;31:2123-30. [PMID: 25717192 DOI: 10.1093/bioinformatics/btv118] [Citation(s) in RCA: 134] [Impact Index Per Article: 14.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2014] [Accepted: 02/19/2015] [Indexed: 12/11/2022]

Verleyen W, Ballouz S, Gillis J. Measuring the wisdom of the crowds in network-based gene function inference. Bioinformatics 2014;31:745-52. [DOI: 10.1093/bioinformatics/btu715] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Grover MP, Ballouz S, Mohanasundaram KA, George RA, Sherman CDH, Crowley TM, Wouters MA. Identification of novel therapeutics for complex diseases from genome-wide association data. BMC Med Genomics 2014;7 Suppl 1:S8. [PMID: 25077696 PMCID: PMC4101352 DOI: 10.1186/1755-8794-7-s1-s8] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open

Abstract

Background

Human genome sequencing has enabled the association of phenotypes with genetic loci, but our ability to effectively translate this data to the clinic has not kept pace. Over the past 60 years, pharmaceutical companies have successfully demonstrated the safety and efficacy of over 1,200 novel therapeutic drugs via costly clinical studies. While this process must continue, better use can be made of the existing valuable data. In silico tools such as candidate gene prediction systems allow rapid identification of disease genes by identifying the most probable candidate genes linked to genetic markers of the disease or phenotype under investigation. Integration of drug-target data with candidate gene prediction systems can identify novel phenotypes which may benefit from current therapeutics. Such a drug repositioning tool can save valuable time and money spent on preclinical studies and phase I clinical trials.

Methods

We previously used Gentrepid (http://www.gentrepid.org) as a platform to predict 1,497 candidate genes for the seven complex diseases considered in the Wellcome Trust Case-Control Consortium genome-wide association study; namely Type 2 Diabetes, Bipolar Disorder, Crohn's Disease, Hypertension, Type 1 Diabetes, Coronary Artery Disease and Rheumatoid Arthritis. Here, we adopted a simple approach to integrate drug data from three publicly available drug databases: the Therapeutic Target Database, the Pharmacogenomics Knowledgebase and DrugBank; with candidate gene predictions from Gentrepid at the systems level.

Results

Using the publicly available drug databases as sources of drug-target association data, we identified a total of 428 candidate genes as novel therapeutic targets for the seven phenotypes of interest, and 2,130 drugs feasible for repositioning against the predicted novel targets.

Conclusions

By integrating genetic, bioinformatic and drug data, we have demonstrated that currently available drugs may be repositioned as novel therapeutics for the seven diseases studied here, quickly taking advantage of prior work in pharmaceutics to translate ground-breaking results in genetics to clinical treatments.

Collapse

Gillis J, Ballouz S, Pavlidis P. Bias tradeoffs in the creation and analysis of protein-protein interaction networks. J Proteomics 2014;100:44-54. [PMID: 24480284 DOI: 10.1016/j.jprot.2014.01.020] [Citation(s) in RCA: 53] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2013] [Revised: 01/13/2014] [Accepted: 01/17/2014] [Indexed: 02/04/2023]

Abstract

UNLABELLED

Networks constructed from aggregated protein-protein interaction data are commonplace in biology. But the studies these data are derived from were conducted with their own hypotheses and foci. Focusing on data from budding yeast present in BioGRID, we determine that many of the downstream signals present in network data are significantly impacted by biases in the original data. We determine the degree to which selection bias in favor of biologically interesting bait proteins goes down with study size, while we also find that promiscuity in prey contributes more substantially in larger studies. We analyze interaction studies over time with respect to data in the Gene Ontology and find that reproducibly observed interactions are less likely to favor multifunctional proteins. We find that strong alignment between co-expression and protein-protein interaction data occurs only for extreme co-expression values, and use this data to suggest candidates for targets likely to reveal novel biology in follow-up studies.

BIOLOGICAL SIGNIFICANCE

Protein-protein interaction data finds particularly heavy use in the interpretation of disease-causal variants. In principle, network data allows researchers to find novel commonalities among candidate genes. In this study, we detail several of the most salient biases contributing to aggregated protein-protein interaction databases. We find strong evidence for the role of selection and laboratory biases. Many of these effects contribute to the commonalities researchers find for disease genes. In order for characterization of disease genes and their interactions to not simply be an artifact of researcher preference, it is imperative to identify data biases explicitly. Based on this, we also suggest ways to move forward in producing candidates less influenced by prior knowledge. This article is part of a Special Issue entitled: Can Proteomics Fill the Gap Between Genomics and Phenotypes?

Collapse

Ballouz S, Liu JY, Oti M, Gaeta B, Fatkin D, Bahlo M, Wouters MA. Candidate disease gene prediction using Gentrepid: application to a genome-wide association study on coronary artery disease. Mol Genet Genomic Med 2013;2:44-57. [PMID: 24498628 PMCID: PMC3907915 DOI: 10.1002/mgg3.40] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2012] [Accepted: 08/19/2013] [Indexed: 12/12/2022] Open

Ballouz S, Liu JY, George RA, Bains N, Liu A, Oti M, Gaeta B, Fatkin D, Wouters MA. Gentrepid V2.0: a web server for candidate disease gene prediction. BMC Bioinformatics 2013;14:249. [PMID: 23947436 PMCID: PMC3844418 DOI: 10.1186/1471-2105-14-249] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2012] [Accepted: 08/13/2013] [Indexed: 01/06/2023] Open

Ballouz S, Liu JY, Oti M, Gaeta B, Fatkin D, Bahlo M, Wouters MA. Analysis of genome-wide association study data using the protein knowledge base. BMC Genet 2011;12:98. [PMID: 22077927 PMCID: PMC3261104 DOI: 10.1186/1471-2156-12-98] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2011] [Accepted: 11/13/2011] [Indexed: 12/25/2022] Open

Abstract

BACKGROUND

Genome-wide association studies (GWAS) aim to identify causal variants and genes for complex disease by independently testing a large number of SNP markers for disease association. Although genes have been implicated in these studies, few utilise the multiple-hit model of complex disease to identify causal candidates. A major benefit of multi-locus comparison is that it compensates for some shortcomings of current statistical analyses that test the frequency of each SNP in isolation for the phenotype population versus control.

RESULTS

Here we developed and benchmarked several protocols for GWAS data analysis using different in-silico gene prediction and prioritisation methodologies. We adopted a high sensitivity approach to the data, using less conservative statistical SNP associations. Multiple gene search spaces, either of fixed-widths or proximity-based, were generated around each SNP marker. We used the candidate disease gene prediction system Gentrepid to identify candidates based on shared biomolecular pathways or domain-based protein homology. Predictions were made either with phenotype-specific known disease genes as input; or without a priori knowledge, by exhaustive comparison of genes in distinct loci. Because Gentrepid uses biomolecular data to find interactions and common features between genes in distinct loci of the search spaces, it takes advantage of the multi-locus aspect of the data.

CONCLUSIONS

Results suggest testing multiple SNP-to-gene search spaces compensates for differences in phenotypes, populations and SNP platforms. Surprisingly, domain-based homology information was more informative when benchmarked against gene candidates reported by GWA studies compared to previously determined disease genes, possibly suggesting a larger contribution of gene homologs to complex diseases than Mendelian diseases.

Collapse

Oti M, Ballouz S, Wouters MA. Web tools for the prioritization of candidate disease genes. Methods Mol Biol 2011;760:189-206. [PMID: 21779998 DOI: 10.1007/978-1-61779-176-5_12] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]

Ballouz S, Francis AR, Lan R, Tanaka MM. Conditions for the evolution of gene clusters in bacterial genomes. PLoS Comput Biol 2010;6:e1000672. [PMID: 20168992 PMCID: PMC2820515 DOI: 10.1371/journal.pcbi.1000672] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2009] [Accepted: 01/07/2010] [Indexed: 11/18/2022] Open

Abstract

Genes encoding proteins in a common pathway are often found near each other along bacterial chromosomes. Several explanations have been proposed to account for the evolution of these structures. For instance, natural selection may directly favour gene clusters through a variety of mechanisms, such as increased efficiency of coregulation. An alternative and controversial hypothesis is the selfish operon model, which asserts that clustered arrangements of genes are more easily transferred to other species, thus improving the prospects for survival of the cluster. According to another hypothesis (the persistence model), genes that are in close proximity are less likely to be disrupted by deletions. Here we develop computational models to study the conditions under which gene clusters can evolve and persist. First, we examine the selfish operon model by re-implementing the simulation and running it under a wide range of conditions. Second, we introduce and study a Moran process in which there is natural selection for gene clustering and rearrangement occurs by genome inversion events. Finally, we develop and study a model that includes selection and inversion, which tracks the occurrence and fixation of rearrangements. Surprisingly, gene clusters fail to evolve under a wide range of conditions. Factors that promote the evolution of gene clusters include a low number of genes in the pathway, a high population size, and in the case of the selfish operon model, a high horizontal transfer rate. The computational analysis here has shown that the evolution of gene clusters can occur under both direct and indirect selection as long as certain conditions hold. Under these conditions the selfish operon model is still viable as an explanation for the evolution of gene clusters.

Genes involved in a common pathway or function are frequently found near each other on bacterial chromosomes. A number of hypotheses have been previously presented to explain this observation. A particularly influential theory is the selfish operon model, which posits that horizontal transfer could promote gene clustering by favouring transfer of arrangements of genes that are close together. Subsequent theoretical development and analysis of genomic data have contributed to the debate about the plausibility of this model. Here, by re-examining the evolutionary dynamics of gene clusters, we provide and discuss conditions under which gene clusters can evolve. We find that first, some form of bias for clustering is required for clusters to evolve. This bias can be in the form of bias in horizontal transfer towards genes that are close together, or direct natural selection for gene proximity. Our computational work does not present a theoretical obstacle to the selfish operon model as a possible explanation for the evolution of gene clusters.

Collapse

Robertson J, Ballouz S, Jaiyesimi I, Jury R, Margolis J. A Phase I Study of Dose Escalating Conformal Radiation Therapy with Concurrent Full-dose Gemcitabine and Erlotinib for Unresected Pancreas Cancer. Int J Radiat Oncol Biol Phys 2009. [DOI: 10.1016/j.ijrobp.2009.07.620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]

Teber ET, Liu JY, Ballouz S, Fatkin D, Wouters MA. Comparison of automated candidate gene prediction systems using genes implicated in type 2 diabetes by genome-wide association studies. BMC Bioinformatics 2009;10 Suppl 1:S69. [PMID: 19208173 PMCID: PMC2648789 DOI: 10.1186/1471-2105-10-s1-s69] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open

Robertson J, Hardy M, Ballouz S, Jaiyesimi I, Margolis J, Jury R, Wallace M, Maino H. Conformal Radiation Therapy with Concurrent Full-dose Gemcitabine and Erlotinib for Unresected Pancreas Cancer: A Phase I Trial. Int J Radiat Oncol Biol Phys 2008. [DOI: 10.1016/j.ijrobp.2008.06.1725] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]