Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Mehmood T, Martens H, Saebø S, Warringer J, Snipen L. Mining for genotype-phenotype relations in Saccharomyces using partial least squares. BMC Bioinformatics 2011;12:318. [PMID: 21812956 PMCID: PMC3175482 DOI: 10.1186/1471-2105-12-318] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2011] [Accepted: 08/03/2011] [Indexed: 11/18/2022] Open

For:	Mehmood T, Martens H, Saebø S, Warringer J, Snipen L. Mining for genotype-phenotype relations in Saccharomyces using partial least squares. BMC Bioinformatics 2011;12:318. [PMID: 21812956 PMCID: PMC3175482 DOI: 10.1186/1471-2105-12-318] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2011] [Accepted: 08/03/2011] [Indexed: 11/18/2022] Open

Number

Cited by Other Article(s)

Aponte JD, Katz DC, Roth DM, Vidal-García M, Liu W, Andrade F, Roseman CC, Murray SA, Cheverud J, Graf D, Marcucio RS, Hallgrímsson B. Relating multivariate shapes to genescapes using phenotype-biological process associations for craniofacial shape. eLife 2021;10:68623. [PMID: 34779766 PMCID: PMC8631940 DOI: 10.7554/elife.68623] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2021] [Accepted: 11/12/2021] [Indexed: 12/20/2022] Open

Abstract

Realistic mappings of genes to morphology are inherently multivariate on both sides of the equation. The importance of coordinated gene effects on morphological phenotypes is clear from the intertwining of gene actions in signaling pathways, gene regulatory networks, and developmental processes underlying the development of shape and size. Yet, current approaches tend to focus on identifying and localizing the effects of individual genes and rarely leverage the information content of high-dimensional phenotypes. Here, we explicitly model the joint effects of biologically coherent collections of genes on a multivariate trait – craniofacial shape – in a sample of n = 1145 mice from the Diversity Outbred (DO) experimental line. We use biological process Gene Ontology (GO) annotations to select skeletal and facial development gene sets and solve for the axis of shape variation that maximally covaries with gene set marker variation. We use our process-centered, multivariate genotype-phenotype (process MGP) approach to determine the overall contributions to craniofacial variation of genes involved in relevant processes and how variation in different processes corresponds to multivariate axes of shape variation. Further, we compare the directions of effect in phenotype space of mutations to the primary axis of shape variation associated with broader pathways within which they are thought to function. Finally, we leverage the relationship between mutational and pathway-level effects to predict phenotypic effects beyond craniofacial shape in specific mutants. We also introduce an online application that provides users the means to customize their own process-centered craniofacial shape analyses in the DO. The process-centered approach is generally applicable to any continuously varying phenotype and thus has wide-reaching implications for complex trait genetics.

Collapse

Affiliation(s)

Jose D Aponte Department of Cell Biology & Anatomy, Alberta Children's Hospital Research Institute and McCaig Bone and Joint Institute, Cumming School of Medicine, University of Calgary, Calgary, Canada
David C Katz Department of Cell Biology & Anatomy, Alberta Children's Hospital Research Institute and McCaig Bone and Joint Institute, Cumming School of Medicine, University of Calgary, Calgary, Canada
Daniela M Roth School of Dentistry, Faculty of Medicine and Dentistry, University of Alberta, Edmonton, Canada
Marta Vidal-García Department of Cell Biology & Anatomy, Alberta Children's Hospital Research Institute and McCaig Bone and Joint Institute, Cumming School of Medicine, University of Calgary, Calgary, Canada
Wei Liu Department of Cell Biology & Anatomy, Alberta Children's Hospital Research Institute and McCaig Bone and Joint Institute, Cumming School of Medicine, University of Calgary, Calgary, Canada
Fernando Andrade Department of Biology, Loyola University Chicago, Chicago, United States
Charles C Roseman Department of Biology, Loyola University Chicago, Chicago, United States
Steven A Murray The Jackson Laboratory, Bar Harbor, United States
James Cheverud Department of Biology, Loyola University Chicago, Chicago, United States
Daniel Graf School of Dentistry, Faculty of Medicine and Dentistry, University of Alberta, Edmonton, Canada.,Department of Medical Genetics, Faculty of Medicine and Dentistry, University of Alberta, Edmonton, Canada
Ralph S Marcucio Department of Orthopaedic Surgery, School of Medicine, University of California, San Francisco, San Francisco, United States
Benedikt Hallgrímsson Department of Cell Biology & Anatomy, Alberta Children's Hospital Research Institute and McCaig Bone and Joint Institute, Cumming School of Medicine, University of Calgary, Calgary, Canada.,Department of Animal Biology, University of Illinois Urbana Champaign, Urbana, United States

Collapse

Guo C, Wang H, Feng G, Li J, Su C, Zhang J, Wang Z, Du W, Zhang B. Spatiotemporal predictions of obesity prevalence in Chinese children and adolescents: based on analyses of obesogenic environmental variability and Bayesian model. Int J Obes (Lond) 2019;43:1380-1390. [PMID: 30568273 PMCID: PMC6584073 DOI: 10.1038/s41366-018-0301-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/08/2018] [Revised: 11/03/2018] [Accepted: 11/30/2018] [Indexed: 01/22/2023]

Abstract

OBJECTIVE

To find variations in Chinese obesogenic environmental priorities from 2000 to 2011, predict spatiotemporal distribution of obesity prevalence aged 7-17 years in 31 provinces, and provide foundations for policy-makers to reduce obesity in children and adolescents.

METHODS

Based on data examination of provincial obesity prevalence aged 7-17 years from three rounds of China Health and Nutrition Surveys (in 9 [2000], 9 [2006], and 12 [2011] provinces) and corresponding years' environments in 31 provinces from China Statistical Yearbooks and other sources, 12 predictors were selected. We used 30 surveyed provinces in three rounds as training samples to fit three analytic models with partial least-square regressions and prioritized predictors by variable importance projection to find variations. And fitted a spatiotemporal prediction model with Bayesian analysis to infer in space-time.

RESULTS

Variations of obesogenic environmental priorities were found at different times. A Bayesian spatiotemporal prediction model with deviance information criterion of 155.60 and statistically significant (P < 0.05) parameter estimates of intercept (-717.0400, 95% confidence intervals [CI]: -1186.0300, -248.0480), year (0.3584, CI: 0.1245, 0.5924), square of food industry level (0.0003, CI: 0.0002, 0.0004), and log (healthcare) (5.3742, CI: 2.5138, 8.2347) was optimized. Totally inferred average obesity prevalence among children and adolescents were 2.23%, 5.11%, 10.77%, 12.20%, 13.99%, and 17.58% in 31 provinces in China in 2000, 2006, 2011, 2015, 2020, and 2030, respectively. Obesity in north and east of China clusters on predicted maps.

CONCLUSIONS

Obesity prevalence in children and adolescents in China is rapidly increasing, growing at 0.3584% annually from 2000 to 2011. From longitudinal observation, prevalence was significantly influenced by food industry ("Amplifier") and healthcare service ("Balancer"). Targeted interventions in north and east of China are pressing. Further researches on the mechanisms underlying the influence of food industry, healthcare service, and so on in children and adolescents are needed.

Collapse

Hanssen EN, Liland KH, Gill P, Snipen L. Optimizing body fluid recognition from microbial taxonomic profiles. Forensic Sci Int Genet 2018;37:13-20. [PMID: 30071492 DOI: 10.1016/j.fsigen.2018.07.012] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2018] [Revised: 07/12/2018] [Accepted: 07/13/2018] [Indexed: 12/17/2022]

Abstract

In forensics the DNA-profile is used to identify the person who left a biological trace, but information on body fluid can also be essential in the evidence evaluation process. Microbial composition data could potentially be used for body fluid recognition as an improved alternative to the currently used presumptive tests. We have developed a customized workflow for interpretation of bacterial 16S sequence data based on a model composed of Partial Least Squares (PLS) in combination with Linear Discriminant Analysis (LDA). Large data sets from the Human Microbiome Project (HMP) and the American Gut Project (AGP) were used to test different settings in order to optimize performance. From the initial cross-validation of body fluid recognition within the HMP data, the optimal overall accuracy was close to 98%. Sensitivity values for the fecal and oral samples were ≥0.99, followed by the vaginal samples with 0.98 and the skin and nasal samples with 0.96 and 0.81 respectively. Specificity values were high for all 5 categories, mostly >0.99. This optimal performance was achieved by using the following settings: Taxonomic profiles based on operational taxonomic units (OTUs) with 0.98 identity (OTU98), Aitchisons simplex transform with C = 1 pseudo-count and no regularization (r = 1) in the PLS step. Variable selection did not improve the performance further. To test for robustness across sequencing platforms, we also trained the classifier on HMP data and tested on the AGP data set. In this case, the standard OTU based approach showed moderately decline in accuracy. However, by using taxonomic profiles made by direct assignment of reads to a genus, we were able to nearly maintain the high accuracy levels. The optimal combination of settings was still used, except the taxonomic level being genus instead of OTU98. The performance may be improved even further by using higher resolution taxonomic bins.

Collapse

Li C, Gong W, Zhang L, Yang Z, Nong W, Bian Y, Kwan HS, Cheung MK, Xiao Y. Association Mapping Reveals Genetic Loci Associated with Important Agronomic Traits in Lentinula edodes, Shiitake Mushroom. Front Microbiol 2017;8:237. [PMID: 28261189 PMCID: PMC5314409 DOI: 10.3389/fmicb.2017.00237] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2016] [Accepted: 02/03/2017] [Indexed: 12/28/2022] Open

Märtens K, Hallin J, Warringer J, Liti G, Parts L. Predicting quantitative traits from genome and phenome with near perfect accuracy. Nat Commun 2016;7:11512. [PMID: 27160605 PMCID: PMC4866306 DOI: 10.1038/ncomms11512] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2015] [Accepted: 04/01/2016] [Indexed: 12/20/2022] Open

Multivariate Analysis of Genotype-Phenotype Association. Genetics 2016;202:1345-63. [PMID: 26896328 DOI: 10.1534/genetics.115.181339] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2015] [Accepted: 02/15/2016] [Indexed: 11/18/2022] Open

Abstract

With the advent of modern imaging and measurement technology, complex phenotypes are increasingly represented by large numbers of measurements, which may not bear biological meaning one by one. For such multivariate phenotypes, studying the pairwise associations between all measurements and all alleles is highly inefficient and prevents insight into the genetic pattern underlying the observed phenotypes. We present a new method for identifying patterns of allelic variation (genetic latent variables) that are maximally associated-in terms of effect size-with patterns of phenotypic variation (phenotypic latent variables). This multivariate genotype-phenotype mapping (MGP) separates phenotypic features under strong genetic control from less genetically determined features and thus permits an analysis of the multivariate structure of genotype-phenotype association, including its dimensionality and the clustering of genetic and phenotypic variables within this association. Different variants of MGP maximize different measures of genotype-phenotype association: genetic effect, genetic variance, or heritability. In an application to a mouse sample, scored for 353 SNPs and 11 phenotypic traits, the first dimension of genetic and phenotypic latent variables accounted for >70% of genetic variation present in all 11 measurements; 43% of variation in this phenotypic pattern was explained by the corresponding genetic latent variable. The first three dimensions together sufficed to account for almost 90% of genetic variation in the measurements and for all the interpretable genotype-phenotype association. Each dimension can be tested as a whole against the hypothesis of no association, thereby reducing the number of statistical tests from 7766 to 3-the maximal number of meaningful independent tests. Important alleles can be selected based on their effect size (additive or nonadditive effect on the phenotypic latent variable). This low dimensionality of the genotype-phenotype map has important consequences for gene identification and may shed light on the evolvability of organisms.

Collapse

Dumancas GG, Ramasahayam S, Bello G, Hughes J, Kramer R. Chemometric regression techniques as emerging, powerful tools in genetic association studies. Trends Analyt Chem 2015. [DOI: 10.1016/j.trac.2015.05.007] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]

Mehmood T, Rasheed Z. Multivariate Procedure for Variable Selection and Classification of High Dimensional Heterogeneous Data. COMMUNICATIONS FOR STATISTICAL APPLICATIONS AND METHODS 2015. [DOI: 10.5351/csam.2015.22.6.575] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Comparing K-mer based methods for improved classification of 16S sequences. BMC Bioinformatics 2015;16:205. [PMID: 26130333 PMCID: PMC4487979 DOI: 10.1186/s12859-015-0647-4] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2015] [Accepted: 06/06/2015] [Indexed: 11/10/2022] Open

Abstract

Background

The need for precise and stable taxonomic classification is highly relevant in modern microbiology. Parallel to the explosion in the amount of sequence data accessible, there has also been a shift in focus for classification methods. Previously, alignment-based methods were the most applicable tools. Now, methods based on counting K-mers by sliding windows are the most interesting classification approach with respect to both speed and accuracy. Here, we present a systematic comparison on five different K-mer based classification methods for the 16S rRNA gene. The methods differ from each other both in data usage and modelling strategies. We have based our study on the commonly known and well-used naïve Bayes classifier from the RDP project, and four other methods were implemented and tested on two different data sets, on full-length sequences as well as fragments of typical read-length.

Results

The difference in classification error obtained by the methods seemed to be small, but they were stable and for both data sets tested. The Preprocessed nearest-neighbour (PLSNN) method performed best for full-length 16S rRNA sequences, significantly better than the naïve Bayes RDP method. On fragmented sequences the naïve Bayes Multinomial method performed best, significantly better than all other methods. For both data sets explored, and on both full-length and fragmented sequences, all the five methods reached an error-plateau.

Conclusions

We conclude that no K-mer based method is universally best for classifying both full-length sequences and fragments (reads). All methods approach an error plateau indicating improved training data is needed to improve classification from here. Classification errors occur most frequent for genera with few sequences present. For improving the taxonomy and testing new classification methods, the need for a better and more universal and robust training data set is crucial.

Collapse

Franco-Duarte R, Mendes I, Umek L, Drumonde-Neves J, Zupan B, Schuller D. Computational models reveal genotype-phenotype associations in Saccharomyces cerevisiae. Yeast 2014;31:265-77. [PMID: 24752995 DOI: 10.1002/yea.3016] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2013] [Revised: 04/09/2014] [Accepted: 04/10/2014] [Indexed: 11/11/2022] Open

Vinje H, Almøy T, Liland KH, Snipen L. A systematic search for discriminating sites in the 16S ribosomal RNA gene. MICROBIAL INFORMATICS AND EXPERIMENTATION 2014;4:2. [PMID: 24467869 PMCID: PMC3910680 DOI: 10.1186/2042-5783-4-2] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/16/2013] [Accepted: 12/16/2013] [Indexed: 02/01/2023]

Mehmood T, Warringer J, Snipen L, Sæbø S. Improving stability and understandability of genotype-phenotype mapping in Saccharomyces using regularized variable selection in L-PLS regression. BMC Bioinformatics 2012;13:327. [PMID: 23216988 PMCID: PMC3598729 DOI: 10.1186/1471-2105-13-327] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2012] [Accepted: 12/05/2012] [Indexed: 11/26/2022] Open

Abstract

Background

Multivariate approaches have been successfully applied to genome wide association studies. Recently, a Partial Least Squares (PLS) based approach was introduced for mapping yeast genotype-phenotype relations, where background information such as gene function classification, gene dispensability, recent or ancient gene copy number variations and the presence of premature stop codons or frameshift mutations in reading frames, were used post hoc to explain selected genes. One of the latest advancement in PLS named L-Partial Least Squares (L-PLS), where ‘L’ presents the used data structure, enables the use of background information at the modeling level. Here, a modification of L-PLS with variable importance on projection (VIP) was implemented using a stepwise regularized procedure for gene and background information selection. Results were compared to PLS-based procedures, where no background information was used.

Results

Applying the proposed methodology to yeast Saccharomyces cerevisiae data, we found the relationship between genotype-phenotype to have improved understandability. Phenotypic variations were explained by the variations of relatively stable genes and stable background variations. The suggested procedure provides an automatic way for genotype-phenotype mapping. The selected phenotype influencing genes were evolving 29% faster than non-influential genes, and the current results are supported by a recently conducted study. Further power analysis on simulated data verified that the proposed methodology selects relevant variables.

Conclusions

A modification of L-PLS with VIP in a stepwise regularized elimination procedure can improve the understandability and stability of selected genes and background information. The approach is recommended for genome wide association studies where background information is available.

Collapse

On the prospects of whole-genome association mapping in Saccharomyces cerevisiae. Genetics 2012;191:1345-53. [PMID: 22673807 DOI: 10.1534/genetics.112.141168] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Mehmood T, Martens H, Sæbø S, Warringer J, Snipen L. A Partial Least Squares based algorithm for parsimonious variable selection. Algorithms Mol Biol 2011;6:27. [PMID: 22142365 PMCID: PMC3287970 DOI: 10.1186/1748-7188-6-27] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2011] [Accepted: 12/05/2011] [Indexed: 11/15/2022] Open