1
|
Lu L, Anderson‐Cook CM, Zhang M. Understanding the merits of winning data competition solutions for varied sets of objectives. Stat Anal Data Min 2021. [DOI: 10.1002/sam.11494] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Affiliation(s)
- Lu Lu
- University of South Florida Tampa Florida USA
| | | | | |
Collapse
|
2
|
Klammer M, Dybowski JN, Hoffmann D, Schaab C. Pareto Optimization Identifies Diverse Set of Phosphorylation Signatures Predicting Response to Treatment with Dasatinib. PLoS One 2015; 10:e0128542. [PMID: 26083411 PMCID: PMC4470654 DOI: 10.1371/journal.pone.0128542] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2015] [Accepted: 04/26/2015] [Indexed: 01/17/2023] Open
Abstract
Multivariate biomarkers that can predict the effectiveness of targeted therapy in individual patients are highly desired. Previous biomarker discovery studies have largely focused on the identification of single biomarker signatures, aimed at maximizing prediction accuracy. Here, we present a different approach that identifies multiple biomarkers by simultaneously optimizing their predictive power, number of features, and proximity to the drug target in a protein-protein interaction network. To this end, we incorporated NSGA-II, a fast and elitist multi-objective optimization algorithm that is based on the principle of Pareto optimality, into the biomarker discovery workflow. The method was applied to quantitative phosphoproteome data of 19 non-small cell lung cancer (NSCLC) cell lines from a previous biomarker study. The algorithm successfully identified a total of 77 candidate biomarker signatures predicting response to treatment with dasatinib. Through filtering and similarity clustering, this set was trimmed to four final biomarker signatures, which then were validated on an independent set of breast cancer cell lines. All four candidates reached the same good prediction accuracy (83%) as the originally published biomarker. Although the newly discovered signatures were diverse in their composition and in their size, the central protein of the originally published signature — integrin β4 (ITGB4) — was also present in all four Pareto signatures, confirming its pivotal role in predicting dasatinib response in NSCLC cell lines. In summary, the method presented here allows for a robust and simultaneous identification of multiple multivariate biomarkers that are optimized for prediction performance, size, and relevance.
Collapse
Affiliation(s)
- Martin Klammer
- Evotec (München) GmbH, Dept. of Bioinformatics, Am Klopferspitz 19a, 82152 Martinsried, Germany
| | - J. Nikolaj Dybowski
- Evotec (München) GmbH, Dept. of Bioinformatics, Am Klopferspitz 19a, 82152 Martinsried, Germany
| | - Daniel Hoffmann
- Center for Medical Biotechnology, University of Duisburg-Essen, Universitätsstrasse 1-4, 45141 Essen, Germany
| | - Christoph Schaab
- Evotec (München) GmbH, Dept. of Bioinformatics, Am Klopferspitz 19a, 82152 Martinsried, Germany
- Max-Plack Institute for Biochemistry, Am Klopferspitz 18, 82152 Martinsried, Germany
- * E-mail:
| |
Collapse
|
3
|
Lu L, Anderson-Cook CM, Lin DK. Optimal designed experiments using a Pareto front search for focused preference of multiple objectives. Comput Stat Data Anal 2014. [DOI: 10.1016/j.csda.2013.04.008] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
4
|
Lu L, Chapman JL, Anderson-cook CM. A Case Study on Selecting a Best Allocation of New Data for Improving the Estimation Precision of System and Subsystem Reliability Using Pareto Fronts. Technometrics 2013. [DOI: 10.1080/00401706.2013.831776] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
5
|
Levene C, Correa E, Blanch EW, Goodacre R. Enhancing Surface Enhanced Raman Scattering (SERS) Detection of Propranolol with Multiobjective Evolutionary Optimization. Anal Chem 2012; 84:7899-905. [DOI: 10.1021/ac301647a] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Affiliation(s)
- Clare Levene
- Faculty
of Life Sciences and ‡School of Chemistry, Manchester Institute of Biotechnology, University of Manchester, 131 Princess Street, Manchester,
M1 7DN
| | - Elon Correa
- Faculty
of Life Sciences and ‡School of Chemistry, Manchester Institute of Biotechnology, University of Manchester, 131 Princess Street, Manchester,
M1 7DN
| | - Ewan W. Blanch
- Faculty
of Life Sciences and ‡School of Chemistry, Manchester Institute of Biotechnology, University of Manchester, 131 Princess Street, Manchester,
M1 7DN
| | - Royston Goodacre
- Faculty
of Life Sciences and ‡School of Chemistry, Manchester Institute of Biotechnology, University of Manchester, 131 Princess Street, Manchester,
M1 7DN
| |
Collapse
|
6
|
He L, Friedman AM, Bailey-Kellogg C. A divide-and-conquer approach to determine the Pareto frontier for optimization of protein engineering experiments. Proteins 2012; 80:790-806. [PMID: 22180081 PMCID: PMC4939273 DOI: 10.1002/prot.23237] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2011] [Revised: 10/06/2011] [Accepted: 10/21/2011] [Indexed: 01/07/2023]
Abstract
In developing improved protein variants by site-directed mutagenesis or recombination, there are often competing objectives that must be considered in designing an experiment (selecting mutations or breakpoints): stability versus novelty, affinity versus specificity, activity versus immunogenicity, and so forth. Pareto optimal experimental designs make the best trade-offs between competing objectives. Such designs are not "dominated"; that is, no other design is better than a Pareto optimal design for one objective without being worse for another objective. Our goal is to produce all the Pareto optimal designs (the Pareto frontier), to characterize the trade-offs and suggest designs most worth considering, but to avoid explicitly considering the large number of dominated designs. To do so, we develop a divide-and-conquer algorithm, Protein Engineering Pareto FRontier (PEPFR), that hierarchically subdivides the objective space, using appropriate dynamic programming or integer programming methods to optimize designs in different regions. This divide-and-conquer approach is efficient in that the number of divisions (and thus calls to the optimizer) is directly proportional to the number of Pareto optimal designs. We demonstrate PEPFR with three protein engineering case studies: site-directed recombination for stability and diversity via dynamic programming, site-directed mutagenesis of interacting proteins for affinity and specificity via integer programming, and site-directed mutagenesis of a therapeutic protein for activity and immunogenicity via integer programming. We show that PEPFR is able to effectively produce all the Pareto optimal designs, discovering many more designs than previous methods. The characterization of the Pareto frontier provides additional insights into the local stability of design choices as well as global trends leading to trade-offs between competing criteria.
Collapse
Affiliation(s)
- Lu He
- Department of Computer Science, Dartmouth College, Hanover NH 03755
| | - Alan M. Friedman
- Department of Biological Sciences, Markey Center for Structural Biology, Purdue Cancer Center, and Bindley Bioscience Center, Purdue University
| | | |
Collapse
|
7
|
Fjell CD, Hiss JA, Hancock REW, Schneider G. Designing antimicrobial peptides: form follows function. Nat Rev Drug Discov 2011; 11:37-51. [PMID: 22173434 DOI: 10.1038/nrd3591] [Citation(s) in RCA: 1367] [Impact Index Per Article: 105.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Multidrug-resistant bacteria are a severe threat to public health. Conventional antibiotics are becoming increasingly ineffective as a result of resistance, and it is imperative to find new antibacterial strategies. Natural antimicrobials, known as host defence peptides or antimicrobial peptides, defend host organisms against microbes but most have modest direct antibiotic activity. Enhanced variants have been developed using straightforward design and optimization strategies and are being tested clinically. Here, we describe advanced computer-assisted design strategies that address the difficult problem of relating primary sequence to peptide structure, and are delivering more potent, cost-effective, broad-spectrum peptides as potential next-generation antibiotics.
Collapse
Affiliation(s)
- Christopher D Fjell
- Centre for Microbial Diseases and Immunity Research, University of British Columbia, 2259 Lower Mall, Vancouver, British Columbia V6T 1Z4, Canada
| | | | | | | |
Collapse
|
8
|
Dybowski JN, Riemenschneider M, Hauke S, Pyka M, Verheyen J, Hoffmann D, Heider D. Improved Bevirimat resistance prediction by combination of structural and sequence-based classifiers. BioData Min 2011; 4:26. [PMID: 22082002 PMCID: PMC3248369 DOI: 10.1186/1756-0381-4-26] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2011] [Accepted: 11/14/2011] [Indexed: 01/07/2023] Open
Abstract
BACKGROUND Maturation inhibitors such as Bevirimat are a new class of antiretroviral drugs that hamper the cleavage of HIV-1 proteins into their functional active forms. They bind to these preproteins and inhibit their cleavage by the HIV-1 protease, resulting in non-functional virus particles. Nevertheless, there exist mutations in this region leading to resistance against Bevirimat. Highly specific and accurate tools to predict resistance to maturation inhibitors can help to identify patients, who might benefit from the usage of these new drugs. RESULTS We tested several methods to improve Bevirimat resistance prediction in HIV-1. It turned out that combining structural and sequence-based information in classifier ensembles led to accurate and reliable predictions. Moreover, we were able to identify the most crucial regions for Bevirimat resistance computationally, which are in line with experimental results from other studies. CONCLUSIONS Our analysis demonstrated the use of machine learning techniques to predict HIV-1 resistance against maturation inhibitors such as Bevirimat. New maturation inhibitors are already under development and might enlarge the arsenal of antiretroviral drugs in the future. Thus, accurate prediction tools are very useful to enable a personalized therapy.
Collapse
Affiliation(s)
- J Nikolaj Dybowski
- Department of Bioinformatics, Center of Medical Biotechnology, University of Duisburg-Essen, Universitaetsstr, 2, 45117 Essen, Germany.
| | | | | | | | | | | | | |
Collapse
|
9
|
Lu L, Anderson-Cook CM, Robinson TJ. Optimization of Designed Experiments Based on Multiple Criteria Utilizing a Pareto Frontier. Technometrics 2011. [DOI: 10.1198/tech.2011.10087] [Citation(s) in RCA: 88] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
10
|
Computational Design of a DNA- and Fc-Binding Fusion Protein. Adv Bioinformatics 2011; 2011:457578. [PMID: 21941539 PMCID: PMC3173724 DOI: 10.1155/2011/457578] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2011] [Revised: 06/16/2011] [Accepted: 06/22/2011] [Indexed: 12/23/2022] Open
Abstract
Computational design of novel proteins with well-defined functions is an ongoing topic in computational biology. In this work, we generated and optimized a new synthetic fusion protein using an evolutionary approach. The optimization was guided by directed evolution based on hydrophobicity scores, molecular weight, and secondary structure predictions. Several methods were used to refine the models built from the resulting sequences. We have successfully combined two unrelated naturally occurring binding sites, the immunoglobin Fc-binding site of the Z domain and the DNA-binding motif of MyoD bHLH, into a novel stable protein.
Collapse
|
11
|
Petrov D, Zagrovic B. Microscopic analysis of protein oxidative damage: effect of carbonylation on structure, dynamics, and aggregability of villin headpiece. J Am Chem Soc 2011; 133:7016-24. [PMID: 21506564 PMCID: PMC3088313 DOI: 10.1021/ja110577e] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
One of the most important irreversible oxidative modifications of proteins is carbonylation, the process of introducing a carbonyl group in reaction with reactive oxygen species. Notably, carbonylation increases with the age of cells and is associated with the formation of intracellular protein aggregates and the pathogenesis of age-related disorders such as neurodegenerative diseases and cancer. However, it is still largely unclear how carbonylation affects protein structure, dynamics, and aggregability at the atomic level. Here, we use classical molecular dynamics simulations to study structure and dynamics of the carbonylated headpiece domain of villin, a key actin-organizing protein. We perform an exhaustive set of molecular dynamics simulations of a native villin headpiece together with every possible combination of carbonylated versions of its seven lysine, arginine, and proline residues, quantitatively the most important carbonylable amino acids. Surprisingly, our results suggest that high levels of carbonylation, far above those associated with cell death in vivo, may be required to destabilize and unfold protein structure through the disruption of specific stabilizing elements, such as salt bridges or proline kinks, or tampering with the hydrophobic effect. On the other hand, by using thermodynamic integration and molecular hydrophobicity potential approaches, we quantitatively show that carbonylation of hydrophilic lysine and arginine residues is equivalent to introducing hydrophobic, charge-neutral mutations in their place, and, by comparison with experimental results, we demonstrate that this by itself significantly increases the intrinsic aggregation propensity of both structured, native proteins and their unfolded states. Finally, our results provide a foundation for a novel experimental strategy to study the effects of carbonylation on protein structure, dynamics, and aggregability using site-directed mutagenesis.
Collapse
Affiliation(s)
- Drazen Petrov
- Department of Structural and Computational Biology, Max F. Perutz Laboratories, University of Vienna, Campus Vienna Biocenter 5, Vienna AT-1030, Austria
| | | |
Collapse
|
12
|
Dynamic causal modeling with genetic algorithms. J Neurosci Methods 2010; 194:402-6. [PMID: 21094663 DOI: 10.1016/j.jneumeth.2010.11.007] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2010] [Revised: 10/13/2010] [Accepted: 11/10/2010] [Indexed: 11/23/2022]
Abstract
In the last years, dynamic causal modeling has gained increased popularity in the neuroimaging community as an approach for the estimation of effective connectivity from functional magnetic resonance imaging (fMRI) data. The algorithm calls for an a priori defined model, whose parameter estimates are subsequently computed upon the given data. As the number of possible models increases exponentially with additional areas, it rapidly becomes inefficient to compute parameter estimates for all models in order to reveal the family of models with the highest posterior probability. In the present study, we developed a genetic algorithm for dynamic causal models and investigated whether this evolutionary approach can accelerate the model search. In this context, the configuration of the intrinsic, extrinsic and bilinear connection matrices represents the genetic code and Bayesian model selection serves as a fitness function. Using crossover and mutation, populations of models are created and compared with each other. The most probable ones survive the current generation and serve as a source for the next generation of models. Tests with artificially created data sets show that the genetic algorithm approximates the most plausible models faster than a random-driven brute-force search. The fitness landscape revealed by the genetic algorithm indicates that dynamic causal modeling has excellent properties for evolution-driven optimization techniques.
Collapse
|
13
|
|