1
|
Whitsitt Q, Saxena A, Patel B, Evans BM, Hunt B, Purcell EK. Spatial transcriptomics at the brain-electrode interface in rat motor cortex and the relationship to recording quality. J Neural Eng 2024; 21:046033. [PMID: 38885679 PMCID: PMC11289622 DOI: 10.1088/1741-2552/ad5936] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Revised: 04/15/2024] [Accepted: 06/17/2024] [Indexed: 06/20/2024]
Abstract
Study of the foreign body reaction to implanted electrodes in the brain is an important area of research for the future development of neuroprostheses and experimental electrophysiology. After electrode implantation in the brain, microglial activation, reactive astrogliosis, and neuronal cell death create an environment immediately surrounding the electrode that is significantly altered from its homeostatic state.Objective.To uncover physiological changes potentially affecting device function and longevity, spatial transcriptomics (ST) was implemented to identify changes in gene expression driven by electrode implantation and compare this differential gene expression to traditional metrics of glial reactivity, neuronal loss, and electrophysiological recording quality.Approach.For these experiments, rats were chronically implanted with functional Michigan-style microelectrode arrays, from which electrophysiological recordings (multi-unit activity, local field potential) were taken over a six-week time course. Brain tissue cryosections surrounding each electrode were then mounted for ST processing. The tissue was immunolabeled for neurons and astrocytes, which provided both a spatial reference for ST and a quantitative measure of glial fibrillary acidic protein and neuronal nuclei immunolabeling surrounding each implant.Main results. Results from rat motor cortex within 300µm of the implanted electrodes at 24 h, 1 week, and 6 weeks post-implantation showed up to 553 significantly differentially expressed (DE) genes between implanted and non-implanted tissue sections. Regression on the significant DE genes identified the 6-7 genes that had the strongest relationship to histological and electrophysiological metrics, revealing potential candidate biomarkers of recording quality and the tissue response to implanted electrodes.Significance. Our analysis has shed new light onto the potential mechanisms involved in the tissue response to implanted electrodes while generating hypotheses regarding potential biomarkers related to recorded signal quality. A new approach has been developed to understand the tissue response to electrodes implanted in the brain using genes identified through transcriptomics, and to screen those results for potential relationships with functional outcomes.
Collapse
Affiliation(s)
- Quentin Whitsitt
- Department of Biomedical Engineering and Institute of Quantitative Health Science and Engineering, Michigan State University, East Lansing, MI 48824, United States of America
| | - Akash Saxena
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI 48824, United States of America
| | - Bella Patel
- Department of Biomedical Engineering and Institute of Quantitative Health Science and Engineering, Michigan State University, East Lansing, MI 48824, United States of America
| | - Blake M Evans
- Department of Biomedical Engineering and Institute of Quantitative Health Science and Engineering, Michigan State University, East Lansing, MI 48824, United States of America
| | - Bradley Hunt
- Department of Biomedical Engineering and Institute of Quantitative Health Science and Engineering, Michigan State University, East Lansing, MI 48824, United States of America
| | - Erin K Purcell
- Department of Biomedical Engineering and Institute of Quantitative Health Science and Engineering, Michigan State University, East Lansing, MI 48824, United States of America
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI 48824, United States of America
| |
Collapse
|
2
|
Bresci A, Kobayashi-Kirschvink KJ, Cerullo G, Vanna R, So PTC, Polli D, Kang JW. Label-free morpho-molecular phenotyping of living cancer cells by combined Raman spectroscopy and phase tomography. Commun Biol 2024; 7:785. [PMID: 38951178 PMCID: PMC11217291 DOI: 10.1038/s42003-024-06496-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Accepted: 06/23/2024] [Indexed: 07/03/2024] Open
Abstract
Accurate, rapid and non-invasive cancer cell phenotyping is a pressing concern across the life sciences, as standard immuno-chemical imaging and omics require extended sample manipulation. Here we combine Raman micro-spectroscopy and phase tomography to achieve label-free morpho-molecular profiling of human colon cancer cells, following the adenoma, carcinoma, and metastasis disease progression, in living and unperturbed conditions. We describe how to decode and interpret quantitative chemical and co-registered morphological cell traits from Raman fingerprint spectra and refractive index tomograms. Our multimodal imaging strategy rapidly distinguishes cancer phenotypes, limiting observations to a low number of pristine cells in culture. This synergistic dataset allows us to study independent or correlated information in spectral and tomographic maps, and how it benefits cell type inference. This method is a valuable asset in biomedical research, particularly when biological material is in short supply, and it holds the potential for non-invasive monitoring of cancer progression in living organisms.
Collapse
Affiliation(s)
- Arianna Bresci
- G. R. Harrison Spectroscopy Laboratory, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.
- Department of Physics, Politecnico di Milano, Milan, 20133, Italy.
| | - Koseki J Kobayashi-Kirschvink
- G. R. Harrison Spectroscopy Laboratory, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Giulio Cerullo
- Department of Physics, Politecnico di Milano, Milan, 20133, Italy
- CNR-Institute for Photonics and Nanotechnologies (CNR-IFN), Milan, 20133, Italy
| | - Renzo Vanna
- CNR-Institute for Photonics and Nanotechnologies (CNR-IFN), Milan, 20133, Italy
| | - Peter T C So
- G. R. Harrison Spectroscopy Laboratory, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
- Department of Mechanical Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Dario Polli
- Department of Physics, Politecnico di Milano, Milan, 20133, Italy.
- CNR-Institute for Photonics and Nanotechnologies (CNR-IFN), Milan, 20133, Italy.
| | - Jeon Woong Kang
- G. R. Harrison Spectroscopy Laboratory, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.
| |
Collapse
|
3
|
Nguyen MB, Venet M, Fan CPS, Dragulescu A, Rusin CG, Mertens LL, Mital S, Villemain O. Modeling the Relationship Between Diastolic Phenotype and Outcomes in Pediatric Hypertrophic Cardiomyopathy. J Am Soc Echocardiogr 2024; 37:508-517.e3. [PMID: 38097053 DOI: 10.1016/j.echo.2023.11.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Revised: 11/27/2023] [Accepted: 11/27/2023] [Indexed: 01/12/2024]
Abstract
BACKGROUND Pediatric hypertrophic cardiomyopathy (HCM) is associated with adverse events. The contribution of diastolic dysfunction to adverse events is poorly understood. The aim of this study was to explore the association between diastolic phenotype and outcomes in pediatric patients with HCM. METHODS Children <18 years of age with diagnosed with HCM were included. Diastolic function parameters were measured from the first echocardiogram at the time of diagnosis, including Doppler flow velocities, tissue Doppler velocities, and left atrial volume and function. Using principal-component analysis, key features in echocardiographic parameters were identified. The principal components were regressed to freedom from major adverse cardiac events (MACE), defined as implantable cardioverter-defibrillator insertion, myectomy, aborted sudden cardiac death, transplantation, need for mechanical circulatory support, and death. RESULTS Variables that estimate left ventricular filling pressures were highly collinear and associated with MACE (hazard ratio, 0.86; 95% CI, 0.75-1.00), though this was no longer significant after controlling for left ventricular thickness and genetic variation. Left atrial size parameters adjusted for body surface area were independently associated with outcomes in the covariate-adjusted model (hazard ratio, 0.69; 95% CI, 0.5-0.94). The covariate-adjusted model had an Akaike information criterion of 213, an adjusted R2 value of 0.78, and a concordance index of 0.82 for association with MACE. CONCLUSION Echocardiographic parameters of diastolic dysfunction were associated with MACE in this population study, in combination with the severity of left ventricular hypertrophy and genetic variation. Left atrial size parameters adjusted for body surface area were independently associated with adverse events. Additional study of diastolic function parameters adjusted for patient size could facilitate the prediction of adverse events in pediatric patients with HCM.
Collapse
Affiliation(s)
- Minh B Nguyen
- Department of Pediatric Cardiology, Baylor College of Medicine, Houston, Texas; Division of Cardiology, Department of Paediatrics, The Hospital for Sick Children, University of Toronto, Toronto, Ontario, Canada.
| | - Maelys Venet
- Division of Cardiology, Department of Paediatrics, The Hospital for Sick Children, University of Toronto, Toronto, Ontario, Canada
| | - Chun-Po Steve Fan
- Ted Rogers Computational Program, Ted Rogers Centre for Heart Research, Peter Munk Cardiac Centre, University Health Network, Toronto, Ontario, Canada
| | - Andreea Dragulescu
- Division of Cardiology, Department of Paediatrics, The Hospital for Sick Children, University of Toronto, Toronto, Ontario, Canada
| | - Craig G Rusin
- Department of Pediatric Cardiology, Baylor College of Medicine, Houston, Texas
| | - Luc L Mertens
- Division of Cardiology, Department of Paediatrics, The Hospital for Sick Children, University of Toronto, Toronto, Ontario, Canada
| | - Seema Mital
- Division of Cardiology, Department of Paediatrics, The Hospital for Sick Children, University of Toronto, Toronto, Ontario, Canada; Ted Rogers Centre for Heart Research, Toronto, Ontario, Canada
| | - Olivier Villemain
- Division of Cardiology, Department of Paediatrics, The Hospital for Sick Children, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
4
|
Liu H, Yu S. A dimensionality-reduction genomic prediction method without direct inverse of the genomic relationship matrix for large genomic data. PLANT CELL REPORTS 2023; 42:1825-1832. [PMID: 37750948 DOI: 10.1007/s00299-023-03069-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 09/08/2023] [Indexed: 09/27/2023]
Abstract
KEY MESSAGE A new genomic prediction method (RHPP) was developed via combining randomized Haseman-Elston regression (RHE-reg), PCR based on genomic information of core population, and preconditioned conjugate gradient (PCG) algorithm. Computational efficiency is becoming a hot issue in the practical application of genomic prediction due to the large number of data generated by the high-throughput genotyping technology. In this study, we developed a fast genomic prediction method RHPP via combining randomized Haseman-Elston regression (RHE-reg), PCR based on genomic information of core population, and preconditioned conjugate gradient (PCG) algorithm. The simulation results demonstrated similar prediction accuracy between RHPP and GBLUP, and significantly higher computational efficiency of the former with the increase of individuals. The results of real datasets of both bread wheat and loblolly pine demonstrated that RHPP had a similar or better predictive accuracy in most cases compared with GBLUP. In the future, RHPP may be an attractive choice for analyzing large-scale and high-dimensional data.
Collapse
Affiliation(s)
- Hailan Liu
- Maize Research Institute, Sichuan Agricultural University, Chengdu, 611130, Sichuan, China.
| | - Shizhou Yu
- Molecular Genetics Key Laboratory of China Tobacco, Guizhou Academy of Tobacco Science, Guiyang, 550081, Guizhou, China.
| |
Collapse
|
5
|
Pervez MN, Yeo WS, Lin L, Xiong X, Naddeo V, Cai Y. Optimization and prediction of the cotton fabric dyeing process using Taguchi design-integrated machine learning approach. Sci Rep 2023; 13:12363. [PMID: 37524835 PMCID: PMC10390507 DOI: 10.1038/s41598-023-39528-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Accepted: 07/26/2023] [Indexed: 08/02/2023] Open
Abstract
The typical textile dyeing process calls for a wide range of operational parameters, and it has always been difficult to pinpoint which of these qualities is the most important in dyeing performance. Consequently, this research used a combined design of experiments and machine learning prediction models' method to offer a sustainable and beneficial reactive cotton fabric dyeing process. To be more precise, we built a least square support vector regression (LSSVR) model based on Taguchi's statistical orthogonal design (L27) to predict exhaustion percentage (E%), fixation rate (F%), and total fixation efficiency (T%) and color strength (K/S) in the reactive cotton dyeing process. The model's prediction accuracy was assessed using many measures, including root mean square error (RMSE), mean absolute error (MAE), and the coefficient of determination (R2). Principal component regression (PCR), partial least square regression (PLSR), and fuzzy modelling were some of the other types of regression models used to compare results. Our findings reveal that the LSSVR model greatly outperformed competing models in predicting the E%, F%, T%, and K/S. This is shown by the LSSVR model's much smaller RMSE and MAE values. Overall, it provided the highest possible R2 values, which reached 0.9819.
Collapse
Affiliation(s)
- Md Nahid Pervez
- Hubei Provincial Engineering Laboratory for Clean Production and High Value Utilization of Bio-Based Textile Materials, Wuhan Textile University, Wuhan, 430200, China
- School of Computing, Huanggang Normal University, Huanggang, 438000, China
- Sanitary Environmental Engineering Division (SEED), Department of Civil Engineering, University of Salerno, 84084, Fisciano, Italy
| | - Wan Sieng Yeo
- Department of Chemical and Energy Engineering, Faculty of Engineering and Science, Curtin University Malaysia, CDT 250, 98009, Miri, Sarawak, Malaysia
| | - Lina Lin
- Hubei Provincial Engineering Laboratory for Clean Production and High Value Utilization of Bio-Based Textile Materials, Wuhan Textile University, Wuhan, 430200, China.
- State Key Laboratory of New Textile Materials and Advanced Processing Technologies, Wuhan Textile University, Wuhan, 430073, China.
| | - Xiaorong Xiong
- School of Computing, Huanggang Normal University, Huanggang, 438000, China.
| | - Vincenzo Naddeo
- Sanitary Environmental Engineering Division (SEED), Department of Civil Engineering, University of Salerno, 84084, Fisciano, Italy.
| | - Yingjie Cai
- Hubei Provincial Engineering Laboratory for Clean Production and High Value Utilization of Bio-Based Textile Materials, Wuhan Textile University, Wuhan, 430200, China
| |
Collapse
|
6
|
Alemu A, Batista L, Singh PK, Ceplitis A, Chawade A. Haplotype-tagged SNPs improve genomic prediction accuracy for Fusarium head blight resistance and yield-related traits in wheat. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2023; 136:92. [PMID: 37009920 PMCID: PMC10068637 DOI: 10.1007/s00122-023-04352-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Accepted: 03/21/2023] [Indexed: 06/19/2023]
Abstract
Linkage disequilibrium (LD)-based haplotyping with subsequent SNP tagging improved the genomic prediction accuracy up to 0.07 and 0.092 for Fusarium head blight resistance and spike width, respectively, across six different models. Genomic prediction is a powerful tool to enhance genetic gain in plant breeding. However, the method is accompanied by various complications leading to low prediction accuracy. One of the major challenges arises from the complex dimensionality of marker data. To overcome this issue, we applied two pre-selection methods for SNP markers viz. LD-based haplotype-tagging and GWAS-based trait-linked marker identification. Six different models were tested with preselected SNPs to predict the genomic estimated breeding values (GEBVs) of four traits measured in 419 winter wheat genotypes. Ten different sets of haplotype-tagged SNPs were selected by adjusting the level of LD thresholds. In addition, various sets of trait-linked SNPs were identified with different scenarios from the training-test combined and only from the training populations. The BRR and RR-BLUP models developed from haplotype-tagged SNPs had a higher prediction accuracy for FHB and SPW by 0.07 and 0.092, respectively, compared to the corresponding models developed without marker pre-selection. The highest prediction accuracy for SPW and FHB was achieved with tagged SNPs pruned at weak LD thresholds (r2 < 0.5), while stringent LD was required for spike length (SPL) and flag leaf area (FLA). Trait-linked SNPs identified only from training populations failed to improve the prediction accuracy of the four studied traits. Pre-selection of SNPs via LD-based haplotype-tagging could play a vital role in optimizing genomic selection and reducing genotyping costs. Furthermore, the method could pave the way for developing low-cost genotyping methods through customized genotyping platforms targeting key SNP markers tagged to essential haplotype blocks.
Collapse
Affiliation(s)
- Admas Alemu
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden
| | | | - Pawan K Singh
- International Maize and Wheat Improvement Center, Texcoco, Mexico
| | | | - Aakash Chawade
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden.
| |
Collapse
|
7
|
Establishment and Validation of Fourier Transform Infrared Spectroscopy (FT–MIR) Methodology for the Detection of Linoleic Acid in Buffalo Milk. Foods 2023; 12:foods12061199. [PMID: 36981127 PMCID: PMC10048274 DOI: 10.3390/foods12061199] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Revised: 02/28/2023] [Accepted: 03/10/2023] [Indexed: 03/14/2023] Open
Abstract
Buffalo milk is a dairy product that is considered to have a higher nutritional value compared to cow’s milk. Linoleic acid (LA) is an essential fatty acid that is important for human health. This study aimed to investigate and validate the use of Fourier transform mid-infrared spectroscopy (FT-MIR) for the quantification of the linoleic acid in buffalo milk. Three machine learning models were used to predict linoleic acid content, and random forest was employed to select the most important subset of spectra for improved model performance. The validity of the FT-MIR methods was evaluated in accordance with ICH Q2 (R1) guidelines using the accuracy profile method, and the precision, the accuracy, and the limit of quantification were determined. The results showed that Fourier transform infrared spectroscopy is a suitable technique for the analysis of linoleic acid, with a lower limit of quantification of 0.15 mg/mL milk. Our results showed that FT-MIR spectroscopy is a viable method for LA concentration analysis.
Collapse
|
8
|
Pervez MN, Yeo WS, Shafiq F, Jilani MM, Sarwar Z, Riza M, Lin L, Xiong X, Naddeo V, Cai Y. Sustainable fashion: Design of the experiment assisted machine learning for the environmental-friendly resin finishing of cotton fabric. Heliyon 2023; 9:e12883. [PMID: 36691543 PMCID: PMC9860286 DOI: 10.1016/j.heliyon.2023.e12883] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 01/06/2023] [Accepted: 01/06/2023] [Indexed: 01/11/2023] Open
Abstract
Given the carcinogenic properties of formaldehyde-based chemicals, an alternative method for resin-finishing cotton textiles is urgently needed. Therefore, the primary objective of this study is to introduce a sustainable resin-finishing process for cotton fabric via an industrial procedure. For this purpose, Bluesign® approved a formaldehyde-free Knittex RCT® resin was used, and the process parameters were designed and optimized according to the Taguchi L27 method. XRD analysis confirmed the crosslinking formation between resin and neighboring molecules of cotton fabric, as no change in the cellulose crystallization phase. Several machine learning models were built in a sequence to predict the crease recovery angle (CRA), tearing strength (TE) and whiteness index (WI). Assessment of modelling was evaluated through the use of various metrics such as root mean square error (RMSE), mean absolute error (MAE), and the coefficient of determination (R2). Results were compared to those from other regression models, such as principal component regression (PCR), partial least squares regression (PLSR), and fuzzy modelling. Based on the results of our research, the LSSVR model predicted the CRA, TE, and WI with substantially more accuracy than other models, as shown by the fact that its RMSE and MAE values were significantly lower. In addition, it offered the greatest possible R2 values, reaching up to 0.9627.
Collapse
Affiliation(s)
- Md Nahid Pervez
- Hubei Provincial Engineering Laboratory for Clean Production and High Value Utilization of Bio-based Textile Materials, Wuhan Textile University, Wuhan 430200, China,School of Computing, Huanggang Normal University, Huanggang 438000, China,Sanitary Environmental Engineering Division (SEED), Department of Civil Engineering, University of Salerno, Fisciano 84084, Italy
| | - Wan Sieng Yeo
- Department of Chemical and Energy Engineering, Faculty of Engineering and Science, Curtin University Malaysia, CDT 250, 98009 Miri, Sarawak, Malaysia
| | - Faizan Shafiq
- Hubei Provincial Engineering Laboratory for Clean Production and High Value Utilization of Bio-based Textile Materials, Wuhan Textile University, Wuhan 430200, China
| | - Muhammad Munib Jilani
- Department of Textile Processing, National Textile University, Faisalabad, Punjab 37610, Pakistan
| | - Zahid Sarwar
- School of Engineering and Technology, National Textile University, Faisalabad, Punjab 37610, Pakistan
| | - Mumtahina Riza
- Department of Applied Ecology, North Carolina State University, Campus Box 7617 Raleigh, NC 27695-7617, USA
| | - Lina Lin
- Hubei Provincial Engineering Laboratory for Clean Production and High Value Utilization of Bio-based Textile Materials, Wuhan Textile University, Wuhan 430200, China,Corresponding author. .
| | - Xiaorong Xiong
- School of Computing, Huanggang Normal University, Huanggang 438000, China,Corresponding author. .
| | - Vincenzo Naddeo
- Sanitary Environmental Engineering Division (SEED), Department of Civil Engineering, University of Salerno, Fisciano 84084, Italy,Corresponding author. .
| | - Yingjie Cai
- Hubei Provincial Engineering Laboratory for Clean Production and High Value Utilization of Bio-based Textile Materials, Wuhan Textile University, Wuhan 430200, China
| |
Collapse
|
9
|
Comparison of artificial intelligence algorithms and their ranking for the prediction of genetic merit in sheep. Sci Rep 2022; 12:18726. [PMID: 36333409 PMCID: PMC9636184 DOI: 10.1038/s41598-022-23499-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Accepted: 11/01/2022] [Indexed: 11/06/2022] Open
Abstract
As the amount of data on farms grows, it is important to evaluate the potential of artificial intelligence for making farming predictions. Considering all this, this study was undertaken to evaluate various machine learning (ML) algorithms using 52-year data for sheep. Data preparation was done before analysis. Breeding values were estimated using Best Linear Unbiased Prediction. 12 ML algorithms were evaluated for their ability to predict the breeding values. The variance inflation factor for all features selected through principal component analysis (PCA) was 1. The correlation coefficients between true and predicted values for artificial neural networks, Bayesian ridge regression, classification and regression trees, gradient boosting algorithm, K nearest neighbours, multivariate adaptive regression splines (MARS) algorithm, polynomial regression, principal component regression (PCR), random forests, support vector machines, XGBoost algorithm were 0.852, 0.742, 0.869, 0.915, 0.781, 0.746, 0.742, 0.746, 0.917, 0.777, 0.915 respectively for breeding value prediction. Random forests had the highest correlation coefficients. Among the prediction equations generated using OLS, the highest coefficient of determination was 0.569. A total of 12 machine learning models were developed from the prediction of breeding values in sheep in the present study. It may be said that machine learning techniques can perform predictions with reasonable accuracies and can thus be viable alternatives to conventional strategies for breeding value prediction.
Collapse
|
10
|
Zhou HJ, Li L, Li Y, Li W, Li JJ. PCA outperforms popular hidden variable inference methods for molecular QTL mapping. Genome Biol 2022; 23:210. [PMID: 36221136 PMCID: PMC9552461 DOI: 10.1186/s13059-022-02761-4] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Accepted: 08/26/2022] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Estimating and accounting for hidden variables is widely practiced as an important step in molecular quantitative trait locus (molecular QTL, henceforth "QTL") analysis for improving the power of QTL identification. However, few benchmark studies have been performed to evaluate the efficacy of the various methods developed for this purpose. RESULTS Here we benchmark popular hidden variable inference methods including surrogate variable analysis (SVA), probabilistic estimation of expression residuals (PEER), and hidden covariates with prior (HCP) against principal component analysis (PCA)-a well-established dimension reduction and factor discovery method-via 362 synthetic and 110 real data sets. We show that PCA not only underlies the statistical methodology behind the popular methods but is also orders of magnitude faster, better-performing, and much easier to interpret and use. CONCLUSIONS To help researchers use PCA in their QTL analysis, we provide an R package PCAForQTL along with a detailed guide, both of which are freely available at https://github.com/heatherjzhou/PCAForQTL . We believe that using PCA rather than SVA, PEER, or HCP will substantially improve and simplify hidden variable inference in QTL mapping as well as increase the transparency and reproducibility of QTL research.
Collapse
Affiliation(s)
- Heather J Zhou
- Department of Statistics, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Lei Li
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, 518055, China
| | - Yumei Li
- Division of Computational Biomedicine, Department of Biological Chemistry, School of Medicine, University of California, Irvine, Irvine, CA, 92697, USA
| | - Wei Li
- Division of Computational Biomedicine, Department of Biological Chemistry, School of Medicine, University of California, Irvine, Irvine, CA, 92697, USA
| | - Jingyi Jessica Li
- Department of Statistics, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
- Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
- Department of Computational Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
- Department of Biostatistics, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
| |
Collapse
|
11
|
Bartholomé J, Prakash PT, Cobb JN. Genomic Prediction: Progress and Perspectives for Rice Improvement. Methods Mol Biol 2022; 2467:569-617. [PMID: 35451791 DOI: 10.1007/978-1-0716-2205-6_21] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Genomic prediction can be a powerful tool to achieve greater rates of genetic gain for quantitative traits if thoroughly integrated into a breeding strategy. In rice as in other crops, the interest in genomic prediction is very strong with a number of studies addressing multiple aspects of its use, ranging from the more conceptual to the more practical. In this chapter, we review the literature on rice (Oryza sativa) and summarize important considerations for the integration of genomic prediction in breeding programs. The irrigated breeding program at the International Rice Research Institute is used as a concrete example on which we provide data and R scripts to reproduce the analysis but also to highlight practical challenges regarding the use of predictions. The adage "To someone with a hammer, everything looks like a nail" describes a common psychological pitfall that sometimes plagues the integration and application of new technologies to a discipline. We have designed this chapter to help rice breeders avoid that pitfall and appreciate the benefits and limitations of applying genomic prediction, as it is not always the best approach nor the first step to increasing the rate of genetic gain in every context.
Collapse
Affiliation(s)
- Jérôme Bartholomé
- CIRAD, UMR AGAP Institut, Montpellier, France.
- AGAP Institut, Univ Montpellier, CIRAD, INRAE, Montpellier SupAgro, Montpellier, France.
- Rice Breeding Platform, International Rice Research Institute, Manila, Philippines.
| | | | | |
Collapse
|
12
|
Ahmar S, Ballesta P, Ali M, Mora-Poblete F. Achievements and Challenges of Genomics-Assisted Breeding in Forest Trees: From Marker-Assisted Selection to Genome Editing. Int J Mol Sci 2021; 22:10583. [PMID: 34638922 PMCID: PMC8508745 DOI: 10.3390/ijms221910583] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Revised: 09/26/2021] [Accepted: 09/27/2021] [Indexed: 12/23/2022] Open
Abstract
Forest tree breeding efforts have focused mainly on improving traits of economic importance, selecting trees suited to new environments or generating trees that are more resilient to biotic and abiotic stressors. This review describes various methods of forest tree selection assisted by genomics and the main technological challenges and achievements in research at the genomic level. Due to the long rotation time of a forest plantation and the resulting long generation times necessary to complete a breeding cycle, the use of advanced techniques with traditional breeding have been necessary, allowing the use of more precise methods for determining the genetic architecture of traits of interest, such as genome-wide association studies (GWASs) and genomic selection (GS). In this sense, main factors that determine the accuracy of genomic prediction models are also addressed. In turn, the introduction of genome editing opens the door to new possibilities in forest trees and especially clustered regularly interspaced short palindromic repeats and CRISPR-associated protein 9 (CRISPR/Cas9). It is a highly efficient and effective genome editing technique that has been used to effectively implement targetable changes at specific places in the genome of a forest tree. In this sense, forest trees still lack a transformation method and an inefficient number of genotypes for CRISPR/Cas9. This challenge could be addressed with the use of the newly developing technique GRF-GIF with speed breeding.
Collapse
Affiliation(s)
- Sunny Ahmar
- Institute of Biological Sciences, University of Talca, 1 Poniente 1141, Talca 3460000, Chile;
| | - Paulina Ballesta
- The National Fund for Scientific and Technological Development, Av. del Agua 3895, Talca 3460000, Chile
| | - Mohsin Ali
- Department of Forestry and Range Management, University of Agriculture Faisalabad, Faisalabad 38000, Pakistan;
| | - Freddy Mora-Poblete
- Institute of Biological Sciences, University of Talca, 1 Poniente 1141, Talca 3460000, Chile;
| |
Collapse
|
13
|
Lin ZM, Chen JF, Xu FT, Liu CM, Chen JS, Wang Y, Zhang C, Huang PT. Principal component regression-based contrast-enhanced ultrasound evaluation system for the management of BI-RADS US 4A breast masses: objective assistance for radiologists. ULTRASOUND IN MEDICINE & BIOLOGY 2021; 47:1737-1746. [PMID: 33838937 DOI: 10.1016/j.ultrasmedbio.2021.02.027] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Revised: 02/23/2021] [Accepted: 02/26/2021] [Indexed: 06/12/2023]
Abstract
A portion of detected breast masses might be overrated by using the Breast Imaging-Reporting and Data System ultrasonography (BI-RADS US) lexicon. A principal component regression-based contrast-enhanced ultrasound (PCR-CEUS) evaluation system was built to quantitatively illustrate whether CEUS could help radiologists to differentiate 4A masses. The PCR-CEUS evaluation system, based on principal component analysis (PCA) and logistic regression, was verified by random assignment into training and test sets and shown to reduce the data dimension and avoid collinearity in CEUS variables. This prospective study consecutively collected 238 patients with 238 4A masses confirmed pathologically. All enrolled patients accepted CEUS examination. The diagnostic performance of senior and junior radiologists, PCR-CEUS and combined methods was compared. The PCR-CEUS system had consistent diagnostic performance in both the training and test sets, with an area under the curve (AUC) of 0.831 (0.765-0.897), 0.798 (0.7034-0.892) and 0.854 (0.765-0.943) (all P > 0.05). The AUC of the combined diagnostic model (PCR-CEUS + Senior radiologists) was higher than that of senior radiologists, and the combined model had higher sensitivity (0.875 (0.781-0.969) vs. 0.729 (0.603-0.855)) without compromising specificity. Furthermore, the AUC and specificity of the combined model (PCR-CEUS + Junior radiologists) (0.852 (0.787-0.916)) was higher than that of junior radiologists (0.665 (0.592-0.737) (P < 0.00001)). PCR-CEUS demonstrated good ability in differentiating malignant BI-RADS-US 4A masses and was helpful for both senior and junior radiologists.
Collapse
Affiliation(s)
- Zi-Mei Lin
- Department of Ultrasound in Medicine, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310009, China; Research Center of Ultrasound in Medicine and Biomedical Engineering, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310009, China
| | - Ji-Fan Chen
- Department of Ultrasound in Medicine, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310009, China; Research Center of Ultrasound in Medicine and Biomedical Engineering, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310009, China
| | - Fang-Ting Xu
- Department of Ultrasound in Medicine, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310009, China; Research Center of Ultrasound in Medicine and Biomedical Engineering, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310009, China
| | - Chun-Mei Liu
- Department of Ultrasound in Medicine, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310009, China; Research Center of Ultrasound in Medicine and Biomedical Engineering, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310009, China
| | - Jian-She Chen
- Department of Ultrasound in Medicine, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310009, China; Research Center of Ultrasound in Medicine and Biomedical Engineering, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310009, China
| | - Yao Wang
- Department of Ultrasound in Medicine, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310009, China; Research Center of Ultrasound in Medicine and Biomedical Engineering, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310009, China
| | - Chao Zhang
- Department of Ultrasound in Medicine, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310009, China; Research Center of Ultrasound in Medicine and Biomedical Engineering, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310009, China
| | - Pin-Tong Huang
- Department of Ultrasound in Medicine, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310009, China; Research Center of Ultrasound in Medicine and Biomedical Engineering, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310009, China.
| |
Collapse
|
14
|
Farhadi S, Salehi M, Moieni A, Safaie N, Sabet MS. Modeling of paclitaxel biosynthesis elicitation in Corylus avellana cell culture using adaptive neuro-fuzzy inference system-genetic algorithm (ANFIS-GA) and multiple regression methods. PLoS One 2020; 15:e0237478. [PMID: 32853208 PMCID: PMC7451515 DOI: 10.1371/journal.pone.0237478] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2020] [Accepted: 07/27/2020] [Indexed: 01/28/2023] Open
Abstract
Paclitaxel as a microtubule-stabilizing agent is widely used for the treatment of a vast range of cancers. Corylus avellana cell suspension culture (CSC) is a promising strategy for paclitaxel production. Elicitation of paclitaxel biosynthesis pathway is a key approach for improving its production in cell culture. However, optimization of this process is time-consuming and costly. Modeling of paclitaxel elicitation process can be helpful to predict the optimal condition for its high production in cell culture. The objective of this study was modeling and forecasting paclitaxel biosynthesis in C. avellana cell culture responding cell extract (CE), culture filtrate (CF) and cell wall (CW) derived from endophytic fungus, either individually or combined treatment with methyl-β-cyclodextrin (MBCD), based on four input variables including concentration levels of fungal elicitors and MBCD, elicitor adding day and CSC harvesting time, using adaptive neuro-fuzzy inference system (ANFIS) and multiple regression methods. The results displayed a higher accuracy of ANFIS models (0.94-0.97) as compared to regression models (0.16-0.54). The great accordance between the predicted and observed values of paclitaxel biosynthesis for both training and testing subsets support excellent performance of developed ANFIS models. Optimization process of developed ANFIS models with genetic algorithm (GA) showed that optimal MBCD (47.65 mM) and CW (2.77% (v/v)) concentration levels, elicitor adding day (16) and CSC harvesting time (139 h and 41 min after elicitation) can lead to highest paclitaxel biosynthesis (427.92 μg l-1). The validation experiment showed that ANFIS-GA method can be a promising tool for selecting the optimal conditions for maximum paclitaxel biosynthesis, as a case study.
Collapse
Affiliation(s)
- Siamak Farhadi
- Department of Plant Genetics and Breeding, Faculty of Agriculture, Tarbiat Modares University, Tehran, Iran
| | - Mina Salehi
- Department of Plant Genetics and Breeding, Faculty of Agriculture, Tarbiat Modares University, Tehran, Iran
| | - Ahmad Moieni
- Department of Plant Genetics and Breeding, Faculty of Agriculture, Tarbiat Modares University, Tehran, Iran
| | - Naser Safaie
- Department of Plant Pathology, Faculty of Agriculture, Tarbiat Modares University, Tehran, Iran
| | - Mohammad Sadegh Sabet
- Department of Plant Genetics and Breeding, Faculty of Agriculture, Tarbiat Modares University, Tehran, Iran
| |
Collapse
|
15
|
Salehi M, Moieni A, Safaie N, Farhadi S. Whole fungal elicitors boost paclitaxel biosynthesis induction in Corylus avellana cell culture. PLoS One 2020; 15:e0236191. [PMID: 32673365 PMCID: PMC7365444 DOI: 10.1371/journal.pone.0236191] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2020] [Accepted: 06/30/2020] [Indexed: 12/29/2022] Open
Abstract
Paclitaxel is an effective natural-source chemotherapeutic agent commonly applied to treat a vast range of cancers. In vitro Corylus avellana culture has been reported as a promising and inexpensive system for paclitaxel production. Fungal elicitors have been made known as the most efficient strategy for the biosynthesis induction of secondary metabolites in plant in vitro culture. In this research, C. avellana cell suspension culture (CSC) was exposed to cell extract (CE) and culture filtrate (CF) derived from Camarosporomyces flavigenus, either individually or combined treatment, in mid and late log phase. There is no report on the use of whole fungal elicitors (the combined treatment of CE and CF) for the elicitation of secondary metabolite biosynthesis in plant in vitro culture. The combined treatment of CE and CF significantly led to more paclitaxel biosynthesis and secretion than the individual use of them. Also, multivariate statistical approaches including stepwise regression (SR), ordinary least squares regression (OLSR), principal component regression (PCR) and partial least squares regression (PLSR) were used to model and predict paclitaxel biosynthesis and secretion. Based on value account for (VAF), root mean square error (RMSE), coefficient of determination (R2), mean absolute percentage error (MAPE) and relative percent difference (RPD) can be concluded that mentioned regression models effectively worked only for modeling and predicting extracellular paclitaxel portion in C. avellana cell culture.
Collapse
Affiliation(s)
- Mina Salehi
- Department of Plant Breeding and Biotechnology, Faculty of Agriculture, Tarbiat Modares University, Tehran, Iran
| | - Ahmad Moieni
- Department of Plant Breeding and Biotechnology, Faculty of Agriculture, Tarbiat Modares University, Tehran, Iran
| | - Naser Safaie
- Department of Plant Pathology, Faculty of Agriculture, Tarbiat Modares University, Tehran, Iran
| | - Siamak Farhadi
- Department of Plant Breeding and Biotechnology, Faculty of Agriculture, Tarbiat Modares University, Tehran, Iran
| |
Collapse
|
16
|
Ballesta P, Bush D, Silva FF, Mora F. Genomic Predictions Using Low-Density SNP Markers, Pedigree and GWAS Information: A Case Study with the Non-Model Species Eucalyptus cladocalyx. PLANTS (BASEL, SWITZERLAND) 2020; 9:E99. [PMID: 31941085 PMCID: PMC7020392 DOI: 10.3390/plants9010099] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/25/2019] [Revised: 12/20/2019] [Accepted: 01/09/2020] [Indexed: 11/16/2022]
Abstract
High-throughput genotyping techniques have enabled large-scale genomic analysis to precisely predict complex traits in many plant species. However, not all species can be well represented in commercial SNP (single nucleotide polymorphism) arrays. In this study, a high-density SNP array (60 K) developed for commercial Eucalyptus was used to genotype a breeding population of Eucalyptus cladocalyx, yielding only ~3.9 K informative SNPs. Traditional Bayesian genomic models were investigated to predict flowering, stem quality and growth traits by considering the following effects: (i) polygenic background and all informative markers (GS model) and (ii) polygenic background, QTL-genotype effects (determined by GWAS) and SNP markers that were not associated with any trait (GSq model). The estimates of pedigree-based heritability and genomic heritability varied from 0.08 to 0.34 and 0.002 to 0.5, respectively, whereas the predictive ability varied from 0.19 (GS) and 0.45 (GSq). The GSq approach outperformed GS models in terms of predictive ability when the proportion of the variance explained by the significant marker-trait associations was higher than those explained by the polygenic background and non-significant markers. This approach can be particularly useful for plant/tree species poorly represented in the high-density SNP arrays, developed for economically important species, or when high-density marker panels are not available.
Collapse
Affiliation(s)
- Paulina Ballesta
- Institute of Biological Sciences, University of Talca, 2 Norte 685, Talca 3460000, Chile;
| | - David Bush
- CSIRO–Australian Tree Seed Centre, Acton 2601, Australia;
| | - Fabyano Fonseca Silva
- Department of Animal Science, Universidade Federal de Viçosa, Viçosa 36570-900, Brazil;
| | - Freddy Mora
- Institute of Biological Sciences, University of Talca, 2 Norte 685, Talca 3460000, Chile;
| |
Collapse
|
17
|
Abstract
Genomic Selection (GS) is a method in plant breeding to predict the genetic value of untested lines based on genome-wide marker data. The method has been widely explored with simulated data and also in real plant breeding programs. However, the optimal strategy and stage for implementation of GS in a plant-breeding program is still uncertain. The accuracy of GS has proven to be affected by the data used in the GS model, including size of the training population, relationships between individuals, marker density, and use of pedigree information. GS is commonly used to predict the additive genetic value of a line, whereas non-additive genetics are often disregarded. In this review, we provide a background knowledge on genomic prediction models used for GS and a view on important considerations concerning data used in these models. We compare within- and across-breeding cycle strategies for implementation of GS in cereal breeding and possibilities for using GS to select untested lines as parents. We further discuss the difference of estimating additive and non-additive genetic values and its usefulness to either select new parents, or new candidate varieties.
Collapse
|
18
|
Genomic Prediction of Growth and Stem Quality Traits in Eucalyptus globulus Labill. at Its Southernmost Distribution Limit in Chile. FORESTS 2018. [DOI: 10.3390/f9120779] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The present study was undertaken to examine the ability of different genomic selection (GS) models to predict growth traits (diameter at breast height, tree height and wood volume), stem straightness and branching quality of Eucalyptus globulus Labill. trees using a genome-wide Single Nucleotide Polymorphism (SNP) chip (60 K), in one of the southernmost progeny trials of the species, close to its southern distribution limit in Chile. The GS methods examined were Ridge Regression-BLUP (RRBLUP), Bayes-A, Bayes-B, Bayesian least absolute shrinkage and selection operator (BLASSO), principal component regression (PCR), supervised PCR and a variant of the RRBLUP method that involves the previous selection of predictor variables (RRBLUP-B). RRBLUP-B and supervised PCR models presented the greatest predictive ability (PA), followed by the PCR method, for most of the traits studied. The highest PA was obtained for the branching quality (~0.7). For the growth traits, the maximum values of PA varied from 0.43 to 0.54, while for stem straightness, the maximum value of PA reached 0.62 (supervised PCR). The study population presented a more extended linkage disequilibrium (LD) than other populations of E. globulus previously studied. The genome-wide LD decayed rapidly within 0.76 Mbp (threshold value of r2 = 0.1). The average LD on all chromosomes was r2 = 0.09. In addition, the 0.15% of total pairs of linked SNPs were in a complete LD (r2 = 1), and the 3% had an r2 value >0.5. Genomic prediction, which is based on the reduction in dimensionality and variable selection may be a promising method, considering the early growth of the trees and the low-to-moderate values of heritability found in the traits evaluated. These findings provide new understanding of how develop novel breeding strategies for tree improvement of E. globulus at its southernmost range limit in Chile, which could represent new opportunities for forest planting that can benefit the local economy.
Collapse
|