1
|
Shojaie A, Al Khleifat A, Sarraf P, Al-Chalabi A. Analysis of non-motor symptoms in amyotrophic lateral sclerosis. Amyotroph Lateral Scler Frontotemporal Degener 2024; 25:237-241. [PMID: 37981575 DOI: 10.1080/21678421.2023.2280618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Accepted: 11/01/2023] [Indexed: 11/21/2023]
Abstract
OBJECTIVE We investigated non-motor symptoms in ALS using sequential questionnaires; here we report the findings of the second questionnaire. METHODS A social media platform (Twitter, now known as X) was used to publicize the questionnaires. Data were downloaded from SurveyMonkey and analyzed by descriptive statistics, comparison of means, and regression models. RESULTS There were 182 people with ALS and 57 controls. The most important non-motor symptoms were cold limbs (60.4% cases, 14% controls, p = 9.67 x 10-10) and appetite loss (29.7% cases, 5.3% controls, p = 1.6 x 10-4). The weaker limb was most likely to feel cold (p = 9.67 x 10-10), and symptoms were more apparent in the evening and night. Appetite loss was reported as due to feeling full and the time taken to eat. People with ALS experienced medium-intensity pain, more usually shock-like pain than burning or cold-like pain, although the most prevalent type of pain was non-differentiated. CONCLUSIONS Non-motor symptoms are an important feature of ALS. Further investigation is needed to understand their physiological basis and whether they represent phenotypic differences useful for subtyping ALS.
Collapse
Affiliation(s)
- Ali Shojaie
- Maurice Wohl Clinical Neuroscience Institute, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
| | - Ahmad Al Khleifat
- Maurice Wohl Clinical Neuroscience Institute, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
| | - Payam Sarraf
- Department of Neuromuscular Diseases, Iranian Centre of Neurological Research, Neuroscience Institute, Tehran University of Medical Sciences, Tehran, Iran, and
| | - Ammar Al-Chalabi
- Maurice Wohl Clinical Neuroscience Institute, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
- Department of Neurology, King's College Hospital, London, UK
| |
Collapse
|
2
|
Hudson A, Shojaie A. Statistical inference on qualitative differences in the magnitude of an effect. Stat Med 2024; 43:1419-1440. [PMID: 38305667 PMCID: PMC10947912 DOI: 10.1002/sim.10025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/15/2023] [Indexed: 02/03/2024]
Abstract
Qualitative interactions occur when a treatment effect or measure of association varies in sign by sub-population. Of particular interest in many biomedical settings are absence/presence qualitative interactions, which occur when an effect is present in one sub-population but absent in another. Absence/presence interactions arise in emerging applications in precision medicine, where the objective is to identify a set of predictive biomarkers that have prognostic value for clinical outcomes in some sub-population but not others. They also arise naturally in gene regulatory network inference, where the goal is to identify differences in networks corresponding to diseased and healthy individuals, or to different subtypes of disease; such differences lead to identification of network-based biomarkers for diseases. In this paper, we argue that while the absence/presence hypothesis is important, developing a statistical test for this hypothesis is an intractable problem. To overcome this challenge, we approximate the problem in a novel inference framework. In particular, we propose to make inferences about absence/presence interactions by quantifying the relative difference in effect size, reasoning that when the relative difference is large, an absence/presence interaction occurs. The proposed methodology is illustrated through a simulation study as well as an analysis of breast cancer data from the Cancer Genome Atlas.
Collapse
Affiliation(s)
- Aaron Hudson
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Center, Washington, United States
| | - Ali Shojaie
- Department of Biostatistics, University of Washington, Washington, United States
| |
Collapse
|
3
|
Xia L, Hantrakun V, Teparrukkul P, Wongsuvan G, Kaewarpai T, Dulsuk A, Day NPJ, Lemaitre RN, Chantratita N, Limmathurotsakul D, Shojaie A, Gharib SA, West TE. Plasma Metabolomics Reveals Distinct Biological and Diagnostic Signatures for Melioidosis. Am J Respir Crit Care Med 2024; 209:288-298. [PMID: 37812796 PMCID: PMC10840774 DOI: 10.1164/rccm.202207-1349oc] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 10/09/2023] [Indexed: 10/11/2023] Open
Abstract
Rationale: The global burden of sepsis is greatest in low-resource settings. Melioidosis, infection with the gram-negative bacterium Burkholderia pseudomallei, is a frequent cause of fatal sepsis in endemic tropical regions such as Southeast Asia. Objectives: To investigate whether plasma metabolomics would identify biological pathways specific to melioidosis and yield clinically meaningful biomarkers. Methods: Using a comprehensive approach, differential enrichment of plasma metabolites and pathways was systematically evaluated in individuals selected from a prospective cohort of patients hospitalized in rural Thailand with infection. Statistical and bioinformatics methods were used to distinguish metabolomic features and processes specific to patients with melioidosis and between fatal and nonfatal cases. Measurements and Main Results: Metabolomic profiling and pathway enrichment analysis of plasma samples from patients with melioidosis (n = 175) and nonmelioidosis infections (n = 75) revealed a distinct immuno-metabolic state among patients with melioidosis, as suggested by excessive tryptophan catabolism in the kynurenine pathway and significantly increased levels of sphingomyelins and ceramide species. We derived a 12-metabolite classifier to distinguish melioidosis from other infections, yielding an area under the receiver operating characteristic curve of 0.87 in a second validation set of patients. Melioidosis nonsurvivors (n = 94) had a significantly disturbed metabolome compared with survivors (n = 81), with increased leucine, isoleucine, and valine metabolism, and elevated circulating free fatty acids and acylcarnitines. A limited eight-metabolite panel showed promise as an early prognosticator of mortality in melioidosis. Conclusions: Melioidosis induces a distinct metabolomic state that can be examined to distinguish underlying pathophysiological mechanisms associated with death. A 12-metabolite signature accurately differentiates melioidosis from other infections and may have diagnostic applications.
Collapse
Affiliation(s)
- Lu Xia
- Department of Biostatistics
| | | | - Prapit Teparrukkul
- Department of Internal Medicine, Sunpasitthiprasong Hospital, Ubon Ratchathani, Thailand; and
| | | | | | - Adul Dulsuk
- Department of Microbiology and Immunology, and
| | - Nicholas P. J. Day
- Mahidol Oxford Tropical Medicine Research Unit
- Centre for Tropical Medicine and Global Health, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
| | | | - Narisara Chantratita
- Mahidol Oxford Tropical Medicine Research Unit
- Department of Microbiology and Immunology, and
| | - Direk Limmathurotsakul
- Mahidol Oxford Tropical Medicine Research Unit
- Department of Tropical Hygiene, Faculty of Tropical Medicine, Mahidol University, Bangkok, Thailand
| | | | - Sina A. Gharib
- Division of Pulmonary, Critical Care, and Sleep Medicine, Department of Medicine, and
| | - T. Eoin West
- Division of Pulmonary, Critical Care, and Sleep Medicine, Department of Medicine, and
- Department of Global Health, University of Washington, Seattle, Washington
| |
Collapse
|
4
|
Shojaie A, Al Khleifat A, Opie-Martin S, Sarraf P, Al-Chalabi A. Non-motor symptoms in amyotrophic lateral sclerosis. Amyotroph Lateral Scler Frontotemporal Degener 2024; 25:61-66. [PMID: 37798838 DOI: 10.1080/21678421.2023.2263868] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 09/19/2023] [Indexed: 10/07/2023]
Abstract
OBJECTIVE While motor symptoms are well-known in ALS, non-motor symptoms are often under-reported and may have a significant impact on quality of life. In this study, we aimed to examine the nature and extent of non-motor symptoms in ALS. METHODS A 20-item questionnaire was developed covering the domains of autonomic function, sleep, pain, gastrointestinal disturbance, and emotional lability, posted online and shared on social media platforms to target people with ALS and controls. RESULTS A total of 1018 responses were received, of which 927 were complete from 506 people with ALS and 421 unaffected individuals. Cold limbs (p 1.66 × 10-36), painful limbs (p 6.14 × 10-28), and urinary urgency (p 4.70 × 10-23) were associated with ALS. People with ALS were more likely to report autonomic symptoms, pain, and psychiatric symptoms than controls (autonomic symptoms B = 0.043, p 6.10 × 10-5, pain domain B = 0.18, p 3.72 × 10-11 and psychiatric domain B = 0.173, p 1.32 × 10-4). CONCLUSIONS Non-motor symptoms in ALS are common. The identification and management of non-motor symptoms should be integrated into routine clinical care for people with ALS. Further research is warranted to investigate the relationship between non-motor symptoms and disease progression, as well as to develop targeted interventions to improve the quality of life for people with ALS.
Collapse
Affiliation(s)
- Ali Shojaie
- Maurice Wohl Clinical Neuroscience Institute, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
| | - Ahmad Al Khleifat
- Maurice Wohl Clinical Neuroscience Institute, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
| | - Sarah Opie-Martin
- Maurice Wohl Clinical Neuroscience Institute, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
| | - Payam Sarraf
- Department of Neuromuscular Diseases, Iranian Centre of Neurological Research, Neuroscience Institute, Tehran University of Medical Sciences, Tehran, Iran, and
| | - Ammar Al-Chalabi
- Maurice Wohl Clinical Neuroscience Institute, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
- Department of Neurology, King's College Hospital, London, UK
| |
Collapse
|
5
|
Hampe CS, Shojaie A, Brooks-Worrell B, Dibay S, Utzschneider K, Kahn SE, Larkin ME, Johnson ML, Younes N, Rasouli N, Desouza C, Cohen RM, Park JY, Florez HJ, Valencia WM, Palmer JP, Balasubramanyam A. GAD65Abs Are Not Associated With Beta-Cell Dysfunction in Patients With T2D in the GRADE Study. J Endocr Soc 2024; 8:bvad179. [PMID: 38333889 PMCID: PMC10853002 DOI: 10.1210/jendso/bvad179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Indexed: 02/10/2024] Open
Abstract
Context Autoantibodies directed against the 65-kilodalton isoform of glutamic acid decarboxylase (GAD65Abs) are markers of autoimmune type 1 diabetes (T1D) but are also present in patients with Latent Autoimmune Diabetes of Adults and autoimmune neuromuscular diseases, and also in healthy individuals. Phenotypic differences between these conditions are reflected in epitope-specific GAD65Abs and anti-idiotypic antibodies (anti-Id) against GAD65Abs. We previously reported that 7.8% of T2D patients in the GRADE study have GAD65Abs but found that GAD65Ab positivity was not correlated with beta-cell function, glycated hemoglobin (HbA1c), or fasting glucose levels. Context In this study, we aimed to better characterize islet autoantibodies in this T2D cohort. This is an ancillary study to NCT01794143. Methods We stringently defined GAD65Ab positivity with a competition assay, analyzed GAD65Ab-specific epitopes, and measured GAD65Ab-specific anti-Id in serum. Results Competition assays confirmed that 5.9% of the patients were GAD65Ab positive, but beta-cell function was not associated with GAD65Ab positivity, GAD65Ab epitope specificity or GAD65Ab-specific anti-Id. GAD65-related autoantibody responses in GRADE T2D patients resemble profiles in healthy individuals (low GAD65Ab titers, presence of a single autoantibody, lack of a distinct epitope pattern, and presence of anti-Id to diabetes-associated GAD65Ab). In this T2D cohort, GAD65Ab positivity is likely unrelated to the pathogenesis of beta-cell dysfunction. Conclusion Evidence for islet autoimmunity in the pathophysiology of T2D beta-cell dysfunction is growing, but T1D-associated autoantibodies may not accurately reflect the nature of their autoimmune process.
Collapse
Affiliation(s)
| | - Ali Shojaie
- Department of Biostatistics, Department of Medicine, University of Washington, Seattle, WA 98185, USA
| | - Barbara Brooks-Worrell
- Department of Biostatistics, Department of Medicine, University of Washington, Seattle, WA 98185, USA
- Department of Medicine, VA Puget Sound Health Care System, Seattle, WA 98108, USA
| | - Sepideh Dibay
- Department of Biostatistics, Department of Medicine, University of Washington, Seattle, WA 98185, USA
| | - Kristina Utzschneider
- Department of Biostatistics, Department of Medicine, University of Washington, Seattle, WA 98185, USA
- Department of Medicine, VA Puget Sound Health Care System, Seattle, WA 98108, USA
| | - Steven E Kahn
- Department of Biostatistics, Department of Medicine, University of Washington, Seattle, WA 98185, USA
- Department of Medicine, VA Puget Sound Health Care System, Seattle, WA 98108, USA
| | - Mary E Larkin
- Massachusetts General Hospital Diabetes Center, Harvard Medical School, Boston, MA 02114, USA
| | - Mary L Johnson
- International Diabetes Center, Minneapolis, MN 55416, USA
| | - Naji Younes
- The Biostatistics Center, Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, The George Washington University, Rockville, MD 20852, USA
| | - Neda Rasouli
- Department of Medicine, University of Colorado School of Medicine, Aurora, CO 80045, USA
| | - Cyrus Desouza
- Division of Diabetes, Endocrinology and Metabolism, University of Nebraska and Omaha VA Medical Center, Omaha, NE 68198, USA
| | - Robert M Cohen
- Division of Endocrinology, Diabetes and Metabolism, University of Cincinnati and Cincinnati VA Medical Center, Cincinnati, OH 45221, USA
| | | | - Hermes J Florez
- Department of Medicine, University of Miami, Miami, FL 33135, USA
- Division of Endocrinology, Diabetes and Metabolic Diseases, Medical University of South Carolina, Charleston, SC 29425, USA
| | - Willy Marcos Valencia
- Division of Endocrinology, Diabetes and Metabolic Diseases, Medical University of South Carolina, Charleston, SC 29425, USA
- Geriatric Research, Education and Clinical Center, Bruce W. Carter Veterans Affairs Medical Center, Miami, FL 33125, USA
- Robert Stempel Department of Public Health, College of Health and Urban Affairs, Florida International University, Miami, FL 33181, USA
| | - Jerry P Palmer
- Department of Biostatistics, Department of Medicine, University of Washington, Seattle, WA 98185, USA
- Department of Medicine, VA Puget Sound Health Care System, Seattle, WA 98108, USA
| | - Ashok Balasubramanyam
- Department of Medicine: Endocrinology, Diabetes and Metabolism, Baylor College of Medicine, Houston, TX 77030, USA
| |
Collapse
|
6
|
Wang Y, Shojaie A, Randolph T, Knight P, Ma J. GENERALIZED MATRIX DECOMPOSITION REGRESSION: ESTIMATION AND INFERENCE FOR TWO-WAY STRUCTURED DATA. Ann Appl Stat 2023; 17:2944-2969. [PMID: 38149262 PMCID: PMC10751029 DOI: 10.1214/23-aoas1746] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2023]
Abstract
Motivated by emerging applications in ecology, microbiology, and neuroscience, this paper studies high-dimensional regression with two-way structured data. To estimate the high-dimensional coefficient vector, we propose the generalized matrix decomposition regression (GMDR) to efficiently leverage auxiliary information on row and column structures. GMDR extends the principal component regression (PCR) to two-way structured data, but unlike PCR, GMDR selects the components that are most predictive of the outcome, leading to more accurate prediction. For inference on regression coefficients of individual variables, we propose the generalized matrix decomposition inference (GMDI), a general high-dimensional inferential framework for a large family of estimators that include the proposed GMDR estimator. GMDI provides more flexibility for incorporating relevant auxiliary row and column structures. As a result, GMDI does not require the true regression coefficients to be sparse, but constrains the coordinate system representing the regression coefficients according to the column structure. GMDI also allows dependent and heteroscedastic observations. We study the theoretical properties of GMDI in terms of both the type-I error rate and power and demonstrate the effectiveness of GMDR and GMDI in simulation studies and an application to human microbiome data.
Collapse
Affiliation(s)
- Yue Wang
- Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus
| | - Ali Shojaie
- Department of Biostatistics, University of Washington
| | | | | | - Jing Ma
- Public Health Sciences Division, Fred Hutchinson Cancer Center
| |
Collapse
|
7
|
Tin A, Fohner AE, Yang Q, Brody JA, Davies G, Yao J, Liu D, Caro I, Lindbohm JV, Duggan MR, Meirelles O, Harris SE, Gudmundsdottir V, Taylor AM, Henry A, Beiser AS, Shojaie A, Coors A, Fitzpatrick AL, Langenberg C, Satizabal CL, Sitlani CM, Wheeler E, Tucker-Drob EM, Bressler J, Coresh J, Bis JC, Candia J, Jennings LL, Pietzner M, Lathrop M, Lopez OL, Redmond P, Gerszten RE, Rich SS, Heckbert SR, Austin TR, Hughes TM, Tanaka T, Emilsson V, Vasan RS, Guo X, Zhu Y, Tzourio C, Rotter JI, Walker KA, Ferrucci L, Kivimäki M, Breteler MMB, Cox SR, Debette S, Mosley TH, Gudnason VG, Launer LJ, Psaty BM, Seshadri S, Fornage M. Identification of circulating proteins associated with general cognitive function among middle-aged and older adults. Commun Biol 2023; 6:1117. [PMID: 37923804 PMCID: PMC10624811 DOI: 10.1038/s42003-023-05454-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Accepted: 10/12/2023] [Indexed: 11/06/2023] Open
Abstract
Identifying circulating proteins associated with cognitive function may point to biomarkers and molecular process of cognitive impairment. Few studies have investigated the association between circulating proteins and cognitive function. We identify 246 protein measures quantified by the SomaScan assay as associated with cognitive function (p < 4.9E-5, n up to 7289). Of these, 45 were replicated using SomaScan data, and three were replicated using Olink data at Bonferroni-corrected significance. Enrichment analysis linked the proteins associated with general cognitive function to cell signaling pathways and synapse architecture. Mendelian randomization analysis implicated higher levels of NECTIN2, a protein mediating viral entry into neuronal cells, with higher Alzheimer's disease (AD) risk (p = 2.5E-26). Levels of 14 other protein measures were implicated as consequences of AD susceptibility (p < 2.0E-4). Proteins implicated as causes or consequences of AD susceptibility may provide new insight into the potential relationship between immunity and AD susceptibility as well as potential therapeutic targets.
Collapse
Grants
- N01 HC095163 NHLBI NIH HHS
- RC2 HL102419 NHLBI NIH HHS
- HHSN268201500003C NHLBI NIH HHS
- UH3 NS100605 NINDS NIH HHS
- R01 HL103612 NHLBI NIH HHS
- 75N92020D00002 NHLBI NIH HHS
- U01 HL096812 NHLBI NIH HHS
- MC_UU_00006/1 Medical Research Council
- UF1 NS125513 NINDS NIH HHS
- 75N92020D00005 NHLBI NIH HHS
- N01AG12100 NIA NIH HHS
- N01HC95160 NHLBI NIH HHS
- R01 AG054076 NIA NIH HHS
- R01 HL120393 NHLBI NIH HHS
- BB/F019394/1 Biotechnology and Biological Sciences Research Council
- RF1 AG059421 NIA NIH HHS
- R01 HL131136 NHLBI NIH HHS
- N01 HC095168 NHLBI NIH HHS
- UL1 RR025005 NCRR NIH HHS
- R01 AG015928 NIA NIH HHS
- HHSN268201800004I NHLBI NIH HHS
- U01 HL080295 NHLBI NIH HHS
- N01HC95163 NHLBI NIH HHS
- N01 AG012100 NIA NIH HHS
- HHSN268201500001C NHLBI NIH HHS
- UL1 TR001079 NCATS NIH HHS
- N01 HC085082 NHLBI NIH HHS
- U01 HL096917 NHLBI NIH HHS
- R01 HL059367 NHLBI NIH HHS
- U01 HL130114 NHLBI NIH HHS
- HHSN268200800007C NHLBI NIH HHS
- R01 HL085251 NHLBI NIH HHS
- N01HC95169 NHLBI NIH HHS
- R01 NS087541 NINDS NIH HHS
- 75N92020D00001 NHLBI NIH HHS
- R01 HL086694 NHLBI NIH HHS
- R01 AG054628 NIA NIH HHS
- U01 HL096902 NHLBI NIH HHS
- R01 HL087652 NHLBI NIH HHS
- N01 HC095162 NHLBI NIH HHS
- U01 HG004402 NHGRI NIH HHS
- N01HC95164 NHLBI NIH HHS
- N01 HC085086 NHLBI NIH HHS
- N01HC55222 NHLBI NIH HHS
- R01 AG049607 NIA NIH HHS
- R01 AG065596 NIA NIH HHS
- N01 HC095165 NHLBI NIH HHS
- N01HC95162 NHLBI NIH HHS
- MR/R024227/1 Medical Research Council
- N01HC85086 NHLBI NIH HHS
- 75N92020D00003 NHLBI NIH HHS
- R01 HL105756 NHLBI NIH HHS
- N01HC95168 NHLBI NIH HHS
- N01 HC095169 NHLBI NIH HHS
- HHSN268201800003I NHLBI NIH HHS
- P30 DK063491 NIDDK NIH HHS
- HHSN268201800007I NHLBI NIH HHS
- HHSN268201700002C NHLBI NIH HHS
- R01 AG066524 NIA NIH HHS
- RF1 AG063507 NIA NIH HHS
- HHSN268201200036C NHLBI NIH HHS
- R01 HL144483 NHLBI NIH HHS
- HHSN268201800001C NHLBI NIH HHS
- HHSN268201700001I NHLBI NIH HHS
- R01 AG056477 NIA NIH HHS
- HHSN268201700004I NHLBI NIH HHS
- N01HC95165 NHLBI NIH HHS
- N01 HC095159 NHLBI NIH HHS
- U01 AG058589 NIA NIH HHS
- N01HC95159 NHLBI NIH HHS
- N01 HC095161 NHLBI NIH HHS
- HHSN268201500001I NHLBI NIH HHS
- HHSN271201200022C NIDA NIH HHS
- N01 HC025195 NHLBI NIH HHS
- N01HC95161 NHLBI NIH HHS
- UL1 TR001420 NCATS NIH HHS
- 75N92020D00004 NHLBI NIH HHS
- U01 HL096814 NHLBI NIH HHS
- P30 AG066509 NIA NIH HHS
- R01 HL132320 NHLBI NIH HHS
- 75N92020D00007 NHLBI NIH HHS
- P30 AG066546 NIA NIH HHS
- R01 AG033040 NIA NIH HHS
- MR/S011676/1 Medical Research Council
- U01 AG052409 NIA NIH HHS
- HHSN268201500003I NHLBI NIH HHS
- K01 AG071689 NIA NIH HHS
- 75N92021D00006 NHLBI NIH HHS
- R01 AG026307 NIA NIH HHS
- R01 AG020098 NIA NIH HHS
- HHSN268201700005C NHLBI NIH HHS
- HHSN268201700001C NHLBI NIH HHS
- N01HC85082 NHLBI NIH HHS
- HHSN268201700003C NHLBI NIH HHS
- N01 HC095166 NHLBI NIH HHS
- N01HC95167 NHLBI NIH HHS
- N01HC85083 NHLBI NIH HHS
- UH2 NS100605 NINDS NIH HHS
- N01HC25195 NHLBI NIH HHS
- 75N92019D00031 NHLBI NIH HHS
- U01 HL096899 NHLBI NIH HHS
- HHSN268201700004C NHLBI NIH HHS
- UL1 TR000040 NCATS NIH HHS
- HHSN268201700002I NHLBI NIH HHS
- HHSN268201700005I NHLBI NIH HHS
- P30 AG072947 NIA NIH HHS
- R01 AG025941 NIA NIH HHS
- Chief Scientist Office
- 75N92020D00006 NHLBI NIH HHS
- N01HC95166 NHLBI NIH HHS
- R01 AG023629 NIA NIH HHS
- R01 HL087641 NHLBI NIH HHS
- N01HC85079 NHLBI NIH HHS
- N01 HC085080 NHLBI NIH HHS
- UL1 TR001881 NCATS NIH HHS
- N01 HC095167 NHLBI NIH HHS
- HHSN268201800005I NHLBI NIH HHS
- N01HC85080 NHLBI NIH HHS
- HHSN268201700003I NHLBI NIH HHS
- HHSN268201800006I NHLBI NIH HHS
- N01 HC095164 NHLBI NIH HHS
- N01HC85081 NHLBI NIH HHS
- N01 HC095160 NHLBI NIH HHS
- The ARIC study has been funded in whole or in part with Federal funds from the National Heart, Lung, and Blood Institute, National Institutes of Health, Department of Health and Human Services (contract numbers HHSN268201700001I, HHSN268201700002I, HHSN268201700003I, HHSN268201700004I and HHSN268201700005I), R01HL087641, R01HL059367 and R01HL086694; National Human Genome Research Institute contract U01HG004402; and National Institutes of Health contract HHSN268200625226C. Funding was also supported by 5RC2HL102419, R01NS087541 and R01HL131136. Neurocognitive data were collected by U01 2U01HL096812, 2U01HL096814, 2U01HL096899, 2U01HL096902, 2U01HL096917 from the NIH (NHLBI, NINDS, NIA and NIDCD). Infrastructure was partly supported by Grant Number UL1RR025005, a component of the National Institutes of Health and NIH Roadmap for Medical Research. This Cardiovascular Heath Study (CHS) research was supported by NHLBI contracts HHSN268201200036C, HHSN268200800007C, HHSN268201800001C, N01HC55222, N01HC85079, N01HC85080, N01HC85081, N01HC85082, N01HC85083, N01HC85086, 75N92021D00006; and NHLBI grants U01HL080295, R01HL087652, R01HL105756, R01HL103612, R01HL120393, R01HL085251, R01HL144483, and U01HL130114 with additional contribution from the National Institute of Neurological Disorders and Stroke (NINDS). Additional support was provided through R01AG023629, R01AG15928, and R01AG20098 from the National Institute on Aging (NIA). AEF is supported by K01AG071689. The Framingham Heart Study is conducted and supported by the National Heart, Lung, and Blood Institute (NHLBI) in collaboration with Boston University (Contract No. N01-HC-25195, HHSN268201500001I and 75N92019D00031). This work was also supported by grant R01AG063507, R01AG054076, R01AG049607, R01AG059421, R01AG033040, R01AG066524, P30AG066546, U01 AG052409, U01 AG058589 from from the National Institute on Aging and R01 AG017950, UH2/3 NS100605, UF1 NS125513 from National Institute of Neurological Disorders and Stroke and R01HL132320. AGES has been funded by NIA contracts N01-AG012100 and HSSN271201200022C, NIH Grant No. 1R01AG065596-01A1, Hjartavernd (the Icelandic Heart Association), and the Althingi (the Icelandic Parliament). M. R. Duggan, T. Tanaka, J. Candia, K. A. Walker, L. Ferrucci, L.J. Launer, O. Meirelles are funded by the National Institute on Aging Intramural Research Program. This study was funded, in part, by the National Institute on Aging Intramural Research Program. The Coronary Artery Risk Development in Young Adults Study (CARDIA) is supported by contracts HHSN268201800003I, HHSN268201800004I, HHSN268201800005I, HHSN268201800006I, and HHSN268201800007I from the National Heart, Lung, and Blood Institute (NHLBI). The LBC1921 was supported by the UK’s Biotechnology and Biological Sciences Research Council (BBSRC), The Royal Society, and The Chief Scientist Office of the Scottish Government. Genotyping was funded by the BBSRC (BB/F019394/1). LBC1936 is supported by the Biotechnology and Biological Sciences Research Council, and the Economic and Social Research Council [BB/W008793/1], Age UK (Disconnected Mind project), and the University of Edinburgh. Genotyping was funded by the BBSRC (BB/F019394/1). The Olink® Neurology Proteomics assay was supported by a National Institutes of Health (NIH) research grant R01AG054628. Phenotype harmonization, data management, sample-identity QC, and general study coordination, were provided by the TOPMed Data Coordinating Center (3R01HL-120393-02S1), and TOPMed MESA Multi-Omics (HHSN2682015000031/HSN26800004). The MESA projects are conducted and supported by the National Heart, Lung, and Blood Institute (NHLBI) in collaboration with MESA investigators. Support for the Multi-Ethnic Study of Atherosclerosis (MESA) projects are conducted and supported by the National Heart, Lung, and Blood Institute (NHLBI) in collaboration with MESA investigators. Support for MESA is provided by contracts 75N92020D00001, HHSN268201500003I, N01-HC-95159, 75N92020D00005, N01-HC-95160, 75N92020D00002, N01-HC-95161, 75N92020D00003, N01-HC-95162, 75N92020D00006, N01-HC-95163, 75N92020D00004, N01-HC-95164, 75N92020D00007, N01-HC-95165, N01-HC-95166, N01-HC-95167, N01-HC-95168, N01-HC-95169, UL1-TR-000040, UL1-TR-001079, UL1-TR-001420, UL1TR001881, DK063491, and R01HL105756. The Three City (3C) Study is conducted under a partnership agreement among the Institut National de la Santé et de la Recherche Médicale (INSERM), the University of Bordeaux, and Sanofi-Aventis. The Fondation pour la Recherche Médicale funded the preparation and initiation of the study. The 3C Study is also supported by the Caisse Nationale Maladie des Travailleurs Salariés, Direction Générale de la Santé, Mutuelle Générale de l’Education Nationale (MGEN), Institut de la Longévité, Conseils Régionaux of Aquitaine and Bourgogne, Fondation de France, and Ministry of Research–INSERM Programme “Cohortes et collections de données biologiques.” Ilana Caro received a grant from the EUR digital public health. This PhD program is supported within the framework of the PIA3 (Investment for the future). Project reference 17-EURE-0019.
Collapse
Affiliation(s)
- Adrienne Tin
- Memory Impairment and Neurodegenerative Dementia (MIND) Center, University of Mississippi Medical Center, Jackson, MS, USA.
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.
| | - Alison E Fohner
- Department of Epidemiology, University of Washington, Seattle, WA, USA.
- Institute for Public Health Genetics, University of Washington, Seattle, WA, USA.
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA.
| | - Qiong Yang
- Department of Biostatistics, Boston University, Boston, MA, USA
| | - Jennifer A Brody
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Gail Davies
- Lothian Birth Cohorts, Department of Psychology, University of Edinburgh, 7 George Square, Edinburgh, EH8 9JZ, UK
| | - Jie Yao
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Dan Liu
- Population Health Sciences, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany
| | - Ilana Caro
- University of Bordeaux, Institut National de la Santé et de la Recherche Médicale (INSERM), Bordeaux Population Health Research Center, UMR 1219, CHU Bordeaux, Bordeaux, France
| | - Joni V Lindbohm
- Broad Institute of the Massachusetts Institute of Technology and Harvard University, The Klarman Cell Observatory, Cambridge, MA, USA
- Clinicum, Department of Public Health, University of Helsinki, Helsinki, Finland
- Department of Epidemiology and Public Health, University College London, London, UK
| | - Michael R Duggan
- Laboratory of Behavioral Neuroscience, National Institute on Aging, Baltimore, MD, USA
| | - Osorio Meirelles
- National Institute on Aging, National Institutes of Health, Laboratory of Epidemiology and Population Science, Bethesda, MD, USA
| | - Sarah E Harris
- Lothian Birth Cohorts, Department of Psychology, University of Edinburgh, 7 George Square, Edinburgh, EH8 9JZ, UK
| | - Valborg Gudmundsdottir
- Faculty of Medicine, University of Iceland, Reykjavik, Iceland
- Icelandic Heart Association, Kopavogur, Iceland
| | - Adele M Taylor
- Lothian Birth Cohorts, Department of Psychology, University of Edinburgh, 7 George Square, Edinburgh, EH8 9JZ, UK
| | - Albert Henry
- Institute of Cardiovascular Science, University of London, London, UK
| | - Alexa S Beiser
- Department of Biostatistics, Boston University, Boston, MA, USA
- Framingham Heart Study, Framingham, MA, USA
| | - Ali Shojaie
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Annabell Coors
- Population Health Sciences, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany
| | - Annette L Fitzpatrick
- Department of Epidemiology, University of Washington, Seattle, WA, USA
- Departments of Family Medicine, University of Washington, Seattle, WA, USA
| | - Claudia Langenberg
- Precision Healthcare Institute, Queen Mary University of London, London, UK
- MRC Epidemiology Unit, University of Cambridge, Cambridge, UK
- Computational Medicine, Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Claudia L Satizabal
- Framingham Heart Study, Framingham, MA, USA
- Department of Population Health Sciences and Glenn Biggs Institute for Alzheimer's & Neurodegenerative Diseases, UT Health San Antonio, San Antonio, TX, USA
- Department of Neurology, Boston University School of Medicine, Boston, MA, USA
| | - Colleen M Sitlani
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Eleanor Wheeler
- MRC Epidemiology Unit, University of Cambridge, Cambridge, UK
| | | | - Jan Bressler
- Human Genetics Center, School of Public Health, University of Texas Health Science Center at Houston, Houston, TX, USA
| | | | - Joshua C Bis
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Julián Candia
- Translational Gerontology Branch, National Institute on Aging, Baltimore, MD, USA
| | - Lori L Jennings
- Novartis Institutes for Biomedical Research, 22 Windsor Street, Cambridge, MA, USA
| | - Maik Pietzner
- Precision Healthcare Institute, Queen Mary University of London, London, UK
- MRC Epidemiology Unit, University of Cambridge, Cambridge, UK
- Computational Medicine, Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Berlin, Germany
| | | | - Oscar L Lopez
- Departments of Neurology and Psychiatry, University of Pittsburgh, Pittsburgh, PA, USA
| | - Paul Redmond
- Lothian Birth Cohorts, Department of Psychology, University of Edinburgh, 7 George Square, Edinburgh, EH8 9JZ, UK
| | - Robert E Gerszten
- Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Stephen S Rich
- Center for Public Health Genomics, Department of Public Health Sciences, University of Virginia, Charlottesville, VA, USA
| | - Susan R Heckbert
- Department of Epidemiology, University of Washington, Seattle, WA, USA
| | - Thomas R Austin
- Department of Epidemiology, University of Washington, Seattle, WA, USA
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Timothy M Hughes
- Department of Internal Medicine, Section of Gerontology and Geriatric Medicine, Wake Forest School of Medicine, Winston-Salem, NC, USA
- Department of Epidemiology and Prevention, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Toshiko Tanaka
- Translational Gerontology Branch, National Institute on Aging, Baltimore, MD, USA
| | - Valur Emilsson
- Faculty of Medicine, University of Iceland, Reykjavik, Iceland
- Icelandic Heart Association, Kopavogur, Iceland
| | - Ramachandran S Vasan
- Framingham Heart Study, Framingham, MA, USA
- University of Texas School of Public Health in San Antonio, San Antonio, TX, USA
- University of Texas Health Sciences Center, San Antonio, TX, USA
| | - Xiuqing Guo
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Yineng Zhu
- Department of Biostatistics, Boston University, Boston, MA, USA
| | - Christophe Tzourio
- University of Bordeaux, Institut National de la Santé et de la Recherche Médicale (INSERM), Bordeaux Population Health Research Center, UMR 1219, CHU Bordeaux, Bordeaux, France
| | - Jerome I Rotter
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Keenan A Walker
- Laboratory of Behavioral Neuroscience, National Institute on Aging, Baltimore, MD, USA
| | - Luigi Ferrucci
- Translational Gerontology Branch, National Institute on Aging, Baltimore, MD, USA
| | - Mika Kivimäki
- UCL Brain Sciences, University College London, London, UK
- Clinicum, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - Monique M B Breteler
- Population Health Sciences, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany
- Institute for Medical Biometry, Informatics and Epidemiology (IMBIE), Faculty of Medicine, University of Bonn, Bonn, Germany
| | - Simon R Cox
- Lothian Birth Cohorts, Department of Psychology, University of Edinburgh, 7 George Square, Edinburgh, EH8 9JZ, UK
| | - Stephanie Debette
- University of Bordeaux, Institut National de la Santé et de la Recherche Médicale (INSERM), Bordeaux Population Health Research Center, UMR 1219, CHU Bordeaux, Bordeaux, France
- Department of Neurology, Institute for Neurodegenerative Diseases, CHU de Bordeaux, Bordeaux, France
| | - Thomas H Mosley
- Memory Impairment and Neurodegenerative Dementia (MIND) Center, University of Mississippi Medical Center, Jackson, MS, USA
| | | | - Lenore J Launer
- Laboratory of Epidemiology and Population Science, National Institute on Aging, National Institutes of Health, Baltimore, MD, USA
| | - Bruce M Psaty
- Department of Epidemiology, University of Washington, Seattle, WA, USA
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
- Department of Health Systems and Population Health, University of Washington, Seattle, WA, USA
| | - Sudha Seshadri
- Framingham Heart Study, Framingham, MA, USA
- Department of Population Health Sciences and Glenn Biggs Institute for Alzheimer's & Neurodegenerative Diseases, UT Health San Antonio, San Antonio, TX, USA
| | - Myriam Fornage
- Human Genetics Center, School of Public Health, University of Texas Health Science Center at Houston, Houston, TX, USA
- Institute of Molecular Medicine, McGovern Medical School, University of Texas Health Science Center at Houston, Houston, TX, USA
| |
Collapse
|
8
|
Pershad Y, Mack T, Poisner H, Jakubek YA, Stilp AM, Mitchell BD, Lewis JP, Boerwinkle E, Loos RJ, Chami N, Wang Z, Barnes K, Pankratz N, Fornage M, Redline S, Psaty BM, Bis JC, Shojaie A, Silverman EK, Cho MH, Yun J, DeMeo D, Levy D, Johnson A, Mathias R, Taub M, Arnett D, North K, Raffield LM, Carson A, Doyle MF, Rich SS, Rotter JI, Guo X, Cox N, Roden DM, Franceschini N, Desai P, Reiner A, Auer PL, Scheet P, Jaiswal S, Weinstock JS, Bick AG. Determinants of mosaic chromosomal alteration fitness. medRxiv 2023:2023.10.20.23297280. [PMID: 37905118 PMCID: PMC10615010 DOI: 10.1101/2023.10.20.23297280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
Clonal hematopoiesis (CH) is characterized by the acquisition of a somatic mutation in a hematopoietic stem cell that results in a clonal expansion. These driver mutations can be single nucleotide variants in cancer driver genes or larger structural rearrangements called mosaic chromosomal alterations (mCAs). The factors that influence the variations in mCA fitness and ultimately result in different clonal expansion rates are not well-understood. We used the Passenger-Approximated Clonal Expansion Rate (PACER) method to estimate clonal expansion rate for 6,381 individuals in the NHLBI TOPMed cohort with gain, loss, and copy-neutral loss of heterozygosity mCAs. Our estimates of mCA fitness were correlated (R 2 = 0.49) with an alternative approach that estimated fitness of mCAs in the UK Biobank using a theoretical probability distribution. Individuals with lymphoid-associated mCAs had a significantly higher white blood cell count and faster clonal expansion rate. In a cross-sectional analysis, genome-wide association study of estimates of mCA expansion rate identified TCL1A , NRIP1 , and TERT locus variants as modulators of mCA clonal expansion rate.
Collapse
|
9
|
Li T, Ferraro N, Strober BJ, Aguet F, Kasela S, Arvanitis M, Ni B, Wiel L, Hershberg E, Ardlie K, Arking DE, Beer RL, Brody J, Blackwell TW, Clish C, Gabriel S, Gerszten R, Guo X, Gupta N, Johnson WC, Lappalainen T, Lin HJ, Liu Y, Nickerson DA, Papanicolaou G, Pritchard JK, Qasba P, Shojaie A, Smith J, Sotoodehnia N, Taylor KD, Tracy RP, Van Den Berg D, Wheeler MT, Rich SS, Rotter JI, Battle A, Montgomery SB. The functional impact of rare variation across the regulatory cascade. Cell Genom 2023; 3:100401. [PMID: 37868038 PMCID: PMC10589633 DOI: 10.1016/j.xgen.2023.100401] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Revised: 03/08/2023] [Accepted: 08/10/2023] [Indexed: 10/24/2023]
Abstract
Each human genome has tens of thousands of rare genetic variants; however, identifying impactful rare variants remains a major challenge. We demonstrate how use of personal multi-omics can enable identification of impactful rare variants by using the Multi-Ethnic Study of Atherosclerosis, which included several hundred individuals, with whole-genome sequencing, transcriptomes, methylomes, and proteomes collected across two time points, 10 years apart. We evaluated each multi-omics phenotype's ability to separately and jointly inform functional rare variation. By combining expression and protein data, we observed rare stop variants 62 times and rare frameshift variants 216 times as frequently as controls, compared to 13-27 times as frequently for expression or protein effects alone. We extended a Bayesian hierarchical model, "Watershed," to prioritize specific rare variants underlying multi-omics signals across the regulatory cascade. With this approach, we identified rare variants that exhibited large effect sizes on multiple complex traits including height, schizophrenia, and Alzheimer's disease.
Collapse
Affiliation(s)
- Taibo Li
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Nicole Ferraro
- Biomedical Informatics Training Program, Stanford University, Stanford, CA, USA
| | - Benjamin J. Strober
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Harvard School of Public Health, Epidemiology Department, Boston, MA, USA
| | | | - Silva Kasela
- New York Genome Center, New York, NY, USA
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Marios Arvanitis
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Department of Medicine, Division of Cardiology, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Bohan Ni
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Laurens Wiel
- Division of Cardiovascular Medicine, Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | | | | | - Dan E. Arking
- McKusick-Nathans Institute, Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Rebecca L. Beer
- National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Jennifer Brody
- Cardiovascular Health Research Unit, Departments of Medicine and Epidemiology, University of Washington, Seattle, WA, USA
| | - Thomas W. Blackwell
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Clary Clish
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Robert Gerszten
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Cardiovascular Institute, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA
| | - Xiuqing Guo
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Namrata Gupta
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - W. Craig Johnson
- Collaborative Health Studies Coordinating Center, University of Washington, Seattle, WA, USA
| | - Tuuli Lappalainen
- New York Genome Center, New York, NY, USA
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Henry J. Lin
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Yongmei Liu
- Department of Medicine, Duke University School of Medicine, Durham, NC, USA
| | | | - George Papanicolaou
- National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - Pankaj Qasba
- National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Ali Shojaie
- Department of Biostatistics, University of Washington School of Public Health, Seattle, WA, USA
| | - Josh Smith
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Nona Sotoodehnia
- Cardiovascular Health Research Unit, Departments of Medicine and Epidemiology, University of Washington, Seattle, WA, USA
| | - Kent D. Taylor
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Russell P. Tracy
- Laboratory for Clinical Biochemistry Research, University of Vermont, Burlington, VT, USA
| | - David Van Den Berg
- Department of Preventive Medicine, University of Southern California, Los Angeles, CA, USA
| | - Matthew T. Wheeler
- Division of Cardiovascular Medicine, Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Stephen S. Rich
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA
| | - Jerome I. Rotter
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Alexis Battle
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
- McKusick-Nathans Institute, Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Malone Center for Engineering of Healthcare, Johns Hopkins University, Baltimore, MD, USA
| | - Stephen B. Montgomery
- Department of Genetics, Stanford University, Stanford, CA, USA
- Department of Pathology, Stanford University, Stanford, CA, USA
| |
Collapse
|
10
|
Prater KE, Green KJ, Mamde S, Sun W, Cochoit A, Smith CL, Chiou KL, Heath L, Rose SE, Wiley J, Keene CD, Kwon RY, Snyder-Mackler N, Blue EE, Logsdon B, Young JE, Shojaie A, Garden GA, Jayadev S. Human microglia show unique transcriptional changes in Alzheimer's disease. Nat Aging 2023; 3:894-907. [PMID: 37248328 PMCID: PMC10353942 DOI: 10.1038/s43587-023-00424-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Accepted: 04/25/2023] [Indexed: 05/31/2023]
Abstract
Microglia, the innate immune cells of the brain, influence Alzheimer's disease (AD) progression and are potential therapeutic targets. However, microglia exhibit diverse functions, the regulation of which is not fully understood, complicating therapeutics development. To better define the transcriptomic phenotypes and gene regulatory networks associated with AD, we enriched for microglia nuclei from 12 AD and 10 control human dorsolateral prefrontal cortices (7 males and 15 females, all aged >60 years) before single-nucleus RNA sequencing. Here we describe both established and previously unrecognized microglial molecular phenotypes, the inferred gene networks driving observed transcriptomic change, and apply trajectory analysis to reveal the putative relationships between microglial phenotypes. We identify microglial phenotypes more prevalent in AD cases compared with controls. Further, we describe the heterogeneity in microglia subclusters expressing homeostatic markers. Our study demonstrates that deep profiling of microglia in human AD brain can provide insight into microglial transcriptional changes associated with AD.
Collapse
Affiliation(s)
| | - Kevin J Green
- Department of Neurology, University of Washington, Seattle, WA, USA
| | - Sainath Mamde
- Department of Neurology, University of Washington, Seattle, WA, USA
| | - Wei Sun
- Biostatistics Program, Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | | | - Carole L Smith
- Department of Neurology, University of Washington, Seattle, WA, USA
| | - Kenneth L Chiou
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ, USA
- School of Life Sciences, Center for Evolution and Medicine, Arizona State University, Tempe, AZ, USA
| | | | - Shannon E Rose
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA, USA
| | | | - C Dirk Keene
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA, USA
| | - Ronald Y Kwon
- Institute for Stem Cell and Regenerative Medicine, University of Washington, Seattle, WA, USA
- Department of Orthopaedics and Sports Medicine, University of Washington, Seattle, WA, USA
| | - Noah Snyder-Mackler
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ, USA
- School of Life Sciences, Center for Evolution and Medicine, Arizona State University, Tempe, AZ, USA
- ASU-Banner Neurodegenerative Disease Research Center, Arizona State University, Tempe, AZ, USA
| | - Elizabeth E Blue
- Division of Medical Genetics, University of Washington, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
| | - Benjamin Logsdon
- Sage Bionetworks, Seattle, WA, USA
- Cajal Neuroscience, Seattle, WA, USA
| | - Jessica E Young
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA, USA
- Institute for Stem Cell and Regenerative Medicine, University of Washington, Seattle, WA, USA
| | - Ali Shojaie
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Gwenn A Garden
- Department of Neurology, University of North Carolina, Chapel Hill, NC, USA
| | - Suman Jayadev
- Department of Neurology, University of Washington, Seattle, WA, USA.
- Institute for Stem Cell and Regenerative Medicine, University of Washington, Seattle, WA, USA.
- Division of Medical Genetics, University of Washington, Seattle, WA, USA.
| |
Collapse
|
11
|
Shojaie A, Rota S, Al Khleifat A, Ray Chaudhuri K, Al-Chalabi A. Non-motor symptoms in amyotrophic lateral sclerosis: lessons from Parkinson's disease. Amyotroph Lateral Scler Frontotemporal Degener 2023:1-10. [PMID: 37349906 DOI: 10.1080/21678421.2023.2220748] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/24/2023]
Abstract
Amyotrophic lateral sclerosis and Parkinson's disease are neurodegenerative diseases of the motor system which are now recognized also to affect non-motor pathways. Non-motor symptoms have been acknowledged as important determinants of quality of life in Parkinson's disease, and there is increasing interest in understanding the extent and role of non-motor symptoms in amyotrophic lateral sclerosis. We therefore reviewed what is known about non-motor symptoms in amyotrophic lateral sclerosis, using lessons from Parkinson's disease.
Collapse
Affiliation(s)
- Ali Shojaie
- Maurice Wohl Clinical Neuroscience Institute, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
| | - Silvia Rota
- Maurice Wohl Clinical Neuroscience Institute, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
- Parkinson's Foundation Centre of Excellence, King's College Hospital, London, UK
- Centre for Neuroimaging Sciences, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
| | - Ahmad Al Khleifat
- Maurice Wohl Clinical Neuroscience Institute, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
| | - K Ray Chaudhuri
- Maurice Wohl Clinical Neuroscience Institute, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
- Parkinson's Foundation Centre of Excellence, King's College Hospital, London, UK
- National Institute for Health Research Biomedical Research Centre and Dementia Unit at South London and Maudsley NHS Foundation Trust and King's College London, London, UK, and
- Department of Neurology, King's College Hospital, London, UK
| | - Ammar Al-Chalabi
- Maurice Wohl Clinical Neuroscience Institute, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
- National Institute for Health Research Biomedical Research Centre and Dementia Unit at South London and Maudsley NHS Foundation Trust and King's College London, London, UK, and
- Department of Neurology, King's College Hospital, London, UK
| |
Collapse
|
12
|
Kalani R, Bartz TM, Psaty BM, Elkind MSV, Floyd JS, Gerszten RE, Shojaie A, Heckbert SR, Bis JC, Austin TR, Tirschwell DL, Delaney JAC, Longstreth WT. Plasma Proteomic Associations With Incident Ischemic Stroke in Older Adults: The Cardiovascular Health Study. Neurology 2023; 100:e2182-e2190. [PMID: 37015819 DOI: 10.1212/wnl.0000000000207242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2022] [Accepted: 02/16/2023] [Indexed: 04/06/2023] Open
Abstract
BACKGROUND Plasma proteomics may elucidate novel insights into the pathophysiology of ischemic stroke (IS), identify biomarkers of IS risk, and guide development of nascent prevention strategies. We evaluated the relationship between the plasma proteome and IS risk in the population-based Cardiovascular Health Study (CHS). METHODS Eligible CHS participants were free of prevalent stroke and underwent quantification of 1298 plasma proteins using the aptamer-based SOMAScan assay platform from the 1992-1993 study visit. Multivariable Cox proportional hazards regression was used to evaluate associations between a 1-standard deviation increase in the log-2 transformed estimated plasma protein concentrations and incident IS, adjusting for demographics, IS risk factors, and estimated glomerular filtration rate. For proteins independently associated with incident IS, a secondary stratified analysis evaluated associations in subgroups defined by sex and race. Exploratory analyses evaluated plasma proteomic associations with cardioembolic and non-cardioembolic IS as well as proteins associated with IS risk in participants with left atrial dysfunction but without atrial fibrillation. RESULTS Of 2983 eligible participants, the mean age was 74.3 (± 4.8) years, 61.2% were women, and 15.4% were Black. Over a median follow-up of 12.6 years, 450 participants experienced an incident IS. N-terminal pro-brain natriuretic peptide (NTproBNP, adjusted HR 1.37, 95% CI 1.23-1.53, P=2.08x10-08) and macrophage metalloelastase (MMP12, adjusted HR 1.30, 95% CI 1.16-1.45, P=4.55x10-06) were independently associated with IS risk. These two associations were similar in men and women and in Black and non-Black participants. In exploratory analyses, NTproBNP was independently associated with incident cardioembolic IS, E-selectin with incident non-cardioembolic IS, and secreted frizzled-related protein 1 with IS risk in participants with left atrial dysfunction. CONCLUSIONS In a cohort of older adults, NTproBNP and MMP12 were independently associated with IS risk. We identified plasma proteomic determinants of incident cardioembolic and non-cardioembolic IS and found a novel protein associated with IS risk in those with left atrial dysfunction.
Collapse
Affiliation(s)
- Rizwan Kalani
- Department of Neurology, University of Washington, Seattle, WA, USA
| | - Traci M Bartz
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Bruce M Psaty
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
- Department of Epidemiology, University of Washington, Seattle, WA, USA
- Department of Health Services, University of Washington, Seattle, WA, USA
| | - Mitchell S V Elkind
- Department of Neurology, Vagelos College of Physicians and Surgeons, Columbia University, New York, NY, USA
- Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, NY, USA
| | - James S Floyd
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
- Department of Epidemiology, University of Washington, Seattle, WA, USA
| | - Robert E Gerszten
- Division of Cardiovascular Medicine, Department of Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA
| | - Ali Shojaie
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Susan R Heckbert
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
- Department of Epidemiology, University of Washington, Seattle, WA, USA
| | - Joshua C Bis
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Thomas R Austin
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | | | - Joseph A C Delaney
- Department of Epidemiology, University of Washington, Seattle, WA, USA
- College of Pharmacy, University of Manitoba, Winnipeg, Manitoba, Canada
| | - W T Longstreth
- Department of Neurology, University of Washington, Seattle, WA, USA
- Department of Epidemiology, University of Washington, Seattle, WA, USA
| |
Collapse
|
13
|
Stanaway IB, Wallace JC, Hong S, Wilder CS, Green FH, Tsai J, Knight M, Workman T, Vigoren EM, Smith MN, Griffith WC, Thompson B, Shojaie A, Faustman EM. Alteration of oral microbiome composition in children living with pesticide-exposed farm workers. Int J Hyg Environ Health 2023; 248:114090. [PMID: 36516690 PMCID: PMC9898171 DOI: 10.1016/j.ijheh.2022.114090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Revised: 08/30/2022] [Accepted: 11/30/2022] [Indexed: 12/14/2022]
Abstract
Our prior work shows that azinphos-methyl pesticide exposure is associated with altered oral microbiomes in exposed farmworkers. Here we extend this analysis to show the same association pattern is also evident in their children. Oral buccal swab samples were analyzed at two time points, the apple thinning season in spring-summer 2005 for 78 children and 101 adults and the non-spray season in winter 2006 for 62 children and 82 adults. The pesticide exposure for the children were defined by the farmworker occupation of the cohabitating household adult and the blood azinphos-methyl detection of the cohabitating adult. Oral buccal swab 16S rRNA sequencing determined taxonomic microbiota proportional composition from concurrent samples from both adults and children. Analysis of the identified bacteria showed significant proportional changes for 12 of 23 common oral microbiome genera in association with azinphos-methyl detection and farmworker occupation. The most common significantly altered genera had reductions in the abundance of Streptococcus, suggesting an anti-microbial effect of the pesticide. Principal component analysis of the microbiome identified two primary clusters, with association of principal component 1 to azinphos-methyl blood detection and farmworker occupational status of the household. The children's buccal microbiota composition clustered with their household adult in ∼95% of the households. Household adult farmworker occupation and household pesticide exposure is associated with significant alterations in their children's oral microbiome composition. This suggests that parental occupational exposure and pesticide take-home exposure pathways elicit alteration of their children's microbiomes.
Collapse
Affiliation(s)
- Ian B Stanaway
- Department of Environmental and Occupational Health Sciences, Institute for Risk Analysis and Risk Communication, University of Washington, Seattle, WA, USA
| | - James C Wallace
- Department of Environmental and Occupational Health Sciences, Institute for Risk Analysis and Risk Communication, University of Washington, Seattle, WA, USA
| | - Sungwoo Hong
- Department of Environmental and Occupational Health Sciences, Institute for Risk Analysis and Risk Communication, University of Washington, Seattle, WA, USA
| | - Carly S Wilder
- Department of Environmental and Occupational Health Sciences, Institute for Risk Analysis and Risk Communication, University of Washington, Seattle, WA, USA
| | - Foad H Green
- Department of Environmental and Occupational Health Sciences, Institute for Risk Analysis and Risk Communication, University of Washington, Seattle, WA, USA
| | - Jesse Tsai
- Department of Environmental and Occupational Health Sciences, Institute for Risk Analysis and Risk Communication, University of Washington, Seattle, WA, USA
| | - Misty Knight
- Department of Environmental and Occupational Health Sciences, Institute for Risk Analysis and Risk Communication, University of Washington, Seattle, WA, USA
| | - Tomomi Workman
- Department of Environmental and Occupational Health Sciences, Institute for Risk Analysis and Risk Communication, University of Washington, Seattle, WA, USA
| | - Eric M Vigoren
- Department of Environmental and Occupational Health Sciences, Institute for Risk Analysis and Risk Communication, University of Washington, Seattle, WA, USA
| | - Marissa N Smith
- Department of Environmental and Occupational Health Sciences, Institute for Risk Analysis and Risk Communication, University of Washington, Seattle, WA, USA
| | - William C Griffith
- Department of Environmental and Occupational Health Sciences, Institute for Risk Analysis and Risk Communication, University of Washington, Seattle, WA, USA
| | - Beti Thompson
- Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Ali Shojaie
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Elaine M Faustman
- Department of Environmental and Occupational Health Sciences, Institute for Risk Analysis and Risk Communication, University of Washington, Seattle, WA, USA.
| |
Collapse
|
14
|
Pi H, Xia L, Ralph DD, Rayner SG, Shojaie A, Leary PJ, Gharib SA. Metabolomic Signatures Associated With Pulmonary Arterial Hypertension Outcomes. Circ Res 2023; 132:254-266. [PMID: 36597887 PMCID: PMC9904878 DOI: 10.1161/circresaha.122.321923] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Accepted: 12/20/2022] [Indexed: 01/05/2023]
Abstract
BACKGROUND Pulmonary arterial hypertension (PAH) is a complex disease characterized by progressive right ventricular (RV) failure leading to significant morbidity and mortality. Investigating metabolic features and pathways associated with RV dilation, mortality, and measures of disease severity can provide insight into molecular mechanisms, identify subphenotypes, and suggest potential therapeutic targets. METHODS We collected data from a prospective cohort of PAH participants and performed untargeted metabolomic profiling on 1045 metabolites from circulating blood. Analyses were intended to identify metabolomic differences across a range of common metrics in PAH (eg, dilated versus nondilated RV). Partial least squares discriminant analysis was first applied to assess the distinguishability of relevant outcomes. Significantly altered metabolites were then identified using linear regression, and Cox regression models (as appropriate for the specific outcome) with adjustments for age, sex, body mass index, and PAH cause. Models exploring RV maladaptation were further adjusted for pulmonary vascular resistance. Pathway enrichment analysis was performed to identify significantly dysregulated processes. RESULTS A total of 117 participants with PAH were included. Partial least squares discriminant analysis showed cluster differentiation between participants with dilated versus nondilated RVs, survivors versus nonsurvivors, and across a range of NT-proBNP (N-terminal pro-B-type natriuretic peptide) levels, REVEAL 2.0 composite scores, and 6-minute-walk distances. Polyamine and histidine pathways were associated with differences in RV dilation, mortality, NT-proBNP, REVEAL score, and 6-minute walk distance. Acylcarnitine pathways were associated with NT-proBNP, REVEAL score, and 6-minute walk distance. Sphingomyelin pathways were associated with RV dilation and NT-proBNP after adjustment for pulmonary vascular resistance. CONCLUSIONS Distinct plasma metabolomic profiles are associated with RV dilation, mortality, and measures of disease severity in PAH. Polyamine, histidine, and sphingomyelin metabolic pathways represent promising candidates for identifying patients at high risk for poor outcomes and investigation into their roles as markers or mediators of disease progression and RV adaptation.
Collapse
Affiliation(s)
- Hongyang Pi
- University of Washington, Department of Medicine
| | - Lu Xia
- University of Washington, Department of Biostatistics
| | | | | | - Ali Shojaie
- University of Washington, Department of Biostatistics
| | - Peter J. Leary
- University of Washington, Department of Medicine
- University of Washington, Department of Epidemiology
| | | |
Collapse
|
15
|
Prater KE, Green KJ, Sun W, Smith CL, Chiou KL, Mamde S, Heath LM, Rose S, Keene CD, Kwon RY, Snyder‐Mackler N, Blue EE, Young JE, Logsdon BA, Shojaie A, Garden GA, Jayadev S. Transcriptomic profiling of myeloid cells in Alzheimer’s Disease brain illustrates heterogeneity of microglia endolysosomal subtypes. Alzheimers Dement 2022. [DOI: 10.1002/alz.062391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Affiliation(s)
| | | | - Wei Sun
- Fred Hutchinson Cancer Research Center Seattle WA USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
16
|
Zhao S, Shojaie A. NETWORK DIFFERENTIAL CONNECTIVITY ANALYSIS. Ann Appl Stat 2022; 16:2166-2182. [PMID: 37842097 PMCID: PMC10569671 DOI: 10.1214/21-aoas1581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2023]
Abstract
Identifying differences in networks has become a canonical problem in many biological applications. Existing methods try to accomplish this goal by either directly comparing the estimated structures of two networks, or testing the null hypothesis that the covariance or inverse covariance matrices in two populations are identical. However, estimation approaches do not provide measures of uncertainty, e.g., p-values, whereas existing testing approaches could lead to misleading results, as we illustrate in this paper. To address these shortcomings, we propose a qualitative hypothesis testing framework, which tests whether the connectivity structures in the two networks are the same. our framework is especially appropriate if the goal is to identify nodes or edges that are differentially connected. No existing approach could test such hypotheses and provide corresponding measures of uncertainty. Theoretically, we show that under appropriate conditions, our proposal correctly controls the type-I error rate in testing the qualitative hypothesis. Empirically, we demonstrate the performance of our proposal using simulation studies and applications in cancer genomics.
Collapse
Affiliation(s)
| | - Ali Shojaie
- Department of Biostatistics, University of Washington
| |
Collapse
|
17
|
Abstract
While most classical approaches to Granger causality detection assume linear dynamics, many interactions in real-world applications, like neuroscience and genomics, are inherently nonlinear. In these cases, using linear models may lead to inconsistent estimation of Granger causal interactions. We propose a class of nonlinear methods by applying structured multilayer perceptrons (MLPs) or recurrent neural networks (RNNs) combined with sparsity-inducing penalties on the weights. By encouraging specific sets of weights to be zero-in particular, through the use of convex group-lasso penalties-we can extract the Granger causal structure. To further contrast with traditional approaches, our framework naturally enables us to efficiently capture long-range dependencies between series either via our RNNs or through an automatic lag selection in the MLP. We show that our neural Granger causality methods outperform state-of-the-art nonlinear Granger causality methods on the DREAM3 challenge data. This data consists of nonlinear gene expression and regulation time courses with only a limited number of time points. The successes we show in this challenging dataset provide a powerful example of how deep learning can be useful in cases that go beyond prediction on large datasets. We likewise illustrate our methods in detecting nonlinear interactions in a human motion capture dataset.
Collapse
|
18
|
Austin TR, McHugh CP, Brody JA, Bis JC, Sitlani CM, Bartz TM, Biggs ML, Bansal N, Buzkova P, Carr SA, deFilippi CR, Elkind MSV, Fink HA, Floyd JS, Fohner AE, Gerszten RE, Heckbert SR, Katz DH, Kizer JR, Lemaitre RN, Longstreth WT, McKnight B, Mei H, Mukamal KJ, Newman AB, Ngo D, Odden MC, Vasan RS, Shojaie A, Simon N, Smith GD, Davies NM, Siscovick DS, Sotoodehnia N, Tracy RP, Wiggins KL, Zheng J, Psaty BM. Proteomics and Population Biology in the Cardiovascular Health Study (CHS): design of a study with mentored access and active data sharing. Eur J Epidemiol 2022; 37:755-765. [PMID: 35790642 PMCID: PMC9255954 DOI: 10.1007/s10654-022-00888-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Accepted: 06/03/2022] [Indexed: 11/08/2022]
Abstract
BACKGROUND In the last decade, genomic studies have identified and replicated thousands of genetic associations with measures of health and disease and contributed to the understanding of the etiology of a variety of health conditions. Proteins are key biomarkers in clinical medicine and often drug-therapy targets. Like genomics, proteomics can advance our understanding of biology. METHODS AND RESULTS In the setting of the Cardiovascular Health Study (CHS), a cohort study of older adults, an aptamer-based method that has high sensitivity for low-abundance proteins was used to assay 4979 proteins in frozen, stored plasma from 3188 participants (61% women, mean age 74 years). CHS provides active support, including central analysis, for seven phenotype-specific working groups (WGs). Each CHS WG is led by one or two senior investigators and includes 10 to 20 early or mid-career scientists. In this setting of mentored access, the proteomic data and analytic methods are widely shared with the WGs and investigators so that they may evaluate associations between baseline levels of circulating proteins and the incidence of a variety of health outcomes in prospective cohort analyses. We describe the design of CHS, the CHS Proteomics Study, characteristics of participants, quality control measures, and structural characteristics of the data provided to CHS WGs. We additionally highlight plans for validation and replication of novel proteomic associations. CONCLUSION The CHS Proteomics Study offers an opportunity for collaborative data sharing to improve our understanding of the etiology of a variety of health conditions in older adults.
Collapse
Affiliation(s)
- Thomas R Austin
- Cardiovascular Health Research Unit, University of Washington, Seattle, WA, USA. .,Department of Epidemiology, University of Washington, Seattle, WA, USA.
| | | | - Jennifer A Brody
- Cardiovascular Health Research Unit, University of Washington, Seattle, WA, USA.,Department of Medicine, University of Washington, Seattle, WA, USA
| | - Joshua C Bis
- Cardiovascular Health Research Unit, University of Washington, Seattle, WA, USA.,Department of Medicine, University of Washington, Seattle, WA, USA
| | - Colleen M Sitlani
- Cardiovascular Health Research Unit, University of Washington, Seattle, WA, USA.,Department of Medicine, University of Washington, Seattle, WA, USA
| | - Traci M Bartz
- Cardiovascular Health Research Unit, University of Washington, Seattle, WA, USA.,Department of Medicine, University of Washington, Seattle, WA, USA.,Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Mary L Biggs
- Cardiovascular Health Research Unit, University of Washington, Seattle, WA, USA.,Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Nisha Bansal
- Division of Nephrology, University of Washington, Seattle, WA, USA
| | - Petra Buzkova
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | | | | | | | - Howard A Fink
- Geriatric Research Education & Clinical Center, Minneapolis VA Healthcare System, Minneapolis, MN, USA
| | - James S Floyd
- Cardiovascular Health Research Unit, University of Washington, Seattle, WA, USA.,Department of Epidemiology, University of Washington, Seattle, WA, USA.,Department of Medicine, University of Washington, Seattle, WA, USA
| | - Alison E Fohner
- Cardiovascular Health Research Unit, University of Washington, Seattle, WA, USA.,Department of Epidemiology, University of Washington, Seattle, WA, USA.,Institute of Public Health Genetics, University of Washington, Seattle, WA, USA
| | - Robert E Gerszten
- Division of Cardiovascular Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Susan R Heckbert
- Cardiovascular Health Research Unit, University of Washington, Seattle, WA, USA.,Department of Epidemiology, University of Washington, Seattle, WA, USA
| | - Daniel H Katz
- Division of Cardiovascular Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Jorge R Kizer
- Cardiology Section, San Francisco VA Health Care System, San Francisco, CA, USA.,Department of Biostatistics, University of California San Francisco, San Francisco, CA, USA.,Department of Epidemology, University of California San Francisco, San Francisco, CA, USA.,Department of Medicine, University of California San Francisco, San Francisco, CA, USA
| | - Rozenn N Lemaitre
- Cardiovascular Health Research Unit, University of Washington, Seattle, WA, USA.,Department of Medicine, University of Washington, Seattle, WA, USA
| | - W T Longstreth
- Department of Epidemiology, University of Washington, Seattle, WA, USA.,Department of Neurology, University of Washington, Seattle, WA, USA
| | - Barbara McKnight
- Department of Medicine, University of Washington, Seattle, WA, USA
| | - Hao Mei
- Department of Data Science, University of Mississippi Medical Center, Jackson, MS, USA
| | | | - Anne B Newman
- Department of Epidemiology, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| | - Debby Ngo
- Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Michelle C Odden
- Department of Epidemiology and Population Health, Stanford University, Stanford, CA, USA
| | - Ramachandran S Vasan
- Department of Epidemiology, School of Public Health, Boston University, Boston, MA, USA.,Department of Medicine, Boston University School of Medicine, Boston, MA, USA
| | - Ali Shojaie
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Noah Simon
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - George Davey Smith
- MRC Integrative Epidemiology Unit, Bristol Medical School, University of Bristol, Bristol, UK
| | - Neil M Davies
- MRC Integrative Epidemiology Unit, Bristol Medical School, University of Bristol, Bristol, UK.,K.G. Jebsen Center for Genetic Epidemiology, Department of Public Health and Nursing, NTNU, Norwegian University of Science and Technology, Norwegian, Norway.,Bristol Medical School, Population Health Sciences, University of Bristol, Bristol, UK
| | | | - Nona Sotoodehnia
- Cardiovascular Health Research Unit, University of Washington, Seattle, WA, USA.,Division of Cardiology, University of Washington, Seattle, WA, USA
| | - Russell P Tracy
- Departments of Pathology & Laboratory Medicine, and Biochemistry, Larner College of Medicine, University of Vermont, Burlington, VT, USA
| | - Kerri L Wiggins
- Cardiovascular Health Research Unit, University of Washington, Seattle, WA, USA.,Department of Medicine, University of Washington, Seattle, WA, USA
| | - Jie Zheng
- MRC Integrative Epidemiology Unit, Bristol Medical School, University of Bristol, Bristol, UK
| | - Bruce M Psaty
- Cardiovascular Health Research Unit, University of Washington, Seattle, WA, USA.,Department of Epidemiology, University of Washington, Seattle, WA, USA.,Department of Medicine, University of Washington, Seattle, WA, USA.,Department of Health Systems and Population Health, University of Washington, Seattle, WA, USA
| |
Collapse
|
19
|
Abstract
Estimation of density functions supported on general domains arises when the data are naturally restricted to a proper subset of the real space. This problem is complicated by typically intractable normalizing constants. Score matching provides a powerful tool for estimating densities with such intractable normalizing constants but as originally proposed is limited to densities on [Formula: see text] and [Formula: see text]. In this paper, we offer a natural generalization of score matching that accommodates densities supported on a very general class of domains. We apply the framework to truncated graphical and pairwise interaction models and provide theoretical guarantees for the resulting estimators. We also generalize a recently proposed method from bounded to unbounded domains and empirically demonstrate the advantages of our method.
Collapse
Affiliation(s)
| | - Mathias Drton
- Department of Mathematics, Technical University of Munich, 85748 Garching bei München, Germany
| | - Ali Shojaie
- Department of Biostatistics, University of Washington, Seattle, Washington, 98195, USA
| |
Collapse
|
20
|
Bloch J, Greaves-Tunnell A, Shea-Brown E, Harchaoui Z, Shojaie A, Yazdan-Shahmorad A. Network structure mediates functional reorganization induced by optogenetic stimulation of non-human primate sensorimotor cortex. iScience 2022; 25:104285. [PMID: 35573193 PMCID: PMC9095749 DOI: 10.1016/j.isci.2022.104285] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2021] [Revised: 03/22/2022] [Accepted: 04/19/2022] [Indexed: 11/04/2022] Open
Abstract
Because aberrant network-level functional connectivity underlies a variety of neural disorders, the ability to induce targeted functional reorganization would be a profound development toward therapies for neural disorders. Brain stimulation has been shown to induce large-scale network-wide functional connectivity changes (FCC), but the mapping from stimulation to the induced changes is unclear. Here, we develop a model which jointly considers the stimulation protocol and the cortical network structure to accurately predict network-wide FCC in response to optogenetic stimulation of non-human primate primary sensorimotor cortex. We observe that the network structure has a much stronger effect than the stimulation protocol on the resulting FCC. We also observe that the mappings from these input features to the FCC diverge over frequency bands and successive stimulations. Our framework represents a paradigm shift for targeted neural stimulation and can be used to interrogate, improve, and develop stimulation-based interventions for neural disorders. Optogenetic stimulation drives connectivity changes over sensorimotor cortex Nonparametric model informed by protocol and network features predicts the changes The underlying network is the primary mediator of the induced changes The mappings governing the changes diverge over time and frequency bands
Collapse
|
21
|
Abstract
Introduced more than a half-century ago, Granger causality has become a popular tool for analyzing time series data in many application domains, from economics and finance to genomics and neuroscience. Despite this popularity, the validity of this framework for inferring causal relationships among time series has remained the topic of continuous debate. Moreover, while the original definition was general, limitations in computational tools have constrained the applications of Granger causality to primarily simple bivariate vector autoregressive processes. Starting with a review of early developments and debates, this article discusses recent advances that address various shortcomings of the earlier approaches, from models for high-dimensional time series to more recent developments that account for nonlinear and non-Gaussian observations and allow for subsampled and mixed-frequency time series.
Collapse
Affiliation(s)
- Ali Shojaie
- Department of Biostatistics, University of Washington, Seattle, Washington 98195-4322, USA
| | - Emily B Fox
- Department of Statistics, Stanford University, Stanford, California 94305-4020, USA
| |
Collapse
|
22
|
Chakraborty S, Shojaie A. Nonparametric Causal Structure Learning in High Dimensions. Entropy 2022; 24:e24030351. [PMID: 35327862 PMCID: PMC8947566 DOI: 10.3390/e24030351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Revised: 02/21/2022] [Accepted: 02/25/2022] [Indexed: 12/10/2022]
Abstract
The PC and FCI algorithms are popular constraint-based methods for learning the structure of directed acyclic graphs (DAGs) in the absence and presence of latent and selection variables, respectively. These algorithms (and their order-independent variants, PC-stable and FCI-stable) have been shown to be consistent for learning sparse high-dimensional DAGs based on partial correlations. However, inferring conditional independences from partial correlations is valid if the data are jointly Gaussian or generated from a linear structural equation model—an assumption that may be violated in many applications. To broaden the scope of high-dimensional causal structure learning, we propose nonparametric variants of the PC-stable and FCI-stable algorithms that employ the conditional distance covariance (CdCov) to test for conditional independence relationships. As the key theoretical contribution, we prove that the high-dimensional consistency of the PC-stable and FCI-stable algorithms carry over to general distributions over DAGs when we implement CdCov-based nonparametric tests for conditional independence. Numerical studies demonstrate that our proposed algorithms perform nearly as good as the PC-stable and FCI-stable for Gaussian distributions, and offer advantages in non-Gaussian graphical models.
Collapse
|
23
|
Brooks-Worrell B, Hampe CS, Hattery EG, Palomino B, Zangeneh SZ, Utzschneider K, Kahn SE, Larkin ME, Johnson ML, Mather KJ, Younes N, Rasouli N, Desouza C, Cohen RM, Park JY, Florez HJ, Valencia WM, Shojaie A, Palmer JP, Balasubramanyam A. Islet Autoimmunity is Highly Prevalent and Associated With Diminished β-Cell Function in Patients With Type 2 Diabetes in the Grade Study. Diabetes 2022; 71:db210590. [PMID: 35061024 PMCID: PMC9375448 DOI: 10.2337/db21-0590] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Accepted: 01/08/2021] [Indexed: 11/13/2022]
Abstract
Islet autoimmunity may contribute to β-cell dysfunction in type 2 diabetes (T2D). Its prevalence and clinical significance have not been rigorously determined. In this ancillary study to the Glycemia Reduction Approaches in Diabetes-A Comparative Effectiveness (GRADE) Study, we investigated the prevalence of cellular and humoral islet autoimmunity in patients with T2D duration 4·0±3·0 y, HbA1c 7·5±0·5% on metformin alone. We measured T cell autoreactivity against islet proteins, islet autoantibodies against GAD65, IA2, ZnT8, and β-cell function. Cellular islet autoimmunity was present in 41·3%, humoral islet autoimmunity in 13·5%, and both in 5·3%. β-cell function calculated as iAUC-CG and ΔC-peptide(0- 30)/Δglucose(0-30) from an oral glucose tolerance test was lower among T cell-positives (T+) than T cell-negatives (T-) using two different adjustments for insulin sensitivity (iAUC-CG: 13·2% [95% CI 0·3, 24·4%] or 11·4% [95% CI 0·4, 21·2%] lower; ΔC-peptide(0-30)/Δglucose(0-30)) 19% [95% CI 3·1, 32·3%] or 17·7% [95% CI 2·6, 30·5%] lower). T+ patients had 17% higher HbA1c (95% CI 0·07, 0·28) and 7·7 mg/dL higher fasting plasma glucose levels (95% CI 0·2,15·3) than T- patients. We conclude that islet autoimmunity is much more prevalent in T2D patients than previously reported. T cell-mediated autoimmunity is associated with diminished β-cell function and worse glycemic control.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Steven E. Kahn
- VA Puget Sound Health Care System, Seattle, WA
- University of Washington, Seattle, WA
| | | | | | | | - Naji Younes
- The Biostatistics Center, Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, The George Washington University, Rockville, MD
| | - Neda Rasouli
- University of Colorado School of Medicine, Aurora, CO
| | - Cyrus Desouza
- University of Nebraska and Omaha VA Medical Center, Omaha, NE
| | - Robert M. Cohen
- University of Cincinnati and Cincinnati VA Medical Center, Cincinnati, OH
| | | | - Hermes J. Florez
- University of Miami, Miami, FL
- Medical University of South Carolina, Charleston, SC
| | | | | | | | - Jerry P. Palmer
- VA Puget Sound Health Care System, Seattle, WA
- University of Washington, Seattle, WA
| | | |
Collapse
|
24
|
Zhang K, Safikhani A, Tank A, Shojaie A. Penalized estimation of threshold auto-regressive models with many components and thresholds. Electron J Stat 2022; 16:1891-1951. [PMID: 37051046 PMCID: PMC10088520 DOI: 10.1214/22-ejs1982] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Thanks to their simplicity and interpretable structure, autoregressive processes are widely used to model time series data. However, many real time series data sets exhibit non-linear patterns, requiring nonlinear modeling. The threshold Auto-Regressive (TAR) process provides a family of non-linear auto-regressive time series models in which the process dynamics are specific step functions of a thresholding variable. While estimation and inference for low-dimensional TAR models have been investigated, high-dimensional TAR models have received less attention. In this article, we develop a new framework for estimating high-dimensional TAR models, and propose two different sparsity-inducing penalties. The first penalty corresponds to a natural extension of classical TAR model to high-dimensional settings, where the same threshold is enforced for all model parameters. Our second penalty develops a more flexible TAR model, where different thresholds are allowed for different auto-regressive coefficients. We show that both penalized estimation strategies can be utilized in a three-step procedure that consistently learns both the thresholds and the corresponding auto-regressive coefficients. However, our theoretical and empirical investigations show that the direct extension of the TAR model is not appropriate for high-dimensional settings and is better suited for moderate dimensions. In contrast, the more flexible extension of the TAR model leads to consistent estimation and superior empirical performance in high dimensions.
Collapse
Affiliation(s)
- Kunhui Zhang
- University of Washington, Department of Statistics, Padelford Hall, W Stevens Way NE, Seattle, WA 98195
| | - Abolfazl Safikhani
- University of Florida, Department of Statistics, 102 Griffin-Floyd Hall, Gainesville, FL 32611
| | - Alex Tank
- University of Washington, Department of Statistics, Padelford Hall, W Stevens Way NE, Seattle, WA 98195
| | - Ali Shojaie
- University of Washington, Department of Statistics, Padelford Hall, W Stevens Way NE, Seattle, WA 98195
| |
Collapse
|
25
|
Haris A, Simon N, Shojaie A. Generalized Sparse Additive Models. J Mach Learn Res 2022; 23:70. [PMID: 37873545 PMCID: PMC10593424] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Figures] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
We present a unified framework for estimation and analysis of generalized additive models in high dimensions. The framework defines a large class of penalized regression estimators, encompassing many existing methods. An efficient computational algorithm for this class is presented that easily scales to thousands of observations and features. We prove minimax optimal convergence bounds for this class under a weak compatibility condition. In addition, we characterize the rate of convergence when this compatibility condition is not met. Finally, we also show that the optimal penalty parameters for structure and sparsity penalties in our framework are linked, allowing cross-validation to be conducted over only a single tuning parameter. We complement our theoretical results with empirical studies comparing some existing methods within this framework.
Collapse
Affiliation(s)
- Asad Haris
- Department of Earth, Ocean and Atmospheric Sciences, University of British Columbia, 2020 - 2207 Main Mall, Vancouver, BC, Canada V6T 1Z4
| | - Noah Simon
- Department of Biostatistics, University of Washington, Seattle, WA 98195-7232, USA
| | - Ali Shojaie
- Department of Biostatistics, University of Washington, Seattle, WA 98195-7232, USA
| |
Collapse
|
26
|
Prater KE, Green KJ, Chiou KL, Smith CL, Sun W, Shojaie A, Heath LM, Rose S, Keene CD, Logsdon BA, Snyder-Mackler N, Blue EE, Young JE, Garden GA, Jayadev S. Microglia subtype transcriptomes differ between Alzheimer Disease and control human postmortem brain samples. Alzheimers Dement 2022. [PMID: 34971137 DOI: 10.1002/alz.058474] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
BACKGROUND Microglia-mediated neuroinflammation is hypothesized to contribute to disease progression in neurodegenerative diseases such as Alzheimer's Disease (AD). Microglia subtypes are complex, with beneficial and harmful phenotypes. Understanding the gene expression networks which define the spectrum of microglia phenotypes is critical to identifying specific targets for neuroinflammation modulating therapies. METHOD Our study utilized post-mortem brain tissue from 22 total (7 male) participants; 12 (3 male) had significant AD neuropathic change. Nuclei isolated from prefrontal cortex were sorted for the myeloid marker PU.1 using fluorescence activated nucleus sorting (FANS). The FANS approach yields larger numbers of nuclei annotated as microglia with high quality sequence from each individual. We performed single-nucleus RNA-seq using the 10X Genomics Chromium platform. RESULTS We isolated more than 120,000 microglia nuclei, facilitating group comparisons based on disease state. Unbiased clustering revealed 10 microglia clusters and improved resolution of microglia heterogeneity compared to standard single-cell approaches. We identify clusters of microglia enriched for biological pathways implicating defined myeloid roles including interferon-stimulated, endo/lysosomal, neurodegenerative with a "disease-associated microglia" (DAM) signature, as well as a metabolically active and autophagic cluster. Interestingly, the cluster proportionately enriched for AD individuals' nuclei is not the DAM cluster but instead one of the clusters in which endo/lysosomal genes are highly upregulated. Furthermore, many of the genes in known AD risk loci are strongly differentially regulated in this AD associated cluster. We also identify a cluster of microglia that is proportionately enriched for control samples with upregulated cell cycle and proliferation genes. Trajectory analysis suggests that the paths AD and control nuclei take from unactivated "homeostatic" to various phenotypic states are also distinct. CONCLUSION Using human AD tissue collected with uniform protocols we characterize the transcriptomic profiles of microglia subtypes in human brain. By enriching for myeloid cells prior to analysis we can resolve microglia subtypes revealing the diversity of microglia which are "inflammatory" as well as other microglia subtypes responding with induction of metabolic and lysosomal pathways. Our data identifies subtypes of microglia that are unique to AD and control individuals. These results support the possibility of pharmacological targeting of specific subtypes of microglia to alter AD progression.
Collapse
Affiliation(s)
| | | | | | | | - Wei Sun
- Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
27
|
Wang X, Shojaie A. Causal Discovery in High-Dimensional Point Process Networks with Hidden Nodes. Entropy (Basel) 2021; 23:1622. [PMID: 34945928 PMCID: PMC8700240 DOI: 10.3390/e23121622] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Revised: 11/20/2021] [Accepted: 11/27/2021] [Indexed: 12/01/2022]
Abstract
Thanks to technological advances leading to near-continuous time observations, emerging multivariate point process data offer new opportunities for causal discovery. However, a key obstacle in achieving this goal is that many relevant processes may not be observed in practice. Naïve estimation approaches that ignore these hidden variables can generate misleading results because of the unadjusted confounding. To plug this gap, we propose a deconfounding procedure to estimate high-dimensional point process networks with only a subset of the nodes being observed. Our method allows flexible connections between the observed and unobserved processes. It also allows the number of unobserved processes to be unknown and potentially larger than the number of observed nodes. Theoretical analyses and numerical studies highlight the advantages of the proposed method in identifying causal interactions among the observed processes.
Collapse
Affiliation(s)
| | - Ali Shojaie
- Department of Biostatistics, University of Washington, Seattle, WA 98195, USA;
| |
Collapse
|
28
|
Abstract
A great deal of interest has recently focused on conducting inference on the parameters in a high-dimensional linear model. In this paper, we consider a simple and very naïve two-step procedure for this task, in which we (i) fit a lasso model in order to obtain a subset of the variables, and (ii) fit a least squares model on the lasso-selected set. Conventional statistical wisdom tells us that we cannot make use of the standard statistical inference tools for the resulting least squares model (such as confidence intervals and p-values), since we peeked at the data twice: once in running the lasso, and again in fitting the least squares model. However, in this paper, we show that under a certain set of assumptions, with high probability, the set of variables selected by the lasso is identical to the one selected by the noiseless lasso and is hence deterministic. Consequently, the naïve two-step approach can yield asymptotically valid inference. We utilize this finding to develop the naïve confidence interval, which can be used to draw inference on the regression coefficients of the model selected by the lasso, as well as the naïve score test, which can be used to test the hypotheses regarding the full-model regression coefficients.
Collapse
Affiliation(s)
- Sen Zhao
- 1600Amphitheatre Parkway, Mountain View, California 94043, USA
| | - Daniela Witten
- University of Washington, Health Sciences Building, Box 357232, Seattle, Washington 98195, USA
| | - Ali Shojaie
- University of Washington, Health Sciences Building, Box 357232, Seattle, Washington 98195, USA
| |
Collapse
|
29
|
Yue K, Ma J, Thornton T, Shojaie A. REHE: Fast variance components estimation for linear mixed models. Genet Epidemiol 2021; 45:891-905. [PMID: 34658056 DOI: 10.1002/gepi.22432] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 06/11/2021] [Accepted: 10/04/2021] [Indexed: 11/07/2022]
Abstract
Linear mixed models are widely used in ecological and biological applications, especially in genetic studies. Reliable estimation of variance components is crucial for using linear mixed models. However, standard methods, such as the restricted maximum likelihood (REML), are computationally inefficient in large samples and may be unstable with small samples. Other commonly used methods, such as the Haseman-Elston (HE) regression, may yield negative estimates of variances. Utilizing regularized estimation strategies, we propose the restricted Haseman-Elston (REHE) regression and REHE with resampling (reREHE) estimators, along with an inference framework for REHE, as fast and robust alternatives that provide nonnegative estimates with comparable accuracy to REML. The merits of REHE are illustrated using real data and benchmark simulation studies.
Collapse
Affiliation(s)
- Kun Yue
- Department of Biostatistics, University of Washington, Seattle, Washington, USA
| | - Jing Ma
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
| | - Timothy Thornton
- Department of Biostatistics, University of Washington, Seattle, Washington, USA
| | - Ali Shojaie
- Department of Biostatistics, University of Washington, Seattle, Washington, USA
| |
Collapse
|
30
|
Yu S, Drton M, Promislow DEL, Shojaie A. CorDiffViz: an R package for visualizing multi-omics differential correlation networks. BMC Bioinformatics 2021; 22:486. [PMID: 34627139 PMCID: PMC8501646 DOI: 10.1186/s12859-021-04383-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Accepted: 09/20/2021] [Indexed: 11/22/2022] Open
Abstract
BACKGROUND Differential correlation networks are increasingly used to delineate changes in interactions among biomolecules. They characterize differences between omics networks under two different conditions, and can be used to delineate mechanisms of disease initiation and progression. RESULTS We present a new R package, CorDiffViz, that facilitates the estimation and visualization of differential correlation networks using multiple correlation measures and inference methods. The software is implemented in R, HTML and Javascript, and is available at https://github.com/sqyu/CorDiffViz . Visualization has been tested for the Chrome and Firefox web browsers. A demo is available at https://diffcornet.github.io/CorDiffViz/demo.html . CONCLUSIONS Our software offers considerable flexibility by allowing the user to interact with the visualization and choose from different estimation methods and visualizations. It also allows the user to easily toggle between correlation networks for samples under one condition and differential correlations between samples under two conditions. Moreover, the software facilitates integrative analysis of cross-correlation networks between two omics data sets.
Collapse
Affiliation(s)
- Shiqing Yu
- Department of Statistics, University of Washington, NE Stevens Way, Seattle, WA, 98195, USA.
| | - Mathias Drton
- Department of Mathematics, Technical University of Munich, Boltzmannstraße, 85748, Garching bei München, Germany
| | - Daniel E L Promislow
- Departments of Pathology and Biology, University of Washington, NE Pacific St, Seattle, WA, 98195, USA
| | - Ali Shojaie
- Department of Biostatistics, University of Washington, NE Pacific St, Seattle, WA, 98195, USA
| |
Collapse
|
31
|
Hellstern M, Ma J, Yue K, Shojaie A. netgsa: Fast computation and interactive visualization for topology-based pathway enrichment analysis. PLoS Comput Biol 2021; 17:e1008979. [PMID: 34115744 PMCID: PMC8221786 DOI: 10.1371/journal.pcbi.1008979] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 06/23/2021] [Accepted: 04/18/2021] [Indexed: 01/26/2023] Open
Abstract
Existing software tools for topology-based pathway enrichment analysis are either computationally inefficient, have undesirable statistical power, or require expert knowledge to leverage the methods' capabilities. To address these limitations, we have overhauled NetGSA, an existing topology-based method, to provide a computationally-efficient user-friendly tool that offers interactive visualization. Pathway enrichment analysis for thousands of genes can be performed in minutes on a personal computer without sacrificing statistical power. The new software also removes the need for expert knowledge by directly curating gene-gene interaction information from multiple external databases. Lastly, by utilizing the capabilities of Cytoscape, the new software also offers interactive and intuitive network visualization.
Collapse
Affiliation(s)
- Michael Hellstern
- Department of Biostatistics, University of Washington, Seattle, Washington
| | - Jing Ma
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Kun Yue
- Department of Biostatistics, University of Washington, Seattle, Washington
| | - Ali Shojaie
- Department of Biostatistics, University of Washington, Seattle, Washington
| |
Collapse
|
32
|
Shojaie A. Differential Network Analysis: A Statistical Perspective. Wiley Interdiscip Rev Comput Stat 2021; 13:e1508. [PMID: 37050915 PMCID: PMC10088462 DOI: 10.1002/wics.1508] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Accepted: 03/03/2020] [Indexed: 11/06/2022]
Abstract
Networks effectively capture interactions among components of complex systems, and have thus become a mainstay in many scientific disciplines. Growing evidence, especially from biology, suggest that networks undergo changes over time, and in response to external stimuli. In biology and medicine, these changes have been found to be predictive of complex diseases. They have also been used to gain insight into mechanisms of disease initiation and progression. Primarily motivated by biological applications, this article provides a review of recent statistical machine learning methods for inferring networks and identifying changes in their structures.
Collapse
Affiliation(s)
- Ali Shojaie
- Department of Biostatistics, University of Washington, Seattle WA
| |
Collapse
|
33
|
Manzour H, Küçükyavuz S, Wu HH, Shojaie A. Integer Programming for Learning Directed Acyclic Graphs from Continuous Data. INFORMS Journal on Optimization 2021; 3:46-73. [PMID: 37051459 PMCID: PMC10088505 DOI: 10.1287/ijoo.2019.0040] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Learning directed acyclic graphs (DAGs) from data is a challenging task both in theory and in practice, because the number of possible DAGs scales superexponentially with the number of nodes. In this paper, we study the problem of learning an optimal DAG from continuous observational data. We cast this problem in the form of a mathematical programming model that can naturally incorporate a superstructure to reduce the set of possible candidate DAGs. We use a negative log-likelihood score function with both [Formula: see text] and [Formula: see text] penalties and propose a new mixed-integer quadratic program, referred to as a layered network (LN) formulation. The LN formulation is a compact model that enjoys as tight an optimal continuous relaxation value as the stronger but larger formulations under a mild condition. Computational results indicate that the proposed formulation outperforms existing mathematical formulations and scales better than available algorithms that can solve the same problem with only [Formula: see text] regularization. In particular, the LN formulation clearly outperforms existing methods in terms of computational time needed to find an optimal DAG in the presence of a sparse superstructure.
Collapse
Affiliation(s)
- Hasan Manzour
- Department of Industrial and Systems Engineering, University of Washington, Seattle, Washington 98195
| | - Simge Küçükyavuz
- Department of Industrial Engineering and Management Sciences, Northwestern University, Evanston, Illinois 60208
| | - Hao-Hsiang Wu
- Department of Management Science, National Chiao Tung University, Hsinchu, Taiwan
| | - Ali Shojaie
- Department of Biostatistics, University of Washington, Seattle, Washington 98195
| |
Collapse
|
34
|
Simon N, Shojaie A. Convergence Rates of Nonparametric Penalized Regression under Misspecified Smoothness. Stat Sin 2021. [DOI: 10.5705/ss.202018.0144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
35
|
Abstract
We present a framework for learning Granger causality networks for multivariate categorical time series based on the mixture transition distribution (MTD) model. Traditionally, MTD is plagued by a nonconvex objective, non-identifiability, and presence of local optima. To circumvent these problems, we recast inference in the MTD as a convex problem. The new formulation facilitates the application of MTD to high-dimensional multivariate time series. As a baseline, we also formulate a multi-output logistic autoregressive model (mLTD), which while a straightforward extension of autoregressive Bernoulli generalized linear models, has not been previously applied to the analysis of multivariate categorial time series. We establish identifiability conditions of the MTD model and compare them to those for mLTD. We further devise novel and efficient optimization algorithms for MTD based on our proposed convex formulation, and compare the MTD and mLTD in both simulated and real data experiments. Finally, we establish consistency of the convex MTD in high dimensions. Our approach simultaneously provides a comparison of methods for network inference in categorical time series and opens the door to modern, regularized inference with the MTD model.
Collapse
Affiliation(s)
| | - Xiudi Li
- Department of Biostatistics, University of Washington, Seattle WA
| | - Emily B Fox
- Departments of Computer Science & Engineering and Statistics, University of Washington, Seattle WA
| | - Ali Shojaie
- Department of Biostatistics, University of Washington, Seattle WA
| |
Collapse
|
36
|
Li X, Shojaie A. Discussion of “A Tuning-Free Robust and Efficient Approach to High-Dimensional Regression”. J Am Stat Assoc 2020. [DOI: 10.1080/01621459.2020.1837139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Affiliation(s)
- Xiudi Li
- Department of Biostatistics, University of Washington , Seattle , WA , USA
| | - Ali Shojaie
- Department of Biostatistics, University of Washington , Seattle , WA , USA
| |
Collapse
|
37
|
Dibay Moghadam S, Navarro SL, Shojaie A, Randolph TW, Bettcher LF, Le CB, Hullar MA, Kratz M, Neuhouser ML, Lampe PD, Raftery D, Lampe JW. Plasma lipidomic profiles after a low and high glycemic load dietary pattern in a randomized controlled crossover feeding study. Metabolomics 2020; 16:121. [PMID: 33219392 PMCID: PMC8116047 DOI: 10.1007/s11306-020-01746-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/28/2020] [Accepted: 11/09/2020] [Indexed: 12/11/2022]
Abstract
BACKGROUND Dietary patterns low in glycemic load are associated with reduced risk of cardiometabolic diseases. Improvements in serum lipid concentrations may play a role in these observed associations. OBJECTIVE We investigated how dietary patterns differing in glycemic load affect clinical lipid panel measures and plasma lipidomics profiles. METHODS In a crossover, controlled feeding study, 80 healthy participants (n = 40 men, n = 40 women), 18-45 y were randomized to receive low-glycemic load (LGL) or high glycemic load (HGL) diets for 28 days each with at least a 28-day washout period between controlled diets. Fasting plasma samples were collected at baseline and end of each diet period. Lipids on a clinical panel including total-, VLDL-, LDL-, and HDL-cholesterol and triglycerides were measured using an auto-analyzer. Lipidomics analysis using mass-spectrometry provided the concentrations of 863 species. Linear mixed models and lipid ontology enrichment analysis were implemented. RESULTS Lipids from the clinical panel were not significantly different between diets. Univariate analysis showed that 67 species on the lipidomics panel, predominantly in the triacylglycerol class, were higher after the LGL diet compared to the HGL (FDR < 0.05). Three species with FA 17:0 were lower after LGL diet with enrichment analysis (FDR < 0.05). CONCLUSION In the context of controlled eucaloric diets with similar macronutrient distribution, these results suggest that there are relative shifts in lipid species, but the overall pool does not change. Further studies are needed to better understand in which compartment the different lipid species are transported in blood, and how these shifts are related to health outcomes. This trial was registered at clinicaltrials.gov as NCT00622661.
Collapse
Affiliation(s)
- Sepideh Dibay Moghadam
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, Seattle, WA, USA
- Department of Epidemiology, University of Washington, Seattle, WA, USA
| | - Sandi L Navarro
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, Seattle, WA, USA
| | - Ali Shojaie
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Timothy W Randolph
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, Seattle, WA, USA
| | - Lisa F Bettcher
- Department of Anesthesiology and Pain Medicine, Northwest Metabolomics Research Center, University of Washington, Seattle, WA, USA
| | - Cynthia B Le
- Department of Anesthesiology and Pain Medicine, Northwest Metabolomics Research Center, University of Washington, Seattle, WA, USA
| | - Meredith A Hullar
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, Seattle, WA, USA
| | - Mario Kratz
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, Seattle, WA, USA
| | - Marian L Neuhouser
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, Seattle, WA, USA
| | - Paul D Lampe
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, Seattle, WA, USA
| | - Daniel Raftery
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, Seattle, WA, USA
- Department of Anesthesiology and Pain Medicine, Northwest Metabolomics Research Center, University of Washington, Seattle, WA, USA
| | - Johanna W Lampe
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, Seattle, WA, USA.
| |
Collapse
|
38
|
Abstract
This paper concerns the development of an inferential framework for high-dimensional linear mixed effect models. These are suitable models, for instance, when we have n repeated measurements for M subjects. We consider a scenario where the number of fixed effects p is large (and may be larger than M), but the number of random effects q is small. Our framework is inspired by a recent line of work that proposes de-biasing penalized estimators to perform inference for high-dimensional linear models with fixed effects only. In particular, we demonstrate how to correct a 'naive' ridge estimator in extension of work by Bühlmann (2013) to build asymptotically valid confidence intervals for mixed effect models. We validate our theoretical results with numerical experiments, in which we show our method outperforms those that fail to account for correlation induced by the random effects. For a practical demonstration we consider a riboflavin production dataset that exhibits group structure, and show that conclusions drawn using our method are consistent with those obtained on a similar dataset without group structure.
Collapse
Affiliation(s)
- Lina Lin
- Department of Statistics, University of Washington
| | - Mathias Drton
- Department of Mathematics, Technical University of Munich
| | - Ali Shojaie
- Department of Biostatistics, University of Washington
| |
Collapse
|
39
|
Jin K, Wilson KA, Beck JN, Nelson CS, Brownridge GW, Harrison BR, Djukovic D, Raftery D, Brem RB, Yu S, Drton M, Shojaie A, Kapahi P, Promislow D. Genetic and metabolomic architecture of variation in diet restriction-mediated lifespan extension in Drosophila. PLoS Genet 2020; 16:e1008835. [PMID: 32644988 PMCID: PMC7347105 DOI: 10.1371/journal.pgen.1008835] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2020] [Accepted: 05/06/2020] [Indexed: 01/08/2023] Open
Abstract
In most organisms, dietary restriction (DR) increases lifespan. However, several studies have found that genotypes within the same species vary widely in how they respond to DR. To explore the mechanisms underlying this variation, we exposed 178 inbred Drosophila melanogaster lines to a DR or ad libitum (AL) diet, and measured a panel of 105 metabolites under both diets. Twenty four out of 105 metabolites were associated with the magnitude of the lifespan response. These included proteinogenic amino acids and metabolites involved in α-ketoglutarate (α-KG)/glutamine metabolism. We confirm the role of α-KG/glutamine synthesis pathways in the DR response through genetic manipulations. We used covariance network analysis to investigate diet-dependent interactions between metabolites, identifying the essential amino acids threonine and arginine as “hub” metabolites in the DR response. Finally, we employ a novel metabolic and genetic bipartite network analysis to reveal multiple genes that influence DR lifespan response, some of which have not previously been implicated in DR regulation. One of these is CCHa2R, a gene that encodes a neuropeptide receptor that influences satiety response and insulin signaling. Across the lines, variation in an intronic single nucleotide variant of CCHa2R correlated with variation in levels of five metabolites, all of which in turn were correlated with DR lifespan response. Inhibition of adult CCHa2R expression extended DR lifespan of flies, confirming the role of CCHa2R in lifespan response. These results provide support for the power of combined genomic and metabolomic analysis to identify key pathways underlying variation in this complex quantitative trait. Dietary restriction extends lifespan across most organisms in which it has been tested. However, several studies have now demonstrated that this effect can vary dramatically across different genotypes within a population. Within a population, dietary restriction might be beneficial for some, yet detrimental for others. Here, we measure the metabolome of 178 genetically characterized fly strains on fully fed and restricted diets. The fly strains vary widely in their lifespan response to dietary restriction. We then use information about each strain’s genome and metabolome (a measure of small molecules circulating in flies) to pinpoint cellular pathways that govern this variation in response. We identify a novel pathway involving the gene CCHa2R, which encodes a neuropeptide receptor that has not previously been implicated in dietary restriction or age-related signaling pathways. This study demonstrates the power of leveraging systems biology and network biology methods to understand how and why different individuals vary in their response to health and lifespan-extending interventions.
Collapse
Affiliation(s)
- Kelly Jin
- Department of Pathology, University of Washington School of Medicine, Seattle, Washington, United States of America
| | - Kenneth A. Wilson
- Buck Institute for Research on Aging, Novato, California, United States of America
- Davis School of Gerontology, University of Southern California, University Park, Los Angeles, California, United States of America
| | - Jennifer N. Beck
- Buck Institute for Research on Aging, Novato, California, United States of America
| | | | - George W. Brownridge
- Buck Institute for Research on Aging, Novato, California, United States of America
- Dominican University of California, San Rafael, California, United States of America
| | - Benjamin R. Harrison
- Department of Pathology, University of Washington School of Medicine, Seattle, Washington, United States of America
| | - Danijel Djukovic
- Northwest Metabolomics Research Center, Department of Anesthesiology and Pain Medicine, University of Washington, Seattle, Washington, United States of America
| | - Daniel Raftery
- Northwest Metabolomics Research Center, Department of Anesthesiology and Pain Medicine, University of Washington, Seattle, Washington, United States of America
| | - Rachel B. Brem
- Buck Institute for Research on Aging, Novato, California, United States of America
- Davis School of Gerontology, University of Southern California, University Park, Los Angeles, California, United States of America
- Department of Plant and Microbial Biology, University of California, Berkeley, Berkeley, California, United States of America
| | - Shiqing Yu
- Department of Statistics, University of Washington, Seattle, Washington, United States of America
| | - Mathias Drton
- Department of Mathematics, Technical University of Munich, Munich, Germany
| | - Ali Shojaie
- Department of Biostatistics, University of Washington, Seattle, Washington, United States of America
| | - Pankaj Kapahi
- Buck Institute for Research on Aging, Novato, California, United States of America
- Davis School of Gerontology, University of Southern California, University Park, Los Angeles, California, United States of America
| | - Daniel Promislow
- Department of Pathology, University of Washington School of Medicine, Seattle, Washington, United States of America
- Department of Biology, University of Washington, Seattle, Washington, United States of America
- * E-mail:
| |
Collapse
|
40
|
Safikhani A, Shojaie A. Joint Structural Break Detection and Parameter Estimation in High-Dimensional Non-Stationary VAR Models. J Am Stat Assoc 2020; 117:251-264. [PMID: 38375186 PMCID: PMC10874880 DOI: 10.1080/01621459.2020.1770097] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2018] [Revised: 11/01/2019] [Accepted: 05/11/2020] [Indexed: 10/24/2022]
Abstract
Assuming stationarity is unrealistic in many time series applications. A more realistic alternative is to assume piecewise stationarity, where the model can change at potentially many change points. We propose a three-stage procedure for simultaneous estimation of change points and parameters of high-dimensional piecewise vector autoregressive (VAR) models. In the first step, we reformulate the change point detection problem as a high-dimensional variable selection one, and solve it using a penalized least square estimator with a total variation penalty. We show that the penalized estimation method over-estimates the number of change points, and propose a selection criterion to identify the change points. In the last step of our procedure, we estimate the VAR parameters in each of the segments. We prove that the proposed procedure consistently detects the number and location of change points, and provides consistent estimates of VAR parameters. The performance of the method is illustrated through several simulated and real data examples.
Collapse
Affiliation(s)
| | - Ali Shojaie
- Department of Biostatistics, University of Washington
| |
Collapse
|
41
|
Affiliation(s)
| | - Ali Shojaie
- Department of Biostatistics, University of Washington
| | - Marco Carone
- Department of Biostatistics, University of Washington
| |
Collapse
|
42
|
Wang Y, Randolph TW, Shojaie A, Ma J. The Generalized Matrix Decomposition Biplot and Its Application to Microbiome Data. mSystems 2019; 4:e00504-19. [PMID: 31848304 PMCID: PMC6918030 DOI: 10.1128/msystems.00504-19] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2019] [Accepted: 11/13/2019] [Indexed: 11/20/2022] Open
Abstract
Exploratory analysis of human microbiome data is often based on dimension-reduced graphical displays derived from similarities based on non-Euclidean distances, such as UniFrac or Bray-Curtis. However, a display of this type, often referred to as the principal-coordinate analysis (PCoA) plot, does not reveal which taxa are related to the observed clustering because the configuration of samples is not based on a coordinate system in which both the samples and variables can be represented. The reason is that the PCoA plot is based on the eigen-decomposition of a similarity matrix and not the singular value decomposition (SVD) of the sample-by-abundance matrix. We propose a novel biplot that is based on an extension of the SVD, called the generalized matrix decomposition biplot (GMD-biplot), which involves an arbitrary matrix of similarities and the original matrix of variable measures, such as taxon abundances. As in a traditional biplot, points represent the samples, and arrows represent the variables. The proposed GMD-biplot is illustrated by analyzing multiple real and simulated data sets which demonstrate that the GMD-biplot provides improved clustering capability and a more meaningful relationship between the arrows and points.IMPORTANCE Biplots that simultaneously display the sample clustering and the important taxa have gained popularity in the exploratory analysis of human microbiome data. Traditional biplots, assuming Euclidean distances between samples, are not appropriate for microbiome data, when non-Euclidean distances are used to characterize dissimilarities among microbial communities. Thus, incorporating information from non-Euclidean distances into a biplot becomes useful for graphical displays of microbiome data. The proposed GMD-biplot accounts for any arbitrary non-Euclidean distances and provides a robust and computationally efficient approach for graphical visualization of microbiome data. In addition, the proposed GMD-biplot displays both the samples and taxa with respect to the same coordinate system, which further allows the configuration of future samples.
Collapse
Affiliation(s)
- Yue Wang
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
- Department of Biostatistics, University of Washington, Seattle, Washington, USA
| | - Timothy W Randolph
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
- Clinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
| | - Ali Shojaie
- Department of Biostatistics, University of Washington, Seattle, Washington, USA
| | - Jing Ma
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
- Department of Statistics, Texas A&M University, College Station, Texas, USA
| |
Collapse
|
43
|
Abstract
BACKGROUND Pathway enrichment extensively used in the analysis of Omics data for gaining biological insights into the functional roles of pre-defined subsets of genes, proteins and metabolites. A large number of methods have been proposed in the literature for this task. The vast majority of these methods use as input expression levels of the biomolecules under study together with their membership in pathways of interest. The latest generation of pathway enrichment methods also leverages information on the topology of the underlying pathways, which as evidence from their evaluation reveals, lead to improved sensitivity and specificity. Nevertheless, a systematic empirical comparison of such methods is still lacking, making selection of the most suitable method for a specific experimental setting challenging. This comparative study of nine network-based methods for pathway enrichment analysis aims to provide a systematic evaluation of their performance based on three real data sets with different number of features (genes/metabolites) and number of samples. RESULTS The findings highlight both methodological and empirical differences across the nine methods. In particular, certain methods assess pathway enrichment due to differences both across expression levels and in the strength of the interconnectedness of the members of the pathway, while others only leverage differential expression levels. In the more challenging setting involving a metabolomics data set, the results show that methods that utilize both pieces of information (with NetGSA being a prototypical one) exhibit superior statistical power in detecting pathway enrichment. CONCLUSION The analysis reveals that a number of methods perform equally well when testing large size pathways, which is the case with genomic data. On the other hand, NetGSA that takes into consideration both differential expression of the biomolecules in the pathway, as well as changes in the topology exhibits a superior performance when testing small size pathways, which is usually the case for metabolomics data.
Collapse
Affiliation(s)
- Jing Ma
- Texas A&M University, Department of Statistics, College Station, 77840 USA
- Fred Hutchinson Cancer Research Center, Public Health Sciences Division, Seattle, 98107 USA
| | - Ali Shojaie
- University of Washington, Department of Biostatistics, Seattle, 98105 USA
| | | |
Collapse
|
44
|
Abstract
An optimal and flexible multiple hypotheses testing procedure is constructed for dependent data based on Bayesian techniques, aiming at handling two challenges, namely dependence structure and non-null distribution specification. Ignoring dependence among hypotheses tests may lead to loss of efficiency and bias in decision. Misspecification in the non-null distribution, on the other hand, can result in both false positive and false negative errors. Hidden Markov models are used to accommodate the dependence structure among the tests. Dirichlet mixture process prior is applied on the non-null distribution to overcome the potential pitfalls in distribution misspecification. The testing algorithm based on Bayesian techniques optimizes the false negative rate (FNR) while controlling the false discovery rate (FDR). The procedure is applied to pointwise and clusterwise analysis. Its performance is compared with existing approaches using both simulated and real data examples.
Collapse
Affiliation(s)
- Xia Wang
- Department of Mathematical Sciences, University of Cincinnati, Cincinnati, Ohio 45221, U.S.A
| | - Ali Shojaie
- Department of Biostatistics, University of Washington, Seattle, Washington 98195, U.S.A
| | - Jian Zou
- Department of Mathematical Sciences, Worcester Polytechnic Institute, Worcester, Massachusetts 01609, U.S.A
| |
Collapse
|
45
|
Navarro SL, Tarkhan A, Shojaie A, Randolph TW, Gu H, Djukovic D, Osterbauer KJ, Hullar MA, Kratz M, Neuhouser ML, Lampe PD, Raftery D, Lampe JW. Plasma metabolomics profiles suggest beneficial effects of a low-glycemic load dietary pattern on inflammation and energy metabolism. Am J Clin Nutr 2019; 110:984-992. [PMID: 31432072 PMCID: PMC6766441 DOI: 10.1093/ajcn/nqz169] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2019] [Accepted: 07/02/2019] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Low-glycemic load dietary patterns, characterized by consumption of whole grains, legumes, fruits, and vegetables, are associated with reduced risk of several chronic diseases. METHODS Using samples from a randomized, controlled, crossover feeding trial, we evaluated the effects on metabolic profiles of a low-glycemic whole-grain dietary pattern (WG) compared with a dietary pattern high in refined grains and added sugars (RG) for 28 d. LC-MS-based targeted metabolomics analysis was performed on fasting plasma samples from 80 healthy participants (n = 40 men, n = 40 women) aged 18-45 y. Linear mixed models were used to evaluate differences in response between diets for individual metabolites. Kyoto Encyclopedia of Genes and Genomes (KEGG)-defined pathways and 2 novel data-driven analyses were conducted to consider differences at the pathway level. RESULTS There were 121 metabolites with detectable signal in >98% of all plasma samples. Eighteen metabolites were significantly different between diets at day 28 [false discovery rate (FDR) < 0.05]. Inositol, hydroxyphenylpyruvate, citrulline, ornithine, 13-hydroxyoctadecadienoic acid, glutamine, and oxaloacetate were higher after the WG diet than after the RG diet, whereas melatonin, betaine, creatine, acetylcholine, aspartate, hydroxyproline, methylhistidine, tryptophan, cystamine, carnitine, and trimethylamine were lower. Analyses using KEGG-defined pathways revealed statistically significant differences in tryptophan metabolism between diets, with kynurenine and melatonin positively associated with serum C-reactive protein concentrations. Novel data-driven methods at the metabolite and network levels found correlations among metabolites involved in branched-chain amino acid (BCAA) degradation, trimethylamine-N-oxide production, and β oxidation of fatty acids (FDR < 0.1) that differed between diets, with more favorable metabolic profiles detected after the WG diet. Higher BCAAs and trimethylamine were positively associated with homeostasis model assessment-insulin resistance. CONCLUSIONS These exploratory metabolomics results support beneficial effects of a low-glycemic load dietary pattern characterized by whole grains, legumes, fruits, and vegetables, compared with a diet high in refined grains and added sugars on inflammation and energy metabolism pathways. This trial was registered at clinicaltrials.gov as NCT00622661.
Collapse
Affiliation(s)
- Sandi L Navarro
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA,Address correspondence to SLN (e-mail: )
| | - Aliasghar Tarkhan
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Ali Shojaie
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA,Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Timothy W Randolph
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Haiwei Gu
- College of Health Solutions, Arizona State University, Phoenix, AZ, USA
| | - Danijel Djukovic
- Northwest Metabolomics Research Center, Department of Anesthesiology and Pain Medicine, University of Washington, Seattle, WA, USA
| | - Katie J Osterbauer
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Meredith A Hullar
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Mario Kratz
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Marian L Neuhouser
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Paul D Lampe
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Daniel Raftery
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA,Northwest Metabolomics Research Center, Department of Anesthesiology and Pain Medicine, University of Washington, Seattle, WA, USA
| | - Johanna W Lampe
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| |
Collapse
|
46
|
Haris A, Shojaie A, Simon N. Nonparametric regression with adaptive truncation via a convex hierarchical penalty. Biometrika 2019; 106:87-107. [PMID: 31427821 DOI: 10.1093/biomet/asy056] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2017] [Indexed: 11/13/2022] Open
Abstract
We consider the problem of nonparametric regression with a potentially large number of covariates. We propose a convex, penalized estimation framework that is particularly well suited to high-dimensional sparse additive models and combines the appealing features of finite basis representation and smoothing penalties. In the case of additive models, a finite basis representation provides a parsimonious representation for fitted functions but is not adaptive when component functions possess different levels of complexity. In contrast, a smoothing spline-type penalty on the component functions is adaptive but does not provide a parsimonious representation. Our proposal simultaneously achieves parsimony and adaptivity in a computationally efficient way. We demonstrate these properties through empirical studies and show that our estimator converges at the minimax rate for functions within a hierarchical class. We further establish minimax rates for a large class of sparse additive models. We also develop an efficient algorithm that scales similarly to the lasso with the number of covariates and sample size.
Collapse
Affiliation(s)
- Asad Haris
- Department of Biostatistics, University of Washington, 1705 NE Pacific Street, Seattle, Washington, USA
| | - Ali Shojaie
- Department of Biostatistics, University of Washington, 1705 NE Pacific Street, Seattle, Washington, USA
| | - Noah Simon
- Department of Biostatistics, University of Washington, 1705 NE Pacific Street, Seattle, Washington, USA
| |
Collapse
|
47
|
Moghadam SD, Navarro S, Shojaie A, Randolph T, Bettcher L, Le C, Hullar M, Kratz M, Neuhouser M, Lampe P, Raftery D, Lampe J. Plasma Lipidomics Profiles After a Diet Characterized by Whole Grains Compared to a Diet High in Refined Grains and Added Sugars (FS03-07-19). Curr Dev Nutr 2019. [DOI: 10.1093/cdn/nzz046.fs03-07-19] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Abstract
Objectives
Dietary patterns high in fiber from sources including whole grains, legumes, fruits, vegetables, nuts and seeds, are associated with lower risk of chronic disease, such as cardiovascular disease and cancer. We investigated how plasma lipidomics profiles differed between a diet high in whole grains (WG) versus a diet high in refined grains and added sugars (RG).
Methods
Using a randomized, crossover, controlled feeding study, 80 healthy participants (n = 40 men, n = 40 women, 40 normal weight, 40 overweight/obese), 18–45 y, were randomized to receive either a WG or RG diet for 28 days. After a 28-day washout period where participants resumed their habitual diet, they crossed over to the other diet. Targeted, differential mobility mass spectrometry was performed on fasting plasma samples collected at the baseline and end of each diet period and quantified the concentrations of 863 lipids from 13 classes. Paired t-tests and pairwise partial least squares-discriminant analysis (PLS-DA) were used to evaluate differences in lipid profiles between the two diets.
Results
At a class level, only ceramides were significantly different when comparing the two diets. After removing lipid species with > 20% missing values or CVs < 25%, 606 were retained for species analysis. Sixty-seven lipid species were significantly different between diets at day 28 (FDR < 0.05): 38 of 414 detected triglycerides, 9 of 59 phosphatidylethanolamines, 9 of 63 phosphatidylcholines, 4 of 22 cholesterol esters, 3 of 11 sphingomyelins, 2 of 13 lysophosphatidylcholines, and 1 of 5 ceramides. The majority of significant lipids were higher in plasma after the WG diet. PLSDA analysis showed the first and second components explaining 49% and 8.4%, respectively. Based on the selected components, lipidomic profiles showed fair separation for the two groups of diet. R2 values were 0.07 and 0.43, and Q2 values were -0.03 and 0.04 for components 1 and 2, respectively.
Conclusions
Higher concentrations of some lipid species such as cholesterol ester 12:0, a carrier of high-density lipoprotein, could indicate a favorable shift in lipid profiles. Further investigation using more complex models are being conducted.
Funding Sources
National Cancer Institute - National Institutes of Health.
Collapse
Affiliation(s)
| | | | | | | | - Lisa Bettcher
- Mitochondria and Metabolism Center (MMC), University of Washington
| | | | | | | | | | | | - Daniel Raftery
- Mitochondria and Metabolism Center (MMC), University of Washington
| | | |
Collapse
|
48
|
Yu S, Drton M, Shojaie A. Generalized Score Matching for Non-Negative Data. J Mach Learn Res 2019; 20:76. [PMID: 34290571 PMCID: PMC8291733] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
A common challenge in estimating parameters of probability density functions is the intractability of the normalizing constant. While in such cases maximum likelihood estimation may be implemented using numerical integration, the approach becomes computationally intensive. The score matching method of Hyvärinen (2005) avoids direct calculation of the normalizing constant and yields closed-form estimates for exponential families of continuous distributions over R m . Hyvärinen (2007) extended the approach to distributions supported on the non-negative orthant, R + m . In this paper, we give a generalized form of score matching for non-negative data that improves estimation efficiency. As an example, we consider a general class of pairwise interaction models. Addressing an overlooked inexistence problem, we generalize the regularized score matching method of Lin et al. (2016) and improve its theoretical guarantees for non-negative Gaussian graphical models.
Collapse
Affiliation(s)
- Shiqing Yu
- Department of Statistics, University of Washington, Seattle, WA, U.S.A
| | - Mathias Drton
- Department of Mathematical Sciences, University of Copenhagen, Copenhagen, Denmark; Department of Statistics, University of Washington, Seattle, WA, U.S.A
| | - Ali Shojaie
- Department of Biostatistics, University of Washington, Seattle, WA, U.S.A
| |
Collapse
|
49
|
Sondhi A, Shojaie A. The Reduced PC-Algorithm: Improved Causal Structure Learning in Large Random Networks. J Mach Learn Res 2019; 20:https://jmlr.org/papers/v20/17-601.html. [PMID: 37799538 PMCID: PMC10552884] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 10/07/2023]
Abstract
We consider the task of estimating a high-dimensional directed acyclic graph, given observations from a linear structural equation model with arbitrary noise distribution. By exploiting properties of common random graphs, we develop a new algorithm that requires conditioning only on small sets of variables. The proposed algorithm, which is essentially a modified version of the PC-Algorithm, offers significant gains in both computational complexity and estimation accuracy. In particular, it results in more efficient and accurate estimation in large networks containing hub nodes, which are common in biological systems. We prove the consistency of the proposed algorithm, and show that it also requires a less stringent faithfulness assumption than the PC-Algorithm. Simulations in low and high-dimensional settings are used to illustrate these findings. An application to gene expression data suggests that the proposed algorithm can identify a greater number of clinically relevant genes than current methods.
Collapse
Affiliation(s)
- Arjun Sondhi
- Department of Biostatistics, University of Washington
| | - Ali Shojaie
- Department of Biostatistics, University of Washington
| |
Collapse
|
50
|
Sedaghat N, Fathy M, Modarressi MH, Shojaie A. Combining Supervised and Unsupervised Learning for Improved miRNA Target Prediction. IEEE/ACM Trans Comput Biol Bioinform 2018; 15:1594-1604. [PMID: 28715336 PMCID: PMC7001746 DOI: 10.1109/tcbb.2017.2727042] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
MicroRNAs (miRNAs) are short non-coding RNAs which bind to mRNAs and regulate their expression. MiRNAs have been found to be associated with initiation and progression of many complex diseases. Investigating miRNAs and their targets can thus help develop new therapies by designing anti-miRNA oligonucleotides. While existing computational approaches can predict miRNA targets, these predictions have low accuracy. In this paper, we propose a two-step approach to refine the results of sequence-based prediction algorithms. The first step, which is based on our previous work, uses an ensemble learning approach that combines multiple existing methods. The second step utilizes support vector machine (SVM) classifiers in one- and two-class modes to infer miRNA-mRNA interactions based on both binding features, as well as network features extracted from gene regulatory network. Experimental results using two real data sets from TCGA indicate that the use of two-class SVM classification significantly improves the precision of miRNA-mRNA prediction.
Collapse
|