1
|
Li C, Luo Y, Xie Y, Zhang Z, Liu Y, Zou L, Xiao F. Structural and functional prediction, evaluation, and validation in the post-sequencing era. Comput Struct Biotechnol J 2024; 23:446-451. [PMID: 38223342 PMCID: PMC10787220 DOI: 10.1016/j.csbj.2023.12.031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 12/20/2023] [Accepted: 12/22/2023] [Indexed: 01/16/2024] Open
Abstract
The surge of genome sequencing data has underlined substantial genetic variants of uncertain significance (VUS). The decryption of VUS discovered by sequencing poses a major challenge in the post-sequencing era. Although experimental assays have progressed in classifying VUS, only a tiny fraction of the human genes have been explored experimentally. Thus, it is urgently needed to generate state-of-the-art functional predictors of VUS in silico. Artificial intelligence (AI) is an invaluable tool to assist in the identification of VUS with high efficiency and accuracy. An increasing number of studies indicate that AI has brought an exciting acceleration in the interpretation of VUS, and our group has already used AI to develop protein structure-based prediction models. In this review, we provide an overview of the previous research on AI-based prediction of missense variants, and elucidate the challenges and opportunities for protein structure-based variant prediction in the post-sequencing era.
Collapse
Affiliation(s)
- Chang Li
- Clinical Biobank, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
- The Key Laboratory of Geriatrics, Beijing Institute of Geriatrics, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
| | - Yixuan Luo
- Beijing Normal University, Beijing, China
| | - Yibo Xie
- Information Center, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
| | - Zaifeng Zhang
- The Key Laboratory of Geriatrics, Beijing Institute of Geriatrics, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
| | - Ye Liu
- The Key Laboratory of Geriatrics, Beijing Institute of Geriatrics, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
| | - Lihui Zou
- The Key Laboratory of Geriatrics, Beijing Institute of Geriatrics, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
| | - Fei Xiao
- Clinical Biobank, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
- The Key Laboratory of Geriatrics, Beijing Institute of Geriatrics, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
- Beijing Normal University, Beijing, China
| |
Collapse
|
2
|
Engreitz JM, Lawson HA, Singh H, Starita LM, Hon GC, Carter H, Sahni N, Reddy TE, Lin X, Li Y, Munshi NV, Chahrour MH, Boyle AP, Hitz BC, Mortazavi A, Craven M, Mohlke KL, Pinello L, Wang T, Kundaje A, Yue F, Cody S, Farrell NP, Love MI, Muffley LA, Pazin MJ, Reese F, Van Buren E, Dey KK, Kircher M, Ma J, Radivojac P, Balliu B, Williams BA, Huangfu D, Park CY, Quertermous T, Das J, Calderwood MA, Fowler DM, Vidal M, Ferreira L, Mooney SD, Pejaver V, Zhao J, Gazal S, Koch E, Reilly SK, Sunyaev S, Carpenter AE, Buenrostro JD, Leslie CS, Savage RE, Giric S, Luo C, Plath K, Barrera A, Schubach M, Gschwind AR, Moore JE, Ahituv N, Yi SS, Hallgrimsdottir I, Gaulton KJ, Sakaue S, Booeshaghi S, Mattei E, Nair S, Pachter L, Wang AT, Shendure J, Agarwal V, Blair A, Chalkiadakis T, Chardon FM, Dash PM, Deng C, Hamazaki N, Keukeleire P, Kubo C, Lalanne JB, Maass T, Martin B, McDiarmid TA, Nobuhara M, Page NF, Regalado S, Sims J, Ushiki A, Best SM, Boyle G, Camp N, Casadei S, Da EY, Dawood M, Dawson SC, Fayer S, Hamm A, James RG, Jarvik GP, McEwen AE, Moore N, Pendyala S, Popp NA, Post M, Rubin AF, Smith NT, Stone J, Tejura M, Wang ZR, Wheelock MK, Woo I, Zapp BD, Amgalan D, Aradhana A, Arana SM, Bassik MC, Bauman JR, Bhattacharya A, Cai XS, Chen Z, Conley S, Deshpande S, Doughty BR, Du PP, Galante JA, Gifford C, Greenleaf WJ, Guo K, Gupta R, Isobe S, Jagoda E, Jain N, Jones H, Kang HY, Kim SH, Kim Y, Klemm S, Kundu R, Kundu S, Lago-Docampo M, Lee-Yow YC, Levin-Konigsberg R, Li DY, Lindenhofer D, Ma XR, Marinov GK, Martyn GE, McCreery CV, Metzl-Raz E, Monteiro JP, Montgomery MT, Mualim KS, Munger C, Munson G, Nguyen TC, Nguyen T, Palmisano BT, Pampari A, Rabinovitch M, Ramste M, Ray J, Roy KR, Rubio OM, Schaepe JM, Schnitzler G, Schreiber J, Sharma D, Sheth MU, Shi H, Singh V, Sinha R, Steinmetz LM, Tan J, Tan A, Tycko J, Valbuena RC, Amiri VVP, van Kooten MJFM, Vaughan-Jackson A, Venida A, Weldy CS, Worssam MD, Xia F, Yao D, Zeng T, Zhao Q, Zhou R, Chen ZS, Cimini BA, Coppin G, Coté AG, Haghighi M, Hao T, Hill DE, Lacoste J, Laval F, Reno C, Roth FP, Singh S, Spirohn-Fitzgerald K, Taipale M, Teelucksingh T, Tixhon M, Yadav A, Yang Z, Kraus WL, Armendariz DA, Dederich AE, Gogate A, El Hayek L, Goetsch SC, Kaur K, Kim HB, McCoy MK, Nzima MZ, Pinzón-Arteaga CA, Posner BA, Schmitz DA, Sivakumar S, Sundarrajan A, Wang L, Wang Y, Wu J, Xu L, Xu J, Yu L, Zhang Y, Zhao H, Zhou Q, Won H, Bell JL, Broadaway KA, Degner KN, Etheridge AS, Koller BH, Mah W, Mu W, Ritola KD, Rosen JD, Schoenrock SA, Sharp RA, Bauer D, Lettre G, Sherwood R, Becerra B, Blaine LJ, Che E, Francoeur MJ, Gibbs EN, Kim N, King EM, Kleinstiver BP, Lecluze E, Li Z, Patel ZM, Phan QV, Ryu J, Starr ML, Wu T, Gersbach CA, Crawford GE, Allen AS, Majoros WH, Iglesias N, Rai R, Venukuttan R, Li B, Anglen T, Bounds LR, Hamilton MC, Liu S, McCutcheon SR, McRoberts Amador CD, Reisman SJ, ter Weele MA, Bodle JC, Streff HL, Siklenka K, Strouse K, Bernstein BE, Babu J, Corona GB, Dong K, Duarte FM, Durand NC, Epstein CB, Fan K, Gaskell E, Hall AW, Ham AM, Knudson MK, Shoresh N, Wekhande S, White CM, Xi W, Satpathy AT, Corces MR, Chang SH, Chin IM, Gardner JM, Gardell ZA, Gutierrez JC, Johnson AW, Kampman L, Kasowski M, Lareau CA, Liu V, Ludwig LS, McGinnis CS, Menon S, Qualls A, Sandor K, Turner AW, Ye CJ, Yin Y, Zhang W, Wold BJ, Carilli M, Cheong D, Filibam G, Green K, Kawauchi S, Kim C, Liang H, Loving R, Luebbert L, MacGregor G, Merchan AG, Rebboah E, Rezaie N, Sakr J, Sullivan DK, Swarna N, Trout D, Upchurch S, Weber R, Castro CP, Chou E, Feng F, Guerra A, Huang Y, Jiang L, Liu J, Mills RE, Qian W, Qin T, Sartor MA, Sherpa RN, Wang J, Wang Y, Welch JD, Zhang Z, Zhao N, Mukherjee S, Page CD, Clarke S, Doty RW, Duan Y, Gordan R, Ko KY, Li S, Li B, Thomson A, Raychaudhuri S, Price A, Ali TA, Dey KK, Durvasula A, Kellis M, Iakoucheva LM, Kakati T, Chen Y, Benazouz M, Jain S, Zeiberg D, De Paolis Kaluza MC, Velyunskiy M, Gasch A, Huang K, Jin Y, Lu Q, Miao J, Ohtake M, Scopel E, Steiner RD, Sverchkov Y, Weng Z, Garber M, Fu Y, Haas N, Li X, Phalke N, Shan SC, Shedd N, Yu T, Zhang Y, Zhou H, Battle A, Jerby L, Kotler E, Kundu S, Marderstein AR, Montgomery SB, Nigam A, Padhi EM, Patel A, Pritchard J, Raine I, Ramalingam V, Rodrigues KB, Schreiber JM, Singhal A, Sinha R, Wang AT, Abundis M, Bisht D, Chakraborty T, Fan J, Hall DR, Rarani ZH, Jain AK, Kaundal B, Keshari S, McGrail D, Pease NA, Yi VF, Wu H, Kannan S, Song H, Cai J, Gao Z, Kurzion R, Leu JI, Li F, Liang D, Ming GL, Musunuru K, Qiu Q, Shi J, Su Y, Tishkoff S, Xie N, Yang Q, Yang W, Zhang H, Zhang Z, Beer MA, Hadjantonakis AK, Adeniyi S, Cho H, Cutler R, Glenn RA, Godovich D, Hu N, Jovanic S, Luo R, Oh JW, Razavi-Mohseni M, Shigaki D, Sidoli S, Vierbuchen T, Wang X, Williams B, Yan J, Yang D, Yang Y, Sander M, Gaulton KJ, Ren B, Bartosik W, Indralingam HS, Klie A, Mummey H, Okino ML, Wang G, Zemke NR, Zhang K, Zhu H, Zaitlen N, Ernst J, Langerman J, Li T, Sun Y, Rudensky AY, Periyakoil PK, Gao VR, Smith MH, Thomas NM, Donlin LT, Lakhanpal A, Southard KM, Ardy RC, Cherry JM, Gerstein MB, Andreeva K, Assis PR, Borsari B, Douglass E, Dong S, Gabdank I, Graham K, Jolanki O, Jou J, Kagda MS, Lee JW, Li M, Lin K, Miyasato SR, Rozowsky J, Small C, Spragins E, Tanaka FY, Whaling IM, Youngworth IA, Sloan CA, Belter E, Chen X, Chisholm RL, Dickson P, Fan C, Fulton L, Li D, Lindsay T, Luan Y, Luo Y, Lyu H, Ma X, Macias-Velasco J, Miga KH, Quaid K, Stitziel N, Stranger BE, Tomlinson C, Wang J, Zhang W, Zhang B, Zhao G, Zhuo X, Brennand K, Ciccia A, Hayward SB, Huang JW, Leuzzi G, Taglialatela A, Thakar T, Vaitsiankova A, Dey KK, Ali TA, Kim A, Grimes HL, Salomonis N, Gupta R, Fang S, Lee-Kim V, Heinig M, Losert C, Jones TR, Donnard E, Murphy M, Roberts E, Song S, Mostafavi S, Sasse A, Spiro A, Pennacchio LA, Kato M, Kosicki M, Mannion B, Slaven N, Visel A, Pollard KS, Drusinsky S, Whalen S, Ray J, Harten IA, Ho CH, Sanjana NE, Caragine C, Morris JA, Seruggia D, Kutschat AP, Wittibschlager S, Xu H, Fu R, He W, Zhang L, Osorio D, Bly Z, Calluori S, Gilchrist DA, Hutter CM, Morris SA, Samer EK. Deciphering the impact of genomic variation on function. Nature 2024; 633:47-57. [PMID: 39232149 DOI: 10.1038/s41586-024-07510-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Accepted: 05/02/2024] [Indexed: 09/06/2024]
Abstract
Our genomes influence nearly every aspect of human biology-from molecular and cellular functions to phenotypes in health and disease. Studying the differences in DNA sequence between individuals (genomic variation) could reveal previously unknown mechanisms of human biology, uncover the basis of genetic predispositions to diseases, and guide the development of new diagnostic tools and therapeutic agents. Yet, understanding how genomic variation alters genome function to influence phenotype has proved challenging. To unlock these insights, we need a systematic and comprehensive catalogue of genome function and the molecular and cellular effects of genomic variants. Towards this goal, the Impact of Genomic Variation on Function (IGVF) Consortium will combine approaches in single-cell mapping, genomic perturbations and predictive modelling to investigate the relationships among genomic variation, genome function and phenotypes. IGVF will create maps across hundreds of cell types and states describing how coding variants alter protein activity, how noncoding variants change the regulation of gene expression, and how such effects connect through gene-regulatory and protein-interaction networks. These experimental data, computational predictions and accompanying standards and pipelines will be integrated into an open resource that will catalyse community efforts to explore how our genomes influence biology and disease across populations.
Collapse
|
3
|
Lin YJ, Menon AS, Hu Z, Brenner SE. Variant Impact Predictor database (VIPdb), version 2: trends from three decades of genetic variant impact predictors. Hum Genomics 2024; 18:90. [PMID: 39198917 PMCID: PMC11360829 DOI: 10.1186/s40246-024-00663-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2024] [Accepted: 08/19/2024] [Indexed: 09/01/2024] Open
Abstract
BACKGROUND Variant interpretation is essential for identifying patients' disease-causing genetic variants amongst the millions detected in their genomes. Hundreds of Variant Impact Predictors (VIPs), also known as Variant Effect Predictors (VEPs), have been developed for this purpose, with a variety of methodologies and goals. To facilitate the exploration of available VIP options, we have created the Variant Impact Predictor database (VIPdb). RESULTS The Variant Impact Predictor database (VIPdb) version 2 presents a collection of VIPs developed over the past three decades, summarizing their characteristics, ClinGen calibrated scores, CAGI assessment results, publication details, access information, and citation patterns. We previously summarized 217 VIPs and their features in VIPdb in 2019. Building upon this foundation, we identified and categorized an additional 190 VIPs, resulting in a total of 407 VIPs in VIPdb version 2. The majority of the VIPs have the capacity to predict the impacts of single nucleotide variants and nonsynonymous variants. More VIPs tailored to predict the impacts of insertions and deletions have been developed since the 2010s. In contrast, relatively few VIPs are dedicated to the prediction of splicing, structural, synonymous, and regulatory variants. The increasing rate of citations to VIPs reflects the ongoing growth in their use, and the evolving trends in citations reveal development in the field and individual methods. CONCLUSIONS VIPdb version 2 summarizes 407 VIPs and their features, potentially facilitating VIP exploration for various variant interpretation applications. VIPdb is available at https://genomeinterpretation.org/vipdb.
Collapse
Affiliation(s)
- Yu-Jen Lin
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, 94720, USA
- Center for Computational Biology, University of California, Berkeley, CA, 94720, USA
| | - Arul S Menon
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, 94720, USA
- College of Computing, Data Science, and Society, University of California, Berkeley, CA, 94720, USA
| | - Zhiqiang Hu
- Department of Plant and Microbial Biology, University of California, 111 Koshland Hall #3102, Berkeley, CA, 94720-3102, USA
- Illumina, Foster City, CA, 94404, USA
| | - Steven E Brenner
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, 94720, USA.
- Center for Computational Biology, University of California, Berkeley, CA, 94720, USA.
- College of Computing, Data Science, and Society, University of California, Berkeley, CA, 94720, USA.
- Department of Plant and Microbial Biology, University of California, 111 Koshland Hall #3102, Berkeley, CA, 94720-3102, USA.
| |
Collapse
|
4
|
Badonyi M, Marsh JA. Proteome-scale prediction of molecular mechanisms underlying dominant genetic diseases. PLoS One 2024; 19:e0307312. [PMID: 39172982 PMCID: PMC11341024 DOI: 10.1371/journal.pone.0307312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Accepted: 06/26/2024] [Indexed: 08/24/2024] Open
Abstract
Many dominant genetic disorders result from protein-altering mutations, acting primarily through dominant-negative (DN), gain-of-function (GOF), and loss-of-function (LOF) mechanisms. Deciphering the mechanisms by which dominant diseases exert their effects is often experimentally challenging and resource intensive, but is essential for developing appropriate therapeutic approaches. Diseases that arise via a LOF mechanism are more amenable to be treated by conventional gene therapy, whereas DN and GOF mechanisms may require gene editing or targeting by small molecules. Moreover, pathogenic missense mutations that act via DN and GOF mechanisms are more difficult to identify than those that act via LOF using nearly all currently available variant effect predictors. Here, we introduce a tripartite statistical model made up of support vector machine binary classifiers trained to predict whether human protein coding genes are likely to be associated with DN, GOF, or LOF molecular disease mechanisms. We test the utility of the predictions by examining biologically and clinically meaningful properties known to be associated with the mechanisms. Our results strongly support that the models are able to generalise on unseen data and offer insight into the functional attributes of proteins associated with different mechanisms. We hope that our predictions will serve as a springboard for researchers studying novel variants and those of uncertain clinical significance, guiding variant interpretation strategies and experimental characterisation. Predictions for the human UniProt reference proteome are available at https://osf.io/z4dcp/.
Collapse
Affiliation(s)
- Mihaly Badonyi
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, United Kingdom
| | - Joseph A. Marsh
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, United Kingdom
| |
Collapse
|
5
|
McDonnell AF, Plech M, Livesey BJ, Gerasimavicius L, Owen LJ, Hall HN, FitzPatrick DR, Marsh JA, Kudla G. Deep mutational scanning quantifies DNA binding and predicts clinical outcomes of PAX6 variants. Mol Syst Biol 2024; 20:825-844. [PMID: 38849565 PMCID: PMC11219921 DOI: 10.1038/s44320-024-00043-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Revised: 04/05/2024] [Accepted: 05/14/2024] [Indexed: 06/09/2024] Open
Abstract
Nonsense and missense mutations in the transcription factor PAX6 cause a wide range of eye development defects, including aniridia, microphthalmia and coloboma. To understand how changes of PAX6:DNA binding cause these phenotypes, we combined saturation mutagenesis of the paired domain of PAX6 with a yeast one-hybrid (Y1H) assay in which expression of a PAX6-GAL4 fusion gene drives antibiotic resistance. We quantified binding of more than 2700 single amino-acid variants to two DNA sequence elements. Mutations in DNA-facing residues of the N-terminal subdomain and linker region were most detrimental, as were mutations to prolines and to negatively charged residues. Many variants caused sequence-specific molecular gain-of-function effects, including variants in position 71 that increased binding to the LE9 enhancer but decreased binding to a SELEX-derived binding site. In the absence of antibiotic selection, variants that retained DNA binding slowed yeast growth, likely because such variants perturbed the yeast transcriptome. Benchmarking against known patient variants and applying ACMG/AMP guidelines to variant classification, we obtained supporting-to-moderate evidence that 977 variants are likely pathogenic and 1306 are likely benign. Our analysis shows that most pathogenic mutations in the paired domain of PAX6 can be explained simply by the effects of these mutations on PAX6:DNA association, and establishes Y1H as a generalisable assay for the interpretation of variant effects in transcription factors.
Collapse
Affiliation(s)
- Alexander F McDonnell
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK
| | - Marcin Plech
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK
| | - Benjamin J Livesey
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK
| | - Lukas Gerasimavicius
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK
| | - Liusaidh J Owen
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK
| | - Hildegard Nikki Hall
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK
| | - David R FitzPatrick
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK
| | - Joseph A Marsh
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK
| | - Grzegorz Kudla
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK.
| |
Collapse
|
6
|
Rubin AF. A new way of looking at transcription factor assays. Mol Syst Biol 2024; 20:741-743. [PMID: 38849564 PMCID: PMC11219719 DOI: 10.1038/s44320-024-00044-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Accepted: 05/14/2024] [Indexed: 06/09/2024] Open
Abstract
AF Rubin discusses a new high-throughput functional assay for transcription factors applied for a deep mutational scanning study of the transcription factor PAX6 by Kudla and colleagues (McDonnell et al, 2023 ) in this issue of Molecular Systems Biology .
Collapse
Affiliation(s)
- Alan F Rubin
- Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia.
- Department of Medical Biology, University of Melbourne, Parkville, Victoria, Australia.
| |
Collapse
|
7
|
Tabet DR, Kuang D, Lancaster MC, Li R, Liu K, Weile J, Coté AG, Wu Y, Hegele RA, Roden DM, Roth FP. Benchmarking computational variant effect predictors by their ability to infer human traits. Genome Biol 2024; 25:172. [PMID: 38951922 PMCID: PMC11218265 DOI: 10.1186/s13059-024-03314-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Accepted: 06/17/2024] [Indexed: 07/03/2024] Open
Abstract
BACKGROUND Computational variant effect predictors offer a scalable and increasingly reliable means of interpreting human genetic variation, but concerns of circularity and bias have limited previous methods for evaluating and comparing predictors. Population-level cohorts of genotyped and phenotyped participants that have not been used in predictor training can facilitate an unbiased benchmarking of available methods. Using a curated set of human gene-trait associations with a reported rare-variant burden association, we evaluate the correlations of 24 computational variant effect predictors with associated human traits in the UK Biobank and All of Us cohorts. RESULTS AlphaMissense outperformed all other predictors in inferring human traits based on rare missense variants in UK Biobank and All of Us participants. The overall rankings of computational variant effect predictors in these two cohorts showed a significant positive correlation. CONCLUSION We describe a method to assess computational variant effect predictors that sidesteps the limitations of previous evaluations. This approach is generalizable to future predictors and could continue to inform predictor choice for personal and clinical genetics.
Collapse
Affiliation(s)
- Daniel R Tabet
- Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Da Kuang
- Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Megan C Lancaster
- Division of Cardiovascular Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Roujia Li
- Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Karen Liu
- Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Jochen Weile
- Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Atina G Coté
- Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Yingzhou Wu
- Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Robert A Hegele
- Department of Medicine, Department of Biochemistry, Schulich School of Medicine and Dentistry, Robarts Research Institute, Western University, London, ON, Canada
| | - Dan M Roden
- Division of Cardiovascular Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Pharmacology, Vanderbilt University Medical Centre, Nashville, TN, USA
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Frederick P Roth
- Donnelly Centre, University of Toronto, Toronto, ON, Canada.
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada.
- Department of Computer Science, University of Toronto, Toronto, ON, Canada.
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada.
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA.
| |
Collapse
|
8
|
Lin YJ, Menon AS, Hu Z, Brenner SE. Variant Impact Predictor database (VIPdb), version 2: Trends from 25 years of genetic variant impact predictors. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.25.600283. [PMID: 38979289 PMCID: PMC11230257 DOI: 10.1101/2024.06.25.600283] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Background Variant interpretation is essential for identifying patients' disease-causing genetic variants amongst the millions detected in their genomes. Hundreds of Variant Impact Predictors (VIPs), also known as Variant Effect Predictors (VEPs), have been developed for this purpose, with a variety of methodologies and goals. To facilitate the exploration of available VIP options, we have created the Variant Impact Predictor database (VIPdb). Results The Variant Impact Predictor database (VIPdb) version 2 presents a collection of VIPs developed over the past 25 years, summarizing their characteristics, ClinGen calibrated scores, CAGI assessment results, publication details, access information, and citation patterns. We previously summarized 217 VIPs and their features in VIPdb in 2019. Building upon this foundation, we identified and categorized an additional 186 VIPs, resulting in a total of 403 VIPs in VIPdb version 2. The majority of the VIPs have the capacity to predict the impacts of single nucleotide variants and nonsynonymous variants. More VIPs tailored to predict the impacts of insertions and deletions have been developed since the 2010s. In contrast, relatively few VIPs are dedicated to the prediction of splicing, structural, synonymous, and regulatory variants. The increasing rate of citations to VIPs reflects the ongoing growth in their use, and the evolving trends in citations reveal development in the field and individual methods. Conclusions VIPdb version 2 summarizes 403 VIPs and their features, potentially facilitating VIP exploration for various variant interpretation applications. Availability VIPdb version 2 is available at https://genomeinterpretation.org/vipdb.
Collapse
Affiliation(s)
- Yu-Jen Lin
- Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA
- Center for Computational Biology, University of California, Berkeley, California 94720, USA
| | - Arul S. Menon
- Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA
- College of Computing, Data Science, and Society, University of California, Berkeley, California 94720, USA
| | - Zhiqiang Hu
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
- Currently at: Illumina, Foster City, California 94404, USA
| | - Steven E. Brenner
- Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA
- Center for Computational Biology, University of California, Berkeley, California 94720, USA
- College of Computing, Data Science, and Society, University of California, Berkeley, California 94720, USA
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
| |
Collapse
|
9
|
Ma K, Huang S, Ng KK, Lake NJ, Joseph S, Xu J, Lek A, Ge L, Woodman KG, Koczwara KE, Cohen J, Ho V, O'Connor CL, Brindley MA, Campbell KP, Lek M. Deep Mutational Scanning in Disease-related Genes with Saturation Mutagenesis-Reinforced Functional Assays (SMuRF). BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.07.12.548370. [PMID: 37873263 PMCID: PMC10592615 DOI: 10.1101/2023.07.12.548370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Interpretation of disease-causing genetic variants remains a challenge in human genetics. Current costs and complexity of deep mutational scanning methods hamper crowd-sourcing approaches toward genome-wide resolution of variants in disease-related genes. Our framework, Saturation Mutagenesis-Reinforced Functional assays (SMuRF), addresses these issues by offering simple and cost-effective saturation mutagenesis, as well as streamlining functional assays to enhance the interpretation of unresolved variants. Applying SMuRF to neuromuscular disease genes FKRP and LARGE1, we generated functional scores for all possible coding single nucleotide variants, which aid in resolving clinically reported variants of uncertain significance. SMuRF also demonstrates utility in predicting disease severity, resolving critical structural regions, and providing training datasets for the development of computational predictors. Our approach opens new directions for enabling variant-to-function insights for disease genes in a manner that is broadly useful for crowd-sourcing implementation across standard research laboratories.
Collapse
Affiliation(s)
- Kaiyue Ma
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
| | - Shushu Huang
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
- Equal second authors
| | - Kenneth K Ng
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
- Equal second authors
| | - Nicole J Lake
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
| | - Soumya Joseph
- Howard Hughes Medical Institute, Senator Paul D. Wellstone Muscular Dystrophy Specialized Research Center, Department of Molecular Physiology and Biophysics and Department of Neurology, Roy J. and Lucille A. Carver College of Medicine, The University of Iowa, Iowa City, IA, USA
| | - Jenny Xu
- Yale University, New Haven, CT, USA
| | - Angela Lek
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
- Muscular Dystrophy Association, Chicago, IL, USA
| | - Lin Ge
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
- Department of Neurology, National Center for Children's Health, Beijing Children's Hospital, Capital Medical University, Beijing, China
| | - Keryn G Woodman
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
| | | | - Justin Cohen
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
| | - Vincent Ho
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
| | | | - Melinda A Brindley
- Department of Infectious Diseases, Department of Population Health, University of Georgia, Athens, GA, USA
- Senior Authors
| | - Kevin P Campbell
- Howard Hughes Medical Institute, Senator Paul D. Wellstone Muscular Dystrophy Specialized Research Center, Department of Molecular Physiology and Biophysics and Department of Neurology, Roy J. and Lucille A. Carver College of Medicine, The University of Iowa, Iowa City, IA, USA
- Senior Authors
| | - Monkol Lek
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
- Senior Authors
- Lead Contact
| |
Collapse
|
10
|
Biar CG, Pfeifer C, Carvill GL, Calhoun JD. Multimodal framework to resolve variants of uncertain significance in TSC2. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.07.597916. [PMID: 38895336 PMCID: PMC11185720 DOI: 10.1101/2024.06.07.597916] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]
Abstract
Efforts to resolve the functional impact of variants of uncertain significance (VUS) have lagged behind the identification of new VUS; as such, there is a critical need for scalable VUS resolution technologies. Computational variant effect predictors (VEPs), once trained, can predict pathogenicity for all missense variants in a gene, set of genes, or the exome. Existing tools have employed information on known pathogenic and benign variants throughout the genome to predict pathogenicity of VUS. We hypothesize that taking a gene-specific approach will improve pathogenicity prediction over globally-trained VEPs. We tested this hypothesis using the gene TSC2, whose loss of function results in tuberous sclerosis, a multisystem mTORopathy affecting about 1 in 6,000 individuals born in the United States. TSC2 has been identified as a high-priority target for VUS resolution, with (1) well-characterized molecular and patient phenotypes associated with loss-of-function variants, and (2) more than 2,700 VUS already documented in ClinVar. We developed Tuberous sclerosis classifier to Resolve variants of Uncertain Significance in T SC2 (TRUST), a machine learning model to predict pathogenicity of TSC2 missense VUS. To test whether these predictions are accurate, we further introduce curated loci prime editing (cliPE) as an accessible strategy for performing scalable multiplexed assays of variant effect (MAVEs). Using cliPE, we tested the effects of more than 200 TSC2 variants, including 106 VUS. It is highly likely this functional data alone would be sufficient to reclassify 92 VUS with most being reclassified as likely benign. We found that TRUST's classifications were correlated with the functional data, providing additional validation for the in silico predictions. We provide our pathogenicity predictions and MAVE data to aid with VUS resolution. In the near future, we plan to host these data on a public website and deposit into relevant databases such as MAVEdb as a community resource. Ultimately, this study provides a framework to complete variant effect maps of TSC1 and TSC2 and adapt this approach to other mTORopathy genes.
Collapse
Affiliation(s)
- Carina G Biar
- Ken and Ruth Davee Department of Neurology, Northwestern Feinberg School of Medicine, Chicago, Illinois
| | - Cole Pfeifer
- Ken and Ruth Davee Department of Neurology, Northwestern Feinberg School of Medicine, Chicago, Illinois
| | - Gemma L Carvill
- Ken and Ruth Davee Department of Neurology, Northwestern Feinberg School of Medicine, Chicago, Illinois
| | - Jeffrey D Calhoun
- Ken and Ruth Davee Department of Neurology, Northwestern Feinberg School of Medicine, Chicago, Illinois
| |
Collapse
|
11
|
Kernohan KD, Boycott KM. The expanding diagnostic toolbox for rare genetic diseases. Nat Rev Genet 2024; 25:401-415. [PMID: 38238519 DOI: 10.1038/s41576-023-00683-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/22/2023] [Indexed: 05/23/2024]
Abstract
Genomic technologies, such as targeted, exome and short-read genome sequencing approaches, have revolutionized the care of patients with rare genetic diseases. However, more than half of patients remain without a diagnosis. Emerging approaches from research-based settings such as long-read genome sequencing and optical genome mapping hold promise for improving the identification of disease-causal genetic variants. In addition, new omic technologies that measure the transcriptome, epigenome, proteome or metabolome are showing great potential for variant interpretation. As genetic testing options rapidly expand, the clinical community needs to be mindful of their individual strengths and limitations, as well as remaining challenges, to select the appropriate diagnostic test, correctly interpret results and drive innovation to address insufficiencies. If used effectively - through truly integrative multi-omics approaches and data sharing - the resulting large quantities of data from these established and emerging technologies will greatly improve the interpretative power of genetic and genomic diagnostics for rare diseases.
Collapse
Affiliation(s)
- Kristin D Kernohan
- CHEO Research Institute, University of Ottawa, Ottawa, ON, Canada
- Newborn Screening Ontario, CHEO, Ottawa, ON, Canada
| | - Kym M Boycott
- CHEO Research Institute, University of Ottawa, Ottawa, ON, Canada.
- Department of Genetics, CHEO, Ottawa, ON, Canada.
| |
Collapse
|
12
|
Ma K, Gauthier LO, Cheung F, Huang S, Lek M. High-throughput assays to assess variant effects on disease. Dis Model Mech 2024; 17:dmm050573. [PMID: 38940340 PMCID: PMC11225591 DOI: 10.1242/dmm.050573] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
Interpreting the wealth of rare genetic variants discovered in population-scale sequencing efforts and deciphering their associations with human health and disease present a critical challenge due to the lack of sufficient clinical case reports. One promising avenue to overcome this problem is deep mutational scanning (DMS), a method of introducing and evaluating large-scale genetic variants in model cell lines. DMS allows unbiased investigation of variants, including those that are not found in clinical reports, thus improving rare disease diagnostics. Currently, the main obstacle limiting the full potential of DMS is the availability of functional assays that are specific to disease mechanisms. Thus, we explore high-throughput functional methodologies suitable to examine broad disease mechanisms. We specifically focus on methods that do not require robotics or automation but instead use well-designed molecular tools to transform biological mechanisms into easily detectable signals, such as cell survival rate, fluorescence or drug resistance. Here, we aim to bridge the gap between disease-relevant assays and their integration into the DMS framework.
Collapse
Affiliation(s)
- Kaiyue Ma
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510, USA
| | - Logan O. Gauthier
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510, USA
| | - Frances Cheung
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510, USA
| | - Shushu Huang
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510, USA
| | - Monkol Lek
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510, USA
| |
Collapse
|
13
|
Cooper S, Obolenski S, Waters AJ, Bassett AR, Coelho MA. Analyzing the functional effects of DNA variants with gene editing. CELL REPORTS METHODS 2024; 4:100776. [PMID: 38744287 PMCID: PMC11133854 DOI: 10.1016/j.crmeth.2024.100776] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 03/01/2024] [Accepted: 04/22/2024] [Indexed: 05/16/2024]
Abstract
Continual advancements in genomics have led to an ever-widening disparity between the rate of discovery of genetic variants and our current understanding of their functions and potential roles in disease. Systematic methods for phenotyping DNA variants are required to effectively translate genomics data into improved outcomes for patients with genetic diseases. To make the biggest impact, these approaches must be scalable and accurate, faithfully reflect disease biology, and define complex disease mechanisms. We compare current methods to analyze the function of variants in their endogenous DNA context using genome editing strategies, such as saturation genome editing, base editing and prime editing. We discuss how these technologies can be linked to high-content readouts to gain deep mechanistic insights into variant effects. Finally, we highlight key challenges that need to be addressed to bridge the genotype to phenotype gap, and ultimately improve the diagnosis and treatment of genetic diseases.
Collapse
Affiliation(s)
- Sarah Cooper
- Cellular and Gene Editing Research, Wellcome Sanger Institute, Hinxton, UK
| | - Sofia Obolenski
- Experimental Cancer Genetics, Wellcome Sanger Institute, Hinxton, UK; Department of Dermatology, Leiden University Medical Center, Leiden, the Netherlands
| | - Andrew J Waters
- Experimental Cancer Genetics, Wellcome Sanger Institute, Hinxton, UK
| | - Andrew R Bassett
- Cellular and Gene Editing Research, Wellcome Sanger Institute, Hinxton, UK.
| | | |
Collapse
|
14
|
Clausen L, Okarmus J, Voutsinos V, Meyer M, Lindorff-Larsen K, Hartmann-Petersen R. PRKN-linked familial Parkinson's disease: cellular and molecular mechanisms of disease-linked variants. Cell Mol Life Sci 2024; 81:223. [PMID: 38767677 PMCID: PMC11106057 DOI: 10.1007/s00018-024-05262-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 04/25/2024] [Accepted: 05/02/2024] [Indexed: 05/22/2024]
Abstract
Parkinson's disease (PD) is a common and incurable neurodegenerative disorder that arises from the loss of dopaminergic neurons in the substantia nigra and is mainly characterized by progressive loss of motor function. Monogenic familial PD is associated with highly penetrant variants in specific genes, notably the PRKN gene, where homozygous or compound heterozygous loss-of-function variants predominate. PRKN encodes Parkin, an E3 ubiquitin-protein ligase important for protein ubiquitination and mitophagy of damaged mitochondria. Accordingly, Parkin plays a central role in mitochondrial quality control but is itself also subject to a strict protein quality control system that rapidly eliminates certain disease-linked Parkin variants. Here, we summarize the cellular and molecular functions of Parkin, highlighting the various mechanisms by which PRKN gene variants result in loss-of-function. We emphasize the importance of high-throughput assays and computational tools for the clinical classification of PRKN gene variants and how detailed insights into the pathogenic mechanisms of PRKN gene variants may impact the development of personalized therapeutics.
Collapse
Affiliation(s)
- Lene Clausen
- Department of Biology, Linderstrøm-Lang Centre for Protein Science, University of Copenhagen, 2200, Copenhagen, Denmark
| | - Justyna Okarmus
- Department of Neurobiology Research, Institute of Molecular Medicine, University of Southern Denmark, 5230, Odense, Denmark
| | - Vasileios Voutsinos
- Department of Biology, Linderstrøm-Lang Centre for Protein Science, University of Copenhagen, 2200, Copenhagen, Denmark
| | - Morten Meyer
- Department of Neurobiology Research, Institute of Molecular Medicine, University of Southern Denmark, 5230, Odense, Denmark
- Department of Neurology, Odense University Hospital, 5000, Odense, Denmark
- Department of Clinical Research, BRIDGE, Brain Research Inter Disciplinary Guided Excellence, University of Southern Denmark, 5230, Odense, Denmark
| | - Kresten Lindorff-Larsen
- Department of Biology, Linderstrøm-Lang Centre for Protein Science, University of Copenhagen, 2200, Copenhagen, Denmark
| | - Rasmus Hartmann-Petersen
- Department of Biology, Linderstrøm-Lang Centre for Protein Science, University of Copenhagen, 2200, Copenhagen, Denmark.
| |
Collapse
|
15
|
Tenthorey JL, del Banco S, Ramzan I, Klingenberg H, Liu C, Emerman M, Malik HS. Indels allow antiviral proteins to evolve functional novelty inaccessible by missense mutations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.07.592993. [PMID: 38765965 PMCID: PMC11100679 DOI: 10.1101/2024.05.07.592993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
Antiviral proteins often evolve rapidly at virus-binding interfaces to defend against new viruses. We investigated whether antiviral adaptation via missense mutations might face limits, which insertion or deletion mutations (indels) could overcome. We report one such case of a nearly insurmountable evolutionary challenge: the human anti-retroviral protein TRIM5α requires more than five missense mutations in its specificity-determining v1 loop to restrict a divergent simian immunodeficiency virus (SIV). However, duplicating just one amino acid in v1 enables human TRIM5α to potently restrict SIV in a single evolutionary step. Moreover, natural primate TRIM5α v1 loops have evolved indels that confer novel antiviral specificities. Thus, indels enable antiviral proteins to overcome viral challenges inaccessible by missense mutations, revealing the potential of these often-overlooked mutations in driving protein innovation.
Collapse
Affiliation(s)
- Jeannette L. Tenthorey
- Cellular and Molecular Pharmacology Department, University of California, San Francisco; San Francisco, 94158, USA
| | - Serena del Banco
- Division of Basic Sciences, Fred Hutchinson Cancer Center; Seattle, USA
| | - Ishrak Ramzan
- Cellular and Molecular Pharmacology Department, University of California, San Francisco; San Francisco, 94158, USA
| | - Hayley Klingenberg
- Cellular and Molecular Pharmacology Department, University of California, San Francisco; San Francisco, 94158, USA
| | - Chang Liu
- Cellular and Molecular Pharmacology Department, University of California, San Francisco; San Francisco, 94158, USA
| | - Michael Emerman
- Division of Basic Sciences, Fred Hutchinson Cancer Center; Seattle, USA
- Division of Human Biology, Fred Hutchinson Cancer Center; Seattle, USA
| | - Harmit S. Malik
- Division of Basic Sciences, Fred Hutchinson Cancer Center; Seattle, USA
- Howard Hughes Medical Investigator, Fred Hutchinson Cancer Center; Seattle, USA
| |
Collapse
|
16
|
Allen S, Garrett A, Muffley L, Fayer S, Foreman J, Adams DJ, Hurles M, Rubin AF, Roth FP, Starita LM, Biesecker LG, Turnbull C. Workshop report: the clinical application of data from multiplex assays of variant effect (MAVEs), 12 July 2023. Eur J Hum Genet 2024; 32:593-600. [PMID: 38433264 PMCID: PMC11061192 DOI: 10.1038/s41431-024-01566-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Revised: 02/05/2024] [Accepted: 02/08/2024] [Indexed: 03/05/2024] Open
Affiliation(s)
- Sophie Allen
- Division of Genetics and Epidemiology, The Institute of Cancer Research, London, UK.
| | - Alice Garrett
- Division of Genetics and Epidemiology, The Institute of Cancer Research, London, UK
- St George's University Hospitals NHS Foundation Trust, Tooting, London, UK
| | - Lara Muffley
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
| | - Shawn Fayer
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Julia Foreman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - David J Adams
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Matthew Hurles
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Alan F Rubin
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia
| | - Frederick P Roth
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
- Donnelly Centre and Departments of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Lea M Starita
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
| | - Leslie G Biesecker
- Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Clare Turnbull
- Division of Genetics and Epidemiology, The Institute of Cancer Research, London, UK
- The Royal Marsden NHS Foundation Trust, London, UK
| |
Collapse
|
17
|
Livesey BJ, Badonyi M, Dias M, Frazer J, Kumar S, Lindorff-Larsen K, McCandlish DM, Orenbuch R, Shearer CA, Muffley L, Foreman J, Glazer AM, Lehner B, Marks DS, Roth FP, Rubin AF, Starita LM, Marsh JA. Guidelines for releasing a variant effect predictor. ARXIV 2024:arXiv:2404.10807v1. [PMID: 38699161 PMCID: PMC11065047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]
Abstract
Computational methods for assessing the likely impacts of mutations, known as variant effect predictors (VEPs), are widely used in the assessment and interpretation of human genetic variation, as well as in other applications like protein engineering. Many different VEPs have been released to date, and there is tremendous variability in their underlying algorithms and outputs, and in the ways in which the methodologies and predictions are shared. This leads to considerable challenges for end users in knowing which VEPs to use and how to use them. Here, to address these issues, we provide guidelines and recommendations for the release of novel VEPs. Emphasising open-source availability, transparent methodologies, clear variant effect score interpretations, standardised scales, accessible predictions, and rigorous training data disclosure, we aim to improve the usability and interpretability of VEPs, and promote their integration into analysis and evaluation pipelines. We also provide a large, categorised list of currently available VEPs, aiming to facilitate the discovery and encourage the usage of novel methods within the scientific community.
Collapse
Affiliation(s)
- Benjamin J. Livesey
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| | - Mihaly Badonyi
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| | - Mafalda Dias
- Centre for Genomic Regulation (CRG),The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Jonathan Frazer
- Centre for Genomic Regulation (CRG),The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Sushant Kumar
- Department of Medical Biophysics, University of Toronto; Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - Kresten Lindorff-Larsen
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - David M. McCandlish
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Rose Orenbuch
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | | | - Lara Muffley
- Department of Genome Sciences, University of Washington and the Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
| | - Julia Foreman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | | | - Ben Lehner
- Wellcome Sanger Institute, Cambridge, UK; Universitat Pompeu Fabra (UPF), Barcelona, Spain; Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | - Debora S. Marks
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Boston, MA, USA
| | - Frederick P. Roth
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Alan F. Rubin
- Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research; Department of Medical Biology, University of Melbourne, Parkville, Australia
| | - Lea M. Starita
- Department of Genome Sciences, University of Washington and the Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
| | - Joseph A. Marsh
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| |
Collapse
|
18
|
Dawood M, Fayer S, Pendyala S, Post M, Kalra D, Patterson K, Venner E, Muffley LA, Fowler DM, Rubin AF, Posey JE, Plon SE, Lupski JR, Gibbs RA, Starita LM, Robles-Espinoza CD, Coyote-Maestas W, Gallego Romero I. Defining and Reducing Variant Classification Disparities. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.04.11.24305690. [PMID: 38645101 PMCID: PMC11030469 DOI: 10.1101/2024.04.11.24305690] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
Background Multiplexed Assays of Variant Effects (MAVEs) can test all possible single variants in a gene of interest. The resulting saturation-style data may help resolve variant classification disparities between populations, especially for variants of uncertain significance (VUS). Methods We analyzed clinical significance classifications in 213,663 individuals of European-like genetic ancestry versus 206,975 individuals of non-European-like genetic ancestry from All of Us and the Genome Aggregation Database. Then, we incorporated clinically calibrated MAVE data into the Clinical Genome Resource's Variant Curation Expert Panel rules to automate VUS reclassification for BRCA1, TP53, and PTEN . Results Using two orthogonal statistical approaches, we show a higher prevalence ( p ≤5.95e-06) of VUS in individuals of non-European-like genetic ancestry across all medical specialties assessed in all three databases. Further, in the non-European-like genetic ancestry group, higher rates of Benign or Likely Benign and variants with no clinical designation ( p ≤2.5e-05) were found across many medical specialties, whereas Pathogenic or Likely Pathogenic assignments were higher in individuals of European-like genetic ancestry ( p ≤2.5e-05). Using MAVE data, we reclassified VUS in individuals of non-European-like genetic ancestry at a significantly higher rate in comparison to reclassified VUS from European-like genetic ancestry ( p =9.1e-03) effectively compensating for the VUS disparity. Further, essential code analysis showed equitable impact of MAVE evidence codes but inequitable impact of allele frequency ( p =7.47e-06) and computational predictor ( p =6.92e-05) evidence codes for individuals of non-European-like genetic ancestry. Conclusions Generation of saturation-style MAVE data should be a priority to reduce VUS disparities and produce equitable training data for future computational predictors.
Collapse
|
19
|
Takahashi S, Kojima T, Wasano K, Homma K. Functional Studies of Deafness-Associated Pendrin and Prestin Variants. Int J Mol Sci 2024; 25:2759. [PMID: 38474007 PMCID: PMC10931795 DOI: 10.3390/ijms25052759] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Revised: 02/21/2024] [Accepted: 02/23/2024] [Indexed: 03/14/2024] Open
Abstract
Pendrin and prestin are evolutionary-conserved membrane proteins that are essential for normal hearing. Dysfunction of these proteins results in hearing loss in humans, and numerous deafness-associated pendrin and prestin variants have been identified in patients. However, the pathogenic impacts of many of these variants are ambiguous. Here, we report results from our ongoing efforts to experimentally characterize pendrin and prestin variants using in vitro functional assays. With previously established fluorometric anion transport assays, we determined that many of the pendrin variants identified on transmembrane (TM) 10, which contains the essential anion binding site, and on the neighboring TM9 within the core domain resulted in impaired anion transport activity. We also determined the range of functional impairment in three deafness-associated prestin variants by measuring nonlinear capacitance (NLC), a proxy for motor function. Using the results from our functional analyses, we also evaluated the performance of AlphaMissense (AM), a computational tool for predicting the pathogenicity of missense variants. AM prediction scores correlated well with our experimental results; however, some variants were misclassified, underscoring the necessity of experimentally assessing the effects of variants. Together, our experimental efforts provide invaluable information regarding the pathogenicity of deafness-associated pendrin and prestin variants.
Collapse
Affiliation(s)
- Satoe Takahashi
- Department of Otolaryngology—Head and Neck Surgery, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
| | - Takashi Kojima
- Department of Otolaryngology—Head and Neck Surgery, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
- Department of Otolaryngology, Head and Neck Surgery, National Hospital Organization Tochigi Medical Center, Tochigi 320-0057, Japan
| | - Koichiro Wasano
- Department of Otolaryngology—Head and Neck Surgery, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
- Department of Otolaryngology, Head and Neck Surgery, Tokai University School of Medicine, Isehara 259-1193, Japan
| | - Kazuaki Homma
- Department of Otolaryngology—Head and Neck Surgery, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
- The Hugh Knowles Center for Clinical and Basic Science in Hearing and Its Disorders, Northwestern University, Evanston, IL 60208, USA
| |
Collapse
|
20
|
Barili V, Ambrosini E, Bortesi B, Minari R, De Sensi E, Cannizzaro IR, Taiani A, Michiara M, Sikokis A, Boggiani D, Tommasi C, Serra O, Bonatti F, Adorni A, Luberto A, Caggiati P, Martorana D, Uliana V, Percesepe A, Musolino A, Pellegrino B. Genetic Basis of Breast and Ovarian Cancer: Approaches and Lessons Learnt from Three Decades of Inherited Predisposition Testing. Genes (Basel) 2024; 15:219. [PMID: 38397209 PMCID: PMC10888198 DOI: 10.3390/genes15020219] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 02/02/2024] [Accepted: 02/05/2024] [Indexed: 02/25/2024] Open
Abstract
Germline variants occurring in BRCA1 and BRCA2 give rise to hereditary breast and ovarian cancer (HBOC) syndrome, predisposing to breast, ovarian, fallopian tube, and peritoneal cancers marked by elevated incidences of genomic aberrations that correspond to poor prognoses. These genes are in fact involved in genetic integrity, particularly in the process of homologous recombination (HR) DNA repair, a high-fidelity repair system for mending DNA double-strand breaks. In addition to its implication in HBOC pathogenesis, the impairment of HR has become a prime target for therapeutic intervention utilizing poly (ADP-ribose) polymerase (PARP) inhibitors. In the present review, we introduce the molecular roles of HR orchestrated by BRCA1 and BRCA2 within the framework of sensitivity to PARP inhibitors. We examine the genetic architecture underneath breast and ovarian cancer ranging from high- and mid- to low-penetrant predisposing genes and taking into account both germline and somatic variations. Finally, we consider higher levels of complexity of the genomic landscape such as polygenic risk scores and other approaches aiming to optimize therapeutic and preventive strategies for breast and ovarian cancer.
Collapse
Affiliation(s)
- Valeria Barili
- Department of Medicine and Surgery, University of Parma, 43126 Parma, Italy
| | - Enrico Ambrosini
- Medical Genetics, University Hospital of Parma, 43126 Parma, Italy
| | - Beatrice Bortesi
- Medical Oncology Unit, University Hospital of Parma, 43126 Parma, Italy
| | - Roberta Minari
- Medical Oncology Unit, University Hospital of Parma, 43126 Parma, Italy
| | - Erika De Sensi
- Department of Medicine and Surgery, University of Parma, 43126 Parma, Italy
| | | | - Antonietta Taiani
- Department of Medicine and Surgery, University of Parma, 43126 Parma, Italy
| | - Maria Michiara
- Medical Oncology Unit, University Hospital of Parma, 43126 Parma, Italy
- Breast Unit, University Hospital of Parma, 43126 Parma, Italy
| | - Angelica Sikokis
- Medical Oncology Unit, University Hospital of Parma, 43126 Parma, Italy
- Breast Unit, University Hospital of Parma, 43126 Parma, Italy
| | - Daniela Boggiani
- Medical Oncology Unit, University Hospital of Parma, 43126 Parma, Italy
- Breast Unit, University Hospital of Parma, 43126 Parma, Italy
| | - Chiara Tommasi
- Department of Medicine and Surgery, University of Parma, 43126 Parma, Italy
- Medical Oncology Unit, University Hospital of Parma, 43126 Parma, Italy
- Breast Unit, University Hospital of Parma, 43126 Parma, Italy
| | - Olga Serra
- Medical Oncology Unit, University Hospital of Parma, 43126 Parma, Italy
- Breast Unit, University Hospital of Parma, 43126 Parma, Italy
| | - Francesco Bonatti
- Medical Oncology Unit, University Hospital of Parma, 43126 Parma, Italy
| | - Alessia Adorni
- Medical Oncology Unit, University Hospital of Parma, 43126 Parma, Italy
| | - Anita Luberto
- Department of Medicine and Surgery, University of Parma, 43126 Parma, Italy
| | | | - Davide Martorana
- Medical Genetics, University Hospital of Parma, 43126 Parma, Italy
| | - Vera Uliana
- Medical Genetics, University Hospital of Parma, 43126 Parma, Italy
| | - Antonio Percesepe
- Department of Medicine and Surgery, University of Parma, 43126 Parma, Italy
- Medical Genetics, University Hospital of Parma, 43126 Parma, Italy
| | - Antonino Musolino
- Department of Medicine and Surgery, University of Parma, 43126 Parma, Italy
- Medical Oncology Unit, University Hospital of Parma, 43126 Parma, Italy
- Breast Unit, University Hospital of Parma, 43126 Parma, Italy
| | - Benedetta Pellegrino
- Medical Oncology Unit, University Hospital of Parma, 43126 Parma, Italy
- Breast Unit, University Hospital of Parma, 43126 Parma, Italy
| |
Collapse
|
21
|
Andorf CM, Haley OC, Hayford RK, Portwood JL, Harding S, Sen S, Cannon EK, Gardiner JM, Kim HS, Woodhouse MR. PanEffect: a pan-genome visualization tool for variant effects in maize. Bioinformatics 2024; 40:btae073. [PMID: 38337024 PMCID: PMC10881103 DOI: 10.1093/bioinformatics/btae073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Revised: 01/30/2024] [Accepted: 02/06/2024] [Indexed: 02/12/2024] Open
Abstract
SUMMARY Understanding the effects of genetic variants is crucial for accurately predicting traits and functional outcomes. Recent approaches have utilized artificial intelligence and protein language models to score all possible missense variant effects at the proteome level for a single genome, but a reliable tool is needed to explore these effects at the pan-genome level. To address this gap, we introduce a new tool called PanEffect. We implemented PanEffect at MaizeGDB to enable a comprehensive examination of the potential effects of coding variants across 50 maize genomes. The tool allows users to visualize over 550 million possible amino acid substitutions in the B73 maize reference genome and to observe the effects of the 2.3 million natural variations in the maize pan-genome. Each variant effect score, calculated from the Evolutionary Scale Modeling (ESM) protein language model, shows the log-likelihood ratio difference between B73 and all variants in the pan-genome. These scores are shown using heatmaps spanning benign outcomes to potential functional consequences. In addition, PanEffect displays secondary structures and functional domains along with the variant effects, offering additional functional and structural context. Using PanEffect, researchers now have a platform to explore protein variants and identify genetic targets for crop enhancement. AVAILABILITY AND IMPLEMENTATION The PanEffect code is freely available on GitHub (https://github.com/Maize-Genetics-and-Genomics-Database/PanEffect). A maize implementation of PanEffect and underlying datasets are available at MaizeGDB (https://www.maizegdb.org/effect/maize/).
Collapse
Affiliation(s)
- Carson M Andorf
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, United States
- Department of Computer Science, Iowa State University, Ames, IA 50011, United States
| | - Olivia C Haley
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, United States
| | - Rita K Hayford
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, United States
| | - John L Portwood
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, United States
| | - Stephen Harding
- USDA-ARS, Mycotoxin Prevention and Applied Microbiology Research Unit, National Center for Agricultural Utilization Research, Peoria, IL 61604, United States
| | - Shatabdi Sen
- Department of Plant Pathology & Microbiology, Iowa State University, Ames, IA 50011, United States
| | - Ethalinda K Cannon
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, United States
| | - Jack M Gardiner
- Division of Animal Sciences, University of Missouri, Columbia, MO 65211, United States
| | - Hye-Seon Kim
- USDA-ARS, Mycotoxin Prevention and Applied Microbiology Research Unit, National Center for Agricultural Utilization Research, Peoria, IL 61604, United States
| | - Margaret R Woodhouse
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, United States
| |
Collapse
|
22
|
Takahashi S, Kojima T, Wasano K, Homma K. Functional studies of deafness-associated pendrin and prestin variants. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.23.576877. [PMID: 38328051 PMCID: PMC10849616 DOI: 10.1101/2024.01.23.576877] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/09/2024]
Abstract
Pendrin and prestin are evolutionary conserved membrane proteins that are essential for normal hearing. Pendrin is an anion transporter required for normal development and maintenance of ion homeostasis in the inner ear, while prestin is a voltage-dependent motor responsible for cochlear amplification essential for high sensitivity and frequency selectivity of mammalian hearing. Dysfunction of these proteins result in hearing loss in humans, and numerous deafness-associated pendrin and prestin variants have been identified in patients. However, the pathogenic impacts of many of these variants are ambiguous. Here we report results from our ongoing efforts in experimentally characterizing pendrin and prestin variants using in vitro functional assays, providing invaluable information regarding their pathogenicity.
Collapse
|
23
|
Nuttle X, Burt ND, Currall B, Moysés-Oliveira M, Mohajeri K, Bhavsar R, Lucente D, Yadav R, Tai DJC, Gusella JF, Talkowski ME. Parallelized engineering of mutational models using piggyBac transposon delivery of CRISPR libraries. CELL REPORTS METHODS 2024; 4:100672. [PMID: 38091988 PMCID: PMC10831954 DOI: 10.1016/j.crmeth.2023.100672] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Revised: 08/14/2023] [Accepted: 11/21/2023] [Indexed: 01/25/2024]
Abstract
New technologies and large-cohort studies have enabled novel variant discovery and association at unprecedented scale, yet functional characterization of these variants remains paramount to deciphering disease mechanisms. Approaches that facilitate parallelized genome editing of cells of interest or induced pluripotent stem cells (iPSCs) have become critical tools toward this goal. Here, we developed an approach that incorporates libraries of CRISPR-Cas9 guide RNAs (gRNAs) together with inducible Cas9 into a piggyBac (PB) transposon system to engineer dozens to hundreds of genomic variants in parallel against isogenic cellular backgrounds. This method empowers loss-of-function (LoF) studies through the introduction of insertions or deletions (indels) and copy-number variants (CNVs), though generating specific nucleotide changes is possible with prime editing. The ability to rapidly establish high-quality mutational models at scale will facilitate the development of isogenic cellular collections and catalyze comparative functional genomic studies investigating the roles of hundreds of genes and mutations in development and disease.
Collapse
Affiliation(s)
- Xander Nuttle
- Center for Genomic Medicine and Department of Neurology, Massachusetts General Hospital, Boston, MA, USA; Department of Neurology, Harvard Medical School, Boston, MA, USA; Stanley Center for Psychiatric Research, Broad Institute, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute, Cambridge, MA, USA.
| | - Nicholas D Burt
- Center for Genomic Medicine and Department of Neurology, Massachusetts General Hospital, Boston, MA, USA
| | - Benjamin Currall
- Center for Genomic Medicine and Department of Neurology, Massachusetts General Hospital, Boston, MA, USA
| | - Mariana Moysés-Oliveira
- Center for Genomic Medicine and Department of Neurology, Massachusetts General Hospital, Boston, MA, USA; Department of Neurology, Harvard Medical School, Boston, MA, USA; Stanley Center for Psychiatric Research, Broad Institute, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute, Cambridge, MA, USA
| | - Kiana Mohajeri
- Center for Genomic Medicine and Department of Neurology, Massachusetts General Hospital, Boston, MA, USA; Stanley Center for Psychiatric Research, Broad Institute, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute, Cambridge, MA, USA; PhD program in Biological and Biomedical Sciences, Harvard Medical School, Boston, MA, USA
| | - Riya Bhavsar
- Center for Genomic Medicine and Department of Neurology, Massachusetts General Hospital, Boston, MA, USA; Department of Neurology, Harvard Medical School, Boston, MA, USA; Stanley Center for Psychiatric Research, Broad Institute, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute, Cambridge, MA, USA
| | - Diane Lucente
- Center for Genomic Medicine and Department of Neurology, Massachusetts General Hospital, Boston, MA, USA
| | - Rachita Yadav
- Center for Genomic Medicine and Department of Neurology, Massachusetts General Hospital, Boston, MA, USA; Department of Neurology, Harvard Medical School, Boston, MA, USA; Stanley Center for Psychiatric Research, Broad Institute, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute, Cambridge, MA, USA
| | - Derek J C Tai
- Center for Genomic Medicine and Department of Neurology, Massachusetts General Hospital, Boston, MA, USA; Department of Neurology, Harvard Medical School, Boston, MA, USA; Stanley Center for Psychiatric Research, Broad Institute, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute, Cambridge, MA, USA
| | - James F Gusella
- Center for Genomic Medicine and Department of Neurology, Massachusetts General Hospital, Boston, MA, USA; Program in Medical and Population Genetics, Broad Institute, Cambridge, MA, USA; Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA; Harvard Stem Cell Institute, Cambridge, MA, USA
| | - Michael E Talkowski
- Center for Genomic Medicine and Department of Neurology, Massachusetts General Hospital, Boston, MA, USA; Department of Neurology, Harvard Medical School, Boston, MA, USA; Stanley Center for Psychiatric Research, Broad Institute, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute, Cambridge, MA, USA.
| |
Collapse
|
24
|
Harrison PW, Amode MR, Austine-Orimoloye O, Azov A, Barba M, Barnes I, Becker A, Bennett R, Berry A, Bhai J, Bhurji SK, Boddu S, Branco Lins PR, Brooks L, Ramaraju S, Campbell L, Martinez MC, Charkhchi M, Chougule K, Cockburn A, Davidson C, De Silva N, Dodiya K, Donaldson S, El Houdaigui B, Naboulsi T, Fatima R, Giron CG, Genez T, Grigoriadis D, Ghattaoraya G, Martinez JG, Gurbich T, Hardy M, Hollis Z, Hourlier T, Hunt T, Kay M, Kaykala V, Le T, Lemos D, Lodha D, Marques-Coelho D, Maslen G, Merino G, Mirabueno L, Mushtaq A, Hossain S, Ogeh D, Sakthivel MP, Parker A, Perry M, Piližota I, Poppleton D, Prosovetskaia I, Raj S, Pérez-Silva J, Salam A, Saraf S, Saraiva-Agostinho N, Sheppard D, Sinha S, Sipos B, Sitnik V, Stark W, Steed E, Suner MM, Surapaneni L, Sutinen K, Tricomi FF, Urbina-Gómez D, Veidenberg A, Walsh TA, Ware D, Wass E, Willhoft N, Allen J, Alvarez-Jarreta J, Chakiachvili M, Flint B, Giorgetti S, Haggerty L, Ilsley G, Keatley J, Loveland J, Moore B, Mudge J, Naamati G, Tate J, Trevanion S, Winterbottom A, Frankish A, Hunt SE, Cunningham F, Dyer S, Finn R, Martin F, Yates A. Ensembl 2024. Nucleic Acids Res 2024; 52:D891-D899. [PMID: 37953337 PMCID: PMC10767893 DOI: 10.1093/nar/gkad1049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 10/20/2023] [Accepted: 10/24/2023] [Indexed: 11/14/2023] Open
Abstract
Ensembl (https://www.ensembl.org) is a freely available genomic resource that has produced high-quality annotations, tools, and services for vertebrates and model organisms for more than two decades. In recent years, there has been a dramatic shift in the genomic landscape, with a large increase in the number and phylogenetic breadth of high-quality reference genomes, alongside major advances in the pan-genome representations of higher species. In order to support these efforts and accelerate downstream research, Ensembl continues to focus on scaling for the rapid annotation of new genome assemblies, developing new methods for comparative analysis, and expanding the depth and quality of our genome annotations. This year we have continued our expansion to support global biodiversity research, doubling the number of annotated genomes we support on our Rapid Release site to over 1700, driven by our close collaboration with biodiversity projects such as Darwin Tree of Life. We have also strengthened support for key agricultural species, including the first regulatory builds for farmed animals, and have updated key tools and resources that support the global scientific community, notably the Ensembl Variant Effect Predictor. Ensembl data, software, and tools are freely available.
Collapse
Affiliation(s)
- Peter W Harrison
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - M Ridwan Amode
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Olanrewaju Austine-Orimoloye
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Andrey G Azov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Matthieu Barba
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - If Barnes
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Arne Becker
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Ruth Bennett
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Andrew Berry
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Jyothish Bhai
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Simarpreet Kaur Bhurji
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Sanjay Boddu
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Paulo R Branco Lins
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Lucy Brooks
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Shashank Budhanuru Ramaraju
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Lahcen I Campbell
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Manuel Carbajo Martinez
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Mehrnaz Charkhchi
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Kapeel Chougule
- Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA
| | - Alexander Cockburn
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Claire Davidson
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Nishadi H De Silva
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Kamalkumar Dodiya
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Sarah Donaldson
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Bilal El Houdaigui
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Tamara El Naboulsi
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Reham Fatima
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Carlos Garcia Giron
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Thiago Genez
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Dionysios Grigoriadis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Gurpreet S Ghattaoraya
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Jose Gonzalez Martinez
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Tatiana A Gurbich
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Matthew Hardy
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Zoe Hollis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Thibaut Hourlier
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Toby Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Mike Kay
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Vinay Kaykala
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Tuan Le
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Diana Lemos
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Disha Lodha
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Diego Marques-Coelho
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Gareth Maslen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Gabriela Alejandra Merino
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Louisse Paola Mirabueno
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Aleena Mushtaq
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Syed Nakib Hossain
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Denye N Ogeh
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Manoj Pandian Sakthivel
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Anne Parker
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Malcolm Perry
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Ivana Piližota
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Daniel Poppleton
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Irina Prosovetskaia
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Shriya Raj
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - José G Pérez-Silva
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Ahamed Imran Abdul Salam
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Shradha Saraf
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Nuno Saraiva-Agostinho
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Dan Sheppard
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Swati Sinha
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Botond Sipos
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Vasily Sitnik
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - William Stark
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Emily Steed
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Marie-Marthe Suner
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Likhitha Surapaneni
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Kyösti Sutinen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Francesca Floriana Tricomi
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - David Urbina-Gómez
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Andres Veidenberg
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Thomas A Walsh
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Doreen Ware
- Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA
- USDA ARS NAA Robert W. Holley Center for Agriculture and Health, Agricultural Research Service, Ithaca, NY 14853, USA
| | - Elizabeth Wass
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Natalie L Willhoft
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Jamie Allen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Jorge Alvarez-Jarreta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Marc Chakiachvili
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Bethany Flint
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Stefano Giorgetti
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Leanne Haggerty
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Garth R Ilsley
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Jon Keatley
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Jane E Loveland
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Benjamin Moore
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Jonathan M Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Guy Naamati
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - John Tate
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Stephen J Trevanion
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Andrea Winterbottom
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Adam Frankish
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Sarah E Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Fiona Cunningham
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Sarah Dyer
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Robert D Finn
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Fergal J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| | - Andrew D Yates
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire CB10 1SD, UK
| |
Collapse
|
25
|
Fowler DM, Rehm HL. Will variants of uncertain significance still exist in 2030? Am J Hum Genet 2024; 111:5-10. [PMID: 38086381 PMCID: PMC10806733 DOI: 10.1016/j.ajhg.2023.11.005] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 11/12/2023] [Accepted: 11/13/2023] [Indexed: 12/28/2023] Open
Abstract
In 2020, the National Human Genome Research Institute (NHGRI) made ten "bold predictions," including that "the clinical relevance of all encountered genomic variants will be readily predictable, rendering the diagnostic designation 'variant of uncertain significance (VUS)' obsolete." We discuss the prospects for this prediction, arguing that many, if not most, VUS in coding regions will be resolved by 2030. We outline a confluence of recent changes making this possible, especially advances in the standards for variant classification that better leverage diverse types of evidence, improvements in computational variant effect predictor performance, scalable multiplexed assays of variant effect capable of saturating the genome, and data-sharing efforts that will maximize the information gained from each new individual sequenced and variant interpreted. We suggest that clinicians and researchers can realize a future where VUSs have largely been eliminated, in line with the NHGRI's bold prediction. The length of time taken to reach this future, and thus whether we are able to achieve the goal of largely eliminating VUSs by 2030, is largely a consequence of the choices made now and in the next few years. We believe that investing in eliminating VUSs is worthwhile, since their predominance remains one of the biggest challenges to precision genomic medicine.
Collapse
Affiliation(s)
- Douglas M Fowler
- Department of Genome Sciences, University of Washington, Seattle, WA, USA; Department of Bioengineering, University of Washington, Seattle, WA, USA; Brotman Baty Institute for Precision Medicine, Seattle, WA, USA.
| | - Heidi L Rehm
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
26
|
Kwon S, Safer J, Nguyen DT, Hoksza D, May P, Arbesfeld JA, Rubin AF, Campbell AJ, Burgin A, Iqbal S. Genomics 2 Proteins portal: A resource and discovery tool for linking genetic screening outputs to protein sequences and structures. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.02.573913. [PMID: 38260256 PMCID: PMC10802383 DOI: 10.1101/2024.01.02.573913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Recent advances in AI-based methods have revolutionized the field of structural biology. Concomitantly, high-throughput sequencing and functional genomics technologies have enabled the detection and generation of variants at an unprecedented scale. However, efficient tools and resources are needed to link these two disparate data types - to "map" variants onto protein structures, to better understand how the variation causes disease and thereby design therapeutics. Here we present the Genomics 2 Proteins Portal (G2P; g2p.broadinstitute.org/): a human proteome-wide resource that maps 19,996,443 genetic variants onto 42,413 protein sequences and 77,923 structures, with a comprehensive set of structural and functional features. Additionally, the G2P portal generalizes the capability of linking genomics to proteins beyond databases by allowing users to interactively upload protein residue-wise annotations (variants, scores, etc.) as well as the protein structure to establish the connection. The portal serves as an easy-to-use discovery tool for researchers and scientists to hypothesize the structure-function relationship between natural or synthetic variations and their molecular phenotype.
Collapse
|
27
|
Calame DG, Emrick LT. Functional genomics and small molecules in mitochondrial neurodevelopmental disorders. Neurotherapeutics 2024; 21:e00316. [PMID: 38244259 PMCID: PMC10903096 DOI: 10.1016/j.neurot.2024.e00316] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 12/16/2023] [Accepted: 01/02/2024] [Indexed: 01/22/2024] Open
Abstract
Mitochondria are critical for brain development and homeostasis. Therefore, pathogenic variation in the mitochondrial or nuclear genome which disrupts mitochondrial function frequently results in developmental disorders and neurodegeneration at the organismal level. Large-scale application of genome-wide technologies to individuals with mitochondrial diseases has dramatically accelerated identification of mitochondrial disease-gene associations in humans. Multi-omic and high-throughput studies involving transcriptomics, proteomics, metabolomics, and saturation genome editing are providing deeper insights into the functional consequence of mitochondrial genomic variation. Integration of deep phenotypic and genomic data through allelic series continues to uncover novel mitochondrial functions and permit mitochondrial gene function dissection on an unprecedented scale. Finally, mitochondrial disease-gene associations illuminate disease mechanisms and thereby direct therapeutic strategies involving small molecules and RNA-DNA therapeutics. This review summarizes progress in functional genomics and small molecule therapeutics in mitochondrial neurodevelopmental disorders.
Collapse
Affiliation(s)
- Daniel G Calame
- Section of Pediatric Neurology and Developmental Neuroscience, Department of Pediatrics, Baylor College of Medicine, Houston, TX, USA; Texas Children's Hospital, Houston, TX, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA.
| | - Lisa T Emrick
- Section of Pediatric Neurology and Developmental Neuroscience, Department of Pediatrics, Baylor College of Medicine, Houston, TX, USA; Texas Children's Hospital, Houston, TX, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| |
Collapse
|
28
|
Abakarova M, Marquet C, Rera M, Rost B, Laine E. Alignment-based Protein Mutational Landscape Prediction: Doing More with Less. Genome Biol Evol 2023; 15:evad201. [PMID: 37936309 PMCID: PMC10653582 DOI: 10.1093/gbe/evad201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Revised: 10/27/2023] [Accepted: 11/01/2023] [Indexed: 11/09/2023] Open
Abstract
The wealth of genomic data has boosted the development of computational methods predicting the phenotypic outcomes of missense variants. The most accurate ones exploit multiple sequence alignments, which can be costly to generate. Recent efforts for democratizing protein structure prediction have overcome this bottleneck by leveraging the fast homology search of MMseqs2. Here, we show the usefulness of this strategy for mutational outcome prediction through a large-scale assessment of 1.5M missense variants across 72 protein families. Our study demonstrates the feasibility of producing alignment-based mutational landscape predictions that are both high-quality and compute-efficient for entire proteomes. We provide the community with the whole human proteome mutational landscape and simplified access to our predictive pipeline.
Collapse
Affiliation(s)
- Marina Abakarova
- CNRS, IBPS, Laboratory of Computational and Quantitative Biology (LCQB), Sorbonne Université, UMR 7238, Paris 75005, France
- Université Paris Cité, INSERM UMR U1284, 75004 Paris, France
| | - Céline Marquet
- Department of Informatics, Bioinformatics and Computational Biology - i12, TUM-Technical University of Munich, Boltzmannstr. 3, Garching, 85748 Munich, Germany
- TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748 Garching, Germany
| | - Michael Rera
- Université Paris Cité, INSERM UMR U1284, 75004 Paris, France
| | - Burkhard Rost
- Department of Informatics, Bioinformatics and Computational Biology - i12, TUM-Technical University of Munich, Boltzmannstr. 3, Garching, 85748 Munich, Germany
- Institute for Advanced Study (TUM-IAS), Lichtenbergstr. 2a, Garching, 85748 Munich, Germany
- TUM School of Life Sciences Weihenstephan (TUM-WZW), Alte Akademie 8, Freising, Germany
| | - Elodie Laine
- CNRS, IBPS, Laboratory of Computational and Quantitative Biology (LCQB), Sorbonne Université, UMR 7238, Paris 75005, France
- Institut universitaire de France (IUF)
| |
Collapse
|
29
|
Hollstein R, Peron A, Wendt KS, Parenti I. Editorial: Pathogenic mechanisms in neurodevelopmental disorders: advances in cellular models and multi-omics approaches. Front Cell Dev Biol 2023; 11:1296885. [PMID: 37868909 PMCID: PMC10588624 DOI: 10.3389/fcell.2023.1296885] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Accepted: 09/27/2023] [Indexed: 10/24/2023] Open
Affiliation(s)
- R. Hollstein
- Institute of Human Genetics, University of Bonn and University Hospital Bonn, Bonn, Germany
| | - A. Peron
- Medical Genetics, Meyer Children’s Hospital IRCCS, Florence, Italy
- Department of Experimental and Clinical Biomedical Sciences “Mario Serio”, Università Degli Studi Di Firenze, Florence, Italy
| | - K. S. Wendt
- Department of Cell Biology, Erasmus MC, Rotterdam, Netherlands
| | - I. Parenti
- Institute of Human Genetics, University Hospital Essen, University Duisburg-Essen, Essen, Germany
| |
Collapse
|
30
|
Abstract
Machine-learning algorithm uses structure prediction to spot disease-causing mutations.
Collapse
Affiliation(s)
- Joseph A Marsh
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| | - Sarah A Teichmann
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
- Theory of Condensed Matter, Cavendish Laboratory, University of Cambridge, Cambridge, UK
| |
Collapse
|
31
|
Livesey BJ, Marsh JA. Advancing variant effect prediction using protein language models. Nat Genet 2023; 55:1426-1427. [PMID: 37563330 DOI: 10.1038/s41588-023-01470-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/12/2023]
Affiliation(s)
- Benjamin J Livesey
- MRC Human Genetics Unit, Institute of Genetics & Cancer, University of Edinburgh, Edinburgh, UK
| | - Joseph A Marsh
- MRC Human Genetics Unit, Institute of Genetics & Cancer, University of Edinburgh, Edinburgh, UK.
| |
Collapse
|
32
|
The Impact of Genomic Variation on Function (IGVF) Consortium. ARXIV 2023:arXiv:2307.13708v1. [PMID: 37547663 PMCID: PMC10402186] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]
Abstract
Our genomes influence nearly every aspect of human biology from molecular and cellular functions to phenotypes in health and disease. Human genetics studies have now associated hundreds of thousands of differences in our DNA sequence ("genomic variation") with disease risk and other phenotypes, many of which could reveal novel mechanisms of human biology and uncover the basis of genetic predispositions to diseases, thereby guiding the development of new diagnostics and therapeutics. Yet, understanding how genomic variation alters genome function to influence phenotype has proven challenging. To unlock these insights, we need a systematic and comprehensive catalog of genome function and the molecular and cellular effects of genomic variants. Toward this goal, the Impact of Genomic Variation on Function (IGVF) Consortium will combine approaches in single-cell mapping, genomic perturbations, and predictive modeling to investigate the relationships among genomic variation, genome function, and phenotypes. Through systematic comparisons and benchmarking of experimental and computational methods, we aim to create maps across hundreds of cell types and states describing how coding variants alter protein activity, how noncoding variants change the regulation of gene expression, and how both coding and noncoding variants may connect through gene regulatory and protein interaction networks. These experimental data, computational predictions, and accompanying standards and pipelines will be integrated into an open resource that will catalyze community efforts to explore genome function and the impact of genetic variation on human biology and disease across populations.
Collapse
|