1
|
Uttam V, Vohra V, Chhotaray S, Santhosh A, Diwakar V, Patel V, Gahlyan RK. Exome-wide comparative analyses revealed differentiating genomic regions for performance traits in Indian native buffaloes. Anim Biotechnol 2024; 35:2277376. [PMID: 37934017 DOI: 10.1080/10495398.2023.2277376] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2023]
Abstract
In India, 20 breeds of buffalo have been identified and registered, yet limited studies have been conducted to explore the performance potential of these breeds, especially in the Indian native breeds. This study is a maiden attempt to delineate the important variants and unique genes through exome sequencing for milk yield, milk composition, fertility, and adaptation traits in Indian local breeds of buffalo. In the present study, whole exome sequencing was performed on Chhattisgarhi (n = 3), Chilika (n = 4), Gojri (n = 3), and Murrah (n = 4) buffalo breeds and after stringent quality control, 4333, 6829, 4130, and 4854 InDels were revealed, respectively. Exome-wide FST along 100-kb sliding windows detected 27, 98, 38, and 35 outlier windows in Chhattisgarhi, Chilika, Gojri, and Murrah, respectively. The comparative exome analysis of InDels and subsequent gene ontology revealed unique breed specific genes for milk yield (CAMSAP3), milk composition (CLCN1, NUDT3), fertility (PTGER3) and adaptation (KCNA3, TH) traits. Study provides insight into mechanism of how these breeds have evolved under natural selection, the impact of these events on their respective genomes, and their importance in maintaining purity of these breeds for the traits under study. Additionally, this result will underwrite to the genetic acquaintance of these breeds for breeding application, and in understanding of evolution of these Indian local breeds.
Collapse
Affiliation(s)
- Vishakha Uttam
- Animal Genetics & Breeding Division, ICAR-National Dairy Research Institute, Karnal, Haryana, India
| | - Vikas Vohra
- Animal Genetics & Breeding Division, ICAR-National Dairy Research Institute, Karnal, Haryana, India
| | - Supriya Chhotaray
- Animal Genetics & Breeding Division, ICAR-National Dairy Research Institute, Karnal, Haryana, India
| | - Ameya Santhosh
- Animal Genetics & Breeding Division, ICAR-National Dairy Research Institute, Karnal, Haryana, India
| | - Vikas Diwakar
- Animal Genetics & Breeding Division, ICAR-National Dairy Research Institute, Karnal, Haryana, India
| | - Vaibhav Patel
- Animal Genetics & Breeding Division, ICAR-National Dairy Research Institute, Karnal, Haryana, India
| | - Rajesh Kumar Gahlyan
- Animal Genetics & Breeding Division, ICAR-National Dairy Research Institute, Karnal, Haryana, India
| |
Collapse
|
2
|
Zhu L, Akhmet N, Bo D, Pan C, Wu J, Lan X. Genetic variant of the sheep E2F8 gene and its associations with litter size. Anim Biotechnol 2024; 35:2337751. [PMID: 38597900 DOI: 10.1080/10495398.2024.2337751] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/11/2024]
Abstract
The economic efficiency of sheep breeding, aiming to enhance productivity, is a focal point for improvement of sheep breeding. Recent studies highlight the involvement of the Early Region 2 Binding Factor transcription factor 8 (E2F8) gene in female reproduction. Our group's recent genome-wide association study (GWAS) emphasizes the potential impact of the E2F8 gene on prolificacy traits in Australian White sheep (AUW). Herein, the purpose of this study was to assess the correlation of the E2F8 gene with litter size in AUW sheep breed. This work encompassed 659 AUW sheep, subject to genotyping through PCR-based genotyping technology. Furthermore, the results of PCR-based genotyping showed significant associations between the P1-del-32bp bp InDel and the fourth and fifth parities litter size in AUW sheep; the litter size of those with genotype ID were superior compared to those with DD and II genotypes. Thus, these results indicate that the P1-del-32bp InDel within the E2F8 gene can be useful in marker-assisted selection (MAS) in sheep.
Collapse
Affiliation(s)
- Leijing Zhu
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, PR China
| | - Nazar Akhmet
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, PR China
| | - Didi Bo
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, PR China
| | - Chuanying Pan
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, PR China
| | - Jiyao Wu
- College of Animal Science, Fujian Agriculture and Forestry University, Fuzhou, PR China
| | - Xianyong Lan
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, PR China
| |
Collapse
|
3
|
Yang Y, Braga MV, Dean MD. Insertion-Deletion Events Are Depleted in Protein Regions with Predicted Secondary Structure. Genome Biol Evol 2024; 16:evae093. [PMID: 38735759 PMCID: PMC11102076 DOI: 10.1093/gbe/evae093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Revised: 04/16/2024] [Accepted: 04/21/2024] [Indexed: 05/14/2024] Open
Abstract
A fundamental goal in evolutionary biology and population genetics is to understand how selection shapes the fate of new mutations. Here, we test the null hypothesis that insertion-deletion (indel) events in protein-coding regions occur randomly with respect to secondary structures. We identified indels across 11,444 sequence alignments in mouse, rat, human, chimp, and dog genomes and then quantified their overlap with four different types of secondary structure-alpha helices, beta strands, protein bends, and protein turns-predicted by deep-learning methods of AlphaFold2. Indels overlapped secondary structures 54% as much as expected and were especially underrepresented over beta strands, which tend to form internal, stable regions of proteins. In contrast, indels were enriched by 155% over regions without any predicted secondary structures. These skews were stronger in the rodent lineages compared to the primate lineages, consistent with population genetic theory predicting that natural selection will be more efficient in species with larger effective population sizes. Nonsynonymous substitutions were also less common in regions of protein secondary structure, although not as strongly reduced as in indels. In a complementary analysis of thousands of human genomes, we showed that indels overlapping secondary structure segregated at significantly lower frequency than indels outside of secondary structure. Taken together, our study shows that indels are selected against if they overlap secondary structure, presumably because they disrupt the tertiary structure and function of a protein.
Collapse
Affiliation(s)
- Yi Yang
- Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Matthew V Braga
- Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Matthew D Dean
- Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| |
Collapse
|
4
|
Carvalho VHV, Rodrigues JCG, Vinagre LWMS, Pereira EEB, Monte N, Fernandes MR, Ribeiro-Dos-Santos AM, Guerreiro JF, Ribeiro-Dos-Santos Â, Dos Santos SEB, Dos Santos NPC. Genomic investigation on genes related to mercury metabolism in Amazonian indigenous populations. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 923:171232. [PMID: 38402986 DOI: 10.1016/j.scitotenv.2024.171232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Revised: 02/21/2024] [Accepted: 02/22/2024] [Indexed: 02/27/2024]
Abstract
Studies have identified elevated levels of mercury in Amazonian Indigenous individuals, highlighting them as one of the most exposed to risks. In the unique context of the Brazilian Indigenous population, it is crucial to identify genetic variants with clinical significance to better understand vulnerability to mercury and its adverse effects. Currently, there is a lack of research on the broader genomic profile of Indigenous people, particularly those from the Amazon region, concerning mercury contamination. Therefore, the aim of this study was to assess the genomic profile related to the processes of mercury absorption, distribution, metabolism, and excretion in 64 Indigenous individuals from the Brazilian Amazon. We aimed to determine whether these individuals exhibit a higher susceptibility to mercury exposure. Our study identified three high-impact variants (GSTA1 rs1051775, GSTM1 rs1183423000, and rs1241704212), with the latter two showing a higher frequency in the study population compared to global populations. Additionally, we discovered seven new variants with modifier impact and a genomic profile different from the worldwide populations. These genetic variants may predispose the study population to more harmful mercury exposure compared to global populations. As the first study to analyze broader genomics of mercury metabolism pathways in Brazilian Amazonian Amerindians, we emphasize that our research aims to contribute to public policies by utilizing genomic investigation as a method to identify populations with a heightened susceptibility to mercury exposure.
Collapse
Affiliation(s)
- Victor Hugo Valente Carvalho
- Núecleo de Pesquisas em Oncologia, Unidade de Alta Complexidade em Oncologia, Hospital Universitário João de Barros Barreto, 66073-005 Belém, Pará, Brazil.
| | - Juliana Carla Gomes Rodrigues
- Núecleo de Pesquisas em Oncologia, Unidade de Alta Complexidade em Oncologia, Hospital Universitário João de Barros Barreto, 66073-005 Belém, Pará, Brazil
| | - Lui Wallacy Morikawa Souza Vinagre
- Núecleo de Pesquisas em Oncologia, Unidade de Alta Complexidade em Oncologia, Hospital Universitário João de Barros Barreto, 66073-005 Belém, Pará, Brazil
| | - Esdras Edgar Batista Pereira
- Núecleo de Pesquisas em Oncologia, Unidade de Alta Complexidade em Oncologia, Hospital Universitário João de Barros Barreto, 66073-005 Belém, Pará, Brazil
| | - Natasha Monte
- Núecleo de Pesquisas em Oncologia, Unidade de Alta Complexidade em Oncologia, Hospital Universitário João de Barros Barreto, 66073-005 Belém, Pará, Brazil
| | - Marianne Rodrigues Fernandes
- Núecleo de Pesquisas em Oncologia, Unidade de Alta Complexidade em Oncologia, Hospital Universitário João de Barros Barreto, 66073-005 Belém, Pará, Brazil
| | - André Maurício Ribeiro-Dos-Santos
- Laboratório de Genética Humana e Médica, Instituto de Ciências Biológicas, Universidade Federal do Pará, 66075-110, Belém, Pará, Brazil
| | - João Farias Guerreiro
- Laboratório de Genética Humana e Médica, Instituto de Ciências Biológicas, Universidade Federal do Pará, 66075-110, Belém, Pará, Brazil
| | - Ândrea Ribeiro-Dos-Santos
- Laboratório de Genética Humana e Médica, Instituto de Ciências Biológicas, Universidade Federal do Pará, 66075-110, Belém, Pará, Brazil
| | - Sidney Emanuel Batista Dos Santos
- Núecleo de Pesquisas em Oncologia, Unidade de Alta Complexidade em Oncologia, Hospital Universitário João de Barros Barreto, 66073-005 Belém, Pará, Brazil
| | - Ney Pereira Carneiro Dos Santos
- Núecleo de Pesquisas em Oncologia, Unidade de Alta Complexidade em Oncologia, Hospital Universitário João de Barros Barreto, 66073-005 Belém, Pará, Brazil
| |
Collapse
|
5
|
Gong B, Lababidi S, Kusko R, Bouri K, Prezek S, Thovarai V, Prasanna A, Maier EJ, Golkaram M, Sun X, Kyriakidis K, Kitajima JP, Ebrahim Sahraeian SM, Guo Y, Johanson E, Jones W, Tong W, Xu J. Towards accurate indel calling for oncopanel sequencing through an international pipeline competition at precisionFDA. Sci Rep 2024; 14:8165. [PMID: 38589653 PMCID: PMC11001604 DOI: 10.1038/s41598-024-58573-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Accepted: 04/01/2024] [Indexed: 04/10/2024] Open
Abstract
Accurately calling indels with next-generation sequencing (NGS) data is critical for clinical application. The precisionFDA team collaborated with the U.S. Food and Drug Administration's (FDA's) National Center for Toxicological Research (NCTR) and successfully completed the NCTR Indel Calling from Oncopanel Sequencing Data Challenge, to evaluate the performance of indel calling pipelines. Top performers were selected based on precision, recall, and F1-score. The performance of many other pipelines was close to the top performers, which produced a top cluster of performers. The performance was significantly higher in high confidence regions and coding regions, and significantly lower in low complexity regions. Oncopanel capture and other issues may have occurred that affected the recall rate. Indels with higher variant allele frequency (VAF) may generally be called with higher confidence. Many of the indel calling pipelines had good performance. Some of them performed generally well across all three oncopanels, while others were better for a specific oncopanel. The performance of indel calling can further be improved by restricting the calls within high confidence intervals (HCIs) and coding regions, and by excluding low complexity regions (LCR) regions. Certain VAF cut-offs could be applied according to the applications.
Collapse
Affiliation(s)
- Binsheng Gong
- Division of Bioinformatics and Biostatistics, Office of Research, National Center for Toxicological Research, Office of the Chief Scientist, Office of the Commissioner, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Samir Lababidi
- Health Informatics Staff, Office of Data, Analytics, and Research, Office of Digital Transformation, Office of the Commissioner, U.S. Food and Drug Administration, Silver Spring, MD, 20993, USA
| | - Rebecca Kusko
- Cellino Biotech, 750 Main Street, Cambridge, MA, 02143, USA
| | - Khaled Bouri
- Office of Regulatory Science and Innovation, Office of the Chief Scientist, Office of the Commissioner, U.S. Food and Drug Administration, Silver Spring, MD, 20993, USA
| | | | | | | | | | | | | | | | | | | | - Yunfei Guo
- Roche Sequencing Solutions, Santa Clara, CA, 95050, USA
| | - Elaine Johanson
- Health Informatics Staff, Office of Data, Analytics, and Research, Office of Digital Transformation, Office of the Commissioner, U.S. Food and Drug Administration, Silver Spring, MD, 20993, USA
| | - Wendell Jones
- Q squared Solutions Genomics, 2400 Elis Road, Durham, NC, 27703, USA
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, Office of Research, National Center for Toxicological Research, Office of the Chief Scientist, Office of the Commissioner, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Joshua Xu
- Division of Bioinformatics and Biostatistics, Office of Research, National Center for Toxicological Research, Office of the Chief Scientist, Office of the Commissioner, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA.
| |
Collapse
|
6
|
Dickson ZW, Golding GB. Evolution of Transcript Abundance is Influenced by Indels in Protein Low Complexity Regions. J Mol Evol 2024; 92:153-168. [PMID: 38485789 DOI: 10.1007/s00239-024-10158-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Accepted: 01/24/2024] [Indexed: 04/02/2024]
Abstract
Protein Protein low complexity regions (LCRs) are compositionally biased amino acid sequences, many of which have significant evolutionary impacts on the proteins which contain them. They are mutationally unstable experiencing higher rates of indels and substitutions than higher complexity regions. LCRs also impact the expression of their proteins, likely through multiple effects along the path from gene transcription, through translation, and eventual protein degradation. It has been observed that proteins which contain LCRs are associated with elevated transcript abundance (TAb), despite having lower protein abundance. We have gathered and integrated human data to investigate the co-evolution of TAb and LCRs through ancestral reconstructions and model inference using an approximate Bayesian calculation based method. We observe that on short evolutionary timescales TAb evolution is significantly impacted by changes in LCR length, with insertions driving TAb down. But in contrast, the observed data is best explained by indel rates in LCRs which are unaffected by shifts in TAb. Our work demonstrates a coupling between LCR and TAb evolution, and the utility of incorporating multiple responses into evolutionary analyses.
Collapse
Affiliation(s)
| | - G Brian Golding
- Department of Biology, McMaster University, Hamilton, ON, Canada
| |
Collapse
|
7
|
Gong B, Li D, Zhang Y, Kusko R, Lababidi S, Cao Z, Chen M, Chen N, Chen Q, Chen Q, Dai J, Gan Q, Gao Y, Guo M, Hariani G, He Y, Hou W, Jiang H, Kushwaha G, Li JL, Li J, Li Y, Liu LC, Liu R, Liu S, Meriaux E, Mo M, Moore M, Moss TJ, Niu Q, Patel A, Ren L, Saremi NF, Shang E, Shang J, Song P, Sun S, Urban BJ, Wang D, Wang S, Wen Z, Xiong X, Yang J, Yin L, Zhang C, Zhang R, Bhandari A, Cai W, Eterovic AK, Megherbi DB, Shi T, Suo C, Yu Y, Zheng Y, Novoradovskaya N, Sears RL, Shi L, Jones W, Tong W, Xu J. Extend the benchmarking indel set by manual review using the individual cell line sequencing data from the Sequencing Quality Control 2 (SEQC2) project. Sci Rep 2024; 14:7028. [PMID: 38528062 DOI: 10.1038/s41598-024-57439-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Accepted: 03/18/2024] [Indexed: 03/27/2024] Open
Abstract
Accurate indel calling plays an important role in precision medicine. A benchmarking indel set is essential for thoroughly evaluating the indel calling performance of bioinformatics pipelines. A reference sample with a set of known-positive variants was developed in the FDA-led Sequencing Quality Control Phase 2 (SEQC2) project, but the known indels in the known-positive set were limited. This project sought to provide an enriched set of known indels that would be more translationally relevant by focusing on additional cancer related regions. A thorough manual review process completed by 42 reviewers, two advisors, and a judging panel of three researchers significantly enriched the known indel set by an additional 516 indels. The extended benchmarking indel set has a large range of variant allele frequencies (VAFs), with 87% of them having a VAF below 20% in reference Sample A. The reference Sample A and the indel set can be used for comprehensive benchmarking of indel calling across a wider range of VAF values in the lower range. Indel length was also variable, but the majority were under 10 base pairs (bps). Most of the indels were within coding regions, with the remainder in the gene regulatory regions. Although high confidence can be derived from the robust study design and meticulous human review, this extensive indel set has not undergone orthogonal validation. The extended benchmarking indel set, along with the indels in the previously published known-positive set, was the truth set used to benchmark indel calling pipelines in a community challenge hosted on the precisionFDA platform. This benchmarking indel set and reference samples can be utilized for a comprehensive evaluation of indel calling pipelines. Additionally, the insights and solutions obtained during the manual review process can aid in improving the performance of these pipelines.
Collapse
Affiliation(s)
- Binsheng Gong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Dan Li
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Yifan Zhang
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Rebecca Kusko
- Cellino Bio, 750 Main Street, Cambridge, MA, 02143, USA
| | - Samir Lababidi
- Office of Data Analytics and Research, Office of Digital Transformation, Office of the Commissioner, U.S. Food and Drug Administration, Silver Spring, MD, 20993, USA
| | - Zehui Cao
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Mingyang Chen
- Human Phenome Institute, Fudan University, Shanghai, 201203, China
| | - Ning Chen
- iGeneTech Bioscience Co., Ltd., 8 Shengmingyuan Rd., Changping, Beijing, China
| | - Qiaochu Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Qingwang Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Jiacheng Dai
- Human Phenome Institute, Fudan University, Shanghai, 201203, China
| | - Qiang Gan
- Clinical Diagnostics Division, Thermo Fisher Scientific, 46500 Kato Rd., Fremont, CA, 94538, USA
| | - Yuechen Gao
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Mingkun Guo
- College of Chemistry, Sichuan University, Chengdu, 610064, Sichuan, China
| | - Gunjan Hariani
- Q squared Solutions Genomics, 2400 Ellis Road, Durham, NC, 27703, USA
| | - Yujie He
- College of Chemistry, Sichuan University, Chengdu, 610064, Sichuan, China
| | - Wanwan Hou
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - He Jiang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Garima Kushwaha
- Guardant Health, Inc., 505 Penobscot Drive, Redwood City, CA, 94063, USA
| | - Jian-Liang Li
- Integrative Bioinformatics Support Group, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, NC, 27709, USA
| | - Jianying Li
- Integrative Bioinformatics Support Group, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, NC, 27709, USA
| | - Yulan Li
- College of Life Sciences, Shanghai Normal University, Shanghai, 200234, China
| | - Liang-Chun Liu
- Clinical Diagnostics Division, Thermo Fisher Scientific, 46500 Kato Rd., Fremont, CA, 94538, USA
| | - Ruimei Liu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Shiming Liu
- Center for Bioinformatics and Computational Biology, and the Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai, 200241, China
| | - Edwin Meriaux
- CMINDS Research Center, University of Massachusetts, Lowell, MA, 01854, USA
| | - Mengqing Mo
- Department of Epidemiology, School of Public Health, Fudan University, Shanghai, 200032, China
| | | | - Tyler J Moss
- Eurofins Viracor, LLC, 18000 W 99th St., Lenexa, KS, 66219, USA
| | - Quanne Niu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Ananddeep Patel
- Eurofins Viracor Biopharma Services, Inc., 18000 W 99th St., Lenexa, KS, 66219, USA
| | - Luyao Ren
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Nedda F Saremi
- Agilent Technologies, Inc., 11011 N Torrey Pines Rd., La Jolla, CA, 92037, USA
| | - Erfei Shang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Jun Shang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Ping Song
- Cancer Genomics Laboratory, Department of Genomic Medicine, MD Anderson Cancer Center, Houston, TX, 77030, USA
| | - Siqi Sun
- ResearchDx, Irvine, CA, 92618, USA
| | - Brent J Urban
- Eurofins Viracor Biopharma Services, Inc., 18000 W 99th St., Lenexa, KS, 66219, USA
| | - Danke Wang
- Human Phenome Institute, Fudan University, Shanghai, 201203, China
| | - Shangzi Wang
- State Key Laboratory of Genetic Engineering and Collaborative Innovation Center for Genetics and Development, School of Life Sciences, Fudan University, Shanghai, 200438, China
| | - Zhining Wen
- College of Chemistry, Sichuan University, Chengdu, 610064, Sichuan, China
| | - Xiangyi Xiong
- College of Life Sciences, Shanghai Normal University, Shanghai, 200234, China
| | - Jingcheng Yang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Lihui Yin
- PathGroup, Nashville, TN, 37217, USA
| | - Chao Zhang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Ruolan Zhang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | | | - Wanshi Cai
- iGeneTech Bioscience Co., Ltd., 8 Shengmingyuan Rd., Changping, Beijing, China
| | - Agda Karina Eterovic
- Eurofins Viracor Biopharma Services, Inc., 18000 W 99th St., Lenexa, KS, 66219, USA
| | - Dalila B Megherbi
- CMINDS Research Center, University of Massachusetts, Lowell, MA, 01854, USA
| | - Tieliu Shi
- Center for Bioinformatics and Computational Biology, and the Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai, 200241, China
| | - Chen Suo
- Department of Epidemiology, School of Public Health, Fudan University, Shanghai, 200032, China
| | - Ying Yu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Yuanting Zheng
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | | | - Renee L Sears
- Velsera, 6 Cityplace Dr Suite 550, Creve Coeur, MO, 63141, USA
| | - Leming Shi
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Wendell Jones
- Q squared Solutions Genomics, 2400 Ellis Road, Durham, NC, 27703, USA
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Joshua Xu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA.
| |
Collapse
|
8
|
Lee CG, Kim HJ, Seol CA, Ki CS. Novel In-Frame Deletion CNOT3 Variant in a Family With Intellectual Developmental Disorder With Speech Delay and Dysmorphic Facies. Neurol Genet 2024; 10:e200116. [PMID: 38179413 PMCID: PMC10766080 DOI: 10.1212/nxg.0000000000200116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Accepted: 10/17/2023] [Indexed: 01/06/2024]
Abstract
Objectives Intellectual developmental disorder with speech delay, autism, and dysmorphic facies (IDDSADF) is caused by heterozygous CNOT3 (MIM# 604910) variants on chromosome 19q13. This study aimed to identify and describe the clinical features of a Korean family with maternally inherited speech delay and intellectual and developmental disability to elucidate the underlying genetic mechanism. Methods We conducted whole-exome sequencing and confirmatory Sanger sequencing on the proband, the mother, and unaffected grandparents with wild-type genotypes. Results The phenotypes of the mother and 2 daughters presented muscular hypotonia, global developmental delay, speech delay, intellectual disability, macrocephaly, facial dysmorphic features, and focal corpus callosum hypoplasia. Whole-exome sequencing identified a novel in-frame deletion, c.2017_2019del (p.Phe673del) in CNOT3, located in the C-terminal negative on the TATA-less-box domain. Discussion This report presents a new possible mechanism underlying IDDSADF caused by CNOT3 variants-an in-frame deletion. The findings enhance our understanding of early-life neurodevelopment and the genotype-phenotype relationships of IDDSADF caused by CNOT3 variants. In addition, this report could assist in early diagnosis and facilitate genetic counseling.
Collapse
Affiliation(s)
- Cha Gon Lee
- From the Department of Pediatrics (C.G.L.); Department of Rehabilitation Medicine (H.J.K.), Nowon Eulji Medical Center, Eulji University School of Medicine, Seoul; and GC Genome (C.A.S., C.-S.K.), Yongin, Republic of Korea
| | - Hyun Jung Kim
- From the Department of Pediatrics (C.G.L.); Department of Rehabilitation Medicine (H.J.K.), Nowon Eulji Medical Center, Eulji University School of Medicine, Seoul; and GC Genome (C.A.S., C.-S.K.), Yongin, Republic of Korea
| | - Chang Ahn Seol
- From the Department of Pediatrics (C.G.L.); Department of Rehabilitation Medicine (H.J.K.), Nowon Eulji Medical Center, Eulji University School of Medicine, Seoul; and GC Genome (C.A.S., C.-S.K.), Yongin, Republic of Korea
| | - Chang-Seok Ki
- From the Department of Pediatrics (C.G.L.); Department of Rehabilitation Medicine (H.J.K.), Nowon Eulji Medical Center, Eulji University School of Medicine, Seoul; and GC Genome (C.A.S., C.-S.K.), Yongin, Republic of Korea
| |
Collapse
|
9
|
Ramos RM, Petroli RJ, D'Alessandre NDR, Guardia GDA, Afonso ACDF, Nishi MY, Domenice S, Galante PAF, Mendonca BB, Batista RL. Small Indels in the Androgen Receptor Gene: Phenotype Implications and Mechanisms of Mutagenesis. J Clin Endocrinol Metab 2023; 109:68-79. [PMID: 37572362 DOI: 10.1210/clinem/dgad470] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 08/02/2023] [Accepted: 08/07/2023] [Indexed: 08/14/2023]
Abstract
CONTEXT Despite high abundance of small indels in human genomes, their precise roles and underlying mechanisms of mutagenesis in Mendelian disorders require further investigation. OBJECTIVE To profile the distribution, functional implications, and mechanisms of small indels in the androgen receptor (AR) gene in individuals with androgen insensitivity syndrome (AIS). METHODS We conducted a systematic review of previously reported indels within the coding region of the AR gene, including 3 novel indels. Distribution throughout the AR coding region was examined and compared with genomic population data. Additionally, we assessed their impact on the AIS phenotype and investigated potential mechanisms driving their occurrence. RESULTS A total of 82 indels in AIS were included. Notably, all frameshift indels exhibited complete AIS. The distribution of indels across the AR gene showed a predominance in the N-terminal domain, most leading to frameshift mutations. Small deletions accounted for 59.7%. Most indels occurred in nonrepetitive sequences, with 15.8% situated within triplet regions. Gene burden analysis demonstrated significant enrichment of frameshift indels in AIS compared with controls (P < .00001), and deletions were overrepresented in AIS (P < .00001). CONCLUSION Our findings underscore a robust genotype-phenotype relationship regarding small indels in the AR gene in AIS, with a vast majority presenting complete AIS. Triplet regions and homopolymeric runs emerged as prone loci for small indels within the AR. Most were frameshift indels, with polymerase slippage potentially explaining half of AR indel occurrences. Complex frameshift indels exhibited association with palindromic runs. These discoveries advance understanding of the genetic basis of AIS and shed light on potential mechanisms underlying pathogenic small indel events.
Collapse
Affiliation(s)
- Raquel Martinez Ramos
- Developmental Endocrinology Unit, Hormone and Molecular Genetics Laboratory (LIM/42), Endocrinology Division, Internal Medicine Department, Medical School, University of São Paulo (USP), São Paulo, SP, 05403-000, Brazil
| | - Reginaldo José Petroli
- Faculdade de Medicina da Universidade Federal de Alagoas (UFAL), Programa de Pós-Graduação em Ciências Médicas-UFAL, Maceió, AL, 57072-900, Brazil
| | | | | | - Ana Caroline de Freitas Afonso
- Developmental Endocrinology Unit, Hormone and Molecular Genetics Laboratory (LIM/42), Endocrinology Division, Internal Medicine Department, Medical School, University of São Paulo (USP), São Paulo, SP, 05403-000, Brazil
| | - Mirian Yumie Nishi
- Developmental Endocrinology Unit, Hormone and Molecular Genetics Laboratory (LIM/42), Endocrinology Division, Internal Medicine Department, Medical School, University of São Paulo (USP), São Paulo, SP, 05403-000, Brazil
| | - Sorahia Domenice
- Developmental Endocrinology Unit, Hormone and Molecular Genetics Laboratory (LIM/42), Endocrinology Division, Internal Medicine Department, Medical School, University of São Paulo (USP), São Paulo, SP, 05403-000, Brazil
| | | | - Berenice Bilharinho Mendonca
- Developmental Endocrinology Unit, Hormone and Molecular Genetics Laboratory (LIM/42), Endocrinology Division, Internal Medicine Department, Medical School, University of São Paulo (USP), São Paulo, SP, 05403-000, Brazil
| | - Rafael Loch Batista
- Developmental Endocrinology Unit, Hormone and Molecular Genetics Laboratory (LIM/42), Endocrinology Division, Internal Medicine Department, Medical School, University of São Paulo (USP), São Paulo, SP, 05403-000, Brazil
- Instituto do Câncer do Estado de São Paulo da Faculdade, de Medicina da Universidade de São Paulo (ICESP), São Paulo, SP, 01246-000, Brazil
| |
Collapse
|
10
|
Pan X, Huang P, Ali SS, Renslo B, Hutchinson TE, Erwin N, Greenberg Z, Ding Z, Li Y, Warnecke A, Fernandez NE, Staecker H, He M. CRISPR-Cas9 Engineered Extracellular Vesicles for the Treatment of Dominant Progressive Hearing Loss. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.14.557853. [PMID: 38168224 PMCID: PMC10760051 DOI: 10.1101/2023.09.14.557853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
Clinical translation of gene therapy has been challenging, due to limitations in current delivery vehicles such as traditional viral vectors. Herein, we report the use of gRNA:Cas9 ribonucleoprotein (RNP) complexes engineered extracellular vesicles (EVs) for in vivo gene therapy. By leveraging a novel high-throughput microfluidic droplet-based electroporation system (μDES), we achieved 10-fold enhancement of loading efficiency and more than 1000-fold increase in processing throughput on loading RNP complexes into EVs (RNP-EVs), compared with conventional bulk electroporation. The flow-through droplets serve as enormous bioreactors for offering millisecond pulsed, low-voltage electroporation in a continuous-flow and scalable manner, which minimizes the Joule heating influence and surface alteration to retain natural EV stability and integrity. In the Shaker-1 mouse model of dominant progressive hearing loss, we demonstrated the effective delivery of RNP-EVs into inner ear hair cells, with a clear reduction of Myo7ash1 mRNA expression compared to RNP-loaded lipid-like nanoparticles (RNP-LNPs), leading to significant hearing recovery measured by auditory brainstem responses (ABR).
Collapse
Affiliation(s)
- Xiaoshu Pan
- College of Pharmacy, Department of Pharmaceutics, University of Florida, Gainesville, Florida 32611, United States
| | - Peixin Huang
- Department of Otolaryngology, Head and Neck Surgery, University of Kansas School of Medicine, Kansas City, Kansas 66160, United States
| | - Samantha S. Ali
- College of Pharmacy, Department of Pharmaceutics, University of Florida, Gainesville, Florida 32611, United States
| | - Bryan Renslo
- Department of Otolaryngology, Head and Neck Surgery, University of Kansas School of Medicine, Kansas City, Kansas 66160, United States
| | - Tarun E Hutchinson
- College of Pharmacy, Department of Pharmaceutics, University of Florida, Gainesville, Florida 32611, United States
| | - Nina Erwin
- College of Pharmacy, Department of Pharmaceutics, University of Florida, Gainesville, Florida 32611, United States
| | - Zachary Greenberg
- College of Pharmacy, Department of Pharmaceutics, University of Florida, Gainesville, Florida 32611, United States
| | - Zuo Ding
- College of Pharmacy, Department of Pharmaceutics, University of Florida, Gainesville, Florida 32611, United States
| | - Yanjun Li
- Department of Medicinal Chemistry, Center for Natural Products, Drug Discovery and Development, University of Florida, Gainesville, Florida, 32610, United States
| | - Athanasia Warnecke
- Department of Otolaryngology, Hannover Medical School, 30625 Hannover, Germany
| | - Natalia E. Fernandez
- College of Pharmacy, Department of Pharmaceutics, University of Florida, Gainesville, Florida 32611, United States
| | - Hinrich Staecker
- Department of Otolaryngology, Head and Neck Surgery, University of Kansas School of Medicine, Kansas City, Kansas 66160, United States
| | - Mei He
- College of Pharmacy, Department of Pharmaceutics, University of Florida, Gainesville, Florida 32611, United States
| |
Collapse
|
11
|
Struck TH, Golombek A, Hoesel C, Dimitrov D, Elgetany AH. Mitochondrial Genome Evolution in Annelida-A Systematic Study on Conservative and Variable Gene Orders and the Factors Influencing its Evolution. Syst Biol 2023; 72:925-945. [PMID: 37083277 PMCID: PMC10405356 DOI: 10.1093/sysbio/syad023] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Revised: 04/15/2023] [Accepted: 04/18/2023] [Indexed: 04/22/2023] Open
Abstract
The mitochondrial genomes of Bilateria are relatively conserved in their protein-coding, rRNA, and tRNA gene complement, but the order of these genes can range from very conserved to very variable depending on the taxon. The supposedly conserved gene order of Annelida has been used to support the placement of some taxa within Annelida. Recently, authors have cast doubts on the conserved nature of the annelid gene order. Various factors may influence gene order variability including, among others, increased substitution rates, base composition differences, structure of noncoding regions, parasitism, living in extreme habitats, short generation times, and biomineralization. However, these analyses were neither done systematically nor based on well-established reference trees. Several focused on only a few of these factors and biological factors were usually explored ad-hoc without rigorous testing or correlation analyses. Herein, we investigated the variability and evolution of the annelid gene order and the factors that potentially influenced its evolution, using a comprehensive and systematic approach. The analyses were based on 170 genomes, including 33 previously unrepresented species. Our analyses included 706 different molecular properties, 20 life-history and ecological traits, and a reference tree corresponding to recent improvements concerning the annelid tree. The results showed that the gene order with and without tRNAs is generally conserved. However, individual taxa exhibit higher degrees of variability. None of the analyzed life-history and ecological traits explained the observed variability across mitochondrial gene orders. In contrast, the combination and interaction of the best-predicting factors for substitution rate and base composition explained up to 30% of the observed variability. Accordingly, correlation analyses of different molecular properties of the mitochondrial genomes showed an intricate network of direct and indirect correlations between the different molecular factors. Hence, gene order evolution seems to be driven by molecular evolutionary aspects rather than by life history or ecology. On the other hand, variability of the gene order does not predict if a taxon is difficult to place in molecular phylogenetic reconstructions using sequence data or not. We also discuss the molecular properties of annelid mitochondrial genomes considering canonical views on gene evolution and potential reasons why the canonical views do not always fit to the observed patterns without making some adjustments. [Annelida; compositional biases; ecology; gene order; life history; macroevolution; mitochondrial genomes; substitution rates.].
Collapse
Affiliation(s)
- Torsten H Struck
- Natural History Museum, University of Oslo, P.O. Box 1172, Blindern, 0318 Oslo, Norway
- Centre of Molecular Biodiversity Research, Zoological Research Museum Alexander KoenigBonn 53113, Germany
- FB05 Biology/Chemistry; University of Osnabrück, Osnabrück 49069, Germany
| | - Anja Golombek
- Centre of Molecular Biodiversity Research, Zoological Research Museum Alexander KoenigBonn 53113, Germany
- FB05 Biology/Chemistry; University of Osnabrück, Osnabrück 49069, Germany
| | - Christoph Hoesel
- FB05 Biology/Chemistry; University of Osnabrück, Osnabrück 49069, Germany
| | - Dimitar Dimitrov
- Department of Natural History, University Museum of Bergen, University of Bergen, P.O. Box 7800, 5020 Bergen, Norway
| | - Asmaa Haris Elgetany
- Natural History Museum, University of Oslo, P.O. Box 1172, Blindern, 0318 Oslo, Norway
- Zoology Department, Faculty of Science, Damietta University, New Damietta, Central zone, 34517, Egypt
| |
Collapse
|
12
|
Martin SD, Bhuiyan I, Soleimani M, Wang G. Biomarkers for Immune Checkpoint Inhibitors in Renal Cell Carcinoma. J Clin Med 2023; 12:4987. [PMID: 37568390 PMCID: PMC10419620 DOI: 10.3390/jcm12154987] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Revised: 07/25/2023] [Accepted: 07/26/2023] [Indexed: 08/13/2023] Open
Abstract
Immune checkpoint inhibitor (ICI) therapy has revolutionized renal cell carcinoma treatment. Patients previously thought to be palliative now occasionally achieve complete cures from ICI. However, since immunotherapies stimulate the immune system to induce anti-tumor immunity, they often lead to adverse autoimmunity. Furthermore, some patients receive no benefit from ICI, thereby unnecessarily risking adverse events. In many tumor types, PD-L1 expression levels, immune infiltration, and tumor mutation burden predict the response to ICI and help inform clinical decision making to better target ICI to patients most likely to experience benefits. Unfortunately, renal cell carcinoma is an outlier, as these biomarkers fail to discriminate between positive and negative responses to ICI therapy. Emerging biomarkers such as gene expression profiles and the loss of pro-angiogenic proteins VHL and PBRM-1 show promise for identifying renal cell carcinoma cases likely to respond to ICI. This review provides an overview of the mechanistic underpinnings of different biomarkers and describes the theoretical rationale for their use. We discuss the effectiveness of each biomarker in renal cell carcinoma and other cancer types, and we introduce novel biomarkers that have demonstrated some promise in clinical trials.
Collapse
Affiliation(s)
- Spencer D. Martin
- Department of Pathology and Laboratory Medicine, Faculty of Medicine, University of British Columbia, Vancouver, BC V5Z 1M9, Canada;
| | - Ishmam Bhuiyan
- Faculty of Medicine, University of British Columbia, Vancouver, BC V6T 1Z3, Canada;
| | - Maryam Soleimani
- Division of Medical Oncology, Faculty of Medicine, University of British Columbia, Vancouver, BC V6T 1Z3, Canada;
- British Columbia Cancer Vancouver Centre, Vancouver, BC V5Z 4E6, Canada
| | - Gang Wang
- Department of Pathology and Laboratory Medicine, Faculty of Medicine, University of British Columbia, Vancouver, BC V5Z 1M9, Canada;
- British Columbia Cancer Vancouver Centre, Vancouver, BC V5Z 4E6, Canada
| |
Collapse
|
13
|
Wolf MM, Rathmell WK, de Cubas AA. Immunogenicity in renal cell carcinoma: shifting focus to alternative sources of tumour-specific antigens. Nat Rev Nephrol 2023; 19:440-450. [PMID: 36973495 PMCID: PMC10801831 DOI: 10.1038/s41581-023-00700-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/27/2023] [Indexed: 03/29/2023]
Abstract
Renal cell carcinoma (RCC) comprises a group of malignancies arising from the kidney with unique tumour-specific antigen (TSA) signatures that can trigger cytotoxic immunity. Two classes of TSAs are now considered potential drivers of immunogenicity in RCC: small-scale insertions and deletions (INDELs) that result in coding frameshift mutations, and activation of human endogenous retroviruses. The presence of neoantigen-specific T cells is a hallmark of solid tumours with a high mutagenic burden, which typically have abundant TSAs owing to non-synonymous single nucleotide variations within the genome. However, RCC exhibits high cytotoxic T cell reactivity despite only having an intermediate non-synonymous single nucleotide variation mutational burden. Instead, RCC tumours have a high pan-cancer proportion of INDEL frameshift mutations, and coding frameshift INDELs are associated with high immunogenicity. Moreover, cytotoxic T cells in RCC subtypes seem to recognize tumour-specific endogenous retrovirus epitopes, whose presence is associated with clinical responses to immune checkpoint blockade therapy. Here, we review the distinct molecular landscapes in RCC that promote immunogenic responses, discuss clinical opportunities for discovery of biomarkers that can inform therapeutic immune checkpoint blockade strategies, and identify gaps in knowledge for future investigations.
Collapse
Affiliation(s)
- Melissa M Wolf
- Department of Medicine, Program in Cancer Biology, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN, USA
| | - W Kimryn Rathmell
- Department of Medicine, Program in Cancer Biology, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN, USA.
| | - Aguirre A de Cubas
- Department of Microbiology and Immunology, Medical University of South Carolina, Charleston, SC, USA.
- Hollings Cancer Center, Medical University of South Carolina, Charleston, SC, USA.
| |
Collapse
|
14
|
Liu Z, Zhao Y, Zhang Y, Xu L, Zhou L, Yang W, Zhao H, Zhao J, Wang F. Development of Omni InDel and supporting database for maize. FRONTIERS IN PLANT SCIENCE 2023; 14:1216505. [PMID: 37457340 PMCID: PMC10344896 DOI: 10.3389/fpls.2023.1216505] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Accepted: 06/12/2023] [Indexed: 07/18/2023]
Abstract
Insertions-deletions (InDels) are the second most abundant molecular marker in the genome and have been widely used in molecular biology research along with simple sequence repeats (SSR) and single-nucleotide polymorphisms (SNP). However, InDel variant mining and marker development usually focuses on a single type of dimorphic InDel, which does not reflect the overall InDel diversity across the genome. Here, we developed Omni InDels for maize, soybean, and rice based on sequencing data and genome assembly that included InDel variants with base lengths from 1 bp to several Mb, and we conducted a detailed classification of Omni InDels. Moreover, we screened a set of InDels that are easily detected and typed (Perfect InDels) from the Omni InDels, verified the site authenticity using 3,587 germplasm resources from 11 groups, and analyzed the germplasm resources. Furthermore, we developed a Multi-InDel set based on the Omni InDels; each Multi-InDel contains multiple InDels, which greatly increases site polymorphism, they can be detected in multiple platforms such as fluorescent capillary electrophoresis and sequencing. Finally, we developed an online database website to make Omni InDels easy to use and share and developed a visual browsing function called "Variant viewer" for all Omni InDel sites to better display the variant distribution.
Collapse
Affiliation(s)
- Zhihao Liu
- Key Laboratory of Crop DNA Fingerprinting Innovation and Utilization (Co-construction by Ministry and Province), Ministry of Agriculture and Rural Affairs, Beijing Academy of Agricultural and Forest Sciences (BAAFS), Beijing, China
- College of Agriculture, Jilin Agricultural University, Changchun, China
| | - Yikun Zhao
- Key Laboratory of Crop DNA Fingerprinting Innovation and Utilization (Co-construction by Ministry and Province), Ministry of Agriculture and Rural Affairs, Beijing Academy of Agricultural and Forest Sciences (BAAFS), Beijing, China
| | - Yunlong Zhang
- Key Laboratory of Crop DNA Fingerprinting Innovation and Utilization (Co-construction by Ministry and Province), Ministry of Agriculture and Rural Affairs, Beijing Academy of Agricultural and Forest Sciences (BAAFS), Beijing, China
| | - Liwen Xu
- Key Laboratory of Crop DNA Fingerprinting Innovation and Utilization (Co-construction by Ministry and Province), Ministry of Agriculture and Rural Affairs, Beijing Academy of Agricultural and Forest Sciences (BAAFS), Beijing, China
| | - Ling Zhou
- Provincial Key Laboratory of Agrobiology, Institute of Crop Germplasm and Biotechnology, Jiangsu Academy of Agricultural Sciences, Nanjing, Jiangsu, China
| | - Weiguang Yang
- College of Agriculture, Jilin Agricultural University, Changchun, China
| | - Han Zhao
- Provincial Key Laboratory of Agrobiology, Institute of Crop Germplasm and Biotechnology, Jiangsu Academy of Agricultural Sciences, Nanjing, Jiangsu, China
| | - Jiuran Zhao
- Key Laboratory of Crop DNA Fingerprinting Innovation and Utilization (Co-construction by Ministry and Province), Ministry of Agriculture and Rural Affairs, Beijing Academy of Agricultural and Forest Sciences (BAAFS), Beijing, China
| | - Fengge Wang
- Key Laboratory of Crop DNA Fingerprinting Innovation and Utilization (Co-construction by Ministry and Province), Ministry of Agriculture and Rural Affairs, Beijing Academy of Agricultural and Forest Sciences (BAAFS), Beijing, China
| |
Collapse
|
15
|
Hassan MM, Hussain MA, Ali SS, Mahdi MA, Mohamed NS, AbdElbagi H, Mohamed O, Sherif AE, Osman W, Ibrahim SRM, Ghazawi KF, Miski SF, Mohamed GA, Ashour A. Detection of Nonsynonymous Single Variants in Human HLA-DRB1 Exon 2 Associated with Renal Transplant Rejection. MEDICINA (KAUNAS, LITHUANIA) 2023; 59:1116. [PMID: 37374320 DOI: 10.3390/medicina59061116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 06/01/2023] [Accepted: 06/06/2023] [Indexed: 06/29/2023]
Abstract
Background: HLA-DRB1 is the most polymorphic gene in the human leukocyte antigen (HLA) class II, and exon 2 is critical because it encodes antigen-binding sites. This study aimed to detect functional or marker genetic variants of HLA-DRB1 exon 2 in renal transplant recipients (acceptance and rejection) using Sanger sequencing. Methods: This hospital-based case-control study collected samples from two hospitals over seven months. The 60 participants were equally divided into three groups: rejection, acceptance, and control. The target regions were amplified and sequenced by PCR and Sanger sequencing. Several bioinformatics tools have been used to assess the impact of non-synonymous single-nucleotide variants (nsSNVs) on protein function and structure. The sequences data that support the findings of this study with accession numbers (OQ747803-OQ747862) are available in National Center for Biotechnology Information (GenBank database). Results: Seven SNVs were identified, two of which were novel (chr6(GRCh38.p12): 32584356C>A (K41N) and 32584113C>A (R122R)). Three of the seven SNVs were non-synonymous and found in the rejection group (chr6(GRCh38.p12): 32584356C>A (K41N), 32584304A>G (Y59H), and 32584152T>A (R109S)). The nsSNVs had varying effects on protein function, structure, and physicochemical parameters and could play a role in renal transplant rejection. The chr6(GRCh38.p12):32584152T>A variant showed the greatest impact. This is because of its conserved nature, main domain location, and pathogenic effects on protein structure, function, and stability. Finally, no significant markers were identified in the acceptance samples. Conclusion: Pathogenic variants can affect intramolecular/intermolecular interactions of amino acid residues, protein function/structure, and disease risk. HLA typing based on functional SNVs could be a comprehensive, accurate, and low-cost method for covering all HLA genes while shedding light on previously unknown causes in many graft rejection cases.
Collapse
Affiliation(s)
- Mohamed M Hassan
- Department of Hematology, Faculty of Medical Laboratory Sciences, National University, Khartoum 11111, Sudan
| | - Mohamed A Hussain
- Department of Pharmaceutical Microbiology, Faculty of Pharmacy, International University of Africa, Khartoum 11111, Sudan
| | - Sababil S Ali
- Department of Parasitology and Medical Entomology, Faculty of Medical Laboratory Sciences, National University, Khartoum11111, Sudan
| | - Mohammed A Mahdi
- Department of Chemical Pathology, Faculty of Medical Laboratory Sciences, National University, Khartoum 11111, Sudan
| | - Nouh Saad Mohamed
- Molecular Biology Unit, Sirius Training and Research Centre, Khartoum 11111, Sudan
| | - Hanadi AbdElbagi
- Molecular Biology Unit, Sirius Training and Research Centre, Khartoum 11111, Sudan
| | - Osama Mohamed
- Department of Molecular Biology, National University Biomedical Research Institute, National University, Khartoum 11111, Sudan
| | - Asmaa E Sherif
- Department of Pharmacognosy, Faculty of Pharmacy, Prince Sattam Bin Abdulaziz University, Al-kharj 11942, Saudi Arabia
- Department of Pharmacognosy, Faculty of Pharmacy, Mansoura University, Mansoura 35516, Egypt
| | - Wadah Osman
- Department of Pharmacognosy, Faculty of Pharmacy, Prince Sattam Bin Abdulaziz University, Al-kharj 11942, Saudi Arabia
- Department of Pharmacognosy, Faculty of Pharmacy, University of Khartoum, Al-Qasr Ave, Khartoum 11111, Sudan
| | - Sabrin R M Ibrahim
- Preparatory Year Program, Department of Chemistry, Batterjee Medical College, Jeddah 21442, Saudi Arabia
- Department of Pharmacognosy, Faculty of Pharmacy, Assiut University, Assiut 71526, Egypt
| | - Kholoud F Ghazawi
- Clinical Pharmacy Department, College of Pharmacy, Umm Al-Qura University, Makkah 24382, Saudi Arabia
| | - Samar F Miski
- Department of Pharmacology and Toxicology, College of Pharmacy, Taibah University, Al-Madinah Al-Munawwarah 30078, Saudi Arabia
| | - Gamal A Mohamed
- Department of Natural Products and Alternative Medicine, Faculty of Pharmacy, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| | - Ahmed Ashour
- Department of Pharmacognosy, Faculty of Pharmacy, Prince Sattam Bin Abdulaziz University, Al-kharj 11942, Saudi Arabia
- Department of Pharmacognosy, Faculty of Pharmacy, Mansoura University, Mansoura 35516, Egypt
| |
Collapse
|
16
|
Banerjee A, Bahar I. Structural Dynamics Predominantly Determine the Adaptability of Proteins to Amino Acid Deletions. Int J Mol Sci 2023; 24:8450. [PMID: 37176156 PMCID: PMC10179678 DOI: 10.3390/ijms24098450] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 05/01/2023] [Accepted: 05/06/2023] [Indexed: 05/15/2023] Open
Abstract
The insertion or deletion (indel) of amino acids has a variety of effects on protein function, ranging from disease-forming changes to gaining new functions. Despite their importance, indels have not been systematically characterized towards protein engineering or modification goals. In the present work, we focus on deletions composed of multiple contiguous amino acids (mAA-dels) and their effects on the protein (mutant) folding ability. Our analysis reveals that the mutant retains the native fold when the mAA-del obeys well-defined structural dynamics properties: localization in intrinsically flexible regions, showing low resistance to mechanical stress, and separation from allosteric signaling paths. Motivated by the possibility of distinguishing the features that underlie the adaptability of proteins to mAA-dels, and by the rapid evaluation of these features using elastic network models, we developed a positive-unlabeled learning-based classifier that can be adopted for protein design purposes. Trained on a consolidated set of features, including those reflecting the intrinsic dynamics of the regions where the mAA-dels occur, the new classifier yields a high recall of 84.3% for identifying mAA-dels that are stably tolerated by the protein. The comparative examination of the relative contribution of different features to the prediction reveals the dominant role of structural dynamics in enabling the adaptation of the mutant to mAA-del without disrupting the native fold.
Collapse
Affiliation(s)
- Anupam Banerjee
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY 11794, USA
| | - Ivet Bahar
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY 11794, USA
- Department of Biochemistry and Cell Biology, Stony Brook University, Stony Brook, NY 11794, USA
| |
Collapse
|
17
|
Lemire BD, Uppuluri P. Coding Sequence Insertions in Fungal Genomes are Intrinsically Disordered and can Impart Functionally-Important Properties on the Host Protein. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.06.535715. [PMID: 37066283 PMCID: PMC10104129 DOI: 10.1101/2023.04.06.535715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/18/2023]
Abstract
Insertion and deletion mutations (indels) are important mechanisms of generating protein diversity. Indels in coding sequences are under considerable selective pressure to maintain reading frames and to preserve protein function, but once generated, indels provide raw material for the acquisition of new protein properties and functions. We reported recently that coding sequence insertions in the Candida albicans NDU1 protein, a mitochondrial protein involved in the assembly of the NADH:ubiquinone oxidoreductase are imperative for respiration, biofilm formation and pathogenesis. NDU1 inserts are specific to CTG-clade fungi, absent in human ortholog and successfully harnessed as drug targets. Here, we present the first comprehensive report investigating indels and clade-defining insertions (CDIs) in fungal proteomes. We investigated 80 ascomycete proteomes encompassing CTG clade species, the Saccharomycetaceae family, the Aspergillaceae family and the Herpotrichiellaceae (black yeasts) family. We identified over 30,000 insertions, 4,000 CDIs and 2,500 clade-defining deletions (CDDs). Insert sizes range from 1 to over 1,000 residues in length, while maximum deletion length is 19 residues. Inserts are strikingly over-represented in protein kinases, and excluded from structural domains and transmembrane segments. Inserts are predicted to be highly disordered. The amino acid compositions of the inserts are highly depleted in hydrophobic residues and enriched in polar residues. An indel in the Saccharomyces cerevisiae Sth1 protein, the catalytic subunit of the RSC (Remodel the Structure of Chromatin) complex is predicted to be disordered until it forms a ß-strand upon interaction. This interaction performs a vital role in RSC-mediated transcriptional regulation, thereby expanding protein function.
Collapse
Affiliation(s)
- Bernard D. Lemire
- Department of Biochemistry, University of Alberta, Edmonton, Canada (retired)
| | - Priya Uppuluri
- Institute for Infection and Immunity, Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, USA
- David Geffen School of Medicine at UCLA, Los Angeles, California, USA
| |
Collapse
|
18
|
Miton CM, Tokuriki N. Insertions and Deletions (Indels): A Missing Piece of the Protein Engineering Jigsaw. Biochemistry 2023; 62:148-157. [PMID: 35830609 DOI: 10.1021/acs.biochem.2c00188] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Over the years, protein engineers have studied nature and borrowed its tricks to accelerate protein evolution in the test tube. While there have been considerable advances, our ability to generate new proteins in the laboratory is seemingly limited. One explanation for these shortcomings may be that insertions and deletions (indels), which frequently arise in nature, are largely overlooked during protein engineering campaigns. The profound effect of indels on protein structures, by way of drastic backbone alterations, could be perceived as "saltation" events that bring about significant phenotypic changes in a single mutational step. Should we leverage these effects to accelerate protein engineering and gain access to unexplored regions of adaptive landscapes? In this Perspective, we describe the role played by indels in the functional diversification of proteins in nature and discuss their untapped potential for protein engineering, despite their often-destabilizing nature. We hope to spark a renewed interest in indels, emphasizing that their wider study and use may prove insightful and shape the future of protein engineering by unlocking unique functional changes that substitutions alone could never achieve.
Collapse
Affiliation(s)
- Charlotte M Miton
- Michael Smith Laboratories, University of British Columbia, Vancouver, V6T 1Z4 BC, Canada
| | - Nobuhiko Tokuriki
- Michael Smith Laboratories, University of British Columbia, Vancouver, V6T 1Z4 BC, Canada
| |
Collapse
|
19
|
Kang Y, Bi Y, Tang Q, Xu H, Lan X, Zhang Q, Pan C. A 7-nt nucleotide sequence variant within the sheep KDM3B gene affects female reproduction traits. Anim Biotechnol 2022; 33:1661-1667. [PMID: 34081570 DOI: 10.1080/10495398.2021.1929270] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Lysine demethylase 3B (KDM3B) gene is a histone demethylase, demonstrating specific demethylation of the histone H3 lysine 9. It was detected as a sheep reproductive candidate gene by genome-wide scans, and related studies also showed its significance in female reproductive process. However, rare study researched its polymorphism. Herein, we hypothesized that the polymorphisms of KDM3B gene were associated with sheep reproduction traits. A 7-nt nucleotide sequence variant (rs1088697156) within KDM3B gene was identified in a total of 888 individuals, including the Australian White (AUW) sheep and Lanzhou Fat-tailed (LFT) sheep. II (insertion/insertion) and ID (insertion/deletion) genotypes of 7-nt variant were detected, which were at Hardy-Weinberg equilibrium (HWE) in detected breeds. Association analysis illustrated the 7-nt variant was significantly associated with the litter size, duration of pregnancy, live lamb number, live lamb rate, stillbirth number, stillbirth rate of average and different parity (P < 0.05) in AUW sheep. Moreover, 'ID' was the dominant genotype with excellent consistency in reproductive traits. It is instrumental to select individuals with ID genotype for improving the sheep reproduction traits. These findings suggest that the 7-nt variant within KDM3B gene can be used as a candidate marker of reproduction traits for sheep breeding improvement by marker-assisted selection.
Collapse
Affiliation(s)
- Yuxin Kang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Yi Bi
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Qi Tang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Hongwei Xu
- College of Life Science and Engineering, Northwest Minzu University, Lanzhou, China.,Gansu Tech Innovation Center of Animal Cell, Biomedical Research Center, Northwest Minzu University, Lanzhou, China
| | - Xianyong Lan
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Qingfeng Zhang
- Tianjin Aoqun Sheep Industry Academy Company, Tianjin, China.,Tianjin Aoqun Animal Husbandry Co., Ltd, Tianjin, China
| | - Chuanying Pan
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| |
Collapse
|
20
|
Jilani M, Turcan A, Haspel N, Jagodzinski F. Elucidating the Structural Impacts of Protein InDels. Biomolecules 2022; 12:1435. [PMID: 36291643 PMCID: PMC9599607 DOI: 10.3390/biom12101435] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Revised: 09/23/2022] [Accepted: 09/27/2022] [Indexed: 09/17/2023] Open
Abstract
The effects of amino acid insertions and deletions (InDels) remain a rather under-explored area of structural biology. These variations oftentimes are the cause of numerous disease phenotypes. In spite of this, research to study InDels and their structural significance remains limited, primarily due to a lack of experimental information and computational methods. In this work, we fill this gap by modeling InDels computationally; we investigate the rigidity differences between the wildtype and a mutant variant with one or more InDels. Further, we compare how structural effects due to InDels differ from the effects of amino acid substitutions, which are another type of amino acid mutation. We finish by performing a correlation analysis between our rigidity-based metrics and wet lab data for their ability to infer the effects of InDels on protein fitness.
Collapse
Affiliation(s)
- Muneeba Jilani
- Department of Computer Science, University of Massachusetts Boston, Boston, MA 02125, USA
| | - Alistair Turcan
- Department of Computer Science, Western Washington University, Bellingham, WA 98225, USA
| | - Nurit Haspel
- Department of Computer Science, University of Massachusetts Boston, Boston, MA 02125, USA
| | - Filip Jagodzinski
- Department of Computer Science, Western Washington University, Bellingham, WA 98225, USA
| |
Collapse
|
21
|
Target-Based Small Molecule Drug Discovery for Colorectal Cancer: A Review of Molecular Pathways and In Silico Studies. Biomolecules 2022; 12:biom12070878. [PMID: 35883434 PMCID: PMC9312989 DOI: 10.3390/biom12070878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 06/05/2022] [Accepted: 06/17/2022] [Indexed: 01/27/2023] Open
Abstract
Colorectal cancer is one of the most prevalent cancer types. Although there have been breakthroughs in its treatments, a better understanding of the molecular mechanisms and genetic involvement in colorectal cancer will have a substantial role in producing novel and targeted treatments with better safety profiles. In this review, the main molecular pathways and driver genes that are responsible for initiating and propagating the cascade of signaling molecules reaching carcinoma and the aggressive metastatic stages of colorectal cancer were presented. Protein kinases involved in colorectal cancer, as much as other cancers, have seen much focus and committed efforts due to their crucial role in subsidizing, inhibiting, or changing the disease course. Moreover, notable improvements in colorectal cancer treatments with in silico studies and the enhanced selectivity on specific macromolecular targets were discussed. Besides, the selective multi-target agents have been made easier by employing in silico methods in molecular de novo synthesis or target identification and drug repurposing.
Collapse
|
22
|
Patterson Rosa L, Martin K, Vierra M, Foster G, Brooks SA, Lafayette C. Non-frameshift deletion on MITF is associated with a novel splashed white spotting pattern in horses (Equus caballus). Anim Genet 2022; 53:538-540. [PMID: 35672910 DOI: 10.1111/age.13225] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Revised: 05/25/2022] [Accepted: 05/25/2022] [Indexed: 11/29/2022]
Affiliation(s)
| | | | | | | | - Samantha A Brooks
- Department of Animal Science, UF Genetics Institute, University of Florida, Gainesville, Florida, USA
| | | |
Collapse
|
23
|
Katsonis P, Wilhelm K, Williams A, Lichtarge O. Genome interpretation using in silico predictors of variant impact. Hum Genet 2022; 141:1549-1577. [PMID: 35488922 PMCID: PMC9055222 DOI: 10.1007/s00439-022-02457-6] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Accepted: 04/17/2022] [Indexed: 02/06/2023]
Abstract
Estimating the effects of variants found in disease driver genes opens the door to personalized therapeutic opportunities. Clinical associations and laboratory experiments can only characterize a tiny fraction of all the available variants, leaving the majority as variants of unknown significance (VUS). In silico methods bridge this gap by providing instant estimates on a large scale, most often based on the numerous genetic differences between species. Despite concerns that these methods may lack reliability in individual subjects, their numerous practical applications over cohorts suggest they are already helpful and have a role to play in genome interpretation when used at the proper scale and context. In this review, we aim to gain insights into the training and validation of these variant effect predicting methods and illustrate representative types of experimental and clinical applications. Objective performance assessments using various datasets that are not yet published indicate the strengths and limitations of each method. These show that cautious use of in silico variant impact predictors is essential for addressing genome interpretation challenges.
Collapse
Affiliation(s)
- Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
| | - Kevin Wilhelm
- Graduate School of Biomedical Sciences, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
| | - Amanda Williams
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA. .,Department of Biochemistry, Human Genetics and Molecular Biology, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA. .,Department of Pharmacology, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA. .,Computational and Integrative Biomedical Research Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
| |
Collapse
|
24
|
Dash M, Somvanshi VS, Godwin J, Budhwar R, Sreevathsa R, Rao U. Exploring Genomic Variations in Nematode-Resistant Mutant Rice Lines. FRONTIERS IN PLANT SCIENCE 2022; 13:823372. [PMID: 35401589 PMCID: PMC8988285 DOI: 10.3389/fpls.2022.823372] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/27/2021] [Accepted: 02/28/2022] [Indexed: 06/14/2023]
Abstract
Rice (Oryza sativa) production is seriously affected by the root-knot nematode Meloidogyne graminicola, which has emerged as a menace in upland and irrigated rice cultivation systems. Previously, activation tagging in rice was utilized to identify candidate gene(s) conferring resistance against M. graminicola. T-DNA insertional mutants were developed in a rice landrace (acc. JBT 36/14), and four mutant lines showed nematode resistance. Whole-genome sequencing of JBT 36/14 was done along with the four nematode resistance mutant lines to identify the structural genetic variations that might be contributing to M. graminicola resistance. Sequencing on Illumina NovaSeq 6000 platform identified 482,234 genetic variations in JBT 36/14 including 448,989 SNPs and 33,245 InDels compared to reference indica genome. In addition, 293,238-553,648 unique SNPs and 32,395-65,572 unique InDels were found in the four mutant lines compared to their JBT 36/14 background, of which 93,224 SNPs and 8,170 InDels were common between all the mutant lines. Functional annotation of genes containing these structural variations showed that the majority of them were involved in metabolism and growth. Trait analysis revealed that most of these genes were involved in morphological traits, physiological traits and stress resistance. Additionally, several families of transcription factors, such as FAR1, bHLH, and NAC, and putative susceptibility (S) genes, showed the presence of SNPs and InDels. Our results indicate that subject to further genetic validations, these structural genetic variations may be involved in conferring nematode resistance to the rice mutant lines.
Collapse
Affiliation(s)
- Manoranjan Dash
- Division of Nematology, ICAR-Indian Agricultural Research Institute, New Delhi, India
| | | | | | - Roli Budhwar
- Bionivid Technology Private Limited, Bangalore, India
| | | | - Uma Rao
- Division of Nematology, ICAR-Indian Agricultural Research Institute, New Delhi, India
| |
Collapse
|
25
|
Pini V, Mariot V, Dumonceaux J, Counsell J, O'Neill HC, Farmer S, Conti F, Muntoni F. Transiently expressed CRISPR/Cas9 induces wild-type dystrophin in vitro in DMD patient myoblasts carrying duplications. Sci Rep 2022; 12:3756. [PMID: 35260651 PMCID: PMC8904532 DOI: 10.1038/s41598-022-07671-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 02/09/2022] [Indexed: 01/14/2023] Open
Abstract
Among the mutations arising in the DMD gene and causing Duchenne Muscular Dystrophy (DMD), 10–15% are multi-exon duplications. There are no current therapeutic approaches with the ability to excise large multi-exon duplications, leaving this patient cohort without mutation-specific treatment. Using CRISPR/Cas9 could provide a valid alternative to achieve targeted excision of genomic duplications of any size. Here we show that the expression of a single CRISPR/Cas9 nuclease targeting a genomic region within a DMD duplication can restore the production of wild-type dystrophin in vitro. We assessed the extent of dystrophin repair following both constitutive and transient nuclease expression by either transducing DMD patient-derived myoblasts with integrating lentiviral vectors or electroporating them with CRISPR/Cas9 expressing plasmids. Comparing genomic, transcript and protein data, we observed that both continuous and transient nuclease expression resulted in approximately 50% dystrophin protein restoration in treated myoblasts. Our data demonstrate that a high transient expression profile of Cas9 circumvents its requirement of continuous expression within the cell for targeting DMD duplications. This proof-of-concept study therefore helps progress towards a clinically relevant gene editing strategy for in vivo dystrophin restoration, by highlighting important considerations for optimizing future therapeutic approaches.
Collapse
Affiliation(s)
- Veronica Pini
- Dubowitz Neuromuscular Centre, Molecular Neurosciences Section, Developmental Neuroscience Research and Teaching Department, UCL Great Ormond Street Institute of Child Health, London, WC1N 1EH, UK.
| | - Virginie Mariot
- Translational Myology Laboratory, Molecular Neurosciences Section, Developmental Neuroscience Research and Teaching Department, UCL Great Ormond Street Institute of Child Health, London, WC1N 1EH, UK
| | - Julie Dumonceaux
- Translational Myology Laboratory, Molecular Neurosciences Section, Developmental Neuroscience Research and Teaching Department, UCL Great Ormond Street Institute of Child Health, London, WC1N 1EH, UK
| | - John Counsell
- Dubowitz Neuromuscular Centre, Molecular Neurosciences Section, Developmental Neuroscience Research and Teaching Department, UCL Great Ormond Street Institute of Child Health, London, WC1N 1EH, UK
| | - Helen C O'Neill
- Genome Editing and Reproductive Genetics Group, Institute for Women's Health, University College London, 86-96 Chenies Mews, London, WC1E 6HX, UK
| | - Sarah Farmer
- Dubowitz Neuromuscular Centre, Molecular Neurosciences Section, Developmental Neuroscience Research and Teaching Department, UCL Great Ormond Street Institute of Child Health, London, WC1N 1EH, UK
| | - Francesco Conti
- Dubowitz Neuromuscular Centre, Molecular Neurosciences Section, Developmental Neuroscience Research and Teaching Department, UCL Great Ormond Street Institute of Child Health, London, WC1N 1EH, UK
| | - Francesco Muntoni
- Dubowitz Neuromuscular Centre, Molecular Neurosciences Section, Developmental Neuroscience Research and Teaching Department, UCL Great Ormond Street Institute of Child Health, London, WC1N 1EH, UK. .,NIHR Great Ormond Street Hospital Biomedical Research Centre, Great Ormond Street Institute of Child Health, University College London, & Great Ormond Street Hospital Trust, London, UK.
| |
Collapse
|
26
|
Choi H, Choi Y, Choi J, Lee AC, Yeom H, Hyun J, Ryu T, Kwon S. Purification of multiplex oligonucleotide libraries by synthesis and selection. Nat Biotechnol 2022; 40:47-53. [PMID: 34326548 DOI: 10.1038/s41587-021-00988-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Accepted: 06/16/2021] [Indexed: 02/07/2023]
Abstract
Complex oligonucleotide (oligo) libraries are essential materials for diverse applications in synthetic biology, pharmaceutical production, nanotechnology and DNA-based data storage. However, the error rates in synthesizing complex oligo libraries can be substantial, leading to increment in cost and labor for the applications. As most synthesis errors arise from faulty insertions and deletions, we developed a length-based method with single-base resolution for purification of complex libraries containing oligos of identical or different lengths. Our method-purification of multiplex oligonucleotide libraries by synthesis and selection-can be performed either step-by-step manually or using a next-generation sequencer. When applied to a digital data-encoded library containing oligos of identical length, the method increased the purity of full-length oligos from 83% to 97%. We also show that libraries encoding the complementarity-determining region H3 with three different lengths (with an empirically achieved diversity >106) can be simultaneously purified in one pot, increasing the in-frame oligo fraction from 49.6% to 83.5%.
Collapse
Affiliation(s)
- Hansol Choi
- Department of Electrical and Computer Engineering, Seoul National University, Seoul, Republic of Korea
| | - Yeongjae Choi
- Nano Systems Institute, Seoul National University, Seoul, Republic of Korea.,Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, USA
| | - Jaewon Choi
- Interdisciplinary Program in Bioengineering, Seoul National University, Seoul, Republic of Korea.,Integrated Major in Innovative Medical Science, Seoul National University, Seoul, Republic of Korea
| | - Amos Chungwon Lee
- Bio-MAX Institute, Seoul National University, Seoul, Republic of Korea
| | - Huiran Yeom
- Bio-MAX Institute, Seoul National University, Seoul, Republic of Korea
| | - Jinwoo Hyun
- Department of Electrical and Computer Engineering, Seoul National University, Seoul, Republic of Korea
| | - Taehoon Ryu
- ATG Lifetech Inc., Seoul, Republic of Korea.
| | - Sunghoon Kwon
- Department of Electrical and Computer Engineering, Seoul National University, Seoul, Republic of Korea. .,Nano Systems Institute, Seoul National University, Seoul, Republic of Korea. .,Interdisciplinary Program in Bioengineering, Seoul National University, Seoul, Republic of Korea. .,Bio-MAX Institute, Seoul National University, Seoul, Republic of Korea.
| |
Collapse
|
27
|
Rao RSP, Ahsan N, Xu C, Su L, Verburgt J, Fornelli L, Kihara D, Xu D. Evolutionary Dynamics of Indels in SARS-CoV-2 Spike Glycoprotein. Evol Bioinform Online 2021; 17:11769343211064616. [PMID: 34898980 PMCID: PMC8655444 DOI: 10.1177/11769343211064616] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2021] [Accepted: 11/12/2021] [Indexed: 01/28/2023] Open
Abstract
SARS-CoV-2, responsible for the current COVID-19 pandemic that claimed over 5.0 million lives, belongs to a class of enveloped viruses that undergo quick evolutionary adjustments under selection pressure. Numerous variants have emerged in SARS-CoV-2, posing a serious challenge to the global vaccination effort and COVID-19 management. The evolutionary dynamics of this virus are only beginning to be explored. In this work, we have analysed 1.79 million spike glycoprotein sequences of SARS-CoV-2 and found that the virus is fine-tuning the spike with numerous amino acid insertions and deletions (indels). Indels seem to have a selective advantage as the proportions of sequences with indels steadily increased over time, currently at over 89%, with similar trends across countries/variants. There were as many as 420 unique indel positions and 447 unique combinations of indels. Despite their high frequency, indels resulted in only minimal alteration of N-glycosylation sites, including both gain and loss. As indels and point mutations are positively correlated and sequences with indels have significantly more point mutations, they have implications in the evolutionary dynamics of the SARS-CoV-2 spike glycoprotein.
Collapse
Affiliation(s)
- R Shyama Prasad Rao
- Biostatistics and Bioinformatics Division, Yenepoya Research Center, Yenepoya University, Mangaluru, Karnataka, India
| | - Nagib Ahsan
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK, USA
- Mass Spectrometry, Proteomics and Metabolomics Core Facility, Stephenson Life Sciences Research Center, University of Oklahoma, Norman, OK, USA
| | - Chunhui Xu
- Department of Electrical Engineering and Computer Science, Informatics Institute, and Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO, USA
| | - Lingtao Su
- Department of Electrical Engineering and Computer Science, Informatics Institute, and Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO, USA
| | - Jacob Verburgt
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Luca Fornelli
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK, USA
- Department of Biology, University of Oklahoma, Norman, OK, USA
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Dong Xu
- Department of Electrical Engineering and Computer Science, Informatics Institute, and Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO, USA
| |
Collapse
|
28
|
Chen J, Guo JT. Structural and functional analysis of somatic coding and UTR indels in breast and lung cancer genomes. Sci Rep 2021; 11:21178. [PMID: 34707120 PMCID: PMC8551294 DOI: 10.1038/s41598-021-00583-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Accepted: 10/14/2021] [Indexed: 11/24/2022] Open
Abstract
Insertions and deletions (Indels) represent one of the major variation types in the human genome and have been implicated in diseases including cancer. To study the features of somatic indels in different cancer genomes, we investigated the indels from two large samples of cancer types: invasive breast carcinoma (BRCA) and lung adenocarcinoma (LUAD). Besides mapping somatic indels in both coding and untranslated regions (UTRs) from the cancer whole exome sequences, we investigated the overlap between these indels and transcription factor binding sites (TFBSs), the key elements for regulation of gene expression that have been found in both coding and non-coding sequences. Compared to the germline indels in healthy genomes, somatic indels contain more coding indels with higher than expected frame-shift (FS) indels in cancer genomes. LUAD has a higher ratio of deletions and higher coding and FS indel rates than BRCA. More importantly, these somatic indels in cancer genomes tend to locate in sequences with important functions, which can affect the core secondary structures of proteins and have a bigger overlap with predicted TFBSs in coding regions than the germline indels. The somatic CDS indels are also enriched in highly conserved nucleotides when compared with germline CDS indels.
Collapse
Affiliation(s)
- Jing Chen
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC, 28223, USA
| | - Jun-Tao Guo
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC, 28223, USA.
| |
Collapse
|
29
|
Martin NS, Ahnert SE. Insertions and deletions in the RNA sequence-structure map. J R Soc Interface 2021; 18:20210380. [PMID: 34610259 PMCID: PMC8492174 DOI: 10.1098/rsif.2021.0380] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2021] [Accepted: 09/13/2021] [Indexed: 12/21/2022] Open
Abstract
Genotype-phenotype maps link genetic changes to their fitness effect and are thus an essential component of evolutionary models. The map between RNA sequences and their secondary structures is a key example and has applications in functional RNA evolution. For this map, the structural effect of substitutions is well understood, but models usually assume a constant sequence length and do not consider insertions or deletions. Here, we expand the sequence-structure map to include single nucleotide insertions and deletions by using the RNAshapes concept. To quantify the structural effect of insertions and deletions, we generalize existing definitions for robustness and non-neutral mutation probabilities. We find striking similarities between substitutions, deletions and insertions: robustness to substitutions is correlated with robustness to insertions and, for most structures, to deletions. In addition, frequent structural changes after substitutions also tend to be common for insertions and deletions. This is consistent with the connection between energetically suboptimal folds and possible structural transitions. The similarities observed hold both for genotypic and phenotypic robustness and mutation probabilities, i.e. for individual sequences and for averages over sequences with the same structure. Our results could have implications for the rate of neutral and non-neutral evolution.
Collapse
Affiliation(s)
- Nora S. Martin
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, JJ Thomson Avenue, Cambridge CB3 0HE, UK
- Sainsbury Laboratory, University of Cambridge, Bateman Street, Cambridge CB2 1LR, UK
| | - Sebastian E. Ahnert
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, UK
- The Alan Turing Institute, British Library, Euston Road, London NW1 2DB, UK
| |
Collapse
|
30
|
Gershony LC, Belanger JM, Hytönen MK, Lohi H, Oberbauer AM. Whole Genome Sequencing Reveals Multiple Linked Genetic Variants on Canine Chromosome 12 Associated with Risk for Symmetrical Lupoid Onychodystrophy (SLO) in the Bearded Collie. Genes (Basel) 2021; 12:1265. [PMID: 34440439 PMCID: PMC8394396 DOI: 10.3390/genes12081265] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2021] [Revised: 08/17/2021] [Accepted: 08/17/2021] [Indexed: 01/16/2023] Open
Abstract
In dogs, symmetrical lupoid onychodystrophy (SLO) results in nail loss and an abnormal regrowth of the claws. In Bearded Collies, an autoimmune nature has been suggested because certain dog leukocyte antigen (DLA) class II haplotypes are associated with the condition. A genome-wide association study of the Bearded Collie revealed two regions of association that conferred risk for disease: one on canine chromosome (CFA) 12 that encompasses the DLA genes, and one on CFA17. Case-control association was employed on whole genome sequencing data to uncover putative causative variants in SLO within the CFA12 and CFA17 associated regions. Genotype imputation was then employed to refine variants of interest. Although no SLO-associated protein-coding variants were identified on CFA17, multiple variants, many with predicted damaging effects, were identified within potential candidate genes on CFA12. Furthermore, many potentially damaging alleles were fully correlated with the presence of DLA class II risk haplotypes for SLO, suggesting that the variants may reflect DLA class II haplotype association with disease or vice versa. Strong linkage disequilibrium in the region precluded the ability to isolate and assess the individual or combined effect of variants on disease development. Nonetheless, all were predictive of risk for SLO and, with judicious assessment, their application in selective breeding may prove useful to reduce the incidence of SLO in the breed.
Collapse
Affiliation(s)
- Liza C. Gershony
- Department on Animal Science, University of California, Davis, CA 95616, USA; (L.C.G.); (J.M.B.)
| | - Janelle M. Belanger
- Department on Animal Science, University of California, Davis, CA 95616, USA; (L.C.G.); (J.M.B.)
| | - Marjo K. Hytönen
- Department of Medical and Clinical Genetics, University of Helsinki, 00014 Helsinki, Finland; (M.K.H.); (H.L.)
- Department of Veterinary Biosciences, University of Helsinki, 00014 Helsinki, Finland
- Folkhälsan Research Center, 00290 Helsinki, Finland
| | - Hannes Lohi
- Department of Medical and Clinical Genetics, University of Helsinki, 00014 Helsinki, Finland; (M.K.H.); (H.L.)
- Department of Veterinary Biosciences, University of Helsinki, 00014 Helsinki, Finland
- Folkhälsan Research Center, 00290 Helsinki, Finland
| | - Anita M. Oberbauer
- Department on Animal Science, University of California, Davis, CA 95616, USA; (L.C.G.); (J.M.B.)
| |
Collapse
|
31
|
Gupta A, Alland D. Reversible gene silencing through frameshift indels and frameshift scars provide adaptive plasticity for Mycobacterium tuberculosis. Nat Commun 2021; 12:4702. [PMID: 34349104 PMCID: PMC8339072 DOI: 10.1038/s41467-021-25055-y] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2019] [Accepted: 07/16/2021] [Indexed: 12/15/2022] Open
Abstract
Mycobacterium tuberculosis can adapt to changing environments by non-heritable mechanisms. Frame-shifting insertions and deletions (indels) may also participate in adaptation through gene disruption, which could be reversed by secondary introduction of a frame-restoring indel. We present ScarTrek, a program that scans genomic data for indels, including those that together disrupt and restore a gene's reading frame, producing "frame-shift scars" suggestive of reversible gene inactivation. We use ScarTrek to analyze 5977 clinical M. tuberculosis isolates. We show that indel frequency inversely correlates with genomic linguistic complexity and varies with gene-position and gene-essentiality. Using ScarTrek, we detect 74 unique frame-shift scars in 48 genes, with a 3.74% population-level incidence of unique scar events. We find multiple scars in the ESX-1 gene cluster. Six scars show evidence of convergent evolution while the rest shared a common ancestor. Our results suggest that sequential indels are a mechanism for reversible gene silencing and adaptation in M. tuberculosis.
Collapse
Affiliation(s)
- Aditi Gupta
- Center for Emerging Pathogens, New Jersey Medical School, Rutgers University, Newark, NJ, USA.
| | - David Alland
- Center for Emerging Pathogens, New Jersey Medical School, Rutgers University, Newark, NJ, USA.
| |
Collapse
|
32
|
Seaby EG, Ennis S. Challenges in the diagnosis and discovery of rare genetic disorders using contemporary sequencing technologies. Brief Funct Genomics 2021; 19:243-258. [PMID: 32393978 DOI: 10.1093/bfgp/elaa009] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Next generation sequencing (NGS) has revolutionised rare disease diagnostics. Concomitant with advancing technologies has been a rise in the number of new gene disorders discovered and diagnoses made for patients and their families. However, despite the trend towards whole exome and whole genome sequencing, diagnostic rates remain suboptimal. On average, only ~30% of patients receive a molecular diagnosis. National sequencing projects launched in the last 5 years are integrating clinical diagnostic testing with research avenues to widen the spectrum of known genetic disorders. Consequently, efforts to diagnose genetic disorders in a clinical setting are now often shared with efforts to prioritise candidate variants for the detection of new disease genes. Herein we discuss some of the biggest obstacles precluding molecular diagnosis and discovery of new gene disorders. We consider bioinformatic and analytical challenges faced when interpreting next generation sequencing data and showcase some of the newest tools available to mitigate these issues. We consider how incomplete penetrance, non-coding variation and structural variants are likely to impact diagnostic rates, and we further discuss methods for uplifting novel gene discovery by adopting a gene-to-patient-based approach.
Collapse
|
33
|
Shin JE, Riesselman AJ, Kollasch AW, McMahon C, Simon E, Sander C, Manglik A, Kruse AC, Marks DS. Protein design and variant prediction using autoregressive generative models. Nat Commun 2021; 12:2403. [PMID: 33893299 PMCID: PMC8065141 DOI: 10.1038/s41467-021-22732-w] [Citation(s) in RCA: 122] [Impact Index Per Article: 40.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2021] [Accepted: 03/26/2021] [Indexed: 12/11/2022] Open
Abstract
The ability to design functional sequences and predict effects of variation is central to protein engineering and biotherapeutics. State-of-art computational methods rely on models that leverage evolutionary information but are inadequate for important applications where multiple sequence alignments are not robust. Such applications include the prediction of variant effects of indels, disordered proteins, and the design of proteins such as antibodies due to the highly variable complementarity determining regions. We introduce a deep generative model adapted from natural language processing for prediction and design of diverse functional sequences without the need for alignments. The model performs state-of-art prediction of missense and indel effects and we successfully design and test a diverse 105-nanobody library that shows better expression than a 1000-fold larger synthetic library. Our results demonstrate the power of the alignment-free autoregressive model in generalizing to regions of sequence space traditionally considered beyond the reach of prediction and design.
Collapse
Affiliation(s)
- Jung-Eun Shin
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Adam J Riesselman
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- insitro, South San Francisco, CA, USA
| | - Aaron W Kollasch
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Conor McMahon
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA, USA
- Vertex Pharmaceuticals, Boston, MA, USA
| | - Elana Simon
- Harvard College, Cambridge, MA, USA
- Reverie Labs, Cambridge, MA, USA
| | - Chris Sander
- Department of Cell Biology, Harvard Medical School, Boston, MA, USA
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Aashish Manglik
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Francisco, CA, USA
- Department of Anesthesia and Perioperative Care, University of California San Francisco, San Francisco, CA, USA
| | - Andrew C Kruse
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA, USA.
| | - Debora S Marks
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA.
- Broad Institute of Harvard and MIT, Cambridge, MA, USA.
| |
Collapse
|
34
|
An extended catalogue of tandem alternative splice sites in human tissue transcriptomes. PLoS Comput Biol 2021; 17:e1008329. [PMID: 33826604 PMCID: PMC8055015 DOI: 10.1371/journal.pcbi.1008329] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2020] [Revised: 04/19/2021] [Accepted: 03/22/2021] [Indexed: 12/18/2022] Open
Abstract
Tandem alternative splice sites (TASS) is a special class of alternative splicing events that are characterized by a close tandem arrangement of splice sites. Most TASS lack functional characterization and are believed to arise from splicing noise. Based on the RNA-seq data from the Genotype Tissue Expression project, we present an extended catalogue of TASS in healthy human tissues and analyze their tissue-specific expression. The expression of TASS is usually dominated by one major splice site (maSS), while the expression of minor splice sites (miSS) is at least an order of magnitude lower. Among 46k miSS with sufficient read support, 9k (20%) are significantly expressed above the expected noise level, and among them 2.5k are expressed tissue-specifically. We found significant correlations between tissue-specific expression of RNA-binding proteins (RBP), tissue-specific expression of miSS, and miSS response to RBP inactivation by shRNA. In combination with RBP profiling by eCLIP, this allowed prediction of novel cases of tissue-specific splicing regulation including a miSS in QKI mRNA that is likely regulated by PTBP1. The analysis of human primary cell transcriptomes suggested that both tissue-specific and cell-type-specific factors contribute to the regulation of miSS expression. More than 20% of tissue-specific miSS affect structured protein regions and may adjust protein-protein interactions or modify the stability of the protein core. The significantly expressed miSS evolve under the same selection pressure as maSS, while other miSS lack signatures of evolutionary selection and conservation. Using mixture models, we estimated that not more than 15% of maSS and not more than 54% of tissue-specific miSS are noisy, while the proportion of noisy splice sites among non-significantly expressed miSS is above 63%. Pre-mRNA splicing is an important step in the processing of the genomic information during gene expression. During splicing, introns are excised from a gene transcript, and the remaining exons are ligated. Our work concerns one its particular subtype, which involves the so-called tandem alternative splice sites, a group of closely located exon borders that are used alternatively. We analyzed RNA-seq measurements of gene expression provided by the Genotype-Tissue Expression (GTEx) project, the largest to-date collection of such measurements in healthy human tissues, and constructed a detailed catalogue of tandem alternative splice sites. Within this catalogue, we characterized patterns of tissue-specific expression, regulation, impact on protein structure, and evolutionary selection acting on tandem alternative splice sites. In a number of genes, we predicted regulatory mechanisms that could be responsible for choosing one of many tandem alternative splice sites. The results of this study provide an invaluable resource for molecular biologists studying alternative splicing.
Collapse
|
35
|
Lin M, Malik FK, Guo JT. A comparative study of protein-ssDNA interactions. NAR Genom Bioinform 2021; 3:lqab006. [PMID: 33655206 PMCID: PMC7902235 DOI: 10.1093/nargab/lqab006] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2020] [Revised: 11/24/2020] [Accepted: 01/26/2021] [Indexed: 12/18/2022] Open
Abstract
Single-stranded DNA-binding proteins (SSBs) play crucial roles in DNA replication, recombination and repair, and serve as key players in the maintenance of genomic stability. While a number of SSBs bind single-stranded DNA (ssDNA) non-specifically, the others recognize and bind specific ssDNA sequences. The mechanisms underlying this binding discrepancy, however, are largely unknown. Here, we present a comparative study of protein-ssDNA interactions by annotating specific and non-specific SSBs and comparing structural features such as DNA-binding propensities and secondary structure types of residues in SSB-ssDNA interactions, protein-ssDNA hydrogen bonding and π-π interactions between specific and non-specific SSBs. Our results suggest that protein side chain-DNA base hydrogen bonds are the major contributors to protein-ssDNA binding specificity, while π-π interactions may mainly contribute to binding affinity. We also found the enrichment of aspartate in the specific SSBs, a key feature in specific protein-double-stranded DNA (dsDNA) interactions as reported in our previous study. In addition, no significant differences between specific and non-specific groups with respect of conformational changes upon ssDNA binding were found, suggesting that the flexibility of SSBs plays a lesser role than that of dsDNA-binding proteins in conferring binding specificity.
Collapse
Affiliation(s)
- Maoxuan Lin
- Department of Bioinformatics and Genomics, The University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Fareeha K Malik
- Department of Bioinformatics and Genomics, The University of North Carolina at Charlotte, Charlotte, NC 28223, USA
- Research Center of Modeling and Simulation, National University of Science and Technology, Islamabad, 44000, Pakistan
| | - Jun-tao Guo
- Department of Bioinformatics and Genomics, The University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| |
Collapse
|
36
|
Chen J, Guo JT. Comparative assessments of indel annotations in healthy and cancer genomes with next-generation sequencing data. BMC Med Genomics 2020; 13:170. [PMID: 33167946 PMCID: PMC7653722 DOI: 10.1186/s12920-020-00818-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Accepted: 10/29/2020] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Insertion and deletion (indel) is one of the major variation types in human genomes. Accurate annotation of indels is of paramount importance in genetic variation analysis and investigation of their roles in human diseases. Previous studies revealed a high number of false positives from existing indel calling methods, which limits downstream analyses of the effects of indels on both healthy and disease genomes. In this study, we evaluated seven commonly used general indel calling programs for germline indels and four somatic indel calling programs through comparative analysis to investigate their common features and differences and to explore ways to improve indel annotation accuracy. METHODS In our comparative analysis, we adopted a more stringent evaluation approach by considering both the indel positions and the indel types (insertion or deletion sequences) between the samples and the reference set. In addition, we applied an efficient way to use a benchmark for improved performance comparisons for the general indel calling programs RESULTS: We found that germline indels in healthy genomes derived by combining several indel calling tools could help remove a large number of false positive indels from individual programs without compromising the number of true positives. The performance comparisons of somatic indel calling programs are more complicated due to the lack of a reliable and comprehensive benchmark. Nevertheless our results revealed large variations among the programs and among cancer types. CONCLUSIONS While more accurate indel calling programs are needed, we found that the performance for germline indel annotations can be improved by combining the results from several programs. In addition, well-designed benchmarks for both germline and somatic indels are key in program development and evaluations.
Collapse
Affiliation(s)
- Jing Chen
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, 9201 University City Blvd, Charlotte, NC, 28223, USA
| | - Jun-Tao Guo
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, 9201 University City Blvd, Charlotte, NC, 28223, USA.
| |
Collapse
|
37
|
Marian AJ. Clinical Interpretation and Management of Genetic Variants. ACTA ACUST UNITED AC 2020; 5:1029-1042. [PMID: 33145465 PMCID: PMC7591931 DOI: 10.1016/j.jacbts.2020.05.013] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2020] [Revised: 05/22/2020] [Accepted: 05/27/2020] [Indexed: 01/31/2023]
Abstract
The human genome contains approximately 4 million variants, whose population frequencies vary according to the ethnic backgrounds. Genetic diversity of humans in part determines interindividual variability in susceptibility to diseases, response to therapy, and the clinical outcomes. Genetic variants exert a gradient of biological and clinical effect sizes. In general, variants with the largest effect sizes are responsible for the single-gene disorders, whereas those with moderate and modest effect sizes are responsible for oligogenic and polygenic diseases, respectively. A phenotype is the consequence of nonlinear stochastic interactions among multiple genetic and nongenetic determinants. Discerning pathogenicity of the genetic variants, identified through genetic testing, in the clinical phenotype is challenging and requires complementary expertise in human molecular genetics and clinical medicine.
Genetic variants are major determinants of susceptibility to disease, response to therapy, and clinical outcomes. Advances in the short-read sequencing technologies, despite some shortcomings, have enabled identification of the vast majority of the genetic variants in each genome. The major challenge is in identifying the pathogenic variants in cardiovascular diseases. The yield of the genetic testing has been limited because of technological shortcomings and our incomplete understanding of the genetic basis of cardiovascular disorders. To advance the field, a shift to long-read sequencing platforms is necessary. In addition, to discern the pathogenic variants, genetic diseases should be considered as a continuum and the genetic variants as probabilistic factors with a gradient of effect sizes. Moreover, disease-specific physician-scientists with expertise in the clinical medicine and molecular genetics are best equipped to discern functional and clinical significance of the genetic variants. The changes would be expected to enhance clinical utilities of the genetic discoveries.
Collapse
Affiliation(s)
- Ali J Marian
- Center for Cardiovascular Genetics, Institute of Molecular Medicine and Department of Medicine, University of Texas Health Sciences Center at Houston, Houston, Texas
| |
Collapse
|
38
|
Akhatayeva Z, Mao C, Jiang F, Pan C, Lin C, Hao K, Lan T, Chen H, Zhang Q, Lan X. Indel variants within the PRL and GHR genes associated with sheep litter size. Reprod Domest Anim 2020; 55:1470-1478. [PMID: 32762057 DOI: 10.1111/rda.13796] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2020] [Accepted: 08/01/2020] [Indexed: 12/27/2022]
Abstract
Growth hormone and prolactin belong to the class of peptide hormones that have a wide range of regulatory functions. In this study, polymorphisms of growth hormone receptor (GHR) and prolactin (PRL) genes were analysed as candidate genes, which are responsible for the litter size in Australian White (AUW) sheep. According to the statistical analyses results, the polymorphism information content (PIC) values of the PRL-P1-ins-23 bp, GHR-P2-del-23 bp and GHR-P8-del-23 bp were 0.371, 0.366 and 0.375, respectively, which indicates the high genetic polymorphism in AUW sheep. Moreover, all indel loci are not conformed to the HWE (p < .05). Further, our findings revealed that the PRL-P1-ins-23 bp polymorphism in the ovine PRL gene was significantly related to the first parity litter size (p = .001) and the DD genotype displaying the highest genotypic mean. Meanwhile, the GHR-P2-del-23 bp and GHR-P8-23 bp indels in the ovine GHR gene were significantly correlated with first parity litter size (p < .05), and the individuals with the genotype II showed significantly higher litter size than others. Collectively, these results demonstrated that our findings could be useful for future sheep breeding strategies based on the molecular-assisted selection (MAS).
Collapse
Affiliation(s)
- Zhanerke Akhatayeva
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, China
| | - Cui Mao
- Tianjin Aoqun Sheep Industry Academy Company, Tianjin, China.,Tianjin Aoqun Animal Husbandry Co., Ltd., Tianjin, China
| | - Fugui Jiang
- Tianjin Aoqun Animal Husbandry Co., Ltd., Tianjin, China.,Shandong Key Lab of Animal Disease Control and Breeding, Institute of Animal Science and Veterinary Medicine, Shandong Academy of Agricultural Sciences, Jinan, China
| | - Chuanying Pan
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, China
| | - Chunjian Lin
- Tianjin Aoqun Sheep Industry Academy Company, Tianjin, China.,Tianjin Aoqun Animal Husbandry Co., Ltd., Tianjin, China
| | - Kunjie Hao
- Tianjin Aoqun Sheep Industry Academy Company, Tianjin, China.,Tianjin Aoqun Animal Husbandry Co., Ltd., Tianjin, China
| | - Tianxin Lan
- Tianjin Aoqun Sheep Industry Academy Company, Tianjin, China.,Tianjin Aoqun Animal Husbandry Co., Ltd., Tianjin, China
| | - Hong Chen
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, China
| | - Qingfeng Zhang
- Tianjin Aoqun Sheep Industry Academy Company, Tianjin, China.,Tianjin Aoqun Animal Husbandry Co., Ltd., Tianjin, China
| | - Xianyong Lan
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, China
| |
Collapse
|
39
|
Bak I, Kim DJ, Kim HC, Shin HJ, Yu E, Yoo KW, Yu DY. Two base pair deletion in IL2 receptor γ gene in NOD/SCID mice induces a highly severe immunodeficiency. Lab Anim Res 2020; 36:27. [PMID: 32817844 PMCID: PMC7427935 DOI: 10.1186/s42826-020-00048-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2019] [Accepted: 05/13/2020] [Indexed: 11/17/2022] Open
Abstract
Genome editing has recently emerged as a powerful tool for generating mutant mice. Small deletions of nucleotides in the target genes are frequently found in CRISPR/Cas9 mediated mutant mice. However, there are very few reports analyzing the phenotypes in small deleted mutant mice generated by CRISPR/Cas9. In this study, we generated a mutant by microinjecting sgRNAs targeting the IL2 receptor γ gene and Cas9 protein, into the cytoplasm of IVF-derived NOD.CB17/Prkdcscid/JKrb (NOD/SCID) mice embryos, and further investigated whether a 2 bp deletion of the IL2 receptor γ gene affects severe deficiency of immune cells as seen in NOD/LtSz-scid IL2 receptor γ−/− (NSG) mice. Our results show that the thymus weight of mutant mice is significantly less than that of NOD/SCID mice, whereas the spleen weight was marginally less. T and B cells in the mutant mice were severely deficient, and NK cells were almost absent. In addition, tumor growth was exceedingly increased in the mutant mice transplanted with HepG2, Raji and A549 cells, but not in nude and NOD/SCID mice. These results suggest that the NOD/SCID mice with deletion of 2 bp in the IL2 receptor γ gene shows same phenotype as NSG mice. Taken together, our data indicates that small deletions by genome editing is sufficient to generate null mutant mice.
Collapse
Affiliation(s)
- Inseon Bak
- Korea Research Institute of Bioscience and Biotechnology (KRIBB), 125 Gwahak-ro, Yuseong-gu, Daejeon, 34141 Korea.,Genome engineering laboratory, GHBIO Inc., C406, 17 Techno4-ro Yuseong-gu, Daejeon, 34013 Korea
| | - Doo-Jin Kim
- Korea Research Institute of Bioscience and Biotechnology (KRIBB), 125 Gwahak-ro, Yuseong-gu, Daejeon, 34141 Korea
| | - Hyoung-Chin Kim
- Korea Research Institute of Bioscience and Biotechnology (KRIBB), 30 Yeongudanji-ro, Ochang-eup, Cheongwon-gu, Cheongju, Chungcheongbukdo 28116 Korea
| | - Hye-Jun Shin
- Genome engineering laboratory, GHBIO Inc., C406, 17 Techno4-ro Yuseong-gu, Daejeon, 34013 Korea
| | - Eunhye Yu
- Genome engineering laboratory, GHBIO Inc., C406, 17 Techno4-ro Yuseong-gu, Daejeon, 34013 Korea
| | - Kyeong-Won Yoo
- Genome engineering laboratory, GHBIO Inc., C406, 17 Techno4-ro Yuseong-gu, Daejeon, 34013 Korea
| | - Dae-Yeul Yu
- Korea Research Institute of Bioscience and Biotechnology (KRIBB), 125 Gwahak-ro, Yuseong-gu, Daejeon, 34141 Korea
| |
Collapse
|
40
|
Emond S, Petek M, Kay EJ, Heames B, Devenish SRA, Tokuriki N, Hollfelder F. Accessing unexplored regions of sequence space in directed enzyme evolution via insertion/deletion mutagenesis. Nat Commun 2020; 11:3469. [PMID: 32651386 PMCID: PMC7351745 DOI: 10.1038/s41467-020-17061-3] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2019] [Accepted: 06/01/2020] [Indexed: 11/22/2022] Open
Abstract
Insertions and deletions (InDels) are frequently observed in natural protein evolution, yet their potential remains untapped in laboratory evolution. Here we introduce a transposon-based mutagenesis approach (TRIAD) to generate libraries of random variants with short in-frame InDels, and screen TRIAD libraries to evolve a promiscuous arylesterase activity in a phosphotriesterase. The evolution exhibits features that differ from previous point mutagenesis campaigns: while the average activity of TRIAD variants is more compromised, a larger proportion has successfully adapted for the activity. Different functional profiles emerge: (i) both strong and weak trade-off between activities are observed; (ii) trade-off is more severe (20- to 35-fold increased kcat/KM in arylesterase with 60-400-fold decreases in phosphotriesterase activity) and (iii) improvements are present in kcat rather than just in KM, suggesting adaptive solutions. These distinct features make TRIAD an alternative to widely used point mutagenesis, accessing functional innovations and traversing unexplored fitness landscape regions.
Collapse
Affiliation(s)
- Stephane Emond
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1GA, UK.
- Evonetix Ltd, Coldhams Business Park, Norman Way, Cambridge, CB1 3LH, UK.
| | - Maya Petek
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1GA, UK
| | - Emily J Kay
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1GA, UK
- Cancer Research UK Beatson Institute, Glasgow, G61 1BD, UK
| | - Brennen Heames
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1GA, UK
- Institute for Evolution and Biodiversity, Westfälische Wilhelms-Universität, Hüfferstrasse 1, 48149, Münster, Germany
| | - Sean R A Devenish
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1GA, UK
- Fluidic Analytics, The Paddocks Business Centre, Cherry Hinton Road, Cambridge, CB1 8DH, UK
| | - Nobuhiko Tokuriki
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | - Florian Hollfelder
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1GA, UK.
| |
Collapse
|
41
|
Lin M, Guo JT. New insights into protein-DNA binding specificity from hydrogen bond based comparative study. Nucleic Acids Res 2020; 47:11103-11113. [PMID: 31665426 PMCID: PMC6868434 DOI: 10.1093/nar/gkz963] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Revised: 10/06/2019] [Accepted: 10/10/2019] [Indexed: 12/25/2022] Open
Abstract
Knowledge of protein-DNA binding specificity has important implications in understanding DNA metabolism, transcriptional regulation and developing therapeutic drugs. Previous studies demonstrated hydrogen bonds between amino acid side chains and DNA bases play major roles in specific protein-DNA interactions. In this paper, we investigated the roles of individual DNA strands and protein secondary structure types in specific protein-DNA recognition based on side chain-base hydrogen bonds. By comparing the contribution of each DNA strand to the overall binding specificity between DNA-binding proteins with different degrees of binding specificity, we found that highly specific DNA-binding proteins show balanced hydrogen bonding with each of the two DNA strands while multi-specific DNA binding proteins are generally biased towards one strand. Protein-base pair hydrogen bonds, in which both bases of a base pair are involved in forming hydrogen bonds with amino acid side chains, are more prevalent in the highly specific protein-DNA complexes than those in the multi-specific group. Amino acids involved in side chain-base hydrogen bonds favor strand and coil secondary structure types in highly specific DNA-binding proteins while multi-specific DNA-binding proteins prefer helices.
Collapse
Affiliation(s)
- Maoxuan Lin
- Department of Bioinformatics and Genomics, The University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Jun-Tao Guo
- Department of Bioinformatics and Genomics, The University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| |
Collapse
|
42
|
Seif Y, Choudhary KS, Hefner Y, Anand A, Yang L, Palsson BO. Metabolic and genetic basis for auxotrophies in Gram-negative species. Proc Natl Acad Sci U S A 2020; 117:6264-6273. [PMID: 32132208 PMCID: PMC7084086 DOI: 10.1073/pnas.1910499117] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Auxotrophies constrain the interactions of bacteria with their environment, but are often difficult to identify. Here, we develop an algorithm (AuxoFind) using genome-scale metabolic reconstruction to predict auxotrophies and apply it to a series of available genome sequences of over 1,300 Gram-negative strains. We identify 54 auxotrophs, along with the corresponding metabolic and genetic basis, using a pangenome approach, and highlight auxotrophies conferring a fitness advantage in vivo. We show that the metabolic basis of auxotrophy is species-dependent and varies with 1) pathway structure, 2) enzyme promiscuity, and 3) network redundancy. Various levels of complexity constitute the genetic basis, including 1) deleterious single-nucleotide polymorphisms (SNPs), in-frame indels, and deletions; 2) single/multigene deletion; and 3) movement of mobile genetic elements (including prophages) combined with genomic rearrangements. Fourteen out of 19 predictions agree with experimental evidence, with the remaining cases highlighting shortcomings of sequencing, assembly, annotation, and reconstruction that prevent predictions of auxotrophies. We thus develop a framework to identify the metabolic and genetic basis for auxotrophies in Gram-negatives.
Collapse
Affiliation(s)
- Yara Seif
- Systems Biology Research Group, Department of Bioengineering, University of California San Diego, CA 92122
| | - Kumari Sonal Choudhary
- Systems Biology Research Group, Department of Bioengineering, University of California San Diego, CA 92122
| | - Ying Hefner
- Systems Biology Research Group, Department of Bioengineering, University of California San Diego, CA 92122
| | - Amitesh Anand
- Systems Biology Research Group, Department of Bioengineering, University of California San Diego, CA 92122
| | - Laurence Yang
- Systems Biology Research Group, Department of Bioengineering, University of California San Diego, CA 92122
- Department of Chemical Engineering, Queen's University, Kingston, ON K7L 3N6, Canada
| | - Bernhard O Palsson
- Systems Biology Research Group, Department of Bioengineering, University of California San Diego, CA 92122;
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, 2800 Lyngby, Denmark
| |
Collapse
|
43
|
Toward in silico Identification of Tumor Neoantigens in Immunotherapy. Trends Mol Med 2019; 25:980-992. [PMID: 31494024 DOI: 10.1016/j.molmed.2019.08.001] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2019] [Revised: 07/13/2019] [Accepted: 08/02/2019] [Indexed: 12/30/2022]
Abstract
Cancer immunotherapy includes cancer vaccination, adoptive T cell transfer (ACT) with chimeric antigen receptor (CAR) T cells, and administration of tumor-infiltrating lymphocytes and immune-checkpoint blockade such as anti-CTLA4/anti-PD1 inhibitors that can directly or indirectly target tumor neoantigens and elicit a T cell response. Accurate, rapid, and cost-effective identification of neoantigens, however, is critical for successful immunotherapy. Here, we review computational issues for neoantigen identification by summarizing the various sources of neoantigens and their identification from high-throughput sequencing data. Several opinions are presented to inspire further discussions toward improving neoantigen identification. Continuing efforts are required to improve the sensitivity and specificity of bona fide neoantigens, taking advantage of the development of high-throughput sequencing techniques for effective and personalized cancer immunotherapy.
Collapse
|
44
|
Yue Z, Zhao L, Cheng N, Yan H, Xia J. dbCID: a manually curated resource for exploring the driver indels in human cancer. Brief Bioinform 2019; 20:1925-1933. [DOI: 10.1093/bib/bby059] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2018] [Revised: 05/22/2018] [Indexed: 12/12/2022] Open
Abstract
Abstract
While recent advances in next-generation sequencing technologies have enabled the creation of a multitude of databases in cancer genomic research, there is no comprehensive database focusing on the annotation of driver indels (insertions and deletions) yet. Therefore, we have developed the database of Cancer driver InDels (dbCID), which is a collection of known coding indels that likely to be engaged in cancer development, progression or therapy. dbCID contains experimentally supported and putative driver indels derived from manual curation of literature and is freely available online at http://bioinfo.ahu.edu.cn:8080/dbCID. Using the data deposited in dbCID, we summarized features of driver indels in four levels (gene, DNA, transcript and protein) through comparing with putative neutral indels. We found that most of the genes containing driver indels in dbCID are known cancer genes playing a role in tumorigenesis. Contrary to the expectation, the sequences affected by driver frameshift indels are not larger than those by neutral ones. In addition, the frameshift and inframe driver indels prefer to disrupt high-conservative regions both in DNA sequences and protein domains. Finally, we developed a computational method for discriminating cancer driver from neutral frameshift indels based on the deposited data in dbCID. The proposed method outperformed other widely used non-cancer-specific predictors on an external test set, which demonstrated the usefulness of the data deposited in dbCID. We hope dbCID will be a benchmark for improving and evaluating prediction algorithms, and the characteristics summarized here may assist with investigating the mechanism of indel–cancer association.
Collapse
Affiliation(s)
- Zhenyu Yue
- Institute of Physical Science and Information Technology, School of Computer Science and Technology, Anhui University, Hefei, Anhui, China
| | - Le Zhao
- Institute of Physical Science and Information Technology, School of Computer Science and Technology, Anhui University, Hefei, Anhui, China
| | - Na Cheng
- Institute of Physical Science and Information Technology, School of Computer Science and Technology, Anhui University, Hefei, Anhui, China
| | - Hua Yan
- School of Life Sciences, Anhui University, Hefei, Anhui, China
| | - Junfeng Xia
- Institute of Physical Science and Information Technology, School of Computer Science and Technology, Anhui University, Hefei, Anhui, China
| |
Collapse
|
45
|
Shukla HG, Bawa PS, Srinivasan S. hg19KIndel: ethnicity normalized human reference genome. BMC Genomics 2019; 20:459. [PMID: 31170919 PMCID: PMC6555027 DOI: 10.1186/s12864-019-5854-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2019] [Accepted: 05/29/2019] [Indexed: 11/22/2022] Open
Abstract
Background The most widely used human genome reference assembly hg19 harbors minor alleles at 2.18 million positions as revealed by 1000 Genome Phase 3 dataset. Although this is less than 2% of the 89 million variants reported, it has been shown that the minor alleles can result in 30% false positives in individual genomes, thus misleading and burdening downstream interpretation. More alarming is the fact that, significant percentage of variants that are homozygous recessive for these minor alleles, with potential disease implications, are masked from reporting. Results We have demonstrated that the false positives (FP) and false negatives (FN) can be corrected for by simply replacing nucleotides at the minor allele positions in hg19 with corresponding major allele. Here, we have effectively replaced 2.18 million minor alleles Single Nucleotide Polymorphism (SNPs), Insertion and Deletions (INDELs), Multiple Nucleotide Polymorphism (MNPs) in hg19 with the corresponding major alleles to create an ethnically normalized reference genome called hg19KIndel. In doing so, hg19KIndel has both corrected for sequencing errors acknowledged to be present in hg19 and has improved read alignment near the minor alleles in hg19. Conclusion We have created and made available a new version human reference genome called hg19KIndel. It has been shown that variant calling using hg19KIndel, significantly reduces false positives calls, which in-turn reduces the burden from downstream analysis and validation. It also improved false negative variants call, which means that the variants which were getting missed due to the presence of minor alleles in hg19, will now be called using hg19KIndel. Using hg19KIndel, one even gets a better mapping percentage when compared to currently available human reference genome. hg19KIndel reference genome and its auxiliary datasets are available at 10.5281/zenodo.2638113
Collapse
Affiliation(s)
- Harsh G Shukla
- Institute of Bioinformatics and Applied Biotechnology, Biotech Park, Electronic City Phase I, Bangalore, 560100, India
| | - Pushpinder Singh Bawa
- Institute of Bioinformatics and Applied Biotechnology, Biotech Park, Electronic City Phase I, Bangalore, 560100, India.,Manipal Academy of Higher Education (MAHE), Manipal, India
| | - Subhashini Srinivasan
- Institute of Bioinformatics and Applied Biotechnology, Biotech Park, Electronic City Phase I, Bangalore, 560100, India.
| |
Collapse
|
46
|
Pagel KA, Antaki D, Lian A, Mort M, Cooper DN, Sebat J, Iakoucheva LM, Mooney SD, Radivojac P. Pathogenicity and functional impact of non-frameshifting insertion/deletion variation in the human genome. PLoS Comput Biol 2019; 15:e1007112. [PMID: 31199787 PMCID: PMC6594643 DOI: 10.1371/journal.pcbi.1007112] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2018] [Revised: 06/26/2019] [Accepted: 05/17/2019] [Indexed: 11/19/2022] Open
Abstract
Differentiation between phenotypically neutral and disease-causing genetic variation remains an open and relevant problem. Among different types of variation, non-frameshifting insertions and deletions (indels) represent an understudied group with widespread phenotypic consequences. To address this challenge, we present a machine learning method, MutPred-Indel, that predicts pathogenicity and identifies types of functional residues impacted by non-frameshifting insertion/deletion variation. The model shows good predictive performance as well as the ability to identify impacted structural and functional residues including secondary structure, intrinsic disorder, metal and macromolecular binding, post-translational modifications, allosteric sites, and catalytic residues. We identify structural and functional mechanisms impacted preferentially by germline variation from the Human Gene Mutation Database, recurrent somatic variation from COSMIC in the context of different cancers, as well as de novo variants from families with autism spectrum disorder. Further, the distributions of pathogenicity prediction scores generated by MutPred-Indel are shown to differentiate highly recurrent from non-recurrent somatic variation. Collectively, we present a framework to facilitate the interrogation of both pathogenicity and the functional effects of non-frameshifting insertion/deletion variants. The MutPred-Indel webserver is available at http://mutpred.mutdb.org/.
Collapse
Affiliation(s)
- Kymberleigh A. Pagel
- School of Informatics, Computing, and Engineering, Indiana University, Bloomington, Indiana, United States of America
| | - Danny Antaki
- Department of Psychiatry, University of California San Diego, La Jolla, California, United States of America
| | - AoJie Lian
- Department of Psychiatry, University of California San Diego, La Jolla, California, United States of America
- Center for Medical Genetics, School of Life Sciences, Central South University, Changsha, China
| | - Matthew Mort
- Institute of Medical Genetics, Cardiff University, Cardiff, United Kingdom
| | - David N. Cooper
- Institute of Medical Genetics, Cardiff University, Cardiff, United Kingdom
| | - Jonathan Sebat
- Department of Psychiatry, University of California San Diego, La Jolla, California, United States of America
| | - Lilia M. Iakoucheva
- Department of Psychiatry, University of California San Diego, La Jolla, California, United States of America
| | - Sean D. Mooney
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, Washington, United States of America
| | - Predrag Radivojac
- School of Informatics, Computing, and Engineering, Indiana University, Bloomington, Indiana, United States of America
- Khoury College of Computer Sciences, Northeastern University, Boston, Massachusetts, United States of America
| |
Collapse
|
47
|
Grover CE, Arick MA, Thrash A, Conover JL, Sanders WS, Peterson DG, Frelichowski JE, Scheffler JA, Scheffler BE, Wendel JF. Insights into the Evolution of the New World Diploid Cottons (Gossypium, Subgenus Houzingenia) Based on Genome Sequencing. Genome Biol Evol 2019; 11:53-71. [PMID: 30476109 PMCID: PMC6320677 DOI: 10.1093/gbe/evy256] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/20/2018] [Indexed: 12/24/2022] Open
Abstract
We employed phylogenomic methods to study molecular evolutionary processes and phylogeny in the geographically widely dispersed New World diploid cottons (Gossypium, subg. Houzingenia). Whole genome resequencing data (average of 33× genomic coverage) were generated to reassess the phylogenetic history of the subgenus and provide a temporal framework for its diversification. Phylogenetic analyses indicate that the subgenus likely originated following transoceanic dispersal from Africa about 6.6 Ma, but that nearly all of the biodiversity evolved following rapid diversification in the mid-Pleistocene (0.5-2.0 Ma), with multiple long-distance dispersals required to account for range expansion to Arizona, the Galapagos Islands, and Peru. Comparative analyses of cpDNAversus nuclear data indicate that this history was accompanied by several clear cases of interspecific introgression. Repetitive DNAs contribute roughly half of the total 880 Mb genome, but most transposable element families are relatively old and stable among species. In the genic fraction, pairwise synonymous mutation rates average 1% per Myr, with nonsynonymous changes being about seven times less frequent. Over 1.1 million indels were detected and phylogenetically polarized, revealing a 2-fold bias toward deletions over small insertions. We suggest that this genome down-sizing bias counteracts genome size growth by TE amplification and insertions, and helps explain the relatively small genomes that are restricted to this subgenus. Compared with the rate of nucleotide substitution, the rate of indel occurrence is much lower averaging about 17 nucleotide substitutions per indel event.
Collapse
Affiliation(s)
- Corrinne E Grover
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University
| | - Mark A Arick
- Institute for Genomics, Biocomputing, and Biotechnology, Mississippi State University
| | - Adam Thrash
- Institute for Genomics, Biocomputing, and Biotechnology, Mississippi State University
| | - Justin L Conover
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University
| | - William S Sanders
- Institute for Genomics, Biocomputing, and Biotechnology, Mississippi State University
- Department of Computer Science & Engineering, Mississippi State University
- The Jackson Laboratory, Connecticut
| | - Daniel G Peterson
- Institute for Genomics, Biocomputing, and Biotechnology, Mississippi State University
| | | | | | - Brian E Scheffler
- USDA, Genomics and Bioinformatics Research Unit, Stoneville, Mississippi
| | - Jonathan F Wendel
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University
| |
Collapse
|