1
|
Xu K, Zhu J, Zhai H, Yang Q, Zhou K, Song Q, Wu J, Liu D, Li Y, Xia Z. A single-nucleotide polymorphism in PvPW1 encoding β-1,3-glucanase 9 is associated with pod width in Phaseolus vulgaris L. J Genet Genomics 2024:S1673-8527(24)00258-3. [PMID: 39389459 DOI: 10.1016/j.jgg.2024.09.020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2024] [Revised: 09/25/2024] [Accepted: 09/26/2024] [Indexed: 10/12/2024]
Abstract
Pod width influences pod size, shape, yield, and consumer preference in snap beans (Phaseolus vulgaris L.). In this study, we map PvPW1, a quantitative trait locus associated with pod width in snap beans, through genotyping and phenotyping of recombinant plants. We identify Phvul.006G072800, encoding the β-1,3-glucanase 9 protein, as the causal gene for PvPW1. The PvPW1G3555 allele is found to positively regulate pod width, as revealed by an association analysis between pod width phenotype and the PvPW1G3555C genotype across 17 bi-parental F2 populations. 97.7% of the 133 wide pod accessions carry PvPW1G3555, while 82.1% of the 78 narrow pod accessions carry PvPW1C3555, indicating strong selection pressure on PvPW1 during common bean breeding. Re-sequencing data from 59 common bean cultivars identify an 8-bp deletion in the intron linked to PvPW1C3555, leading to the development of the InDel marker of PvM436. Genotyping 317 common bean accessions with PvM436 demonstrated that accessions with PvM436247 and PvM436227 alleles have wider pods compared to those with PvM436219 allele, establishing PvM436 as a reliable marker for molecular breeding in snap beans. These findings highlight PvPW1 as a critical gene regulating pod width and underscore the utility of PvM436 in marker-assisted selection for snap bean breeding.
Collapse
Affiliation(s)
- Kun Xu
- State Key Laboratory of Black Soils Conservation and Utilization, Key Laboratory of Soybean Molecular Design Breeding, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Harbin, Heilongjiang 150081, China
| | - Jinlong Zhu
- State Key Laboratory of Black Soils Conservation and Utilization, Key Laboratory of Soybean Molecular Design Breeding, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Harbin, Heilongjiang 150081, China
| | - Hong Zhai
- State Key Laboratory of Black Soils Conservation and Utilization, Key Laboratory of Soybean Molecular Design Breeding, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Harbin, Heilongjiang 150081, China
| | - Qiang Yang
- State Key Laboratory of Black Soils Conservation and Utilization, Key Laboratory of Soybean Molecular Design Breeding, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Harbin, Heilongjiang 150081, China
| | - Keqin Zhou
- State Key Laboratory of Black Soils Conservation and Utilization, Key Laboratory of Soybean Molecular Design Breeding, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Harbin, Heilongjiang 150081, China
| | - Qijian Song
- USDA ARS, Soybean Genome & Improvement Lab, Beltsville 20705, USA
| | - Jing Wu
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 10081, China.
| | - Dajun Liu
- Horticulture Department, College of Advanced Agriculture and Ecological Environment, Heilongjiang University, Harbin, Heilongjiang 150000, China.
| | - Yanhua Li
- State Key Laboratory of Black Soils Conservation and Utilization, Key Laboratory of Soybean Molecular Design Breeding, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Harbin, Heilongjiang 150081, China.
| | - Zhengjun Xia
- State Key Laboratory of Black Soils Conservation and Utilization, Key Laboratory of Soybean Molecular Design Breeding, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Harbin, Heilongjiang 150081, China.
| |
Collapse
|
2
|
Panrat T, Phongdara A, Wuthisathid K, Meemetta W, Phiwsaiya K, Vanichviriyakit R, Senapin S, Sangsuriya P. Structural modelling and preventive strategy targeting of WSSV hub proteins to combat viral infection in shrimp Penaeus monodon. PLoS One 2024; 19:e0307976. [PMID: 39074084 DOI: 10.1371/journal.pone.0307976] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Accepted: 07/15/2024] [Indexed: 07/31/2024] Open
Abstract
White spot syndrome virus (WSSV) presents a considerable peril to the aquaculture sector, leading to notable financial consequences on a global scale. Previous studies have identified hub proteins, including WSSV051 and WSSV517, as essential binding elements in the protein interaction network of WSSV. This work further investigates the functional structures and potential applications of WSSV hub complexes in managing WSSV infection. Using computational methodologies, we have successfully generated comprehensive three-dimensional (3D) representations of hub proteins along with their three mutual binding counterparts, elucidating crucial interaction locations. The results of our study indicate that the WSSV051 hub protein demonstrates higher binding energy than WSSV517. Moreover, a unique motif, denoted as "S-S-x(5)-S-x(2)-P," was discovered among the binding proteins. This pattern perhaps contributes to the detection of partners by the hub proteins of WSSV. An antiviral strategy targeting WSSV hub proteins was demonstrated through the oral administration of dual hub double-stranded RNAs to the black tiger shrimp, Penaeus monodon, followed by a challenge assay. The findings demonstrate a decrease in shrimp mortality and a cessation of WSSV multiplication. In conclusion, our research unveils the structural features and dynamic interactions of hub complexes, shedding light on their significance in the WSSV protein network. This highlights the potential of hub protein-based interventions to mitigate the impact of WSSV infection in aquaculture.
Collapse
Affiliation(s)
- Tanate Panrat
- Prince of Songkla University International College, Prince of Songkla University, Hatyai Campus, Songkhla, Thailand
- Center for Genomics and Bioinformatics Research, Faculty of Science, Prince of Songkla University, Songkhla, Thailand
| | - Amornrat Phongdara
- Center for Genomics and Bioinformatics Research, Faculty of Science, Prince of Songkla University, Songkhla, Thailand
| | - Kitti Wuthisathid
- Center of Excellence for Shrimp Molecular Biology and Biotechnology (Centex Shrimp), Faculty of Science, Mahidol University, Bangkok, Thailand
| | - Watcharachai Meemetta
- Center of Excellence for Shrimp Molecular Biology and Biotechnology (Centex Shrimp), Faculty of Science, Mahidol University, Bangkok, Thailand
| | - Kornsunee Phiwsaiya
- Center of Excellence for Shrimp Molecular Biology and Biotechnology (Centex Shrimp), Faculty of Science, Mahidol University, Bangkok, Thailand
- National Center for Genetic Engineering and Biotechnology (BIOTEC), National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Rapeepun Vanichviriyakit
- Center of Excellence for Shrimp Molecular Biology and Biotechnology (Centex Shrimp), Faculty of Science, Mahidol University, Bangkok, Thailand
- Department of Anatomy, Faculty of Science, Mahidol University, Bangkok, Thailand
| | - Saengchan Senapin
- Center of Excellence for Shrimp Molecular Biology and Biotechnology (Centex Shrimp), Faculty of Science, Mahidol University, Bangkok, Thailand
- National Center for Genetic Engineering and Biotechnology (BIOTEC), National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Pakkakul Sangsuriya
- Aquatic Molecular Genetics and Biotechnology Research Team, BIOTEC, NSTDA, Pathum Thani, Thailand
| |
Collapse
|
3
|
Teekas L, Sharma S, Vijay N. Terminal regions of a protein are a hotspot for low complexity regions and selection. Open Biol 2024; 14:230439. [PMID: 38862022 DOI: 10.1098/rsob.230439] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 05/13/2024] [Indexed: 06/13/2024] Open
Abstract
Volatile low complexity regions (LCRs) are a novel source of adaptive variation, functional diversification and evolutionary novelty. An interplay of selection and mutation governs the composition and length of low complexity regions. High %GC and mutations provide length variability because of mechanisms like replication slippage. Owing to the complex dynamics between selection and mutation, we need a better understanding of their coexistence. Our findings underscore that positively selected sites (PSS) and low complexity regions prefer the terminal regions of genes, co-occurring in most Tetrapoda clades. We observed that positively selected sites within a gene have position-specific roles. Central-positively selected site genes primarily participate in defence responses, whereas terminal-positively selected site genes exhibit non-specific functions. Low complexity region-containing genes in the Tetrapoda clade exhibit a significantly higher %GC and lower ω (dN/dS: non-synonymous substitution rate/synonymous substitution rate) compared with genes without low complexity regions. This lower ω implies that despite providing rapid functional diversity, low complexity region-containing genes are subjected to intense purifying selection. Furthermore, we observe that low complexity regions consistently display ubiquitous prevalence at lower purity levels, but exhibit a preference for specific positions within a gene as the purity of the low complexity region stretch increases, implying a composition-dependent evolutionary role. Our findings collectively contribute to the understanding of how genetic diversity and adaptation are shaped by the interplay of selection and low complexity regions in the Tetrapoda clade.
Collapse
Affiliation(s)
- Lokdeep Teekas
- Computational Evolutionary Genomics Lab, Department of Biological Sciences, IISER Bhopal , Bhauri, Madhya Pradesh, India
| | - Sandhya Sharma
- Computational Evolutionary Genomics Lab, Department of Biological Sciences, IISER Bhopal , Bhauri, Madhya Pradesh, India
| | - Nagarjun Vijay
- Computational Evolutionary Genomics Lab, Department of Biological Sciences, IISER Bhopal , Bhauri, Madhya Pradesh, India
| |
Collapse
|
4
|
Dickson ZW, Golding GB. Evolution of Transcript Abundance is Influenced by Indels in Protein Low Complexity Regions. J Mol Evol 2024; 92:153-168. [PMID: 38485789 DOI: 10.1007/s00239-024-10158-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Accepted: 01/24/2024] [Indexed: 04/02/2024]
Abstract
Protein Protein low complexity regions (LCRs) are compositionally biased amino acid sequences, many of which have significant evolutionary impacts on the proteins which contain them. They are mutationally unstable experiencing higher rates of indels and substitutions than higher complexity regions. LCRs also impact the expression of their proteins, likely through multiple effects along the path from gene transcription, through translation, and eventual protein degradation. It has been observed that proteins which contain LCRs are associated with elevated transcript abundance (TAb), despite having lower protein abundance. We have gathered and integrated human data to investigate the co-evolution of TAb and LCRs through ancestral reconstructions and model inference using an approximate Bayesian calculation based method. We observe that on short evolutionary timescales TAb evolution is significantly impacted by changes in LCR length, with insertions driving TAb down. But in contrast, the observed data is best explained by indel rates in LCRs which are unaffected by shifts in TAb. Our work demonstrates a coupling between LCR and TAb evolution, and the utility of incorporating multiple responses into evolutionary analyses.
Collapse
Affiliation(s)
| | - G Brian Golding
- Department of Biology, McMaster University, Hamilton, ON, Canada
| |
Collapse
|
5
|
Lee Y, Kim SJ, Kim YJ, Kim YH, Yoon JY, Shin J, Ok SM, Kim EJ, Choi EJ, Oh JW. Sensor development for multiple simultaneous classifications using genetically engineered M13 bacteriophages. Biosens Bioelectron 2023; 241:115642. [PMID: 37703643 DOI: 10.1016/j.bios.2023.115642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Revised: 07/17/2023] [Accepted: 08/25/2023] [Indexed: 09/15/2023]
Abstract
Sensors for detecting infinitesimal amounts of chemicals in air have been widely developed because they can identify the origin of chemicals. These sensing technologies are also used to determine the variety and freshness of fresh food and detect explosives, hazardous chemicals, environmental hormones, and diseases using exhaled gases. However, there is still a need to rapidly develop portable and highly sensitive sensors that respond to complex environments. Here, we show an efficient method for optimising an M13 bacteriophage-based multi-array colourimetric sensor for multiple simultaneous classifications. Apples, which are difficult to classify due to many varieties in distribution, were selected for classifying targets. M13 was adopted to fabricate a multi-array colourimetric sensor using the self-templating process since a chemical property of major coat protein p8 consisting of the M13 body can be manipulated by genetic engineering to respond to various target substances. The twenty sensor units, which consisted of different types of manipulated M13, exhibited colour changes because of the change of photonic crystal-like nanostructure when they were exposed to target substances associated with apples. The classification success rate of the optimal sensor combinations was achieved with high accuracy for the apple variety (100%), four standard fragrances (100%), and aging (84.5%) simultaneously. We expect that this optimisation technique can be used for rapid sensor development capable of multiple simultaneous classifications in various fields, such as medical diagnosis, hazardous environment monitoring, and the food industry, where sensors need to be developed in response to complex environments consisting of various targets.
Collapse
Affiliation(s)
- Yujin Lee
- Department of Nano Fusion Technology, Pusan National University, 46241, Busan, Republic of Korea.
| | - Sung-Jo Kim
- Bio-IT Fusion Technology Research Institute, Pusan National University, 46241, Busan, Republic of Korea
| | - Ye-Ji Kim
- Department of Nano Fusion Technology, Pusan National University, 46241, Busan, Republic of Korea
| | - You Hwan Kim
- Department of Nano Fusion Technology, Pusan National University, 46241, Busan, Republic of Korea
| | - Ji-Young Yoon
- Dental Research Institute, Dental and Life Science Institute, Pusan National University, 50612, Yangsan, Republic of Korea; Department of Dental Anesthesia and Pain Medicine, School of Dentistry, Pusan National University, 50612, Yangsan, Republic of Korea
| | - Jonghyun Shin
- Dental Research Institute, Dental and Life Science Institute, Pusan National University, 50612, Yangsan, Republic of Korea; Department of Pediatric Dentistry, School of Dentistry, Pusan National University, 50612, Yangsan, Republic of Korea
| | - Soo-Min Ok
- Dental Research Institute, Dental and Life Science Institute, Pusan National University, 50612, Yangsan, Republic of Korea; Department of Oral Medicine, School of Dentistry, Pusan National University, 50612, Yangsan, Republic of Korea
| | - Eun-Jung Kim
- Dental Research Institute, Dental and Life Science Institute, Pusan National University, 50612, Yangsan, Republic of Korea; Department of Dental Anesthesia and Pain Medicine, School of Dentistry, Pusan National University, 50612, Yangsan, Republic of Korea
| | - Eun Jung Choi
- Bio-IT Fusion Technology Research Institute, Pusan National University, 46241, Busan, Republic of Korea; Korea Nanobiotechnology Center, Pusan National University, 46241, Busan, Republic of Korea
| | - Jin-Woo Oh
- Department of Nano Fusion Technology, Pusan National University, 46241, Busan, Republic of Korea; Bio-IT Fusion Technology Research Institute, Pusan National University, 46241, Busan, Republic of Korea; Korea Nanobiotechnology Center, Pusan National University, 46241, Busan, Republic of Korea; Department of Nanoenergy Engineering and Research Center for Energy Convergence Technology, Pusan National University, 46241, Busan, Republic of Korea
| |
Collapse
|
6
|
Rich KD, Srivastava S, Muthye VR, Wasmuth JD. Identification of potential molecular mimicry in pathogen-host interactions. PeerJ 2023; 11:e16339. [PMID: 37953771 PMCID: PMC10637249 DOI: 10.7717/peerj.16339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 10/02/2023] [Indexed: 11/14/2023] Open
Abstract
Pathogens have evolved sophisticated strategies to manipulate host signaling pathways, including the phenomenon of molecular mimicry, where pathogen-derived biomolecules imitate host biomolecules. In this study, we resurrected, updated, and optimized a sequence-based bioinformatics pipeline to identify potential molecular mimicry candidates between humans and 32 pathogenic species whose proteomes' 3D structure predictions were available at the start of this study. We observed considerable variation in the number of mimicry candidates across pathogenic species, with pathogenic bacteria exhibiting fewer candidates compared to fungi and protozoans. Further analysis revealed that the candidate mimicry regions were enriched in solvent-accessible regions, highlighting their potential functional relevance. We identified a total of 1,878 mimicked regions in 1,439 human proteins, and clustering analysis indicated diverse target proteins across pathogen species. The human proteins containing mimicked regions revealed significant associations between these proteins and various biological processes, with an emphasis on host extracellular matrix organization and cytoskeletal processes. However, immune-related proteins were underrepresented as targets of mimicry. Our findings provide insights into the broad range of host-pathogen interactions mediated by molecular mimicry and highlight potential targets for further investigation. This comprehensive analysis contributes to our understanding of the complex mechanisms employed by pathogens to subvert host defenses and we provide a resource to assist researchers in the development of novel therapeutic strategies.
Collapse
Affiliation(s)
- Kaylee D. Rich
- Faculty of Veterinary Medicine, University of Calgary, Calgary, Alberta, Canada
- Host-Parasite Interactions Research Training Network, University of Calgary, Calgary, Alberta, Canada
| | - Shruti Srivastava
- Faculty of Veterinary Medicine, University of Calgary, Calgary, Alberta, Canada
- Host-Parasite Interactions Research Training Network, University of Calgary, Calgary, Alberta, Canada
| | - Viraj R. Muthye
- Faculty of Veterinary Medicine, University of Calgary, Calgary, Alberta, Canada
- Host-Parasite Interactions Research Training Network, University of Calgary, Calgary, Alberta, Canada
| | - James D. Wasmuth
- Faculty of Veterinary Medicine, University of Calgary, Calgary, Alberta, Canada
- Host-Parasite Interactions Research Training Network, University of Calgary, Calgary, Alberta, Canada
| |
Collapse
|
7
|
Vaglietti S, Villeri V, Dell’Oca M, Marchetti C, Cesano F, Rizzo F, Miller D, LaPierre L, Pelassa I, Monje FJ, Colnaghi L, Ghirardi M, Fiumara F. PolyQ length-based molecular encoding of vocalization frequency in FOXP2. iScience 2023; 26:108036. [PMID: 37860754 PMCID: PMC10582585 DOI: 10.1016/j.isci.2023.108036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 07/18/2023] [Accepted: 09/21/2023] [Indexed: 10/21/2023] Open
Abstract
The transcription factor FOXP2, a regulator of vocalization- and speech/language-related phenotypes, contains two long polyQ repeats (Q1 and Q2) displaying marked, still enigmatic length variation across mammals. We found that the Q1/Q2 length ratio quantitatively encodes vocalization frequency ranges, from the infrasonic to the ultrasonic, displaying striking convergent evolution patterns. Thus, species emitting ultrasonic vocalizations converge with bats in having a low ratio, whereas species vocalizing in the low-frequency/infrasonic range converge with elephants and whales, which have higher ratios. Similar, taxon-specific patterns were observed for the FOXP2-related protein FOXP1. At the molecular level, we observed that the FOXP2 polyQ tracts form coiled coils, assembling into condensates and fibrils, and drive liquid-liquid phase separation (LLPS). By integrating evolutionary and molecular analyses, we found that polyQ length variation related to vocalization frequency impacts FOXP2 structure, LLPS, and transcriptional activity, thus defining a novel form of polyQ length-based molecular encoding of vocalization frequency.
Collapse
Affiliation(s)
- Serena Vaglietti
- Rita Levi Montalcini Department of Neuroscience, University of Turin, 10125 Turin, Italy
| | - Veronica Villeri
- Rita Levi Montalcini Department of Neuroscience, University of Turin, 10125 Turin, Italy
| | - Marco Dell’Oca
- Rita Levi Montalcini Department of Neuroscience, University of Turin, 10125 Turin, Italy
| | - Chiara Marchetti
- Rita Levi Montalcini Department of Neuroscience, University of Turin, 10125 Turin, Italy
| | - Federico Cesano
- Department of Chemistry, University of Turin, 10125 Turin, Italy
| | - Francesca Rizzo
- Jockey Club College of Veterinary Medicine and Life Sciences, City University of Hong Kong, Kowloon Tong, Hong Kong SAR 518057, China
| | - Dave Miller
- Cascades Pika Watch, Oregon Zoo, Portland, OR 97221, USA
| | - Louis LaPierre
- Deptartment of Natural Science, Lower Columbia College, Longview, WA 98632, USA
| | - Ilaria Pelassa
- Rita Levi Montalcini Department of Neuroscience, University of Turin, 10125 Turin, Italy
| | - Francisco J. Monje
- Department of Neurophysiology and Neuropharmacology, Medical University of Vienna, 1090 Vienna, Austria
| | - Luca Colnaghi
- Division of Neuroscience, IRCCS San Raffaele Scientific Institute, 20132 Milan, Italy
- School of Medicine, Vita-Salute San Raffaele University, 20132 Milan, Italy
| | - Mirella Ghirardi
- Rita Levi Montalcini Department of Neuroscience, University of Turin, 10125 Turin, Italy
| | - Ferdinando Fiumara
- Rita Levi Montalcini Department of Neuroscience, University of Turin, 10125 Turin, Italy
| |
Collapse
|
8
|
Chan-Yao-Chong M, Chan J, Kono H. Benchmarking of force fields to characterize the intrinsically disordered R2-FUS-LC region. Sci Rep 2023; 13:14226. [PMID: 37648703 PMCID: PMC10468508 DOI: 10.1038/s41598-023-40801-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Accepted: 08/16/2023] [Indexed: 09/01/2023] Open
Abstract
Intrinsically Disordered Proteins (IDPs) play crucial roles in numerous diseases like Alzheimer's and ALS by forming irreversible amyloid fibrils. The effectiveness of force fields (FFs) developed for globular proteins and their modified versions for IDPs varies depending on the specific protein. This study assesses 13 FFs, including AMBER and CHARMM, by simulating the R2 region of the FUS-LC domain (R2-FUS-LC region), an IDP implicated in ALS. Due to the flexibility of the region, we show that utilizing multiple measures, which evaluate the local and global conformations, and combining them together into a final score are important for a comprehensive evaluation of force fields. The results suggest c36m2021s3p with mTIP3p water model is the most balanced FF, capable of generating various conformations compatible with known ones. In addition, the mTIP3P water model is computationally more efficient than those of top-ranked AMBER FFs with four-site water models. The evaluation also reveals that AMBER FFs tend to generate more compact conformations compared to CHARMM FFs but also more non-native contacts. The top-ranking AMBER and CHARMM FFs can reproduce intra-peptide contacts but underperform for inter-peptide contacts, indicating there is room for improvement.
Collapse
Affiliation(s)
- Maud Chan-Yao-Chong
- Molecular Modeling and Simulation (MMS) Team, Institute for Quantum Life Science, National Institutes for Quantum Science and Technology (QST), 4-9-1, Anagawa, Inage Ward, Chiba City, Chiba, 263-8555, Japan
- Toulouse Biotechnology Institute, TBI, Université de Toulouse, CNRS, INRAE, INSA, 135, Avenue de Rangueil, 31077, Toulouse Cedex 04, France
| | - Justin Chan
- Molecular Modeling and Simulation (MMS) Team, Institute for Quantum Life Science, National Institutes for Quantum Science and Technology (QST), 4-9-1, Anagawa, Inage Ward, Chiba City, Chiba, 263-8555, Japan
| | - Hidetoshi Kono
- Molecular Modeling and Simulation (MMS) Team, Institute for Quantum Life Science, National Institutes for Quantum Science and Technology (QST), 4-9-1, Anagawa, Inage Ward, Chiba City, Chiba, 263-8555, Japan.
| |
Collapse
|
9
|
Birchard K, Driver HG, Ademidun D, Bedolla-Guzmán Y, Birt T, Chown EE, Deane P, Harkness BAS, Morrin A, Masello JF, Taylor RS, Friesen VL. Circadian gene variation in relation to breeding season and latitude in allochronic populations of two pelagic seabird species complexes. Sci Rep 2023; 13:13692. [PMID: 37608061 PMCID: PMC10444859 DOI: 10.1038/s41598-023-40702-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2022] [Accepted: 08/16/2023] [Indexed: 08/24/2023] Open
Abstract
Annual cues in the environment result in physiological changes that allow organisms to time reproduction during periods of optimal resource availability. Understanding how circadian rhythm genes sense these environmental cues and stimulate the appropriate physiological changes in response is important for determining the adaptability of species, especially in the advent of changing climate. A first step involves characterizing the environmental correlates of natural variation in these genes. Band-rumped and Leach's storm-petrels (Hydrobates spp.) are pelagic seabirds that breed across a wide range of latitudes. Importantly, some populations have undergone allochronic divergence, in which sympatric populations use the same breeding sites at different times of year. We investigated the relationship between variation in key functional regions of four genes that play an integral role in the cellular clock mechanism-Clock, Bmal1, Cry2 and Per2-with both breeding season and absolute latitude in these two species complexes. We discovered that allele frequencies in two genes, Clock and Bmal1, differed between seasonal populations in one archipelago, and also correlated with absolute latitude of breeding colonies. These results indicate that variation in these circadian rhythm genes may be involved in allochronic speciation, as well as adaptation to photoperiod at breeding locations.
Collapse
Affiliation(s)
- Katie Birchard
- Biology Department, Queen's University, Kingston, ON, K7L 3N6, Canada
- Apex Resource Management Solutions, Ottawa, ON, K2A 3K2, Canada
| | - Hannah G Driver
- Biology Department, Queen's University, Kingston, ON, K7L 3N6, Canada
- Children's Hospital of Eastern Ontario Research Institute, Ottawa, ON, K1H 8L1, Canada
| | - Dami Ademidun
- Biology Department, Queen's University, Kingston, ON, K7L 3N6, Canada
| | | | - Tim Birt
- Biology Department, Queen's University, Kingston, ON, K7L 3N6, Canada
| | - Erin E Chown
- Biology Department, Queen's University, Kingston, ON, K7L 3N6, Canada
| | - Petra Deane
- Biology Department, Queen's University, Kingston, ON, K7L 3N6, Canada
- Mascoma LLC, Lallemand Inc., Lebanon, NH, 03766, USA
| | - Bronwyn A S Harkness
- Biology Department, Queen's University, Kingston, ON, K7L 3N6, Canada
- Environment and Climate Change Canada, Wildlife Research Division, Ottawa, ON, K1S 5B6, Canada
| | - Austin Morrin
- Biology Department, Queen's University, Kingston, ON, K7L 3N6, Canada
- Sims Animal Hospital, Kingston, ON, K7K 7E9, Canada
| | - Juan F Masello
- Department of Animal Behaviour, University of Bielefeld, 33615, Bielefeld, Germany
| | - Rebecca S Taylor
- Biology Department, Queen's University, Kingston, ON, K7L 3N6, Canada
- Environment and Climate Change Canada, Landscape Science and Technology Division, Ottawa, ON, K1S 5R1, Canada
| | - Vicki L Friesen
- Biology Department, Queen's University, Kingston, ON, K7L 3N6, Canada.
| |
Collapse
|
10
|
Kim SJ, Lee Y, Choi EJ, Lee JM, Kim KH, Oh JW. The development progress of multi-array colourimetric sensors based on the M13 bacteriophage. NANO CONVERGENCE 2023; 10:1. [PMID: 36595116 PMCID: PMC9808696 DOI: 10.1186/s40580-022-00351-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Accepted: 12/08/2022] [Indexed: 06/17/2023]
Abstract
Techniques for detecting chemicals dispersed at low concentrations in air continue to evolve. These techniques can be applied not only to manage the quality of agricultural products using a post-ripening process but also to establish a safety prevention system by detecting harmful gases and diagnosing diseases. Recently, techniques for rapid response to various chemicals and detection in complex and noisy environments have been developed using M13 bacteriophage-based sensors. In this review, M13 bacteriophage-based multi-array colourimetric sensors for the development of an electronic nose is discussed. The self-templating process was adapted to fabricate a colour band structure consisting of an M13 bacteriophage. To detect diverse target chemicals, the colour band was utilised with wild and genetically engineered M13 bacteriophages to enhance their sensing abilities. Multi-array colourimetric sensors were optimised for application in complex and noisy environments based on simulation and deep learning analysis. The development of a multi-array colourimetric sensor platform based on the M13 bacteriophage is likely to result in significant advances in the detection of various harmful gases and the diagnosis of various diseases based on exhaled gas in the future.
Collapse
Affiliation(s)
- Sung-Jo Kim
- Bio-IT Fusion Technology Research Institute, Pusan National University, Busan, Republic of Korea
| | - Yujin Lee
- Department of Nano Fusion Technology, Pusan National University, Busan, Republic of Korea
| | - Eun Jung Choi
- Bio-IT Fusion Technology Research Institute, Pusan National University, Busan, Republic of Korea
- Korea Nanobiotechnology Center, Pusan National University, Busan, Republic of Korea
| | - Jong-Min Lee
- School of Nano Convergence Technology, Hallym University, Chuncheon, Republic of Korea
- Korea and Nano Convergence Technology Center, Hallym University, Chuncheon, Republic of Korea
| | - Kwang Ho Kim
- School of Materials Science and Engineering, Pusan National University, Busan, Republic of Korea
- Global Frontier Research and Development Center for Hybrid Interface Materials, Pusan National University, Busan, Republic of Korea
| | - Jin-Woo Oh
- Bio-IT Fusion Technology Research Institute, Pusan National University, Busan, Republic of Korea
- Department of Nano Fusion Technology, Pusan National University, Busan, Republic of Korea
- Korea Nanobiotechnology Center, Pusan National University, Busan, Republic of Korea
- Department of Nanoenergy Engineering and Research Center for Energy Convergence Technology, Pusan National University, Busan, Republic of Korea
| |
Collapse
|
11
|
Lee B, Jaberi-Lashkari N, Calo E. A unified view of low complexity regions (LCRs) across species. eLife 2022; 11:e77058. [PMID: 36098382 PMCID: PMC9470157 DOI: 10.7554/elife.77058] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Accepted: 08/17/2022] [Indexed: 11/13/2022] Open
Abstract
Low complexity regions (LCRs) play a role in a variety of important biological processes, yet we lack a unified view of their sequences, features, relationships, and functions. Here, we use dotplots and dimensionality reduction to systematically define LCR type/copy relationships and create a map of LCR sequence space capable of integrating LCR features and functions. By defining LCR relationships across the proteome, we provide insight into how LCR type and copy number contribute to higher order assemblies, such as the importance of K-rich LCR copy number for assembly of the nucleolar protein RPA43 in vivo and in vitro. With LCR maps, we reveal the underlying structure of LCR sequence space, and relate differential occupancy in this space to the conservation and emergence of higher order assemblies, including the metazoan extracellular matrix and plant cell wall. Together, LCR relationships and maps uncover and identify scaffold-client relationships among E-rich LCR-containing proteins in the nucleolus, and revealed previously undescribed regions of LCR sequence space with signatures of higher order assemblies, including a teleost-specific T/H-rich sequence space. Thus, this unified view of LCRs enables discovery of how LCRs encode higher order assemblies of organisms.
Collapse
Affiliation(s)
- Byron Lee
- Department of Biology, Massachusetts Institute of TechnologyCambridgeUnited States
| | - Nima Jaberi-Lashkari
- Department of Biology, Massachusetts Institute of TechnologyCambridgeUnited States
| | - Eliezer Calo
- Department of Biology, Massachusetts Institute of TechnologyCambridgeUnited States
- David H. Koch Institute for Integrative Cancer Research, Massachusetts Institute of TechnologyCambridgeUnited States
| |
Collapse
|
12
|
Aledo JC. A Census of Human Methionine-Rich Prion-like Domain-Containing Proteins. Antioxidants (Basel) 2022; 11:antiox11071289. [PMID: 35883780 PMCID: PMC9312190 DOI: 10.3390/antiox11071289] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2022] [Revised: 06/24/2022] [Accepted: 06/27/2022] [Indexed: 11/16/2022] Open
Abstract
Methionine-rich prion-like proteins can regulate liquid–liquid phase separation processes in response to stresses. To date, however, very few proteins have been identified as methionine-rich prion-like. Herein, we have performed a computational survey of the human proteome to search for methionine-rich prion-like domains. We present a census of 51 manually curated methionine-rich prion-like proteins. Our results show that these proteins tend to be modular in nature, with molecular sizes significantly greater than those we would expect due to random sampling effects. These proteins also exhibit a remarkably high degree of spatial compaction when compared to average human proteins, even when protein size is accounted for. Computational evidence suggests that such a high degree of compactness might be due to the aggregation of methionine residues, pointing to a potential redox regulation of compactness. Gene ontology and network analyses, performed to shed light on the biological processes in which these proteins might participate, indicate that methionine-rich and non-methionine-rich prion-like proteins share gene ontology terms related to the regulation of transcription and translation but, more interestingly, these analyses also reveal that proteins from the methionine-rich group tend to share more gene ontology terms among them than they do with their non-methionine-rich prion-like counterparts.
Collapse
Affiliation(s)
- Juan Carlos Aledo
- Department of Molecular Biology and Biochemistry, University of Malaga, 29071 Malaga, Spain
| |
Collapse
|
13
|
Santonia D, Felici G. An immunological glimpse of human virus peptides: distance from self, MHC class I binding, Proteasome Cleveage, TAP Transport and sequence composition entropy. Virus Res 2022; 317:198814. [PMID: 35588940 DOI: 10.1016/j.virusres.2022.198814] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 05/13/2022] [Accepted: 05/15/2022] [Indexed: 10/18/2022]
Abstract
Adaptive immune response is triggered when specific pathogen peptides called epitopes are recognised as exogenous according to the paradigm of self/non-self. To be recognized by immune cells, epitopes have to be exposed (presented) on the surface of the cell. Predicting if a peptide is exposed is important to shed light on the rules that govern immune response, and thus to identify potential targets, and to design vaccine and drugs. We focused on peptides exposed on cell surface and made accessible to immune system through the MHC Class I complex. Before this can happen, three successive selection steps have to take place: a) Proteasome cleveage, b) TAP Transport, and c) binding to MHC-class I. Starting from a set of 211 host human reference viruses, we computed the set of unique peptides occurring in the correspondent proteomes. Then, we obtained the probability values of Proteasome Cleveage, TAP Transport and Binding to MHC Class I associated to those peptides through established prediction software tools. Such values were analysed in conjunction with two other features that could play a major role: the distance from self, strictly linked to the concept of nullomers, and the sequence entropy, measuring the complexity of the peptide amino acid composition. The analysis confirmed and extended previous results on a larger, more significant and consistent data set; we showed that the higher the distances from self, the higher the score of TAP Transport and binding to MHC class I; no significant association was instead found between distance from self and Proteasome Cleveage. Additionally, amino acid peptide composition entropy was significantly associated with the other features. In particular, higher entropies were linked with higher scores of Proteasome Cleveage, TAP Transport, Binding to MHC Class I, and higher distance from self. The relationship among the three selection steps provided evidence of a tight correlation among them, clearly suggesting it could be the product of a co-evolutive process. We believe that these results give new insights on the complex processes that regulate peptide presentation through MHC class I, and unveil the mechanisms the allow the immune system to distinguish self and viral non-self peptides.
Collapse
Affiliation(s)
- Daniele Santonia
- Institute for System Analysis and Computer Science "Antonio Ruberti", National Research Council of Italy, Via dei Taurini 19, Rome 00185, Italy.
| | - Giovanni Felici
- Institute for System Analysis and Computer Science "Antonio Ruberti", National Research Council of Italy, Via dei Taurini 19, Rome 00185, Italy
| |
Collapse
|
14
|
Becerra A, Muñoz-Velasco I, Aguilar-Cámara A, Cottom-Salas W, Cruz-González A, Vázquez-Salazar A, Hernández-Morales R, Jácome R, Campillo-Balderas JA, Lazcano A. Two short low complexity regions (LCRs) are hallmark sequences of the Delta SARS-CoV-2 variant spike protein. Sci Rep 2022; 12:936. [PMID: 35042962 PMCID: PMC8766472 DOI: 10.1038/s41598-022-04976-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2021] [Accepted: 01/04/2022] [Indexed: 11/24/2022] Open
Abstract
Low complexity regions (LCRs) are protein sequences formed by a set of compositionally biased residues. LCRs are extremely abundant in cellular proteins and have also been reported in viruses, where they may partake in evasion of the host immune system. Analyses of 28,231 SARS-CoV-2 whole proteomes and of 261,051 spike protein sequences revealed the presence of four extremely conserved LCRs in the spike protein of several SARS-CoV-2 variants. With the exception of Iota, where it is absent, the Spike LCR-1 is present in the signal peptide of 80.57% of the Delta variant sequences, and in other variants of concern and interest. The Spike LCR-2 is highly prevalent (79.87%) in Iota. Two distinctive LCRs are present in the Delta spike protein. The Delta Spike LCR-3 is present in 99.19% of the analyzed sequences, and the Delta Spike LCR-4 in 98.3% of the same set of proteins. These two LCRs are located in the furin cleavage site and HR1 domain, respectively, and may be considered hallmark traits of the Delta variant. The presence of the medically-important point mutations P681R and D950N in these LCRs, combined with the ubiquity of these regions in the highly contagious Delta variant opens the possibility that they may play a role in its rapid spread.
Collapse
Affiliation(s)
- Arturo Becerra
- Facultad de Ciencias, Universidad Nacional Autónoma de México, 04510, Mexico City, Mexico
| | - Israel Muñoz-Velasco
- Facultad de Ciencias, Universidad Nacional Autónoma de México, 04510, Mexico City, Mexico
| | | | - Wolfgang Cottom-Salas
- Facultad de Ciencias, Universidad Nacional Autónoma de México, 04510, Mexico City, Mexico
- Escuela Nacional Preparatoria, Plantel 8 Miguel E. Schulz, Universidad Nacional Autónoma de México, 01600, Mexico City, Mexico
| | - Adrián Cruz-González
- Facultad de Ciencias, Universidad Nacional Autónoma de México, 04510, Mexico City, Mexico
| | - Alberto Vázquez-Salazar
- Department of Chemical and Biomolecular Engineering, University of California, Los Angeles, CA, 90095, USA
| | | | - Rodrigo Jácome
- Facultad de Ciencias, Universidad Nacional Autónoma de México, 04510, Mexico City, Mexico
| | | | - Antonio Lazcano
- Facultad de Ciencias, Universidad Nacional Autónoma de México, 04510, Mexico City, Mexico.
- El Colegio Nacional, 06470, Mexico City, Mexico.
| |
Collapse
|
15
|
Aledo JC. The Role of Methionine Residues in the Regulation of Liquid-Liquid Phase Separation. Biomolecules 2021; 11:biom11081248. [PMID: 34439914 PMCID: PMC8394241 DOI: 10.3390/biom11081248] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2021] [Revised: 08/12/2021] [Accepted: 08/18/2021] [Indexed: 02/07/2023] Open
Abstract
Membraneless organelles are non-stoichiometric supramolecular structures in the micron scale. These structures can be quickly assembled/disassembled in a regulated fashion in response to specific stimuli. Membraneless organelles contribute to the spatiotemporal compartmentalization of the cell, and they are involved in diverse cellular processes often, but not exclusively, related to RNA metabolism. Liquid-liquid phase separation, a reversible event involving demixing into two distinct liquid phases, provides a physical framework to gain insights concerning the molecular forces underlying the process and how they can be tuned according to the cellular needs. Proteins able to undergo phase separation usually present a modular architecture, which favors a multivalency-driven demixing. We discuss the role of low complexity regions in establishing networks of intra- and intermolecular interactions that collectively control the phase regime. Post-translational modifications of the residues present in these domains provide a convenient strategy to reshape the residue-residue interaction networks that determine the dynamics of phase separation. Focus will be placed on those proteins with low complexity domains exhibiting a biased composition towards the amino acid methionine and the prominent role that reversible methionine sulfoxidation plays in the assembly/disassembly of biomolecular condensates.
Collapse
Affiliation(s)
- Juan Carlos Aledo
- Departamento de Biología Molecular y Bioquímica, Facultad de Ciencias, Universidad de Málaga, 29071 Málaga, Spain
| |
Collapse
|
16
|
Prentice MB, Bowman J, Murray DL, Khidas K, Wilson PJ. Spatial and environmental influences on selection in a clock gene coding trinucleotide repeat in Canada lynx (Lynx canadensis). Mol Ecol 2020; 29:4637-4652. [PMID: 32989809 DOI: 10.1111/mec.15652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2020] [Accepted: 09/09/2020] [Indexed: 11/30/2022]
Abstract
Clock genes exhibit substantial control over gene expression and ultimately life-histories using external cues such as photoperiod, and are thus likely to be critical for adaptation to shifting seasonal conditions and novel environments as species redistribute their ranges under climate change. Coding trinucleotide repeats (cTNRs) are found within several clock genes, and may be interesting targets of selection due to their containment within exonic regions and elevated mutation rates. Here, we conduct inter-specific characterization of the NR1D1 cTNR between Canada lynx and bobcat, and intra-specific spatial and environmental association analyses of neutral microsatellites and our functional cTNR marker, to investigate the role of selection on this locus in Canada lynx. We report signatures of divergent selection between lynx and bobcat, with the potential for hybrid-mediated gene flow in the area of range overlap. We also provide evidence that this locus is under selection across Canada lynx in eastern Canada, with both spatial and environmental variables significantly contributing to the explained variation, after controlling for neutral population structure. These results suggest that cTNRs may play an important role in the generation of functional diversity within some mammal species, and allow for contemporary rates of adaptation in wild populations in response to environmental change. We encourage continued investment into the study of cTNR markers to better understand their broader relevance to the evolution and adaptation of mammals.
Collapse
Affiliation(s)
- Melanie B Prentice
- Department of Environmental & Life Sciences, Trent University, Peterborough, ON, Canada
| | - Jeff Bowman
- Wildlife Research and Monitoring Section, Ontario Ministry of Natural Resources and Forestry, Peterborough, ON, Canada
| | - Dennis L Murray
- Biology Department, Trent University, Peterborough, ON, Canada
| | - Kamal Khidas
- Vertebrate Zoology and Beaty Centre for Species Discovery, Canadian Museum of Nature, Ottawa, ON, Canada
| | - Paul J Wilson
- Biology Department, Trent University, Peterborough, ON, Canada
| |
Collapse
|
17
|
Pelassa I, Cibelli M, Villeri V, Lilliu E, Vaglietti S, Olocco F, Ghirardi M, Montarolo PG, Corà D, Fiumara F. Compound Dynamics and Combinatorial Patterns of Amino Acid Repeats Encode a System of Evolutionary and Developmental Markers. Genome Biol Evol 2020; 11:3159-3178. [PMID: 31589292 PMCID: PMC6839033 DOI: 10.1093/gbe/evz216] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/27/2019] [Indexed: 01/05/2023] Open
Abstract
Homopolymeric amino acid repeats (AARs) like polyalanine (polyA) and polyglutamine (polyQ) in some developmental proteins (DPs) regulate certain aspects of organismal morphology and behavior, suggesting an evolutionary role for AARs as developmental "tuning knobs." It is still unclear, however, whether these are occasional protein-specific phenomena or hints at the existence of a whole AAR-based regulatory system in DPs. Using novel approaches to trace their functional and evolutionary history, we find quantitative evidence supporting a generalized, combinatorial role of AARs in developmental processes with evolutionary implications. We observe nonrandom AAR distributions and combinations in HOX and other DPs, as well as in their interactomes, defining elements of a proteome-wide combinatorial functional code whereby different AARs and their combinations appear preferentially in proteins involved in the development of specific organs/systems. Such functional associations can be either static or display detectable evolutionary dynamics. These findings suggest that progressive changes in AAR occurrence/combination, by altering embryonic development, may have contributed to taxonomic divergence, leaving detectable traces in the evolutionary history of proteomes. Consistent with this hypothesis, we find that the evolutionary trajectories of the 20 AARs in eukaryotic proteomes are highly interrelated and their individual or compound dynamics can sharply mark taxonomic boundaries, or display clock-like trends, carrying overall a strong phylogenetic signal. These findings provide quantitative evidence and an interpretive framework outlining a combinatorial system of AARs whose compound dynamics mark at the same time DP functions and evolutionary transitions.
Collapse
Affiliation(s)
- Ilaria Pelassa
- Department of Neuroscience Rita Levi Montalcini, University of Torino, Italy
| | - Marica Cibelli
- Department of Neuroscience Rita Levi Montalcini, University of Torino, Italy
| | - Veronica Villeri
- Department of Neuroscience Rita Levi Montalcini, University of Torino, Italy
| | - Elena Lilliu
- Department of Neuroscience Rita Levi Montalcini, University of Torino, Italy
| | - Serena Vaglietti
- Department of Neuroscience Rita Levi Montalcini, University of Torino, Italy
| | - Federica Olocco
- Department of Neuroscience Rita Levi Montalcini, University of Torino, Italy
| | - Mirella Ghirardi
- Department of Neuroscience Rita Levi Montalcini, University of Torino, Italy.,National Institute of Neuroscience (INN), Torino, Italy
| | - Pier Giorgio Montarolo
- Department of Neuroscience Rita Levi Montalcini, University of Torino, Italy.,National Institute of Neuroscience (INN), Torino, Italy
| | - Davide Corà
- Department of Translational Medicine, Piemonte Orientale University, Novara, Italy.,Center for Translational Research on Autoimmune and Allergic Disease (CAAD), Novara, Italy
| | - Ferdinando Fiumara
- Department of Neuroscience Rita Levi Montalcini, University of Torino, Italy.,National Institute of Neuroscience (INN), Torino, Italy
| |
Collapse
|
18
|
Atypical structural tendencies among low-complexity domains in the Protein Data Bank proteome. PLoS Comput Biol 2020; 16:e1007487. [PMID: 31986130 PMCID: PMC7004392 DOI: 10.1371/journal.pcbi.1007487] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2019] [Revised: 02/06/2020] [Accepted: 12/23/2019] [Indexed: 11/29/2022] Open
Abstract
A variety of studies have suggested that low-complexity domains (LCDs) tend to be intrinsically disordered and are relatively rare within structured proteins in the Protein Data Bank (PDB). Although LCDs are often treated as a single class, we previously found that LCDs enriched in different amino acids can exhibit substantial differences in protein metabolism and function. Therefore, we wondered whether the structural conformations of LCDs are likewise dependent on which specific amino acids are enriched within each LCD. Here, we directly examined relationships between enrichment of individual amino acids and secondary structure tendencies across the entire PDB proteome. Secondary structure tendencies varied as a function of the identity of the amino acid enriched and its degree of enrichment. Furthermore, divergence in secondary structure profiles often occurred for LCDs enriched in physicochemically similar amino acids (e.g. valine vs. leucine), indicating that LCDs composed of related amino acids can have distinct secondary structure tendencies. Comparison of LCD secondary structure tendencies with numerous pre-existing secondary structure propensity scales resulted in relatively poor correlations for certain types of LCDs, indicating that these scales may not capture secondary structure tendencies as sequence complexity decreases. Collectively, these observations provide a highly resolved view of structural tendencies among LCDs parsed by the nature and magnitude of single amino acid enrichment. The structures that proteins adopt are directly related to their amino acid sequences. Low-complexity domains (LCDs) in protein sequences are unusual regions made up of only a few different types of amino acids. Although this is the key feature that classifies sequences as LCDs, the physical properties of LCDs will differ based on the types of amino acids that are found in each domain. For example, the sequences “AAAAAAAAAA”, “EEEEEEEEEE”, and “EEKRKEEEKE” will have very different properties, even though they would all be classified as LCDs by traditional methods. In a previous study, we developed a new method to further divide LCDs into categories that more closely reflect the differences in their physical properties. In this study, we apply that approach to examine the structures of LCDs when sorted into different categories based on their amino acids. This allowed us to define relationships between the types of amino acids in the LCDs and their corresponding structures. Since protein structure is closely related to protein function, this has important implications for understanding the basic functions and properties of LCDs in a variety of proteins.
Collapse
|
19
|
Abstract
The endoplasmic reticulum (ER) is the site for folding and maturation of secreted and membrane proteins. When the ER protein-folding machinery is overwhelmed, misfolded proteins trigger ER stress, which is frequently linked to human diseases, including cancer and neurodegeneration. Inositol-requiring enzyme 1 (IRE1) is an ER membrane-resident sensor that assembles into large clusters of previously unknown organization upon its activation by unfolded peptides. We demonstrate that IRE1 clusters are topologically complex dynamic structures that remain contiguous with the ER membrane throughout their lifetime. The majority of clustered IRE1 molecules are diffusionally trapped inside the clusters until IRE1 signaling attenuates, at which point they are released back into the ER through a pathway that is functionally distinct from cluster assembly. The endoplasmic reticulum (ER) membrane-resident stress sensor inositol-requiring enzyme 1 (IRE1) governs the most evolutionarily conserved branch of the unfolded protein response. Upon sensing an accumulation of unfolded proteins in the ER lumen, IRE1 activates its cytoplasmic kinase and ribonuclease domains to transduce the signal. IRE1 activity correlates with its assembly into large clusters, yet the biophysical characteristics of IRE1 clusters remain poorly characterized. We combined superresolution microscopy, single-particle tracking, fluorescence recovery, and photoconversion to examine IRE1 clustering quantitatively in living human and mouse cells. Our results revealed that: 1) In contrast to qualitative impressions gleaned from microscopic images, IRE1 clusters comprise only a small fraction (∼5%) of the total IRE1 in the cell; 2) IRE1 clusters have complex topologies that display features of higher-order organization; 3) IRE1 clusters contain a diffusionally constrained core, indicating that they are not phase-separated liquid condensates; 4) IRE1 molecules in clusters remain diffusionally accessible to the free pool of IRE1 molecules in the general ER network; 5) when IRE1 clusters disappear at later time points of ER stress as IRE1 signaling attenuates, their constituent molecules are released back into the ER network and not degraded; 6) IRE1 cluster assembly and disassembly are mechanistically distinct; and 7) IRE1 clusters’ mobility is nearly independent of cluster size. Taken together, these insights define the clusters as dynamic assemblies with unique properties. The analysis tools developed for this study will be widely applicable to investigations of clustering behaviors in other signaling proteins.
Collapse
|
20
|
Ntountoumi C, Vlastaridis P, Mossialos D, Stathopoulos C, Iliopoulos I, Promponas V, Oliver SG, Amoutzias GD. Low complexity regions in the proteins of prokaryotes perform important functional roles and are highly conserved. Nucleic Acids Res 2019; 47:9998-10009. [PMID: 31504783 PMCID: PMC6821194 DOI: 10.1093/nar/gkz730] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2019] [Revised: 07/16/2019] [Accepted: 08/15/2019] [Indexed: 01/27/2023] Open
Abstract
We provide the first high-throughput analysis of the properties and functional role of Low Complexity Regions (LCRs) in more than 1500 prokaryotic and phage proteomes. We observe that, contrary to a widespread belief based on older and sparse data, LCRs actually have a significant, persistent and highly conserved presence and role in many and diverse prokaryotes. Their specific amino acid content is linked to proteins with certain molecular functions, such as the binding of RNA, DNA, metal-ions and polysaccharides. In addition, LCRs have been repeatedly identified in very ancient, and usually highly expressed proteins of the translation machinery. At last, based on the amino acid content enriched in certain categories, we have developed a neural network web server to identify LCRs and accurately predict whether they can bind nucleic acids, metal-ions or are involved in chaperone functions. An evaluation of the tool showed that it is highly accurate for eukaryotic proteins as well.
Collapse
Affiliation(s)
- Chrysa Ntountoumi
- Bioinformatics Laboratory, Department of Biochemistry and Biotechnology, University of Thessaly, 41500, Greece
| | - Panayotis Vlastaridis
- Bioinformatics Laboratory, Department of Biochemistry and Biotechnology, University of Thessaly, 41500, Greece
| | - Dimitris Mossialos
- Microbial Biotechnology-Molecular Bacteriology-Virology Laboratory, Department of Biochemistry and Biotechnology, University of Thessaly, 41500, Greece
| | | | | | - Vasilios Promponas
- Bioinformatics Research Laboratory, Department of Biological Sciences, New Campus, University of Cyprus, PO Box 20537, CY-1678 Nicosia, Cyprus
| | - Stephen G Oliver
- Cambridge Systems Biology Centre & Department of Biochemistry, University of Cambridge, CB2 1GA, UK
| | - Grigoris D Amoutzias
- Bioinformatics Laboratory, Department of Biochemistry and Biotechnology, University of Thessaly, 41500, Greece
| |
Collapse
|
21
|
Zhu Z, Qu J, Yu L, Jiang X, Liu G, Wang L, Qu Y, Qin Y. Three glycoside hydrolase family 12 enzymes display diversity in substrate specificities and synergistic action between each other. Mol Biol Rep 2019; 46:5443-5454. [PMID: 31359382 DOI: 10.1007/s11033-019-04999-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2019] [Accepted: 07/23/2019] [Indexed: 12/01/2022]
Abstract
PoCel12A, PoCel12B, and PoCel12C are genes that encode glycoside hydrolase family 12 (GH12) enzymes in Penicillium oxalicum. PoCel12A and PoCel12B are typical GH12 enzymes that belong to fungal subfamilies 12-1 and 12-2, respectively. PoCel12C contains a low-complexity region (LCR) domain, which is not found in PoCel12A or PoCel12B and independent of fungal subfamily 12-1 or 12-2. Recombinant enzymes (named rCel12A, rCel12B and rCel12C) demonstrate existing diversity in the substrate specificities. Although most members in GH family 12 are typical endoglucanases and preferentially hydrolyze β-1,4-glucan (e.g., carboxymethylcellulose), recombinant PoCel12A is a non-typical endo-(1-4)-β-glucanase; it preferentially hydrolyzes mix-linked β-glucan (barley β-glucan, β-1,3-1,4-glucan) and slightly hydrolyzes β-1,4-glucan (carboxymethylcellulose). Recombinant PoCel12B possesses a significantly high activity against xyloglucan. A specific activity of rCel12B toward xyloglucan (239 µmol/min/mg) is the second-highest value known. Recombinant PoCel12C shows low activity toward β-glucan, carboxymethylcellulose, or xyloglucan. All three enzymes can degrade phosphoric acid-swollen cellulose (PASC). However, the hydrolysis products toward PASC by enzymes are different: the main hydrolysis products are cellotriose, cellotetraose, and cellobiose for rCel12A, rCel12B, and rCel12C, correspondingly. A synergistic action toward PASC among rCel12A and rCel12B is observed, thereby suggesting a potential application for preparing enzyme cocktails used in lignocellulose hydrolysis.
Collapse
Affiliation(s)
- Zhu Zhu
- National Glycoengineering Research Center, Shandong University, No. 72 Binhai Road, Qingdao, 266237, China.,State Key Lab of Microbial Technology, Shandong University, No. 72 Binhai Road, Qingdao, 266237, China
| | - Jingyao Qu
- National Glycoengineering Research Center, Shandong University, No. 72 Binhai Road, Qingdao, 266237, China.,State Key Lab of Microbial Technology, Shandong University, No. 72 Binhai Road, Qingdao, 266237, China
| | - Lele Yu
- National Glycoengineering Research Center, Shandong University, No. 72 Binhai Road, Qingdao, 266237, China.,State Key Lab of Microbial Technology, Shandong University, No. 72 Binhai Road, Qingdao, 266237, China
| | - Xukai Jiang
- National Glycoengineering Research Center, Shandong University, No. 72 Binhai Road, Qingdao, 266237, China.,Department of Microbiology, Faculty of Medicine, Nursing and Health Sciences, Monash Biomedicine Discovery Institute, Monash University, Melbourne, 3800, Australia
| | - Guodong Liu
- National Glycoengineering Research Center, Shandong University, No. 72 Binhai Road, Qingdao, 266237, China.,State Key Lab of Microbial Technology, Shandong University, No. 72 Binhai Road, Qingdao, 266237, China
| | - Lushan Wang
- National Glycoengineering Research Center, Shandong University, No. 72 Binhai Road, Qingdao, 266237, China
| | - Yinbo Qu
- National Glycoengineering Research Center, Shandong University, No. 72 Binhai Road, Qingdao, 266237, China.,State Key Lab of Microbial Technology, Shandong University, No. 72 Binhai Road, Qingdao, 266237, China
| | - Yuqi Qin
- National Glycoengineering Research Center, Shandong University, No. 72 Binhai Road, Qingdao, 266237, China. .,State Key Lab of Microbial Technology, Shandong University, No. 72 Binhai Road, Qingdao, 266237, China.
| |
Collapse
|
22
|
Polyserine repeats promote coiled coil-mediated fibril formation and length-dependent protein aggregation. J Struct Biol 2018; 204:572-584. [PMID: 30194983 DOI: 10.1016/j.jsb.2018.09.001] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2018] [Revised: 08/06/2018] [Accepted: 09/01/2018] [Indexed: 12/13/2022]
Abstract
Short polyserine (polyS) repeats are frequently found in proteins and longer ones are produced in neurological disorders such as Huntington disease (HD) owing to translational frameshifting or non-ATG-dependent translation, together with polyglutamine (polyQ) and polyalanine (polyA) repeats, forming intracellular aggregates. However, the physiological and pathological structures of polyS repeats are not clearly understood. Early studies highlighted their structural versatility, similar to other homopolymers whose conformation is influenced by the surrounding protein context. As polyS stretches are frequently near polyQ and polyA repeats, which can be part of coiled coil (CC) structures, and the frameshift-derived polyS repeats in HD directly flank CC heptads important for aggregation, we investigate here the structural and aggregation properties of polyS in the context of CC structures. We have taken advantage of peptide models, previously used to study polyQ and polyA in CCs, in which we inserted polyS repeats of variable length and studied them in comparison with polyQ and polyA peptides. We found that polyS repeats promote CC-mediated polymerization and fibrillization as revealed by circular dichroism, chemical crosslinking, and atomic force microscopy. Furthermore, they promote CC-based, length-dependent intracellular aggregation, which is negligible with 7 and widespread with 49 serines. These findings show that polyS repeats can participate in the formation of CCs, as previously found for polyQ and polyA, conferring to peptides distinctive structural properties with aggregation kinetics that are intermediate between those of polyA and polyQ CCs, and contribute to an overall structural definition of the pathophysiogical roles of homopolymeric repeats in CC structures.
Collapse
|
23
|
Homan EJ, Bremel RD. A Role for Epitope Networking in Immunomodulation by Helminths. Front Immunol 2018; 9:1763. [PMID: 30108588 PMCID: PMC6079203 DOI: 10.3389/fimmu.2018.01763] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2018] [Accepted: 07/17/2018] [Indexed: 12/19/2022] Open
Abstract
Helminth infections, by nematodes, trematodes, or cestodes, can lead to the modulation of host immune responses. This allows long-duration parasite infections and also impacts responses to co-infections. Surface, secreted, excreted, and shed proteins are thought to play a major role in modulation. A commonly reported feature of such immune modulation is the role of T regulatory (Treg) cells and IL-10. Efforts to identify helminth proteins, which cause immunomodulation, have identified candidates but not provided clarity as to a uniform mechanism driving modulation. In this study, we applied a bioinformatics systems approach, allowing us to analyze predicted T-cell epitopes of 17 helminth species and the responses to their surface proteins. In addition to major histocompatibility complex (MHC) binding, we analyzed amino acid motifs that would be recognized by T-cell receptors [T-cell-exposed motifs (TCEMs)]. All the helminth species examined have, within their surface proteins, peptides, which combine very common TCEMs with predicted high affinity binding to many human MHC alleles. This combination of features would result in large cognate T cell and a high probability of eliciting Treg responses. The TCEMs, which determine recognition by responding T-cell clones, are shared to a high degree between helminth species and with Plasmodium falciparum and Mycobacterium tuberculosis, both common co-infecting organisms. The implication of our observations is not only that Treg cells play a significant role in helminth-induced immune modulation but also that the epitope specificities of Treg responses are shared across species and genera of helminth. Hence, the immune response to a given helminth cannot be considered in isolation but rather forms part of an epitope ecosystem, or microenvironment, in which potentially immunosuppressive peptides in the helminth network via their common T-cell receptor recognition signals with T-cell epitopes in self proteins, microbiome, other helminths, and taxonomically unrelated pathogens. Such a systems approach provides a high-level view of the antigen-immune system signaling dynamics that may bias a host's immune response to helminth infections toward immune modulation. It may indicate how helminths have evolved to select for peptides that favor long-term parasite host coexistence.
Collapse
|
24
|
Chaudhry SR, Lwin N, Phelan D, Escalante AA, Battistuzzi FU. Comparative analysis of low complexity regions in Plasmodia. Sci Rep 2018; 8:335. [PMID: 29321589 PMCID: PMC5762703 DOI: 10.1038/s41598-017-18695-y] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2017] [Accepted: 12/14/2017] [Indexed: 12/20/2022] Open
Abstract
Low complexity regions (LCRs) are a common feature shared by many genomes, but their evolutionary and functional significance remains mostly unknown. At the core of the uncertainty is a poor understanding of the mechanisms that regulate their retention in genomes, whether driven by natural selection or neutral evolution. Applying a comparative approach of LCRs to multiple strains and species is a powerful approach to identify patterns of conservation in these regions. Using this method, we investigate the evolutionary history of LCRs in the genus Plasmodium based on orthologous protein coding genes shared by 11 species and strains from primate and rodent-infecting pathogens. We find multiple lines of evidence in support of natural selection as a major evolutionary force shaping the composition and conservation of LCRs through time and signatures that their evolutionary paths are species specific. Our findings add a comparative analysis perspective to the debate on the evolution of LCRs and harness the power of sequence comparisons to identify potential functionally important LCR candidates.
Collapse
Affiliation(s)
- S R Chaudhry
- Department of Biological Sciences, Oakland University, Rochester, MI, USA.,Center for Molecular Medicine and Genetics, Wayne State University, Detroit, MI, USA
| | - N Lwin
- Department of Biological Sciences, Oakland University, Rochester, MI, USA
| | - D Phelan
- Department of Biological Sciences, Oakland University, Rochester, MI, USA
| | - A A Escalante
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, USA
| | - F U Battistuzzi
- Department of Biological Sciences, Oakland University, Rochester, MI, USA. .,Center for Data Science and Big Data Analytics, Oakland University, Rochester, MI, USA.
| |
Collapse
|
25
|
Prentice MB, Bowman J, Lalor JL, McKay MM, Thomson LA, Watt CM, McAdam AG, Murray DL, Wilson PJ. Signatures of selection in mammalian clock genes with coding trinucleotide repeats: Implications for studying the genomics of high-pace adaptation. Ecol Evol 2017; 7:7254-7276. [PMID: 28944015 PMCID: PMC5606889 DOI: 10.1002/ece3.3223] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2017] [Revised: 05/31/2017] [Accepted: 06/06/2017] [Indexed: 12/14/2022] Open
Abstract
Climate change is predicted to affect the reproductive ecology of wildlife; however, we have yet to understand if and how species can adapt to the rapid pace of change. Clock genes are functional genes likely critical for adaptation to shifting seasonal conditions through shifts in timing cues. Many of these genes contain coding trinucleotide repeats, which offer the potential for higher rates of change than single nucleotide polymorphisms (SNPs) at coding sites, and, thus, may translate to faster rates of adaptation in changing environments. We characterized repeats in 22 clock genes across all annotated mammal species and evaluated the potential for selection on repeat motifs in three clock genes (NR1D1,CLOCK, and PER1) in three congeneric species pairs with different latitudinal range limits: Canada lynx and bobcat (Lynx canadensis and L. rufus), northern and southern flying squirrels (Glaucomys sabrinus and G. volans), and white‐footed and deer mouse (Peromyscus leucopus and P. maniculatus). Signatures of positive selection were found in both the interspecific comparison of Canada lynx and bobcat, and intraspecific analyses in Canada lynx. Northern and southern flying squirrels showed differing frequencies at common CLOCK alleles and a signature of balancing selection. Regional excess homozygosity was found in the deer mouse at PER1 suggesting disruptive selection, and further analyses suggested balancing selection in the white‐footed mouse. These preliminary signatures of selection and the presence of trinucleotide repeats within many clock genes warrant further consideration of the importance of candidate gene motifs for adaptation to climate change.
Collapse
Affiliation(s)
- Melanie B Prentice
- Department of Environmental and Life Sciences Trent University Peterborough ON Canada
| | - Jeff Bowman
- Wildlife Research and Monitoring Section Ontario Ministry of Natural Resources and Forestry Peterborough ON Canada
| | | | - Michelle M McKay
- Department of Environmental and Life Sciences Trent University Peterborough ON Canada
| | | | - Cristen M Watt
- Department of Environmental and Life Sciences Trent University Peterborough ON Canada
| | - Andrew G McAdam
- Department of Integrative Biology University of Guelph Guelph ON Canada
| | | | - Paul J Wilson
- Biology Department Trent University Peterborough ON Canada
| |
Collapse
|
26
|
Cordeiro TN, Herranz-Trillo F, Urbanek A, Estaña A, Cortés J, Sibille N, Bernadó P. Structural Characterization of Highly Flexible Proteins by Small-Angle Scattering. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2017; 1009:107-129. [DOI: 10.1007/978-981-10-6038-0_7] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
27
|
Adaptive Variation and Introgression of a CONSTANS-Like Gene in North American Red Oaks. FORESTS 2016. [DOI: 10.3390/f8010003] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
28
|
A bioinformatics pipeline to search functional motifs within whole-proteome data: a case study of poxviruses. Virus Genes 2016; 53:173-178. [PMID: 28000080 PMCID: PMC5357487 DOI: 10.1007/s11262-016-1416-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2016] [Accepted: 12/01/2016] [Indexed: 12/19/2022]
Abstract
Proteins harbor domains or short linear motifs, which facilitate their functions and interactions. Finding functional motifs in protein sequences could predict the putative cellular roles or characteristics of hypothetical proteins. In this study, we present Shetti-Motif, which is an interactive tool to (i) map UniProt and PROSITE flat files, (ii) search for multiple pre-defined consensus patterns or experimentally validated functional motifs in large datasets protein sequences (proteome-wide), (iii) search for motifs containing repeated residues (low-complexity regions, e.g., Leu-, SR-, PEST-rich motifs, etc.). As proof of principle, using this comparative proteomics pipeline, eleven proteomes encoded by member of Poxviridae family were searched against about 100 experimentally validated functional motifs. The closely related viruses and viruses infect the same host cells (e.g. vaccinia and variola viruses) show similar motif-containing proteins profile. The motifs encoded by these viruses are correlated, which explains why poxviruses are able to interact with wide range of host cells. In conclusion, this in silico analysis is useful to establish a dataset(s) or potential proteins for further investigation or compare between species.
Collapse
|
29
|
Zhang Y, Man VH, Roland C, Sagui C. Amyloid Properties of Asparagine and Glutamine in Prion-like Proteins. ACS Chem Neurosci 2016; 7:576-87. [PMID: 26911543 DOI: 10.1021/acschemneuro.5b00337] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
Abstract
Sequences rich in glutamine (Q) and asparagine (N) are intrinsically disordered in monomeric form, but can aggregate into highly ordered amyloids, as seen in Q/N-rich prion domains (PrDs). Amyloids are fibrillar protein aggregates rich in β-sheet structures that can self-propagate through protein-conformational chain reactions. Here, we present a comprehensive theoretical study of N/Q-rich peptides, including sequences found in the yeast Sup35 PrD, in parallel and antiparallel β-sheet aggregates, and probe via fully atomistic molecular dynamics simulations all their possible steric-zipper interfaces in order to determine their protofibril structure and their relative stability. Our results show that polyglutamine aggregates are more stable than polyasparagine aggregates. Enthalpic contributions to the free energy favor the formation of polyQ protofibrils, while entropic contributions favor the formation of polyN protofibrils. The considerably larger phase space that disordered polyQ must sample on its way to aggregation probably is at the root of the associated slower kinetics observed experimentally. When other amino acids are present, such as in the Sup35 PrD, their shorter side chains favor steric-zipper formation for N but not Q, as they preclude the in-register association of the long Q side chains.
Collapse
Affiliation(s)
- Yuan Zhang
- Department of Physics, and
Center for High Performance Simulations (CHiPS), North Carolina State University, Raleigh, North Carolina 27695, United States
| | - Viet Hoang Man
- Department of Physics, and
Center for High Performance Simulations (CHiPS), North Carolina State University, Raleigh, North Carolina 27695, United States
| | - Christopher Roland
- Department of Physics, and
Center for High Performance Simulations (CHiPS), North Carolina State University, Raleigh, North Carolina 27695, United States
| | - Celeste Sagui
- Department of Physics, and
Center for High Performance Simulations (CHiPS), North Carolina State University, Raleigh, North Carolina 27695, United States
| |
Collapse
|
30
|
Battistuzzi FU, Schneider KA, Spencer MK, Fisher D, Chaudhry S, Escalante AA. Profiles of low complexity regions in Apicomplexa. BMC Evol Biol 2016; 16:47. [PMID: 26923229 PMCID: PMC4770516 DOI: 10.1186/s12862-016-0625-0] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2015] [Accepted: 02/17/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Low complexity regions (LCRs) are a ubiquitous feature in genomes and yet their evolutionary history and functional roles are unclear. Previous studies have shown contrasting evidence in favor of both neutral and selective mechanisms of evolution for different sets of LCRs suggesting that modes of identification of these regions may play a role in our ability to discern their evolutionary history. To further investigate this issue, we used a multiple threshold approach to identify species-specific profiles of proteome complexity and, by comparing properties of these sets, determine the influence that starting parameters have on evolutionary inferences. RESULTS We find that, although qualitatively similar, quantitatively each species has a unique LCR profile which represents the frequency of these regions within each genome. Inferences based on these profiles are more accurate in comparative analyses of genome complexity as they allow to determine the relative complexity of multiple genomes as well as the type of repetitiveness that is most common in each. Based on the multiple threshold LCR sets obtained, we identified predominant evolutionary mechanisms at different complexity levels, which show neutral mechanisms acting on highly repetitive LCRs (e.g., homopolymers) and selective forces becoming more important as heterogeneity of the LCRs increases. CONCLUSIONS Our results show how inferences based on LCRs are influenced by the parameters used to identify these regions. Sets of LCRs are heterogeneous aggregates of regions that include homo- and heteropolymers and, as such, evolve according to different mechanisms. LCR profiles provide a new way to investigate genome complexity across species and to determine the driving mechanism of their evolution.
Collapse
Affiliation(s)
| | - Kristan A Schneider
- Department of MNI, University of Applied Sciences Mittweida, Mittweida, Germany.
| | - Matthew K Spencer
- Department of Geology and Physics, Lake Superior State University, Sault Ste. Marie, MI, USA.
| | - David Fisher
- David Eccles School of Business, University of Utah, Salt Lake City, UT, USA.
| | - Sophia Chaudhry
- Department of Biological Sciences, Oakland University, Rochester, MI, USA. .,Center for Molecular Medicine and Genetics, Wayne State University, Detroit, MI, USA.
| | - Ananias A Escalante
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, USA.
| |
Collapse
|
31
|
Wu R, Liu Q, Zhang P, Liang D. Tandem amino acid repeats in the green anole (Anolis carolinensis) and other squamates may have a role in increasing genetic variability. BMC Genomics 2016; 17:109. [PMID: 26868501 PMCID: PMC4751654 DOI: 10.1186/s12864-016-2430-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2015] [Accepted: 02/02/2016] [Indexed: 01/04/2023] Open
Abstract
Background Tandem amino acid repeats are characterised by the consecutive recurrence of a single amino acid. They exhibit high rates of length mutations in addition to point mutations and have been proposed to be involved in genetic plasticity. Squamate reptiles (lizards and snakes) diversify in both morphology and physiology. The underlying mechanism is yet to be understood. In a previous phylogenomic analysis of reptiles, the density of tandem repeats in an anole lizard diverged heavily from that of the other reptiles. To gain further insight into the tandem amino acid repeats in squamates, we analysed the repeat content in the green anole (Anolis carolinensis) proteome and compared the amino acid repeats in a large orthologous protein data set from six vertebrates (the Western clawed frog, the green anole, the Chinese softshell turtle, the zebra finch, mouse and human). Results Our results revealed that the number of amino acid repeats in the green anole exceeded those found in the other five species studied. Species-only repeats were found in high proportion in the green anole but not in the other five species, suggesting that the green anole had gained many amino acid repeats in either the Anolis or the squamate lineage. Since the amino acid repeat containing genes in the green anole were highly enriched in genes related to transcription and development, an important family of developmental genes, i.e., the Hox family, was further studied in a wide collection of squamates. Abundant amino acid repeats were also observed, implying the general high tolerance of amino acid repeats in squamates. A particular enrichment of amino acid repeats was observed in the central class Hox genes that are known to be responsible for defining cervical to lumbar regions. Conclusions Our study suggests that the abundant amino acid repeats in the green anole, and possibly in other squamates, may play a role in increasing the genetic variability, and contribute to the evolutionary diversity of this clade. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-2430-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Riga Wu
- Key Laboratory of Gene Engineering of the Ministry of Education, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, People's Republic of China.
| | - Qingfeng Liu
- Key Laboratory of Gene Engineering of the Ministry of Education, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, People's Republic of China.
| | - Peng Zhang
- Key Laboratory of Gene Engineering of the Ministry of Education, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, People's Republic of China.
| | - Dan Liang
- Key Laboratory of Gene Engineering of the Ministry of Education, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, People's Republic of China.
| |
Collapse
|
32
|
Sobhy H. A Review of Functional Motifs Utilized by Viruses. Proteomes 2016; 4:proteomes4010003. [PMID: 28248213 PMCID: PMC5217368 DOI: 10.3390/proteomes4010003] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2015] [Revised: 01/07/2016] [Accepted: 01/13/2016] [Indexed: 01/05/2023] Open
Abstract
Short linear motifs (SLiM) are short peptides that facilitate protein function and protein-protein interactions. Viruses utilize these motifs to enter into the host, interact with cellular proteins, or egress from host cells. Studying functional motifs may help to predict protein characteristics, interactions, or the putative cellular role of a protein. In virology, it may reveal aspects of the virus tropism and help find antiviral therapeutics. This review highlights the recent understanding of functional motifs utilized by viruses. Special attention was paid to the function of proteins harboring these motifs, and viruses encoding these proteins. The review highlights motifs involved in (i) immune response and post-translational modifications (e.g., ubiquitylation, SUMOylation or ISGylation); (ii) virus-host cell interactions, including virus attachment, entry, fusion, egress and nuclear trafficking; (iii) virulence and antiviral activities; (iv) virion structure; and (v) low-complexity regions (LCRs) or motifs enriched with residues (Xaa-rich motifs).
Collapse
Affiliation(s)
- Haitham Sobhy
- Department of Molecular Biology, Umeå University, 901 87 Umeå, Sweden.
| |
Collapse
|
33
|
Pelassa I, Fiumara F. Differential Occurrence of Interactions and Interaction Domains in Proteins Containing Homopolymeric Amino Acid Repeats. Front Genet 2015; 6:345. [PMID: 26734058 PMCID: PMC4683181 DOI: 10.3389/fgene.2015.00345] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2015] [Accepted: 11/20/2015] [Indexed: 12/13/2022] Open
Abstract
Homopolymeric amino acids repeats (AARs), which are widespread in proteomes, have often been viewed simply as spacers between protein domains, or even as "junk" sequences with no obvious function but with a potential to cause harm upon expansion as in genetic diseases associated with polyglutamine or polyalanine expansions, including Huntington disease and cleidocranial dysplasia. A growing body of evidence indicates however that at least some AARs can form organized, functional protein structures, and can regulate protein function. In particular, certain AARs can mediate protein-protein interactions, either through homotypic AAR-AAR contacts or through heterotypic contacts with other protein domains. It is still unclear however, whether AARs may have a generalized, proteome-wide role in shaping protein-protein interaction networks. Therefore, we have undertaken here a bioinformatics screening of the human proteome and interactome in search of quantitative evidence of such a role. We first identified the sets of proteins that contain repeats of any one of the 20 amino acids, as well as control sets of proteins chosen at random in the proteome. We then analyzed the connectivity between the proteins of the AAR-containing protein sets and we compared it with that observed in the corresponding control networks. We find evidence for different degrees of connectivity in the different AAR-containing protein networks. Indeed, networks of proteins containing polyglutamine, polyglutamate, polyproline, and other AARs show significantly increased levels of connectivity, whereas networks containing polyleucine and other hydrophobic repeats show lower degrees of connectivity. Furthermore, we observed that numerous protein-protein, -nucleic acid, and -lipid interaction domains are significantly enriched in specific AAR protein groups. These findings support the notion of a generalized, combinatorial role of AARs, together with conventional protein interaction domains, in shaping the interaction networks of the human proteome, and define proteome-wide knowledge that may guide the informed biological exploration of the role of AARs in protein interactions.
Collapse
Affiliation(s)
- Ilaria Pelassa
- Department of Neuroscience, University of Torino Torino, Italy
| | - Ferdinando Fiumara
- Department of Neuroscience, University of TorinoTorino, Italy; National Institute of Neuroscience (INN)Torino, Italy
| |
Collapse
|
34
|
Prentice MB, Bowman J, Wilson PJ. A test of somatic mosaicism in the androgen receptor gene of Canada lynx (Lynx canadensis). BMC Genet 2015; 16:125. [PMID: 26503624 PMCID: PMC4623281 DOI: 10.1186/s12863-015-0284-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2015] [Accepted: 10/19/2015] [Indexed: 11/11/2022] Open
Abstract
Background The androgen receptor, an X-linked gene, has been widely studied in human populations because it contains highly polymorphic trinucleotide repeat motifs that have been associated with a number of adverse human health and behavioral effects. A previous study on the androgen receptor gene in carnivores reported somatic mosaicism in the tissues of a number of species including Eurasian lynx (Lynx lynx). We investigated this claim in a closely related species, Canada lynx (Lynx canadensis). The presence of somatic mosaicism in lynx tissues could have implications for the future study of exonic trinucleotide repeats in landscape genomic studies, in which the accurate reporting of genotypes would be highly problematic. Methods To determine whether mosaicism occurs in Canada lynx, two lynx individuals were sampled for a variety of tissue types (lynx 1) and tissue locations (lynx 1 and 2), and 1,672 individuals of known sex were genotyped to further rule out mosaicism. Results We found no evidence of mosaicism in tissues from the two necropsied individuals, or any of our genotyped samples. Conclusions Our results indicate that mosaicism does not manifest in Canada lynx. Therefore, the use of hide samples for further work involving trinucleotide repeat polymorphisms in Canada lynx is warranted. Electronic supplementary material The online version of this article (doi:10.1186/s12863-015-0284-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Melanie B Prentice
- Department of Environmental & Life Sciences, Trent University, 1600 West Bank Drive, Peterborough, K9J 7B8, ON, Canada.
| | - Jeff Bowman
- Wildlife Research and Monitoring Section, Ontario Ministry of Natural Resources and Forestry, 2140 East Bank Drive, Peterborough, K9J 7B8, ON, Canada.
| | - Paul J Wilson
- Biology Department, Trent University, 1600 West Bank Drive, Peterborough, K9J 7B8, ON, Canada.
| |
Collapse
|
35
|
Martins F, Gonçalves R, Oliveira J, Cruz-Monteagudo M, Nieto-Villar JM, Paz-y-Miño C, Rebelo I, Tejera E. Unravelling the relationship between protein sequence and low-complexity regions entropies: Interactome implications. J Theor Biol 2015; 382:320-7. [PMID: 26164061 DOI: 10.1016/j.jtbi.2015.06.049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Revised: 06/12/2015] [Accepted: 06/28/2015] [Indexed: 10/23/2022]
Abstract
Low-complexity regions are sub-sequences of biased composition in a protein sequence. The influence of these regions over protein evolution, specific functions and highly interactive capacities is well known. Although protein sequence entropy has been largely studied, its relationship with low-complexity regions and the subsequent effects on protein function remains unclear. In this work we propose a theoretical and empirical model integrating the sequence entropy with local complexity parameters. Our results indicate that the protein sequence entropy is related with the protein length, the entropies inside and outside the low-complexity regions as well as their number and average size. We found a small but significant increment in the sequence entropy of hubs proteins. In agreement with our theoretical model, this increment is highly dependent of the balance between the increment of protein length and average size of the low-complexity regions. Finally, our models and proteins analysis provide evidence supporting that modifications in the average size is more relevant in hubs proteins than changes in the number of low-complexity regions.
Collapse
Affiliation(s)
- F Martins
- Department of Biochemistry, Faculty of Pharmacy, University of Porto, Portugal
| | - R Gonçalves
- Department of Biochemistry, Faculty of Pharmacy, University of Porto, Portugal
| | - J Oliveira
- Department of Biochemistry, Faculty of Pharmacy, University of Porto, Portugal
| | - M Cruz-Monteagudo
- Instituto de Investigaciones Biomédicas, Universidad de las Américas, Quito, Ecuador
| | - J M Nieto-Villar
- Dpto. de Química-Física, Fac. de Química, Universidad de La Habana, Cuba. Cátedra de Sistemas Complejos "H. Poincaré", Universidad de La Habana, Cuba
| | - C Paz-y-Miño
- Instituto de Investigaciones Biomédicas, Universidad de las Américas, Quito, Ecuador
| | - I Rebelo
- Department of Biochemistry, Faculty of Pharmacy, University of Porto, Portugal; UCIBIO@REQUIMTE, Portugal.
| | - E Tejera
- Instituto de Investigaciones Biomédicas, Universidad de las Américas, Quito, Ecuador
| |
Collapse
|
36
|
Abstract
Amino acid repeats (AARs) are abundant in protein sequences. They have particular roles in protein function and evolution. Simple repeat patterns generated by DNA slippage tend to introduce length variations and point mutations in repeat regions. Loss of normal and gain of abnormal function owing to their variable length are potential risks leading to diseases. Repeats with complex patterns mostly refer to the functional domain repeats, such as the well-known leucine-rich repeat and WD repeat, which are frequently involved in protein–protein interaction. They are mainly derived from internal gene duplication events and stabilized by ‘gate-keeper’ residues, which play crucial roles in preventing inter-domain aggregation. AARs are widely distributed in different proteomes across a variety of taxonomic ranges, and especially abundant in eukaryotic proteins. However, their specific evolutionary and functional scenarios are still poorly understood. Identifying AARs in protein sequences is the first step for the further investigation of their biological function and evolutionary mechanism. In principle, this is an NP-hard problem, as most of the repeat fragments are shaped by a series of sophisticated evolutionary events and become latent periodical patterns. It is not possible to define a uniform criterion for detecting and verifying various repeat patterns. Instead, different algorithms based on different strategies have been developed to cope with different repeat patterns. In this review, we attempt to describe the amino acid repeat-detection algorithms currently available and compare their strategies based on an in-depth analysis of the biological significance of protein repeats.
Collapse
|
37
|
Kirmitzoglou I, Promponas VJ. LCR-eXXXplorer: a web platform to search, visualize and share data for low complexity regions in protein sequences. ACTA ACUST UNITED AC 2015; 31:2208-10. [PMID: 25712690 PMCID: PMC4481844 DOI: 10.1093/bioinformatics/btv115] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2014] [Accepted: 02/17/2015] [Indexed: 11/20/2022]
Abstract
Motivation: Local compositionally biased and low complexity regions (LCRs) in amino acid sequences have initially attracted the interest of researchers due to their implication in generating artifacts in sequence database searches. There is accumulating evidence of the biological significance of LCRs both in physiological and in pathological situations. Nonetheless, LCR-related algorithms and tools have not gained wide appreciation across the research community, partly due to the fact that only a handful of user-friendly software is currently freely available. Results: We developed LCR-eXXXplorer, an extensible online platform attempting to fill this gap. LCR-eXXXplorer offers tools for displaying LCRs from the UniProt/SwissProt knowledgebase, in combination with other relevant protein features, predicted or experimentally verified. Moreover, users may perform powerful queries against a custom designed sequence/LCR-centric database. We anticipate that LCR-eXXXplorer will be a useful starting point in research efforts for the elucidation of the structure, function and evolution of proteins with LCRs. Availability and implementation: LCR-eXXXplorer is freely available at the URL http://repeat.biol.ucy.ac.cy/lcr-exxxplorer. Contact:vprobon@ucy.ac.cy Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ioannis Kirmitzoglou
- Bioinformatics Research Laboratory, Department of Biological Sciences, University of Cyprus, CY 1678, Nicosia, Cyprus
| | - Vasilis J Promponas
- Bioinformatics Research Laboratory, Department of Biological Sciences, University of Cyprus, CY 1678, Nicosia, Cyprus
| |
Collapse
|
38
|
Lenz C, Haerty W, Golding GB. Increased substitution rates surrounding low-complexity regions within primate proteins. Genome Biol Evol 2014; 6:655-65. [PMID: 24572016 PMCID: PMC3971593 DOI: 10.1093/gbe/evu042] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Previous studies have found that DNA-flanking low-complexity regions (LCRs) have an increased substitution rate. Here, the substitution rate was confirmed to increase in the vicinity of LCRs in several primate species, including humans. This effect was also found among human sequences from the 1000 Genomes Project. A strong correlation was found between average substitution rate per site and distance from the LCR, as well as the proportion of genes with gaps in the alignment at each site and distance from the LCR. Along with substitution rates, dN/dS ratios were also determined for each site, and the proportion of sites undergoing negative selection was found to have a negative relationship with distance from the LCR.
Collapse
Affiliation(s)
- Carolyn Lenz
- Department of Biology, McMaster University, Hamilton, Ontario, Canada
| | | | | |
Collapse
|
39
|
Ogilvie HA, Imin N, Djordjevic MA. Diversification of the C-TERMINALLY ENCODED PEPTIDE (CEP) gene family in angiosperms, and evolution of plant-family specific CEP genes. BMC Genomics 2014; 15:870. [PMID: 25287121 PMCID: PMC4197245 DOI: 10.1186/1471-2164-15-870] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2014] [Accepted: 09/24/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Small, secreted signaling peptides work in parallel with phytohormones to control important aspects of plant growth and development. Genes from the C-TERMINALLY ENCODED PEPTIDE (CEP) family produce such peptides which negatively regulate plant growth, especially under stress, and affect other important developmental processes. To illuminate how the CEP gene family has evolved within the plant kingdom, including its emergence, diversification and variation between lineages, a comprehensive survey was undertaken to identify and characterize CEP genes in 106 plant genomes. RESULTS Using a motif-based system developed for this study to identify canonical CEP peptide domains, a total of 916 CEP genes and 1,223 CEP domains were found in angiosperms and for the first time in gymnosperms. This defines a narrow band for the emergence of CEP genes in plants, from the divergence of lycophytes to the angiosperm/gymnosperm split. Both CEP genes and domains were found to have diversified in angiosperms, particularly in the Poaceae and Solanaceae plant families. Multispecies orthologous relationships were determined for 22% of identified CEP genes, and further analysis of those groups found selective constraints upon residues within the CEP peptide and within the previously little-characterized variable region. An examination of public Oryza sativa RNA-Seq datasets revealed an expression pattern that links OsCEP5 and OsCEP6 to panicle development and flowering, and CEP gene trees reveal these emerged from a duplication event associated with the Poaceae plant family. CONCLUSIONS The characterization of the plant-family specific CEP genes OsCEP5 and OsCEP6, the association of CEP genes with angiosperm-specific development processes like panicle development, and the diversification of CEP genes in angiosperms provides further support for the hypothesis that CEP genes have been integral to the evolution of novel traits within the angiosperm lineage. Beyond these findings, the comprehensive set of CEP genes and their properties reported here will be a resource for future research on CEP genes and peptides.
Collapse
Affiliation(s)
- Huw A Ogilvie
- Research School of Biology, The Australian National University, Canberra, ACT 0200 Australia
| | - Nijat Imin
- Research School of Biology, The Australian National University, Canberra, ACT 0200 Australia
| | - Michael A Djordjevic
- Research School of Biology, The Australian National University, Canberra, ACT 0200 Australia
| |
Collapse
|
40
|
Lind-Riehl JF, Sullivan AR, Gailing O. Evidence for selection on a CONSTANS-like gene between two red oak species. ANNALS OF BOTANY 2014; 113:967-75. [PMID: 24615344 PMCID: PMC3997637 DOI: 10.1093/aob/mcu019] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/10/2013] [Accepted: 01/27/2014] [Indexed: 05/25/2023]
Abstract
BACKGROUND AND AIMS Hybridizing species such as oaks may provide a model to study the role of selection in speciation with gene flow. Discrete species' identities and different adaptations are maintained among closely related oak species despite recurrent gene flow. This is probably due to ecologically mediated selection at a few key genes or genomic regions. Neutrality tests can be applied to identify so-called outlier loci, which demonstrate locus-specific signatures of divergent selection and are candidate genes for further study. METHODS Thirty-six genic microsatellite markers, some with putative functions in flowering time and drought tolerance, and eight non-genic microsatellite markers were screened in two population pairs (n = 160) of the interfertile species Quercus rubra and Q. ellipsoidalis, which are characterized by contrasting adaptations to drought. Putative outliers were then tested in additional population pairs from two different geographic regions (n = 159) to support further their potential role in adaptive divergence. KEY RESULTS A marker located in the coding sequence of a putative CONSTANS-like (COL) gene was repeatedly identified as under strong divergent selection across all three geographically disjunct population pairs. COL genes are involved in the photoperiodic control of growth and development and are implicated in the regulation of flowering time. CONCLUSIONS The location of the polymorphism in the Quercus COL gene and given the potential role of COL genes in adaptive divergence and reproductive isolation makes this a promising candidate speciation gene. Further investigation of the phenological characteristics of both species and flowering time pathway genes is suggested in order to elucidate the importance of phenology genes for the maintenance of species integrity. Next-generation sequencing in multiple population pairs in combination with high-density genetic linkage maps could reveal the genome-wide distribution of outlier genes and their potential role in reproductive isolation between these species.
Collapse
Affiliation(s)
| | | | - Oliver Gailing
- Michigan Technological University, 1400 Townsend Drive, Houghton, MI 49931, USA
| |
Collapse
|
41
|
Frenkel ZM, Barzily Z, Volkovich Z, Trifonov EN. Hidden ancient repeats in DNA: Mapping and quantification. Gene 2013; 528:282-7. [DOI: 10.1016/j.gene.2013.06.059] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2013] [Accepted: 06/21/2013] [Indexed: 01/27/2023]
|
42
|
Mary Rajathei D, Selvaraj S. Analysis of sequence repeats of proteins in the PDB. Comput Biol Chem 2013; 47:156-66. [PMID: 24121644 DOI: 10.1016/j.compbiolchem.2013.09.001] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2013] [Revised: 08/27/2013] [Accepted: 09/05/2013] [Indexed: 10/26/2022]
Abstract
Internal repeats in protein sequences play a significant role in the evolution of protein structure and function. Applications of different bioinformatics tools help in the identification and characterization of these repeats. In the present study, we analyzed sequence repeats in a non-redundant set of proteins available in the Protein Data Bank (PDB). We used RADAR for detecting internal repeats in a protein, PDBeFOLD for assessing structural similarity, PDBsum for finding functional involvement and Pfam for domain assignment of the repeats in a protein. Through the analysis of sequence repeats, we found that identity of the sequence repeats falls in the range of 20-40% and, the superimposed structures of the most of the sequence repeats maintain similar overall folding. Analysis sequence repeats at the functional level reveals that most of the sequence repeats are involved in the function of the protein through functionally involved residues in the repeat regions. We also found that sequence repeats in single and two domain proteins often contained conserved sequence motifs for the function of the domain.
Collapse
Affiliation(s)
- David Mary Rajathei
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli 620024, Tamilnadu, India
| | | |
Collapse
|
43
|
Almeida B, Fernandes S, Abreu IA, Macedo-Ribeiro S. Trinucleotide repeats: a structural perspective. Front Neurol 2013; 4:76. [PMID: 23801983 PMCID: PMC3687200 DOI: 10.3389/fneur.2013.00076] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2013] [Accepted: 06/04/2013] [Indexed: 11/29/2022] Open
Abstract
Trinucleotide repeat (TNR) expansions are present in a wide range of genes involved in several neurological disorders, being directly involved in the molecular mechanisms underlying pathogenesis through modulation of gene expression and/or the function of the RNA or protein it encodes. Structural and functional information on the role of TNR sequences in RNA and protein is crucial to understand the effect of TNR expansions in neurodegeneration. Therefore, this review intends to provide to the reader a structural and functional view of TNR and encoded homopeptide expansions, with a particular emphasis on polyQ expansions and its role at inducing the self-assembly, aggregation and functional alterations of the carrier protein, which culminates in neuronal toxicity and cell death. Detail will be given to the Machado-Joseph Disease-causative and polyQ-containing protein, ataxin-3, providing clues for the impact of polyQ expansion and its flanking regions in the modulation of ataxin-3 molecular interactions, function, and aggregation.
Collapse
Affiliation(s)
- Bruno Almeida
- Instituto de Biologia Molecular e Celular, Universidade do Porto , Porto , Portugal
| | | | | | | |
Collapse
|
44
|
C-terminal low-complexity sequence repeats of Mycobacterium smegmatis Ku modulate DNA binding. Biosci Rep 2013; 33:175-84. [PMID: 23167261 PMCID: PMC3553676 DOI: 10.1042/bsr20120105] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
Ku protein is an integral component of the NHEJ (non-homologous end-joining) pathway of DSB (double-strand break) repair. Both eukaryotic and prokaryotic Ku homologues have been characterized and shown to bind DNA ends. A unique feature of Mycobacterium smegmatis Ku is its basic C-terminal tail that contains several lysine-rich low-complexity PAKKA repeats that are absent from homologues encoded by obligate parasitic mycobacteria. Such PAKKA repeats are also characteristic of mycobacterial Hlp (histone-like protein) for which they have been shown to confer the ability to appose DNA ends. Unexpectedly, removal of the lysine-rich extension enhances DNA-binding affinity, but an interaction between DNA and the PAKKA repeats is indicated by the observation that only full-length Ku forms multiple complexes with a short stem-loop-containing DNA previously designed to accommodate only one Ku dimer. The C-terminal extension promotes DNA end-joining by T4 DNA ligase, suggesting that the PAKKA repeats also contribute to efficient end-joining. We suggest that low-complexity lysine-rich sequences have evolved repeatedly to modulate the function of unrelated DNA-binding proteins.
Collapse
|
45
|
Stolle E, Kidner JH, Moritz RFA. Patterns of evolutionary conservation of microsatellites (SSRs) suggest a faster rate of genome evolution in Hymenoptera than in Diptera. Genome Biol Evol 2013; 5:151-62. [PMID: 23292136 PMCID: PMC3595035 DOI: 10.1093/gbe/evs133] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/21/2012] [Indexed: 12/25/2022] Open
Abstract
Microsatellites, or simple sequence repeats (SSRs), are common and widespread DNA elements in genomes of many organisms. However, their dynamics in genome evolution is unclear, whereby they are thought to evolve neutrally. More available genome sequences along with dated phylogenies allowed for studying the evolution of these repetitive DNA elements along evolutionary time scales. This could be used to compare rates of genome evolution. We show that SSRs in insects can be retained for several hundred million years. Different types of microsatellites seem to be retained longer than others. By comparing Dipteran with Hymenopteran species, we found very similar patterns of SSR loss during their evolution, but both taxa differ profoundly in the rate. Relative to divergence time, Diptera lost SSRs twice as fast as Hymenoptera. The loss of SSRs on the Drosophila melanogaster X-chromosome was higher than on the other chromosomes. However, accounting for generation time, the Diptera show an 8.5-fold slower rate of SSR loss than the Hymenoptera, which, in contrast to previous studies, suggests a faster genome evolution in the latter. This shows that generation time differences can have a profound effect. A faster genome evolution in these insects could be facilitated by several factors very different to Diptera, which is discussed in light of our results on the haplodiploid D. melanogaster X-chromosome. Furthermore, large numbers of SSRs can be found to be in synteny and thus could be exploited as a tool to investigate genome structure and evolution.
Collapse
Affiliation(s)
- Eckart Stolle
- Department of Zoology, Institute of Biology, Martin-Luther-University Halle-Wittenberg, Halle (Saale), Germany.
| | | | | |
Collapse
|
46
|
Ryan CP, Crespi BJ. Androgen receptor polyglutamine repeat number: models of selection and disease susceptibility. Evol Appl 2012; 6:180-96. [PMID: 23467468 PMCID: PMC3586616 DOI: 10.1111/j.1752-4571.2012.00275.x] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2012] [Accepted: 05/04/2012] [Indexed: 12/14/2022] Open
Abstract
Variation in polyglutamine repeat number in the androgen receptor (AR CAGn) is negatively correlated with the transcription of androgen-responsive genes and is associated with susceptibility to an extensive list of human disease. Only a small portion of the heritability for many of these diseases is explained by conventional SNP-based genome-wide association studies, and the forces shaping AR CAGn among humans remains largely unexplored. Here, we propose evolutionary models for understanding selection at the AR CAG locus, namely balancing selection, sexual conflict, accumulation-selection, and antagonistic pleiotropy. We evaluate these models by examining AR CAGn-linked susceptibility to eight extensively studied diseases representing the diverse physiological roles of androgens, and consider the costs of these diseases by their frequency and fitness effects. Five diseases could contribute to the distribution of AR CAGn observed among contemporary human populations. With support for disease susceptibilities associated with long and short AR CAGn, balancing selection provides a useful model for studying selection at this locus. Gender-specific differences AR CAGn health effects also support this locus as a candidate for sexual conflict over repeat number. Accompanied by the accumulation of AR CAGn in humans, these models help explain the distribution of repeat number in contemporary human populations.
Collapse
Affiliation(s)
- Calen P Ryan
- Department of Biological Sciences, Simon Fraser University Burnaby, BC, Canada
| | | |
Collapse
|
47
|
Frenkel ZM, Trifonov EN. Origin and evolution of genes and genomes. Crucial role of triplet expansions. J Biomol Struct Dyn 2012; 30:201-10. [DOI: 10.1080/07391102.2012.677771] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
|
48
|
Luo H, Lin K, David A, Nijveen H, Leunissen JAM. ProRepeat: an integrated repository for studying amino acid tandem repeats in proteins. Nucleic Acids Res 2011; 40:D394-9. [PMID: 22102581 PMCID: PMC3245022 DOI: 10.1093/nar/gkr1019] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
ProRepeat (http://prorepeat.bioinformatics.nl/) is an integrated curated repository and analysis platform for in-depth research on the biological characteristics of amino acid tandem repeats. ProRepeat collects repeats from all proteins included in the UniProt knowledgebase, together with 85 completely sequenced eukaryotic proteomes contained within the RefSeq collection. It contains non-redundant perfect tandem repeats, approximate tandem repeats and simple, low-complexity sequences, covering the majority of the amino acid tandem repeat patterns found in proteins. The ProRepeat web interface allows querying the repeat database using repeat characteristics like repeat unit and length, number of repetitions of the repeat unit and position of the repeat in the protein. Users can also search for repeats by the characteristics of repeat containing proteins, such as entry ID, protein description, sequence length, gene name and taxon. ProRepeat offers powerful analysis tools for finding biological interesting properties of repeats, such as the strong position bias of leucine repeats in the N-terminus of eukaryotic protein sequences, the differences of repeat abundance among proteomes, the functional classification of repeat containing proteins and GC content constrains of repeats’ corresponding codons.
Collapse
Affiliation(s)
- Hong Luo
- Laboratory of Bioinformatics, Wageningen University and Research Centre, PO Box 569, 6700 AN Wageningen, Netherlands
| | | | | | | | | |
Collapse
|
49
|
Haerty W, Golding GB. Increased polymorphism near low-complexity sequences across the genomes of Plasmodium falciparum isolates. Genome Biol Evol 2011; 3:539-50. [PMID: 21602572 PMCID: PMC3140889 DOI: 10.1093/gbe/evr045] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Low-complexity regions (LCRs) within proteins sequences are often considered to evolve neutrally even though recent studies reported evidence for selection acting on some of them. Because of their widespread distribution among eukaryotes genomes and the potential deleterious effect of expansion/contraction of some of them in humans, low-complexity sequences are of major interest and numerous studies have attempted to describe their dynamic between genomes as well as the factors correlated to their variation and to assess their selective value. However, due to the scarcity of individual genomes within a species, most of the analyses so far have been performed at the species level with the implicit assumption that the variation both in composition and size within species is too small relative to the between-species divergence to affect the conclusions of the analysis. Here we used the available genomes of 14 Plasmodium falciparum isolates to assess the relationship between low-complexity sequence variation and factors such as nucleotide polymorphism across strains, sequence composition, and protein expression. We report that more than half of the 7,711 low-complexity sequences found within aligned coding sequences are variable in size among strains. Across strains, we observed an increasing density of polymorphic sites toward the LCR boundaries. This observation strongly suggests the joint effects of lowered selective constraints on low-complexity sequences and a mutagenic effect of these simple sequences.
Collapse
Affiliation(s)
- Wilfried Haerty
- Department of Biology, McMaster University, Hamilton, Ontario, Canada
| | | |
Collapse
|