Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Journal Articles

Rank	Citation Analysis	Article Type	Number of Years	Citation(s) in RCA
1	Rhie A, Nurk S, Cechova M, Hoyt SJ, Taylor DJ, Altemose N, Hook PW, Koren S, Rautiainen M, Alexandrov IA, Allen J, Asri M, Bzikadze AV, Chen NC, Chin CS, Diekhans M, Flicek P, Formenti G, Fungtammasan A, Garcia Giron C, Garrison E, Gershman A, Gerton JL, Grady PGS, Guarracino A, Haggerty L, Halabian R, Hansen NF, Harris R, Hartley GA, Harvey WT, Haukness M, Heinz J, Hourlier T, Hubley RM, Hunt SE, Hwang S, Jain M, Kesharwani RK, Lewis AP, Li H, Logsdon GA, Lucas JK, Makalowski W, Markovic C, Martin FJ, Mc Cartney AM, McCoy RC, McDaniel J, McNulty BM, Medvedev P, Mikheenko A, Munson KM, Murphy TD, Olsen HE, Olson ND, Paulin LF, Porubsky D, Potapova T, Ryabov F, Salzberg SL, Sauria MEG, Sedlazeck FJ, Shafin K, Shepelev VA, Shumate A, Storer JM, Surapaneni L, Taravella Oill AM, Thibaud-Nissen F, Timp W, Tomaszkiewicz M, Vollger MR, Walenz BP, Watwood AC, Weissensteiner MH, Wenger AM, Wilson MA, Zarate S, Zhu Y, Zook JM, Eichler EE, O'Neill RJ, Schatz MC, Miga KH, Makova KD, Phillippy AM. The complete sequence of a human Y chromosome. Nature 2023;621:344-354. [PMID: 37612512 PMCID: PMC10752217 DOI: 10.1038/s41586-023-06457-y] [Citation(s) in RCA: 174] [Impact Index Per Article: 87.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Accepted: 07/19/2023] [Indexed: 08/25/2023] Abstract The human Y chromosome has been notoriously difficult to sequence and assemble because of its complex repeat structure that includes long palindromes, tandem repeats and segmental duplications1-3. As a result, more than half of the Y chromosome is missing from the GRCh38 reference sequence and it remains the last human chromosome to be finished4,5. Here, the Telomere-to-Telomere (T2T) consortium presents the complete 62,460,029-base-pair sequence of a human Y chromosome from the HG002 genome (T2T-Y) that corrects multiple errors in GRCh38-Y and adds over 30 million base pairs of sequence to the reference, showing the complete ampliconic structures of gene families TSPY, DAZ and RBMY; 41 additional protein-coding genes, mostly from the TSPY family; and an alternating pattern of human satellite 1 and 3 blocks in the heterochromatic Yq12 region. We have combined T2T-Y with a previous assembly of the CHM13 genome4 and mapped available population variation, clinical variants and functional genomics data to produce a complete and comprehensive reference sequence for all 24 human chromosomes. Collapse Key Words Collapse MESH Headings Humans Base Sequence Chromosomes, Human, Y/genetics DNA, Satellite/genetics Genetic Variation/genetics Genetics, Population Genomics/methods Genomics/standards Heterochromatin/genetics Multigene Family/genetics Reference Standards Segmental Duplications, Genomic/genetics Sequence Analysis, DNA/standards Tandem Repeat Sequences/genetics Telomere/genetics Collapse Grants U01 HG010961 NHGRI NIH HHS R35 GM124827 NIGMS NIH HHS R01 GM130691 NIGMS NIH HHS T32 GM007454 NIGMS NIH HHS UM1 HG010971 NHGRI NIH HHS Z99 HG999999 Intramural NIH HHS R01 HG002939 NHGRI NIH HHS K99 GM147352 NIGMS NIH HHS R01 HG009190 NHGRI NIH HHS ZIA HG200398 Intramural NIH HHS R35 GM133747 NIGMS NIH HHS U24 HG010263 NHGRI NIH HHS R01 GM136684 NIGMS NIH HHS R01 HG010040 NHGRI NIH HHS U41 HG010972 NHGRI NIH HHS R21 CA240199 NCI NIH HHS R01 CA266339 NCI NIH HHS R00 GM147352 NIGMS NIH HHS U41 HG006620 NHGRI NIH HHS R01 HG010169 NHGRI NIH HHS U41 HG007234 NHGRI NIH HHS U01 CA253481 NCI NIH HHS U24 HG007234 NHGRI NIH HHS R01 HG011274 NHGRI NIH HHS U24 HG006620 NHGRI NIH HHS U24 HG010136 NHGRI NIH HHS R21 HG010548 NHGRI NIH HHS S10 OD028587 NIH HHS U01 HG010971 NHGRI NIH HHS U01 DA047638 NIDA NIH HHS R01 GM123312 NIGMS NIH HHS R01 GM072264 NIGMS NIH HHS R01 HG002385 NHGRI NIH HHS U01 HG011758 NHGRI NIH HHS Howard Hughes Medical Institute Collapse	research-article	2	174
2	Tomaszkiewicz M, Rangavittal S, Cechova M, Campos Sanchez R, Fescemyer HW, Harris R, Ye D, O'Brien PCM, Chikhi R, Ryder OA, Ferguson-Smith MA, Medvedev P, Makova KD. A time- and cost-effective strategy to sequence mammalian Y Chromosomes: an application to the de novo assembly of gorilla Y. Genome Res 2016;26:530-40. [PMID: 26934921 PMCID: PMC4817776 DOI: 10.1101/gr.199448.115] [Citation(s) in RCA: 81] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2015] [Accepted: 01/21/2016] [Indexed: 01/25/2023] Abstract The mammalian Y Chromosome sequence, critical for studying male fertility and dispersal, is enriched in repeats and palindromes, and thus, is the most difficult component of the genome to assemble. Previously, expensive and labor-intensive BAC-based techniques were used to sequence the Y for a handful of mammalian species. Here, we present a much faster and more affordable strategy for sequencing and assembling mammalian Y Chromosomes of sufficient quality for most comparative genomics analyses and for conservation genetics applications. The strategy combines flow sorting, short- and long-read genome and transcriptome sequencing, and droplet digital PCR with novel and existing computational methods. It can be used to reconstruct sex chromosomes in a heterogametic sex of any species. We applied our strategy to produce a draft of the gorilla Y sequence. The resulting assembly allowed us to refine gene content, evaluate copy number of ampliconic gene families, locate species-specific palindromes, examine the repetitive element content, and produce sequence alignments with human and chimpanzee Y Chromosomes. Our results inform the evolution of the hominine (human, chimpanzee, and gorilla) Y Chromosomes. Surprisingly, we found the gorilla Y Chromosome to be similar to the human Y Chromosome, but not to the chimpanzee Y Chromosome. Moreover, we have utilized the assembled gorilla Y Chromosome sequence to design genetic markers for studying the male-specific dispersal of this endangered species. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse	Research Support, U.S. Gov't, Non-P.H.S.	9	81
3	Tomaszkiewicz M, Medvedev P, Makova KD. Y and W Chromosome Assemblies: Approaches and Discoveries. Trends Genet 2017;33:266-282. [DOI: 10.1016/j.tig.2017.01.008] [Citation(s) in RCA: 64] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2016] [Revised: 12/05/2016] [Accepted: 01/24/2017] [Indexed: 01/19/2023] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse		8	64
4	Sahlin K, Tomaszkiewicz M, Makova KD, Medvedev P. Deciphering highly similar multigene family transcripts from Iso-Seq data with IsoCon. Nat Commun 2018;9:4601. [PMID: 30389934 PMCID: PMC6214943 DOI: 10.1038/s41467-018-06910-x] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2018] [Accepted: 09/29/2018] [Indexed: 12/30/2022] Open Abstract A significant portion of genes in vertebrate genomes belongs to multigene families, with each family containing several gene copies whose presence/absence, as well as isoform structure, can be highly variable across individuals. Existing de novo techniques for assaying the sequences of such highly-similar gene families fall short of reconstructing end-to-end transcripts with nucleotide-level precision or assigning alternatively spliced transcripts to their respective gene copies. We present IsoCon, a high-precision method using long PacBio Iso-Seq reads to tackle this challenge. We apply IsoCon to nine Y chromosome ampliconic gene families and show that it outperforms existing methods on both experimental and simulated data. IsoCon has allowed us to detect an unprecedented number of novel isoforms and has opened the door for unraveling the structure of many multigene families and gaining a deeper understanding of genome evolution and human diseases. Collapse Key Words Collapse MESH Headings Aged Algorithms Computer Simulation Exons/genetics Fragile X Mental Retardation Protein/genetics Gene Dosage Humans Male Middle Aged Multigene Family Protein Isoforms/genetics Protein Isoforms/metabolism RNA Splicing/genetics RNA, Messenger/genetics RNA, Messenger/metabolism Reproducibility of Results Sequence Analysis, RNA/methods Testis/metabolism Collapse Grants DBI-1356529 NSF \| BIO \| Division of Biological Infrastructure (DBI) IIS-1453527 NSF \| Directorate for Computer & Information Science and Engineering \| Division of Information and Intelligent Systems (Information & Intelligent Systems) CCF-1439057 NSF \| Directorate for Computer & Information Science and Engineering \| Division of Computing and Communication Foundations (CCF) IIS-1421908 NSF \| Directorate for Computer & Information Science and Engineering \| Division of Information and Intelligent Systems (Information & Intelligent Systems) DBI-ABI 0965596 NSF \| BIO \| Division of Biological Infrastructure (DBI) UL1 TR002014 NCATS NIH HHS UL1TR000127 U.S. Department of Health & Human Services \| NIH \| National Center for Advancing Translational Sciences (NCATS) UL1 TR000127 NCATS NIH HHS Collapse	Research Support, N.I.H., Extramural	7	35
5	Cechova M, Harris RS, Tomaszkiewicz M, Arbeithuber B, Chiaromonte F, Makova KD. High Satellite Repeat Turnover in Great Apes Studied with Short- and Long-Read Technologies. Mol Biol Evol 2019;36:2415-2431. [PMID: 31273383 PMCID: PMC6805231 DOI: 10.1093/molbev/msz156] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2019] [Revised: 06/12/2019] [Accepted: 06/13/2019] [Indexed: 12/23/2022] Open Abstract Satellite repeats are a structural component of centromeres and telomeres, and in some instances, their divergence is known to drive speciation. Due to their highly repetitive nature, satellite sequences have been understudied and underrepresented in genome assemblies. To investigate their turnover in great apes, we studied satellite repeats of unit sizes up to 50 bp in human, chimpanzee, bonobo, gorilla, and Sumatran and Bornean orangutans, using unassembled short and long sequencing reads. The density of satellite repeats, as identified from accurate short reads (Illumina), varied greatly among great ape genomes. These were dominated by a handful of abundant repeated motifs, frequently shared among species, which formed two groups: 1) the (AATGG)n repeat (critical for heat shock response) and its derivatives; and 2) subtelomeric 32-mers involved in telomeric metabolism. Using the densities of abundant repeats, individuals could be classified into species. However, clustering did not reproduce the accepted species phylogeny, suggesting rapid repeat evolution. Several abundant repeats were enriched in males versus females; using Y chromosome assemblies or Fluorescent In Situ Hybridization, we validated their location on the Y. Finally, applying a novel computational tool, we identified many satellite repeats completely embedded within long Oxford Nanopore and Pacific Biosciences reads. Such repeats were up to 59 kb in length and consisted of perfect repeats interspersed with other similar sequences. Our results based on sequencing reads generated with three different technologies provide the first detailed characterization of great ape satellite repeats, and open new avenues for exploring their functions. Collapse Key Words great apes heterochromatin long sequencing reads satellite repeats Collapse MESH Headings Collapse Grants R01 GM130691 NIGMS NIH HHS T32 GM102057 NIGMS NIH HHS Collapse	research-article	6	32
6	Tomaszkiewicz M, Abou Najm M, Beysens D, Alameddine I, Bou Zeid E, El-Fadel M. Projected climate change impacts upon dew yield in the Mediterranean basin. THE SCIENCE OF THE TOTAL ENVIRONMENT 2016;566-567:1339-1348. [PMID: 27266520 DOI: 10.1016/j.scitotenv.2016.05.195] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/04/2016] [Revised: 05/27/2016] [Accepted: 05/27/2016] [Indexed: 06/06/2023] Abstract Water scarcity is increasingly raising the need for non-conventional water resources, particularly in arid and semi-arid regions. In this context, atmospheric moisture can potentially be harvested in the form of dew, which is commonly disregarded from the water budget, although its impact may be significant when compared to rainfall during the dry season. In this study, a dew atlas for the Mediterranean region is presented illustrating dew yields using the yield data collected for the 2013 dry season. The results indicate that cumulative monthly dew yield in the region can exceed 2.8mm at the end of the dry season and 1.5mm during the driest months, compared to <1mm of rainfall during the same period in some areas. Dew yields were compared with potential evapotranspiration (PET) and actual evapotranspiration (ET) during summer months thus highlighting the role of dew to many native plants in the region. Furthermore, forecasted trends in temperature and relative humidity were used to estimate dew yields under future climatic scenarios. The results showed a 27% decline in dew yield during the critical summer months at the end of the century (2080). Collapse Key Words Climate change adaptation Dew Geostatistical analysis Non-conventional water resources Collapse MESH Headings Collapse Grants Collapse		9	13
7	Vegesna R, Tomaszkiewicz M, Medvedev P, Makova KD. Dosage regulation, and variation in gene expression and copy number of human Y chromosome ampliconic genes. PLoS Genet 2019;15:e1008369. [PMID: 31525193 PMCID: PMC6772104 DOI: 10.1371/journal.pgen.1008369] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Revised: 10/01/2019] [Accepted: 08/13/2019] [Indexed: 12/28/2022] Open Abstract The Y chromosome harbors nine multi-copy ampliconic gene families expressed exclusively in testis. The gene copies within each family are >99% identical to each other, which poses a major challenge in evaluating their copy number. Recent studies demonstrated high variation in Y ampliconic gene copy number among humans. However, how this variation affects expression levels in human testis remains understudied. Here we developed a novel computational tool Ampliconic Copy Number Estimator (AmpliCoNE) that utilizes read sequencing depth information to estimate Y ampliconic gene copy number per family. We applied this tool to whole-genome sequencing data of 149 men with matched testis expression data whose samples are part of the Genotype-Tissue Expression (GTEx) project. We found that the Y ampliconic gene families with low copy number in humans were deleted or pseudogenized in non-human great apes, suggesting relaxation of functional constraints. Among the Y ampliconic gene families, higher copy number leads to higher expression. Within the Y ampliconic gene families, copy number does not influence gene expression, rather a high tolerance for variation in gene expression was observed in testis of presumably healthy men. No differences in gene expression levels were found among major Y haplogroups. Age positively correlated with expression levels of the HSFY and PRY gene families in the African subhaplogroup E1b, but not in the European subhaplogroups R1b and I1. We also found that expression of five Y ampliconic gene families is coordinated with that of their non-Y (i.e. X or autosomal) homologs. Indeed, five ampliconic gene families had consistently lower expression levels when compared to their non-Y homologs suggesting dosage regulation, while the HSFY family had higher expression levels than its X homolog and thus lacked dosage regulation. Collapse Key Words Collapse MESH Headings Animals Chromosomes, Human, Y/genetics Chromosomes, Human, Y/physiology DNA Copy Number Variations/genetics Databases, Genetic Dosage Compensation, Genetic/genetics Dosage Compensation, Genetic/physiology Epigenesis, Genetic/genetics Gene Dosage/genetics Gene Expression/genetics Gene Expression Regulation/genetics Genes, Y-Linked/genetics Genes, Y-Linked/physiology Heat Shock Transcription Factors/genetics Heat Shock Transcription Factors/metabolism Humans Male Multigene Family/genetics Sequence Analysis, DNA/methods Testis/metabolism Collapse Grants R01 GM130691 NIGMS NIH HHS T32 GM102057 NIGMS NIH HHS Pennsylvania Department of Health National Institutes of Health National Science Foundation Clinical and Translational Sciences Institute Institute for CyberScience Eberly College of Sciences—at at Penn State Collapse	Research Support, N.I.H., Extramural	6	13
8	Fungtammasan A, Tomaszkiewicz M, Campos-Sánchez R, Eckert KA, DeGiorgio M, Makova KD. Reverse Transcription Errors and RNA-DNA Differences at Short Tandem Repeats. Mol Biol Evol 2016;33:2744-58. [PMID: 27413049 PMCID: PMC5026258 DOI: 10.1093/molbev/msw139] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open Abstract Transcript variation has important implications for organismal function in health and disease. Most transcriptome studies focus on assessing variation in gene expression levels and isoform representation. Variation at the level of transcript sequence is caused by RNA editing and transcription errors, and leads to nongenetically encoded transcript variants, or RNA–DNA differences (RDDs). Such variation has been understudied, in part because its detection is obscured by reverse transcription (RT) and sequencing errors. It has only been evaluated for intertranscript base substitution differences. Here, we investigated transcript sequence variation for short tandem repeats (STRs). We developed the first maximum-likelihood estimator (MLE) to infer RT error and RDD rates, taking next generation sequencing error rates into account. Using the MLE, we empirically evaluated RT error and RDD rates for STRs in a large-scale DNA and RNA replicated sequencing experiment conducted in a primate species. The RT error rates increased exponentially with STR length and were biased toward expansions. The RDD rates were approximately 1 order of magnitude lower than the RT error rates. The RT error rates estimated with the MLE from a primate data set were concordant with those estimated with an independent method, barcoded RNA sequencing, from a Caenorhabditis elegans data set. Our results have important implications for medical genomics, as STR allelic variation is associated with >40 diseases. STR nonallelic transcript variation can also contribute to disease phenotype. The MLE and empirical rates presented here can be used to evaluate the probability of disease-associated transcripts arising due to RDD. Collapse Key Words RNA sequencing RNA–DNA differences error correction model. microsatellites reverse transcription errors sequencing errors tandem repeats transcription errors Collapse MESH Headings Collapse Grants Collapse	Research Support, Non-U.S. Gov't	9	12
9	Rangavittal S, Harris RS, Cechova M, Tomaszkiewicz M, Chikhi R, Makova KD, Medvedev P. RecoverY: k-mer-based read classification for Y-chromosome-specific sequencing and assembly. Bioinformatics 2019;34:1125-1131. [PMID: 29194476 DOI: 10.1093/bioinformatics/btx771] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2017] [Accepted: 11/27/2017] [Indexed: 11/13/2022] Open Abstract Motivation The haploid mammalian Y chromosome is usually under-represented in genome assemblies due to high repeat content and low depth due to its haploid nature. One strategy to ameliorate the low coverage of Y sequences is to experimentally enrich Y-specific material before assembly. As the enrichment process is imperfect, algorithms are needed to identify putative Y-specific reads prior to downstream assembly. A strategy that uses k-mer abundances to identify such reads was used to assemble the gorilla Y. However, the strategy required the manual setting of key parameters, a time-consuming process leading to sub-optimal assemblies. Results We develop a method, RecoverY, that selects Y-specific reads by automatically choosing the abundance level at which a k-mer is deemed to originate from the Y. This algorithm uses prior knowledge about the Y chromosome of a related species or known Y transcript sequences. We evaluate RecoverY on both simulated and real data, for human and gorilla, and investigate its robustness to important parameters. We show that RecoverY leads to a vastly superior assembly compared to alternate strategies of filtering the reads or contigs. Compared to the preliminary strategy used by Tomaszkiewicz et al., we achieve a 33% improvement in assembly size and a 20% improvement in the NG50, demonstrating the power of automatic parameter selection. Availability and implementation Our tool RecoverY is freely available at https://github.com/makovalab-psu/RecoverY. Contact kmakova@bx.psu.edu or pashadag@cse.psu.edu. Supplementary information Supplementary data are available at Bioinformatics online. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse	Research Support, U.S. Gov't, Non-P.H.S.	6	12
10	Ye D, Zaidi AA, Tomaszkiewicz M, Anthony K, Liebowitz C, DeGiorgio M, Shriver MD, Makova KD. High Levels of Copy Number Variation of Ampliconic Genes across Major Human Y Haplogroups. Genome Biol Evol 2018;10:1333-1350. [PMID: 29718380 PMCID: PMC6007357 DOI: 10.1093/gbe/evy086] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/27/2018] [Indexed: 01/11/2023] Open Abstract Because of its highly repetitive nature, the human male-specific Y chromosome remains understudied. It is important to investigate variation on the Y chromosome to understand its evolution and contribution to phenotypic variation, including infertility. Approximately 20% of the human Y chromosome consists of ampliconic regions which include nine multi-copy gene families. These gene families are expressed exclusively in testes and usually implicated in spermatogenesis. Here, to gain a better understanding of the role of the Y chromosome in human evolution and in determining sexually dimorphic traits, we studied ampliconic gene copy number variation in 100 males representing ten major Y haplogroups world-wide. Copy number was estimated with droplet digital PCR. In contrast to low nucleotide diversity observed on the Y in previous studies, here we show that ampliconic gene copy number diversity is very high. A total of 98 copy-number-based haplotypes were observed among 100 individuals, and haplotypes were sometimes shared by males from very different haplogroups, suggesting homoplasies. The resulting haplotypes did not cluster according to major Y haplogroups. Overall, only two gene families (RBMY and TSPY) showed significant differences in copy number among major Y haplogroups, and the haplogroup of a male could not be predicted based on his ampliconic gene copy numbers. Finally, we did not find significant correlations either between copy number variation and individual's height, or between the former and facial masculinity/femininity. Our results suggest rapid evolution of ampliconic gene copy numbers on the human Y, and we discuss its causes. Collapse Key Words ampliconic genes y chromosome haplotypes Collapse MESH Headings Body Height Chromosomes, Human, Y DNA Copy Number Variations Evolution, Molecular Gene Amplification Genome, Human Haplotypes Humans Male Masculinity Multigene Family Phenotype Collapse Grants Collapse	research-article	7	11
11	Vegesna R, Tomaszkiewicz M, Ryder OA, Campos-Sánchez R, Medvedev P, DeGiorgio M, Makova KD. Ampliconic Genes on the Great Ape Y Chromosomes: Rapid Evolution of Copy Number but Conservation of Expression Levels. Genome Biol Evol 2021;12:842-859. [PMID: 32374870 PMCID: PMC7313670 DOI: 10.1093/gbe/evaa088] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/28/2020] [Indexed: 12/16/2022] Open Abstract Multicopy ampliconic gene families on the Y chromosome play an important role in spermatogenesis. Thus, studying their genetic variation in endangered great ape species is critical. We estimated the sizes (copy number) of nine Y ampliconic gene families in population samples of chimpanzee, bonobo, and orangutan with droplet digital polymerase chain reaction, combined these estimates with published data for human and gorilla, and produced genome-wide testis gene expression data for great apes. Analyzing this comprehensive data set within an evolutionary framework, we, first, found high inter- and intraspecific variation in gene family size, with larger families exhibiting higher variation as compared with smaller families, a pattern consistent with random genetic drift. Second, for four gene families, we observed significant interspecific size differences, sometimes even between sister species—chimpanzee and bonobo. Third, despite substantial variation in copy number, Y ampliconic gene families’ expression levels did not differ significantly among species, suggesting dosage regulation. Fourth, for three gene families, size was positively correlated with gene expression levels across species, suggesting that, given sufficient evolutionary time, copy number influences gene expression. Our results indicate high variability in size but conservation in gene expression levels in Y ampliconic gene families, significantly advancing our understanding of Y-chromosome evolution in great apes. Collapse Key Words Y chromosome ampliconic genes bonobo gene copy number gene expression great apes orangutan Collapse MESH Headings Collapse Grants Collapse	Research Support, U.S. Gov't, Non-P.H.S.	4	9
12	Rangavittal S, Stopa N, Tomaszkiewicz M, Sahlin K, Makova KD, Medvedev P. DiscoverY: a classifier for identifying Y chromosome sequences in male assemblies. BMC Genomics 2019;20:641. [PMID: 31399045 PMCID: PMC6688218 DOI: 10.1186/s12864-019-5996-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2019] [Accepted: 07/26/2019] [Indexed: 11/10/2022] Open Abstract Background Although the Y chromosome plays an important role in male sex determination and fertility, it is currently understudied due to its haploid and repetitive nature. Methods to isolate Y-specific contigs from a whole-genome assembly broadly fall into two categories. The first involves retrieving Y-contigs using proportion sharing with a female, but such a strategy is prone to false positives in the absence of a high-quality, complete female reference. A second strategy uses the ratio of depth of coverage from male and female reads to select Y-contigs, but such a method requires high-depth sequencing of a female and cannot utilize existing female references. Results We develop a k-mer based method called DiscoverY, which combines proportion sharing with female with depth of coverage from male reads to classify contigs as Y-chromosomal. We evaluate the performance of DiscoverY on human and gorilla genomes, across different sequencing platforms including Illumina, 10X, and PacBio. In the cases where the male and female data are of high quality, DiscoverY has a high precision and recall and outperforms existing methods. For cases when a high quality female reference is not available, we quantify the effect of using draft reference or even just raw sequencing reads from a female. Conclusion DiscoverY is an effective method to isolate Y-specific contigs from a whole-genome assembly. However, regions homologous to the X chromosome remain difficult to detect. Electronic supplementary material The online version of this article (10.1186/s12864-019-5996-3) contains supplementary material, which is available to authorized users. Collapse Key Words Genome assembly Male genome Y chromosome Collapse MESH Headings Collapse Grants Collapse	Journal Article	6	9
13	Böhne A, Schultheis C, Galiana-Arnoux D, Froschauer A, Zhou Q, Schmidt C, Selz Y, Ozouf-Costaz C, Dettai A, Segurens B, Couloux A, Bernard-Samain S, Barbe V, Chilmonczyk S, Brunet F, Darras A, Tomaszkiewicz M, Semon M, Schartl M, Volff JN. Molecular analysis of the sex chromosomes of the platyfish Xiphophorus maculatus: Towards the identification of a new type of master sexual regulator in vertebrates. Integr Zool 2011;4:277-84. [PMID: 21392300 DOI: 10.1111/j.1749-4877.2009.00166.x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Abstract In contrast to mammals and birds, fish display an amazing diversity of genetic sex determination systems, with frequent changes during evolution possibly associated with the emergence of new sex chromosomes and sex-determining genes. To better understand the molecular and evolutionary mechanisms driving this diversity, several fish models are studied in parallel. Besides the medaka (Oryzias latipes Temminck and Schlegel, 1846) for which the master sex-determination gene has been identified, one of the most advanced models for studying sex determination is the Southern platyfish (Xiphophorus maculatus, Günther 1966). Xiphophorus maculatus belongs to the Poeciliids, a family of live-bearing freshwater fish, including platyfish, swordtails and guppies that perfectly illustrates the diversity of genetic sex-determination mechanisms observed in teleosts. For X. maculatus, bacterial artificial chromosome contigs covering the sex-determination region of the X and Y sex chromosomes have been constructed. Initial molecular analysis demonstrated that the sex-determination region is very unstable and frequently undergoes duplications, deletions, inversions and other rearrangements. Eleven gene candidates linked to the master sex-determining gene have been identified, some of them corresponding to pseudogenes. All putative genes are present on both the X and the Y chromosomes, suggesting a poor degree of differentiation and a young evolutionary age for platyfish sex chromosomes. When compared with other fish and tetrapod genomes, syntenies were detected only with autosomes. This observation supports an independent origin of sex chromosomes, not only in different vertebrate lineages but also between different fish species. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse	Research Support, Non-U.S. Gov't	14	4
14	Tomaszkiewicz M, Sahlin K, Medvedev P, Makova KD. Transcript Isoform Diversity of Ampliconic Genes on the Y Chromosome of Great Apes. Genome Biol Evol 2023;15:evad205. [PMID: 37967251 PMCID: PMC10673640 DOI: 10.1093/gbe/evad205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 10/20/2023] [Accepted: 11/03/2023] [Indexed: 11/17/2023] Open Abstract Y chromosomal ampliconic genes (YAGs) are important for male fertility, as they encode proteins functioning in spermatogenesis. The variation in copy number and expression levels of these multicopy gene families has been studied in great apes; however, the diversity of splicing variants remains unexplored. Here, we deciphered the sequences of polyadenylated transcripts of all nine YAG families (BPY2, CDY, DAZ, HSFY, PRY, RBMY, TSPY, VCY, and XKRY) from testis samples of six great ape species (human, chimpanzee, bonobo, gorilla, Bornean orangutan, and Sumatran orangutan). To achieve this, we enriched YAG transcripts with capture probe hybridization and sequenced them with long (Pacific Biosciences) reads. Our analysis of this data set resulted in several findings. First, we observed evolutionarily conserved alternative splicing patterns for most YAG families except for BPY2 and PRY. Second, our results suggest that BPY2 transcripts and proteins originate from separate genomic regions in bonobo versus human, which is possibly facilitated by acquiring new promoters. Third, our analysis indicates that the PRY gene family, having the highest representation of noncoding transcripts, has been undergoing pseudogenization. Fourth, we have not detected signatures of selection in the five YAG families shared among great apes, even though we identified many species-specific protein-coding transcripts. Fifth, we predicted consensus disorder regions across most gene families and species, which could be used for future investigations of male infertility. Overall, our work illuminates the YAG isoform landscape and provides a genomic resource for future functional studies focusing on infertility phenotypes in humans and critically endangered great apes. Collapse Key Words Y chromosome ampliconic gene diversity great apes transcript isoform Collapse MESH Headings Animals Male Humans Pan paniscus/genetics Hominidae/genetics Y Chromosome/genetics Pan troglodytes/genetics Protein Isoforms/genetics Collapse Grants R01 GM130691 NIGMS NIH HHS R01 GM146462 NIGMS NIH HHS Collapse	Research Support, N.I.H., Extramural	2
15	Tomaszkiewicz M, Sahlin K, Medvedev P, Makova KD. Transcript Isoform Diversity of Ampliconic Genes on the Y Chromosome of Great Apes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.02.530874. [PMID: 36993458 PMCID: PMC10054944 DOI: 10.1101/2023.03.02.530874] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/19/2023] Abstract Y-chromosomal Ampliconic Genes (YAGs) are important for male fertility, as they encode proteins functioning in spermatogenesis. The variation in copy number and expression levels of these multicopy gene families has been recently studied in great apes, however, the diversity of splicing variants remains unexplored. Here we deciphered the sequences of polyadenylated transcripts of all nine YAG families (BPY2, CDY, DAZ, HSFY, PRY, RBMY, TSPY, VCY, and XKRY) from testis samples of six great ape species (human, chimpanzee, bonobo, gorilla, Bornean orangutan, and Sumatran orangutan). To achieve this, we enriched YAG transcripts with capture-probe hybridization and sequenced them with long (Pacific Biosciences) reads. Our analysis of this dataset resulted in several findings. First, we uncovered a high diversity of YAG transcripts across great apes. Second, we observed evolutionarily conserved alternative splicing patterns for most YAG families except for BPY2 and PRY. Our results suggest that BPY2 transcripts and predicted proteins in several great ape species (bonobo and the two orangutans) have independent evolutionary origins and are not homologous to human reference transcripts and proteins. In contrast, our results suggest that the PRY gene family, having the highest representation of transcripts without open reading frames, has been undergoing pseudogenization. Third, even though we have identified many species-specific protein-coding YAG transcripts, we have not detected any signatures of positive selection. Overall, our work illuminates the YAG isoform landscape and its evolutionary history, and provides a genomic resource for future functional studies focusing on infertility phenotypes in humans and critically endangered great apes. Collapse Key Words Collapse MESH Headings Collapse Grants R01 GM130691 NIGMS NIH HHS R01 GM146462 NIGMS NIH HHS Collapse	Preprint	2
16	Sokirniy I, Inam H, Tomaszkiewicz M, Reynolds J, McCandlish D, Pritchard J. A side-by-side comparison of variant function measurements using deep mutational scanning and base editing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.30.601444. [PMID: 39005366 PMCID: PMC11244880 DOI: 10.1101/2024.06.30.601444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/16/2024] Abstract Variant annotation is a crucial objective in mammalian functional genomics. Deep Mutational Scanning (DMS) is a well-established method for annotating human gene variants, but CRISPR base editing (BE) is emerging as an alternative. However, questions remain about how well high-throughput base editing measurements can annotate variant function and the extent of downstream experimental validation required. This study presents the first direct comparison of DMS and BE in the same lab and cell line. Results indicate that focusing on the most likely edits and highest efficiency sgRNAs enhances the agreement between a "gold standard" DMS dataset and a BE screen. A simple filter for sgRNAs making single edits in their window could sufficiently annotate a large proportion of variants directly from sgRNA sequencing of large pools. When multi-edit guides are unavoidable, directly measuring the variants created in the pool, rather than sgRNA abundance, can recover high-quality variant annotation measurements in multiplexed pools. Taken together, our data show a surprising degree of correlation between base editor data and gold standard deep mutational scanning. Collapse Key Words Collapse MESH Headings Collapse Grants R35 GM133613 NIGMS NIH HHS T32 GM108563 NIGMS NIH HHS U01 CA265709 NCI NIH HHS Collapse	Preprint	1

Please SIGN IN to browse more articles.