1
|
Matveeva OV, Ogurtsov AY, Nazipova NN, Shabalina SA. Sequence characteristics define trade-offs between on-target and genome-wide off-target hybridization of oligoprobes. PLoS One 2018; 13:e0199162. [PMID: 29928000 PMCID: PMC6013149 DOI: 10.1371/journal.pone.0199162] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2018] [Accepted: 06/02/2018] [Indexed: 12/20/2022] Open
Abstract
Off-target oligoprobe's interaction with partially complementary nucleotide sequences represents a problem for many bio-techniques. The goal of the study was to identify oligoprobe sequence characteristics that control the ratio between on-target and off-target hybridization. To understand the complex interplay between specific and genome-wide off-target (cross-hybridization) signals, we analyzed a database derived from genomic comparison hybridization experiments performed with an Affymetrix tiling array. The database included two types of probes with signals derived from (i) a combination of specific signal and cross-hybridization and (ii) genomic cross-hybridization only. All probes from the database were grouped into bins according to their sequence characteristics, where both hybridization signals were averaged separately. For selection of specific probes, we analyzed the following sequence characteristics: vulnerability to self-folding, nucleotide composition bias, numbers of G nucleotides and GGG-blocks, and occurrence of probe's k-mers in the human genome. Increases in bin ranges for these characteristics are simultaneously accompanied by a decrease in hybridization specificity-the ratio between specific and cross-hybridization signals. However, both averaged hybridization signals exhibit growing trends along with an increase of probes' binding energy, where the hybridization specific signal increases significantly faster in comparison to the cross-hybridization. The same trend is evident for the S function, which serves as a combined evaluation of probe binding energy and occurrence of probe's k-mers in the genome. Application of S allows extracting a larger number of specific probes, as compared to using only binding energy. Thus, we showed that high values of specific and cross-hybridization signals are not mutually exclusive for probes with high values of binding energy and S. In this study, the application of a new set of sequence characteristics allows detection of probes that are highly specific to their targets for array design and other bio-techniques that require selection of specific probes.
Collapse
Affiliation(s)
- Olga V. Matveeva
- Biopolymer Design LLC, Acton, Massachusetts, United States of America
- * E-mail: (OVM); (SAS)
| | - Aleksey Y. Ogurtsov
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Nafisa N. Nazipova
- Institute of Mathematical Problems of Biology, RAS – the Branch of Keldysh Institute of Applied Mathematics of Russian Academy of Sciences, Pushchino, Moscow Region, Russia
| | - Svetlana A. Shabalina
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
- * E-mail: (OVM); (SAS)
| |
Collapse
|
2
|
Matveeva OV, Nechipurenko YD, Riabenko E, Ragan C, Nazipova NN, Ogurtsov AY, Shabalina SA. Optimization of signal-to-noise ratio for efficient microarray probe design. Bioinformatics 2017; 32:i552-i558. [PMID: 27587674 DOI: 10.1093/bioinformatics/btw451] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Target-specific hybridization depends on oligo-probe characteristics that improve hybridization specificity and minimize genome-wide cross-hybridization. Interplay between specific hybridization and genome-wide cross-hybridization has been insufficiently studied, despite its crucial role in efficient probe design and in data analysis. RESULTS In this study, we defined hybridization specificity as a ratio between oligo target-specific hybridization and oligo genome-wide cross-hybridization. A microarray database, derived from the Genomic Comparison Hybridization (GCH) experiment and performed using the Affymetrix platform, contains two different types of probes. The first type of oligo-probes does not have a specific target on the genome and their hybridization signals are derived from genome-wide cross-hybridization alone. The second type includes oligonucleotides that have a specific target on the genomic DNA and their signals are derived from specific and cross-hybridization components combined together in a total signal. A comparative analysis of hybridization specificity of oligo-probes, as well as their nucleotide sequences and thermodynamic features was performed on the database. The comparison has revealed that hybridization specificity was negatively affected by low stability of the fully-paired oligo-target duplex, stable probe self-folding, G-rich content, including GGG motifs, low sequence complexity and nucleotide composition symmetry. CONCLUSION Filtering out the probes with defined 'negative' characteristics significantly increases specific hybridization and dramatically decreasing genome-wide cross-hybridization. Selected oligo-probes have two times higher hybridization specificity on average, compared to the probes that were filtered from the analysis by applying suggested cutoff thresholds to the described parameters. A new approach for efficient oligo-probe design is described in our study. CONTACT shabalin@ncbi.nlm.nih.gov or olga.matveeva@gmail.com SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Olga V Matveeva
- Biopolymer Design LLC, Acton, MA 01721, USA Engelhardt Institute of Molecular Biology, Moscow 119991, Russia
| | | | - Evgeniy Riabenko
- Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia
| | - Chikako Ragan
- Queensland Brain Institute, University of Queensland, Brisbane, QLD 4072 Australia
| | - Nafisa N Nazipova
- Institute of Mathematical Problems of Biology, Pushchino, Moscow Region, 142290, Russia
| | - Aleksey Y Ogurtsov
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Svetlana A Shabalina
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| |
Collapse
|
3
|
Gregory WF, Parkinson J. Caenorhabditis elegans-applications to nematode genomics. Comp Funct Genomics 2011; 4:194-202. [PMID: 18629128 PMCID: PMC2447415 DOI: 10.1002/cfg.260] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2003] [Accepted: 01/30/2003] [Indexed: 11/06/2022] Open
Abstract
The complete genome sequence of the free-living nematode Caenorhabditis elegans was published 4 years ago. Since then, we have seen great strides in technologies that seek to exploit this data. Here we describe the application of some of these techniques and other advances that are helping us to understand about not only the biology of this important model organism but also the entire phylum Nematoda.
Collapse
Affiliation(s)
- William F Gregory
- Institute of Cell Animal and Population Biology Kings Buildings West Mains Rd Edinburgh EH9 3JT UK
| | | |
Collapse
|
4
|
Mah AK, Tu DK, Johnsen RC, Chu JS, Chen N, Baillie DL. Characterization of the octamer, a cis-regulatory element that modulates excretory cell gene-expression in Caenorhabditis elegans. BMC Mol Biol 2010; 11:19. [PMID: 20211011 PMCID: PMC2841177 DOI: 10.1186/1471-2199-11-19] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2009] [Accepted: 03/08/2010] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND We have previously demonstrated that the POU transcription factor CEH-6 is required for driving aqp-8 expression in the C. elegans excretory (canal) cell, an osmotic regulatory organ that is functionally analogous to the kidney. This transcriptional regulation occurs through a CEH-6 binding to a cis-regulatory element called the octamer (ATTTGCAT), which is located in the aqp-8 promoter. RESULTS Here, we further characterize octamer driven transcription in C. elegans. First, we analyzed the positional requirements of the octamer. To do so, we assayed the effects on excretory cell expression by placing the octamer within the well-characterized promoter of vit-2. Second, using phylogenetic footprinting between three Caenorhabditis species, we identified a set of 165 genes that contain conserved upstream octamers in their promoters. Third, we used promoter::GFP fusions to examine the expression patterns of 107 of the 165 genes. This analysis demonstrated that conservation of octamers in promoters increases the likelihood that the gene is expressed in the excretory cell. Furthermore, we found that the sequences flanking the octamers may have functional importance. Finally, we altered the octamer using site-directed mutagenesis. Thus, we demonstrated that some nucleotide substitutions within the octamer do not affect the expression pattern of nearby genes, but change their overall expression was changed. Therefore, we have expanded the core octamer to include flanking regions and variants of the motif. CONCLUSIONS Taken together, we have demonstrated that octamer-containing regions are associated with excretory cell expression of several genes that have putative roles in osmoregulation. Moreover, our analysis of the octamer sequence and its sequence variants could aid in the identification of additional genes that are expressed in the excretory cell and that may also be regulated by CEH-6.
Collapse
Affiliation(s)
- Allan K Mah
- Department Molecular Biology and Biochemistry, Simon Fraser University, 8888 University Drive, Burnaby, British Columbia, Canada, V5A 1S6
- Department of Medical Genetics, Centre for Molecular Medicine and Therapeutics, University of British Columbia, 950 West 28th Avenue, Vancouver, British Columbia, Canada V5Z H4H
| | - Domena K Tu
- Department Molecular Biology and Biochemistry, Simon Fraser University, 8888 University Drive, Burnaby, British Columbia, Canada, V5A 1S6
| | - Robert C Johnsen
- Department Molecular Biology and Biochemistry, Simon Fraser University, 8888 University Drive, Burnaby, British Columbia, Canada, V5A 1S6
| | - Jeffrey S Chu
- Department Molecular Biology and Biochemistry, Simon Fraser University, 8888 University Drive, Burnaby, British Columbia, Canada, V5A 1S6
| | - Nansheng Chen
- Department Molecular Biology and Biochemistry, Simon Fraser University, 8888 University Drive, Burnaby, British Columbia, Canada, V5A 1S6
| | - David L Baillie
- Department Molecular Biology and Biochemistry, Simon Fraser University, 8888 University Drive, Burnaby, British Columbia, Canada, V5A 1S6
| |
Collapse
|
5
|
Evolution of Transcription Factor Binding Sites in Mammalian Gene Regulatory Regions: Handling Counterintuitive Results. J Mol Evol 2009; 68:654-64. [DOI: 10.1007/s00239-009-9238-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2007] [Revised: 03/30/2009] [Accepted: 04/15/2009] [Indexed: 01/26/2023]
|
6
|
Cutter AD, Dey A, Murray RL. Evolution of the Caenorhabditis elegans genome. Mol Biol Evol 2009; 26:1199-234. [PMID: 19289596 DOI: 10.1093/molbev/msp048] [Citation(s) in RCA: 85] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
A fundamental problem in genome biology is to elucidate the evolutionary forces responsible for generating nonrandom patterns of genome organization. As the first metazoan to benefit from full-genome sequencing, Caenorhabditis elegans has been at the forefront of research in this area. Studies of genomic patterns, and their evolutionary underpinnings, continue to be augmented by the recent push to obtain additional full-genome sequences of related Caenorhabditis taxa. In the near future, we expect to see major advances with the onset of whole-genome resequencing of multiple wild individuals of the same species. In this review, we synthesize many of the important insights to date in our understanding of genome organization and function that derive from the evolutionary principles made explicit by theoretical population genetics and molecular evolution and highlight fertile areas for future research on unanswered questions in C. elegans genome evolution. We call attention to the need for C. elegans researchers to generate and critically assess nonadaptive hypotheses for genomic and developmental patterns, in addition to adaptive scenarios. We also emphasize the potential importance of evolution in the gonochoristic (female and male) ancestors of the androdioecious (hermaphrodite and male) C. elegans as the source for many of its genomic and developmental patterns.
Collapse
Affiliation(s)
- Asher D Cutter
- Department of Ecology & Evolutionary Biology and the Centre for the Analysis of Genome Evolution and Function, University of Toronto, Toronto, Ontario, Canada.
| | | | | |
Collapse
|
7
|
Keightley PD, Halligan DL. Analysis and implications of mutational variation. Genetica 2008; 136:359-69. [PMID: 18663587 DOI: 10.1007/s10709-008-9304-4] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2008] [Accepted: 07/16/2008] [Indexed: 11/25/2022]
Abstract
Variation from new mutations is important for several questions in quantitative genetics. Key parameters are the genomic mutation rate and the distribution of effects of mutations (DEM), which determine the amount of new quantitative variation that arises per generation from mutation (V(M)). Here, we review methods and empirical results concerning mutation accumulation (MA) experiments that have shed light on properties of mutations affecting quantitative traits. Surprisingly, most data on fitness traits from laboratory assays of MA lines indicate that the DEM is platykurtic in form (i.e., substantially less leptokurtic than an exponential distribution), and imply that most variation is produced by mutations of moderate to large effect. This finding contrasts with results from MA or mutagenesis experiments in which mutational changes to the DNA can be assayed directly, which imply that the vast majority of mutations have very small phenotypic effects, and that the distribution has a leptokurtic form. We compare these findings with recent approaches that attempt to infer the DEM for fitness based on comparing the frequency spectra of segregating nucleotide polymorphisms at putatively neutral and selected sites in population samples. When applied to data for humans and Drosophila, these analyses also indicate that the DEM is strongly leptokurtic. However, by combining the resultant estimates of parameters of the DEM with estimates of the mutation rate per nucleotide, the predicted V(M) for fitness is only a tiny fraction of V(M) observed in MA experiments. This discrepancy can be explained if we postulate that a few deleterious mutations of large effect contribute most of the mutational variation observed in MA experiments and that such mutations segregate at very low frequencies in natural populations, and effectively are never seen in population samples.
Collapse
Affiliation(s)
- Peter D Keightley
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, West Mains Road, Edinburgh, EH9 3JT, UK.
| | | |
Collapse
|
8
|
Loewe L, Cutter AD. On the potential for extinction by Muller's ratchet in Caenorhabditis elegans. BMC Evol Biol 2008; 8:125. [PMID: 18447910 PMCID: PMC2408595 DOI: 10.1186/1471-2148-8-125] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2007] [Accepted: 04/30/2008] [Indexed: 11/10/2022] Open
Abstract
Background The self-fertile hermaphrodite worm C. elegans is an important model organism for biology, yet little is known about the origin and persistence of the self-fertilizing mode of reproduction in this lineage. Recent work has demonstrated an extraordinary degree of selfing combined with a high deleterious mutation rate in contemporary populations. These observations raise the question as to whether the mutation load might rise to such a degree as to eventually threaten the species with extinction. The potential for such a process to occur would inform our understanding of the time since the origin of self-fertilization in C. elegans history. Results To address this issue, here we quantify the rate of fitness decline expected to occur via Muller's ratchet for a purely selfing population, using both analytical approximations and globally distributed individual-based simulations from the evolution@home system to compute the rate of deleterious mutation accumulation. Using the best available estimates for parameters of how C. elegans evolves, we conclude that pure selfing can persist for only short evolutionary intervals, and is expected to lead to extinction within thousands of years for a plausible portion of parameter space. Credible lower-bound estimates of nuclear mutation rates do not extend the expected time to extinction much beyond a million years. Conclusion Thus we conclude that either the extreme self-fertilization implied by current patterns of genetic variation in C. elegans arose relatively recently or that low levels of outcrossing and other factors are key to the persistence of C. elegans into the present day. We also discuss results for the mitochondrial genome and the implications for C. briggsae, a close relative that made the transition to selfing independently of C. elegans.
Collapse
Affiliation(s)
- Laurence Loewe
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, EH9 3JT, UK.
| | | |
Collapse
|
9
|
Abstract
The distribution of fitness effects (DFE) of new mutations is a fundamental entity in genetics that has implications ranging from the genetic basis of complex disease to the stability of the molecular clock. It has been studied by two different approaches: mutation accumulation and mutagenesis experiments, and the analysis of DNA sequence data. The proportion of mutations that are advantageous, effectively neutral and deleterious varies between species, and the DFE differs between coding and non-coding DNA. Despite these differences between species and genomic regions, some general principles have emerged: advantageous mutations are rare, and those that are strongly selected are exponentially distributed; and the DFE of deleterious mutations is complex and multi-modal.
Collapse
Affiliation(s)
- Adam Eyre-Walker
- Centre for the Study of Evolution, University of Sussex, Brighton, BN1 9QG, UK.
| | | |
Collapse
|
10
|
Halligan DL, Keightley PD. Ubiquitous selective constraints in the Drosophila genome revealed by a genome-wide interspecies comparison. Genome Res 2006; 16:875-84. [PMID: 16751341 PMCID: PMC1484454 DOI: 10.1101/gr.5022906] [Citation(s) in RCA: 181] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Non-coding DNA comprises approximately 80% of the euchromatic portion of the Drosophila melanogaster genome. Non-coding sequences are known to contain functionally important elements controlling gene expression, but the proportion of sites that are selectively constrained is still largely unknown. We have compared the complete D. melanogaster and Drosophila simulans genome sequences to estimate mean selective constraint (the fraction of mutations that are eliminated by selection) in coding and non-coding DNA by standardizing to substitution rates in putatively unconstrained sequences. We show that constraint is positively correlated with intronic and intergenic sequence length and is generally remarkably strong in non-coding DNA, implying that more than half of all point mutations in the Drosophila genome are deleterious. This fraction is also likely to be an underestimate if many substitutions in non-coding DNA are adaptively driven to fixation. We also show that substitutions in long introns and intergenic sequences are clustered, such that there is an excess of substitutions <8 bp apart and a deficit farther apart. These results suggest that there are blocks of constrained nucleotides, presumably involved in gene expression control, that are concentrated in long non-coding sequences. Furthermore, we infer that there is more than three times as much functional non-coding DNA as protein-coding DNA in the Drosophila genome. Most deleterious mutations therefore occur in non-coding DNA, and these may make an important contribution to a wide variety of evolutionary processes.
Collapse
Affiliation(s)
- Daniel L Halligan
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3JT, United Kingdom.
| | | |
Collapse
|
11
|
Oda-Ishii I, Bertrand V, Matsuo I, Lemaire P, Saiga H. Making very similar embryos with divergent genomes: conservation of regulatory mechanisms of Otx between the ascidians Halocynthia roretzi and Ciona intestinalis. Development 2005; 132:1663-74. [PMID: 15743880 DOI: 10.1242/dev.01707] [Citation(s) in RCA: 68] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Ascidian embryos develop with a fixed cell lineage into simple tadpoles. Their lineage is almost perfectly conserved, even between the evolutionarily distant species Halocynthia roretzi and Ciona intestinalis, which show no detectable sequence conservation in the non-coding regions of studied orthologous genes. To address how a common developmental program can be maintained without detectable cis-regulatory sequence conservation, we compared in both species the regulation of Otx, a gene with a shared complex expression pattern. We found that in Halocynthia, the regulatory logic is based on the use of very simple cell line-specific regulatory modules, the activities of which are conserved, in most cases, in the Ciona embryo. The activity of each of these enhancer modules relies on the conservation of a few repeated crucial binding sites for transcriptional activators, without obvious constraints on their precise number, order or orientation, or on the surrounding sequences. We propose that a combination of simplicity and degeneracy allows the conservation of the regulatory logic, despite drastic sequence divergence. The regulation of Otx in the anterior endoderm by Lhx and Fox factors may even be conserved with vertebrates.
Collapse
Affiliation(s)
- Izumi Oda-Ishii
- Department of Biological Sciences, Graduate School of Science, Tokyo Metropolitan University, 1-1 Minamiohsawa, Hachiohji, Tokyo 192-0397, Japan
| | | | | | | | | |
Collapse
|
12
|
Petalcorin MIR, Joshua GW, Agapow PM, Dolphin CT. The fmo genes of Caenorhabditis elegans and C. briggsae: characterisation, gene expression and comparative genomic analysis. Gene 2004; 346:83-96. [PMID: 15716098 DOI: 10.1016/j.gene.2004.09.021] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2004] [Revised: 08/18/2004] [Accepted: 09/28/2004] [Indexed: 10/26/2022]
Abstract
The flavin-containing monooxygenase (FMO) gene family is conserved and ancient with representatives present in almost all phyla so far examined. The genes encode FAD-, NADP- and O(2)-dependent enzymes that catalyse oxygenation of soft-nucleophilic heteroatom centres in a range of substrates. Although usually classified as xenobiotic-metabolising enzymes, examples of FMOs exist that have evolved to metabolise specific endogenous substrates as part of a discrete physiological process. The genome of Caenorhabditis elegans contains five predicted genes encoding putative homologs of mammalian FMOs, K08C7.2, K08C7.5, Y39A1A.19, F53F4.5 and H24K24.5, which we have named fmo and numbered fmo-1 to fmo-5, respectively. As a first step towards determining their functional role(s), we have experimentally characterised these C. elegans fmo genes including analysing reporter gene expression patterns and RNAi phenotypes. Two major gene expression patterns were observed, either intestinal or hypodermal, but no gross RNAi phenotypes were found possibly due to functional redundancy. The internal structures of fmo-2, fmo-3 and fmo-4 have been compared with orthologs identified in the related nematode C. briggsae. For each orthologous pair, a global comparison of the paired upstream intergenic regions was performed and a number of conserved noncoding sequences, which may represent potential cis-regulatory elements, identified. Phylogenetic analysis reveals that several of the fmo homologs are the result of gene duplication along the lineage leading to the nematodes.
Collapse
Affiliation(s)
- Mark I R Petalcorin
- Section of Molecular Genetics, Pharmaceutical Science Research Division, Franklin-Wilkins Building, 150 Stamford Street, King's College London, London SE1 9NN, UK
| | | | | | | |
Collapse
|
13
|
Estes S, Phillips PC, Denver DR, Thomas WK, Lynch M. Mutation accumulation in populations of varying size: the distribution of mutational effects for fitness correlates in Caenorhabditis elegans. Genetics 2004; 166:1269-79. [PMID: 15082546 PMCID: PMC1470770 DOI: 10.1534/genetics.166.3.1269] [Citation(s) in RCA: 82] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
The consequences of mutation for population-genetic and evolutionary processes depend on the rate and, especially, the frequency distribution of mutational effects on fitness. We sought to approximate the form of the distribution of mutational effects by conducting divergence experiments in which lines of a DNA repair-deficient strain of Caenorhabditis elegans, msh-2, were maintained at a range of population sizes. Assays of these lines conducted in parallel with the ancestral control suggest that the mutational variance is dominated by contributions from highly detrimental mutations. This was evidenced by the ability of all but the smallest population-size treatments to maintain relatively high levels of mean fitness even under the 100-fold increase in mutational pressure caused by knocking out the msh-2 gene. However, we show that the mean fitness decline experienced by larger populations is actually greater than expected on the basis of our estimates of mutational parameters, which could be consistent with the existence of a common class of mutations with small individual effects. Further, comparison of the total mutation rate estimated from direct sequencing of DNA to that detected from phenotypic analyses implies the existence of a large class of evolutionarily relevant mutations with no measurable effect on laboratory fitness.
Collapse
Affiliation(s)
- Suzanne Estes
- Center for Ecology and Evolutionary Biology, University of Oregon, Eugene, Oregon 97403, USA.
| | | | | | | | | |
Collapse
|
14
|
Natarajan L, Jackson BM, Szyleyko E, Eisenmann DM. Identification of evolutionarily conserved promoter elements and amino acids required for function of the C. elegans beta-catenin homolog BAR-1. Dev Biol 2004; 272:536-57. [PMID: 15282167 DOI: 10.1016/j.ydbio.2004.05.027] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2003] [Revised: 04/14/2004] [Accepted: 05/02/2004] [Indexed: 10/26/2022]
Abstract
beta-catenins are conserved transcription factors regulated posttranslationally by Wnt signaling. bar-1 encodes a Caenorhabditis elegans beta-catenin acting in multiple Wnt-mediated processes, including cell fate specification by vulval precursor cells (VPCs) and migration of the Q(L) neuroblast progeny. We took two approaches to extend our knowledge of bar-1 function. First, we undertook a bar-1 promoter analysis using transcriptional GFP reporter fusions and found that bar-1 expression is regulated in specific cells at the transcriptional level. We identified promoter elements necessary for bar-1 expression in several cell types, including a 321-bp element sufficient for expression in ventral cord neurons (VCNs) and a 1.1-kb element sufficient for expression in the developing vulva and adult seam cells. Expression of bar-1 from the 321-bp element rescued the Uncoordinated (Unc) phenotype of bar-1 mutants, but not the vulval phenotype, suggesting that a Wnt pathway may act in ventral cord neurons to mediate proper locomotion. By comparison of the 1.1-kb element to homologous sequences from Caenorhabditis briggsae, we identified evolutionarily conserved sequences necessary for expression in vulval or seam cells. Second, we analyzed 24 mutations in bar-1 and identified several residues required for BAR-1 activity in C. elegans. By phylogenetic comparison, we found that most of these residues are conserved and may identify amino acids necessary for beta-catenin function in all species.
Collapse
Affiliation(s)
- L Natarajan
- Department of Biological Sciences, University of Maryland Baltimore County, Baltimore, MD 21250, USA
| | | | | | | |
Collapse
|
15
|
Kern AD, Begun DJ. Patterns of Polymorphism and Divergence from Noncoding Sequences of Drosophila melanogaster and D. simulans: Evidence for Nonequilibrium Processes. Mol Biol Evol 2004; 22:51-62. [PMID: 15456897 DOI: 10.1093/molbev/msh269] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Despite the fact that D. melanogaster and D. simulans have been the central model system for molecular population genetics, few data are available for noncoding regions. Here, we present an analysis of population genetic data from intergenic regions and comparisons of these data to previously collected data from introns and exons. Polymorphisms and fixations were categorized as A/T to G/C or G/C to A/T changes and were polarized by inferring the ancestral state using both parsimony and maximum likelihood. Noncoding fixations in both D. melanogaster and D. simulans were consistent with equilibrium base-composition evolution. However, polarized noncoding polymorphisms, revealed a different pattern. Although A/T to G/C and G/C to A/T polymorphisms in D. simulans were consistent with equilibrium, we observed a highly significant dearth of A/T to G/C polymorphisms in D. melanogaster introns but not in intergenic sequences. Such data could be explained by recent evolution of mutational biases associated with transcription or by lineage-specific selection on base composition. These data reveal the complexity of evolutionary processes acting even on noncoding DNA in Drosophila.
Collapse
Affiliation(s)
- Andrew D Kern
- Center for Population Biology, University of California, Davis, USA.
| | | |
Collapse
|
16
|
Ohler U, Yekta S, Lim LP, Bartel DP, Burge CB. Patterns of flanking sequence conservation and a characteristic upstream motif for microRNA gene identification. RNA (NEW YORK, N.Y.) 2004; 10:1309-22. [PMID: 15317971 PMCID: PMC1370619 DOI: 10.1261/rna.5206304] [Citation(s) in RCA: 117] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/14/2003] [Revised: 06/17/2004] [Accepted: 01/13/2004] [Indexed: 05/19/2023]
Abstract
MicroRNAs are approximately 22-nucleotide (nt) RNAs processed from foldback segments of endogenous transcripts. Some are known to play important gene regulatory roles during animal and plant development by pairing to the messages of protein-coding genes to direct the post-transcriptional repression of these messages. Previously, we developed a computational method called MiRscan, which scores features related to the foldbacks, and used this algorithm to identify new miRNA genes in the nematode Caenorhabditis elegans. In the present study, to identify sequences that might be involved in processing or transcriptional regulation of miRNAs, we aligned sequences upstream and downstream of orthologous nematode miRNA foldbacks. These alignments showed a pronounced peak in sequence conservation about 200 bp upstream of the miRNA foldback and revealed a highly significant sequence motif, with consensus CTCCGCCC, that is present upstream of almost all independently transcribed nematode miRNA genes. Scoring the pattern of upstream/downstream conservation, the occurrence of this sequence motif, and orthology of host genes for intronic miRNA candidates, yielded substantial improvements in the accuracy of MiRscan. Nine new C. elegans miRNA gene candidates were validated using a PCR-sequencing protocol. As previously seen for bacterial RNA genes, sequence features outside of the RNA secondary structure can therefore be very useful for the computational identification of eukaryotic noncoding RNA genes. The total number of confidently identified nematode miRNAs now approaches 100. The improved analysis supports our previous assertion that miRNA gene identification is nearing completion in C. elegans with apparently no more than 20 miRNA genes now remaining to be identified.
Collapse
Affiliation(s)
- Uwe Ohler
- Department of Biology, Massachusetts Institute of Technology, Cambridge 02142, USA
| | | | | | | | | |
Collapse
|
17
|
Abstract
Background Computational gene prediction continues to be an important problem, especially for genomes with little experimental data. Results I introduce the SNAP gene finder which has been designed to be easily adaptable to a variety of genomes. In novel genomes without an appropriate gene finder, I demonstrate that employing a foreign gene finder can produce highly inaccurate results, and that the most compatible parameters may not come from the nearest phylogenetic neighbor. I find that foreign gene finders are more usefully employed to bootstrap parameter estimation and that the resulting parameters can be highly accurate. Conclusion Since gene prediction is sensitive to species-specific parameters, every genome needs a dedicated gene finder.
Collapse
Affiliation(s)
- Ian Korf
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK.
| |
Collapse
|
18
|
Zagrobelny M, Jeffares DC, Arctander P. Differences in non-LTR retrotransposons within C. elegans and C. briggsae genomes. Gene 2004; 330:61-6. [PMID: 15087124 DOI: 10.1016/j.gene.2004.01.003] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2003] [Revised: 09/15/2003] [Accepted: 01/08/2004] [Indexed: 11/21/2022]
Abstract
An exhaustive study of the Sam/Frodo family of non-LTR retrotransposons in the Caenorhabditis elegans and Caenorhabditis briggsae genomes demonstrated that C. briggsae contains 60 Sam/Frodo elements including a new subfamily designated Merry, while at least 1000 elements are present in C. elegans. In contrast to C. elegans, C. briggsae does not contain any other non-LTR retrotransposons. The Sam/Frodo/Merry sequences in C. briggsae are shorter and less complete than the Sam/Frodo sequences in C. elegans probably because they all lack a functional first open reading frame (ORF1) and because the genome only encodes one functional reverse transcriptase gene of a non-LTR retrotransposon. Evidence of purifying selection for a functional reverse transcriptase sequence in master/leader elements was found in both nematodes in spite of low copy numbers in C. briggsae. Sam elements in C. elegans are the most abundant Sam/Frodo/Merry family members. They contain the only functional ORF1 copies and, unlike Frodo and Merry members, have a higher GC content than the genomic regions in which they reside. This may indicate a higher transcription rate within this subfamily.
Collapse
Affiliation(s)
- Mika Zagrobelny
- Department of Evolutionary Biology, Zoological Institute, University of Copenhagen, Copenhagen, Denmark.
| | | | | |
Collapse
|
19
|
Bigelow HR, Wenick AS, Wong A, Hobert O. CisOrtho: a program pipeline for genome-wide identification of transcription factor target genes using phylogenetic footprinting. BMC Bioinformatics 2004; 5:27. [PMID: 15113408 PMCID: PMC406492 DOI: 10.1186/1471-2105-5-27] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2004] [Accepted: 03/12/2004] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND All known genomes code for a large number of transcription factors. It is important to develop methods that will reveal how these transcription factors act on a genome wide level, that is, through what target genes they exert their function. RESULTS We describe here a program pipeline aimed at identifying transcription factor target genes in whole genomes. Starting from a consensus binding site, represented as a weight matrix, potential sites in a pre-filtered genome are identified and then further filtered by assessing conservation of the putative site in the genome of a related species, a process called phylogenetic footprinting. CisOrtho has been successfully used to identify targets for two homeodomain transcription factors in the genomes of the nematodes Caenorhabditis elegans and Caenorhabditis briggsae. CONCLUSIONS CisOrtho will identify targets of other nematode transcription factors whose DNA binding specificity is known and can be easily adapted to search other genomes for transcription factor targets.
Collapse
Affiliation(s)
- Henry R Bigelow
- Department of Biochemistry and Molecular Biophysics, Columbia University, College of Physicians and Surgeons, 701 West 168th Street, New York, NY 10032, USA
| | - Adam S Wenick
- Department of Biochemistry and Molecular Biophysics, Columbia University, College of Physicians and Surgeons, 701 West 168th Street, New York, NY 10032, USA
| | - Allan Wong
- Department of Biochemistry and Molecular Biophysics, Columbia University, College of Physicians and Surgeons, 701 West 168th Street, New York, NY 10032, USA
| | - Oliver Hobert
- Department of Biochemistry and Molecular Biophysics, Columbia University, College of Physicians and Surgeons, 701 West 168th Street, New York, NY 10032, USA
| |
Collapse
|
20
|
|
21
|
Abstract
Various experimental and computational approaches have been used to identify genomic locations of transcription-factor binding sites; methods involving computational comparisons of related genomes have been particularly successful. Identifying genomic locations of transcription-factor binding sites, particularly in higher eukaryotic genomes, has been an enormous challenge. Various experimental and computational approaches have been used to detect these sites; methods involving computational comparisons of related genomes have been particularly successful.
Collapse
Affiliation(s)
- Martha L Bulyk
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, New Research Building, 77 Avenue Louis Pasteur, Boston, MA 02115, USA.
| |
Collapse
|
22
|
Ruvinsky I, Ruvkun G. Functional tests of enhancer conservation between distantly related species. Development 2003; 130:5133-42. [PMID: 12944426 DOI: 10.1242/dev.00711] [Citation(s) in RCA: 87] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Expression patterns of orthologous genes are often conserved, even between distantly related organisms, suggesting that once established, developmental programs can be stably maintained over long periods of evolutionary time. Because many orthologous transcription factors are also functionally conserved, one possible model to account for homologous gene expression patterns, is conservation of specific binding sites within cis-regulatory elements of orthologous genes. If this model is correct, a cis-regulatory element from one organism would be expected to function in a distantly related organism. To test this hypothesis, we fused the green fluorescent protein gene to neuronal and muscular enhancer elements from a variety of Drosophila melanogaster genes, and tested whether these would activate expression in the homologous cell types in Caenorhabditis elegans. Regulatory elements from several genes directed appropriate expression in homologous tissue types, suggesting conservation of regulatory sites. However, enhancers of most Drosophila genes tested were not properly recognized in C. elegans, implying that over this evolutionary distance enough changes occurred in cis-regulatory sequences and/or transcription factors to prevent proper recognition of heterospecific enhancers. Comparisons of enhancer elements of orthologous genes between C. elegans and C. briggsae revealed extensive conservation, as well as specific instances of functional divergence. Our results indicate that functional changes in cis-regulatory sequences accumulate on timescales much shorter than the divergence of arthropods and nematodes, and that mechanisms other than conservation of individual binding sites within enhancer elements are responsible for the conservation of expression patterns of homologous genes between distantly related species.
Collapse
Affiliation(s)
- Ilya Ruvinsky
- Department of Molecular Biology, Massachusetts General Hospital and Department of Genetics, Harvard Medical School, Wellman 8, Boston, MA 02114, USA
| | | |
Collapse
|
23
|
Shabalina SA, Ogurtsov AY, Lipman DJ, Kondrashov AS. Patterns in interspecies similarity correlate with nucleotide composition in mammalian 3'UTRs. Nucleic Acids Res 2003; 31:5433-9. [PMID: 12954780 PMCID: PMC203331 DOI: 10.1093/nar/gkg751] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Post-transcriptional regulation and the formation of mRNA 3' ends are crucial for gene expression in eukaryotes. Interspecies conservation of many sequences within 3'UTRs reveals selective constraint due to similar function. To study the pattern of conservation within 3'UTRs, we compiled and aligned 50 sets of complete orthologous 3'UTRs from four orders of mammals. We observed a mosaic pattern of conservation, with alternating regions of high (phylogenetic footprints) and low similarity. Conservation in 3'UTRs correlates with their base composition and also with the synonymous substitution rate in corresponding coding regions. The non-uniform distribution of conservation is more pronounced for 3'UTRs with a moderate or low level of overall conservation, where invariant nucleotides are more numerous, and their runs of lengths 4-7 occur more frequently than if conservation were random. Many runs of invariant nucleotides are AU-rich or pyrimidine-rich. Some of these runs coincide with known functional cis- elements of eukaryotic mRNAs, such as the U-rich upstream element, polyadenylation signal and DICE regulatory signal. More divergent regions of multiple alignments of 3'UTRs are often more G- and/or C-rich. Our results provide evidence on the importance of moderately conserved regions in 3'UTRs and suggest that regulatory functions of 3'UTRs might utilize gene-specific information in these regions.
Collapse
Affiliation(s)
- Svetlana A Shabalina
- National Center for Biotechnology Information, National Institutes of Health, 8600 Rockville Pike, Building 38A, Bethesda, MD 20894, USA.
| | | | | | | |
Collapse
|
24
|
Kirouac M, Sternberg PW. cis-Regulatory control of three cell fate-specific genes in vulval organogenesis of Caenorhabditis elegans and C. briggsae. Dev Biol 2003; 257:85-103. [PMID: 12710959 DOI: 10.1016/s0012-1606(03)00032-0] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The great-grandprogeny of the Caenorhabditis elegans vulval precursor cells (VPCs) adopt one of the final vulA, B1, B2, C, D, E, and F cell fates in a precise spatial pattern. This pattern of vulval cell types is likely to depend on the cis-regulatory regions of the transcriptional targets of intercellular signals in vulval development. egl-17, zmp-1, and cdh-3 are expressed differentially in the developing vulva cells, providing a potential readout for different signaling pathways. To understand how such pathways interact to specify unique vulval cell types in a precise pattern, we have identified cis-regulatory regions sufficient to confer vulval cell type-specific regulation when fused in cis to the basal pes-10 promoter. We have identified the C. briggsae homologs of these three genes, with their corresponding control regions, and tested these regions in both C. elegans and C. briggsae. These regions of similarity in C. elegans and C. briggsae upstream of egl-17, zmp-1, and cdh-3 promote expression in vulval cells and the anchor cell (AC). By using the cis-regulatory analysis and phylogenetic footprinting, we have identified overrepresented sequences involved in conferring vulval and AC expression.
Collapse
Affiliation(s)
- Martha Kirouac
- Howard Hughes Medical Institute and Division of Biology, mail code 156-29, California Institute of Technology, Pasadena, CA 91125, USA
| | | |
Collapse
|
25
|
Ureta-Vidal A, Ettwiller L, Birney E. Comparative genomics: genome-wide analysis in metazoan eukaryotes. Nat Rev Genet 2003; 4:251-62. [PMID: 12671656 DOI: 10.1038/nrg1043] [Citation(s) in RCA: 156] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
The increasing number of complete and nearly complete metazoan genome sequences provides a significant amount of material for large-scale comparative genomic analysis. Finding new effective methods to analyse such enormous datasets has been the object of intense research. Three main areas in comparative genomics have recently shown important developments: whole-genome alignment, gene prediction and regulatory-region prediction. Each of these areas improves the methods of deciphering long genomic sequences and uncovering what lies hidden in them.
Collapse
Affiliation(s)
- Abel Ureta-Vidal
- EnsEMBL Project, Room A2-06, EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | | |
Collapse
|
26
|
Farrer T, Roller AB, Kent WJ, Zahler AM. Analysis of the role of Caenorhabditis elegans GC-AG introns in regulated splicing. Nucleic Acids Res 2002; 30:3360-7. [PMID: 12140320 PMCID: PMC137088 DOI: 10.1093/nar/gkf465] [Citation(s) in RCA: 40] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
GC-AG introns represent 0.7% of total human pre-mRNA introns. To study the function of GC-AG introns in splicing regulation, 196 cDNA-confirmed GC-AG introns were identified in Caenorhabditis elegans. These represent 0.6% of the cDNA- confirmed intron data set for this organism. Eleven of these GC-AG introns are involved in alternative splicing. In a comparison of the genomic sequences of homologous genes between C.elegans and Caenorhabditis briggsae for 26 GC-AG introns, the C at the +2 position is conserved in only five of these introns. A system to experimentally test the function of GC-AG introns in alternative splicing was developed. Results from these experiments indicate that the conserved C at the +2 position of the tenth intron of the let-2 gene is essential for developmentally regulated alternative splicing. This C allows the splice donor to function as a very weak splice site that works in balance with an alternative GT splice donor. A weak GT splice donor can functionally replace the GC splice donor and allow for splicing regulation. These results indicate that while the majority of GC-AG introns appear to be constitutively spliced and have no evolutionary constraints to prevent them from being GT-AG introns, a subset of GC-AG introns is involved in alternative splicing and the C at the +2 position of these introns can have an important role in splicing regulation.
Collapse
Affiliation(s)
- Tracy Farrer
- Department of MCD Biology and Center for Molecular Biology of RNA, Sinsheimer Laboratories, University of California, Santa Cruz, CA 95064, USA
| | | | | | | |
Collapse
|
27
|
Baillie DL. Genomes in motion. Genome Res 2002; 12:843. [PMID: 12045137 DOI: 10.1101/gr.293102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Affiliation(s)
- David L Baillie
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia V5A 1S6, Canada.
| |
Collapse
|
28
|
Bergman CM, Pfeiffer BD, Rincón-Limas DE, Hoskins RA, Gnirke A, Mungall CJ, Wang AM, Kronmiller B, Pacleb J, Park S, Stapleton M, Wan K, George RA, de Jong PJ, Botas J, Rubin GM, Celniker SE. Assessing the impact of comparative genomic sequence data on the functional annotation of the Drosophila genome. Genome Biol 2002; 3:RESEARCH0086. [PMID: 12537575 PMCID: PMC151188 DOI: 10.1186/gb-2002-3-12-research0086] [Citation(s) in RCA: 103] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2002] [Revised: 11/25/2002] [Accepted: 12/05/2002] [Indexed: 11/15/2022] Open
Abstract
BACKGROUND It is widely accepted that comparative sequence data can aid the functional annotation of genome sequences; however, the most informative species and features of genome evolution for comparison remain to be determined. RESULTS We analyzed conservation in eight genomic regions (apterous, even-skipped, fushi tarazu, twist, and Rhodopsins 1, 2, 3 and 4) from four Drosophila species (D. erecta, D. pseudoobscura, D. willistoni, and D. littoralis) covering more than 500 kb of the D. melanogaster genome. All D. melanogaster genes (and 78-82% of coding exons) identified in divergent species such as D. pseudoobscura show evidence of functional constraint. Addition of a third species can reveal functional constraint in otherwise non-significant pairwise exon comparisons. Microsynteny is largely conserved, with rearrangement breakpoints, novel transposable element insertions, and gene transpositions occurring in similar numbers. Rates of amino-acid substitution are higher in uncharacterized genes relative to genes that have previously been studied. Conserved non-coding sequences (CNCSs) tend to be spatially clustered with conserved spacing between CNCSs, and clusters of CNCSs can be used to predict enhancer sequences. CONCLUSIONS Our results provide the basis for choosing species whose genome sequences would be most useful in aiding the functional annotation of coding and cis-regulatory sequences in Drosophila. Furthermore, this work shows how decoding the spatial organization of conserved sequences, such as the clustering of CNCSs, can complement efforts to annotate eukaryotic genomes on the basis of sequence conservation alone.
Collapse
Affiliation(s)
- Casey M Bergman
- Berkeley Drosophila Genome Project, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA 94720, USA
- These authors contributed equally to this work
| | - Barret D Pfeiffer
- Berkeley Drosophila Genome Project, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA 94720, USA
- These authors contributed equally to this work
| | - Diego E Rincón-Limas
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Current address: Departamento de Biologia Molecular, Universidad Autonoma de Tamaulipas-UAMRA, Reynosa, CP 88740, Mexico
| | - Roger A Hoskins
- Berkeley Drosophila Genome Project, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA 94720, USA
| | | | - Chris J Mungall
- Howard Hughes Medical Institute, Department of Molecular and Cellular Biology, University of California, Berkeley, CA 94720, USA
| | - Adrienne M Wang
- Berkeley Drosophila Genome Project, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA 94720, USA
- Current address: Department of Physiology, University of California, San Francisco, CA 94143, USA
| | - Brent Kronmiller
- Berkeley Drosophila Genome Project, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA 94720, USA
- Current address: Department of Bioinformatics and Computational Biology, Iowa State University, Ames, IA 50011, USA
| | - Joanne Pacleb
- Berkeley Drosophila Genome Project, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA 94720, USA
| | - Soo Park
- Berkeley Drosophila Genome Project, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA 94720, USA
| | - Mark Stapleton
- Berkeley Drosophila Genome Project, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA 94720, USA
| | - Kenneth Wan
- Berkeley Drosophila Genome Project, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA 94720, USA
| | - Reed A George
- Berkeley Drosophila Genome Project, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA 94720, USA
| | - Pieter J de Jong
- Children's Hospital and Research Center at Oakland, Oakland, CA 94609, USA
| | - Juan Botas
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Gerald M Rubin
- Berkeley Drosophila Genome Project, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA 94720, USA
- Howard Hughes Medical Institute, Department of Molecular and Cellular Biology, University of California, Berkeley, CA 94720, USA
| | - Susan E Celniker
- Berkeley Drosophila Genome Project, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA 94720, USA
| |
Collapse
|