251
|
Zhang Y, Mei P, Lou R, Zhang MQ, Wu G, Qiang B, Zhang Z, Shen Y. Gene expression profiling in developing human hippocampus. J Neurosci Res 2002; 70:200-8. [PMID: 12271469 DOI: 10.1002/jnr.10322] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The gene expression profile of developing human hippocampus is of particular interest and importance to neurobiologists devoted to development of the human brain and related diseases. To gain further molecular insight into the developmental and functional characteristics, we analyzed the expression profile of active genes in developing human hippocampus. Expressed sequence tags (ESTs) were selected by sequencing randomly selected clones from an original 3'-directed cDNA library of 150-day human fetal hippocampus, and a digital expression profile of 946 known genes that could be divided into 16 categories was generated. We also used for comparison 14 other expression profiles of related human neural cells/tissues, including human adult hippocampus. To yield more confidence regarding differential expression, a method was applied to attach normalized expression data to genes with a low false-positive rate (<0.05). Finally, hierarchical cluster analysis was used to exhibit related gene expression patterns. Our results are in accordance with anatomical and physiological observations made during the developmental process of the human hippocampus. Furthermore, some novel findings appeared to be unique to our results. The abundant expression of genes for cell surface components and disease-related genes drew our attention. Twenty-four genes are significantly different from adult, and 13 genes might be developing hippocampus-specific candidate genes, including wnt2b and some Alzheimer's disease-related genes. Our results could provide useful information on the ontogeny, development, and function of cells in the human hippocampus at the molecular level and underscore the utility of large-scale, parallel gene expression analyses in the study of complex biological phenomena.
Collapse
|
252
|
Zhang MQ. Extracting functional information from microarrays: a challenge for functional genomics. Proc Natl Acad Sci U S A 2002; 99:12509-11. [PMID: 12271149 PMCID: PMC130487 DOI: 10.1073/pnas.212532499] [Citation(s) in RCA: 20] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
|
253
|
Abstract
The human genome sequence is the book of our life. Buried in this large volume are our genes, which are scattered as small DNA fragments throughout the genome and comprise a small percentage of the total text. Finding these indistinct 'needles' in a vast genomic 'haystack' can be extremely challenging. In response to this challenge, computational prediction approaches have proliferated in recent years that predict the location and structure of genes. Here, I discuss these approaches and explain why they have become essential for the analyses of newly sequenced genomes.
Collapse
|
254
|
Xuan Z, McCombie WR, Zhang MQ. GFScan: a gene family search tool at genomic DNA level. Genome Res 2002; 12:1142-9. [PMID: 12097353 PMCID: PMC186623 DOI: 10.1101/gr.220102] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2001] [Accepted: 04/11/2002] [Indexed: 11/25/2022]
Abstract
We have developed GFScan(Gene Family Scan), a tool that identifies members of a gene family by searching genomic DNA sequences with genomic DNA motifs (or matrices) that are representative of the family. We have tested GFScan on four human gene families including the neurotransmitter-gated ion-channels (NGIC) family, the carbonic anhydrases (CA) family, the Dbl homology (DH) domain family, and the ETS-domain family. All known members of these families with motifs mapped to sequenced genomic DNA regions were found, whereas some novel genomic locations were also found to match the motifs, which may indicate new members in these families. Compared with other methods, GFScan recognized all true positives with much fewer false positives. We also showed that motifs constructed based on human genes could be used to search the mouse genome to identify orthologous family members in mouse. This program is available at http://www.cshl.org/mzhanglab/.
Collapse
|
255
|
Zhang MQ. Predicting full-length transcripts. Trends Biotechnol 2002. [DOI: 10.1016/s0167-7799(02)01977-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
256
|
Banerjee N, Zhang MQ. Functional genomics as applied to mapping transcription regulatory networks. Curr Opin Microbiol 2002; 5:313-7. [PMID: 12057687 DOI: 10.1016/s1369-5274(02)00322-3] [Citation(s) in RCA: 62] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
The sequencing of the human genome and the entire genomes of many model organisms has resulted in the identification of many genes. Many large-scale experiments for generating gene disruptions and analyzing the phenotypes are underway to ascertain gene function. A future challenge will be to determine interaction and regulation of all the genes of an organism. Recent advances in functional genomic technology have begun to shine light on such gene network problems at both transcriptomic and proteomic levels. Functional genomics will not only elucidate what the genes do, but will also help determine when, where and how they are expressed as an orchestrated system. In this review, we discuss the functional genomics approaches to extract knowledge about transcription regulatory mechanisms from combinations of sequence data, microarray data and ChIP data. We focus in particular on the budding yeast Saccharomyces cerevisiae.
Collapse
|
257
|
Davuluri RV, Grosse I, Zhang MQ. Computational identification of promoters and first exons in the human genome. Nat Genet 2001; 29:412-7. [PMID: 11726928 DOI: 10.1038/ng780] [Citation(s) in RCA: 299] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The identification of promoters and first exons has been one of the most difficult problems in gene-finding. We present a set of discriminant functions that can recognize structural and compositional features such as CpG islands, promoter regions and first splice-donor sites. We explain the implementation of the discriminant functions into a decision tree that constitutes a new program called FirstEF. By using different models to predict CpG-related and non-CpG-related first exons, we showed by cross-validation that the program could predict 86% of the first exons with 17% false positives. We also demonstrated the prediction accuracy of FirstEF at the genome level by applying it to the finished sequences of human chromosomes 21 and 22 as well as by comparing the predictions with the locations of the experimentally verified first exons. Finally, we present the analysis of the predicted first exons for all of the 24 chromosomes of the human genome.
Collapse
|
258
|
Weinmann AS, Bartley SM, Zhang T, Zhang MQ, Farnham PJ. Use of chromatin immunoprecipitation to clone novel E2F target promoters. Mol Cell Biol 2001; 21:6820-32. [PMID: 11564866 PMCID: PMC99859 DOI: 10.1128/mcb.21.20.6820-6832.2001] [Citation(s) in RCA: 324] [Impact Index Per Article: 14.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2001] [Accepted: 07/05/2001] [Indexed: 01/14/2023] Open
Abstract
We have taken a new approach to the identification of E2F-regulated promoters. After modification of a chromatin immunoprecipitation assay, we cloned nine chromatin fragments which represent both strong and weak in vivo E2F binding sites. Further characterization of three of the cloned fragments revealed that they are bound in vivo not only by E2Fs but also by members of the retinoblastoma tumor suppressor protein family and by RNA polymerase II, suggesting that these fragments represent promoters regulated by E2F transcription complexes. In fact, database analysis indicates that all three fragments correspond to genomic DNA located just upstream of start sites for previously identified mRNAs. One clone, ChET 4, corresponds to the promoter region for beclin 1, a candidate tumor suppressor protein. We demonstrate that another of the clones, ChET 8, is strongly bound by E2F family members in vivo but does not contain a consensus E2F binding site. However, this fragment functions as a promoter whose activity can be repressed by E2F1. Finally, we demonstrate that the ChET 9 promoter contains a consensus E2F binding site, can be activated by E2F1, and drives expression of an mRNA that is upregulated in colon and liver tumors. Interestingly, the characterized ChET promoters do not display regulation patterns typical of known E2F target genes in a U937 cell differentiation system. In summary, we have provided evidence that chromatin immunoprecipitation can be used to identify E2F-regulated promoters which contain both consensus and nonconsensus binding sites and have shown that not all E2F-regulated promoters show identical expression profiles.
Collapse
|
259
|
Ando S, Sarlis NJ, Krishnan J, Feng X, Refetoff S, Zhang MQ, Oldfield EH, Yen PM. Aberrant alternative splicing of thyroid hormone receptor in a TSH-secreting pituitary tumor is a mechanism for hormone resistance. Mol Endocrinol 2001; 15:1529-38. [PMID: 11518802 DOI: 10.1210/mend.15.9.0687] [Citation(s) in RCA: 62] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
Patients with TSH-secreting pituitary tumors (TSHomas) have high serum TSH levels despite elevated thyroid hormone levels. The mechanism for this defect in the negative regulation of TSH secretion is not known. We performed RT-PCR to detect mutations in TRbeta from a surgically resected TSHoma. Analyses of the RT-PCR products revealed a 135-bp deletion within the sixth exon that encodes the ligand-binding domain of TRbeta2. This deletion was caused by alternative splicing of TRbeta2 mRNA, as near-consensus splice sequences were found at the junction site and no deletion or mutations were detected in the tumoral genomic DNA. This TRbeta variant (TRbeta2spl) lacked thyroid hormone binding and had impaired T3-dependent negative regulation of both TSHbeta and glycoprotein hormone alpha-subunit genes in cotransfection studies. Furthermore, TRbeta2spl showed dominant negative activity against the wild-type TRbeta2. These findings strongly suggest that aberrant alternative splicing of TRbeta2 mRNA generated an abnormal TR protein that accounted for the defective negative regulation of TSH in the TSHoma. This is the first example of aberrant alternative splicing of a nuclear hormone receptor causing hormonal dysregulation. This novel posttranscriptional mechanism for generating abnormal receptors may occur in other hormone-resistant states or tumors in which no receptor mutation is detected in genomic DNA.
Collapse
|
260
|
Abstract
MOTIVATION We present JTEF, a new program for finding 3' terminal exons in human DNA sequences. This program is based on quadratic discriminant analysis, a standard non-linear statistical pattern recognition method. The quadratic discriminant functions used for building the algorithm were trained on a set of 3' terminal exons of type 3tuexon (those containing the true STOP codon). RESULTS We showed that the average predictive accuracy of JTEF is higher than the presently available best programs (GenScan and Genemark.hmm) based on a test set of 65 human DNA sequences with 121 genes. In particular JTEF performs well on larger genomic contigs containing multiple genes and significant amounts of intergenic DNA. It will become a valuable tool for genome annotation and gene functional studies. AVAILABILITY JTEF is available free for academic users on request from ftp://cshl.org/pub/science/mzhanglab/JTEF and will be made available through the World Wide Web (http://argon.cshl.org/).
Collapse
|
261
|
Kel AE, Kel-Margoulis OV, Farnham PJ, Bartley SM, Wingender E, Zhang MQ. Computer-assisted identification of cell cycle-related genes: new targets for E2F transcription factors. J Mol Biol 2001; 309:99-120. [PMID: 11491305 DOI: 10.1006/jmbi.2001.4650] [Citation(s) in RCA: 133] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The processes that take place during development and differentiation are directed through coordinated regulation of expression of a large number of genes. One such gene regulatory network provides cell cycle control in eukaryotic organisms. In this work, we have studied the structural features of the 5' regulatory regions of cell cycle-related genes. We developed a new method for identifying composite substructures (modules) in regulatory regions of genes consisting of a binding site for a key transcription factor and additional contextual motifs: potential targets for other transcription factors that may synergistically regulate gene transcription. Applying this method to cell cycle-related promoters, we created a program for context-specific identification of binding sites for transcription factors of the E2F family which are key regulators of the cell cycle. We found that E2F composite modules are found at a high frequency and in close proximity to the start of transcription in cell cycle-related promoters in comparison with other promoters. Using this information, we then searched for E2F sites in genomic sequences with the goal of identifying new genes which play important roles in controlling cell proliferation, differentiation and apoptosis. Using a chromatin immunoprecipitation assay, we then experimentally verified the binding of E2F in vivo to the promoters predicted by the computer-assisted methods. Our identification of new E2F target genes provides new insight into gene regulatory networks and provides a framework for continued analysis of the role of contextual promoter features in transcriptional regulation. The tools described are available at http://compel.bionet.nsc.ru/FunSite/SiteScan.html.
Collapse
|
262
|
Takizawa T, Matsumoto J, Tohma T, Kanke T, Wada Y, Nagao M, Inagaki N, Nagai H, Zhang MQ, Timmerman H. VUF-K-8788, a periphery-selective histamine H1 antagonist with anti-pruritic activities. JAPANESE JOURNAL OF PHARMACOLOGY 2001; 86:55-64. [PMID: 11430473 DOI: 10.1254/jjp.86.55] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
The pharmacological properties of 7-[3-[4-(2-quinolinylmethyl)-1-piperazinyl]-propoxy]-2,3-dihydro-4H-1,4-benzothiazin-3-one (VUF-K-8788) were investigated in vitro and in vivo. VUF-K-8788 inhibited [3H]-mepyramine from binding to the cell membrane of lung parenchyma (Ki value: 5.0 nM) and the histamine-induced contraction of isolated guinea pig ileum (pA2: 9.71) without affecting ileal contractions induced by acetylcholine, serotonin, KCl and BaCl2. The increase of vascular permeabilities induced by histamine and passive cutaneous anaphylaxis (PCA) in guinea pigs were inhibited by VUF-K-8788 in a dose-dependent fashion (ED50: 0.24 and 0.26 mg/kg, p.o., respectively). Moreover, the anti-histaminic effect of VUF-K-8788 was also observed in rats. In experiments on the effects on the central nervous system, VUF-K-8788 at 1 mg/kg, p.o. hardly antagonized the H1 receptor at all in the cerebral cortex of guinea pigs. VUF-K-8788 inhibited the PCA-induced scratching behavior completely without affecting thiopental-induced sleep in mice. These results suggested that VUF-K-8788 would be useful in the treatment of allergic disorders such as atopic dermatitis and eczema.
Collapse
|
263
|
Wu Q, Zhang T, Cheng JF, Kim Y, Grimwood J, Schmutz J, Dickson M, Noonan JP, Zhang MQ, Myers RM, Maniatis T. Comparative DNA sequence analysis of mouse and human protocadherin gene clusters. Genome Res 2001; 11:389-404. [PMID: 11230163 PMCID: PMC311048 DOI: 10.1101/gr.167301] [Citation(s) in RCA: 195] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The genomic organization of the human protocadherin alpha, beta, and gamma gene clusters (designated Pcdh alpha [gene symbol PCDHA], Pcdh beta [PCDHB], and Pcdh gamma [PCDHG]) is remarkably similar to that of immunoglobulin and T-cell receptor genes. The extracellular and transmembrane domains of each protocadherin protein are encoded by an unusually large "variable" region exon, while the intracellular domains are encoded by three small "constant" region exons located downstream from a tandem array of variable region exons. Here we report the results of a comparative DNA sequence analysis of the orthologous human (750 kb) and mouse (900 kb) protocadherin gene clusters. The organization of Pcdh alpha and Pcdh gamma gene clusters in the two species is virtually identical, whereas the mouse Pcdh beta gene cluster is larger and contains more genes than the human Pcdh beta gene cluster. We identified conserved DNA sequences upstream of the variable region exons, and found that these sequences are more conserved between orthologs than between paralogs. Within this region, there is a highly conserved DNA sequence motif located at about the same position upstream of the translation start codon of each variable region exon. In addition, the variable region of each gene cluster contains a rich array of CpG islands, whose location corresponds to the position of each variable region exon. These observations are consistent with the proposal that the expression of each variable region exon is regulated by a distinct promoter, which is highly conserved between orthologous variable region exons in mouse and human.
Collapse
|
264
|
Liu HX, Cartegni L, Zhang MQ, Krainer AR. A mechanism for exon skipping caused by nonsense or missense mutations in BRCA1 and other genes. Nat Genet 2001; 27:55-8. [PMID: 11137998 DOI: 10.1038/83762] [Citation(s) in RCA: 311] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Point mutations can generate defective and sometimes harmful proteins. The nonsense-mediated mRNA decay (NMD) pathway minimizes the potential damage caused by nonsense mutations. In-frame nonsense codons located at a minimum distance upstream of the last exon-exon junction are recognized as premature termination codons (PTCs), targeting the mRNA for degradation. Some nonsense mutations cause skipping of one or more exons, presumably during pre-mRNA splicing in the nucleus; this phenomenon is termed nonsense-mediated altered splicing (NAS), and its underlying mechanism is unclear. By analyzing NAS in BRCA1, we show here that inappropriate exon skipping can be reproduced in vitro, and results from disruption of a splicing enhancer in the coding sequence. Enhancers can be disrupted by single nonsense, missense and translationally silent point mutations, without recognition of an open reading frame as such. These results argue against a nuclear reading-frame scanning mechanism for NAS. Coding-region single-nucleotide polymorphisms (cSNPs) within exonic splicing enhancers or silencers may affect the patterns or efficiency of mRNA splicing, which may in turn cause phenotypic variability and variable penetrance of mutations elsewhere in a gene.
Collapse
|
265
|
Stamm S, Zhu J, Nakai K, Stoilov P, Stoss O, Zhang MQ. An alternative-exon database and its statistical analysis. DNA Cell Biol 2000; 19:739-56. [PMID: 11177572 DOI: 10.1089/104454900750058107] [Citation(s) in RCA: 129] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
We compiled a comprehensive database of alternative exons from the literature and analyzed them statistically. Most alternative exons are cassette exons and are expressed in more than two tissues. Of all exons whose expression was reported to be specific for a certain tissue, the majority were expressed in the brain. Whereas the length of constitutive exons follows a normal distribution, the distribution of alternative exons is skewed toward smaller ones. Furthermore, alternative-exon splice sites deviate more from the consensus: their 3' splice sites are characterized by a higher purine content in the polypyrimidine stretch, and their 5' splice sites deviate from the consensus sequence mostly at the +4 and +5 positions. Furthermore, for exons expressed in a single tissue, adenosine is more frequently used at the -3 position of the 3' splice site. In addition to the known AC-rich and purine-rich exonic sequence elements, sequence comparison using a Gibbs algorithm identified several motifs in exons surrounded by weak splice sites and in tissue-specific exons. Together, these data indicate a combinatorial effect of weak splice sites, atypical nucleotide usage at certain positions, and functional enhancers as an important contribution to alternative-exon regulation.
Collapse
|
266
|
Abstract
A nonredundant database of 2312 full-length human 5'-untranslated regions (UTRs) was carefully prepared using state-of-the-art experimental and computational technologies. A comprehensive computational analysis of this data was conducted for characterizing the 5' UTR features. Classification and regression tree (CART) analysis was used to classify the data into three distinct classes. Class I consists of mRNAs that are believed to be poorly translated with long 5' UTRs filled with potential inhibitory features. Class II consists of terminal oligopyrimidine tract (TOP) mRNAs that are regulated in a growth-dependent manner, and class III consists of mRNAs with favorable 5' UTR features that may help efficient translation. The most accurate tree we found has 92.5% classification accuracy as estimated by cross validation. The classification model included the presence of TOP, a secondary structure, 5' UTR length, and the presence of upstream AUGs (uAUGs) as the most relevant variables. The present classification and characterization of the 5' UTRs provide precious information for better understanding the translational regulation of human mRNAs. Furthermore, this database and classification can help people build better computational models for predicting the 5'-terminal exon and separating the 5' UTR from the coding region.
Collapse
|
267
|
Zhang MQ. Discriminant analysis and its application in DNA sequence motif recognition. Brief Bioinform 2000; 1:331-42. [PMID: 11465051 DOI: 10.1093/bib/1.4.331] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Identification of functional motifs in a DNA sequence is fundamentally a statistical pattern recognition problem. Discriminant analysis is widely used for solving such problems. This paper will review two basic parametric methods: LDA (linear discriminant analysis) and QDA (quadratic discriminant analysis). Their usage in recognition of splice sites and exons in the human genome will be demonstrated.
Collapse
|
268
|
Zhu J, Zhang MQ. Cluster, function and promoter: analysis of yeast expression array. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2000:479-90. [PMID: 10902195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 02/17/2023]
Abstract
Gene clusters could be derived based on expression profiles, function categorization and promoter regions. To obtain thorough understanding of gene expression and regulation, the three aspects should be combined in an organic way. In this study, we explored the possible ways to analyze the large-scale gene expression data. Three approaches were used to analyze yeast temporal expression data: 1) start from clustering on the expression profiles followed by function categorization and promoter analysis, 2) start from function categorization followed by clustering on expression profiles and promoter analysis, and 3) start from clustering on the promoter region followed by clustering on expression profiles. For clustering analysis on the time-series data, we developed a largest-first algorithm, which provide a mechanism for quality control on clusters. For promoter analysis, we developed a core-extension algorithm.
Collapse
|
269
|
Walczyński K, Guryn R, Zuiderveld OP, Zhang MQ, Timmerman H. Histamine H1 receptor ligands: part II. Synthesis and in vitro pharmacology of 2-[2-(phenylamino)thiazol-4-yl]ethanamine and 2-(2-benzhydrylthiazol-4-yl)ethanamine derivatives. FARMACO (SOCIETA CHIMICA ITALIANA : 1989) 2000; 55:569-74. [PMID: 11152236 DOI: 10.1016/s0014-827x(00)00087-2] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
New 2-[2-(phenylamino)thiazol-4-yl]ethanamine and 2-(2-benzhydrylthiazol-4-yl)ethanamine derivatives were prepared and tested in vitro as H1 receptor antagonists. The compounds with 2-phenylamino substitution with meta-halide substituents at the phenyl ring, showed weak H1-antagonistic activity (pA2: 4.62-5.04) and this activity was completely lost in the case of meta-methyl substituent (pA2 < 4). When the phenylamino group was replaced by benzhydryl groups of classic antihistamines, the resulting compounds exhibited slightly improved H1-antagonistic activity (at the meta-position pA2: 6.38-6.15; at the para-position pA2: 6.04-5.87).
Collapse
|
270
|
Abstract
Vertebrate genomic DNA is generally CpG depleted, possibly because methylation of cytosines at 80% of CpG dinucleotides results in their frequent mutation to thymine, and thus CpG to TpG dinucleotides. There are, however, genomic regions of high G+C content (CpG islands), where the occurrence of CpGs is significantly higher, close to the expected frequency, whereas the methylation concentration is significantly lower than the overall genome. CpG islands are longer than 200 bp and have over 50% of G+C content and CpG frequency, at least 0.6 of that statistically expected. Approximately 50% of mammalian gene promoters are associated with one or more CpG islands. Although biologists often intuitively use CpG islands for 5' gene identification, this has not been rigorously quantified. We have determined the features that discriminate the promoter-associated and non-associated CpG islands. This led to an effective algorithm for large-scale promoter mapping (with 2-kb resolution) with a concentration of false-positive predictions of promoters much lower than previously obtained. Using this algorithm, we correctly discriminated approximately 85% of the CpG islands within an interval (-500 to +1500) around a transcriptional start site (TSS) from those that lie further away from TSSs. We also correctly mapped approximately 93% of the promoters containing CpG islands.
Collapse
|
271
|
Abstract
Acetylcholinesterase (AChE) inhibitors are an important class of medicinal agents useful for the treatment of Alzheimer s disease, glaucoma, myasthenia gravis and for the recovery of neuromuscular block in surgery. To rationalize the structural requirements of AChE inhibitors we attempt to derive a coherent AChE-inhibitor recognition pattern based on literature data of molecular modelling and quantitative structure-activity relationship (QSAR) analyses. These data are summarised from nearly all therapeutically important chemical classes of reversible AChE inhibitors, e.g., derivatives of physostigmine, tacrine, donepezil and huperzine A. Interactions observed from X-ray crystallography between these inhibitors and AChE have also been incorporated and compared with modelling and QSAR results. It is concluded that hydrophobicity and the presence of an ionizable nitrogen are the pre-requisites for the inhibitors to interact with AChE. However the mode of interaction i.e., the 3-dimensional (3D) positioning of the inhibitor in the active site of the enzyme varies among different chemical classes. It is also recognised that water molecules play crucial roles in defining these different 3D positioning. The information on AChE-inhibitor interactions provided should be useful for future discovery of new chemical classes of AChE inhibitors, especially from De Novo design and hybrid construction.
Collapse
|
272
|
Liu HX, Chew SL, Cartegni L, Zhang MQ, Krainer AR. Exonic splicing enhancer motif recognized by human SC35 under splicing conditions. Mol Cell Biol 2000; 20:1063-71. [PMID: 10629063 PMCID: PMC85223 DOI: 10.1128/mcb.20.3.1063-1071.2000] [Citation(s) in RCA: 175] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Exonic splicing enhancers (ESEs) are important cis elements required for exon inclusion. Using an in vitro functional selection and amplification procedure, we have identified a novel ESE motif recognized by the human SR protein SC35 under splicing conditions. The selected sequences are functional and specific: they promote splicing in nuclear extract or in S100 extract complemented by SC35 but not by SF2/ASF. They can also function in a different exonic context from the one used for the selection procedure. The selected sequences share one or two close matches to a short and highly degenerate octamer consensus, GRYYcSYR. A score matrix was generated from the selected sequences according to the nucleotide frequency at each position of their best match to the consensus motif. The SC35 score matrix, along with our previously reported SF2/ASF score matrix, was used to search the sequences of two well-characterized splicing substrates derived from the mouse immunoglobulin M (IgM) and human immunodeficiency virus tat genes. Multiple SC35 high-score motifs, but only two widely separated SF2/ASF motifs, were found in the IgM C4 exon, which can be spliced in S100 extract complemented by SC35. In contrast, multiple high-score motifs for both SF2/ASF and SC35 were found in a variant of the Tat T3 exon (lacking an SC35-specific silencer) whose splicing can be complemented by either SF2/ASF or SC35. The motif score matrix can help locate SC35-specific enhancers in natural exon sequences.
Collapse
|
273
|
Machida M, Yamazaki S, Kunihiro S, Tanaka T, Kushida N, Jinnno K, Haikawa Y, Yamazaki J, Yamamoto S, Sekine M, Oguchi A, Nagai Y, Sakai M, Aoki K, Ogura K, Kudoh Y, Kikuchi H, Zhang MQ, Yanagida M. A 38 kb segment containing the cdc2 gene from the left arm of fission yeast chromosome II: sequence analysis and characterization of the genomic DNA and cDNAs encoded on the segment. Yeast 2000; 16:71-80. [PMID: 10620777 DOI: 10.1002/(sici)1097-0061(20000115)16:1<71::aid-yea505>3.0.co;2-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
A genomic 38 kbp segment on the c1750 cosmid clone containing the cdc2 gene, located in the left arm of chromosome II from Schizosaccharomyces pombe, was sequenced. The segment was found to have five previously known genes, pht1, cdc2, his3, act1 and mei4. Among 11 coding sequences (CDSs) predicted by the gene finding software INTRON.PLOT., four CDSs, pi007, pi010, pi014 and pi016, had considerable similarity to 40S ribosomal protein, glycosyltransferase, cdc2-related protein kinase and alpha-1, 2-mannosyltransferase, respectively. Another unusually huge open reading frame (ORF) (pi011), consisting of 2233 amino acids, existed, having significant homology to alpha-amylase, granule-bound glycogen synthase and the Sz. pombe YS 1110 clone product at the N-terminal, middle and C-terminal regions, respectively. All the predicted 11 CDSs were experimentally analysed by RACE PCR. The sequencing of the RACE products revealed that there were two small overlaps at the 3' untranslated regions (UTRs) between pi004 and pi005 (17 bp) and between pi007 and pi008 (2 bp). The distances between 5' end of the 5'UTR and the putative translation initiation codon varied from 10 to 302 nucleotides (nt) among the nine CDSs successfully analysed by 5'-RACE. The expression level of each CDS on this clone was determined. Among the 16 genes on this clone, the previously determined genes, pht1, cdc2, his3 and act1, were found to be most highly expressed. Finally, cDNAs of all the newly identified genes were detected by RACE, proving the actual expression of these genes. The nucleotide sequence has been submitted to the EMBL database under Accession No. AB004534.
Collapse
|
274
|
Walczyński K, Timmerman H, Zuiderveld OP, Zhang MQ, Glinka R. Histamine H1 receptor ligands. Part I. Novel thiazol-4-ylethanamine derivatives: synthesis and in vitro pharmacology. FARMACO (SOCIETA CHIMICA ITALIANA : 1989) 1999; 54:533-41. [PMID: 10510850 DOI: 10.1016/s0014-827x(99)00060-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
A series of 2-substituted thiazol-4-ylethanamines have been synthesized and tested for their histaminergic H1-receptor activities. The compounds with 2-phenyl substitution, regardless of the different physicochemical properties of the meta-substituents at the phenyl ring, showed weak H1-agonistic activity with pD2 values ranging from 4.35 to 5.36. When the phenyl group was replaced by a benzyl group, the resulting compounds all exhibited weak H1-antagonistic activity (pA2: 4.14-4.82).
Collapse
|
275
|
Mayeda A, Badolato J, Kobayashi R, Zhang MQ, Gardiner EM, Krainer AR. Purification and characterization of human RNPS1: a general activator of pre-mRNA splicing. EMBO J 1999; 18:4560-70. [PMID: 10449421 PMCID: PMC1171530 DOI: 10.1093/emboj/18.16.4560] [Citation(s) in RCA: 120] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Biochemical purification of a pre-mRNA splicing activity from HeLa cells that stimulates distal alternative 3' splice sites in a concentration-dependent manner resulted in the identification of RNPS1, a novel general activator of pre-mRNA splicing. RNPS1 cDNAs, encoding a putative nucleic-acid-binding protein of unknown function, were previously identified in mouse and human. RNPS1 is conserved in metazoans and has an RNA-recognition motif preceded by an extensive serine-rich domain. Recombinant human RNPS1 expressed in baculovirus functionally synergizes with SR proteins and strongly activates splicing of both constitutively and alternatively spliced pre-mRNAs. We conclude that RNPS1 is not only a potential regulator of alternative splicing but may also play a more fundamental role as a general activator of pre-mRNA splicing.
Collapse
|
276
|
Zhang MQ. Large-Scale Gene Expression Data Analysis: A New Challenge to Computational Biologists. Genome Res 1999. [DOI: 10.1101/gr.9.8.681] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The use of high-density DNA arrays to monitor gene expression at a genome-wide scale constitutes a fundamental advance in biology. In particular, the expression pattern of all genes in Saccharomyces cerevisiae can be interrogated using microarray analysis where cDNAs are hybridized to an array of each of the ∼6000 genes in the yeast genome. In this survey I review three recent experiments related to transcriptional regulation and discuss the great challenge for computational biologists trying to extract functional information from such large-scale gene expression data.
Collapse
|
277
|
Zhang MQ. Large-scale gene expression data analysis: a new challenge to computational biologists. Genome Res 1999; 9:681-8. [PMID: 10447504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/13/2023]
Abstract
The use of high-density DNA arrays to monitor gene expression at a genome-wide scale constitutes a fundamental advance in biology. In particular, the expression pattern of all genes in Saccharomyces cerevisiae can be interrogated using microarray analysis where cDNAs are hybridized to an array of each of the approximately 6000 genes in the yeast genome. In this survey I review three recent experiments related to transcriptional regulation and discuss the great challenge for computational biologists trying to extract functional information from such large-scale gene expression data.
Collapse
|
278
|
Abstract
MOTIVATION In order to facilitate a systematic study of the promoters and transcriptionally regulatory cis-elements of the yeast Saccharomyces cerevisiae on a genomic scale, we have developed a comprehensive yeast-specific promoter database, SCPD. RESULTS Currently SCPD contains 580 experimentally mapped transcription factor (TF) binding sites and 425 transcriptional start sites (TSS) as its primary data entries. It also contains relevant binding affinity and expression data where available. In addition to mechanisms for promoter information (including sequence) retrieval and a data submission form, SCPD also provides some simple but useful tools for promoter sequence analysis. AVAILABILITY SCPD can be accessed from the URL http://cgsigma.cshl.org/jian. The database is continually updated.
Collapse
|
279
|
Abstract
The use of high density DNA arrays to monitor gene expression at a genome-wide scale constitutes a fundamental advance in biology. In particular, the expression pattern of all genes in Saccharomyces cerevisiae can be interrogated using microarray analysis where cDNAs are hybridized to an array of more than 6000 genes in the yeast genome. In an effort to build a comprehensive Yeast Promoter Database and to develop new computational methods for mapping upstream regulatory elements, we started recently in an on going collaboration with experimental biologists on analysis of large-scale expression data. It is well known that complex gene expression patterns result from dynamic interacting networks of genes in the genetic regulatory circuitry. Hierarchical and modular organization of regulatory DNA sequence elements are important information for our understanding of combinatorial control of gene expression. As a bioinformatics attempt in this new direction, we have done some computational exploration of various initial experimental data. We will use cell-cycle regulated gene expression as a specific example to demonstrate how one may extract promoter information computationally from such genome-wide screening. Full report of the experiments and of the complete analysis will be published elsewhere when all the experiments are to be finished later in this year (Spellman, P.T., et al. 1998. Mol. Biol. Cell 9, 3273-3297).
Collapse
|
280
|
Abstract
We present polyadq, a program for detection of human polyadenylation signals. To avoid training on possibly flawed data, the development of polyadq began with a de novo characterization of human mRNA 3' processing signals. This information was used in training two quadratic discriminant functions that polyadq uses to evaluate potential polyA signals. In our tests, polyadq predicts polyA signals with a correlation coefficient of 0.413 on whole genes and 0.512 in the last two exons of genes, substantially outperforming other published programs on the same data set. polyadq is also the only program that is able to consistently detect the ATTAAA variant of the polyA signal.
Collapse
|
281
|
Ioshikhes I, Trifonov EN, Zhang MQ. Periodical distribution of transcription factor sites in promoter regions and connection with chromatin structure. Proc Natl Acad Sci U S A 1999; 96:2891-5. [PMID: 10077607 PMCID: PMC15865 DOI: 10.1073/pnas.96.6.2891] [Citation(s) in RCA: 40] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Nucleosomes regulate transcriptional initiation when positioned in the promoter area. This may require the transcription factor (TF) sites to be correlated with the nucleosome positions and phased on the nucleosome surface. If this is the case, one would expect a periodical distribution of TF sites in the vicinity of promoters, with the nucleosomal period of 10.1-10.5 bp. We examined the distributions of putative binding sites of 323 different TFs along 1, 057 sequences of the Eukaryotic Promoter Database (release 50) [Cavin Perier, R., Junier, T. & Bucher, P. (1998) Nucleic Acids Res. 26, 353-357] and of 218 TFs on 673 sequences of the Lead Exon Database of human promoter sequences. We obtained a statistically significant overrepresentation of TF sites distributed with the main period of 10.1-10.5 bp in the region -50 to +120 around the transcription start site and in few locations nearby. Correlation of the positioning of the TF sites with the nucleosomes is further reinforced by sequence-directed mapping of the nucleosomes, a method previously developed.
Collapse
|
282
|
Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, Eisen MB, Brown PO, Botstein D, Futcher B. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell 1998; 9:3273-97. [PMID: 9843569 PMCID: PMC25624 DOI: 10.1091/mbc.9.12.3273] [Citation(s) in RCA: 2726] [Impact Index Per Article: 104.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/1998] [Accepted: 10/15/1998] [Indexed: 12/13/2022] Open
Abstract
We sought to create a comprehensive catalog of yeast genes whose transcript levels vary periodically within the cell cycle. To this end, we used DNA microarrays and samples from yeast cultures synchronized by three independent methods: alpha factor arrest, elutriation, and arrest of a cdc15 temperature-sensitive mutant. Using periodicity and correlation algorithms, we identified 800 genes that meet an objective minimum criterion for cell cycle regulation. In separate experiments, designed to examine the effects of inducing either the G1 cyclin Cln3p or the B-type cyclin Clb2p, we found that the mRNA levels of more than half of these 800 genes respond to one or both of these cyclins. Furthermore, we analyzed our set of cell cycle-regulated genes for known and new promoter elements and show that several known elements (or variations thereof) contain information predictive of cell cycle regulation. A full description and complete data sets are available at http://cellcycle-www.stanford.edu
Collapse
|
283
|
Zhang MQ. A discrimination study of human core-promoters. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 1998:240-51. [PMID: 9697186] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
A core-promoter, approximately from -60 bp upstream to +40 bp downstream of a RNA polymerase (RNAP) II transcription start site (TSS), binds to the preinitiation complex (PIC) and determine the position of TSS. Using position-specific k-tuple feature variables, a quadratic discriminant analysis (QDA) method is shown to be very effective in identifying human core-promoters.
Collapse
|
284
|
Zhang MQ. Identification of protein-coding regions in Arabidopsis thaliana genome based on quadratic discriminant analysis. PLANT MOLECULAR BIOLOGY 1998; 37:803-806. [PMID: 9678575 DOI: 10.1023/a:1006023912378] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
A new method (MZEF) for predicting internal coding exons in genomic DNA sequences has been developed. This method is based on a prediction algorithm that uses the quadratic discriminant function for multivariate statistical pattern recognition. With improved feature measures, an Arabidopsis thaliana-specific implementation of MZEF is completed and made available to the plant genome community.
Collapse
|
285
|
Abstract
A special program developed by the authors, called Pombe, identifies protein coding regions in the Schizosaccharomyces pombe genome. Linear discriminant analysis was applied to predict 5'-terminal, internal, 3'-terminal exons (coding-exon) and introns. The accuracy of the prediction was tested by cross verifications. The sensitivity, specificity and correlation coefficient for the internal exon prediction were 98.5%, 99.9% and 98.3% respectively at the nucleotide level. Open reading frames were studied and used to predict intron-less genes: 99.0% of such genes were identified with correct stopping sites. The gene structure was determined by dynamic programming and the prediction achieved 97.0% correlation coefficient at the nucleotide level. The program is available at http:(/)/clio.cshl.org/genefinder.
Collapse
|
286
|
Abstract
To facilitate gene finding and for the investigation of human molecular genetics on a genome scale, we present a comprehensive survey on various statistical features of human exons. We first show that human exons with flanking genomic DNA sequences can be classified into 12 mutually exclusive categories. This classification could serve as a standard for future studies so that direct comparisons of results can be made. A database for eight categories (related to human genes in which coding regions are split by introns) was built from GenBank release 87.0 and analyzed by a number of methods to characterize statistical features of these sequences that may serve as controls or regulatory signals for gene expression. The statistical information compiled includes profiles of signals for transcription, splicing and translation, various compositional statistics and size distributions. Further analyses reveal novel correlations and constraints among different splicing features across an internal exon that are consistent with the Exon Definition model. This information is fundamental for a quantitative view of human gene organization, and should be invaluable for individual scientists to design human molecular genetics experiments.
Collapse
|
287
|
Zwaagstra ME, Timmerman H, van de Stolpe AC, de Kanter FJ, Tamura M, Wada Y, Zhang MQ. Synthesis and structure-activity relationships of carboxyflavones as structurally rigid CysLT1 (LTD4) receptor antagonists. J Med Chem 1998; 41:1428-38. [PMID: 9554876 DOI: 10.1021/jm970179x] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
The synthesis and CysLT1 receptor affinities of a new series of highly rigid 3'- and 4'-(2-quinolinylmethoxy)- or 3'- and 4'-[2-(2-quinolinyl)ethenyl]-substituted, 6-, 7-, or 8-carboxylated flavones are described. CysLT1 receptor affinities of the flavones (down to 11 nM) were determined by their ability to displace [3H]LTD4 from its receptor in guinea pig lung membranes. Structure-affinity relationship studies showed that the relative positions of the carboxylic acid and the quinoline moiety were critical for CysLT1 affinities. While the carboxyl is optimal in the 8 position but tolerated in the 6 position, only the 6- and not the 8-tetrazole has significant activity. The quinoline moiety may be connected to the flavone skeleton by an ethenyl or a methoxy linker, but the substitution position is important for high affinity, especially in the 6-carboxylated flavones. 4'-Substituted 6-carboxyflavones are essentially inactive, whereas the 3'-substituted analogues have submicromolar CysLT1 affinity. Replacement of the quinoline by other heteroaromates generally leads to decreased affinities, with the phenyl and naphthyl analogues displaying only little or no affinity, while the 7-chloroquinoline analogue is comparable in activity to the quinoline. Flavones having CysLT1 receptor affinities of 10-30 nM were selected for determination of their inhibitory effects on the LTD4-induced contraction of guinea pig ileum in vitro. The IC50 values ranged between 15 and 100 nM. Compound 5d (8-carboxy-6-chloro-3'-(2-quinolinylmethoxy)flavone, VUF 5087) was selected for further research because of its high potency in the functional assay. This series contains the most rigid CysLT1 receptor antagonists known to date, and they are useful in the development of a CysLT1 antagonist model, which is discussed in the companion paper.
Collapse
|
288
|
Zwaagstra ME, Schoenmakers SH, Nederkoorn PH, Gelens E, Timmerman H, Zhang MQ. Development of a three-dimensional CysLT1 (LTD4) antagonist model with an incorporated amino acid residue from the receptor. J Med Chem 1998; 41:1439-45. [PMID: 9554877 DOI: 10.1021/jm970180w] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
This paper describes the molecular modeling of leukotriene CysLT1 (or LTD4) receptor antagonists. Several different structural classes of CysLT1 antagonists were superimposed onto the new and highly rigid CysLT1 antagonist 8-carboxy-3'-[2-(2-quinolinyl)ethenyl]flavone (1, VUF 5017) to generate a common pharmacophoric arrangement. On the basis of known structure-activity relationships of CysLT1 antagonists, the quinoline nitrogen (or a bioisosteric equivalent thereof) and an acidic function were taken as the matching points. In order to optimize the fitting of acidic moieties of all antagonists, an arginine residue from the receptor was proposed as the interaction site for the acidic moieties. Incorporation of this amino acid residue into the model revealed additional interactions between the guanidine group and the nitrogen atoms of quinoline-containing CysLT1 antagonists. In some cases, the arginine may even interact with pi-clouds of phenyl residues of CysLT1 antagonists. The alignment of Montelukast (MK-476) suggests the presence of an additional pocket in the binding site for CysLT1 antagonists. The derived model should be useful for a better understanding of the molecular recognition of the leukotriene CysLT1 receptor.
Collapse
|
289
|
Zhang MQ. Identification of human gene core promoters in silico. Genome Res 1998; 8:319-26. [PMID: 9521935 PMCID: PMC310696 DOI: 10.1101/gr.8.3.319] [Citation(s) in RCA: 73] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/1997] [Accepted: 02/12/1998] [Indexed: 02/06/2023]
Abstract
Identification of the 5'-end of human genes requires identification of functional promoter elements. In silico identification of those elements is difficult because of the hierarchical and modular nature of promoter architecture. To address this problem, I propose a new stepwise strategy based on initial localization of a functional promoter into a 1- to 2-kb (extended promoter) region from within a large genomic DNA sequence of 100 kb or larger and further localization of a transcriptional start site (TSS) into a 50- to 100-bp (corepromoter) region. Using positional dependent 5-tuple measures, a quadratic discriminant analysis (QDA) method has been implemented in a new program-CorePromoter. Our experiments indicate that when given a 1- to 2-kb extended promoter, CorePromoter will correctly localize the TSS to a 100-bp interval approximately 60% of the time. [Figure 3 can be found in its entirety as an online supplement at http://www.genome.org.]
Collapse
|
290
|
Zwaagstra ME, Timmerman H, Tamura M, Tohma T, Wada Y, Onogi K, Zhang MQ. Synthesis and structure-activity relationships of carboxylated chalcones: a novel series of CysLT1 (LTD4) receptor antagonists. J Med Chem 1997; 40:1075-89. [PMID: 9089329 DOI: 10.1021/jm960628d] [Citation(s) in RCA: 51] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
The synthesis and CysLT1 antagonistic activities of a new series of 2-, 3-, and 4-(2-quinolinylmethoxy)- and 3- and 4-[2-(2-quinolinyl)ethenyl]-substituted, 2'-, 3'-, 4'-, or 5'-carboxylated chalcones are described. Structure-activity relationship studies indicate a preference for the presence of a negatively charged (acidic) moiety, although in some cases nitrile or ester analogues also exhibit moderate activity. The quinoline moiety may be substituted at either the 3- or the 4-position. Replacement of this heterocycle by other aromatic groups results in compounds with comparable affinities [2-(7-chloroquinoline), 1-(1-methyl-2-benzimidazole), or 1-(2-benzothiazole)] or substantially lower activities [1-(1-ethoxyethyl)-2-benzimidazole, 2-naphthyl, or phenyl]. The quinoline and chalcone moieties may be connected by either an ethenyl or a methoxy spacer. The acidic moiety at the chalcone B ring may be attached to the 2'-, 3'-, 4'-, or 5'-position, for both the 3- and 4-substituted chalcones. There are no general patterns to specify which substitution positions gave the most potent compounds. The series contains several potent CysLT1 receptor antagonists, with K(D) values approaching the nanomolar range, as measured by the displacement of [3H]LTD4 from guinea pig lung membranes. Antagonism of LTD4-induced contraction of guinea pig ileum, the inhibition of antigen-induced contraction of guinea pig trachea in vitro, and the inhibition of LTD4-induced increase of vascular permeability in vivo are determined for chalcones with high CysLT1 receptor affinities (K(D) values below 0.1 microM). 2'-Hydroxy-4-(2-quinolinylmethoxy)-5'-(5-tetrazolyl)chalcone (14, VUF 4819) showed good activity in both in vitro and in vivo assays and has been selected for further evaluation.
Collapse
|
291
|
Zhang MQ, Timmerman H. Leukotriene cysLT1 (LTD4) receptor antagonism of H1-antihistamines: an in vitro study. Inflamm Res 1997; 46 Suppl 1:S93-4. [PMID: 9098782] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
|
292
|
Zhang MQ. Identification of protein coding regions in the human genome by quadratic discriminant analysis. Proc Natl Acad Sci U S A 1997; 94:565-8. [PMID: 9012824 PMCID: PMC19553 DOI: 10.1073/pnas.94.2.565] [Citation(s) in RCA: 198] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/1996] [Accepted: 10/29/1996] [Indexed: 02/03/2023] Open
Abstract
A new method for predicting internal coding exons in genomic DNA sequences has been developed. This method is based on a prediction algorithm that uses the quadratic discriminant function for multivariate statistical pattern recognition. Substantial improvements have been made (with only 9 discriminant variables) when compared with existing methods: HEXON [Solovyev, V. V., Salamov, A. A. & Lawrence, C. B. (1994) Nucleic Acids Res. 22, 5156-5163] (based on linear discriminant analysis) and GRAIL2 [Uberbacher, E. C. & Mural, R. J. (1991) Proc. Natl. Acad. Sci. USA 88, 11261-11265] (based on neural networks). A computer program called MZEF is freely available to the genome community and allows users to adjust prior probability and to output alternative overlapping exons.
Collapse
|
293
|
Zwaagstra ME, Timmerman H, Abdoelgafoer RS, Zhang MQ. Synthesis of carboxylated flavonoids as new leads for LTD4 antagonists. Eur J Med Chem 1996. [DOI: 10.1016/s0223-5234(97)89849-2] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
294
|
Zhang MQ, Wada Y, Sato F, Timmerman H. (Piperidinylalkoxy)chromones: novel antihistamines with additional antagonistic activity against leukotriene D4. J Med Chem 1995; 38:2472-7. [PMID: 7608912 DOI: 10.1021/jm00013a023] [Citation(s) in RCA: 21] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
A series of novel chromone derivatives, in which the chromone moiety is connected to a (diphenylmethylene)-, (diphenylmethyl)-, or (diphenylmethoxy)piperidine via an alkyloxy spacer, were synthesized as antiallergic and antiasthmatic agents. In addition to their potent antihistaminic activity, the compounds also inhibit contraction in guinea pig ileum induced by leukotriene D4. When analyzed by radioligand binding assays in guinea pig lung membranes, one of the compounds, 7-[[3-[4-(diphenylmethylene)piperidin-1- yl]propyl]oxy]-2-(5-tetrazolyl)-4-oxo-4H-1-benzopyran, showed dissociation constants (KD) of 5.62 nM and 2.34 microM for H1- and LTD4-receptors, respectively. In vivo at the dose of 10 mg/kg, the compound inhibited the histamine- and LTD4-induced increase of vascular permeability in guinea pigs by 95 and 30%, respectively. The inhibition of LTD4-induced increase in vascular permeability by the compound was increased to 56% when a dose of 50 mg/kg was employed. Similar to terfenadine, the compound does not readily occupy the brain H1-receptors when given intraperitoneally to mice, implying no sedating side effects.
Collapse
|
295
|
Abstract
We propose a generating functional method--random path analysis (RPA)--that generalizes the classical dynamic programming (DP) method widely used in sequence alignments. For a given cost function, DP is a deterministic method that finds an optimal alignment by minimizing the total cost function for all possible alignments. By allowing uncertainty, RPA is a statistical method that weights fluctuating alignments by probabilities. Therefore, DP maybe thought of as the deterministic limit of RPA when the fluctuations approach zero. DP is the method of choice if one is only interested in optimal alignment. But we argue that, when information beyond the optimal alignment is desired, RPA gives a natural extension of DP for biological applications. As an algebraic approach, RPA is computationally intensive for long sequences, but it can provide better parametric control for developing analytical or perturbational results and it is more informative and biologically relevant. The idea of RPA opens up new opportunities for simulational approaches and more importantly it suggests a novel hardware implementation that has the potential of improving the way a sequence alignment is done. Here we focus on deriving a mathematically rigorous solution to RPA both in its combinatorial form and in its graphical representation; this puts DP in logical perspective under a more general conceptual framework.
Collapse
|
296
|
Zhang MQ, Walczynski K, Timmerman H. A steric approach for the design of antihistamines with low muscarinic receptor antagonism. Inflamm Res 1995; 44 Suppl 1:S90-1. [PMID: 8521019 DOI: 10.1007/bf01674411] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
|
297
|
Zhang MQ, Caldirola P, Timmerman H. Chiral manipulation of drug selectivity: studies on a series of terfenadine-derived dual antagonists on H1-receptors and calcium channels. AGENTS AND ACTIONS 1994; 41 Spec No:C140-2. [PMID: 7976802 DOI: 10.1007/bf02007802] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
A series of terfenadine derivatives were evaluated for enantioselectivity on histamine H1-receptors and calcium channels. Whereas H1-receptors are only sterically discriminative against the benzhydryl part of the molecules, calcium channels showed enantioselectivity to either the phenylbutyl part or the benzhydryl part provided that an appropriate lipophilicity is preserved at the chiral site. It is speculated that the hydrophilicity of the butanol moiety is responsible for the lack of stereoselectivity of terfenadine enantiomers since it drives the side chain out of the stereoselective site of calcium channels, which are lipophilic. In four different test systems, (guinea-pig ileum, guinea-pig lung membranes, rat aorta and rat cortex membranes), this series of compounds generally showed about 10 times higher activity on H1-receptors than on calcium channels. By introducing a chiral center in the different parts of the molecule we were able to increase the selectivity of an enantiomer VUF4648 to calcium channels.
Collapse
|
298
|
Stamm S, Zhang MQ, Marr TG, Helfman DM. A sequence compilation and comparison of exons that are alternatively spliced in neurons. Nucleic Acids Res 1994; 22:1515-26. [PMID: 8202349 PMCID: PMC308024 DOI: 10.1093/nar/22.9.1515] [Citation(s) in RCA: 82] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open
Abstract
Alternative splicing is an important regulatory mechanism to create protein diversity. In order to elucidate possible regulatory elements common to neuron specific exons, we created and statistically analysed a database of exons that are alternatively spliced in neurons. The splice site comparison of alternatively and constitutively spliced exons reveals that some, but not all alternatively spliced exons have splice sites deviating from the consensus sequence, implying diverse patterns of regulation. The deviation from the consensus is most evident at the -3 position of the 3' splice site and the +4 and -3 position of the 5' splice site. The nucleotide composition of alternatively and constitutively spliced exons is different, with alternatively spliced exons being more AU rich. We performed overlapping k-tuple analysis to identify common motifs. We found that alternatively and constitutively spliced exons differ in the frequency of several trinucleotides that cannot be explained by the amino acid composition and may be important for splicing regulation.
Collapse
|
299
|
Abstract
A database of 210 Schizosaccharomyces pombe DNA sequences (524,794 bp) was extracted from GenBank (release number 81.0) and examined by a number of methods in order to characterize statistical features of these sequences that might serve as signals or constraints for messenger RNA splicing. The statistical information compiled includes splicing signal (donor, acceptor and branch site) profiles, translational initiation start profile, exon/intron length distributions, ORF distribution, CDS size distribution, codon usage table, and 6-tuple distribution. The information content of the various signals are also presented. A rule-based interactive computer program for finding introns called INTRON.PLOT has been developed and was used to successfully analyze 7 newly sequenced genes.
Collapse
|
300
|
Zhang MQ, Steinbusch HWM, Kooij PJ, Timmerman H. Synthesis and preliminary evaluation of hydroquinone-substituted histidine derivatives as putative histaminergic neurotoxins. Eur J Med Chem 1994. [DOI: 10.1016/0223-5234(94)90142-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|