26
|
Jianlin Cheng, Tegge A, Baldi P. Machine Learning Methods for Protein Structure Prediction. IEEE Rev Biomed Eng 2008; 1:41-9. [DOI: 10.1109/rbme.2008.2008239] [Citation(s) in RCA: 79] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
27
|
|
28
|
Abstract
The metabolism of solid tumors is associated with high lactate production while growing in oxygen (aerobic glycolysis) suggesting that tumors may have defects in mitochondrial function. The mitochondria produce cellular energy by oxidative phosphorylation (OXPHOS), generate reactive oxygen species (ROS) as a by-product, and regulate apoptosis via the mitochondrial permeability transition pore (mtPTP). The mitochondria are assembled from both nuclear DNA (nDNA) and mitochondrial DNA (mtDNA) genes. The mtDNA codes for 37 genes essential of OXPHOS, is present in thousands of copies per cell, and has a very high mutations rate. In humans, severe mtDNA mutations result in multisystem disease, while some functional population-specific polymorphisms appear to have permitted humans to adapt to new environments. Mutations in the nDNA-encoded mitochondrial genes for fumarate hydratase and succinate dehydrogenase have been linked to uterine leiomyomas and paragangliomas, and cancer cells have been shown to induce hexokinase II which harnesses OXPHOS adenosine triphosphate (ATP) production to drive glycolysis. Germline mtDNA mutations at nucleotides 10398 and 16189 have been associated with breast cancer and endometrial cancer. Tumor mtDNA somatic mutations range from severe insertion-deletion and chain termination mutations to mild missense mutations. Surprisingly, of the 190 tumor-specific somatic mtDNA mutations reported, 72% are also mtDNA sequence variants found in the general population. These include 52% of the tumor somatic mRNA missense mutations, 83% of the tRNA mutations, 38% of the rRNA mutations, and 85% of the control region mutations. Some associations might reflect mtDNA sequencing errors, but analysis of several of the tumor-specific somatic missense mutations with population counterparts appear legitimate. Therefore, mtDNA mutations in tumors may fall into two main classes: (1) severe mutations that inhibit OXPHOS, increase ROS production and promote tumor cell proliferation and (2) milder mutations that may permit tumors to adapt to new environments. The former may be lost during subsequent tumor oxygenation while the latter may become fixed. Hence, mitochondrial dysfunction does appear to be a factor in cancer etiology, an insight that may suggest new approaches for diagnosis and treatment.
Collapse
|
29
|
Tanzilli S, Tittel W, Halder M, Alibart O, Baldi P, Gisin N, Zbinden H. A photonic quantum information interface. Nature 2005; 437:116-20. [PMID: 16136138 DOI: 10.1038/nature04009] [Citation(s) in RCA: 302] [Impact Index Per Article: 15.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2005] [Accepted: 07/07/2005] [Indexed: 11/08/2022]
Abstract
Quantum communication requires the transfer of quantum states, or quantum bits of information (qubits), from one place to another. From a fundamental perspective, this allows the distribution of entanglement and the demonstration of quantum non-locality over significant distances. Within the context of applications, quantum cryptography offers a provably secure way to establish a confidential key between distant partners. Photons represent the natural flying qubit carriers for quantum communication, and the presence of telecommunications optical fibres makes the wavelengths of 1,310 nm and 1,550 nm particularly suitable for distribution over long distances. However, qubits encoded into alkaline atoms that absorb and emit at wavelengths around 800 nm have been considered for the storage and processing of quantum information. Hence, future quantum information networks made of telecommunications channels and alkaline memories will require interfaces that enable qubit transfers between these useful wavelengths, while preserving quantum coherence and entanglement. Here we report a demonstration of qubit transfer between photons of wavelength 1,310 nm and 710 nm. The mechanism is a nonlinear up-conversion process, with a success probability of greater than 5 per cent. In the event of a successful qubit transfer, we observe strong two-photon interference between the 710 nm photon and a third photon at 1,550 nm, initially entangled with the 1,310 nm photon, although they never directly interacted. The corresponding fidelity is higher than 98 per cent.
Collapse
|
30
|
|
31
|
Cheng J, Randall AZ, Sweredoski MJ, Baldi P. SCRATCH: a protein structure and structural feature prediction server. Nucleic Acids Res 2005; 33:W72-6. [PMID: 15980571 PMCID: PMC1160157 DOI: 10.1093/nar/gki396] [Citation(s) in RCA: 697] [Impact Index Per Article: 36.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
SCRATCH is a server for predicting protein tertiary structure and structural features. The SCRATCH software suite includes predictors for secondary structure, relative solvent accessibility, disordered regions, domains, disulfide bridges, single mutation stability, residue contacts versus average, individual residue contacts and tertiary structure. The user simply provides an amino acid sequence and selects the desired predictions, then submits to the server. Results are emailed to the user. The server is available at .
Collapse
|
32
|
Alibart O, Ostrowsky DB, Baldi P, Tanzilli S. High-performance guided-wave asynchronous heralded single-photon source. OPTICS LETTERS 2005; 30:1539-41. [PMID: 16007800 DOI: 10.1364/ol.30.001539] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
We report on a guided-wave asynchronous heralded single-photon source based on the creation of nondegenerate photon pairs by spontaneous parametric downconversion in a periodically poled lithium niobate wave-guide. We show that, by use of the signal photon at 1310 nm as a trigger, a gated detection process permits announcement of the arrival of single photons at 1550 nm at the output of a single-mode optical fiber with a high probability of 0.37. At the same time the multiphoton emission probability is reduced by a factor of 10 compared with Poissonian light sources. Furthermore, the model we have developed to calculate those figures of merit is shown to be accurate. This study can therefore serve as a paradigm for the conception of new quantum communication and computation networks.
Collapse
|
33
|
Baldi P, Campari EG, Casula G, Focardi S, Levi G, Palmonari F. Gravitational constantGmeasured with a superconducting gravimeter. Int J Clin Exp Med 2005. [DOI: 10.1103/physrevd.71.022002] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
34
|
Hampson SE, Gaut BS, Baldi P. Statistical detection of chromosomal homology using shared-gene density alone. Bioinformatics 2004; 21:1339-48. [PMID: 15585535 DOI: 10.1093/bioinformatics/bti168] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Over evolutionary time, various processes including point mutations and insertions, deletions and inversions of variable sized segments progressively degrade the homology of duplicated chromosomal regions making identification of the homologous regions correspondingly difficult. Existing algorithms that attempt to detect homology are based on shared-gene density and colinearity and possibly also strand information. RESULTS Here, we develop a new algorithm for the statistical detection of chromosomal homology, CloseUp, which uses shared-gene density alone to fully exploit the observation that relaxing colinearity requirements in general is beneficial for homology detection and at the same time optimizes computation time. CloseUp has two components: the identification of candidate homologous regions followed by their statistical evaluation using Monte Carlo methods and data randomization. Using both artificial and real data, we compared CloseUp with two existing programs (ADHoRe and LineUp) for chromosomal homology detection and found that in general CloseUp compares favorably. AVAILABILITY CloseUp and supplementary information are available at http://www.igb.uci.edu/servers/cgss.html CONTACT pfbaldi@ics.uci.edu.
Collapse
|
35
|
Pollastri G, Baldi P. Prediction of contact maps by GIOHMMs and recurrent neural networks using lateral propagation from all four cardinal corners. Bioinformatics 2004; 18 Suppl 1:S62-70. [PMID: 12169532 DOI: 10.1093/bioinformatics/18.suppl_1.s62] [Citation(s) in RCA: 106] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Accurate prediction of protein contact maps is an important step in computational structural proteomics. Because contact maps provide a translation and rotation invariant topological representation of a protein, they can be used as a fundamental intermediary step in protein structure prediction. RESULTS We develop a new set of flexible machine learning architectures for the prediction of contact maps, as well as other information processing and pattern recognition tasks. The architectures can be viewed as recurrent neural network implemantations of a class of Bayesian networks we call generalized input-output HMMs (GIOHMMs). For the specific case of contact maps, contextual information is propagated laterally through four hidden planes, one for each cardinal corner. We show that these architectures can be trained from examples and yield contact map predictors that outperform previously reported methods. While several extensions and improvements are in progress, the current version can accurately predict 60.5% of contacts at a distance cutoff of 8 A and 45% of distant contacts at 10 A, for proteins of length up to 300.
Collapse
|
36
|
Baldi P, Patocchi A, Zini E, Toller C, Velasco R, Komjanc M. Cloning and linkage mapping of resistance gene homologues in apple. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2004; 109:231-239. [PMID: 15052401 DOI: 10.1007/s00122-004-1624-x] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/26/2003] [Accepted: 01/29/2004] [Indexed: 05/24/2023]
Abstract
Apple ( Malus x domestica Borkh.) sequences sharing homology with known resistance genes were cloned using a PCR-based approach with degenerate oligonucleotide primers designed on conserved regions of the nucleotide-binding site (NBS). Sequence analysis of the amplified fragments indicated the presence of at least 27 families of NBS-containing genes in apple, each composed of several very similar or nearly identical sequences. The NBS-leucine-rich repeat homologues appeared to include members of the two major groups that have been described in dicot plants: one possessing a toll-interleukin receptor element and one lacking such a domain. Genetic mapping of the cloned sequences was achieved through the development of CAPS and SSCP markers using a segregating population of a cross between the two apple cultivars Fiesta and Discovery. Several of the apple resistance gene homologues mapped in the vicinity, or at least on the same linkage group, of known loci controlling resistance to various pathogens. The utility of resistance gene-homologue sequences as molecular markers for breeding purposes and for gene cloning is discussed.
Collapse
|
37
|
Reali D, Deriu MG, Baldi P, Baggiani A, Pinto B. [Mycobacteria in swimming pool water and the meaning of microbiological conventional indicators]. ANNALI DI IGIENE : MEDICINA PREVENTIVA E DI COMUNITA 2004; 16:247-53. [PMID: 15554531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/01/2023]
Abstract
Monitoring program of hygienic quality water in twelve public swimming pools was performed. Legally required microbiological indicator parameters of safety for gastrointestinal illness were measured besides the analyses of Pseudomonas spp. and Staphylococcus spp. prevalence, frequency of recovery and number of nontuberculous mycobacteria. We detected positive samples for coliforms at lower rate (29.3%) than Pseudomonas (75.5%), Staphylococcus spp. (46%) and Mycobacteria (59.4%). We pointed out statistically significant correlation (r=0.67 p=0.0001) between Mycobacteria and Pseudomonas so we think that the latter might be a good predictive marker. As 82% of samples had free chlorine residual within the limits stated by Italian Laws, the efficacy of chlorination to prevent risk of infectious diseases transmission by route other than gastroenteric was discussed. A revision of both the sanitary significance of conventional microbial parameters and the related regulations appears necessary.
Collapse
|
38
|
Pollastri G, Baldi P, Fariselli P, Casadio R. Improved prediction of the number of residue contacts in proteins by recurrent neural networks. Bioinformatics 2002; 17 Suppl 1:S234-42. [PMID: 11473014 DOI: 10.1093/bioinformatics/17.suppl_1.s234] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Knowing the number of residue contacts in a protein is crucial for deriving constraints useful in modeling protein folding, protein structure, and/or scoring remote homology searches. Here we use an ensemble of bi-directional recurrent neural network architectures and evolutionary information to improve the state-of-the-art in contact prediction using a large corpus of curated data. The ensemble is used to discriminate between two different states of residue contacts, characterized by a contact number higher or lower than the average value of the residue distribution. The ensemble achieves performances ranging from 70.1% to 73.1% depending on the radius adopted to discriminate contacts (6Ato 12A). These performances represent gains of 15% to 20% over the base line statistical predictors always assigning an aminoacid to the most numerous state, 3% to 7% better than any previous method. Combination of different radius predictors further improves the performance. SERVER: http://promoter.ics.uci.edu/BRNN-PRED/.
Collapse
|
39
|
Long AD, Mangalam HJ, Chan BY, Tolleri L, Hatfield GW, Baldi P. Improved statistical inference from DNA microarray data using analysis of variance and a Bayesian statistical framework. Analysis of global gene expression in Escherichia coli K12. J Biol Chem 2001; 276:19937-44. [PMID: 11259426 DOI: 10.1074/jbc.m010192200] [Citation(s) in RCA: 288] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Abstract
We describe statistical methods based on the t test that can be conveniently used on high density array data to test for statistically significant differences between treatments. These t tests employ either the observed variance among replicates within treatments or a Bayesian estimate of the variance among replicates within treatments based on a prior estimate obtained from a local estimate of the standard deviation. The Bayesian prior allows statistical inference to be made from microarray data even when experiments are only replicated at nominal levels. We apply these new statistical tests to a data set that examined differential gene expression patterns in IHF(+) and IHF(-) Escherichia coli cells (Arfin, S. M., Long, A. D., Ito, E. T., Tolleri, L., Riehle, M. M., Paegle, E. S., and Hatfield, G. W. (2000) J. Biol. Chem. 275, 29672-29684). These analyses identify a more biologically reasonable set of candidate genes than those identified using statistical tests not incorporating a Bayesian prior. We also show that statistical tests based on analysis of variance and a Bayesian prior identify genes that are up- or down-regulated following an experimental manipulation more reliably than approaches based only on a t test or fold change. All the described tests are implemented in a simple-to-use web interface called Cyber-T that is located on the University of California at Irvine genomics web site.
Collapse
|
40
|
Baldi P, Long AD. A Bayesian framework for the analysis of microarray expression data: regularized t -test and statistical inferences of gene changes. Bioinformatics 2001; 17:509-19. [PMID: 11395427 DOI: 10.1093/bioinformatics/17.6.509] [Citation(s) in RCA: 1262] [Impact Index Per Article: 54.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION DNA microarrays are now capable of providing genome-wide patterns of gene expression across many different conditions. The first level of analysis of these patterns requires determining whether observed differences in expression are significant or not. Current methods are unsatisfactory due to the lack of a systematic framework that can accommodate noise, variability, and low replication often typical of microarray data. RESULTS We develop a Bayesian probabilistic framework for microarray data analysis. At the simplest level, we model log-expression values by independent normal distributions, parameterized by corresponding means and variances with hierarchical prior distributions. We derive point estimates for both parameters and hyperparameters, and regularized expressions for the variance of each gene by combining the empirical variance with a local background variance associated with neighboring genes. An additional hyperparameter, inversely related to the number of empirical observations, determines the strength of the background variance. Simulations show that these point estimates, combined with a t -test, provide a systematic inference approach that compares favorably with simple t -test or fold methods, and partly compensate for the lack of replication.
Collapse
|
41
|
Baisnée PF, Baldi P, Brunak S, Pedersen AG. Flexibility of the genetic code with respect to DNA structure. Bioinformatics 2001; 17:237-48. [PMID: 11294789 DOI: 10.1093/bioinformatics/17.3.237] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION The primary function of DNA is to carry genetic information through the genetic code. DNA, however, contains a variety of other signals related, for instance, to reading frame, codon bias, pairwise codon bias, splice sites and transcription regulation, nucleosome positioning and DNA structure. Here we study the relationship between the genetic code and DNA structure and address two questions. First, to which degree does the degeneracy of the genetic code and the acceptable amino acid substitution patterns allow for the superimposition of DNA structural signals to protein coding sequences? Second, is the origin or evolution of the genetic code likely to have been constrained by DNA structure? RESULTS We develop an index for code flexibility with respect to DNA structure. Using five different di- or tri-nucleotide models of sequence-dependent DNA structure, we show that the standard genetic code provides a fair level of flexibility at the level of broad amino acid categories. Thus the code generally allows for the superimposition of any structural signal on any protein-coding sequence, through amino acid substitution. The flexibility observed at the level of single amino acids allows only for the superimposition of punctual and loosely positioned signals to conserved amino acid sequences. The degree of flexibility of the genetic code is low or average with respect to several classes of alternative codes. This result is consistent with the view that DNA structure is not likely to have played a significant role in the origin and evolution of the genetic code.
Collapse
|
42
|
Baldi P, Pollastri G, Andersen CA, Brunak S. Matching protein beta-sheet partners by feedforward and recurrent neural networks. PROCEEDINGS. INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS FOR MOLECULAR BIOLOGY 2001; 8:25-36. [PMID: 10977063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 02/17/2023]
Abstract
Predicting the secondary structure (alpha-helices, beta-sheets, coils) of proteins is an important step towards understanding their three dimensional conformations. Unlike alpha-helices that are built up from one contiguous region of the polypeptide chain, beta-sheets are more complex resulting from a combination of two or more disjoint regions. The exact nature of these long distance interactions remains unclear. Here we introduce two neural-network based methods for the prediction of amino acid partners in parallel as well as anti-parallel beta-sheets. The neural architectures predict whether two residues located at the center of two distant windows are paired or not in a beta-sheet structure. Variations on these architecture, including also profiles and ensembles, are trained and tested via five-fold cross validation using a large corpus of curated data. Prediction on both coupled and non-coupled residues currently approaches 84% accuracy, better than any previously reported method.
Collapse
|
43
|
Hampson S, Baldi P, Kibler D, Sandmeyer SB. Analysis of yeast's ORF upstream regions by parallel processing, microarrays, and computational methods. PROCEEDINGS. INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS FOR MOLECULAR BIOLOGY 2001; 8:190-201. [PMID: 10977080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 02/17/2023]
|
44
|
Solli P, Carbognani P, Cattelani L, Baldi P, Rusca M. Unusually located hydatid cysts miming a pulmonary tumor invaliding the spine. THE JOURNAL OF CARDIOVASCULAR SURGERY 2001; 42:147-9. [PMID: 11292925] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 02/19/2023]
Abstract
Hydatid disease is a worldwide encountered zoonosis but at present very rare in Europe, liver and lungs being the most frequently involved sites. Bone involvement is very uncommon and the vertebral spine is the most common site of skeletal involvement (less than 1% overall). We report a case of vertebral hydatid disease with secondary pleuro-pulmonary involvement successfully treated by emergency spinal decompression followed by lung resection en bloc with chest wall and partial vertebrectomy.
Collapse
|
45
|
Baldi P, Baisnée PF. Sequence analysis by additive scales: DNA structure for sequences and repeats of all lengths. Bioinformatics 2000; 16:865-89. [PMID: 11120677 DOI: 10.1093/bioinformatics/16.10.865] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION DNA structure plays an important role in a variety of biological processes. Different di- and tri-nucleotide scales have been proposed to capture various aspects of DNA structure including base stacking energy, propeller twist angle, protein deformability, bendability, and position preference. Yet, a general framework for the computational analysis and prediction of DNA structure is still lacking. Such a framework should in particular address the following issues: (1) construction of sequences with extremal properties; (2) quantitative evaluation of sequences with respect to a given genomic background; (3) automatic extraction of extremal sequences and profiles from genomic databases; (4) distribution and asymptotic behavior as the length N of the sequences increases; and (5) complete analysis of correlations between scales. RESULTS We develop a general framework for sequence analysis based on additive scales, structural or other, that addresses all these issues. We show how to construct extremal sequences and calibrate scores for automatic genomic and database extraction. We show that distributions rapidly converge to normality as Nincreases. Pairwise correlations between scales depend both on background distribution and sequence length and rapidly converge to an analytically predictable asymptotic value. For di- and tri-nucleotide scales, normal behavior and asymptotic correlation values are attained over a characteristic window length of about 10-15 bp. With a uniform background distribution, pairwise correlations between empirically-derived scales remain relatively small and roughly constant at all lengths, except for propeller twist and protein deformability which are positively correlated. There is a positive (resp. negative) correlation between dinucleotide base stacking (resp. propeller twist and protein deformability) and AT-content that increases in magnitude with length. The framework is applied to the analysis of various DNA tandem repeats. We derive exact expressions for counting the number of repeat unit classes at all lengths. Tandem repeats are likely to result from a variety of different mechanisms, a fraction of which is likely to depend on profiles characterized by extreme structural features.
Collapse
|
46
|
Baldi P, Marè C, Terzi V, Galiba G, Cattivelli L. The cold dependent accumulation of COR TMC-AP3 in cereals with contrasting, frost tolerance is regulated by different mRNA expression and protein turnover. PLANT SCIENCE : AN INTERNATIONAL JOURNAL OF EXPERIMENTAL PLANT BIOLOGY 2000; 156:47-54. [PMID: 10908804 DOI: 10.1016/s0168-9452(00)00228-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
The accumulation of specific cold-regulated (COR) proteins is a component of the hardening process and different amount of COR proteins has been related to different degrees of cold tolerance. A number of different mechanisms controls the accumulation of the COR proteins in the plant cells. In this work we describe the mechanisms controlling the accumulation of the COR protein TMC-AP3, a putative chloroplastic amino acid selective channel protein [1] in barley, durum, wheat, emmer and bread wheat. Winter barley and, to less extent, winter bread wheat showed a higher cor tmc-ap3 expression at low temperature than the spring one while no significant differences were detected between the emmer and the durum. wheat genotypes. After 2 days of de-hardening the transcript level dropped down in the same way in all tested genotypes, nevertheless the decrease in protein content was genotype dependent. In all frost resistant genotypes the amount of COR TMC-AP3 after 9 days of de-hardening was higher compared with that of susceptible ones. These findings suggest that resistant and susceptible genotypes have different protein degradation rate and/or mRNA translational efficiency. Differences in the protein degradation rate were not dependent from the amino acidic sequence of the protein, being extremely similar in all tested genotypes. A genetic study based on Chinese spring/Cheyenne chromosome substitution lines showed that the turnover of TMC-AP3 is a polygenic trait controlled by a number of loci being the most important located on chromosomes 1B, 2B, 2D and 4D.
Collapse
|
47
|
Gallo K, Baldi P, De Micheli M, Ostrowsky DB, Assanto G. Cascading phase shift and multivalued response in counterpropagating frequency-nondegenerate parametric amplifiers. OPTICS LETTERS 2000; 25:966-968. [PMID: 18064242 DOI: 10.1364/ol.25.000966] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
We propose and analyze a novel geometry for all-optical processing of low-power signals that is based on a frequency-nondegenerate counterpropagating parametric amplifier. The stationary response in the cascading regime exhibits multivalued solutions with enhanced nonlinear phase shifts. Implementation in a quasi-phase-matched LiNbO(3) waveguide is discussed.
Collapse
|
48
|
Baldi P, Brunak S, Chauvin Y, Andersen CA, Nielsen H. Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics 2000; 16:412-24. [PMID: 10871264 DOI: 10.1093/bioinformatics/16.5.412] [Citation(s) in RCA: 1048] [Impact Index Per Article: 43.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
We provide a unified overview of methods that currently are widely used to assess the accuracy of prediction algorithms, from raw percentages, quadratic error measures and other distances, and correlation coefficients, and to information theoretic measures such as relative entropy and mutual information. We briefly discuss the advantages and disadvantages of each approach. For classification tasks, we derive new learning algorithms for the design of prediction systems by directly optimising the correlation coefficient. We observe and prove several results relating sensitivity and specificity of optimal systems. While the principles are general, we illustrate the applicability on specific problems such as protein secondary structure and signal peptide prediction.
Collapse
|
49
|
Baldi P. On the convergence of a clustering algorithm for protein-coding regions in microbial genomes. Bioinformatics 2000; 16:367-71. [PMID: 10869034 DOI: 10.1093/bioinformatics/16.4.367] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION As the number of fully sequenced prokaryotic genomes continues to grow rapidly, computational methods for reliably detecting protein-coding regions become even more important. Audic and Claverie (1998) Proc. Natl Acad. Sci. USA, 95, 10026-10031, have proposed a clustering algorithm for protein-coding regions in microbial genomes. The algorithm is based on three Markov models of order k associated with subsequences extracted from a given genome. The parameters of the three Markov models are recursively updated by the algorithm which, in simulations, always appear to converge to a unique stable partition of the genome. The partition corresponds to three kinds of regions: (1) coding on the direct strand, (2) coding on the complementary strand, (3) non-coding. RESULTS Here we provide an explanation for the convergence of the algorithm by observing that it is essentially a form of the expectation maximization (EM) algorithm applied to the corresponding mixture model. We also provide a partial justification for the uniqueness of the partition based on identifiability. Other possible variations and improvements are briefly discussed.
Collapse
|
50
|
Baldi P, Brunak S, Frasconi P, Soda G, Pollastri G. Exploiting the past and the future in protein secondary structure prediction. Bioinformatics 1999; 15:937-46. [PMID: 10743560 DOI: 10.1093/bioinformatics/15.11.937] [Citation(s) in RCA: 320] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Predicting the secondary structure of a protein (alpha-helix, beta-sheet, coil) is an important step towards elucidating its three-dimensional structure, as well as its function. Presently, the best predictors are based on machine learning approaches, in particular neural network architectures with a fixed, and relatively short, input window of amino acids, centered at the prediction site. Although a fixed small window avoids overfitting problems, it does not permit capturing variable long-rang information. RESULTS We introduce a family of novel architectures which can learn to make predictions based on variable ranges of dependencies. These architectures extend recurrent neural networks, introducing non-causal bidirectional dynamics to capture both upstream and downstream information. The prediction algorithm is completed by the use of mixtures of estimators that leverage evolutionary information, expressed in terms of multiple alignments, both at the input and output levels. While our system currently achieves an overall performance close to 76% correct prediction--at least comparable to the best existing systems--the main emphasis here is on the development of new algorithmic ideas. AVAILABILITY The executable program for predicting protein secondary structure is available from the authors free of charge. CONTACT pfbaldi@ics.uci.edu, gpollast@ics.uci.edu, brunak@cbs.dtu.dk, paolo@dsi.unifi.it.
Collapse
|