1
|
Gunter HM, Youlten SE, Reis ALM, McCubbin T, Madala BS, Wong T, Stevanovski I, Cipponi A, Deveson IW, Santini NS, Kummerfeld S, Croucher PI, Marcellin E, Mercer TR. A universal molecular control for DNA, mRNA and protein expression. Nat Commun 2024; 15:2480. [PMID: 38509097 PMCID: PMC10954659 DOI: 10.1038/s41467-024-46456-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Accepted: 02/28/2024] [Indexed: 03/22/2024] Open
Abstract
The expression of genes encompasses their transcription into mRNA followed by translation into protein. In recent years, next-generation sequencing and mass spectrometry methods have profiled DNA, RNA and protein abundance in cells. However, there are currently no reference standards that are compatible across these genomic, transcriptomic and proteomic methods, and provide an integrated measure of gene expression. Here, we use synthetic biology principles to engineer a multi-omics control, termed pREF, that can act as a universal molecular standard for next-generation sequencing and mass spectrometry methods. The pREF sequence encodes 21 synthetic genes that can be in vitro transcribed into spike-in mRNA controls, and in vitro translated to generate matched protein controls. The synthetic genes provide qualitative controls that can measure sensitivity and quantitative accuracy of DNA, RNA and peptide detection. We demonstrate the use of pREF in metagenome DNA sequencing and RNA sequencing experiments and evaluate the quantification of proteins using mass spectrometry. Unlike previous spike-in controls, pREF can be independently propagated and the synthetic mRNA and protein controls can be sustainably prepared by recipient laboratories using common molecular biology techniques. Together, this provides a universal synthetic standard able to integrate genomic, transcriptomic and proteomic methods.
Collapse
Affiliation(s)
- Helen M Gunter
- Australian Institute of Bioengineering and Nanotechnology, The University of Queensland, Brisbane, Queensland, Australia
- BASE mRNA Facility, The University of Queensland, Brisbane, Queensland, Australia
- ARC Centre of Excellence in Synthetic Biology, The University of Queensland, Brisbane, Queensland, Australia
| | - Scott E Youlten
- Department of Genetics, Yale University School of Medicine, New Haven, CT, 06510, USA
- Garvan Institute of Medical Research, Sydney, New South Wales, Australia
- St Vincent's Clinical School, University of New South Wales, Sydney, New South Wales, Australia
| | - Andre L M Reis
- Genomics and Inherited Disease Program, Garvan Institute of Medical Research, Sydney, New South Wales, Australia
- Centre for Population Genomics, Garvan Institute of Medical Research and Murdoch Children's Research Institute, Sydney, New South Wales, Australia
- School of Electrical and Information Engineering, University of Sydney, Sydney, New South Wales, Australia
| | - Tim McCubbin
- Australian Institute of Bioengineering and Nanotechnology, The University of Queensland, Brisbane, Queensland, Australia
- ARC Centre of Excellence in Synthetic Biology, The University of Queensland, Brisbane, Queensland, Australia
| | - Bindu Swapna Madala
- Garvan Institute of Medical Research, Sydney, New South Wales, Australia
- Centre for Population Genomics, Garvan Institute of Medical Research and Murdoch Children's Research Institute, Sydney, New South Wales, Australia
| | - Ted Wong
- Garvan Institute of Medical Research, Sydney, New South Wales, Australia
| | - Igor Stevanovski
- Genomics and Inherited Disease Program, Garvan Institute of Medical Research, Sydney, New South Wales, Australia
- Centre for Population Genomics, Garvan Institute of Medical Research and Murdoch Children's Research Institute, Sydney, New South Wales, Australia
| | - Arcadi Cipponi
- Garvan Institute of Medical Research, Sydney, New South Wales, Australia
- St Vincent's Clinical School, University of New South Wales, Sydney, New South Wales, Australia
| | - Ira W Deveson
- Genomics and Inherited Disease Program, Garvan Institute of Medical Research, Sydney, New South Wales, Australia
- Centre for Population Genomics, Garvan Institute of Medical Research and Murdoch Children's Research Institute, Sydney, New South Wales, Australia
- School of Electrical and Information Engineering, University of Sydney, Sydney, New South Wales, Australia
| | - Nadia S Santini
- Centro Nacional de Investigación Disciplinaria en Conservación y Mejoramiento de Ecosistemas Forestales, INIFAP, Ciudad de México, 04010, Mexico
| | - Sarah Kummerfeld
- Garvan Institute of Medical Research, Sydney, New South Wales, Australia
- St Vincent's Clinical School, University of New South Wales, Sydney, New South Wales, Australia
| | - Peter I Croucher
- Garvan Institute of Medical Research, Sydney, New South Wales, Australia
- St Vincent's Clinical School, University of New South Wales, Sydney, New South Wales, Australia
| | - Esteban Marcellin
- Australian Institute of Bioengineering and Nanotechnology, The University of Queensland, Brisbane, Queensland, Australia
- ARC Centre of Excellence in Synthetic Biology, The University of Queensland, Brisbane, Queensland, Australia
| | - Tim R Mercer
- Australian Institute of Bioengineering and Nanotechnology, The University of Queensland, Brisbane, Queensland, Australia.
- BASE mRNA Facility, The University of Queensland, Brisbane, Queensland, Australia.
- ARC Centre of Excellence in Synthetic Biology, The University of Queensland, Brisbane, Queensland, Australia.
- Garvan Institute of Medical Research, Sydney, New South Wales, Australia.
| |
Collapse
|
2
|
Alexandari AM, Horton CA, Shrikumar A, Shah N, Li E, Weilert M, Pufall MA, Zeitlinger J, Fordyce PM, Kundaje A. De novo distillation of thermodynamic affinity from deep learning regulatory sequence models of in vivo protein-DNA binding. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.11.540401. [PMID: 37214836 PMCID: PMC10197627 DOI: 10.1101/2023.05.11.540401] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Transcription factors (TF) are proteins that bind DNA in a sequence-specific manner to regulate gene transcription. Despite their unique intrinsic sequence preferences, in vivo genomic occupancy profiles of TFs differ across cellular contexts. Hence, deciphering the sequence determinants of TF binding, both intrinsic and context-specific, is essential to understand gene regulation and the impact of regulatory, non-coding genetic variation. Biophysical models trained on in vitro TF binding assays can estimate intrinsic affinity landscapes and predict occupancy based on TF concentration and affinity. However, these models cannot adequately explain context-specific, in vivo binding profiles. Conversely, deep learning models, trained on in vivo TF binding assays, effectively predict and explain genomic occupancy profiles as a function of complex regulatory sequence syntax, albeit without a clear biophysical interpretation. To reconcile these complementary models of in vitro and in vivo TF binding, we developed Affinity Distillation (AD), a method that extracts thermodynamic affinities de-novo from deep learning models of TF chromatin immunoprecipitation (ChIP) experiments by marginalizing away the influence of genomic sequence context. Applied to neural networks modeling diverse classes of yeast and mammalian TFs, AD predicts energetic impacts of sequence variation within and surrounding motifs on TF binding as measured by diverse in vitro assays with superior dynamic range and accuracy compared to motif-based methods. Furthermore, AD can accurately discern affinities of TF paralogs. Our results highlight thermodynamic affinity as a key determinant of in vivo binding, suggest that deep learning models of in vivo binding implicitly learn high-resolution affinity landscapes, and show that these affinities can be successfully distilled using AD. This new biophysical interpretation of deep learning models enables high-throughput in silico experiments to explore the influence of sequence context and variation on both intrinsic affinity and in vivo occupancy.
Collapse
Affiliation(s)
- Amr M. Alexandari
- Department of Computer Science, Stanford University, Stanford, CA 94305
| | | | - Avanti Shrikumar
- Department of Earth System Science, Stanford University, Stanford, CA 94305
| | - Nilay Shah
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Eileen Li
- Department of Genetics, Stanford University, Stanford, CA 94305
| | - Melanie Weilert
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Miles A. Pufall
- Department of Biochemistry, Carver College of Medicine, University of Iowa, Iowa City, Iowa 52242, USA
| | - Julia Zeitlinger
- Stowers Institute for Medical Research, Kansas City, MO, USA
- The University of Kansas Medical Center, Kansas City, KS, USA
| | - Polly M. Fordyce
- Department of Genetics, Stanford University, Stanford, CA 94305
- Department of Bioengineering, Stanford University, Stanford, CA 94305
- ChEM-H Institute, Stanford University, Stanford, CA 94305
- Chan Zuckerberg Biohub, San Francisco, CA 94110
| | - Anshul Kundaje
- Department of Computer Science, Stanford University, Stanford, CA 94305
- Department of Genetics, Stanford University, Stanford, CA 94305
| |
Collapse
|
3
|
Orenstein Y. Reverse de Bruijn: Utilizing Reverse Peptide Synthesis to Cover All Amino Acid k-mers. J Comput Biol 2020; 27:376-385. [PMID: 31995404 DOI: 10.1089/cmb.2019.0448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Peptide arrays measure the binding intensity of a specific protein to thousands of amino acid peptides. By using peptides that cover all k-mers, a comprehensive picture of the binding spectrum is obtained. Researchers would like to measure binding to the longest k-mer possible but are constrained by the number of peptides that can fit into a single microarray. A key challenge is designing a minimum number of peptides that cover all k-mers. Here, we suggest a novel idea to reduce the length of the sequence covering all k-mers by utilizing a unique property of the peptide synthesis process. Since the synthesis can start from both ends of the peptide template, it is enough to cover each k-mer or its reverse and to use the same template twice: in forward and reverse. Then, the computational problem is to generate a minimum length sequence that for each k-mer either contains the k-mer or its reverse. In this study, we present a new algorithm, called ReverseCAKE, to generate such a sequence. ReverseCAKE runs in time linear in the output size and is guaranteed to produce a sequence that is longer by at most Θ(nlogn) characters compared with the optimum n. The obtained saving factor by ReverseCAKE approaches the theoretical lower bound as k increases. In addition, we formulated the problem as an integer linear program and empirically observed that the solutions obtained by ReverseCAKE are near-optimal. Through this work, we enable more effective design of peptide microarrays.
Collapse
Affiliation(s)
- Yaron Orenstein
- School of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| |
Collapse
|
4
|
Dans PD, Balaceanu A, Pasi M, Patelli AS, Petkevičiūtė D, Walther J, Hospital A, Bayarri G, Lavery R, Maddocks JH, Orozco M. The static and dynamic structural heterogeneities of B-DNA: extending Calladine-Dickerson rules. Nucleic Acids Res 2019; 47:11090-11102. [PMID: 31624840 PMCID: PMC6868377 DOI: 10.1093/nar/gkz905] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2019] [Revised: 09/25/2019] [Accepted: 10/06/2019] [Indexed: 12/12/2022] Open
Abstract
We present a multi-laboratory effort to describe the structural and dynamical properties of duplex B-DNA under physiological conditions. By processing a large amount of atomistic molecular dynamics simulations, we determine the sequence-dependent structural properties of DNA as expressed in the equilibrium distribution of its stochastic dynamics. Our analysis includes a study of first and second moments of the equilibrium distribution, which can be accurately captured by a harmonic model, but with nonlocal sequence-dependence. We characterize the sequence-dependent choreography of backbone and base movements modulating the non-Gaussian or anharmonic effects manifested in the higher moments of the dynamics of the duplex when sampling the equilibrium distribution. Contrary to prior assumptions, such anharmonic deformations are not rare in DNA and can play a significant role in determining DNA conformation within complexes. Polymorphisms in helical geometries are particularly prevalent for certain tetranucleotide sequence contexts and are always coupled to a complex network of coordinated changes in the backbone. The analysis of our simulations, which contain instances of all tetranucleotide sequences, allow us to extend Calladine-Dickerson rules used for decades to interpret the average geometry of DNA, leading to a set of rules with quantitative predictive power that encompass nonlocal sequence-dependence and anharmonic fluctuations.
Collapse
Affiliation(s)
- Pablo D Dans
- Institute for Research in Biomedicine (IRB Barcelona). The Barcelona Institute of Science and Technology. Baldiri Reixac 10–12, 08028 Barcelona, Spain
- Department of Biological Sciences, University of the Republic (UdelaR), CENUR Gral. Rivera 1350, 50000 Salto, Uruguay
| | - Alexandra Balaceanu
- Institute for Research in Biomedicine (IRB Barcelona). The Barcelona Institute of Science and Technology. Baldiri Reixac 10–12, 08028 Barcelona, Spain
| | - Marco Pasi
- LBPA, École normale supérieure Paris-Saclay, 61 Av. du Pdt Wilson, Cachan 94235, France
- Bases Moléculaires et Structurales des Systèmes Infectieux, Univ. Lyon I/CNRS UMR 5086, IBCP, 7 Passage du Vercors, Lyon 69367, France
| | - Alessandro S Patelli
- Institute of Mathematics, Swiss Federal Institute of Technology (EPFL), CH-1015 Lausanne, Switzerland
| | - Daiva Petkevičiūtė
- Institute of Mathematics, Swiss Federal Institute of Technology (EPFL), CH-1015 Lausanne, Switzerland
- Faculty of Mathematics and Natural Sciences, Kaunas University of Technology, Studentų g. 50, 51368 Kaunas, Lithuania
| | - Jürgen Walther
- Institute for Research in Biomedicine (IRB Barcelona). The Barcelona Institute of Science and Technology. Baldiri Reixac 10–12, 08028 Barcelona, Spain
| | - Adam Hospital
- Institute for Research in Biomedicine (IRB Barcelona). The Barcelona Institute of Science and Technology. Baldiri Reixac 10–12, 08028 Barcelona, Spain
| | - Genís Bayarri
- Institute for Research in Biomedicine (IRB Barcelona). The Barcelona Institute of Science and Technology. Baldiri Reixac 10–12, 08028 Barcelona, Spain
| | - Richard Lavery
- Bases Moléculaires et Structurales des Systèmes Infectieux, Univ. Lyon I/CNRS UMR 5086, IBCP, 7 Passage du Vercors, Lyon 69367, France
| | - John H Maddocks
- Institute of Mathematics, Swiss Federal Institute of Technology (EPFL), CH-1015 Lausanne, Switzerland
| | - Modesto Orozco
- Institute for Research in Biomedicine (IRB Barcelona). The Barcelona Institute of Science and Technology. Baldiri Reixac 10–12, 08028 Barcelona, Spain
- Department of Biochemistry and Molecular Biology. University of Barcelona, 08028 Barcelona, Spain
| |
Collapse
|
5
|
Orenstein Y, Puccinelli R, Kim R, Fordyce P, Berger B. Optimized Sequence Library Design for Efficient In Vitro Interaction Mapping. Cell Syst 2019; 5:230-236.e5. [PMID: 28957657 DOI: 10.1016/j.cels.2017.07.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2017] [Revised: 04/14/2017] [Accepted: 07/27/2017] [Indexed: 11/27/2022]
Abstract
Sequence libraries that cover all k-mers enable universal, unbiased measurements of binding to both oligonucleotides and peptides. While the number of k-mers grows exponentially in k, space on all experimental platforms is limited. Here, we shrink k-mer library sizes by using joker characters, which represent all characters in the alphabet simultaneously. We present the JokerCAKE (joker covering all k-mers) algorithm for generating a short sequence such that each k-mer appears at least p times with at most one joker character per k-mer. By running our algorithm on a range of parameters and alphabets, we show that JokerCAKE produces near-optimal sequences. Moreover, through comparison with data from hundreds of DNA-protein binding experiments and with new experimental results for both standard and JokerCAKE libraries, we establish that accurate binding scores can be inferred for high-affinity k-mers using JokerCAKE libraries. JokerCAKE libraries allow researchers to search a significantly larger sequence space using the same number of experimental measurements and at the same cost.
Collapse
Affiliation(s)
- Yaron Orenstein
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Robert Puccinelli
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Ryan Kim
- Research Science Institute, Center for Excellence in Education, McLean, VA 22102, USA
| | - Polly Fordyce
- Department of Genetics, Stanford University, Stanford, CA 94305, USA; Department of Bioengineering, Stanford University, Stanford, CA 94305, USA; ChEM-H Institute, Stanford University, Stanford, CA 94305, USA; Chan Zuckerberg Biohub, San Francisco, CA 94158, USA
| | - Bonnie Berger
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
| |
Collapse
|
6
|
Li W, Thanos D, Provata A. Quantifying local randomness in human DNA and RNA sequences using Erdös motifs. J Theor Biol 2018; 461:41-50. [PMID: 30336158 DOI: 10.1016/j.jtbi.2018.09.031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2018] [Revised: 08/14/2018] [Accepted: 09/25/2018] [Indexed: 10/28/2022]
Abstract
In 1932, Paul Erdös asked whether a random walk constructed from a binary sequence can achieve the lowest possible deviation (lowest discrepancy), for the sequence itself and for all its subsequences formed by homogeneous arithmetic progressions. Although avoiding low discrepancy is impossible for infinite sequences, as recently proven by Terence Tao, attempts were made to construct such sequences with finite lengths. We recognize that such constructed sequences (we call these "Erdös sequences") exhibit certain hallmarks of randomness at the local level: they show roughly equal frequencies of short subsequences, and at the same time exclude trivial periodic patterns. For the human DNA we examine the frequency of a set of Erdös motifs of length-10 using three nucleotides-to-binary mappings. The particular length-10 Erdös sequence is derived from the length-11 Mathias sequence and is identical with the first 10 digits of the Thue-Morse sequence, underscoring the fact that both are deficient in periodicities. Our calculations indicate that: (1) the purine(A and G)/pyridimine(C and T) based Erdös motifs are greatly underrepresented in the human genome, (2) the strong(G and C)/weak(A and T) based Erdös motifs are slightly overrepresented, (3) the densities of the two are negatively correlated, (4) the Erdös motifs based on all three mappings being combined are slightly underrepresented, and (5) the strong/weak based Erdös motifs are greatly overrepresented in the human messenger RNA sequences.
Collapse
Affiliation(s)
- Wentian Li
- The Robert S. Boas Center for Genomics and Human Genetics, The Feinstein Institute for Medical Research, Northwell Health, Manhasset, NY, USA.
| | - Dimitrios Thanos
- Department of Mathematics, National and Kapodistrian University of Athens, Athens GR-15784, Greece; Institute of Nanoscience and Nanotechnology, National Center for Scientific Research "Demokritos", Athens GR-15341, Greece
| | - Astero Provata
- Institute of Nanoscience and Nanotechnology, National Center for Scientific Research "Demokritos", Athens GR-15341, Greece
| |
Collapse
|
7
|
Orenstein Y, Yu YW, Berger B. Joker de Bruijn: Covering k-Mers Using Joker Characters. J Comput Biol 2018; 25:1171-1178. [PMID: 30117747 PMCID: PMC6247992 DOI: 10.1089/cmb.2018.0032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Sequence libraries that cover all k-mers enable universal and unbiased measurements of nucleotide and peptide binding. The shortest sequence to cover all k-mers is a de Bruijn sequence of length \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$\vert \Sigma { \vert ^k} + k - 1$$
\end{document}. Researchers would like to increase k to measure interactions at greater detail, but face a challenging problem: the number of k-mers grows exponentially in k, while the space on the experimental device is limited. In this study, we introduce a novel advance to shrink k-mer library sizes by using joker characters, which represent all characters in the alphabet. Theoretically, the use of joker characters can reduce the library size tremendously, but it should be limited as the introduced degeneracy lowers the statistical robustness of measurements. In this work, we consider the problem of generating a minimum-length sequence that covers a given set of k-mers using joker characters. The number and positions of the joker characters are provided as input. We first prove that the problem is NP-hard. We then present the first solution to the problem, which is based on two algorithmic innovations: (1) a greedy heuristic and (2) an integer linear programming (ILP) formulation. We first run the heuristic to find a good feasible solution, and then run an ILP solver to improve it. We ran our algorithm on DNA and amino acid alphabets to cover all k-mers for different values of k and k-mer multiplicity. Results demonstrate that it produces sequences that are very close to the theoretical lower bound.
Collapse
Affiliation(s)
- Yaron Orenstein
- 1 Department of Electrical and Computer Engineering, Ben-Gurion University of the Negev , Beer-Sheva, Israel .,2 Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology , Cambridge, Massachusetts
| | - Yun William Yu
- 3 Department of Mathematics, Massachusetts Institute of Technology , Cambridge, Massachusetts
| | - Bonnie Berger
- 2 Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology , Cambridge, Massachusetts.,3 Department of Mathematics, Massachusetts Institute of Technology , Cambridge, Massachusetts
| |
Collapse
|
8
|
Abstract
Current microarray technologies to determine RNA structure or measure
protein–RNA interactions rely on single-stranded, unstructured RNA probes
on a chip covering together all k-mers. Since space on the array
is limited, the problem is to efficiently design a compact library of unstructured
ℓ-long RNA probes, where each k-mer is
covered at least p times. Ray et al. designed such a library for
specific values of k, ℓ, and
p using ad-hoc rules. To our knowledge, there is no general
method to date to solve this problem. Here, we address the problem of finding a
minimum-size covering of all k-mers by
ℓ-long sequences with the desired properties for any value
of k, ℓ, and p. As we
prove that the problem is NP-hard, we give two solutions: the first is a greedy
algorithm with a logarithmic approximation ratio; the second, a heuristic greedy
approach based on random walks in de Bruijn graphs. The heuristic algorithm works
well in practice and produces a library of unstructured RNA probes that is only
∼1.1-times greater in size compared to the theoretical lower bound. We
present results for typical values of k and probe lengths
ℓ and show that our algorithm generates a library that
is significantly smaller than the library of Ray et al.; moreover, we show that
our algorithm outperforms naive methods. Our approach can be generalized and
extended to generate RNA or DNA oligo libraries with other desired properties. The
software is freely available online.
Collapse
Affiliation(s)
- Yaron Orenstein
- 1 Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology , Cambridge, MA
| | - Bonnie Berger
- 1 Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology , Cambridge, MA.,2 Department of Mathematics, Massachusetts Institute of Technology , Cambridge, MA
| |
Collapse
|
9
|
The complex task of choosing a de novo assembly: Lessons from fungal genomes. Comput Biol Chem 2014; 53 Pt A:97-107. [DOI: 10.1016/j.compbiolchem.2014.08.014] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/11/2014] [Indexed: 12/21/2022]
|