1
|
Abstract
The computational reconstruction of genome sequences from shotgun sequencing data has been greatly simplified by the advent of sequencing technologies that generate long reads. In the case of relatively small genomes (e.g., bacterial or viral), complete genome sequences can frequently be reconstructed computationally without the need for further experiments. However, large and complex genomes, such as those of most animals and plants, continue to pose significant challenges. In such genomes, assembly software produces incomplete and fragmented reconstructions that require additional experimentally derived information and manual intervention in order to reconstruct individual chromosome arms. Recent technologies originally designed to capture chromatin structure have been shown to effectively complement sequencing data, leading to much more contiguous reconstructions of genomes than previously possible. Here, we survey these technologies and the algorithms used to assemble and analyze large eukaryotic genomes, placed within the historical context of genome scaffolding technologies that have been in existence since the dawn of the genomic era.
Collapse
Affiliation(s)
- Jay Ghurye
- Department of Computer Science and Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland, United States of America
| | - Mihai Pop
- Department of Computer Science and Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland, United States of America
| |
Collapse
|
2
|
|
3
|
Wang W, Feng B, Xiao J, Xia Z, Zhou X, Li P, Zhang W, Wang Y, Møller BL, Zhang P, Luo MC, Xiao G, Liu J, Yang J, Chen S, Rabinowicz PD, Chen X, Zhang HB, Ceballos H, Lou Q, Zou M, Carvalho LJCB, Zeng C, Xia J, Sun S, Fu Y, Wang H, Lu C, Ruan M, Zhou S, Wu Z, Liu H, Kannangara RM, Jørgensen K, Neale RL, Bonde M, Heinz N, Zhu W, Wang S, Zhang Y, Pan K, Wen M, Ma PA, Li Z, Hu M, Liao W, Hu W, Zhang S, Pei J, Guo A, Guo J, Zhang J, Zhang Z, Ye J, Ou W, Ma Y, Liu X, Tallon LJ, Galens K, Ott S, Huang J, Xue J, An F, Yao Q, Lu X, Fregene M, López-Lavalle LAB, Wu J, You FM, Chen M, Hu S, Wu G, Zhong S, Ling P, Chen Y, Wang Q, Liu G, Liu B, Li K, Peng M. Cassava genome from a wild ancestor to cultivated varieties. Nat Commun 2014; 5:5110. [PMID: 25300236 DOI: 10.1038/ncomms610] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2014] [Accepted: 08/27/2014] [Indexed: 05/28/2023] Open
Abstract
Cassava is a major tropical food crop in the Euphorbiaceae family that has high carbohydrate production potential and adaptability to diverse environments. Here we present the draft genome sequences of a wild ancestor and a domesticated variety of cassava and comparative analyses with a partial inbred line. We identify 1,584 and 1,678 gene models specific to the wild and domesticated varieties, respectively, and discover high heterozygosity and millions of single-nucleotide variations. Our analyses reveal that genes involved in photosynthesis, starch accumulation and abiotic stresses have been positively selected, whereas those involved in cell wall biosynthesis and secondary metabolism, including cyanogenic glucoside formation, have been negatively selected in the cultivated varieties, reflecting the result of natural selection and domestication. Differences in microRNA genes and retrotransposon regulation could partly explain an increased carbon flux towards starch accumulation and reduced cyanogenic glucoside accumulation in domesticated cassava. These results may contribute to genetic improvement of cassava through better understanding of its biology.
Collapse
Affiliation(s)
- Wenquan Wang
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Binxiao Feng
- 1] Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China [2] Tropical Crop Genetic Resources Institute, CATAS, Danzhou 571700, China
| | - Jingfa Xiao
- Beijing Institute of Genomics, Chinese Academy of Sciences (CAS), Beijing 100101, China
| | - Zhiqiang Xia
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Xincheng Zhou
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Pinghua Li
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Weixiong Zhang
- 1] Department of Computer Science and Engineering and Department of Genetics, Washington University, Saint Louis, Missouri 63130, USA [2] Institute for Systems Biology, Jianghan University, Wuhan 430056, China
| | - Ying Wang
- South China Botanical Garden, CAS, Guangzhou 510650, China
| | - Birger Lindberg Møller
- Plant Biochemistry Laboratory, Department of Plant and Environmental Sciences, University of Copenhagen, Copenhagen 1165, Denmark
| | - Peng Zhang
- Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences of CAS, Shanghai 200032, China
| | - Ming-Cheng Luo
- Department of Plant Sciences, University of California, Davis, California 95616, USA
| | - Gong Xiao
- South China Botanical Garden, CAS, Guangzhou 510650, China
| | - Jingxing Liu
- Beijing Institute of Genomics, Chinese Academy of Sciences (CAS), Beijing 100101, China
| | - Jun Yang
- Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences of CAS, Shanghai 200032, China
| | - Songbi Chen
- Tropical Crop Genetic Resources Institute, CATAS, Danzhou 571700, China
| | - Pablo D Rabinowicz
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA
| | - Xin Chen
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Hong-Bin Zhang
- Department of Soil and Crop Sciences, Texas A&M University, College Station, Texas 77843, USA
| | - Henan Ceballos
- International Center for Tropical Agriculture (CIAT), Cali 6713, Colombia
| | - Qunfeng Lou
- State Key Laboratory of Crop Genetics and Germplasm Enhancement, College of Horticulture, Nanjing Agricultural University, Nanjing 210095, China
| | - Meiling Zou
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Luiz J C B Carvalho
- Brazilian Enterprise for Agricultural Research (EMBRAPA), Genetic Resources and Biotechnology, Brasilia 70770, Brazil
| | - Changying Zeng
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Jing Xia
- 1] Department of Computer Science and Engineering and Department of Genetics, Washington University, Saint Louis, Missouri 63130, USA [2] Institute for Systems Biology, Jianghan University, Wuhan 430056, China
| | - Shixiang Sun
- Beijing Institute of Genomics, Chinese Academy of Sciences (CAS), Beijing 100101, China
| | - Yuhua Fu
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Haiyan Wang
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Cheng Lu
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Mengbin Ruan
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Shuigeng Zhou
- Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, Shanghai 200433, China
| | - Zhicheng Wu
- Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, Shanghai 200433, China
| | - Hui Liu
- Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, Shanghai 200433, China
| | - Rubini Maya Kannangara
- Plant Biochemistry Laboratory, Department of Plant and Environmental Sciences, University of Copenhagen, Copenhagen 1165, Denmark
| | - Kirsten Jørgensen
- Plant Biochemistry Laboratory, Department of Plant and Environmental Sciences, University of Copenhagen, Copenhagen 1165, Denmark
| | - Rebecca Louise Neale
- Plant Biochemistry Laboratory, Department of Plant and Environmental Sciences, University of Copenhagen, Copenhagen 1165, Denmark
| | - Maya Bonde
- Plant Biochemistry Laboratory, Department of Plant and Environmental Sciences, University of Copenhagen, Copenhagen 1165, Denmark
| | - Nanna Heinz
- Plant Biochemistry Laboratory, Department of Plant and Environmental Sciences, University of Copenhagen, Copenhagen 1165, Denmark
| | - Wenli Zhu
- Tropical Crop Genetic Resources Institute, CATAS, Danzhou 571700, China
| | - Shujuan Wang
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Yang Zhang
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Kun Pan
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Mingfu Wen
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Ping-An Ma
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Zhengxu Li
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Meizhen Hu
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Wenbin Liao
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Wenbin Hu
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Shengkui Zhang
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Jinli Pei
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Anping Guo
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Jianchun Guo
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Jiaming Zhang
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Zhengwen Zhang
- Tropical Crop Genetic Resources Institute, CATAS, Danzhou 571700, China
| | - Jianqiu Ye
- Tropical Crop Genetic Resources Institute, CATAS, Danzhou 571700, China
| | - Wenjun Ou
- Tropical Crop Genetic Resources Institute, CATAS, Danzhou 571700, China
| | - Yaqin Ma
- Department of Plant Sciences, University of California, Davis, California 95616, USA
| | - Xinyue Liu
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA
| | - Luke J Tallon
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA
| | - Kevin Galens
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA
| | - Sandra Ott
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA
| | - Jie Huang
- Tropical Crop Genetic Resources Institute, CATAS, Danzhou 571700, China
| | - Jingjing Xue
- Tropical Crop Genetic Resources Institute, CATAS, Danzhou 571700, China
| | - Feifei An
- Tropical Crop Genetic Resources Institute, CATAS, Danzhou 571700, China
| | - Qingqun Yao
- Tropical Crop Genetic Resources Institute, CATAS, Danzhou 571700, China
| | - Xiaojing Lu
- Tropical Crop Genetic Resources Institute, CATAS, Danzhou 571700, China
| | - Martin Fregene
- International Center for Tropical Agriculture (CIAT), Cali 6713, Colombia
| | | | - Jiajie Wu
- Department of Plant Sciences, University of California, Davis, California 95616, USA
| | - Frank M You
- Department of Plant Sciences, University of California, Davis, California 95616, USA
| | - Meili Chen
- Beijing Institute of Genomics, Chinese Academy of Sciences (CAS), Beijing 100101, China
| | - Songnian Hu
- Beijing Institute of Genomics, Chinese Academy of Sciences (CAS), Beijing 100101, China
| | - Guojiang Wu
- South China Botanical Garden, CAS, Guangzhou 510650, China
| | - Silin Zhong
- State Key Laboratory of Agrobiotechnology, School of Life Sciences, Chinese University of Hong Kong, Hong Kong, China
| | - Peng Ling
- Citrus Research and Education Center (CREC), University of Florida, Gainesville, Florida 32611, USA
| | - Yeyuan Chen
- Tropical Crop Genetic Resources Institute, CATAS, Danzhou 571700, China
| | - Qinghuang Wang
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Guodao Liu
- Tropical Crop Genetic Resources Institute, CATAS, Danzhou 571700, China
| | - Bin Liu
- State Key Laboratory of Desert and Oasis Ecology, Key Laboratory of Biogeography and Bioresources in Arid Land, Center of Systematic Genomics, Xinjiang Institute of Ecology and Geography, Urumqi 830011, China
| | - Kaimian Li
- Tropical Crop Genetic Resources Institute, CATAS, Danzhou 571700, China
| | - Ming Peng
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| |
Collapse
|
4
|
Wang W, Feng B, Xiao J, Xia Z, Zhou X, Li P, Zhang W, Wang Y, Møller BL, Zhang P, Luo MC, Xiao G, Liu J, Yang J, Chen S, Rabinowicz PD, Chen X, Zhang HB, Ceballos H, Lou Q, Zou M, Carvalho LJCB, Zeng C, Xia J, Sun S, Fu Y, Wang H, Lu C, Ruan M, Zhou S, Wu Z, Liu H, Kannangara RM, Jørgensen K, Neale RL, Bonde M, Heinz N, Zhu W, Wang S, Zhang Y, Pan K, Wen M, Ma PA, Li Z, Hu M, Liao W, Hu W, Zhang S, Pei J, Guo A, Guo J, Zhang J, Zhang Z, Ye J, Ou W, Ma Y, Liu X, Tallon LJ, Galens K, Ott S, Huang J, Xue J, An F, Yao Q, Lu X, Fregene M, López-Lavalle LAB, Wu J, You FM, Chen M, Hu S, Wu G, Zhong S, Ling P, Chen Y, Wang Q, Liu G, Liu B, Li K, Peng M. Cassava genome from a wild ancestor to cultivated varieties. Nat Commun 2014; 5:5110. [PMID: 25300236 PMCID: PMC4214410 DOI: 10.1038/ncomms6110] [Citation(s) in RCA: 155] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2014] [Accepted: 08/27/2014] [Indexed: 11/10/2022] Open
Abstract
Cassava is a major tropical food crop in the Euphorbiaceae family that has high carbohydrate production potential and adaptability to diverse environments. Here we present the draft genome sequences of a wild ancestor and a domesticated variety of cassava and comparative analyses with a partial inbred line. We identify 1,584 and 1,678 gene models specific to the wild and domesticated varieties, respectively, and discover high heterozygosity and millions of single-nucleotide variations. Our analyses reveal that genes involved in photosynthesis, starch accumulation and abiotic stresses have been positively selected, whereas those involved in cell wall biosynthesis and secondary metabolism, including cyanogenic glucoside formation, have been negatively selected in the cultivated varieties, reflecting the result of natural selection and domestication. Differences in microRNA genes and retrotransposon regulation could partly explain an increased carbon flux towards starch accumulation and reduced cyanogenic glucoside accumulation in domesticated cassava. These results may contribute to genetic improvement of cassava through better understanding of its biology.
Collapse
Affiliation(s)
- Wenquan Wang
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Binxiao Feng
- 1] Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China [2] Tropical Crop Genetic Resources Institute, CATAS, Danzhou 571700, China
| | - Jingfa Xiao
- Beijing Institute of Genomics, Chinese Academy of Sciences (CAS), Beijing 100101, China
| | - Zhiqiang Xia
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Xincheng Zhou
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Pinghua Li
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Weixiong Zhang
- 1] Department of Computer Science and Engineering and Department of Genetics, Washington University, Saint Louis, Missouri 63130, USA [2] Institute for Systems Biology, Jianghan University, Wuhan 430056, China
| | - Ying Wang
- South China Botanical Garden, CAS, Guangzhou 510650, China
| | - Birger Lindberg Møller
- Plant Biochemistry Laboratory, Department of Plant and Environmental Sciences, University of Copenhagen, Copenhagen 1165, Denmark
| | - Peng Zhang
- Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences of CAS, Shanghai 200032, China
| | - Ming-Cheng Luo
- Department of Plant Sciences, University of California, Davis, California 95616, USA
| | - Gong Xiao
- South China Botanical Garden, CAS, Guangzhou 510650, China
| | - Jingxing Liu
- Beijing Institute of Genomics, Chinese Academy of Sciences (CAS), Beijing 100101, China
| | - Jun Yang
- Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences of CAS, Shanghai 200032, China
| | - Songbi Chen
- Tropical Crop Genetic Resources Institute, CATAS, Danzhou 571700, China
| | - Pablo D Rabinowicz
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA
| | - Xin Chen
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Hong-Bin Zhang
- Department of Soil and Crop Sciences, Texas A&M University, College Station, Texas 77843, USA
| | - Henan Ceballos
- International Center for Tropical Agriculture (CIAT), Cali 6713, Colombia
| | - Qunfeng Lou
- State Key Laboratory of Crop Genetics and Germplasm Enhancement, College of Horticulture, Nanjing Agricultural University, Nanjing 210095, China
| | - Meiling Zou
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Luiz J C B Carvalho
- Brazilian Enterprise for Agricultural Research (EMBRAPA), Genetic Resources and Biotechnology, Brasilia 70770, Brazil
| | - Changying Zeng
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Jing Xia
- 1] Department of Computer Science and Engineering and Department of Genetics, Washington University, Saint Louis, Missouri 63130, USA [2] Institute for Systems Biology, Jianghan University, Wuhan 430056, China
| | - Shixiang Sun
- Beijing Institute of Genomics, Chinese Academy of Sciences (CAS), Beijing 100101, China
| | - Yuhua Fu
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Haiyan Wang
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Cheng Lu
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Mengbin Ruan
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Shuigeng Zhou
- Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, Shanghai 200433, China
| | - Zhicheng Wu
- Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, Shanghai 200433, China
| | - Hui Liu
- Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, Shanghai 200433, China
| | - Rubini Maya Kannangara
- Plant Biochemistry Laboratory, Department of Plant and Environmental Sciences, University of Copenhagen, Copenhagen 1165, Denmark
| | - Kirsten Jørgensen
- Plant Biochemistry Laboratory, Department of Plant and Environmental Sciences, University of Copenhagen, Copenhagen 1165, Denmark
| | - Rebecca Louise Neale
- Plant Biochemistry Laboratory, Department of Plant and Environmental Sciences, University of Copenhagen, Copenhagen 1165, Denmark
| | - Maya Bonde
- Plant Biochemistry Laboratory, Department of Plant and Environmental Sciences, University of Copenhagen, Copenhagen 1165, Denmark
| | - Nanna Heinz
- Plant Biochemistry Laboratory, Department of Plant and Environmental Sciences, University of Copenhagen, Copenhagen 1165, Denmark
| | - Wenli Zhu
- Tropical Crop Genetic Resources Institute, CATAS, Danzhou 571700, China
| | - Shujuan Wang
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Yang Zhang
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Kun Pan
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Mingfu Wen
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Ping-An Ma
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Zhengxu Li
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Meizhen Hu
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Wenbin Liao
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Wenbin Hu
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Shengkui Zhang
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Jinli Pei
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Anping Guo
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Jianchun Guo
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Jiaming Zhang
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Zhengwen Zhang
- Tropical Crop Genetic Resources Institute, CATAS, Danzhou 571700, China
| | - Jianqiu Ye
- Tropical Crop Genetic Resources Institute, CATAS, Danzhou 571700, China
| | - Wenjun Ou
- Tropical Crop Genetic Resources Institute, CATAS, Danzhou 571700, China
| | - Yaqin Ma
- Department of Plant Sciences, University of California, Davis, California 95616, USA
| | - Xinyue Liu
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA
| | - Luke J Tallon
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA
| | - Kevin Galens
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA
| | - Sandra Ott
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA
| | - Jie Huang
- Tropical Crop Genetic Resources Institute, CATAS, Danzhou 571700, China
| | - Jingjing Xue
- Tropical Crop Genetic Resources Institute, CATAS, Danzhou 571700, China
| | - Feifei An
- Tropical Crop Genetic Resources Institute, CATAS, Danzhou 571700, China
| | - Qingqun Yao
- Tropical Crop Genetic Resources Institute, CATAS, Danzhou 571700, China
| | - Xiaojing Lu
- Tropical Crop Genetic Resources Institute, CATAS, Danzhou 571700, China
| | - Martin Fregene
- International Center for Tropical Agriculture (CIAT), Cali 6713, Colombia
| | | | - Jiajie Wu
- Department of Plant Sciences, University of California, Davis, California 95616, USA
| | - Frank M You
- Department of Plant Sciences, University of California, Davis, California 95616, USA
| | - Meili Chen
- Beijing Institute of Genomics, Chinese Academy of Sciences (CAS), Beijing 100101, China
| | - Songnian Hu
- Beijing Institute of Genomics, Chinese Academy of Sciences (CAS), Beijing 100101, China
| | - Guojiang Wu
- South China Botanical Garden, CAS, Guangzhou 510650, China
| | - Silin Zhong
- State Key Laboratory of Agrobiotechnology, School of Life Sciences, Chinese University of Hong Kong, Hong Kong, China
| | - Peng Ling
- Citrus Research and Education Center (CREC), University of Florida, Gainesville, Florida 32611, USA
| | - Yeyuan Chen
- Tropical Crop Genetic Resources Institute, CATAS, Danzhou 571700, China
| | - Qinghuang Wang
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| | - Guodao Liu
- Tropical Crop Genetic Resources Institute, CATAS, Danzhou 571700, China
| | - Bin Liu
- State Key Laboratory of Desert and Oasis Ecology, Key Laboratory of Biogeography and Bioresources in Arid Land, Center of Systematic Genomics, Xinjiang Institute of Ecology and Geography, Urumqi 830011, China
| | - Kaimian Li
- Tropical Crop Genetic Resources Institute, CATAS, Danzhou 571700, China
| | - Ming Peng
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences (CATAS), Haikou 571101, China
| |
Collapse
|
5
|
Abstract
In the next generation sequencing techniques millions of short reads are produced from a genomic sequence at a single run. The chances of low read coverage to some regions of the sequence are very high. The reads are short and very large in number. Due to erroneous base calling, there could be errors in the reads. As a consequence, sequence assemblers often fail to sequence an entire DNA molecule and instead output a set of overlapping segments that together represent a consensus region of the DNA. This set of overlapping segments are collectively called contigs in the literature. The final step of the sequencing process, called scaffolding, is to assemble the contigs into a correct order. Scaffolding techniques typically exploit additional information such as mate-pairs, pair-ends, or optical restriction maps. In this paper we introduce a series of novel algorithms for scaffolding that exploit optical restriction maps (ORMs). Simulation results show that our algorithms are indeed reliable, scalable, and efficient compared to the best known algorithms in the literature.
Collapse
|
6
|
Cheng S, van den Bergh E, Zeng P, Zhong X, Xu J, Liu X, Hofberger J, de Bruijn S, Bhide AS, Kuelahoglu C, Bian C, Chen J, Fan G, Kaufmann K, Hall JC, Becker A, Bräutigam A, Weber AP, Shi C, Zheng Z, Li W, Lv M, Tao Y, Wang J, Zou H, Quan Z, Hibberd JM, Zhang G, Zhu XG, Xu X, Schranz ME. The Tarenaya hassleriana genome provides insight into reproductive trait and genome evolution of crucifers. THE PLANT CELL 2013; 25:2813-30. [PMID: 23983221 PMCID: PMC3784582 DOI: 10.1105/tpc.113.113480] [Citation(s) in RCA: 73] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/07/2013] [Revised: 07/06/2013] [Accepted: 08/06/2013] [Indexed: 05/18/2023]
Abstract
The Brassicaceae, including Arabidopsis thaliana and Brassica crops, is unmatched among plants in its wealth of genomic and functional molecular data and has long served as a model for understanding gene, genome, and trait evolution. However, genome information from a phylogenetic outgroup that is essential for inferring directionality of evolutionary change has been lacking. We therefore sequenced the genome of the spider flower (Tarenaya hassleriana) from the Brassicaceae sister family, the Cleomaceae. By comparative analysis of the two lineages, we show that genome evolution following ancient polyploidy and gene duplication events affect reproductively important traits. We found an ancient genome triplication in Tarenaya (Th-α) that is independent of the Brassicaceae-specific duplication (At-α) and nested Brassica (Br-α) triplication. To showcase the potential of sister lineage genome analysis, we investigated the state of floral developmental genes and show Brassica retains twice as many floral MADS (for minichromosome maintenance1, AGAMOUS, DEFICIENS and serum response factor) genes as Tarenaya that likely contribute to morphological diversity in Brassica. We also performed synteny analysis of gene families that confer self-incompatibility in Brassicaceae and found that the critical serine receptor kinase receptor gene is derived from a lineage-specific tandem duplication. The T. hassleriana genome will facilitate future research toward elucidating the evolutionary history of Brassicaceae genomes.
Collapse
Affiliation(s)
| | - Erik van den Bergh
- Biosystematics Group, Wageningen University, 6708 PB Wageningen, The Netherlands
| | - Peng Zeng
- Beijing Genomics Institute, 518083 Shenzhen, China
| | - Xiao Zhong
- Beijing Genomics Institute, 518083 Shenzhen, China
| | - Jiajia Xu
- Plant Systems Biology Group, Partner Institute of Computational Biology, Chinese Academy of Sciences/Max Planck Society, Shanghai 200031, China
| | - Xin Liu
- Beijing Genomics Institute, 518083 Shenzhen, China
| | - Johannes Hofberger
- Biosystematics Group, Wageningen University, 6708 PB Wageningen, The Netherlands
| | - Suzanne de Bruijn
- Molecular Biology Group, Wageningen University, 6708 PB Wageningen, The Netherlands
- Institute for Biochemistry and Biology, University of Potsdam, 14476 Potsdam, Germany
| | - Amey S. Bhide
- Plant Developmental Biology Group, Institute of Botany, Justus-Liebig-University, 35392 Giessen, Germany
| | - Canan Kuelahoglu
- Institute of Plant Biochemistry, Center of Excellence on Plant Sciences, Heinrich-Heine-University, D-40225 Duesseldorf, Germany
| | - Chao Bian
- Beijing Genomics Institute, 518083 Shenzhen, China
| | - Jing Chen
- Beijing Genomics Institute, 518083 Shenzhen, China
| | - Guangyi Fan
- Beijing Genomics Institute, 518083 Shenzhen, China
| | - Kerstin Kaufmann
- Institute for Biochemistry and Biology, University of Potsdam, 14476 Potsdam, Germany
| | - Jocelyn C. Hall
- Department of Biological Sciences, University of Alberta, Edmonton, Alberta, Canada T6G 2E9
| | - Annette Becker
- Plant Developmental Biology Group, Institute of Botany, Justus-Liebig-University, 35392 Giessen, Germany
| | - Andrea Bräutigam
- Institute of Plant Biochemistry, Center of Excellence on Plant Sciences, Heinrich-Heine-University, D-40225 Duesseldorf, Germany
| | - Andreas P.M. Weber
- Institute of Plant Biochemistry, Center of Excellence on Plant Sciences, Heinrich-Heine-University, D-40225 Duesseldorf, Germany
| | | | - Zhijun Zheng
- Beijing Genomics Institute, 518083 Shenzhen, China
| | - Wujiao Li
- Beijing Genomics Institute, 518083 Shenzhen, China
| | - Mingju Lv
- Plant Systems Biology Group, Partner Institute of Computational Biology, Chinese Academy of Sciences/Max Planck Society, Shanghai 200031, China
| | - Yimin Tao
- Plant Systems Biology Group, Partner Institute of Computational Biology, Chinese Academy of Sciences/Max Planck Society, Shanghai 200031, China
| | - Junyi Wang
- Beijing Genomics Institute, 518083 Shenzhen, China
| | - Hongfeng Zou
- Beijing Genomics Institute, 518083 Shenzhen, China
- State Key Laboratory of Agricultural Genomics, Beijing Genomics Institute, 518083 Shenzhen, China
- Key Laboratory of Genomics, Ministry of Agriculture, Beijing Genomics Institute, 518083 Shenzhen, China
| | - Zhiwu Quan
- Beijing Genomics Institute, 518083 Shenzhen, China
- State Key Laboratory of Agricultural Genomics, Beijing Genomics Institute, 518083 Shenzhen, China
- Key Laboratory of Genomics, Ministry of Agriculture, Beijing Genomics Institute, 518083 Shenzhen, China
| | - Julian M. Hibberd
- Department of Plant Sciences, University of Cambridge, Cambridge CB2 3EA, United Kingdom
| | - Gengyun Zhang
- Beijing Genomics Institute, 518083 Shenzhen, China
- State Key Laboratory of Agricultural Genomics, Beijing Genomics Institute, 518083 Shenzhen, China
- Department of Plant Sciences, University of Cambridge, Cambridge CB2 3EA, United Kingdom
| | - Xin-Guang Zhu
- Plant Systems Biology Group, Partner Institute of Computational Biology, Chinese Academy of Sciences/Max Planck Society, Shanghai 200031, China
| | - Xun Xu
- Beijing Genomics Institute, 518083 Shenzhen, China
| | - M. Eric Schranz
- Biosystematics Group, Wageningen University, 6708 PB Wageningen, The Netherlands
- Address correspondence to
| |
Collapse
|
7
|
Lonardi S, Duma D, Alpert M, Cordero F, Beccuti M, Bhat PR, Wu Y, Ciardo G, Alsaihati B, Ma Y, Wanamaker S, Resnik J, Bozdag S, Luo MC, Close TJ. Combinatorial pooling enables selective sequencing of the barley gene space. PLoS Comput Biol 2013; 9:e1003010. [PMID: 23592960 PMCID: PMC3617026 DOI: 10.1371/journal.pcbi.1003010] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2012] [Accepted: 02/05/2013] [Indexed: 11/23/2022] Open
Abstract
For the vast majority of species – including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution) so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding. The problem of obtaining the full genomic sequence of an organism has been solved either via a global brute-force approach (called whole-genome shotgun) or by a divide-and-conquer strategy (called clone-by-clone). Both approaches have advantages and disadvantages in terms of cost, manual labor, and the ability to deal with sequencing errors and highly repetitive regions of the genome. With the advent of second-generation sequencing instruments, the whole-genome shotgun approach has been the preferred choice. The clone-by-clone strategy is, however, still very relevant for large complex genomes. In fact, several research groups and international consortia have produced clone libraries and physical maps for many economically or ecologically important organisms and now are in a position to proceed with sequencing. In this manuscript, we demonstrate the feasibility of this approach on the gene-space of a large, very repetitive plant genome. The novelty of our approach is that, in order to take advantage of the throughput of the current generation of sequencing instruments, we pool hundreds of clones using a special type of “smart” pooling design that allows one to establish with high accuracy the source clone from the sequenced reads in a pool. Extensive simulations and experimental results support our claims.
Collapse
Affiliation(s)
- Stefano Lonardi
- Department of Computer Science and Engineering, University of California, Riverside, California, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
8
|
Bozdag S, Close TJ, Lonardi S. A graph-theoretical approach to the selection of the minimum tiling path from a physical map. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2013; 10:352-360. [PMID: 23929859 DOI: 10.1109/tcbb.2013.26] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
The problem of computing the minimum tiling path (MTP) from a set of clones arranged in a physical map is a cornerstone of hierarchical (clone-by-clone) genome sequencing projects. We formulate this problem in a graph theoretical framework, and then solve by a combination of minimum hitting set and minimum spanning tree algorithms. The tool implementing this strategy, called FMTP, shows improved performance compared to the widely used software FPC. When we execute FMTP and FPC on the same physical map, the MTP produced by FMTP covers a higher portion of the genome, and uses a smaller number of clones. For instance, on the rice genome the MTP produced by our tool would reduce by about 11 percent the cost of a clone-by-clone sequencing project. Source code, benchmark data sets, and documentation of FMTP are freely available at >http://code.google.com/p/fingerprint-based-minimal-tiling-path/ under MIT license.
Collapse
Affiliation(s)
- Serdar Bozdag
- Department of Mathematics, Statistics and Computer Science, Marquette University, PO Box 1881, Milwaukee, WI 53201-1881, USA.
| | | | | |
Collapse
|
9
|
Accurate Decoding of Pooled Sequenced Data Using Compressed Sensing. LECTURE NOTES IN COMPUTER SCIENCE 2013. [DOI: 10.1007/978-3-642-40453-5_7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
|
10
|
Schook LB, Beever JE, Rogers J, Humphray S, Archibald A, Chardon P, Milan D, Rohrer G, Eversole K. Swine Genome Sequencing Consortium (SGSC): a strategic roadmap for sequencing the pig genome. Comp Funct Genomics 2010; 6:251-5. [PMID: 18629187 PMCID: PMC2447480 DOI: 10.1002/cfg.479] [Citation(s) in RCA: 76] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2005] [Revised: 03/17/2005] [Accepted: 03/18/2005] [Indexed: 11/08/2022] Open
Abstract
The Swine Genome Sequencing Consortium (SGSC) was formed in September 2003 by academic, government and industry representatives to provide international coordination for sequencing the pig genome. The SGSC's mission is to advance biomedical research for animal production and health by the development of DNAbased tools and products resulting from the sequencing of the swine genome. During the past 2 years, the SGSC has met bi-annually to develop a strategic roadmap for creating the required scientific resources, to integrate existing physical maps, and to create a sequencing strategy that captured international participation and a broad funding base. During the past year, SGSC members have integrated their respective physical mapping data with the goal of creating a minimal tiling path (MTP) that will be used as the sequencing template. During the recent Plant and Animal Genome meeting (January 16, 2005 San Diego, CA), presentations demonstrated that a human-pig comparative map has been completed, BAC fingerprint contigs (FPC) for each of the autosomes and X chromosome have been constructed and that BAC end-sequencing has permitted, through BLAST analysis and RH-mapping, anchoring of the contigs. Thus, significant progress has been made towards the creation of a MTP. In addition, whole-genome (WG) shotgun libraries have been constructed and are currently being sequenced in various laboratories around the globe. Thus, a hybrid sequencing approach in which 3x coverage of BACs comprising the MTP and 3x of the WG-shotgun libraries will be used to develop a draft 6x coverage of the pig genome.
Collapse
Affiliation(s)
- Lawrence B Schook
- Institute for Genomic Biology, University of Illinois, Urbana, IL, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
11
|
Bozdag S, Close TJ, Lonardi S. A compartmentalized approach to the assembly of physical maps. BMC Bioinformatics 2009; 10:217. [PMID: 19604400 PMCID: PMC2717093 DOI: 10.1186/1471-2105-10-217] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2008] [Accepted: 07/15/2009] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Physical maps have been historically one of the cornerstones of genome sequencing and map-based cloning strategies. They also support marker assisted breeding and EST mapping. The problem of building a high quality physical map is computationally challenging due to unavoidable noise in the input fingerprint data. RESULTS We propose a novel compartmentalized method for the assembly of high quality physical maps from fingerprinted clones. The knowledge of genetic markers enables us to group clones into clusters so that clones in the same cluster are more likely to overlap. For each cluster of clones, a local physical map is first constructed using FingerPrinted Contigs (FPC). Then, all the individual maps are carefully merged into the final physical map. Experimental results on the genomes of rice and barley demonstrate that the compartmentalized assembly produces significantly more accurate maps, and that it can detect and isolate clones that would induce "chimeric" contigs if used in the final assembly. CONCLUSION The software is available for download at http://www.cs.ucr.edu/~sbozdag/assembler/
Collapse
Affiliation(s)
- Serdar Bozdag
- National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA.
| | | | | |
Collapse
|
12
|
Abstract
Recent advances in both clone fingerprinting and draft sequencing technology have made it increasingly common for species to have a bacterial artificial clone (BAC) fingerprint map, BAC end sequences (BESs) and draft genomic sequence. The FPC (fingerprinted contigs) software package contains three modules that maximize the value of these resources. The BSS (blast some sequence) module provides a way to easily view the results of aligning draft sequence to the BESs, and integrates the results with the following two modules. The MTP (minimal tiling path) module uses sequence and fingerprints to determine a minimal tiling path of clones. The DSI (draft sequence integration) module aligns draft sequences to FPC contigs, displays them alongside the contigs and identifies potential discrepancies; the alignment can be based on either individual BES alignments to the draft, or on the locations of BESs that have been assembled into the draft. FPC also supports high-throughput fingerprint map generation as its time-intensive functions have been parallelized for Unix-based desktops or servers with multiple CPUs. Simulation results are provided for the MTP, DSI and parallelization. These features are in the FPC V9.3 software package, which is freely available.
Collapse
Affiliation(s)
- William Nelson
- Arizona Genomics Computational Laboratory, BIO5 Institute, University of Arizona, Tucson, AZ, USA
| | | |
Collapse
|
13
|
Mun JH, Kwon SJ, Yang TJ, Kim HS, Choi BS, Baek S, Kim JS, Jin M, Kim JA, Lim MH, Lee SI, Kim HI, Kim H, Lim YP, Park BS. The first generation of a BAC-based physical map of Brassica rapa. BMC Genomics 2008; 9:280. [PMID: 18549474 PMCID: PMC2432078 DOI: 10.1186/1471-2164-9-280] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2007] [Accepted: 06/12/2008] [Indexed: 11/30/2022] Open
Abstract
Background The genus Brassica includes the most extensively cultivated vegetable crops worldwide. Investigation of the Brassica genome presents excellent challenges to study plant genome evolution and divergence of gene function associated with polyploidy and genome hybridization. A physical map of the B. rapa genome is a fundamental tool for analysis of Brassica "A" genome structure. Integration of a physical map with an existing genetic map by linking genetic markers and BAC clones in the sequencing pipeline provides a crucial resource for the ongoing genome sequencing effort and assembly of whole genome sequences. Results A genome-wide physical map of the B. rapa genome was constructed by the capillary electrophoresis-based fingerprinting of 67,468 Bacterial Artificial Chromosome (BAC) clones using the five restriction enzyme SNaPshot technique. The clones were assembled into contigs by means of FPC v8.5.3. After contig validation and manual editing, the resulting contig assembly consists of 1,428 contigs and is estimated to span 717 Mb in physical length. This map provides 242 anchored contigs on 10 linkage groups to be served as seed points from which to continue bidirectional chromosome extension for genome sequencing. Conclusion The map reported here is the first physical map for Brassica "A" genome based on the High Information Content Fingerprinting (HICF) technique. This physical map will serve as a fundamental genomic resource for accelerating genome sequencing, assembly of BAC sequences, and comparative genomics between Brassica genomes. The current build of the B. rapa physical map is available at the B. rapa Genome Project website for the user community.
Collapse
Affiliation(s)
- Jeong-Hwan Mun
- Brassica Genomics Team, National Institute of Agricultural Biotechnology, Rural Development Administration, 225 Seodun-dong, Gwonseon-gu, Suwon 441-707, South Korea.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
14
|
Wei F, Coe E, Nelson W, Bharti AK, Engler F, Butler E, Kim H, Goicoechea JL, Chen M, Lee S, Fuks G, Sanchez-Villeda H, Schroeder S, Fang Z, McMullen M, Davis G, Bowers JE, Paterson AH, Schaeffer M, Gardiner J, Cone K, Messing J, Soderlund C, Wing RA. Physical and genetic structure of the maize genome reflects its complex evolutionary history. PLoS Genet 2008; 3:e123. [PMID: 17658954 PMCID: PMC1934398 DOI: 10.1371/journal.pgen.0030123] [Citation(s) in RCA: 228] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2007] [Accepted: 06/11/2007] [Indexed: 11/21/2022] Open
Abstract
Maize (Zea mays L.) is one of the most important cereal crops and a model for the study of genetics, evolution, and domestication. To better understand maize genome organization and to build a framework for genome sequencing, we constructed a sequence-ready fingerprinted contig-based physical map that covers 93.5% of the genome, of which 86.1% is aligned to the genetic map. The fingerprinted contig map contains 25,908 genic markers that enabled us to align nearly 73% of the anchored maize genome to the rice genome. The distribution pattern of expressed sequence tags correlates to that of recombination. In collinear regions, 1 kb in rice corresponds to an average of 3.2 kb in maize, yet maize has a 6-fold genome size expansion. This can be explained by the fact that most rice regions correspond to two regions in maize as a result of its recent polyploid origin. Inversions account for the majority of chromosome structural variations during subsequent maize diploidization. We also find clear evidence of ancient genome duplication predating the divergence of the progenitors of maize and rice. Reconstructing the paleoethnobotany of the maize genome indicates that the progenitors of modern maize contained ten chromosomes. As a cash crop and a model biological system, maize is of great public interest. To facilitate maize molecular breeding and its basic biology research, we built a high-resolution physical map with two different fingerprinting methods on the same set of bacterial artificial chromosome clones. The physical map was integrated to a high-density genetic map and further serves as a framework for the maize genome-sequencing project. Comparative genomics showed that the euchromatic regions between rice and maize are very conserved. Physically we delimited these conserved regions and thus detected many genome rearrangements. We defined extensively the duplication blocks within the maize genome. These blocks allowed us to reconstruct the chromosomes of the maize progenitor. We detected that maize genome has experienced two rounds of genome duplications, an ancient one before maize–rice divergence and a recent one after tetraploidization.
Collapse
Affiliation(s)
- Fusheng Wei
- Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America
- Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
| | - Ed Coe
- Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America
- Plant Genetics Research Unit, Agricultural Research Service, United States Department of Agriculture, Columbia, Missouri, United States of America
| | - William Nelson
- Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
- Arizona Genomics Computational Laboratory, University of Arizona, Tucson, Arizona, United States of America
| | - Arvind K Bharti
- Plant Genome Initiative at Rutgers, Waksman Institute, Rutgers, The State University of New Jersey, Piscataway, New Jersey, United States of America
| | - Fred Engler
- Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
- Arizona Genomics Computational Laboratory, University of Arizona, Tucson, Arizona, United States of America
| | - Ed Butler
- Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America
- Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
| | - HyeRan Kim
- Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America
- Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
| | - Jose Luis Goicoechea
- Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America
- Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
| | - Mingsheng Chen
- Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America
- Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
| | - Seunghee Lee
- Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America
- Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
| | - Galina Fuks
- Plant Genome Initiative at Rutgers, Waksman Institute, Rutgers, The State University of New Jersey, Piscataway, New Jersey, United States of America
| | - Hector Sanchez-Villeda
- Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America
| | - Steven Schroeder
- Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America
| | - Zhiwei Fang
- Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America
| | - Michael McMullen
- Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America
- Plant Genetics Research Unit, Agricultural Research Service, United States Department of Agriculture, Columbia, Missouri, United States of America
| | - Georgia Davis
- Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America
| | - John E Bowers
- Plant Genome Mapping Laboratory, Departments of Crop and Soil Science, Plant Biology, and Genetics, University of Georgia, Athens, Georgia, United States of America
| | - Andrew H Paterson
- Plant Genome Mapping Laboratory, Departments of Crop and Soil Science, Plant Biology, and Genetics, University of Georgia, Athens, Georgia, United States of America
| | - Mary Schaeffer
- Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America
- Plant Genetics Research Unit, Agricultural Research Service, United States Department of Agriculture, Columbia, Missouri, United States of America
| | - Jack Gardiner
- Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America
| | - Karen Cone
- Division of Biological Sciences, University of Missouri, Columbia, Missouri, Arizona, United States of America
| | - Joachim Messing
- Plant Genome Initiative at Rutgers, Waksman Institute, Rutgers, The State University of New Jersey, Piscataway, New Jersey, United States of America
| | - Carol Soderlund
- Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
- Arizona Genomics Computational Laboratory, University of Arizona, Tucson, Arizona, United States of America
- * To whom correspondence should be addressed. E-mail: (CS); (RAW)
| | - Rod A Wing
- Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America
- Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
- * To whom correspondence should be addressed. E-mail: (CS); (RAW)
| |
Collapse
|
15
|
Nagarajan N, Read TD, Pop M. Scaffolding and validation of bacterial genome assemblies using optical restriction maps. Bioinformatics 2008; 24:1229-35. [PMID: 18356192 PMCID: PMC2373919 DOI: 10.1093/bioinformatics/btn102] [Citation(s) in RCA: 96] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2007] [Revised: 03/05/2008] [Accepted: 03/16/2008] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION New, high-throughput sequencing technologies have made it feasible to cheaply generate vast amounts of sequence information from a genome of interest. The computational reconstruction of the complete sequence of a genome is complicated by specific features of these new sequencing technologies, such as the short length of the sequencing reads and absence of mate-pair information. In this article we propose methods to overcome such limitations by incorporating information from optical restriction maps. RESULTS We demonstrate the robustness of our methods to sequencing and assembly errors using extensive experiments on simulated datasets. We then present the results obtained by applying our algorithms to data generated from two bacterial genomes Yersinia aldovae and Yersinia kristensenii. The resulting assemblies contain a single scaffold covering a large fraction of the respective genomes, suggesting that the careful use of optical maps can provide a cost-effective framework for the assembly of genomes. AVAILABILITY The tools described here are available as an open-source package at ftp://ftp.cbcb.umd.edu/pub/software/soma
Collapse
|
16
|
Ané JM, Zhu H, Frugoli J. Recent Advances in Medicago truncatula Genomics. INTERNATIONAL JOURNAL OF PLANT GENOMICS 2008; 2008:256597. [PMID: 18288239 PMCID: PMC2216067 DOI: 10.1155/2008/256597] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/02/2007] [Accepted: 09/14/2007] [Indexed: 05/23/2023]
Abstract
Legume rotation has allowed a consistent increase in crop yield and consequently in human population since the antiquity. Legumes will also be instrumental in our ability to maintain the sustainability of our agriculture while facing the challenges of increasing food and biofuel demand. Medicago truncatula and Lotus japonicus have emerged during the last decade as two major model systems for legume biology. Initially developed to dissect plant-microbe symbiotic interactions and especially legume nodulation, these two models are now widely used in a variety of biological fields from plant physiology and development to population genetics and structural genomics. This review highlights the genetic and genomic tools available to the M. truncatula community. Comparative genomic approaches to transfer biological information between model systems and legume crops are also discussed.
Collapse
Affiliation(s)
- Jean-Michel Ané
- Department of Agronomy,
University of Wisconsin,
Madison, WI 53706,
USA
| | - Hongyan Zhu
- Department of Plant and Soil Sciences,
University of Kentucky, Lexington, KY 40546,
USA
| | - Julia Frugoli
- Department of Genetics and Biochemistry,
Clemson University,
100 Jordan Hall,
Clemson, SC 29634,
USA
| |
Collapse
|
17
|
Shultz JL, Ali S, Ballard L, Lightfoot DA. Development of a physical map of the soybean pathogen Fusarium virguliforme based on synteny with Fusarium graminearum genomic DNA. BMC Genomics 2007; 8:262. [PMID: 17683537 PMCID: PMC1978504 DOI: 10.1186/1471-2164-8-262] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2007] [Accepted: 08/03/2007] [Indexed: 01/12/2023] Open
Abstract
BACKGROUND Reference genome sequences within the major taxa can be used to assist the development of genomic tools for related organisms. A major constraint in the use of these sequenced and annotated genomes is divergent evolution. Divergence of organisms from a common ancestor may have occurred millions of years ago, leading to apparently un-related and un-syntenic genomes when sequence alignment is attempted. RESULTS A series of programs were written to prepare 36 Mbp of Fusarium graminearum sequence in 19 scaffolds as a reference genome. Exactly 4,152 Bacterial artificial chromosome (BAC) end sequences from 2,178 large-insert Fusarium virguliforme clones were tested against this sequence. A total of 94 maps of F. graminearum sequence scaffolds, annotated exonic fragments and associated F. virguliforme sequences resulted. CONCLUSION Developed here was a technique that allowed the comparison of genomes based on small, 15 bp regions of shared identity. The main power of this method lay in its ability to align diverged sequences. This work is unique in that discontinuous sequences were used for the analysis and information not readily apparent, such as match direction, are presented. The 94 maps and JAVA programs are freely available on the Web and by request.
Collapse
Affiliation(s)
- Jeffry L Shultz
- USDA-ARS, Crop Genetics and Production Research Unit, PO Box 345, Stoneville, MS 38776, USA.
| | | | | | | |
Collapse
|
18
|
Affiliation(s)
- Pablo D Rabinowicz
- J. C. Venter Institute, 9712 Medical Center Drive, Rockville, Maryland 20850, USA.
| |
Collapse
|
19
|
Kim H, San Miguel P, Nelson W, Collura K, Wissotski M, Walling JG, Kim JP, Jackson SA, Soderlund C, Wing RA. Comparative physical mapping between Oryza sativa (AA genome type) and O. punctata (BB genome type). Genetics 2007; 176:379-90. [PMID: 17339227 PMCID: PMC1893071 DOI: 10.1534/genetics.106.068783] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2006] [Accepted: 02/09/2007] [Indexed: 11/18/2022] Open
Abstract
A comparative physical map of the AA genome (Oryza sativa) and the BB genome (O. punctata) was constructed by aligning a physical map of O. punctata, deduced from 63,942 BAC end sequences (BESs) and 34,224 fingerprints, onto the O. sativa genome sequence. The level of conservation of each chromosome between the two species was determined by calculating a ratio of BES alignments. The alignment result suggests more divergence of intergenic and repeat regions in comparison to gene-rich regions. Further, this characteristic enabled localization of heterochromatic and euchromatic regions for each chromosome of both species. The alignment identified 16 locations containing expansions, contractions, inversions, and transpositions. By aligning 40% of the punctata BES on the map, 87% of the punctata FPC map covered 98% of the O. sativa genome sequence. The genome size of O. punctata was estimated to be 8% larger than that of O. sativa with individual chromosome differences of 1.5-16.5%. The sum of expansions and contractions observed in regions >500 kb were similar, suggesting that most of the contractions/expansions contributing to the genome size difference between the two species are small, thus preserving the macro-collinearity between these species, which diverged approximately 2 million years ago.
Collapse
Affiliation(s)
- HyeRan Kim
- Arizona Genomics Institute, University of Arizona, Tucson, Arizona 85721, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
20
|
Quiniou SMA, Waldbieser GC, Duke MV. A first generation BAC-based physical map of the channel catfish genome. BMC Genomics 2007; 8:40. [PMID: 17284319 PMCID: PMC1800894 DOI: 10.1186/1471-2164-8-40] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2006] [Accepted: 02/06/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Channel catfish, Ictalurus punctatus, is the leading species in North American aquaculture. Genetic improvement of catfish is performed through selective breeding, and genomic tools will help improve selection efficiency. A physical map is needed to integrate the genetic map with the karyotype and to support fine mapping of phenotypic trait alleles such as Quantitative Trait Loci (QTL) and the effective positional cloning of genes. RESULTS A genome-wide physical map of the channel catfish was constructed by High-Information-Content Fingerprinting (HICF) of 46,548 Bacterial Artificial Chromosomes (BAC) clones using the SNaPshot technique. The clones were assembled into contigs with FPC software. The resulting assembly contained 1,782 contigs and covered an estimated physical length of 0.93 Gb. The validity of the assembly was demonstrated by 1) anchoring 19 of the largest contigs to the microsatellite linkage map 2) comparing the assembly of a multi-gene family to Restriction Fragment Length Polymorphism (RFLP) patterns seen in Southern blots, and 3) contig sequencing. CONCLUSION This is the first physical map for channel catfish. The HICF technique allowed the project to be finished with a limited amount of human resource in a high throughput manner. This physical map will greatly facilitate the detailed study of many different genomic regions in channel catfish, and the positional cloning of genes controlling economically important production traits.
Collapse
Affiliation(s)
| | | | - Mary V Duke
- USDA-ARS/CGRU, 141 Experiment Station Rd, Stoneville, MS 38776, USA
| |
Collapse
|
21
|
Shopinski KL, Iqbal MJ, Shultz JL, Jayaraman D, Lightfoot DA. Development of a pooled probe method for locating small gene families in a physical map of soybean using stress related paralogues and a BAC minimum tile path. PLANT METHODS 2006; 2:20. [PMID: 17156445 PMCID: PMC1716159 DOI: 10.1186/1746-4811-2-20] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/31/2006] [Accepted: 12/08/2006] [Indexed: 05/12/2023]
Abstract
BACKGROUND Genome analysis of soybean (Glycine max L.) has been complicated by its paleo-autopolyploid nature and conserved homeologous regions. Landmarks of expressed sequence tags (ESTs) located within a minimum tile path (MTP) of contiguous (contig) bacterial artificial chromosome (BAC) clones or radiation hybrid set can identify stress and defense related gene rich regions in the genome. A physical map of about 2,800 contigs and MTPs of 8,064 BAC clones encompass the soybean genome. That genome is being sequenced by whole genome shotgun methods so that reliable estimates of gene family size and gene locations will provide a useful tool for finishing. The aims here were to develop methods to anchor plant defense- and stress-related gene paralogues on the MTP derived from the soybean physical map, to identify gene rich regions and to correlate those with QTL for disease resistance. RESULTS The probes included 143 ESTs from a root library selected by subtractive hybridization from a multiply disease resistant soybean cultivar 'Forrest' 14 days after inoculation with Fusarium solani f. sp. glycines (F. virguliforme). Another 166 probes were chosen from a root EST library (Gm-r1021) prepared from a non-inoculated soybean cultivar 'Williams 82' based on their homology to the known defense and stress related genes. Twelve and thirteen pooled EST probes were hybridized to high-density colony arrays of MTP BAC clones from the cv. 'Forrest' genome. The EST pools located 613 paralogues for 201 of the 309 probes used (range 1-13 per functional probe). One hundred BAC clones contained more than one kind of paralogue. Many more BACs (246) contained a single paralogue of one of the 201 probes detectable gene families. ESTs were anchored on soybean linkage groups A1, B1, C2, E, D1a+Q, G, I, M, H, and O. CONCLUSION Estimates of gene family sizes were more similar to those made by Southern hybridization than by bioinformatics inferences from EST collections. When compared to Arabidopsis thaliana there were more 2 and 4 member paralogue families reflecting the diploidized-tetraploid nature of the soybean genome. However there were fewer families with 5 or more genes and the same number of single genes. Therefore the method can identify evolutionary patterns such as massively extensive selective gene loss or rapid divergence to regenerate the unique genes in some families.
Collapse
Affiliation(s)
- Kay L Shopinski
- Department of Plant, Soil and Agriculture Systems, Room 176, Agriculture Building, MC 4415, Southern Illinois University, Carbondale, IL 62901, USA
- Dept of Plant Molecular Biology, United States Department of Agriculture, Peoria, IL, USA
| | - Muhammad J Iqbal
- Institute for Sustainable and Renewable Resources (ISRR), Institute for Advanced Learning and Research (IALR), Danville, VA 24540, USA
| | - Jeffry L Shultz
- Department of Plant, Soil and Agriculture Systems, Room 176, Agriculture Building, MC 4415, Southern Illinois University, Carbondale, IL 62901, USA
- Dept of Soybean Genetics, United States Department of Agriculture, Stoneville, MS 38776, USA
| | - Dheepakkumaran Jayaraman
- Department of Plant, Soil and Agriculture Systems, Room 176, Agriculture Building, MC 4415, Southern Illinois University, Carbondale, IL 62901, USA
| | - David A Lightfoot
- Department of Plant, Soil and Agriculture Systems, Room 176, Agriculture Building, MC 4415, Southern Illinois University, Carbondale, IL 62901, USA
| |
Collapse
|
22
|
Zhang X, Scheuring C, Tripathy S, Xu Z, Wu C, Ko A, Tian SK, Arredondo F, Lee MK, Santos FA, Jiang RHY, Zhang HB, Tyler BM. An integrated BAC and genome sequence physical map of Phytophthora sojae. MOLECULAR PLANT-MICROBE INTERACTIONS : MPMI 2006; 19:1302-10. [PMID: 17153914 DOI: 10.1094/mpmi-19-1302] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
Phytophthora spp. are serious pathogens that threaten numerous cultivated crops, trees, and natural vegetation worldwide. The soybean pathogen P. sojae has been developed as a model oomycete. Here, we report a bacterial artificial chromosome (BAC)-based, integrated physical map of the P. sojae genome. We constructed two BAC libraries, digested 8,681 BACs with seven restriction enzymes, end labeled the digested fragments with four dyes, and analyzed them with capillary electrophoresis. Fifteen data sets were constructed from the fingerprints, using individual dyes and all possible combinations, and were evaluated for contig assembly. In all, 257 contigs were assembled from the XhoI data set, collectively spanning approximately 132 Mb in physical length. The BAC contigs were integrated with the draft genome sequence of P. sojae by end sequencing a total of 1,440 BACs that formed a minimal tiling path. This enabled the 257 contigs of the BAC map to be merged with 207 sequence scaffolds to form an integrated map consisting of 79 superscaffolds. The map represents the first genome-wide physical map of a Phytophthora sp. and provides a valuable resource for genomics and molecular biology research in P. sojae and other Phytophthora spp. In one illustration of this value, we have placed the 350 members of a superfamily of putative pathogenicity effector genes onto the map, revealing extensive clustering of these genes.
Collapse
Affiliation(s)
- Xuemin Zhang
- Virginia Bioinformatics Institute, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061-0477, USA
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
23
|
Cannon SB, Sterck L, Rombauts S, Sato S, Cheung F, Gouzy J, Wang X, Mudge J, Vasdewani J, Schiex T, Spannagl M, Monaghan E, Nicholson C, Humphray SJ, Schoof H, Mayer KFX, Rogers J, Quétier F, Oldroyd GE, Debellé F, Cook DR, Retzel EF, Roe BA, Town CD, Tabata S, Van de Peer Y, Young ND. Legume genome evolution viewed through the Medicago truncatula and Lotus japonicus genomes. Proc Natl Acad Sci U S A 2006; 103:14959-64. [PMID: 17003129 PMCID: PMC1578499 DOI: 10.1073/pnas.0603228103] [Citation(s) in RCA: 237] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
Genome sequencing of the model legumes, Medicago truncatula and Lotus japonicus, provides an opportunity for large-scale sequence-based comparison of two genomes in the same plant family. Here we report synteny comparisons between these species, including details about chromosome relationships, large-scale synteny blocks, microsynteny within blocks, and genome regions lacking clear correspondence. The Lotus and Medicago genomes share a minimum of 10 large-scale synteny blocks, each with substantial collinearity and frequently extending the length of whole chromosome arms. The proportion of genes syntenic and collinear within each synteny block is relatively homogeneous. Medicago-Lotus comparisons also indicate similar and largely homogeneous gene densities, although gene-containing regions in Mt occupy 20-30% more space than Lj counterparts, primarily because of larger numbers of Mt retrotransposons. Because the interpretation of genome comparisons is complicated by large-scale genome duplications, we describe synteny, synonymous substitutions and phylogenetic analyses to identify and date a probable whole-genome duplication event. There is no direct evidence for any recent large-scale genome duplication in either Medicago or Lotus but instead a duplication predating speciation. Phylogenetic comparisons place this duplication within the Rosid I clade, clearly after the split between legumes and Salicaceae (poplar).
Collapse
Affiliation(s)
- Steven B. Cannon
- Department of Plant Pathology, University of Minnesota, St. Paul, MN 55108
- U.S. Department of Agriculture–Agricultural Research Service and Department of Agronomy, Iowa State University, Ames, IA 50010
| | - Lieven Sterck
- Department of Plant Systems Biology (VIB), Ghent University, B-9052 Ghent, Belgium
| | - Stephane Rombauts
- Department of Plant Systems Biology (VIB), Ghent University, B-9052 Ghent, Belgium
| | - Shusei Sato
- Kazusa DNA Research Institute, Kisarazu, Chiba 292-0818, Japan
| | - Foo Cheung
- Institute for Genomic Research, Rockville, MD 20850
| | - Jérôme Gouzy
- Laboratoire des Interactions Plantes–Microorganismes, Institut National de la Recherche Agronomique–Centre National de la Recherche Scientifique, 31326 Castanet-Tolosan, France
| | - Xiaohong Wang
- Department of Plant Pathology, University of Minnesota, St. Paul, MN 55108
| | - Joann Mudge
- Department of Plant Pathology, University of Minnesota, St. Paul, MN 55108
| | | | - Thomas Schiex
- Unité de Biométrie et Intelligence Artificielle, B.P. 52627, Institut National de la Recherche Agronomique, 31326 Castanet-Tolosan, France
| | - Manuel Spannagl
- Munich Information Center for Protein Sequences Institute for Bioinformatics, Gesellschaft für Strahlung und Umweltforschung, Research Center for Environment and Health, 85764 Neuherberg, Germany
| | | | - Christine Nicholson
- Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Sean J. Humphray
- Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Heiko Schoof
- Max Planck Institute for Plant Breeding Research, 50829 Köln, Germany
| | - Klaus F. X. Mayer
- Munich Information Center for Protein Sequences Institute for Bioinformatics, Gesellschaft für Strahlung und Umweltforschung, Research Center for Environment and Health, 85764 Neuherberg, Germany
| | - Jane Rogers
- Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom
| | | | | | - Frédéric Debellé
- Laboratoire des Interactions Plantes–Microorganismes, Institut National de la Recherche Agronomique–Centre National de la Recherche Scientifique, 31326 Castanet-Tolosan, France
| | - Douglas R. Cook
- Department of Plant Pathology, University of California, One Shields Avenue, Davis, CA 95616
| | - Ernest F. Retzel
- Center for Computational Genomics and Bioinformatics, Minneapolis, MN 55455; and
| | - Bruce A. Roe
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019
| | | | - Satoshi Tabata
- Kazusa DNA Research Institute, Kisarazu, Chiba 292-0818, Japan
| | - Yves Van de Peer
- Department of Plant Systems Biology (VIB), Ghent University, B-9052 Ghent, Belgium
| | - Nevin D. Young
- Department of Plant Pathology, University of Minnesota, St. Paul, MN 55108
- To whom correspondence should be addressed. E-mail:
| |
Collapse
|
24
|
Shultz JL, Yesudas C, Yaegashi S, Afzal AJ, Kazi S, Lightfoot DA. Three minimum tile paths from bacterial artificial chromosome libraries of the soybean (Glycine max cv. 'Forrest'): tools for structural and functional genomics. PLANT METHODS 2006; 2:9. [PMID: 16725032 PMCID: PMC1524761 DOI: 10.1186/1746-4811-2-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/20/2006] [Accepted: 05/25/2006] [Indexed: 05/04/2023]
Abstract
BACKGROUND The creation of minimally redundant tile paths (hereafter MTP) from contiguous sets of overlapping clones (hereafter contigs) in physical maps is a critical step for structural and functional genomics. Build 4 of the physical map of soybean (Glycine max L. Merr. cv. 'Forrest') showed the 1 Gbp haploid genome was composed of 0.7 Gbp diploid, 0.1 Gbp tetraploid and 0.2 Gbp octoploid regions. Therefore, the size of the unique genome was about 0.8 Gbp. The aim here was to create MTP sub-libraries from the soybean cv. Forrest physical map builds 2 to 4. RESULTS The first MTP, named MTP2, was 14,208 clones (of mean insert size 140 kbp) picked from the 5,597 contigs of build 2. MTP2 was constructed from three BAC libraries (BamHI (B), HindIII (H) and EcoRI (E) inserts). MTP2 encompassed the contigs of build 3 that derived from build 2 by a series of contig merges. MTP2 encompassed 2 Gbp compared to the soybean haploid genome of 1 Gbp and does not distinguish regions by ploidy. The second and third MTPs, called MTP4BH and MTP4E, were each based on build 4. Each was semi-automatically selected from 2,854 contigs. MTP4BH was 4,608 B and H insert clones of mean size 173 kbp in the large (27.6 kbp) T-DNA vector pCLD04541. MTP4BH was suitable for plant transformation and functional genomics. MTP4E was 4,608 BAC clones with large inserts (mean 175 kbp) in the small (7.5 kbp) pECBAC1 vector. MTP4E was suitable for DNA sequencing. MTP4BH and MTP4E clones each encompassed about 0.8 Gbp, the 0.7 Gbp diploid regions and 0.05 Gbp each from the tetraploid and octoploid regions. MTP2 and MTP4BH were used for BAC-end sequencing, EST integration, micro-satellite integration into the physical map and high information content fingerprinting. MTP4E will be used for genome sequence by pooled genomic clone index. CONCLUSION Each MTP and associated BES will be useful to deconvolute and ultimately finish the whole genome shotgun sequence of soybean.
Collapse
Affiliation(s)
- JL Shultz
- Dept of Soybean Genetics, United States Department of Agriculture, Stoneville, MS 38776, USA
- Dept. of Plant Soil and Agricultural Systems, Genomics and Biotechnology Facility, Center for Excellence in Soybean Research, Southern Illinois University, Carbondale, IL 62901, USA
| | - C Yesudas
- Dept. of Plant Soil and Agricultural Systems, Genomics and Biotechnology Facility, Center for Excellence in Soybean Research, Southern Illinois University, Carbondale, IL 62901, USA
| | - S Yaegashi
- Dept of Soybean Genetics, United States Department of Agriculture, Stoneville, MS 38776, USA
- Dept of Bioinformatics, University of Tokyo, Tokyo, Japan
| | - AJ Afzal
- Dept. of Plant Soil and Agricultural Systems, Genomics and Biotechnology Facility, Center for Excellence in Soybean Research, Southern Illinois University, Carbondale, IL 62901, USA
| | | | - DA Lightfoot
- Dept. of Plant Soil and Agricultural Systems, Genomics and Biotechnology Facility, Center for Excellence in Soybean Research, Southern Illinois University, Carbondale, IL 62901, USA
| |
Collapse
|
25
|
Nelson WM, Bharti AK, Butler E, Wei F, Fuks G, Kim H, Wing RA, Messing J, Soderlund C. Whole-genome validation of high-information-content fingerprinting. PLANT PHYSIOLOGY 2005; 139:27-38. [PMID: 16166258 PMCID: PMC1203355 DOI: 10.1104/pp.105.061978] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Fluorescent-based high-information-content fingerprinting (HICF) techniques have recently been developed for physical mapping. These techniques make use of automated capillary DNA sequencing instruments to enable both high-resolution and high-throughput fingerprinting. In this article, we report the construction of a whole-genome HICF FPC map for maize (Zea mays subsp. mays cv B73), using a variant of HICF in which a type IIS restriction enzyme is used to generate the fluorescently labeled fragments. The HICF maize map was constructed from the same three maize bacterial artificial chromosome libraries as previously used for the whole-genome agarose FPC map, providing a unique opportunity for direct comparison of the agarose and HICF methods; as a result, it was found that HICF has substantially greater sensitivity in forming contigs. An improved assembly procedure is also described that uses automatic end-merging of contigs to reduce the effects of contamination and repetitive bands. Several new features in FPC v7.2 are presented, including shared-memory multiprocessing, which allows dramatically faster assemblies, and automatic end-merging, which permits more accurate assemblies. It is further shown that sequenced clones may be digested in silico and located accurately on the HICF assembly, despite size deviations that prevent the precise prediction of experimental fingerprints. Finally, repetitive bands are isolated, and their effect on the assembly is studied.
Collapse
Affiliation(s)
- William M Nelson
- Arizona Genomics Computational Laboratory, BIO5 Institute, University of Arizona, Tucson, 85721, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Pampanwar V, Engler F, Hatfield J, Blundy S, Gupta G, Soderlund C. FPC Web tools for rice, maize, and distribution. PLANT PHYSIOLOGY 2005; 138:116-26. [PMID: 15888684 PMCID: PMC1104167 DOI: 10.1104/pp.104.056291] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]
Abstract
Many clone-based physical maps have been built with the FingerPrinted Contig (FPC) software, which is written in C and runs locally for fast and flexible analysis. If the maps were viewable only from FPC, they would not be as useful to the whole community since FPC must be installed on the user machine and the database downloaded. Hence, we have created a set of Web tools so users can easily view the FPC data and perform salient queries with standard browsers. This set includes the following four programs: WebFPC, a view of the contigs; WebChrom, the location of the contigs and genetic markers along the chromosome; WebBSS, locating user-supplied sequence on the map; and WebFCmp, comparing fingerprints. For additional FPC support, we have developed an FPC module for BioPerl and an FPC browser using the Generic Model Organism Project (GMOD) genome browser (GBrowse), where the FPC BioPerl module generates the data files for input into GBrowse. This provides an alternative to the WebChrom/WebFPC view. These tools are available to download along with documentation. The tools have been implemented for both the rice (Oryza sativa) and maize (Zea mays) FPC maps, which both contain the locations of clones, markers, genetic markers, and sequenced clone (along with links to sites that contain additional information).
Collapse
Affiliation(s)
- Vishal Pampanwar
- Arizona Genomic Computational Laboratory, BIO5 Institute, University of Arizona, Tucson, Arizona 85721, USA
| | | | | | | | | | | |
Collapse
|
27
|
Lai J, Ma J, Swigonová Z, Ramakrishna W, Linton E, Llaca V, Tanyolac B, Park YJ, Jeong OY, Bennetzen JL, Messing J. Gene loss and movement in the maize genome. Genome Res 2004; 14:1924-31. [PMID: 15466290 PMCID: PMC524416 DOI: 10.1101/gr.2701104] [Citation(s) in RCA: 166] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2004] [Accepted: 07/15/2004] [Indexed: 01/20/2023]
Abstract
Maize (Zea mays L. ssp. mays), one of the most important agricultural crops in the world, originated by hybridization of two closely related progenitors. To investigate the fate of its genes after tetraploidization, we analyzed the sequence of five duplicated regions from different chromosomal locations. We also compared corresponding regions from sorghum and rice, two important crops that have largely collinear maps with maize. The split of sorghum and maize progenitors was recently estimated to be 11.9 Mya, whereas rice diverged from the common ancestor of maize and sorghum approximately 50 Mya. A data set of roughly 4 Mb yielded 206 predicted genes from the three species, excluding any transposon-related genes, but including eight gene remnants. On average, 14% of the genes within the aligned regions are noncollinear between any two species. However, scoring each maize region separately, the set of noncollinear genes between all four regions jumps to 68%. This is largely because at least 50% of the duplicated genes from the two progenitors of maize have been lost over a very short period of time, possibly as short as 5 million years. Using the nearly completed rice sequence, we found noncollinear genes in other chromosomal positions, frequently in more than one. This demonstrates that many genes in these species have moved to new chromosomal locations in the last 50 million years or less, most as single gene events that did not dramatically alter gene structure.
Collapse
Affiliation(s)
- Jinsheng Lai
- Waksman Institute of Microbiology, Rutgers University, Piscataway, New Jersey 08854-8020, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Chen R, Sodergren E, Weinstock GM, Gibbs RA. Dynamic building of a BAC clone tiling path for the Rat Genome Sequencing Project. Genome Res 2004; 14:679-84. [PMID: 15060010 PMCID: PMC383313 DOI: 10.1101/gr.2171704] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
CLONEPICKER is a software pipeline that integrates sequence data with BAC clone fingerprints to dynamically select a minimal overlapping clone set covering the whole genome. In the Rat Genome Sequencing Project (RGSP), a hybrid strategy of "clone by clone" and "whole genome shotgun" approaches was used to maximize the merits of both approaches. Like the "clone by clone" method, one key challenge for this strategy was to select a low-redundancy clone set that covered the whole genome while the sequencing is in progress. The CLONEPICKER pipeline met this challenge using restriction enzyme fingerprint data, BAC end sequence data, and sequences generated from individual BAC clones as well as WGS reads. In the RGSP, an average of 7.5 clones was identified from each side of a seed clone, and the minimal overlapping clones were reliably selected. Combined with the assembled BAC fingerprint map, a set of BAC clones that covered >97% of the genome was identified and used in the RGSP.
Collapse
Affiliation(s)
- Rui Chen
- Department of Molecular and Human Genetics, Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA.
| | | | | | | |
Collapse
|