1
|
Sivabharathi RC, Rajagopalan VR, Suresh R, Sudha M, Karthikeyan G, Jayakanthan M, Raveendran M. Haplotype-based breeding: A new insight in crop improvement. PLANT SCIENCE : AN INTERNATIONAL JOURNAL OF EXPERIMENTAL PLANT BIOLOGY 2024; 346:112129. [PMID: 38763472 DOI: 10.1016/j.plantsci.2024.112129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Revised: 05/09/2024] [Accepted: 05/15/2024] [Indexed: 05/21/2024]
Abstract
Haplotype-based breeding (HBB) is one of the cutting-edge technologies in the realm of crop improvement due to the increasing availability of Single Nucleotide Polymorphisms identified by Next Generation Sequencing technologies. The complexity of the data can be decreased with fewer statistical tests and a lower probability of spurious associations by combining thousands of SNPs into a few hundred haplotype blocks. The presence of strong genomic regions in breeding lines of most crop species facilitates the use of haplotypes to improve the efficiency of genomic and marker-assisted selection. Haplotype-based breeding as a Genomic Assisted Breeding (GAB) approach harnesses the genome sequence data to pinpoint the allelic variation used to hasten the breeding cycle and circumvent the challenges associated with linkage drag. This review article demonstrates ways to identify candidate genes, superior haplotype identification, haplo-pheno analysis, and haplotype-based marker-assisted selection. The crop improvement strategies that utilize superior haplotypes will hasten the breeding progress to safeguard global food security.
Collapse
Affiliation(s)
- R C Sivabharathi
- Department of Genetics and Plant breeding, CPBG, Tamil Nadu Agricultural University, Coimbatore 641003, India
| | - Veera Ranjani Rajagopalan
- Department of Plant Biotechnology, Centre for Plant Molecular Biology and Biotechnology, Tamil Nadu Agricultural University, Coimbatore, 641003, India
| | - R Suresh
- Department of Rice, CPBG, Tamil Nadu Agricultural University, Coimbatore 641003, India
| | - M Sudha
- Department of Plant Biotechnology, Centre for Plant Molecular Biology and Biotechnology, Tamil Nadu Agricultural University, Coimbatore, 641003, India.
| | - G Karthikeyan
- Department of Plant Pathology, CPPS, Tamil Nadu Agricultural University, Coimbatore 641003, India
| | - M Jayakanthan
- Department of Plant Molecular Biology and Bioinformatics, Centre for Plant Molecular Biology and Biotechnology, Tamil Nadu Agricultural University, Coimbatore 641003, India
| | - M Raveendran
- Directorate of research, Tamil Nadu Agricultural University, Coimbatore 641003, India.
| |
Collapse
|
2
|
Liu W, Jiao X, Thutkawkorapin J, Mahdessian H, Lindblom A. Cancer risk susceptibility loci in a Swedish population. Oncotarget 2017; 8:110300-110310. [PMID: 29299148 PMCID: PMC5746383 DOI: 10.18632/oncotarget.22687] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2017] [Accepted: 10/27/2017] [Indexed: 12/13/2022] Open
Abstract
A germline mutation in cancer predisposing genes is known to increase the risk of more than one tumor type. In order to find loci associated with many types of cancer, a genome-wide association study (GWAS) was conducted, and 3,555 Swedish cancer cases and 15,581 controls were analyzed for 226,883 SNPs. The study used haplotype analysis instead of single SNP analysis in order to find putative founder effects. Haplotype association studies identified seven risk loci associated with cancer risk, on chromosomes 1, 7, 11, 14, 16, 17 and 21. Four of the haplotypes, on chromosomes 7, 14, 16 and 17, were confirmed in Swedish familial cancer cases. It was possible to perform exome sequencing in one patient for each of those four loci. No clear disease-causing exonic mutation was found in any of the four loci. Some of the candidate loci hold several cancer genes, suggesting that the risk associated with one locus could involve more than one gene associated with cancer risk. In summary, this study identified seven novel candidate loci associated with cancer risk. It was also suggested that cancer risk at one locus could depend on multiple contributing risk mutations/genes.
Collapse
Affiliation(s)
- Wen Liu
- Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, Sweden
| | - Xiang Jiao
- Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, Sweden
| | | | - Hovsep Mahdessian
- Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, Sweden
| | - Annika Lindblom
- Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, Sweden
| |
Collapse
|
3
|
Lombardi A, Menconi F, Greenberg D, Concepcion E, Leo M, Rocchi R, Marinó M, Keddache M, Tomer Y. Dissecting the Genetic Susceptibility to Graves' Disease in a Cohort of Patients of Italian Origin. Front Endocrinol (Lausanne) 2016; 7:21. [PMID: 27014188 PMCID: PMC4781855 DOI: 10.3389/fendo.2016.00021] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/10/2015] [Accepted: 02/22/2016] [Indexed: 11/13/2022] Open
Abstract
Graves' disease (GD) is an autoimmune oligogenic disorder with a strong hereditary component. Several GD susceptibility genes have been identified and confirmed during the last two decades. However, there are very few studies that evaluated susceptibility genes for GD in specific geographic subsets. Previously, we mapped a new locus on chromosome 3q that was unique to GD families of Italian origin. In the present study, we used association analysis of single-nucleotide polymorphism (SNPs) at the 3q locus in a cohort of GD patients of Italian origin in order to prioritize the best candidates among the known genes in this locus to choose the one(s) best supported by the association. DNA samples were genotyped using the Illumina GoldenGate genotyping assay analyzing 690 SNP in the linked 3q locus covering all 124 linkage disequilibrium blocks in this locus. Candidate non-HLA (human-leukocyte-antigen) genes previously reported to be associated with GD and/or other autoimmune disorders were analyzed separately. Three SNPs in the 3q locus showed a nominal association (p < 0.05): rs13097181, rs763313, and rs6792646. Albeit these could not be further validated by multiple comparison correction, we were prioritizing candidate genes at a locus already known to harbor a GD-related gene, not hypothesis testing. Moreover, we found significant associations with the thyroid-stimulating hormone receptor (TSHR) gene, the cytotoxic T-lymphocyte antigen-4 (CTLA-4) gene, and the thyroglobulin (TG) gene. In conclusion, we identified three SNPs on chromosome 3q that may map a new GD susceptibility gene in this region which is unique to the Italian population. Furthermore, we confirmed that the TSHR, the CTLA-4, and the TG genes are associated with GD in Italians. Our findings highlight the influence of ethnicity and geographic variations on the genetic susceptibility to GD.
Collapse
Affiliation(s)
- Angela Lombardi
- Division of Endocrinology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- *Correspondence: Angela Lombardi, ; Yaron Tomer,
| | | | - David Greenberg
- Battelle Center for Mathematical Medicine, Nationwide Children’s Hospital, Columbus, OH, USA
| | - Erlinda Concepcion
- Division of Endocrinology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Marenza Leo
- Endocrinology, University Hospital of Pisa, Pisa, Italy
| | | | | | - Mehdi Keddache
- Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, USA
| | - Yaron Tomer
- Division of Endocrinology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Bronx VA Medical Center, Bronx, NY, USA
- *Correspondence: Angela Lombardi, ; Yaron Tomer,
| |
Collapse
|
4
|
Mishima H, Sasaki K, Tanaka M, Tatebe O, Yoshiura KI. Agile parallel bioinformatics workflow management using Pwrake. BMC Res Notes 2011; 4:331. [PMID: 21899774 PMCID: PMC3180464 DOI: 10.1186/1756-0500-4-331] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2011] [Accepted: 09/08/2011] [Indexed: 12/20/2022] Open
Abstract
Background In bioinformatics projects, scientific workflow systems are widely used to manage computational procedures. Full-featured workflow systems have been proposed to fulfil the demand for workflow management. However, such systems tend to be over-weighted for actual bioinformatics practices. We realize that quick deployment of cutting-edge software implementing advanced algorithms and data formats, and continuous adaptation to changes in computational resources and the environment are often prioritized in scientific workflow management. These features have a greater affinity with the agile software development method through iterative development phases after trial and error. Here, we show the application of a scientific workflow system Pwrake to bioinformatics workflows. Pwrake is a parallel workflow extension of Ruby's standard build tool Rake, the flexibility of which has been demonstrated in the astronomy domain. Therefore, we hypothesize that Pwrake also has advantages in actual bioinformatics workflows. Findings We implemented the Pwrake workflows to process next generation sequencing data using the Genomic Analysis Toolkit (GATK) and Dindel. GATK and Dindel workflows are typical examples of sequential and parallel workflows, respectively. We found that in practice, actual scientific workflow development iterates over two phases, the workflow definition phase and the parameter adjustment phase. We introduced separate workflow definitions to help focus on each of the two developmental phases, as well as helper methods to simplify the descriptions. This approach increased iterative development efficiency. Moreover, we implemented combined workflows to demonstrate modularity of the GATK and Dindel workflows. Conclusions Pwrake enables agile management of scientific workflows in the bioinformatics domain. The internal domain specific language design built on Ruby gives the flexibility of rakefiles for writing scientific workflows. Furthermore, readability and maintainability of rakefiles may facilitate sharing workflows among the scientific community. Workflows for GATK and Dindel are available at http://github.com/misshie/Workflows.
Collapse
Affiliation(s)
- Hiroyuki Mishima
- Department of Human Genetics, Nagasaki University Graduate School of Biomedical Sciences, 1-12-4 Sakamoto, Nagasaki, Nagasaki, Japan.
| | | | | | | | | |
Collapse
|
5
|
Sangket U, Mahasirimongkol S, Chantratita W, Tandayya P, Aulchenko YS. ParallABEL: an R library for generalized parallelization of genome-wide association studies. BMC Bioinformatics 2010; 11:217. [PMID: 20429914 PMCID: PMC2879286 DOI: 10.1186/1471-2105-11-217] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2009] [Accepted: 04/29/2010] [Indexed: 11/24/2022] Open
Abstract
Background Genome-Wide Association (GWA) analysis is a powerful method for identifying loci associated with complex traits and drug response. Parts of GWA analyses, especially those involving thousands of individuals and consuming hours to months, will benefit from parallel computation. It is arduous acquiring the necessary programming skills to correctly partition and distribute data, control and monitor tasks on clustered computers, and merge output files. Results Most components of GWA analysis can be divided into four groups based on the types of input data and statistical outputs. The first group contains statistics computed for a particular Single Nucleotide Polymorphism (SNP), or trait, such as SNP characterization statistics or association test statistics. The input data of this group includes the SNPs/traits. The second group concerns statistics characterizing an individual in a study, for example, the summary statistics of genotype quality for each sample. The input data of this group includes individuals. The third group consists of pair-wise statistics derived from analyses between each pair of individuals in the study, for example genome-wide identity-by-state or genomic kinship analyses. The input data of this group includes pairs of SNPs/traits. The final group concerns pair-wise statistics derived for pairs of SNPs, such as the linkage disequilibrium characterisation. The input data of this group includes pairs of individuals. We developed the ParallABEL library, which utilizes the Rmpi library, to parallelize these four types of computations. ParallABEL library is not only aimed at GenABEL, but may also be employed to parallelize various GWA packages in R. The data set from the North American Rheumatoid Arthritis Consortium (NARAC) includes 2,062 individuals with 545,080, SNPs' genotyping, was used to measure ParallABEL performance. Almost perfect speed-up was achieved for many types of analyses. For example, the computing time for the identity-by-state matrix was linearly reduced from approximately eight hours to one hour when ParallABEL employed eight processors. Conclusions Executing genome-wide association analysis using the ParallABEL library on a computer cluster is an effective way to boost performance, and simplify the parallelization of GWA studies. ParallABEL is a user-friendly parallelization of GenABEL.
Collapse
Affiliation(s)
- Unitsa Sangket
- Center for Genomics and Bioinformatics Research, Faculty of Science, Prince of Songkla University, Songkhla, 90112, Thailand.
| | | | | | | | | |
Collapse
|
6
|
Lu G, Ni J. Highlighting computations in bioscience and bioinformatics: review of the Symposium of Computations in Bioinformatics and Bioscience (SCBB07). BMC Bioinformatics 2008; 9 Suppl 6:S1. [PMID: 18541044 PMCID: PMC2423432 DOI: 10.1186/1471-2105-9-s6-s1] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
The Second Symposium on Computations in Bioinformatics and Bioscience (SCBB07) was held in Iowa City, Iowa, USA, on August 13–15, 2007. This annual event attracted dozens of bioinformatics professionals and students, who are interested in solving emerging computational problems in bioscience, from China, Japan, Taiwan and the United States. The Scientific Committee of the symposium selected 18 peer-reviewed papers for publication in this supplemental issue of BMC Bioinformatics. These papers cover a broad spectrum of topics in computational biology and bioinformatics, including DNA, protein and genome sequence analysis, gene expression and microarray analysis, computational proteomics and protein structure classification, systems biology and machine learning.
Collapse
Affiliation(s)
- Guoqing Lu
- Department of Biology, University of Nebraska, Omaha, NE 68182, USA.
| | | |
Collapse
|