Marano LA, Marcorin L, Castelli EDC, Mendes-Junior CT. Evaluation of MC1R high-throughput nucleotide sequencing data generated by the 1000 Genomes Project.
Genet Mol Biol 2017;
40:530-539. [PMID:
28486572 PMCID:
PMC5488459 DOI:
10.1590/1678-4685-gmb-2016-0180]
[Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2016] [Accepted: 12/23/2016] [Indexed: 02/06/2023] Open
Abstract
The advent of next-generation sequencing allows simultaneous processing of several
genomic regions/individuals, increasing the availability and accuracy of whole-genome
data. However, these new approaches may present some errors and bias due to
alignment, genotype calling, and imputation methods. Despite these flaws, data
obtained by next-generation sequencing can be valuable for population and
evolutionary studies of specific genes, such as genes related to how pigmentation
evolved among populations, one of the main topics in human evolutionary biology.
Melanocortin-1 receptor (MC1R) is one of the most studied genes
involved in pigmentation variation. As MC1R has already been
suggested to affect melanogenesis and increase risk of developing melanoma, it
constitutes one of the best models to understand how natural selection acts on
pigmentation. Here we employed a locally developed pipeline to obtain genotype and
haplotype data for MC1R from the raw sequencing data provided by the
1000 Genomes FTP site. We also compared such genotype data to Phase
3 VCF to evaluate its quality and discover any polymorphic sites that may have been
overlooked. In conclusion, either the VCF file or one of the presently described
pipelines could be used to obtain reliable and accurate genotype calling from the
1000 Genomes Phase 3 data.
Collapse