1
|
Park A, Koslickia D. Pro krustean Graph: A substring index for rapid k-mer size analysis. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.11.21.568151. [PMID: 38853857 PMCID: PMC11160577 DOI: 10.1101/2023.11.21.568151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2024]
Abstract
Despite the widespread adoption of k -mer-based methods in bioinformatics, understanding the influence of k -mer sizes remains a persistent challenge. Selecting an optimal k -mer size or employing multiple k -mer sizes is often arbitrary, application-specific, and fraught with computational complexities. Typically, the influence of k -mer size is obscured by the outputs of complex bioinformatics tasks, such as genome analysis, comparison, assembly, alignment, and error correction. However, it is frequently overlooked that every method is built above a well-defined k -mer-based object like Jaccard Similarity, de Bruijn graphs, k -mer spectra, and Bray-Curtis Dissimilarity. Despite these objects offering a clearer perspective on the role of k -mer sizes, the dynamics of k -mer-based objects with respect to k -mer sizes remain surprisingly elusive. This paper introduces a computational framework that generalizes the transition of k -mer-based objects across k -mer sizes, utilizing a novel substring index, the Prokrustean graph. The primary contribution of this framework is to compute quantities associated with k -mer-based objects for all k -mer sizes, where the computational complexity depends solely on the number of maximal repeats and is independent of the range of k -mer sizes. For example, counting vertices of compacted de Bruijn graphs for k = 1 , … , 100 can be accomplished in mere seconds with our substring index constructed on a gigabase-sized read set. Additionally, we derive a space-efficient algorithm to extract the Prokrustean graph from the Burrows-Wheeler Transform. It becomes evident that modern substring indices, mostly based on longest common prefixes of suffix arrays, inherently face difficulties at exploring varying k -mer sizes due to their limitations at grouping co-occurring substrings. We have implemented four applications that utilize quantities critical in modern pangenomics and metagenomics. The code for these applications and the construction algorithm is available at https://github.com/KoslickiLab/prokrustean.
Collapse
Affiliation(s)
- Adam Park
- Computer Science and Engineering in Pennsylvania State University, PA, USA
| | - David Koslickia
- Computer Science and Engineering in Pennsylvania State University, PA, USA
- Biology in Pennsylvania State University, PA, USA
- Huck Institutes of the Life Sciences in Pennsylvania State University, PA, USA
| |
Collapse
|
2
|
Mustafa H, Karasikov M, Mansouri Ghiasi N, Rätsch G, Kahles A. Label-guided seed-chain-extend alignment on annotated De Bruijn graphs. Bioinformatics 2024; 40:i337-i346. [PMID: 38940164 PMCID: PMC11211850 DOI: 10.1093/bioinformatics/btae226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
MOTIVATION Exponential growth in sequencing databases has motivated scalable De Bruijn graph-based (DBG) indexing for searching these data, using annotations to label nodes with sample IDs. Low-depth sequencing samples correspond to fragmented subgraphs, complicating finding the long contiguous walks required for alignment queries. Aligners that target single-labelled subgraphs reduce alignment lengths due to fragmentation, leading to low recall for long reads. While some (e.g. label-free) aligners partially overcome fragmentation by combining information from multiple samples, biologically irrelevant combinations in such approaches can inflate the search space or reduce accuracy. RESULTS We introduce a new scoring model, 'multi-label alignment' (MLA), for annotated DBGs. MLA leverages two new operations: To promote biologically relevant sample combinations, 'Label Change' incorporates more informative global sample similarity into local scores. To improve connectivity, 'Node Length Change' dynamically adjusts the DBG node length during traversal. Our fast, approximate, yet accurate MLA implementation has two key steps: a single-label seed-chain-extend aligner (SCA) and a multi-label chainer (MLC). SCA uses a traditional scoring model adapting recent chaining improvements to assembly graphs and provides a curated pool of alignments. MLC extracts seed anchors from SCAs alignments, produces multi-label chains using MLA scoring, then finally forms multi-label alignments. We show via substantial improvements in taxonomic classification accuracy that MLA produces biologically relevant alignments, decreasing average weighted UniFrac errors by 63.1%-66.8% and covering 45.5%-47.4% (median) more long-read query characters than state-of-the-art aligners. MLAs runtimes are competitive with label-combining alignment and substantially faster than single-label alignment. AVAILABILITY AND IMPLEMENTATION The data, scripts, and instructions for generating our results are available at https://github.com/ratschlab/mla.
Collapse
Affiliation(s)
- Harun Mustafa
- Department of Computer Science, ETH Zurich, Zurich, 8092, Switzerland
- Biomedical Informatics Group, University Hospital Zurich, Zurich, 8091, Switzerland
- Biomedical Informatics, Swiss Institute of Bioinformatics, Zurich, 8092, Switzerland
| | - Mikhail Karasikov
- Department of Computer Science, ETH Zurich, Zurich, 8092, Switzerland
- Biomedical Informatics Group, University Hospital Zurich, Zurich, 8091, Switzerland
- Biomedical Informatics, Swiss Institute of Bioinformatics, Zurich, 8092, Switzerland
| | - Nika Mansouri Ghiasi
- Department of Information Technology and Electrical Engineering, ETH Zurich, Zurich, 8092, Switzerland
| | - Gunnar Rätsch
- Department of Computer Science, ETH Zurich, Zurich, 8092, Switzerland
- Biomedical Informatics Group, University Hospital Zurich, Zurich, 8091, Switzerland
- Biomedical Informatics, Swiss Institute of Bioinformatics, Zurich, 8092, Switzerland
- ETH AI Center, Zurich, 8092, Switzerland
- Department of Biology, ETH Zurich, Zurich, 8093, Switzerland
- The LOOP Zurich—Medical Research Center, Zurich, 8044, Switzerland
| | - André Kahles
- Department of Computer Science, ETH Zurich, Zurich, 8092, Switzerland
- Biomedical Informatics Group, University Hospital Zurich, Zurich, 8091, Switzerland
- Biomedical Informatics, Swiss Institute of Bioinformatics, Zurich, 8092, Switzerland
- The LOOP Zurich—Medical Research Center, Zurich, 8044, Switzerland
| |
Collapse
|
3
|
Yang C, Zhang Z, Huang Y, Xie X, Liao H, Xiao J, Veldsman WP, Yin K, Fang X, Zhang L. LRTK: a platform agnostic toolkit for linked-read analysis of both human genome and metagenome. Gigascience 2024; 13:giae028. [PMID: 38869148 PMCID: PMC11170215 DOI: 10.1093/gigascience/giae028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 03/15/2024] [Accepted: 05/09/2024] [Indexed: 06/14/2024] Open
Abstract
BACKGROUND Linked-read sequencing technologies generate high-base quality short reads that contain extrapolative information on long-range DNA connectedness. These advantages of linked-read technologies are well known and have been demonstrated in many human genomic and metagenomic studies. However, existing linked-read analysis pipelines (e.g., Long Ranger) were primarily developed to process sequencing data from the human genome and are not suited for analyzing metagenomic sequencing data. Moreover, linked-read analysis pipelines are typically limited to 1 specific sequencing platform. FINDINGS To address these limitations, we present the Linked-Read ToolKit (LRTK), a unified and versatile toolkit for platform agnostic processing of linked-read sequencing data from both human genome and metagenome. LRTK provides functions to perform linked-read simulation, barcode sequencing error correction, barcode-aware read alignment and metagenome assembly, reconstruction of long DNA fragments, taxonomic classification and quantification, and barcode-assisted genomic variant calling and phasing. LRTK has the ability to process multiple samples automatically and provides users with the option to generate reproducible reports during processing of raw sequencing data and at multiple checkpoints throughout downstream analysis. We applied LRTK on linked reads from simulation, mock community, and real datasets for both human genome and metagenome. We showcased LRTK's ability to generate comparative performance results from preceding benchmark studies and to report these results in publication-ready HTML document plots. CONCLUSIONS LRTK provides comprehensive and flexible modules along with an easy-to-use Python-based workflow for processing linked-read sequencing datasets, thereby filling the current gap in the field caused by platform-centric genome-specific linked-read data analysis tools.
Collapse
Affiliation(s)
- Chao Yang
- Department of Computer Science, Hong Kong Baptist University, Hong Kong SAR 999077, Hong Kong
| | - Zhenmiao Zhang
- Department of Computer Science, Hong Kong Baptist University, Hong Kong SAR 999077, Hong Kong
| | - Yufen Huang
- BGI Research, Shenzhen 518083, China
- BGI Genomics, Shenzhen 518083, China
| | | | - Herui Liao
- Department of Electrical Engineering, City University of Hong Kong, Hong Kong SAR 999077, Hong Kong
| | - Jin Xiao
- Department of Computer Science, Hong Kong Baptist University, Hong Kong SAR 999077, Hong Kong
| | - Werner Pieter Veldsman
- Department of Computer Science, Hong Kong Baptist University, Hong Kong SAR 999077, Hong Kong
| | - Kejing Yin
- Department of Computer Science, Hong Kong Baptist University, Hong Kong SAR 999077, Hong Kong
| | - Xiaodong Fang
- BGI Genomics, Shenzhen 518083, China
- BGI Research, Sanya 572025, China
| | - Lu Zhang
- Department of Computer Science, Hong Kong Baptist University, Hong Kong SAR 999077, Hong Kong
- Institute for Research and Continuing Education, Hong Kong Baptist University, Hong Kong SAR 999077, Hong Kong
| |
Collapse
|
4
|
Su R, Zhou H, Yang W, Moqir S, Ritu X, Liu L, Shi Y, Dong A, Bayier M, Letu Y, Manxi X, Chulu H, Nasenochir N, Meng H, Herrid M. Near telomere-to-telomere genome assembly of Mongolian cattle: implications for population genetic variation and beef quality. Gigascience 2024; 13:giae099. [PMID: 39693631 DOI: 10.1093/gigascience/giae099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2024] [Revised: 09/29/2024] [Accepted: 11/10/2024] [Indexed: 12/20/2024] Open
Abstract
BACKGROUND Mongolian cattle, a unique breed indigenous to China, represent valuable genetic resources and serve as important sources of meat and milk. However, there is a lack of high-quality genomes in cattle, which limits biological research and breeding improvement. FINDINGS In this study, we conducted whole-genome sequencing on a Mongolian bull. This effort yielded a 3.1 Gb Mongolian cattle genome sequence, with a BUSCO integrity assessment of 95.9%. The assembly achieved both contig N50 and scaffold N50 values of 110.9 Mb, with only 3 gaps identified across the entire genome. Additionally, we successfully assembled the Y chromosome among the 31 chromosomes. Notably, 3 chromosomes were identified as having telomeres at both ends. The annotation data include 54.31% repetitive sequences and 29,794 coding genes. Furthermore, a population genetic variation analysis was conducted on 332 individuals from 56 breeds, through which we identified variant loci and potentially discovered genes associated with the formation of marbling patterns in beef, predominantly located on chromosome 12. CONCLUSIONS This study produced a genome with high continuity, completeness, and accuracy, marking the first assembly and annotation of a near telomere-to-telomere genome in cattle. Based on this, we generated a variant database comprising 332 individuals. The assembly of the genome and the analysis of population variants provide significant insights into cattle evolution and enhance our understanding of breeding selection.
Collapse
Affiliation(s)
- Rina Su
- Grassland & Cattle Investment Co., Ltd., R&D Center, Hohhot 010000, Inner Mongolia
| | - Hao Zhou
- School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Wenhao Yang
- School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Sorgog Moqir
- Grassland & Cattle Investment Co., Ltd., R&D Center, Hohhot 010000, Inner Mongolia
| | - Xiji Ritu
- Grassland & Cattle Investment Co., Ltd., R&D Center, Hohhot 010000, Inner Mongolia
| | - Lei Liu
- Grassland & Cattle Investment Co., Ltd., R&D Center, Hohhot 010000, Inner Mongolia
| | - Ying Shi
- Grassland & Cattle Investment Co., Ltd., R&D Center, Hohhot 010000, Inner Mongolia
| | - Ai Dong
- Bureau of Agriculture and Animal Husbandry, Alxa League, Bayanhot 750306, Inner Mongolia, China
| | - Menghe Bayier
- Centre for Animal Husbandry and Veterinary Technology, Alxa League, Bayanhot 750306, Inner Mongolia
| | - Yibu Letu
- Station for Animal Husbandry, Xilingol League, Xilinhot 026000, Inner Mongolia
| | - Xin Manxi
- Station for Animal Husbandry, Xilingol League, Xilinhot 026000, Inner Mongolia
| | - Hasi Chulu
- Station for Animal Husbandry, Sunit Left Banner, Xilingol League, Xilinhot 026000, Inner Mongolia
| | - Narenhua Nasenochir
- College of Animal Science, Inner Mongolia Agriculture University, Hohhot 010000, Inner Mongolia, China
| | - He Meng
- School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Muren Herrid
- Grassland & Cattle Investment Co., Ltd., R&D Center, Hohhot 010000, Inner Mongolia
- International Livestock Research Centre, Gold Coast 4211, Queensland, Australia
| |
Collapse
|
5
|
Rajaby R, Liu DX, Au CH, Cheung YT, Lau AYT, Yang QY, Sung WK. INSurVeyor: improving insertion calling from short read sequencing data. Nat Commun 2023; 14:3243. [PMID: 37277343 DOI: 10.1038/s41467-023-38870-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Accepted: 05/18/2023] [Indexed: 06/07/2023] Open
Abstract
Insertions are one of the major types of structural variations and are defined as the addition of 50 nucleotides or more into a DNA sequence. Several methods exist to detect insertions from next-generation sequencing short read data, but they generally have low sensitivity. Our contribution is two-fold. First, we introduce INSurVeyor, a fast, sensitive and precise method that detects insertions from next-generation sequencing paired-end data. Using publicly available benchmark datasets (both human and non-human), we show that INSurVeyor is not only more sensitive than any individual caller we tested, but also more sensitive than all of them combined. Furthermore, for most types of insertions, INSurVeyor is almost as sensitive as long reads callers. Second, we provide state-of-the-art catalogues of insertions for 1047 Arabidopsis Thaliana genomes from the 1001 Genomes Project and 3202 human genomes from the 1000 Genomes Project, both generated with INSurVeyor. We show that they are more complete and precise than existing resources, and important insertions are missed by existing methods.
Collapse
Affiliation(s)
- Ramesh Rajaby
- Hong Kong Genome Institute, Hong Kong Science Park, Shatin, Hong Kong, China
- A*STAR Genome Institute of Singapore, 60 Biopolis Street, Singapore, 138672, Singapore
| | - Dong-Xu Liu
- National Key Laboratory of Crop Genetic Improvement, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Chun Hang Au
- Hong Kong Genome Institute, Hong Kong Science Park, Shatin, Hong Kong, China
| | - Yuen-Ting Cheung
- Hong Kong Genome Institute, Hong Kong Science Park, Shatin, Hong Kong, China
| | - Amy Yuet Ting Lau
- Hong Kong Genome Institute, Hong Kong Science Park, Shatin, Hong Kong, China
| | - Qing-Yong Yang
- National Key Laboratory of Crop Genetic Improvement, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Wing-Kin Sung
- Hong Kong Genome Institute, Hong Kong Science Park, Shatin, Hong Kong, China.
- A*STAR Genome Institute of Singapore, 60 Biopolis Street, Singapore, 138672, Singapore.
- National Key Laboratory of Crop Genetic Improvement, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China.
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China.
- Department of Chemical Pathology, The Chinese University of Hong Kong, Hong Kong, China.
- Laboratory of Computational Genomics, Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Hong Kong, China.
- School of Computing, National University of Singapore, 13 Computing Drive, Singapore, 117417, Singapore.
| |
Collapse
|
6
|
Zhou Y, Yang L, Han X, Han J, Hu Y, Li F, Xia H, Peng L, Boschiero C, Rosen BD, Bickhart DM, Zhang S, Guo A, Van Tassell CP, Smith TPL, Yang L, Liu GE. Assembly of a pangenome for global cattle reveals missing sequences and novel structural variations, providing new insights into their diversity and evolutionary history. Genome Res 2022; 32:1585-1601. [PMID: 35977842 PMCID: PMC9435747 DOI: 10.1101/gr.276550.122] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2022] [Accepted: 07/21/2022] [Indexed: 02/03/2023]
Abstract
A cattle pangenome representation was created based on the genome sequences of 898 cattle representing 57 breeds. The pangenome identified 83 Mb of sequence not found in the cattle reference genome, representing 3.1% novel sequence compared with the 2.71-Gb reference. A catalog of structural variants developed from this cattle population identified 3.3 million deletions, 0.12 million inversions, and 0.18 million duplications. Estimates of breed ancestry and hybridization between cattle breeds using insertion/deletions as markers were similar to those produced by single nucleotide polymorphism-based analysis. Hundreds of deletions were observed to have stratification based on subspecies and breed. For example, an insertion of a Bov-tA1 repeat element was identified in the first intron of the APPL2 gene and correlated with cattle breed geographic distribution. This insertion falls within a segment overlapping predicted enhancer and promoter regions of the gene, and could affect important traits such as immune response, olfactory functions, cell proliferation, and glucose metabolism in muscle. The results indicate that pangenomes are a valuable resource for studying diversity and evolutionary history, and help to delineate how domestication, trait-based breeding, and adaptive introgression have shaped the cattle genome.
Collapse
Affiliation(s)
- Yang Zhou
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| | - Lv Yang
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| | - Xiaotao Han
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| | - Jiazheng Han
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| | - Yan Hu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| | - Fan Li
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| | - Han Xia
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| | - Lingwei Peng
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| | - Clarissa Boschiero
- Animal Genomics and Improvement Laboratory, BARC, USDA-ARS, Beltsville, Maryland 20705, USA
| | - Benjamin D Rosen
- Animal Genomics and Improvement Laboratory, BARC, USDA-ARS, Beltsville, Maryland 20705, USA
| | - Derek M Bickhart
- Dairy Forage Research Center, ARS USDA, Madison, Wisconsin 53706, USA
| | - Shujun Zhang
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| | - Aizhen Guo
- The State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, Wuhan 430070, China
| | - Curtis P Van Tassell
- Animal Genomics and Improvement Laboratory, BARC, USDA-ARS, Beltsville, Maryland 20705, USA
| | - Timothy P L Smith
- U.S. Meat Animal Research Center, ARS USDA, Clay Center, Nebraska 68933, USA
| | - Liguo Yang
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| | - George E Liu
- Animal Genomics and Improvement Laboratory, BARC, USDA-ARS, Beltsville, Maryland 20705, USA
| |
Collapse
|
7
|
Meleshko D, Yang R, Marks P, Williams S, Hajirasouliha I. Efficient detection and assembly of non-reference DNA sequences with synthetic long reads. Nucleic Acids Res 2022; 50:e108. [PMID: 35924489 PMCID: PMC9561269 DOI: 10.1093/nar/gkac653] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Revised: 06/10/2022] [Accepted: 08/01/2022] [Indexed: 11/14/2022] Open
Abstract
Recent pan-genome studies have revealed an abundance of DNA sequences in human genomes that are not present in the reference genome. A lion's share of these non-reference sequences (NRSs) cannot be reliably assembled or placed on the reference genome. Improvements in long-read and synthetic long-read (aka linked-read) technologies have great potential for the characterization of NRSs. While synthetic long reads require less input DNA than long-read datasets, they are algorithmically more challenging to use. Except for computationally expensive whole-genome assembly methods, there is no synthetic long-read method for NRS detection. We propose a novel integrated alignment-based and local assembly-based algorithm, Novel-X, that uses the barcode information encoded in synthetic long reads to improve the detection of such events without a whole-genome de novo assembly. Our evaluations demonstrate that Novel-X finds many non-reference sequences that cannot be found by state-of-the-art short-read methods. We applied Novel-X to a diverse set of 68 samples from the Polaris HiSeq 4000 PGx cohort. Novel-X discovered 16 691 NRS insertions of size > 300 bp (total length 18.2 Mb). Many of them are population specific or may have a functional impact.
Collapse
Affiliation(s)
- Dmitry Meleshko
- Tri-Institutional PhD Program in Computational Biology and Medicine, Weill Cornell Medical College, NY 10021, USA.,Institute for Computational Biomedicine, Department of Physiology and Biophysics, Weill Cornell Medicine of Cornell University, NY 10021, USA
| | - Rui Yang
- Tri-Institutional PhD Program in Computational Biology and Medicine, Weill Cornell Medical College, NY 10021, USA.,Institute for Computational Biomedicine, Department of Physiology and Biophysics, Weill Cornell Medicine of Cornell University, NY 10021, USA
| | - Patrick Marks
- 10x Genomics Inc., Stoneridge Mall Road, Pleasanton, CA 94566, USA
| | - Stephen Williams
- 10x Genomics Inc., Stoneridge Mall Road, Pleasanton, CA 94566, USA
| | - Iman Hajirasouliha
- Institute for Computational Biomedicine, Department of Physiology and Biophysics, Weill Cornell Medicine of Cornell University, NY 10021, USA.,Englander Institute for Precision Medicine, The Meyer Cancer Center, Weill Cornell Medicine, NY 10021, USA
| |
Collapse
|