1
|
Moeckel C, Mareboina M, Konnaris MA, Chan CS, Mouratidis I, Montgomery A, Chantzi N, Pavlopoulos GA, Georgakopoulos-Soares I. A survey of k-mer methods and applications in bioinformatics. Comput Struct Biotechnol J 2024; 23:2289-2303. [PMID: 38840832 PMCID: PMC11152613 DOI: 10.1016/j.csbj.2024.05.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 05/14/2024] [Accepted: 05/15/2024] [Indexed: 06/07/2024] Open
Abstract
The rapid progression of genomics and proteomics has been driven by the advent of advanced sequencing technologies, large, diverse, and readily available omics datasets, and the evolution of computational data processing capabilities. The vast amount of data generated by these advancements necessitates efficient algorithms to extract meaningful information. K-mers serve as a valuable tool when working with large sequencing datasets, offering several advantages in computational speed and memory efficiency and carrying the potential for intrinsic biological functionality. This review provides an overview of the methods, applications, and significance of k-mers in genomic and proteomic data analyses, as well as the utility of absent sequences, including nullomers and nullpeptides, in disease detection, vaccine development, therapeutics, and forensic science. Therefore, the review highlights the pivotal role of k-mers in addressing current genomic and proteomic problems and underscores their potential for future breakthroughs in research.
Collapse
Affiliation(s)
- Camille Moeckel
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Manvita Mareboina
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Maxwell A. Konnaris
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Candace S.Y. Chan
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
| | - Ioannis Mouratidis
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institute of the Life Sciences, Penn State University, University Park, Pennsylvania, USA
| | - Austin Montgomery
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Nikol Chantzi
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | | | - Ilias Georgakopoulos-Soares
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institute of the Life Sciences, Penn State University, University Park, Pennsylvania, USA
| |
Collapse
|
2
|
Brooke G, Wendel S, Banerjee A, Wallace N. Opportunities to Advance Cervical Cancer Prevention and Care. Tumour Virus Res 2024; 18:200292. [PMID: 39490532 DOI: 10.1016/j.tvr.2024.200292] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2024] [Revised: 10/21/2024] [Accepted: 10/22/2024] [Indexed: 11/05/2024] Open
Abstract
Cervical cancer (CaCx) is a major public health issue, with over 600,000 women diagnosed annually. CaCx kills someone every 90 seconds, mostly in low- and middle-income countries. There are effective yet imperfect mechanisms to prevent CaCx. Since human papillomavirus (HPV) infections cause most CaCx, they can be prevented by vaccination. Screening methodologies can identify premalignant lesions and allow interventions before a CaCx develops. However, these tools are less feasible in resource-poor environments. Additionally, current screening modalities cannot triage lesions based on their relative risk of progression, which results in overtreatment. CaCx care relies heavily on genotoxic agents that cause severe side effects. This review discusses ways that recent technological advancements could be leveraged to improve CaCx care and prevention.
Collapse
Affiliation(s)
- Grant Brooke
- Division of Biology, Kansas State University, Manhattan, KS 66506, USA
| | - Sebastian Wendel
- Department of Kinesiology, Kansas State University, Manhattan, KS 66506, USA
| | - Abhineet Banerjee
- Division of Biology, Kansas State University, Manhattan, KS 66506, USA
| | - Nicholas Wallace
- Department of Kinesiology, Kansas State University, Manhattan, KS 66506, USA.
| |
Collapse
|
3
|
Jogi HR, Smaraki N, Nayak SS, Rajawat D, Kamothi DJ, Panigrahi M. Single cell RNA-seq: a novel tool to unravel virus-host interplay. Virusdisease 2024; 35:41-54. [PMID: 38817399 PMCID: PMC11133279 DOI: 10.1007/s13337-024-00859-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Accepted: 02/12/2024] [Indexed: 06/01/2024] Open
Abstract
Over the last decade, single cell RNA sequencing (scRNA-seq) technology has caught the momentum of being a vital revolutionary tool to unfold cellular heterogeneity by high resolution assessment. It evades the inadequacies of conventional sequencing technology which was able to detect only average expression level among cell populations. In the era of twenty-first century, several epidemic and pandemic viruses have emerged. Being an intracellular entity, viruses totally rely on host. Complex virus-host dynamics result when the virus tend to obtain factors from host cell required for its replication and establishment of infection. As a prevailing tool, scRNA-seq is able to understand virus-host interplay by comprehensive transcriptome profiling. Because of technological and methodological advancement, this technology is capable to recognize viral genome and host cell response heterogeneity. Further development in analytical methods with multiomics approach and increased availability of accessible scRNA-seq datasets will improve the understanding of viral pathogenesis that can be helpful for development of novel antiviral therapeutic strategies.
Collapse
Affiliation(s)
- Harsh Rajeshbhai Jogi
- Division of Veterinary Microbiology, Indian Veterinary Research Institute, Izatnagar, Bareilly, UP 243122 India
| | - Nabaneeta Smaraki
- Division of Veterinary Microbiology, Indian Veterinary Research Institute, Izatnagar, Bareilly, UP 243122 India
| | - Sonali Sonejita Nayak
- Division of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly, UP 243122 India
| | - Divya Rajawat
- Division of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly, UP 243122 India
| | - Dhaval J. Kamothi
- Division of Pharmacology and Toxicology, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly, UP 243122 India
| | - Manjit Panigrahi
- Division of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly, UP 243122 India
| |
Collapse
|
4
|
Jia TZ, Nishikawa S, Fujishima K. Sequencing the Origins of Life. BBA ADVANCES 2022; 2:100049. [PMID: 37082609 PMCID: PMC10074849 DOI: 10.1016/j.bbadva.2022.100049] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Revised: 02/27/2022] [Accepted: 03/02/2022] [Indexed: 01/10/2023] Open
Abstract
One goal of origins of life research is to understand how primitive informational and catalytic biopolymers emerged and evolved. Recently, a number of sequencing techniques have been applied to analysis of replicating and evolving primitive biopolymer systems, providing a sequence-specific and high-resolution view of primitive chemical processes. Here, we review application of sequencing techniques to analysis of synthetic and primitive nucleic acids and polypeptides. This includes next-generation sequencing of primitive polymerization and evolution processes, followed by discussion of other novel biochemical techniques that could contribute to sequence analysis of primitive biopolymer driven chemical systems. Further application of sequencing to origins of life research, perhaps as a life detection technology, could provide insight into the origin and evolution of informational and catalytic biopolymers on early Earth or elsewhere.
Collapse
Affiliation(s)
- Tony Z. Jia
- Earth-Life Science Institute, Tokyo Institute of Technology, 2-12-1-IE-1 Ookayama, Meguro-ku, Tokyo 152-8550, Japan
- Blue Marble Space Institute of Science, 600 1st Ave, Floor 1, Seattle, WA 98104, USA
- Corresponding author
| | - Shota Nishikawa
- Earth-Life Science Institute, Tokyo Institute of Technology, 2-12-1-IE-1 Ookayama, Meguro-ku, Tokyo 152-8550, Japan
- School of Life Science and Technology, Tokyo Institute of Technology, Yokohama, Kanagawa 226-8501, Japan
| | - Kosuke Fujishima
- Earth-Life Science Institute, Tokyo Institute of Technology, 2-12-1-IE-1 Ookayama, Meguro-ku, Tokyo 152-8550, Japan
- Graduate School of Media and Governance, Keio University, 5322 Endo, Fujisawa-shi, Kanagawa 252-0882, Japan
| |
Collapse
|
5
|
Zhu D, Gao J, Tang C, Xu Z, Sun T. Single-Cell RNA Sequencing of Bone Marrow Mesenchymal Stem Cells from the Elderly People. Int J Stem Cells 2021; 15:173-182. [PMID: 34711696 PMCID: PMC9148839 DOI: 10.15283/ijsc21042] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2021] [Revised: 07/26/2021] [Accepted: 08/24/2021] [Indexed: 11/09/2022] Open
Abstract
Background and Objectives Bone marrow mesenchymal stem cells (BMSCs) show considerable promise in regenerative medicine. Many studies demonstrated that BMSCs cultured in vitro were highly heterogeneous and composed of diverse cell subpopulations, which may be the basis of their multiple biological characteristics. However, the exact cell subpopulations that make up BMSCs are still unknown. Methods and Results In this study, we used single-cell RNA sequencing (scRNA-Seq) to divide 6,514 BMSCs into three clusters. The number and corresponding proportion of cells in clusters 1 to 3 were 3,766 (57.81%), 1,720 (26.40%), and 1,028 (15.78%). The gene expression profile and function of the cells in the same cluster were similar. The vast majority of cells expressed the markers defining BMSCs by flow cytometry and gene expression analysis. Each cluster had at least 20 differentially expressed genes (DEGs). We conducted Gene Ontology enrichment analysis on the top 20 DEGs of each cluster and found that the three clusters had different functions, which were related to self-renewal, multilineage differentiation and cytokine secretion, respectively. In addition, the function of the top 20 DEGs of each cluster was checked by the National Center for Biotechnology Information gene database to further verify our hypothesis. Conclusions This study indicated that scRNA-Seq can be used to divide BMSCs into different subpopulations, demonstrating the heterogeneity of BMSCs.
Collapse
Affiliation(s)
- Dezhou Zhu
- Department of Orthopaedics, The Third Affiliated Hospital of Wenzhou Medical University, Wenzhou, China.,The Second School of Clinical Medicine, Southern Medical University, Guangzhou, China
| | - Jie Gao
- Department of Orthopaedics, The Seventh Medical Center of Chinese PLA General Hospital, Beijing, China
| | - Chengxuan Tang
- Department of Orthopaedics, The Third Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - Zheng Xu
- Department of Outpatient, The First Retired Cadre Sanitarium of Beijing Garrison in Fengtai District, Beijing, China.,School of Clinical Medicine, The Second Military Medical University, Shanghai, China
| | - Tiansheng Sun
- Department of Orthopaedics, The Seventh Medical Center of Chinese PLA General Hospital, Beijing, China
| |
Collapse
|
6
|
Single-Cell Transcriptome Profiling Simulation Reveals the Impact of Sequencing Parameters and Algorithms on Clustering. Life (Basel) 2021; 11:life11070716. [PMID: 34357088 PMCID: PMC8304014 DOI: 10.3390/life11070716] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2021] [Revised: 07/09/2021] [Accepted: 07/15/2021] [Indexed: 11/16/2022] Open
Abstract
Despite the scRNA-seq analytic algorithms developed, their performance for cell clustering cannot be quantified due to the unknown "true" clusters. Referencing the transcriptomic heterogeneity of cell clusters, a "true" mRNA number matrix of cell individuals was defined as ground truth. Based on the matrix and the actual data generation procedure, a simulation program (SSCRNA) for raw data was developed. Subsequently, the consistency between simulated data and real data was evaluated. Furthermore, the impact of sequencing depth and algorithms for analyses on cluster accuracy was quantified. As a result, the simulation result was highly consistent with that of the actual data. Among the clustering algorithms, the Gaussian normalization method was the more recommended. As for the clustering algorithms, the K-means clustering method was more stable than K-means plus Louvain clustering. In conclusion, the scRNA simulation algorithm developed restores the actual data generation process, discovers the impact of parameters on classification, compares the normalization/clustering algorithms, and provides novel insight into scRNA analyses.
Collapse
|
7
|
Kim HK, Ha TW, Lee MR. Single-Cell Transcriptome Analysis as a Promising Tool to Study Pluripotent Stem Cell Reprogramming. Int J Mol Sci 2021; 22:ijms22115988. [PMID: 34206025 PMCID: PMC8198005 DOI: 10.3390/ijms22115988] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Revised: 05/26/2021] [Accepted: 05/31/2021] [Indexed: 12/15/2022] Open
Abstract
Cells are the basic units of all organisms and are involved in all vital activities, such as proliferation, differentiation, senescence, and apoptosis. A human body consists of more than 30 trillion cells generated through repeated division and differentiation from a single-cell fertilized egg in a highly organized programmatic fashion. Since the recent formation of the Human Cell Atlas consortium, establishing the Human Cell Atlas at the single-cell level has been an ongoing activity with the goal of understanding the mechanisms underlying diseases and vital cellular activities at the level of the single cell. In particular, transcriptome analysis of embryonic stem cells at the single-cell level is of great importance, as these cells are responsible for determining cell fate. Here, we review single-cell analysis techniques that have been actively used in recent years, introduce the single-cell analysis studies currently in progress in pluripotent stem cells and reprogramming, and forecast future studies.
Collapse
|
8
|
Ramadan M, Alariqi M, Ma Y, Li Y, Liu Z, Zhang R, Jin S, Min L, Zhang X. Efficient CRISPR/Cas9 mediated Pooled-sgRNAs assembly accelerates targeting multiple genes related to male sterility in cotton. PLANT METHODS 2021; 17:16. [PMID: 33557889 PMCID: PMC7869495 DOI: 10.1186/s13007-021-00712-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Accepted: 01/19/2021] [Indexed: 05/04/2023]
Abstract
BACKGROUND Upland cotton (Gossypium hirsutum), harboring a complex allotetraploid genome, consists of A and D sub-genomes. Every gene has multiple copies with high sequence similarity that makes genetic, genomic and functional analyses extremely challenging. The recent accessibility of CRISPR/Cas9 tool provides the ability to modify targeted locus efficiently in various complicated plant genomes. However, current cotton transformation method targeting one gene requires a complicated, long and laborious regeneration process. Hence, optimizing strategy that targeting multiple genes is of great value in cotton functional genomics and genetic engineering. RESULTS To target multiple genes in a single experiment, 112 plant development-related genes were knocked out via optimized CRISPR/Cas9 system. We optimized the key steps of pooled sgRNAs assembly method by which 116 sgRNAs pooled together into 4 groups (each group consisted of 29 sgRNAs). Each group of sgRNAs was compiled in one PCR reaction which subsequently went through one round of vector construction, transformation, sgRNAs identification and also one round of genetic transformation. Through the genetic transformation mediated Agrobacterium, we successfully generated more than 800 plants. For mutants identification, Next Generation Sequencing technology has been used and results showed that all generated plants were positive and all targeted genes were covered. Interestingly, among all the transgenic plants, 85% harbored a single sgRNA insertion, 9% two insertions, 3% three different sgRNAs insertions, 2.5% mutated sgRNAs. These plants with different targeted sgRNAs exhibited numerous combinations of phenotypes in plant flowering tissues. CONCLUSION All targeted genes were successfully edited with high specificity. Our pooled sgRNAs assembly offers a simple, fast and efficient method/strategy to target multiple genes in one time and surely accelerated the study of genes function in cotton.
Collapse
Affiliation(s)
- Mohamed Ramadan
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, Hubei, China
- Department of Plant Genetic Resources, Division of Ecology and Dry Land Agriculture, Desert Research Center, Cairo, Egypt
| | - Muna Alariqi
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, Hubei, China
| | - Yizan Ma
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, Hubei, China
| | - Yanlong Li
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, Hubei, China
| | - Zhenping Liu
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, Hubei, China
| | - Rui Zhang
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, Hubei, China
| | - Shuangxia Jin
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, Hubei, China
| | - Ling Min
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, Hubei, China.
| | - Xianlong Zhang
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, Hubei, China
| |
Collapse
|
9
|
Low-complexity and highly robust barcodes for error-rich single molecular sequencing. 3 Biotech 2021; 11:78. [PMID: 33505833 DOI: 10.1007/s13205-020-02607-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Accepted: 12/23/2020] [Indexed: 12/28/2022] Open
Abstract
DNA barcodes are frequently corrupted due to insertion, deletion, and substitution errors during DNA synthesis, amplification and sequencing, resulting in index hopping. In this paper, we propose a new DNA barcode construction scheme that combines a cyclic block code with a predetermined pseudo-random sequence bit by bit to form bit pairs, and then converts the bit pairs to bases, i.e., the DNA barcodes. Then, we present a barcode identification scheme for noisy sequencing reads, which uses a combination of cyclic shifting and traditional dynamic programming to mark the insertion and deletion positions, and then performs erasure-and-error-correction decoding on the corrupted codewords. Furthermore, we verify the identification error rate of barcodes for multiple errors and evaluate the reliability of the barcodes in DNA context. This method can be easily generalized for constructing long barcodes, which may be used in scenarios with serious errors. Simulation results show that the bit error rate after identifying insertions/deletions is greatly reduced using the combination of cyclic shift and dynamic programming compared to using dynamic programming only. It indicates that the proposed method can effectively improve the accuracy for estimating insertion/deletion errors. And the overall identification error rate of the proposed method is lower than 10 - 5 when the probability of each base mutation is less than 0.1, which is the typical scenario in third-generation sequencing.
Collapse
|
10
|
Zhang Z, Cui F, Wang C, Zhao L, Zou Q. Goals and approaches for each processing step for single-cell RNA sequencing data. Brief Bioinform 2020; 22:6034054. [PMID: 33316046 DOI: 10.1093/bib/bbaa314] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2020] [Revised: 10/10/2020] [Accepted: 10/16/2020] [Indexed: 12/12/2022] Open
Abstract
Single-cell RNA sequencing (scRNA-seq) has enabled researchers to study gene expression at the cellular level. However, due to the extremely low levels of transcripts in a single cell and technical losses during reverse transcription, gene expression at a single-cell resolution is usually noisy and highly dimensional; thus, statistical analyses of single-cell data are a challenge. Although many scRNA-seq data analysis tools are currently available, a gold standard pipeline is not available for all datasets. Therefore, a general understanding of bioinformatics and associated computational issues would facilitate the selection of appropriate tools for a given set of data. In this review, we provide an overview of the goals and most popular computational analysis tools for the quality control, normalization, imputation, feature selection and dimension reduction of scRNA-seq data.
Collapse
Affiliation(s)
- Zilong Zhang
- University of Electronic Science and Technology of China
| | | | - Chunyu Wang
- School of Computer Science and Technology, Harbin Institute of Technology
| | - Lingling Zhao
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China
| |
Collapse
|
11
|
Kumar A, Mali P. Mapping regulators of cell fate determination: Approaches and challenges. APL Bioeng 2020; 4:031501. [PMID: 32637855 PMCID: PMC7332300 DOI: 10.1063/5.0004611] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Accepted: 06/01/2020] [Indexed: 12/25/2022] Open
Abstract
Given the limited regenerative capacities of most organs, strategies are needed to efficiently generate large numbers of parenchymal cells capable of integration into the diseased organ. Although it was initially thought that terminally differentiated cells lacked the ability to transdifferentiate, it has since been shown that cellular reprogramming of stromal cells to parenchymal cells through direct lineage conversion holds great potential for the replacement of post-mitotic parenchymal cells lost to disease. To this end, an assortment of genetic, chemical, and mechanical cues have been identified to reprogram cells to different lineages both in vitro and in vivo. However, some key challenges persist that limit broader applications of reprogramming technologies. These include: (1) low reprogramming efficiencies; (2) incomplete functional maturation of derived cells; and (3) difficulty in determining the typically multi-factor combinatorial recipes required for successful transdifferentiation. To improve efficiency by comprehensively identifying factors that regulate cell fate, large scale genetic and chemical screening methods have thus been utilized. Here, we provide an overview of the underlying concept of cell reprogramming as well as the rationale, considerations, and limitations of high throughput screening methods. We next follow with a summary of unique hits that have been identified by high throughput screens to induce reprogramming to various parenchymal lineages. Finally, we discuss future directions of applying this technology toward human disease biology via disease modeling, drug screening, and regenerative medicine.
Collapse
Affiliation(s)
- Aditya Kumar
- Department of Bioengineering, University of California, San Diego, La Jolla, California 92093, USA
| | - Prashant Mali
- Department of Bioengineering, University of California, San Diego, La Jolla, California 92093, USA
| |
Collapse
|
12
|
Baker M, Hong SI, Kang S, Choi DS. Rodent models for psychiatric disorders: problems and promises. Lab Anim Res 2020; 36:9. [PMID: 32322555 PMCID: PMC7161141 DOI: 10.1186/s42826-020-00039-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2020] [Accepted: 03/03/2020] [Indexed: 01/19/2023] Open
Abstract
Psychiatric disorders are a prevalent global health problem, over 900 million individuals affected by a continuum of mental and substance use disorders. Due to this high prevalence, and the substantial direct and indirect societal costs, it is essential to understand the underlying mechanisms of these disorders to facilitate development of new and more effective treatments. Since the advent of recombinant DNA technologies in the early 1980s, genetically modified rodent models have significantly contributed to the genetic and molecular basis of psychiatric disorders. Despite significant advancements, many challenges remain after unsuccessful drug development based on rodent models. Recent human genetics show the polygenetic nature of mental disorders, identifying hundreds of allelic variants that confer increased risk. However, given the complexity of the brain, with many unique cell types, gene expression profiles, and developmental trajectories, proper animal models are needed more than ever to dissect genes and circuits in a cell type-specific manner to advance our understanding and treatment of psychiatric disorders. In this mini-review, we highlight current challenges and promises of using rodent models in advancing science and drug development, focusing on advanced techniques, and their applications to rodent models of psychiatric disorders.
Collapse
Affiliation(s)
- Matthew Baker
- Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic College of Medicine, 200 First Street SW, Rochester, MN 55905 USA
| | - Sa-Ik Hong
- Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic College of Medicine, 200 First Street SW, Rochester, MN 55905 USA
| | - Seungwoo Kang
- Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic College of Medicine, 200 First Street SW, Rochester, MN 55905 USA
| | - Doo-Sup Choi
- Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic College of Medicine, 200 First Street SW, Rochester, MN 55905 USA
- Neuroscience Program, Rochester, MN USA
- Department of Psychiatry and Psychology, Mayo Clinic College of Medicine, Rochester, MN USA
| |
Collapse
|
13
|
Abstract
Diversity indices are useful single-number metrics for characterizing a complex distribution of a set of attributes across a population of interest. The utility of these different metrics or sets of metrics depends on the context and application, and whether a predictive mechanistic model exists. In this topical review, we first summarize the relevant mathematical principles underlying heterogeneity in a large population, before outlining the various definitions of 'diversity' and providing examples of scientific topics in which its quantification plays an important role. We then review how diversity has been a ubiquitous concept across multiple fields, including ecology, immunology, cellular barcoding experiments, and socioeconomic studies. Since many of these applications involve sampling of populations, we also review how diversity in small samples is related to the diversity in the entire population. Features that arise in each of these applications are highlighted.
Collapse
Affiliation(s)
- Song Xu
- Center for Biomedical Informatics Research, Department of Medicine, Stanford University, Stanford, CA, United States of America
| | | | | |
Collapse
|
14
|
Srivastava A, Malik L, Smith T, Sudbery I, Patro R. Alevin efficiently estimates accurate gene abundances from dscRNA-seq data. Genome Biol 2019; 20:65. [PMID: 30917859 PMCID: PMC6437997 DOI: 10.1186/s13059-019-1670-y] [Citation(s) in RCA: 122] [Impact Index Per Article: 24.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2018] [Accepted: 03/05/2019] [Indexed: 12/15/2022] Open
Abstract
We introduce alevin, a fast end-to-end pipeline to process droplet-based single-cell RNA sequencing data, performing cell barcode detection, read mapping, unique molecular identifier (UMI) deduplication, gene count estimation, and cell barcode whitelisting. Alevin's approach to UMI deduplication considers transcript-level constraints on the molecules from which UMIs may have arisen and accounts for both gene-unique reads and reads that multimap between genes. This addresses the inherent bias in existing tools which discard gene-ambiguous reads and improves the accuracy of gene abundance estimates. Alevin is considerably faster, typically eight times, than existing gene quantification approaches, while also using less memory.
Collapse
Affiliation(s)
- Avi Srivastava
- Department of Computer Science, Stony Brook University, Stony Brook, USA
| | - Laraib Malik
- Department of Computer Science, Stony Brook University, Stony Brook, USA
| | - Tom Smith
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, CB2 1GA UK
| | - Ian Sudbery
- Sheffield Institute for Nucleic Acids, Department of Molecular Biology and Biotechnology, The University of Sheffield, Sheffield, S10 2TN UK
| | - Rob Patro
- Department of Computer Science, Stony Brook University, Stony Brook, USA
| |
Collapse
|