Title: Single Cell Data Analysis: Computational methods for
characterizing cell types
I will describe two techniques for the analysis of single-cell
sequencing data. (1) Forest Fire Clustering. This is an efficient and
interpretable method for cell-type discovery from single-cell data. It
makes minimal prior assumptions and, different from current
approaches, calculates a non-parametric posterior probability that
each cell is assigned a cell-type label. These posterior distributions
allow for the evaluation of a label confidence for each cell and
enable the computation of "label entropies," highlighting transitions
along developmental trajectories. (2) SCAN-ATAC-Sim. It is difficult
to benchmark the performance of various scATAC-seq analysis techniques
(such as clustering and deconvolution) without having a priori a known
set of gold-standard cell types. To simulate scATAC-seq experiments
with known cell-type labels, we introduce an efficient and scalable
scATAC-seq simulation method that down-samples bulk ATAC-seq data
(e.g., from representative cell lines or tissues). Our protocol uses a
consistent but tunable signal-to-noise ratio across cell types in a
SCAN-ATAC-Sim: a scalable and efficient method for simulating
single-cell ATAC-seq data from bulk-tissue experiments.
Z Chen, J Zhang, J Liu, Z Zhang, J Zhu, D Lee, M Xu, M Gerstein
(2021). Bioinformatics 37: 1756-8.
Forest Fire Clustering for single-cell sequencing combines iterative
label propagation with parallelized Monte Carlo simulations.
Z Chen, J Goldwasser, P Tuckman, J Liu, J Zhang, M Gerstein (2022).
Nat Commun 13: 3538.