• Conos: wiring together scRNA-seq dataset collections

    Conos (Clustering on Network of Samples) is a tool for joint anlaysis of heterogeneous collections of scRNA-seq datasets, such as collections combining multiple individuals, conditions, tissues, or technological platforms. Please see pre-print for detailed description and analysis examples, as well as the the github page for hands-on tutorials.

  • RNA velocity estimation

    velocyto framework predicts movement of cells in transcriptional space, by estimating the first derivative of the transcriptional state - RNA velocity. This provides basis for quantitative modeling of dynamic biological processes, such as cell differentiation, or perturbation response.

  • Demultiplexing of single-cell RNA-seq data

    dropEst is a pipeline for demultiplexing single-cell RNA-seq data, implementing additional corrections for accurate estimation of the molecular count matrices. Please refer to the original publication for details.

  • Single Cell Transcriptional Analysis

    SCDE pacakge provides routines for analysis of single-cell RNA-seq data. It is based on the probabilistic mixture error model, which is used to implement differential expression, subpopulation analysis and other tasks on the data.

    We are now developing Pagoda2 package for analysis of high-throughput (droplet- or microwell-based) single-cell RNA-seq datasets

  • - developed with Park Lab during PI's postdoctoral fellowship and shortly after:

  • Transposable Element Analyzer

    The Tea pipeline is designed to identify insertions of repetitive elements (such as LINE1 repeats or endogenous retroviral elements) in the human genomes. Its primary aim is to detect novel repeat insertions occurring in somatic tissues (e.g. cancerous tumors), however it is also capable of detecting instances of repeat insertions polymorphic among individuals.

    For more details, please refer to the Tea manuscript, and the pipeline download page on the Park Lab site

  • Repeat Enrichment Estimator

    The software, developed during PI's fellowship in the Park Lab, provides means to estiamte the enrichment of repetitive elements in the short-read sequencing data. For details, please refer to the manuscript. The implementation, custom sequence assemblies are available from the Park Lab server, which also provides a web interface for running the analysis.

  • ChIP-seq processing pipeline (SPP)

    The spp R package provides routines for processing ChIP-seq data. It supports the output of many short-read aligners, and can be used determine statistically significant set of binding peaks or broad regions of enrichment. The details of peak-calling algorithms were described in the initial manuscript. The updated versions of the package, and a brief tutorial are available from the Park Lab site.