Data Analysis

The NTC offers a selection of standard data analyses for a flat rate per project. We can also perform certain custom data analyses (see below) on PRO-seq or TT-seq generated by the lab, invoiced on an hourly basis.
 

Standard Data Analyses

PRO-seq

QC: $200 per project (this fee is required for all projects)
This is our standard charge for quality control and data handling. It is required for all projects and returns to you:
  • QC of sequencing data: mapping of reads to reference genome (e.g., hg38, mm10), reporting of mapping statistics to reference and spike-in genome
  • PCA plot to display relationship between replicates
  • Analysis of correlation between samples, using PRO-seq reads around annotated TSSs
  • Transfer of fastq files of raw sequencing data, with a minimum of 20 million uniquely mappable reads per sample
You may also select one of the following options
 
A. Basic Data Analysis: $300 per project (in addition to QC analysis fee)
  • All outputs of QC analysis, plus:
  • Normalization of data sets, with input from users. Depth normalization is standard unless spike-in reads are significantly different between samples.
  • If sample replicates are well correlated, merging of replicates as bigwig/ bedgraph files for display and analysis purposes
  • Bigwig files (with normalized and combined data, if appropriate), binned as user requests
  • UCSC browser session with data for manual inspection
  • Custom annotation table indicating the genomic coordinates of the "dominant" annotated TSS for each active gene in your samples determined using PRO-seq data and proTSS (https://github.com/NascentTranscriptionCore/proTSScall)
  • Text file containing read counts for each sample across the above determined active Ensembl gene models
  • Determination of genes with significantly affected PRO-seq signal (as compared to a control data set) using DEseq2
B. High Resolution PRO-seq Analysis: $500 per project (in addition to QC analysis fee)
  • All outputs of QC and Basic analyses, plus:
  • Custom annotation table indicating dominant TSS and TES locations for each active Ensembl gene. These coordinates are determined using PRO-seq 5' end data and RNA-seq reads (requires user-provided RNA-seq). This analysis enables one to focus on the major sites of initiation and termination employed in a given cell type and condition. This approach can refine TSS locations that are inaccurate in the annotation, allowing for higher resolution analyses of initiation and early elongation.  For highest quality results, users should provide raw fastq files for RNA-seq performed under conditions matching their PRO-seq samples, as this improves determination of transcription termination sites and thus whole gene body windows
  • RNA polymerase pausing analysis
    • Table with read counts around active, dominant TSSs, as compared to reads within gene bodies. These promoter/gene body read counts can be used for calculation of pausing indexes.
    • If desired, determination of transcripts with significantly affected PRO-seq signal specifically in the promoter proximal or gene body window (as compared to a control data set) using DEseq2.

TT-seq

QC analysis: $200 per project (this fee is required for all projects)
This is our standard charge for quality control and data handling. It is required for all projects and returns to you:
  • QC of sequencing data: mapping of reads to reference genome (e.g., hg38, mm10), reporting of mapping statistics to reference and spike-in genome
  • PCA plot to display relationship between replicates
  • Transfer of fastq files of raw sequencing data, with a minimum of 25 million uniquely mappable reads per sample
You may also chose:
 
Basic Data Analysis: $300 per project (in addition to QC analysis fee)
  • All outputs of QC analysis, plus:
  • Normalization of data sets, with input from users. Depth normalization is standard unless spike-in reads are significantly different between samples
  • If sample replicates are well correlated, merging of replicates as bigwig/bedgraph files for display and analysis purposes
  • Bigwig files (with normalized and combined data, if appropriate), binned as user requests
  • UCSC browser session with data for manual inspection
  • Text file containing read counts for each sample across Ensembl gene models
  • Determination of genes with significantly affected RNA-seq signal (as compared to a control data set) using DEseq2

(discontinued) Custom Data Analyses

In order to focus our resources on tool development, the Nascent Transcriptomics Core will no longer offer custom data analysis services.  If you need assistance with analyses beyond the packages listed above, we highly recommend contacting our colleagues at the Harvard Chan Bioinformatics Core.
https://bioinformatics.sph.harvard.edu/

We look forward to expanding our available analysis packages so please check back periodically.