Bioinformatics Support

Contact: Jeff Delrow
Location: Thomas Bldg DE-740
Contact e-mail: bioinformatics@fredhutch.org

The Bioinformatics Shared Resource is staffed by three dedicated bioinformatics specialists and a database developer/programmer. To assist researchers with exploring and understanding genomics data, bioinformatics support is offered through two service models:

Standard Workflows and Deliverables: Basic analytical support is provided for data generated in the Fred Hutch Genomics Shared Resource. For common workflows, generation of standard deliverables is included in the service fee. The supported workflows and standard deliverables are detailed below .

Modified Workflows: We are happy to work on projects beyond the standard workflows described, and often collaborate with researchers to bring new techniques to the Center. However, adding new tools, working with non-model organisms, or integrating substantial datasets from external sources may require effort beyond what can be included in our standard service fees. In such cases, support may require an hourly rate fee-for-service structure and be subject to staff availability and project demands. Please email bioinformatics@fredhutch.org to discuss the scope of your project and get an estimate of potential service fees.

Standard Analysis Deliverables

We have established standard processes and deliverables to support the most common protocols requested by our users, as described in detail below. For these standardized workflows, we can provide descriptive text suitable for inclusion in the Methods section of a manuscript or grant.

 

Whole Exome Sequencing Analysis

Analysis and variant calling of Whole Exome Sequencing data gathered with standard capture reagents.

  • Sequencing data QC results (tool: fastqc)
  • Original and analysis-ready alignment files (tools: bwa, samtools, GATK)
  • Depth coverage of sample and intervals, and overall stats (tools: samtools, bedtools, GATK)
  • Variant calling and variant annotation in vcf format (GATK, annovar)
  • Somatic variant calling and variant annotation if paired samples provided (GATK, annovar)
  • Somatic small indel calling and annotation if paired samples provide (strelka, annovar)

Additional deliverables (potentially fee-for-service):

  • CNV analysis
  • Additional data processing and customized figure generation

Targeted Sequencing Analysis

Analysis of sequence variants or copy number alterations in targeted genes or intergenic regions.

  • Sequencing data QC results (tool: fastqc)
  • Original and analysis-ready alignment files (tools: bwa, samtools, GATK)
  • Depth coverage of sample and intervals, and overall stats (tools: samtools, bedtools, GATK)
  • Variant calling and variant annotation in vcf format (GATK, annovar)
  • CNV analysis (if Ovation Custom DNA Target Enrichment System from NuGen)

Additional deliverables (potentially fee-for-service):

  • Additional data processing and custom figure generation

RNA-seq - Differential Expression Analysis

Processing of expression data from common well-annotated organisms (human, mouse, yeast, etc.)

  • Aligned reads in bam format (tools: TopHat, Picard, SAMtools)
  • QC reports: aligned reads, duplication rate, ribosomal content, 3’ bias, exonic mapping rate, base call quality by position, PCA, MA plots (tools: FastQC, RNA-SeQC, R)
  • Table of raw per-gene fragment counts across all samples (tool: HTseq-count)
  • Table of normalized expression values in CPM (counts per million) units and differential expression significance testing results (tools: edgeR)
  • Depending on experimental design, gene ontology enrichment analysis, clustering, visualization (tools: goseq, MeV)

Non-model organisms, complex designs/analyses, and non-standard library prep may require more analyst time and incur additional charges. Researchers are encouraged to contact the Genomics & Bioinformatics Shared Resource during experiment design.

ChIP-seq Analysis

Analyze DNA “tags” bound to proteins of interest over long regions (e.g. histones) or shorter regions (e.g. TFs). Our ChIP-seq workflow is based on https://www.encodeproject.org/chip-seq/

  • Sequencing data QC results (tools: fastqc)
  • Aligned reads in standard bam format (tools: bwa or bowtie)
  • Control-normalized tag density in bigwig format for visualization (tools: Homer, UCSC tools)
  • Called peaks in BED format (tools: MACS2, bedtools)
  • Differential peak calling (tool: DiffBind)

DNA accessibility - ATAC-seq Analysis

Assessment of chromatin accessibility. Our ATAC-seq workflow is based on https://www.encodeproject.org/atac-seq/

  • Sequencing data QC results
  • Alignment files
  • Peak calling files
  • BigWig files for visualization
  • Detailed reports

Single cell RNA-seq

Processing of expression data from single cells captured and prepared with 10x Genomics Chromium system or similar method, from common well-annotated organisms (human, mouse, yeast, etc.)

  • Quality control reports: numbers of read pairs gathered, percent aligned, percent duplication, 3’ bias and mitochondrial/ribosomal content (tools: FastQC, RSeQC, STAR)
  • Recommended cells to exclude from analysis based on above (high mitochondrial content,  extreme 3’ bias, apparent doublets, etc)
  • Table of normalized expression values in standard FPKM units (tools: RSEM, Kallisto, SCDE, Monocle)
  • Depending on experimental design, differential expression or exploration of cell types (tools: Monocle, SCDE)

Non-model organisms, complex designs/analyses, and non-standard library prep may require more analyst time and incur additional charges. Researchers are encouraged to talk with the Genomics & Bioinformatics Shared Resource during experiment design.

CRISPR Screens

  • Table of sgRNA counts (tools: FASTX-Toolkit, bowtie)
  • QC reports: summary of total counts and number of detected guides in each sample, PCA, MA plots (tool: R)