library(DESeq2)
library(bcbioRNASeq)

# Shared RMarkdown settings
prepareRNASeqTemplate()
if (file.exists("setup.R")) {
    source("setup.R")
}

# Load bcbioRNADataSet object
bcbName <- load(params$bcb)
bcb <- mget(bcbName, inherits = FALSE)[[1]]
# rm(list = bcbName) # Only when the object is not equal to bcb to save memory space.
if (!is(bcb, "bcbioRNADataSet")) {
    stop("bcb must bcbioRNADataSet class object")
}
if (nrow(bcb) == 505)
    print("Loaded object is the same dimension than test data in package.")

# Design formula
# help("design", "DESeq2")
design <- params$design

# Contrast
# help("results", "DESeq2")
#   1. Design matrix parameter.
#   2. Numerator for LFC (expt).
#   3. Denominator for LFC (control).
contrast <- params$contrast

# Significance cutoffs
alpha <- params$alpha
lfc <- params$lfc

# Directory paths
outputDir <- params$outputDir
countsDir <- file.path(outputDir, "results", "counts")
deDir <- file.path(outputDir, "results", "differential_expression")

Overview


> dds <- DESeqDataSetFromTximport(txi = txi(bcb), colData = colData(bcb), design = design) %>% 
+     DESeq
> rld <- rlog(dds)

Alpha level (FDR) cutoffs

Let’s take a look at the number of genes we get with different false discovery rate (FDR) cutoffs. These tests subset P values that have been multiple test corrected using the Benjamini Hochberg (BH) method (Benjamini and Hochberg 1995).

> alphaSummary(dds)
0.1 0.05 0.01 0.001 1e-06
LFC > 0 (up) 179, 0.61% 142, 0.48% 93, 0.32% 58, 0.2% 25, 0.085%
LFC < 0 (down) 140, 0.48% 104, 0.35% 58, 0.2% 38, 0.13% 15, 0.051%
outliers 0, 0% 0, 0% 0, 0% 0, 0% 0, 0%
low counts 11599, 39% 11046, 38% 11046, 38% 10494, 36% 16570, 56%
cutoff (mean count < 7) (mean count < 6) (mean count < 6) (mean count < 5) (mean count < 74)

Results

> resUnshrunken <- results(
+     dds,
+     contrast = contrast,
+     alpha = alpha)
> 
> # DESeqResults with shrunken log2 fold changes (LFC)
> # help("lfcShrink", "DESeq2")
> # Only `coef` or `contrast` can be specified, not both
> # Use the correct `coef` number to modify from `resultsNames(dds)`
> resShrunken <- lfcShrink(
+     dds = dds,
+     # coef = 2,
+     contrast = contrast,
+     res = resUnshrunken)
> 
> # Use shrunken LFC values by default
> res <- resShrunken

We performed the analysis using a BH adjusted P value cutoff of 0.01 and a log fold-change (LFC) ratio cutoff of 0.

Plots

Mean average (MA)

An MA plot compares transformed counts on M (log ratio) and A (mean average) scales (Y. H. Yang et al. 2002).

> plotMA(res)

Volcano

A volcano plot compares significance (BH-adjusted P value) against fold change (log2) (Cui and Churchill 2003; Li et al. 2014). Genes in the green box with text labels have an adjusted P value are likely to be the top candidate genes of interest.

> plotVolcano(res, lfc = lfc)