Brad Chapman bio photo By Brad Chapman Comment


  • Germline heterozygous SNPs, informative for purity/ploidy/clone estimation Need estimation for tumor-only
  • Copy number calls – GC corrected and normalized (to normal or process-matched normal)
  • Split copy number calls into major/minor alleles, potentially with multiple states
  • Somatic variant calls with allele frequencies, for tumor subclones
  • Estimate subclones from somatic calls + major/minor CNVs


  • Heterogeneous input samples ranging from WGS tumor/normal to panel/capture tumor-only, would like to have similar workflow to handle most cases
  • Lack of good truth sets, so hard to determine if truth sets work well
  • Most tools not fully automated and require some decision making during the process

Example figures

  • Overview of problem Figure 1:

  • sequencing levels required for reconstruction depending on clonal complexity Figure 5: Figure 6:


Purity, CNV allelic copy number

PureCN – purity/ploidy, classify variants as germline/somatic clonal/subclonal, panels/exome > 100x, support for no matching normals but needs process-matched normal BAM

BubbleTree: purity, LOH and subclonality WGS/exome, from heterozygous variants and CNVs

TitanCNA – WGS/exome; heterozygous variants => purity, CNVs into major/minor subclones, LOH

Battenberg – WGS tumor/normal; BAMs => purity, CNV caller into major/minor subclones

Subclonal reconstruction

PhyloWGS – subclone and tumor evolution from Battenberg or TITAN output (major/minor allele CN)+ VCFs

SciClone – exome/WGS: somatic CNV calls + variants. Uses only variants in CN=2 CN=2 regions, requires a relatively stable genome to have enough events.

Canopy – exome/WGS: allele specific CNVs + variants. Recommend using Sequenza as input segmented CNV calls.

Guan Lab, U of M – Somatic variants and CNVs, SMC-Het winner but demonstration implementation only!Synapse:syn6087005/wiki/398911

THetA – integrated, CNVs only, academic only in latest version

PyClone – academic only

CNV callers

  • CNVkit
  • Seq2C
  • Canvas


tHapMix simulation – WGS tumor/normal ICGC-TCGA DREAM Tumor Heterogeneity Challenge (VCF + Battenberg stratified output), no truth sets available!Synapse:syn2813581/wiki/303137

Remove artifacts

  • small variants – Damage assessment
  • germline CNVs

CNV benchmarking

HCC2218 truth set from Canvas GiaB NA24385 CNVs

comments powered by Disqus