sigDriver is a driver discovery tool base on mutational signature exposures. Currently, single base substitution signatures are supported.
sigDriver works in R (>= 3.4.0) and depends on the following packages: data.table, optparse, dplyr, GenomicRanges, reshape2, SKAT, matrixStats, rtracklayer, trackViewer, qqman, VariantAnnotation, DDoutlier, R.utils
sigDriver package can run on a standard computer. The memory requirment is dependent on the number of the input somatic variants and number of requested threads.
The runtimes below are generated using a computer with 16 GB RAM, 4 cores@2.3 GHz.
This package is supported for macOS and Linux. The package has been tested on the following systems:
- Linux: openSUSE 42.3
Installation in R using devtools:
- This should take less than 1 minute if all prerequisites were meet
library(devtools)
install_github("wkljohn/sigDriver",subdir="R_library")Installing helper Rscripts and the example data:
git clone https://github.com/wkljohn/sigDriver.gitRun sigDriver
- The approximate run time for this demo is 9 minutes
Rscript run_sigdriver.R -s SBS9 -t 4 -v ./example/example_SBS9.snv.simple.gz -e ./example/example_signatures.tsv -m ./example/example_SBS9_metadata.tsv -o ./output/Run results annotation
- The approximate run time for this demo is 1 hour 43 minutes
Rscript run_sigdriver_annotate.R -g gencode.v19.gtf -t 4 -l ./output/SBS9_results.tsv -d ./output/SBS9_var_meta.rds -s SBS9 -v ./example/example_SBS9.snv.simple.gz -e ./example/example_signatures.tsv -m ./example/example_SBS9_metadata.tsv -o ./output/
- Running
run_sigdriver.Rrequire the following inputs:
| Parameter | Description | Note |
|---|---|---|
| -v | Input variants in simple format | Tab delimited without column names {Entity,ID,Cohort,Genome_Build,Variant_type(SNV/INDEL),Chr,Start,End,Ref,Alt} |
| -m | Sample metadata | Tab delimited with column names {ID,entity,gender} |
| -e | Signature exposures | 'signatures' x 'ID' matrix column names=ID, row names=Signature |
| -o | Output folder | |
| -s | Signature to test | signature(s) listed in -e |
| -t | Threads |
- Running
run_sigdriver_annotate.Rrequire inputs in-addition to runningrun_sigdriver.R:
| Parameter | Description | Note |
|---|---|---|
| -g | Reference GTF | |
| -l | Results from sigdriver | {Signature}_results.tsv in output folder |
| -d | RDS object from sigdriver | {Signature}_var_meta.rds in output folder |
- Running
run_sigdriver.Rproduce the following outputs:
| File name | Description |
|---|---|
{signature}_results.tsv |
the final output of the script on associated hotspots and the details |
{signature}_results_FULL.tsv |
the hotspots output before filtering and p-values correction |
{signature}_qq.png |
the quantile-quantile plot for the distribution of p-values |
{signature}_var_meta.rds |
The corrected variants table for downstream analysis(note: this R object is incompatible across different versions of GenomicRanges) |
- Running
run_sigdriver_annotate.Rproduce the following outputs:
| File name | Description |
|---|---|
{signature}_perturb_sites_annotated_results.tsv |
the perturbed p-values and gene annotations of each hotspot point mutation |
./plots/{signature}/[hotspot].png |
the graphical presentation of the significant hotspot |
./plots/{signature}/[hotspot].png |
the graphical presentation of the significant hotspot |
For questions and suggestions, please contact:
John Wong: wkljohn@gmail.com