-
Notifications
You must be signed in to change notification settings - Fork 7
10 Variant Stratification
Tim Dunn edited this page Feb 6, 2024
·
1 revision
We currently output an intermediate VCF in GA4GH compatible format, meaning the results can be stratified and analyzed by hap.py
's quantification helper script qfy.py
.
In order to use qfy.py
please install hap.py
.
tabix
and bgzip
should already be included as part of HTSlib.
> ./vcfdist \ # run vcfdist
query.vcf.gz \
truth.vcf.gz \
reference.fasta \
-b analysis-regions.bed \
-p output-prefix/
> bgzip output-prefix/summary.vcf # compress summary VCF
> tabix -p vcf output-prefix/summary.vcf.gz # index summary VCF
> export HGREF=/path/to/reference.fasta # set reference path
> source /path/to/happy/venv2/bin/activate # activate hap.py virtualenv
> python /path/to/happy/install/bin/qfy.py \ # run quantification script
-t ga4gh \
--write-vcf \
--write-counts \
--stratification strat.tsv \
--roc QUAL \
--o results/qfy-output-prefix \
output-prefix/summary.vcf.gz
Ensure that strat.tsv
contains one stratification region per line; each line consists of a region name and BED file name separated by a tab.
GIAB stratification regions for GRCh38 can be found here.