Quick Start¶
Get up and running in minutes. All examples below assume gbcms is installed.
Many samples on HPC?
Use the Nextflow pipeline instead of the CLI for parallel processing on a cluster.
Basic Usage¶
Output Format¶
Multiple Samples¶
Quality Filters¶
Complete Example¶
gbcms dna \
--variants variants.vcf \
--bam TumorSample:tumor.bam \
--fasta hg38.fa \
--output-dir genotyped/ \
--format maf \
--suffix .genotyped \
--threads 8 \
--min-mapq 30 \
--min-baseq 20 \
--filter-duplicates \
--filter-secondary \
--filter-supplementary \
--mfsd \
--mfsd-parquet
Output:
- genotyped/TumorSample.genotyped.maf — allele counts + 34 mFSD columns
- genotyped/TumorSample.genotyped.fsd.parquet — raw fragment size arrays
gbcms rna \
--variants mutations.maf \
--bam tumor_rna:aligned.bam \
--fasta hg38.fa \
--rna-editing-db TABLE1_hg38.txt.gz \
--format maf \
--threads 8 \
--output-dir results/
Output: results/tumor_rna.maf — standard counts + 5 RNA columns:
rna_sense_depth, rna_antisense_depth, rna_sense_strand_alt_count,
rna_editing_site_overlap, rna_splice_spanning_count
Docker¶
Common CLI Options¶
| Option | Default | Description |
|---|---|---|
--variants |
Required | VCF or MAF file |
--bam |
Required | BAM file(s). Prefix with name: to set sample ID |
--bam-list |
— | File with BAM paths (one per line) |
--fasta |
Required | Reference FASTA |
--output-dir |
Required | Output directory |
--format |
vcf |
Output format (vcf or maf) |
--min-mapq |
20 (DNA) / 1 (RNA) | Minimum mapping quality |
--min-baseq |
20 | Minimum base quality |
--threads |
1 | Number of threads |
Related¶
- DNA CLI Reference — mFSD, UMI, alignment backend options
- RNA CLI Reference — Strandedness, editing DB, splice junction options
- Nextflow Pipeline — Process many samples in parallel on HPC
- Allele Classification — How the counting engine works
- Troubleshooting — Common issues and solutions
abbreviations