Skip to content

gbcms

Get Base Counts Multi-Sample — High-performance variant counting from BAM files

Version Python License

What It Does

GBCMS extracts allele counts and variant metrics at specified positions in BAM files:

block-beta
    columns 3
    VCF["📄 VCF/MAF\nVariant positions"]:1
    Engine["⚡ gbcms\nPython + Rust"]:1
    Counts["📊 Allele Counts\nDP · RD · AD · VAF"]:1
    BAM["🗂️ BAM Files\n(1 to N samples)"]:1
    space:1
    Metrics["🧬 Fragment Counts\nStrand bias · mFSD"]:1

    VCF --> Engine
    BAM --> Engine
    Engine --> Counts
    Engine --> Metrics
Use mouse to pan and zoom

Visual Overview

Key Metrics

Metric Formula Description
VAF AD / (RD + AD) Variant Allele Frequency
Strand Bias Fisher's exact test Detect sequencing artifacts
Fragment Counts Deduplicated pairs PCR-aware counting

Quick Start

# Install
pip install gbcms

# DNA/cfDNA counting
gbcms dna --variants variants.vcf --bam sample.bam --fasta ref.fa --output-dir results/

# RNA-seq counting
gbcms rna --variants variants.vcf --bam rna:aligned.bam --fasta ref.fa --output-dir results/

Full Installation Guide | CLI Examples


Choose Your Workflow

flowchart TD
    Start(["What data?"]):::start
    Start -->|"DNA / cfDNA\nWGS / WES / Panel"| DNA(["gbcms dna"]):::dna
    Start -->|"RNA-seq\n(STAR-aligned, dUTP)"| RNA(["gbcms rna"]):::rna

    DNA --> NsamD{"Many samples?
≥10 BAMs"}
    RNA --> NsamR{"Many samples?
≥10 BAMs"}

    NsamD -->|"No"| CLI(["🖥️ CLI"]):::cli
    NsamD -->|"Yes"| HPC{"HPC / SLURM?"}
    NsamR -->|"No"| CLI
    NsamR -->|"Yes"| HPC

    HPC -->|"Yes"| NF(["🔷 Nextflow"]):::nf
    HPC -->|"No"| CLI

    classDef start fill:#9b59b6,color:#fff,stroke:#7d3c98,stroke-width:2px;
    classDef dna fill:#27ae60,color:#fff,stroke:#1e8449,stroke-width:2px;
    classDef rna fill:#3498db,color:#fff,stroke:#2471a3,stroke-width:2px;
    classDef cli fill:#e67e22,color:#fff,stroke:#bf6516,stroke-width:2px;
    classDef nf fill:#2c3e50,color:#fff,stroke:#1a2530,stroke-width:2px;
Use mouse to pan and zoom
Workflow Best For Guide
CLI 1-10 samples, local/single server Quick Start
Nextflow 10+ samples, HPC/SLURM Nextflow Guide

Architecture

Python/Rust hybrid for maximum performance:

See Architecture Reference → for full diagrams covering system layers, data flow, genomic binning, coordinate system, config hierarchy, and end-to-end sequence.

Technical Details | How It Works


Documentation

Section Description
Getting Started Installation and first run
CLI Reference Command-line usage
Nextflow Pipeline HPC workflow
How It Works Architecture, algorithms, and formats
Development Contributing guide