Installation¶
Quick Install¶
System Requirements
PyPI wheels require glibc 2.34+ (Ubuntu 22.04+, RHEL 9+, Debian 12+). For older systems, see Legacy Linux.
Requirements¶
| Component | Requirement |
|---|---|
| Python | 3.10+ |
| OS | Linux (glibc 2.34+), macOS, Windows (WSL2) |
| Memory | 4GB+ (8GB for large BAMs) |
For Nextflow Workflow¶
- Nextflow 21.10.3+
- Docker or Singularity
Legacy Linux (RHEL 8 / HPC)¶
For RHEL 8, CentOS 8, or HPC systems with glibc < 2.34:
# Create environment with build dependencies
# Note: clangdev (not clang) provides headers needed by bindgen
micromamba create -n gbcms_env python=3.13 clangdev rust -c conda-forge
micromamba activate gbcms_env
# Set libclang path for the Rust build
export LIBCLANG_PATH=$CONDA_PREFIX/lib
# Install from source
git clone https://github.com/msk-access/gbcms.git
cd gbcms
pip install .
Why not pip install?
PyPI wheels require glibc 2.34+. On RHEL 8 (glibc 2.28), pip falls back to source compilation which requires Rust and clang headers. The conda environment provides these dependencies.
Verification¶
# Check installation
gbcms --version
# Expected: X.Y.Z (your installed version)
# Test help
gbcms --help
Docker Usage¶
docker run --rm \
-v $(pwd):/data \
ghcr.io/msk-access/gbcms:X.Y.Z \
gbcms dna \
--variants /data/variants.vcf \
--bam /data/sample.bam \
--fasta /data/reference.fa \
--output-dir /data/results/
Docker Volume
Use -v to mount your data directory.
Troubleshooting¶
See the full Troubleshooting Guide for detailed solutions.
Quick checks:
# BAM index missing
samtools index sample.bam
# FASTA index missing
samtools faidx reference.fa
# Chromosome mismatch — compare names between FASTA, BAM, VCF/MAF
grep "^>" reference.fa | head -5
samtools view -H sample.bam | grep "^@SQ" | head -5
For glibc errors, installation failures, or Docker permission issues, see Troubleshooting → Installation Issues.
Upgrade¶
Related¶
- Quick Start — Common usage patterns with DNA and RNA examples
- CLI Reference — DNA — Full option reference for
gbcms dna - CLI Reference — RNA — Full option reference for
gbcms rna - Nextflow Pipeline — Running many samples in parallel on HPC
- Troubleshooting — Installation issues and common errors
abbreviations