Skip to content

Installation Guide

This guide covers all methods for installing py-gbcms.

PyPI Installation

Install the latest stable release:

pip install py-gbcms

Verify installation:

gbcms --version

Docker

Pull the official Docker image:

docker pull ghcr.io/msk-access/py-gbcms:2.1.0

Run via Docker:

docker run --rm ghcr.io/msk-access/py-gbcms:2.1.0 gbcms --help

Requirements

System Requirements

  • Python: 3.10 or higher
  • Operating Systems: Linux, macOS, Windows (with WSL2)
  • Memory: Minimum 4GB RAM (8GB+ recommended for large BAM files)
  • Disk: Varies by BAM file size

For Source Installation

  • Rust toolchain: 1.70 or higher (for building from source)
  • Build tools: gcc, make (Linux/macOS)

For Nextflow Workflow

  • Nextflow: 21.10.3 or higher
  • Container runtime: Docker or Singularity
  • HPC scheduler: SLURM (optional, for cluster execution)

Installation Methods

1. PyPI (Stable Release)

Best for: Most users, production use

pip install py-gbcms

Benefits: - Pre-compiled wheels for major platforms - No Rust toolchain required - Fastest installation

2. From Source (Development)

Best for: Developers, contributing code, bleeding-edge features

Prerequisites

Install Rust toolchain:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source $HOME/.cargo/env

Installation Steps

# Clone repository
git clone https://github.com/msk-access/py-gbcms.git
cd py-gbcms

# Install with uv (recommended)
curl -LsSf https://astral.sh/uv/install.sh | sh
uv pip install -e ".[dev]"

# OR install with pip
pip install -e ".[dev]"

Verify:

gbcms --version

3. Docker

Best for: Reproducibility, containerized workflows, no local dependencies

Pull Image

# Latest stable release
docker pull ghcr.io/msk-access/py-gbcms:2.1.0

# Or use 'latest' tag
docker pull ghcr.io/msk-access/py-gbcms:latest

Usage Example

docker run --rm \
  -v $(pwd):/data \
  ghcr.io/msk-access/py-gbcms:2.1.0 \
  gbcms run \
    --variants /data/variants.vcf \
    --bam /data/sample.bam \
    --fasta /data/reference.fa \
    --output-dir /data/results/

Note: Use -v to mount your data directory into the container.

4. Singularity (HPC)

Best for: HPC clusters without Docker

# Pull from Docker registry
singularity pull docker://ghcr.io/msk-access/py-gbcms:2.1.0

# Run
singularity exec py-gbcms_2.1.0.sif gbcms --help

Nextflow Workflow Setup

For processing multiple samples in parallel:

1. Install Nextflow

curl -s https://get.nextflow.io | bash
mv nextflow ~/bin/  # or any directory in your PATH

Verify:

nextflow -version

2. Install Container Runtime

Docker (local):

Follow instructions at https://docs.docker.com/engine/installation/

Singularity (HPC):

Singularity is usually pre-installed on HPC systems. Check with:

singularity --version

If not available, contact your system administrator.

3. Clone Repository (for Nextflow workflow)

git clone https://github.com/msk-access/py-gbcms.git
cd py-gbcms

The Nextflow workflow is in the nextflow/ directory.


Verification

Test Installation

Run a simple command:

gbcms --help

Expected output:

Usage: gbcms [OPTIONS] COMMAND [ARGS]...

  py-gbcms: Get Base Counts Multi-Sample

Options:
  --version  Show version and exit
  --help     Show this message and exit

Commands:
  run  Run variant counting

Check Version

gbcms --version

Should show: 2.1.0 (or current version)


Troubleshooting

Python Version Issues

Error: Python 3.10+ required

Solution:

# Check Python version
python --version

# Install Python 3.10+ (Ubuntu/Debian)
sudo apt update
sudo apt install python3.10 python3.10-pip

# Use pyenv (any OS)
pyenv install 3.10.0
pyenv global 3.10.0

Rust Toolchain Not Found

Error: cargo not found or rustc not found

Solution:

# Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# Reload shell
source $HOME/.cargo/env

# Verify
rustc --version

Docker Permission Denied

Error: permission denied while trying to connect to the Docker daemon

Solution:

# Add user to docker group (Linux)
sudo usermod -aG docker $USER
newgrp docker

# Or use sudo
sudo docker pull ghcr.io/msk-access/py-gbcms:2.1.0

Module Import Errors

Error: ModuleNotFoundError: No module named 'gbcms'

Solution:

# Reinstall
pip uninstall py-gbcms
pip install py-gbcms

# Or force reinstall
pip install --force-reinstall py-gbcms

BAM Index Not Found

Error: BAI index not found

Solution:

# Create BAM index
samtools index sample.bam

This creates sample.bam.bai in the same directory.


Upgrading

Upgrade from PyPI

pip install --upgrade py-gbcms

Upgrade Docker Image

docker pull ghcr.io/msk-access/py-gbcms:latest

Upgrade from Source

cd py-gbcms
git pull origin main
pip install -e ".[dev]"

Uninstallation

PyPI

pip uninstall py-gbcms

Docker

docker rmi ghcr.io/msk-access/py-gbcms:2.1.0

Singularity

rm py-gbcms_2.1.0.sif

Next Steps

After installation: