Testing Guide
This guide covers running tests, adding new tests, and accuracy validation for py-gbcms.
Running Tests
Quick Test
# Run all tests
pytest -v
# Run with coverage
pytest --cov=gbcms --cov-report=html
# Run specific test file
pytest tests/test_accuracy.py -v
Test Categories
| Category |
Files |
Purpose |
| Accuracy |
test_accuracy.py |
SNP, indel, complex variant counting |
| CLI |
test_cli_sample_id.py |
Command-line parsing |
| Filters |
test_filters.py |
Read filtering logic |
| MAF |
test_maf_*.py |
MAF column preservation |
| Pipeline |
test_pipeline_v2.py |
End-to-end workflow |
| Strand |
test_strand_counts.py |
Strand-specific counts |
Test Structure
tests/
├── test_accuracy.py # Variant type accuracy
├── test_cli_sample_id.py # CLI argument parsing
├── test_filters.py # Read filtering
├── test_maf_preservation.py
├── test_maf_reader.py
├── test_pipeline_v2.py
└── test_strand_counts.py
Writing Tests
Basic Test Template
import pytest
from pathlib import Path
def test_my_feature(tmp_path):
"""Test description."""
# Arrange
input_file = tmp_path / "input.txt"
input_file.write_text("test data")
# Act
result = my_function(input_file)
# Assert
assert result.success
assert result.count == 42
Accuracy Test Template
def test_snp_accuracy():
"""Verify SNP counting against known BAM."""
# Create variant
variant = Variant("chr1", 100, "A", "T", "SNP")
# Run counting
results = count_bam(bam_path, [variant], fasta_path)
# Validate
assert results[0].ref_count == 50
assert results[0].alt_count == 10
Manual Validation
# Check counts at specific position
samtools mpileup -r chr1:100-100 -q 20 \
-f ref.fa sample.bam 2>/dev/null | \
awk '{print "DP="$4}'
Comparing with gbcms Output
# Run gbcms
gbcms run -v variants.maf -b sample.bam -f ref.fa -o output/
# Check output
awk -F'\t' 'NR==2 {print "REF="$41, "ALT="$42}' output/*.maf
Accuracy Validation
Variant Types Tested
| Type |
Test |
Status |
| SNP |
test_snp_accuracy |
✅ |
| Insertion |
test_insertion_accuracy |
✅ |
| Deletion |
test_deletion_accuracy |
✅ |
| Complex |
test_complex_accuracy |
✅ |
| MNP |
test_mnp_accuracy |
✅ |
Real-World Validation
# Compare gbcms vs samtools for a SNP
# Position: chr1:11168293 G>A
# gbcms output
awk -F'\t' '$5=="1" && $6=="11168293" {print "REF="$41, "ALT="$42}' output.maf
# samtools output
samtools mpileup -r 1:11168293-11168293 -q 20 -f ref.fa sample.bam | \
awk '{gsub(/\^.|\$/,"",$5); print "DP="$4, "Pileup="$5}'
Coverage Targets
| Module |
Target |
Current |
| cli.py |
80% |
79% |
| pipeline.py |
70% |
29% |
| io/input.py |
85% |
82% |
| io/output.py |
90% |
96% |
| models/core.py |
90% |
90% |
Run coverage report:
pytest --cov=gbcms --cov-report=html
open htmlcov/index.html