Nextflow Pipeline
Run Krewlyzer at scale with the Nextflow pipeline.
Quick Start
nextflow run msk-access/krewlyzer \
--samplesheet samples.csv \
--ref /path/to/hg19.fa \
--outdir results/
Workflow Architecture
The pipeline uses a Nextflow-native parallel pattern:
flowchart TB
BAM["sample.bam"] --> EXTRACT["KREWLYZER_EXTRACT"]
EXTRACT --> BED["sample.bed.gz"]
BED --> MOTIF["KREWLYZER_MOTIF"]
BED --> FSC["KREWLYZER_FSC"]
BED --> FSD["KREWLYZER_FSD"]
BED --> WPS["KREWLYZER_WPS"]
BED --> OCF["KREWLYZER_OCF"]
BED --> ENTROPY["KREWLYZER_REGION_ENTROPY"]
BED --> RMDS["KREWLYZER_REGION_MDS"]
FSC --> FSR["KREWLYZER_FSR"]
subgraph "Parallel Paths"
METH_BAM["meth.bam"] --> UXM["KREWLYZER_UXM"]
BAM2["BAM + MAF"] --> MFSD["KREWLYZER_MFSD"]
end
Use mouse to pan and zoom
Documentation
| Page | Description |
|---|---|
| Samplesheet | Input samplesheet format |
| Parameters | All pipeline parameters |
| Outputs | Output channels and files |
| Examples | Workflow examples |
Features
- Parallel processing - Process multiple samples simultaneously
- Resume support - Resume failed runs
- Container support - Docker/Singularity
- Cloud ready - AWS, Google Cloud, Azure
Performance Benchmarks
Real-world performance from MSK-ACCESS v1/v2 duplex plasma samples:
| Sample Type | Duration | CPU Usage | Peak Memory |
|---|---|---|---|
| Healthy control | 2-5 min | 90-140% | 1.7-1.9 GB |
| ctDNA plasma | 4-6 min | 190-300% | 2.8-3.2 GB |
Tested Configuration
- Docker with amd64 emulation on Apple Silicon
- 8 CPUs, 32 GB memory
- Panel mode with
--skip-ponand--duplexenabled
See Also
- CLI Reference - Command-line usage
- Panel Mode - MSK-ACCESS workflows