Pipeline Integration
Run All Features
The run-all command allows you to execute all feature extraction modules for a single BAM file in one go.
Usage
krewlyzer run-all sample.bam --reference hg19.fa --output all_features_out \
[--variants variants.maf] \
[--bin-input targets.bed] \
[--threads N]
Arguments
sample.bam: Input BAM file (sorted, indexed).--reference,-g: Reference genome FASTA.--output,-o: Output directory.--variants,-v: (Optional) VCF or MAF file formfsdanalysis.--bin-input,-b: (Optional) Custom bins for FSC/FSR (e.g., targeted panel regions).--threads,-t: Number of threads (default: 0 = all cores).--mapq,-q: Minimum mapping quality (default: 20).--minlen,--maxlen: Fragment length range (default: 65-400).
Nextflow Pipeline
Krewlyzer includes a Nextflow pipeline (krewlyzer.nf) for processing multiple samples in parallel on HPC clusters or local machines.
Usage
nextflow run krewlyzer.nf --samplesheet samplesheet.csv --ref /path/to/reference.fa --outdir results/
Samplesheet Format (CSV)
A CSV file with the following columns:
sample,bam,meth_bam,vcf,bed,maf,single_sample_maf
sample1,/path/to/sample1.bam,,/path/to/sample1.vcf,,,
sample2,/path/to/sample2.bam,,,,/path/to/sample2.maf,true
sample3,/path/to/sample3.bam,,,,/path/to/cohort.maf,false
sample4,,,,/path/to/pre_extracted.bed.gz,,
Pipeline Logic
The pipeline automatically runs specific modules based on valid input in the samplesheet columns:
bam: Triggers the mainrun-allworkflow (Extraction -> Motif -> Features). This is the standard path for WGS BAMs.meth_bam: Triggersuxm(Methylation) analysis. Can be run alongsidebamor independently.bed: Triggers fragment-only features (FSC, FSR, WPS, OCF, FSD). Skips extraction/motif steps. Use this for re-running analysis on already extracted fragments.vcf: Used withbamformFSD(Mutant Fragment Size Distribution) analysis.maf: Multi-sample MAF file formFSDanalysis. The pipeline filters byTumor_Sample_Barcodematching the sample ID (regex:.*sample_id.*).single_sample_maf: Set totrueif the MAF contains only this sample's variants (skips filtering). Set tofalseor leave empty for multi-sample MAFs.
[!NOTE] If the filtered MAF has zero variants for a sample, MFSD is skipped with a warning.
Note on Column Headers: The samplesheet MUST use these exact headers: sample,bam,meth_bam,vcf,bed,maf,single_sample_maf. Empty fields for optional columns should be left blank (commas required).
Profiles
-profile lsf: For LSF clusters.-profile slurm: For SLURM clusters.-profile docker: Run using Docker container.