Skip to content

Glossary

Quick reference for terminology used throughout Krewlyzer documentation.


Biological Terms

DNA & Chromatin

Term Plain English Technical Definition
cfDNA DNA floating in blood Cell-free DNA - short DNA fragments released from dying cells into the bloodstream
ctDNA Tumor DNA in blood Circulating tumor DNA - the fraction of cfDNA originating from cancer cells
Nucleosome DNA packaging unit Histone octamer with ~147bp of DNA wrapped around it; the fundamental unit of chromatin
Chromatin DNA + proteins The complex of DNA and histone proteins that makes up chromosomes
Linker DNA DNA between spools The ~20bp of DNA connecting adjacent nucleosomes
Mono-nucleosomal One wrapped loop Fragment derived from a single nucleosome (~145-180bp)
Di-nucleosomal Two wrapped loops Fragment spanning two nucleosomes (~300-340bp)

Fragment Characteristics

Term Plain English Technical Definition
Fragment One piece of DNA A single cfDNA molecule with defined start and end positions
Fragment length DNA piece size The number of base pairs from the 5' end to 3' end of a fragment
End motif Cutting pattern The 4-nucleotide sequence at the 5' end of a fragment
Breakpoint motif Internal cut The sequence context where a fragment was cut

Genomic Context

Term Plain English Technical Definition
Chromosome arm Half of a chromosome The p (short) or q (long) arm of a chromosome, separated by the centromere
GC content Letter composition The percentage of G (guanine) and C (cytosine) bases in a sequence
Open chromatin Accessible DNA Genomic regions not tightly wrapped, allowing transcription factor access
TSS Gene start site Transcription Start Site - where RNA polymerase begins transcribing
CTCF site DNA organizer CCCTC-binding factor sites that organize 3D chromatin structure
Alu element Repeated sequence A ~300bp repetitive element found ~1 million times in the human genome

Sequencing & Alignment Terms

Read Processing

Term Plain English Technical Definition
BAM file Aligned sequences Binary Alignment Map - compressed file containing sequencing reads aligned to a reference genome
Read Sequenced chunk A single sequence generated by the sequencer (R1 = forward, R2 = reverse)
Read pair Two matching reads R1 and R2 reads from the same DNA fragment (paired-end sequencing)
Proper pair Good alignment Read pair where both reads align correctly in expected orientation and distance
MAPQ Alignment confidence Mapping quality - Phred-scaled probability that alignment position is wrong
Duplicate PCR copy Multiple reads from the same original molecule (amplification artifact)

Quality Thresholds

Setting Plain English Krewlyzer Default
MAPQ ≥ 20 High confidence alignment 99% probability alignment is correct
Min length 65bp Not too short Excludes fragments smaller than most cfDNA
Max length 400bp Not too long Excludes di-nucleosomal and larger fragments
Skip duplicates Remove PCR copies Ensures each molecule counted once
Require proper pair Good read pairs only May need to disable for duplex/consensus BAMs

Krewlyzer Feature Terms

Fragment Size Features

Feature Measures Higher Value Means
FSC (Coverage) Fragment count per genomic bin More fragments in that region
FSR (Ratio) Short ÷ Long fragment ratio More tumor-derived DNA
FSD (Distribution) Size histogram per arm (Shape matters, not value)

Size Bin Definitions (Rust Backend)

Bin Name Size Range Biological Meaning
ultra_short 65-99bp Sub-nucleosomal, TF footprints
core_short 100-149bp Tumor-enriched (primary biomarker)
mono_nucl 150-259bp Standard mono-nucleosomal cfDNA
di_nucl 260-399bp Di-nucleosomal and larger
long 400+bp Very long fragments (rare)

Nucleosome Features

Feature Measures Higher Value Means
WPS Protection score at each position Nucleosome present (positive) or absent (negative)
NRL Nucleosome Repeat Length Expected ~190bp; deviation suggests abnormality
nrl_quality Periodicity strength (0-1) Clearer nucleosome spacing pattern

Other Features

Feature Measures Higher Value Means
MDS Motif Diversity Score More diverse (potentially tumor-related) end motifs
OCF Orientation asymmetry Tissue-specific fragmentation pattern
mFSD Mutant vs wild-type sizes ALT shorter than REF = ctDNA present

Normalization Terms

GC Correction

Term Plain English Technical Definition
GC bias Uneven amplification PCR/capture preferentially amplifies certain GC contents
LOESS Smoothing algorithm Locally Estimated Scatterplot Smoothing - fits local regression curves
Correction factor Adjustment weight Multiplier to remove GC-related count biases

Panel of Normals (PON)

Term Plain English Technical Definition
PON Healthy baseline Panel of Normals - reference statistics from healthy samples
Z-score Deviation from normal (Sample - PON_mean) / PON_std; measures abnormality
Log-ratio Relative change log₂(Sample / PON_expected); positive = elevated
PON stability Reliability weight 1 / (variance + k); higher = more trustworthy comparison

Panel Sequencing Terms

Target Capture

Term Plain English Technical Definition
Panel Targeted genes Capture probe set designed to sequence specific genomic regions
On-target Captured regions Fragments overlapping panel target regions
Off-target Background DNA Fragments not overlapping targets (unbiased background)
Bait Capture probe Oligonucleotide used to pull down target DNA in hybridization capture
Bait padding Edge buffer bp to trim from bait edges to avoid capture artifacts

MSK-ACCESS Assay Codes

Code Description
XS1 MSK-ACCESS v1.0 panel
XS2 MSK-ACCESS v2.0 panel
WGS Whole Genome Sequencing (no targets)

File Format Terms

Extension Description Generated By
.bam Aligned reads External aligner (BWA, etc.)
.bed.gz Fragment coordinates krewlyzer extract
.tsv Tab-separated features All feature commands
.parquet Columnar data format WPS (efficient for large arrays)
.pon.parquet PON model krewlyzer pon build

See Also