Glossary
Quick reference for terminology used throughout Krewlyzer documentation.
Biological Terms
DNA & Chromatin
| Term |
Plain English |
Technical Definition |
| cfDNA |
DNA floating in blood |
Cell-free DNA - short DNA fragments released from dying cells into the bloodstream |
| ctDNA |
Tumor DNA in blood |
Circulating tumor DNA - the fraction of cfDNA originating from cancer cells |
| Nucleosome |
DNA packaging unit |
Histone octamer with ~147bp of DNA wrapped around it; the fundamental unit of chromatin |
| Chromatin |
DNA + proteins |
The complex of DNA and histone proteins that makes up chromosomes |
| Linker DNA |
DNA between spools |
The ~20bp of DNA connecting adjacent nucleosomes |
| Mono-nucleosomal |
One wrapped loop |
Fragment derived from a single nucleosome (~145-180bp) |
| Di-nucleosomal |
Two wrapped loops |
Fragment spanning two nucleosomes (~300-340bp) |
Fragment Characteristics
| Term |
Plain English |
Technical Definition |
| Fragment |
One piece of DNA |
A single cfDNA molecule with defined start and end positions |
| Fragment length |
DNA piece size |
The number of base pairs from the 5' end to 3' end of a fragment |
| End motif |
Cutting pattern |
The 4-nucleotide sequence at the 5' end of a fragment |
| Breakpoint motif |
Internal cut |
The sequence context where a fragment was cut |
Genomic Context
| Term |
Plain English |
Technical Definition |
| Chromosome arm |
Half of a chromosome |
The p (short) or q (long) arm of a chromosome, separated by the centromere |
| GC content |
Letter composition |
The percentage of G (guanine) and C (cytosine) bases in a sequence |
| Open chromatin |
Accessible DNA |
Genomic regions not tightly wrapped, allowing transcription factor access |
| TSS |
Gene start site |
Transcription Start Site - where RNA polymerase begins transcribing |
| CTCF site |
DNA organizer |
CCCTC-binding factor sites that organize 3D chromatin structure |
| Alu element |
Repeated sequence |
A ~300bp repetitive element found ~1 million times in the human genome |
Sequencing & Alignment Terms
Read Processing
| Term |
Plain English |
Technical Definition |
| BAM file |
Aligned sequences |
Binary Alignment Map - compressed file containing sequencing reads aligned to a reference genome |
| Read |
Sequenced chunk |
A single sequence generated by the sequencer (R1 = forward, R2 = reverse) |
| Read pair |
Two matching reads |
R1 and R2 reads from the same DNA fragment (paired-end sequencing) |
| Proper pair |
Good alignment |
Read pair where both reads align correctly in expected orientation and distance |
| MAPQ |
Alignment confidence |
Mapping quality - Phred-scaled probability that alignment position is wrong |
| Duplicate |
PCR copy |
Multiple reads from the same original molecule (amplification artifact) |
Quality Thresholds
| Setting |
Plain English |
Krewlyzer Default |
| MAPQ ≥ 20 |
High confidence alignment |
99% probability alignment is correct |
| Min length 65bp |
Not too short |
Excludes fragments smaller than most cfDNA |
| Max length 400bp |
Not too long |
Excludes di-nucleosomal and larger fragments |
| Skip duplicates |
Remove PCR copies |
Ensures each molecule counted once |
| Require proper pair |
Good read pairs only |
May need to disable for duplex/consensus BAMs |
Krewlyzer Feature Terms
Fragment Size Features
| Feature |
Measures |
Higher Value Means |
| FSC (Coverage) |
Fragment count per genomic bin |
More fragments in that region |
| FSR (Ratio) |
Short ÷ Long fragment ratio |
More tumor-derived DNA |
| FSD (Distribution) |
Size histogram per arm |
(Shape matters, not value) |
Size Bin Definitions (Rust Backend)
| Bin Name |
Size Range |
Biological Meaning |
| ultra_short |
65-99bp |
Sub-nucleosomal, TF footprints |
| core_short |
100-149bp |
Tumor-enriched (primary biomarker) |
| mono_nucl |
150-259bp |
Standard mono-nucleosomal cfDNA |
| di_nucl |
260-399bp |
Di-nucleosomal and larger |
| long |
400+bp |
Very long fragments (rare) |
Nucleosome Features
| Feature |
Measures |
Higher Value Means |
| WPS |
Protection score at each position |
Nucleosome present (positive) or absent (negative) |
| NRL |
Nucleosome Repeat Length |
Expected ~190bp; deviation suggests abnormality |
| nrl_quality |
Periodicity strength (0-1) |
Clearer nucleosome spacing pattern |
Other Features
| Feature |
Measures |
Higher Value Means |
| MDS |
Motif Diversity Score |
More diverse (potentially tumor-related) end motifs |
| OCF |
Orientation asymmetry |
Tissue-specific fragmentation pattern |
| mFSD |
Mutant vs wild-type sizes |
ALT shorter than REF = ctDNA present |
Normalization Terms
GC Correction
| Term |
Plain English |
Technical Definition |
| GC bias |
Uneven amplification |
PCR/capture preferentially amplifies certain GC contents |
| LOESS |
Smoothing algorithm |
Locally Estimated Scatterplot Smoothing - fits local regression curves |
| Correction factor |
Adjustment weight |
Multiplier to remove GC-related count biases |
Panel of Normals (PON)
| Term |
Plain English |
Technical Definition |
| PON |
Healthy baseline |
Panel of Normals - reference statistics from healthy samples |
| Z-score |
Deviation from normal |
(Sample - PON_mean) / PON_std; measures abnormality |
| Log-ratio |
Relative change |
log₂(Sample / PON_expected); positive = elevated |
| PON stability |
Reliability weight |
1 / (variance + k); higher = more trustworthy comparison |
Panel Sequencing Terms
Target Capture
| Term |
Plain English |
Technical Definition |
| Panel |
Targeted genes |
Capture probe set designed to sequence specific genomic regions |
| On-target |
Captured regions |
Fragments overlapping panel target regions |
| Off-target |
Background DNA |
Fragments not overlapping targets (unbiased background) |
| Bait |
Capture probe |
Oligonucleotide used to pull down target DNA in hybridization capture |
| Bait padding |
Edge buffer |
bp to trim from bait edges to avoid capture artifacts |
MSK-ACCESS Assay Codes
| Code |
Description |
| XS1 |
MSK-ACCESS v1.0 panel |
| XS2 |
MSK-ACCESS v2.0 panel |
| WGS |
Whole Genome Sequencing (no targets) |
| Extension |
Description |
Generated By |
.bam |
Aligned reads |
External aligner (BWA, etc.) |
.bed.gz |
Fragment coordinates |
krewlyzer extract |
.tsv |
Tab-separated features |
All feature commands |
.parquet |
Columnar data format |
WPS (efficient for large arrays) |
.pon.parquet |
PON model |
krewlyzer pon build |
See Also