Command Line Interface (CLI)

kreview exposes all primary pipeline orchestrations explicitly through the terminal using typer.

kreview

ctDNA fragmentomics feature evaluation

Usage:

kreview [OPTIONS] COMMAND [ARGS]...

Options:

  --version
  --install-completion  Install completion for the current shell.
  --show-completion     Show completion for the current shell, to copy it or
                        customize the installation.

features-list

List all registered feature evaluators.

Usage:

kreview features-list [OPTIONS]

label

Generate ctDNA labels without feature evaluation.

Usage:

kreview label [OPTIONS]

Options:

  --cancer-samplesheet PATH       Cancer samplesheet CSV  [required]
  --healthy-xs1-samplesheet PATH  Healthy XS1 samplesheet CSV  [required]
  --healthy-xs2-samplesheet PATH  Healthy XS2 samplesheet CSV  [required]
  --cbioportal-dir PATH           Directory with cBioPortal files  [required]
  --output PATH                   Output parquet file  [default:
                                  labels.parquet]
  --min-vaf FLOAT                 Min VAF for Possible ctDNA+ (default 1%)
                                  [default: 0.01]
  --min-variants INTEGER          Min # variants passing VAF for Possible
                                  ctDNA+  [default: 1]
  --chunk-size INTEGER            Batch size for DuckDB file loading over SFTP
                                  network mounts  [default: 500]

report

Re-generate HTML Dashboards from existing matrix parquet files.

Usage:

kreview report [OPTIONS]

Options:

  --input-dir PATH         Directory with *_matrix.parquet files  [required]
  --out-dir PATH           Directory to deposit Quarto reports  [default:
                           reports/]
  --cvd-safe               Render dashboards and plots using an Okabe-Ito
                           Colorblind-Safe palette instead of default neon.
  --shap-samples INTEGER   Max samples for SHAP explainability computation in
                           dashboards.  [default: 500]
  --shap-features INTEGER  Max features to visualize in SHAP plots.  [default:
                           10]

run

Run full pipeline: label → extract → evaluate → report.

Usage:

kreview run [OPTIONS]

Options:

  --cancer-samplesheet PATH       [required]
  --healthy-xs1-samplesheet PATH  [required]
  --healthy-xs2-samplesheet PATH  [required]
  --cbioportal-dir PATH           [required]
  --krewlyzer-dir TEXT            krewlyzer output directory  [required]
  --output PATH                   Output directory  [default: output/]
  --min-vaf FLOAT                 [default: 0.01]
  --min-fragments INTEGER         [default: 2000]
  --min-variants INTEGER          [default: 1]
  --features TEXT                 Comma-separated features to run
  --tier INTEGER                  Run features of this tier only
  --workers INTEGER               Total processes  [default: 4]
  -v, --verbose                   Enable verbose logging
  --cvd-safe                      Render dashboards and plots using an Okabe-
                                  Ito Colorblind-Safe palette instead of
                                  default neon.
  --skip-report / --no-skip-report
                                  Skip HTML report generation  [default: no-
                                  skip-report]
  --cv-folds INTEGER              Number of cross-validation folds (3-20,
                                  default 5)  [default: 5]
  --impute-strategy TEXT          Imputation strategy for missing values:
                                  median, mean, or zero  [default: median]
  --export-duckdb                 Export a persistent duckdb data lake
                                  containing all feature matrices
  --chunk-size INTEGER            Batch size for DuckDB file loading over SFTP
                                  network mounts  [default: 500]
  --top-n INTEGER                 Max sub-metrics to feed into ML models per
                                  evaluator. All sub-metrics are included up
                                  to this cap; the model's feature importance
                                  ranks them.  [default: 50]
  --shap-samples INTEGER          Max samples for SHAP explainability
                                  computation in dashboards. Lower values
                                  reduce memory usage.  [default: 500]
  --shap-features INTEGER         Max features to visualize in SHAP
                                  beeswarm/waterfall plots. The model still
                                  trains on --top-n features.  [default: 10]
  --resume                        Skip evaluators whose model results already
                                  exist in the output directory.
  --compute-univariate-auc        Compute per-feature univariate LR AUC (adds
                                  ~10s per evaluator).