Configuration
Snowpack is configured through environment variables. Each component reads its own set of variables at startup. Variables without a default are required.
API
The API server connects to Spark Thrift Server for query execution and to Postgres for job state and cached health data.
| Variable | Default | Description |
|---|---|---|
SNOWPACK_SPARK_HOST | localhost | Spark Thrift Server hostname. |
SNOWPACK_SPARK_PORT | 10000 | Spark Thrift Server port. |
SNOWPACK_CATALOG | glue_catalog | Iceberg catalog name used in Spark SQL statements. |
SNOWPACK_POSTGRES_HOST | localhost | PostgreSQL hostname. |
SNOWPACK_POSTGRES_PORT | 5432 | PostgreSQL port. |
SNOWPACK_POSTGRES_DATABASE | snowpack | PostgreSQL database name. |
SNOWPACK_POSTGRES_USER | snowpack | PostgreSQL username. |
SNOWPACK_POSTGRES_PASSWORD | — | PostgreSQL password. No default; must be provided. |
SNOWPACK_DRAIN_MODE | off | Set to on to reject new maintenance submissions. Existing running jobs are unaffected. Useful during planned Spark downtime. |
Health Sync Worker
The health sync worker periodically loads table metadata from the PyIceberg catalog and writes health snapshots to Postgres. It also optionally pushes metrics to Mimir via OTLP.
| Variable | Default | Description |
|---|---|---|
SNOWPACK_HEALTH_SYNC_INTERVAL_SECONDS | 900 | Health sync cadence in seconds (15 min). Set to 0 to disable the sync loop entirely. |
SNOWPACK_HEALTH_SYNC_DATABASES | (all) | Comma-separated list of databases to sync. When unset, all databases in the catalog are synced. |
SNOWPACK_HEALTH_SYNC_CONCURRENCY | 10 | Max concurrent PyIceberg table loads. Use ~2 on memory-constrained pods to avoid OOM kills. |
SNOWPACK_MIMIR_ENDPOINT | (unset) | OTLP gRPC endpoint for Mimir metrics push. Leave empty to disable metrics push. |
SNOWPACK_GLUE_CATALOG | lakehouse_dev | Glue catalog name used by PyIceberg for direct metadata access. |
AWS_REGION | us-east-1 | AWS region for Glue and S3 API calls. |
Orchestrator
The orchestrator is a CronJob that queries the API for table health, decides which tables need maintenance, and submits jobs. It does not connect to Spark directly.
| Variable | Default | Description |
|---|---|---|
SNOWPACK_API_URL | http://snowpack-api.snowpack.svc.cluster.local | Snowpack API base URL. The orchestrator calls this for health checks and job submissions. |
SNOWPACK_MAINTENANCE_CADENCE_HOURS | 6 | Global minimum hours between maintenance runs for a given table. Individual tables can override this via the snowpack.maintenance_cadence_hours table property. |
SNOWPACK_HEALTH_CONCURRENCY | 10 | Max concurrent health check requests to the API during the discovery phase. |
SNOWPACK_MAX_SUBMIT | 3 | Max jobs the orchestrator will queue in a single run. Prevents overloading Spark when many tables need maintenance simultaneously. |
SNOWPACK_POLL_INTERVAL | 30 | Seconds between job status polls while waiting for submitted jobs to complete. |
SNOWPACK_OPT_IN_MODE | true | When true, only tables with snowpack.maintenance_enabled = true are considered. When false, all tables are eligible unless explicitly excluded via compaction_skip. |
SNOWPACK_INCLUDE_DATABASES | (unset) | Comma-separated database allowlist. When set, only tables in these databases are considered. This is the Helm orchestrator.allowedDatabases value. |
SNOWPACK_DRY_RUN | false | When true, the orchestrator logs all decisions but does not submit any maintenance jobs. Useful for validating configuration changes. |
SNOWPACK_SLACK_WEBHOOK_URL | (unset) | Slack incoming webhook URL. When set, the orchestrator posts a summary after each run. Optional. |