Cascade features an automatic DAG loading system that's superior to Airflow's approach with multiple improvements.
| Feature | Airflow | Cascade |
|---|---|---|
| Multiple Sources | Local directory only | Local directory + S3 bucket (simultaneously) |
| Hot-Reloading | Restart required | Automatic detection & reload |
| Change Detection | Full rescan | Checksum-based (only reloads changed DAGs) |
| Validation | Basic | Comprehensive (nodes, edges, cycles, types) |
| Error Handling | Parse errors can crash | Graceful degradation (bad DAGs logged, not loaded) |
| File Formats | Python only | JSON + Elixir (.exs) |
| Deletion Handling | Manual cleanup | Automatic DAG disabling |
| Scan Interval | Fixed 30s | Configurable per deployment |
The DagLoader GenServer:
- Scans sources (local directory and/or S3 bucket) at configurable intervals
- Detects changes using MD5 checksums (only reloads changed files)
- Validates DAGs before loading (prevents bad DAGs from breaking the system)
- Upserts DAGs (creates new or updates existing)
- Handles deletions by disabling DAGs when source files are removed
- Logs everything for debugging and auditing
┌─────────────────┐ ┌─────────────────┐
│ Local Files │ │ S3 Bucket │
│ (./dags/*.json)│ │ (dags/*.json) │
└────────┬────────┘ └────────┬────────┘
│ │
└───────────┬───────────────┘
│
▼
┌─────────────────┐
│ DagLoader │
│ (scan every │
│ 30 seconds) │
└────────┬────────┘
│
┌──────────┼──────────┐
│ │ │
▼ ▼ ▼
Validate Calculate Upsert
DAG Checksum to DB
Configure via environment variables:
# Local directory to scan (default: ./dags)
export DAGS_DIR="./dags"
# Scan interval in seconds (default: 30)
export DAGS_SCAN_INTERVAL=30
# Enable/disable auto-loading (default: true)
export DAGS_ENABLED=true
# Optional: S3 bucket for remote DAGs
export DAGS_S3_BUCKET="my-company-dags"
export DAGS_S3_PREFIX="production/dags/"In docker-compose.yml:
services:
cascade:
environment:
- DAGS_DIR=/app/dags
- DAGS_SCAN_INTERVAL=60
- DAGS_S3_BUCKET=my-dags-bucket
- DAGS_S3_PREFIX=dags/In ECS task definition:
{
"environment": [
{"name": "DAGS_DIR", "value": "/app/dags"},
{"name": "DAGS_SCAN_INTERVAL", "value": "60"},
{"name": "DAGS_S3_BUCKET", "value": "my-dags-bucket"}
]
}Simple, declarative DAG definitions:
{
"nodes": [
{
"id": "extract",
"type": "local",
"config": {
"module": "MyApp.Tasks.Extract",
"timeout": 300,
"retry": 3
}
},
{
"id": "transform",
"type": "local",
"config": {
"module": "MyApp.Tasks.Transform",
"timeout": 300
},
"depends_on": ["extract"]
}
],
"edges": [
{"from": "extract", "to": "transform"}
],
"description": "ETL Pipeline",
"enabled": true
}For dynamic DAG generation:
# Dynamic configuration
num_parallel_tasks = System.get_env("PARALLEL_TASKS", "5") |> String.to_integer()
# Generate tasks programmatically
tasks = for i <- 1..num_parallel_tasks do
%{
"id" => "parallel_#{i}",
"type" => "local",
"config" => %{"module" => "MyApp.Task"}
}
end
# Return DAG definition
%{
"nodes" => tasks,
"edges" => [],
"description" => "Dynamically generated with #{num_parallel_tasks} tasks"
}%{
"nodes" => [
%{
"id" => "unique_task_id", # Required: unique identifier
"type" => "local" | "lambda", # Required: task type
"config" => %{...} # Required: task configuration
}
]
}%{
"edges" => [ # Dependencies between tasks
%{"from" => "task1", "to" => "task2"}
],
"description" => "...", # Human-readable description
"enabled" => true # Enable/disable DAG
}Two ways to specify dependencies:
- Via
depends_onin node:
{
"id": "task2",
"type": "local",
"depends_on": ["task1"],
"config": {...}
}- Via
edgesarray (preferred):
{
"nodes": [...],
"edges": [
{"from": "task1", "to": "task2"}
]
}DAGs are validated before loading. The validator checks:
- ✅
nodesarray must exist and not be empty - ✅ Each node must have
id,type, andconfig
- ✅ Node IDs must be unique
- ✅ No duplicate node definitions
- ✅ All edges must reference existing nodes
- ✅ All
depends_onentries must reference existing nodes
- ✅ No circular dependencies allowed (uses topological sort)
- ✅ Task types must be valid (
local,lambda)
Example validation errors:
❌ Missing required fields: nodes
❌ Duplicate node IDs: task_1
❌ DAG contains a cycle (circular dependency)
❌ All edges must have valid 'from' and 'to' node IDs
- Configure AWS credentials (via IAM role or environment variables)
- Set S3 environment variables:
export DAGS_S3_BUCKET="my-dags-bucket" export DAGS_S3_PREFIX="production/dags/"
- Upload DAG files to S3:
aws s3 cp my_dag.json s3://my-dags-bucket/production/dags/
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:GetObject"
],
"Resource": [
"arn:aws:s3:::my-dags-bucket",
"arn:aws:s3:::my-dags-bucket/production/dags/*"
]
}
]
}- Central storage: Share DAGs across multiple Cascade instances
- Version control: Use S3 versioning for DAG history
- Access control: Fine-grained IAM permissions
- Scalability: No file system dependencies
# In IEx console or remote shell
iex> Cascade.DagLoader.get_status()
%{
enabled: true,
scan_interval: 30000,
loaded_dags: ["example_etl", "lambda_pipeline"],
dag_count: 2,
local_source: %{type: :local, path: "./dags"},
s3_source: nil
}iex> Cascade.DagLoader.scan_now()
:ok# Temporarily disable
export DAGS_ENABLED=false
# Or restart with disabled loading
docker restart cascade-appThe DAG loader produces structured logs:
🔄 [DAG_LOADER] Starting DAG loader (scan_interval=30000ms)
📂 [LOCAL_SOURCE] Scanning directory: ./dags
☁️ [S3_SOURCE] Scanning S3: s3://my-bucket/dags/
📥 [DAG_LOADER] Loading DAG: example_etl from local:./dags/example_etl.json
✅ [DAG_LOADER] Successfully loaded DAG: example_etl
❌ [DAG_LOADER] Failed to load DAG example_etl: JSON parse error
🗑️ [DAG_LOADER] DAG deleted: old_dag
DAG not loading:
- Check file permissions
- Verify JSON/Elixir syntax
- Check validation errors in logs
- Ensure
DAGS_ENABLED=true
Changes not detected:
- Wait for next scan interval
- Force scan with
DagLoader.scan_now() - Verify file actually changed (checksum-based)
S3 DAGs not loading:
- Verify AWS credentials
- Check S3 bucket permissions
- Verify bucket name and prefix
- Check S3 logs for access denied errors
dags/
├── README.md
├── production/
│ ├── daily_etl.json
│ └── hourly_sync.json
├── staging/
│ ├── test_pipeline.json
│ └── debug_flow.exs
└── templates/
└── example_template.exs
- Use descriptive, kebab-case names:
daily-etl-pipeline.json - File name becomes DAG name (without extension)
- Avoid special characters
# Keep DAGs in version control
git add dags/*.json
git commit -m "Add new ETL pipeline"
git push
# Deploy to S3 from CI/CD
aws s3 sync ./dags/ s3://my-bucket/production/dags/# Test DAG locally before deploying
mix test apps/cascade/test/cascade/dag_loader_test.exs
# Validate JSON syntax
cat dags/my_dag.json | jq .
# Test .exs files
elixir dags/my_dag.exsAlways include descriptions in DAG files:
{
"description": "Daily ETL: Extracts from API, transforms data, loads to warehouse",
"nodes": [...]
}# In application startup or manual command
Mix.Task.run("cascade.load_dag", ["daily_etl", "dags/daily_etl.json"])-
Move DAG files to
./dagsdirectory:mv my_dag.json ./dags/
-
Remove manual loading code
-
DAG loads automatically on application start and every 30s
- ✅ No manual intervention required
- ✅ Works in production (no Mix dependency)
- ✅ Hot-reloading without restart
- ✅ Centralized DAG management
- Local directory: < 10ms for 100 files
- S3 bucket: < 500ms for 100 files (varies by network)
- Validation: < 1ms per DAG
- Checksum calculation: < 1ms per DAG
-
Adjust scan interval for large DAG sets:
export DAGS_SCAN_INTERVAL=60 # Reduce scan frequency
-
Use S3 prefix to limit scope:
export DAGS_S3_PREFIX="production/active-dags/"
-
Disable if not needed:
export DAGS_ENABLED=false
# Production
export DAGS_S3_PREFIX="production/dags/"
# Staging
export DAGS_S3_PREFIX="staging/dags/"
# Development
export DAGS_DIR="./dev-dags"
export DAGS_S3_BUCKET=""# Use .exs for feature flags
enabled = System.get_env("NEW_PIPELINE_ENABLED", "false") == "true"
%{
"nodes" => [...],
"enabled" => enabled,
"description" => "New pipeline (controlled by NEW_PIPELINE_ENABLED)"
}# Template in dags/templates/etl_template.exs
defmodule ETLTemplate do
def generate(source, destination) do
%{
"nodes" => [
%{"id" => "extract_#{source}", "type" => "local", ...},
%{"id" => "load_#{destination}", "type" => "local", ...}
],
"description" => "ETL from #{source} to #{destination}"
}
end
end
# Generate actual DAG
ETLTemplate.generate("api", "warehouse")- S3 bucket encryption: Use server-side encryption (SSE-S3 or SSE-KMS)
- IAM roles: Use least-privilege IAM roles (read-only S3 access)
- File permissions: Restrict write access to DAG files
- Code review: Review
.exsfiles carefully (they execute Elixir code) - Validation: DAG validation prevents malformed workflows
Potential improvements under consideration:
- DAG versioning with rollback capability
- Webhook notifications on DAG changes
- UI for DAG management
- Git integration (pull from repository)
- Schema validation for task configs
- DAG templates library