Cascade's Phase 1 implementation is complete! We now have a working single-node workflow orchestrator with:
-
Database Layer
- PostgreSQL schemas for DAGs, Jobs, and TaskExecutions
- Ecto models with proper relationships and validations
- Migrations for all tables with proper indexes
-
DSL (Domain Specific Language)
- Elixir macros for defining DAGs (
dag,task,depends_on) - Compiler that converts DSL to language-agnostic JSON
- Validator with cycle detection using libgraph
- Topological sorting for dependency resolution
- Elixir macros for defining DAGs (
-
Runtime Engine
- StateManager: ETS-based in-memory state + async Postgres persistence
- Scheduler: Job lifecycle orchestration, dependency resolution
- Executor: Task dispatch and worker selection
- TaskRunner: Worker GenServer pool for task execution
- LocalExecutor: Executes Elixir module functions
-
Event System
- PubSub-based event publishing for job and task updates
- Standardized event structs (JobEvent, TaskEvent, WorkerEvent)
- Topic-based subscriptions for real-time updates
-
Example DAG
- ETL pipeline with Extract → Transform → Load → Notify
- Sample task modules demonstrating the task interface
# From project root
cd /Users/tim/cascade_project/cascade
# Start IEx with the application
iex -S mix# Load the ETL DAG into the database
{:ok, dag} = Cascade.Examples.DAGLoader.load_etl_dag()# Manually trigger the DAG
alias Cascade.Runtime.Scheduler
{:ok, job} = Scheduler.trigger_job(dag.id, "manual", %{environment: "dev"})
# You should see logs showing:
# - Job created
# - Tasks executing in dependency order: extract → transform → load → notify
# - Each task completing successfully# Get job details
alias Cascade.Workflows
job_with_details = Workflows.get_job_with_details!(job.id)
# Check task executions
task_executions = Workflows.list_task_executions_for_job(job.id)
Enum.each(task_executions, fn te ->
IO.puts("#{te.task_id}: #{te.status}")
end)
# Get in-memory state (for active jobs)
alias Cascade.Runtime.StateManager
{:ok, job_state} = StateManager.get_job_state(job.id)
IO.inspect(job_state, label: "Job State")# List all DAGs in the system
dags = Workflows.list_dags()
Enum.each(dags, fn dag ->
IO.puts("#{dag.name}: #{dag.description}")
end)┌─────────────┐
│ User/API │
└──────┬──────┘
│ trigger_job(dag_id)
▼
┌─────────────────────┐
│ Scheduler │ ← Orchestrates job lifecycle
│ - Create Job (PG) │
│ - Init State (ETS) │
│ - Find ready tasks │
└──────┬──────────────┘
│ dispatch_task
▼
┌─────────────────────┐
│ Executor │ ← Dispatches tasks to workers
│ - Select worker │
│ - Assign task │
└──────┬──────────────┘
│ PubSub: worker:node1
▼
┌─────────────────────┐
│ TaskRunner │ ← Executes tasks
│ - Run local code │
│ - Report results │
└──────┬──────────────┘
│ task_completed
▼
┌─────────────────────┐
│ StateManager │ ← Tracks state
│ - Update ETS │
│ - Publish events │
│ - Persist to PG │
└─────────────────────┘
# 1. Define task modules
defmodule MyApp.Tasks.MyTask do
require Logger
def run(context) do
Logger.info("Running task for job #{context.job_id}")
# Your task logic here
result = %{status: "success", data: "..."}
{:ok, result}
end
end
# 2. Define the DAG
defmodule MyApp.MyDAG do
use Cascade.DSL
dag "my_workflow" do
description "My custom workflow"
task :step1 do
type :local
module "MyApp.Tasks.MyTask"
timeout 300
end
task :step2 do
type :local
module "MyApp.Tasks.AnotherTask"
depends_on [:step1]
timeout 300
end
end
end
# 3. Load it into the database
definition = MyApp.MyDAG.get_dag_definition()
{:ok, validated} = Cascade.DSL.Validator.validate(definition)
{:ok, dag} = Cascade.Workflows.create_dag(%{
name: definition["name"],
description: definition["metadata"]["description"],
definition: validated,
compiled_at: DateTime.utc_now()
})
# 4. Trigger it!
{:ok, job} = Cascade.Runtime.Scheduler.trigger_job(dag.id, "manual", %{})- Multi-node Erlang clustering
- Worker health monitoring and heartbeats
- Task reassignment on worker failure
- Load balancing across worker pool
- Lambda task execution
- S3 artifact storage
- Hybrid local/remote execution
- Real-time DAG visualization
- Job monitoring dashboard
- Manual job triggering UI
- Worker cluster status
- Cron-based scheduling
- Retry logic with exponential backoff
- Task timeouts
- Error callbacks
- Job cancellation
- Authentication/authorization
lib/cascade/runtime/state_manager.ex- In-memory job state (ETS)lib/cascade/runtime/scheduler.ex- Job lifecycle orchestrationlib/cascade/runtime/executor.ex- Task dispatchlib/cascade/runtime/task_runner.ex- Task execution workers
lib/cascade/dsl.ex- DSL macroslib/cascade/dsl/compiler.ex- DSL → JSON compilerlib/cascade/dsl/validator.ex- DAG validation & topological sort
lib/cascade/workflows/dag.ex- DAG schemalib/cascade/workflows/job.ex- Job schemalib/cascade/workflows/task_execution.ex- TaskExecution schemalib/cascade/workflows.ex- Context functions (CRUD)
lib/cascade/events.ex- PubSub events & topics
lib/cascade/examples/etl_dag.ex- Example ETL DAGlib/cascade/examples/tasks.ex- Example task moduleslib/cascade/examples/dag_loader.ex- Helper to load DAGs
# Run the application
iex -S mix
# In IEx:
# 1. Load example DAG
{:ok, dag} = Cascade.Examples.DAGLoader.load_etl_dag()
# 2. Trigger a job
alias Cascade.Runtime.Scheduler
{:ok, job} = Scheduler.trigger_job(dag.id, "manual", %{})
# 3. Watch the logs - you should see:
# - Job created
# - ExtractData task starts and completes
# - TransformData task starts and completes
# - LoadData task starts and completes
# - SendNotification task starts and completes
# - Job marked as complete
# 4. Check the results
alias Cascade.Workflows
job_final = Workflows.get_job_with_details!(job.id)
IO.inspect(job_final.status) # Should be :success- Current implementation is single-node only
- All tasks execute on the same node
- Worker pool size defaults to
2 * CPU cores - Set
CASCADE_WORKERSenv var to override - State is stored in ETS (lost on restart) and Postgres (durable)
Phase 1 is complete! You can now:
✅ Define DAGs in Elixir DSL ✅ Load DAGs to Postgres ✅ Trigger jobs manually via IEx ✅ See tasks execute in dependency order ✅ Query job/task status from database ✅ View in-memory state for active jobs
The foundation is solid and ready for Phase 2: Distributed Workers!