Skip to content

npow/awesome-metaflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 

Repository files navigation

Awesome Metaflow Awesome

A curated list of Metaflow extensions, plugins, integrations, and resources.

Metaflow is a human-friendly Python/R framework for real-life ML, AI, and data science. Originally built at Netflix and open-sourced in 2019.


Contents


Core Infrastructure

  • metaflow - Core framework with built-in @kubernetes, @batch, Argo, Airflow, and Step Functions support.
  • metaflow-local-service - Track Metaflow runs anywhere without a database — starts on demand, stops when idle.
  • metaflow-serverless - Serverless Metaflow metadata service — free-tier Postgres, zero setup.
  • metaflow-service - Metadata tracking REST API and UI backend.
  • metaflow-ui - React web UI for real-time run monitoring with a plugin system.

Infrastructure & IaC


Dependency Management

  • metaflow-nflx-extensions - Netflix's enhanced @conda/@pypi: named environments, mixed conda+pip, and faster resolving via micromamba. Requires Metaflow ≥ 2.8.3.
  • metaflow_extensions - Adds @pip and a preinstall shell-hook for system-level deps on remote nodes. ⚠️ Built against Metaflow 2.7.x; verify compatibility.

Extension Mechanism & Templates


Orchestration & Scheduling

  • metaflow-dagster - Dagster scheduling, observability, and UI for Metaflow pipelines.
  • metaflow-flyte - Schedule and monitor Metaflow pipelines through Flyte without rewriting them.
  • metaflow-kestra - Kestra scheduling, triggers, and UI for Metaflow pipelines.
  • metaflow-kubeflow - Deploy Metaflow flows to Kubeflow Pipelines with no infra changes required.
  • metaflow-mage - Mage pipeline orchestration and UI for Metaflow flows.
  • metaflow-prefect - Prefect scheduling, deployments, and UI for Metaflow pipelines.
  • metaflow-slurm - Run steps on HPC Slurm clusters via SSH + sbatch. Beta.
  • metaflow-temporal - Temporal scheduling and durable workflows for Metaflow pipelines.
  • metaflow-windmill - Windmill workflow automation and UI for Metaflow pipelines.

Distributed Compute & Training

  • metaflow-deepspeed - @deepspeed for multi-node DeepSpeed training with S3/Azure checkpoint uploads. Experimental.
  • metaflow-mpi - @mpi for multi-node MPI programs (C/Fortran/mpi4py). Experimental.
  • metaflow-pyspark - PySpark decorator for Metaflow steps. Experimental, low adoption.
  • metaflow-ray - Ephemeral Ray clusters on AWS Batch or Kubernetes. Supports Ray Core, Train, Tune, and Data.
  • metaflow-sandbox - Metaflow steps in millisecond-start sandboxes with cloud-scale fanout and consistent deps.
  • metaflow-tensorflow - @tensorflow that auto-configures TF_CONFIG for tf.distribute.Strategy. Experimental.
  • metaflow-torchrun - Run tasks as nodes in a torchrun DDP job on Batch or Kubernetes.

Model & Artifact Management


Observability & Monitoring

  • metaflow-gpu-profile - @gpu_profile decorator that renders GPU utilization as a Metaflow card.
  • metaflow-measure - Emit step metrics to Datadog and other backends via a measure API.
  • metaflow-profiler - Flamegraph profiling card for Metaflow steps.
  • metaflow-sentry-logger - Sentry logging via @sentry. ⚠️ Relies on an unsupported extension API; broken as of Metaflow 2.7.20.
  • metaflowbot - Slack bot for real-time flow monitoring with CloudFormation deploy. ⚠️ Older; verify against current Metaflow.
  • resource-tracker - Zero-dependency CPU/memory/GPU tracker with Metaflow card output and cloud cost recommendations.

Cards & Visualization


Third-Party Integrations

  • airflow-metaflow-demo - Metaflow + Airflow in Docker Compose with KubernetesPodOperator steps.
  • comet_ml - @comet_flow and @comet_step ship in the comet_ml package, including automatic Card export.
  • hamilton-metaflow - Hamilton as a feature engineering layer inside Metaflow steps.
  • sap-ai-core-metaflow - Generates Argo Workflow Templates from Metaflow flows for SAP AI Core.
  • wandb - @wandb_log ships in the wandb package for experiment tracking.
  • zdatasets - Zillow's Dataset SDK with a DatasetParameter integration for Metaflow flows.

Developer Tooling

  • gha-metaflow - GitHub Actions workflows that trigger Metaflow runs on push/PR.
  • metaflow-contracts - Catch bad data between Metaflow steps before it corrupts your pipeline.
  • metaflow-dev-vscode - VS Code extension with shortcuts for running flows and steps from the editor.
  • metaflow-diff - Diff your working directory against a past Metaflow run.
  • metaflow-mcp-server - MCP server for inspecting runs, logs, and artifacts from any AI agent.
  • metaflow-optuna - Parallel hyperparameter tuning with true adaptive TPE — no sequential bottleneck.
  • metaflow-stubs - Type stubs for IDE autocompletion. pip install metaflow-stubs

Examples & Tutorials


Contributing

  1. Project must be Metaflow-related with a clear install path
  2. One-line description only
  3. Flag experimental/broken items with ⚠️
  4. No item without a working link

License

CC0

About

Every Metaflow extension worth knowing, curated and organized

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors