feat: introduce component hash history by jordanrfrazier · Pull Request #11311 · langflow-ai/langflow

jordanrfrazier · 2026-01-15T02:12:55Z

Adds component hash history files for stable and nightly versions. This will allow us to track Core Components across versions of Langflow, allowing users to disable Custom Component execution.

Uses a simple version -> hash mapping. I decided against dealing with complexity of allowed ranges for now -- the growth of these files (even for the nightly) will not be significant in the next year (~10mb).

Note this also removes the previous work done to add the hash history to the existing component index -- it makes more sense to keep them separated.

coderabbitai · 2026-01-15T02:13:02Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

This pull request refactors component indexing and hash history handling. The index builder is simplified to create deterministic indexes from scratch without preserving history, while a new separate script manages version-based hash histories. Component metadata now includes a unique component_id attribute that propagates through the system.

Changes

Cohort / File(s)	Summary
Index building refactoring `scripts/build_component_index.py`	Removed hash history loading, merging, and prior index loading machinery; packaging dependency guard removed. Build now deterministically normalizes and hashes index fresh each run, without preserving prior state. New public constant `COMPONENT_INDEX_PATH` designates output location.
New hash history builder `scripts/build_hash_history.py`	New script providing version-aware hash history management with functions for version retrieval, JSON loading/saving, component import, and history updates. Supports both stable and nightly release tracks with version conflict detection.
Test updates `src/backend/tests/unit/test_build_hash_history.py`	New test suite covering `update_history` versioning scenarios and `main` function behavior with mocked helpers.
Test cleanup `src/backend/tests/unit/test_component_index_hash_history.py`	Removed 300+ lines of tests covering old hash history merging and index loading machinery.
Telemetry schema `src/backend/base/langflow/services/telemetry/schema.py`	Removed `component_id` field from `ComponentInputsPayload` example.
Hash history asset `src/lfx/src/lfx/_assets/stable_hash_history.json`	New static JSON registry mapping 2000+ components to version and hash metadata for stable releases.
Component ID attribute `src/lfx/src/lfx/custom/custom_component/custom_component.py`	Added `component_id: str \| None = None` attribute documenting unique static identifier.
Metadata propagation `src/lfx/src/lfx/custom/utils.py`	Extended `build_component_metadata` to propagate `component_id` into frontend node metadata when available.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~40 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 4

❌ Failed checks (1 error, 3 warnings)

Check name	Status	Explanation	Resolution
Test Coverage For New Implementations	❌ Error	Tests call update_history with 5 arguments but function signature takes 4; tests expect UUID-keyed history with 'name' field, but implementation uses component names without creating 'name' field for new entries; test validates component_id uniqueness contradicting PR summary stating component_id was removed.	Update test to match implementation: correct function calls to 4 arguments, use component names as keys, remove 'name' field assertions, and validate component name uniqueness instead of component_id. Fix implementation to extract component_id from metadata and include 'name' field in new entries.
Test Quality And Coverage	⚠️ Warning	Tests call update_history() with 5 arguments but implementation signature accepts only 4; data structure mismatch on keying and 'name' field.	Fix test calls to match actual signature: update_history(history, component_name, code_hash, version); update assertions to check history[component_name]['versions'] structure.
Test File Naming And Structure	⚠️ Warning	Test file has critical structural mismatches with actual implementation regarding function parameters and data structure keys.	Remove component_id parameter from update_history() calls; update assertions to use component_name as keys instead of component_id; remove test_all_real_component_ids_are_unique().
Excessive Mock Usage Warning	⚠️ Warning	Test file exhibits excessive mocking of internal logic functions (_import_components, load_hash_history, save_hash_history) in integration tests, obscuring actual behavior verification and causing test assertions to mismatch implementation details.	Remove mocks of internal logic functions from integration tests; keep only external dependency mocks; split into unit tests (pure logic without mocks) and integration tests (real internal calls); add tests verifying complete real flow.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main objective of the PR: introducing a component hash history tracking system across Langflow versions, which is the primary focus across multiple files and the stated PR objective.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch manage-hash-history-comp-index

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

src/lfx/src/lfx/custom/utils.py

coderabbitai

Actionable comments posted: 6

🤖 Fix all issues with AI agents

In `@scripts/build_hash_history.py`:
- Around line 62-66: The new-component branch that creates
history[component_name] currently only sets "versions"; update the block inside
the if component_name not in history check to also set a "name" key with the
human-readable component_name so new entries match existing
stable_hash_history.json shape (i.e., ensure history[component_name]["name"] =
component_name alongside history[component_name]["versions"] = {version_key:
code_hash}); locate and modify the creation logic referencing history,
component_name, version_key, and code_hash.
- Around line 108-120: The loop is using comp_name as the key when calling
update_history, but the history uses component UUIDs; change the call in the
loop to pass comp_details["metadata"]["component_id"] (instead of comp_name)
along with code_hash and current_version to update_history, and update the
update_history(function) signature and its docstring to indicate it expects a
component UUID (component_id) as the unique identifier rather than a component
name so existing lookups and writes use the UUID keys.

In `@src/backend/tests/unit/test_build_hash_history.py`:
- Around line 111-121: The test test_all_real_component_ids_are_unique is still
asserting uniqueness of metadata["component_id"] while the code now uses
comp_name (the dict key) as the identifier; either remove this test if
component_id is no longer used, or update it to collect comp_name keys from the
modules_dict returned by _import_components() (iterate modules_dict.items() and
collect the per-module dict keys) and assert the total list length equals the
set length to ensure comp_name uniqueness across categories; reference the
functions _import_components and main() and the variable comp_name when making
the change.
- Around line 79-109: The test_main_function assertions expect saved_history to
be keyed by component_id with a "name" field, but main() actually keys history
by component name and stores versions under that key; update the assertions in
test_main_function to check saved_history uses component names as keys (e.g.,
"MyComponent", "AnotherComponent", "ThirdComponent") and verify
saved_history["MyComponent"]["versions"]["0.1.0"] == "hash_v1" (and similarly
for the others) instead of checking for component_id keys and a separate "name"
field; also remove or reduce unnecessary patches (for example don't patch Path
if you can use tmp_path directly) so the test exercises more real behavior and
only mock true external dependencies like _import_components, load_hash_history,
save_hash_history, and get_lfx_version.
- Around line 43-76: The test must be updated to match the actual update_history
signature and storage key: call update_history with 4 args (history,
component_name, code_hash, current_version) instead of 5, replace all uses of
component_id as the lookup key with component_name (e.g. assert
history[component_name]["versions"]["0.3.0"] == code_hash_v1), remove assertions
expecting a history[...]["name"] field (implementation does not store it), and
ensure the ValueError check still uses the same version semantics but calls
update_history(history, component_name, code_hash_v1, "0.4.0") so checks target
the component_name-based history structure.

In `@src/lfx/src/lfx/custom/utils.py`:
- Around line 525-528: The hasattr(custom_component, "component_id") check is
ineffective because component_id is a class attribute and will always be
present; change the condition to verify the value is non-empty (e.g., if
custom_component.component_id is not None and custom_component.component_id !=
"") before assigning frontend_node.metadata["component_id"], and replace
logger.error with logger.warning to reflect that a missing component_id may be
expected for user-defined components; keep the same context variables
(custom_component, component_id, frontend_node.metadata, ctype_name, logger)
when implementing this change.

🧹 Nitpick comments (3)

scripts/build_hash_history.py (2)
12-16: Variable shadowing: version shadows the imported function.

The variable version on line 16 shadows the version function imported on line 14, which could cause confusion.
Proposed fix
 def get_lfx_version():
     """Get the installed lfx version."""
     from importlib.metadata import version
 
-    return version("lfx")
+    lfx_version = version("lfx")
+    return lfx_version
8-9: Consider using Path for consistency with build_component_index.py.

The history file paths are defined as strings, while build_component_index.py uses Path objects for COMPONENT_INDEX_PATH. Using Path consistently would improve maintainability.
Proposed fix
-STABLE_HISTORY_FILE = "src/lfx/src/lfx/_assets/stable_hash_history.json"
-NIGHTLY_HISTORY_FILE = "src/lfx/src/lfx/_assets/nightly_hash_history.json"
+STABLE_HISTORY_FILE = Path(__file__).parent.parent / "src" / "lfx" / "src" / "lfx" / "_assets" / "stable_hash_history.json"
+NIGHTLY_HISTORY_FILE = Path(__file__).parent.parent / "src" / "lfx" / "src" / "lfx" / "_assets" / "nightly_hash_history.json"
This would also eliminate the need for Path() wrappers on lines 106 and 122.
src/backend/tests/unit/test_build_hash_history.py (1)

6-10: Consider avoiding sys.path manipulation for imports.

Modifying sys.path at runtime is fragile and can break if the directory structure changes. Consider either:

Making scripts/ a proper installable package with a pyproject.toml or setup.py

Using a conftest.py fixture to handle the path setup

Adding the scripts directory to PYTHONPATH in the test configuration

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4a673cf and 9cec63c.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (9)

scripts/build_component_index.py
scripts/build_hash_history.py
src/backend/base/langflow/services/telemetry/schema.py
src/backend/tests/unit/test_build_hash_history.py
src/backend/tests/unit/test_component_index_hash_history.py
src/lfx/src/lfx/_assets/component_index.json
src/lfx/src/lfx/_assets/stable_hash_history.json
src/lfx/src/lfx/custom/custom_component/custom_component.py
src/lfx/src/lfx/custom/utils.py

💤 Files with no reviewable changes (2)

src/backend/tests/unit/test_component_index_hash_history.py
src/backend/base/langflow/services/telemetry/schema.py

🧰 Additional context used

📓 Path-based instructions (3)

src/backend/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/backend_development.mdc)

src/backend/**/*.py: Use FastAPI async patterns with await for async operations in component execution methods
Use asyncio.create_task() for background tasks and implement proper cleanup with try/except for asyncio.CancelledError
Use queue.put_nowait() for non-blocking queue operations and asyncio.wait_for() with timeouts for controlled get operations

Files:

src/backend/tests/unit/test_build_hash_history.py

src/backend/tests/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/testing.mdc)

src/backend/tests/**/*.py: Place backend unit tests in src/backend/tests/ directory, component tests in src/backend/tests/unit/components/ organized by component subdirectory, and integration tests accessible via make integration_tests
Use same filename as component with appropriate test prefix/suffix (e.g., my_component.py → test_my_component.py)
Use the client fixture (FastAPI Test Client) defined in src/backend/tests/conftest.py for API tests; it provides an async httpx.AsyncClient with automatic in-memory SQLite database and mocked environment variables. Skip client creation by marking test with @pytest.mark.noclient
Inherit from the correct ComponentTestBase family class located in src/backend/tests/base.py based on API access needs: ComponentTestBase (no API), ComponentTestBaseWithClient (needs API), or ComponentTestBaseWithoutClient (pure logic). Provide three required fixtures: component_class, default_kwargs, and file_names_mapping
Create comprehensive unit tests for all new backend components. If unit tests are incomplete, create a corresponding Markdown file documenting manual testing steps and expected outcomes
Test both sync and async code paths, mock external dependencies appropriately, test error handling and edge cases, validate input/output behavior, and test component initialization and configuration
Use @pytest.mark.asyncio decorator for async component tests and ensure async methods are properly awaited
Test background tasks using asyncio.create_task() and verify completion with asyncio.wait_for() with appropriate timeout constraints
Test queue operations using non-blocking queue.put_nowait() and asyncio.wait_for(queue.get(), timeout=...) to verify queue processing without blocking
Use @pytest.mark.no_blockbuster marker to skip the blockbuster plugin in specific tests
For database tests that may fail in batch runs, run them sequentially using uv run pytest src/backend/tests/unit/test_database.py r...

Files:

src/backend/tests/unit/test_build_hash_history.py

**/test_*.py

📄 CodeRabbit inference engine (Custom checks)

**/test_*.py: Review test files for excessive use of mocks that may indicate poor test design - check if tests have too many mock objects that obscure what's actually being tested
Warn when mocks are used instead of testing real behavior and interactions, and suggest using real objects or test doubles when mocks become excessive
Ensure mocks are used appropriately for external dependencies only, not for core logic
Backend test files should follow the naming convention test_*.py with proper pytest structure
Test files should have descriptive test function names that explain what is being tested
Tests should be organized logically with proper setup and teardown
Consider including edge cases and error conditions for comprehensive test coverage
Verify tests cover both positive and negative scenarios where appropriate
For async functions in backend tests, ensure proper async testing patterns are used with pytest
For API endpoints, verify both success and error response testing

Files:

src/backend/tests/unit/test_build_hash_history.py

🧠 Learnings (10)

📚 Learning: 2025-06-26T19:43:18.260Z

Learnt from: ogabrielluiz
Repo: langflow-ai/langflow PR: 0
File: :0-0
Timestamp: 2025-06-26T19:43:18.260Z
Learning: In langflow custom components, the `module_name` parameter is now propagated through template building functions to add module metadata and code hashes to frontend nodes for better component tracking and debugging.

Applied to files:

src/lfx/src/lfx/custom/utils.py