feat: Support the new ModelInput in Astra DB#11203
Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
WalkthroughThe changes refactor embedding model configuration across AstraDB components from generic HandleInput to dedicated ModelInput with dynamic option refresh capabilities. A new parameter Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Pre-merge checks and finishing touchesImportant Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error, 2 warnings)
✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is
❌ Your project status has failed because the head coverage (42.37%) is below the target coverage (60.00%). You can increase the head coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #11203 +/- ##
=======================================
Coverage 37.45% 37.45%
=======================================
Files 1616 1616
Lines 79041 79041
Branches 11945 11945
=======================================
Hits 29606 29606
- Misses 47777 47778 +1
+ Partials 1658 1657 -1
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
…i/langflow into feat-astra-modelinput
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/backend/base/langflow/initial_setup/starter_projects/Vector Store RAG.json (1)
3729-3898: Alignembedding_modeltemplate with newModelInputusageThe embedded
AstraDBVectorStoreComponentPython now correctly definesembedding_modelas aModelInput(model_type="embedding",input_types=["Embeddings"], real‑time refresh and unified‑model wiring). However, the corresponding JSON template forembedding_modelin this node is still aHandleInput(type: "other",_input_type: "HandleInput"). That mismatch can prevent the unifiedModelInputUI/logic from working as intended in this starter flow and may confuse the frontend about the field’s shape.Please regenerate or adjust the
template.embedding_modelblock here to be a properModelInput(similar toLanguageModelComponent.template.model), includingtype: "model",model_type: "embedding",real_time_refresh, and the correct options structure, so the starter project actually surfaces the new embedding model selector.On the positive side, the component’s
icon = "AstraDB"is consistent with the shared icon naming guideline. Based on learnings, this matches the expected cross‑frontend/backend icon naming scheme.Also applies to: 4169-4187
♻️ Duplicate comments (1)
src/backend/base/langflow/initial_setup/starter_projects/Vector Store RAG.json (1)
4542-4711: Second AstraDB node duplicates the updated component definitionThis AstraDB node embeds the same updated
AstraDBVectorStoreComponentcode (includingModelInputforembedding_model) and retains the sameHandleInput‑basedtemplate.embedding_modeldefinition as the first node. The same alignment concern and suggested fix from the earlier comment apply here as well.Also applies to: 4982-5000
🧹 Nitpick comments (4)
src/lfx/src/lfx/base/models/unified_models.py (1)
1039-1039: Consider defensive check for missing field in build_config.If
model_field_namedoesn't exist inbuild_config, this will raise aKeyError. While callers are expected to provide valid build configs, a defensive check or clearer error message could improve debuggability.🔎 Optional defensive check
+ if model_field_name not in build_config: + msg = f"Field '{model_field_name}' not found in build_config" + raise KeyError(msg) + # Use cached results cached = component.cache.get(cache_key, {"options": []}) build_config[model_field_name]["options"] = cached["options"]src/backend/tests/unit/test_unified_models.py (1)
83-125: Good test coverage for custom field name scenario.The test properly validates:
- Options are populated into the custom
"embedding_model"field- Default value is set to the first option
- Option structure is preserved
Consider adding a test for the
input_typesvisibility logic (whenfield_value="connect_other_models").🔎 Optional: Add test for connection mode visibility
def test_update_model_options_connect_other_models_mode(): """Test that input_types is set correctly when connecting other models.""" mock_component = MagicMock() mock_component.user_id = "test-user-789" mock_component.cache = {} mock_component.log = MagicMock() build_config = { "embedding_model": { "options": [], "value": "", "input_types": [], } } def mock_get_options(user_id): return [{"name": "text-embedding-ada-002", "provider": "OpenAI"}] result = update_model_options_in_build_config( component=mock_component, build_config=build_config, cache_key_prefix="embedding_model_options", get_options_func=mock_get_options, field_name="embedding_model", field_value="connect_other_models", model_field_name="embedding_model", ) # Verify input_types is set for Embeddings connection assert result["embedding_model"]["input_types"] == ["Embeddings"]src/lfx/src/lfx/components/datastax/astradb_vectorstore.py (1)
148-148: Clarify the conditional field_value logic.The expression
field_value if field_name == "embedding_model" else Nonepasses the field value only when the embedding model field changes. This is correct for determining visibility mode, but the logic could be clearer with a comment explaining whyNoneis passed for other field changes.🔎 Optional: Add clarifying comment
field_name=field_name, + # Pass field_value only for embedding_model changes to enable connection mode detection + # For api_key changes, we just refresh options without changing visibility field_value=field_value if field_name == "embedding_model" else None, model_field_name="embedding_model",src/backend/base/langflow/initial_setup/starter_projects/Vector Store RAG.json (1)
3897-3898:update_build_configand hybrid search wiring look coherent; consider minor robustness tweakThe new
AstraDBVectorStoreComponentimplementation:
- Uses
ModelInputforembedding_modeland correctly routes option loading throughupdate_model_options_in_build_config+get_embedding_model_optionswithmodel_field_name="embedding_model".- Sensibly limits option refresh to
field_name in (None, "embedding_model", "api_key"), avoiding unnecessary calls.- Dynamically hides
embedding_modelwhen a non–“Bring your own” collection‑level embedding provider is selected, which matches the intended Vectorize vs BYO behavior.- Configures hybrid search / reranker options via
_configure_search_options, and safely degrades when_detect_hybrid_capabilitiesraises.One small robustness improvement: in
_configure_search_options, the line that doesindex = build_config["collection_name"]["options"].index( build_config["collection_name"]["value"] )assumes the current
valueis always present inoptions. If the options list changes asynchronously (e.g., DB-side changes) and the value becomes stale, this will raiseValueErrorand break the UI refresh. Guarding with a membership check (or catchingValueErrorand logging/early‑returning) would make this code more resilient without changing behavior in the common case.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
src/backend/base/langflow/initial_setup/starter_projects/Hybrid Search RAG.jsonsrc/backend/base/langflow/initial_setup/starter_projects/Vector Store RAG.jsonsrc/backend/tests/unit/test_unified_models.pysrc/lfx/src/lfx/base/models/unified_models.pysrc/lfx/src/lfx/components/datastax/astradb_vectorstore.py
🧰 Additional context used
📓 Path-based instructions (3)
src/backend/**/*.py
📄 CodeRabbit inference engine (.cursor/rules/backend_development.mdc)
src/backend/**/*.py: Use FastAPI async patterns withawaitfor async operations in component execution methods
Useasyncio.create_task()for background tasks and implement proper cleanup with try/except forasyncio.CancelledError
Usequeue.put_nowait()for non-blocking queue operations andasyncio.wait_for()with timeouts for controlled get operations
Files:
src/backend/tests/unit/test_unified_models.py
src/backend/tests/**/*.py
📄 CodeRabbit inference engine (.cursor/rules/testing.mdc)
src/backend/tests/**/*.py: Place backend unit tests insrc/backend/tests/directory, component tests insrc/backend/tests/unit/components/organized by component subdirectory, and integration tests accessible viamake integration_tests
Use same filename as component with appropriate test prefix/suffix (e.g.,my_component.py→test_my_component.py)
Use theclientfixture (FastAPI Test Client) defined insrc/backend/tests/conftest.pyfor API tests; it provides an asynchttpx.AsyncClientwith automatic in-memory SQLite database and mocked environment variables. Skip client creation by marking test with@pytest.mark.noclient
Inherit from the correctComponentTestBasefamily class located insrc/backend/tests/base.pybased on API access needs:ComponentTestBase(no API),ComponentTestBaseWithClient(needs API), orComponentTestBaseWithoutClient(pure logic). Provide three required fixtures:component_class,default_kwargs, andfile_names_mapping
Create comprehensive unit tests for all new backend components. If unit tests are incomplete, create a corresponding Markdown file documenting manual testing steps and expected outcomes
Test both sync and async code paths, mock external dependencies appropriately, test error handling and edge cases, validate input/output behavior, and test component initialization and configuration
Use@pytest.mark.asynciodecorator for async component tests and ensure async methods are properly awaited
Test background tasks usingasyncio.create_task()and verify completion withasyncio.wait_for()with appropriate timeout constraints
Test queue operations using non-blockingqueue.put_nowait()andasyncio.wait_for(queue.get(), timeout=...)to verify queue processing without blocking
Use@pytest.mark.no_blockbustermarker to skip the blockbuster plugin in specific tests
For database tests that may fail in batch runs, run them sequentially usinguv run pytest src/backend/tests/unit/test_database.pyr...
Files:
src/backend/tests/unit/test_unified_models.py
**/test_*.py
📄 CodeRabbit inference engine (Custom checks)
**/test_*.py: Review test files for excessive use of mocks that may indicate poor test design - check if tests have too many mock objects that obscure what's actually being tested
Warn when mocks are used instead of testing real behavior and interactions, and suggest using real objects or test doubles when mocks become excessive
Ensure mocks are used appropriately for external dependencies only, not for core logic
Backend test files should follow the naming convention test_*.py with proper pytest structure
Test files should have descriptive test function names that explain what is being tested
Tests should be organized logically with proper setup and teardown
Consider including edge cases and error conditions for comprehensive test coverage
Verify tests cover both positive and negative scenarios where appropriate
For async functions in backend tests, ensure proper async testing patterns are used with pytest
For API endpoints, verify both success and error response testing
Files:
src/backend/tests/unit/test_unified_models.py
🧠 Learnings (8)
📚 Learning: 2025-11-24T19:47:28.997Z
Learnt from: CR
Repo: langflow-ai/langflow PR: 0
File: .cursor/rules/testing.mdc:0-0
Timestamp: 2025-11-24T19:47:28.997Z
Learning: Applies to src/backend/tests/**/*.py : Test component build config updates by calling `to_frontend_node()` to get the node template, then calling `update_build_config()` to apply configuration changes
Applied to files:
src/backend/tests/unit/test_unified_models.py
📚 Learning: 2025-11-24T19:47:28.997Z
Learnt from: CR
Repo: langflow-ai/langflow PR: 0
File: .cursor/rules/testing.mdc:0-0
Timestamp: 2025-11-24T19:47:28.997Z
Learning: Applies to src/backend/tests/**/*.py : Use `pytest.mark.api_key_required` and `pytest.mark.no_blockbuster` markers for components that need external APIs; use `MockLanguageModel` from `tests.unit.mock_language_model` for testing without external API keys
Applied to files:
src/backend/tests/unit/test_unified_models.py
📚 Learning: 2025-11-24T19:47:28.997Z
Learnt from: CR
Repo: langflow-ai/langflow PR: 0
File: .cursor/rules/testing.mdc:0-0
Timestamp: 2025-11-24T19:47:28.997Z
Learning: Applies to src/backend/tests/**/*.py : Test both sync and async code paths, mock external dependencies appropriately, test error handling and edge cases, validate input/output behavior, and test component initialization and configuration
Applied to files:
src/backend/tests/unit/test_unified_models.py
📚 Learning: 2025-11-24T19:47:28.997Z
Learnt from: CR
Repo: langflow-ai/langflow PR: 0
File: .cursor/rules/testing.mdc:0-0
Timestamp: 2025-11-24T19:47:28.997Z
Learning: Applies to src/backend/tests/**/*.py : Inherit from the correct `ComponentTestBase` family class located in `src/backend/tests/base.py` based on API access needs: `ComponentTestBase` (no API), `ComponentTestBaseWithClient` (needs API), or `ComponentTestBaseWithoutClient` (pure logic). Provide three required fixtures: `component_class`, `default_kwargs`, and `file_names_mapping`
Applied to files:
src/backend/tests/unit/test_unified_models.py
📚 Learning: 2025-11-24T19:47:28.997Z
Learnt from: CR
Repo: langflow-ai/langflow PR: 0
File: .cursor/rules/testing.mdc:0-0
Timestamp: 2025-11-24T19:47:28.997Z
Learning: Applies to src/backend/tests/**/*.py : Use `monkeypatch` fixture to mock internal functions for testing error handling scenarios; validate error status codes and error message content in responses
Applied to files:
src/backend/tests/unit/test_unified_models.py
📚 Learning: 2025-12-19T18:04:08.938Z
Learnt from: Jkavia
Repo: langflow-ai/langflow PR: 11111
File: src/backend/tests/unit/api/v2/test_workflow.py:10-11
Timestamp: 2025-12-19T18:04:08.938Z
Learning: In the langflow-ai/langflow repository, pytest-asyncio is configured with asyncio_mode = 'auto' in pyproject.toml. This means you do not need to decorate test functions or classes with pytest.mark.asyncio; async tests are auto-detected and run by pytest-asyncio. When reviewing tests, ensure they rely on this configuration (i.e., avoid unnecessary pytest.mark.asyncio decorators) and that tests living under any tests/ path (e.g., src/.../tests/**/*.py) follow this convention. If a test explicitly requires a different asyncio policy, document it and adjust the config accordingly.
Applied to files:
src/backend/tests/unit/test_unified_models.py
📚 Learning: 2025-11-24T19:46:09.104Z
Learnt from: CR
Repo: langflow-ai/langflow PR: 0
File: .cursor/rules/backend_development.mdc:0-0
Timestamp: 2025-11-24T19:46:09.104Z
Learning: Backend components should be structured with clear separation of concerns: agents, data processing, embeddings, input/output, models, text processing, prompts, tools, and vector stores
Applied to files:
src/backend/base/langflow/initial_setup/starter_projects/Hybrid Search RAG.json
📚 Learning: 2025-11-24T19:46:57.920Z
Learnt from: CR
Repo: langflow-ai/langflow PR: 0
File: .cursor/rules/icons.mdc:0-0
Timestamp: 2025-11-24T19:46:57.920Z
Learning: Use clear, recognizable icon names (e.g., `"AstraDB"`, `"Postgres"`, `"OpenAI"`). Always use the same icon name for the same service across backend and frontend.
Applied to files:
src/backend/base/langflow/initial_setup/starter_projects/Vector Store RAG.json
🧬 Code graph analysis (2)
src/backend/tests/unit/test_unified_models.py (1)
src/lfx/src/lfx/base/models/unified_models.py (2)
get_unified_models_detailed(114-221)update_model_options_in_build_config(972-1125)
src/lfx/src/lfx/components/datastax/astradb_vectorstore.py (2)
src/lfx/src/lfx/base/models/unified_models.py (2)
get_embedding_model_options(545-738)update_model_options_in_build_config(972-1125)src/lfx/src/lfx/inputs/inputs.py (2)
DropdownInput(585-611)ModelInput(127-237)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 1
- GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 5
- GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 4
- GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 2
- GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 3
- GitHub Check: Lint Backend / Run Mypy (3.13)
- GitHub Check: Lint Backend / Run Mypy (3.11)
- GitHub Check: Lint Backend / Run Mypy (3.12)
- GitHub Check: Run Backend Tests / Integration Tests - Python 3.10
- GitHub Check: Run Backend Tests / LFX Tests - Python 3.10
- GitHub Check: Test Starter Templates
- GitHub Check: test-starter-projects
- GitHub Check: Update Component Index
- GitHub Check: Optimize new Python code in this PR
🔇 Additional comments (9)
src/lfx/src/lfx/base/models/unified_models.py (3)
979-980: LGTM! Backward-compatible API extension.The new
model_field_nameparameter with default"model"enables flexible field naming while maintaining backward compatibility with existing callers.
1016-1022: LGTM! Refresh logic correctly uses dynamic field name.The
should_refreshcondition now properly comparesfield_name == model_field_nameinstead of hardcoded"model", ensuring refresh triggers correctly for custom field names like"embedding_model".
1110-1123: LGTM! Visibility logic properly parameterized.The
input_typesassignment and default value setting now correctly target the dynamicmodel_field_namefield, enabling the Astra DB component to use"embedding_model"while other components continue using"model".src/backend/tests/unit/test_unified_models.py (1)
128-162: LGTM! Default field name behavior validated.Test confirms backward compatibility when
model_field_nameis not specified, verifying the default"model"field is used correctly.src/lfx/src/lfx/components/datastax/astradb_vectorstore.py (3)
5-5: LGTM! Correct imports for embedding model options integration.The imports align with the unified models API for embedding model option handling.
139-150: LGTM! Proper integration with update_model_options_in_build_config.The conditional refresh logic correctly limits expensive option fetching to relevant scenarios:
- Initial load (
field_name is None)- Embedding model field changes
- API key changes (may affect available providers)
The
model_field_name="embedding_model"correctly targets the custom field name.
35-44: The parametermodel_typeis valid and properly defined. TheModelInputclass inherits fromModelInputMixin, which explicitly definesmodel_type: str | None = "language"with documentation stating it accepts'language'or'embedding'as valid values. The code usage is correct.Likely an incorrect or invalid review comment.
src/backend/base/langflow/initial_setup/starter_projects/Hybrid Search RAG.json (2)
1181-1181: LGTM!The
code_hashupdate is expected when the underlying component code changes to support the newModelInputtype for embedding model selection.
1631-1649: Verify template configuration matches the new ModelInput type.The embedded component code (line 1348) defines
embedding_modelas aModelInputwithmodel_type="embedding", but the template configuration here still shows:
"_input_type": "HandleInput""type": "other"This may cause inconsistencies when the starter project is loaded. The template should likely reflect:
"_input_type": "ModelInput""type": "model"or equivalent"model_type": "embedding""real_time_refresh": truePlease verify if the template is dynamically regenerated at runtime from the component code, or if this needs manual alignment.
* feat: Support the new ModelInput in Astra DB * [autofix.ci] apply automated fixes * Update templates * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * Update starter templates * Update unified_models.py * [autofix.ci] apply automated fixes * Update component_index.json * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) --------- Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
This pull requests adds support for the ModelInput in the Embedding Model selection, so the Unified Models feature can be directly used within the component.
Summary by CodeRabbit
Release Notes
New Features
Improvements
✏️ Tip: You can customize this high-level summary in your review settings.