feat: Add not contains filter operator in DataFrame Operations Component#9415
feat: Add not contains filter operator in DataFrame Operations Component#9415Cristhianzl merged 13 commits intomainfrom
Conversation
…where "not contains" filter option was missing, causing incorrect filtering behavior.
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughAdded a new "not contains" filter operator to DataFrameOperationsComponent and implemented its logic in filter_rows_by_value by negating the string contains check. No other operators or public signatures were changed. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Component as DataFrameOperationsComponent
participant Pandas as pandas.DataFrame
User->>Component: Select operator ("not contains") and value
Component->>Pandas: filter_rows_by_value(col.astype(str).str.contains(value, na=False))
Note right of Component: Negate mask for "not contains"
Component->>Pandas: Apply ~mask to DataFrame
Pandas-->>Component: Filtered DataFrame
Component-->>User: Return result
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested labels
Suggested reviewers
✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
src/backend/base/langflow/components/processing/dataframe_operations.py (2)
257-258: Use regex=False to avoid unintended regex behavior and improve robustnessThe implementation is correct and consistent with "contains". However, pandas treats the pattern as a regex by default. This can:
- Surprise users when filter_value has special regex chars (e.g., . * + ?)
- Raise errors for invalid patterns (e.g., unmatched brackets)
- Open the door to performance pitfalls with pathological patterns
Recommend using a literal substring match with regex=False. Apply to "not contains" (below) and, for consistency, also to the existing "contains" case above.
Diff for this block:
- elif operator == "not contains": - mask = ~column.astype(str).str.contains(str(filter_value), na=False) + elif operator == "not contains": + mask = ~column.astype(str).str.contains(str(filter_value), na=False, regex=False)Also update the "contains" branch for consistency (outside the selected lines):
elif operator == "contains": mask = column.astype(str).str.contains(str(filter_value), na=False, regex=False)If existing users rely on regex semantics, consider an advanced toggle (e.g., filter_regex: bool) instead of changing defaults.
I can add a small test matrix covering contains/not contains with literals, regex metacharacters, and NaNs. Want me to draft it?
312-312: Gracefully handle empty column_name in Drop DuplicatesIf column_name is missing/empty, this will raise a KeyError. Defaulting subset=None uses all columns, which is a safer fallback.
- return DataFrame(df.drop_duplicates(subset=self.column_name)) + subset = self.column_name if getattr(self, "column_name", None) else None + return DataFrame(df.drop_duplicates(subset=subset))
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
src/backend/base/langflow/components/processing/dataframe_operations.py(3 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
src/backend/base/langflow/components/**/*.py
📄 CodeRabbit Inference Engine (.cursor/rules/backend_development.mdc)
src/backend/base/langflow/components/**/*.py: Add new backend components to the appropriate subdirectory under src/backend/base/langflow/components/
Implement async component methods using async def and await for asynchronous operations
Use asyncio.create_task for background work in async components and ensure proper cleanup on cancellation
Use asyncio.Queue for non-blocking queue operations in async components and handle timeouts appropriately
Files:
src/backend/base/langflow/components/processing/dataframe_operations.py
{src/backend/**/*.py,tests/**/*.py,Makefile}
📄 CodeRabbit Inference Engine (.cursor/rules/backend_development.mdc)
{src/backend/**/*.py,tests/**/*.py,Makefile}: Run make format_backend to format Python code before linting or committing changes
Run make lint to perform linting checks on backend Python code
Files:
src/backend/base/langflow/components/processing/dataframe_operations.py
src/backend/**/components/**/*.py
📄 CodeRabbit Inference Engine (.cursor/rules/icons.mdc)
In your Python component class, set the
iconattribute to a string matching the frontend icon mapping exactly (case-sensitive).
Files:
src/backend/base/langflow/components/processing/dataframe_operations.py
🔇 Additional comments (1)
src/backend/base/langflow/components/processing/dataframe_operations.py (1)
82-82: LGTM: "not contains" option added to the dropdown matches backend operator stringThe UI option string exactly matches the backend check ("not contains"), so the wiring is correct.
* fix: Avoid namespace collision for Astra * [autofix.ci] apply automated fixes * Update Vector Store RAG.json * [autofix.ci] apply automated fixes --------- Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
fix: revert to stable composio version
* fix: Knowledge base component refactor * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * Update styleUtils.ts * Update ingestion.py * [autofix.ci] apply automated fixes * Fix ingestion of df * [autofix.ci] apply automated fixes * Update Knowledge Ingestion.json * Fix one failing test * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes * Revert composio versions for CI * Revert "Revert composio versions for CI" This reverts commit 9bcb694. --------- Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> Co-authored-by: Edwin Jose <edwin.jose@datastax.com> Co-authored-by: Carlos Coelho <80289056+carlosrcoelho@users.noreply.github.com>
fix .env load on windows script Co-authored-by: Ítalo Johnny <italojohnnydosanjos@gmail.com>
…ent (#9564) Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
edwinjosechittilappilly
left a comment
There was a problem hiding this comment.
LGTM
@jordanrfrazier we might need to add tests later for this PR.
can we add it in follow up PR post release.
Codecov Report❌ Patch coverage is ❌ Your project status has failed because the head coverage (5.81%) is below the target coverage (10.00%). You can increase the head coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #9415 +/- ##
==========================================
- Coverage 34.69% 34.59% -0.11%
==========================================
Files 1209 1209
Lines 57115 57115
Branches 5419 5419
==========================================
- Hits 19818 19757 -61
- Misses 37153 37214 +61
Partials 144 144
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
…rate limiting requests to avoid false test failures
|



This pull request adds support for a new "not contains" filter operator in the
DataFrameOperationsComponent, allowing users to filter rows where a column does not contain a specified value. The change updates both the UI options and the filtering logic.Enhancements to DataFrame filtering:
filter_operatordropdown options in theDataFrameOperationsComponent, expanding the available filter types for users.filter_rows_by_valuemethod, enabling filtering of rows where the column value does not contain the specified filter value.Summary by CodeRabbit