feat: add tenacity retry in openserach by edwinjosechittilappilly · Pull Request #10917 · langflow-ai/langflow

edwinjosechittilappilly · 2025-12-05T23:17:46Z

This pull request refactors the embedding generation logic in the _add_documents_to_vector_store method of opensearch_multimodal.py to improve reliability and efficiency, especially when handling rate limits and different embedding model providers. The main changes include switching to the tenacity library for robust, rate-limit-aware retries and optimizing concurrency based on the embedding model type.

Embedding and Retry Logic Improvements:

Replaced manual retry and threading logic with tenacity-based decorators, providing separate retry strategies for rate limit errors (longer backoff, more attempts) and other retryable errors (shorter backoff, fewer attempts).
Added explicit error handling and logging for failed embedding attempts, ensuring that failures are clearly reported after all retries.

Concurrency and Model-Specific Handling:

Implemented sequential embedding with inter-request delays for IBM/Watsonx models to avoid exceeding rate limits, and parallel embedding for other models using a thread pool.
Dynamically determined concurrency settings (max_workers) based on the number of text chunks and model type, improving performance while respecting provider constraints.

Summary by CodeRabbit

Release Notes

Bug Fixes
- Embedding generation now features automatic rate-limit detection and recovery with intelligent exponential backoff to maintain reliability during high-demand periods
- Implemented distinct error-handling strategies tailored for rate-limit scenarios versus other transient failures, each with optimized recovery and retry behavior
- Added comprehensive monitoring and logging for embedding operations, providing visibility into failure recovery and automatic retry actions

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-05T23:18:02Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Refactored embedding generation in OpenSearch multimodal component to replace concurrent ThreadPoolExecutor with a rate-limit-aware, multi-tier retry mechanism using tenacity. Introduces sequential processing for IBM/watsonx models with inter-request delays and parallel processing for others, with distinct retry policies for rate-limit versus generic errors. Existing ingestion, indexing, and mapping logic remains unchanged.

Changes

Cohort / File(s)	Summary
Embedding retry mechanism `src/lfx/src/lfx/components/elastic/opensearch_multimodal.py`	Replaced concurrent ThreadPoolExecutor embedding with tenacity-wrapped retry logic; added `embed_chunk_with_retry` function; implements rate-limit errors (5 attempts, exponential backoff to 30s) and generic retryable errors (3 attempts, shorter backoff); enforces sequential processing for IBM/watsonx models with inter-request delays; enhanced logging for retry events; preserves dimension calculation, vector field naming, mapping creation, and bulk ingestion flow.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Verify tenacity retry configuration (exponential backoff curves, attempt thresholds) and correctness for both rate-limit and generic error paths
Confirm sequential vs. parallel execution logic for model-specific handling (IBM/watsonx)
Validate error handling coverage and logging adequacy for failure scenarios
Assess impact on embedding generation performance and latency from sequential processing

Possibly related PRs

feat: custom worker count for IBM models in OpenSearch #10866: Restricts parallel embedding concurrency for IBM/Watsonx models by setting max_workers=1, addressing similar embedding serialization concerns in the same file.
feat: Add OpenSearch multimodal multi-embedding component #10714: Introduced multimodal multi-embedding implementation with threaded generation and dynamic vector fields in opensearch_multimodal.py, which this PR refactors with retry-aware retry logic and altered threading strategy.

Suggested labels

enhancement

Suggested reviewers

phact
lucaseduoli
erichare

Pre-merge checks and finishing touches

Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 warning, 3 inconclusive)

Check name	Status	Explanation	Resolution
Test File Naming And Structure	⚠️ Warning	PR introduces retry logic, rate-limit handling, and concurrency patterns but existing tests lack coverage for these critical new features.	Add six comprehensive unit tests covering rate-limit retry logic, exponential backoff, sequential/parallel concurrency patterns, failure scenarios, and retry recovery.
Test Coverage For New Implementations	❓ Inconclusive	Cannot locate PR changes in current repository state despite comprehensive search. Access to correct branch/commit containing opensearch_multimodal.py changes and associated test files is required.	Verify correct repository branch/commit is checked out and that files mentioned in PR summary are accessible for test coverage assessment.
Test Quality And Coverage	❓ Inconclusive	Repository files could not be located to assess test coverage for the tenacity retry mechanism changes in opensearch_multimodal.py.	Provide access to the modified opensearch_multimodal.py file and corresponding test files to verify test coverage and quality.
Excessive Mock Usage Warning	❓ Inconclusive	Test files related to opensearch_multimodal.py changes could not be located in the repository to assess mock usage patterns.	Provide access to test files for the opensearch_multimodal.py component to evaluate whether mocks are used excessively or appropriately for testing embedding and retry logic.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately reflects the main change: adding tenacity-based retry mechanisms to opensearch embedding generation.
Docstring Coverage	✅ Passed	Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull request overview

This PR refactors the embedding generation logic in the OpenSearch multimodal component to improve reliability and handle rate limits more effectively. The changes replace manual retry/threading logic with the tenacity library for robust retry behavior and implement model-specific concurrency strategies.

Key Changes:

Introduced tenacity-based retry decorators with separate strategies for rate limit errors (5 attempts, exponential backoff 2-30s) and other errors (3 attempts, exponential backoff 1-8s)
Implemented sequential embedding with 0.6s delays for IBM/Watsonx models and parallel embedding for other models
Enhanced error logging with detailed retry information and failure tracking

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-05T23:22:11Z