Fix: Add retry logic for HTTP 400 errors in OpenAI Compatible API #9189

roomote · 2025-11-12T05:58:37Z

Description

This PR attempts to address Issue #9188 where the Qwen3-Coder-30B-A3B model with OpenAI Compatible API was experiencing HTTP 400 errors, particularly after multiple rounds of conversation.

Problem

Users were experiencing HTTP 400 errors when:

Reopening conversations with the Qwen3-Coder-30B-A3B model
Having multiple rounds of conversation
The conversation history became too long for the model's context window

Solution

Implemented automatic retry logic with progressive conversation history truncation:

Detects HTTP 400 errors and automatically retries the request
Progressively truncates older conversation messages while keeping at least the 10 most recent messages for context
Supports up to 3 retry attempts with increasing truncation ratios (50%, 60%, 70%, up to 80%)
Applied to both streaming and non-streaming API calls

Changes

Modified BaseOpenAiCompatibleProvider to add retry logic in createStream() method
Added retry logic to completePrompt() method for non-streaming calls
Added comprehensive test coverage for all retry scenarios

Testing

Added new test suite base-openai-compatible-provider-retry.spec.ts with 7 test cases
All existing tests continue to pass
Tests verify:
- Retry with truncation on HTTP 400 errors
- Progressive truncation on multiple retries
- No retry on non-400 errors
- Minimum message preservation (10 messages)
- Proper handling of short conversations

Impact

This fix should help users experiencing HTTP 400 errors with OpenAI Compatible APIs, especially when using models with limited context windows or when having long conversation histories.

Feedback and guidance are welcome!

Important

Adds retry logic with progressive truncation for HTTP 400 errors in OpenAI Compatible API, with tests for streaming and non-streaming calls.

Behavior:
- Adds retry logic for HTTP 400 errors in createStream() and completePrompt() in base-openai-compatible-provider.ts.
- Truncates conversation history or prompt progressively on retries, keeping at least 10 messages or reducing prompt length.
- Supports up to 3 retries with truncation ratios of 50%, 60%, 70%, up to 80%.
- Applies to both streaming and non-streaming API calls.
Testing:
- New test suite base-openai-compatible-provider-retry.spec.ts with 7 test cases.
- Tests cover retry logic, progressive truncation, non-retry on non-400 errors, and minimum message preservation.
Misc:
- Logs warnings on retries and truncations in base-openai-compatible-provider.ts.

^{This description was created by}^{for 669b812. You can customize this summary. It will automatically update as commits are pushed.}

… in OpenAI Compatible API - Implement automatic retry with progressive conversation history truncation when HTTP 400 errors occur - Add retry logic to both streaming and non-streaming methods - Truncate older messages while keeping at least 10 most recent for context - Add comprehensive test coverage for retry scenarios - Fixes issue #9188 where Qwen3-Coder-30B-A3B model was prone to HTTP 400 errors after multiple conversation rounds

roomote · 2025-11-12T05:58:57Z

Rooviewer See task on Roo Cloud

Review complete. Found 2 issues that should be addressed:

Fix recursive call to pass truncated messages instead of original messages
Fix prompt truncation to keep end of prompt instead of beginning

_{Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues.}

roomote · 2025-11-12T06:00:54Z

src/api/providers/base-openai-compatible-provider.ts

+				console.warn(
+					`[${this.providerName}] Received HTTP 400 error, retrying with truncated conversation history (attempt ${retryCount + 1}/3)`,
+				)
+				return this.createStream(systemPrompt, messages, metadata, requestOptions, retryCount + 1)


The retry logic passes the original messages parameter instead of the truncated convertedMessages, causing the truncation logic to re-execute on the same original messages each retry. This means previous truncations are lost and the same truncation ratio is applied to the full original message list each time, rather than progressively truncating. This could lead to inefficient retries where the message count doesn't decrease as expected across multiple attempts.

Suggested change

return this.createStream(systemPrompt, messages, metadata, requestOptions, retryCount + 1)

return this.createStream(systemPrompt, convertedMessages, metadata, requestOptions, retryCount + 1)

_{Fix it with Roo Code or mention @roomote and request a fix.}

ellipsis-dev · 2025-11-12T06:01:01Z

src/api/providers/base-openai-compatible-provider.ts

+		// If this is a retry and we have many messages, try truncating older conversation history
+		// Keep at least the last 10 messages to maintain context
+		if (retryCount > 0 && convertedMessages.length > 10) {
+			const truncationRatio = Math.min(0.5 + retryCount * 0.1, 0.8) // Truncate 50%, 60%, 70%, up to 80%


The truncation ratio is computed inline using magic numbers (0.5, 0.1, 0.8). Consider extracting these values as named constants for improved readability and easier tuning in the future.

roomote · 2025-11-12T06:01:15Z

src/api/providers/base-openai-compatible-provider.ts

+			// Retry with truncated prompt if we haven't exceeded max retries
+			if (is400Error && retryCount < 3 && prompt.length > 1000) {
+				const truncationRatio = Math.min(0.5 + retryCount * 0.1, 0.8)
+				const truncatedPrompt = prompt.substring(0, Math.floor(prompt.length * (1 - truncationRatio)))


The truncation keeps the beginning of the prompt and discards the end, which loses the most recent and relevant context. For prompts containing conversation history or instructions at the end, this approach removes critical information needed for the model to respond appropriately. This is inconsistent with the createStream method which keeps the most recent messages. Consider keeping the end of the prompt instead to preserve recent context.

Suggested change

const truncatedPrompt = prompt.substring(0, Math.floor(prompt.length * (1 - truncationRatio)))

const truncatedPrompt = prompt.substring(Math.floor(prompt.length * truncationRatio))

_{Fix it with Roo Code or mention @roomote and request a fix.}

roomote bot requested a review from mrubens as a code owner November 12, 2025 05:58

github-project-automation bot added this to Roo Code Roadmap and Roo Code Roadmap Nov 12, 2025

roomote bot requested review from cte and jr as code owners November 12, 2025 05:58

github-project-automation bot moved this to Triage in Roo Code Roadmap Nov 12, 2025

github-project-automation bot moved this to New in Roo Code Roadmap Nov 12, 2025

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working labels Nov 12, 2025

roomote bot mentioned this pull request Nov 12, 2025

[BUG] Roo Code is prone to HTTP 400 errors after multiple rounds of communication. #9188

Open

roomote bot commented Nov 12, 2025

View reviewed changes

ellipsis-dev bot reviewed Nov 12, 2025

View reviewed changes

roomote bot commented Nov 12, 2025

View reviewed changes

hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Nov 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix: Add retry logic for HTTP 400 errors in OpenAI Compatible API #9189

Fix: Add retry logic for HTTP 400 errors in OpenAI Compatible API #9189

roomote bot commented Nov 12, 2025 •

edited by ellipsis-dev bot

Loading

Uh oh!

roomote bot commented Nov 12, 2025 •

edited

Loading

Uh oh!

roomote bot Nov 12, 2025

Uh oh!

ellipsis-dev bot Nov 12, 2025

Uh oh!

roomote bot Nov 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	return this.createStream(systemPrompt, messages, metadata, requestOptions, retryCount + 1)
	return this.createStream(systemPrompt, convertedMessages, metadata, requestOptions, retryCount + 1)

	const truncatedPrompt = prompt.substring(0, Math.floor(prompt.length * (1 - truncationRatio)))
	const truncatedPrompt = prompt.substring(Math.floor(prompt.length * truncationRatio))

Fix: Add retry logic for HTTP 400 errors in OpenAI Compatible API #9189

Are you sure you want to change the base?

Fix: Add retry logic for HTTP 400 errors in OpenAI Compatible API #9189

Conversation

roomote bot commented Nov 12, 2025 • edited by ellipsis-dev bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Problem

Solution

Changes

Testing

Impact

Uh oh!

roomote bot commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

roomote bot Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

ellipsis-dev bot Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

roomote bot commented Nov 12, 2025 •

edited by ellipsis-dev bot

Loading

roomote bot commented Nov 12, 2025 •

edited

Loading