fix: omit max_tokens from OpenAI requests when None (Azure compatibility)#396
Open
thuantan2060 wants to merge 4 commits intoNevaMind-AI:mainfrom
Open
fix: omit max_tokens from OpenAI requests when None (Azure compatibility)#396thuantan2060 wants to merge 4 commits intoNevaMind-AI:mainfrom
thuantan2060 wants to merge 4 commits intoNevaMind-AI:mainfrom
Conversation
Azure OpenAI rejects max_tokens=null in chat completion requests. Only include max_tokens in the request payload when it has a value. Affected methods: - OpenAISDKClient.chat() - OpenAISDKClient.summarize() - OpenAISDKClient.vision() - OpenAILLMBackend.build_summary_payload() - OpenAILLMBackend.build_vision_payload() Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Builds linux-x86_64 wheel on every push to main and uploads as 'memu-wheel' artifact with 7-day retention for cross-repo consumption by memu-server Docker CI pipeline.
- actions/checkout v4 → v6 - astral-sh/setup-uv v7 → v8 - actions/upload-artifact v6 → v7
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
📝 Pull Request Summary
Fix Azure OpenAI compatibility by omitting
max_tokensfrom API requests when the value isNone. Azure OpenAI rejectsnullfor this field, while standard OpenAI silently accepts it.✅ What does this PR do?
max_tokens=Nonedirectly to the OpenAI API, which serializes tomax_tokens: nullin the JSON payload.max_tokensin the request dict to only including it when the valueis not
None.src/memu/llm/openai_sdk.py:chat(),summarize(),vision()src/memu/llm/backends/openai.py:build_summary_payload(),build_vision_payload()max_tokensis explicitly set — only theNone/default case is fixed.🤔 Why is this change needed?
400 Bad Requestwhenmax_tokens: nullis included in chat completionrequests:
Invalid type for 'max_tokens': expected an unsupported value, but got null instead.
nulland uses its default, masking the issue.OPENAI_BASE_URLoverride), which is acommon enterprise setup.
doubao,openrouter) and thehttp_clientalready handle this correctly withif max_tokens is not Noneguards — this PR aligns the OpenAI SDK client and OpenAI HTTP backend with that samepattern.
🔍 Type of Change
Please check what applies:
✅ PR Quality Checklist
📌 Optional
max_tokensis explicitly set to an integer: behavior unchanged (included in payload)max_tokensisNone(default): now omitted from payload instead of sendingnulldoubao,openrouter,http_client): already use this pattern, no changes neededlazyllm_client.pybackend may have the same issue but was not modified since it uses a different SDKinterface — worth auditing separately