-
Couldn't load subscription status.
- Fork 4.3k
Description
Describe the bug
When streaming agent responses via Agent.InvokeStreamingAsync that include function tool calls, the assistant entry Semantic Kernel adds to chat history concatenates the pre-tool-call text with the post-tool-call reply. The persisted chat message does not match either response emitted by the OpenAI API.
To Reproduce
Steps to reproduce the behavior:
- Configure an
OpenAIAssistantAgentwith a function tool (e.g.,get_profiles) and callInvokeStreamingAsyncagainst OpenAI (tested withgpt-5). - Ask a question that causes the assistant to invoke the tool (for example: “what profiles do we have?”).
- Let the tool function return successfully and allow the streaming response to finish.
- Examine the assistant message that Semantic Kernel stores in chat history; it now contains the initial “I’m going to pull…” text merged with the final answer instead of keeping them separate.
Expected behavior
Each assistant message recorded in chat history should mirror the model responses. The pre-tool-call streaming chunk should appear once, and the final answer should stand on its own without the earlier text prefixed.
Platform
- Language: C#
- Source: NuGet package Microsoft.SemanticKernel.Agents.OpenAI 1.65.0-preview
- AI model: gpt-5
- IDE: VS Code
- OS: Linux container (Debian-based)
Additional context
OpenAI streams two distinct assistant messages:
{
"role": "assistant",
"content": "I’m going to pull the current list of profiles configured in this workspace so I can give you an accurate, up-to-date answer.",
"tool_calls": [...]
}
followed by
{
"role": "assistant",
"content": "<p>Here’s what we’ve got right now — ...</p>"
}
After InvokeStreamingAsync completes, Semantic Kernel produces this chat history entry for the second assistant message:
{
"role": "assistant",
"content": "I’m going to pull the current list of profiles configured in this workspace so I can give you an accurate, up-to-date answer.<p>Here’s what we’ve got right now — ...</p>"
}
The first sentence from the initial tool-call prompt should not be duplicated in the final assistant entry.