-
Notifications
You must be signed in to change notification settings - Fork 18.2k
Description
Self Checks
- I have read the Contributing Guide and Language Policy.
- This is only for bug report, if you would like to ask a question, please head to Discussions.
- I have searched for existing issues search for existing issues, including closed ones.
- I confirm that I am using English to submit this report, otherwise it will be closed.
- 【中文用户 & Non English User】请使用英语提交,否则会被关闭 :)
- Please do not modify this template :) and fill in all the required fields.
Dify version
1.9.1
Cloud or Self Hosted
Self Hosted (Source)
Steps to reproduce
Create a workflow with an HTTP Request node.
Configure the node to send a POST request with binary body type (e.g., upload a 100MB+ MP3 file).
Trigger the workflow by uploading the large file via chat or API.
Observe the execution result.
📸 Screenshot attached: Workflow appears to complete, but logs show database storage error and UI shows “任务已完成” while backend fails silently.
[Workflow Execution Log]
- HTTP Request node starts...
- File uploaded: test.mp3 (141.09 MB)
- Node status changes to "SUCCEEDED" visually
- But database throws: "storage limit exceeded" or "value too long for type jsonb"
✔️ Expected Behavior
HTTP node should successfully execute and return response.
Binary file content should NOT be stored in process_data.
Only metadata (file size, filename, transfer method) should be logged.
Workflow should not fail due to database constraints.
❌ Actual Behavior
When uploading large binary files (>50MB), the HTTP node appears to succeed in UI (“任务已完成”), but:
The actual database insertion fails due to oversized JSONB field.
No clear error is shown to user — workflow continues silently.
Subsequent nodes may receive incomplete or corrupted context.
System performance degrades due to massive log entries.
💡 Root Cause: In core/workflow/nodes/http_request/executor.py, the to_log() method decodes binary content as UTF-8 string using .decode("utf-8", errors="replace"), which expands 100MB binary into ~1GB+ text string before storing it in process_data.
Additional Context
This issue is critical for production use cases involving:
Audio/video transcription
Document processing (PDFs, images)
File uploads via API or chat
The current behavior leads to:
Database bloat
Silent failures
Security risk (storing raw binary data in logs)