Skip to content

Releases: espressif/esp-webrtc-solution

ESP WebRTC Solution Release v1.2

21 Nov 11:29

Choose a tag to compare

Overview

ESP WebRTC Solution v1.2 is a significant update that brings enhanced stability, new features, expanded hardware support, and improved developer experience. This release includes major component updates, critical bug fixes.

What's New

1️⃣ Peer Connection Enhancements

esp_peer updated to v1.2.7

New Features:

  • ✔️ Support for multiple data channels
  • ✔️ Support for Forward-TSN (Transmission Sequence Number)
  • ✔️ Support for ESP32-C5 microcontroller

Bug Fixes:

  • ✔️ Fixed stability issues under poor network conditions
  • ✔️ Fixed potential crash caused by race conditions
  • ✔️ Fixed DTLS role mismatch issue
  • ✔️ Added msid attribute in SDP for better stream identification
  • ✔️ Fixed TURN relay connectivity issues

2️⃣ New Capture System – GMF-based esp_capture

The legacy capture system has been replaced with a GMF (Generic Media Framework) implementation, providing significant extensibility improvements:

  • Support for multiple capture paths
  • Automatic capability negotiation to simplify configuration
  • Enhanced pipeline flexibility for multimedia processing

3️⃣ Published Components to ESP-IDF Registry

Now available as independent, reusable components:

4️⃣ Solution Updates

New Solution:

  • Added kms_demo for Kurento Media Server Publisher

Improvements:

  • Added AI processing (Pedestrian Detection) to doorbell_local demo
  • Fixed OpenAI demo function call build error

5️⃣ Other Features and Improvements

  • HTTP client refined: supports redirect and OPTIONS requests
  • Added SEI injection support via video send hook – thanks to Todd Sharp
  • Fixed WHIP signaling issue when ICE server not configured
  • Added bitrate setting control for esp_webrtc
  • Fixed crash issue when resetting while renderer is created
  • Fixed data queue read/write dead-loop issue
  • Fixed codec board pin configuration issue

🆕 New board supports:

  • ESP32P4-EYE
  • XIAO ESP32S3 Sense

Migration Guide — Upgrading to v1.2.0

This release introduces the GMF-based Capture System, which replaces the old esp_capture APIs. To upgrade smoothly, please review the following compatibility changes:

For detailed changes, please refer to commit 0ad48d.

Users only need to update media_sys.c and replace the relevant APIs or configurations as shown in the table below.

Capture System API Changes

Legacy API (Simple Capture) New API (GMF Capture) Notes
esp_capture_audio_codec_src_cfg_t esp_capture_audio_dev_src_cfg_t Renamed
esp_capture_new_audio_codec_src esp_capture_new_audio_dev_src Renamed
ESP_CAPTURE_CODEC_TYPE_* ESP_CAPTURE_FMT_ID_* Renamed
esp_capture_path_handle_t esp_capture_sink_handle_t Renamed
esp_capture_setup_path esp_capture_sink_setup Renamed
esp_capture_enable_path esp_capture_sink_enable Renamed
esp_capture_acquire_path_frame esp_capture_sink_acquire_frame Renamed
esp_capture_release_path_frame esp_capture_sink_release_frame Renamed

Obtaining v1.2.0

Users can obtain the release code using either of the following methods:

Method 1: Using Git (Recommended)

git clone -b v1.2.0 https://github.com/espressif/esp-webrtc-solution.git esp-webrtc-solution-v1.2.0
cd esp-webrtc-solution-v1.2.0/

This is the recommended method for obtaining v1.2.0 of ESP WebRTC Solution.

Method 2: Download Archive

Alternatively, you can download the release archive directly from GitHub:
esp-webrtc-solution-v1.2.0.zip

Support

For issues and feature requests, please use the GitHub issue tracker.

Contributors

We thank all contributors who helped improve this release.

ESP WebRTC Solution Release v1.0

14 May 11:26

Choose a tag to compare

Overview

ESP WebRTC Solution v1.0 is the first stable release of Espressif’s WebRTC implementation designed specifically for lightweight embedded devices. This version delivers a comprehensive protocol stack for building real-time communication applications on ESP32 series chips, supporting audio/video streaming, data channel communication, and customizable signaling mechanisms.

🚀 Highlights

  • High-level esp_webrtc API for easy WebRTC application development
  • Support for peer-to-peer media and data communication via RTP and SCTP
  • TURN support, NACK handling, SCTP SACK support
  • Flexible signaling abstraction with built-in support for AppRTC, WHIP, OpenAI Realtime, and local HTTP SSE
  • Audio/video capture and rendering modules with codec abstraction
  • Out-of-the-box support for key audio/video codecs: H.264, MJPEG, OPUS, G.711, AAC
  • Demo projects for doorbell, OpenAI chatbot, WHIP publishing, and more
  • Fewer dependencies (only depends on libSRTP; all other modules are included in ESP-IDF)
  • Lightweight and low memory consumption — designed specifically for embedded devices

1. Core WebRTC Components

1.1 High-level esp_webrtc API

The esp_webrtc API internally manages PeerConnection state and signaling flow, making it easy to build WebRTC applications. In most cases, users only need to adapt it to their custom board and signaling — everything else is handled by esp_webrtc.

1.2 WebRTC Peer Connection (esp_peer)

esp_peer abstracts WebRTC PeerConnection logic on ESP32 devices and includes a default implementation (peer_default) derived from libpeer, with the following features:

  • Full TURN support (RFC5766 and RFC8656)
  • Optimized connection speed with multiple ICE candidates
  • Supports both Controlling and Controlled roles
  • RTP NACK support for retransmission
  • SCTP SACK support for large data transmission
  • Separate tasks for sending/receiving to avoid blocking
  • Codec support:
    • Video: H.264 (baseline), MJPEG (data channel only)
    • Audio: G.711 (PCMA/PCMU), OPUS

1.3 WebRTC Signaling (esp_peer_signaling)

Signaling is used to detect peers and exchange SDP/control commands. esp_peer_signaling abstracts signaling logic, allowing easy integration of custom signaling without modifying the core WebRTC stack. Built-in implementations:

  • esp_signaling_get_apprtc_impl: AppRTC signaling via WebSocket
  • esp_signaling_get_whip_impl: WHIP protocol for publishing to WebRTC servers
  • esp_signaling_get_openai_signaling: OpenAI Realtime API for chatbot integration
  • esp_signaling_get_http_impl: Local signaling via HTTP SSE for testing

2. Media Provider

2.1 Media Capture (esp_capture)

Integrated audio/video capture framework supporting various devices and codecs:

  • Video: H.264 (baseline), MJPEG
  • Audio: G.711 (PCMA/PCMU), AAC, OPUS

Capture device abstraction allows plug-and-play development:

  • esp_capture_new_audio_codec_src: I2S audio (via esp_codec_dev)
  • esp_capture_new_audio_aec_src: I2S audio with AEC
  • esp_capture_new_video_v4l2_src: V4L2 (MIPI CSI/DVP, ESP32-P4 only)
  • esp_capture_new_video_dvp_src: DVP camera (ESP32-S3 and others)

2.2 Media Player (av_render)

A lightweight media player supporting a "push" model for playback:

  • Video: H.264 (baseline), MJPEG
  • Audio: G.711 (PCMA/PCMU), AAC, OPUS

Rendering device abstraction:

  • av_render_alloc_i2s_render: Audio playback via I2S
  • av_render_alloc_lcd_render: Video output via esp_lcd

3. Board Configuration (codec_board)

Provides default configuration to simplify adapting audio boards for quick testing and verification.


4. Demo Solutions

  • Peer Demo: Peer-to-peer demo between two ESP32 devices
    • Audio and data channel communication
  • OpenAI Demo: Real-time chatbot using OpenAI Realtime API
    • Audio capture, AEC, and AI response
    • Supports function calls for device control
    • Supports OPUS for better audio quality
    • Supports changing voice type
  • Doorbell Demo: Smart video doorbell solution
    • AppRTC signaling, video/audio stream, two-way talk, MAC-based auto connect
  • Doorbell Local: Local version of Doorbell Demo
    • Uses HTTP SSE signaling for LAN tests
  • Video Call Demo: Full-featured two-way audio/video calling
    • MJPEG transmission over data channel
  • WHIP Demo: Publish media to WHIP-compatible servers
    • Uses WHIP signaling and media provider APIs

5. Compatibility

  • Supported Chips: ESP32, ESP32-S2, ESP32-S3, ESP32-P4
  • ESP-IDF Version: v5.4 or later recommended
  • Requirements:
    • PSRAM for video/audio processing
    • Compatible camera/audio drivers depending on your board

6. Obtaining v1.0.0

Users can use either of the following 2 methods to get the release code.

Using Git

git clone -b v1.0.0 https://github.com/espressif/esp-webrtc-solution.git esp-webrtc-solution-v1.0.0
cd esp-webrtc-solution-v1.0.0/

This is the recommended way of obtaining v1.0.0 of ESP WebRTC Solution.

Download an Archive

Attached to the release is an esp-webrtc-solution-v1.0.0.zip archive.
You can also download it from github directly:
esp-webrtc-solution-v1.0.0.zip


7. Getting Started

After obtaining the v1.0.0 code, try the Peer Demo or Doorbell Local for initial testing by following the README.


8. Contributing / Feedback

Your feedback is highly appreciated and helps us improve the solution!
Feel free to open an issue or submit a pull request on GitHub.