Skip to content

Commit 3788f1b

Browse files
authored
Merge pull request #522 from espressif/blog/H264_use_tips
Sync Merge: blog/H264_use_tips
2 parents 56460fc + 4a2347c commit 3788f1b

File tree

6 files changed

+321
-0
lines changed

6 files changed

+321
-0
lines changed
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
---
2+
title: Hou Haiyan
3+
---
23.1 KB
Loading
27.6 KB
Loading
47.8 KB
Loading
Lines changed: 310 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,310 @@
1+
---
2+
title: ESP H.264 Practical Usage Guide
3+
date: 2025-07-18
4+
showAuthor: false
5+
authors:
6+
- hou-haiyan
7+
tags:
8+
- Multimedia
9+
- H.264
10+
- Performance Tuning
11+
- ESP32-P4
12+
- ESP32-S3
13+
summary: "This article introduces Espressif's esp_h264 component, a lightweight H.264 codec optimized for embedded devices. It shows how to leverage hardware acceleration, implement efficient video processing, and optimize performance for various applications."
14+
---
15+
16+
## Overview
17+
18+
### What is ESP H.264?
19+
20+
Espressif has recently launched the `esp_h264` component for ESP32 series microcontrollers, which through hardware acceleration, dynamic scheduling and lightweight algorithms, is able to balance the computing power and power consumption of video codec.
21+
22+
![esp_h264](./esp-h264.webp)
23+
24+
### Key Features
25+
26+
- **Hardware Acceleration**: Leverages ESP32-P4 for hardware encoding and high-speed decoding, with single-instruction, multiple-data ([SIMD](https://en.wikipedia.org/wiki/Single_instruction,_multiple_data)) acceleration on ESP32-S3 for enhanced efficiency
27+
- **Memory Optimization**: Implements advanced algorithms to minimize memory usage, ensuring stable operation on resource-constrained devices
28+
- **Dynamic Configuration**: Flexible parameter adjustment for real-time optimization of performance, resource allocation, and video quality
29+
- **Advanced Encoding**: Supports Baseline profile, high-quality I/P frame generation, ROI encoding, and bitrate control
30+
- **Efficient Decoding**: Software-based parsing of standard H.264 streams for smooth video playback
31+
32+
### Target Applications
33+
34+
`esp_h264` main applications are:
35+
- Video surveillance systems
36+
- Remote meetings and communication
37+
- Mobile streaming applications
38+
- IoT video processing
39+
40+
## CODEC specifications
41+
42+
### Encoding
43+
44+
| Platform | Type | Max Resolution | Max Performance | Advanced Features |
45+
|--------------|----------|----------------|-----------------------|--------------------------------------|
46+
| **ESP32-S3** | Software | Any | 320×240@11fps | Basic encoding |
47+
| **ESP32-P4** | Hardware | ≤1080P | 1920×1080@30fps | Dual encoding, ROI optimization, Motion vector output |
48+
49+
### Decoding
50+
51+
| Platform | Type | Max Resolution | Max Performance |
52+
|--------------|----------|----------------|-----------------------|
53+
| **ESP32-S3** | Software | Any | 320×240@19fps |
54+
| **ESP32-P4** | Software | Any | 1280×720@10fps |
55+
56+
## Getting Started
57+
58+
### Basic Workflow
59+
60+
The hardware encoding standardization process can be summarized into four core operations:
61+
62+
![Single Hardware encoder use flow](./hw-single-enc-use-flow.webp)
63+
64+
1. **Initialize**: Create encoder with configuration parameters
65+
2. **Start**: Open the encoder for processing
66+
3. **Process**: Execute frame-by-frame encoding in a loop
67+
4. **Cleanup**: Release resources and destroy encoder object
68+
69+
### Quick Start Example
70+
71+
```c
72+
// Hardware single-stream encoding configuration example
73+
esp_h264_enc_cfg_hw_t cfg = {0};
74+
cfg.gop = 30;
75+
cfg.fps = 30;
76+
cfg.res = {.width = 640, .height = 480};
77+
cfg.rc = {
78+
.bitrate = (640 * 480 * 30) / 100,
79+
.qp_min = 26,
80+
.qp_max = 30
81+
};
82+
cfg.pic_type = ESP_H264_RAW_FMT_O_UYY_E_VYY;
83+
84+
// Initialize encoder
85+
esp_h264_enc_t *enc = NULL;
86+
esp_h264_enc_hw_new(&cfg, &enc);
87+
88+
// Allocate input/output buffers
89+
esp_h264_enc_in_frame_t in_frame = {.raw_data.len = 640 * 480 * 1.5};
90+
in_frame.raw_data.buffer = esp_h264_aligned_calloc(128, 1,
91+
in_frame.raw_data.len,
92+
&in_frame.raw_data.len,
93+
ESP_H264_MEM_INTERNAL);
94+
95+
// Start encoding
96+
esp_h264_enc_open(enc);
97+
98+
// Encoding loop
99+
while (capture_frame(in_frame.raw_data.buffer)) {
100+
esp_h264_enc_process(enc, &in_frame, &out_frame);
101+
send_packet(out_frame.raw_data.buffer);
102+
}
103+
104+
// Resource release
105+
esp_h264_enc_close(enc);
106+
esp_h264_enc_del(enc);
107+
esp_h264_free(in_frame.raw_data.buffer);
108+
```
109+
110+
## API Reference
111+
112+
The following section provides a brief overview of the available functions.
113+
{{< alert icon="lightbulb" iconColor="#179299" cardColor="#9cccce">}}
114+
These functions are thread-safe and can be called at any time during the encoder lifecycle.
115+
{{< /alert >}}
116+
117+
118+
### Encoding functions
119+
120+
| Function | Description | Platform Support |
121+
|----------------------------|---------------------------------------|--------------------------|
122+
| `esp_h264_enc_sw_new` | Create single-stream software encoder | ESP32-S3, ESP32-P4 |
123+
| `esp_h264_enc_hw_new` | Create single-stream hardware encoder | ESP32-P4 only |
124+
| `esp_h264_enc_dual_hw_new` | Create dual-stream hardware encoder | ESP32-P4 only |
125+
| `esp_h264_enc_open` | Start encoder | All platforms |
126+
| `esp_h264_enc_process` | Execute encoding for a single frame and output compressed data | All platforms |
127+
| `esp_h264_enc_close` | Stop encoder | All platforms |
128+
| `esp_h264_enc_del` | Release encoder resources | All platforms |
129+
130+
### Decoding functions
131+
132+
| Function | Description | Platform Support |
133+
|------------------------|--------------------------------------------------|--------------------------|
134+
| `esp_h264_dec_sw_new` | Create software decoder | ESP32-S3, ESP32-P4 |
135+
| `esp_h264_dec_open` | Start decoder | All platforms |
136+
| `esp_h264_dec_process` | Execute decoding for a single frame and output raw data | All platforms |
137+
| `esp_h264_dec_close` | Stop decoder | All platforms |
138+
| `esp_h264_dec_del` | Release decoder resources | All platforms |
139+
140+
### Dynamic Parameter Control
141+
142+
| Function | Description | Typical Use Cases |
143+
|-------------------------------|--------------------------------|----------------------------------|
144+
| `esp_h264_enc_get_resolution` | Get resolution information | Display configuration |
145+
| `esp_h264_enc_get/set_fps` | Dynamically adjust frame rate | Network bandwidth adaptation |
146+
| `esp_h264_enc_get/set_gop` | Dynamically adjust GOP size | Quality vs. bandwidth balance |
147+
| `esp_h264_enc_get/set_bitrate`| Dynamically adjust bitrate | Network bandwidth adaptation |
148+
149+
## Advanced Features
150+
151+
This section highlights advanced capabilities of the H.264 encoder that offer greater control and flexibility for specialized use cases. These features include region-based quality adjustments, motion vector extraction for video analysis, and dual-stream encoding support on the ESP32-P4.
152+
153+
### Region of Interest (ROI) Encoding
154+
155+
ROI encoding allows you to allocate more bits to important areas of the frame while reducing quality in less critical regions.
156+
157+
**ROI Configuration**
158+
159+
```c
160+
// Set the center area for high-priority encoding
161+
esp_h264_enc_roi_cfg_t roi_cfg = {
162+
.roi_mode = ESP_H264_ROI_MODE_DELTA_QP,
163+
.none_roi_delta_qp = 10 // Increase QP by 10 for non-ROI region
164+
};
165+
ESP_H264_CHECK(esp_h264_enc_hw_cfg_roi(param_hd, roi_cfg));
166+
167+
// Define the center 1/4 area as ROI
168+
esp_h264_enc_roi_reg_t roi_reg = {
169+
.x = width / 4, .y = height / 4,
170+
.len_x = width / 2, .len_y = height / 2
171+
};
172+
ESP_H264_CHECK(esp_h264_enc_hw_set_roi_region(param_hd, roi_reg));
173+
```
174+
175+
**ROI API Functions**
176+
177+
| Function | Description | Use Cases |
178+
|--------------------------------|--------------------------------|----------------------------------|
179+
| `esp_h264_enc_cfg_roi` | Configure ROI parameters | Key encoding for faces, license plates |
180+
| `esp_h264_enc_get_roi_cfg_info`| Get current ROI configuration | Status monitoring |
181+
| `esp_h264_enc_set_roi_region` | Define ROI regions | Specific area enhancement |
182+
| `esp_h264_enc_get_roi_region` | Get ROI region information | Configuration verification |
183+
184+
### Motion Vector Extraction
185+
186+
Extract [motion vector](https://en.wikipedia.org/wiki/Motion_estimation) data for video analysis and post-processing applications.
187+
188+
**Motion Vector API Functions**
189+
190+
| Function | Description | Use Cases |
191+
|-------------------------------|---------------------------------|----------------------------------|
192+
| `esp_h264_enc_cfg_mv` | Configure motion vector output | Video analysis setup |
193+
| `esp_h264_enc_get_mv_cfg_info`| Get motion vector configuration | Configuration verification |
194+
| `esp_h264_enc_set_mv_pkt` | Set motion vector packet buffer | Data collection |
195+
| `esp_h264_enc_get_mv_data_len`| Get motion vector data length | Buffer management |
196+
197+
### Dual-Stream Encoding (ESP32-P4 Only)
198+
199+
ESP32-P4 supports simultaneous encoding of two independent video streams with different parameters.
200+
201+
```c
202+
// Main stream 1080P storage, sub-stream 480P transmission
203+
esp_h264_enc_cfg_dual_hw_t dual_cfg = {0};
204+
dual_cfg.cfg0 = {.res = {1920, 1080}, .bitrate = 4000000}; // Main stream
205+
dual_cfg.cfg1 = {.res = {640, 480}, .bitrate = 1000000}; // Sub-stream
206+
ESP_H264_CHECK(esp_h264_enc_dual_hw_new(&dual_cfg, &enc));
207+
```
208+
209+
## Application Scenarios & Best Practices
210+
211+
The following examples demonstrate how to apply advanced encoding features to meet specific use-case requirements. Each scenario outlines an optimal configuration strategy, showcasing how ROI, bitrate control, and motion vectors can be tailored for performance, privacy, or adaptability.
212+
213+
### 1. Video Surveillance
214+
215+
In video surveillance applications, it's critical to maintain high visual fidelity in regions that contain important details—such as faces, license plates, or motion-detected areas—while conserving bandwidth and storage elsewhere. ROI (Region of Interest) encoding allows the encoder to prioritize such regions by allocating more bits, thereby enhancing clarity where it matters most.
216+
217+
**Optimal Configuration:**
218+
219+
* **Enable ROI encoding** to enhance key visual areas.
220+
* **GOP = 30** ensures a keyframe every second at 30 fps, balancing video seekability and compression.
221+
* **QP range: \[20–35]** provides a controlled balance between compression efficiency and perceptual quality, especially in bandwidth-constrained environments.
222+
223+
```c
224+
// Surveillance optimized configuration
225+
esp_h264_enc_cfg_hw_t surveillance_cfg = {
226+
.gop = 30,
227+
.fps = 25,
228+
.res = {1280, 720},
229+
.rc = {
230+
.bitrate = 2000000,
231+
.qp_min = 20,
232+
.qp_max = 35
233+
}
234+
};
235+
```
236+
237+
**ROI Setup for Key Areas:**
238+
To further refine quality, specific regions—such as the center of the frame or areas flagged by motion detection—can be configured for lower quantization parameters (QPs), resulting in better detail preservation.
239+
240+
* **Reduce QP** in key regions by up to **25%**, improving clarity for facial recognition or license plate reading.
241+
* **Leverage motion vector data** to dynamically track and adapt ROI regions for intelligent, resource-efficient surveillance.
242+
243+
### 2. Privacy Protection
244+
245+
In scenarios where privacy is a concern—such as public-facing cameras or indoor monitoring—specific regions of the video may need to be intentionally blurred. This can be achieved by strategically increasing the quantization parameter (QP) in those regions, reducing detail without additional processing overhead.
246+
247+
**Implementation Strategy:**
248+
- Increase QP by 25% in ROI areas to achieve blur effect
249+
- Use fixed GOP to prevent mosaic area diffusion
250+
251+
```c
252+
// Privacy protection ROI configuration
253+
esp_h264_enc_roi_cfg_t privacy_cfg = {
254+
.roi_mode = ESP_H264_ROI_MODE_DELTA_QP,
255+
.none_roi_delta_qp = -5 // Better quality for non-sensitive areas
256+
};
257+
258+
// Blur sensitive area
259+
esp_h264_enc_roi_reg_t blur_region = {
260+
.x = sensitive_x, .y = sensitive_y,
261+
.len_x = sensitive_width, .len_y = sensitive_height,
262+
.qp = 15 // High QP for blur effect
263+
};
264+
```
265+
266+
### 3. Network Adaptive Streaming
267+
268+
For real-time video applications operating over variable or constrained networks, maintaining a stable and responsive stream is essential. By dynamically adjusting encoding parameters such as bitrate and frame rate based on current bandwidth conditions, the encoder can optimize video quality while minimizing buffering and transmission failures.
269+
270+
**Strategy:**
271+
- Enable dynamic bitrate control (CBR/VBR)
272+
- Adjust parameters based on network conditions
273+
274+
```c
275+
// Network adaptation function
276+
void adapt_to_network_conditions(esp_h264_enc_handle_t enc, uint32_t available_bandwidth) {
277+
esp_h264_enc_param_hw_handle_t param_hd;
278+
esp_h264_enc_hw_get_param_hd(enc, &param_hd);
279+
280+
if (available_bandwidth < 1000000) { // < 1 Mbps
281+
esp_h264_enc_set_bitrate(&param_hd->base, 800000);
282+
esp_h264_enc_set_fps(&param_hd->base, 15);
283+
} else if (available_bandwidth < 3000000) { // < 3 Mbps
284+
esp_h264_enc_set_bitrate(&param_hd->base, 2000000);
285+
esp_h264_enc_set_fps(&param_hd->base, 25);
286+
} else { // >= 3 Mbps
287+
esp_h264_enc_set_bitrate(&param_hd->base, 4000000);
288+
esp_h264_enc_set_fps(&param_hd->base, 30);
289+
}
290+
}
291+
```
292+
293+
## Resources and Support
294+
295+
### Development Resources
296+
297+
- **Sample Projects**: [ESP H.264 Sample Projects](https://github.com/espressif/esp-idf/tree/master/examples/peripherals/h264)
298+
- **Component Registry**: [ESP H.264 Component](https://components.espressif.com/components/espressif/esp_h264/)
299+
- **Release Notes**: [Latest updates and compatibility information](https://components.espressif.com/components/espressif/esp_h264/versions/1.1.2/changelog?language=en)
300+
301+
### Technical Support
302+
303+
- **Official Forum**: [Espressif Technical Support](https://docs.espressif.com/projects/esp-faq/en/latest/)
304+
- **GitHub Issue Tracker**: [ESP-ADF Issues](https://github.com/espressif/esp-adf/issues)
305+
306+
## Conclusion
307+
308+
Espressif's lightweight H.264 codec component `esp_h264` is designed for efficient video processing on resource-constrained devices. This comprehensive guide analyzes its core advantages from four dimensions: technical features, API interfaces, application scenarios, and troubleshooting, thereby helping developers unlock the potential of embedded video codec.
309+
310+
Whether you're building a surveillance system, implementing video streaming, or developing innovative multimedia applications, ESP H.264 offers the tools and performance needed to succeed in resource-constrained environments.

data/authors/hou-haiyan.json

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
{
2+
"name": "Hou haiyan",
3+
"image" : "img/authors/espressif.png",
4+
"bio": "Embedded Software Engineer at Espressif",
5+
"social": [
6+
{ "github": "https://github.com/houhaiyan" }
7+
]
8+
}

0 commit comments

Comments
 (0)