Checklist
Issue or Suggestion Description
Hello everyone — I’m building a lightweight facial landmark pipeline for ESP32-S3 using ESP-WHO + ESP-DL. Goal: a compact model that predicts ~22 2D facial keypoints (eyes, pupils, mouth corners, nose, chin) for blink, yawn and coarse gaze detection.
Current status:
- ESP-WHO example (human_face_detect) runs on my ESP32-S3 with GC2145 camera.
- I can flash and run .espdl models on the board.
Questions / requests:
- Recommended training → quantization flow for a 22-point regressor: PyTorch→ONNX→esp-ppq (PTQ) or prefer QAT? Any example repo?
- Best input resolution for ESP32-S3 (tradeoff fps vs accuracy)? 128×128 / 160×160 / 224×224 suggestions please.
- Operator support checklist: which ops commonly trip up conversion to .espdl? (conv2d, depthwise, relu6, upsample, transpose conv, gather/index ops?)
- Example quantization script or minimal esp-ppq template for converting ONNX→.espdl with a representative calibration dataset.
- Any tips to design a regression head (heatmap vs direct coordinate regression) that works best with ESP-DL and quantization?
help me with more tips
Thanks — looking for practical, minimal working example and gotchas to avoid.