Skip to content

bug: VLM input plugins crash on dropped camera frames (ret not checked) #2207

@Ridwannurudeen

Description

@Ridwannurudeen

Description

Three VLM input plugins call cap.read() but never check the ret return value before passing the frame to downstream processing. When the camera drops a frame (returns ret=False, frame=None), the plugins crash instead of gracefully skipping.

Affected files

  1. src/inputs/plugins/vlm_local_yolo.py:221ret is ignored, frame (potentially None) is passed directly to self.model.predict(), which raises a runtime error.

  2. src/inputs/plugins/vlm_coco_local.py:131ret is ignored, frame (potentially None) is returned and passed to PyTorch model inference.

  3. src/inputs/plugins/webcam_to_face_emotion.py:89 — Same pattern, ret ignored.

When this happens

  • Virtual camera devices in simulation (Gazebo/Isaac Sim) may not have frames ready immediately at startup
  • Physical cameras that temporarily lose connection (USB disconnect, power fluctuation)
  • High system load causing the camera driver to drop frames

Expected behavior

The plugins should check ret and return None (skip the frame) when cap.read() fails, matching the pattern already used in unitree_realsense_dev_vlm_provider.py:98-109.

Steps to reproduce

  1. Configure VLM_Local_YOLO with a camera index that points to a virtual device not yet streaming
  2. Start the OM1 runtime
  3. The plugin crashes on the first _poll() call when frame is None

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions