Skip to content

[Feature Request] Browser visual Interaction based on pixels #3680

@nitpicker55555

Description

@nitpicker55555

Required prerequisites

Motivation

More and more model providers are releasing computer-use models that support pure pixel-based clicking and page interactions. We need to add support for this mode of operation, which relies purely on visual input to interact with pages using pixel coordinates and UI elements.

Solution

No response

Alternatives

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions