-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Open
Labels
type:bugSomething isn't workingSomething isn't working
Description
Description of the bug:
The current example script for the gemini live API (quickstarts/Get_started_LiveAPI.py) is unable to perceive visual information and hallucinates wildly.
Actual vs expected behavior:
Setting:
- Run the script as
python Get_started_LiveAPI.py --mode screenorpython Get_started_LiveAPI.py --mode camera, and - query the model with "What do you see on the screen that I am sharing"
This will result in responses like "There is a man shown with dark hair..." or "A chess game...".
Expected response: "The screen shows a terminal with various commands..."
Hence one can conclude that vision currently does not work with live API?
Any other information you'd like to share?
No response
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
type:bugSomething isn't workingSomething isn't working