Meta Quest application for Vitol challenge to track and recognize objects and patterns in Mixed Reality.
Report bug
·
Request feature
- Quick start
- About this project
- Status
- What's included
- Bugs and feature requests
- Creators
- Copyright and license
You can find our ONNX model for EfficientNet B7 trained on TU-Berlin Sketch dataset in Google Drive.
This project combines multiple challenges from LauzHack of EPFL, Switzerland, which are proposed by companies such as AXA Group (an Artificial Intelligence model that can run on a laptop, mobile device, or immersive device), Logitech (using the MX Ink together with the Meta Quest 3/3S to create a Mixed Reality (XR) application), and primarily Vitol (creating an AI service for recognizing static and moving objects and/or a chatbot capable of interacting with the user).
As shown in the image below, this project is a multi-agent AI system combining Speech-To-Text with OpenAI Whisper for multi-agent routing and generating written responses when necessary using Qwen2.5-0.5b. It also utilizes YoLo11 for object detection in images, an EfficientNet-B7 (Arxiv) for recognizing patterns or drawings made with the MX Ink, and finally, OpenAI TTS for Text-to-Speech.
The implementation within the Meta Quest has been done using WebXR. For more information... WebXR
During the LauzHack is in development
agent-src/
│ ├── agent/
│ │ ├── image_prepos.py
│ │ ├── router.py
│ │ └── main.py
│ └── data-models/
│ ├── label_mapping.pkl
│ └── efficientnet_b7.onnx
├── models/
│ ├──efficient_net_b7.ipynb
│ ├──mobile_net.ipynb
│ └──yolov8.ipynb
├── assets/
├── examples/
├── .env.example
├── requirements.txt
Have a bug or a feature request? Please first read the issue guidelines and search for existing and closed issues. If your problem or idea is not addressed yet, please open a new issue.
Gabriel Juan
- GitHub: @GabrielJuan349
- LinkedIn: in/gabi-juan
Jan Gras
- GitHub: @JG03dev
- LinkedIn: in/jangras
Yeray Cordero
- GitHub: @yeray142
- LinkedIn: in/yeray142
Nikalas Boyanov
- GitHub: @finnithegamer
- LinkedIn: in/nikalas-boyanov-nunev
Code and documentation copyright 2024-2036 the authors. Code released under the MIT License.
Enjoy 🤘
