a11y-vision

A proof of concept for multimodal approach to generate accessible webpages for use with screen readers when accessibility markers are not present.

The idea

Segment a webpage screenshot into components, map them to DOM elements, and generate an annotated, accessible view.

Current status

The project is in a PoC staghe, your contributions are welcome to bring it to life as a viable software package!

TODO

Generate annotations for segmented content
Link existing webpage elements to an acessibility tree
Create a web extension to automate the screen capture and UI

Setup

Clone the repository:

git clone https://github.com/rawaha-e/a11y-vision

Install the dependencies:

pip install -r requirements.txt

Download SAM2 checkpoint:

mkdir checkpoints
wget https://dl.fbaipublicfiles.com/segment_anything_2/092824/sam2.1_hiera_large.pt -O checkpoints/sam2.1_hiera_large.pt

Place screenshot.png in the project directory.
Run the inference:

python3 segment_webpage.py

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
segment_webpage.py		segment_webpage.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

a11y-vision

The idea

Current status

TODO

Setup

About

Uh oh!

Releases

Packages

Languages

License

rawaha-e/a11y-vision

Folders and files

Latest commit

History

Repository files navigation

a11y-vision

The idea

Current status

TODO

Setup

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages