Skip to content

rawaha-e/a11y-vision

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

a11y-vision

A proof of concept for multimodal approach to generate accessible webpages for use with screen readers when accessibility markers are not present.

The idea

Segment a webpage screenshot into components, map them to DOM elements, and generate an annotated, accessible view.

Current status

The project is in a PoC staghe, your contributions are welcome to bring it to life as a viable software package!

TODO

  • Generate annotations for segmented content
  • Link existing webpage elements to an acessibility tree
  • Create a web extension to automate the screen capture and UI

Setup

  1. Clone the repository:
git clone https://github.com/rawaha-e/a11y-vision
  1. Install the dependencies:
pip install -r requirements.txt
  1. Download SAM2 checkpoint:
mkdir checkpoints
wget https://dl.fbaipublicfiles.com/segment_anything_2/092824/sam2.1_hiera_large.pt -O checkpoints/sam2.1_hiera_large.pt
  1. Place screenshot.png in the project directory.

  2. Run the inference:

python3 segment_webpage.py

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages