There hasn't been much work on the web scraping part.
I am interested to work on this.
Since this is going to be a generic one, what I have thought as of now includes:
- A generic web scraper which scrapes all images, links and the text.
- Use scrapy for this maybe.
Still a beginner, any tips or corrections?