Extending the web scraper.

There hasn't been much work on the web scraping part.
I am interested to work on this.
Since this is going to be a generic one, what I have thought as of now includes:
1) A generic web scraper which scrapes all images, links and the text.
2) Use scrapy for this maybe.

Still a beginner, any tips or corrections?