Web scraping in R or Python

A work in progress!

Web scraping in R or Python

A workshop presented as part of Kent State University Libraries' 2023 Digital Scholarship Series Showcase. Written by Kristin Yeager (Head) and Moira O'Neill (GA) from the Statistical Consulting Office.

Motivation

The web is a rich source of data for many kinds of researchers, including applied mathematicians, natural and social scientists, literary scholars, historians, and artists. These data can be found in online newspapers and journals, on social media sites, in government databases, and in website metadata. Often this data is bound up in a website’s HTML structure and is not easily accessible or downloadable, so we need an alternative systematic way to retrieve it.

Defining web scraping

Web scraping is an automated process in which your computer contacts a website or web resource makes a copy of that website’s HTML or XML extracts the content of that HTML or XML as data.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
presentation		presentation
scraping_materials		scraping_materials
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Web scraping in R or Python

Motivation

Defining web scraping

About

Uh oh!

Releases

Packages

Languages

License

moira-du-monde/webscraping_r_python

Folders and files

Latest commit

History

Repository files navigation

Web scraping in R or Python

Motivation

Defining web scraping

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages