Streaming Service Comparision

Overview

This is my capstone project for CODE:You. The project analyzes four streaming services and their price histories to gain insights into content differences such as overall content amount, content types, genres, and more along with a comparison of price points. The goal of this project is to demonstrate a general knowledge of Python.

Data Sources:

Four of the datasets used in this project contain content information about each streaming service listed below, including title, content type, genre, release year, IMDb ID, and IMDb Average Rating, all of which came from kaggle.com and are updated daily via an API used by the datasets' owner. The files uses below were last updated as of 03/23/2025. One dataset contains information related to pricing on each of the services for a specific timeframe noted as month-year. This dataset was manually derived. See details below

Netflix
Hulu
Prime
AppleTV
Streaming_Service_Pricing_Histories
- Manually Derived From: Price_History_Reference.docx

Project Structure

This project is organized as follows:

Preliminary Data Exploration: Jupyter notebooks or scripts to explore a dataset.
Preliminary Data Manipulation: Using python for feature creation to differentiate the streaming service datasets.
Data Cleaning & Preparation: Using python and other packages to clean and prepare data for analysis.
Analysis: Using Python with the Pandas package to analyze the data.
Visualizations: Using Matplotlib and Seaborn to visualize my findings.
Summary: Summary of analysis/findings.

Features Utilized for the Project

Feature	Description
Read FIVE data files	Used 4 CSV files from Kaggle & created one of my own.
Created several seaborn & matplotlib plots, 4 Stacked Bar Charts, 3 WordClouds, & 1 Line Graph	Made various plots to visually show my findings
Utilized a virtual environment	Created a venv for this project to keep my computer clean
Utilized Markdown & Commenting in my Jupyter Notebook	Included Markdown Language and commenting in my code to describe each section of my project & to define clear notes describing each code block.
Best practices	Created a function to wrap text on the x-axis of several graphs

Getting Started

The following is a guide to running the project files locally:

If you want to save a copy on your GitHub, fork the repository located here, otherwise, move to step 2
In your command center or in the terminal of VS Code, clone the repository to your on your local machine: 'git clone https://github.com/rkynhoff/Streaming_Service_Comparisons.git'
- Ensure your command center is opened to the folder in which you wish to save this repository
Follow the first three steps in the "Virtual Environment Instructions" to create and activiate a virtual environment, depending on your operating system (OS)
- This step should also include installing the requirements.txt file
Explore the Juptyer notebooks and contents in the respective folders.
Open the "my_functions.py" file then run it
Open the "STRM_SERV_COMP_V2.ipynb" file
In the toolbar, select "Run All" to run the program
Investigate the code blocks, comments, and markdown areas for insight into the program
Refer to the data dictionaries within the Jupyter Notebook located after the intitial DataFrames load and after the final cleaned DataFrame, or their respecitve ipynb files if needed
Helpful Hint: You may want to turn on Word Wrap as some of the cells contain comments/notes that would require scrolling without Word Wrap enabled
- To do this in VS Code:
  - Select File > Preferences > Settings
  - Type in Word Wrap in the search
  - Toggle Word Wrap to "on" if not already on
- Jupyter Notebooks online (JupyterLab,JupyerLite, etc.)
  - Select File > Wrap Words
  - Choose to turn it on
If running an editor which requires the ipykernel extension, proceed with the install when prompted
When you are finished perusing the repository, run the final line code for your OS from the Virtual Environment Instructions below

Virtual Environment Instructions

Depending upon your OS, enter the commands below into your terminal to create, activate and install a virtual environment on your machine Onlly use Deactivate when you are finished with the program

Command	Linux/Mac	GitBash
Create	`python3 -m venv venv`	`python -m venv venv`
Activate	`source venv/bin/activate`	`source venv/Scripts/activate`
Install	`pip install -r requirements.txt`	`pip install -r requirements.txt`
Deactivate	`deactivate`	`deactivate`

Dependencies

pandas and numpy for data manipulation and analysis
matplotlib and seaborn for data visualization
wordcloud for generating word cloud visuals
PIL (Python Imagining Library) for image processing
textwrap for wrapping text on graph axes

Name		Name	Last commit message	Last commit date
Latest commit History 136 Commits
Data		Data
Images		Images
Research		Research
Working_Files		Working_Files
.gitignore		.gitignore
AI_Notes.md		AI_Notes.md
README.md		README.md
Raw_Prices_Data_Dictionary.md		Raw_Prices_Data_Dictionary.md
Raw_Streaming_Data_Dictionary.md		Raw_Streaming_Data_Dictionary.md
STRM_SERV_COMP_V2.ipynb		STRM_SERV_COMP_V2.ipynb
STRM_SERV_COMP_Visualization.ipynb		STRM_SERV_COMP_Visualization.ipynb
Streaming_Clean_Data_Dictionary.md		Streaming_Clean_Data_Dictionary.md
my_functions.py		my_functions.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Streaming Service Comparision

Overview

Data Sources:

Project Structure

Features Utilized for the Project

Getting Started

Virtual Environment Instructions

Dependencies

About

Uh oh!

Releases

Packages

Uh oh!

Languages

rkynhoff/Streaming_Service_Comparisons

Folders and files

Latest commit

History

Repository files navigation

Streaming Service Comparision

Overview

Data Sources:

Project Structure

Features Utilized for the Project

Getting Started

Virtual Environment Instructions

Dependencies

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages