Skip to content

rmkeeler/stackoverflow-us-india

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 

Repository files navigation

Table of Contents

  1. Installation
  2. Project Motivation
  3. File Descriptions
  4. Results
  5. Licensing, Authors, and Acknowledgements

Installation

Python version used: 3.8.7

Packages used:

  1. Pandas 1.2.3
  2. Matplotlib 3.4.1

It's important to ensure two folders exist in this repo's folder on your machine when you run the analysis.

  1. output: plots will be saved by the notebook to this folder
  2. datasets: notebook will search for the survey datasets in this folder during import at the beginning of the notebook.

Project Motivation

After examining Stackoverflow's annual survey data, I found that 2018, 2019 and 2020 were years during which they asked developers about languages they knew and planned to learn. These responses can be made into useful indicators of language repertoires which developers feel they need to succeed in their fields.

And that, in turn, can help companies understand where talent in certain langauges tend to reside, making recruiting decisions smarter.

With this project, I took a step into that space by answering three questions about developer talent in the US and India:

  1. Which languages gained and lost in popularity between 2018 and 2020? (became easier and harder to find in each market)
  2. How multilingual are programmers in the different programming fields studied with the survey? (how many languages do devs tend to know in each field)
  3. If many responses in a year indicate a desire to learn a particular language in coming years, is it a meaningful sign that the language will become significantly more popular in the region in those years?

File Descriptions

stackoverflowsurvey_2018_2020.ipynb is the only file included. All analyses take place in that notebook. All necessary custom functions are defined in that notebook, as well.

The analyses use two datasets created by Stackoverflow via surveys they conducted in 2018 and 2020.

  1. Master list of survey datasets at Stackoverflow
  2. Direct link to 2018 dataset
  3. Direct link to 2020 dataset

Links to Stackoverflow Survey source datasets also appear above the import statements in that notebook.

Results

The main findings of the code can be found at this Medium post I wrote in April, 2021.

Licensing, Authors, Acknowledgements

Data collected and made available by StackOverflow. Info regarding the datasets and their licensing can be found on Kaggle. Feel free to use the code provided in this repository at your own discretion.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published