Skip to content

QLS-MiCM/DataProcessingInPython

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

59 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Processing in Python

Click one of these: Open student version in Colab Open compact student version in Colab Open solutions version in Colab

Overview

In this 4-hour workshop, students will learn basic data processing skills using Python. Attendees will learn how to import code from other modules and packages to take advantage of the existing Python ecosystem. After seeing how to access packages, we will explore popular data analysis packages. We will see how to use NumPy to perform operations on large data arrays and how to use Matplotlib to generate clear data visualisations. We will also scratch the surface on using pandas to store data in tables. Along the way, we will discuss how to approach new, unfamiliar packages and learn how to use them.

Learning Objectives

By the end of this workshop, you should be able to:

  1. Import code from existing modules and packages.
  2. Use NumPy to easily process multidimensional data.
  3. Use Matplotlib to generate different types of plots to visualise data.
  4. Use pandas to represent data stored in tables.
  5. Approach a new package and explore its documentation and examples.

Requirements

  • Basic knowledge of Python is required.
  • Attendees must be comfortable using variables for simple data types, as well as collections. Attendees should also be comfortable with loops and control flow and be familiar with the basics of using functions in Python.
  • To be able to participate in the exercises, participants must either:
    • (Preferred) Have a Google Account to run in-browser as a Colab notebook
    • Have a local installation of Python and software to edit Jupyter notebooks (e.g., Jupyter Lab, Microsoft Visual Studio Code, PyCharm)

Software

This workshop is intended to be interactive. Before the workshop, please download the materials from this repository. You can download the material as a ZIP file using the green button higher up on this page, or you can simply clone this repository by typing the following in a terminal:

git clone https://github.com/QLS-MiCM/DataProcessingInPython.git

In your Python environment, you must have the following packages installed:

  • NumPy
  • Matplotlib
  • pandas

Links to Colab

If you don't want to install anything locally, you can open the workshop materials using Google Colab:

Warning: Make sure that using_colab = True in the first code cell and run that cell to download all the data files required for this workshop.

References

This workshop material relies heavily on the documentation of the various projects discussed, including NumPy, Matplotlib, pandas, conda and pip, as well as the official Python documentation. Links to relevant documentation pages are provided throughout the Jupyter notebook. There are also references to a few other useful tutorials.

This workshop is based on previous iterations of this workshop (as Intermediate Python) and the Intro to Python workshop, which can be found at the following repositories:

Colab badge created using https://shields.io.

Some cool Markdown tricks can be found at https://www.markdownguide.org/hacks/.


Workshop created as part of the McGill Initiative in Computational Medicine.

For more information about the QLS-MiCM, visit: https://www.mcgill.ca/micm/.

The contents of this repository are licensed under the Creative Commons Attribution-ShareAlike 4.0 International License.

CC-BY-SA

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors