Pandas logo

Pandas DataFrames in Jupyter Notebooks

  • Pre-workshop activities: 10 min
  • Introductory presentation: 15 min
  • Hands-on activities: 30-60 min

Why Pandas? Why Jupyter Notebook?

Pandas is an open-source Python library for data structuring and analysis. Its capabilities are flexible and can be integrated with other Python Libraries. It is also efficient for automating repetitive processes.

Jupyter Notebook is a web-based environment for interactive computing that helps researchers and scientists easily view programming output. In contrast to traditional programming environments, users can view output one section at a time, which can help for breaking down tasks and debugging. There is no need for console knowledge.

This workshop uses the syzygy.ca Jupyter Notebook software because it is free for several Canadian Institution’s students and staff to use and avoids the additional hassle of downloading new software. Notebooks are available on other platforms.

Learning objectives:

At the end of this workshop, you will be able to:

  1. Input and output code in Jupyter Notebook
  2. Understand the use cases of the Pandas library
  3. Upload a CSV file as a dataframe in a python notebook
  4. Output properties of a dataframe
  5. Manipulate rows and columns to a dataframe
  6. Create a visualization using the data from the dataframe

NEXT STEP: Pre-Workshop Activities