Data Visualization

At the heart of any data science workflow is data exploration. Most commonly, we explore data by using the following:

  • Statistical methods (measuring averages, measuring variability, etc.)

  • Data visualization (transforming data into a visual form)

This indicates that one of the central tasks of data visualization is to help us explore data.

The other central task is to help us communicate and explain the results we've found through exploring data. That being said, we have two kinds of data visualization:

  • Exploratory data visualization: we build graphs for ourselves to explore data and find patterns.

  • Explanatory data visualization: we build graphs for others to communicate and explain the patterns we've found through exploring data.

In this first course, we're going to focus on exploratory data visualization. The main visualization library we're going to use is Matplotlib. We're going to learn the following:

  • How to visualize time series data with line plots.

  • What are correlations and how to visualize them with scatter plots.

  • How to visualize frequency distributions with bar plots and histograms.

  • How to speed up our exploratory data visualization workflow with the pandas library.

  • How to visualize multiple variables using Seaborn's relational plots.

In the second course, our focus will be explanatory data visualization. We'll learn about graph aesthetics, information design principles, storytelling data visualization, customizing graphs with Matplotlib, and more.

When we use Matplotlib inside Jupyter, we also need to add the %matplotlib inline magic — this enables Jupyter to generate the graphs

We only need to run %matplotlib inline once inside a notebook. If we have a notebook with ten cells, and we plot a graph in each cell, it's enough to add %matplotlib inline in the first cell.

Last updated