Matplotlib

[ back to top ]

According to the Matplotlib website: "Matplotlib is a Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. Matplotlib can be used in Python scripts, the Python and IPython shells, the Jupyter notebook, web application servers, and four graphical user interface toolkits."

(Visual) examples can be found here and here for inspiration. In addition, we will show some of Matplotlib's functionality in this tutorial.

In [1]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

Matplolib has several plotting capabilities. For example,

  • plot — plotting x and y data points
  • errorbar — plotting x and y data points with errorbars
  • hist — plotting histograms
  • hist2d — plotting 2D histograms
  • matshow — display a matrix
  • etc...
In [2]:
x = np.linspace(0, 4*np.pi, 100)
y = np.sin(x)
In [3]:
plt.plot(x, y);
In [4]:
x = np.linspace(0, 4*np.pi, 100)
y1 = np.sin(x)
y2 = np.cos(x)
In [5]:
fig, ax = plt.subplots()
ax.plot(x, y1)
ax.plot(x, y2)
plt.show()
In [6]:
fig, ax = plt.subplots()
ax.plot(x, y1, label='Sine')
ax.plot(x, y2, label='Cosine')
ax.legend()
plt.show()
In [7]:
fig, ax = plt.subplots()
ax.plot(x, y1, label='Sine')
ax.plot(x, y2, label='Cosine')
ax.set_xlabel('x')
ax.set_ylabel('f(x)')
ax.grid()
ax.legend(title='Functions')
plt.show()
In [8]:
fig, ax = plt.subplots()
ax.plot(x, y1, label='Sine')
ax.plot(x, y2, label='Cosine')
ax.fill_between(x, y2, y1, color='C2', alpha=0.25)
ax.set_xlabel('x')
ax.set_ylabel('f(x)')
ax.grid()
ax.legend(title='Functions')
plt.show()

Let's try to make some plots based on the text file people.txt that we created in the first part of these lectures. (If you lost it, you can download it here).

In [9]:
#!/usr/bin/env python

f=open("people.txt",'r+')

# Reserve two python lists for the names and ages in the txt file
names = []
ages = []

# Loop over the lines in the file and fill the lists
for line in f:
    names.append(line.split()[0])
    ages.append(int(line.split()[1]))

# Make a plot
fig, ax = plt.subplots(figsize = (8,5))

bar_width = 0.75
opacity = 0.6

bars = ax.bar(names, ages, bar_width,
                alpha=opacity, color='r')

ax.set_xlabel('Name')
ax.set_ylabel('Age')
ax.set_title('Ages of people in people.txt')
plt.show()

We could also make two data series for the men and women separately. We could (and should) have done that by adding another column 'gender' in the txt file, but let's take another approach in the example below where we use a dictionary instead.

In [10]:
#!/usr/bin/env python

people = {'James':[24,'m'],'Jane':[40,'f'],'Sam':[12,'m'],'Ben':[1,'m'],'Debbie':[20,'f'],'Peggy':[30,'f'],\
          'Chuck':[67,'m'],'Mary':[8,'f'],'Buck':[30,'m'],'Burt':[100,'m']}
# Loop over the lines in the file and fill the lists
male = []
male_ages = []
female = []
female_ages = []

for key,value in people.iteritems():
    if value[1] == 'm':
        male.append(key)
        male_ages.append(value[0])
    if value[1] == 'f':
        female.append(key)
        female_ages.append(value[0])

# Make a plot
fig, ax = plt.subplots(figsize = (8,5))

bar_width = 0.75
opacity = 0.6

male_bars = ax.bar(male, male_ages, bar_width,
                alpha=opacity, color='b')

female_bars = ax.bar(female, female_ages, bar_width,
                alpha=opacity, color='r')

ax.set_xlabel('Name')
ax.set_ylabel('Age')
ax.set_title('Ages of people in people.txt')
plt.show()

Finally, we can make a histogram of the ages, print out some statistics, and let one of the bins stand out by changing the color.

In [11]:
#!/usr/bin/env python

f=open("people.txt",'r+')

# Make a list of all the ages in the dataset using Python's list comprehension
ages = np.array([int(line.split()[1]) for line in f])
mu = ages.mean()
sigma = ages.std()

print 'Average age is {:.0f}'.format(mu)
print 'Standard deviation is {:.0f}'.format(sigma)

# Make a plot
fig, ax = plt.subplots(figsize = (8,5))

# Bin the data using the ax.hist command
n, bins, patches = ax.hist(ages, 10,color='b',alpha=0.6)
 
ax.set_xlabel('Age')
ax.set_ylabel('Number of people')
ax.set_title(r'Histogram of ages: $\mu={:.0f}$, $\sigma={:.0f}$'.format(mu,sigma))
ax.set_yticks([1,2,3,4])
plt.setp(patches[-1], 'facecolor', 'g')
plt.show()
Average age is 32
Standard deviation is 27

Other Python packages to check out

[ back to top ]

  • Scipy The SciPy (Scientific Python) package extends the functionality of NumPy with a substantial collection of useful algorithms, like minimization, Fourier transformation, regression, and other applied mathematical techniques. You can import scipy into your program using any of the aforementioned methods using scipy as the package, however odds are your package of interest lies within a project which you need to import explicitly to use, see the examples below. I won't go into details on the use of scipy functions because there are just too many. You can find them on their documentation page and I have provided a list of some common functions below. Scipy and numpy work hand in hand so use scipy to preform operations on your arrays rather then writing them by hand.
    • scipy.stats - Everything statistics.
    • scipy.constants - Physical and mathematical constants
    • scipy.fftpack - Fourier transform
    • scipy.integrate - Integration routines
    • scipy.interpolate - Interpolation
    • scipy.linalg - Linear algebra routines
    • scipy.ndimage - n-dimensional image package
    • scipy.special - Any special mathematical functions
  • Pandas — High-performance data analysis toolkit
  • Seaborn — Statistical data visualization using Matplotlib
  • Scikit-learn — Machine learning library
  • Jupyter — Document that combined code, markdown documentation, images, etc. Ideal for documenting an analysis.

Additional exercises

If you're still up for it, here are some additional exercises :)

[ back to top ]

  1. Use the ipython terminal to prototype a function that prints the first 100 terms of the Fibonacci sequence. Save your lines of code to a file named fibscript.py so you can use it later.
  2. Write a script called mcpi.py that calculates pi by drawing random numbers uniformly on the space x=(-1,1) y=(-1,1).
    • Hint: Write out the area of a circle and a square and take the ratio.
    • Super Hint: The number of things falling randomly in a given area are proportional to their area.
  3. Write a script called peopletodict.py to make a dictionary from people.txt where the keys are the names and the values are the ages.