[ back to top ]
If we want to use libraries and modules not defined within the built-in functionality of python we have to import them. There are a number of ways to do this.
import numpy,scipy
Imports the module numpy and the module scipy, and creates a reference to that modules in the current namespace. Or in other words, after you’ve run this statement, you can use numpy.name and scipy.name to refer to things defined in module numpy and scipy respectively.
from numpy import *
Imports the module numpy, and creates references in the current namespace to all public objects defined by that module (that is, everything that doesn’t have a name starting with “_”). Or in other words, after you’ve run this statement, you can simply use a plain name to refer to things defined in module numpy. Here, numpy itself is not defined, so numpy.name doesn’t work. If name was already defined, it is replaced by the new version. Also, if name in numpy is changed to point to some other object, your module won’t notice.
from numpy import array, mean, std
Imports the module numpy, and creates references in the current namespace to the given objects. Or in other words, you can now use array and mean and std in your program.
More information on the numpy package will follow later in the tutorials.
[ back to top ]
Thus far we have been entering lines into the Ipython terminal or in a Jupter notebook one or a few lines at a time. This is a great way to prototype an idea you have and want to implement but no way to write a full program! For writing programs we turn to writing scripts and for that we need to exit Ipython and open your favorite text editor. To exit Ipython you can enter the command exit() into the terminal and enter y if prompted.
When writing scripts, one should always keep coding style and coding conventions in mind. More information on coding style and conventions for Python (PEP) can be found in the tutorial section on Coding Style.
Let's move on to writing our first python script and then talk more about things we can use in python. For our first program let's keep it simple and use some ideas we already have worked with. How about we find all the prime numbers up to 100? To do this we need to tests if each number between 2 and 100 is divisible by any number smaller then itself. We can do this pretty easily with a list, a few loops, and an if statement.
#!/usr/bin/env python
numbers0to100=range(2,101)
for num in numbers0to100:
prime=True
for chknum in range(2,num):
if num%chknum==0:
prime=False
if prime==True:
print(num,"is prime")
We can copy this script into our text editor and then save it with a descriptive name in a directory we will use for these script examples. Let's name it primenumbers.py, where the .py extension identifies what kind of file it is. Now we have a script and need to run it.
There are two main ways to run a python script. Firstly, we can give the path to our script as the argument of the python command on the command line in a shell by typing
python primenumbers.py
Secondly, we can include the so-called shebang line
as the first line of our script, which if the file is executable will figure out which python on your system to use and then run the script by typing on the command line
./primenumbers.py
Both methods will produce the same result so it's really just a matter of preference. Let's run our program by the first method for now by making sure our shell is in the same directory as our script the then running
python primenumbers.py.
You should see the same result as when running it in the Jupyter notebook.
One thing that we do not want to do with a script is be frequently editing and re-saving the file to alter parameters and files which it is reading or writing to. For things like this we want to specify arguments after the script name on the command line. This functionality is not built into python natively so we need to import a different module to do this. For simple things we will use the sys module, for more complex things we will use the argparse module. Let's start with a simple example and discuss what it does and how to run it. Save the following code as simpleargv.py in your running directory.
#!/usr/bin/env python
import sys
print('Number of arguments:', len(sys.argv), 'arguments.')
print('Argument List:', sys.argv)
sys takes every word in your shell command after python (if you're using that invocation) and places it in a list whose elements were seperated by spaces in the command. This list is called sys.argv. Thus the length of sys.argv is the number of arguments you have plus 1 (the name of the script is also put in) and you can get the arguments you are feeding into the program by indexing into the list. Try running the program in the shell as follows and see if the output makes sense.
python simpleargv.py one two three
For more advanced input we might want to allow the program to be more robust then just taking a list of words from the command line. This is where argparse comes in. With argparse we can designate specific options to be valid and set variables in our script based on the options which are specified. Lets say that we would like our python script to support a --name option. We could define this alone but we could also allow --name to have a shorter sibling –n. –-name and -n should have the same meaning, but one is shorter and the other longer and more verbose. To do this we again will look to a simple example and discuss how to run it. You can save the following code to your running directory and name it simpleargparse.py.
#!/usr/bin/env python
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('-n', '--name')
parser.add_argument('-v', '--verbose')
args = parser.parse_args()
print(args.name,args.verbose)
After the shebang we import the module argparse and then instantiate a ArgumentParser instance in our variable parser. Then we add a new option and parse the arguments. Finally we invoke and print the stored options by calling args.name and args.verbose. To run this program we could use
python simpleoptparse.py -n James -v True
which should just result in James and True being printed.
Let's say we want our variable to be stored to a name we choose ourselves, define a default value, and also define beforehand what kind of variable we want the incoming variables to be. To do that, we have to add the dest, type, and default options respectively to our add_argument lines.
#!/usr/bin/env python
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('-n','--name',dest='my_name',type=str,default='KillRoy')
parser.add_argument('-v','--verbose',dest='v',type=bool,default=False)
args = parser.parse_args()
print(args.my_name,args.v)
If we run the program with the same options as before the results will be the same. However we don't have to specify them, and the result will be the defaults being used.
Another thing that argparse is great for is documentation. By running the program and specifying the -h option we can see the help information specified in the add_argument lines and a program banner specified in the ArgumentParser.
#!/usr/bin/env python
import argparse
desc="Just a test program to demonstrate some argparse basics"
parser = argparse.ArgumentParser(description=desc)
parser.add_argument('-n','--name',dest='my_name',
type=str,default='KillRoy',
help="Give your name so I can print it.")
parser.add_argument('-v','--verbose',dest='v',
type=bool,default=False,
help="Do you want me to be verbose?")
args = parser.parse_args()
print(args.my_name,args.v)
Now running the program with the -h option gives you a description of the program and the options it uses.
The ArgumentParser adds a lot of great functionality and all the options can be found at the Python documentation pages online. In addition, have a look here for a tutorial on argparse in Python 2.
[ back to top ]
Before we get too far along for anyone to develop bad habits let's stop and talk about some style guidelines you should follow as they are layed out in the Python Enhancement Proposal (PEP).
Python is different then other languages in that it uses white space to determine what control statement it is inside of. Use 4 spaces for indentation. This is enough space to give your code some visual structure, while leaving room for multiple indentation levels. When using text editors to write scripts most times you can change your settings so that a tab inserts 4 spaces.
Use up to 79 characters per line of code, and 72 characters for comments. This is a style guideline that some people adhere to and others completely ignore. This used to relate to a limit on the display size of most monitors. Now almost every monitor is capable of showing much more than 80 characters per line. But we often work in terminals, which are not always high-resolution. We also like to have multiple code files open, next to each other. It turns out this is still a useful guideline to follow in most cases. When using functions whose arguments get long one can simply break after the completion of an argument with a return, python will keep reading in the arguments until it finds a matching ')'. If a string or equation runs too long a backslash ('\') can be used to break it into more then one line.
Use single blank lines to break up your code into meaningful blocks.
Use a single space after the pound sign (#) at the beginning of a line. If you are writing more than one paragraph, use an empty line with a pound sign between paragraphs.
Name variables and program files using only lowercase letters, underscores, and numbers. Python won't complain or throw errors if you use capitalization, but you will mislead other programmers if you use capital letters in variables at this point.
[ back to top ]
[ back to top ]
The NumPy (Numeric Python) package provides efficient routines for manipulating large arrays and matrices of numeric data. It contains among other things:
numpy.ndarray
)By convention, NumPy is usually imported via
import numpy as np
The fundamental datastructure that NumPy gives us the the ndarray (usually just called "array"). According to the NumPy documentation
An array object represents a multidimensional, homogeneous array of fixed-size items. An associated data-type object describes the format of each element in the array (its byte-order, how many bytes it occupies in memory, whether it is an integer, a floating point number, or something else, etc.)
Generally, think of an array as an (efficient) Python list with additional functionality. BUT keep in mind that there are a few important differences to be aware of. For instance, array object are homogenous—the values in an array must all be of the same type. Because arrays are stored in an unbroken block of memory, they need to be fixed size. While NumPy does support appending to arrays, this can become problematic for very large arrays.
array = np.array([1, 2, 3, 4, 5, 6])
print(array)
print(type(array))
The datatype of an array can be found using the dtype
array attribute
print(array.dtype)
If not specified, NumPy will try to determine what dtype
you wanted, based on the context. However, you can also manually specify the dtype
yourself.
array = np.array([1, 2, 3, 4, 5, 6], dtype=float)
print(array)
print(array.dtype)
Array attributes reflect information that is intrinsic to the array itself. For example, it's shape, the number of items in the array, or (as we've already seen) the item data types
array = np.array([[1, 2, 3],[4, 5, 6]], dtype=float)
print(array)
print(array.shape)
print(array.size)
print(array.dtype)
In addition to array attributes, ndarray
s also have many methods that can be used to operate on an array.
print(array.sum()) # Sum of the values in the array
print(array.min()) # Minimum value in the array
print(array.max()) # Maximum value in the array
print(array.mean()) # Mean of the values in the array
print(array.cumsum()) # Cumulative sum at each index in the array
print(array.std()) # Standard deviation of the values in array
NumPy naturally supports various matrix operations:
M = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
print(M)
M.T
M.diagonal()
M.dot([1, 2, 3])
M.trace()
To learn more about the motivation and need for something like Numpy, check out this great blog post Why Python is Slow: Looking Under the Hood.
array1 = np.array([1, 2, 3, 4])
array2 = np.array([5, 6, 7, 8])
print(array1 + array2)
2*array1
array1**2
Let's get an idea of how much NumPy speeds things up.
from IPython.display import Image
Image('img/squares-list-creation.png')
Image('img/sum-range.png')
[ back to top ]