2. Problem statement
Let's say we have a table of data like this:
name country apples pears
Giovanni Italy 31 13
Mario Italy 23 33
Luigi Italy 0 5
Margaret England 22 13
Albert Germany 15 6
How to read it in python?
How to do some basic plotting?
3. Alternatives for plotting
data in python
Pylab (enthought)→ Matlab/Octave approach
Enthought → extended version of Pylab (free for
academic use)
rpy/rpy2 → allows to run R commands within
python
Sage → interfaces python with Matlab, R, octave,
mathematica, ...
4. The Pylab system
pylab is a system of three libraries, which together
transform python in a Matlablike environment
It is composed by:
Numpy (arrays, matrices, complex numbers, etc.. in
python)
Scipy (extended scientific/statistics functions)
Matplotlib (plotting library)
iPython (extended interactive interpreter)
5. How to install pylab
There are many alternatives to install PyLab:
use the package manager of your linux distro
use enthought's distribution (
http://www.enthought.com/products/epd.php) (free
for academic use)
compile and google for help!
Numpy and scipy contains some Fortran libraries,
therefore easy_install doesn't work well with
them
6. ipython -pylab
Ipython is an extended version of the standard
python interpreter
It has a modality especially designed for pylab
The standard python interpreter doesn't support
very well plotting (not multithreading)
So if you want an interactive interpreter, use
ipython with the pylab option:
$: alias pylab=”ipython -pylab”
$: pylab
In [1]:
7. Why the python interpreter
is not the best for plotting
Gets blocked when you create a plot
8. How to read a CSV file with
python
To read a file like this in pylab:
name country apples pears
Giovanni Italy 31 13
Mario Italy 23 33
Luigi Italy 0 5
Margaret England 22 13
Albert Germany 15 6
→ Use the function 'matplotlib.mlab.csv2rec'
>>> data = csv2rec('exampledata.txt',
delimiter='t')
9. Numpy - record arrays
csv2rec stores data in a numpy recarray object, where
you can access columns and rows easily:
>>> print data['name']
['Giovanni' 'Mario' 'Luigi' 'Margaret'
'Albert']
>>> data['apples']
array([31, 23, 0, 22, 15])
>>> data[1]
('Mario', 'Italy', 23, 33)
10. Alternative to csv2rec
numpy.genfromtxt (new in 2009)
More options than csv2rec, included in numpy
Tricky default parameters: need to specify dtype=None
>>> data = numpy.genfromtxt('datafile.txt',
dtype=None)
>>> data
array....
12. Barchart
>>> data = csv2rec('exampledata.txt',
delimiter='t')
>>> figure()
>>> clf()
Read a CSV file and storing
it in a recordarray object
Use figure() and cls() to
reset the graphic device
13. Barchart
>>> data = csv2rec('exampledata.txt',
delimiter='t')
>>> bar(x=arange(len(data)), y=data['apples'],
color='red', width=0.1, label='apples')
The bar function creates a
barchart
14. Barchart
>>> data = csv2rec('exampledata.txt',
delimiter='t')
>>> bar(x=arange(len(data)), y=data['apples'],
color='red', width=0.1, label='apples')
>>> bar(arange(len(data))+0.1, data['pears'],
color='blue', width=0.1, label='pears')
This is the second barchart
15. Barchart
>>> data = csv2rec('exampledata.txt',
delimiter='t')
>>> bar(x=arange(len(data)), y=data['apples'],
color='red', width=0.1, label='apples')
>>> bar(arange(len(data))+0.1, data['pears'],
color='blue', width=0.1, label='pears')
>>> xticks(range(len(data)), data['name'], )
Redefining the labels in the X axis
(xticks)
16. Barchart
>>> data = csv2rec('exampledata.txt',
delimiter='t')
>>> bar(x=arange(len(data)), y=data['apples'],
color='red', width=0.1, label='apples')
>>> bar(arange(len(data))+0.1, data['pears'],
color='blue', width=0.1, label='pears')
>>> xticks(range(len(data)), data['name'], )
>>> legend()
>>> grid('.')
>>> title('apples and pears by person')
Adding legend, grid, title