Introduction to Python for Image Processing And Computer Vision =============================================================== The combination of Python (the language), Numpy (the numerical array lib), SciPy (scientific libs) and Matplotlib (the graphical plot lib) will serve as our computational basis to learn image processing and computer vision. Where possible and needed we will use other libraries as well. We will use the name 'Python' to refer to this collection of tools. The best way to explore Python is by using it. We assume that you have ``ipython`` installed on your computer. All interaction shown is from actual ``ipython`` sessions. Start ``ipython`` with the command ``ipython``. You can start right away typing commands into the command line window: .. ipython:: python @suppress %reset -f x = 2 The interpreter executes this statement and shows nothing (meaning that there were no errors). We may look at the value of variable ``x`` by just typing its name: .. ipython:: python x In Python there is no need to declare variables before using them. If you start losing the overview over which variables you have already used and which data they contain, the command ``whos`` can help you out. .. ipython:: In [1]: whos Out[1]: In case you are not familiar with Python we suggest that you work your way through one of the tutorials that are available on the web (see some suggestions :doc:`here `). In Python the list and dictionary are probably the most important datastructures. A list is a heterogeneous sequence of values (numbers, strings, lists, dictionaries, functions, really anything). .. ipython:: In [1]: a = [ 1, 2, 3, 4, 5 ] In [2]: a[0] In [3]: a[3] In [4]: a[-1] Indexing is done using square brackets and indexing starts at 0. Negative indices start counting from the back to the front. Using nested lists you may construct something that we would call a multidimensional list (or array) in C. .. ipython:: In [1]: a = [ [1,2,3], [4,5], [6] ] In [1]: a[1] In [1]: len(a) In [1]: len(a[1]) In [1]: a[0][1] But because of the list being heterogeneous and nested lists all can have their own length the list is not the preferred way to encode multidimensional arrays as the often are encountered in numerical computing. The best way of learning Numpy is by reading the official tutorial: `Tentative Numpy Tutorial`_. .. _`Tentative Numpy Tutorial`: http://www.scipy.org/Tentative_NumPy_Tutorial Python for Image Processing --------------------------- To demonstrate the use of Python for image processing we will look at two algorithms in some depth here: contrast stretching and linear filtering. Contrast Stretching ................... Contrast stretching is described in :ref:`histogramBasedOperations`. Let $f$ be a scalar image within the range $[0,1]\in\setR$. Then the contrast stretched image $g$ is defined as: .. math:: g(\v x) = \frac{f(\v x)-f_\mbox{min}}{f_\mbox{max}-f_\mbox{min}} Observe that this is a point-operator and we therefore do not need an explicit loop over all pixels (see :ref:`pointOperators`). .. code-block:: python def cst( f ): fmin = amin(f) # the min value in an array fmax = amax(f) # the max value in an array return (f-fmin)/(fmax-fmin) The image :download:`lowcontrast.npy ` uses only part of the range from 0 to 1. Make an histogram of this image and of the contrast stretched version of it. The cst operator is so common when displaying images that it is even built into the matplotlib imshow function. For floating point images the values are always stretched to comprise the whole range from 0 to 1. To undo this automatic stretching in the `imshow` function you have to use something like: .. ipython:: In [1]: from pylab import *; In [1]: a = imread("images/cermet.png"); @savefig lowcontrast.png width=8cm In [1]: clf(); imshow(a,vmin=0,vmax=1,cmap=cm.gray); In [1]: h, be = histogram(a.flatten(), bins=40) @savefig lowcontrasthistogram.png width=8cm In [1]: clf(); bar( be[0:-1], h, width=diff(be)[0] ); xlim( (0,1) ); .. admonition:: Exercise: Run your version of the ``cst`` algorithm on the lowcontrast.npy image and plot its histogram. Linear Filtering ................ With little doubt the linear filter is the most used filter in the image processing toolbox. It simply calculates a weighted summation of all pixel values in a small neighborhood (small compared to the image size). It does so for all shifted neighborhoods in an image and all weighted sums thus form a new image. In signal processing this would be called a moving (weighted) average filter. Let $F$ be the discrete function with all samples of the original image and let $W$ be the weight function, the result image $G$ then is called the correlation of $F$ and $W$: .. math:: G(i,j) = \sum_{k=-\infty}^{+\infty}\, \sum_{l=-\infty}^{+\infty} F(i+k, j+l)\,W(k,l) Although there are built-in functions that calculate the linear filter efficiently, here we are showing a pure Python implementation. In the above mathematical expression we assumed the image (and weight function) to be infinitely large (hence the indices that run from $-\infty$ to $+\infty$). In practice we have to deal with the fact that the image is defined only on a finite domain. Then the indices $(i+l, j+l)$ can point *outside* the image (see the border problem) and we have to decide what value to use for $F(i+k, j+l)$. In practice the weight function has non zero values only in small neighborhood of the origin. Assume that $W(k,l)$ is non zero only in the neighborhood for $k=-K,..,+K$ and $l=-L,..,+L$. .. literalinclude:: /../python/linfilters.py :pyobject: linfilter1 Note that the correlation function ``linfilter1`` has 4 nested loops. For large images and large kernels it explains why even the simplest image processing function are so processor intensive. Because of the interpreted nature of the Python language writing a 4 levels nested loop is bound to result in a very slow program. Test this function on a small image. .. ipython:: In [1]: from linfilters import *; In [1]: from scipy.ndimage.interpolation import zoom; In [1]: from pylab import *; rcParams['savefig.bbox'] = 'tight' In [1]: a = imread('images/cameraman.png' ); In [1]: f = zoom(a, 0.25); In [1]: g = linfilter1(f, ones((5,5))/25); In [1]: subplot(1,2,1); In [1]: imshow(f); gray(); In [1]: subplot(1,2,2); @savefig cameraman_linfilter1.png In [1]: imshow(g); Python doesn't like expliciteloops (i.e. they are slow). In many cases we can come up with other ways to implement the same function and circumvent most explicit loops. Obviously the loops over all pixels and all neighbors must be there somewhere but 'hidden' in Python functions (written in C) and therefore we can achieve considerable speedups. .. literalinclude:: /../python/linfilters.py :pyobject: linfilter2 This version circumvents the double loop over all pixels in the neighborhood. It extracts a detail image of the correct size. The correct multi-indices are build with the `ix_`_ function. .. _`ix_`: http://docs.scipy.org/doc/numpy/reference/generated/numpy.ix_.html#numpy.ix_ Instead of 'hiding' the loops over all pixels in the neighborhood we can hide the loops over all pixels in the image. Observe that a correlation is equal to the addition of shifted and weighted images. .. literalinclude:: /../python/linfilters.py :pyobject: linfilter3 And the easiest alternative is the builtin function correlate: .. literalinclude:: /../python/linfilters.py :pyobject: linfilter4 .. admonition:: Exercise: #. Time these alternatives using an image of size 64x64 for several sizes of the neighborhood (from 3x3 to 11x11 eg). Make a plot with the timings as function of the neighborhood size for all filters.