================ Image Arithmetic ================ .. ipython:: python :suppress: from pylab import * rcParams['savefig.bbox']='tight' a = imread('images/cermet.png') b = imread('images/trui.png') figure(1) # just comment subplot(1,3,1) title('a',size='xx-small') imshow(a,cmap='gray') axis('off') subplot(1,3,2) title('b',size='xx-small') imshow(b,cmap='gray') axis('off') subplot(1,3,3) title('(a+2*b)/3',size='xx-small') imshow((a+2*b)/3,cmap='gray') axis('off') @savefig imageadding.png width=99% subplots_adjust(hspace=0.3, wspace=0.05) Consider two 2-dimensional images $f$ and $g$ both with signature $\setR^2\rightarrow\setR$ and both sampled on the same grid resulting in discrete representations $F$ and $G$ respectively. The point wise addition of the two images $h=f+g$ is defined by: .. math:: \forall \v x\in\setR^2: h(\v x) = (f+g)(\v x) = f(\v x) + g(\v x) When we sample the addition of the two images on the same grid $B$ we obtain the discrete image $H$: .. math:: \forall \v k \in\setZ^2: H(\v k) = F(\v k) + G(\v k) The Python code for the addition of two images is: .. code-block:: python # Addition of two images def addImage(f, g): h = f.copy() for p in domainIterator(f): h[p] = f[p] + g[p] return h .. sidebar:: Python Programming The copy of ``f`` to make the output image ``h`` is an easy way to be sure that the output image is of the same size and type as the input image. Please note that there is no checking on what size and type image ``g`` is (in production code this is needed of course). Also note that a multi dimensional array in Numpy may be indexed with ``a[p]`` in case ``p`` is a tuple of integers. The function ``domainIterator`` returns an iterator that returns the tuples of all multi-indices in the image domain. In the above example we have *lifted* a well defined operator (the scalar addition $+$) to work on images by applying it pairwise to all pixels of the two images. Consider again the definition of the image addition operator $(f+g)(\v x) = f(\v x) + g(\v x)$. The operator $+$ in the left hand side working on images is *defined* in terms of the operator working on scalars in the right hand side. The addition of two images in terms of a pixel wise addition of the pixel values is *not* needed in Python. Python automatically lifts many of the standard arithmetic operators to work on images (i.e. arrays). The addition of the two images ``f`` and ``g`` could have been simply written as ``f+g``. Definition ========== More formally we can repeat this lifting procedure for any scalar function $\gamma$. Consider two images $f:\set D\rightarrow \set R$ and $g:\set D\rightarrow \set R'$. Let $\gamma:\set R\times \set R'\rightarrow \set R''$ be an operator that takes a value from $\set R$ and a value from $\set R'$ and produces a value from $\set R''$. Such a binary operator $\gamma$ can be *lifted* to work on images: .. math:: \forall\v x\in \set D: \gamma(f,g)(\v x) = \gamma( f(\v x), g(\v x) ) Note that the $\gamma$ function on the right hand side working on the image values is assumed to exist and that the $\gamma$ working on images is *defined* through the above equation. An image operator constructed by point wise lifting a value operator to the image domain is called a *point operator*. We will implicitly assume such a lifting construction when we talk about the addition, subtraction, multiplication etc. of images. We will write $\gamma$ when operating on image values and images alike. Effectively we are *overloading* the $\gamma$-function to work on both values and images. Many of the common binary operators on real numbers are written in an *infix notation* (like $f+g$, $f-g$, etc). We will follow that convention and write $f+g$ to denote the point wise addition of two images. Discretization ============== The lifting construction of a point operator is easily discretized. Let $F$ and $G$ be the sampled discretizations of the images $f$ and $g$ respectively. The discretization $H$ of the the image $h=\gamma(f,g)$ is: .. math:: H(k) = \gamma( F(k), G(k) ) This shows that we can define the discrete image operator $\Gamma$ working on discrete image representations $H = \Gamma(F,G)$. For lifted point operators we never will make the distinction between $\gamma$ and its discretized version $\Gamma$ again. .. \begin{figure}[htbf] .. \centering .. \includegraphics[width=3cm]{timeseq} .. \caption{{\bf Point wise transition between two images.} .. From top to bottom the images $(1-\alpha) f + \alpha g$ .. for $\alpha = 0$, $0.25$, $0.5$, $0,75$ and $1$ are shown.} .. \label{fig:timeseq} .. \end{figure} Python/Numpy ============ In Python/Numpy the idea of lifting operators from their definition working on scalars to work on images (``ndarrays`` in Numpy) is elegantly implemented using the idea of *universal functions* (as quoted from the documentation): "A universal function (or ufunc for short) is a function that operates on ndarrays in an element-by-element fashion, supporting array broadcasting, type casting, and several other standard features." The term "type casting" refers to the mechanism for automatical type conversion in case the operator requires so. The term "array broadcasting" refers to the mechanism to adapt and set the dimensions of the operands. Array broadcasting is the mechanism that allows us to write ``f+1`` where ``f`` is an image (ndarray). The result is that 1 is added to all elements of ``f``. For an overview of all point operators (or universal functions in Numpy speak) we refer to the `Numpy documentation`_ .. _Numpy documentation: http://docs.scipy.org/doc/numpy/reference/ufuncs.html#available-ufuncs Practical Use ============= :math:`\alpha`-Blending ----------------------- .. ipython:: python :suppress: from pylab import * #rcParams['savefig.bbox'] = 'tight' f = imread('images/cermet.png') g = imread('images/trui.png') N = 5 alph = linspace(0,1,N) # yes this should be a loop # but then only the last image shows up i=0 a = alph[i] subplot(1,N,i+1) title( "%2.2f" % a, size='xx-small' ) imshow((1-a)*f+a*g, cmap='gray') axis('off') i=1 a = alph[i] subplot(1,N,i+1) title( "%2.2f" % a, size='xx-small' ) imshow((1-a)*f+a*g, cmap='gray') axis('off') i=2 a = alph[i] subplot(1,N,i+1) title( "%2.2f" % a, size='xx-small' ) imshow((1-a)*f+a*g, cmap='gray') axis('off') i=3 a = alph[i] subplot(1,N,i+1) title( "%2.2f" % a, size='xx-small' ) imshow((1-a)*f+a*g, cmap='gray') axis('off') i=4 a = alph[i] subplot(1,N,i+1) title( "%2.2f" % a, size='xx-small' ) #plt.tight_layout() imshow((1-a)*f+a*g, cmap='gray') axis('off') @savefig imageblending.png width=99% subplots_adjust(hspace=0.3, wspace=0.05) Point operators, although very simple, are often used in practical applications of image processing. A simple example is the weighted average of two images. Let $f$ and $g$ be two color images defined on the same spatial domain. A sequence of images that shows a smooth transition from $f$ to $g$ (if rendered as a movie) is obtained by: .. math:: h_\alpha = (1-\alpha) f + \alpha g for $\alpha$ values increasing from 0 to 1. Such a sequence is sketched above. The Python code to generate the $\alpha$-blend of two images is: .. code-block:: python # Alpha blending of two images def alphaBlend(f, g, alpha): return (1 - alpha) * f + alpha * g .. \begin{figure} .. \centering .. \includegraphics[width=\textwidth]{unsharpmask} .. \caption{{\bf Unsharp image masking.} By subtracting a blurred .. version(middle) from the original image (left) a perceptually .. sharper image (right) is obtained.} .. \label{fig:unsharpmask} .. \end{figure} Unsharp Masking --------------- .. ipython:: python :suppress: from pylab import * rcParams['savefig.bbox'] = 'tight' from scipy.ndimage.filters import gaussian_filter f = imread('images/trui.png') g = gaussian_filter(f,3) h = f + 3*(f-g) figure(1) subplot(1,3,1) imshow(f,cmap='gray',vmin=0,vmax=1) title('f',size='xx-small') axis('off') subplot(1,3,2) imshow(g,cmap='gray',vmin=0,vmax=1) title('g',size='xx-small') axis('off') subplot(1,3,3) imshow(h,cmap='gray',vmin=0,vmax=1) title('f+3*(f-g)',size='xx-small') plt.tight_layout() axis('off') @savefig unsharpmasking.png subplots_adjust(hspace=0.3, wspace=0.05) A second example using alpha blending of two images is *unsharp masking*. This technique takes an image $f$ and a blurred version of that image (we call it $g$). The result image then is obtained by adding $\beta$ times the difference $f-g$ to the original image. For some more background on the origin of unsharp masking we refer to the article on `Wikipedia `_. In a later chapter we will learn how to blur an image through *Gaussian convolution*, only then we can understand why this seemingly simple operation is capable of sharpening the image. Thresholding ------------ An example of the use of a relational operator (resulting in a boolean or binary image) is *image thresholding*. Let $f$ be a scalar image, then $f>t$ (for constant $t$) results in a binary image. .. ipython:: python :suppress: from pylab import * f = imread('images/rice.png')[:,:,1] # only the green component b = f>0.45 figure(1) subplot(1,2,1) imshow(f,cmap='gray') axis('off') title('f') subplot(1,2,2) axis('off') imshow(b,cmap='gray') @savefig threshold.png width=99% title('f>0.45') The traditional interpretation of such a binary image is of white *objects* (the 1 pixels) against a black *background* (the 0 pixels). .. % image arithmetic for indicator images .. % boolean logic for binary images .. % ?? for label images Exercises ========= #. **Blending Color Images.** In case we have an operator $\gamma$ defined on two color values we can obviously lift this operator to work on color images as well. In case the operator $\gamma$ itself is the 'lifted' version of an operator working on scalars then Python/Numpy is of great help. E.g. the addition of two color images is the components wise addition (i.e. adding red, green and blue components independently) can be simply done with ``f+g``. Implement the $\alpha$-blend of two color images. #. **Unsharp Masking of Color Images.** Implement an algorithm for the unsharp masking of color images. There is no unique way to do this. You could first transform the image to a color model that explicitly encodes the luminance (brightness or value) and two color coordinates. Then operate on the luminance channel only. Or you could just work on the red, green and blue image independently. Comment on your findings comparing both versions. .. **Image Processing and Array Semantics.** Numpy defines the .. ``ndarray`` datatype as its basic data type. Each element in the .. array often is a scalar (float or int or ...). For images this .. sometimes gets us in trouble. For instance reading a color image .. returns in an array with shape $(M,N,3)$ where $M\times N$ are the .. image dimensions. The last index in the array refers to the color .. component. But it could be that we have collected three slices out .. of a three dimensional MRI image. The ``ndarray`` data type cannot .. make a distinction anymore. So what to do when making a histogram? .. For a three dimensional scalar image the histogram will be a one .. dimensional array, for the color image it should be a three .. dimensional array. .. Actually Numpy can help us here (but as far as i know this is not .. used for image processing and computer vision). In Numpy it is .. possible to define *structured arrays*. E.g. the statement ``a = .. zeros( (M,N), dtype='int8,int8,int8')`` defines a array with shape $(M,N)$ .. where each element is a 3 element vector. .. Can you define a class ``ImageSamples`` that inherits the .. ``ndarray`` and makes the different semantics explicit. Are you .. going to store the "semantic type" (color versus 3D) explicitly in .. the ``ImageSamples`` class or are you subclassing the .. ``ImageSamples`` class?