1.5. Image Representation

1.5.1. Coordinates and Indices

Accustomed to the choice of coordinate axes in mathematics you might think that the axes for an image \(f(x,y)\) are the same: the x-axis running from left to right, the y-axis from bottom to top with the origin in the lower left corner. This is not true for digital images in general (the one exception i know of is Microsofts Device Independent Bitmap).

In order to get an understanding of the many different formats for digital images we have to distinguish:

  • the axes with the coordinates along those axes, and
  • the indices into the array representing all samples.

We will use numpy and matplotlib throughout to set our definition for these lecture notes.

We will use this image as leading example:

\[\begin{split}\begin{array}{cccc} 0& 1& 2& 3\\ 4& 5& 6& 7\\ 8& 9&10&11 \end{array}\end{split}\]

In numpy this is easy to generate and plot with matplotlib:

In [1]: f = np.arange(12).reshape(3,4)

In [2]: plt.imshow(f, cmap='gray', interpolation='nearest');

In [3]: plt.show()
../../../_images/imageaxes.png

The sample points are in the middle of the squares (NOT pixels) and we see that the origin with coordinates is chosen to be at the top left. The x-coordinate axis runs from left to right and the y-coordinate axis runs from top to bottom.

In this coordinate axes system we can draw a line from \((0,2)\) to \((3,0)\)

In [4]: f = np.arange(12).reshape(3,4)

In [5]: plt.imshow(f, cmap='gray', interpolation='nearest');

In [6]: plt.plot([0,3], [2,0], lw=3);

In [7]: plt.show()
../../../_images/imageaxesplot.png

Carefully note that the coordinate system is a left-handed one. You have to turn the x-axis clockwise to the y-axis. The classical choice with x axis from left to right and y-axis from bottom to top results in a right-handed coordinate system. Be aware that in a left handed system things are a bit different then you might expect. A rotation over a positive angle \(\phi\) characterized with rotation matrix:

\[\begin{split}R = \matvec{cc}{\cos\phi & -\sin\phi\\ \sin\phi & \cos\phi}\end{split}\]

is turning a vector counter clockwise in a right handed coordinate system, but clockwise in a left handed system.

Being familiar with numpy arrays you have probably understood that in case we want to index the array with the coordinates \((x,y)\) (with \(x\) and \(y\) integer valued of course) that we need to write f[y,x].

So, the first index in the image array is the y-coordinate, and the second index is the x-coordinate.

1.5.2. Color Images

Let’s read a color image from disk and display it:

In [8]: f = plt.imread('python/data/images/peppers.png')

In [9]: plt.imshow(f);

In [10]: plt.show();

In [11]: print(f.shape)
(256, 256, 3)
../../../_images/peppersrgb.png

The shape of a color image array thus is \(M\times N \times 3\). It is also possible that a 4-th channel is used that encodes the transparency mask when displaying an image. Be sure to check and not assume things that are not nescessarily true.

Each of the planes f[:,:,c] with c equal to 0, 1 or 2 gives one of the color planes. Each color plane is a scalar image and can be displayed as such:

In [12]: plt.imshow(f[:,:,1], cmap='gray'); # the green channel is a good approximation of the luminance

In [13]: plt.show()
../../../_images/peppersgreenchannel.png

1.5.3. Domain Iterators