4. Geometrical Operators
An often used image operator is a geometrical operator, these are the operators that keep the colors the same but put the pixels in a different spot.
These operators can be used to create some artistical effect, but they are also useful in correcting for some unwanted distortion, which were introduced, when the picture was taken (like camera rotation, perspective distortions etc.)
The input for these methods (at least in this Exercise) is always a quadrilateral region of a discrete image \(F\). That means that you only know the intensity values at discrete integral pixel positions. The output consists of a rectangular image \(G\) with axes aligned with the Cartesian coordinate axes \(x\) and \(y\).
In an arbitrary image transformation, there is no guarantee, that an “input”-pixel will be positioned at a pixel in the “output” image as well. Rather, most of the time your output image pixels will “look at” image positions in the input image, which are “between” pixels.
So you need access to intensity values, which are not on the sampling grid of the original image, e.g. the intensitiy value at position \((6.4, 7.3)\). In this section we assume that you know about interpolation. Here we focus on the basic structure of an image warping algorithm.
Let \(f\) be an image with domain \(E\) and range \(V\). Let \(\phi\) be a transform that takes a point \(\v x\in E\) and returns a point \(\v y=\phi(\v x)\) in the domain \(E'\) of image \(g\), we then define the geometrical transform \(\Phi\) as:
or equivalently:
in case the inverse transform exists.
Note that the last two transforms are equivalent from a mathematical point of view. We could start an algorithm for a geometrical operator with either equation. We start with the first one:
4.1. Forward Transform
1import numpy as np
2import matplotlib.pyplot as plt
3from ipcv.ip.pixels import pixel, domainIterator;
4
5def geoOp_fwd(f, phi):
6 fdomain = f.shape[:2]
7 g = np.zeros_like(f)
8 for p in domainIterator(fdomain):
9 q = phi(p).rint()
10 if q.isin(fdomain):
11 g[q] = f[p]
12 return g
The function geoOp2d_fwd()
is based on the first equation: we loop
over all pixels in the domain of image \(f\), calculate where that pixel
is transformed to and then set this pixel in the output image with the
same color as the original pixel in the original image.
Below we define a function that rotates a pixel
1def rotator(angle):
2 ca = np.cos(angle)
3 sa = np.sin(angle)
4 R = np.array([[ca, -sa], [sa, ca]])
5 def rotate(t):
6 return pixel(R @ np.array(t))
7 return rotate
and we use it to do an image rotation
Show code for figure
1plt.figure(figsize=(10,5))
2
3from ipcv.utils.files import ipcv_image_path
4a = plt.imread(ipcv_image_path('peppers.png'));
5plt.subplot(131); plt.imshow(a); plt.axis('off');
6plt.title('Original');
7
8b = geoOp_fwd(a, rotator(np.pi/6));
9plt.subplot(132); plt.imshow(b); plt.axis('off');
10plt.title('Rotated');
11
12plt.subplot(133);
13plt.imshow(b[:64,:64]);
14plt.axis('off');
15plt.title('Rotated (detail)');
16
Evidently the forward transform is not the way to implement a geometrical transform. In the next sectio we will look at the backward transform that is based on the second formulation of the geometric operator.
4.2. Backward Transform
Instead of enumerating all pixels (sample positions) in the original image we now enumerate all pixels in the output image and use the inverse transform to calculate where that pixel came from in the original image.
1def geoOp(f, invphi):
2 g = np.zeros_like(f)
3 gdomain = g.shape[:2]
4 for q in domainIterator(gdomain):
5 p = invphi(q).rint()
6 if p.isin(gdomain):
7 g[q] = f[p]
8 return g
9
10plt.subplot(131); plt.imshow(a);
11
12b = geoOp(a, rotator(-np.pi/6))
13plt.subplot(132); plt.imshow(b);
14plt.subplot(133); plt.imshow(b[:64,:64]);
As can be clearly seen in this example the backward algorithm is the simplest one that really works. No more holes. Please note that we use a simple nearest neighbor interpolation. In practice it is much better to use a better interpolation scheme. Most image processing packages allow you to select one of several interpolation techniques.
4.3. Geometrical Transform in Scikit-Image
The most generic form of a geometrical transform is called
skimage.transform.warp
in skimage. As a second argument you have
to pass it the inverse transform. It can be a function that takes an
array of shape (M,2) of points in the result image and it should
return an array of the shape giving the positions of the corresponding
points in the input image. It is much like our geoOp
function
(only much faster). So let’s do the rotation again:
1from skimage.transform import warp
2
3def rotator_skimage(angle):
4 ca = np.cos(angle)
5 sa = np.sin(angle)
6 RT = np.array([[ca, -sa], [sa, ca]]).T
7 def rotate(x):
8 return x @ RT
9 return rotate
10
11plt.subplot(131); plt.imshow(a); plt.title('Original')
12c = warp(a, rotator_skimage(np.pi/6))
13plt.subplot(132); plt.imshow(c); plt.title('Rotated')
14plt.subplot(133); plt.imshow(c[:64,:64]); plt.title('Rotated (detail)')
Scikit-Image has several functions for specialized geometrical
transforms. In the example below we first set up the homogenuous
matrix that rotates around the center of an image and then instead of
using a callable as argument in warp
we give it this matrix.
1def rotation_matrix(f, angle):
2 M,N = f.shape[:2]
3 T = np.array([[1, 0, M/2],
4 [0, 1, N/2],
5 [0, 0, 1 ]])
6 ca = np.cos(angle)
7 sa = np.sin(angle)
8 R = np.array([[ca, -sa, 0],
9 [sa, ca, 0],
10 [ 0, 0, 1]])
11 return T @ R @ np.linalg.inv(T)
12
13plt.subplot(131); plt.imshow(a);
14d = warp(a, rotation_matrix(a, np.pi/6))
15plt.subplot(132); plt.imshow(d);
16plt.subplot(133); plt.imshow(d[:64,:64]);