Image Interpolation
===================


Given the samples $F$ of an image $f$ the task of interpolation is to
calculate a value for $f(x,y)$ even for those $(x,y)$ that are not
sample points.

In most cases it is not realistic to ask for *the* value $f(x, y)$: a
unique value only exists in case the sampling proces is without loss
of information. For band-limited images indeed such a unique
interpolated function exists, because ideal bandlimited images
(functions) can be sampled without loss and thus can be reconstructed
without error. [#fn_bandlimited]_

In these lecture notes we start from a rather pragmatic point of
view. We assume that the digital images we start with are
*oversampled* i.e. the samples are close enough to eachother to
faithfully represent the image $f$. Surely we have to take care not to
undersample images in our computational analysis without presmoothing
as we would be computationally introducing aliasing then.

In this section we start with interpolation for 1D functions, that is
we start with interpolation of sampled univariate functions. We will
discuss only a few techniques that are commonly used. Then we show how
to generalize these 1D interpolation methods to 2D (and nD if
needed). 

.. We end this section with a somewhat more theoretical view on
   interpolation that encompasses all the methods described and some
   more.


Interpolating 1D functions
--------------------------

Consider the samples $F(k)$ of a function $f(x)$. We assume that the
sampling interval is 1, i.e. $F(k)=f(k)$. The goal of interpolation is
to find an estimate of $f(x)$ for arbitrary $x$ given only the sampled
(discrete) function $F$.

Nearest neighbor interpolation
..............................

Nearest neighbor interpolation is the simplest interpolation
method. Given the samples $F(k)$ the value of the interpolated
function $\hat f$ at coordinate $x$ is given as:

.. math::
   \hat f(x) = F( \lfloor x+\half\rfloor )

i.e. we simply select the sample point that is nearest to $x$.  Here
$\lfloor x \rfloor$ is the floor function, i.e. the largest integer
values that is less then or equal to $x$. 

This leads to a staircase type of interpolated function. The function
$\hat f$ is indeed defined on the entire real axis but it is neither
continuous (as it contains jumps) nor differentiable.

As an example we look at the function $f(x)=sin(x)$ and its nearest
neighbor interpolated version. 

.. ipython:: python
   
   from scipy.interpolate import interp1d

   x = np.linspace(0, 6, 200); # fine sampling to represent continuous function
   xs = np.array([0, 1, 2, 3, 4, 5, 6]); # the sample points
   f = np.sin(x); F = np.sin(xs); # the 'continuous' function and its sampled version
   ifunc = interp1d(xs, F, kind='nearest'); hatf_nn = ifunc(x);
   @savefig nninterp.png width=10cm align=center
   plt.clf(); plt.plot(x, f, 'g-', xs, F, 'ro', x, hatf_nn, 'b-');

In green the continuous function $f$ (note that we did sample this
function too... but we did so with a very small sample width and thus for our eye the
plot looks like a continuous function), in red the sampled function
and in blue the interpolated function.


Linear Interpolation
....................

Linear interpolation is the scientific equivalent of what you have
already learned in kindergarten: connect the dots. Between to adjacent
sample points $k$ and $k+1$ we assume the function is a linear
function and thus in this interval $[k,k+1]$ we have:

.. math::
   k\leq x \leq k+1: \quad\hat f(x) = (1-(x-k))F(k) + (x-k)F(k+1)

.. ipython:: python
   
   ifunc = interp1d(xs, F, kind='linear'); hatf_lin = ifunc(x);
   @savefig lininterp.png width=10cm align=center
   plt.clf(); plt.plot(x, f, 'g-', xs, F, 'ro', x, hatf_lin, 'b-');

The linearly interpolated function is continuous but it is not
everywhere differentiable (in de sample points it's not
differentiable). Furthermore for the interpolated function the second
and higher order derivatives are equal zero (except for the sample
points where these are not defined).

Although a simple interpolation method, linear interpolation is very
often used in practice. It is simple to implement and very fast.


Cubic Interpolation
...................

In nearest neighbor interpolation only one sample is used (the
nearest) to set the interpolated value. In linear interpolation we
look at the 2 closest sample points (one on the left and one on the
right). For cubic interpolation we look at two pixels on the left and
two on the right.

To interpolate between $x=k$ and $x=k+1$ we fit a cubic polynomial

.. math::
   k\leq x \leq k+1: \quad \hat{f}(x) = a x^3 + b x^2 + c x + d

to the 4 sample points $k-1, k, k+1$ and $k+2$. For the 4 points we have:

.. math::
   F(k-1) &= a (-1)^3 + b (-1)^2 + c(-1) + d\\
   F(k)   &= d\\
   F(k+1) &= a (1)^3 + b (1)^2 + c(1) + d\\
   F(k+2) &= a (2)^3 + b (2)^2 + c(2) + d

Solving these equations for $a,b,c$ and $d$ we get:

.. math::
   a &= \frac{1}{6}\left( -F(k-1)+3F(k)-3F(k+1)+F(k+2)\right)\\
   b &= \frac{1}{2}\left( F(k-1)-2F(k)+F(k+1)\right)\\
   c &= \frac{1}{6}\left( -2F(k-1)-3F(k)+6F(k+1)-F(k+2)\right) \\
   d &= F(k)


.. ipython:: python
   
   ifunc = interp1d(xs, F, kind='cubic'); hatf_cubic = ifunc(x);
   @savefig cubicinterp.png width=10cm align=center
   plt.clf(); plt.plot(x, f, 'g-', xs, F, 'ro', x, hatf_cubic, 'b-');


Even Better Interpolation
.........................

There are even better interpolation methods to be used. A
straightforward generalization would be to consider even more samples
to estimate the function value of $f$. Taking $n$ samples to the left
of $x$ and $n$ samples to the right we need a polynomial of order
$2n-1$. A big disadvantage of higher order polynomials is overfitting
of the original function: higher order polnomials tend to fluctuate
wildly inbetween the sample points.

.. ipython:: python

   xs = np.arange(-7, 8)
   F = 1.0 * (np.abs(xs) < 3)
   plt.plot(xs, F, 'o');

   x = np.linspace(-7, 7, 1000)
   ifunc = interp1d(xs, F, kind='cubic'); hat_f = ifunc(x)
   plt.plot(x, hat_f);

   @savefig interp_block.png
   plt.show()


Another disadvantage of the interpolation schemes discussed so far is
that the piecewise interpolated function $\hat f$ shows
discontinuities at the sample point positions. This disadvantage can
be tackled with **spline** interpolation.


Interpolating 2D functions
--------------------------

Given an interpolation method for 1D (scalar) functions it is always
possible to generalize the method to work with 2D functions (and even
higher dimensional ones). As an example consider the linear
interpolation. To estimate a value $f(x,y)$ with $i\leq x\leq i+1$ and
$j\leq y\leq j+1$ we first interpolate in the '$x$-direction' to find
values $f(x,j)$ and $f(x,j+1)$ and then interpolate in the
'$y$-direction' to find $f(x,y)$.

Assume that $x=i+a$ and $y=j+b$ (with $a$ and $b$ in the interval
$[0,1]$) then:

.. math::
   f(x,y) = &(1-a)(1-b) F(i,j) + \\
            &(1-a) b F(i,j+1) + \\
	    &a (1-b) F(i+1,j) + \\
	    &a b F(i+1,j+1)


When generalizing cubic interpolation to two dimensions we need 16
samples in the image (a $4\times4$ subgrid).

For 2D functions we can also use spline interpolation.

.. 
    Linear Interpolation
    --------------------

    A linear interpolator is defined with a kernel $\phi$ such that:

    .. math::
       \hat{f}(\v x) = \sum_{\v k} F(\v k) \phi(\v x - B \v k)

    where $B$ is the sampling grid matrix (assuming a square unit grid we
    have $B=I$ but we prefer to write $B\v k$ anyway to stress that $B\v
    k$ turns indices into coordinates).

    The function $\phi$ is the *interpolation kernel* and it has to satisfy
    some requirements to make the above equation a true interpolation
    kernel.

    Interpretation:

    * weighted sum of samples, or

    * sum of shifted and scaled kernels placed at sample points

    Properties

    * sum over k equal to one (for all x), meaning that we are doing an average

    * phi(Bk)=1 iff k=0 and zero otherwise

    Examples

    * NNB
    * linear
    * cubic
    * spline?
    * Gaussian?