2. Images in MotionΒΆ

Imagine a horse walking, i.e. moving, in front of a (pinhole) camera. Also imagine that you could look at the back plane of the camera where the image is projected. Then you would see the moving horse. At every moment in time \(t\) there is an image \(f(x,y)\) projected on the backplane. To indicate the time dependence we can represent the ‘image in motion’ as a function in 3 arguments: \(f(x,y,t)\).


Conceptually we think of an image as a 2D function with the continuous plane \(\setR^2\) as its domain. Equivalently, an ‘image in motion’ is defined as a 3D function defined on the domain \(\setR^2\times\setR\): both the spatial coordinates \(x,y\) as well as the time ‘coordinate’ \(t\) are continuous.

Evidently just like we need sampling to represent images with a finite amount of data, we also need to sample the time coordinate. A sampled ‘image in motion’ is most often called a video sequence. Each image in the sequence is called a frame.

Sampling images is possible because the human eye cannot resolve small details, i.e. when looking at a sampled image from a distance we don’t see the pixels anymore. The same is true for images in motion. If you present the human eye with a rapid sequence of images we cannot distinguish the individual frames anymore.

A video simply thus is nothing more then a lot of images displayed on the screen in a rapid sequence.


The pictures and animations of the horse in motion shown here are made in 1878 by Eadweard Muybridge (see his Wikipedia page).

In these lecture notes we distinguish two points of view when considering images in motion.

  • In case there is a relative motion between the observer (the camera or the human eye) and the objects in the scene, all points in the image seem to move in time. Calculating the velocity vectors for all points in the image results in the optic flow field.
  • Instead of considering the movement of all points in the image we are often interested only in the movement of an object depicted in the images in the sequence. This leads to motion tracking algorithms. For motion tracking we often assume that given the position and shape of an object at time \(t_0\) the task is to find the object in the images for \(t>t_0\).