2.2. Epipolar Geometry and the Fundamental Matrix

../../../_images/epipolar_geometry.svg

Fig. 2.10 Epipolar geometry.

Consider two camera’s looking at the same scene. Both are calibrated with respect to the same world frame with projection matrices \(P\) and \(P'\) respectively. Let \(\v X\) be a fixed point in world space seen by both camera’s. Let \(\v x\) be the image coordinates in the first (left) camera and \(\v x'\) the coordinates in the second (right) camera:

\[\begin{split}\hv x &\sim P \hv X\\ \hv x' &\sim P' \hv X\end{split}\]

Note that in case we only know the projected point \(\v x\) in the left image the exact location of the 3D point is unknown, all we know it that it must lie on the line through \(\v O\) and the point \(\v x\) on the left retina. Because the unknown point \(\v X\) is on a straight line in the 3D world its position is restricted to a straight line in the right image. This line is called the epipolar line associated with point \(\v x\) in the left image.

Equivalantly in case the point \(\v x'\) in the right image is known (but not the 3D point \(\v X\)) then \(\v X\) is projected somewhere on a line in the left image. So the point \(\v x'\) in the right image corresponds with an epipolar line in the left image.

Also note that a point anywhere on an epipolar line in the left image, when visible by the right camera as well, is to be projected on the corresponding epipolar line in the right image. This observation forms the basis of most (if not all) stereo vision algorithms. Given a point \(\v x\) in the left image we only have to search for the corresponding point \(\v x'\) in the right image on the epipolar line.

We have drawn the two camera’s in the above figure such that the origin \(\v O'\) of the right camera is projected onto a point \(\v e\) in the left image. This point \(\v e\) is called the epipolar point in the left image. Equivalently there is an epipolar point \(\v e'\) in the right image being the projection of the origin \(\v O\) of the left camera onto the retina of the right camera. The line segment from \(O\) to \(O'\) is called the baseline of the stereo camera setup.

The corresponing points \(\hv x \sim P\hv X\) and \(\hv x'\sim P'\hv X\) are related through the fundamental matrix \(F\).

Theorem 2.1 (Fundamental Matrix)

Let \(\v X\) be a 3D point observed by two camera’s resulting in projections \(\hv x\sim P\hv X\) and \(\hv x'\sim P'\hv X\). These 2D points are related through the fundamental matrix \(F\):

\[\hv x\T\, F \, \hv x' = 0\]

Note that in case we write \(\hv l = F\hv x'\) we have \(\hv x\T \hv l = 0\) indicating that \(\hv x\) is on the line \(\hv l\). Indeed \(\hv l\) is the epipolar line in the left image corresponding with the point \(\hv x'\) in the right image. Equivalently \(\hv l' = F\T \hv x\) is the epipolar line in the right image.

We give two proofs of the fundamental matrix theorem: an existence proof, only proving the existence of a fundamental matrix \(F\) without giving an explicit expression for it in terms of the projection matrices \(P\) and \(P'\) and a constructive proof that will provide such a explicit expression.

Proof (Existence proof Fundamental Matrix)

The existence proof is based on the same construction as used to derive the homotopy (i.e. projective transform) \(H\) that relates the points \(\hv x\sim P\hv X\) and \(\hv x'\sim P'\hv X\) when \(X\) is known to ly on a plane that is visible from both camera’s (see the section on projectivities considering the two setups to make image stitching possible).

The line \(\hv l'\) in the right image can be written as:

\[\hv l' = \hv e' \times \hv x' = [\hv e']_\times\hv x'\]

Using \(\hv x'=H\hv x\) we have

\[\hv l' = = [\hv e']_\times\hv x' = [\hv e']_\times H \hv x\]

Such that

\[F = [\hv e']_\times H\]
Proof (Constructive proof Fundamental Matrix)

Without loss of generality we assume that the left camera frame coincides with the world frame:

\[P = K( I\; \v 0)\]

With respect to the world frame (is left camera frame) the right camera is rotated and translated:

\[P' = K'(R\T -R\T \v t)\]

where \(\v t\) is the translation of \(\v O\) to \(\v O'\).