Geometric Transformations, Pixel Interpolation, Image Warping

Geometric Transformation

A geometric transformation is a vector T that transforms a pixel (x,y) to a new position (x',y').

$x' = T_x(x,y), \, y'=T_y(x,y)$

The main use of geometric transformations is to correct distortions in images.

There are two main steps in any geometric transformation:

Pixel Coordinate Transformation: mapping a pixel of the input image to a point in the output image
Brightness Interpolation: output coordinates computed in the previous step are usually real values that can't be directly mapped onto a digital grid. We use brightness interpolation to estimate locations on the digital grid that are closest to the computed output coordinates

Pixel Coordinate Transformation

A few common geometric transformations include translation, rotation, scaling and skewing. These are called affine transformations.

In general,

$x'=a_0+a_1x+a_2y$

$y'=b_0+b_1x+b_2y$

When we apply a geometric transformation to the entire image, the coordinate system might change. The Jacobian Determinant J provides information on how the coordinate system changes.

$J=a_1b_2-a_2b_1$

To perform a geometric transformation, we use matrix multiplication. We represent the pixel and the transformation matrices using homogeneous coordinates to allow any transformation (including translation) to be performed using a simple matrix multiplication.

The above affine transformation equations can be expressed in matrix notation as follows:

$\begin{bmatrix}x'\\y'\\1\end{bmatrix}=\begin{bmatrix}a_1 & a_2 & a_0\\b_1 & b_2 & b_0\\0 & 0 & 1\end{bmatrix}\begin{bmatrix}x\\y\\1\end{bmatrix}$

Translation

$\begin{bmatrix}x'\\y'\\1\end{bmatrix}=\begin{bmatrix}1 & 0 & t_x\\0 & 1 & t_y\\0 & 0 & 1\end{bmatrix}\begin{bmatrix}x\\y\\1\end{bmatrix}$

$x' = x+t_x$

$y'=y+t_y$

So, J=1

Rotation (by $\theta$ in counter-clockwise direction)

$\begin{bmatrix}x'\\y'\\1\end{bmatrix}=\begin{bmatrix}cos\theta & -sin\theta & 0\\sin\theta & cos\theta & 0\\0 & 0 & 1\end{bmatrix}\begin{bmatrix}x\\y\\1\end{bmatrix}$

$x'= xcos\theta-ysin\theta$

$y'=xsin\theta + ycos\theta$

So, J=1

Scaling

$\begin{bmatrix}x'\\y'\\1\end{bmatrix}=\begin{bmatrix}s_x & 0 & 0\\0 & s_y & 0\\0 & 0 & 1\end{bmatrix}\begin{bmatrix}x\\y\\1\end{bmatrix}$

$x'= s_xx$

$y'=s_yy$

So, J= $s_xs_y$

Skewing

$x' = x+ytan\phi$

$y'=y$

So, J=1

Note that the order of transformations is important. Translation and then rotation is NOT the same as rotation and then translation. This is because after translation, the axis of rotation will change.

Brightness Interpolation

As discussed, the output coordinates (x', y') will be real numbers and will most likely not fit into a discrete-valued digital raster/grid. For each output point, we use brightness interpolation of neighboring samples to estimate the closest integral values that can fit the coordinates onto the discrete grid.

There are 3 commonly used interpolation techniques:

Nearest-Neighborhood Interpolation
Linear Interpolation
Bicubic Interpolation

We aim to determine the brightness at (x',y') in the output image by determining the brightness value of the corresponding point (x,y) in the original image. To do so, we first compute the inverse transform:

$(x,y)=T^{-1}(x',y')$

Unfortunately, the (x,y) values returned by the above computation will not fit the digital raster either. So, the brightness value is unknown. To get the brightness value at (x,y), the image is resampled.

The brightness can be expressed by the following equation:

$f_n(x,y)=\sum_{l=-\infty}^{\infty}\sum_{k=-\infty}^{\infty} g_s(l,k)h_n(x-l,y-k)$

$g_s$ is the sampled image, n distinguishes different interpolation methods, $h_n$ is the interpolation kernel.

Nearest-Neighborhood Interpolation

This method assigns the brightness value of the nearest point in the grid to (x,y). This interpolation method is denoted by $f_1(x,y)$ .

$f_1(x,y)=g_s(round(x), round(y))$

This method of interpolation might cause step-like boundaries.

Linear Interpolation

Linear Interpolation explores 4 points in the neighborhood of (x,y) and combines them linearly. The influence a point has in the linear combination depends on its proximity to (x,y).

Linear Interpolation is computed as follows:

$f_2(x, y) = (1-a)(1-b) g_s(l, k)+ a (1-b) g_s(l+1, k) + b (1-a) g_s(l, k+1) + abg_s(l+1, k+1)$

where l=floor(x), k=floor(y), a=x-l, b=y-k

This may cause a little blurring due to its averaging nature.

Sidenote: Bilinear Interpolation is the successive application of Linear Interpolation along each axis.
It is NOT linear in x, y.

Bicubic Interpolation

Bicubic Interpolation uses a bicubic polynomial from 16 points in the neighborhood to compute the brightness at (x,y).

It overcomes the drawbacks of both nearest-neighborhood interpolation (i.e. step-like boundaries) and linear interpolation (i.e. blurring). It also preserves fine details in the image very well.

Image Warping

Image Warping refers to the digital manipulation of an image so as to distort shapes in the image. It can be used to fix distortions or for creative purposes such as image morphing.
We make use of landmarks (also called correspondences/fiducials) as the control points for warping. We have a set of source and target anchor points that determine how the image must be warped.
The Radial Basis Function (RBF) or Radial Basis Interpolation is used to perform Image Warping.
If a Gaussian kernel is used, $\sigma$ will control the smoothness of the warped image.
Different types of warping include thin-plate warping, Gaussian warping and affine least-square warping.

Applications of Image Warping

Image Morphing: It is the blending of two warped images.
Image Mosaicing: It is the piecing together of images to form a cohesive image. (Ex. Image stitching for panoramas) Some image mosaicing issues include:
- the need for a cancavas
- blending at the edges of images
- adjusting brightness
- cascading transformations
Some other applications of warping include template creation, lens distortion etc.

PreviousAssignment Questions

Last updated 5 years ago

Was this helpful?