Project 3: Antialiasing, Subsampling, Supersampling and the Frequency Domain

Carlos Eduardo Scheidegger
cscheid@cs.utah.edu
UID: 00454219

Abstract:

In this project, I investigated techniques for antialiasing for image subsampling. I also analyzed their behavior, especially in the frequency domain. Finally, I also analyzed and investigated interpolating supersampling filters for increasing the spatial resolution of an image.

Preliminaries

We denote the Fourier Transform pair by


$\displaystyle \mathscr{F}\left\{ f(x) \right \}$ $\textstyle =$ $\displaystyle \int_{-\infty}^{\infty} f(x) e^{-2\pi j s x} \; dx$ (1)
  $\textstyle =$ $\displaystyle F(s)$ (2)
$\displaystyle \mathscr{F}^{-1}\left\{ F(s) \right \}$ $\textstyle =$ $\displaystyle \int_{-\infty}^{\infty} F(S) e^{2\pi j s x} \; ds$ (3)
  $\textstyle =$ $\displaystyle f(x)$ (4)

and the power spectrum by


$\displaystyle \mathscr{P}(f)$ $\textstyle =$ $\displaystyle \vert F(s)\vert^2$ (5)
  $\textstyle =$ $\displaystyle Re(F(s))^2 + Im(F(s))^2$ (6)
  $\textstyle =$ $\displaystyle F(s) \times F^*(s)$ (7)

We will use the power spectrum to visualize the frequency domain of both images and filters. In typical scenarios, the power spectrum has much higher dynamic range than their spatial domain transforms. We don't usually care about the specific values in each point for visualization purposes. So we will use the logarithm of the power spectrum to compress the dynamic range. Since most typical images tend to concentrate most of the energy in low frequencies, we have many values in the frequency domain practically zero. Since $\lim_{n \to 0} \log n = -\infty$, the logarithm might not compress the dynamic range as we intended to. To compensate for this, we add a constant value of 1 to the power spectrum, because $\lim_{n \to 0}
\log (n+1) = 0$, but $\lim_{n \to \infty} \log (n+1) = \log n$. In this way, the smallest value is the visualization is zero, and the large values are mostly unaffected.

Background

Computers are eminently discrete computing devices, and so sampling a continuous function is an inevitable step in any digital signal processing application. WE wish to study the effects of sampling in signals.

Sampling

We model sampling as a multiplication of the original sequence by an infinite sequence of impulses. The relative strength of each impulse will represent the original value of the function at that point. This pulse sequence is known as the Shah functional:


$\displaystyle \textrm{III}(x)$ $\textstyle =$ $\displaystyle \sum_{-\infty \leq n \leq \infty} \delta(x-n)$ (8)

The Shah function has several interesting properties. First of all, its Fourier Transform is itself, and it has a nice similarity transformation:


$\displaystyle \mathscr{F}\left\{ \textrm{III}(x) \right \}$ $\textstyle =$ $\displaystyle \textrm{III}(s)$ (9)
$\displaystyle \mathscr{F}\left\{ \textrm{III}(ax) \right \}$ $\textstyle =$ $\displaystyle \vert a\vert^{-1} \textrm{III}(\frac{s}{a})$ (10)

(These equations can be proven by taking the limit of a sequence of progressively more compact gaussians [2]) Equation 10 will be particularly useful to understand the consequences of sampling.

This model allows us to accurately talk about the impacts of sampling in the original image. In the discussion, we will use the frequency domain extensively. Specifically, we will investigate what happens to the power spectrum of the original function under sampling. The power spectrum measures the relative strength of the signal for a given frequency.

Subsampling

I implemented three different low-pass filters to reduce the effects of aliasing. All three of them are radially symmetric, to ensure that the entire operation was radially invariant (I wanted to suppress frequencies in the same way across all directions). One of the filters is separable (the Gaussian), while the two others aren't (Butterworth and ideal). I'll explain and analyze each in a separate section.

Description of experiments

My experiments included synthetic images. To investigate ringing, I tested the filters on images with large discontinuities in color. For that, I used an image with a white square on a black background.

Figure 1: A synthetic circle with radius half the width of the domain.
Image imgs/circle.png

Since the circle is nothing but a radially symmetric rectangle function, its power spectrum is clearly a radially symmetric sinc function. Because the circle is fairly low-frequency, the sinc function is comparably high-frequency. The ringing in Figure 2 is more evident than would be expected because we're visualizing in log space.

Figure 2: Power spectrum of Figure 1.
Image imgs/spectrum_circle.png

To investigate in-spectrum loss and foldback, I used a synthetic image with artificially high frequency content. Since I wanted a radially symmetric image, I decided to use $f(r) = \cos
(\pi r^2)$ (later I found out a reference in the web for this function under the name ``zone-plate''). This function is interesting. First, its frequencies are increasing linearly as a function of distance from the origin. This allows us to deterministically compute the highest frequencies present in the image. Second, it is almost its own fourier transform (I used Mathematica for this, but I suspect it can be done by some form of series expansion):


$\displaystyle \mathscr{F}\left\{ \cos(\pi x^2) \right \} = \frac{\cos(\pi u^2) + \sin(\pi u^2)}{\sqrt{2}}$     (11)

Since the original (before the sampling) images are not really continuous, we have to be careful not to have any sampling artifacts in them. I wanted to have as high frequency components as possible in the original images, so I scaled the radius appropriately. Here, I use both 512x512 and 1024x1024 zone plates. The sampling theorem states that the highest possible sampled frequency is half the sampling rate. Since we're sampling at integral numbers of $x$ and $y$, the highest frequency we can sample is 1/2. This implies a scaling factor of $(512 \sqrt{2})^{-1}$ and $(1024 \sqrt{2})^{-1}$. With this scaling factor, the frequency of the sampled function will be exactly the maximum sampling frequency at the corners of the image. There will be (in theory) no aliasing. The power spectrum will be very similar, but scaled, because we're scaling the definition to get the desired frequencies. In practice, we can see some very mild aliasing. I believe this to be an issue of displays rather than anything else: I've experienced worse aliasing in CRTs than in LCDs, and the aliasing becomes really noticeable if you drag the window around (try it, it's kind of cool). The 512x512 zone plate and its power spectrum are shown in Figures 3 and 4. The power spectrum is aliased because of the scaling.

Figure 3: The zone plate.
Image imgs/zone_plate.png

Figure 4: The zone plate power spectrum.
Image imgs/spectrum_zone_plate.png

My real-life images were taken from the USC-SIPI database [1]. I chose to use these images because they were specifically collected for the purpose of testing algorithms, and their examples span a big range of image ``features'', such as color, texture, contrast, etc. I tested my filters on images 1.3.05  5, 1.4.01 6, 1.4.07 7, 1.5.02 8, 4.2.03 (the ``baboon'') 9, 4.2.04 (``Lenna'' -- with two `n's [3]), 5.3.01 11 and elaine.512 12. I chose these figure subjectively to span a variety of features such as noise (both white and frequency-specific), resolution, texture, perspective and contrast. I will not refer to all the results in the main text, but I include them as an appendix.

Figure 5: Image 1.3.05 from the USC-SIPI database and its power spectrum.
Image imgs/1_3_05.png Image imgs/spectrum_1_3_05.png

Figure 6: Image 1.4.01 from the USC-SIPI database and its power spectrum.
Image imgs/1_4_01.png Image imgs/spectrum_1_4_01.png

Figure 7: Image 1.4.07 from the USC-SIPI database and its power spectrum.
Image imgs/1_4_07.png Image imgs/spectrum_1_4_07.png

Figure 8: Image 1.5.02 from the USC-SIPI database and its power spectrum.
Image imgs/1_5_02.png Image imgs/spectrum_1_5_02.png

Figure 9: Image 4.2.03 from the USC-SIPI database and its power spectrum.
Image imgs/4_2_03.png Image imgs/spectrum_4_2_03.png

Figure 10: Image 4.2.04 from the USC-SIPI database and its power spectrum.
Image imgs/4_2_04.png Image imgs/spectrum_4_2_04.png

Figure 11: Image 5.3.01 from the USC-SIPI database and its power spectrum.
Image imgs/man.png Image imgs/spectrum_man.png

Figure 12: Image elaine.512 from the USC-SIPI database and its power spectrum.
Image imgs/elaine_512.png Image imgs/spectrum_elaine_512.png

Sampling Artifacts

The main sampling artifact is known as aliasing. Since we're modeling sampling as multiplication by a sequence of pulses, the effects of aliasing in the frequency domain are a sequence of convolutions of the image frequencies, displaced by an amount dependendent on the sampling rate: the higher the sampling rate is, the farther apart the pulses are in the frequency domain.

Aliasing is so called because since if the signal we're sampling has energy in frequencies higher than half the spacing between the pulses, the convolution will cause these to overlap. So when we examine the frequency domain of the original function after sampling, some of its higher frequencies will have been ``disguised'' as lower frequencies: they will fold back to the low-frequency spectrum.

To prevent aliasing, we have to make sure that the signal is band-limited before we do the sampling. We then use appropriate low-pass filters for that. In theory, the best possible filter is the one where all frequencies above the sampling rate are suppressed. It turns out this strategy exhibits severe visual artifacts such as ringing, due to what is called Gibbs phenomenon [2]. The solution for that is to use filters with a higher degree of continuity. The consequence is called in-spectrum energy loss: we will reduce the energy of frequencies allowed by the sampling rate we use so that there will be less visual artifacts.

The final effect of aliasing is known as truncation, and happens when filtering a continuous signal with a non-band-limited filter (such as the sinc filter). Since we can only physically construct band-limited signals, I modeled these by multiplying the original filter with a rectangular box function the width of the filter being used. The truncation artifacts then show up as a distortion of the filter in the spatial domain. This distortion is given by a convolution of a sinc function with the original filter. We can then expect some (possibly mild) ringing.

I used 2x and 4x subsampling here. For the case of 2x subsampling, the resulting spectrum is half the size in each dimension. All energies above that will show up as aliasing. We then integrate the filter energies on these regions to give a bound of the expected aliasing. The in-spectrum loss is modeled by subtracting the integral of the in-spectrum filter energy from 1. I model the truncation error as the difference in total energy between the truncated filter and the original one.

Filters

Ideal Low-Pass Filter

The first filter I implemented was the ILPF. The ideal low-pass filter has a single parameter $D_0$ that indicates the highest frequency that is to be present in the filtered images. Frequencies higher than $D_0$ are eliminated entirely from the filtered image, while all other frequencies are untouched. The following functions describe the filter in the spatial and frequency domains:


$\displaystyle F(u,v)$ $\textstyle =$ $\displaystyle 1, \textrm{if $u^2 + v^2 \leq D_0$}$ (12)
  $\textstyle =$ $\displaystyle 0, \textrm{if $u^2 + v^2 > D_0$}$ (13)

I tested this filter with cutoff frequencies of one-half, 1/4th, 1/8th, 1/16th, 1/32th and 1/64th of the domain width. These should result in progressively increased blurring. Also, since this filter is equivalent to convolving with a sinc function, we should expect noticeable ringing in the filtering. The following figures show the effects of filtering on the synthetic circle and 512x512 zone plate:

Image imgs/circle_ilpf_16.png Image imgs/spectrum_circle_ilpf_16.png

Image imgs/circle_ilpf_32.png Image imgs/spectrum_circle_ilpf_32.png

Image imgs/circle_ilpf_64.png Image imgs/spectrum_circle_ilpf_64.png

Image imgs/circle_ilpf_128.png Image imgs/spectrum_circle_ilpf_128.png

Figure 13: Synthetic circle filtered with increasingly high-frequency ILPFs ($D_0$ = 1/32, 1/16, 1/8, 1/4 and 1/2 of the domain width).
Image imgs/circle_ilpf_256.png Image imgs/spectrum_circle_ilpf_256.png

Image imgs/zp_ilpf_16.png Image imgs/spectrum_zp_ilpf_16.png

Image imgs/zp_ilpf_32.png Image imgs/spectrum_zp_ilpf_32.png

Image imgs/zp_ilpf_64.png Image imgs/spectrum_zp_ilpf_64.png

Image imgs/zp_ilpf_128.png Image imgs/spectrum_zp_ilpf_128.png

Figure 14: 512x512 zone plate filtered with increasingly high-frequency ILPFs ($D_0$ = 1/32, 1/16, 1/8, 1/4 and 1/2 of the domain width).
Image imgs/zp_ilpf_256.png Image imgs/spectrum_zp_ilpf_256.png

The power spectrum became hard to visualize because the filter simply cut off all the frequencies above the threshold, and so the log scale brought up whatever tiny energies there were present. There are some leftover frequencies in the synthetic circle images. Since I first filtered the image and then recomputed its power spectrum, something in that process may have introduced these lower energies. The circle exhibits, expectedly, a very large amount of ringing. The only reason we cannot notice more ringing in the zone plate is that the original image already rings a lot. Surprisingly, the images are not radially symmetric. I think this can be attributed to the truncation in the edge of the filters, or maybe my circles and zone-plates are not perfectly centered.

The subsampled images are shown here.

Figure 15: Synthetic circle sampled at half of the original resolution.
Image imgs/small_circle_ilpf_16.png Image imgs/small_circle_ilpf_32.png Image imgs/small_circle_ilpf_64.png Image imgs/small_circle_ilpf_128.png Image imgs/small_circle_ilpf_256.png

Figure 16: Synthetic circle sampled at 1/4th of the original resolution.
Image imgs/tiny_circle_ilpf_16.png Image imgs/tiny_circle_ilpf_32.png Image imgs/tiny_circle_ilpf_64.png Image imgs/tiny_circle_ilpf_128.png Image imgs/tiny_circle_ilpf_256.png

Figure 17: Zone plate sampled at half of the original resolution.
Image imgs/small_zp_ilpf_16.png Image imgs/small_zp_ilpf_32.png Image imgs/small_zp_ilpf_64.png Image imgs/small_zp_ilpf_128.png Image imgs/small_zp_ilpf_256.png

Figure 18: Zone plate sampled at 1/4th of the original resolution.
Image imgs/tiny_zp_ilpf_16.png Image imgs/tiny_zp_ilpf_32.png Image imgs/tiny_zp_ilpf_64.png Image imgs/tiny_zp_ilpf_128.png Image imgs/tiny_zp_ilpf_256.png


Table 1: Aliasing analysis of images in terms of percentage, with 2x subsampling. Foldback error refers to percentage of original power spectrum, and in-spectrum loss refers to percentage of in-spectrum energy of the subsampled image.
Image $D_0 = 1/32 w$ $D_0 = 1/16 w$ $D_0 = 1/8 w$ $D_0 = 1/4 w$ $D_0 = 1/2 w$
  In-spectrum Loss Foldback In-spectrum Loss Foldback In-spectrum Loss Foldback In-spectrum Loss Foldback In-spectrum Loss Foldback
Circle 1 2.16% 0% 0.92% 0% 0.31% 0% 0.02% 0% 0% 0.09%
Zone Plate 3 98.52% 0% 94.86% 0% 80.65% 0% 21.77% 0.003% 0% 49.8%
1.3.05 5 15.16% 0% 4.79% 0% 1.49% 0% 0.10% 0% 0% 0.52%
1.4.01 6 3.64% 0 % 1.96% 0% 0.61% 0% 0.03% 0% 0% 0.23%
1.4.07 7 1.02% 0 % 0.45% 0% 0.13% 0% 0.01% 0% 0% 0.11%
1.5.02 8 26.65% 0% 26.55% 0% 2.43% 0% 0.15% 0.001% 0% 0.60%
4.2.03 9 2.98% 0% 2.23% 0% 1.43% 0% 0.18% 0% 0% 1.10%
4.2.04 10 1.79% 0% 0.82% 0% 0.32% 0% 0.02% 0% 0% 0.08%
5.3.01 11 3.71% 0% 1.95% 0% 0.77% 0% 0.05% 0% 0% 0.22%
elaine.512 12 1.50% 0% 0.50% 0% 0.14% 0% 0.013% 0% 0% 0.09%


Results

Here are some figures to illustrate the results. I'll repeat the same idea in the following sections. I chose some overly-blurred images, some good results and some aliased ones. These are already the 2x subsampled images.

Figure 19: In-spectrum loss: 1.3.05 1/32w and 1.5.02 1/16w
Image imgs/small_1_3_05_ilpf_32.png Image imgs/small_1_5_02_ilpf_64.png

Figure 20: Little loss or aliasing: 4.2.03 1/8w and 4.2.04 1/4w
Image imgs/small_4_2_03_ilpf_64.png Image imgs/small_4_2_04_ilpf_128.png

Figure 21: Aliasing: 4.2.03 1/2w and zone plate 1/2w
Image imgs/small_4_2_03_ilpf_256.png Image imgs/small_zp_ilpf_256.png

It is important to notice that although the some of the images portrayed above exhibit little to no in-spectrum loss and foldback, they still show severe ringing artifacts due to the discontinuity of the filter in the frequency domain (specially the circle, where there's a specific discontinuity). The ILPF performs well on numbers, but the filtering results may not be satisfactory.

Gaussian Filter

The gaussian filter is a classic low-pass filter that has many useful properties. It is defined by the following frequency response:


$\displaystyle G(r,\theta)$ $\textstyle =$ $\displaystyle e^{\frac{r^2}{2 \sigma^2}}$ (14)

For example, it is its own Fourier Transform (up to scaling). This makes analysis much simpler. Also, since it is $C^\infty$, there will be no Gibbs phenomenon associated with the filtered images back on the time domain (ie. there will be no ringing). Finally, Gaussian filters provide a single parameter $\sigma$ to adjust, which can be important if there is user involvement in the process. The main drawback of the Gaussian filter family is that there can be significant in-spectrum loss, resulting in blurring in the subsampled images. In practice, I only experienced this problem in the pathological zone-plate example. Usually, the Gaussian filter is implemented as a separable filter. I have not done so here because of simplicity.


Table 2: Aliasing analysis of gaussian-filtered images in terms of percentage, with 2x subsampling.
Image $\sigma = 1/32 w$ $\sigma = 1/16 w$ $\sigma = 1/8 w$ $\sigma = 1/4 w$ $\sigma = 1/2 w$
  In-spectrum Loss Foldback In-spectrum Loss Foldback In-spectrum Loss Foldback In-spectrum Loss Foldback In-spectrum Loss Foldback
Circle 1 4.16% 0.00% 1.91% 0.00% 0.80% 0.00% 0.28% 0.01% 0.08% 0.06%
Zone Plate 3 98.83% 0.00% 95.11% 0.00% 80.57% 0.92% 44.24% 21.99% 14.89% 41.76%
1.3.05 5 14.32% 0.00% 8.91% 0.00% 4.00% 0.00% 1.38% 0.08% 0.39% 0.34%
1.4.01 6 4.06% 0.00% 2.63% 0.00% 1.30% 0.00% 0.47% 0.03% 0.13% 0.15%
1.4.07 7 1.18% 0.00% 0.69% 0.00% 0.32% 0.00% 0.12% 0.01% 0.03% 0.07%
1.5.02 8 26.64% 0.00% 22.59% 0.00% 10.82% 0.00% 3.56% 0.11% 0.98% 0.40%
4.2.03 9 3.43% 0.00% 2.58% 0.00% 1.72% 0.00% 0.79% 0.16% 0.25% 0.69%
4.2.04 10 2.63% 0.00% 1.40% 0.00% 0.65% 0.00% 0.24% 0.01% 0.07% 0.06%
5.3.01 11 5.13% 0.00% 2.93% 0.00% 1.44% 0.00% 0.54% 0.04% 0.16% 0.14%
elaine.512 12 2.38% 0.00% 1.12% 0.00% 0.45% 0.00% 0.15% 0.01% 0.04% 0.07%


Figure 22: Zone plate sampled at half of the original resolution.
Image imgs/small_zp_gaussian_16.png Image imgs/small_zp_gaussian_32.png Image imgs/small_zp_gaussian_64.png Image imgs/small_zp_gaussian_128.png Image imgs/small_zp_gaussian_256.png

Again, the zone plate was the one function that gave the filters troubles, giving either significant in-spectrum loss or severe leaking. We can definitely see the aliasing effects on the forth and fifth images of the above figure, while the fourth still has 20% in-spectrum loss. The other images, in general, did not exhibit severe problems.

results

Figure 23: In-spectrum loss: 1.5.02 1/32w and 1.5.02 1/16w
Image imgs/small_1_5_02_gaussian_16.png Image imgs/small_1_5_02_gaussian_32.png

Figure 24: Little loss or aliasing: 5.3.01 1/8w and 4.2.04 1/4w
Image imgs/small_man_gaussian_128.png Image imgs/small_4_2_04_gaussian_128.png

Figure 25: Aliasing: 4.2.03 1/2w and 1.5.02e 1/2w
Image imgs/small_4_2_03_gaussian_256.png Image imgs/small_1_5_02_gaussian_256.png

While the gaussian filter can overblur some images, most images that are not overblurred do not exhibit serious aliasing effects. The only image that was not properly filtered was, again, the zone plate. The interesting effect observed in the first overblurred image is due to the peculiar power spectrum of that image 8.

Butterworth Low-Pass Filter

So far, we've seen that the ideal low-pass filter provides good results in terms of foldback and inspectrum loss, but gives bad ringing artifacts. The gaussian filter family exhibits no ringing, but has noticeable inspectrum loss problems. To try and alleviate this, the butterworth filter family has an additional parameter called the order of the filter, that provides a way to tune between smooth and sharp frequency cutoff transitions. The frequency response of the $n$th-order Butterworth filter I implemented is as follows:


$\displaystyle B(r,\theta)$ $\textstyle =$ $\displaystyle \frac{1}{1 + \left(\frac{r}{D_0}\right)^{2n}}$ (15)

The $D_0$ indicates the radius of the frequency response in which the filter should dampen the energies by 50%.

The following table summarizes the results for several different orders (1, 4, 16) of the butterworth filter:

Table 3: Aliasing analysis of 1st-order butterworth-filtered images in terms of percentage, with 2x subsampling.
Image $D_0 = 1/32 w$ $D_0 = 1/16 w$ $D_0 = 1/8 w$ $D_0 = 1/4 w$ $D_0 = 1/2 w$
  In-spectrum Loss Foldback In-spectrum Loss Foldback In-spectrum Loss Foldback In-spectrum Loss Foldback In-spectrum Loss Foldback
circle 1 5.638% 7.586e-06% 2.655% 0.0001108% 1.171% 0.001401% 0.453% 0.01137% 0.1447% 0.04251%
Zone Plate 3 98.88% 0.6778% 95.36% 2.436% 83.7% 8.406% 56.49% 22.09% 24.78% 37.68%
1.3.05 5 14.68% 4.508e-05% 10.67% 0.0006501% 5.634% 0.007957% 2.263% 0.06438% 0.7216% 0.2507%
1.4.01 6 4.373% 1.784e-05% 3.084% 0.0002656% 1.743% 0.00338% 0.7504% 0.02785% 0.247% 0.1082%
1.4.07 7 1.352% 7.113e-06% 0.8508% 0.0001073% 0.4466% 0.001404% 0.1856% 0.01225% 0.06078% 0.0518%
1.5.02 8 26.3% 7.24e-05% 23.49% 0.00105% 14.81% 0.01207% 5.993% 0.08809% 1.839% 0.3047%
4.2.03 9 3.759% 8.567e-05% 2.834% 0.00128% 1.981% 0.01638% 1.105% 0.1345% 0.4302% 0.5091%
4.2.04 10 3.219% 7.379e-06% 1.803% 0.0001095% 0.8965% 0.001391% 0.3744% 0.01126% 0.1239% 0.0419%
5.3.01 11 6.161% 1.929e-05% 3.639% 0.0002829% 1.93% 0.003563% 0.8439% 0.02876% 0.2851% 0.1077%
elaine.512 12 2.952% 7.142e-06% 1.541% 0.0001069% 0.6751% 0.001397% 0.2518% 0.01229% 0.07851% 0.05291%



Table 4: Aliasing analysis of 4th-order butterworth-filtered images in terms of percentage, with 2x subsampling.
Image $D_0 = 1/32 w$ $D_0 = 1/16 w$ $D_0 = 1/8 w$ $D_0 = 1/4 w$ $D_0 = 1/2 w$
  In-spectrum Loss Foldback In-spectrum Loss Foldback In-spectrum Loss Foldback In-spectrum Loss Foldback In-spectrum Loss Foldback
circle 1 2.591% 0% v1.13% 6.364e-13% 0.4092% 8.909e-08% 0.0707% 0.002535% 0.0008008% 0.07398%
Zone plate 3 98.93% 6.501e-13% 95.91% 2.652e-08% 83.66% 0.0004331% 36.83% 3.372% 0.6431% 45.94%
1.3.05 5 15.38% 0% 7.371% 3.622e-12% 2.085% 4.714e-07% 0.3216% 0.01304% 0.003451% 0.4114%
1.4.01 6 3.965% 0% 2.284% 1.723e-12% 0.8392% 2.165e-07% 0.1223% 0.005953% 0.001288% 0.1814%
1.4.07 7 1.052% 0% 0.5556% 5.049e-13% 0.1875% 6.497e-08% 0.03201% 0.001909% 0.000394% 0.08328%
1.5.02 8 26.66% 0% 26.38% 9.748e-12% 4.314% 5.541e-07% 0.4847% 0.01734% 0.005957% 0.5201%
4.2.03 9 3.128% 0% 2.366% 9.047e-12% 1.579% 9.497e-07% 0.4601% 0.02709% 0.005833% 0.8843%
4.2.04 10 2.04% 0% 0.9779% 6.441e-13% 0.3923% 9.408e-08% 0.06914% 0.002608% 0.0007761% 0.07292%
5.3.01 11 4.117% 0% 2.235% 1.753e-12% 0.9655% 2.414e-07% 0.1727% 0.006675% 0.001869% 0.1838%
elaine.512 12 1.83% 0% 0.6495% 4.249e-13% 0.2008% 5.825e-08% 0.03399% 0.00173% 0.000377% 0.08077%



Table 5: Aliasing analysis of 16th-order butterworth-filtered images in terms of percentage, with 2x subsampling.
Image $D_0 = 1/32 w$ $D_0 = 1/16 w$ $D_0 = 1/8 w$ $D_0 = 1/4 w$ $D_0 = 1/2 w$
  In-spectrum Loss Foldback In-spectrum Loss Foldback In-spectrum Loss Foldback In-spectrum Loss Foldback In-spectrum Loss Foldback
circle 1 2.279% 0% 0.9761% 0% 0.3341% 0% 0.03407% 0.0003726% 7.346e-09% 0.0873%
Zone Plate 3 98.62% 0% 95.28% 0% 81.49% 0% 26.16% 0.3121% 1.128e-05% 49.52%
1.3.05 5 15.26% 0% 5.665% 0% 1.621% 0% 0.1442% 0.002026% 3.029e-08% 0.4946%
1.4.01 6 3.73% 0% 2.032% 0% 0.6608% 0% 0.05307% 0.001017% 1.109e-08% 0.2183%
1.4.07 7 1.04% 0% 0.4808% 0% 0.1491% 0% 0.01631% 0.0002591% 4.93e-09% 0.1034%
1.5.02 8 26.65% 0% 26.56% 0% 2.452% 0% 0.1756% 0.002544% 2.577e-08% 0.5902%
4.2.03 9 3.019% 0% 2.268% 0% 1.47% 0% 0.2581% 0.003906% 6.143e-08% 1.058%
4.2.04 10 1.843% 0% 0.8541% 0% 0.3289% 0% 0.03168% 0.0004441% 7.89e-09% 0.08581%
5.3.01 11 3.795% 0% 2.017% 0% 0.816% 0% 0.07989% 0.001106% 1.457e-08% 0.2164%
elaine.512 12 1.565% 0% 0.5366% 0% 0.1581% 0% 0.01564% 0.0002567% 3.263e-09% 0.09244%


The 1st-order butterworth filter seems too smooth, exhibiting significant in-spectrum loss even for real images, which wasn't the case for the gaussian or ideal filters. It also exhibits foldback, although it's not noticeable from the images. The 4th order butterworth filters seemed to perform adequately, without significant in-spectrum loss or foldback for wider kernels. The 16th order, on the other hand, did suppress foldback entirely for most cases. On the other hand, the 16th order filter is more ``discontinuous'' than the other ones, so the images exhibit more ringing.

Results

Figure 26: In-spectrum loss: 1st order BW 1/32w
Image imgs/small_elaine_512_butterworth_16_1.png Image imgs/small_man_butterworth_32_1.png Image imgs/small_1_5_02_butterworth_16_1.png

Figure 27: Small inspectrum loss and aliasing: 4th order BW 1/4w
Image imgs/small_zp_butterworth_128_4.png Image imgs/small_1_4_01_butterworth_256_4.png Image imgs/small_1_4_07_butterworth_256_4.png

Figure 28: Ringing and aliasing: 16th order BW 1/16w, 1/2w, 1/2w
Image imgs/small_circle_butterworth_32_16.png Image imgs/small_elaine_512_256_16.png Image imgs/small_zp_butterworth_256_16.png

Notice that even the zoneplate image exhibits little aliasing. The butterworth filter seems more tunable than the other ones, but in general the gaussian filter is more than adequate and much simpler to tune.

Analysis

For the theoretical analysis of the performance of the filters, we have to assume something about the distribution of the image energy on the frequency domain. I'll analyze two assumptions: uniform distribution, which is in a sense the worst-possible case in terms of in-spectrum loss and aliasing possibilities, and exponential distribution from the center, which seems to be much better for the performance of filters, and also closer to real-life images. As a matter of fact, power spectra of real photographs do resemble exponential distributions 9 10 11 12. Remember that the visualizations of power spectra are on the log domain, so linear tendencies in the visualization indicate exponential growth. A near- uniform distribution of frequencies was only experienced in the zone-plate image.

To evaluate the performance, we compute the integrals of the power spectra of the original image (in this case, the expected distribution) and filtered ones. We also compute the integrals for the in-spectrum (for both 2x and 4x subsampling) energies. The difference between in-spectrum energies of original and filtered gives the in-spectrum loss, while the difference between energies outside of the spectrum and the total filtered ones give the foldback. The exponential distribution gives an energy of $2^{20-20r}$, and the domain $[-1,1]^2$. $r$ is the distance from the center of the domain.

Figure 29: Average distributions for constant and exponential power spectra.
Image imgs/constant.png Image imgs/exponential_spectrum.png


Table 7: Summary of analysis of all low-pass filters implemented, in terms of expected in-spectrum loss and foldback for images with constant and exponential power spectra.
Table 6: Average amount of in-spectrum loss and foldback, evaluated numerically, for the filters implemented here.
Filter $D_0 = 1/32 w$ $D_0 = 1/16 w$ $D_0 = 1/8 w$ $D_0 = 1/4 w$ $D_0 = 1/2 w$
  Cnst. Exp. Cnst. Exp. Cnst. Exp. Cnst. Exp. Cnst. Exp.
  ISL FB ISL FB ISL FB ISL FB ISL FB ISL FB ISL FB ISL FB ISL FB ISL FB
ILPF 98.78% 0.00% 78.51% 0.00% 95.10% 0.00% 48.14% 0.00% 80.38% 0.00% 13.58% 0.00% 21.52% 0.00% 0.32% 0.00% 0.00% 68.16% 0.00% 0.43%
Gaussian 98.77% 0.00% 81.16% 0.00% 95.09% 0.00% 57.35% 0.00% 80.55% 0.93% 28.94% 0.00% 44.23% 28.32% 10.15% 0.12% 14.89% 61.85% 2.84% 0.32%
1st BW 98.79% 0.94% 84.21% 0.00% 95.33% 3.64% 64.59% 0.00% 83.68% 12.65% 38.39% 0.01% 56.48% 33.32% 16.48% 0.09% 24.77% 56.79% 5.284% 0.25%
4th BW 98.98% 0.00% 81.48% 0.00% 95.91% 0.00% 54.17% 0.00% 83.64% 0.00% 19.43% 0.00% 36.81% 3.41% 1.80% 0.03% 0.64% 60.69% 0.01% 0.42%
16th BW 98.84% 0% 79.31% 0% 95.37% 0% 49.72% 0% 81.47% 0% 14.83% 0% 26.13% 0.32% 0.51% 0.01% 0.00% 66.16% 0.00% 0.45%


The interesting thing to notice is that the zone plate image is performing close to its theoretically worse. This supports the argument that it is the ideal image for testing filtering techniques at the worst possible conditions. At the same time, we can compare it to real-life images, and even with relatively pathological images such as Figure 8, the in-spectrum loss and foldback are not too bad. This seems to indicate that our test is overly harsh.

It seems my exponentially decreasing power spectra does not match exactly the images. For one, the in-spectrum loss expected is still larger than the ones experienced. Still, it does seem to predict the small amount of leaked energies in real images as the experiments showed.

I think it is possible to design optimal filters, similar to the Wiener sense that will minimize both foldback and in-spectrum noise, assuming we know the spectra of the images. Since it seems that an exponentially decreasing power spectrum is at least somewhat appropriate for real-life images, this could give optimal filters for a certain subsampling.

Interpolating Filters for Supersampling

In this next section, I investigated interpolating filters for supersampling. Specifically, I implemented sinc reconstruction (by padding in the Fourier domain), nearest-neighbor and bilinear (on the spatial domain).

Sinc reconstruction

The first interpolating filter works in the fourier domain. By padding the fourier domain with extra zeros, we are effectively multiplying the original image by a rectangle function on a larger domain. The inverse fourier transform of this is a convolution of the original function by a sinc function of period equal to the original resolution. The pixels that are between the original samples are the reconstructed with the sinc interpolation.

In theory, sinc interpolation is the best possible interpolation scheme, because no new frequencies are introduced (by definition). The problem that arises in practice is that similar artifacts to the ideal low-pass filters appear: sinc reconstruction near sharp edges cause ringing.

Figure 30: Examples of ringing from sinc reconstruction.
Image imgs/circle_big.png Image imgs/baboon_big.png

The following filters work in the spatial domain and they give visually better results. On the other hand, they have definite effects on the Fourier domain, as we'll see.

Nearest-Neighbor Interpolation

The next filter I investigated is the nearest-neighbor filter. With 2x supersampling, this is equivalent to convolving the image with a box filter two pixels thick. On the frequency domain, this is equivalent to multiplying with a sinc filter of period 1/2. But since we're increasing the domain to twice the size, we're going to see the frequency domain repeated. So there will be multiple copies of noise artifacts on the resulting domain, and they will be damped only slightly, because the sinc function decreases slowly.

Figure 31: Images supersampled to 2x by nearest-neighbor. Notice that the frequency domain is ``tiled'' (compare to the original spectra), then dampened as a sinc function.
Image imgs/nn_elaine_512.png Image imgs/spectrum_nn_elaine_512.png Image imgs/nn_1_5_02.png Image imgs/spectrum_nn_1_5_02.png Image imgs/nn_zp.png Image imgs/spectrum_nn_zp.png

Linear Interpolation

The final filter I analyzed was a simple linear-interpolation filter. This is equivalent to convolving the original filter with a triangle filter. We know that the FT of a triangle function is the square of the sinc function. This means that we should expect the same sort of artifacts that were present in the nearest-neighbor interpolation, only significantly more dampened, because of the power of two in the reconstruction filter.

Figure 32: Images supersampled to 2x by nearest-neighbor. Notice that the frequency domain is ``tiled'' (compare to the original spectra), then dampened as the square of sinc function, giving much less artifacts than the nearest-neighbor interpolant.
Image imgs/l_elaine_512.png Image imgs/spectrum_l_elaine_512.png Image imgs/l_1_5_02.png Image imgs/spectrum_l_1_5_02.png Image imgs/l_zp.png Image imgs/spectrum_l_zp.png

Code

The code I used for this assignment is included in the directory code. It is basically a mish-mash of VISPack code for the filtering peppered with many bash scripts to automate the job. I used ImageMagick to convert FITS images to PNGs.

Conclusion

Whew! This was a lot of work. But at the same time, it was really cool to see the artifacts of sampling and how periodic noise shows up on the Fourier Domain. I feel I learned a lot. I apologize for the extremely late submission.

Bibliography

1
The USC-SIPI Image Database. http://sipi.usc.edu/database/.

2
R. Bracewell. The Fourier Transform and its applications. McGraw-Hill. 3rd edition, 1996.

3
``The rest of the Lenna story'' http://www.lenna.org.

About this document ...

Project 3: Antialiasing, Subsampling, Supersampling and the Frequency Domain

This document was generated using the LaTeX2HTML translator Version 2002 (1.62)

Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999, Ross Moore, Mathematics Department, Macquarie University, Sydney.

The command line arguments were:
latex2html index.tex -split 0 -nonavigation

The translation was initiated by Carlos Scheidegger on 2005-05-04


Carlos Scheidegger 2005-05-04