COMP SCI 180

Project 1

Goal: Produce a color image of digitized Prokudin-Gorskii using image processing techniques.

We start off with splitting the image into its three color channels: R, G, B.

Then, we have to align the 3 color channel images in order to combine them and create a color image. Unfortunately, each channel is a little bit offset from the other, meaning that when we put the 3 channels together, the image produced will not be completely aligned, producing a blurry image. For example:

So, we must find the optimal displacement for each color channel in order to best align them on top of each other. I started off on low res images, creating a function which performed exhaustive search through a bunch of possible displacements. At first, I defined my window of possibilities to be [-20, 20] around x=0, y=0. For each possible displacement, I found the sum of squared differences between a given color channel and a chosen reference channel. (I chose my reference channel to be blue). After doing this, I displayed my image, and it turned out to be quite unaligned still.

So, I decided to implement border reduction so that the edges of the image would be ignored during the search process. This is because I noticed the edges of the images were quite “noisy”, so it could have been affecting the alignment result. I did the reduced borders by calculating int(height * 5 / 100) and int(width * 5 / 100). When I calculated the SSD of 2 channels, I used only the internal pixels based off of my new border dimensions. Then, I also implemented a dynamic search window, also by calculating a percentage of height and width. I searched around x=0, y=0 based off of that dynamic height and width. These changes improved my alignment of low res images by a lot.

monastery.jpg
tobolsk.jpg
cathedral.jpg

Then, it was time to move on to high res images. For this, I implemented pyramid alignment by rescaling my image down by a factor of 2 a certain num_levels times using a for loop. Then, for each layer of the pyramid, I found the best displacement values for x and y, and then used those values as a reference to search around for the next layer of the pyramid. At first, my image was taking forever to display, which I figured out was because I was dynamically calculating the window size for each and every layer. So, I decided to use a fixed window size of [-10, 10]. (At first, I was using a window size of [-15, 15], but my images were taking a very long time to display with this window size). Now, my high res images were being displayed in reasonable time, but they were still unaligned. At first, I thought this could be due to the “noisy borders”. So, I tried to crop 15 percent of the image border before beginning the pyramid search process. Even after this, my images were still unaligned, so I decided to try difference metrics for comparing the color channels, including NCC(Normalized Cross Correlation) and SSIM(Structural SImilarity and Index Measure). In the end, I ended up using SSIM, which produced some well aligned images!

church.tif
emir.tif
harvesters.tif
Khan.tif

icon.tif
lady.tif
melons.tif
sculpture.tif
onion_church.tif
self_portrait.tif
three_generations.tif
Trees.tif
train.tif

IMAGEGR
church.tif(4, 25)(-4, 58)
emir.tif(23, 49)(41, 106)

harvesters.tif(16, 59)(14, 123)
icon.tif(17, 40)(23, 89)
lady.tif(9, 55)(13, 119)
melons.tif(10, 81)(13, 177)
onion_church.tif(28, 51)(36, 108)
sculpture.tif(-11, 33)(-27, 140)
self_portrait.tif(29, 78)(37, 175)
three_generations.tif(17, 54)(11, 113)
tobolsk.jpg(3, 3)(3, 7)
train.tif(7, 41)(32, 85)
cathedral.jpg(2, 5)(3, 12)
monestary.jpg(2, -3)(2, 3)
Trees.tif(-40, 75)(-66, 113)
Khan.tif(50, 64)(87, 134)