Image Combination

 

Problem and Motivation

Capturing a single high quality, high resolution image of a given object or scene can be difficult or impossible due to limitations of the imaging system. However, it may be possible to capture multiple low quality images and then combine them into a single high quality image.


The human visual system does this kind of image combination so seamlessly that it is not even noticeable. At any moment, only about two degrees of the visual field are in focus on the fovea of the retina. The eye darts around, collecting high resolution samples of its environment. These samples are combined in the visual cortex to produce the perception of a continuous high resolution image of the world.


Combining images digitally is useful for many imaging scenarios. We were interested in two specific applications. First, in the process of scanning an object with a laser to produce a 3D mesh, scans from multiple angles are often needed to capture the object’s entire geometry. Due to the ability to control the speed of the laser scan line but not the resolution of the scanner’s detector, the laser scanner will have an anisotropic PSF. We were interested in image combination techniques that could be generalized to 3D meshes, and that were robust to anisotropic PSFs, spatially-varying PSFs, and noise.


We were also interested in the related problem of reconstructing images of objects from reflections off of shiny 3D surfaces. Consider “Tempest in a Teapot: Compromising Reflections Revisited” by Backes et. al. presented at the 2009 IEEE Symposium on Security and Privacy.  In this work, the authors showed that reflections off of stable, moving, and some diffuse objects around an office could be used to reconstruct what is being displayed on an LCD monitor.  The authors, however, only consider a reflection off of a single object when reconstructing the image on the LCD display.  Taking multiple reflected images and merging them in order to gain a better image could produce images of higher quality than those possible from analyzing only single reflections.



Exploration of Existing Approaches

We explored three approaches for image combination.


Super-resolution (SR): SR algorithms combine multiple low-resolution images to produce a high-resolution image of a larger size than the input images. Most SR algorithms assume that the images undergo blur, resampling (resolution decimation), and the addition of noise.  Most SR algorithms assume knowledge regarding what type of blur was performed. These assumptions are often restrictive. For example, some SR algorithms assume that all input images have been degraded with the same blur kernel. SR techniques rely on sub-pixel shifts between images, which cause aliasing that the SR algorithms can exploit.  


Multi-channel blind deconvolution (MCBD): MCBD algorithms combine multiple low-quality images to produce a high quality image that is of the same size as the input images. MCBD algorithms assume that multiple images taken of the same scene undergo blur and the addition of noise.  These algorithms work on the assumption that the blur function is unknown; therefore they are ‘blind’. However, like SR algorithms, restrictive assumptions about the blur functions of the input images are often made.


Wavelet-based image fusion: Unlike SR and MCBD, wavelet-based image fusion is able to deal with spatially-varying blur. Wavelet transforms analyze local frequency information: they answer the question, what frequencies are contained in each spatial location of the image? The wavelet fusion technique can utilize this property to maximize high frequencies in the combined image. First, the discrete wavelet transforms of the input images are computed. Then, in the wavelet domain, decision rules are applied. These decision rules choose new wavelet coefficients for the output image based on the wavelet coefficients of the input images. The simplest decision rule is the maximum absolute value rule, which for each location in the wavelet domain selects the wavelet coefficient from the input with the greatest absolute value. Finally, the inverse wavelet transform is computed for the output.


This page has an excellent overview of wavelet-based image fusion.


Curvelet-based image fusion: Curvelet-based image fusion is an extension of wavelet-based image fusion that uses a curvelet transform instead of a wavelet transform. While wavelets describe position and spatial frequency, curvelets describe position, scale, and orientation. Curvelet techniques show promise for image fusion by producing better results than simple wavelet-based fusion with fewer artifacts.


Our Approach


We decided to focus on wavelet-based image fusion, because of all the techniques we studied, it had the fewest assumptions that would be violated by our applications. We tested wavlet-based fusion against combining images by simple averaging where the images had been degraded by anisotropic blur filters, anisotropic blur filters plus noise, and spatially-varying blur filters. We also compared wavelet-based image fusion and simple averaging for object reconstruction from reflections. We learned that wavelet-based image fusion performs better than averaging in cases where images have been degraded by blur, but that the technique is not very robust to noise.


Possible Applications and Future Work


Wavelet-based image fusion shows promise for many applications where combining images is necessary. It may be possible to generalize the algorithm to operate on 3D meshes. Wavelet-based fusion may also prove useful for applications such as image compositing. We would like to find a way to make wavelet-based image fusion more robust to noise. We would also like to do more research into curvelet-based image fusion and compare it to wavelet-based techniques.