As a research assistant to Ross Whitaker, and later as a staff
software developer at SCI working with Tolga Tasdizen I've implemented
a number of applications aimed to fullfill the image processing
requirements of the Robert Marc Laboratory at the Moran Eye Center at
the University of Utah.
Autocrop is an automatic image preprocessing application specific to the Marc
Lab data. The electron microscope images produced at the Marc Lab have
to be corrected for rotation and cropped to remove the extraneous
border data. The images are processed as follows.
Original Marc Lab data downsampled by a factor of 8 | The detected edges highlighted with dashed lines | The rotated image | The cropped image |
Stephen M. Pizer , E. Philip Amburn , John D. Austin , Robert Cromartie , Ari Geselowitz , Trey Greer , Bart Ter Haar Romeny , John B. Zimmerman, Adaptive histogram equalization and its variations, Computer Vision, Graphics, and Image Processing, v.39 n.3, p.355-368, Sept. 1987
The original cropped image | CLAHE processed image |
The basic prerequisite to accurate image registration is the ability to produce an initial mosaic layout. In order to ease the burden on the technicians during data acquisition phase the datasets generated by the Marc Lab are not required to be ordered. Instead, the image ordering is deduced automatically. The algorithm relies on finding neighbors among a given set of tiles. The neighbors are found using phase correlation for any given pair of tiles. This method is based on work described by Girod and Kuo. There is also a very concise description of the method available from wikipedia.
Girod, B. and Kuo, D. 1989. Direct estimation of displacement histograms. In Proceedings of the Optical Society of America Meeting on Understanding and Machine Vision, 73--76.
Once neighboring tiles are established (often right, sometimes wrong), it is possible to use transform cascading to generate an initial tile layout of the mosaic. Since multiple redundant cascades are possible, we can select the best one -- the one with the smallest tile mismatch error. The beneficial side effect of this redundancy is that it allows us to correct the mismatched neighors.
The mosaic is laid out by selecting one of the tiles as the anchor. The rest of the tiles are mapped into the space of the anchor tile using either direct or cascaded transforms. The mosaic layout is built up in the flood fill fashion -- the images are added incrementally to the perimeter of the mosaic. The algorithm is seeded with the perimiter being the anchor tile. As each new perimeter is added to the layout the associated transforms are refined using the gradient descent optimizer.
The details of the implementation are described in greater detail in the following technical reports: Implementation of an Automatic Image Registration Tool, Automatic Assembly of TEM Mosaics and Mosaic Stacks Using Phase Correlation.
slice 9 initial layout by ir-fft |
The above image was generated from the initial mosaic layout created using ir-fft. The images were scaled down by a factor of 8 and a 3 level multi-resolution image pyramid was used for brute force refinement of the dispalcement vectors yielded by tile phase correlation. This layout took 1 minute 25 seconds to compute on a dual Opteron 252 with 8GB RAM. This mosaic has a maximum pixel variance of 13900 and mean pixel variance of 460.
Once a rough estimate of the mosaic is known, the mosaic can be refined by correcting the distortions present in the tiles. The refinement is carried out simultaneously for all tiles. Each tile is unwarped by a unique instance of a centered normalized bi-variate 4th order Legendre polynomial transform. The metric used to guide the optimizer is the mean intensity variance within the overlapping tile regions of the mosaic. Again, the details of the implementation are discussed in the above technical report: Implementation of an Automatic Image Registration Tool.
initial | refined | |
mosaic image | ||
mean pixel variance image |
The images above demonstrate the improvement achieved with mosaic refinement. The following movie clips capture the mosaic state at each iteration of the refinement:
ir-refine-grid is the replacement for ir-refine. ir-refine-grid implements mosaic refinement via local neighborhood matching. The tile transforms are sampled onto a mesh, and a small neighborhood centered at each mesh vertex is matched with a corresponding region from the overlapping tiles. The matching is carried out using phase correlation, same as in ir-fft. This yields a translation vector that is used to refine the tile mesh. The implementation details are discussed in the following technical report: Automatic Assembly of TEM Mosaics and Mosaic Stacks Using Phase Correlation. Images below demonstrate the superior refinement achieved by mesh warping in comparison to polynomial warping.
slice 9 initial layout. | slice 9 refined using cubic Legendre polynomial transforms. | slice 9 refined using the grid transform. |
Polynomial warping was carried out in a multi-resolution fashion with source images scaled down by a factor of 8, 2 pyramid levels, 20 iterations per level. Polynomial refinement took 6 minutes 1 second and reduced maximum variance to 1190, and mean variance to 272.
Grid transform warping on source images scaled down by a factor of 8, with mesh vertex neighborhood size of 96x96 pixels (13x16 vertex mesh per tile) and just 2 iterations, took 7 minutes and 33 seconds. Mesh warping reduced maximum variance to 901, and mean variance to 188.
The ir-fft and ir-refine programs both do not generate images as their output -- that would be wasteful given the size of the input images. Instead, both applications generate small text files which reference the image tiles and their transform parameters. From this a mosaic image can be reconstructed. I've written several applications for assembling mosaic images out of individual mosaic files or a sequence of files. The ir-assemble-rgb-movie was used to generate the individual frames for the movies referenced above.
The assembled mosaics have to be stacked on top of each other (automatically) into a volume. The datasets produced by the Marc Lab for each slice are arbitrarily oriented in the imaging plane, and sometimes even flipped with respect to the previous slice.
We have attempted to correct for rotation using feature matching with the Scale Invariant Feature Transform (SIFT), however our success rate was not much better than 50%. Therefore, Ross has suggested that we should try a simpler solution -- use a brute force search on severely downsampled images (approximately 100x100 pixels). The problem with this approach is that severe downsampling destroyed pretty much all features in the mosaics preprocessed with CLAHE.
So, instead of abandoning the brute force approach I abandoned CLAHE and implemented my own preprocessing algorithm that would enhance features in the image at both local and global scales such that they would not get washed out when downsampling the image. The algorithm is as follows:
The result of this algorithm is that it enhances regions with greater than median variance -- the edges, which become black in the output image. The algorithm also enhances regions with lesser than median variance -- the flat spots (blobs) which become white in the output image.
The original cropped image | ir-blob processed image |
ir-stos is an abandoned project for slice to slice image registration using feature matching with Scale Invariant Feature Transform (SIFT) keys. The project was abandoned due to poor results. Since the slice to slice registration was expected to be an automatic process, I required the success rate of 90% or better, however all I could ever achieve with SIFT on the full dataset was about 50%. I have to mention that this took place prior to the development of ir-blob image preprocessing, so it may be possible that we could get better results with SIFT if we used ir-blob preprocessed images. This project is described in greater detail in the following technical report: Implementation of an Automatic Slice-to-Slice Registration Tool.
Lowe, D.G. 2004. Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision.
SIFT keys on the fixed image | SIFT keys on the moving image | Registration results |
Feature matching was abandoned in favour of brute force rotation/translation parameter search. This application takes two images (mosaics) and downsamples them into tiny thumbnails (no smaller than 128x128 pixels). The thumbnails are matched to each other at different rotation/translation parameters and the best result is used to initialize the registration transform.
The registration can be optionally refined in a multi resolution fashion up to the highest resolution, using the centered normalized bi-variate Legendre polynomial transforms.
brute force registration results, blob enhanced slices | refined registration results, CLAHE processed slices |
The slice to slice registration results can be refined using the grid transform. The moving slice is treated like a texture map on a 2D mesh, where each vertex in the mesh can be displaced to warp the slice. Small image neighborhoods at each mesh vertex are matched using phase correlation with a corresponding image neighborhood in the fixed slice. The displacement vector image yielded by the phase correlation is filtered using a median filter and smoothed with a Gaussian filter to remove outliers. The displacement vectors are applied to the moving slice mesh, and the process is repeated 1 or more times. Typically, 2-3 iterations are enough to get an accurate registration. ir-stos-grid is the replacement for ir-stos registration and ir-stos-brute with polynomial transform refinement. The implementation details ir-stos-grid are discussed in the following technical report: Automatic Assembly of TEM Mosaics and Mosaic Stacks Using Phase Correlation.
ir-stom-approx is a utility for cascading generating the cascaded transforms for slices registered using the Legendre polynomial transform. Since direct cascading of N transforms is impossible due to exponentially growing degree of the cascaded transform, the full N transform cascade is approximated by a cascade of 2 transforms:
ir-slice is similar to ir-assemble in that it takes the output file of ir-stom-approx and generates an image. This can be used to generate the individual slice images to be assembled into a volume. ir-slice was used to generate the individual frames (slices) in the following movies: full view of a 45 slice stack, cropped view of a 45 slice stack.
ir-stom stacks registered slices into a volume. ir-stom takes a set of .stos files and outputs an image for each slice. ir-stom cascades the slice to slice transforms directly (without any regularization), therefore the output images may contain stretching/squeezing artifacts. Each slice is cropped such that only pixels with a valid mapping into every slice in the volume are kept. ir-stom was used to generate the individual frames (slices) in the following movies. Each movie is a fly through the slice stack. The slices were registered using ir-stos-brute without refinement, and refined using ir-stos-grid.
ir-tweak is an interactive multi threaded cross platform application for manual correction/bootstrapping of the slice to slice registration, implemented using OpenGL, Qt, ITK, GLEW, Cg. The most recent version of ir-tweak implements fragment shading in order to reduce the memory footprint of the image textures. The registration is implemented via a Thin Plate Spline Transform. Here is a short movie clip demonstrating ir-tweak capabilities.
ir-tweak main window | ir-tweak mosaic window |
ir-track is a stalled work in progress. It is an interactive multi threaded cross platform application for slice to slice feature tracking.
ir-track screenshot |
ir-mosaic is an interactive multi threaded cross platform application for computer aided slice assembly. Here is a short movie clip demonstrating ir-mosaic capabilities.
mosaic tile view | mosaic layout | mosaic refinement |
Over the years I have accumulated around 60 KLOC of reusable code. I have recently packaged this code into a set of libraries. the_core library provides basic math, geometry, and utility classes. the_itk library contains convenience wrapper functions for working with the ITK toolkit. the_ui library provides an OpenGL abstraction layer for viewing setup and display list handling. This library also includes a model/view framework, a dependency graph regeneration framework, and toolkit independent input device event handling. the_ui_qt and the_ui_fltk are the glue libraries that allow the use of the_ui functionality with the Qt 4 and FLTK 1.1.x toolkits.
The command line tools, iv, ir-mosaic, ir-tweak and ir-track depend on the libraries. The command line tools need the_core and the_itk libraries only. The libraries source code can be configured and compiled using cmake. The libraries depend on Boost, ITK, FFTW, GLEW, Cg, Qt 4 and FLTK 1.1.x (optional).