Wei Liu (u0614581)
weiliu@sci.utah.edu
where is indicator vector with each element representing if the data point in this cluster or not. The elements of can only take discrete values by definition but for the purpose of optimization, this can be relaxed. The solution is the eigenvectors of the affinity matrix associated with its largest eigenvalues.
Normalized graph cuts: Conventional graph cuts tends to classify small number of isolated data points as clusters, and this is often not correct. Normalized graph cuts (NGC) defines a new objective function by normalizing the original graph cuts criteria:
Pattern recognition viewpoint: The eigenvectors (for either conventional or normalized graph cuts) can be see as `features' of data points, in the pattern recognition viewpoint. Hence for clustering, we can use more eigenvectors than the number of clusters, and use general clustering method like K-means for clustering, with eigenvectors as input feature points. This strategy is better than manually thresholding eigenvectors (as the book chapter does), as it does not need human intervention.
Affinity matrix: Affinity matrix is a matrix with each element the affinity between data point and . In this project we use intensity-based affinity and distance-based affinity. The kernel size of these two affinity is tricky. For intensity based affinity, the kernel size should be small, because two data points with large intensity difference should not be in same cluster by all means. for distance-based affinity, the kernel size should not be too small, because even two data points are far away, they can still be in same cluster, hence the restriction of distance affinity should be looser.
Graph structure: There are different method to generate the graph, like the -neighborhood or -nearest neighborhood method. In this project we just generate a full-connected graph. That is, each data point is connected with all other points, with different weights. For large data set, we may need a sparse representation of affinity matrix, and may need the above -based or nearest neighborhood method to generate sparse affinity matrix.
Normalization of distance affinity and intensity: Different data set have different scale of distance between data points. For image, the largest distance between two data points is , but the largest distance for a image is . I think it is good to normalized the distance to the same scale , such that one parameter setting (i.e. the value) apply to most images. Analogous to this, intensity affinity also need normalization.
|
|
Compare affinity matrix: In this experiment we compare distance-based affinity and intensity-based affinity and the combination of both. Same to previous test, both distance difference and intensity difference are normalized to the scope of . (This normalization is not confused with the normalization of `normalized graph cuts'.)
First we use ohly distance-based affinity. In such case, the affinity does not use any information from the image, so I expect to see in the result some big, coherent region but not like the image's region. The result in figure 3 confirmed my thoughts. In figure 3, it is hard to find a good threshold for both eigenvectors. This is because, without image information, the eigenvectors obtained from distance-based affinity matrix only represent the internal coherence of the image, and is hence smooth. After thresholding at an arbitrary value as shown in the figure, the result is a circle region and two rectangle region.
|
Next we use only intensity-based affinity. I would expect this scenario will have much better result as it use information from image. Because it is less possible if two pixel have big difference on intensity, intensity is a accurate and trustable measurement of affinity compared to spatial distance, so I choose a much smaller kernel size for intensity. The result in figure 0100 shows, by carefully thresholding eigenvectors, I am able to segment this simple image quite well. There are still small holes in the bigger regions, because there is no constraint on the data points' spatial neighborhood.
|
The we use both intensity and distance-based affinity by multiplying them together. The result in figure 5 shows the affinity matrix looks like the combination of the individual case, also does the eigenvectors. The segmentation result is better than only using intensity-based affinity, as there is less holes in the big region. This is due to the constraints on distance force the isolated pixel merge to its neighbors. By choosing a larger , we are probably able to remove all holes in this particular data set. However, a bigger may not apply to other image in general.
|
Linear combination of eigenvectors: For this example in figure 3, 4 and 5 I did not observe the fact that single eigenvector is not enough for clustering. However, the combination of eigenvectors should have better discrimination strength than single eigenvectors. This is like feature extraction of pattern recognition, more features always get better results, if they are really good features and independent with each other.
Compare graph cuts with normalized graph cuts: To compare the two method, I use K-means for clustering instead of manually thresholding eigenvectors. This will guarantee a fair comparison between GC and NGC. Also note that K-means sometimes performs not good as manual thresholding but as long as both GC and NGC use K-means, it is a fair comparison.
Another motivation of using K-means is this method automatically cluster the data based on the eigenvectors, without human intervention.
|
NGC on realy image: To varify the normalized graph cuts on real image, I choose a Matlab's buildin image of spine. The results in figure 7 shows the method works. For human vision's observation this image have three regions. If we set a strong constraints on distance , we get the first row in the figure. We see when given two clusters, NGC works is able to find the forground and background. When given three clusters, it is able to find one region of the third cluster (on the top left of the image), but not able to find the other spatially seperated region of the same cluster (top right of the image). When given four clusters, NGC even tries to split the background. This is understandable, because the spatial distance constraint is so strong that even the background are classified as two clusters.
This let me think if NGC would work better if I loosen the spatial constraints. Bottom row is when . We see when gigen three clusters, NGC did find the other region of the third cluster (top right of the third image). When given 4 clusters, it tries to put some edge points into fourth cluster. This is understandable, because from human vision, there is no fourth clusters at all.
|
On large data: The affinity matrix has elements (N is the number of pixels in image), and eigen-decomposition has complexity, which make the computation cost prohibitive. I also tried to build a sparse affinity matrix and solve eigen problem on this sparse matrix with much smaller computation cost. The method of building sparse affinity matrix is redefine the connectivity between the vertex of the graph, such that affinity less than certain value is set to zero. This -neighborhood definition will generate sparse matrix.
Unfortunately, I was stuck on the usage of eigs function of Matlab, and could not finish this experiment. But theoretically this is doable.
I am concerned the method to combine two affinity matrix (say, from two modality of image). A simple multiplication may not be the best method. We have to manually tune the kernel size of two affinity to represent our belief of them. But it seems there is no general method available for this combination.