The options for creating histogram volumes are seen by running: "gkms hvol":One result of my Master's thesis research is that the chances are very high that you don't need to worry about any of the optional command-line parameters! The default parameters generally work quite well. So, for the impatient, this may be all you need:usage: gkms hvol [-s v,g,h] [-p v,g,h] [-d v,g,h] [-h lapl|hess|gmg] <input> <hvolOut> Default values are in {}s -s v,g,h : Strategies for determining inclusion of values in the histogram volume. Each of v,g,h can be ... "f": fraction of maximum range "p": percentile of highest values to exclude "a": absolute specification "s": multiple of standard deviation {f,p,p} -p v,g,h : Parameters to control chosen inclusion strategies. Each of v,g,h should be a scalar value. {f: 1.0; p: 0.15; a: (no default); s: 70} -d v,g,h : Dimensions of output histogram volume. Each of v,g,h should be an integer {256,256,256} -h lapl|hess|gmg: how to measure second derivative "lapl": Laplacian "hess": Hessian-based "gmg": gradient of gradient magnitude {hess} input : 3-D scalar nrrd being analyzed hvolOut : 3-D histogram volume to be createdgkms hvol engine-crop.nrrd engine-hvol.nrrdThe main subtlety in histogram volume creation is deciding what range of derivative values should be represented by the index space of the histogram volume. This aspect of histogram volume creation is called the "inclusion strategy". I have found the most reliable way to determine inclusion is by looking at linear histograms of the derivative value, and then determining what range would exclude some percentile of the hits. This is why the default inclusion strategy for the first and second derivative axes (of the histogram volume) is "p". Any time that the data is something other than super-clean synthetic data, there are going to be a few very high derivative values. These are the voxels which should not be included in the histogram volume because it would decrease the precision with which we can represent the meaningful variations in the derivative values.
One the other hand, the range of data values should probably be fully represented, so the inclusion strategy is "f": some fraction of the full range. In gkms, the default fraction is 1.0, that is, the whole value range. In other cases, you know exactly what range of values to include, in which case it makes the most sense to use inclusion "a": absolute specification of the derivative values to be represented by the histogram. Some multiple of the ("s") standard deviation may be another interesting inclusion strategy.
The next option available in gkms is the dimensions ("-d") of the output histogram volume. I have found 2563 to work fine. If you want to skimp on how big the histogram volume is, it makes more sense to skimp on the derivative axes, than on the data value axis. For example, to make a histogram volume with only 128 bits for the two derivative axes, you'd run:
gkms hvol engine-crop.nrrd -d 256,128,128 tmp.hvolFinally, there is the matter of just how to measure the second derivative. I know of three different ways of approximating the second directional derivative along the gradient direction: the Laplician, something based on the Hessian, and something based on the gradient of the gradient magnitude. As explained in Appendix C of my thesis, I feel the Hessian-based measure is probably the best, so it is the default.