|
|
|
|
teem
|
/
|
nrrd
|
General Description of the NRRD format
|
General Description of the NRRD format
The NRRD format was primarily designed to be easy to write, rather
than easy to read. The NRRD header is simple ASCII text, one field
per line. The fields in the header do not have a strict ordering, and
most of them are optional. Most strings are case insensitive, and
alternate forms of many of the identifiers and descriptors are
allowed. Writing NRRD headers by hand, from scratch, is entirely
feasible (although the Utah Nrrd
Utilities program "unu make -h" is probably a better
solution). When writing non-ASCII data, the byte ordering is
recorded, but not altered to match one particular endianness. The format
flexibility greatly increases the complexity and responsibilities of a
NRRD reader, but the compelling benefit is having a simple portable
format that is general with respect to dimension and type.
The NRRD file format was also conceived as being somewhat analogous to
the PPM format for color images: straight-forward, friendly to
programmers, and descriptive of a sufficiently large class of data to
be useful in research. Time and experience with the NRRD format has
gradually increased its complexity (such as with the introduction of
node- versus cell-centered samples), but the feature set has very
nearly converged. As a general representation of raster data, NRRD is
intended to occupy the very large but sparsely populated niche between
- Raw, headerless data, hopefully with some nearby README file
explaining the type and dimensions.
- Very sophisticated, powerful (complicated) formats such as HDF
(http://hdf.ncsa.uiuc.edu/).
Various aspects of the NRRD format borrow heavily from the PCGV volume
dataset format developed by James Durkin at the Cornell Program of
Computer Graphics.
Optional encodings
NRRD has two basic encodings: ascii and raw. It has other optional
encodings which are useful in different situations:
- hex: If you know enough PostScript to learn the image
dimensions, this allows you to snarf image data out of a PostScript file.
- gzip: Allows you to read and write data with the zlib
compression library, in a way that is compatible with the gzip/gunzip
command-line tools.
- bzip2: Allows you to read and write data with
bzip2 compression, compatible with the bzip2/bunzip2 command-line tools.
Having an optional encoding means that the nrrd library can be
compiled without these turned on, so that no external libraries are
needed. Builds of the nrrd library which are missing the
compression encodings will fail with a warning message when asked to
read or write compressed data.
Other optional encodings may be added in the future. However, there
is no risk that NRRD will turn into another TIFF, a format so flexible
that few readers actually support all of the 121
page specification. The only optional encodings which may be
added to NRRD in the future will be ones for which there exist freely
available command-line tools to convert the encoded data (in
isolation) to raw data.
If you have a NRRD file volume.nrrd, with an attached header,
using a data encoding not supported by the available nrrd
implementation, you can always use the unix/linux/cygwin command
"tail +N volume.nrrd" (where N is two plus the
number of lines in the header) to get at all the data, so as to pass
it onto a stand-alone converter. Or, the Utah Nrrd Utilities command
"unu data" is a much easier way of doing the exact same
thing. Data in a separate file, detached from the NRRD header, is
obviously trivial to pass to a converter.
I couldn't find stand-alone converters for hex data, so I wrote them:
Because unu data will always be able to spit out the
data portion of a NRRD file, even if the nrrd library on which
it was built wasn't compiled with the optional encodings enabled,
other non-teem NRRD readers should feel no obligation to support the
optional encodings.
NRRD compared to VTK format
The VTK
(http://public.kitware.com/VTK/pdf/file-formats.pdf) file
format is more general than NRRD in the types of information
represented, and slightly less general that NRRD when it comes to
raster data.
Unlike NRRD, VTK can represent:
- Point sets, polygonal data, structure and unstructured meshes
of various types
- Multiple pieces of data in one file, allowing samples to have
many various attributes.
- Vector and tensor types explicitly. In NRRD, these are
represented implicitly, by using a short non-spatial axis prior to
the spatial axes.
But with raster data, unlike VTK, NRRD can:
- Read and write data in either byte-ordering (VTK is always
big-endian).
- Have the data in a seperate file from the header.
- Represent the difference between cell and node-centered samples
- Store data of of any dimension, and any C scalar type.
- Encode data in more than just ascii and binary, including gzip
and bzip2 compression.
- Store more peripheral information, such as axis labels and units,
old min and max (range of values pre-quantization)
NRRD compared to MetaImage format
This format (specs available from http://tolkien.rad.unc.edu/technologies/MetaImage/)
was developed at the Computer
Aided Diagnosis and Display Lab at UNC. It is extremely similar
to NRRD in terms of representational capabilities, in that it
represents arrays of general type and dimension.
Here are some differences in representational capability or
aspects of the file format. In favor of MetaImage:
- Having a detached header file give a list of image (slice) filenames,
while describing those images as a volume, is quite powerful, and
MetaIO has two different ways of doing this.
- The ElementNumberOfChannels field allows a nominally
3-D data header to describe what is logically a 4-D array. NRRD
suffers from a slight weirdness in this regard (a color image is a
three-dimensional nrrd), a consequence of its "everything is a scalar"
mentality.
- The distinction between element size and element spacing is
fundamental to properly representing MRI data when the slice thickness
is different (usually less than) the spacing between slices. NRRD
doesn't know anything about this difference, and perhaps it should.
In favor of NRRD:
- Has a simple "magic" on a line by itself at the beginning of the file,
to unambiguously identify the type of the file to multi-format readers.
- Having more than just raw and ascii encoding: gzip and bzip2 compression,
as well as hex.
- A more conservative approach to representing optional information.
If you don't know information like sample spacing, you don't include
that field. The NRRD reader remembers that you didn't know spacing,
instead of inventing some default value.
There are many one-to-one parallels between the header fields in
the two formats:
NRRD
| -
MetaImage
|
#
| -
Comment
|
dimension
| -
NDims
|
sizes
| -
DimSize
|
type
| -
ElementType
|
endian
| -
ElementByteOrderMSB
|
byte skip
| -
HeaderSize
|
min
| -
ElementMin
|
max
| -
ElementMax
|
content
| -
Name
|
axis mins
| -
Position
|
data file
| -
ElementDataFile
|
(not using a detached header)
| -
ElementDataFile LOCAL
| | | | | | | | | | | | | |
Some MetaImage fields that NRRD has no good analog for:
- ObjectType, ObjectSubType, TransformType,
Modality: descriptive strings
- ID, ParentID: integers
- Color, Orientation, AnatomicalOrientation:
arrays
- SequenceID: 4-tuple of integers specific to DICOM format,
which uniquely identifies the dataset.
Some NRRD fields that MetaImage doesn't seem to have analogs for:
- centers: cell-vs-node centering
- axis maxs: good with histograms, scatterplots, fields of view
- old min, old max:
remembering value range pre-quantization
- line skip: very handy for snarfing data from other formats,
such as VTK, VisPack, and PostScript
- labels: arbitrary string per axis
- units: arbitrary string per axis giving units that spacing
is measured in.
Future Extensions
There are at least two things which will be added to the next
version of the NRRD format. These changes will probably be implemented
within a year. The magic for this new and improved
format has not yet been established.
- New field: "gamma: ". This is to have parity with the
PNG's gamma chunk. Like "min: ", "max: ", and
"axis mins: ", this will be an optional field, which is
sensible to use for only some nrrds.
- General facility for key/value pairs. This would be a more
structured way of storing non-nrrd optional information than comments
provide.
I have some other ideas on how the NRRD file format may be extended in
the future, but these are not likely to happen within the next year.
- More than one array per NRRD file: There are many situations
where it is good to logically associate multiple NRRD files together
as one "data set". Examples include a large volume and its
pre-computed univariate histogram (to help in isovalue navigation), or
a collection of different one-, two-, and three-dimensional lookup
tables which are good transfer functions for a given volume. I am
leaning towards implementing this multi-NRRD association with XML on
top of regular NRRD0001 files. If you need this, though, you should
probably be using HDF
(http://hdf.ncsa.uiuc.edu/)
- Bricking: Whenever I do get around to implementing bricking in
nrrd, the results will be saved in NRRD files. One level of
bricking will turn a 3-D array into a 6-D array. For various subtle
reasons, the representation of the axis mins and maxs becomes
ambiguous in the case of cell-centered data, and this information is
meaningless for all the bricked axes. Brick overlap should be
represented too, but this means a new field.
- Other compression methods: It would be really nice to have a
compression method that worked well on floating point data.