tad

April 22, 2021

Synopsis

tad command [options] input-fileoutput-file

Description

Tad is a tool that works on multi-dimensional arrays with metadata (Tagged Array Data). It provides a set of commands to manipulate arrays in various ways or print information about them.

Arrays consist of array elements, and each element consists of a fixed number of components of a certain data type. For example, an image of size 640x480 is typically stored in a 2D array with dimensions 640 and 480, and each array element consists of 3 data values of type uint8 for R, G, and B.

Array meta data is consists of tags (name-value pairs). These tags are associated with one of the following

For example, an image might have the global tag “DESCRIPTION=Sunset at a beach” and element component tags INTERPRETATION=SRGB/R, INTERPRETATION=SRGB/G, and INTERPRETATION=SRGB/B for the three element components.

Tad Commands

All commands require one or more input-files, and all except the info command require an output-file. Input and output files cannot be omitted, but you can use - to specify standard input or output.

Common options accepted by all commands are

create

Create arrays. All data will be zero, but the array(s) can be piped to the calc command to fill them with meaningful data.

Examples:

convert

Convert data between different formats.

Examples:

calc

Calculate array data values.

Examples:

diff

Compute the absolute difference between two arrays.

Example:

info

Print information about arrays and their contents and meta data. This command does not take an output-file argument. The default output per array consists of an overview, all tags, and optionally statistics (with -s). All options described after option -b will disable the default output, and instead print their own output in the order in which they are given.

Examples:

File Formats

The tad utility supports many file formats. Some are builtin and some require an external library. Some file formats are supported for reading and writing (rw), some only for reading (r). Different file formats can store different types of arrays. Here’s an overview:

Name File Format(s) Library Read/Write Arrays per file Dimensions Components Data Types Comment
tad .tad builtin rw unlimited unlimited unlimited all Native format, very fast.
raw (raw) builtin rw unlimited unlimited unlimited all Requires the following input tags for reading: DIMENSIONS, COMPONENTS, TYPES.
csv .csv builtin rw unlimited unlimited unlimited all, interpreted as float32 when reading and simplified to int8, uint8, int16 or uint16 if the values fit Simple text format, easy to edit.
pnm .pgm, .ppm, .pam, .pfm builtin rw unlimited 2 1-4, with float32 only 1 or 3 uint8, uint16, float32 Simple image file formats.
rgbe .pic, .hdr builtin rw unlimited 2 3 float32 Simple format for HDR images.
dcmtk .dcm, .dicom DCMTK r 1 2 1 or 3 uint8, uint16, uint32, uint64 Used for medical image data.
exr .exr OpenEXR rw 1 2 unlimited float32 Used for HDR images.
fits .fits, .fit CFITSIO r unlimited unlimited 1 all Used for astronomy data.
ffmpeg Many video and image formats FFmpeg r unlimited 2 1-4 uint8, uint16 Can import all kinds of video and image data.
gdal Many remote sensing file formats GDAL r 1 2 unlimited uint8, int16, uint16, int32, uint32, float32, float64 Used for remote sensing image data.
gta .gta libgta rw unlimited unlimited unlimited all Obsoleted by tad.
hdf5 .h5, .he5, .hdf5 HDF5 rw unlimited unlimited unlimited all Universal, but slow and awful.
jpeg .jpg, .jpeg libjpeg rw 1 2 1 or 3 uint8 Lossy image format.
matio .mat libmatio rw unlimited unlimited unlimited all Old Matlab file format.
pdf .pdf libpoppler r unlimited (one per page) 2 3 uint8 Rasterized PDF documents. Supports input tag DPI to set resolution.
pfs .pfs libpfs rw unlimited 2 1-1024 float32 Simple format for 2D floating point data.
png .png libpng rw 1 2 1-4 uint8, uint16 Lossless image file format.
tiff .tiff libtiff rw unlimited 2 unlimited all Versatile image file format.

TAD File Format Specification

TAD files (.tad) start with the four bytes T, A, D, and 0.

The fifth byte defines the data type: 0 for int8, 1 for uint8, 2 for int16, 3 for uint16, 4 for int32, 5 for uint32, 6 for int64, 7 for uint64, 8 for float32, and 9 for float64. These correspond to the common representation of data types on all relevant platforms (two’s complement for signed integers, IEEE 754 single and double precision for float32 and float64, little-endian).

In the following, numbers are always stored as little-endian 64 bit unsigned integers.

Bytes 6-14 store the number of array element components C. Bytes 15-23 store the number of dimensions D. The following D*8 bytes store the array size in each dimension.

The tag lists follow: first the global tag list, then D tag lists for the dimensions, and then C tag lists for the components. Each tag lists starts with 8 bytes storing the number of bytes it occupies in the file (excluding these first 8 bytes). The NAME and VALUE pairs in the tag list follow, each pair as two zero-terminated UTF-8 strings.

After that, the array data follows using the data type described above. The size of the data can be calculated as the number of array elements (the product over all dimensions) times the number of element components times the size of the data type.

Directly after the array data, another TAD may follow, again starting with the four bytes T, A, D, and 0. A file can store any number of TADs.

This format is simple and direct and does not involve data conversion of any kind, so it is very efficient. If and when a future platform becomes relevant that does not use todays conventions (two’s complement, IEEE 754, little-endian), then data conversion will be necessary on that platform.

Common Tags

Tags are simply NAME=VALUE pairs. Both NAME and VALUE must be valid UTF8 strings without control characters, and NAME must not contain =. Other than that, there are no restrictions, but there are some conventions and a few common tags that should be used to enable interoperability.

Tag names can use / to denote tag directories, thereby creating tag name spaces. For example, an application FooBar might use tag names such as FOOBAR/AUDIO/BASS and FOOBAR/COLOR/BACKGROUND.

The following global tags are common:

The following dimension tags are common:

The following component tags are common: