[ Beneath the Waves ]

TMSB XML Schema Part 6: Transformation Profiles

article by Ben Lincoln

 

The transformation profile list file is a global configuration file (either for the entire system where TMSB is being executed, or any instance launched by the current user, if that user has created a copy of the file in their personal directory). It is where transformation profiles (repeatable operations to be performed on images of various types, such as normalization) are defined. This file must be named Global-TransformationProfiles.xml.

When you begin to examine the schema, you may think that I have "XMLized" the configuration to the point of absurdity. However, after you reach the more complex profile types, I think you will agree that this sort of thing belongs in a configuration of this type, and therefore the simpler operations do too by extension of the concept.

Note: the FileType value in the metadata section of the file must be Transformation Profile List.

Schema

<TransformationProfiles>

<AbsoluteValueProfile>

<Name>

A string to reference the transformation profile by

</Name>

<Description>

A description of the transformation profile

</Description>

</AbsoluteValueProfile>

<AdditionProfile>

<Name>

A string to reference the transformation profile by

</Name>

<Description>

A description of the transformation profile

</Description>

<Amount>

The amount to add to the values to which the profile is applied

</Amount>

</AdditionProfile>

<MultiplicationProfile>

<Name>

A string to reference the transformation profile by

</Name>

<Description>

A description of the transformation profile

</Description>

<Amount>

The amount to the values in the input data by

</Amount>

</MultiplicationProfile>

<ExponentProfile>

<Name>

A string to reference the transformation profile by

</Name>

<Description>

A description of the transformation profile

</Description>

<Amount>

The power to raise the input values to

</Amount>

</ExponentProfile>

<ClippingProfile>

<Name>

A string to reference the transformation profile by

</Name>

<Description>

A description of the transformation profile

</Description>

<LowerBound>

The value below which input data should be truncated

</LowerBound>

<UpperBound>

The value above which input data should be truncated

</UpperBound>

</ClippingProfile>

<DataRepositionAndScaleProfile>

<Name>

A string to reference the transformation profile by

</Name>

<Description>

A description of the transformation profile

</Description>

<OutputLowerBound>

The value to which the minimum value present in the input data should be remapped

</OutputLowerBound>

<OutputUpperBound>

The value to which the maximum value present in the input data should be remapped

</OutputUpperBound>

</DataRepositionAndScaleProfile>

<NormalizationProfile>

<Name>

A string to reference the transformation profile by

</Name>

<Description>

A description of the transformation profile

</Description>

<Stance>

See below

</Stance>

<QuantizationStepCount>

A positive integer (see below)

</QuantizationStepCount>

<Threshold>

A decimal/float-point value (see below)

</Threshold>

<ThresholdRepetitions>

An integer of zero or greater (see below)

</ThresholdRepetitions>

<SearchWidth>

A positive integer (see below)

</SearchWidth>

<Backlash>

An integer of zero or greater (see below)

</Backlash>

</NormalizationProfile>

<NormalizationAboutArbitraryValueProfile>

<Name>

A string to reference the transformation profile by

</Name>

<Description>

A description of the transformation profile

</Description>

<InputCenterPoint>

A decimal/floating-point value (see below)

</InputCenterPoint>

<NormalizeUpperAndLowerIndependently>

A boolean (true/false) value (see below)

</NormalizeUpperAndLowerIndependently>

<OutputLowerBound>

The value to which the minimum value present in the output data should be remapped

</OutputLowerBound>

<OutputUpperBound>

The value to which the maximum value present in the output data should be remapped

</OutputUpperBound>

<Stance>

See below

</Stance>

<QuantizationStepCount>

A positive integer (see below)

</QuantizationStepCount>

<Threshold>

A decimal/float-point value (see below)

</Threshold>

<ThresholdRepetitions>

An integer of zero or greater (see below)

</ThresholdRepetitions>

<SearchWidth>

A positive integer (see below)

</SearchWidth>

<Backlash>

An integer of zero or greater (see below)

</Backlash>

</NormalizationAboutArbitraryValueProfile>

<TransformationProfileCollection>

<Name>

A string to reference the transformation profile by

</Name>

<Description>

A description of the transformation profile

</Description>

<SubProfileName>

The name of another transformation profile

</SubProfileName>

</TransformationProfileCollection>

</TransformationProfiles>

Notes

Each major block type in this file has two common tags: Name and Description. The Name value is used to reference the transformation profile in other locations in this and other configuration files. Description is currently for informational purposes only by users who actually read the XML configuration files. In the GUI configuration editor I have planned for a future release, the description will be displayed there as well.

AbsoluteValueProfile defines a transformation in which the absolute value is returned. This profile type accepts no options, so you should never have to define one unless you delete the entry that's included in the default file for some reason.

AdditionProfile, MultiplicationProfile, and ExponentProfile are all simple mathematical operations which add a fixed amount, multiply by a fixed amount, or raise to a fixed power the input data. The fixed value is defined for all three by the Amount tag. Note: to subtract, add a negative value, and to divide, multiply by the reciprocal of the value you wish to divide by.

ClippingProfile is used in cases where a set of data should have its lower or upper bound truncated (values below or above that value, respectively, replaced by that boundary value). This is used frequently in the configuration included with TMSB as a precursor to normalization (discussed below).

Within a ClippingProfile block:

LowerBound is an optional value which specifies the minimum value to which the input data should be clipped. If this value is not specified, no clipping of the lower bound will occur.

UpperBound is an optional value which specifies the maximum value to which the input data should be clipped. If this value is not specified, no clipping of the upper bound will occur.

DataRepositionAndScaleProfile is an experimental, simplified normalization method that I added early in the development of TMSB. It has not been tested extensively, and the use of the full normalization profile types (below) is highly recommended instead.

Within a DataRepositionAndScaleProfile block:

OutputLowerBound is the value to which the lowest value present in the input data should be remapped.

OutputUpperBound is the value to which the highest value present in the input data should be remapped.

For example, if the input data lies in the range 0.0 - 10.0, OutputLowerBound is specified as -1.0, and OutputUpperBound is specified as 1.0, then after processing by the transformation profile, input values of 2.0, 5.0, and 8.0 should be converted to -0.1, 0.0, and 0.9 respectively. I stress should. Again, please use the formal normalization profile types, as they have been tested fairly rigorously.

NormalizationProfile defines a normalization operation (the mapping of a particular range of input values to a particular range of output values). Unlike most normalization, TMSB supports an "aggressive" mode (to help eliminate outlying values which can spoil the results of traditional normalization).

Within a NormalizationProfile block:

Stance is one of:

Standard
Aggressive

Standard is the type of normalization that most people familiar with the term will immediately grasp: when used in this mode, the normalization profile accepts no additional options, and normalizes input values to the range 0.0 - 1.0 (which are then mapped to whatever range is appropriate for the output file format, such as 0 - 65535 in the case of 16-bit-per-channel TIFFs, or 0 - 255 in the case of 8-bit-per-channel JPEGs). Aggressive allows additional options to be specified which cause outliers to be truncated in various ways.

As you may notice after perusing the examples, I have made extensive use of the "standard"/"aggressive" duality in terms of naming specific profiles[1].

The following options become available when Stance is set to Aggressive, and are based upon the concept of a histogram. If you are not familiar with histograms, this section may be confusing.

QuantizationStepCount is the number of discrete steps that the input data is mapped to in order to perform the detection of the desired minimum and maximum truncation points. For example, if this setting is 10 (a value far too low to actually use with real images), then input values of 0.114, 0.145, and 0.176 will all be counted towards the 0.1 "bar" (or "bucket") on the histogram. If this value is not explicitly specified, the default of 100 is used.

Threshold represents (indirectly) the number of occurrences of a particular histogram value that must exist to trigger a "match". This should be a floating-point value greater than 0.0, but less than 1.0, and represents the "height" of a histogram "bar" that is great enough to be considered "enough", relative to the maximum height of any one bar in the histogram. That is, if the Threshold value is 0.02, and the first histogram value with a count of 2% or greater (relative to the highest bar in the histogram) iterating right from the left side is at 0.3, and the first histogram value with a count of 2% or greater iterating left from the right side is at 0.9, then (assuming no other factors), the aggressive normalization algorithm will cause the data to be rescaled so that input values of 0.3 are mapped to 0.0, input values of 0.9 are mapped to 1.0, and values outside this range are truncated. If this value is not explicitly specified, the default of 0.05 is used.

ThresholdRepetitions is the number of times that the Threshold value must be exceeded in order for a match to occur. This can be used to eliminate the influence of outlying histogram "spikes" as opposed to the main "body" of the histogram. If this value is not explicitly specified, the default of 3 is used.

SearchWidth is the number of adjacent histogram values that are searched for an additional "match", when ThresholdRepetitions is set to a value greater than 0. It is used in conjunction with the ThresholdRepetitions value. If this value is not explicitly specified, the default of 5 is used.

Backlash is the number of histogram values that the algorithm "backs off" after it has found the lower or upper bound (based on the other criteria specified). For example, if the QuantizationStepCount is 100, Backlash is 5, and the raw lower and upper bounds were determined to be 0.2 and 0.8, then the actual upper and lower bounds which will be used would be 0.15 and 0.85. This can be used to prevent data near the cutoff threshold from being truncated even though it doesn't actually meet the criteria for the aggressive normalization. For example, if the histogram curve is extremely steep (but has a number of unwanted outlying values), a Backlash value greater than 0 can preserve the entire area under the curve instead of clipping the left and right side. If this value is not explicitly specified, the default of 1 is used.

NormalizationAboutArbitraryValueProfile is an extension of the standard NormalizationProfile type. It is used for cases where e.g. the example data contains both positive and negative values, and the distinction between those values should be preserved in the output data (whereas a standard normalization would rescale the set as a whole). This type contains all of the options of NormalizationProfile, with the following additional tags:

InputCenterPoint represents the point about which the data should be normalized. In the case of the example mentioned above, this would be 0.0. If this value is not explicitly specified, the default of 0.0 is used.

NormalizeUpperAndLowerIndependently controls whether the values above and below the center point should be scaled by the same amount or not. For example, if InputCenterPoint is 0.0, the input data lies in the range -0.1 to 0.2, and NormalizeUpperAndLowerIndependently is false, then the output data will lie in the range -0.5 to 1.0. If, instead, NormalizeUpperAndLowerIndependently is true, then in that same scenario, the output data will lie in the range -1.0 to 1.0, with negative values being scaled by twice the amount of positive values.

OutputLowerBound is the value to which the minimum value present in the output data should be mapped. If this value is not explicitly specified, the default of 0.0 is used.

OutputUpperBound is the value to which the maximum value present in the output data should be mapped. If this value is not explicitly specified, the default of 1.0 is used.

For example, if InputCenterPoint is 0.0, OutputLowerBound is 0.0, and OutputUpperBound is 1.0, then the final output will lie within the range 0.0 to 1.0, with values below 0.5 representing numbers which were negative in the input data.

Note: when used with a Stance of Aggressive, and NormalizeUpperAndLowerIndependently of true, the same aggressive search options are applied to both upper and lower halves of the data.

TransformationProfileCollection represents a set of other transformation profiles to execute sequentially against the input data. Blocks of this type should be specified below (in the file) the blocks which define the referenced profiles.

This type only contains one option:

SubProfileName is the name of the transformation profile which should be used. This may be the name of another TransformationProfileCollection block (as long as that referenced block is defined earlier in the file than the reference occurs).

The sub-profiles are applied in the same order they occur in the file. For example, using an AdditionProfile with an Amount of 0.5 followed by a MultiplicationProfile with an Amount of 2.0 is equivalent to the mathematical formula output = (input + 0.5) * 2.0, whereas if the same MultiplicationProfile is specified before the AdditionProfile, then the result would be equivalent to output = (input * 2.0) + 0.5

Example

This configuration defines:

<TransformationProfiles>

<AbsoluteValueProfile>

<Name>

Absolute Value

</Name>

<Description>

Converts the input data to its absolute value (that is, if any of the data is negative, the negative data will be effectively multiplied by -1 to convert it to a positive value).

</Description>

</AbsoluteValueProfile>

<AdditionProfile>

<Name>

Add: 0.5

</Name>

<Description>

Adds 0.5 to the input data.

</Description>

<Amount>

0.5

</Amount>

</AdditionProfile>

<AdditionProfile>

<Name>

Subtract: 0.5

</Name>

<Description>

Subtracts 0.5 from the input data.

</Description>

<Amount>

-0.5

</Amount>

</AdditionProfile>

<MultiplicationProfile>

<Name>

Multiply: 2.0

</Name>

<Description>

Doubles the values of the input data.

</Description>

<Amount>

2.0

</Amount>

</MultiplicationProfile>

<MultiplicationProfile>

<Name>

Divide: 2.0

</Name>

<Description>

Halves the values of the input data.

</Description>

<Amount>

0.5

</Amount>

</MultiplicationProfile>

<ExponentProfile>

<Name>

Exponent: 2

</Name>

<Description>

Raises the input data to the second power ("squares it").

</Description>

<Amount>

2.0

</Amount>

</ExponentProfile>

<ClippingProfile>

<Name>

Clip: Positive Values Only

</Name>

<Description>

Clips input values below 0.0 to 0.0. Does not otherwise alter the data.

</Description>

<LowerBound>

0.0

</LowerBound>

</ClippingProfile>

<ClippingProfile>

<Name>

Clip: Negative Values Only

</Name>

<Description>

Clips input values above 0.0 to 0.0. Does not otherwise alter the data.

</Description>

<UpperBound>

0.0

</UpperBound>

</ClippingProfile>

<ClippingProfile>

<Name>

Clip: 0.0 - 1.0

</Name>

<Description>

Clips input values below 0.0 to 0.0, and values above 1.0 to 1.0

</Description>

<LowerBound>

0.0

</LowerBound>

<UpperBound>

1.0

</UpperBound>

</ClippingProfile>

<NormalizationProfile>

<Name>

Normalize: Standard

</Name>

<Description>

Applies a conventional normalization to the specified data. The result will range from 0.0 to 1.0.

</Description>

<Stance>

Standard

</Stance>

</NormalizationProfile>

<NormalizationProfile>

<Name>

Normalize: Militant

</Name>

<Description>

Applies a highly-aggressive and detail-oriented normalization to the specified data.

</Description>

<Stance>

Aggressive

</Stance>

<QuantizationStepCount>

500

</QuantizationStepCount>

<Threshold>

0.02

</Threshold>

<ThresholdRepetitions>

3

</ThresholdRepetitions>

<SearchWidth>

25

</SearchWidth>

<Backlash>

0

</Backlash>

</NormalizationProfile>

<NormalizationProfile>

<Name>

Normalize: No Mercy

</Name>

<Description>

Applies an extremely fine-grained, sensitive, aggressive normalization to the specified data.
This profile is intended to be used as a first pass against datasets in which the data of interest occupies a very small band (for example, 0.0001 - 0.0002). It should be paired with a second, more general-purpose profile (such as "Normalize: Militant") for maximum effectiveness and minimal regret regarding the consequences of its actions (see "Normalize: No Remorse").

</Description>

<Stance>

Aggressive

</Stance>

<QuantizationStepCount>

100000

</QuantizationStepCount>

<Threshold>

0.01

</Threshold>

<ThresholdRepetitions>

3

</ThresholdRepetitions>

<SearchWidth>

5000

</SearchWidth>

<Backlash>

1

</Backlash>

</NormalizationProfile>

<NormalizationAboutArbitraryValueProfile>

<Name>

Normalize About Zero (Confident - Independent)

</Name>

<Description>

Applies the "Confident" normalization to the specified data.
Positive values will be normalized to the range 0.0 - 1.0, and negative values will be normalized to the range -1.0 - 0.0.
Positive and negative values will be normalized independently from each other.

</Description>

<InputCenterPoint>

0.0

</InputCenterPoint>

<NormalizeUpperAndLowerIndependently>

true

</NormalizeUpperAndLowerIndependently>

<OutputLowerBound>

0.0

</OutputLowerBound>

<OutputUpperBound>

1.0

</OutputUpperBound>

<Stance>

Aggressive

</Stance>

<QuantizationStepCount>

100

</QuantizationStepCount>

<Threshold>

0.001

</Threshold>

<ThresholdRepetitions>

3

</ThresholdRepetitions>

<SearchWidth>

5

</SearchWidth>

<Backlash>

1

</Backlash>

</NormalizationAboutArbitraryValueProfile>

<TransformationProfileCollection>

<Name>

Normalize: No Remorse

</Name>

<Description>

Applies the "No Mercy" and "Militant" normalization profiles to the input data, in order to effect a scorched-earth policy in which all conventional outliers are removed.

</Description>

<SubProfileName>

Normalize: No Mercy

</SubProfileName>

<SubProfileName>

Normalize: Militant

</SubProfileName>

</TransformationProfileCollection>

<TransformationProfileCollection>

<Name>

Reverse Exponent: 2

</Name>

<Description>

For values normalized to the range 0.0 - 1.0, inverts those values, raises the result to the second power, then inverts the values again.

</Description>

<SubProfileName>

Multiply: -1.0

</SubProfileName>

<SubProfileName>

Add: 1.0

</SubProfileName>

<SubProfileName>

Exponent: 2

</SubProfileName>

<SubProfileName>

Subtract: 1.0

</SubProfileName>

<SubProfileName>

Multiply: -1.0

</SubProfileName>

</TransformationProfileCollection>

</TransformationProfiles>

 
Footnotes
1. The transformation profile names Normalize: Milquetoast, Normalize: Assertive, Normalize: Sabre-Rattler, Normalize: Resistance is Futile, and Normalize: The Weak Shall Perish are reserved by the developer for use in a future release of TMSB.
 
[ Page Icon ]