Erste Überlegung: Hä?
Ernsthaft? Was soll denn an den Defaults von TIFF so problematisch sein? Steht doch alles in der Spezifikation. Es gilt:- Enthält ein TIFF ein Tag nicht, für das ein Default definiert ist, gilt der Default.
- Wenn ein TIFF ein Tag enthält, gilt der Wert des Tags.
- Sonst gilt, der Wert ist nicht definiert und demnach nicht vorhanden.
Der zweite Blick
Leider ist es in der Praxis komplizierter. Ich bekam die Frage, wenn jhove bei der Prüfung der von checkit_tiff mitgelieferten Beispiel-TIFFs für das Thresholding-Tag 263 den Wert "1" ausgibt:$> jhove tiffs_should_pass/minimal_valid_baseline.tiffaber checkit_tiff mit dem beigefügten Beispiel keinen Fehler wirft, obwohl doch keine Positiv-Regel in der Konfigurationsdatei hinterlegt ist:
Jhove (Rel. 1.6, 2011-01-04)
Date: 2018-04-16 12:41:25 MESZ
RepresentationInformation: tiffs_should_pass/minimal_valid_baseline.tiff
ReportingModule: TIFF-hul, Rel. 1.5 (2007-10-02)
LastModified: 2017-07-14 11:28:57 MESZ
Size: 323
Format: TIFF
Version: 5.0
Status: Well-Formed and valid
SignatureMatches:
TIFF-hul
MIMEtype: image/tiff
Profile: Baseline bilevel (Class B), TIFF/IT-BP (ISO 12639:1998), TIFF/IT-BP/P1 (ISO 12639:1998), TIFF/IT-BP/P2 (ISO 12639:1998), TIFF/IT-MP (ISO 12639:1998)
TIFFMetadata:
ByteOrder: little-endian
IFDs:
Number: 1
IFD:
Offset: 38
Type: TIFF
Entries:
NisoImageMetadata:
ByteOrder: little_endian
CompressionScheme: uncompressed
ImageWidth: 20
ImageHeight: 10
ColorSpace: white is zero
Orientation: normal
SamplingFrequencyUnit: inch
XSamplingFrequency: 376,193
YSamplingFrequency: 376,193
BitsPerSample: 1
BitsPerSampleUnit: integer
SamplesPerPixel: 1
NewSubfileType: 0
SampleFormat: 1
MinSampleValue: 0
MaxSampleValue: 1
Threshholding: 1
TIFFITProperties:
BackgroundColorIndicator: background not defined
ImageColorIndicator: image not defined
TransparencyIndicator: no transparency
PixelIntensityRange: 0, 1
RasterPadding: 1 byte
BitsPerRunLength: 8
BitsPerExtendedRunLength: 16
$> checkit_tiff example_configs/cit_tiff6_baseline_SLUB.cfg tiffs_should_pass/minimal_valid_baseline.tiffZuerst war ich etwas erschrocken, war ich mir doch sicher, dass checkit_tiff funktioniert und ich alles sorgfältig geprüft hatte. Zur Sicherheit habe ich die Ausgabe mit tiffdump der libtiff geprüft:
'./build/checkit_tiff' version: development_v0.4.0
revision: 408
licensed under conditions of libtiff (see http://libtiff.maptools.org/misc.html)
cfg_file=example_configs/cit_tiff6_baseline_SLUB.cfg
tiff file/dir=tiffs_should_pass/minimal_valid_baseline.tiff
file: tiffs_should_pass/minimal_valid_baseline.tiff
(./) general --> TIFF should have just one IFD, (lineno: 12)
(./) general --> All tag offsets should be word aligned, (lineno: 14)
(./) general --> All offsets may only be used once, (lineno: 14)
(./) general --> All tag offsets should be greater than zero, (lineno: 14)
(./) general --> All IFDs should be word aligned, (lineno: 15)
(./) general --> Tags should be sorted in ascending order, (lineno: 15)
(./) tag 256 (ImageWidth) --> Tag should have a value in a range of (lineno: 23)
(./) tag 257 (ImageLength) --> Tag should have a value in a range of (lineno: 25)
(./) tag 258 (BitsPerSample) --> One or more conditions needs to be combined in a logical_or operation (open) (lineno: 30)
(./) tag 259 (Compression) --> Tag should have one exact value. (lineno: 36)
(./) tag 262 (Photometric) --> Tag should have a value in a range of (lineno: 40)
(./) tag 273 (StripOffsets) --> TIFF should contain this tag. (lineno: 45)
(./) tag 277 (SamplesPerPixel) --> Tag should have one exact value. (lineno: 52)
(./) tag 278 (RowsPerStrip) --> Tag should have a value in a range of (lineno: 55)
(./) tag 279 (StripByteCounts) --> TIFF should contain this tag. (lineno: 60)
(./) tag 282 (XResolution) --> Tag should have a value in a range of (lineno: 63)
(./) tag 283 (YResolution) --> Tag should have a value in a range of (lineno: 66)
(./) tag 296 (ResolutionUnit) --> Tag should have one exact value. (lineno: 69)
(./) tag 254 (SubFileType) --> One or more conditions needs to be combined in a logical_or operation (open) (lineno: 77)
(./) tag 274 (Orientation) --> Tag should have one exact value. (lineno: 113)
(./) tag 284 (PlanarConfig) --> Tag should have one exact value. (lineno: 122)
(./)
(./)Yes, the given tif is valid :)
$> tiffdump tiffs_should_pass/minimal_valid_baseline.tifftiffs_should_pass/minimal_valid_baseline.tiff:Gut, tiffdump war auf meiner Seite. Was ist also der Grund für diese Diskrepanz? Schauen wir zuerst in die TIFF-6.0 Spezifikation, dort steht auf Seite 41:
Magic: 0x4949 <little-endian> Version: 0x2a <ClassicTIFF>
Directory 0: offset 38 (0x26) next 0 (0)
SubFileType (254) LONG (4) 1<0>
ImageWidth (256) SHORT (3) 1<20>
ImageLength (257) SHORT (3) 1<10>
BitsPerSample (258) SHORT (3) 1<1>
Compression (259) SHORT (3) 1<1>
Photometric (262) SHORT (3) 1<0>
StripOffsets (273) LONG (4) 1<8>
Orientation (274) SHORT (3) 1<1>
SamplesPerPixel (277) SHORT (3) 1<1>
RowsPerStrip (278) SHORT (3) 1<64>
StripByteCounts (279) LONG (4) 1<30>
XResolution (282) RATIONAL (5) 1<376.193>
YResolution (283) RATIONAL (5) 1<376.193>
PlanarConfig (284) SHORT (3) 1<1>
ResolutionUnit (296) SHORT (3) 1<2>
For black and white TIFF files that represent shades of gray, the technique used toOkay. Für das oben benutzte TIFF trifft zu, dass es schwarz-weiß ist und kein Tag 263 enthält. Daher wird der Default = 1 angenommen.
convert from gray to black and white pixels.
Tag = 263 (107.H)
Type = SHORT
N = 1
1 = No dithering or halftoning has been applied to the image data.
2 = An ordered dither or halftone technique has been applied to the image data.
3 = A randomized process such as error diffusion has been applied to the image data.
Default is Threshholding = 1. See also CellWidth, CellLength.
Jhove präsentiert die Metadaten der TIFF-Dateien also so, wie ein TIFF-Reader sie interpretieren würde. Die Tools checkit_tiff und tiffdump zeigen dagegen, welche TIFF-Tags mit welchen Werten tatsächlich in den TIFF-Dateien explizit kodiert sind.
Fazit
Kenne Deine Tools! Statt Default-Werte zu interpretieren, sollten solche Annahmen explizit gekennzeichnet werden. Für den Durchschnittsanwender ist sonst nicht ersichtlich, wie die Ergebnisse zustande kommen. Als Lektion für checkit_tiff nehme ich diese Frage mit in die FAQ auf.First thought: WTF?
Seriously? What's supposed to be so problematic about TIFF's defaults? After all, the Spezifikation says it all. The rules are:- If a TIFF does not contain a tag that has a well-defined default value, then that default value is used.
- If a TIFF does contain a tag, then that tag's value is used.
- In all other cases, the value is undefined and hence nonexistent.
Der zweite Blick
Unfortunately, the real world is a little more complicated. I was asked why jhove would give a value of "1" for the Thresholding tag 263 when validating TIFF-examples that are delivered with checkit_tiff as shown below:$> jhove tiffs_should_pass/minimal_valid_baseline.tiffHowever, checkit_tiff does not throw an error while validating the same sample file, even though there's no whitelist rule for that tag in the config file:
Jhove (Rel. 1.6, 2011-01-04)
Date: 2018-04-16 12:41:25 MESZ
RepresentationInformation: tiffs_should_pass/minimal_valid_baseline.tiff
ReportingModule: TIFF-hul, Rel. 1.5 (2007-10-02)
LastModified: 2017-07-14 11:28:57 MESZ
Size: 323
Format: TIFF
Version: 5.0
Status: Well-Formed and valid
SignatureMatches:
TIFF-hul
MIMEtype: image/tiff
Profile: Baseline bilevel (Class B), TIFF/IT-BP (ISO 12639:1998), TIFF/IT-BP/P1 (ISO 12639:1998), TIFF/IT-BP/P2 (ISO 12639:1998), TIFF/IT-MP (ISO 12639:1998)
TIFFMetadata:
ByteOrder: little-endian
IFDs:
Number: 1
IFD:
Offset: 38
Type: TIFF
Entries:
NisoImageMetadata:
ByteOrder: little_endian
CompressionScheme: uncompressed
ImageWidth: 20
ImageHeight: 10
ColorSpace: white is zero
Orientation: normal
SamplingFrequencyUnit: inch
XSamplingFrequency: 376,193
YSamplingFrequency: 376,193
BitsPerSample: 1
BitsPerSampleUnit: integer
SamplesPerPixel: 1
NewSubfileType: 0
SampleFormat: 1
MinSampleValue: 0
MaxSampleValue: 1
Threshholding: 1
TIFFITProperties:
BackgroundColorIndicator: background not defined
ImageColorIndicator: image not defined
TransparencyIndicator: no transparency
PixelIntensityRange: 0, 1
RasterPadding: 1 byte
BitsPerRunLength: 8
BitsPerExtendedRunLength: 16
$> checkit_tiff example_configs/cit_tiff6_baseline_SLUB.cfg tiffs_should_pass/minimal_valid_baseline.tiffBeing sure that checkit_tiff works as expected and that I had checked everything, I was shocked at first. To err on the side of safety, I ran a crosscheck of checkit_tiff's output with the output of the tiffdump tool from the libtiff:
'./build/checkit_tiff' version: development_v0.4.0
revision: 408
licensed under conditions of libtiff (see http://libtiff.maptools.org/misc.html)
cfg_file=example_configs/cit_tiff6_baseline_SLUB.cfg
tiff file/dir=tiffs_should_pass/minimal_valid_baseline.tiff
file: tiffs_should_pass/minimal_valid_baseline.tiff
(./) general --> TIFF should have just one IFD, (lineno: 12)
(./) general --> All tag offsets should be word aligned, (lineno: 14)
(./) general --> All offsets may only be used once, (lineno: 14)
(./) general --> All tag offsets should be greater than zero, (lineno: 14)
(./) general --> All IFDs should be word aligned, (lineno: 15)
(./) general --> Tags should be sorted in ascending order, (lineno: 15)
(./) tag 256 (ImageWidth) --> Tag should have a value in a range of (lineno: 23)
(./) tag 257 (ImageLength) --> Tag should have a value in a range of (lineno: 25)
(./) tag 258 (BitsPerSample) --> One or more conditions needs to be combined in a logical_or operation (open) (lineno: 30)
(./) tag 259 (Compression) --> Tag should have one exact value. (lineno: 36)
(./) tag 262 (Photometric) --> Tag should have a value in a range of (lineno: 40)
(./) tag 273 (StripOffsets) --> TIFF should contain this tag. (lineno: 45)
(./) tag 277 (SamplesPerPixel) --> Tag should have one exact value. (lineno: 52)
(./) tag 278 (RowsPerStrip) --> Tag should have a value in a range of (lineno: 55)
(./) tag 279 (StripByteCounts) --> TIFF should contain this tag. (lineno: 60)
(./) tag 282 (XResolution) --> Tag should have a value in a range of (lineno: 63)
(./) tag 283 (YResolution) --> Tag should have a value in a range of (lineno: 66)
(./) tag 296 (ResolutionUnit) --> Tag should have one exact value. (lineno: 69)
(./) tag 254 (SubFileType) --> One or more conditions needs to be combined in a logical_or operation (open) (lineno: 77)
(./) tag 274 (Orientation) --> Tag should have one exact value. (lineno: 113)
(./) tag 284 (PlanarConfig) --> Tag should have one exact value. (lineno: 122)
(./)
(./)Yes, the given tif is valid :)
$> tiffdump tiffs_should_pass/minimal_valid_baseline.tifftiffs_should_pass/minimal_valid_baseline.tiff:Well, tiffdump was in my team there. So, what's the reason for that discrepancy? First, let's have a loot at the TIFF-6.0 Spezifikation. On page 41, the specification states:
Magic: 0x4949 <little-endian> Version: 0x2a <ClassicTIFF>
Directory 0: offset 38 (0x26) next 0 (0)
SubFileType (254) LONG (4) 1<0>
ImageWidth (256) SHORT (3) 1<20>
ImageLength (257) SHORT (3) 1<10>
BitsPerSample (258) SHORT (3) 1<1>
Compression (259) SHORT (3) 1<1>
Photometric (262) SHORT (3) 1<0>
StripOffsets (273) LONG (4) 1<8>
Orientation (274) SHORT (3) 1<1>
SamplesPerPixel (277) SHORT (3) 1<1>
RowsPerStrip (278) SHORT (3) 1<64>
StripByteCounts (279) LONG (4) 1<30>
XResolution (282) RATIONAL (5) 1<376.193>
YResolution (283) RATIONAL (5) 1<376.193>
PlanarConfig (284) SHORT (3) 1<1>
ResolutionUnit (296) SHORT (3) 1<2>
For black and white TIFF files that represent shades of gray, the technique used toOkay. Looking at the sample TIFF we used above, it's true that it's a black-and-white image and does not contain tag 263. Hence, a default = 1 is assumed.
convert from gray to black and white pixels.
Tag = 263 (107.H)
Type = SHORT
N = 1
1 = No dithering or halftoning has been applied to the image data.
2 = An ordered dither or halftone technique has been applied to the image data.
3 = A randomized process such as error diffusion has been applied to the image data.
Default is Threshholding = 1. See also CellWidth, CellLength.
Apparently, Jhove will present the metadata in the TIF files in a way that a TIF reader would interpret them. The tools checkit_tiff and tiffdump however show which TIF tags are actually explicitely encoded in the TIFFs and what values they have.
Wrap-up
Know your tools!Instead of interpreting default values, these kinds of exceptions need to be cleary marked. Otherwise, the genesis of these results might not be apparent to the average user.I have learned learned my lesson and will include this question into the checkit_tiff FAQ.
