Fix invalid encoding on level 9 with single value input#1115
Fix invalid encoding on level 9 with single value input#1115
Conversation
With single value input and a full block write (>=64K) the indexing function would overflow a uint16 to a 0. This would make it impossible to generate a valid huffman table for the literal size prediction. In turn this would mean that the entire block would be output as literals - since the cost of the value would be 0 bits. This would in turn mean that EOB could not be encoded for the bit writer - since there were no matches. This was previously being satisfied with "filling". Fixes: 1. First never encode more than `maxFlateBlockTokens` - 32K for the literal estimate table. 2. Always include EOB explicitly - if somehow literals should slip through. 3. Add test that will write big single-value input as regression test. Others were using copy that does smaller writes. Fixes #1114
This comment was marked as resolved.
This comment was marked as resolved.
|
Fuzzed for 13 hours. |
|
Hi @klauspost, thank you for your work! We use the library to compress and decompress OCI layout artifacts and noticed a size change (1 Byte :D) when upgrading from Could you confirm whether this change is intentional, and if so, recommend a configuration or approach to preserve byte-for-byte stability across versions? |
|
@frewilhelm Your base assumption is wrong. Never rely on compressed output to remain the same. Encoding will continue to change. That is true here, as with the standard library. In some cases (though not currently in deflate) the encoding may also differ by platform. You can pin your dependency and defer the pain or just not design yourself into this corner. |
With single value input and a full block write (>=64K) the indexing function would overflow a uint16 to a 0.
This would make it impossible to generate a valid huffman table for the literal size prediction.
In turn this would mean that the entire block would be output as literals - since the cost of each value would be 0 bits.
This would in turn mean that EOB could not be encoded for the bit writer - since there were no matches. This was previously being satisfied with "filling".
Fixes:
maxFlateBlockTokens- 32K for the literal estimate table.Fixes #1114