Crate fast_srgb8

Expand description

Small crate implementing fast conversion between linear float and 8-bit sRGB.

f32_to_srgb8: Convert f32 to an sRGB u8. Meets all the requirements of the most relevent public spec which includes:
- Maximum error of 0.6 ULP (on integer side) — Note that in practice this is a higher max error than the naive implementation will give you, so for applications like scientific or medical imaging, perhaps this is less acceptable. That said, for normal graphics work, this should be fine.
- Monotonic across the 0.0..=1.0 range. (If f32_to_srgb8(a) > f32_to_srgb8(b), then a > b)
- All possible outputs are achievable (round-trips with srgb8_to_f32).
f32x4_to_srgb8: Produces results identical to calling f32_to_srgb8 4 times in a row. On targets where we have a SIMD implementation (currently SSE2-enabled x86 and x86_64), this will use that. Otherwise, it will just call f32_to_srgb8 four times in a row, and return the results.
srgb8_to_f32: Inverse operation of f32_to_srgb8. Uses the standard technique of a 256-item lookup table.

Large performance improvments over the naive implementation (see README.md for benchmarks)
Supports no_std (normally this is tricky, as these operations require powf naively, which is not available to libcore)
No dependencies.
SIMD support for conversion to sRGB (conversion from sRGB is already ~20x faster than naive impl, and would probably be slower in SIMD, so for now it’s not implemented).
Consistent and correct (according to at least one relevant spec) handling of edge cases, such as NaN/Inf/etc.
Exhaustive checking of all inputs for correctness (in tests).

Functions§

f32_to_srgb8: Converts linear f32 RGB component to an 8-bit sRGB value.
f32x4_to_srgb8: Performs 4 simultaneous calls to f32_to_srgb8, and returns 4 results.
srgb8_to_f32: Convert from a 8-bit sRGB component to a linear f32.