core.compress — compression codecs
A single Codec protocol with six concrete implementations. Used by
every Verum layer that touches external byte streams:
- HTTP
Content-Encodingnegotiation and round-tripping - WebSocket
permessage-deflateextension - TLS 1.3 certificate compression (RFC 8879)
- QUIC payload paths (
datagramandstreamframes) - Log-pipeline code for archival compression
- RDMA zero-copy payload paths (
lz4speed-priority)
Spec alignment
| Codec | RFC / spec |
|---|---|
| Gzip | RFC 1952 |
| Deflate (raw) | RFC 1951 |
| Zlib | RFC 1950 |
| Brotli | RFC 7932 |
| Zstd | RFC 8878 |
| Lz4 | lz4 frame format v1.6.3 |
Module layout
| Submodule | Purpose |
|---|---|
core.compress (mod) | Algorithm, CompressError, Codec protocol, dispatch entry points |
core.compress.mod_gzip | Gzip, Deflate, Zlib — the three RFC-195x formats |
core.compress.mod_brotli | Brotli with configurable window |
core.compress.mod_zstd | Zstd with optional dictionary support |
core.compress.mod_lz4 | Lz4 speed-priority codec |
Algorithm — runtime-dispatchable identifier
public type Algorithm is
| Gzip // RFC 1952 (deflate + header/footer)
| Deflate // RFC 1951 raw deflate (headerless)
| Zlib // RFC 1950 container around deflate
| Brotli // RFC 7932 — dense, text-optimised
| Zstd // RFC 8878 — tunable; negative levels = fast mode
| Lz4 // pure speed, modest ratio
| Identity; // no-op fallback for negotiation
HTTP Content-Encoding token helpers
implement Algorithm {
public fn content_encoding(self) -> &'static Text; // "gzip" | "deflate" | "br" | "zstd" | "lz4" | "identity"
public fn from_content_encoding(token: &Text) -> Maybe<Algorithm>;
}
Zlibshares the HTTP token"deflate"with rawDeflate(both appear asdeflateon the wire; the distinguishing byte-level framing is settled by the zlib header0x78 0x9C …).from_content_encodingis case-insensitive and returnsNonefor unknown tokens (server must respond with 415 Unsupported Media Type or fall back toIdentity).
Codec protocol
Every algorithm-specific wrapper implements:
public type Codec is protocol {
const ALGORITHM: Algorithm;
fn encode(input: &[Byte], level: Int, out: &mut List<Byte>)
-> Result<Int, CompressError>;
fn decode(input: &[Byte], max_output_bytes: Int, out: &mut List<Byte>)
-> Result<Int, CompressError>;
};
encodeappends the compressed payload toout, returning the number of bytes appended.decodeappends the decompressed payload, bounded bymax_output_bytes— exceeding the bound surfaces asCompressError.OutputTooLargeto defend against zip-bomb inputs.- The
ALGORITHMassociated const pins each implementation to itsAlgorithmvariant; dispatch tables use it as the lookup key.
Dispatch entry points
public fn encode(algo: Algorithm, input: &[Byte], level: Int,
out: &mut List<Byte>) -> Result<Int, CompressError>;
public fn decode(algo: Algorithm, input: &[Byte], max_output_bytes: Int,
out: &mut List<Byte>) -> Result<Int, CompressError>;
These two functions are the recommended surface for HTTP middleware
and proxy layers. They branch on the Algorithm variant to call the
correct concrete codec:
mount core.compress.{Algorithm, CompressError, encode, decode};
fn encode_with_negotiated(algo: Algorithm, body: &[Byte])
-> Result<List<Byte>, CompressError>
{
let mut out: List<Byte> = [];
encode(algo, body, 6 /* balanced level */, &mut out)?;
Ok(out)
}
Error model
public type CompressError is
| UnsupportedAlgorithm(Algorithm) // backend feature-flag off
| CorruptInput(Text) // bad header, premature EOF, etc.
| BufferTooSmall { need: Int, have: Int } // fixed-output encoder was short
| InvalidLevel { algo: Algorithm, level: Int } // level out of range
| DictionaryMismatch // zstd dict ID/hash mismatch
| OutputTooLarge { limit: Int } // zip-bomb defence
| IoError(Text); // I/O propagation from inner reader/writer
All variants are purely data; no I/O happens inside the error path other than the caller's own propagation.
Level ranges
| Algorithm | Valid level range | Typical default |
|---|---|---|
Gzip / Deflate / Zlib | 0–9 | 6 |
Brotli | 0–11 | 4 |
Zstd | −(2²²)..22 | 3 |
Lz4 | 0–16 | 0 (fast) |
Identity | any (ignored) | — |
InvalidLevel { algo, level } surfaces when the range check fails.
Per-codec examples
Gzip (HTTP bodies)
mount core.compress.mod_gzip.{Gzip};
fn compress_body(raw: &[Byte]) -> Result<List<Byte>, CompressError> {
let mut out: List<Byte> = [];
Gzip.encode(raw, 6 /* balanced */, &mut out)?;
Ok(out)
}
fn decompress_body(wire: &[Byte]) -> Result<List<Byte>, CompressError> {
let mut out: List<Byte> = [];
// 10 MiB output cap is a typical HTTP-body defence.
Gzip.decode(wire, 10 * 1024 * 1024, &mut out)?;
Ok(out)
}
Brotli (text-heavy payloads)
mount core.compress.mod_brotli.{Brotli};
let mut out: List<Byte> = [];
Brotli.encode(html.as_bytes(), 11 /* max quality */, &mut out)?;
Brotli at quality 11 is slow on the encode side but produces notably smaller output for HTML / JSON / CSS than gzip-9; use brotli for static assets and gzip / zstd for dynamic traffic.
Zstd with dictionary
mount core.compress.mod_zstd.{Zstd};
fn compress_log_line(dict: &[Byte], line: &[Byte])
-> Result<List<Byte>, CompressError>
{
let mut out: List<Byte> = [];
Zstd.encode_with_dict(line, dict, 3, &mut out)?;
Ok(out)
}
Zstd dictionaries produce dramatic savings on log / metrics streams where lines share a common vocabulary (e.g., repeated field names, service tags, timestamps) — a 4-KiB trained dictionary typically halves the per-line compressed size at zstd level 3.
Lz4 (speed-priority)
mount core.compress.mod_lz4.{Lz4};
let mut out: List<Byte> = [];
Lz4.encode(payload, 0 /* fast */, &mut out)?;
Lz4 at level 0 compresses at roughly network-cable throughput on
modern CPUs (≈ 3–5 GiB/s on a single thread) — the right choice for
zero-copy RDMA payloads and in-process cache tiers.
DoS / security discipline
Two defences every decoder respects:
max_output_bytescap — a small malicious input can expand to gigabytes if the decoder has no bound. Everydecodecall takes an explicit cap; exceeding it returnsOutputTooLarge { limit }rather than a best-effort OOM.- Header validation — all decoders reject malformed headers /
magic bytes immediately, before allocating the output buffer. A
corrupt gzip header surfaces as
CorruptInput(reason)within the first 10 bytes of input.
Backend plugability
All six codecs are backed by @intrinsic calls — the runtime wires
them to the chosen native backend (zlib-ng, brotli-go, libzstd, lz4)
at build time via a feature flag. The Verum-level surface never
changes when swapping backends; UnsupportedAlgorithm(a) indicates
the compile-time feature flag for that codec is off.
See also
stdlib/encoding— JSON / CBOR / MessagePack / Base64 — codec-shaped but for data format, not payload compression.stdlib/net— HTTPContent-Encodingnegotiation sits above this layer.stdlib/net/tls— RFC 8879 certificate compression usesBrotli/Zstdbackends through this API.