Skip to main content

Tensor types

TL;DR. A tensor type Tensor<Float32, [3, 224, 224]> describes a three-dimensional array of 32-bit floats with the given shape. Shape mismatches — mismatched matmul, wrong axis for reduction, broken broadcasting — are caught by the type checker before the program runs. At runtime, a tensor is just a contiguous buffer plus strides known at compile time.

// Static shape — the type system enforces it.
fn image_channels(img: Tensor<Float32, [3, H, W]>) -> Int { 3 }

// Shape-polymorphic — H and W can vary, compiler keeps track.
fn resize<H, W>(img: Tensor<Float32, [3, H, W]>)
-> Tensor<Float32, [3, 224, 224]> { ... }

// Matmul: the inner dimensions must agree, or the type checker rejects.
fn matmul<M, K, N>(a: Tensor<Float32, [M, K]>,
b: Tensor<Float32, [K, N]>)
-> Tensor<Float32, [M, N]> { ... }
Status

Wired end-to-end through the type system: Tensor<Element, [Shape]> parses, type-checks, and validates compile-time shape arithmetic across 40+ VBC end-to-end tests. Compile-time shape arithmetic is stable; broadcasting inference is maturing; shape-refined indexing (t[i] with i < shape[0]) is experimental.

Why shape-typed tensors?

The vast majority of machine-learning bugs are shape bugs: batch vs. no-batch, transposed matmul, wrong axis for sum or mean, incompatible broadcast. NumPy and PyTorch turn these into runtime ValueErrors, often inside a training loop after an hour of compute.

Bug classNumPy/PyTorchVerum
a @ b with mismatched inner dimsRuntime ValueErrorCompile error, shape visible
sum(x, axis=5) on rank-3 xRuntime errorCompile error
Broadcast between [4, 3, 1] and [4, 5]Runtime errorCompile error
Off-by-one in batch slicingSilent wrong outputCompile error or refined-index error
Forgotten .unsqueezeShape drift at the next opCompile error

Shape typing moves all of this to compile time. The runtime cost is zero — shapes exist only in the type system.

The type

tensor_type_expr = 'tensor' , '<' , shape_params , '>' , type_expr ;
shape_params = dimension , { ',' , dimension } ;
dimension = integer_lit | identifier | meta_arith_expr ;

Two equivalent surface forms:

// generic form — uses a type-level list
Tensor<Float32, [3, 224, 224]>

// explicit form — mirrors tensor literal syntax
tensor<3, 224, 224> Float32

Both elaborate to the same Type.Tensor node. Use the generic form in general code; use the explicit form in numerical contexts where the resemblance to the literal aids reading.

First example: shape-checked dot product

fn dot<N>(a: Tensor<Float32, [N]>, b: Tensor<Float32, [N]>) -> Float32 {
let mut acc = 0.0;
for i in 0..N { acc += a[i] * b[i]; }
acc
}

let x: Tensor<Float32, [3]> = tensor<3> Float32 { 1.0, 2.0, 3.0 };
let y: Tensor<Float32, [3]> = tensor<3> Float32 { 4.0, 5.0, 6.0 };
let z = dot(x, y); // ok, N = 3 inferred

let w: Tensor<Float32, [4]> = tensor<4> Float32 { 1, 2, 3, 4 };
// dot(x, w); // ERROR: expected Tensor<_, [3]>, found Tensor<_, [4]>

Shape arithmetic

Shapes are compile-time expressions over natural numbers and bound identifiers. Verum supports the usual operators inside shape positions:

fn concat<A, B>(
a: Tensor<Float32, [A]>,
b: Tensor<Float32, [B]>,
) -> Tensor<Float32, [A + B]> { ... }

fn reshape_flat<H, W>(
img: Tensor<Float32, [3, H, W]>,
) -> Tensor<Float32, [3 * H * W]> { ... }

fn kron<A, B, C, D>(
x: Tensor<Float32, [A, B]>,
y: Tensor<Float32, [C, D]>,
) -> Tensor<Float32, [A * C, B * D]> { ... }

The operators available in shape positions are the same as in meta arithmetic: +, -, *, / (integer division), and bracketed grouping. Equality between shape expressions uses SMT so [A + B] and [B + A] unify.

Broadcasting

Broadcasting is checked by tensor_shape_checker. Two rules:

  1. Shapes are right-aligned. Missing leading dims are implicitly 1.
  2. Dimensions align if they are equal, or either is 1.

The checker computes the broadcast shape symbolically:

// Input types
let x: Tensor<Float32, [4, 3, 1]>;
let y: Tensor<Float32, [ 3, 5]>;

// Output shape computed at compile time
let z: Tensor<Float32, [4, 3, 5]> = x + y;

A failing broadcast is a compile error with the offending axis called out:

error[E1311]: cannot broadcast shapes [4, 3, 1] and [4, 5]
--> src/net.vr:18:13
|
18 | let z = x + y;
| ^^^^^ dimension 1: 3 vs 5 (neither is 1)
= note: broadcasting requires each axis to be equal or one of
the operands to be 1 at that axis.

Matrix multiplication

fn matmul<M, K, N>(
a: Tensor<Float32, [M, K]>,
b: Tensor<Float32, [K, N]>,
) -> Tensor<Float32, [M, N]> { ... }

let a: Tensor<Float32, [3, 4]> = ...;
let b: Tensor<Float32, [4, 5]> = ...;
let c = matmul(a, b); // Tensor<Float32, [3, 5]>, inferred

let d: Tensor<Float32, [6, 5]> = ...;
// matmul(a, d); // ERROR: inner dims mismatch: 4 vs 6

Batched matmul follows the same pattern with leading broadcast dimensions:

fn bmm<B, M, K, N>(
a: Tensor<Float32, [B, M, K]>,
b: Tensor<Float32, [B, K, N]>,
) -> Tensor<Float32, [B, M, N]> { ... }

Reductions

Reductions shrink a chosen axis. The axis is a compile-time integer; the checker verifies it is in range:

fn sum_axis<Rank, Axis, S: Shape<Rank>>(
t: Tensor<Float32, S>,
) -> Tensor<Float32, S.without(Axis)>
where Axis < Rank;

S.without(Axis) is a type-level function on shapes: remove the element at Axis. Specified at compile time, verified by the checker.

Reshape

fn reshape<OldShape, NewShape>(
t: Tensor<Float32, OldShape>,
) -> Tensor<Float32, NewShape>
where product(OldShape) == product(NewShape);

The product(...) predicate ensures element-count preservation — checked by SMT at compile time.

Tensor literals

Tensor literals pair an explicit shape with nested elements:

let eye3: Tensor<Float32, [3, 3]> = tensor<3, 3> Float32 {
1.0, 0.0, 0.0,
0.0, 1.0, 0.0,
0.0, 0.0, 1.0,
};

let batch: Tensor<Int, [2, 3]> = tensor<2, 3> Int {
[1, 2, 3],
[4, 5, 6],
};

The number of elements must equal ∏ shape_i, verified at parse time. Literals can be nested arrays (PyTorch style) or flat (BLAS style); both are accepted.

Interaction with dependent types

Shape dimensions are (usually) type-level; when you need a runtime-valued dimension you use dependent types:

fn concat_dyn(a: Tensor<Float32, [n]>, b: Tensor<Float32, [m]>)
-> Tensor<Float32, [n + m]>
where n: Nat, m: Nat;

n and m are dependent-typed indices; the return shape is computed from them. At runtime, the dimensions become ordinary integers carried alongside the buffer.

Interaction with autodiff

Tensor types interact with @grad and @jit:

@grad
fn loss(x: Tensor<Float32, [Batch, 10]>, w: Tensor<Float32, [10, 1]>)
-> Tensor<Float32, [Batch, 1]>
{
matmul(x, w)
}

Gradient shapes are computed from the forward shapes, giving you statically-typed backward passes.

Cookbook

A fully-connected layer

type Linear<In, Out> is {
weight: Tensor<Float32, [Out, In]>,
bias: Tensor<Float32, [Out]>,
};

implement Linear<In, Out> {
fn forward<Batch>(&self, x: Tensor<Float32, [Batch, In]>)
-> Tensor<Float32, [Batch, Out]>
{
matmul(x, self.weight.transpose()) + self.bias
}
}

Every shape in forward is checked against the layer's declaration. A shape mismatch in the training loop is a compile error in the layer definition.

A multi-head attention head

fn attention<B, H, T, D>(
q: Tensor<Float32, [B, H, T, D]>,
k: Tensor<Float32, [B, H, T, D]>,
v: Tensor<Float32, [B, H, T, D]>,
) -> Tensor<Float32, [B, H, T, D]> {
let scores = matmul(q, k.transpose(-1, -2)) / sqrt(D as Float32);
let weights = softmax(scores, axis=-1);
matmul(weights, v)
}

All head dimensions are polymorphic — one implementation, infinite sizes.

Convolution (simplified)

fn conv2d<B, C_in, C_out, Kh, Kw, H, W>(
input: Tensor<Float32, [B, C_in, H, W]>,
weight: Tensor<Float32, [C_out, C_in, Kh, Kw]>,
) -> Tensor<Float32, [B, C_out, H - Kh + 1, W - Kw + 1]>
where Kh <= H, Kw <= W;

The output spatial shape is computed from the input. The compiler rejects a 3×3 kernel on a 2×2 input with a helpful message.

Interaction with other features

With...Behaviour
Refinement typesTensor element types can be refined: Tensor<Float32 { self >= 0 }, [N]> — enforces nonnegativity entry-wise.
LinearityGPU-resident tensors can be declared linear so that .unmap() is mandatory.
ProtocolsAdd, Mul, Matmul etc. are protocols parameterised by shape relations.
Meta stagingShape arithmetic is compile-time; @stage can pre-compute shapes across a pipeline.
VerificationReductions and reshape emit SMT obligations automatically.

Common pitfalls

"The compiler says [3, 4] and [A, B] don't unify"

You're calling a shape-polymorphic function with concrete tensors but the compiler can't deduce A = 3, B = 4. Add type arguments: f<3, 4>(t) — or wait for inference to see both arguments and the return type.

"My reshape fails even though the products match"

The SMT constraint is product(OldShape) == product(NewShape), and nonlinear arithmetic is undecidable in general. Provide a hint: reshape<[A*B, C]>(t) or break the reshape into steps with known intermediate shapes.

"Broadcasting says 'dimension 2: 7 vs 7' and still fails"

Check the leading dims; broadcasting is right-aligned, so [7, 3, 1] vs [3, 1] has equal trailing dims but the second operand is missing a leading 7. Use unsqueeze or let broadcasting insert 1s explicitly.

"shape[0] compiled, but shape[i] complains"

Shape subscripts with symbolic indices are refinement-typed (i < Rank). Either bind i with a refined type or use a compile-time constant.

Shape verifier configuration

Tooling that drives the shape verifier directly (custom static analysis, IDE integrations, advanced lint passes) controls its behaviour via ShapeConfig. Every documented field is honoured by the verifier (no inert defenses):

FieldDefaultWhat it controls
enable_broadcasttrueAllow broadcasting in element-wise ops. When false, verify_broadcast rejects any non-identical shape pair. Bounds the verifier to strict-equality semantics for tooling that wants to flag implicit broadcasts.
enable_polymorphismtruePermit polymorphic shape variables (e.g. Tensor<T, [N, M]>) in unification. When false, only fully-resolved static shapes are accepted.
max_rank8Per-tensor rank ceiling, enforced at the entry of every verify_* operation: matmul (both operands), elementwise (both), broadcast (both + result rank check), reduction, transpose, reshape (input + new_shape), and concat (every input). A shape whose rank() exceeds max_rank surfaces ShapeError::InvalidOperation { operation, requirement: "rank ≤ max_rank (N)", actual: "rank K" }, naming the offending operation. Tighten this on memory-constrained targets to surface model architectures that would blow the static-analysis budget; relax it for research workloads with very high-rank tensors.

A typical override keeps every default except the rank ceiling — e.g., to tighten max_rank to 4 for a memory-constrained target, construct a ShapeConfig whose max_rank is 4 and pass it to ShapeVerifier.

The check runs in O(1) per call (rank comparison is a single field read), so tightening the cap costs nothing at verification time but gains an early-fail diagnostic for shape pathologies that would otherwise surface much later in codegen.

Implementation status

FeatureStatusNotes
Tensor<E, [Shape]> typeStablefirst-class in the type system
Tensor literal parser / checkerStableshape inferred from literal structure
Shape arithmetic in typesStablecompile-time *, +, Min, Max over dims
Broadcasting inferenceMaturingNumPy-style alignment rules
Matmul / batched matmulStableshape checker enforces inner-dim equality
Element-wise op overloading via protocolsExperimentalunified across CPU / GPU backends
Shape-refined indexing (t[i] with i < shape[0])Experimentalrefinement integration in progress
Dependent (runtime-valued) dimensionsExperimentalpattern established in stdlib
Autodiff typing (@grad)Planned

FAQ

Are shapes erased at runtime? Yes. The runtime representation is just a buffer plus a tuple of integer strides, computed at compile time for static shapes.

What is Float32 vs Float64 — is element type erased too? Element types are not erased — they control layout. Only the shape indices are erased.

Can I have Tensor<T, S> generic over T? Yes, and you often should. Numeric protocols (Num, Float, Int) constrain T appropriately.

Where does the name tensor come from in the grammar? The lower-case tensor<...> is a literal head that matches the type head. Capital Tensor<T, S> is the generic form used in types. Both reach the same internal representation.

Is there first-class support for sparse tensors? Not at the type level — sparse layouts live behind protocol-based abstractions (SparseTensor<T, S> with its own methods).

What about ragged (variable-length) tensors? Use a row of dependent-typed tensors (List<Tensor<T, [n_i]>>) with the lengths in a companion shape tensor.

See also