Tensor types
TL;DR. A tensor type
Tensor<Float32, [3, 224, 224]>describes a three-dimensional array of 32-bit floats with the given shape. Shape mismatches — mismatched matmul, wrong axis for reduction, broken broadcasting — are caught by the type checker before the program runs. At runtime, a tensor is just a contiguous buffer plus strides known at compile time.
// Static shape — the type system enforces it.
fn image_channels(img: Tensor<Float32, [3, H, W]>) -> Int { 3 }
// Shape-polymorphic — H and W can vary, compiler keeps track.
fn resize<H, W>(img: Tensor<Float32, [3, H, W]>)
-> Tensor<Float32, [3, 224, 224]> { ... }
// Matmul: the inner dimensions must agree, or the type checker rejects.
fn matmul<M, K, N>(a: Tensor<Float32, [M, K]>,
b: Tensor<Float32, [K, N]>)
-> Tensor<Float32, [M, N]> { ... }
Wired end-to-end through the type system: Tensor<Element, [Shape]>
parses, type-checks, and validates compile-time shape arithmetic
across 40+ VBC end-to-end tests. Compile-time shape arithmetic is
stable; broadcasting inference is maturing; shape-refined
indexing (t[i] with i < shape[0]) is experimental.
Why shape-typed tensors?
The vast majority of machine-learning bugs are shape bugs:
batch vs. no-batch, transposed matmul, wrong axis for sum or
mean, incompatible broadcast. NumPy and PyTorch turn these into
runtime ValueErrors, often inside a training loop after an
hour of compute.
| Bug class | NumPy/PyTorch | Verum |
|---|---|---|
a @ b with mismatched inner dims | Runtime ValueError | Compile error, shape visible |
sum(x, axis=5) on rank-3 x | Runtime error | Compile error |
Broadcast between [4, 3, 1] and [4, 5] | Runtime error | Compile error |
| Off-by-one in batch slicing | Silent wrong output | Compile error or refined-index error |
Forgotten .unsqueeze | Shape drift at the next op | Compile error |
Shape typing moves all of this to compile time. The runtime cost is zero — shapes exist only in the type system.
The type
tensor_type_expr = 'tensor' , '<' , shape_params , '>' , type_expr ;
shape_params = dimension , { ',' , dimension } ;
dimension = integer_lit | identifier | meta_arith_expr ;
Two equivalent surface forms:
// generic form — uses a type-level list
Tensor<Float32, [3, 224, 224]>
// explicit form — mirrors tensor literal syntax
tensor<3, 224, 224> Float32
Both elaborate to the same Type.Tensor node. Use the generic
form in general code; use the explicit form in numerical contexts
where the resemblance to the literal aids reading.
First example: shape-checked dot product
fn dot<N>(a: Tensor<Float32, [N]>, b: Tensor<Float32, [N]>) -> Float32 {
let mut acc = 0.0;
for i in 0..N { acc += a[i] * b[i]; }
acc
}
let x: Tensor<Float32, [3]> = tensor<3> Float32 { 1.0, 2.0, 3.0 };
let y: Tensor<Float32, [3]> = tensor<3> Float32 { 4.0, 5.0, 6.0 };
let z = dot(x, y); // ok, N = 3 inferred
let w: Tensor<Float32, [4]> = tensor<4> Float32 { 1, 2, 3, 4 };
// dot(x, w); // ERROR: expected Tensor<_, [3]>, found Tensor<_, [4]>
Shape arithmetic
Shapes are compile-time expressions over natural numbers and bound identifiers. Verum supports the usual operators inside shape positions:
fn concat<A, B>(
a: Tensor<Float32, [A]>,
b: Tensor<Float32, [B]>,
) -> Tensor<Float32, [A + B]> { ... }
fn reshape_flat<H, W>(
img: Tensor<Float32, [3, H, W]>,
) -> Tensor<Float32, [3 * H * W]> { ... }
fn kron<A, B, C, D>(
x: Tensor<Float32, [A, B]>,
y: Tensor<Float32, [C, D]>,
) -> Tensor<Float32, [A * C, B * D]> { ... }
The operators available in shape positions are the same as in
meta arithmetic: +, -, *, / (integer division), and
bracketed grouping. Equality between shape expressions uses SMT
so [A + B] and [B + A] unify.
Broadcasting
Broadcasting is checked by tensor_shape_checker. Two rules:
- Shapes are right-aligned. Missing leading dims are implicitly
1. - Dimensions align if they are equal, or either is
1.
The checker computes the broadcast shape symbolically:
// Input types
let x: Tensor<Float32, [4, 3, 1]>;
let y: Tensor<Float32, [ 3, 5]>;
// Output shape computed at compile time
let z: Tensor<Float32, [4, 3, 5]> = x + y;
A failing broadcast is a compile error with the offending axis called out:
error[E1311]: cannot broadcast shapes [4, 3, 1] and [4, 5]
--> src/net.vr:18:13
|
18 | let z = x + y;
| ^^^^^ dimension 1: 3 vs 5 (neither is 1)
= note: broadcasting requires each axis to be equal or one of
the operands to be 1 at that axis.
Matrix multiplication
fn matmul<M, K, N>(
a: Tensor<Float32, [M, K]>,
b: Tensor<Float32, [K, N]>,
) -> Tensor<Float32, [M, N]> { ... }
let a: Tensor<Float32, [3, 4]> = ...;
let b: Tensor<Float32, [4, 5]> = ...;
let c = matmul(a, b); // Tensor<Float32, [3, 5]>, inferred
let d: Tensor<Float32, [6, 5]> = ...;
// matmul(a, d); // ERROR: inner dims mismatch: 4 vs 6
Batched matmul follows the same pattern with leading broadcast dimensions:
fn bmm<B, M, K, N>(
a: Tensor<Float32, [B, M, K]>,
b: Tensor<Float32, [B, K, N]>,
) -> Tensor<Float32, [B, M, N]> { ... }
Reductions
Reductions shrink a chosen axis. The axis is a compile-time integer; the checker verifies it is in range:
fn sum_axis<Rank, Axis, S: Shape<Rank>>(
t: Tensor<Float32, S>,
) -> Tensor<Float32, S.without(Axis)>
where Axis < Rank;
S.without(Axis) is a type-level function on shapes: remove the
element at Axis. Specified at compile time, verified by the
checker.
Reshape
fn reshape<OldShape, NewShape>(
t: Tensor<Float32, OldShape>,
) -> Tensor<Float32, NewShape>
where product(OldShape) == product(NewShape);
The product(...) predicate ensures element-count preservation —
checked by SMT at compile time.
Tensor literals
Tensor literals pair an explicit shape with nested elements:
let eye3: Tensor<Float32, [3, 3]> = tensor<3, 3> Float32 {
1.0, 0.0, 0.0,
0.0, 1.0, 0.0,
0.0, 0.0, 1.0,
};
let batch: Tensor<Int, [2, 3]> = tensor<2, 3> Int {
[1, 2, 3],
[4, 5, 6],
};
The number of elements must equal ∏ shape_i, verified at parse
time. Literals can be nested arrays (PyTorch style) or flat
(BLAS style); both are accepted.
Interaction with dependent types
Shape dimensions are (usually) type-level; when you need a runtime-valued dimension you use dependent types:
fn concat_dyn(a: Tensor<Float32, [n]>, b: Tensor<Float32, [m]>)
-> Tensor<Float32, [n + m]>
where n: Nat, m: Nat;
n and m are dependent-typed indices; the return shape is
computed from them. At runtime, the dimensions become ordinary
integers carried alongside the buffer.
Interaction with autodiff
Tensor types interact with @grad and @jit:
@grad
fn loss(x: Tensor<Float32, [Batch, 10]>, w: Tensor<Float32, [10, 1]>)
-> Tensor<Float32, [Batch, 1]>
{
matmul(x, w)
}
Gradient shapes are computed from the forward shapes, giving you statically-typed backward passes.
Cookbook
A fully-connected layer
type Linear<In, Out> is {
weight: Tensor<Float32, [Out, In]>,
bias: Tensor<Float32, [Out]>,
};
implement Linear<In, Out> {
fn forward<Batch>(&self, x: Tensor<Float32, [Batch, In]>)
-> Tensor<Float32, [Batch, Out]>
{
matmul(x, self.weight.transpose()) + self.bias
}
}
Every shape in forward is checked against the layer's
declaration. A shape mismatch in the training loop is a compile
error in the layer definition.
A multi-head attention head
fn attention<B, H, T, D>(
q: Tensor<Float32, [B, H, T, D]>,
k: Tensor<Float32, [B, H, T, D]>,
v: Tensor<Float32, [B, H, T, D]>,
) -> Tensor<Float32, [B, H, T, D]> {
let scores = matmul(q, k.transpose(-1, -2)) / sqrt(D as Float32);
let weights = softmax(scores, axis=-1);
matmul(weights, v)
}
All head dimensions are polymorphic — one implementation, infinite sizes.
Convolution (simplified)
fn conv2d<B, C_in, C_out, Kh, Kw, H, W>(
input: Tensor<Float32, [B, C_in, H, W]>,
weight: Tensor<Float32, [C_out, C_in, Kh, Kw]>,
) -> Tensor<Float32, [B, C_out, H - Kh + 1, W - Kw + 1]>
where Kh <= H, Kw <= W;
The output spatial shape is computed from the input. The compiler rejects a 3×3 kernel on a 2×2 input with a helpful message.
Interaction with other features
| With... | Behaviour |
|---|---|
| Refinement types | Tensor element types can be refined: Tensor<Float32 { self >= 0 }, [N]> — enforces nonnegativity entry-wise. |
| Linearity | GPU-resident tensors can be declared linear so that .unmap() is mandatory. |
| Protocols | Add, Mul, Matmul etc. are protocols parameterised by shape relations. |
| Meta staging | Shape arithmetic is compile-time; @stage can pre-compute shapes across a pipeline. |
| Verification | Reductions and reshape emit SMT obligations automatically. |
Common pitfalls
"The compiler says [3, 4] and [A, B] don't unify"
You're calling a shape-polymorphic function with concrete tensors
but the compiler can't deduce A = 3, B = 4. Add type arguments:
f<3, 4>(t) — or wait for inference to see both arguments and
the return type.
"My reshape fails even though the products match"
The SMT constraint is product(OldShape) == product(NewShape),
and nonlinear arithmetic is undecidable in general. Provide a
hint: reshape<[A*B, C]>(t) or break the reshape into steps
with known intermediate shapes.
"Broadcasting says 'dimension 2: 7 vs 7' and still fails"
Check the leading dims; broadcasting is right-aligned, so
[7, 3, 1] vs [3, 1] has equal trailing dims but the second
operand is missing a leading 7. Use unsqueeze or let
broadcasting insert 1s explicitly.
"shape[0] compiled, but shape[i] complains"
Shape subscripts with symbolic indices are refinement-typed
(i < Rank). Either bind i with a refined type or use a
compile-time constant.
Shape verifier configuration
Tooling that drives the shape verifier directly (custom static
analysis, IDE integrations, advanced lint passes) controls its
behaviour via ShapeConfig. Every documented field is honoured
by the verifier (no inert defenses):
| Field | Default | What it controls |
|---|---|---|
enable_broadcast | true | Allow broadcasting in element-wise ops. When false, verify_broadcast rejects any non-identical shape pair. Bounds the verifier to strict-equality semantics for tooling that wants to flag implicit broadcasts. |
enable_polymorphism | true | Permit polymorphic shape variables (e.g. Tensor<T, [N, M]>) in unification. When false, only fully-resolved static shapes are accepted. |
max_rank | 8 | Per-tensor rank ceiling, enforced at the entry of every verify_* operation: matmul (both operands), elementwise (both), broadcast (both + result rank check), reduction, transpose, reshape (input + new_shape), and concat (every input). A shape whose rank() exceeds max_rank surfaces ShapeError::InvalidOperation { operation, requirement: "rank ≤ max_rank (N)", actual: "rank K" }, naming the offending operation. Tighten this on memory-constrained targets to surface model architectures that would blow the static-analysis budget; relax it for research workloads with very high-rank tensors. |
A typical override keeps every default except the rank
ceiling — e.g., to tighten max_rank to 4 for a
memory-constrained target, construct a ShapeConfig whose
max_rank is 4 and pass it to ShapeVerifier.
The check runs in O(1) per call (rank comparison is a single
field read), so tightening the cap costs nothing at verification
time but gains an early-fail diagnostic for shape pathologies
that would otherwise surface much later in codegen.
Implementation status
| Feature | Status | Notes |
|---|---|---|
Tensor<E, [Shape]> type | Stable | first-class in the type system |
| Tensor literal parser / checker | Stable | shape inferred from literal structure |
| Shape arithmetic in types | Stable | compile-time *, +, Min, Max over dims |
| Broadcasting inference | Maturing | NumPy-style alignment rules |
| Matmul / batched matmul | Stable | shape checker enforces inner-dim equality |
| Element-wise op overloading via protocols | Experimental | unified across CPU / GPU backends |
Shape-refined indexing (t[i] with i < shape[0]) | Experimental | refinement integration in progress |
| Dependent (runtime-valued) dimensions | Experimental | pattern established in stdlib |
Autodiff typing (@grad) | Planned | — |
FAQ
Are shapes erased at runtime? Yes. The runtime representation is just a buffer plus a tuple of integer strides, computed at compile time for static shapes.
What is Float32 vs Float64 — is element type erased too?
Element types are not erased — they control layout. Only the
shape indices are erased.
Can I have Tensor<T, S> generic over T? Yes, and you often
should. Numeric protocols (Num, Float, Int) constrain T
appropriately.
Where does the name tensor come from in the grammar? The
lower-case tensor<...> is a literal head that matches the type
head. Capital Tensor<T, S> is the generic form used in types.
Both reach the same internal representation.
Is there first-class support for sparse tensors? Not at the
type level — sparse layouts live behind protocol-based
abstractions (SparseTensor<T, S> with its own methods).
What about ragged (variable-length) tensors? Use a row of
dependent-typed tensors (List<Tensor<T, [n_i]>>) with the lengths
in a companion shape tensor.
See also
- Types — base type forms.
- Refinement types — element-level predicates.
- Dependent types — dynamic dimensions.
- Small neural net tutorial — a real model built on tensor types.
- Grammar reference — Types —
the formal
Tensor<E, [Shape]>production.