Skip to main content
Enigma provides tracing-compatible control flow that lowers to MLIR scf ops. Plain Python if/for executes at trace time (useful for compile-time unrolling); use these constructs for runtime GPU control flow.

enigma.for_range

with enigma.for_range(lo, hi, step=1, dtype="int", init=None) as iv:
    # loop body
ParameterTypeDefaultDescription
loint or IRValuerequiredLower bound (inclusive)
hiint or IRValuerequiredUpper bound (exclusive)
stepint or IRValue1Loop step
dtypestr"int"Induction variable type
initlist or NoneNoneInitial values for loop-carried accumulators
Without init, the context yields the induction variable:
with enigma.for_range(0, N) as i:
    C[i] = A[i] + B[i]
With init, yields (iv, carry) where carry is a mutable Carry object:
zero = enigma.metal_cast(0, "float")
with enigma.for_range(0, N, init=[zero]) as (i, c):
    c[0] = c[0] + A[i]
total = c[0]  # final value after the loop

enigma.if_

with enigma.if_(condition) as (then_b, else_b):
    with then_b:
        # then branch
    with else_b:
        # else branch
Simple if (no else):
with enigma.if_(enigma.cmp_lt(tid, enigma.metal_cast(N, "uint"))):
    C[tid] = A[tid]

enigma.while_

with enigma.while_(lambda: enigma.cmp_lt(counter, limit)):
    # body — must eventually make the condition false
The condition is a zero-argument callable returning an IRValue of type i1.

enigma.range (AST preprocessor)

Inside @enigma.kernel, Python-style for loops with enigma.range are rewritten by the AST preprocessor into for_range calls with automatic carry tracking:
@enigma.kernel
def k(A: enigma.f32, Out: enigma.f32):
    acc = enigma.metal_cast(0, "float")
    for i in enigma.range(0, 64):
        acc = acc + A[i]
    Out[0] = acc
The preprocessor detects modified variables and threads them as loop carries automatically.

enigma.range_constexpr

Unrolls at trace time — the loop body is traced N times as separate ops:
for j in enigma.range_constexpr(4):
    acc[j] = enigma.fma(a[j], b[j], acc[j])
Equivalent to writing the body 4 times. Use for small fixed-count inner loops where you want full unrolling.

Predicated memory access

For boundary checks without full if_ branches:
mask = enigma.cmp_ult(tid, enigma.metal_cast(N, "uint"))
val = enigma.load_if(A, tid, mask, default=0.0)
enigma.store_if(C, tid, val, mask)
FunctionDescription
enigma.load_if(buf, index, mask, default=0)Load if mask is true, else return default
enigma.store_if(buf, index, value, mask)Store only when mask is true
load_if emits an unconditional load + select. store_if wraps the store in an scf.if.