Skip to main content
Enigma compiles Python kernel functions to native Metal GPU code through a multi-stage pipeline. Understanding this pipeline helps you interpret error messages, inspect intermediate outputs, and reason about performance.

Compilation pipeline

Two entry points

  • @enigma.kernel — defines the GPU compute body. Launch geometry is provided at dispatch time.
  • @enigma.jit — defines a host-side function that performs layout algebra at compile time and launches one or more @enigma.kernel functions. Produces a CompiledKernel with pre-computed grid and block.

Stage details

Tracer (_tracing.py)

When enigma.compile(my_kernel) runs, the Python function body executes once with symbolic IRValue placeholders. Every operation — index arithmetic, loads, stores, math, thread queries — is recorded as an IR node. The tracer never runs the function at GPU dispatch time. This is why native Python control flow (if, for) is not supported in kernel bodies. Use enigma.for_range, enigma.if_, and enigma.while_ instead — these emit region-based IR nodes.

IR

The internal IR is a flat SSA list of typed operations. Each node has an opcode, typed operands, and a result type. You can inspect it at compile time:
compiled = enigma.compile(my_kernel, dump_ir=True)

MLIR dialect

mlir_emitter.py lowers the IR to the Enigma MLIR dialect (Enigma-Dialect repo). The dialect defines 32+ operations in TableGen:
compiled = enigma.compile(my_kernel, dump_mlir=True)

MSL emitter (MSLEmitter.cpp)

The MLIR visitor emits Metal Shading Language. It handles:
  • Metal address spaces (device, threadgroup, constant, thread)
  • Type mapping (enigma.f32float, enigma.bf16bfloat, etc.)
  • Kernel entry signature: buffer parameters become device float* A [[buffer(0)]]

xcrun pipeline

xcrun -sdk macosx metal -c kernel.metal -o kernel.air
xcrun -sdk macosx metallib kernel.air -o kernel.metallib
Enigma invokes these automatically. Set work_dir to inspect the artifacts:
compiled = enigma.compile(my_kernel, keep_metal_source=True, work_dir="./build")

Swift runtime bridge

A compiled Swift dylib wraps the Metal API. MetalRuntime calls it via ctypes. It manages device lifecycle, buffer allocation, kernel dispatch, and timed benchmarking.

Compilation artifacts

AttributeContents
compiled.kernel_nameMetal function name, e.g. enigma_kernel_add
compiled.metallib_pathPath to .metallib
compiled.metallib_bytesRaw .metallib bytes
compiled.metal_sourceGenerated MSL source string
compiled.mlir_sourceEnigma dialect MLIR (when available)

See also