> ## Documentation Index
> Fetch the complete documentation index at: https://klyne-research.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# Inspection and Tracing

> How to inspect every stage of the Enigma pipeline to diagnose compilation and correctness issues.

When something is wrong, inspect the pipeline stage-by-stage rather than guessing from runtime output. The full pipeline is:

```
Python source → traced IR → MLIR → Metal source → .metallib → GPU output
```

## Enable debug output at compile time

```python theme={null}
compiled = enigma.compile(
    my_kernel,
    dump_ir=True,           # print traced IR
    dump_mlir=True,         # print MLIR
    keep_metal_source=True, # save .metal source
    work_dir="./build/debug",
)
```

### What to look for in the IR (`dump_ir=True`)

The IR lists every operation the tracer recorded. Verify:

* Loads and stores appear in the right order
* Thread index ops (`thread_position_in_grid`, etc.) are present
* Arithmetic ops match your kernel logic
* No unexpected extra ops (can indicate a scoping issue in for\_range)

### What to look for in MLIR (`dump_mlir=True`)

The MLIR representation shows the dialect ops before Metal emission. Verify:

* Buffer arguments have the correct types (`!enigma.buffer<f32>`)
* Control-flow regions (`scf.for`, `scf.if`) have the correct structure
* Thread index ops map to the correct `Dimension` attribute (x/y/z)

### Inspecting the Metal source

```python theme={null}
print(compiled.metal_source)
```

Key things to check:

* Kernel function signature: buffer count and types match kernel parameters
* Index arithmetic: confirm the generated expressions match your intent
* `[[thread_position_in_grid]]` vs `[[thread_position_in_threadgroup]]`: correct attribute?
* Barrier placement: `threadgroup_barrier` appears between write and read of shared memory

## Saving Metal source for diffing

```python theme={null}
compiled_before = enigma.compile(kernel_v1, keep_metal_source=True, work_dir="./build/v1")
compiled_after  = enigma.compile(kernel_v2, keep_metal_source=True, work_dir="./build/v2")
compiled_before.export_metal("./build/v1/kernel.metal")
compiled_after.export_metal("./build/v2/kernel.metal")
```

Then diff:

```bash theme={null}
diff ./build/v1/kernel.metal ./build/v2/kernel.metal
```

This is the fastest way to catch regressions in codegen.

## Minimal reproducible inputs

When diagnosing a correctness issue:

1. Use the smallest input that reproduces the problem (e.g. `n=32`)
2. Use `np.arange(n).astype(np.float32)` as input so values are predictable
3. Check each element individually with `out[i]` rather than aggregate assertions

```python theme={null}
n = 32
a = np.arange(n, dtype=np.float32)
b = np.ones(n, dtype=np.float32)

raw = rt.execute(compiled, [a, b], n * 4, grid=(n, 1, 1), threads=(32, 1, 1))
out = np.frombuffer(raw, dtype=np.float32)

for i in range(n):
    expected = a[i] + b[i]
    if abs(out[i] - expected) > 1e-5:
        print(f"[{i}] got {out[i]:.4f}, expected {expected:.4f}")
```

## Debugging layout algebra

When a tiling produces unexpected output, print shape and stride at each transform:

```python theme={null}
thr = enigma.make_ordered_layout((4, 64), order=(1, 0))
val = enigma.make_ordered_layout((4, 4), order=(1, 0))
tiler_mn, tv_layout = enigma.make_layout_tv(thr, val)

print("thr layout:", thr)
print("val layout:", val)
print("tiler_mn:  ", tiler_mn)
print("tv_layout: ", tv_layout)
print("threads:   ", enigma.size(tv_layout, mode=[0]))
print("vals/thr:  ", enigma.size(tv_layout, mode=[1]))
```

Confirm:

* `threads` × `vals/thr` = total elements in one tile
* Tiler dimensions fit within the tensor dimensions

## Fast debugging loop

1. Lock a minimal reproducible input (small `n`, predictable values)
2. Compile with `dump_ir=True, dump_mlir=True, keep_metal_source=True`
3. Read the Metal source — the bug is usually visible there
4. Make one change at a time and re-check the Metal source before dispatching
5. Only run dispatch once the Metal source looks correct
