Atomic Operations

Enigma provides a full set of atomic operations for safe concurrent access to shared device memory. All atomic ops map directly to Metal’s atomic_* functions.

Memory ordering

All atomic operations accept a memory_order argument:

Order	Description
`"relaxed"`	No ordering guarantees beyond atomicity
`"acquire"`	All subsequent loads observe stores released before this acquire
`"release"`	All prior stores become visible to threads that acquire this
`"acq_rel"`	Combines acquire and release semantics

The default is "relaxed" unless otherwise specified.

Atomic load and store

val = enigma.atomic_load(ptr, index, memory_order="acquire")
enigma.atomic_store(ptr, index, value, memory_order="release")

Atomic exchange

Returns the old value and stores value atomically:

old = enigma.atomic_exchange(ptr, index, value, memory_order="acq_rel")

Atomic fetch-and-modify (RMW)

Each operation atomically reads the current value, applies the operation, stores the result, and returns the old value.

Function	Operation
`enigma.atomic_fetch_add(ptr, idx, val, order)`	`*ptr += val`
`enigma.atomic_fetch_sub(ptr, idx, val, order)`	`*ptr -= val`
`enigma.atomic_fetch_min(ptr, idx, val, order)`	`ptr = min(ptr, val)`
`enigma.atomic_fetch_max(ptr, idx, val, order)`	`ptr = max(ptr, val)`
`enigma.atomic_fetch_and(ptr, idx, val, order)`	`*ptr &= val`
`enigma.atomic_fetch_or(ptr, idx, val, order)`	`*ptr \|= val`
`enigma.atomic_fetch_xor(ptr, idx, val, order)`	`*ptr ^= val`

Example: global counter

@enigma.kernel
def count_positives(A: enigma.f32, Counter: enigma.i32):
    tid = enigma.thread_position_in_grid
    val = A[tid]
    with enigma.if_(enigma.cmp_gt(val, enigma.f32(0.0))):
        enigma.atomic_fetch_add(Counter, 0, 1, memory_order="relaxed")

Example: parallel histogram

@enigma.kernel
def histogram(A: enigma.u32, Hist: enigma.u32):
    tid = enigma.thread_position_in_grid
    bin_idx = A[tid] % 256
    enigma.atomic_fetch_add(Hist, bin_idx, 1, memory_order="relaxed")

Compare-and-swap (CAS)

atomic_compare_exchange_weak attempts to replace expected with desired. Returns True if the swap succeeded:

swapped = enigma.atomic_compare_exchange_weak(
    ptr, index,
    expected,
    desired,
    success_order="acq_rel",
    failure_order="acquire",
)

Example: lock-free maximum

@enigma.kernel
def atomic_max_f32(A: enigma.f32, Max: enigma.u32):
    tid = enigma.thread_position_in_grid
    val_bits = enigma.as_type(A[tid], enigma.u32)
    # CAS loop to update running maximum using bit representation
    with enigma.while_(lambda: ...):
        old_bits = enigma.atomic_load(Max, 0, memory_order="relaxed")
        with enigma.if_(enigma.cmp_le(val_bits, old_bits)):
            enigma.break_()
        swapped = enigma.atomic_compare_exchange_weak(
            Max, 0, old_bits, val_bits,
            success_order="release", failure_order="relaxed",
        )

Ordering guidelines

Use case	Recommended order
Independent counter increments	`"relaxed"`
Producer side of a flag	`"release"`
Consumer side of a flag	`"acquire"`
RMW used as a mutex	`"acq_rel"`

Using "relaxed" where ordering matters is a common source of subtle bugs. When in doubt, use "acq_rel".

​Memory ordering

​Atomic load and store

​Atomic exchange

​Atomic fetch-and-modify (RMW)

​Example: global counter

​Example: parallel histogram

​Compare-and-swap (CAS)

​Example: lock-free maximum

​Ordering guidelines