Skip to main content
Enigma provides a full set of atomic operations for safe concurrent access to shared device memory. All atomic ops map directly to Metal’s atomic_* functions.

Memory ordering

All atomic operations accept a memory_order argument:
OrderDescription
"relaxed"No ordering guarantees beyond atomicity
"acquire"All subsequent loads observe stores released before this acquire
"release"All prior stores become visible to threads that acquire this
"acq_rel"Combines acquire and release semantics
The default is "relaxed" unless otherwise specified.

Atomic load and store

val = enigma.atomic_load(ptr, index, memory_order="acquire")
enigma.atomic_store(ptr, index, value, memory_order="release")

Atomic exchange

Returns the old value and stores value atomically:
old = enigma.atomic_exchange(ptr, index, value, memory_order="acq_rel")

Atomic fetch-and-modify (RMW)

Each operation atomically reads the current value, applies the operation, stores the result, and returns the old value.
FunctionOperation
enigma.atomic_fetch_add(ptr, idx, val, order)*ptr += val
enigma.atomic_fetch_sub(ptr, idx, val, order)*ptr -= val
enigma.atomic_fetch_min(ptr, idx, val, order)*ptr = min(*ptr, val)
enigma.atomic_fetch_max(ptr, idx, val, order)*ptr = max(*ptr, val)
enigma.atomic_fetch_and(ptr, idx, val, order)*ptr &= val
enigma.atomic_fetch_or(ptr, idx, val, order)*ptr |= val
enigma.atomic_fetch_xor(ptr, idx, val, order)*ptr ^= val

Example: global counter

@enigma.kernel
def count_positives(A: enigma.f32, Counter: enigma.i32):
    tid = enigma.thread_position_in_grid
    val = A[tid]
    with enigma.if_(enigma.cmp_gt(val, enigma.f32(0.0))):
        enigma.atomic_fetch_add(Counter, 0, 1, memory_order="relaxed")

Example: parallel histogram

@enigma.kernel
def histogram(A: enigma.u32, Hist: enigma.u32):
    tid = enigma.thread_position_in_grid
    bin_idx = A[tid] % 256
    enigma.atomic_fetch_add(Hist, bin_idx, 1, memory_order="relaxed")

Compare-and-swap (CAS)

atomic_compare_exchange_weak attempts to replace expected with desired. Returns True if the swap succeeded:
swapped = enigma.atomic_compare_exchange_weak(
    ptr, index,
    expected,
    desired,
    success_order="acq_rel",
    failure_order="acquire",
)

Example: lock-free maximum

@enigma.kernel
def atomic_max_f32(A: enigma.f32, Max: enigma.u32):
    tid = enigma.thread_position_in_grid
    val_bits = enigma.as_type(A[tid], enigma.u32)
    # CAS loop to update running maximum using bit representation
    with enigma.while_(lambda: ...):
        old_bits = enigma.atomic_load(Max, 0, memory_order="relaxed")
        with enigma.if_(enigma.cmp_le(val_bits, old_bits)):
            enigma.break_()
        swapped = enigma.atomic_compare_exchange_weak(
            Max, 0, old_bits, val_bits,
            success_order="release", failure_order="relaxed",
        )

Ordering guidelines

Use caseRecommended order
Independent counter increments"relaxed"
Producer side of a flag"release"
Consumer side of a flag"acquire"
RMW used as a mutex"acq_rel"
Using "relaxed" where ordering matters is a common source of subtle bugs. When in doubt, use "acq_rel".