v0.1.0 — Initial Release
First public release of Enigma DSL for Apple Metal.New features
Layout Algebra
CuTe-style primitives:
Layout, coalesce, complement, zipped_divide, recast_layout, make_layout_tv, tensor_zipped_divide, tensor_composition.Compiler Pipeline
Trace pipeline for
@enigma.kernel and @enigma.jit. MLIR + Metal emission via xcrun metal / xcrun metallib. Vectorized codegen with vec_width (float4 paths).Metal Runtime
MetalRuntime dispatch via Swift dylib bridge. PreparedKernel for pre-allocated repeated dispatch. dispatch_timed() for benchmarking.Device Capabilities
DeviceCapabilities reporting including M3+ feature detection. Scalar parameter packing via enigma.Scalar(...).DSL surface
- Full set of thread, threadgroup, and SIMD index query ops
- Extended dtypes:
bf16,i64,u64,i8,u8,i16,u16,b1 - Arithmetic, float math, integer math, and bit manipulation ops
- SIMD group and quad group reductions, prefix scans, and shuffle ops
- Atomic operations with full memory ordering:
relaxed,acquire,release,acq_rel - Threadgroup shared memory via
threadgroup_allocand barriers - Simdgroup matrix ops:
simdgroup_matrix_load,simdgroup_multiply_accumulate,simdgroup_matrix_store - Control-flow tracing:
for_range,if_,while_ - Predicated load/store:
load_if,store_if - Register/pipeline primitives:
register_tensor,copy,pipeline
Export
CompiledKernel.export_metal()— export generated MSL sourcekeep_metal_source=True— retain.metalartifact after compilation
Testing
- Portable CI: tracing, layout algebra, and compile surfaces under
tests/portable - Metal runtime execution tests under
tests/metal
