Thanks for using Compiler Explorer
Sponsors
Jakt
C++
Ada
Algol68
Analysis
Android Java
Android Kotlin
Assembly
C
C3
Carbon
C with Coccinelle
C++ with Coccinelle
C++ (Circle)
CIRCT
Clean
CMake
CMakeScript
COBOL
C++ for OpenCL
MLIR
Cppx
Cppx-Blue
Cppx-Gold
Cpp2-cppfront
Crystal
C#
CUDA C++
D
Dart
Elixir
Erlang
Fortran
F#
GLSL
Go
Haskell
HLSL
Hook
Hylo
IL
ispc
Java
Julia
Kotlin
LLVM IR
LLVM MIR
Modula-2
Mojo
Nim
Numba
Nix
Objective-C
Objective-C++
OCaml
Odin
OpenCL C
Pascal
Pony
PTX
Python
Racket
Raku
Ruby
Rust
Sail
Snowball
Scala
Slang
Solidity
Spice
SPIR-V
Swift
LLVM TableGen
Toit
Triton
TypeScript Native
V
Vala
Visual Basic
Vyper
WASM
Zig
Javascript
GIMPLE
Ygen
sway
zig source #1
Output
Compile to binary object
Link to binary
Execute the code
Intel asm syntax
Demangle identifiers
Verbose demangling
Filters
Unused labels
Library functions
Directives
Comments
Horizontal whitespace
Debug intrinsics
Compiler
zig 0.10.0
zig 0.11.0
zig 0.12.0
zig 0.12.1
zig 0.13.0
zig 0.14.0
zig 0.14.1
zig 0.15.1
zig 0.2.0
zig 0.3.0
zig 0.4.0
zig 0.5.0
zig 0.6.0
zig 0.7.0
zig 0.7.1
zig 0.8.0
zig 0.9.0
zig trunk
Options
Source code
const Float4 = @Vector(4, f32); /// Scalar dot product, as a baseline export fn naiveScalarDot(vecA: [4]f32, vecB: [4]f32) f32 { return vecA[0] * vecB[0] + vecA[1] * vecB[1] + vecA[2] * vecB[2] + vecA[3] * vecB[3]; } /// Naive SSE Dot function. Returns a scalar, which would need to /// be `@splat`ted to be usable with other vector math. Named /// accessors may encourage this sort of code. export fn naiveSSEDot(vecA: Float4, vecB: Float4) f32 { const ab = vecA * vecB; return ab[0] + ab[1] + ab[2] + ab[3]; } /// A SSE dot function that is a little better. Returns a vector /// with all four channels set to the dot product of the two inputs. /// Avoids mixing scalar and vector code. export fn tolerableSSEDot(vecA: Float4, vecB: Float4) Float4 { const Shuf = @Vector(4, i32); const ab = vecA * vecB; const abRot90 = @shuffle(f32, ab, ab, Shuf{ 1, 2, 3, 0 }); const abRot180 = @shuffle(f32, ab, ab, Shuf{ 2, 3, 0, 1 }); const abRot270 = @shuffle(f32, ab, ab, Shuf{ 3, 0, 1, 2 }); return ab + abRot90 + abRot180 + abRot270; } /// A SSE dot function that will actually give the advertized 4x speedup /// Requires x, y, z, w channels to be laid out in parallel, as is /// usually recommended for SIMD. Note that the individual channels /// of a SIMD vector do not have individual meanings, and thus do not /// need names. export fn goodSSEDot( aX: Float4, aY: Float4, aZ: Float4, aW: Float4, bX: Float4, bY: Float4, bZ: Float4, bW: Float4, ) Float4 { return aX * bX + aY * bY + aZ * bZ + aW * bW; } // Misc. vector operations follow to show generated code /// prevent optimization of data pointed to by p inline fn escape(p: var) void { asm volatile("" : : [g]""(p) : "memory"); } export fn miscChannelAccess() Float4 { var x: Float4 = @splat(4, @as(f32, 1)); escape(&x); // force x to be saved and loaded from memory x = x; // godbolt will attribute the load to this statement x[3] = 1; const y = x[1]; escape(&y); // force y to be stored to memory and not optimized return x; } export fn getElementDynamic(vec: Float4, index: u32) f32 { return vec[index]; } export fn getElement0(vec: Float4) f32 { return vec[0]; } export fn getElement1(vec: Float4) f32 { return vec[1]; } export fn getElement2(vec: Float4) f32 { return vec[2]; } export fn getElement3(vec: Float4) f32 { return vec[3]; } export fn setElementDynamic(vec: Float4, index: u32, value: f32) Float4 { var mutVec = vec; mutVec[index] = value; return mutVec; } export fn setElement0(vec: Float4, value: f32) Float4 { var mutVec = vec; mutVec[0] = value; return mutVec; } export fn setElement1(vec: Float4, value: f32) Float4 { var mutVec = vec; mutVec[1] = value; return mutVec; } export fn setElement2(vec: Float4, value: f32) Float4 { var mutVec = vec; mutVec[2] = value; return mutVec; } export fn setElement3(vec: Float4, value: f32) Float4 { var mutVec = vec; mutVec[3] = value; return mutVec; }
Become a Patron
Sponsor on GitHub
Donate via PayPal
Compiler Explorer Shop
Source on GitHub
Mailing list
Installed libraries
Wiki
Report an issue
How it works
Contact the author
CE on Mastodon
CE on Bluesky
Statistics
Changelog
Version tree