Thanks for using Compiler Explorer
Sponsors
Jakt
C++
Ada
Algol68
Analysis
Android Java
Android Kotlin
Assembly
C
C3
Carbon
C with Coccinelle
C++ with Coccinelle
C++ (Circle)
CIRCT
Clean
CMake
CMakeScript
COBOL
C++ for OpenCL
MLIR
Cppx
Cppx-Blue
Cppx-Gold
Cpp2-cppfront
Crystal
C#
CUDA C++
D
Dart
Elixir
Erlang
Fortran
F#
GLSL
Go
Haskell
HLSL
Hook
Hylo
IL
ispc
Java
Julia
Kotlin
LLVM IR
LLVM MIR
Modula-2
Mojo
Nim
Numba
Nix
Objective-C
Objective-C++
OCaml
Odin
OpenCL C
Pascal
Pony
PTX
Python
Racket
Raku
Ruby
Rust
Sail
Snowball
Scala
Slang
Solidity
Spice
SPIR-V
Swift
LLVM TableGen
Toit
Triton
TypeScript Native
V
Vala
Visual Basic
Vyper
WASM
Zig
Javascript
GIMPLE
Ygen
sway
llvm source #1
Output
Compile to binary object
Link to binary
Execute the code
Intel asm syntax
Demangle identifiers
Verbose demangling
Filters
Unused labels
Library functions
Directives
Comments
Horizontal whitespace
Debug intrinsics
Compiler
clang (assertions trunk)
clang (trunk)
clang 10.0.0
clang 10.0.1
clang 11.0.0
clang 11.0.1
clang 12.0.0
clang 12.0.1
clang 13.0.0
clang 14.0.0
clang 15.0.0
clang 16.0.0
clang 17.0.1
clang 18.1.0
clang 19.1.0
clang 20.1.0
clang 21.1.0
clang 4.0.1
clang 5.0.0
clang 6.0.0
clang 7.0.0
clang 8.0.0
clang 9.0.0
hexagon-clang 16.0.5
llc (assertions trunk)
llc (trunk)
llc 10.0.0
llc 10.0.1
llc 11.0.0
llc 11.0.1
llc 12.0.0
llc 12.0.1
llc 13.0.0
llc 14.0.0
llc 15.0.0
llc 16.0.0
llc 17.0.1
llc 18.1.0
llc 19.1.0
llc 20.1.0
llc 21.1.0
llc 3.2
llc 3.3
llc 3.9.1
llc 4.0.0
llc 4.0.1
llc 5.0.0
llc 6.0.0
llc 7.0.0
llc 8.0.0
llc 9.0.0
opt (assertions trunk)
opt (trunk)
opt 10.0.0
opt 10.0.1
opt 11.0.0
opt 11.0.1
opt 12.0.0
opt 12.0.1
opt 13.0.0
opt 14.0.0
opt 15.0.0
opt 16.0.0
opt 17.0.1
opt 18.1.0
opt 19.1.0
opt 20.1.0
opt 21.1.0
opt 3.2
opt 3.3
opt 3.9.1
opt 4.0.0
opt 4.0.1
opt 5.0.0
opt 6.0.0
opt 7.0.0
opt 8.0.0
opt 9.0.0
Options
Source code
declare void @llvm.memcpy.p0.p0.i64(ptr, ptr, i64, i1) ; Basic scenario: we *really* want SROA to remove the allocas define i28 @test1(i28 %v) { %tmp = alloca i28 %ret = alloca i28 store i28 %v, ptr %tmp ; LLVM-18 no longer replaces this 4-byte memcpy by an i28 load/store pair, blocking SROA call void @llvm.memcpy.p0.p0.i64(ptr %ret, ptr %tmp, i64 4, i1 false) %r = load i28, ptr %ret ret i28 %r } define i28 @test2(i28 %v) { ; suggestion: SROA could promote the alloca access from i28 to i32 %tmp = alloca i32 %ret = alloca i32 %v.prom = zext i28 %v to i32 store i32 %v.prom, ptr %tmp ; always ok to replace this 4-byte memcpy by an i32 load/store pair call void @llvm.memcpy.p0.p0.i64(ptr %ret, ptr %tmp, i64 4, i1 false) %r.prom = load i32, ptr %ret %r = trunc i32 %r.prom to i28 ; after SROA removes the allocas, the remaining zext/trunc cancel ret i28 %r } ; Bug scenario with union access: SROA should not produce wrong code define i32 @test3(i1 %c, ptr %p) { %tmp = alloca i28 ; LLVM-17 miscompiles this by replacing the 4-byte memcpy by an i28 load/store pair call void @llvm.memcpy.p0.p0.i64(ptr %tmp, ptr %p, i64 4, i1 false) %load1 = load i28, ptr %tmp %v1 = zext i28 %load1 to i32 ; we access the 4th byte from the region, so this needs the complete 32 bits %p1 = getelementptr i8, ptr %tmp, i64 3 %load2 = load i8, ptr %p1 %v2 = zext i8 %load2 to i32 %v3 = select i1 %c, i32 %v1, i32 %v2 ret i32 %v3 } define i32 @test4(i1 %c, ptr %p) { ; suggestion: again, SROA could promote the alloca access from i28 to i32 ; the i8 access triggers SROA to replace the i32 access by an i24 complemented by the i8? ; in any case, it does not seem wrong %tmp = alloca i32 call void @llvm.memcpy.p0.p0.i64(ptr %tmp, ptr %p, i64 4, i1 false) %load1.prom = load i32, ptr %tmp %load1 = trunc i32 %load1.prom to i28 %v1 = zext i28 %load1 to i32 %p1 = getelementptr i8, ptr %tmp, i64 3 %load2 = load i8, ptr %p1 %v2 = zext i8 %load2 to i32 %v3 = select i1 %c, i32 %v1, i32 %v2 ret i32 %v3 }
Become a Patron
Sponsor on GitHub
Donate via PayPal
Compiler Explorer Shop
Source on GitHub
Mailing list
Installed libraries
Wiki
Report an issue
How it works
Contact the author
CE on Mastodon
CE on Bluesky
Statistics
Changelog
Version tree