Thanks for using Compiler Explorer
Sponsors
Jakt
C++
Ada
Analysis
Android Java
Android Kotlin
Assembly
C
C3
Carbon
C++ (Circle)
CIRCT
Clean
CMake
CMakeScript
COBOL
C++ for OpenCL
MLIR
Cppx
Cppx-Blue
Cppx-Gold
Cpp2-cppfront
Crystal
C#
CUDA C++
D
Dart
Elixir
Erlang
Fortran
F#
GLSL
Go
Haskell
HLSL
Hook
Hylo
IL
ispc
Java
Julia
Kotlin
LLVM IR
LLVM MIR
Modula-2
Nim
Objective-C
Objective-C++
OCaml
OpenCL C
Pascal
Pony
Python
Racket
Ruby
Rust
Snowball
Scala
Solidity
Spice
Swift
LLVM TableGen
Toit
TypeScript Native
V
Vala
Visual Basic
WASM
Zig
Javascript
GIMPLE
c++ source #1
Output
Compile to binary object
Link to binary
Execute the code
Intel asm syntax
Demangle identifiers
Verbose demangling
Filters
Unused labels
Library functions
Directives
Comments
Horizontal whitespace
Debug intrinsics
Compiler
6502-c++ 11.1.0
ARM GCC 10.2.0
ARM GCC 10.3.0
ARM GCC 10.4.0
ARM GCC 10.5.0
ARM GCC 11.1.0
ARM GCC 11.2.0
ARM GCC 11.3.0
ARM GCC 11.4.0
ARM GCC 12.1.0
ARM GCC 12.2.0
ARM GCC 12.3.0
ARM GCC 12.4.0
ARM GCC 13.1.0
ARM GCC 13.2.0
ARM GCC 13.2.0 (unknown-eabi)
ARM GCC 13.3.0
ARM GCC 13.3.0 (unknown-eabi)
ARM GCC 14.1.0
ARM GCC 14.1.0 (unknown-eabi)
ARM GCC 14.2.0
ARM GCC 14.2.0 (unknown-eabi)
ARM GCC 4.5.4
ARM GCC 4.6.4
ARM GCC 5.4
ARM GCC 6.3.0
ARM GCC 6.4.0
ARM GCC 7.3.0
ARM GCC 7.5.0
ARM GCC 8.2.0
ARM GCC 8.5.0
ARM GCC 9.3.0
ARM GCC 9.4.0
ARM GCC 9.5.0
ARM GCC trunk
ARM gcc 10.2.1 (none)
ARM gcc 10.3.1 (2021.07 none)
ARM gcc 10.3.1 (2021.10 none)
ARM gcc 11.2.1 (none)
ARM gcc 5.4.1 (none)
ARM gcc 7.2.1 (none)
ARM gcc 8.2 (WinCE)
ARM gcc 8.3.1 (none)
ARM gcc 9.2.1 (none)
ARM msvc v19.0 (WINE)
ARM msvc v19.10 (WINE)
ARM msvc v19.14 (WINE)
ARM64 Morello gcc 10.1 Alpha 2
ARM64 gcc 10.2
ARM64 gcc 10.3
ARM64 gcc 10.4
ARM64 gcc 10.5.0
ARM64 gcc 11.1
ARM64 gcc 11.2
ARM64 gcc 11.3
ARM64 gcc 11.4.0
ARM64 gcc 12.1
ARM64 gcc 12.2.0
ARM64 gcc 12.3.0
ARM64 gcc 12.4.0
ARM64 gcc 13.1.0
ARM64 gcc 13.2.0
ARM64 gcc 13.3.0
ARM64 gcc 14.1.0
ARM64 gcc 14.2.0
ARM64 gcc 4.9.4
ARM64 gcc 5.4
ARM64 gcc 5.5.0
ARM64 gcc 6.3
ARM64 gcc 6.4
ARM64 gcc 7.3
ARM64 gcc 7.5
ARM64 gcc 8.2
ARM64 gcc 8.5
ARM64 gcc 9.3
ARM64 gcc 9.4
ARM64 gcc 9.5
ARM64 gcc trunk
ARM64 msvc v19.14 (WINE)
AVR gcc 10.3.0
AVR gcc 11.1.0
AVR gcc 12.1.0
AVR gcc 12.2.0
AVR gcc 12.3.0
AVR gcc 12.4.0
AVR gcc 13.1.0
AVR gcc 13.2.0
AVR gcc 13.3.0
AVR gcc 14.1.0
AVR gcc 14.2.0
AVR gcc 4.5.4
AVR gcc 4.6.4
AVR gcc 5.4.0
AVR gcc 9.2.0
AVR gcc 9.3.0
Arduino Mega (1.8.9)
Arduino Uno (1.8.9)
BPF clang (trunk)
BPF clang 13.0.0
BPF clang 14.0.0
BPF clang 15.0.0
BPF clang 16.0.0
BPF clang 17.0.1
BPF clang 18.1.0
BPF clang 19.1.0
BPF gcc 13.1.0
BPF gcc 13.2.0
BPF gcc 13.3.0
BPF gcc trunk
EDG (experimental reflection)
EDG 6.5
EDG 6.5 (GNU mode gcc 13)
EDG 6.6
EDG 6.6 (GNU mode gcc 13)
FRC 2019
FRC 2020
FRC 2023
HPPA gcc 14.2.0
KVX ACB 4.1.0 (GCC 7.5.0)
KVX ACB 4.1.0-cd1 (GCC 7.5.0)
KVX ACB 4.10.0 (GCC 10.3.1)
KVX ACB 4.11.1 (GCC 10.3.1)
KVX ACB 4.12.0 (GCC 11.3.0)
KVX ACB 4.2.0 (GCC 7.5.0)
KVX ACB 4.3.0 (GCC 7.5.0)
KVX ACB 4.4.0 (GCC 7.5.0)
KVX ACB 4.6.0 (GCC 9.4.1)
KVX ACB 4.8.0 (GCC 9.4.1)
KVX ACB 4.9.0 (GCC 9.4.1)
KVX ACB 5.0.0 (GCC 12.2.1)
LoongArch64 clang (trunk)
LoongArch64 clang 17.0.1
LoongArch64 clang 18.1.0
LoongArch64 clang 19.1.0
M68K gcc 13.1.0
M68K gcc 13.2.0
M68K gcc 13.3.0
M68K gcc 14.1.0
M68K gcc 14.2.0
M68k clang (trunk)
MRISC32 gcc (trunk)
MSP430 gcc 4.5.3
MSP430 gcc 5.3.0
MSP430 gcc 6.2.1
MinGW clang 14.0.3
MinGW clang 14.0.6
MinGW clang 15.0.7
MinGW clang 16.0.0
MinGW clang 16.0.2
MinGW gcc 11.3.0
MinGW gcc 12.1.0
MinGW gcc 12.2.0
MinGW gcc 13.1.0
RISC-V (32-bits) gcc (trunk)
RISC-V (32-bits) gcc 10.2.0
RISC-V (32-bits) gcc 10.3.0
RISC-V (32-bits) gcc 11.2.0
RISC-V (32-bits) gcc 11.3.0
RISC-V (32-bits) gcc 11.4.0
RISC-V (32-bits) gcc 12.1.0
RISC-V (32-bits) gcc 12.2.0
RISC-V (32-bits) gcc 12.3.0
RISC-V (32-bits) gcc 12.4.0
RISC-V (32-bits) gcc 13.1.0
RISC-V (32-bits) gcc 13.2.0
RISC-V (32-bits) gcc 13.3.0
RISC-V (32-bits) gcc 14.1.0
RISC-V (32-bits) gcc 14.2.0
RISC-V (32-bits) gcc 8.2.0
RISC-V (32-bits) gcc 8.5.0
RISC-V (32-bits) gcc 9.4.0
RISC-V (64-bits) gcc (trunk)
RISC-V (64-bits) gcc 10.2.0
RISC-V (64-bits) gcc 10.3.0
RISC-V (64-bits) gcc 11.2.0
RISC-V (64-bits) gcc 11.3.0
RISC-V (64-bits) gcc 11.4.0
RISC-V (64-bits) gcc 12.1.0
RISC-V (64-bits) gcc 12.2.0
RISC-V (64-bits) gcc 12.3.0
RISC-V (64-bits) gcc 12.4.0
RISC-V (64-bits) gcc 13.1.0
RISC-V (64-bits) gcc 13.2.0
RISC-V (64-bits) gcc 13.3.0
RISC-V (64-bits) gcc 14.1.0
RISC-V (64-bits) gcc 14.2.0
RISC-V (64-bits) gcc 8.2.0
RISC-V (64-bits) gcc 8.5.0
RISC-V (64-bits) gcc 9.4.0
RISC-V rv32gc clang (trunk)
RISC-V rv32gc clang 10.0.0
RISC-V rv32gc clang 10.0.1
RISC-V rv32gc clang 11.0.0
RISC-V rv32gc clang 11.0.1
RISC-V rv32gc clang 12.0.0
RISC-V rv32gc clang 12.0.1
RISC-V rv32gc clang 13.0.0
RISC-V rv32gc clang 13.0.1
RISC-V rv32gc clang 14.0.0
RISC-V rv32gc clang 15.0.0
RISC-V rv32gc clang 16.0.0
RISC-V rv32gc clang 17.0.1
RISC-V rv32gc clang 18.1.0
RISC-V rv32gc clang 19.1.0
RISC-V rv32gc clang 9.0.0
RISC-V rv32gc clang 9.0.1
RISC-V rv64gc clang (trunk)
RISC-V rv64gc clang 10.0.0
RISC-V rv64gc clang 10.0.1
RISC-V rv64gc clang 11.0.0
RISC-V rv64gc clang 11.0.1
RISC-V rv64gc clang 12.0.0
RISC-V rv64gc clang 12.0.1
RISC-V rv64gc clang 13.0.0
RISC-V rv64gc clang 13.0.1
RISC-V rv64gc clang 14.0.0
RISC-V rv64gc clang 15.0.0
RISC-V rv64gc clang 16.0.0
RISC-V rv64gc clang 17.0.1
RISC-V rv64gc clang 18.1.0
RISC-V rv64gc clang 19.1.0
RISC-V rv64gc clang 9.0.0
RISC-V rv64gc clang 9.0.1
Raspbian Buster
Raspbian Stretch
SPARC LEON gcc 12.2.0
SPARC LEON gcc 12.3.0
SPARC LEON gcc 12.4.0
SPARC LEON gcc 13.1.0
SPARC LEON gcc 13.2.0
SPARC LEON gcc 13.3.0
SPARC LEON gcc 14.1.0
SPARC LEON gcc 14.2.0
SPARC gcc 12.2.0
SPARC gcc 12.3.0
SPARC gcc 12.4.0
SPARC gcc 13.1.0
SPARC gcc 13.2.0
SPARC gcc 13.3.0
SPARC gcc 14.1.0
SPARC gcc 14.2.0
SPARC64 gcc 12.2.0
SPARC64 gcc 12.3.0
SPARC64 gcc 12.4.0
SPARC64 gcc 13.1.0
SPARC64 gcc 13.2.0
SPARC64 gcc 13.3.0
SPARC64 gcc 14.1.0
SPARC64 gcc 14.2.0
TI C6x gcc 12.2.0
TI C6x gcc 12.3.0
TI C6x gcc 12.4.0
TI C6x gcc 13.1.0
TI C6x gcc 13.2.0
TI C6x gcc 13.3.0
TI C6x gcc 14.1.0
TI C6x gcc 14.2.0
TI CL430 21.6.1
VAX gcc NetBSDELF 10.4.0
VAX gcc NetBSDELF 10.5.0 (Nov 15 03:50:22 2023)
WebAssembly clang (trunk)
Xtensa ESP32 gcc 11.2.0 (2022r1)
Xtensa ESP32 gcc 12.2.0 (20230208)
Xtensa ESP32 gcc 8.2.0 (2019r2)
Xtensa ESP32 gcc 8.2.0 (2020r1)
Xtensa ESP32 gcc 8.2.0 (2020r2)
Xtensa ESP32 gcc 8.4.0 (2020r3)
Xtensa ESP32 gcc 8.4.0 (2021r1)
Xtensa ESP32 gcc 8.4.0 (2021r2)
Xtensa ESP32-S2 gcc 11.2.0 (2022r1)
Xtensa ESP32-S2 gcc 12.2.0 (20230208)
Xtensa ESP32-S2 gcc 8.2.0 (2019r2)
Xtensa ESP32-S2 gcc 8.2.0 (2020r1)
Xtensa ESP32-S2 gcc 8.2.0 (2020r2)
Xtensa ESP32-S2 gcc 8.4.0 (2020r3)
Xtensa ESP32-S2 gcc 8.4.0 (2021r1)
Xtensa ESP32-S2 gcc 8.4.0 (2021r2)
Xtensa ESP32-S3 gcc 11.2.0 (2022r1)
Xtensa ESP32-S3 gcc 12.2.0 (20230208)
Xtensa ESP32-S3 gcc 8.4.0 (2020r3)
Xtensa ESP32-S3 gcc 8.4.0 (2021r1)
Xtensa ESP32-S3 gcc 8.4.0 (2021r2)
arm64 msvc v19.20 VS16.0
arm64 msvc v19.21 VS16.1
arm64 msvc v19.22 VS16.2
arm64 msvc v19.23 VS16.3
arm64 msvc v19.24 VS16.4
arm64 msvc v19.25 VS16.5
arm64 msvc v19.27 VS16.7
arm64 msvc v19.28 VS16.8
arm64 msvc v19.28 VS16.9
arm64 msvc v19.29 VS16.10
arm64 msvc v19.29 VS16.11
arm64 msvc v19.30 VS17.0
arm64 msvc v19.31 VS17.1
arm64 msvc v19.32 VS17.2
arm64 msvc v19.33 VS17.3
arm64 msvc v19.34 VS17.4
arm64 msvc v19.35 VS17.5
arm64 msvc v19.36 VS17.6
arm64 msvc v19.37 VS17.7
arm64 msvc v19.38 VS17.8
arm64 msvc v19.39 VS17.9
arm64 msvc v19.40 VS17.10
arm64 msvc v19.latest
armv7-a clang (trunk)
armv7-a clang 10.0.0
armv7-a clang 10.0.1
armv7-a clang 11.0.0
armv7-a clang 11.0.1
armv7-a clang 12.0.0
armv7-a clang 12.0.1
armv7-a clang 13.0.0
armv7-a clang 13.0.1
armv7-a clang 14.0.0
armv7-a clang 15.0.0
armv7-a clang 16.0.0
armv7-a clang 17.0.1
armv7-a clang 18.1.0
armv7-a clang 19.1.0
armv7-a clang 9.0.0
armv7-a clang 9.0.1
armv8-a clang (all architectural features, trunk)
armv8-a clang (trunk)
armv8-a clang 10.0.0
armv8-a clang 10.0.1
armv8-a clang 11.0.0
armv8-a clang 11.0.1
armv8-a clang 12.0.0
armv8-a clang 13.0.0
armv8-a clang 14.0.0
armv8-a clang 15.0.0
armv8-a clang 16.0.0
armv8-a clang 17.0.1
armv8-a clang 18.1.0
armv8-a clang 19.1.0
armv8-a clang 9.0.0
armv8-a clang 9.0.1
clang-cl 18.1.0
ellcc 0.1.33
ellcc 0.1.34
ellcc 2017-07-16
hexagon-clang 16.0.5
llvm-mos atari2600-3e
llvm-mos atari2600-4k
llvm-mos atari2600-common
llvm-mos atari5200-supercart
llvm-mos atari8-cart-megacart
llvm-mos atari8-cart-std
llvm-mos atari8-cart-xegs
llvm-mos atari8-common
llvm-mos atari8-dos
llvm-mos c128
llvm-mos c64
llvm-mos commodore
llvm-mos cpm65
llvm-mos cx16
llvm-mos dodo
llvm-mos eater
llvm-mos mega65
llvm-mos nes
llvm-mos nes-action53
llvm-mos nes-cnrom
llvm-mos nes-gtrom
llvm-mos nes-mmc1
llvm-mos nes-mmc3
llvm-mos nes-nrom
llvm-mos nes-unrom
llvm-mos nes-unrom-512
llvm-mos osi-c1p
llvm-mos pce
llvm-mos pce-cd
llvm-mos pce-common
llvm-mos pet
llvm-mos rp6502
llvm-mos rpc8e
llvm-mos supervision
llvm-mos vic20
loongarch64 gcc 12.2.0
loongarch64 gcc 12.3.0
loongarch64 gcc 12.4.0
loongarch64 gcc 13.1.0
loongarch64 gcc 13.2.0
loongarch64 gcc 13.3.0
loongarch64 gcc 14.1.0
loongarch64 gcc 14.2.0
mips clang 13.0.0
mips clang 14.0.0
mips clang 15.0.0
mips clang 16.0.0
mips clang 17.0.1
mips clang 18.1.0
mips clang 19.1.0
mips gcc 11.2.0
mips gcc 12.1.0
mips gcc 12.2.0
mips gcc 12.3.0
mips gcc 12.4.0
mips gcc 13.1.0
mips gcc 13.2.0
mips gcc 13.3.0
mips gcc 14.1.0
mips gcc 14.2.0
mips gcc 4.9.4
mips gcc 5.4
mips gcc 5.5.0
mips gcc 9.3.0 (codescape)
mips gcc 9.5.0
mips64 (el) gcc 12.1.0
mips64 (el) gcc 12.2.0
mips64 (el) gcc 12.3.0
mips64 (el) gcc 12.4.0
mips64 (el) gcc 13.1.0
mips64 (el) gcc 13.2.0
mips64 (el) gcc 13.3.0
mips64 (el) gcc 14.1.0
mips64 (el) gcc 14.2.0
mips64 (el) gcc 4.9.4
mips64 (el) gcc 5.4.0
mips64 (el) gcc 5.5.0
mips64 (el) gcc 9.5.0
mips64 clang 13.0.0
mips64 clang 14.0.0
mips64 clang 15.0.0
mips64 clang 16.0.0
mips64 clang 17.0.1
mips64 clang 18.1.0
mips64 clang 19.1.0
mips64 gcc 11.2.0
mips64 gcc 12.1.0
mips64 gcc 12.2.0
mips64 gcc 12.3.0
mips64 gcc 12.4.0
mips64 gcc 13.1.0
mips64 gcc 13.2.0
mips64 gcc 13.3.0
mips64 gcc 14.1.0
mips64 gcc 14.2.0
mips64 gcc 4.9.4
mips64 gcc 5.4.0
mips64 gcc 5.5.0
mips64 gcc 9.5.0
mips64el clang 13.0.0
mips64el clang 14.0.0
mips64el clang 15.0.0
mips64el clang 16.0.0
mips64el clang 17.0.1
mips64el clang 18.1.0
mips64el clang 19.1.0
mipsel clang 13.0.0
mipsel clang 14.0.0
mipsel clang 15.0.0
mipsel clang 16.0.0
mipsel clang 17.0.1
mipsel clang 18.1.0
mipsel clang 19.1.0
mipsel gcc 12.1.0
mipsel gcc 12.2.0
mipsel gcc 12.3.0
mipsel gcc 12.4.0
mipsel gcc 13.1.0
mipsel gcc 13.2.0
mipsel gcc 13.3.0
mipsel gcc 14.1.0
mipsel gcc 14.2.0
mipsel gcc 4.9.4
mipsel gcc 5.4.0
mipsel gcc 5.5.0
mipsel gcc 9.5.0
nanoMIPS gcc 6.3.0 (mtk)
power gcc 11.2.0
power gcc 12.1.0
power gcc 12.2.0
power gcc 12.3.0
power gcc 12.4.0
power gcc 13.1.0
power gcc 13.2.0
power gcc 13.3.0
power gcc 14.1.0
power gcc 14.2.0
power gcc 4.8.5
power64 AT12.0 (gcc8)
power64 AT13.0 (gcc9)
power64 gcc 11.2.0
power64 gcc 12.1.0
power64 gcc 12.2.0
power64 gcc 12.3.0
power64 gcc 12.4.0
power64 gcc 13.1.0
power64 gcc 13.2.0
power64 gcc 13.3.0
power64 gcc 14.1.0
power64 gcc 14.2.0
power64 gcc trunk
power64le AT12.0 (gcc8)
power64le AT13.0 (gcc9)
power64le clang (trunk)
power64le gcc 11.2.0
power64le gcc 12.1.0
power64le gcc 12.2.0
power64le gcc 12.3.0
power64le gcc 12.4.0
power64le gcc 13.1.0
power64le gcc 13.2.0
power64le gcc 13.3.0
power64le gcc 14.1.0
power64le gcc 14.2.0
power64le gcc 6.3.0
power64le gcc trunk
powerpc64 clang (trunk)
s390x gcc 11.2.0
s390x gcc 12.1.0
s390x gcc 12.2.0
s390x gcc 12.3.0
s390x gcc 12.4.0
s390x gcc 13.1.0
s390x gcc 13.2.0
s390x gcc 13.3.0
s390x gcc 14.1.0
s390x gcc 14.2.0
sh gcc 12.2.0
sh gcc 12.3.0
sh gcc 12.4.0
sh gcc 13.1.0
sh gcc 13.2.0
sh gcc 13.3.0
sh gcc 14.1.0
sh gcc 14.2.0
sh gcc 4.9.4
sh gcc 9.5.0
vast (trunk)
x64 msvc v19.0 (WINE)
x64 msvc v19.10 (WINE)
x64 msvc v19.14 (WINE)
x64 msvc v19.20 VS16.0
x64 msvc v19.21 VS16.1
x64 msvc v19.22 VS16.2
x64 msvc v19.23 VS16.3
x64 msvc v19.24 VS16.4
x64 msvc v19.25 VS16.5
x64 msvc v19.27 VS16.7
x64 msvc v19.28 VS16.8
x64 msvc v19.28 VS16.9
x64 msvc v19.29 VS16.10
x64 msvc v19.29 VS16.11
x64 msvc v19.30 VS17.0
x64 msvc v19.31 VS17.1
x64 msvc v19.32 VS17.2
x64 msvc v19.33 VS17.3
x64 msvc v19.34 VS17.4
x64 msvc v19.35 VS17.5
x64 msvc v19.36 VS17.6
x64 msvc v19.37 VS17.7
x64 msvc v19.38 VS17.8
x64 msvc v19.39 VS17.9
x64 msvc v19.40 VS17.10
x64 msvc v19.latest
x86 djgpp 4.9.4
x86 djgpp 5.5.0
x86 djgpp 6.4.0
x86 djgpp 7.2.0
x86 msvc v19.0 (WINE)
x86 msvc v19.10 (WINE)
x86 msvc v19.14 (WINE)
x86 msvc v19.20 VS16.0
x86 msvc v19.21 VS16.1
x86 msvc v19.22 VS16.2
x86 msvc v19.23 VS16.3
x86 msvc v19.24 VS16.4
x86 msvc v19.25 VS16.5
x86 msvc v19.27 VS16.7
x86 msvc v19.28 VS16.8
x86 msvc v19.28 VS16.9
x86 msvc v19.29 VS16.10
x86 msvc v19.29 VS16.11
x86 msvc v19.30 VS17.0
x86 msvc v19.31 VS17.1
x86 msvc v19.32 VS17.2
x86 msvc v19.33 VS17.3
x86 msvc v19.34 VS17.4
x86 msvc v19.35 VS17.5
x86 msvc v19.36 VS17.6
x86 msvc v19.37 VS17.7
x86 msvc v19.38 VS17.8
x86 msvc v19.39 VS17.9
x86 msvc v19.40 VS17.10
x86 msvc v19.latest
x86 nvc++ 22.11
x86 nvc++ 22.7
x86 nvc++ 22.9
x86 nvc++ 23.1
x86 nvc++ 23.11
x86 nvc++ 23.3
x86 nvc++ 23.5
x86 nvc++ 23.7
x86 nvc++ 23.9
x86 nvc++ 24.1
x86 nvc++ 24.3
x86 nvc++ 24.5
x86 nvc++ 24.7
x86-64 Zapcc 190308
x86-64 clang (EricWF contracts)
x86-64 clang (amd-staging)
x86-64 clang (assertions trunk)
x86-64 clang (clangir)
x86-64 clang (dascandy contracts)
x86-64 clang (experimental -Wlifetime)
x86-64 clang (experimental P1061)
x86-64 clang (experimental P1144)
x86-64 clang (experimental P1221)
x86-64 clang (experimental P2996)
x86-64 clang (experimental P3068)
x86-64 clang (experimental P3309)
x86-64 clang (experimental P3367)
x86-64 clang (experimental P3372)
x86-64 clang (experimental metaprogramming - P2632)
x86-64 clang (old concepts branch)
x86-64 clang (p1974)
x86-64 clang (pattern matching - P2688)
x86-64 clang (reflection)
x86-64 clang (resugar)
x86-64 clang (thephd.dev)
x86-64 clang (trunk)
x86-64 clang (variadic friends - P2893)
x86-64 clang (widberg)
x86-64 clang 10.0.0
x86-64 clang 10.0.0 (assertions)
x86-64 clang 10.0.1
x86-64 clang 11.0.0
x86-64 clang 11.0.0 (assertions)
x86-64 clang 11.0.1
x86-64 clang 12.0.0
x86-64 clang 12.0.0 (assertions)
x86-64 clang 12.0.1
x86-64 clang 13.0.0
x86-64 clang 13.0.0 (assertions)
x86-64 clang 13.0.1
x86-64 clang 14.0.0
x86-64 clang 14.0.0 (assertions)
x86-64 clang 15.0.0
x86-64 clang 15.0.0 (assertions)
x86-64 clang 16.0.0
x86-64 clang 16.0.0 (assertions)
x86-64 clang 17.0.1
x86-64 clang 17.0.1 (assertions)
x86-64 clang 18.1.0
x86-64 clang 18.1.0 (assertions)
x86-64 clang 19.1.0
x86-64 clang 19.1.0 (assertions)
x86-64 clang 2.6.0 (assertions)
x86-64 clang 2.7.0 (assertions)
x86-64 clang 2.8.0 (assertions)
x86-64 clang 2.9.0 (assertions)
x86-64 clang 3.0.0
x86-64 clang 3.0.0 (assertions)
x86-64 clang 3.1
x86-64 clang 3.1 (assertions)
x86-64 clang 3.2
x86-64 clang 3.2 (assertions)
x86-64 clang 3.3
x86-64 clang 3.3 (assertions)
x86-64 clang 3.4 (assertions)
x86-64 clang 3.4.1
x86-64 clang 3.5
x86-64 clang 3.5 (assertions)
x86-64 clang 3.5.1
x86-64 clang 3.5.2
x86-64 clang 3.6
x86-64 clang 3.6 (assertions)
x86-64 clang 3.7
x86-64 clang 3.7 (assertions)
x86-64 clang 3.7.1
x86-64 clang 3.8
x86-64 clang 3.8 (assertions)
x86-64 clang 3.8.1
x86-64 clang 3.9.0
x86-64 clang 3.9.0 (assertions)
x86-64 clang 3.9.1
x86-64 clang 4.0.0
x86-64 clang 4.0.0 (assertions)
x86-64 clang 4.0.1
x86-64 clang 5.0.0
x86-64 clang 5.0.0 (assertions)
x86-64 clang 5.0.1
x86-64 clang 5.0.2
x86-64 clang 6.0.0
x86-64 clang 6.0.0 (assertions)
x86-64 clang 6.0.1
x86-64 clang 7.0.0
x86-64 clang 7.0.0 (assertions)
x86-64 clang 7.0.1
x86-64 clang 7.1.0
x86-64 clang 8.0.0
x86-64 clang 8.0.0 (assertions)
x86-64 clang 8.0.1
x86-64 clang 9.0.0
x86-64 clang 9.0.0 (assertions)
x86-64 clang 9.0.1
x86-64 clang rocm-4.5.2
x86-64 clang rocm-5.0.2
x86-64 clang rocm-5.1.3
x86-64 clang rocm-5.2.3
x86-64 clang rocm-5.3.3
x86-64 clang rocm-5.7.0
x86-64 clang rocm-6.0.2
x86-64 clang rocm-6.1.2
x86-64 gcc (contract labels)
x86-64 gcc (contracts natural syntax)
x86-64 gcc (contracts)
x86-64 gcc (coroutines)
x86-64 gcc (modules)
x86-64 gcc (trunk)
x86-64 gcc 10.1
x86-64 gcc 10.2
x86-64 gcc 10.3
x86-64 gcc 10.4
x86-64 gcc 10.5
x86-64 gcc 11.1
x86-64 gcc 11.2
x86-64 gcc 11.3
x86-64 gcc 11.4
x86-64 gcc 12.1
x86-64 gcc 12.2
x86-64 gcc 12.3
x86-64 gcc 12.4
x86-64 gcc 13.1
x86-64 gcc 13.2
x86-64 gcc 13.3
x86-64 gcc 14.1
x86-64 gcc 14.2
x86-64 gcc 3.4.6
x86-64 gcc 4.0.4
x86-64 gcc 4.1.2
x86-64 gcc 4.4.7
x86-64 gcc 4.5.3
x86-64 gcc 4.6.4
x86-64 gcc 4.7.1
x86-64 gcc 4.7.2
x86-64 gcc 4.7.3
x86-64 gcc 4.7.4
x86-64 gcc 4.8.1
x86-64 gcc 4.8.2
x86-64 gcc 4.8.3
x86-64 gcc 4.8.4
x86-64 gcc 4.8.5
x86-64 gcc 4.9.0
x86-64 gcc 4.9.1
x86-64 gcc 4.9.2
x86-64 gcc 4.9.3
x86-64 gcc 4.9.4
x86-64 gcc 5.1
x86-64 gcc 5.2
x86-64 gcc 5.3
x86-64 gcc 5.4
x86-64 gcc 5.5
x86-64 gcc 6.1
x86-64 gcc 6.2
x86-64 gcc 6.3
x86-64 gcc 6.4
x86-64 gcc 6.5
x86-64 gcc 7.1
x86-64 gcc 7.2
x86-64 gcc 7.3
x86-64 gcc 7.4
x86-64 gcc 7.5
x86-64 gcc 8.1
x86-64 gcc 8.2
x86-64 gcc 8.3
x86-64 gcc 8.4
x86-64 gcc 8.5
x86-64 gcc 9.1
x86-64 gcc 9.2
x86-64 gcc 9.3
x86-64 gcc 9.4
x86-64 gcc 9.5
x86-64 icc 13.0.1
x86-64 icc 16.0.3
x86-64 icc 17.0.0
x86-64 icc 18.0.0
x86-64 icc 19.0.0
x86-64 icc 19.0.1
x86-64 icc 2021.1.2
x86-64 icc 2021.10.0
x86-64 icc 2021.2.0
x86-64 icc 2021.3.0
x86-64 icc 2021.4.0
x86-64 icc 2021.5.0
x86-64 icc 2021.6.0
x86-64 icc 2021.7.0
x86-64 icc 2021.7.1
x86-64 icc 2021.8.0
x86-64 icc 2021.9.0
x86-64 icx (latest)
x86-64 icx 2021.1.2
x86-64 icx 2021.2.0
x86-64 icx 2021.3.0
x86-64 icx 2021.4.0
x86-64 icx 2022.0.0
x86-64 icx 2022.1.0
x86-64 icx 2022.2.0
x86-64 icx 2022.2.1
x86-64 icx 2023.0.0
x86-64 icx 2023.1.0
x86-64 icx 2023.2.1
x86-64 icx 2024.0.0
x86-64 icx 2024.1.0
x86-64 icx 2024.2.0
zig c++ 0.10.0
zig c++ 0.11.0
zig c++ 0.12.0
zig c++ 0.12.1
zig c++ 0.13.0
zig c++ 0.6.0
zig c++ 0.7.0
zig c++ 0.7.1
zig c++ 0.8.0
zig c++ 0.9.0
zig c++ trunk
Options
Source code
#ifndef AWS_CHECKSUMS_PRIVATE_CPUID_H #define AWS_CHECKSUMS_PRIVATE_CPUID_H /* * Copyright 2010-2017 Amazon.com, Inc. or its affiliates. All Rights Reserved. * * Licensed under the Apache License, Version 2.0 (the "License"). * You may not use this file except in compliance with the License. * A copy of the License is located at * * http://aws.amazon.com/apache2.0 * * or in the "license" file accompanying this file. This file is distributed * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either * express or implied. See the License for the specific language governing * permissions and limitations under the License. */ #include <stdint.h> /*** * runs cpu id and fills in capabilities for the current cpu architecture. * returns non zero on success, zero on failure. If the operation was successful * cpuid will be set with the bits from the cpuid call, otherwise they will be untouched. **/ int aws_checksums_do_cpu_id(int32_t *cpuid); /** Returns non-zero if the CPU supports the PCLMULQDQ instruction. */ int aws_checksums_is_clmul_present(void); /** Returns non-zero if the CPU supports SSE4.1 instructions. */ int aws_checksums_is_sse41_present(void); /** Returns non-zero if the CPU supports SSE4.2 instructions (i.e. CRC32). */ int aws_checksums_is_sse42_present(void); #endif /* AWS_CHECKSUMS_PRIVATE_CPUID_H */ #ifndef AWS_CHECKSUMS_PRIVATE_CRC_PRIV_H #define AWS_CHECKSUMS_PRIVATE_CRC_PRIV_H /* * Copyright 2010-2017 Amazon.com, Inc. or its affiliates. All Rights Reserved. * * Licensed under the Apache License, Version 2.0 (the "License"). * You may not use this file except in compliance with the License. * A copy of the License is located at * * http://aws.amazon.com/apache2.0 * * or in the "license" file accompanying this file. This file is distributed * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either * express or implied. See the License for the specific language governing * permissions and limitations under the License. */ #define AWS_CRC32_SIZE_BYTES 4 #ifndef AWS_CHECKSUMS_EXPORTS_H #define AWS_CHECKSUMS_EXPORTS_H /* * Copyright 2010-2017 Amazon.com, Inc. or its affiliates. All Rights Reserved. * * Licensed under the Apache License, Version 2.0 (the "License"). * You may not use this file except in compliance with the License. * A copy of the License is located at * * http://aws.amazon.com/apache2.0 * * or in the "license" file accompanying this file. This file is distributed * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either * express or implied. See the License for the specific language governing * permissions and limitations under the License. */ #if defined(USE_WINDOWS_DLL_SEMANTICS) || defined(WIN32) # ifdef USE_IMPORT_EXPORT # ifdef AWS_CHECKSUMS_EXPORTS # define AWS_CHECKSUMS_API __declspec(dllexport) # else # define AWS_CHECKSUMS_API __declspec(dllimport) # endif /* AWS_CHECKSUMS_EXPORTS */ # else # define AWS_CHECKSUMS_API # endif /* USE_IMPORT_EXPORT */ #else /* defined (USE_WINDOWS_DLL_SEMANTICS) || defined (WIN32) */ # define AWS_CHECKSUMS_API #endif /* defined (USE_WINDOWS_DLL_SEMANTICS) || defined (WIN32) */ #endif /* AWS_CHECKSUMS_EXPORTS_H */ #include <stdint.h> #ifdef __cplusplus extern "C" { #endif /* Computes CRC32 (Ethernet, gzip, et. al.) using a (slow) reference implementation. */ AWS_CHECKSUMS_API uint32_t aws_checksums_crc32_sw(const uint8_t *input, int length, uint32_t previousCrc32); /* Computes the Castagnoli CRC32c (iSCSI) using a (slow) reference implementation. */ AWS_CHECKSUMS_API uint32_t aws_checksums_crc32c_sw(const uint8_t *input, int length, uint32_t previousCrc32c); /* Computes the Castagnoli CRC32c (iSCSI). */ AWS_CHECKSUMS_API uint32_t aws_checksums_crc32c_hw(const uint8_t *data, int length, uint32_t previousCrc32); #ifdef __cplusplus } #endif #endif /* AWS_CHECKSUMS_PRIVATE_CRC_PRIV_H */ /*this implementation is only for 64 bit arch and (if on GCC, release mode). * If using clang, this will run for both debug and release.*/ #if defined(__x86_64__) && !(defined(__GNUC__) && defined(DEBUG_BUILD)) # define LIKELY(x) __builtin_expect((x), 1) # define UNLIKELY(x) __builtin_expect((x), 0) /* * Factored out common inline asm for folding crc0,crc1,crc2 stripes in rcx, r11, r10 using * the specified Magic Constants K1 and K2. * Assumes rcx, r11, r10 contain crc0, crc1, crc2 that need folding * Utilizes xmm1, xmm2, xmm3, xmm4 as well as clobbering r8, r9, r11 * Result is placed in ecx */ # define FOLD_K1K2(NAME, K1, K2) \ "fold_k1k2_" #NAME "_%=: \n" \ "mov " #K1 ", %%r8d # Magic K1 constant \n" \ "mov " #K2 ", %%r9d # Magic K2 constant \n" \ "movq %%rcx, %%xmm1 # crc0 into lower dword of xmm1 \n" \ "movq %%r8, %%xmm3 # K1 into lower dword of xmm3 \n" \ "movq %%r11, %%xmm2 # crc1 into lower dword of xmm2 \n" \ "movq %%r9, %%xmm4 # K2 into lower dword of xmm4 \n" \ "pclmulqdq $0x00, %%xmm3, %%xmm1 # Multiply crc0 by K1 \n" \ "pclmulqdq $0x00, %%xmm4, %%xmm2 # Multiply crc1 by K2 \n" \ "xor %%rcx, %%rcx # \n" \ "xor %%r11, %%r11 # \n" \ "movq %%xmm1, %%r8 # \n" \ "movq %%xmm2, %%r9 # \n" \ "crc32q %%r8, %%rcx # folding crc0 \n" \ "crc32q %%r9, %%r11 # folding crc1 \n" \ "xor %%r10d, %%ecx # combine crc2 and crc0 \n" \ "xor %%r11d, %%ecx # combine crc1 and crc0 \n" /** * Private (static) function. * Computes the Castagnoli CRC32c (iSCSI) of the specified data buffer using the Intel CRC32Q (quad word) machine * instruction by operating on 24-byte stripes in parallel. The results are folded together using CLMUL. This function * is optimized for exactly 256 byte blocks that are best aligned on 8-byte memory addresses. It MUST be passed a * pointer to input data that is exactly 256 bytes in length. Note: this function does NOT invert bits of the input crc * or return value. */ static inline uint32_t s_crc32c_sse42_clmul_256(const uint8_t *input, uint32_t crc) { asm volatile("enter_256_%=:" "xor %%r11, %%r11 # zero all 64 bits in r11, will track crc1 \n" "xor %%r10, %%r10 # zero all 64 bits in r10, will track crc2 \n" "crc32q 0(%[in]), %%rcx # crc0 \n" "crc32q 88(%[in]), %%r11 # crc1 \n" "crc32q 176(%[in]), %%r10 # crc2 \n" "crc32q 8(%[in]), %%rcx # crc0 \n" "crc32q 96(%[in]), %%r11 # crc1 \n" "crc32q 184(%[in]), %%r10 # crc2 \n" "crc32q 16(%[in]), %%rcx # crc0 \n" "crc32q 104(%[in]), %%r11 # crc1 \n" "crc32q 192(%[in]), %%r10 # crc2 \n" "crc32q 24(%[in]), %%rcx # crc0 \n" "crc32q 112(%[in]), %%r11 # crc1 \n" "crc32q 200(%[in]), %%r10 # crc2 \n" "crc32q 32(%[in]), %%rcx # crc0 \n" "crc32q 120(%[in]), %%r11 # crc1 \n" "crc32q 208(%[in]), %%r10 # crc2 \n" "crc32q 40(%[in]), %%rcx # crc0 \n" "crc32q 128(%[in]), %%r11 # crc1 \n" "crc32q 216(%[in]), %%r10 # crc2 \n" "crc32q 48(%[in]), %%rcx # crc0 \n" "crc32q 136(%[in]), %%r11 # crc1 \n" "crc32q 224(%[in]), %%r10 # crc2 \n" "crc32q 56(%[in]), %%rcx # crc0 \n" "crc32q 144(%[in]), %%r11 # crc1 \n" "crc32q 232(%[in]), %%r10 # crc2 \n" "crc32q 64(%[in]), %%rcx # crc0 \n" "crc32q 152(%[in]), %%r11 # crc1 \n" "crc32q 240(%[in]), %%r10 # crc2 \n" "crc32q 72(%[in]), %%rcx # crc0 \n" "crc32q 160(%[in]), %%r11 # crc1 \n" "crc32q 248(%[in]), %%r10 # crc2 \n" "crc32q 80(%[in]), %%rcx # crc0 \n" "crc32q 168(%[in]), %%r11 # crc2 \n" FOLD_K1K2(256, $0x1b3d8f29, $0x39d3b296) /* Magic Constants used to fold crc stripes into ecx */ /* output registers [crc] is an input and and output so it is marked read/write (i.e. "+c")*/ : "+c"(crc) /* input registers */ : [crc] "c"(crc), [in] "d"(input) /* additional clobbered registers */ : "%r8", "%r9", "%r11", "%r10", "%xmm1", "%xmm2", "%xmm3", "%xmm4", "cc"); return crc; } /** * Private (static) function. * Computes the Castagnoli CRC32c (iSCSI) of the specified data buffer using the Intel CRC32Q (quad word) machine * instruction by operating on 3 24-byte stripes in parallel. The results are folded together using CLMUL. This function * is optimized for exactly 1024 byte blocks that are best aligned on 8-byte memory addresses. It MUST be passed a * pointer to input data that is exactly 1024 bytes in length. Note: this function does NOT invert bits of the input crc * or return value. */ static inline uint32_t s_crc32c_sse42_clmul_1024(const uint8_t *input, uint32_t crc) { asm volatile( "enter_1024_%=:" "xor %%r11, %%r11 # zero all 64 bits in r11, will track crc1 \n" "xor %%r10, %%r10 # zero all 64 bits in r10, will track crc2 \n" "mov $5, %%r8d # Loop 5 times through 64 byte chunks in 3 parallel stripes \n" "loop_1024_%=:" "prefetcht0 128(%[in]) # \n" "prefetcht0 472(%[in]) # \n" "prefetcht0 808(%[in]) # \n" "crc32q 0(%[in]), %%rcx # crc0: stripe0 \n" "crc32q 344(%[in]), %%r11 # crc1: stripe1 \n" "crc32q 680(%[in]), %%r10 # crc2: stripe2 \n" "crc32q 8(%[in]), %%rcx # crc0 \n" "crc32q 352(%[in]), %%r11 # crc1 \n" "crc32q 688(%[in]), %%r10 # crc2 \n" "crc32q 16(%[in]), %%rcx # crc0 \n" "crc32q 360(%[in]), %%r11 # crc1 \n" "crc32q 696(%[in]), %%r10 # crc2 \n" "crc32q 24(%[in]), %%rcx # crc0 \n" "crc32q 368(%[in]), %%r11 # crc1 \n" "crc32q 704(%[in]), %%r10 # crc2 \n" "crc32q 32(%[in]), %%rcx # crc0 \n" "crc32q 376(%[in]), %%r11 # crc1 \n" "crc32q 712(%[in]), %%r10 # crc2 \n" "crc32q 40(%[in]), %%rcx # crc0 \n" "crc32q 384(%[in]), %%r11 # crc1 \n" "crc32q 720(%[in]), %%r10 # crc2 \n" "crc32q 48(%[in]), %%rcx # crc0 \n" "crc32q 392(%[in]), %%r11 # crc1 \n" "crc32q 728(%[in]), %%r10 # crc2 \n" "crc32q 56(%[in]), %%rcx # crc0 \n" "crc32q 400(%[in]), %%r11 # crc1 \n" "crc32q 736(%[in]), %%r10 # crc2 \n" "add $64, %[in] # \n" "sub $1, %%r8d # \n" "jnz loop_1024_%= # \n" "crc32q 0(%[in]), %%rcx # crc0 \n" "crc32q 344(%[in]), %%r11 # crc1 \n" "crc32q 680(%[in]), %%r10 # crc2 \n" "crc32q 8(%[in]), %%rcx # crc0 \n" "crc32q 352(%[in]), %%r11 # crc1 \n" "crc32q 688(%[in]), %%r10 # crc2 \n" "crc32q 16(%[in]), %%rcx # crc0 \n" "crc32q 696(%[in]), %%r10 # crc2 \n" FOLD_K1K2( 1024, $0xe417f38a, $0x8f158014) /* Magic Constants used to fold crc stripes into ecx output registers [crc] is an input and and output so it is marked read/write (i.e. "+c") we clobber the register for [input] (via add instruction) so we must also tag it read/write (i.e. "+d") in the list of outputs to tell gcc about the clobber */ : "+c"(crc), "+d"(input) /* input registers */ /* the numeric values match the position of the output registers */ : [crc] "c"(crc), [in] "d"(input) /* additional clobbered registers */ /* "cc" is the flags - we add and sub, so the flags are also clobbered */ : "%r8", "%r9", "%r11", "%r10", "%xmm1", "%xmm2", "%xmm3", "%xmm4", "cc"); return crc; } /** * Private (static) function. * Computes the Castagnoli CRC32c (iSCSI) of the specified data buffer using the Intel CRC32Q (quad word) machine * instruction by operating on 24-byte stripes in parallel. The results are folded together using CLMUL. This function * is optimized for exactly 3072 byte blocks that are best aligned on 8-byte memory addresses. It MUST be passed a * pointer to input data that is exactly 3072 bytes in length. Note: this function does NOT invert bits of the input crc * or return value. */ static inline uint32_t s_crc32c_sse42_clmul_3072(const uint8_t *input, uint32_t crc) { asm volatile( "enter_3072_%=:" "xor %%r11, %%r11 # zero all 64 bits in r11, will track crc1 \n" "xor %%r10, %%r10 # zero all 64 bits in r10, will track crc2 \n" "mov $16, %%r8d # Loop 16 times through 64 byte chunks in 3 parallel stripes \n" "loop_3072_%=:" "prefetcht0 128(%[in]) # \n" "prefetcht0 1152(%[in]) # \n" "prefetcht0 2176(%[in]) # \n" "crc32q 0(%[in]), %%rcx # crc0: stripe0 \n" "crc32q 1024(%[in]), %%r11 # crc1: stripe1 \n" "crc32q 2048(%[in]), %%r10 # crc2: stripe2 \n" "crc32q 8(%[in]), %%rcx # crc0: stripe0 \n" "crc32q 1032(%[in]), %%r11 # crc1: stripe1 \n" "crc32q 2056(%[in]), %%r10 # crc2: stripe2 \n" "crc32q 16(%[in]), %%rcx # crc0: stripe0 \n" "crc32q 1040(%[in]), %%r11 # crc1: stripe1 \n" "crc32q 2064(%[in]), %%r10 # crc2: stripe2 \n" "crc32q 24(%[in]), %%rcx # crc0: stripe0 \n" "crc32q 1048(%[in]), %%r11 # crc1: stripe1 \n" "crc32q 2072(%[in]), %%r10 # crc2: stripe2 \n" "crc32q 32(%[in]), %%rcx # crc0: stripe0 \n" "crc32q 1056(%[in]), %%r11 # crc1: stripe1 \n" "crc32q 2080(%[in]), %%r10 # crc2: stripe2 \n" "crc32q 40(%[in]), %%rcx # crc0: stripe0 \n" "crc32q 1064(%[in]), %%r11 # crc1: stripe1 \n" "crc32q 2088(%[in]), %%r10 # crc2: stripe2 \n" "crc32q 48(%[in]), %%rcx # crc0: stripe0 \n" "crc32q 1072(%[in]), %%r11 # crc1: stripe1 \n" "crc32q 2096(%[in]), %%r10 # crc2: stripe2 \n" "crc32q 56(%[in]), %%rcx # crc0: stripe0 \n" "crc32q 1080(%[in]), %%r11 # crc1: stripe1 \n" "crc32q 2104(%[in]), %%r10 # crc2: stripe2 \n" "add $64, %[in] # \n" "sub $1, %%r8d # \n" "jnz loop_3072_%= # \n" FOLD_K1K2( 3072, $0xa51b6135, $0x170076fa) /* Magic Constants used to fold crc stripes into ecx output registers [crc] is an input and and output so it is marked read/write (i.e. "+c") we clobber the register for [input] (via add instruction) so we must also tag it read/write (i.e. "+d") in the list of outputs to tell gcc about the clobber*/ : "+c"(crc), "+d"(input) /* input registers the numeric values match the position of the output registers */ : [crc] "c"(crc), [in] "d"(input) /* additional clobbered registers "cc" is the flags - we add and sub, so the flags are also clobbered */ : "%r8", "%r9", "%r11", "%r10", "%xmm1", "%xmm2", "%xmm3", "%xmm4", "cc"); return crc; } static int detection_performed = 0; static int detected_clmul = 0; /* * Computes the Castagnoli CRC32c (iSCSI) of the specified data buffer using the Intel CRC32Q (64-bit quad word) and * PCLMULQDQ machine instructions (if present). * Handles data that isn't 8-byte aligned as well as any trailing data with the CRC32B (byte) instruction. * Pass 0 in the previousCrc32 parameter as an initial value unless continuing to update a running CRC in a subsequent * call. */ uint32_t aws_checksums_crc32c_hw(const uint8_t *input, int length, uint32_t previousCrc32) { if (UNLIKELY(!detection_performed)) { detected_clmul = aws_checksums_is_clmul_present(); /* Simply setting the flag true to skip HW detection next time Not using memory barriers since the worst that can happen is a fallback to the non HW accelerated code. */ detection_performed = 1; } uint32_t crc = ~previousCrc32; /* For small input, forget about alignment checks - simply compute the CRC32c one byte at a time */ if (UNLIKELY(length < 8)) { while (length-- > 0) { asm("loop_small_%=: CRC32B (%[in]), %[crc]" : "+c"(crc) : [crc] "c"(crc), [in] "r"(input)); input++; } return ~crc; } /* Get the 8-byte memory alignment of our input buffer by looking at the least significant 3 bits */ int input_alignment = (unsigned long int)input & 0x7; /* Compute the number of unaligned bytes before the first aligned 8-byte chunk (will be in the range 0-7) */ int leading = (8 - input_alignment) & 0x7; /* reduce the length by the leading unaligned bytes we are about to process */ length -= leading; /* spin through the leading unaligned input bytes (if any) one-by-one */ while (leading-- > 0) { asm("loop_leading_%=: CRC32B (%[in]), %[crc]" : "+c"(crc) : [crc] "c"(crc), [in] "r"(input)); input++; } /* Using likely to keep this code inlined */ if (LIKELY(detected_clmul)) { while (LIKELY(length >= 3072)) { /* Compute crc32c on each block, chaining each crc result */ crc = s_crc32c_sse42_clmul_3072(input, crc); input += 3072; length -= 3072; } while (LIKELY(length >= 1024)) { /* Compute crc32c on each block, chaining each crc result */ crc = s_crc32c_sse42_clmul_1024(input, crc); input += 1024; length -= 1024; } while (LIKELY(length >= 256)) { /* Compute crc32c on each block, chaining each crc result */ crc = s_crc32c_sse42_clmul_256(input, crc); input += 256; length -= 256; } } /* Spin through remaining (aligned) 8-byte chunks using the CRC32Q quad word instruction */ while (LIKELY(length >= 8)) { /* Hardcoding %rcx register (i.e. "+c") to allow use of qword instruction */ asm volatile("loop_8_%=: CRC32Q (%[in]), %%rcx" : "+c"(crc) : [crc] "c"(crc), [in] "r"(input)); input += 8; length -= 8; } /* Finish up with any trailing bytes using the CRC32B single byte instruction one-by-one */ while (length-- > 0) { asm volatile("loop_trailing_%=: CRC32B (%[in]), %[crc]" : "+c"(crc) : [crc] "c"(crc), [in] "r"(input)); input++; } return ~crc; } #elif !defined(_MSC_VER) /* don't call this without first checking that it is supported. */ uint32_t aws_checksums_crc32c_hw(const uint8_t *input, int length, uint32_t previousCrc32) { return 0; } #endif
Become a Patron
Sponsor on GitHub
Donate via PayPal
Source on GitHub
Mailing list
Installed libraries
Wiki
Report an issue
How it works
Contact the author
CE on Mastodon
About the author
Statistics
Changelog
Version tree