Thanks for using Compiler Explorer
Sponsors
Jakt
C++
Ada
Analysis
Android Java
Android Kotlin
Assembly
C
C3
Carbon
C++ (Circle)
CIRCT
Clean
CMake
CMakeScript
COBOL
C++ for OpenCL
MLIR
Cppx
Cppx-Blue
Cppx-Gold
Cpp2-cppfront
Crystal
C#
CUDA C++
D
Dart
Elixir
Erlang
Fortran
F#
Go
Haskell
HLSL
Hook
Hylo
ispc
Java
Julia
Kotlin
LLVM IR
LLVM MIR
Modula-2
Nim
Objective-C
Objective-C++
OCaml
OpenCL C
Pascal
Pony
Python
Racket
Ruby
Rust
Snowball
Scala
Solidity
Spice
Swift
LLVM TableGen
Toit
TypeScript Native
V
Vala
Visual Basic
WASM
Zig
Javascript
GIMPLE
c source #1
Output
Compile to binary object
Link to binary
Execute the code
Intel asm syntax
Demangle identifiers
Verbose demangling
Filters
Unused labels
Library functions
Directives
Comments
Horizontal whitespace
Debug intrinsics
Compiler
6502 cc65 2.17
6502 cc65 2.18
6502 cc65 2.19
6502 cc65 trunk
ARM GCC 10.2.0 (linux)
ARM GCC 10.2.1 (none)
ARM GCC 10.3.0 (linux)
ARM GCC 10.3.1 (2021.07 none)
ARM GCC 10.3.1 (2021.10 none)
ARM GCC 10.5.0
ARM GCC 11.1.0 (linux)
ARM GCC 11.2.0 (linux)
ARM GCC 11.2.1 (none)
ARM GCC 11.3.0 (linux)
ARM GCC 11.4.0
ARM GCC 12.1.0 (linux)
ARM GCC 12.2.0 (linux)
ARM GCC 12.3.0
ARM GCC 12.4.0
ARM GCC 13.1.0 (linux)
ARM GCC 13.2.0
ARM GCC 13.2.0 (unknown-eabi)
ARM GCC 13.3.0
ARM GCC 13.3.0 (unknown-eabi)
ARM GCC 14.1.0
ARM GCC 14.1.0 (unknown-eabi)
ARM GCC 14.2.0
ARM GCC 14.2.0 (unknown-eabi)
ARM GCC 4.5.4 (linux)
ARM GCC 4.6.4 (linux)
ARM GCC 5.4 (linux)
ARM GCC 5.4.1 (none)
ARM GCC 6.3.0 (linux)
ARM GCC 6.4.0 (linux)
ARM GCC 7.2.1 (none)
ARM GCC 7.3.0 (linux)
ARM GCC 7.5.0 (linux)
ARM GCC 8.2.0 (WinCE)
ARM GCC 8.2.0 (linux)
ARM GCC 8.3.1 (none)
ARM GCC 8.5.0 (linux)
ARM GCC 9.2.1 (none)
ARM GCC 9.3.0 (linux)
ARM GCC trunk (linux)
ARM msvc v19.0 (WINE)
ARM msvc v19.10 (WINE)
ARM msvc v19.14 (WINE)
ARM64 GCC 10.2.0
ARM64 GCC 10.3.0
ARM64 GCC 10.4.0
ARM64 GCC 10.5.0
ARM64 GCC 11.1.0
ARM64 GCC 11.2.0
ARM64 GCC 11.3.0
ARM64 GCC 11.4.0
ARM64 GCC 12.1.0
ARM64 GCC 12.2.0
ARM64 GCC 12.3.0
ARM64 GCC 12.4.0
ARM64 GCC 13.1.0
ARM64 GCC 13.2.0
ARM64 GCC 13.3.0
ARM64 GCC 14.1.0
ARM64 GCC 14.2.0
ARM64 GCC 4.9.4
ARM64 GCC 5.4
ARM64 GCC 5.5.0
ARM64 GCC 6.3
ARM64 GCC 6.4.0
ARM64 GCC 7.3.0
ARM64 GCC 7.5.0
ARM64 GCC 8.2.0
ARM64 GCC 8.5.0
ARM64 GCC 9.3.0
ARM64 GCC 9.4.0
ARM64 GCC 9.5.0
ARM64 GCC trunk
ARM64 Morello GCC 10.1.0 Alpha 1
ARM64 Morello GCC 10.1.2 Alpha 2
ARM64 msvc v19.14 (WINE)
AVR gcc 10.3.0
AVR gcc 11.1.0
AVR gcc 12.1.0
AVR gcc 12.2.0
AVR gcc 12.3.0
AVR gcc 12.4.0
AVR gcc 13.1.0
AVR gcc 13.2.0
AVR gcc 13.3.0
AVR gcc 14.1.0
AVR gcc 14.2.0
AVR gcc 4.5.4
AVR gcc 4.6.4
AVR gcc 5.4.0
AVR gcc 9.2.0
AVR gcc 9.3.0
Arduino Mega (1.8.9)
Arduino Uno (1.8.9)
BPF clang (trunk)
BPF clang 13.0.0
BPF clang 14.0.0
BPF clang 15.0.0
BPF clang 16.0.0
BPF clang 17.0.1
BPF clang 18.1.0
BPF gcc 13.1.0
BPF gcc 13.2.0
BPF gcc 13.3.0
BPF gcc 14.1.0
BPF gcc 14.2.0
BPF gcc trunk
Chibicc 2020-12-07
FRC 2019
FRC 2020
FRC 2023
HPPA gcc 14.2.0
K1C gcc 7.4
K1C gcc 7.5
KVX ACB 4.1.0 (GCC 7.5.0)
KVX ACB 4.1.0-cd1 (GCC 7.5.0)
KVX ACB 4.10.0 (GCC 10.3.1)
KVX ACB 4.11.1 (GCC 10.3.1)
KVX ACB 4.12.0 (GCC 11.3.0)
KVX ACB 4.2.0 (GCC 7.5.0)
KVX ACB 4.3.0 (GCC 7.5.0)
KVX ACB 4.4.0 (GCC 7.5.0)
KVX ACB 4.6.0 (GCC 9.4.1)
KVX ACB 4.8.0 (GCC 9.4.1)
KVX ACB 4.9.0 (GCC 9.4.1)
KVX ACB 5.0.0 (GCC 12.2.1)
LC3 (trunk)
M68K clang (trunk)
M68K gcc 13.1.0
M68K gcc 13.2.0
M68K gcc 13.3.0
M68K gcc 14.1.0
M68K gcc 14.2.0
MRISC32 gcc (trunk)
MSP430 gcc 12.1.0
MSP430 gcc 12.2.0
MSP430 gcc 12.3.0
MSP430 gcc 12.4.0
MSP430 gcc 13.1.0
MSP430 gcc 13.2.0
MSP430 gcc 13.3.0
MSP430 gcc 14.1.0
MSP430 gcc 14.2.0
MSP430 gcc 4.5.3
MSP430 gcc 5.3.0
MSP430 gcc 6.2.1
MinGW clang 14.0.3
MinGW clang 14.0.6
MinGW clang 15.0.7
MinGW clang 16.0.0
MinGW clang 16.0.2
MinGW gcc 11.3.0
MinGW gcc 12.1.0
MinGW gcc 12.2.0
MinGW gcc 13.1.0
POWER64 gcc 11.2.0
POWER64 gcc 12.1.0
POWER64 gcc 12.2.0
POWER64 gcc 12.3.0
POWER64 gcc 12.4.0
POWER64 gcc 13.1.0
POWER64 gcc 13.2.0
POWER64 gcc 13.3.0
POWER64 gcc 14.1.0
POWER64 gcc 14.2.0
POWER64 gcc trunk
RISC-V (32-bits) gcc (trunk)
RISC-V (32-bits) gcc 10.2.0
RISC-V (32-bits) gcc 10.3.0
RISC-V (32-bits) gcc 11.2.0
RISC-V (32-bits) gcc 11.3.0
RISC-V (32-bits) gcc 11.4.0
RISC-V (32-bits) gcc 12.1.0
RISC-V (32-bits) gcc 12.2.0
RISC-V (32-bits) gcc 12.3.0
RISC-V (32-bits) gcc 12.4.0
RISC-V (32-bits) gcc 13.1.0
RISC-V (32-bits) gcc 13.2.0
RISC-V (32-bits) gcc 13.3.0
RISC-V (32-bits) gcc 14.1.0
RISC-V (32-bits) gcc 14.2.0
RISC-V (32-bits) gcc 8.2.0
RISC-V (32-bits) gcc 8.5.0
RISC-V (32-bits) gcc 9.4.0
RISC-V (64-bits) gcc (trunk)
RISC-V (64-bits) gcc 10.2.0
RISC-V (64-bits) gcc 10.3.0
RISC-V (64-bits) gcc 11.2.0
RISC-V (64-bits) gcc 11.3.0
RISC-V (64-bits) gcc 11.4.0
RISC-V (64-bits) gcc 12.1.0
RISC-V (64-bits) gcc 12.2.0
RISC-V (64-bits) gcc 12.3.0
RISC-V (64-bits) gcc 12.4.0
RISC-V (64-bits) gcc 13.1.0
RISC-V (64-bits) gcc 13.2.0
RISC-V (64-bits) gcc 13.3.0
RISC-V (64-bits) gcc 14.1.0
RISC-V (64-bits) gcc 14.2.0
RISC-V (64-bits) gcc 8.2.0
RISC-V (64-bits) gcc 8.5.0
RISC-V (64-bits) gcc 9.4.0
RISC-V rv32gc clang (trunk)
RISC-V rv32gc clang 10.0.0
RISC-V rv32gc clang 10.0.1
RISC-V rv32gc clang 11.0.0
RISC-V rv32gc clang 11.0.1
RISC-V rv32gc clang 12.0.0
RISC-V rv32gc clang 12.0.1
RISC-V rv32gc clang 13.0.0
RISC-V rv32gc clang 13.0.1
RISC-V rv32gc clang 14.0.0
RISC-V rv32gc clang 15.0.0
RISC-V rv32gc clang 16.0.0
RISC-V rv32gc clang 17.0.1
RISC-V rv32gc clang 18.1.0
RISC-V rv32gc clang 9.0.0
RISC-V rv32gc clang 9.0.1
RISC-V rv64gc clang (trunk)
RISC-V rv64gc clang 10.0.0
RISC-V rv64gc clang 10.0.1
RISC-V rv64gc clang 11.0.0
RISC-V rv64gc clang 11.0.1
RISC-V rv64gc clang 12.0.0
RISC-V rv64gc clang 12.0.1
RISC-V rv64gc clang 13.0.0
RISC-V rv64gc clang 13.0.1
RISC-V rv64gc clang 14.0.0
RISC-V rv64gc clang 15.0.0
RISC-V rv64gc clang 16.0.0
RISC-V rv64gc clang 17.0.1
RISC-V rv64gc clang 18.1.0
RISC-V rv64gc clang 9.0.0
RISC-V rv64gc clang 9.0.1
Raspbian Buster
Raspbian Stretch
SDCC 4.0.0
SDCC 4.1.0
SDCC 4.2.0
SDCC 4.3.0
SDCC 4.4.0
SPARC LEON gcc 12.2.0
SPARC LEON gcc 12.3.0
SPARC LEON gcc 12.4.0
SPARC LEON gcc 13.1.0
SPARC LEON gcc 13.2.0
SPARC LEON gcc 13.3.0
SPARC LEON gcc 14.1.0
SPARC LEON gcc 14.2.0
SPARC gcc 12.2.0
SPARC gcc 12.3.0
SPARC gcc 12.4.0
SPARC gcc 13.1.0
SPARC gcc 13.2.0
SPARC gcc 13.3.0
SPARC gcc 14.1.0
SPARC gcc 14.2.0
SPARC64 gcc 12.2.0
SPARC64 gcc 12.3.0
SPARC64 gcc 12.4.0
SPARC64 gcc 13.1.0
SPARC64 gcc 13.2.0
SPARC64 gcc 13.3.0
SPARC64 gcc 14.1.0
SPARC64 gcc 14.2.0
TCC (trunk)
TCC 0.9.27
TI C6x gcc 12.2.0
TI C6x gcc 12.3.0
TI C6x gcc 12.4.0
TI C6x gcc 13.1.0
TI C6x gcc 13.2.0
TI C6x gcc 13.3.0
TI C6x gcc 14.1.0
TI C6x gcc 14.2.0
TI CL430 21.6.1
VAX gcc NetBSDELF 10.4.0
VAX gcc NetBSDELF 10.5.0 (Nov 15 03:50:22 2023)
WebAssembly clang (trunk)
Xtensa ESP32 gcc 11.2.0 (2022r1)
Xtensa ESP32 gcc 12.2.0 (20230208)
Xtensa ESP32 gcc 8.2.0 (2019r2)
Xtensa ESP32 gcc 8.2.0 (2020r1)
Xtensa ESP32 gcc 8.2.0 (2020r2)
Xtensa ESP32 gcc 8.4.0 (2020r3)
Xtensa ESP32 gcc 8.4.0 (2021r1)
Xtensa ESP32 gcc 8.4.0 (2021r2)
Xtensa ESP32-S2 gcc 11.2.0 (2022r1)
Xtensa ESP32-S2 gcc 12.2.0 (20230208)
Xtensa ESP32-S2 gcc 8.2.0 (2019r2)
Xtensa ESP32-S2 gcc 8.2.0 (2020r1)
Xtensa ESP32-S2 gcc 8.2.0 (2020r2)
Xtensa ESP32-S2 gcc 8.4.0 (2020r3)
Xtensa ESP32-S2 gcc 8.4.0 (2021r1)
Xtensa ESP32-S2 gcc 8.4.0 (2021r2)
Xtensa ESP32-S3 gcc 11.2.0 (2022r1)
Xtensa ESP32-S3 gcc 12.2.0 (20230208)
Xtensa ESP32-S3 gcc 8.4.0 (2020r3)
Xtensa ESP32-S3 gcc 8.4.0 (2021r1)
Xtensa ESP32-S3 gcc 8.4.0 (2021r2)
arm64 msvc v19.20 VS16.0
arm64 msvc v19.21 VS16.1
arm64 msvc v19.22 VS16.2
arm64 msvc v19.23 VS16.3
arm64 msvc v19.24 VS16.4
arm64 msvc v19.25 VS16.5
arm64 msvc v19.27 VS16.7
arm64 msvc v19.28 VS16.8
arm64 msvc v19.28 VS16.9
arm64 msvc v19.29 VS16.10
arm64 msvc v19.29 VS16.11
arm64 msvc v19.30 VS17.0
arm64 msvc v19.31 VS17.1
arm64 msvc v19.32 VS17.2
arm64 msvc v19.33 VS17.3
arm64 msvc v19.34 VS17.4
arm64 msvc v19.35 VS17.5
arm64 msvc v19.36 VS17.6
arm64 msvc v19.37 VS17.7
arm64 msvc v19.38 VS17.8
arm64 msvc v19.39 VS17.9
arm64 msvc v19.40 VS17.10
arm64 msvc v19.latest
armv7-a clang (trunk)
armv7-a clang 10.0.0
armv7-a clang 10.0.1
armv7-a clang 11.0.0
armv7-a clang 11.0.1
armv7-a clang 12.0.0
armv7-a clang 12.0.1
armv7-a clang 13.0.0
armv7-a clang 13.0.1
armv7-a clang 14.0.0
armv7-a clang 15.0.0
armv7-a clang 16.0.0
armv7-a clang 17.0.1
armv7-a clang 18.1.0
armv7-a clang 9.0.0
armv7-a clang 9.0.1
armv8-a clang (all architectural features, trunk)
armv8-a clang (trunk)
armv8-a clang 10.0.0
armv8-a clang 10.0.1
armv8-a clang 11.0.0
armv8-a clang 11.0.1
armv8-a clang 12.0.0
armv8-a clang 12.0.1
armv8-a clang 13.0.0
armv8-a clang 13.0.1
armv8-a clang 14.0.0
armv8-a clang 15.0.0
armv8-a clang 16.0.0
armv8-a clang 17.0.1
armv8-a clang 18.1.0
armv8-a clang 9.0.0
armv8-a clang 9.0.1
clang 12 for DPU (rel 2023.2.0)
cproc-master
llvm-mos commander X16
llvm-mos commodore 64
llvm-mos mega65
llvm-mos nes-cnrom
llvm-mos nes-mmc1
llvm-mos nes-mmc3
llvm-mos nes-nrom
llvm-mos osi-c1p
loongarch64 gcc 12.2.0
loongarch64 gcc 12.3.0
loongarch64 gcc 12.4.0
loongarch64 gcc 13.1.0
loongarch64 gcc 13.2.0
loongarch64 gcc 13.3.0
loongarch64 gcc 14.1.0
loongarch64 gcc 14.2.0
mips (el) gcc 12.1.0
mips (el) gcc 12.2.0
mips (el) gcc 12.3.0
mips (el) gcc 12.4.0
mips (el) gcc 13.1.0
mips (el) gcc 13.2.0
mips (el) gcc 13.3.0
mips (el) gcc 14.1.0
mips (el) gcc 14.2.0
mips (el) gcc 4.9.4
mips (el) gcc 5.4
mips (el) gcc 5.5.0
mips (el) gcc 9.5.0
mips clang 13.0.0
mips clang 14.0.0
mips clang 15.0.0
mips clang 16.0.0
mips clang 17.0.1
mips clang 18.1.0
mips gcc 11.2.0
mips gcc 12.1.0
mips gcc 12.2.0
mips gcc 12.3.0
mips gcc 12.4.0
mips gcc 13.1.0
mips gcc 13.2.0
mips gcc 13.3.0
mips gcc 14.1.0
mips gcc 14.2.0
mips gcc 4.9.4
mips gcc 5.4
mips gcc 5.5.0
mips gcc 9.3.0 (codescape)
mips gcc 9.5.0
mips64 (el) gcc 12.1.0
mips64 (el) gcc 12.2.0
mips64 (el) gcc 12.3.0
mips64 (el) gcc 12.4.0
mips64 (el) gcc 13.1.0
mips64 (el) gcc 13.2.0
mips64 (el) gcc 13.3.0
mips64 (el) gcc 14.1.0
mips64 (el) gcc 14.2.0
mips64 (el) gcc 4.9.4
mips64 (el) gcc 5.4.0
mips64 (el) gcc 5.5.0
mips64 (el) gcc 9.5.0
mips64 clang 13.0.0
mips64 clang 14.0.0
mips64 clang 15.0.0
mips64 clang 16.0.0
mips64 clang 17.0.1
mips64 clang 18.1.0
mips64 gcc 11.2.0
mips64 gcc 12.1.0
mips64 gcc 12.2.0
mips64 gcc 12.3.0
mips64 gcc 12.4.0
mips64 gcc 13.1.0
mips64 gcc 13.2.0
mips64 gcc 13.3.0
mips64 gcc 14.1.0
mips64 gcc 14.2.0
mips64 gcc 4.9.4
mips64 gcc 5.4
mips64 gcc 5.5.0
mips64 gcc 9.5.0
mips64el clang 13.0.0
mips64el clang 14.0.0
mips64el clang 15.0.0
mips64el clang 16.0.0
mips64el clang 17.0.1
mips64el clang 18.1.0
mipsel clang 13.0.0
mipsel clang 14.0.0
mipsel clang 15.0.0
mipsel clang 16.0.0
mipsel clang 17.0.1
mipsel clang 18.1.0
movfuscator (trunk)
nanoMIPS gcc 6.3.0
power gcc 11.2.0
power gcc 12.1.0
power gcc 12.2.0
power gcc 12.3.0
power gcc 12.4.0
power gcc 13.1.0
power gcc 13.2.0
power gcc 13.3.0
power gcc 14.1.0
power gcc 14.2.0
power gcc 4.8.5
power64 AT12.0 (gcc8)
power64 AT13.0 (gcc9)
power64le AT12.0 (gcc8)
power64le AT13.0 (gcc9)
power64le clang (trunk)
power64le gcc 11.2.0
power64le gcc 12.1.0
power64le gcc 12.2.0
power64le gcc 12.3.0
power64le gcc 12.4.0
power64le gcc 13.1.0
power64le gcc 13.2.0
power64le gcc 13.3.0
power64le gcc 14.1.0
power64le gcc 14.2.0
power64le gcc 6.3.0
power64le gcc trunk
powerpc64 clang (trunk)
ppci 0.5.5
s390x gcc 11.2.0
s390x gcc 12.1.0
s390x gcc 12.2.0
s390x gcc 12.3.0
s390x gcc 12.4.0
s390x gcc 13.1.0
s390x gcc 13.2.0
s390x gcc 13.3.0
s390x gcc 14.1.0
s390x gcc 14.2.0
sh gcc 12.2.0
sh gcc 12.3.0
sh gcc 12.4.0
sh gcc 13.1.0
sh gcc 13.2.0
sh gcc 13.3.0
sh gcc 14.1.0
sh gcc 14.2.0
sh gcc 4.9.4
sh gcc 9.5.0
vast (trunk)
x64 msvc v19.0 (WINE)
x64 msvc v19.10 (WINE)
x64 msvc v19.14 (WINE)
x64 msvc v19.20 VS16.0
x64 msvc v19.21 VS16.1
x64 msvc v19.22 VS16.2
x64 msvc v19.23 VS16.3
x64 msvc v19.24 VS16.4
x64 msvc v19.25 VS16.5
x64 msvc v19.27 VS16.7
x64 msvc v19.28 VS16.8
x64 msvc v19.28 VS16.9
x64 msvc v19.29 VS16.10
x64 msvc v19.29 VS16.11
x64 msvc v19.30 VS17.0
x64 msvc v19.31 VS17.1
x64 msvc v19.32 VS17.2
x64 msvc v19.33 VS17.3
x64 msvc v19.34 VS17.4
x64 msvc v19.35 VS17.5
x64 msvc v19.36 VS17.6
x64 msvc v19.37 VS17.7
x64 msvc v19.38 VS17.8
x64 msvc v19.39 VS17.9
x64 msvc v19.40 VS17.10
x64 msvc v19.latest
x86 CompCert 3.10
x86 CompCert 3.11
x86 CompCert 3.12
x86 CompCert 3.9
x86 gcc 1.27
x86 msvc v19.0 (WINE)
x86 msvc v19.10 (WINE)
x86 msvc v19.14 (WINE)
x86 msvc v19.20 VS16.0
x86 msvc v19.21 VS16.1
x86 msvc v19.22 VS16.2
x86 msvc v19.23 VS16.3
x86 msvc v19.24 VS16.4
x86 msvc v19.25 VS16.5
x86 msvc v19.27 VS16.7
x86 msvc v19.28 VS16.8
x86 msvc v19.28 VS16.9
x86 msvc v19.29 VS16.10
x86 msvc v19.29 VS16.11
x86 msvc v19.30 VS17.0
x86 msvc v19.31 VS17.1
x86 msvc v19.32 VS17.2
x86 msvc v19.33 VS17.3
x86 msvc v19.34 VS17.4
x86 msvc v19.35 VS17.5
x86 msvc v19.36 VS17.6
x86 msvc v19.37 VS17.7
x86 msvc v19.38 VS17.8
x86 msvc v19.39 VS17.9
x86 msvc v19.40 VS17.10
x86 msvc v19.latest
x86 tendra (trunk)
x86-64 clang (assertions trunk)
x86-64 clang (thephd.dev)
x86-64 clang (trunk)
x86-64 clang (widberg)
x86-64 clang 10.0.0
x86-64 clang 10.0.1
x86-64 clang 11.0.0
x86-64 clang 11.0.1
x86-64 clang 12.0.0
x86-64 clang 12.0.1
x86-64 clang 13.0.0
x86-64 clang 13.0.1
x86-64 clang 14.0.0
x86-64 clang 15.0.0
x86-64 clang 16.0.0
x86-64 clang 17.0.1
x86-64 clang 18.1.0
x86-64 clang 19.1.0
x86-64 clang 3.0.0
x86-64 clang 3.1
x86-64 clang 3.2
x86-64 clang 3.3
x86-64 clang 3.4.1
x86-64 clang 3.5
x86-64 clang 3.5.1
x86-64 clang 3.5.2
x86-64 clang 3.6
x86-64 clang 3.7
x86-64 clang 3.7.1
x86-64 clang 3.8
x86-64 clang 3.8.1
x86-64 clang 3.9.0
x86-64 clang 3.9.1
x86-64 clang 4.0.0
x86-64 clang 4.0.1
x86-64 clang 5.0.0
x86-64 clang 5.0.1
x86-64 clang 5.0.2
x86-64 clang 6.0.0
x86-64 clang 6.0.1
x86-64 clang 7.0.0
x86-64 clang 7.0.1
x86-64 clang 7.1.0
x86-64 clang 8.0.0
x86-64 clang 8.0.1
x86-64 clang 9.0.0
x86-64 clang 9.0.1
x86-64 gcc (trunk)
x86-64 gcc 10.1
x86-64 gcc 10.2
x86-64 gcc 10.3
x86-64 gcc 10.4
x86-64 gcc 10.5
x86-64 gcc 11.1
x86-64 gcc 11.2
x86-64 gcc 11.3
x86-64 gcc 11.4
x86-64 gcc 12.1
x86-64 gcc 12.2
x86-64 gcc 12.3
x86-64 gcc 12.4
x86-64 gcc 13.1
x86-64 gcc 13.2
x86-64 gcc 13.3
x86-64 gcc 14.1
x86-64 gcc 14.2
x86-64 gcc 3.4.6
x86-64 gcc 4.0.4
x86-64 gcc 4.1.2
x86-64 gcc 4.4.7
x86-64 gcc 4.5.3
x86-64 gcc 4.6.4
x86-64 gcc 4.7.1
x86-64 gcc 4.7.2
x86-64 gcc 4.7.3
x86-64 gcc 4.7.4
x86-64 gcc 4.8.1
x86-64 gcc 4.8.2
x86-64 gcc 4.8.3
x86-64 gcc 4.8.4
x86-64 gcc 4.8.5
x86-64 gcc 4.9.0
x86-64 gcc 4.9.1
x86-64 gcc 4.9.2
x86-64 gcc 4.9.3
x86-64 gcc 4.9.4
x86-64 gcc 5.1
x86-64 gcc 5.2
x86-64 gcc 5.3
x86-64 gcc 5.4
x86-64 gcc 6.1
x86-64 gcc 6.2
x86-64 gcc 6.3
x86-64 gcc 6.5
x86-64 gcc 7.1
x86-64 gcc 7.2
x86-64 gcc 7.3
x86-64 gcc 7.4
x86-64 gcc 7.5
x86-64 gcc 8.1
x86-64 gcc 8.2
x86-64 gcc 8.3
x86-64 gcc 8.4
x86-64 gcc 8.5
x86-64 gcc 9.1
x86-64 gcc 9.2
x86-64 gcc 9.3
x86-64 gcc 9.4
x86-64 gcc 9.5
x86-64 icc 13.0.1
x86-64 icc 16.0.3
x86-64 icc 17.0.0
x86-64 icc 18.0.0
x86-64 icc 19.0.0
x86-64 icc 19.0.1
x86-64 icc 2021.1.2
x86-64 icc 2021.10.0
x86-64 icc 2021.2.0
x86-64 icc 2021.3.0
x86-64 icc 2021.4.0
x86-64 icc 2021.5.0
x86-64 icc 2021.6.0
x86-64 icc 2021.7.0
x86-64 icc 2021.7.1
x86-64 icc 2021.8.0
x86-64 icc 2021.9.0
x86-64 icx (latest)
x86-64 icx 2021.1.2
x86-64 icx 2021.2.0
x86-64 icx 2021.3.0
x86-64 icx 2021.4.0
x86-64 icx 2022.0.0
x86-64 icx 2022.1.0
x86-64 icx 2022.2.0
x86-64 icx 2022.2.1
x86-64 icx 2023.0.0
x86-64 icx 2023.1.0
x86-64 icx 2024.0.0
x86_64 CompCert 3.10
x86_64 CompCert 3.11
x86_64 CompCert 3.12
x86_64 CompCert 3.9
z88dk 2.2
zig cc 0.10.0
zig cc 0.11.0
zig cc 0.12.0
zig cc 0.12.1
zig cc 0.13.0
zig cc 0.6.0
zig cc 0.7.0
zig cc 0.7.1
zig cc 0.8.0
zig cc 0.9.0
zig cc trunk
Options
Source code
/* { dg-do compile } */ /* { dg-options "-march=rv64gcv_zvl4096b --param riscv-autovec-preference=scalable -mabi=lp64d -O3" } */ #include <stdint-gcc.h> typedef int8_t v1qi __attribute__ ((vector_size (1))); typedef int8_t v2qi __attribute__ ((vector_size (2))); typedef int8_t v4qi __attribute__ ((vector_size (4))); typedef int8_t v8qi __attribute__ ((vector_size (8))); typedef int8_t v16qi __attribute__ ((vector_size (16))); typedef int8_t v32qi __attribute__ ((vector_size (32))); typedef int8_t v64qi __attribute__ ((vector_size (64))); typedef int8_t v128qi __attribute__ ((vector_size (128))); typedef int8_t v256qi __attribute__ ((vector_size (256))); typedef int8_t v512qi __attribute__ ((vector_size (512))); typedef int8_t v1024qi __attribute__ ((vector_size (1024))); typedef int8_t v2048qi __attribute__ ((vector_size (2048))); typedef int8_t v4096qi __attribute__ ((vector_size (4096))); typedef int16_t v1hi __attribute__ ((vector_size (2))); typedef int16_t v2hi __attribute__ ((vector_size (4))); typedef int16_t v4hi __attribute__ ((vector_size (8))); typedef int16_t v8hi __attribute__ ((vector_size (16))); typedef int16_t v16hi __attribute__ ((vector_size (32))); typedef int16_t v32hi __attribute__ ((vector_size (64))); typedef int16_t v64hi __attribute__ ((vector_size (128))); typedef int16_t v128hi __attribute__ ((vector_size (256))); typedef int16_t v256hi __attribute__ ((vector_size (512))); typedef int16_t v512hi __attribute__ ((vector_size (1024))); typedef int16_t v1024hi __attribute__ ((vector_size (2048))); typedef int16_t v2048hi __attribute__ ((vector_size (4096))); typedef int32_t v1si __attribute__ ((vector_size (4))); typedef int32_t v2si __attribute__ ((vector_size (8))); typedef int32_t v4si __attribute__ ((vector_size (16))); typedef int32_t v8si __attribute__ ((vector_size (32))); typedef int32_t v16si __attribute__ ((vector_size (64))); typedef int32_t v32si __attribute__ ((vector_size (128))); typedef int32_t v64si __attribute__ ((vector_size (256))); typedef int32_t v128si __attribute__ ((vector_size (512))); typedef int32_t v256si __attribute__ ((vector_size (1024))); typedef int32_t v512si __attribute__ ((vector_size (2048))); typedef int32_t v1024si __attribute__ ((vector_size (4096))); typedef int64_t v1di __attribute__ ((vector_size (8))); typedef int64_t v2di __attribute__ ((vector_size (16))); typedef int64_t v4di __attribute__ ((vector_size (32))); typedef int64_t v8di __attribute__ ((vector_size (64))); typedef int64_t v16di __attribute__ ((vector_size (128))); typedef int64_t v32di __attribute__ ((vector_size (256))); typedef int64_t v64di __attribute__ ((vector_size (512))); typedef int64_t v128di __attribute__ ((vector_size (1024))); typedef int64_t v256di __attribute__ ((vector_size (2048))); typedef uint64_t v512di __attribute__ ((vector_size (4096))); typedef uint8_t v1uqi __attribute__ ((vector_size (1))); typedef uint8_t v2uqi __attribute__ ((vector_size (2))); typedef uint8_t v4uqi __attribute__ ((vector_size (4))); typedef uint8_t v8uqi __attribute__ ((vector_size (8))); typedef uint8_t v16uqi __attribute__ ((vector_size (16))); typedef uint8_t v32uqi __attribute__ ((vector_size (32))); typedef uint8_t v64uqi __attribute__ ((vector_size (64))); typedef uint8_t v128uqi __attribute__ ((vector_size (128))); typedef uint8_t v256uqi __attribute__ ((vector_size (256))); typedef uint8_t v512uqi __attribute__ ((vector_size (512))); typedef uint8_t v1024uqi __attribute__ ((vector_size (1024))); typedef uint8_t v2048uqi __attribute__ ((vector_size (2048))); typedef uint8_t v4096uqi __attribute__ ((vector_size (4096))); typedef uint16_t v1uhi __attribute__ ((vector_size (2))); typedef uint16_t v2uhi __attribute__ ((vector_size (4))); typedef uint16_t v4uhi __attribute__ ((vector_size (8))); typedef uint16_t v8uhi __attribute__ ((vector_size (16))); typedef uint16_t v16uhi __attribute__ ((vector_size (32))); typedef uint16_t v32uhi __attribute__ ((vector_size (64))); typedef uint16_t v64uhi __attribute__ ((vector_size (128))); typedef uint16_t v128uhi __attribute__ ((vector_size (256))); typedef uint16_t v256uhi __attribute__ ((vector_size (512))); typedef uint16_t v512uhi __attribute__ ((vector_size (1024))); typedef uint16_t v1024uhi __attribute__ ((vector_size (2048))); typedef uint16_t v2048uhi __attribute__ ((vector_size (4096))); typedef uint32_t v1usi __attribute__ ((vector_size (4))); typedef uint32_t v2usi __attribute__ ((vector_size (8))); typedef uint32_t v4usi __attribute__ ((vector_size (16))); typedef uint32_t v8usi __attribute__ ((vector_size (32))); typedef uint32_t v16usi __attribute__ ((vector_size (64))); typedef uint32_t v32usi __attribute__ ((vector_size (128))); typedef uint32_t v64usi __attribute__ ((vector_size (256))); typedef uint32_t v128usi __attribute__ ((vector_size (512))); typedef uint32_t v256usi __attribute__ ((vector_size (1024))); typedef uint32_t v512usi __attribute__ ((vector_size (2048))); typedef uint32_t v1024usi __attribute__ ((vector_size (4096))); typedef uint64_t v1udi __attribute__ ((vector_size (8))); typedef uint64_t v2udi __attribute__ ((vector_size (16))); typedef uint64_t v4udi __attribute__ ((vector_size (32))); typedef uint64_t v8udi __attribute__ ((vector_size (64))); typedef uint64_t v16udi __attribute__ ((vector_size (128))); typedef uint64_t v32udi __attribute__ ((vector_size (256))); typedef uint64_t v64udi __attribute__ ((vector_size (512))); typedef uint64_t v128udi __attribute__ ((vector_size (1024))); typedef uint64_t v256udi __attribute__ ((vector_size (2048))); typedef uint64_t v512udi __attribute__ ((vector_size (4096))); typedef _Float16 v1hf __attribute__ ((vector_size (2))); typedef _Float16 v2hf __attribute__ ((vector_size (4))); typedef _Float16 v4hf __attribute__ ((vector_size (8))); typedef _Float16 v8hf __attribute__ ((vector_size (16))); typedef _Float16 v16hf __attribute__ ((vector_size (32))); typedef _Float16 v32hf __attribute__ ((vector_size (64))); typedef _Float16 v64hf __attribute__ ((vector_size (128))); typedef _Float16 v128hf __attribute__ ((vector_size (256))); typedef _Float16 v256hf __attribute__ ((vector_size (512))); typedef _Float16 v512hf __attribute__ ((vector_size (1024))); typedef _Float16 v1024hf __attribute__ ((vector_size (2048))); typedef _Float16 v2048hf __attribute__ ((vector_size (4096))); typedef float v1sf __attribute__ ((vector_size (4))); typedef float v2sf __attribute__ ((vector_size (8))); typedef float v4sf __attribute__ ((vector_size (16))); typedef float v8sf __attribute__ ((vector_size (32))); typedef float v16sf __attribute__ ((vector_size (64))); typedef float v32sf __attribute__ ((vector_size (128))); typedef float v64sf __attribute__ ((vector_size (256))); typedef float v128sf __attribute__ ((vector_size (512))); typedef float v256sf __attribute__ ((vector_size (1024))); typedef float v512sf __attribute__ ((vector_size (2048))); typedef float v1024sf __attribute__ ((vector_size (4096))); typedef double v1df __attribute__ ((vector_size (8))); typedef double v2df __attribute__ ((vector_size (16))); typedef double v4df __attribute__ ((vector_size (32))); typedef double v8df __attribute__ ((vector_size (64))); typedef double v16df __attribute__ ((vector_size (128))); typedef double v32df __attribute__ ((vector_size (256))); typedef double v64df __attribute__ ((vector_size (512))); typedef double v128df __attribute__ ((vector_size (1024))); typedef double v256df __attribute__ ((vector_size (2048))); typedef double v512df __attribute__ ((vector_size (4096))); #define exhaust_vector_regs() \ asm volatile("#" :: \ : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", \ "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", \ "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", \ "v26", "v27", "v28", "v29", "v30", "v31"); #define DEF_OP_VV(PREFIX, NUM, TYPE, OP) \ void __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##NUM (TYPE *restrict a, TYPE *restrict b, TYPE *restrict c) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = b[i] OP c[i]; \ } #define DEF_OP_VX(PREFIX, NUM, TYPE, OP) \ void __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##NUM (TYPE *restrict a, TYPE *restrict b, TYPE c) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = b[i] OP c; \ } #define DEF_OP_VI_M16(PREFIX, NUM, TYPE, OP) \ void __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##NUM (TYPE *restrict a, TYPE *restrict b, TYPE *restrict c) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = b[i] OP - 16; \ } #define DEF_OP_VI_15(PREFIX, NUM, TYPE, OP) \ void __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##NUM (TYPE *restrict a, TYPE *restrict b, TYPE *restrict c) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = b[i] OP 15; \ } #define DEF_OP_IV_M16(PREFIX, NUM, TYPE, OP) \ void __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##NUM (TYPE *restrict a, TYPE *restrict b, TYPE *restrict c) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = -16 OP b[i]; \ } #define DEF_OP_IV_15(PREFIX, NUM, TYPE, OP) \ void __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##NUM (TYPE *restrict a, TYPE *restrict b, TYPE *restrict c) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = 15 OP b[i]; \ } #define DEF_MINMAX_VV(PREFIX, NUM, TYPE, OP) \ void __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##NUM (TYPE *restrict a, TYPE *restrict b, TYPE *restrict c) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = b[i] OP c[i] ? b[i] : c[i]; \ } #define DEF_MINMAX_VX(PREFIX, NUM, TYPE, OP) \ void __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##NUM (TYPE *restrict a, TYPE *restrict b, TYPE c) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = b[i] OP c ? b[i] : c; \ } #define DEF_OP_VI_7(PREFIX, NUM, TYPE, OP) \ void __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##NUM (TYPE *restrict a, TYPE *restrict b, TYPE *restrict c) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = b[i] OP 7; \ } #define DEF_OP_V(PREFIX, NUM, TYPE, OP) \ void __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##NUM (TYPE *restrict a, TYPE *restrict b) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = OP (b[i]); \ } #define DEF_OP_V_CVT(PREFIX, NUM, TYPE_IN, TYPE_OUT, OP) \ void __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE_IN##_##TYPE_OUT##_##NUM (TYPE_OUT *restrict a, \ TYPE_IN *restrict b) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = OP (b[i]); \ } #define DEF_CALL_VV(PREFIX, NUM, TYPE, CALL) \ void __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##NUM (TYPE *restrict a, TYPE *restrict b, TYPE *restrict c) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = CALL (b[i], c[i]); \ } #define DEF_CALL_VX(PREFIX, NUM, TYPE, CALL) \ void __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##NUM (TYPE *restrict a, TYPE *restrict b, TYPE c) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = CALL (b[i], c); \ } #define DEF_CONST(TYPE, VAL, NUM) \ void const_##TYPE##_##NUM (TYPE *restrict a) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = VAL; \ } #define DEF_SERIES(TYPE, BASE, STEP, NUM, SUFFIX) \ void series_##TYPE##_##SUFFIX (TYPE *restrict a) \ { \ for (TYPE i = 0; i < NUM; ++i) \ a[i] = (BASE) + i * (STEP); \ } #define DEF_EXTRACT(SCALAR, VECTOR, INDEX) \ SCALAR \ extract_##SCALAR##VECTOR (VECTOR v) \ { \ return v[INDEX]; \ } #define DEF_MASK_LOGIC(PREFIX, NUM, TYPE, OP) \ void __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##NUM (TYPE *restrict a, TYPE *restrict b, TYPE *restrict c, \ TYPE *restrict d, TYPE *restrict e) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = (b[i] > c[i]) OP (d[i] < e[i]); \ } #define DEF_SGNJX_VV(PREFIX, NUM, TYPE, CALL) \ void __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##NUM (TYPE *restrict a, TYPE *restrict b, TYPE *restrict c) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = b[i] * CALL (1.0, c[i]); \ } #define DEF_REDUC_PLUS(TYPE, NUM) \ TYPE __attribute__ ((noinline, noclone)) \ reduc_plus_##TYPE##NUM (TYPE *__restrict a) \ { \ TYPE r = 0; \ for (int i = 0; i < NUM; ++i) \ r += a[i]; \ return r; \ } #define DEF_REDUC_MAXMIN(TYPE, NAME, CMP_OP, NUM) \ TYPE __attribute__ ((noinline, noclone)) \ reduc_##NAME##_##TYPE##_##NUM (TYPE *a) \ { \ TYPE r = 13; \ for (int i = 0; i < NUM; ++i) \ r = a[i] CMP_OP r ? a[i] : r; \ return r; \ } #define DEF_REDUC_BITWISE(TYPE, NAME, BIT_OP, NUM) \ TYPE __attribute__ ((noinline, noclone)) \ reduc_##NAME##_##TYPE##_##NUM (TYPE *a) \ { \ TYPE r = 13; \ for (int i = 0; i < NUM; ++i) \ r BIT_OP a[i]; \ return r; \ } #define VARS2(TYPE, NUM1, NUM2) TYPE var##NUM1, TYPE var##NUM2 #define VARS4(TYPE, NUM1, NUM2, NUM3, NUM4) \ VARS2 (TYPE, NUM1, NUM2), VARS2 (TYPE, NUM3, NUM4) #define VARS8(TYPE, NUM1, NUM2, NUM3, NUM4, NUM5, NUM6, NUM7, NUM8) \ VARS4 (TYPE, NUM1, NUM2, NUM3, NUM4), VARS4 (TYPE, NUM5, NUM6, NUM7, NUM8) #define VARS16(TYPE, NUM1, NUM2, NUM3, NUM4, NUM5, NUM6, NUM7, NUM8, NUM9, \ NUM10, NUM11, NUM12, NUM13, NUM14, NUM15, NUM16) \ VARS8 (TYPE, NUM1, NUM2, NUM3, NUM4, NUM5, NUM6, NUM7, NUM8), \ VARS8 (TYPE, NUM9, NUM10, NUM11, NUM12, NUM13, NUM14, NUM15, NUM16) #define VARS32(TYPE, NUM1, NUM2, NUM3, NUM4, NUM5, NUM6, NUM7, NUM8, NUM9, \ NUM10, NUM11, NUM12, NUM13, NUM14, NUM15, NUM16, NUM17, NUM18, \ NUM19, NUM20, NUM21, NUM22, NUM23, NUM24, NUM25, NUM26, NUM27, \ NUM28, NUM29, NUM30, NUM31, NUM32) \ VARS16 (TYPE, NUM1, NUM2, NUM3, NUM4, NUM5, NUM6, NUM7, NUM8, NUM9, NUM10, \ NUM11, NUM12, NUM13, NUM14, NUM15, NUM16), \ VARS16 (TYPE, NUM17, NUM18, NUM19, NUM20, NUM21, NUM22, NUM23, NUM24, \ NUM25, NUM26, NUM27, NUM28, NUM29, NUM30, NUM31, NUM32) #define VARS64(TYPE, NUM1, NUM2, NUM3, NUM4, NUM5, NUM6, NUM7, NUM8, NUM9, \ NUM10, NUM11, NUM12, NUM13, NUM14, NUM15, NUM16, NUM17, NUM18, \ NUM19, NUM20, NUM21, NUM22, NUM23, NUM24, NUM25, NUM26, NUM27, \ NUM28, NUM29, NUM30, NUM31, NUM32, NUM33, NUM34, NUM35, NUM36, \ NUM37, NUM38, NUM39, NUM40, NUM41, NUM42, NUM43, NUM44, NUM45, \ NUM46, NUM47, NUM48, NUM49, NUM50, NUM51, NUM52, NUM53, NUM54, \ NUM55, NUM56, NUM57, NUM58, NUM59, NUM60, NUM61, NUM62, NUM63, \ NUM64) \ VARS32 (TYPE, NUM1, NUM2, NUM3, NUM4, NUM5, NUM6, NUM7, NUM8, NUM9, NUM10, \ NUM11, NUM12, NUM13, NUM14, NUM15, NUM16, NUM17, NUM18, NUM19, \ NUM20, NUM21, NUM22, NUM23, NUM24, NUM25, NUM26, NUM27, NUM28, \ NUM29, NUM30, NUM31, NUM32), \ VARS32 (TYPE, NUM33, NUM34, NUM35, NUM36, NUM37, NUM38, NUM39, NUM40, \ NUM41, NUM42, NUM43, NUM44, NUM45, NUM46, NUM47, NUM48, NUM49, \ NUM50, NUM51, NUM52, NUM53, NUM54, NUM55, NUM56, NUM57, NUM58, \ NUM59, NUM60, NUM61, NUM62, NUM63, NUM64) #define VARS128(TYPE, NUM1, NUM2, NUM3, NUM4, NUM5, NUM6, NUM7, NUM8, NUM9, \ NUM10, NUM11, NUM12, NUM13, NUM14, NUM15, NUM16, NUM17, NUM18, \ NUM19, NUM20, NUM21, NUM22, NUM23, NUM24, NUM25, NUM26, NUM27, \ NUM28, NUM29, NUM30, NUM31, NUM32, NUM33, NUM34, NUM35, NUM36, \ NUM37, NUM38, NUM39, NUM40, NUM41, NUM42, NUM43, NUM44, NUM45, \ NUM46, NUM47, NUM48, NUM49, NUM50, NUM51, NUM52, NUM53, NUM54, \ NUM55, NUM56, NUM57, NUM58, NUM59, NUM60, NUM61, NUM62, NUM63, \ NUM64, NUM65, NUM66, NUM67, NUM68, NUM69, NUM70, NUM71, NUM72, \ NUM73, NUM74, NUM75, NUM76, NUM77, NUM78, NUM79, NUM80, NUM81, \ NUM82, NUM83, NUM84, NUM85, NUM86, NUM87, NUM88, NUM89, NUM90, \ NUM91, NUM92, NUM93, NUM94, NUM95, NUM96, NUM97, NUM98, NUM99, \ NUM100, NUM101, NUM102, NUM103, NUM104, NUM105, NUM106, \ NUM107, NUM108, NUM109, NUM110, NUM111, NUM112, NUM113, \ NUM114, NUM115, NUM116, NUM117, NUM118, NUM119, NUM120, \ NUM121, NUM122, NUM123, NUM124, NUM125, NUM126, NUM127, \ NUM128) \ VARS64 (TYPE, NUM1, NUM2, NUM3, NUM4, NUM5, NUM6, NUM7, NUM8, NUM9, NUM10, \ NUM11, NUM12, NUM13, NUM14, NUM15, NUM16, NUM17, NUM18, NUM19, \ NUM20, NUM21, NUM22, NUM23, NUM24, NUM25, NUM26, NUM27, NUM28, \ NUM29, NUM30, NUM31, NUM32, NUM33, NUM34, NUM35, NUM36, NUM37, \ NUM38, NUM39, NUM40, NUM41, NUM42, NUM43, NUM44, NUM45, NUM46, \ NUM47, NUM48, NUM49, NUM50, NUM51, NUM52, NUM53, NUM54, NUM55, \ NUM56, NUM57, NUM58, NUM59, NUM60, NUM61, NUM62, NUM63, NUM64), \ VARS64 (TYPE, NUM65, NUM66, NUM67, NUM68, NUM69, NUM70, NUM71, NUM72, \ NUM73, NUM74, NUM75, NUM76, NUM77, NUM78, NUM79, NUM80, NUM81, \ NUM82, NUM83, NUM84, NUM85, NUM86, NUM87, NUM88, NUM89, NUM90, \ NUM91, NUM92, NUM93, NUM94, NUM95, NUM96, NUM97, NUM98, NUM99, \ NUM100, NUM101, NUM102, NUM103, NUM104, NUM105, NUM106, NUM107, \ NUM108, NUM109, NUM110, NUM111, NUM112, NUM113, NUM114, NUM115, \ NUM116, NUM117, NUM118, NUM119, NUM120, NUM121, NUM122, NUM123, \ NUM124, NUM125, NUM126, NUM127, NUM128) #define INIT2(NUM1, NUM2) var##NUM1, var##NUM2 #define INIT4(NUM1, NUM2, NUM3, NUM4) INIT2 (NUM1, NUM2), INIT2 (NUM3, NUM4) #define INIT8(NUM1, NUM2, NUM3, NUM4, NUM5, NUM6, NUM7, NUM8) \ INIT4 (NUM1, NUM2, NUM3, NUM4), INIT4 (NUM5, NUM6, NUM7, NUM8) #define INIT16(NUM1, NUM2, NUM3, NUM4, NUM5, NUM6, NUM7, NUM8, NUM9, NUM10, \ NUM11, NUM12, NUM13, NUM14, NUM15, NUM16) \ INIT4 (NUM1, NUM2, NUM3, NUM4), INIT4 (NUM5, NUM6, NUM7, NUM8) #define INIT32(NUM1, NUM2, NUM3, NUM4, NUM5, NUM6, NUM7, NUM8, NUM9, NUM10, \ NUM11, NUM12, NUM13, NUM14, NUM15, NUM16, NUM17, NUM18, NUM19, \ NUM20, NUM21, NUM22, NUM23, NUM24, NUM25, NUM26, NUM27, NUM28, \ NUM29, NUM30, NUM31, NUM32) \ INIT16 (NUM1, NUM2, NUM3, NUM4, NUM5, NUM6, NUM7, NUM8, NUM9, NUM10, NUM11, \ NUM12, NUM13, NUM14, NUM15, NUM16), \ INIT16 (NUM17, NUM18, NUM19, NUM20, NUM21, NUM22, NUM23, NUM24, NUM25, \ NUM26, NUM27, NUM28, NUM29, NUM30, NUM31, NUM32) #define INIT64(NUM1, NUM2, NUM3, NUM4, NUM5, NUM6, NUM7, NUM8, NUM9, NUM10, \ NUM11, NUM12, NUM13, NUM14, NUM15, NUM16, NUM17, NUM18, NUM19, \ NUM20, NUM21, NUM22, NUM23, NUM24, NUM25, NUM26, NUM27, NUM28, \ NUM29, NUM30, NUM31, NUM32, NUM33, NUM34, NUM35, NUM36, NUM37, \ NUM38, NUM39, NUM40, NUM41, NUM42, NUM43, NUM44, NUM45, NUM46, \ NUM47, NUM48, NUM49, NUM50, NUM51, NUM52, NUM53, NUM54, NUM55, \ NUM56, NUM57, NUM58, NUM59, NUM60, NUM61, NUM62, NUM63, NUM64) \ INIT32 (NUM1, NUM2, NUM3, NUM4, NUM5, NUM6, NUM7, NUM8, NUM9, NUM10, NUM11, \ NUM12, NUM13, NUM14, NUM15, NUM16, NUM17, NUM18, NUM19, NUM20, \ NUM21, NUM22, NUM23, NUM24, NUM25, NUM26, NUM27, NUM28, NUM29, \ NUM30, NUM31, NUM32), \ INIT32 (NUM33, NUM34, NUM35, NUM36, NUM37, NUM38, NUM39, NUM40, NUM41, \ NUM42, NUM43, NUM44, NUM45, NUM46, NUM47, NUM48, NUM49, NUM50, \ NUM51, NUM52, NUM53, NUM54, NUM55, NUM56, NUM57, NUM58, NUM59, \ NUM60, NUM61, NUM62, NUM63, NUM64) #define INIT128(NUM1, NUM2, NUM3, NUM4, NUM5, NUM6, NUM7, NUM8, NUM9, NUM10, \ NUM11, NUM12, NUM13, NUM14, NUM15, NUM16, NUM17, NUM18, NUM19, \ NUM20, NUM21, NUM22, NUM23, NUM24, NUM25, NUM26, NUM27, NUM28, \ NUM29, NUM30, NUM31, NUM32, NUM33, NUM34, NUM35, NUM36, NUM37, \ NUM38, NUM39, NUM40, NUM41, NUM42, NUM43, NUM44, NUM45, NUM46, \ NUM47, NUM48, NUM49, NUM50, NUM51, NUM52, NUM53, NUM54, NUM55, \ NUM56, NUM57, NUM58, NUM59, NUM60, NUM61, NUM62, NUM63, NUM64, \ NUM65, NUM66, NUM67, NUM68, NUM69, NUM70, NUM71, NUM72, NUM73, \ NUM74, NUM75, NUM76, NUM77, NUM78, NUM79, NUM80, NUM81, NUM82, \ NUM83, NUM84, NUM85, NUM86, NUM87, NUM88, NUM89, NUM90, NUM91, \ NUM92, NUM93, NUM94, NUM95, NUM96, NUM97, NUM98, NUM99, \ NUM100, NUM101, NUM102, NUM103, NUM104, NUM105, NUM106, \ NUM107, NUM108, NUM109, NUM110, NUM111, NUM112, NUM113, \ NUM114, NUM115, NUM116, NUM117, NUM118, NUM119, NUM120, \ NUM121, NUM122, NUM123, NUM124, NUM125, NUM126, NUM127, \ NUM128) \ INIT64 (NUM1, NUM2, NUM3, NUM4, NUM5, NUM6, NUM7, NUM8, NUM9, NUM10, NUM11, \ NUM12, NUM13, NUM14, NUM15, NUM16, NUM17, NUM18, NUM19, NUM20, \ NUM21, NUM22, NUM23, NUM24, NUM25, NUM26, NUM27, NUM28, NUM29, \ NUM30, NUM31, NUM32, NUM33, NUM34, NUM35, NUM36, NUM37, NUM38, \ NUM39, NUM40, NUM41, NUM42, NUM43, NUM44, NUM45, NUM46, NUM47, \ NUM48, NUM49, NUM50, NUM51, NUM52, NUM53, NUM54, NUM55, NUM56, \ NUM57, NUM58, NUM59, NUM60, NUM61, NUM62, NUM63, NUM64), \ INIT64 (NUM65, NUM66, NUM67, NUM68, NUM69, NUM70, NUM71, NUM72, NUM73, \ NUM74, NUM75, NUM76, NUM77, NUM78, NUM79, NUM80, NUM81, NUM82, \ NUM83, NUM84, NUM85, NUM86, NUM87, NUM88, NUM89, NUM90, NUM91, \ NUM92, NUM93, NUM94, NUM95, NUM96, NUM97, NUM98, NUM99, NUM100, \ NUM101, NUM102, NUM103, NUM104, NUM105, NUM106, NUM107, NUM108, \ NUM109, NUM110, NUM111, NUM112, NUM113, NUM114, NUM115, NUM116, \ NUM117, NUM118, NUM119, NUM120, NUM121, NUM122, NUM123, NUM124, \ NUM125, NUM126, NUM127, NUM128) #define DEF_INIT(TYPE1, TYPE2, NUM, ...) \ void init_##TYPE1##_##TYPE2##_##NUM (VARS##NUM (TYPE2, __VA_ARGS__), \ TYPE2 *__restrict out) \ { \ TYPE1 v = {__VA_ARGS__}; \ *(TYPE1 *) out = v; \ } #define DEF_OP_VV_VA(OP, TYPE1, ...) \ TYPE1 test_##OP##_##TYPE1 (TYPE1 a, TYPE1 b) \ { \ return OP (a, b, __VA_ARGS__); \ } #define DEF_REPEAT(TYPE1, TYPE2, NUM, ...) \ void init_##TYPE1##_##TYPE2##_##NUM (TYPE2 var0, TYPE2 var1, \ TYPE2 *__restrict out) \ { \ TYPE1 v = {__VA_ARGS__}; \ *(TYPE1 *) out = v; \ } #define DEF_VEC_SET_IMM_INDEX(PREFIX, VECTOR, TYPE, INDEX) \ VECTOR __attribute__ ((noinline, noclone)) \ PREFIX##_##VECTOR##_##INDEX (VECTOR v, TYPE a) \ { \ v[INDEX] = a; \ \ return v; \ } #define DEF_VEC_SET_SCALAR_INDEX(PREFIX, VECTOR, TYPE) \ VECTOR __attribute__ ((noinline, noclone)) \ PREFIX##_##VECTOR##_##TYPE (VECTOR v, TYPE a, unsigned index) \ { \ v[index] = a; \ \ return v; \ } #define DEF_FMA_VV(PREFIX, NUM, TYPE) \ void __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##NUM (TYPE *restrict a, TYPE *restrict b, TYPE *restrict c, \ TYPE *restrict d) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = b[i] * c[i] + d[i]; \ } #define DEF_FNMA_VV(PREFIX, NUM, TYPE) \ void __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##NUM (TYPE *restrict a, TYPE *restrict b, TYPE *restrict c, \ TYPE *restrict d) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = d[i] - b[i] * c[i]; \ } #define DEF_FMS_VV(PREFIX, NUM, TYPE) \ void __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##NUM (TYPE *restrict a, TYPE *restrict b, TYPE *restrict c, \ TYPE *restrict d) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = b[i] * c[i] - d[i]; \ } #define DEF_FNMS_VV(PREFIX, NUM, TYPE) \ void __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##NUM (TYPE *restrict a, TYPE *restrict b, TYPE *restrict c, \ TYPE *restrict d) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = -(b[i] * c[i]) - d[i]; \ } #define DEF_CONVERT(PREFIX, TYPE1, TYPE2, NUM) \ __attribute__ (( \ noipa)) void PREFIX##_##TYPE1##TYPE2##_##NUM (TYPE2 *__restrict dst, \ TYPE1 *__restrict a) \ { \ for (int i = 0; i < NUM; i++) \ dst[i] = (TYPE2) a[i]; \ } #define DEF_AVG_FLOOR(TYPE, TYPE2, NUM) \ __attribute__ ((noipa)) void vavg_##TYPE##_##TYPE2##NUM ( \ TYPE *__restrict dst, TYPE *__restrict a, TYPE *__restrict b, int n) \ { \ for (int i = 0; i < NUM; i++) \ dst[i] = ((TYPE2) a[i] + b[i]) >> 1; \ } #define DEF_AVG_CEIL(TYPE, TYPE2, NUM) \ __attribute__ ((noipa)) void vavg2_##TYPE##_##TYPE2##NUM ( \ TYPE *__restrict dst, TYPE *__restrict a, TYPE *__restrict b, int n) \ { \ for (int i = 0; i < NUM; i++) \ dst[i] = ((TYPE2) a[i] + b[i] + 1) >> 1; \ } #define DEF_MULH(TYPE, NUM) \ void __attribute__ ((noipa)) \ mod_##TYPE##_##NUM (TYPE *__restrict dst, TYPE *__restrict src) \ { \ for (int i = 0; i < NUM; ++i) \ dst[i] = src[i] % 19; \ } #define DEF_COND_UNOP(PREFIX, NUM, TYPE, OP) \ TYPE __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##NUM (TYPE a, TYPE b, TYPE cond) \ { \ TYPE v; \ for (int i = 0; i < NUM; ++i) \ v[i] = cond[i] ? OP (a[i]) : b[i]; \ return v; \ } #define DEF_COND_BINOP(PREFIX, NUM, TYPE, OP) \ TYPE __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##NUM (TYPE a, TYPE b, TYPE c, TYPE cond) \ { \ TYPE v; \ for (int i = 0; i < NUM; ++i) \ v[i] = cond[i] ? a[i] OP b[i] : c[i]; \ return v; \ } #define DEF_COND_MINMAX(PREFIX, NUM, TYPE, OP) \ TYPE __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##NUM (TYPE a, TYPE b, TYPE c, TYPE cond) \ { \ TYPE v; \ for (int i = 0; i < NUM; ++i) \ v[i] = cond[i] ? ((a[i]) OP (b[i]) ? (a[i]) : (b[i])) : c[i]; \ return v; \ } #define DEF_COND_FMA_VV(PREFIX, NUM, TYPE) \ TYPE __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##NUM (TYPE a, TYPE b, TYPE c, TYPE cond) \ { \ TYPE v; \ for (int i = 0; i < NUM; ++i) \ v[i] = cond[i] ? a[i] * b[i] + c[i] : b[i]; \ return v; \ } #define DEF_COND_FNMA_VV(PREFIX, NUM, TYPE) \ TYPE __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##NUM (TYPE a, TYPE b, TYPE c, TYPE cond) \ { \ TYPE v; \ for (int i = 0; i < NUM; ++i) \ v[i] = cond[i] ? a[i] - b[i] * c[i] : b[i]; \ return v; \ } #define DEF_COND_FMS_VV(PREFIX, NUM, TYPE) \ TYPE __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##NUM (TYPE a, TYPE b, TYPE c, TYPE cond) \ { \ TYPE v; \ for (int i = 0; i < NUM; ++i) \ v[i] = cond[i] ? a[i] * b[i] - c[i] : b[i]; \ return v; \ } #define DEF_COND_FNMS_VV(PREFIX, NUM, TYPE) \ TYPE __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##NUM (TYPE a, TYPE b, TYPE c, TYPE cond) \ { \ TYPE v; \ for (int i = 0; i < NUM; ++i) \ v[i] = cond[i] ? -(a[i] * b[i]) - c[i] : b[i]; \ return v; \ } #define DEF_OP_WVV(PREFIX, NUM, TYPE, TYPE2, OP) \ void __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##_##TYPE2##NUM (TYPE2 *restrict a, TYPE *restrict b, \ TYPE *restrict c) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = (TYPE2) b[i] OP (TYPE2) c[i]; \ } #define DEF_OP_WWV(PREFIX, NUM, TYPE, TYPE2, OP) \ void __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##_##TYPE2##NUM (TYPE2 *restrict a, TYPE2 *restrict b, \ TYPE *restrict c) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = b[i] OP (TYPE2) c[i]; \ } #define DEF_OP_WVV_SU(PREFIX, NUM, TYPE1, TYPE2, TYPE3, OP) \ void __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##_##TYPE2##NUM (TYPE3 *restrict a, TYPE1 *restrict b, \ TYPE2 *restrict c) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = (TYPE3) b[i] OP (TYPE3) c[i]; \ } #define DEF_FMA_WVV(PREFIX, NUM, TYPE1, TYPE2) \ void __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE1##_##TYPE2##NUM (TYPE2 *restrict a, TYPE1 *restrict b, \ TYPE1 *restrict c, TYPE2 *restrict d) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = (TYPE2) b[i] * (TYPE2) c[i] + d[i]; \ } #define DEF_FMA_WVV_SU(PREFIX, NUM, TYPE1, TYPE2, TYPE3) \ void __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE1##_##TYPE2##_##TYPE3##NUM (TYPE3 *restrict a, \ TYPE1 *restrict b, \ TYPE2 *restrict c, \ TYPE3 *restrict d) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = (TYPE3) b[i] * (TYPE3) c[i] + d[i]; \ } #define DEF_FNMA_WVV(PREFIX, NUM, TYPE1, TYPE2) \ void __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE1##_##TYPE2##NUM (TYPE2 *restrict a, TYPE1 *restrict b, \ TYPE1 *restrict c, TYPE2 *restrict d) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = d[i] - (TYPE2) b[i] * (TYPE2) c[i]; \ } #define DEF_FMS_WVV(PREFIX, NUM, TYPE1, TYPE2) \ void __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE1##_##TYPE2##NUM (TYPE2 *restrict a, TYPE1 *restrict b, \ TYPE1 *restrict c, TYPE2 *restrict d) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = (TYPE2) b[i] * (TYPE2) c[i] - d[i]; \ } #define DEF_FNMS_WVV(PREFIX, NUM, TYPE1, TYPE2) \ void __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE1##_##TYPE2##NUM (TYPE2 *restrict a, TYPE1 *restrict b, \ TYPE1 *restrict c, TYPE2 *restrict d) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = -((TYPE2) b[i] * (TYPE2) c[i]) - d[i]; \ } #define DEF_WIDEN_REDUC_PLUS(TYPE, TYPE2, NUM) \ TYPE2 __attribute__ ((noinline, noclone)) \ reduc_plus_##TYPE##_##TYPE2##NUM (TYPE *__restrict a) \ { \ TYPE2 r = 0; \ for (int i = 0; i < NUM; ++i) \ r += a[i]; \ return r; \ } #define DEF_NARROW_TRUNC_IMM(TYPE1, TYPE2, NUM) \ void narrow_##TYPE1##_##TYPE2##_##NUM (TYPE1 *restrict a, TYPE2 *restrict b) \ { \ for (int i = 0; i < NUM; i += 1) \ a[i] = (TYPE1) (b[i] >> 7); \ } #define DEF_NARROW_TRUNC_XREG(TYPE1, TYPE2, NUM) \ void narrow_##TYPE1##_##TYPE2##_##NUM (TYPE1 *restrict a, TYPE2 *restrict b, \ int shift) \ { \ for (int i = 0; i < NUM; i += 1) \ a[i] = (TYPE1) (b[i] >> shift); \ } #define DEF_NARROW_TRUNC_VREG(TYPE1, TYPE2, NUM) \ void narrow_##TYPE1##_##TYPE2##_##NUM (TYPE1 *restrict a, TYPE2 *restrict b, \ int *restrict shift) \ { \ for (int i = 0; i < NUM; i += 1) \ a[i] = (TYPE1) (b[i] >> shift[i]); \ } #define DEF_COND_CONVERT(PREFIX, TYPE1, TYPE2, NUM) \ __attribute__ ((noipa)) \ TYPE2 PREFIX##_##TYPE1##TYPE2##_##NUM (TYPE2 dst, TYPE1 a, int *cond) \ { \ for (int i = 0; i < NUM; i++) \ dst[i] = cond[i] ? a[i] : dst[i]; \ return dst; \ } #define DEF_COND_FP_CONVERT(PREFIX, TYPE1, TYPE2, TYPE3, NUM) \ __attribute__ ((noipa)) \ v##NUM##TYPE2 PREFIX##_##TYPE1##TYPE2##TYPE3##_##NUM (v##NUM##TYPE2 dst, \ v##NUM##TYPE1 a, \ int *cond) \ { \ for (int i = 0; i < NUM; i++) \ dst[i] = cond[i] ? (TYPE3) a[i] : dst[i]; \ return dst; \ } #define DEF_COND_CALL(PREFIX, NUM, TYPE, CALL) \ TYPE __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##NUM (TYPE a, TYPE b, TYPE c, TYPE cond) \ { \ TYPE v; \ for (int i = 0; i < NUM; ++i) \ v[i] = cond[i] ? CALL (a[i], b[i]) : c[i]; \ return v; \ } #define DEF_COND_MULH(PREFIX, NUM, TYPE, TYPE2, TYPE3, SHIFT) \ TYPE __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##NUM (TYPE a, TYPE b, TYPE c, TYPE cond) \ { \ TYPE v; \ for (int i = 0; i < NUM; ++i) \ v[i] \ = cond[i] ? (TYPE3) (((TYPE2) a[i] * (TYPE2) b[i]) >> SHIFT) : c[i]; \ return v; \ } #define DEF_COND_OP_WVV(PREFIX, NUM, TYPE, TYPE2, TYPE3, OP) \ v##NUM##TYPE2 __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##_##TYPE2##NUM (v##NUM##TYPE2 a, v##NUM##TYPE b, \ v##NUM##TYPE c, int *restrict cond) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = cond[i] ? (TYPE3) b[i] OP (TYPE3) c[i] : a[i]; \ return a; \ } #define DEF_COND_OP_WVV_SU(PREFIX, NUM, TYPE, TYPE2, TYPE3, OP) \ v##NUM##TYPE2 __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##_##TYPE2##NUM (v##NUM##TYPE2 a, v##NUM##u##TYPE b, \ v##NUM##TYPE c, int *restrict cond) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = cond[i] ? (TYPE3) b[i] OP (TYPE3) c[i] : a[i]; \ return a; \ } #define DEF_COND_OP_WWV(PREFIX, NUM, TYPE, TYPE2, TYPE3, OP) \ v##NUM##TYPE2 __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##_##TYPE2##NUM (v##NUM##TYPE2 a, v##NUM##TYPE2 b, \ v##NUM##TYPE c, int *restrict cond) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = cond[i] ? (TYPE3) b[i] OP (TYPE3) c[i] : a[i]; \ return a; \ } #define DEF_WFMA_VV(PREFIX, NUM, TYPE, TYPE2, TYPE3) \ v##NUM##TYPE2 __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##TYPE2##TYPE3##NUM (v##NUM##TYPE2 a, v##NUM##TYPE b, \ v##NUM##TYPE c, int *restrict cond) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = cond[i] ? (TYPE3) b[i] * (TYPE3) c[i] + a[i] : a[i]; \ return a; \ } #define DEF_WFNMA_VV(PREFIX, NUM, TYPE, TYPE2, TYPE3) \ v##NUM##TYPE2 __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##TYPE2##TYPE3##NUM (v##NUM##TYPE2 a, v##NUM##TYPE b, \ v##NUM##TYPE c, int *restrict cond) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = cond[i] ? a[i] - (TYPE3) b[i] * (TYPE3) c[i] : a[i]; \ return a; \ } #define DEF_WFMS_VV(PREFIX, NUM, TYPE, TYPE2, TYPE3) \ v##NUM##TYPE2 __attribute__ ((noinline, noclone)) \ PREFIX##_##TYPE##TYPE2##TYPE3##NUM (v##NUM##TYPE2 a, v##NUM##TYPE b, \ v##NUM##TYPE c, int *restrict cond) \ { \ for (int i = 0; i < NUM; ++i) \ a[i] = cond[i] ? (TYPE3) b[i] * (TYPE3) c[i] - a[i] : a[i]; \ return a; \ } #define DEF_COND_NARROW_TRUNC_IMM(TYPE1, TYPE2, TYPE3, NUM) \ v##NUM##TYPE1 narrow_##TYPE1##_##TYPE2##_##NUM (v##NUM##TYPE1 a, \ v##NUM##TYPE2 b, \ int *__restrict cond) \ { \ for (int i = 0; i < NUM; i += 1) \ a[i] = cond[i] ? (TYPE3) (b[i] >> 7) : a[i]; \ return a; \ } #define DEF_COND_NARROW_TRUNC_XREG(TYPE1, TYPE2, TYPE3, NUM) \ v##NUM##TYPE1 narrow_##TYPE1##_##TYPE2##_##NUM (v##NUM##TYPE1 a, \ v##NUM##TYPE2 b, int shift, \ int *__restrict cond) \ { \ for (int i = 0; i < NUM; i += 1) \ a[i] = cond[i] ? (TYPE3) (b[i] >> shift) : a[i]; \ return a; \ } #define DEF_CONSECUTIVE(TYPE, NUM) \ TYPE f##TYPE (TYPE a, TYPE b) \ { \ return __builtin_shufflevector (a, b, MASK_##NUM); \ } #define DEF_COMBINE(TYPE1, TYPE2, NUM, ...) \ void combine_##TYPE1##_##TYPE2##_##NUM (TYPE2 *out, TYPE2 x, TYPE2 y) \ { \ v##NUM##TYPE1 v = {__VA_ARGS__}; \ *(v##NUM##TYPE1 *) out = v; \ } #define DEF_TRAILING(TYPE1, TYPE2, NUM, ...) \ void init_##TYPE1##_##TYPE2##_##NUM (TYPE2 var0, TYPE2 var1, TYPE2 var2, \ TYPE2 var3, TYPE2 *__restrict out) \ { \ TYPE1 v = {__VA_ARGS__}; \ *(TYPE1 *) out = v; \ } #define DEF_RET1_ARG0(TYPE) \ TYPE __attribute__((noinline)) \ TYPE##_RET1_ARG0 () \ { \ TYPE r = {}; \ return r; \ } #define DEF_RET1_ARG1(TYPE) \ TYPE __attribute__((noinline)) \ TYPE##_RET1_ARG1 (TYPE a1) \ { \ return a1; \ } #define DEF_RET1_ARG2(TYPE) \ TYPE __attribute__((noinline)) \ TYPE##_RET1_ARG2 (TYPE a1, TYPE a2) \ { \ return a1 + a2; \ } #define DEF_RET1_ARG3(TYPE) \ TYPE __attribute__((noinline)) \ TYPE##_RET1_ARG3 (TYPE a1, TYPE a2, TYPE a3) \ { \ return a1 + a2 + a3; \ } #define DEF_RET1_ARG4(TYPE) \ TYPE __attribute__((noinline)) \ TYPE##_RET1_ARG4 (TYPE a1, TYPE a2, TYPE a3, TYPE a4) \ { \ return a1 + a2 + a3 + a4; \ } #define DEF_RET1_ARG5(TYPE) \ TYPE __attribute__((noinline)) \ TYPE##_RET1_ARG5 (TYPE a1, TYPE a2, TYPE a3, TYPE a4, TYPE a5) \ { \ return a1 + a2 + a3 + a4 + a5; \ } #define DEF_RET1_ARG6(TYPE) \ TYPE __attribute__((noinline)) \ TYPE##_RET1_ARG6 (TYPE a1, TYPE a2, TYPE a3, TYPE a4, TYPE a5, TYPE a6) \ { \ return a1 + a2 + a3 + a4 + a5 + a6; \ } #define DEF_RET1_ARG7(TYPE) \ TYPE __attribute__((noinline)) \ TYPE##_RET1_ARG7 (TYPE a1, TYPE a2, TYPE a3, TYPE a4, TYPE a5, TYPE a6, \ TYPE a7) \ { \ return a1 + a2 + a3 + a4 + a5 + a6 + a7; \ } #define DEF_RET1_ARG8(TYPE) \ TYPE __attribute__((noinline)) \ TYPE##_RET1_ARG8 (TYPE a1, TYPE a2, TYPE a3, TYPE a4, TYPE a5, TYPE a6, \ TYPE a7, TYPE a8) \ { \ return a1 + a2 + a3 + a4 + a5 + a6 + a7 + a8; \ } #define DEF_RET1_ARG9(TYPE) \ TYPE __attribute__((noinline)) \ TYPE##_RET1_ARG9 (TYPE a1, TYPE a2, TYPE a3, TYPE a4, TYPE a5, TYPE a6, \ TYPE a7, TYPE a8, TYPE a9) \ { \ return a1 + a2 + a3 + a4 + a5 + a6 + a7 + a8 + a9; \ } DEF_RET1_ARG0 (v1qi) DEF_RET1_ARG0 (v2qi) DEF_RET1_ARG0 (v4qi) DEF_RET1_ARG0 (v8qi) DEF_RET1_ARG0 (v16qi) DEF_RET1_ARG0 (v32qi) DEF_RET1_ARG0 (v64qi) DEF_RET1_ARG0 (v128qi) DEF_RET1_ARG0 (v256qi) DEF_RET1_ARG0 (v512qi) DEF_RET1_ARG0 (v1024qi) DEF_RET1_ARG0 (v2048qi) DEF_RET1_ARG0 (v4096qi) DEF_RET1_ARG1 (v1qi) DEF_RET1_ARG1 (v2qi) DEF_RET1_ARG1 (v4qi) DEF_RET1_ARG1 (v8qi) DEF_RET1_ARG1 (v16qi) DEF_RET1_ARG1 (v32qi) DEF_RET1_ARG1 (v64qi) DEF_RET1_ARG1 (v128qi) DEF_RET1_ARG1 (v256qi) DEF_RET1_ARG1 (v512qi) DEF_RET1_ARG1 (v1024qi) DEF_RET1_ARG1 (v2048qi) DEF_RET1_ARG1 (v4096qi) DEF_RET1_ARG2 (v1qi) DEF_RET1_ARG2 (v2qi) DEF_RET1_ARG2 (v4qi) DEF_RET1_ARG2 (v8qi) DEF_RET1_ARG2 (v16qi) DEF_RET1_ARG2 (v32qi) DEF_RET1_ARG2 (v64qi) DEF_RET1_ARG2 (v128qi) DEF_RET1_ARG2 (v256qi) DEF_RET1_ARG2 (v512qi) DEF_RET1_ARG2 (v1024qi) DEF_RET1_ARG2 (v2048qi) DEF_RET1_ARG2 (v4096qi) DEF_RET1_ARG3 (v1qi) DEF_RET1_ARG3 (v2qi) DEF_RET1_ARG3 (v4qi) DEF_RET1_ARG3 (v8qi) DEF_RET1_ARG3 (v16qi) DEF_RET1_ARG3 (v32qi) DEF_RET1_ARG3 (v64qi) DEF_RET1_ARG3 (v128qi) DEF_RET1_ARG3 (v256qi) DEF_RET1_ARG3 (v512qi) DEF_RET1_ARG3 (v1024qi) DEF_RET1_ARG3 (v2048qi) DEF_RET1_ARG3 (v4096qi) DEF_RET1_ARG4 (v1qi) DEF_RET1_ARG4 (v2qi) DEF_RET1_ARG4 (v4qi) DEF_RET1_ARG4 (v8qi) DEF_RET1_ARG4 (v16qi) DEF_RET1_ARG4 (v32qi) DEF_RET1_ARG4 (v64qi) DEF_RET1_ARG4 (v128qi) DEF_RET1_ARG4 (v256qi) DEF_RET1_ARG4 (v512qi) DEF_RET1_ARG4 (v1024qi) DEF_RET1_ARG4 (v2048qi) DEF_RET1_ARG4 (v4096qi) DEF_RET1_ARG5 (v1qi) DEF_RET1_ARG5 (v2qi) DEF_RET1_ARG5 (v4qi) DEF_RET1_ARG5 (v8qi) DEF_RET1_ARG5 (v16qi) DEF_RET1_ARG5 (v32qi) DEF_RET1_ARG5 (v64qi) DEF_RET1_ARG5 (v128qi) DEF_RET1_ARG5 (v256qi) DEF_RET1_ARG5 (v512qi) DEF_RET1_ARG5 (v1024qi) DEF_RET1_ARG5 (v2048qi) DEF_RET1_ARG5 (v4096qi) DEF_RET1_ARG6 (v1qi) DEF_RET1_ARG6 (v2qi) DEF_RET1_ARG6 (v4qi) DEF_RET1_ARG6 (v8qi) DEF_RET1_ARG6 (v16qi) DEF_RET1_ARG6 (v32qi) DEF_RET1_ARG6 (v64qi) DEF_RET1_ARG6 (v128qi) DEF_RET1_ARG6 (v256qi) DEF_RET1_ARG6 (v512qi) DEF_RET1_ARG6 (v1024qi) DEF_RET1_ARG6 (v2048qi) DEF_RET1_ARG6 (v4096qi) DEF_RET1_ARG7 (v1qi) DEF_RET1_ARG7 (v2qi) DEF_RET1_ARG7 (v4qi) DEF_RET1_ARG7 (v8qi) DEF_RET1_ARG7 (v16qi) DEF_RET1_ARG7 (v32qi) DEF_RET1_ARG7 (v64qi) DEF_RET1_ARG7 (v128qi) DEF_RET1_ARG7 (v256qi) DEF_RET1_ARG7 (v512qi) DEF_RET1_ARG7 (v1024qi) DEF_RET1_ARG7 (v2048qi) DEF_RET1_ARG7 (v4096qi) DEF_RET1_ARG8 (v1qi) DEF_RET1_ARG8 (v2qi) DEF_RET1_ARG8 (v4qi) DEF_RET1_ARG8 (v8qi) DEF_RET1_ARG8 (v16qi) DEF_RET1_ARG8 (v32qi) DEF_RET1_ARG8 (v64qi) DEF_RET1_ARG8 (v128qi) DEF_RET1_ARG8 (v256qi) DEF_RET1_ARG8 (v512qi) DEF_RET1_ARG8 (v1024qi) DEF_RET1_ARG8 (v2048qi) DEF_RET1_ARG8 (v4096qi) DEF_RET1_ARG9 (v1qi) DEF_RET1_ARG9 (v2qi) DEF_RET1_ARG9 (v4qi) DEF_RET1_ARG9 (v8qi) DEF_RET1_ARG9 (v16qi) DEF_RET1_ARG9 (v32qi) DEF_RET1_ARG9 (v64qi) DEF_RET1_ARG9 (v128qi) DEF_RET1_ARG9 (v256qi) DEF_RET1_ARG9 (v512qi) DEF_RET1_ARG9 (v1024qi) DEF_RET1_ARG9 (v2048qi) DEF_RET1_ARG9 (v4096qi) /* { dg-final { scan-assembler-times {li\s+a[0-1],\s*0} 9 } } */ /* { dg-final { scan-assembler-times {lbu\s+a0,\s*[0-9]+\(sp\)} 8 } } */ /* { dg-final { scan-assembler-times {lhu\s+a0,\s*[0-9]+\(sp\)} 8 } } */ /* { dg-final { scan-assembler-times {lw\s+a0,\s*[0-9]+\(sp\)} 8 } } */ /* { dg-final { scan-assembler-times {ld\s+a[0-1],\s*[0-9]+\(sp\)} 35 } } */ /* { dg-final { scan-assembler-times {sb\s+a[0-7],\s*[0-9]+\(sp\)} 43 } } */ /* { dg-final { scan-assembler-times {sh\s+a[0-7],\s*[0-9]+\(sp\)} 43 } } */ /* { dg-final { scan-assembler-times {sw\s+a[0-7],\s*[0-9]+\(sp\)} 43 } } */ /* { dg-final { scan-assembler-times {sd\s+a[0-7],\s*[0-9]+\(sp\)} 103 } } */
Become a Patron
Sponsor on GitHub
Donate via PayPal
Source on GitHub
Mailing list
Installed libraries
Wiki
Report an issue
How it works
Contact the author
CE on Mastodon
About the author
Statistics
Changelog
Version tree