Thanks for using Compiler Explorer
Sponsors
Jakt
C++
Ada
Analysis
Assembly
C
Carbon
C++ (Circle)
CIRCT
Clean
CMake
C++ for OpenCL
MLIR
Cppx
Cppx-Blue
Cppx-Gold
Cpp2-cppfront
Crystal
C#
CUDA C++
D
Dart
Erlang
Fortran
F#
Go
Haskell
HLSL
Hook
ispc
Java
Julia
Kotlin
LLVM IR
Nim
Objective-C
Objective-C++
OCaml
OpenCL C
Pascal
Pony
Python
Racket
Ruby
Rust
Scala
Solidity
Swift
Toit
TypeScript Native
Visual Basic
Zig
assembly source #1
Output
Compile to binary object
Link to binary
Execute the code
Intel asm syntax
Demangle identifiers
Filters
Unused labels
Library functions
Directives
Comments
Horizontal whitespace
Compiler
AArch64 binutils 2.28
AArch64 binutils 2.31.1
AArch64 binutils 2.33.1
AArch64 binutils 2.35.1
ARM binutils 2.25
ARM binutils 2.28
ARM binutils 2.31.1
ARM gcc 10.2 (linux)
ARM gcc 9.3 (linux)
ARMhf binutils 2.28
BeebAsm 1.09
NASM 2.12.02
NASM 2.13.02
NASM 2.13.03
NASM 2.14.02
PTX Assembler 10.0.130
PTX Assembler 10.1.105
PTX Assembler 10.1.168
PTX Assembler 10.1.243
PTX Assembler 10.2.89
PTX Assembler 11.0.2
PTX Assembler 11.0.3
PTX Assembler 11.1.0
PTX Assembler 11.1.1
PTX Assembler 11.2.0
PTX Assembler 11.2.1
PTX Assembler 11.2.2
PTX Assembler 11.3.0
PTX Assembler 11.3.1
PTX Assembler 11.4.0
PTX Assembler 11.4.1
PTX Assembler 11.5.0
PTX Assembler 9.1.85
PTX Assembler 9.2.88
x86-64 binutils (trunk)
x86-64 binutils 2.27
x86-64 binutils 2.28
x86-64 binutils 2.29.1
x86-64 binutils 2.34
x86-64 binutils 2.36.1
x86-64 binutils 2.38
x86-64 clang (assertions trunk)
x86-64 clang (trunk)
x86-64 clang 10.0.0
x86-64 clang 10.0.1
x86-64 clang 11.0.0
x86-64 clang 11.0.1
x86-64 clang 12.0.0
x86-64 clang 12.0.1
x86-64 clang 13.0.0
x86-64 clang 14.0.0
x86-64 clang 15.0.0
x86-64 clang 3.0.0
x86-64 clang 3.1
x86-64 clang 3.2
x86-64 clang 3.3
x86-64 clang 3.4.1
x86-64 clang 3.5
x86-64 clang 3.5.1
x86-64 clang 3.5.2
x86-64 clang 3.6
x86-64 clang 3.7
x86-64 clang 3.7.1
x86-64 clang 3.8
x86-64 clang 3.8.1
x86-64 clang 3.9.0
x86-64 clang 3.9.1
x86-64 clang 4.0.0
x86-64 clang 4.0.1
x86-64 clang 5.0.0
x86-64 clang 6.0.0
x86-64 clang 7.0.0
x86-64 clang 8.0.0
x86-64 clang 9.0.0
Options
Source code
; test with ; sudo mount -t tmpfs -o size=1G,huge=always tmpfs /tmp/testd ; asm-link -nd yes.asm && ; ./yes > testd/yesout && perf stat -d ./yes > testd/yesout # warmup and run global _start _start: ;; All regs zeroed at the top of a static executable (except RSP) SIZEPOW equ 16 ; mov ecx, 8192 ; bts ecx, 16 ; 1<<16 = 65536 x 4 bytes to fill. 4 bytes, any power of 2 vs. mov cx, imm16 limited to 65535 bts ecx, SIZEPOW mov edx, ecx ; size arg for sys_write in bytes; 1/4 of the actual buffer size mov eax, `y\ny\n` ; sub rsp, rdx ; and rsp, -65536 ; With: 631,4M cycles 357.6M insns. Without: 651,3M cycles 359.4M instructions. (times are somewhat noisy but there's a real difference) ; static buffer is smaller code size, if we don't count the BSS as anything ; and gives us cache-line / page alignment for kernel memcpy speed ; mov rdi, rsp ; push rsp ; pop rdi mov edi, buf mov esi, edi rep stosd ; wmemset(rdi, eax, rcx) 4*rcx bytes lea edi, [rcx + 1] ; mov edi, 1 .loop: mov eax, edi ; __NR_write = stdout fileno syscall test eax,eax ;;;;;;;;;;; For benchmarking purposes only: abort on write fail jge .loop ; jmp .loop mov eax, 231 xor edi, edi syscall section .bss align 4096 buf: resd 1<<SIZEPOW
Become a Patron
Sponsor on GitHub
Donate via PayPal
Source on GitHub
Mailing list
Installed libraries
Wiki
Report an issue
How it works
Contact the author
About the author
Changelog
Version tree