Thanks for using Compiler Explorer
Sponsors
Jakt
C++
Ada
Analysis
Android Java
Android Kotlin
Assembly
C
C3
Carbon
C++ (Circle)
CIRCT
Clean
CMake
CMakeScript
COBOL
C++ for OpenCL
MLIR
Cppx
Cppx-Blue
Cppx-Gold
Cpp2-cppfront
Crystal
C#
CUDA C++
D
Dart
Elixir
Erlang
Fortran
F#
GLSL
Go
Haskell
HLSL
Hook
Hylo
IL
ispc
Java
Julia
Kotlin
LLVM IR
LLVM MIR
Modula-2
Nim
Objective-C
Objective-C++
OCaml
OpenCL C
Pascal
Pony
Python
Racket
Ruby
Rust
Snowball
Scala
Solidity
Spice
SPIR-V
Swift
LLVM TableGen
Toit
TypeScript Native
V
Vala
Visual Basic
WASM
Zig
Javascript
GIMPLE
Ygen
fortran source #1
Output
Compile to binary object
Link to binary
Execute the code
Intel asm syntax
Demangle identifiers
Verbose demangling
Filters
Unused labels
Library functions
Directives
Comments
Horizontal whitespace
Debug intrinsics
Compiler
AARCH64 gfortran 10.5.0
AARCH64 gfortran 11.4.0
AARCH64 gfortran 12.1.0
AARCH64 gfortran 12.2.0
AARCH64 gfortran 12.3.0
AARCH64 gfortran 12.4.0
AARCH64 gfortran 13.1.0
AARCH64 gfortran 13.2.0
AARCH64 gfortran 13.3.0
AARCH64 gfortran 14.1.0
AARCH64 gfortran 14.2.0
AARCH64 gfortran 4.9.4
AARCH64 gfortran 5.5.0
AARCH64 gfortran 6.4
AARCH64 gfortran 7.3
AARCH64 gfortran 8.2
ARM (32bit) gfortran 10.5.0
ARM (32bit) gfortran 11.4.0
ARM (32bit) gfortran 12.1.0
ARM (32bit) gfortran 12.2.0
ARM (32bit) gfortran 12.3.0
ARM (32bit) gfortran 12.4.0
ARM (32bit) gfortran 13.1.0
ARM (32bit) gfortran 13.2.0
ARM (32bit) gfortran 13.3.0
ARM (32bit) gfortran 14.1.0
ARM (32bit) gfortran 14.2.0
ARM (32bit) gfortran 6.4
ARM (32bit) gfortran 7.3
ARM (32bit) gfortran 8.2
HPPA gfortran 14.2.0
LOONGARCH64 gfortran 12.2.0
LOONGARCH64 gfortran 12.3.0
LOONGARCH64 gfortran 12.4.0
LOONGARCH64 gfortran 13.1.0
LOONGARCH64 gfortran 13.2.0
LOONGARCH64 gfortran 13.3.0
LOONGARCH64 gfortran 14.1.0
LOONGARCH64 gfortran 14.2.0
MIPS gfortran 12.1.0
MIPS gfortran 12.2.0
MIPS gfortran 12.3.0
MIPS gfortran 12.4.0
MIPS gfortran 13.1.0
MIPS gfortran 13.2.0
MIPS gfortran 13.3.0
MIPS gfortran 14.1.0
MIPS gfortran 14.2.0
MIPS gfortran 4.9.4
MIPS gfortran 5.5.0
MIPS gfortran 9.5.0
MIPS64 gfortran 12.1.0
MIPS64 gfortran 12.2.0
MIPS64 gfortran 12.3.0
MIPS64 gfortran 12.4.0
MIPS64 gfortran 13.1.0
MIPS64 gfortran 13.2.0
MIPS64 gfortran 13.3.0
MIPS64 gfortran 14.1.0
MIPS64 gfortran 14.2.0
MIPS64 gfortran 4.9.4
MIPS64 gfortran 5.5.0
MIPS64 gfortran 9.5.0
MIPS64el gfortran 12.1.0
MIPS64el gfortran 12.2.0
MIPS64el gfortran 12.3.0
MIPS64el gfortran 12.4.0
MIPS64el gfortran 13.1.0
MIPS64el gfortran 13.2.0
MIPS64el gfortran 13.3.0
MIPS64el gfortran 14.1.0
MIPS64el gfortran 14.2.0
MIPS64el gfortran 4.9.4
MIPS64el gfortran 5.5.0
MIPS64el gfortran 9.5.0
MIPSel gfortran 12.1.0
MIPSel gfortran 12.2.0
MIPSel gfortran 12.3.0
MIPSel gfortran 12.4.0
MIPSel gfortran 13.1.0
MIPSel gfortran 13.2.0
MIPSel gfortran 13.3.0
MIPSel gfortran 14.1.0
MIPSel gfortran 14.2.0
MIPSel gfortran 4.9.4
MIPSel gfortran 5.5.0
MIPSel gfortran 9.5.0
POWER gfortran 12.1.0
POWER gfortran 12.2.0
POWER gfortran 12.3.0
POWER gfortran 12.4.0
POWER gfortran 13.1.0
POWER gfortran 13.2.0
POWER gfortran 13.3.0
POWER gfortran 14.1.0
POWER gfortran 14.2.0
POWER64 gfortran 12.1.0
POWER64 gfortran 12.2.0
POWER64 gfortran 12.3.0
POWER64 gfortran 12.4.0
POWER64 gfortran 13.1.0
POWER64 gfortran 13.2.0
POWER64 gfortran 13.3.0
POWER64 gfortran 14.1.0
POWER64 gfortran 14.2.0
POWER64 gfortran trunk
POWER64le gfortran 12.1.0
POWER64le gfortran 12.2.0
POWER64le gfortran 12.3.0
POWER64le gfortran 12.4.0
POWER64le gfortran 13.1.0
POWER64le gfortran 13.2.0
POWER64le gfortran 13.3.0
POWER64le gfortran 14.1.0
POWER64le gfortran 14.2.0
POWER64le gfortran trunk
RISC-V 32-bits gfortran (trunk)
RISC-V 32-bits gfortran 12.1.0
RISC-V 64-bits gfortran (trunk)
RISC-V 64-bits gfortran 12.1.0
RISCV (32bit) gfortran 11.4.0
RISCV (32bit) gfortran 12.2.0
RISCV (32bit) gfortran 12.3.0
RISCV (32bit) gfortran 12.4.0
RISCV (32bit) gfortran 13.1.0
RISCV (32bit) gfortran 13.2.0
RISCV (32bit) gfortran 13.3.0
RISCV (32bit) gfortran 14.1.0
RISCV (32bit) gfortran 14.2.0
RISCV64 gfortran 11.4.0
RISCV64 gfortran 12.2.0
RISCV64 gfortran 12.3.0
RISCV64 gfortran 12.4.0
RISCV64 gfortran 13.1.0
RISCV64 gfortran 13.2.0
RISCV64 gfortran 13.3.0
RISCV64 gfortran 14.1.0
RISCV64 gfortran 14.2.0
SPARC LEON gfortran 12.2.0
SPARC LEON gfortran 12.3.0
SPARC LEON gfortran 12.4.0
SPARC LEON gfortran 13.1.0
SPARC LEON gfortran 13.2.0
SPARC LEON gfortran 13.3.0
SPARC LEON gfortran 14.1.0
SPARC LEON gfortran 14.2.0
SPARC gfortran 12.2.0
SPARC gfortran 12.3.0
SPARC gfortran 12.4.0
SPARC gfortran 13.1.0
SPARC gfortran 13.2.0
SPARC gfortran 13.3.0
SPARC gfortran 14.1.0
SPARC gfortran 14.2.0
SPARC64 gfortran 12.2.0
SPARC64 gfortran 12.3.0
SPARC64 gfortran 12.4.0
SPARC64 gfortran 13.1.0
SPARC64 gfortran 13.2.0
SPARC64 gfortran 13.3.0
SPARC64 gfortran 14.1.0
SPARC64 gfortran 14.2.0
flang-trunk
flang-trunk-fc1
power64 AT12.0
power64 AT13.0
power64le AT12.0
power64le AT13.0
s390x gfortran 12.1.0
s390x gfortran 12.2.0
s390x gfortran 12.3.0
s390x gfortran 12.4.0
s390x gfortran 13.1.0
s390x gfortran 13.2.0
s390x gfortran 13.3.0
s390x gfortran 14.1.0
s390x gfortran 14.2.0
x86 nvfortran 24.9
x86-64 gfortran (trunk)
x86-64 gfortran 10.1
x86-64 gfortran 10.2
x86-64 gfortran 10.3
x86-64 gfortran 10.4
x86-64 gfortran 10.5
x86-64 gfortran 11.1
x86-64 gfortran 11.2
x86-64 gfortran 11.3
x86-64 gfortran 11.4
x86-64 gfortran 12.1
x86-64 gfortran 12.2
x86-64 gfortran 12.3
x86-64 gfortran 12.4
x86-64 gfortran 13.1
x86-64 gfortran 13.2
x86-64 gfortran 13.3
x86-64 gfortran 14.1
x86-64 gfortran 14.2
x86-64 gfortran 4.9.4
x86-64 gfortran 5.5
x86-64 gfortran 6.3
x86-64 gfortran 7.1
x86-64 gfortran 7.2
x86-64 gfortran 7.3
x86-64 gfortran 8.1
x86-64 gfortran 8.2
x86-64 gfortran 8.3
x86-64 gfortran 8.4
x86-64 gfortran 8.5
x86-64 gfortran 9.1
x86-64 gfortran 9.2
x86-64 gfortran 9.3
x86-64 gfortran 9.4
x86-64 ifort 19.0.0
x86-64 ifort 2021.1.2
x86-64 ifort 2021.10.0
x86-64 ifort 2021.11.0
x86-64 ifort 2021.2.0
x86-64 ifort 2021.3.0
x86-64 ifort 2021.4.0
x86-64 ifort 2021.5.0
x86-64 ifort 2021.6.0
x86-64 ifort 2021.7.0
x86-64 ifort 2021.7.1
x86-64 ifort 2021.8.0
x86-64 ifort 2021.9.0
x86-64 ifx (latest)
x86-64 ifx 2021.1.2
x86-64 ifx 2021.2.0
x86-64 ifx 2021.3.0
x86-64 ifx 2021.4.0
x86-64 ifx 2022.0.0
x86-64 ifx 2022.1.0
x86-64 ifx 2022.2.0
x86-64 ifx 2022.2.1
x86-64 ifx 2023.0.0
x86-64 ifx 2023.1.0
x86-64 ifx 2023.2.1
x86-64 ifx 2024.0.0
Options
Source code
module mymod use iso_fortran_env contains subroutine sub_logsum(LogSum,Prob, V,sigma) ! DESCRIPTION ! Calculates the log-sum and choice-probabilities. The computation avoids ! overflows. !See for more info: ! https://gregorygundersen.com/blog/2020/02/09/log-sum-exp/ ! https://nhigham.com/2021/01/05/what-is-the-log-sum-exp-function/ ! INPUTS ! V: Vector with values of the different choices ! sigma: Standard deviation of the taste shock ! OUTPUTS ! LogSum: Log sum of exponentials ! Prob: Choice probabilities real(8), intent(in) :: V(:) real(8), intent(in) :: sigma real(8), intent(out) :: LogSum real(8), intent(out) :: Prob(:) ! Local variables: integer :: n, id real(8) :: mxm, sum_prob real, save :: times(4) = 0 integer, save :: ntest = 0 real :: t1,t2,t3,t4,t5 ! Body of sub_logsum call cpu_time(t1) n = size(V) ! Check inputs if (n==1) then stop "sub_logsum: Need at least two elements" endif if (n/=size(Prob)) then stop "sub_logsum: V and Prob are not compatible" endif if (sigma<=0d0) then stop "sub_logsum: sigma must be zero or positive" endif ! Maximum over the discrete choice id = maxloc(V,dim=1) mxm = V(id) call cpu_time(t2) ! Logsum and probabilities ! a. numerically robust log-sum LogSum = mxm + sigma*log(sum(exp((V-mxm)/sigma))) call cpu_time(t3) ! b. numerically robust probability Prob = exp((V-LogSum)/sigma) call cpu_time(t4) !-----------------------------------------------------------------! ! Check if outputs are correct. Comment out this section for speed if (LogSum/=LogSum) then stop "sub_logsum: LogSum is either NaN or Inf" endif if (any(Prob<0d0)) then stop "sub_logsum: Some elements of Prob are negative" endif sum_prob = sum(Prob) if (abs(sum_prob-1d0)>1d-7) then write(*,*) "sum_prob = ", sum_prob stop "sub_logsum: Prob does not sum to one" endif Prob = Prob/sum_prob call cpu_time(t5) ntest = ntest+1 times(1) = times(1)+t2-t1 times(2) = times(2)+t3-t2 times(3) = times(3)+t4-t3 times(4) = times(4)+t5-t4 if (ntest>=20000) print *, 'intrinsic: times=',times !-----------------------------------------------------------------! end subroutine sub_logsum subroutine sub_logsum_f(LogSum,Prob, V,sigma) ! DESCRIPTION ! Calculates the log-sum and choice-probabilities. The computation avoids ! overflows. !See for more info: ! https://gregorygundersen.com/blog/2020/02/09/log-sum-exp/ ! https://nhigham.com/2021/01/05/what-is-the-log-sum-exp-function/ ! INPUTS ! V: Vector with values of the different choices ! sigma: Standard deviation of the taste shock ! OUTPUTS ! LogSum: Log sum of exponentials ! Prob: Choice probabilities real(8), intent(in) :: V(:) real(8), intent(in) :: sigma real(8), intent(out) :: LogSum real(8), intent(out) :: Prob(:) ! Local variables: integer :: n, id real(8) :: mxm, sum_prob real, save :: times(4) = 0 real :: t1,t2,t3,t4,t5 integer, save :: ntest = 0 ! Body of sub_logsum call cpu_time(t1) n = size(V) ! Check inputs if (n==1) then stop "sub_logsum: Need at least two elements" endif if (n/=size(Prob)) then stop "sub_logsum: V and Prob are not compatible" endif if (sigma<=0d0) then stop "sub_logsum: sigma must be zero or positive" endif ! Maximum over the discrete choice id = maxloc(V,dim=1) mxm = V(id) call cpu_time(t2) ! Logsum and probabilities ! a. numerically robust log-sum LogSum = mxm + sigma*flog_spline(sum(fexp_spline((V-mxm)/sigma))) call cpu_time(t3) ! b. numerically robust probability Prob = fexp_spline((V-LogSum)/sigma) call cpu_time(t4) !-----------------------------------------------------------------! ! Check if outputs are correct. Comment out this section for speed if (LogSum/=LogSum) then stop "sub_logsum: LogSum is either NaN or Inf" endif if (any(Prob<0d0)) then stop "sub_logsum: Some elements of Prob are negative" endif sum_prob = sum(Prob) if (abs(sum_prob-1d0)>1d-7) then write(*,*) "sum_prob = ", sum_prob stop "sub_logsum: Prob does not sum to one" endif Prob = Prob/sum_prob call cpu_time(t5) ntest = ntest+1 times(1) = times(1)+t2-t1 times(2) = times(2)+t3-t2 times(3) = times(3)+t4-t3 times(4) = times(4)+t5-t4 if (ntest>=20000) print *, 'fast: times=',times !-----------------------------------------------------------------! end subroutine sub_logsum_f subroutine sub_logsum_perini(LogSum,Prob, V,sigma) ! DESCRIPTION ! Calculates the log-sum and choice-probabilities. The computation avoids ! overflows. !See for more info: ! https://gregorygundersen.com/blog/2020/02/09/log-sum-exp/ ! https://nhigham.com/2021/01/05/what-is-the-log-sum-exp-function/ ! INPUTS ! V: Vector with values of the different choices ! sigma: Standard deviation of the taste shock ! OUTPUTS ! LogSum: Log sum of exponentials ! Prob: Choice probabilities real(8), intent(in) :: V(4) real(8), intent(in) :: sigma real(8), intent(out) :: LogSum real(8), intent(out) :: Prob(4) ! Local variables: integer :: n, i real(8) :: mxm, sum_prob, rsig, Vrel(4) real, save :: times(4) = 0 real :: t1,t2,t3,t4,t5 integer, save :: ntest = 0 ! Body of sub_logsum call cpu_time(t1) n = size(V) ! Maximum over the discrete choice mxm = maxval(V) Vrel = V-mxm call cpu_time(t2) ! Logsum and probabilities ! a. numerically robust log-sum rsig = 1.0_real64/sigma LogSum = +sigma*flog_spline(sum(fexp_spline(Vrel*rsig))) call cpu_time(t3) ! b. numerically robust probability Prob = fexp_spline((Vrel-LogSum)*rsig) call cpu_time(t4) sum_prob = sum(Prob) Prob = Prob/sum_prob call cpu_time(t5) ntest = ntest+1 times(1) = times(1)+t2-t1 times(2) = times(2)+t3-t2 times(3) = times(3)+t4-t3 times(4) = times(4)+t5-t4 if (ntest>=20000) print *, 'fast: times=',times !-----------------------------------------------------------------! end subroutine sub_logsum_perini ! Use a 3rd-degree spline, differentiable at the extremes elemental real(real64) function fexp_spline(x) result(exp_x) real(real64), intent(in) :: x real(real64) :: xr,xf integer(int64) :: i8 integer :: ideg integer, parameter :: maxDegree = 16 real(real64), parameter :: half = 0.5_real64 real(real64), parameter :: one = 1.0_real64 real(real64), parameter :: two = 2.0_real64 real(real64), parameter :: log2 = log(two) real(real64), parameter :: rlog2 = one/log(two) real(real64), parameter :: sqrt2 = sqrt(two) ! Series centered in x=1/2 real(real64), parameter :: x0 = half real(real64), parameter :: f0 = 1.5_real64-sqrt2 real(real64), parameter :: k(maxDegree) = [one-sqrt2*log2,& (-sqrt2*log2**ideg/gamma(ideg+one),ideg=2,maxDegree)] integer(int64), parameter :: mantissa = 2_int64**52 integer(int64), parameter :: bias = 1023_int64 integer(int64), parameter :: ishift = mantissa*bias ! Cubic spline expansion coefficients real(real64), parameter :: s(3)= [-7.6167541742324804e-2_real64,& -0.23141283591588344_real64 ,& 0.30758037765820823_real64 ] xr = x*rlog2 ! Change of base: e^x -> 2^y xf = xr-floor(xr) ! Compute fractional part xr = xr-((s(1)*xf+s(2))*xf+s(3))*xf ! Subtract Delta i8 = int(mantissa*xr + ishift,int64) ! exp_x = transfer(i8,exp_x) end function fexp_spline elemental real(real64) function flog_spline(x) result(log_x) real(real64), intent(in) :: x integer, parameter :: maxDegree = 16 real(real64), parameter :: half = 0.5_real64 real(real64), parameter :: one = 1.0_real64 real(real64), parameter :: two = 2.0_real64 real(real64), parameter :: log2 = log(two) real(real64), parameter :: rlog2 = one/log(two) real(real64), parameter :: sqrt2 = sqrt(two) integer(int64), parameter :: mantissa_left = 2_int64**52 integer(int64), parameter :: mantissa = not(shiftl(2047_int64,52)) integer(int64), parameter :: bias = 1023_int64 integer(int64), parameter :: ishift = mantissa_left*bias real(real64) :: xi,xf integer(int64) :: i8 real(real64), parameter :: s(3)= [rlog2,3.0_real64-2.5_real64*rlog2,1.5_real64*rlog2-2.0_real64] i8 = transfer(x,i8) xi = shiftr(i8,52)-bias ! Take mantissa part only xf = transfer(iand(i8,mantissa)+ishift,xf)-one ! Apply cubic polynomial xf = xf*(s(1)+xf*(s(2)+xf*s(3))) ! Compute log and Change of basis: log_2(x) -> log_e(x) = log2*log_2(x) log_x = (xf+xi)*log2 end function flog_spline end module program test_logsum_f use mymod, only: sub_logsum,sub_logsum_f,sub_logsum_perini implicit none real(8), allocatable :: V(:), Prob(:) real(8) :: LogSum, sigma, time, time_f, time_perini, c_start, c_end integer, parameter :: ntest = 20000 integer :: i,ii ! Assign values sigma = 0.1d0 V = [1d0, 2d0, 3d0, 4d0] ! Call sub_logsum allocate(Prob(size(V))) write(*,*) "Call sub_logsum" call cpu_time(c_start) do i=1,ntest call sub_logsum(LogSum,Prob, V,sigma) enddo call cpu_time(c_end) time = c_end-c_start write(*,*) "time = ", time write(*,'(X,A,F13.10)') "LogSum = ", LogSum write(*,*) "Prob = " do ii = 1,size(Prob) write(*,'(F13.10)') Prob(ii) enddo deallocate(Prob) write(*,*) "------------------------------------------" write(*,*) "Call sub_logsum_f" allocate(Prob(size(V))) call cpu_time(c_start) do i=1,ntest call sub_logsum_f(LogSum,Prob, V,sigma) enddo call cpu_time(c_end) time_f = c_end-c_start write(*,*) "time_f = ", time_f write(*,'(X,A,F13.10)') "LogSum = ", LogSum write(*,*) "Prob = " do ii = 1,size(Prob) write(*,'(F13.10)') Prob(ii) enddo write(*,*) "------------------------------------------" write(*,*) "Call sub_logsum_perini" call cpu_time(c_start) do i=1,ntest call sub_logsum_perini(LogSum,Prob, V,sigma) enddo call cpu_time(c_end) time_perini = c_end-c_start write(*,*) "time_perini = ", time_perini write(*,'(X,A,F13.10)') "LogSum = ", LogSum write(*,*) "Prob = " do ii = 1,size(Prob) write(*,'(F13.10)') Prob(ii) enddo write(*,*) "------------------------------------------" write(*,'(X,A)') "logsum perini math vs logsum:" write(*,'(X,A,F13.10)') "time_p/time = ", time_perini/time,' per evaluation = ',time*1e6/ntest,' us' write(*,*) "------------------------------------------" write(*,'(X,A)') "logsum with fast math vs logsum:" write(*,'(X,A,F13.10)') "time_f/time = ", time_f/time,' per evaluation = ',time*1e6/ntest,' us' end program test_logsum_f
Become a Patron
Sponsor on GitHub
Donate via PayPal
Source on GitHub
Mailing list
Installed libraries
Wiki
Report an issue
How it works
Contact the author
CE on Mastodon
About the author
Statistics
Changelog
Version tree