Stochastic rounding for floating-point arithmetic.
This package exports Float64sr
, Float32sr
,Float16sr
, and BFloat16sr
, four number formats that behave
like their deterministic counterparts but with stochastic rounding that is proportional to the
distance of the next representable numbers and therefore
exact in expectation
(see also example below in Usage.
The only known hardware implementation available is
Graphcore's IPU with stochastic rounding,
but other vendors are likely working on stochastic rounding for their next-generation
GPUs (and maybe CPUs too).
The software emulation in StochasticRounding.jl makes the number format slower, but e.g. Float32 with stochastic rounding is only about 2x slower than the default deterministically rounded Float64. Xoroshiro128Plus, a random number generator from the Xorshift family, is used from the RandomNumbers.jl package, due to its speed and statistical properties.
Every format of Float64sr
, Float32sr
,Float16sr
, and BFloat16sr
uses a higher precision format
to obtain the "exact" arithmetic result which is then stochastically rounded to the respective
lower precision format. Float16sr
and BFloat16sr
use Float32
for this,
Float32sr
uses Float64
, and Float64sr
uses Double64
from
DoubleFloats.jl
You are welcome to raise issues, ask questions or suggest any changes or new features.
BFloat16sr
is based on BFloat16s.jl
Float16sr
is slow in Julia <1.6, but much faster in Julia >=1.6 due to LLVM's half
support.
StochasticRounding.jl defines stochastic_round
as a general interface to round stochastically to
a floating-point number, e.g.
julia> stochastic_round(Float32,π)
3.1415927f0
julia> stochastic_round(Float32,π)
3.1415925f0
Here we round π (which is first converted to Float64) to Float32 where it is not exactly representable, hence we have results that differ in the last bit. The chance here is
julia> chance_roundup(Float32,π)
0.6333222836256027
about 63% to obtain the first (which therefore would be round to nearest, the default rounding mode),
and hence 37% to obtain the latter (round away from nearest). The stochastic_round
function is wrapped
into the floating-point formats Float64sr
, Float32sr
, Float16sr
and BFloat16sr
are supposed to be drop-in
replacements for their deterministically rounded counterparts. You can create data of those types as expected
(which is bitwise identical to the deterministic formats respectively) and the type
will trigger stochastic rounding on every arithmetic operation.
julia> a = BFloat16sr(1.0)
BFloat16sr(1.0)
julia> a/3
BFloat16sr(0.33398438)
julia> a/3
BFloat16sr(0.33203125)
As 1/3
is not exactly representable the rounding will be at 66.6% chance towards 0.33398438
and at 33.3% towards 0.33203125 such that in expectation the result is 0.33333... and therefore exact.
Solving a linear equation system with LU decomposition and stochastic rounding:
A = randn(Float32sr,3,3)
b = randn(Float32sr,3)
Now execute the \
several times and the results will differ slightly due to stochastic rounding
julia> A\b
3-element Vector{Float32sr}:
3.3321106f0
2.0391452f0
-0.59199476f0
julia> A\b
3-element Vector{Float32sr}:
3.3321111f0
2.0391457f0
-0.5919949f0
The random number generator is randomly seeded on every import
or using
such that running
the same calculations twice, will not yield bit-reproducible results. However, you can seed
the random number generator at any time with any integer larger than zero as follows
julia> StochasticRounding.seed(2156712)
the state of the random number generator for StochasticRounding.jl is independent from Julia's default,
which is used for rand()
, randn()
etc.
Round-to-nearest (tie to even) is the standard rounding mode for IEEE floats. Stochastic rounding is explained in the following schematic
The exact result x of an arithmetic operation (located at one fifth between x₂ and x₃ in this example) is always rounded down to x₂ for round-to-nearest. For stochastic rounding, only at 80% chance x is round down. At 20% chance it is round up to x₃, proportional to the distance of x between x₂ and x₃.
StochasticRounding.jl is registered in the Julia registry. Hence, simply do
julia>] add StochasticRounding
where ]
opens the package manager.
StochasticRounding.jl is among the fastest software implementation of stochastic rounding for floating-point arithmetic. Define a few random 1000000-element arrays
julia> using StochasticRounding, BenchmarkTools, BFloat16s
julia> A = rand(Float64,1000000);
julia> B = rand(Float64,1000000); # A, B shouldn't be identical as a+a=2a is not round
And similarly for the other number types. Then with Julia 1.6 on an Intel(R) Core(R) i5 (Ice Lake) @ 1.1GHz timings via
@btime +($A,$B)
are
rounding mode | Float64 | Float32 | Float16 | BFloat16 |
---|---|---|---|---|
round to nearest | 1132 μs | 452 μs | 1588 μs | 315 μs |
stochastic rounding | 11,368 μs | 2650 μs | 3310 μs | 1850 μs |
Stochastic rounding imposes an about x5 performance decrease for Float32 and BFloat16, but only x2 for Float16, however, 10x for Float64 due to the use of Double64. For more complicated benchmarks the performance decrease is usually within x10. About 50% of the time is spend on the random number generation with Xoroshiro128+.
This package was written for the following publications. If you use this package in your publications please cite us
Klöwer, M, PV Coveney, EA Paxton, TN Palmer, 2023. Periodic orbits in chaotic dynamical systems simulated at low precision, Scientific Reports, 10.1038/s41598-023-37004-4.
Paxton EA, M Chantry, M Klöwer, L Saffin, TN Palmer, 2022. Climate Modelling in Low-Precision: Effects of both Deterministic & Stochastic Rounding, Journal of Climate, 10.1175/JCLI-D-21-0343.1.