最近需要实现一个CircularBuffer,嗯,不是DataStrucutres.jl中的那个CircularBuffer。抛开具体细节。目前遇到了两个疑惑:
- 在给Array的一个slice赋值的时候,因为提前知道了长度,似乎比
(:)
的语法更快?
测了几组数据,稳定的两倍关系
julia> xs = Array{Float64}(undef, (100, 100, 100, 100));
julia> using BenchmarkTools
julia> @benchmark xs[100*100*100*99 + 1 : 100*100*100*100] = $(randn(100, 100, 100))
BenchmarkTools.Trial:
memory estimate: 32 bytes
allocs estimate: 1
--------------
minimum time: 478.000 μs (0.00% GC)
median time: 562.000 μs (0.00% GC)
mean time: 601.356 μs (0.00% GC)
maximum time: 1.958 ms (0.00% GC)
--------------
samples: 8203
evals/sample: 1
julia> @benchmark xs[:, :, :, 100]= $(randn(100, 100, 100))
BenchmarkTools.Trial:
memory estimate: 864 bytes
allocs estimate: 31
--------------
minimum time: 1.224 ms (0.00% GC)
median time: 1.389 ms (0.00% GC)
mean time: 1.416 ms (0.00% GC)
maximum time: 3.056 ms (0.00% GC)
--------------
samples: 3494
evals/sample: 1
看源码的实现似乎是因为基于Colon会多跑几次?
using BenchmarkTools
import Base: getindex, setindex!, size
struct A{T, N} <: AbstractArray{T, N}
data::Array{T, N}
end
size(x::A) = size(x.data)
getindex(x::A{T, N}, I::Vararg{Int, N}) where {T,N} = getindex(x.data, I[1:N-1]..., 1)
a = A(Array{Float64}(undef, (100, 100, 100, 100)));
@benchmark a[:, :, :, 1]
BenchmarkTools.Trial:
memory estimate: 221.25 MiB
allocs estimate: 5000005
--------------
minimum time: 661.683 ms (2.35% GC)
median time: 721.564 ms (2.20% GC)
mean time: 728.419 ms (3.18% GC)
maximum time: 814.892 ms (8.24% GC)
--------------
samples: 7
evals/sample: 1
这里不太理解为什么会有那么多内存分配。上面getindex(x.data, I[1:N-1]..., 1)
只是一个用来说明问题的例子,实际中1是计算出来的。下面是对比
julia> @benchmark a.data[:, :, :, 1]
BenchmarkTools.Trial:
memory estimate: 7.63 MiB
allocs estimate: 5
--------------
minimum time: 1.225 ms (0.00% GC)
median time: 1.472 ms (0.00% GC)
mean time: 1.559 ms (6.20% GC)
maximum time: 3.675 ms (0.00% GC)
--------------
samples: 3168
evals/sample: 1