库函数的正确用法?

隔壁贴看见一个简单的距离计算函数

就是求欧氏距离,然后我就想用自带的函数写,不用循环

# 通用的距离函数
function Minkowski(
        x::Array{T,1}, 
        y::Array{T,1},
        p::Int
    ) where T <: Number
    
    return sum(abs.(x - y).^p)^(1/p)
end

# 欧式距离-1
Euclidean(x::Array{T,1}, y::Array{T,1}) where T <: Number = Minkowski(x, y, 2)

# 欧式距离-2
function Euclidean2(
        x::Array{T,1}, 
        y::Array{T,1}
    ) where T <: Number
    
    return sqrt(sum(abs2.(x - y)))
end

测试代码

using BenchmarkTools

rng = 10^5
x = rand(-rng:rng, rng)
y = rand(-rng:rng, rng)

输出

julia> @btime distance(x, y)
  81.715 μs (1 allocation: 16 bytes)
2.5760024702397432e7

julia> @btime Euclidean(x, y)
  448.839 μs (5 allocations: 1.53 MiB)
2.5760024702397432e7

julia> @btime Euclidean2(x, y)
  269.225 μs (5 allocations: 1.53 MiB)
2.5760024702397432e7

julia> @btime Minkowski(x, y, 2)
  434.628 μs (5 allocations: 1.53 MiB)
2.5760024702397432e7

去掉 abs 依旧差了不少,

function Euclidean3(
        x::Array{T,1}, 
        y::Array{T,1}
    ) where T <: Number
    
    return sqrt(sum((x - y).^2))
end

# julia> @btime Euclidean3(x, y)
#   252.644 μs (6 allocations: 1.53 MiB)
# 2.5760024702397432e7

Julia 的循环就这么给力吗?那什么时候该用库函数呢?
还是我的测试姿势不对。

1 个赞

差别在allocation上,x - y 会有内存分配的

2 个赞

然后呢?我也想知道

:wink:

这里主要是涉及到了内存分配,如果代码逻辑里不需要这段中间变量,就不应该采用x-y这种计算方式。
抽象出来看,这里计算距离就是一个map + reduce的过程,然后,Julia中刚好提供了一个mapreduce函数:

help?> mapreduce
search: mapreduce

  mapreduce(f, op, itr; [init])

  Apply function f to each element in itr, and then reduce the result using the binary function op. If provided, init must be a neutral element for op that will be returne
  for empty collections. It is unspecified whether init is used for non-empty collections. In general, it will be necessary to provide init to work with empty collections.

  mapreduce is functionally equivalent to calling reduce(op, map(f, itr); init=init), but will in general execute faster since no intermediate collection needs to be
  created. See documentation for reduce and map.

划重点:

but will in general execute faster since no intermediate collection needs to be created.

试一下:

julia> using BenchmarkTools

julia> x = randn(10^5);

julia> y = randn(10^5);

julia> function f(x,y)
       dist = 0
           for i in 1:length(x)
               dist += (x[i]-y[i])^2
           end
           dist = sqrt(dist)
           return dist
       end

julia> @btime f(x,y)
  121.136 μs (1 allocation: 16 bytes)
448.6707620555013

julia> @btime sqrt(reduce(+, map((a, b) -> abs2(a-b), x, y)))
  225.889 μs (6 allocations: 781.41 KiB)
448.6707620554991

julia> @btime sqrt(mapreduce(((a, b),) -> abs2(a-b), +, zip(x, y)))
  105.932 μs (10 allocations: 208 bytes)
448.6707620555013
2 个赞
julia> @btime sqrt(sum(abs2.($x - $y)))
  273.926 μs (4 allocations: 1.53 MiB)
2.570658911172968e7

julia> @btime sqrt(sum(abs2.($x .- $y)))
  135.029 μs (2 allocations: 781.33 KiB)
2.570658911172968e7

julia> @btime sqrt(sum(x->abs2(x[1] - x[2]), zip($x, $y)))
  103.114 μs (1 allocation: 32 bytes)
2.570658911172968e7
1 个赞

1 个帖子被分离到了新主题:加权Jacobi的迭代优化

今天我也遇到了这个性能差异,
@Scheme 最后这个sum写法应该是推荐的写法sum(f, itr),我看源码底层其实是转换成了mapreduce
之前在Python中,总是习惯写sum(a+b for a,b in zip(as, bs)),一不小心这个性能差了有两三倍,花了一晚上才找出是这个导致的 :joy::joy::joy:

然后,发现用mapreduce也是有一些overhead的,性能敏感的地方,还是手写for比较快

https://github.com/JuliaLang/julia/pull/30471

1 个赞