如何将如下代码写成广播的形式

yzqiu0 · 2018 年10 月 1 日 12:41

我想请教一下如下的代码可否写成如a = trues(100); a .= (isodd. (1:100))的广播的形式？

n |= n >> 1
n |= n >> 2
n |= n >> 4
n |= n >> 8
n |= n >> 16
n |= n >> 32

谢谢你的回复。

我写成了foreach的形式，不过速度慢了几百倍！

foreach(i->n |= n >> i, (1, 2, 4, 8, 16, 32))

for i in (1, 2, 4, 8, 16, 32)
        n |= n >> i
    end

对速度没有影响。

Scheme · 2018 年10 月 1 日 23:19

为什么不写for循环？这个应该写不了广播，而且广播比for循环慢。
如果就是考虑速度的话，可以完全手动unroll，或者

julia> using Base.Cartesian: @nexprs

julia> n = 100
       @nexprs 6 j -> begin
           i = 1 << (j-1)
           n |= n >> i
       end
127

Scheme · 2018 年10 月 1 日 23:40

如果你看Julia是怎么优化for循环的话，就知道为什么foreach慢了。（而且foreach返回 nothing 啊。你是怎么用的？）

julia> function g(n)
           for i in (1, 2, 4, 8, 16, 32)
               n |= n >> i
           end
           n
       end
g (generic function with 1 method)

julia> @code_llvm g(100)

读一下LLVM IR你就会发现，LLVM把这个for循环完全unroll了。所以和

julia> function f(n)
           n |= n >> 1
           n |= n >> 2
           n |= n >> 4
           n |= n >> 8
           n |= n >> 16
           n |= n >> 32
       end
f (generic function with 1 method)

等效。让我们来benchmark一下。

julia> @btime f(n) setup=(n=rand(Int))
  2.182 ns (0 allocations: 0 bytes)
-1

julia> @btime g(n) setup=(n=rand(Int))
  2.181 ns (0 allocations: 0 bytes)
9223372036854775807

注意，这里不能用

julia> @btime f(100)
  0.026 ns (0 allocations: 0 bytes)
127

julia> @btime g(100)
  1.882 ns (0 allocations: 0 bytes)
127

因为Julia看到100这个literal可能会把函数给完全优化没。比如说

julia> k() = f(100)
k (generic function with 1 method)

julia> @code_typed k()
CodeInfo(
1 1 ─     return 127                                                                                                                                        │
) => Int64

希望对你有帮助。

yzqiu0 · 2018 年10 月 2 日 01:11

这个函数是这样的，写成foreach时是通过x->n |= n -> x来不断修改n的值最后返回，

function flp2(n::Int64)
    for i in (1, 2, 4, 8, 16, 32)
        n |= n >> i
    end
    n - (n >> 1)
end

function clp2(n::Int64)
    for i in (1, 2, 4, 8, 16, 32)
        n |= n >> i
    end
    n + 1
end

既然这样，我也安心用for循环了。
谢谢你！

yzqiu0 · 2018 年10 月 2 日 01:27

我也试了一下，他两的汇编是除了location是一样的。