Julia公众号文章投稿示例

thautwarm · 2019 年8 月 19 日 14:02

您好, 这里是thautwarm.

之前 @Jun 让我在meetup前给公众号投稿一篇, 我本来打算讲怎么在generated function里支持闭包的,
今天写出来, 发现只介绍staging和generated function就已经有相当的篇幅, 所以先只写了原先想法的一部分.

流程上, 文章先发布在我的博客上, Staging和Julia生成函数, 然后请 @Jun review.

review之后, 在微信公众号里发布文章并链接我的博客.

关于文章具体的结构, 这次我的提纲就是

引出一种优化技术: staging
staging的例子(泛化的Fibonacci数列)
julia中使用staging可能遇到的问题(eval, world age problem)
julia又提供了什么办法解决以上问题(generated function)
julia的解决方案好在哪里

@Jun 让我分享一下经验, 就是这个post. 希望能够帮到大家.

Jun · 2019 年8 月 19 日 14:21

给力啊

另外，没有独立博客的话，大家也可以直接在这个子版块下发帖编辑，目前支持基本的markdown语法和基于mathjax的公式编辑语法。

已发布！

链接

thautwarm · 2019 年8 月 19 日 15:30

然后我们俩遇到了排版问题, 弄了半天, 靠Jun找到的这个 https://www.mdnice.com 搞定了…

Scheme · 2019 年8 月 20 日 03:06

其实可以直接用Julia的type inference来做编译时展开

julia> foo(::Union{Val{0},Val{1}}) = 1
foo (generic function with 1 method)

julia> foo(x::Val{N}) where N = foo(Val(N-1)) + foo(Val(N-2))
foo (generic function with 2 methods)

julia> @code_typed foo(Val(20))
CodeInfo(
1 ─     return 10946
) => Int64

Scheme · 2019 年8 月 20 日 03:33

很多时候可以通过Julia的type system写递归程序，让Julia编译器做编译时展开

julia> findfirstfloat(x::Tuple, i=1) = length(x) < 1 ? nothing : first(x) isa AbstractFloat ? i : findfirstfloat(Base.tail(x), i+1)
findfirstfloat (generic function with 2 methods)

julia> findfirstfloat((1, 2, 3, 4.0, 5))
4

julia> findfirstfloat((1, 2, 3, 4, 5))

julia> @code_typed findfirstfloat((1, 2, 3, 4, 5))
CodeInfo(
1 ─     return
) => Nothing

julia> @code_typed findfirstfloat((1, 2, 3, 4.0, 5))
CodeInfo(
1 ─ %1 = Base.add_int(3, 1)::Int64
└──      return %1
) => Int64

julia> @code_llvm findfirstfloat((1, 2, 3, 4.0, 5))

;  @ REPL[1]:1 within `findfirstfloat'
define i64 @julia_findfirstfloat_16075({ i64, i64, i64, double, i64 } addrspace(11)* nocapture nonnull readonly dereferenceable(40)) {
top:
  ret i64 4
}

thautwarm · 2019 年8 月 20 日 03:44

你和我说的不是一个东西。

Scheme · 2019 年8 月 20 日 03:48

只是想说大多数用staging的时候没有必要。如果能写可以infer的函数，只用type就能生成高效程序。

Scheme · 2019 年8 月 20 日 04:14

emmm，而且这么写LLVM更容易优化它啊

julia> foo(::Val{N}, x, y) where N = foo(Val(N-1), x, y) + foo(Val(N-2), x, y)
foo (generic function with 1 method)

julia> foo(::Val{0}, x, y) = x
foo (generic function with 2 methods)

julia> foo(::Val{1}, x, y) = y
foo (generic function with 3 methods)

julia> @code_llvm foo(Val(10), 1, 2)

;  @ REPL[1]:1 within `foo'
define i64 @julia_foo_16107(i64, i64) {
top:
  %factor = mul i64 %1, 55
  %factor1 = mul i64 %0, 34
; ┌ @ int.jl:53 within `+'
   %2 = add i64 %factor, %factor1
; └
  ret i64 %2
}

直接算出closed form了

thautwarm · 2019 年8 月 20 日 07:31

你这里虽然生成的llvmir和我一样, 但我不依赖编译器.
staging不是大多数时候都用不到, 只是没有相关知识发现不了. 不然oleg kiselyov也不会这么牛逼.
什么叫llvm更容易优化它? 我认为这个和llvm没有关系, 多重分派是julia类型系统带来的.
我当然认为

x0 = x + y
x1 = x0 + ...

这样的形式更好优化.

thautwarm · 2019 年8 月 20 日 07:33

多重分派是Julia特有的. 这篇文章之后要讲的就是把代码直接encode到type上然后利用分派去使用generated function: A method to remove the use of runtime eval and invokelatest, as well as support closures in generated functions! Come with an implemented proptype - Package Announcements - Julia Programming Language

你这里讲的很off topic.

Roger · 2019 年8 月 20 日 07:39

我觉得你的例子并没有很好的阐述你要表达的内容。我觉得有一些更好的例子比如cassette，一些AD技术。tracing技术或者ptx架构的transpile（CUDAnative）这些依赖一定动态信息，或者无法使用类型系统很好的构建的场景是更好的用例。而实际上生成函数起初很大的一个use case就是支持原生的到ptx架构的编译。所以生成函数我觉得更强大的地方在类似Cassette这样的custom compiler pass上，这一点没有提。

你这里所谓的记忆化感觉是用了一个类似SSA（或者说就是？）的IR，但是本身在Julia里生成函数就是可以用来修改IR的（直接返回IR而不是AST）实际上你可以直接利用IR完成这个优化（假如你不想使用llvm的优化）。

thautwarm · 2019 年8 月 20 日 07:55

我的例子是论文原例, 只是去掉了cps notation.

thautwarm · 2019 年8 月 20 日 08:03

这个例子在n是运行时值的时候, 无法由静态派发直接编译优化.

要注意到像把Val(n)作为参数, caller使用Val{n} ... where n去dispatch的时候, 是一个动态的派发.

我们考虑接收Val{n} ... where n类型的函数的执行和优化过程, 当它拿到一个运行时才能确定的Val(10)时, 要使用多重分派就需要一个运行时类型的检查, 而此时会触发了对参数为Val(10)以及递归时会涉及到的其他Val(n)参数的编译.

这个在运行时里又插有编译期的过程, 就是staging.

thautwarm · 2019 年8 月 20 日 08:19

而使用原生的staging有什么好处?

一. staging结果透明.


julia> @code_lowered fibnr_staged_n(+, 0, 1, Val(10))
CodeInfo(
1 ─     $(Expr(:meta, :noinline))
│       + = f
│       x0 = (+)(y, x)
│       x1 = (+)(x0, y)
│       x2 = (+)(x1, x0)
│       x3 = (+)(x2, x1)
│       x4 = (+)(x3, x2)
│       x5 = (+)(x4, x3)
│       x6 = (+)(x5, x4)
│       x7 = (+)(x6, x5)
│       x8 = (+)(x7, x6)
└──     return x8
)


julia> @code_lowered foo(Val(10), 0, 1)
CodeInfo(
1 ─ %1 = $(Expr(:static_parameter, 1)) - 1
│   %2 = (Main.Val)(%1)
│   %3 = (Main.foo)(%2, x, y)
│   %4 = $(Expr(:static_parameter, 1)) - 2
│   %5 = (Main.Val)(%4)
│   %6 = (Main.foo)(%5, x, y)
│   %7 = %3 + %6
└──      return %7
)

二. 生成代码可以被重用, 可以用于bootstrap. 没有依赖, 直接拷贝, 或者生成代码到源文件.

julia> fibnr_with_n_and_memo(10)
quote
    function (f::Plus, x::Int64, y::Int64) where Plus <: Function
        (+) = f
        x0 = y + x
        x1 = x0 + y
        x2 = x1 + x0
        x3 = x2 + x1
        x4 = x3 + x2
        x5 = x4 + x3
        x6 = x5 + x4
        x7 = x6 + x5
        x8 = x7 + x6
        x8
    end
end

如果你看那个系列的论文, 你还能发现, staging可以把诸如递归, mutual递归, 多态递归, 根据参数数量的多态全部消除, 这也是有助于bootstrap的.