求老哥教一下怎么用并行计算


#1

最近在做一个关于分子运动的模拟,大概就是初始给许多点的坐标,然后给每个点掷色子,用来决定这个点下一步往哪走,这个过程要重复很多次。程序大概就下面那样。

在实际的程序里坐标是三维的,大概有几万个分子然后每个分子要运动几万次,这就很费时间了,我就想用并行计算能不能快点,就试着写成下面那样了,但好像不怎对。

求大佬看下该怎么改。

xy = SharedArray{Float64}(200,2)  #创建200个坐标为(0,0)的点
@everywhere function randomWalk(xy)  #随机向四个方向运动 
  @distributed  for i in 1:1000
        for j in 1:200
            direction = rand(["up","dowm","left","right"])
            if direction == "up"
                xy[j,1] = xy[j,1] + 1
            end
            if direction == "dowm"
                xy[j,1] = xy[j,1] - 1
            end
            if direction == "left"
                xy[j,2] = xy[j,2] - 1
            end
            if direction == "right"
                xy[j,2] = xy[j,2] + 1
            end
end

#2

先确保理解了文档里 SharedArrays 部分的内容
https://docs.juliacn.com/latest/manual/parallel-computing/#man-shared-arrays-1

使用 SharedArrays 的基本思想是:

  1. 先用 @everywhere 把你的计算逻辑包起来分发下去
  2. 调用 remotecall 执行逻辑
@everywhere using SharedArrays

xy = SharedArray{Float64}(10000, 2)

@everywhere movements = [
    [0., 1.], # up
    [0., -1.], # down
    [1., 0.], # left
    [-1., 0.] # right
]

@everywhere function customized_points(data::SharedArray)
    idx = indexpids(data)
    if idx == 0 # This worker is not assigned a piece
        return 1:0, 1:0
    end
    nchunks = length(procs(data))
    splits = [round(Int, s) for s in range(0, stop=size(data, 1), length=nchunks+1)]  # 这里其实不用预分配,为了方便理解参照的文档写的
    splits[idx]+1:splits[idx+1]
end

@everywhere function random_walk!(data)
    for _ in 1:10000
        for i in customized_points(data)
            data[i, :] .+= rand(movements)
        end
    end
end

function random_walk_sync!(data)
    @sync begin
        for p in procs(data)
            @async remotecall_wait(random_walk!, p, data)
        end
    end
end

#3

我也不会并行计算,我刚开始按照你的代码改,总是跑到爆内存:thinking:,所以又看了下书和楼上大佬的回复,现在好像明白了,我把你的程序改了改,改成三维的了
先说下我的情况

Julia Version 1.1.0
Commit 80516ca202 (2019-01-21 21:24 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i5-6300HQ CPU @ 2.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)

下面上代码:

using Distributed
addprocs(4) 
@everywhere using SharedArrays
using BenchmarkTools

@everywhere function randomWalk3d!(xy::SharedArray{Int64},time::Int64=1000)
    axis=Dict("x"=>1,"y"=>2,"z"=>3) # 随机选坐标轴
     for i in 1:time
        for j in 1:size(xy)[1] # 对行循环
             xy[j,rand(axis).second]+=rand([-1,1]) # 随机加一减一
        end
    end
end

function randomWalk3d_sync!(xy::SharedArray{Int64},time::Int64=1000)
    @sync begin
    for p in procs(xy)
        @async remotecall_wait(randomWalk3d!,p,xy,10000)
    end
    end
end

xy=SharedArray{Int64}(200,3)
@benchmark randomWalk3d_sync!(xy)

结果

BenchmarkTools.Trial: 
  memory estimate:  18.69 KiB
  allocs estimate:  369
  --------------
  minimum time:     272.681 ms (0.00% GC)
  median time:      282.398 ms (0.00% GC)
  mean time:        286.255 ms (0.00% GC)
  maximum time:     313.312 ms (0.00% GC)
  --------------
  samples:          18
  evals/sample:     1

由于这些数据都是整数,我用的 Int类型


#4

200个分子走一千步,每个分子是独立的,应该在对分子的采用并行循环,而不是步数。