分布式计算中同一个 Project 里如果有多个 module 应该如何组织?

假设一个 Project 里有两个 module, 分别为 ModuleA 和 ModuleB ,并且 ModuleB 依赖 ModuleA ,其中 ModuleA 的代码不含分布式的部分

module ModuleA
    export func_a;
    function func_a()
        println("hello");
    end
end. # end of module

ModuleB 用到了分布式计算

module ModuleB

using Distributed;

@everywhere include("./ModuleA.jl");
@everywhere using .ModuleA;

function func_b()
  @sync @distributed for i = 1:100
    func_a();
  end
end

end

最后通过一个 main 脚本来运行

include("ModuleB.jl");
using .ModuleB;

func_b();

然后 julia -p 2 main.jl
它会提示

ERROR: LoadError: LoadError: UndefVarError: ModuleA not defined
Stacktrace:
 [1] top-level scope at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.5/Distributed/src/macros.jl:200
 [2] include(::String) at ./client.jl:457
 [3] top-level scope at /Users/ionizing/Documents/tests/julia/module_distributed/main.jl:1
 [4] include(::Function, ::Module, ::String) at ./Base.jl:380
 [5] include(::Module, ::String) at ./Base.jl:368
 [6] exec_options(::Base.JLOptions) at ./client.jl:296
 [7] _start() at ./client.jl:506
in expression starting at /Users/ionizing/Documents/tests/julia/module_distributed/ModuleB.jl:6
in expression starting at /Users/ionizing/Documents/tests/julia/module_distributed/main.jl:1

问题好像出在 ModuleB 的 using .ModuleA ,请问应该如何解决?

可以试试
ModuleB

module ModuleB

using Distributed
using ModuleA

function func_b()
  @sync @distributed for i = 1:100
    func_a();
  end
end

end

main

push!(LOAD_PATH, YOUR_PATH_TO_A_AND_B)
using ModuleB;
@everywhere using ModuleB
func_b();
1 个赞

抱歉昨天有点激动,没去证 julia -p x 时的情况,现在即使 @everywhere push!(...); @everywhere using ...; 后也还是不能正常工作,显示

ERROR: LoadError: On worker 2:
ArgumentError: Package ModuleB [top-level] is required but does not seem to be installed:
 - Run `Pkg.instantiate()` to install all recorded dependencies.

_require at ./loading.jl:999
require at ./loading.jl:928
#1 at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.5/Distributed/src/Distributed.jl:78
#103 at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.5/Distributed/src/process_messages.jl:290
run_work_thunk at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.5/Distributed/src/process_messages.jl:79
run_work_thunk at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.5/Distributed/src/process_messages.jl:88
#96 at ./task.jl:356

...and 1 more exception(s).

Stacktrace:
 [1] sync_end(::Channel{Any}) at ./task.jl:314
 [2] macro expansion at ./task.jl:333 [inlined]
 [3] _require_callback(::Base.PkgId) at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.5/Distributed/src/Distributed.jl:75
 [4] #invokelatest#1 at ./essentials.jl:710 [inlined]
 [5] invokelatest at ./essentials.jl:709 [inlined]
 [6] require(::Base.PkgId) at ./loading.jl:931
 [7] require(::Module, ::Symbol) at ./loading.jl:923
 [8] include(::Function, ::Module, ::String) at ./Base.jl:380
 [9] include(::Module, ::String) at ./Base.jl:368
 [10] exec_options(::Base.JLOptions) at ./client.jl:296
 [11] _start() at ./client.jl:506
in expression starting at /Users/ionizing/Documents/tests/julia/main.jl:2

现在 main 和 ModuleB 的文件组织方式是这样的:

# ModuleB
module ModuleB

export func_b;

using Distributed;
using ModuleA;

function func_b()
  @sync @distributed for i = 1:100
    func_a();
  end
end

end
# main.jl
push!(LOAD_PATH, @__DIR__);
using ModuleB;

using Distributed;
@everywhere begin
  push!(LOAD_PATH, @__DIR__);
  using ModuleB;
end

func_b();

当然,如果不用 Module ,所有文件一律用 include() 是能并行跑起来的,但我不想暴露 Module 内的实现细节(比如有用到辅助函数)

同目录没有 Project.toml 吧,我试了这样可以

using Distributed
@everywhere push!(LOAD_PATH, @__DIR__)
@everywhere using ModuleB
func_b();
1 个赞

非常感谢!

已经通过验证可用了!