CUDA.jlGPU计算问题

在提问之前请确定你已经努力阅读了文档,并且尝试自己在互联网上搜索。

请尽可能提供你的demo代码或者GitHub的gist地址。

# code
function kernel_Density(m,h,Pa,Nei,Ndist)
            i = threadIdx().x  + (blockIdx().x-1)*blockDim().x  
            Pa[i,7] = 4*m/(pi*h^2)
            for n = 1 : length(Nei[i,:])
                if Nei[i,n] != 0
                    neighborRho = (h^2-(Ndist[i,n])^2)^3
                else
                    neighborRho = 0
                end
                Pa[i,7] = Pa[i,7] + (4*m/(pi*h^8))*neighborRho
            end
            return nothing
        end
@cuda blocks=5 threads=500 kernel_Density(m,h,d_Pa,d_Nei,d_Ndist)
通过CUDA.jl库包进行GPU计算时,出现了以下问题,不知道是因为什么原因呢?

InvalidIRError: compiling kernel kernel_Density(Int64, Float64, CuDeviceArray{Float64,2,1}, CuDeviceArray{Int16,2,1}, CuDeviceArray{Float64,2,1}) resulted in invalid LLVM IR
Reason: unsupported call through a literal pointer (call to jl_alloc_string)

这是个挺常见的 报错

顺便把 报错信息也贴一下?

你好,十分感谢!报错信息如下:
是因为线程序号不能作为下标使用吗?请问具体怎么解决呢?

Error evaluating main.jl

LoadError: InvalidIRError: compiling kernel kernel_Density(Int64, Float64, CuDeviceArray{Float64,2,1}, CuDeviceArray{Int16,2,1}, CuDeviceArray{Float64,2,1}) resulted in invalid LLVM IR
Reason: unsupported call through a literal pointer (call to jl_alloc_string)
Stacktrace:
[1] _string_n at strings/string.jl:60
[2] StringVector at iobuffer.jl:31
[3] #IOBuffer#331 at iobuffer.jl:114
[4] print_to_string at strings/io.jl:133
[5] string at strings/io.jl:174
[6] throw_checksize_error at multidimensional.jl:779
[7] _unsafe_getindex at multidimensional.jl:756
[8] _getindex at multidimensional.jl:743
[9] getindex at abstractarray.jl:1060
[10] kernel_Density at C:\Users\Administrator\Desktop\Myjulia\dambreak\GPU2500-(2021-03-03)\main.jl:111
Reason: unsupported call through a literal pointer (call to jl_string_to_array)
Stacktrace:
[1] unsafe_wrap at strings/string.jl:71
[2] StringVector at iobuffer.jl:31
[3] #IOBuffer#331 at iobuffer.jl:114
[4] print_to_string at strings/io.jl:133
[5] string at strings/io.jl:174
[6] throw_checksize_error at multidimensional.jl:779
[7] _unsafe_getindex at multidimensional.jl:756
[8] _getindex at multidimensional.jl:743
[9] getindex at abstractarray.jl:1060
[10] kernel_Density at C:\Users\Administrator\Desktop\Myjulia\dambreak\GPU2500-(2021-03-03)\main.jl:111
Reason: unsupported call through a literal pointer (call to memset)
Stacktrace:
[1] fill! at array.jl:428
[2] #IOBuffer#331 at iobuffer.jl:121
[3] print_to_string at strings/io.jl:133
[4] string at strings/io.jl:174
[5] throw_checksize_error at multidimensional.jl:779
[6] _unsafe_getindex at multidimensional.jl:756
[7] _getindex at multidimensional.jl:743
[8] getindex at abstractarray.jl:1060
[9] kernel_Density at C:\Users\Administrator\Desktop\Myjulia\dambreak\GPU2500-(2021-03-03)\main.jl:111
Reason: unsupported dynamic function invocation (call to print)
Stacktrace:
[1] print_to_string at strings/io.jl:135
[2] string at strings/io.jl:174
[3] throw_checksize_error at multidimensional.jl:779
[4] _unsafe_getindex at multidimensional.jl:756
[5] _getindex at multidimensional.jl:743
[6] getindex at abstractarray.jl:1060
[7] kernel_Density at C:\Users\Administrator\Desktop\Myjulia\dambreak\GPU2500-(2021-03-03)\main.jl:111
Reason: unsupported call through a literal pointer (call to jl_array_grow_end)
Stacktrace:
[1] _growend! at array.jl:892
[2] resize! at array.jl:1085
[3] print_to_string at strings/io.jl:137
[4] string at strings/io.jl:174
[5] throw_checksize_error at multidimensional.jl:779
[6] _unsafe_getindex at multidimensional.jl:756
[7] _getindex at multidimensional.jl:743
[8] getindex at abstractarray.jl:1060
[9] kernel_Density at C:\Users\Administrator\Desktop\Myjulia\dambreak\GPU2500-(2021-03-03)\main.jl:111
Reason: unsupported call through a literal pointer (call to jl_array_del_end)
Stacktrace:
[1] _deleteend! at array.jl:901
[2] resize! at array.jl:1090
[3] print_to_string at strings/io.jl:137
[4] string at strings/io.jl:174
[5] throw_checksize_error at multidimensional.jl:779
[6] _unsafe_getindex at multidimensional.jl:756
[7] _getindex at multidimensional.jl:743
[8] getindex at abstractarray.jl:1060
[9] kernel_Density at C:\Users\Administrator\Desktop\Myjulia\dambreak\GPU2500-(2021-03-03)\main.jl:111
Reason: unsupported call through a literal pointer (call to jl_array_to_string)
Stacktrace:
[1] String at strings/string.jl:39
[2] print_to_string at strings/io.jl:137
[3] string at strings/io.jl:174
[4] throw_checksize_error at multidimensional.jl:779
[5] _unsafe_getindex at multidimensional.jl:756
[6] _getindex at multidimensional.jl:743
[7] getindex at abstractarray.jl:1060
[8] kernel_Density at C:\Users\Administrator\Desktop\Myjulia\dambreak\GPU2500-(2021-03-03)\main.jl:111
Reason: unsupported call through a literal pointer (call to jl_alloc_array_1d)
Stacktrace:
[1] Array at boot.jl:406
[2] Array at boot.jl:415
[3] similar at abstractarray.jl:640
[4] similar at abstractarray.jl:630
[5] _unsafe_getindex at multidimensional.jl:755
[6] _getindex at multidimensional.jl:743
[7] getindex at abstractarray.jl:1060
[8] kernel_Density at C:\Users\Administrator\Desktop\Myjulia\dambreak\GPU2500-(2021-03-03)\main.jl:111
in expression starting at C:\Users\Administrator\Desktop\Myjulia\dambreak\GPU2500-(2021-03-03)\main.jl:12
check_ir(::GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget,CUDA.CUDACompilerParams}, ::LLVM.Module) at validation.jl:123
macro expansion at driver.jl:239 [inlined]
macro expansion at TimerOutput.jl:206 [inlined]
codegen(::Symbol, ::GPUCompiler.CompilerJob; libraries::Bool, deferred_codegen::Bool, optimize::Bool, strip::Bool, validate::Bool, only_entry::Bool) at driver.jl:237
codegen at driver.jl:63 [inlined]
compile(::Symbol, ::GPUCompiler.CompilerJob; libraries::Bool, deferred_codegen::Bool, optimize::Bool, strip::Bool, validate::Bool, only_entry::Bool) at driver.jl:39
compile at driver.jl:35 [inlined]
cufunction_compile(::GPUCompiler.FunctionSpec; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at execution.jl:302
cufunction_compile(::GPUCompiler.FunctionSpec) at execution.jl:297
check_cache(::Dict{UInt64,Any}, ::Any, ::Any, ::GPUCompiler.FunctionSpec{var"#kernel_Density#12",Tuple{Int64,Float64,CuDeviceArray{Float64,2,1},CuDeviceArray{Int16,2,1},CuDeviceArray{Float64,2,1}}}, ::UInt64; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at cache.jl:40
(::GPUCompiler.var"#check_cache##kw")(::NamedTuple{(),Tuple{}}, ::typeof(GPUCompiler.check_cache), ::Dict{UInt64,Any}, ::Function, ::Function, ::GPUCompiler.FunctionSpec{var"#kernel_Density#12",Tuple{Int64,Float64,CuDeviceArray{Float64,2,1},CuDeviceArray{Int16,2,1},CuDeviceArray{Float64,2,1}}}, ::UInt64) at cache.jl:15
kernel_Density at main.jl:109 [inlined]
cached_compilation at cache.jl:65 [inlined]
cufunction(::var"#kernel_Density#12", ::Type{Tuple{Int64,Float64,CuDeviceArray{Float64,2,1},CuDeviceArray{Int16,2,1},CuDeviceArray{Float64,2,1}}}; name::Nothing, kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at execution.jl:289
cufunction(::var"#kernel_Density#12", ::Type{Tuple{Int64,Float64,CuDeviceArray{Float64,2,1},CuDeviceArray{Int16,2,1},CuDeviceArray{Float64,2,1}}}) at execution.jl:286
macro expansion at execution.jl:100 [inlined]
macro expansion at main.jl:188 [inlined]
top-level scope at timing.jl:174
include_string(::Function, ::Module, ::String, ::String) at loading.jl:1088

for n = 1 : length(Nei[i,:])

可能是这里出错了,你试试这里直接通过 size 把第二维的长度信息取出来? 这里会创建一个 sub-array 感觉在 kernel 函数里恐怕不行

好的,谢谢!!!!