字典的键不能取数组这样的可变类型(mutable),为何这句可以运行?
Dict([1] => “Li Ming”, [2] => 18)
怎么回事,请帮忙解答
好像,大概,没有说键不能取可变类型吧
https://docs.julialang.org/en/v1/
https://docs.julialang.org/en/v1/base/collections/#Base.Dict
顺带一提
julia> ismutabletype(String)
true
Dict
相关源代码里没有出现类型可变性检测,所以我猜是那个文档错了(大概)
这里的 [1] 是不可变的。
何以见得–[1] 是不可变的
官方文档Collections and Data Structures · The Julia Language
字典是通过哈希表实现的,key应该是不可变的,否则变后的哈希值就改变了,就找不到对应的value了
我觉得这个说的是,键本身作为已有的字典的结构,是不允许改变的,但这不妨碍它是可变类型的
例子中讨论的就是向量作为键竟然可以允许,向量是可变型的
Julia does not ban mutable objects as dictionary keys. Using them is at your own risk. Julia acknowledges the potential complexities and risks involved but chooses to provide programmers with freedom.
When you use an array as a key in a Julia dictionary, Julia calculates the hash value of the array at the time it is inserted and uses this hash value to index the array within the dictionary. This means that even though arrays are mutable, their use as dictionary keys is based on their state at the time of insertion. However, if you mutate the array after inserting it into the dictionary, you could run into issues with accessing the associated value because the dictionary is still indexing the array based on its original hash value, not the new one it would have after being mutated.
This capability comes with a significant caveat: it is your responsibility to ensure that you do not mutate the keys after they have been used in a dictionary. Mutating an array used as a key can lead to unpredictable behavior, including the inability to retrieve the value associated with that key, because the dictionary’s mechanism for finding keys relies on their hash values remaining constant.
Here’s a brief explanation of what’s happening under the hood:
- When you insert an array as a key, Julia computes its hash value based on its contents at that moment.
- Julia uses this hash value to store the key-value pair in the dictionary’s underlying hash table.
- If the array is mutated after insertion, its hash value would change if recalculated. However, the dictionary does not update the hash value for existing keys. This means the dictionary continues to use the original hash value for any operations involving this key.
- Consequently, mutating an array used as a key can break the association in the dictionary, leading to potential errors when attempting to access the data with the mutated array.
Consider the following example:
julia> a = [1]; b = [2];
julia> d = Dict(a => 1, b => 2)
Dict{Vector{Int64}, Int64} with 2 entries:
[1] => 1
[2] => 2
julia> hash(a)
0xc7253bdde85cb9f5
julia> c = deepcopy(a)
1-element Vector{Int64}:
1
julia> hash(c)
0xc7253bdde85cb9f5
julia> push!(a, 3)
2-element Vector{Int64}:
1
3
julia> hash(a)
0xf80ee7e061e708d4
julia> d
Dict{Vector{Int64}, Int64} with 2 entries:
[1, 3] => 1
[2] => 2
julia> d[a]
ERROR: KeyError: key [1, 3] not found
Stacktrace:
[1] getindex(h::Dict{Vector{Int64}, Int64}, key::Vector{Int64})
@ Base ./dict.jl:498
[2] top-level scope
@ REPL[9]:1
julia> d[[1]]
ERROR: KeyError: key [1] not found
Stacktrace:
[1] getindex(h::Dict{Vector{Int64}, Int64}, key::Vector{Int64})
@ Base ./dict.jl:498
[2] top-level scope
@ REPL[10]:1
julia> d[[1, 3]]
ERROR: KeyError: key [1, 3] not found
Stacktrace:
[1] getindex(h::Dict{Vector{Int64}, Int64}, key::Vector{Int64})
@ Base ./dict.jl:498
[2] top-level scope
@ REPL[11]:1
julia> d[c]
ERROR: KeyError: key [1] not found
Stacktrace:
[1] getindex(h::Dict{Vector{Int64}, Int64}, key::Vector{Int64})
@ Base ./dict.jl:498
[2] top-level scope
@ REPL[12]:1
字典是一个典型的 hash 表数据结构,所有的 d[key]
都会以 hash(key)
后的值去索引对应的数据。这意味着:如果 hash(key)
的结果不是稳定的,那么你很有可能会索引不到你的结果。可变数据结构是一种典型的导致 hash(key)
结果不稳定的场景。
举一个生造的例子来说明 hash(key)
不稳定导致的问题:
struct UnstableKey
v::Int
end
# 引入一个随机值来保证它的 hash 是不稳定的
Base.hash(u::UnstableKey, h::UInt) = hash((u.v, rand()), h)
x = UnstableKey(1)
d = Dict(x=>1)
此时 ·d[x]
会失败,因为 x
的 hash 结果并不稳定。
反过来,可变类型结构体也并不一定会导致 hash(key)
不稳定,例如:
mutable struct StableKey
v::Int
_z::Int
end
# 计算 hash 时忽略可能不稳定的内容
Base.hash(u::StableKey, h::UInt) = hash(u.v, h)
x = StableKey(1, 2)
d = Dict(x=>1)
julia> d[x]
1
julia> x._z = rand(Int)
2599232730616789374
julia> d[x] # 即使值变了,也依然可以取到
1
我想上面两个例子其实就解释了为什么 Julia 没有禁止可变类型作为字典的键,因为:
- 可变类型并不一定会让字典的索引变得不稳定,不可变类型并不一定会让字典的索引变得稳定
- 即使禁止可变类型,也没办法让事情变得靠谱,因为这背后只取决于
hash
是否稳定
更进一步的,这其实也说明了,为什么绝大多数时候我们都使用简单的数据结构来作为 key 的原因:因为我们知道它们具有稳定的 hash 计算。