Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hashes with constructor blocks for default values don't work out-of-the box with serializiation #15313

Open
wolfgang371 opened this issue Dec 26, 2024 · 4 comments
Labels
kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:stdlib:serialization

Comments

@wolfgang371
Copy link

require "yaml"

class X
    include YAML::Serializable
    property hash : Hash(Int32,String)
    @hash = Hash(Int32,String).new {|hash,key| hash[key] = "42"}
    def initialize # explicitly needed (0 args)
    end
    # private def after_initialize # called by Crystal after #from_yaml; needed for getting default block handling right
    #     hash = Hash(Int32,String).new {|hash,key| hash[key] = "42"}
    #     @hash.each do |k,v|
    #         hash[k] = v
    #     end
    #     @hash = hash
    # end
end

x = X.new
p x.hash[12] # "42"
y = X.from_yaml(x.to_yaml)
p y.hash[24] # Unhandled exception: Missing hash key: 24 (KeyError)

In this case it's rather easy to fix it (see comments); it case of hashes of hashes (etc.), all having such default blocks, it gets a bit awkward.
Best solution would be serialization honoring the default instance types.

I'm running Crystal 1.14.0 [dacd97b] (2024-10-09), LLVM: 18.1.6, Default target: x86_64-unknown-linux-gnu

@wolfgang371 wolfgang371 added the kind:bug A bug in the code. Does not apply to documentation, specs, etc. label Dec 26, 2024
@Blacksmoke16
Copy link
Member

I'm not really sure there would be a good way for YAML::Serialization to handle this. Like it only knows that hash is of type Hash(Int32, String). What you're essentially wanting is a way to merge new YAML data into an already existing instance of X, of which YAML::Serializable does not currently support. It might be possible, but have you considered other approaches that wouldn't require this?

@wolfgang371
Copy link
Author

I was already happy finding the after_initialize hook, I didn't find anything in the attribution section that could be used instead.
What might also be helpful is making the Hash block a property, so that it can be read and set afterwards (from the outside) - but again, this leaves the issue of "nested" Hashes, so sort of a "partial clone" of a Hash would be helpful.

I really like the feature of this Hash block, especially for more complex nested structures, because you can just read without having to check every second line in the code. But since the block isn't part of the Hash type, it easy to lose it.

@straight-shoota
Copy link
Member

I don't think this is possible to implement in a reasonable, generic way. It would require serializing a function, but most serialization formats do not even support that. So I don't think there's anything we could do.

Of course it's possible to define custom serialization for some specific cases in use code. But I wouldn't see that in stdlib.

Also serializating hashes with default blocks seems like a bit of an odd feature to me. What do you even need that for?

@wolfgang371
Copy link
Author

I agree that serialization of the default block is not worth the effort (for a compiled language).

What I need Hash default blocks for? Well, as I tried to say in my previous comment it's really handy to reduce code bloat:

kkk2v = Hash(Int32,Hash(Int32,Hash(Int32,String))).new
kkk2v[1] ||= Hash(Int32,Hash(Int32,String)).new # awkward
kkk2v[1][2] ||= Hash(Int32,String).new # awkward
kkk2v[1][2][3] = "foo"
p kkk2v

kkk2v = Hash(Int32,Hash(Int32,Hash(Int32,String))).new do |h,k|
    h[k] = Hash(Int32,Hash(Int32,String)).new do |h,k|
        h[k] = Hash(Int32,String).new
    end
end
kkk2v[1][2][3] = "foo"
p kkk2v

Or do you mean what do I need default block serialization for? Well, I need at least some mechanism to to get the code run right after deserialization, i.e. need some mechanism to install the default block - and in case of nested Hashes as above it's not just copying / monkey patching the outer default block.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:stdlib:serialization
Projects
None yet
Development

No branches or pull requests

3 participants