Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eltype inference for Iterators.map could be more precise #56891

Open
ehehela opened this issue Dec 23, 2024 · 6 comments
Open

eltype inference for Iterators.map could be more precise #56891

ehehela opened this issue Dec 23, 2024 · 6 comments
Labels
collections Data structures holding multiple items, e.g. sets invalid Indicates that an issue or pull request is no longer relevant iteration Involves iteration or the iteration protocol

Comments

@ehehela
Copy link

ehehela commented Dec 23, 2024

I am puzzled on why eltype does not infer more precisely for Iterators.map(f, iterators...), especially when both the eltype of the input iterators and the return type of f seem to be inferable.
Would it be possible to improve the eltype inference for Iterators.

julia> fun(x) = x^2
fun (generic function with 1 method)

julia> a = Iterators.map(fun, 1:3)
Base.Generator{UnitRange{Int64}, typeof(fun)}(fun, 1:3)

julia> eltype(a)
Any    # Expected: Int

This enhancement could help improve type stability and make Iterators.map more consistent with the inferred types of other Julia constructs.

@jakobnissen
Copy link
Contributor

jakobnissen commented Dec 23, 2024

The issue may be that the purpose of eltype is not clearly explained. The purpose of eltype is to provide a way to get the element type at compile time, in a manner which does not rely on Julia's own type inference.
This is needed sometimes, because Julia's type inference is considered an optimization, and should not affect the result of any computation (only its efficiency). This, in turn, allows Julia to tweak inference for different use cases without it causing breakage.
Therefore, most code isn't actually allowed to "ask inference" for the known result of some function.

On the other hand, this also implies that users should, where possible, not rely on eltype, unless they know that eltype will provide a correct, concrete type.
Instead, you should use the runtime type. For example, a primitive version of collect could be implemented like this:

function simple_collect(it)
    itval = iterate(it)
    isnothing(itval) && return Union{}[]
    (x, state) = itval
    v = [x]
    T = typeof(x)
    itval = iterate(it, state)
    while itval !== nothing
        (x, state) = itval
        if !isa(x, T)
            T = typejoin(T, typeof(x))
            v = T[i for i in v]
        end
        push!(v, x)
        itval = iterate(it, state)
    end
    v
end

Note that, if inference works well, all the "type stuff" done in this function will be compiled away.

@ehehela
Copy link
Author

ehehela commented Dec 23, 2024

Thanks for your detailed explanation.
According to your answer, it seems that overloading eltype for user-defined iterators is not essential for performance. Is that correct?
Additionally, I noticed there is an issue (#48249) discussing the need for accurate eltype on Iterators.flatten.

@jakobnissen
Copy link
Contributor

Unfortunately, it's not entirely correct. There are some situations where using runtime types as I demonstrated above, is impractical. I can't really think of any, but I remember I've encountered them.
Julia is in an awkward position here precisely because inference is not guaranteed in the same way as in static languages, but it's still needed for optimal performance in some cases. E.g. the actual implementation of collect does use inference.

Let's leave this issue open. Crossref: #56683

@nsajko
Copy link
Contributor

nsajko commented Dec 23, 2024

why eltype does not infer more precisely

Your premise is false. Type inference is not supposed to be involved in the implementation of eltype at all. Type inference is a compiler optimization, as such it's not supposed to affect semantics. Letting type inference observably affect the output of anything user-facing, such as eltype, would most definitely be a bug.

@nsajko nsajko added collections Data structures holding multiple items, e.g. sets iteration Involves iteration or the iteration protocol invalid Indicates that an issue or pull request is no longer relevant labels Dec 23, 2024
@nsajko
Copy link
Contributor

nsajko commented Dec 24, 2024

That said, it's possible to make eltype more precise without reaching for compiler internals: #56899

Doesn't help with your example above, but it could be useful in limited cases.

@ehehela
Copy link
Author

ehehela commented Dec 24, 2024

Thank you all for your detailed explanations and time. I now understand the purpose of eltype in providing the element type at compile time. eltype is no in charge of type inference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
collections Data structures holding multiple items, e.g. sets invalid Indicates that an issue or pull request is no longer relevant iteration Involves iteration or the iteration protocol
Projects
None yet
Development

No branches or pull requests

3 participants