-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check for type recursion without boxing #3407
Conversation
8b38c59
to
eb42fd2
Compare
83ae0ff
to
2247db3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite convinced here. This mostly works. But it misses some cases (so it's not fully sound) and catches some cases it shouldn't (so it's not fully complete). The old check -- reporting when e.g. value & (value & value)
doesn't subjkind with value & value
is still in place, so the lack of soundness doesn't matter all that much. And maybe the lack of completeness doesn't bite in practice? But given these drawbacks, and the fact that the old check worked (but with a somewhat confusing error message), I think I lean mildly against merging.
Actually, maybe I'm wrong about "mildly against merging". I agree that this improves the error message in the majority case, and that's pretty cool. And the implementation is simple enough. But at the least I'd leave an internal ticket about revisiting this in the future. |
c5fc9cf
to
19f511f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@goldfirere Thanks for the helpful review.
Here are my goals for this check:
- Improve the programming experience for unboxed records
- Prohibit all infinite-size types (types that would have a cyclic layout if we had layout variables without an occurs check)
It's because we are sound but not complete w.r.t. "no infinite-size types" that the error message complains about a stronger property, "no recursion without boxing," (it would be confusing to see a "no infinite-size types" error on type t = #{ t : t }
).
So I'd agree that w.r.t. the stronger property of "no recursion without boxing," we are both unsound (your example with type recursion hidden behind an abstract signature, and your recursive module example) and incomplete (int good good
). But I think that this is okay: the stronger property is only mentioned as it's a better error message.
I agree we'll eventually want to be more complete (e.g. allow int good good
) - I'll create an internal ticket corresponding to the CR above check_unboxed_recursion
.
It is a little unclear from the description what the definition of recursion is. Does the new check still allow more useful things like:
What about if |
@lpw25 The new check would allow that, including if For each group of mutually recursive type declarations, we define the following "type contains" relation on type expressions:
( If a path starting from the type expression on the LHS of a declaration contains two types with the same head type constructor, and that repeated type is an unboxed record or variant, then we give an error. Examples (@goldfirere - the third is newly allowed with a bug fix): type t = #{ t : t list }
(* Allowed, [t list] is not in this recursive group, and
has layout [value] so it doesn't contain anything *)
type t = #{ x : #(int * u) }
and u = T of t [@@unboxed]
(* Banned, t -> #(int * u) -> u -> t *)
type 'a id = #{ a : 'a }
type t = int id id
(* Allowed, [int id] is not in this recursive group, and
has layout [value] so it doesn't contain anything *)
type 'a id = #{ a : 'a }
and t = int id id
(* Sadly, banned. t -> int id id -> int id; id appears twice *) |
d3459e6
to
3b9c7e7
Compare
Presumably this is not the case for something like: type ('a : any) t : any is an |
Currently, an We might be concerned about programs like the following once we both drop this restriction and add module Id : sig
type ('a : any) t : any
end = struct
type ('a : any) t = 'a
end
type t = #{ t : t Id.t } But I think it will be okay to accept this, since the type also won't be visibly cyclic to Another example worth thinking about is the following, once we have type 'a id : any = 'a
type t = #{ t : t id } Speculatively: I think the check will reject this (when updated appropriately for |
What about if your |
If the functor parameter abstracts module type S = sig
type ('a : any) t : any
end
module F(Id : S) = struct
type t = #{ t : t Id.t }
end
module Id : S = struct
type ('a : any) t = 'a
end
module M = F(Id) If the functor parameter exposes module type S = sig
type ('a : any) t = 'a (* changed line *)
end
module F(Id : S) = struct
type t = #{ t : t Id.t }
end
(* Error: the definition of t contains recursion without boxing
t contains t Id.t
t Id.t contains t
*) Does this sound right? |
My concern was with your |
Actually, your |
I see the problem now, thanks. Current best guess for the fix: we consider a type not in the recursive group whose layout contains |
I made this change to future-proof against unboxed records with unrepresentable layouts, and refactored so we can more easily support |
I thought about it briefly and didn't come up with any examples where this heuristic misses anything or is overly conservative in a way that made me very sad (though it is certainly conservative). I propose we (1) have Ryan check that this isn't so conservative that it rejects anything in our code base and then (2) merge this PR, unless @lpw25 can come up with another counter example. It will be much easier to figure out whether a better solution exists once we start working on allowing non-values in records and on |
The check doesn't seem to reject anything: I rebased this branch onto |
5d2fea9
to
277b1d8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks all good from my perspective. I still think we'll eventually switch to use layout variables, but there's no reason not to merge this in the meantime.
277b1d8
to
426888b
Compare
Summary
We only allow recursion in unboxed types to occur through boxes, otherwise the type is uninhabitable and usually also infinite-size.
Because
check_well_founded
already ruled out recursion through structural types (in our case, we care about unboxed tuples and aliases), we just look for a cycle in nominal unboxed types ([@@unboxed]
types and unboxed records), tracking the set of seen paths. This gives an algorithm that is simpler thancheck_well_founded
(whose correctness relies on some subtle properties ofTconstr
'sabbrev_memo
).We newly disallow recursive singleton
[@@unboxed]
types (type t = { t : t } [@@unboxed]
), which are uninhabitable anyway.Specification
For each group of mutually recursive type declarations, we define the following "type contains" transitive relation on type expressions:
type 'a t = #{ ...; lbl : u; ... }
,type 'a t = { lbl : u } [@@unboxed]
, ortype 'a t = U of u [@@unboxed]
is in the recursive group, then'a t
containsu
type 'a t = u
is in the recursive group then'a t
containsu
'a t
contains'a
iflayout_of 'a
orany
occurs in'a t
's layoutlayout_of
, so currently only considerany
#(u_1 * u_2 * ...)
contains allu_i
(
'a
is a metavariable for zero or more type parameters;t
is a metavariable for type constructors;u
is a metavariable for type expressions)If a path starting from the type expression on the LHS of a declaration contains two types with the same head type constructor, and that repeated type is an unboxed record or variant, then we give an error.
Examples
See
typing-layouts-unboxed-records/recursive.ml
for more.Alternatives
Layout variables. We could introduce layout variables (for this check only), finding infinite-size types via the occurs check. Such an implementation might be more principled, accepting finite-size recursive types like
type t = #{ t : t }
. The current implementation gives more informative error traces, and leaves the flexiblity to switch to a less restrictive check in the future.The algorithm from the Unboxed data constructors paper. This would accept types like
type 'a id = #{ a : 'a } and t = int id id
. I think this could be an appealing final solution, but some details are a bit subtle. See discussion on ocaml/ocaml#10479.Generalize
check_well_founded
. My first implementation, which roughly generalizescheck_well_founded
over the "type contains" relation, acceptstype 'a id = #{ a : 'a } and t = int id id
. (Onrtjoa.check-well-founded-unboxed-recursion
, unpolished.) I'd hesitate to increase our reliance oncheck_well_founded
, which I believe roughly implements the same algorithm as the paper above in a tricky way.Review: @goldfirere