Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement "leading hole hugging" behavior #82

Merged
merged 5 commits into from
Dec 5, 2024
Merged

Conversation

DavisVaughan
Copy link
Collaborator

@DavisVaughan DavisVaughan commented Dec 5, 2024

Closes #32

Two main changes:

  • Implemented hugging for what I call "leading holes", which greatly improves data.table formatting where its common to have a leading , hole for an empty i expression (explained below)
  • Was forced to sit down and confront comment handling for holes, something I had some TODO comments in there for and knew we'd have to do eventually

Leading holes

Imagine you have this data.table expression, and you want to expand it over multiple lines

dt2 = dt[, .(total_value_received = sum(price * count)), by = "product_id"]

Before this PR, your only option was to hit enter between the [ and the first , and you'd get this:

dt2 = dt[
  , 
  .(total_value_received = sum(price * count)), 
  by = "product_id"
]

Not great!

After this PR, the "hole" [<here>, is considered a "leading hole". Leading holes "hug" tightly to the opening [ token no matter what, so you get this:

dt2 = dt[,
  .(total_value_received = sum(price * count)),
  by = "product_id"
]

So much nicer! This is pretty common with data.table users, and even chains nicely:

dt2 = ProductReceived[
  Products,
  on = c("product_id" = "id"),
][,
  .(total_value_received = sum(price * count)),
  by = "product_id"
]

Note that you can have any number of repeated leading ,s, they all get aggregated together as a set of leading holes.

If there is a comment within a hole, it is no longer a candidate to be a "leading hole" and is instead considered as a normal argument, subject to typical "fully expanded" rules

dt2 = dt[
	# no longer a leading hole
	,
	.(total_value_received = sum(price * count)),
	by = "product_id"
]

To be clear, only holes up front can be leading holes. Trailing and intermediate holes still fully break, like this

dt2 = dt[,
	.(total_value_received = sum(price * count)),
	,
	by = "product_id"
]

I think this is good behavior because these holes are relatively "weird" and the extra spacing makes them very visible.

Leading holes and user requested line breaks

We treat leading holes as roughly invisible when detecting if the user requested a line break. With the example from before:

dt2 = dt[, .(total_value_received = sum(price * count)), by = "product_id"]

You can actually add a newline before or after that , and it will be treated like a user line break. The , will snap back to hugging the [ tightly but the arguments will get expanded. The principle for user requested expansion here is:

The user added a line break anywhere before the first non-hole argument in a call

This seems to result in the least surprising behavior.

Comment on lines +395 to +401
/// ```r
/// fn(
/// a,<hole> # comment
/// ,
/// b
/// )
/// ```
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Holes and comments are quite tricky to think about. In the following example:

fn(
  a, # comment
  ,
  b
)

you might think the formatter assumes the hole is here

fn(
  a, # comment
  <hole>,
  b
)

That would make # comment trailing on a automatically with the Default comment placement behavior.

But that isn't what happens! Instead, holes are placed "eagerly" like this:

fn(
  a,<hole># comment
  ,
  b
)

This causes # comment to be a trailing comment on the hole by default. This trailing comment behavior ALWAYS happens with holes, and is something we exploit as it simplifies our hole comment handling. It is also how the biome group handles array hole comments, so I feel good about it:
https://github.com/biomejs/biome/blob/66458a12a048b0e686b242d9d4dd0f0e90d1f92a/crates/biome_js_formatter/src/comments.rs#L241-L242

Comment on lines +129 to +140
let mut iter_elements = items.elements();

// Split leading holes out from the remainder of the arguments.
// Leading holes tightly hug the `l_token` no matter what.
// Because they are intended to tightly hug, a node is only considered
// a leading hole if there isn't a comment attached.
let leading_holes: Vec<_> = iter_elements
.take_while_ref(|element| {
element.node().map_or(false, |node| {
node.is_hole() && !comments.has_comments(node.syntax())
})
})
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

take_while_ref() is a neat feature from itertools that allows you to, say, take the first n elements out of an iterator without consuming the iterator, which allows you to then do two separate collect() calls to split the iterator results into two vectors.

@DavisVaughan DavisVaughan merged commit 11fa74d into main Dec 5, 2024
4 checks passed
@DavisVaughan DavisVaughan deleted the feature/hole-hugging branch December 5, 2024 19:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handle user requested expansion when the first argument is a hole
1 participant