Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalize all line endings to Unix on the way in #78

Merged
merged 4 commits into from
Dec 6, 2024

Conversation

DavisVaughan
Copy link
Collaborator

Closes #74

Comment on lines 1 to +4
* text=auto eol=lf

# Windows specific test files where we need CRLF endings
crates/air_r_formatter/tests/specs/r/crlf/*.R text eol=crlf
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sets it so that all files are checked out with LF endings except this one directory, which is forced to have CRLF endings (even on Mac, and particularly on CI)

Comment on lines 32 to +56
fn format_file(path: &PathBuf) -> anyhow::Result<ExitStatus> {
let text = std::fs::read_to_string(path)?;
let contents = std::fs::read_to_string(path)?;

let line_ending = line_ending::infer(&contents);

// Normalize to Unix line endings
let contents = match line_ending {
LineEnding::Lf => contents,
LineEnding::Crlf => line_ending::normalize(contents),
};

let parser_options = RParserOptions::default();
let parsed = air_r_parser::parse(text.as_str(), parser_options);
let parsed = air_r_parser::parse(contents.as_str(), parser_options);

if parsed.has_errors() {
return Ok(ExitStatus::Error);
}

let formatter_options = RFormatOptions::default();
// TODO: Respect user specified `LineEnding` option too, not just inferred line endings
let line_ending = match line_ending {
LineEnding::Lf => biome_formatter::LineEnding::Lf,
LineEnding::Crlf => biome_formatter::LineEnding::Crlf,
};

let formatter_options = RFormatOptions::default().with_line_ending(line_ending);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that the cli should

  • infer the line endings from the contents
  • normalize to unix before parsing/formatting
  • convert back to inferred line endings on the way out (eventually respecting a forced LineEnding option if the user set one)

Comment on lines 29 to +33
let input_code = std::fs::read_to_string(input_file).unwrap();

// Normalize to Unix line endings
let input_code = line_ending::normalize(input_code);

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have a way to set options on a per test basis right now, but eventually it will be nice to have a test where we force the format output to be CRLF to ensure it is working right


use memchr::memmem;

static FINDER: LazyLock<memmem::Finder> = LazyLock::new(|| memmem::Finder::new(b"\r\n"));
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the benefits of Finder is that you can construct one up front and reuse it, like with regexes
https://github.com/astral-sh/uv/blob/81569c47bfa91b24ff0712baf1c001ef9604676e/crates/uv-scripts/src/lib.rs#L17

Comment on lines +14 to +17
Lf,

/// Carriage Return + Line Feed characters (\r\n), common on Windows
Crlf,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I switched to the convention that biome_formatter::LineEnding uses so it maps over nicely

Comment on lines +36 to +43
/// # Source
///
/// ---
/// authors = ["rust-analyzer team"]
/// license = "MIT OR Apache-2.0"
/// origin = "https://github.com/rust-lang/rust-analyzer/blob/master/crates/rust-analyzer/src/line_index.rs"
/// ---
pub fn normalize(x: String) -> String {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like it is more flexible to move towards providing attribution in doc comments

  • It lets us structure the folders however we want, without needing a rust-analyzer folder
  • It allows us to change function names and whatnot while still pointing to the original source

@DavisVaughan DavisVaughan requested a review from lionel- December 2, 2024 21:57
@DavisVaughan DavisVaughan force-pushed the feature/normalize-line-endings branch from 9e89a60 to 12cb862 Compare December 6, 2024 18:48
@DavisVaughan DavisVaughan merged commit 5d5a2db into main Dec 6, 2024
4 checks passed
@DavisVaughan DavisVaughan deleted the feature/normalize-line-endings branch December 6, 2024 18:52
@DavisVaughan
Copy link
Collaborator Author

I feel pretty good about this one, we can iterate as needed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Normalize line endings in multiline strings
1 participant