-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Normalize all line endings to Unix on the way in #78
Conversation
* text=auto eol=lf | ||
|
||
# Windows specific test files where we need CRLF endings | ||
crates/air_r_formatter/tests/specs/r/crlf/*.R text eol=crlf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sets it so that all files are checked out with LF endings except this one directory, which is forced to have CRLF endings (even on Mac, and particularly on CI)
fn format_file(path: &PathBuf) -> anyhow::Result<ExitStatus> { | ||
let text = std::fs::read_to_string(path)?; | ||
let contents = std::fs::read_to_string(path)?; | ||
|
||
let line_ending = line_ending::infer(&contents); | ||
|
||
// Normalize to Unix line endings | ||
let contents = match line_ending { | ||
LineEnding::Lf => contents, | ||
LineEnding::Crlf => line_ending::normalize(contents), | ||
}; | ||
|
||
let parser_options = RParserOptions::default(); | ||
let parsed = air_r_parser::parse(text.as_str(), parser_options); | ||
let parsed = air_r_parser::parse(contents.as_str(), parser_options); | ||
|
||
if parsed.has_errors() { | ||
return Ok(ExitStatus::Error); | ||
} | ||
|
||
let formatter_options = RFormatOptions::default(); | ||
// TODO: Respect user specified `LineEnding` option too, not just inferred line endings | ||
let line_ending = match line_ending { | ||
LineEnding::Lf => biome_formatter::LineEnding::Lf, | ||
LineEnding::Crlf => biome_formatter::LineEnding::Crlf, | ||
}; | ||
|
||
let formatter_options = RFormatOptions::default().with_line_ending(line_ending); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is that the cli should
- infer the line endings from the contents
- normalize to unix before parsing/formatting
- convert back to inferred line endings on the way out (eventually respecting a forced
LineEnding
option if the user set one)
let input_code = std::fs::read_to_string(input_file).unwrap(); | ||
|
||
// Normalize to Unix line endings | ||
let input_code = line_ending::normalize(input_code); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have a way to set options on a per test basis right now, but eventually it will be nice to have a test where we force the format output to be CRLF to ensure it is working right
|
||
use memchr::memmem; | ||
|
||
static FINDER: LazyLock<memmem::Finder> = LazyLock::new(|| memmem::Finder::new(b"\r\n")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the benefits of Finder
is that you can construct one up front and reuse it, like with regexes
https://github.com/astral-sh/uv/blob/81569c47bfa91b24ff0712baf1c001ef9604676e/crates/uv-scripts/src/lib.rs#L17
Lf, | ||
|
||
/// Carriage Return + Line Feed characters (\r\n), common on Windows | ||
Crlf, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I switched to the convention that biome_formatter::LineEnding
uses so it maps over nicely
/// # Source | ||
/// | ||
/// --- | ||
/// authors = ["rust-analyzer team"] | ||
/// license = "MIT OR Apache-2.0" | ||
/// origin = "https://github.com/rust-lang/rust-analyzer/blob/master/crates/rust-analyzer/src/line_index.rs" | ||
/// --- | ||
pub fn normalize(x: String) -> String { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like it is more flexible to move towards providing attribution in doc comments
- It lets us structure the folders however we want, without needing a
rust-analyzer
folder - It allows us to change function names and whatnot while still pointing to the original source
9e89a60
to
12cb862
Compare
I feel pretty good about this one, we can iterate as needed! |
Closes #74