Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: diagnostics in lines with multi-byte chars #35

Merged
merged 1 commit into from
Nov 21, 2023

Conversation

davidmh
Copy link
Owner

@davidmh davidmh commented Nov 21, 2023

There's a conflict between the way Lua interprets strings with multi-byte characters and the way we pass the col field through the patterns.

For example, the length for the string: · example typox in every other language would be 15, but Lua counts the bytes in the string, not the number of printable characters. This means that for the same string, lua returns 16 as the length of the string.

The report coming from CSpell also counts only printable characters, so for a file like this:

test.md

* example typox
· example typox

The report will be:

npx cspell -c cspell.json lint --language-id markdown test.md

1/1 ./test.md 163.45ms X
./test.md:1:11 - Unknown word (typox)
./test.md:2:11 - Unknown word (typox)

Both lines have the same column as the start of the unknown word, because CSpell doesn't count bytes when reporting the position of the error.

So when we read the column from the report we just forward whatever we got from the CSpell report.

The end_col ends up with the correct position because we calculate it with the custom from_quote adapter, which finds the end column programmatically.

To counter that discrepancy, I'm using the column reported by CSpell only as an index to start looking for the word reported as an error in the end_col function, and mutating the entries table to define the col property in the same function.

I have a proof of concept that seems to work as expected, I'll test a few scenarios before I push anything.

IMO, that feels a bit too hacky to keep as a long-term solution, we should look into validating the col property in none-ls.

There's a conflict between the way Lua interprets strings with
multi-byte characters and the way we pass the `col` field through the
patterns.

For example, the length for the string: `* example typox` in every other
language would be `15`, but Lua counts the bytes in the string, not the
number of printable characters. This means that for the same string, lua
returns `16` as the length of the string.

The report coming from CSpell also counts only printable characters, so
for a file like this:

`test.md`
```markdown
* example typox
· example typox
```

The report will be:

`npx cspell -c cspell.json lint --language-id markdown test.md`
```
1/1 ./test.md 163.45ms X
./test.md:1:11 - Unknown word (typox)
./test.md:2:11 - Unknown word (typox)
```

Both lines have the same column as the start of the unknown word,
because CSpell doesn't count bytes when reporting the position of the
error.

So when we read the column from the report we just forward whatever we
got from the CSpell report.

The `end_col` ends up with the correct position because we calculate it
with the custom `from_quote` adapter, which finds the end column
programmatically.

To counter that discrepancy, I'm using the column reported by CSpell
only as an index to start looking for the word reported as an error in
the `end_col` function, and mutating the entries table to define the
`col` property in the same function.

I have a proof of concept that seems to work as expected, I'll test a
few scenarios before I push anything.

IMO, that feels a bit too hacky to keep as a long-term solution, we
should look into validating the `col` property in none-ls.
@davidmh davidmh merged commit d21276d into main Nov 21, 2023
5 checks passed
@davidmh davidmh deleted the fix/misplaced-diagnostics branch November 21, 2023 03:38
@davidmh
Copy link
Owner Author

davidmh commented Nov 21, 2023

@eshepelyuk this is a temporary workaround for nvimtools/none-ls.nvim#19 the long-term solution should be implemented in none-ls, so we don't have to patch it in the sources.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant