fix: diagnostics in lines with multi-byte chars #35
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There's a conflict between the way Lua interprets strings with multi-byte characters and the way we pass the
col
field through the patterns.For example, the length for the string:
· example typox
in every other language would be15
, but Lua counts the bytes in the string, not the number of printable characters. This means that for the same string, lua returns16
as the length of the string.The report coming from CSpell also counts only printable characters, so for a file like this:
test.md
* example typox · example typox
The report will be:
npx cspell -c cspell.json lint --language-id markdown test.md
Both lines have the same column as the start of the unknown word, because CSpell doesn't count bytes when reporting the position of the error.
So when we read the column from the report we just forward whatever we got from the CSpell report.
The
end_col
ends up with the correct position because we calculate it with the customfrom_quote
adapter, which finds the end column programmatically.To counter that discrepancy, I'm using the column reported by CSpell only as an index to start looking for the word reported as an error in the
end_col
function, and mutating the entries table to define thecol
property in the same function.I have a proof of concept that seems to work as expected, I'll test a few scenarios before I push anything.
IMO, that feels a bit too hacky to keep as a long-term solution, we should look into validating the
col
property in none-ls.