Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NaN integers can be returned by Char.toCode #1150

Open
Alch-Emi opened this issue Dec 11, 2024 · 2 comments
Open

NaN integers can be returned by Char.toCode #1150

Alch-Emi opened this issue Dec 11, 2024 · 2 comments

Comments

@Alch-Emi
Copy link

Alch-Emi commented Dec 11, 2024

It seems like (as of version 0.19.1) it's possible to construct a NaN : Int through the expression

Char.fromCode 0xd800 |> Char.toCode

Doing some research, this seems unintended (and I think NaN ints are unintended in general?), as the expected return value would seem to be 0xFFFD (aka �), which is what you get when you feed most other invalid unicode codepoints to this expression.

I did a cursory search of other issues in this repo and it doesn't seem like anyone else has opened an issue for this, but please excuse me if I have missed something. oh i just read the duplicates policy! that's lovely!!

Thank you for your time!

OS: NixOS 24.11 on Linux 6.6.63

Occurs in the REPL and Firefox 133.0

Copy link

Thanks for reporting this! To set expectations:

  • Issues are reviewed in batches, so it can take some time to get a response.
  • Ask questions a community forum. You will get an answer quicker that way!
  • If you experience something similar, open a new issue. We like duplicates.

Finally, please be patient with the core team. They are trying their best with limited resources.

@lue-bird
Copy link

lue-bird commented Dec 11, 2024

We found this behavior a while ago and wrote a small explainer
https://github.com/stil4m/elm-syntax/blob/master/src/Char/Extra.elm#L370-L384

Funnily enough, without some form of this behavior, the elm-syntax parser would be many times slower because there is no other way to check for UTF-16 surrogates currently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants