Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Treat empty strings as generic for column type deduction, allow thousands separators #243

Merged
merged 1 commit into from
Oct 8, 2024

Conversation

pjkundert
Copy link
Contributor

@pjkundert pjkundert commented Jan 27, 2023

Allow missing (None) and empty ("") cells to be treated the same, for the purposes of deducing the column type.

This allows us to have empty cells (without the missingval), differentiated from missing cells (with the missingval), for columns generally containing numeric data, while retaining the correctly deduced column formatting.

Resolves #242

@pjkundert pjkundert changed the title Treat empty strings as generic, for column type deduction Treat empty strings as generic for column type deduction, fix SEPARATING_LINE for padded formats Feb 4, 2023
@eliegoudout
Copy link
Contributor

I think you should split your PR in two:

  • One for the SEPARATING_LINE fix
  • One for your proposed modification.

Indeed, merging the fix should not be conditional to accepting/merging your proposed modification.

@pjkundert pjkundert force-pushed the feature-empties branch 2 times, most recently from 6c310a0 to 887861d Compare July 20, 2023 21:03
@pjkundert pjkundert changed the title Treat empty strings as generic for column type deduction, fix SEPARATING_LINE for padded formats Treat empty strings as generic for column type deduction, allow thousands separators Jul 20, 2023
@pjkundert
Copy link
Contributor Author

OK, removed SEPARATING_LINE fix.

Now, deals in deduction of int and float columns, and supports thousands-separators in ints, floats.

o Empty/None values are ignored for deducing the type of a column
o Comma-separated numbers are allowed in for int and float types
@pjkundert
Copy link
Contributor Author

Please consider merging this one, @astatin. It correctly deduces the types of int and float values with comma-separators (important!), and also correctly deduces the types of columns with empty cells (very important for bigger tables with complex data and missing values!)

Thanks for all your work on this!

@astanin
Copy link
Owner

astanin commented Oct 8, 2024

@pjkundert I think this can be merged. Thank you.

@astanin astanin self-requested a review October 8, 2024 08:55
@astanin astanin merged commit f0437c1 into astanin:master Oct 8, 2024
5 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ignore empty strings when deducing column types
3 participants