Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added Format trait to support different BNF syntaxes, and a basic implementation of ABNF to demonstrate #154

Merged
merged 10 commits into from
Jan 5, 2025

Conversation

Carlyle-Foster
Copy link
Contributor

@Carlyle-Foster Carlyle-Foster commented Jan 2, 2025

i'm curious too hear peoples thoughts about this approach, other approaches i considered are:

functional programming
by this i mean making all the basic parsing functions higher-order functions
so you'd pass the format as a normal parameter and it 'binds' the parameter, returning a closure that has it
built in and thus can be composed with nom combinators

i rejected it because
it seems massively complicated to me and the use of closures would change the performance characteristics(EDIT: sic, we already use closures but they're monomorphized at compile-time, that's the whole idea behind nom)

global variables
pretty self-explanatory, you have a static format with a setter and getter to hide how it's implemented,
could be through a RwLock, i even tried an unsafe static mut because i wasn't sure that generating
multiple BNF's of different formats in parallel was really desirable to support

i rejected it because
aside from my obvious unease about globals, it turns out that it breaks the whole test suite in
unpredictable ways because reason tests all "see" the same global variable, having the tests that use
ABNF set it back to the default afterwards and using the RwLock implementation helped a bit,
but i think i'd have to rewrite all the tests to make sure this new "environment" is set correctly and i' don't see
why i should when other options set the bar much higher

code duplication
the idea is i just copy and paste the entirety of parsers.rs into a new module and just change the low-level
prod_lhs() and nonterminal, this was what i tried at first, you can see it working in the second commit
here, this option is already pretty good because there's not much code to duplicate and it's not very complex

i only rejected code duplication because i think the Format trait is a better way of generating the same code
but with the duplication done automatically so it's always in sync, at the minor cost of always requiring a type parameter,
even up to the interface, i wanted to have it be an argument in the API at first but now that would be misleading as to how
it works, no data is actually transferred at runtime

@coveralls
Copy link

coveralls commented Jan 2, 2025

Coverage Status

coverage: 98.263% (-0.2%) from 98.471%
when pulling 64bea7e on Carlyle-Foster:main
into c6fb531 on shnewto:main.

@Carlyle-Foster
Copy link
Contributor Author

i added support for comments! there's no dedicated tests but i added some random comments to the example example grammars in tests/fixtures so it is still tested, it's just not obvious

this closes #37 i think

@Carlyle-Foster
Copy link
Contributor Author

added automatic detection of grammar in Grammar::FromStr(), as suggested in #17

@shnewto shnewto requested a review from CrockAgile January 2, 2025 15:25
@CrockAgile
Copy link
Collaborator

wow this looks great! I will be sure to give it a solid review soon 👍

@Carlyle-Foster
Copy link
Contributor Author

Carlyle-Foster commented Jan 3, 2025

@CrockAgile somewhat off-topic, but are the docs for benching out-of-date? it seems you've moved from criterion to divan but the docs don't even mention divan and criterion.rs is still there, when you run cargo bench like it says both get run but they seem to have the same tests so it takes twice as long as it should(more than that actually because criterion seems to be slower)

@Carlyle-Foster
Copy link
Contributor Author

Carlyle-Foster commented Jan 3, 2025

ideally every format should have it's own feature flag, but i'll have to genericise the test suite before i can do that

@Carlyle-Foster
Copy link
Contributor Author

do we support unicode? the types imply it but i'm not certain that we aren't assuming it's ASCII somewhere, i would hope not since we're using &str

@CrockAgile
Copy link
Collaborator

CrockAgile commented Jan 5, 2025

First, thanks for the contribution! I don't see anything immediately blocking merging this. I particularly liked your explanation of the alternatives considered, and why the Format trait would be preferred.

I've gone thru the PR and I agree with your comments that there is work left to do, but I don't want to let perfection stand in the way of progress. So I am going to merge this, and we can iterate to address:

  • ideally every format should have it's own feature flag, but i'll have to genericise the test suite before i can do that

  • more property tests to flex the ABNF parser
  • anything else you think would be a good next step

@CrockAgile somewhat off-topic, but are the docs for benching out-of-date? it seems you've moved from criterion to divan but the docs don't even mention divan and criterion.rs is still there, when you run cargo bench like it says both get run but they seem to have the same tests so it takes twice as long as it should(more than that actually because criterion seems to be slower)

yes the benchmarking docs are a bit out of date! when divan was added, it was still relatively new. it may be time to reevaluate which benchmark suite best matches the crate's needs

do we support unicode? the types imply it but i'm not certain that we aren't assuming it's ASCII somewhere, i would hope not since we're using &str

yes unicode should be supported! I will make an issue to add track adding more tests and explicit documentation for this

@CrockAgile CrockAgile merged commit 2479db7 into shnewto:main Jan 5, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants