-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A few tests #7
base: main
Are you sure you want to change the base?
A few tests #7
Conversation
Thanks, this looks very useful to me.
|
Agree with your preference, because that seems most useful (but not sure I agree that it matches what happens in most languages -- in languages such as C++/Java this would be an error because the loop variable is not alive after the loop, and in Python a loop such as
Do you know a good clear reference of what exactly that includes? For instance, when I search for it, one of the top hits is the boost documentation (https://www.boost.org/doc/libs/1_50_0/libs/regex/doc/html/boost_regex/syntax/basic_extended.html) -- that one claims to describe ERE's and does include stuff like
With the current(*) version the limit seems to be somewhere around 50k: (*) throughout this thread when I say "current", I really mean the 2018-04-22 release version. |
Thanks indeed, let me add my 2¢ on those points too:
In summary, I would discard testprogfloatproc.in, I'm not sure about testdatarecharclass.in and I would add testprogducktype.in as a fail case. |
I agree with Tobias on |
Loop counter should be incremented after the last iteration.
Correction on the regex size, re2 docs (https://github.com/google/re2/wiki/Syntax) state: |
Single test case taken from #7. Thanks Per.
…tions. As pointed out by Per Austrin, the value of the loop variable after the loop was undefined. Define it to be the number of iterations, i.e. one beyond the last iteration value. Also add a modified test case provided by Per in #7.
Looking up basic/extended regex at http://pubs.opengroup.org/onlinepubs/000095399/basedefs/xbd_chap09.html and https://www.regular-expressions.info/posix.html, the main difference seems to be that in BRE some characters |
The first commit of this PR contains two tests where the antlr rewrite (#3) currently appears wrong, failing to run scripts with e.g.
\(
escapes in regexes, and scripts using string comparisons. These two you should want.The second commit contains a few more interesting tests that you might not necessarily want, but they exhibit some discrepancies between past, current, and future versions of checktestdata, so think of them as highlighting things you might want to specify better:
testprogducktype.in
: this sets variablea
first to an integer, and then starts assigninga[i]
as if it was an array. In current checktestdata this works fine, but in the antlr-rewrite this gives an error. Would be good to specify expected behaviour for thistestdataloopvar.in
: this uses a loop iteration variable after the loop, in particular aWHILEI
variable. In the current checktestdata, after the loop the value of the loop variable equals the number of iterations that were run, but in the antlr-rewrite it is one less. Would be good to specify expected behaviour for this.testdatarecharclass.in
: uses the common\s
,\d
, etc shorthands for character classes in regexes. This used to work in some older version of checktestdata, but does not work in the current version or in the antlr-rewrite. Would be good to specify exact regex dialect that is allowed in checktestdata.testdatarerepetitions.in
: uses an RE with a somewhat large number of repetitions[a-z]{1,1500}
. In current checktestdata this works, in antl-rewrite this throws an exception. It is fair that max number of repetitions depends on implementation I guess, but a bit disappointing that the bound is so low in the rewrite.testprogfloatproc.in
: ok this is not particularly interesting, but it seems the antlr-rewrite uses a different floating point type than the current version -- with current checktestdata adding0.004
to itself 250 times results in a value<= 1.0
, but in the antlr-rewrite the floating point errors go the other way and one gets a value slightly larger than 1.0. This is probably not actionable in any way, just wanted to point it out.These were found by running the two checktestdata versions on all checktestdata scripts for problems on Kattis (there were around 700 or so of them), so while some of them may sound a bit contrived, all of them did in some sense occur in practice.
Let me know if you think any of these are interesting enough to keep, and then I can drop the rest from the PR.