You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
In the grammar I am writing,
there are a few places where I need to match against one of a large selection of constants.
Which I will call a_long_list, it might have 20 elements, it might have 200.
In theory, I'm sure there are use cases for matching against one of thousands, or tens of thousands.
Alt(Equals.(a_long_list)...) works.
But I understand that it will be O(n*m)
for n the length of the list, and m the maximum length of any element of that list
Which honestly isn't too bad, I think.
But I figure it can be done better.
If a Trie is used, I think this becomes just O(m).
I might be screwing up my math here, but I think that in the process of finding the longest match, one automatically finds all the shorter matches, which can be saved for if backoff is required.
So there is no need to re-step through the source, if one fails.
I've started working on this, I have the Trie stuff working to return what strings match, I just need to like it into a matcher, with the trampoline stuffs, Success/Fail/Execute.
What I currently propose is a matcher:
EqualsOneOf(values; greedy::Bool=true)
Where values are the values that it could be equal to.
If greedy is false then this matches shortest-first.
And if it has to back-off, then gives the second shortest string in values that matches,
etc.
This is the natural order for a Trie.
If greedy is true, then it matches longest-first (the longest string that is in values first), and then if that needs to be backedoff from, match's the second longest, and so forth.
The is accomplished by collecting, all the values that match as Trie keys,
and then reverseing the order. (Which i guess does make this O(n+m))
This is a bit less expressive than Alt(Equals.(a_long_list)...) since that lets you choose a priority for the matchers, not just longestfirst or shortestfirst
What do you think?
When I have it working, should I make a PR?
hi. i think that makes sense. if tries can only support strings then the matcher would need to have a more restricted type, but i don't see why it wouldn't still work for strings (but honestly it's been ages since i last used this library, or even julia). if you submit a pr i will certainly try to include it (but can't promise a fast response).
Finally came back to this yesterday, and finished my prototype.
It took me a long time to get my head around your code.
(Things you call iter I would call iter_state)
I want to make it lazy in the case of shortest first matching.
But that means not using Task based generator expressions.
I probably won't poke this code again for another month or so.
Right now it is very tied to strings.
And even onces the requirement on Tries is relaxed, it is going to take some work to make if functional for things that are not index/view-able.
Hi,
In the grammar I am writing,
there are a few places where I need to match against one of a large selection of constants.
Which I will call
a_long_list
, it might have 20 elements, it might have 200.In theory, I'm sure there are use cases for matching against one of thousands, or tens of thousands.
Alt(Equals.(a_long_list)...)
works.But I understand that it will be
O(n*m)
for
n
the length of the list, andm
the maximum length of any element of that listWhich honestly isn't too bad, I think.
But I figure it can be done better.
If a Trie is used, I think this becomes just
O(m)
.I might be screwing up my math here, but I think that in the process of finding the longest match, one automatically finds all the shorter matches, which can be saved for if backoff is required.
So there is no need to re-step through the source, if one fails.
I've started working on this, I have the Trie stuff working to return what strings match, I just need to like it into a
matcher
, with the trampoline stuffs,Success/Fail/Execute
.What I currently propose is a matcher:
EqualsOneOf(values; greedy::Bool=true)
Where
values
are the values that it could be equal to.If
greedy
isfalse
then this matches shortest-first.And if it has to back-off, then gives the second shortest string in values that matches,
etc.
This is the natural order for a Trie.
If
greedy
istrue
, then it matches longest-first (the longest string that is in values first), and then if that needs to be backedoff from, match's the second longest, and so forth.The is accomplished by
collect
ing, all the values that match as Trie keys,and then
reverse
ing the order. (Which i guess does make thisO(n+m)
)This is a bit less expressive than
Alt(Equals.(a_long_list)...)
since that lets you choose a priority for the matchers, not just longestfirst or shortestfirstWhat do you think?
When I have it working, should I make a PR?
One issue is that right now, Tries only support strings.
JuliaCollections/DataStructures.jl#220
The text was updated successfully, but these errors were encountered: