-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restrict Motif discovery to subsequences starting at specific locations #49
Comments
@peterdhansen I think that'd be a great contribution! We're looking to grow out more of our utility functions that go beyond the core algorithms. Feel free to make a PR and we can collaborate. |
I think it would be good to have this functionality. As you mention, it seems fairly trivial to implement. The "harder" thing to do is to write a blog post explaining when the approach is useful. Are you interested in adding the code, unit tests, and a blog post? @peterdhansen |
Sounds good. I'll give it a shot. |
So, I just had another idea. |
Or is the snippets algorithm doing this (for regularly spaced index selection) (Sorry for triple post) |
I think that'd be useful - in my original Hacker News for matrixprofile-ts
post I proposed doing something similar, and folks seemed really
interested.
…On Thu, Sep 17, 2020, 11:32 AM Peter Hansen ***@***.***> wrote:
Or is the snippets algorithm doing this (for regularly spaced index
selection)
(Sorry for triple post)
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#49 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB53ISCLS7B3OXENGY2O5OTSGI22RANCNFSM4RPKYNFQ>
.
|
We could use a similar approach to how missing data can be handled. The stomp implementation handles this right now and we are working on adding similar functionality to mpx. Essentially, provide a boolean array of indices to process or skip. I envision it working like annotation vector. All other distances in the profile can simply return nan. Another approach could be to require users to have valid time domains using a Pandas time series or something. This way we can have users specify intervals of interest. |
Snippets does not do this. It identifies k representative snippets and n neighbors. It helps to answer what is common in the series of interest. |
@peterdhansen Any updates on this? |
Sorry, not yet. I'll take a look in the next week or so. |
Got my environment setup 😄 |
@peterdhansen just wanted to circle back and see if you're still interested in contributing. Happy Holidays! |
When I am analyzing data that has daily fluctuations, I create a annotation vector that is 1 at midnight of each day and 0 everywhere else. This helps prioritize subsequences that start at midnight so each set of motifs have the same 24 hour structure.
The issue is that applying an annotation vector does not prevent the motif algorithm from picking a motif pair where one starts at midnight and the other does not. A new mechanism would have to be defined to restrict these.
Also distance profiles that are calculated inside the motif algorithm do not apply the annotation vector. This could be added and triggered when
use_cmp = True
without any new mechanisms.I can write a custom motif finding code that does this, but if others would like the functionality I'd be happy to contribute.
The text was updated successfully, but these errors were encountered: