-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel Grid loop creation for gemms #885
Parallel Grid loop creation for gemms #885
Conversation
b259c15
to
81ae55d
Compare
Awesome! I'm seeing 38% improvement on |
BTW: I cannot build this PR: /opt/rh/gcc-toolset-13/root/usr/lib/gcc/x86_64-redhat-linux/13/../../../../bin/ld: lib/libTPPPipeline.a(DefaultPipeline.cpp.o): in function |
I could make it compile, after making it a default TPP pass |
This looks like the right fix. Adding it to the transform dialect is the wrong direction. I'm not sure why the CI can build it and you can't. Perhaps a clean build would works? Regardless, your patch is the right direction anyway. |
I did a fully clean build... and following the instructions from our ReadME (have them in a script, where I just set the branch name and origin). It seems something went sideways that pass was in the LinAlgXTransform library?! |
It was, wrong place. Not sure how the CI build it either. I get this error with the PR:
But that's unrelated to your error / fix, I think. |
84d5ee7
to
c69ce46
Compare
Alex has fixed the return value of the pass to fix this error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code itself is very confusing, with manual copying the basic block, #if 0
and dead variables, but as discussed, it should be fine until we gather more examples and understand all the behaviour we want form it, including the dynamic case.
See if you can add some comment to the confusing parts trying to explain what you were trying to do or didn't do because of another problem (for example, the basic block copy), so that we can more easily read it later and fix it.
I have added a couple of comments for now and gotten rid of #if 0 block and the dead code. |
…and making it a simple TPP pass
bcc320d
to
1beaf3d
Compare
No description provided.