-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move lapack_info_check
inside of onemkl_cusolver_host_task
#238
base: develop
Are you sure you want to change the base?
Conversation
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
@@ -571,7 +569,6 @@ inline sycl::event getrf_batch(const char *func_name, Func func, sycl::queue &qu | |||
// Allocate memory with 32-bit ints then copy over results | |||
std::uint64_t ipiv_size = stride_ipiv * batch_size; | |||
int *ipiv32 = (int *)malloc_device(sizeof(int) * ipiv_size, queue); | |||
int *devInfo = (int *)malloc_device(sizeof(int) * batch_size, queue); | |||
|
|||
auto done = queue.submit([&](sycl::handler &cgh) { | |||
int64_t num_events = dependencies.size(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dependencies loop can be simplified in the batch with depends_on_events(cgh, dependencies);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice one. I've made these changes.
@@ -581,16 +578,17 @@ inline sycl::event getrf_batch(const char *func_name, Func func, sycl::queue &qu | |||
onemkl_cusolver_host_task(cgh, queue, [=](CusolverScopedContextHandler &sc) { | |||
auto handle = sc.get_handle(queue); | |||
auto a_ = reinterpret_cast<cuDataType *>(a); | |||
auto devInfo_ = reinterpret_cast<int *>(devInfo); | |||
auto scratchpad_ = reinterpret_cast<cuDataType *>(scratchpad); | |||
auto ipiv_ = reinterpret_cast<int *>(ipiv32); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can also remove this unnecessary reinterpret cast as ipiv32
is already an int *
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
LGTM! |
@ericlars Could we get a review of this from someone? Thanks |
Description
In some cases cuSolver operations can return a successful error code while failing. The previous implementation of this check is done via SYCL and requires the CPU to wait until the cuSolver function completes. This is not great for performance as it prevents the user for issuing more work to the queue until it has received a response from oneMKL.
This PR moves the check inside the host task.
@AidanBeltonS I've updated the cusolver_batch.cpp cases too, can you check it is OK?