Skip to content

Commit

Permalink
Merge pull request #2512 from hvdijk/handle-local-args
Browse files Browse the repository at this point in the history
[NativeCPU] Handle local args.
  • Loading branch information
kbenzie authored Jan 8, 2025
2 parents b2ac58f + 6972fbb commit 9d2a711
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 5 deletions.
8 changes: 4 additions & 4 deletions source/adapters/native_cpu/enqueue.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -138,12 +138,12 @@ UR_APIEXPORT ur_result_t UR_APICALL urEnqueueKernelLaunch(
#else
bool isLocalSizeOne =
ndr.LocalSize[0] == 1 && ndr.LocalSize[1] == 1 && ndr.LocalSize[2] == 1;
if (isLocalSizeOne && ndr.GlobalSize[0] > numParallelThreads) {
if (isLocalSizeOne && ndr.GlobalSize[0] > numParallelThreads &&
!hKernel->hasLocalArgs()) {
// If the local size is one, we make the assumption that we are running a
// parallel_for over a sycl::range.
// Todo: we could add compiler checks and
// kernel properties for this (e.g. check that no barriers are called, no
// local memory args).
// Todo: we could add more compiler checks and
// kernel properties for this (e.g. check that no barriers are called).

// Todo: this assumes that dim 0 is the best dimension over which we want to
// parallelize
Expand Down
4 changes: 3 additions & 1 deletion source/adapters/native_cpu/kernel.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,9 @@ struct ur_kernel_handle_t_ : RefCounted {
_localMemPoolSize = reqSize;
}

// To be called before executing a work group
bool hasLocalArgs() const { return !_localArgInfo.empty(); }

// To be called before executing a work group if local args are present
void handleLocalArgs(size_t numParallelThread, size_t threadId) {
// For each local argument we have size*numthreads
size_t offset = 0;
Expand Down

0 comments on commit 9d2a711

Please sign in to comment.