-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implemented allreduce
& Allreduce
for Communicator
#92
Conversation
For release 2024.09.00
ndsl/comm/communicator.py
Outdated
) | ||
return all_reduce_quantity | ||
|
||
def all_reduce_sum(self, quantity: Quantity): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want a new quantity here? The MPI spec is to have send & recv buffers so mpi4py is already stepping out of that. In-place would be weird, but temp memory floating around like this means we can control it either (buffer or it or otherwise).
I see a few alternative:
1/ Pass an optional recv buffer that can be control from outside
2/ Have an option to be in-place
3/ Mark this for a future NDSL-wide memory pool system (which we are going to get at some point)
@twicki thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For context, since I saw that gather
and scatter
created their own Quantities in their operation, I figured that was the way to go about it. My sense is that an in-place where the allreduce_sum data replaces the data from the input Quantity to the allreduce is more elegant in that it doesn't require a memory allocation. It looks like a setter for Quantity is needed for this to happen since Python complains about needing a setting if quantity.data = reduced_quantity_data
is run, and I'm not sure if that will cause other potential issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think the gather and scatter are missing this too. I let Tobias way in, but we probably at least offer the opportunity to pass a temporary buffer by hand so it can be manages by the user.
As for the in-place, when accessing .data
you should be in numpy/cupy land and it should be good to go so that is odd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I modified the Quantity object to allow setters for the various Quantity metadata and data properties. all_reduce_sum
now allows an output quantity to be passed that will contain the result of the all_reduce being performed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of having setters in Quantity modifying properties of QuantityMetadata, I created a routine duplicate_metadata
that performs the copying of QuantityMetadata properties from one to the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good overall, we need to make sure we are consistent if we return a new quantity and we need to debate memory management
…antity metadata and data properties.
…ata properties from one class to another. Subsequent Quantity setters that performed the copying of QuantityMetadata properties were removed
allreduce
& Allreduce
for Communicator
Description
allreduce
andAllreduce
function under the NDSL Communicator class that enables the summation of one, two, and three-dimensional Quantities across ranks into a single Quantity that's also synced across all ranks while an operation is applied during sync (SUM, MAX, etc.)ndsl.communicator.comm_abc.Comm
object to be given to aCommunicator
to enforce a layer between the actual MPI implementation and the run code. Numerous fixes to undo passingmpi4py
directly toCommunicator
throughout the base,Fixes # (issue)
Adds MPI Allreduce sum functionality to Communicator class.
How Has This Been Tested?
A unit test
test_mpi_allreduce_sum.py
has been included undertests/mpi
that creates a one, two, and three-dimensional quantity of integers on all ranks, passes the quantity to the allreduce sum function, and check whether the resulting quantity is equal to the original quantity multiplied by the number of ranks.Checklist: