Skip to content

Commit

Permalink
Release v1.13.2-aws
Browse files Browse the repository at this point in the history
Signed-off-by: Arun Karthik <akkart@amazon.com>
  • Loading branch information
arunkarthik-akkart committed Dec 11, 2024
1 parent 80d1cb0 commit 5d0c075
Show file tree
Hide file tree
Showing 2 changed files with 30 additions and 1 deletion.
29 changes: 29 additions & 0 deletions RELEASENOTES.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,35 @@ have unified the code into a single branch, and made the AWS-specific parts a
compile-time option. When a feature (or entire release) only supports one of
the two variants, we note that in the release notes.

# v1.13.2-aws (2024-12-06)

This release is intended only for use on AWS P* instances. A general release
that supports other libfabric networks may be made in the near future.

With this release, building with platform-aws requires
[1.22.0amzn4.0](https://github.com/aws/libfabric/commits/1.22.0amzn4.0/)
or greater. AWS customers are generally recommended to track
[the latest-available EFA Installer](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/efa-verify.html)
for performance improvements and bug fixes.

The 1.13.x release series supports
[NCCL 2.23.4-1](https://github.com/NVIDIA/nccl/releases/tag/v2.23.4-1)
while maintaining backward compatibility with older NCCL versions
([NCCL v2.17.1](https://github.com/NVIDIA/nccl/releases/tag/v2.17.1-1) and later).

Bug Fixes:

- Tuner Improvements:
- Fixed algorithm selection for larger ranks and message sizes.
- Re-calibrated the tuner for AllGather and ReduceScatter regions for 0x7 bitmask on P5en,
optimizing performance for larger messages.
- Added tuner support for AllGather and ReduceScatter regions for 0x0 bitmask on P5en.

- Resolved a performance issue by preventing the eager protocol when RDMA writes are in flight,
improving small AllReduce collective performance.

Note: dmabuf support is now turned off by default. Users can enable it explicitly using OFI_NCCL_DISABLE_DMABUF=0 if needed.

# v1.13.1-aws (2024-11-25)

This release is intended only for use on AWS P\* instances. A general release
Expand Down
2 changes: 1 addition & 1 deletion configure.ac
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
#

# Initialization
AC_INIT([aws-ofi-nccl], [1.13.2a1-aws], [al-ofi-nccl-team@amazon.com], , [http://github.com/aws/aws-ofi-nccl])
AC_INIT([aws-ofi-nccl], [1.13.2-aws], [al-ofi-nccl-team@amazon.com], , [http://github.com/aws/aws-ofi-nccl])
AC_PREREQ([2.69])
AC_CONFIG_SRCDIR([src/nccl_ofi_net.c])
AC_CONFIG_AUX_DIR([build-aux])
Expand Down

0 comments on commit 5d0c075

Please sign in to comment.