chore(deps): update dependency cutlass_archive to v3.6.0 (#948) · secretflow/spu@01504cc

Commit

chore(deps): update dependency cutlass_archive to v3.6.0 (#948)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [cutlass_archive](https://redirect.github.com/NVIDIA/cutlass) |
http_archive | minor | `v3.5.1` -> `v3.6.0` |

---

### Release Notes

<details>
<summary>NVIDIA/cutlass (cutlass_archive)</summary>

###
[`v3.6.0`](https://redirect.github.com/NVIDIA/cutlass/releases/tag/v3.6.0):
CUTLASS 3.6.0

[Compare
Source](https://redirect.github.com/NVIDIA/cutlass/compare/v3.5.1...v3.6.0)

- [Hopper structured sparse
GEMM](./examples/62\_hopper_sparse_gemm/62\_hopper_sparse_gemm.cu).
-
[FP16](./test/unit/gemm/device/sm90\_sparse_gemm_f16\_f16\_f32\_tensor_op_f32.cu)
-
[FP8](./test/unit/gemm/device/sm90\_sparse_gemm_f8\_f8\_f32\_tensor_op_f32.cu)
-
[INT8](./test/unit/gemm/device/sm90\_sparse_gemm_s8\_s8\_s32\_tensor_op_s32.cu)
-
[TF32](./test/unit/gemm/device/sm90\_sparse_gemm_tf32\_tf32\_f32\_tensor_op_f32.cu)
- A refactor to the CUTLASS 3.x convolution `kernel::ConvUniversal`
[API](./include/cutlass/conv/kernel/sm90\_implicit_gemm_tma_warpspecialized.hpp)
to bring it in line with `gemm::GemmUniversal`. Now the 3.x convolution
API is no longer considered as a beta API.
- [An improved mixed input
GEMM](./examples/55\_hopper_mixed_dtype_gemm/README.md) and a [lookup
table
implementation](./examples/55\_hopper_mixed_dtype_gemm/55\_hopper_int4\_fp8\_gemm.cu)
for `INT4`x`FP8` scale-only mode.
- [EVT nodes for Top-K selection and
softmax](./include/cutlass/epilogue/fusion/sm90\_visitor_topk_softmax.hpp)
and [GEMM example using
those](./examples/61\_hopper_gemm_with_topk_and_softmax/61\_hopper_gemm_with_topk_and_softmax.cu).
- [Programmatic Dependent
Launch](./include/cutlass/arch/grid_dependency_control.h) (PDL) that
leverages a new Hopper feature to speedup two back-to-back kernels, and
its corresponding
[documentations](./media/docs/dependent_kernel_launch.md).
- [A new debugging tool, synclog](./include/cutlass/arch/synclog.hpp),
for dumping out all synchronization events from within a kernel to a
file. Please see [synclog
documentation](./media/docs/utilities.md#debugging-asynchronous-kernels-with-cutlasss-built-in-synclog-tool)
for details.
- A new TMA-enabled
[epilogue](./include/cutlass/epilogue/collective/sm90\_epilogue_array_tma_warpspecialized.hpp)
for grouped GEMM that brings significant performance improvement, as
well as its EVT support.
- A SIMT-enabled pointer-array
[epilogue](./include/cutlass/epilogue/collective/sm70\_epilogue_vectorized_array.hpp).
- A new [Ping-Pong kernel schedule for Grouped
GEMM](./include/cutlass/gemm/kernel/sm90\_gemm_array_tma_warpspecialized_pingpong.hpp)
and some other optimizations.
- [A new instantiation strategy for CUTLASS profiler
kernels](./python/cutlass_library/sm90\_shapes.py) along with [improved
documentation for instantiation level in CUTLASS
profiler](./media/docs/profiler.md#instantiating-more-kernels-with-hopper).
- A new hardware support for comparisons and computations of
[`cutlass::bfloat16_t`](./include/cutlass/bfloat16.h)
- Fixed use of isnan on Windows for
[`half_t`](./test/unit/core/functional.cu).

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR was generated by [Mend Renovate](https://mend.io/renovate/).
View the [repository job
log](https://developer.mend.io/github/secretflow/spu).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzOS44MC4wIiwidXBkYXRlZEluVmVyIjoiMzkuODAuMCIsInRhcmdldEJyYW5jaCI6Im1haW4iLCJsYWJlbHMiOlsiZGVwZW5kZW5jaWVzIl19-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

Loading branch information

renovate[bot] authored Jan 2, 2025

1 parent d3d0f85 commit 01504cc

bazel/repositories.bzl

            
                      Original file line number
                      Diff line number
                      Diff line change
                  
    @@ -242,10 +242,10 @@ def _com_github_nvidia_cutlass():
  
        maybe(

            http_archive,

            name = "cutlass_archive",

            strip_prefix = "cutlass-3.5.1",

            strip_prefix = "cutlass-3.6.0",

            urls = [

                "https://github.com/NVIDIA/cutlass/archive/refs/tags/v3.5.1.tar.gz",

                "https://github.com/NVIDIA/cutlass/archive/refs/tags/v3.6.0.tar.gz",

            ],

            sha256 = "20b7247cda2d257cbf8ba59ba3ca40a9211c4da61a9c9913e32b33a2c5883a36",

            sha256 = "7576f3437b90d0de5923560ccecebaa1357e5d72f36c0a59ad77c959c9790010",

            build_file = "@spulib//bazel:nvidia_cutlass.BUILD",

        )

0 comments on commit `01504cc`

Please sign in to comment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit

There are no files selected for viewing

0 comments on commit `01504cc`

Commit

There are no files selected for viewing

0 comments on commit 01504cc

0 comments on commit `01504cc`