Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[STF] jacobi example based on parallel_for #3187

Merged
merged 4 commits into from
Dec 19, 2024

Conversation

caugonnet
Copy link
Contributor

@caugonnet caugonnet commented Dec 18, 2024

Description

Add Jacobi example using parallel_for and a reduce access mode

Checklist

  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@caugonnet caugonnet requested review from a team as code owners December 18, 2024 08:21
Copy link

copy-pr-bot bot commented Dec 18, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@caugonnet
Copy link
Contributor Author

/ok to test

@caugonnet caugonnet changed the title Stf jacobi example [STF] jacobi example based on parallel_for Dec 18, 2024
@caugonnet caugonnet added the stf Sequential Task Flow programming model label Dec 18, 2024
@caugonnet
Copy link
Contributor Author

/ok to test

Copy link
Contributor

🟩 CI finished in 27m 34s: Pass: 100%/26 | Total: 2h 14m | Avg: 5m 10s | Max: 22m 48s | Hits: 92%/312
  • 🟩 cudax: Pass: 100%/26 | Total: 2h 14m | Avg: 5m 10s | Max: 22m 48s | Hits: 92%/312

    🟩 cpu
      🟩 amd64              Pass: 100%/22  | Total:  2h 02m | Avg:  5m 32s | Max: 22m 48s | Hits:  92%/312   
      🟩 arm64              Pass: 100%/4   | Total: 12m 27s | Avg:  3m 06s | Max:  3m 16s
    🟩 ctk
      🟩 12.0               Pass: 100%/3   | Total: 15m 37s | Avg:  5m 12s | Max:  9m 11s | Hits:  92%/156   
      🟩 12.5               Pass: 100%/2   | Total: 10m 07s | Avg:  5m 03s | Max:  5m 14s
      🟩 12.6               Pass: 100%/21  | Total:  1h 48m | Avg:  5m 10s | Max: 22m 48s | Hits:  92%/156   
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/3   | Total: 15m 37s | Avg:  5m 12s | Max:  9m 11s | Hits:  92%/156   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 10m 07s | Avg:  5m 03s | Max:  5m 14s
      🟩 nvcc12.6           Pass: 100%/21  | Total:  1h 48m | Avg:  5m 10s | Max: 22m 48s | Hits:  92%/156   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/26  | Total:  2h 14m | Avg:  5m 10s | Max: 22m 48s | Hits:  92%/312   
    🟩 cxx
      🟩 Clang9             Pass: 100%/1   | Total:  3m 21s | Avg:  3m 21s | Max:  3m 21s
      🟩 Clang10            Pass: 100%/1   | Total:  3m 53s | Avg:  3m 53s | Max:  3m 53s
      🟩 Clang11            Pass: 100%/1   | Total:  3m 39s | Avg:  3m 39s | Max:  3m 39s
      🟩 Clang12            Pass: 100%/1   | Total:  3m 22s | Avg:  3m 22s | Max:  3m 22s
      🟩 Clang13            Pass: 100%/1   | Total:  3m 17s | Avg:  3m 17s | Max:  3m 17s
      🟩 Clang14            Pass: 100%/1   | Total:  3m 16s | Avg:  3m 16s | Max:  3m 16s
      🟩 Clang15            Pass: 100%/1   | Total:  3m 41s | Avg:  3m 41s | Max:  3m 41s
      🟩 Clang16            Pass: 100%/1   | Total:  3m 54s | Avg:  3m 54s | Max:  3m 54s
      🟩 Clang17            Pass: 100%/1   | Total:  3m 34s | Avg:  3m 34s | Max:  3m 34s
      🟩 Clang18            Pass: 100%/4   | Total: 32m 49s | Avg:  8m 12s | Max: 22m 48s
      🟩 GCC9               Pass: 100%/1   | Total:  3m 05s | Avg:  3m 05s | Max:  3m 05s
      🟩 GCC10              Pass: 100%/1   | Total:  3m 20s | Avg:  3m 20s | Max:  3m 20s
      🟩 GCC11              Pass: 100%/1   | Total:  3m 25s | Avg:  3m 25s | Max:  3m 25s
      🟩 GCC12              Pass: 100%/2   | Total: 19m 20s | Avg:  9m 40s | Max: 15m 32s
      🟩 GCC13              Pass: 100%/4   | Total: 12m 10s | Avg:  3m 02s | Max:  3m 07s
      🟩 MSVC14.36          Pass: 100%/1   | Total:  9m 11s | Avg:  9m 11s | Max:  9m 11s | Hits:  92%/156   
      🟩 MSVC14.39          Pass: 100%/1   | Total:  9m 03s | Avg:  9m 03s | Max:  9m 03s | Hits:  92%/156   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 10m 07s | Avg:  5m 03s | Max:  5m 14s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/13  | Total:  1h 04m | Avg:  4m 58s | Max: 22m 48s
      🟩 GCC                Pass: 100%/9   | Total: 41m 20s | Avg:  4m 35s | Max: 15m 32s
      🟩 MSVC               Pass: 100%/2   | Total: 18m 14s | Avg:  9m 07s | Max:  9m 11s | Hits:  92%/312   
      🟩 NVHPC              Pass: 100%/2   | Total: 10m 07s | Avg:  5m 03s | Max:  5m 14s
    🟩 gpu
      🟩 v100               Pass: 100%/26  | Total:  2h 14m | Avg:  5m 10s | Max: 22m 48s | Hits:  92%/312   
    🟩 jobs
      🟩 Build              Pass: 100%/24  | Total:  1h 36m | Avg:  4m 00s | Max:  9m 11s | Hits:  92%/312   
      🟩 Test               Pass: 100%/2   | Total: 38m 20s | Avg: 19m 10s | Max: 22m 48s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  3m 00s | Avg:  3m 00s | Max:  3m 00s
      🟩 90a                Pass: 100%/1   | Total:  3m 05s | Avg:  3m 05s | Max:  3m 05s
    🟩 std
      🟩 17                 Pass: 100%/6   | Total: 20m 33s | Avg:  3m 25s | Max:  4m 53s
      🟩 20                 Pass: 100%/20  | Total:  1h 53m | Avg:  5m 41s | Max: 22m 48s | Hits:  92%/312   
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 26)

# Runner
18 linux-amd64-cpu16
4 linux-arm64-cpu16
2 windows-amd64-cpu16
2 linux-amd64-gpu-v100-latest-1

@caugonnet
Copy link
Contributor Author

/ok to test

Copy link
Contributor

🟩 CI finished in 29m 47s: Pass: 100%/26 | Total: 2h 00m | Avg: 4m 38s | Max: 16m 04s | Hits: 92%/312
  • 🟩 cudax: Pass: 100%/26 | Total: 2h 00m | Avg: 4m 38s | Max: 16m 04s | Hits: 92%/312

    🟩 cpu
      🟩 amd64              Pass: 100%/22  | Total:  1h 49m | Avg:  4m 59s | Max: 16m 04s | Hits:  92%/312   
      🟩 arm64              Pass: 100%/4   | Total: 11m 04s | Avg:  2m 46s | Max:  3m 06s
    🟩 ctk
      🟩 12.0               Pass: 100%/3   | Total: 15m 15s | Avg:  5m 05s | Max:  9m 13s | Hits:  92%/156   
      🟩 12.5               Pass: 100%/2   | Total: 10m 03s | Avg:  5m 01s | Max:  5m 16s
      🟩 12.6               Pass: 100%/21  | Total:  1h 35m | Avg:  4m 33s | Max: 16m 04s | Hits:  92%/156   
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/3   | Total: 15m 15s | Avg:  5m 05s | Max:  9m 13s | Hits:  92%/156   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 10m 03s | Avg:  5m 01s | Max:  5m 16s
      🟩 nvcc12.6           Pass: 100%/21  | Total:  1h 35m | Avg:  4m 33s | Max: 16m 04s | Hits:  92%/156   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/26  | Total:  2h 00m | Avg:  4m 38s | Max: 16m 04s | Hits:  92%/312   
    🟩 cxx
      🟩 Clang9             Pass: 100%/1   | Total:  2m 59s | Avg:  2m 59s | Max:  2m 59s
      🟩 Clang10            Pass: 100%/1   | Total:  3m 28s | Avg:  3m 28s | Max:  3m 28s
      🟩 Clang11            Pass: 100%/1   | Total:  3m 09s | Avg:  3m 09s | Max:  3m 09s
      🟩 Clang12            Pass: 100%/1   | Total:  2m 59s | Avg:  2m 59s | Max:  2m 59s
      🟩 Clang13            Pass: 100%/1   | Total:  3m 10s | Avg:  3m 10s | Max:  3m 10s
      🟩 Clang14            Pass: 100%/1   | Total:  3m 05s | Avg:  3m 05s | Max:  3m 05s
      🟩 Clang15            Pass: 100%/1   | Total:  3m 25s | Avg:  3m 25s | Max:  3m 25s
      🟩 Clang16            Pass: 100%/1   | Total:  3m 22s | Avg:  3m 22s | Max:  3m 22s
      🟩 Clang17            Pass: 100%/1   | Total:  3m 14s | Avg:  3m 14s | Max:  3m 14s
      🟩 Clang18            Pass: 100%/4   | Total: 23m 30s | Avg:  5m 52s | Max: 14m 31s
      🟩 GCC9               Pass: 100%/1   | Total:  3m 03s | Avg:  3m 03s | Max:  3m 03s
      🟩 GCC10              Pass: 100%/1   | Total:  2m 59s | Avg:  2m 59s | Max:  2m 59s
      🟩 GCC11              Pass: 100%/1   | Total:  2m 57s | Avg:  2m 57s | Max:  2m 57s
      🟩 GCC12              Pass: 100%/2   | Total: 19m 21s | Avg:  9m 40s | Max: 16m 04s
      🟩 GCC13              Pass: 100%/4   | Total: 11m 03s | Avg:  2m 45s | Max:  2m 55s
      🟩 MSVC14.36          Pass: 100%/1   | Total:  9m 13s | Avg:  9m 13s | Max:  9m 13s | Hits:  92%/156   
      🟩 MSVC14.39          Pass: 100%/1   | Total:  9m 51s | Avg:  9m 51s | Max:  9m 51s | Hits:  92%/156   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 10m 03s | Avg:  5m 01s | Max:  5m 16s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/13  | Total: 52m 21s | Avg:  4m 01s | Max: 14m 31s
      🟩 GCC                Pass: 100%/9   | Total: 39m 23s | Avg:  4m 22s | Max: 16m 04s
      🟩 MSVC               Pass: 100%/2   | Total: 19m 04s | Avg:  9m 32s | Max:  9m 51s | Hits:  92%/312   
      🟩 NVHPC              Pass: 100%/2   | Total: 10m 03s | Avg:  5m 01s | Max:  5m 16s
    🟩 gpu
      🟩 v100               Pass: 100%/26  | Total:  2h 00m | Avg:  4m 38s | Max: 16m 04s | Hits:  92%/312   
    🟩 jobs
      🟩 Build              Pass: 100%/24  | Total:  1h 30m | Avg:  3m 45s | Max:  9m 51s | Hits:  92%/312   
      🟩 Test               Pass: 100%/2   | Total: 30m 35s | Avg: 15m 17s | Max: 16m 04s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  2m 48s | Avg:  2m 48s | Max:  2m 48s
      🟩 90a                Pass: 100%/1   | Total:  2m 55s | Avg:  2m 55s | Max:  2m 55s
    🟩 std
      🟩 17                 Pass: 100%/6   | Total: 19m 22s | Avg:  3m 13s | Max:  5m 16s
      🟩 20                 Pass: 100%/20  | Total:  1h 41m | Avg:  5m 04s | Max: 16m 04s | Hits:  92%/312   
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 26)

# Runner
18 linux-amd64-cpu16
4 linux-arm64-cpu16
2 windows-amd64-cpu16
2 linux-amd64-gpu-v100-latest-1

@caugonnet caugonnet merged commit c0c6ce9 into NVIDIA:main Dec 19, 2024
42 checks passed
shwina pushed a commit to shwina/cccl that referenced this pull request Dec 19, 2024
* Simple jacobi example with parallel for and reductions

* clang-format

* remove useless capture list
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stf Sequential Task Flow programming model
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants