Loops can be readily condensed via fypp #657

sbryngelson · 2024-10-19T20:59:11Z

do concurrent is usually used to invoke a standard language-level parallelism, including GPU offloading. But, if the flag for it is not set, then it doesn't do much other than, perhaps, some multithreading.

It does not seem to clash with OpenACC in my experimentation so far (https://fortran-lang.discourse.group/t/how-does-openacc-collapse-interact-with-do-concurrent/6887).

With it, we can do this:

!$acc parallel loop collapse(4) gang vector default(present)
do concurrent (j = 1:sys_size, q = 0:p, l = 0:n, k = 0:m)
	rhs_vf(j)%sf(k, l, q) = 1d0/dx(k)* &
		(flux_n(1)%vf(j)%sf(k - 1, l, q) &
		 - flux_n(1)%vf(j)%sf(k, l, q))
end do

instead of this

!$acc parallel loop collapse(4) gang vector default(present)
do j = 1, sys_size
    do q = 0, p
        do l = 0, n
            do k = 0, m
                rhs_vf(j)%sf(k, l, q) = 1d0/dx(k)* &
                                        (flux_n(1)%vf(j)%sf(k - 1, l, q) &
                                         - flux_n(1)%vf(j)%sf(k, l, q))
            end do
        end do
    end do
end do

I think we can still pull out a sequential loop as needed, like this:

!$acc parallel loop collapse(3) gang vector default(present)
do concurrent (j = 1:sys_size, q = 0:p, l = 0:n)
	!$acc parallel seq
    do k = 0,m
		rhs_vf(j)%sf(k, l, q) = 1d0/dx(k)* &
			(flux_n(1)%vf(j)%sf(k - 1, l, q) &
			 - flux_n(1)%vf(j)%sf(k, l, q))
	end do
end do

While not an actual code improvement per se, it does seem quite helpful for readability. We go from 8 lines of code for a loop to 2.

The text was updated successfully, but these errors were encountered:

sbryngelson · 2024-10-20T02:59:40Z

This works with NVHPC, but not CCE compilers, in the GPU case (error is something like "collapse requires perfectly nested do loops") [FYI @abbotts ].

I reproduced it on a minimal example.

sbryngelson · 2024-10-22T20:33:55Z

@henryleberre created this that does the trick:

#:def forall(*args)
#:for loop in args[:-1]
do ${loop}$
#:endfor
$:args[-1]
#:for _ in range(len(args)-1)
end do
#:endfor
#:enddef

program forall_example
  implicit none
  integer :: n = 2
  integer :: m = 3
  integer :: i, j
  integer , dimension(1:2,1:2) :: x

  x(1,1) = 0
  x(1,2) = n

  x(2,1) = 1
  x(2,2) = m

  #:call forall('i=x(1,1),x(1,2)', 'j=x(2,1),x(2,2)')
    print*, i, j 
  #:endcall

end program forall_example

the created code is

program forall_example
  implicit none
  integer :: n = 2
  integer :: m = 3
  integer :: i, j
  integer , dimension(1:2,1:2) :: x

  x(1,1) = 0
  x(1,2) = n

  x(2,1) = 1
  x(2,2) = m

do i=x(1,1),x(1,2)
do j=x(2,1),x(2,2)
    print*, i, j 
end do
end do

end program forall_example

sbryngelson added enhancement New feature or request good first issue Good for newcomers labels Oct 19, 2024

sbryngelson closed this as completed Oct 20, 2024

sbryngelson changed the title ~~Loops can be readily condensed via do concurrent~~ Loops can be readily condensed via fypp Oct 22, 2024

sbryngelson reopened this Oct 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loops can be readily condensed via fypp #657

Loops can be readily condensed via fypp #657

sbryngelson commented Oct 19, 2024 •

edited

Loading

sbryngelson commented Oct 20, 2024 •

edited

Loading

sbryngelson commented Oct 22, 2024 •

edited

Loading

Loops can be readily condensed via fypp #657

Loops can be readily condensed via fypp #657

Comments

sbryngelson commented Oct 19, 2024 • edited Loading

sbryngelson commented Oct 20, 2024 • edited Loading

sbryngelson commented Oct 22, 2024 • edited Loading

sbryngelson commented Oct 19, 2024 •

edited

Loading

sbryngelson commented Oct 20, 2024 •

edited

Loading

sbryngelson commented Oct 22, 2024 •

edited

Loading