Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[INSTALL]: install ESMF 8.6.1 and MAPL 2.46.2 in spack-stack 1.6.0 #1168

Open
1 of 14 tasks
junwang-noaa opened this issue Jun 28, 2024 · 51 comments
Open
1 of 14 tasks
Assignees
Labels
INFRA JEDI Infrastructure NAVY United States Naval Research Lab NOAA-EMC OAR-EPIC NOAA Oceanic and Atmospheric Research and Earth Prediction Innovation Center

Comments

@junwang-noaa
Copy link

Package name

ESMF and MAPL

Package version/tag

ESMF/8.6.1 and MAPL/2.46.2

Build options

Current

Installation timeframe

The two libraries will be installed under current spack-stack 1.6.0.

Other information

No response

WCOSS2

  • Check this box if and only if your package should be installed on WCOSS2 Cactus and Dogwood (all spack-stack packages will be installed on Acorn). If not, you may disregard the rest of the items below and submit this request.

WCOSS2: General questions

No response

WCOSS2: Installation and testing

No response

WCOSS2: Technical & security review list

  • The code is mature, stable, and production ready
  • The code is does not and cannot use the internet, and does not contain URLs (http, https, ftp, etc.) except in comments
  • The package does not contain prebuilt binary files that have not been approved by NCO security review
  • The code has no publicly disclosed cybersecurity vulnerabilities and exposures (search https://cve.mitre.org/cve/)
  • The code is not prohibited by DHS, DOC, NOAA, or NWS
  • The code comes from a trusted source. Trusted sources include other NWS, NOAA, or DOC, agencies, or other Federal agencies that operate at a FISMA high or equivalent level. Additionally, trusted sources could be third-party agencies through which there is an existing SLA on file (such as RedHat).
  • The code is actively maintained and supported (it continues to get updates, patches, etc.)
  • The code is not maintained by a private entity operating in a foreign country (if it is, make a note below)
  • There is sufficient documentation to support maintenance
  • There are no known security vulnerabilities or weaknesses
  • Installing and running the code does not require privileged processes/users
  • There are no software dependencies that are unapproved or have security concerns (if there are, make a note below)
  • There are no concerns related to SA, SI, and SC NIST control families

WCOSS2: Additional comments

No response

@climbfuji
Copy link
Collaborator

We need to do #1157 first, then this.

@climbfuji climbfuji added INFRA JEDI Infrastructure NOAA-EMC OAR-EPIC NOAA Oceanic and Atmospheric Research and Earth Prediction Innovation Center NAVY United States Naval Research Lab labels Jun 28, 2024
@climbfuji
Copy link
Collaborator

#1157 was merged, so we can go ahead with this. Unlikely it will happen before the 4th of July weekend. Many people will be on leave.

@climbfuji
Copy link
Collaborator

climbfuji commented Jul 8, 2024

Below are the instructions and a list of platforms / assigned spack-stack installers:

Instructions

  1. Go to spack-stack-1.6.0 installation and run the basic steps for building spack-stack environments on this system (see https://spack-stack.readthedocs.io/en/1.6.0/PreConfiguredSites.html)

  2. Make sure git remotes are configured correctly to point to JCSDA for both spack-stack and spack, do a git remote update, git check out jcsda/release/1.6.0 (replace jcsda with origin or however the remote is named); a subsequent git status should show

-bash-4.2$ git status


# On branch release/1.6.0
# Changes not staged for commit:
#   (use "git add <file>..." to update what will be committed)
#   (use "git checkout -- <file>..." to discard changes in working directory)
#   (commit or discard the untracked or modified content in submodules)
#
#       modified:   spack (new commits, modified content)
  1. git submodule update should check out the correct hash for the spack submodule; if not, go to spack, do a git remote update && git checkout jcsda/release/1.6.0.

  2. Back to the spack-stack top-level directory: source setup.sh

  3. For each unified environment in envs, do (please use a name that works for your setup, may include compiler suffix etc):

spack stack create env --name=ue-esmf-8.6.1-mapl-2.46.2 --site=s4 --template=unified-dev \
    --upstream=/data/prod/jedi/spack-stack/spack-stack-1.6.0/envs/unified-env/install \
    2>&1 | tee log.create.ue-esmf-8.6.1-mapl-2.46.2.001
  1. Update envs/ue-esmf-8.6.1-mapl-2.46.2/spack.yaml and set the correct compiler in the compiler matrix line (match upstream!) and set correct esmf/mapl versions:
sed -i "s/'%aocc', '%apple-clang', '%gcc', '%intel'/'%intel'/g" envs/ue-esmf-8.6.1-mapl-2.46.2/spack.yaml
sed -i "s/mapl@2.40.3 ^esmf@8.5.0/mapl@=2.46.2 ^esmf@=8.6.1/g" envs/ue-esmf-8.6.1-mapl-2.46.2/spack.yaml
sed -i "s/- mapl@2.40.3 ^esmf@8.6.0//g" envs/ue-esmf-8.6.1-mapl-2.46.2/spack.yaml
spack env activate -p envs/ue-esmf-8.6.1-mapl-2.46.2
  1. Concretize: spack concretize 2>&1 | tee log.concretize.ue-esmf-8.6.1-mapl-2.46.2.001, check output:
$ cat log.concretize.ue-esmf-8.6.1-mapl-2.46.2.001 | grep -vE '\[+\]|\[e\]|\[\^\]'
==> Concretized crtm@v2.4.1-jedi%intel

==> Concretized crtm@2.4.0.1%intel

==> Concretized ewok-env%intel+cylc+ecflow

==> Concretized fms@release-jcsda%intel

==> Concretized fms@2023.04%intel

==> Concretized global-workflow-env%intel
 -   lumhmam  global-workflow-env@1.0.0%intel@2021.5.0+python build_system=bundle arch=linux-centos7-skylake_avx512
 -   lyeuaa6      ^esmf@8.6.1%intel@2021.5.0 cxxflags="-fp-model precise" fflags="-fp-model precise" ~debug~external-lapack+external-parallelio+mpi+netcdf~pnetcdf~shared~xerces build_system=makefile esmf_comm=auto esmf_os=auto esmf_pio=auto patches=f63d405 snapshot=none arch=linux-centos7-skylake_avx512

==> Concretized gmao-swell-env%intel

==> Concretized gsi-env%intel

==> Concretized jedi-fv3-env%intel

==> Concretized jedi-mpas-env%intel

==> Concretized jedi-neptune-env%intel
 -   dwzunza  jedi-neptune-env@1.0.0%intel@2021.5.0 build_system=bundle arch=linux-centos7-skylake_avx512
 -   lyeuaa6      ^esmf@8.6.1%intel@2021.5.0 cxxflags="-fp-model precise" fflags="-fp-model precise" ~debug~external-lapack+external-parallelio+mpi+netcdf~pnetcdf~shared~xerces build_system=makefile esmf_comm=auto esmf_os=auto esmf_pio=auto patches=f63d405 snapshot=none arch=linux-centos7-skylake_avx512

==> Concretized jedi-ufs-env%intel ^esmf@=8.6.1 ^mapl@=2.46.2
 -   fikj3on  jedi-ufs-env@1.0.0%intel@2021.5.0 build_system=bundle arch=linux-centos7-skylake_avx512
 -   lyeuaa6      ^esmf@8.6.1%intel@2021.5.0 cxxflags="-fp-model precise" fflags="-fp-model precise" ~debug~external-lapack+external-parallelio+mpi+netcdf~pnetcdf~shared~xerces build_system=makefile esmf_comm=auto esmf_os=auto esmf_pio=auto patches=f63d405 snapshot=none arch=linux-centos7-skylake_avx512
 -   7okttfo      ^mapl@2.46.2%intel@2021.5.0~debug~extdata2g~f2py+fargparse~ipo~pflogger~pfunit~shared build_system=cmake build_type=Release generator=make arch=linux-centos7-skylake_avx512
 -   r2lhznx          ^fargparse@develop%intel@2021.5.0~ipo build_system=cmake build_type=Release generator=make arch=linux-centos7-skylake_avx512
 -   gf6rzli          ^gftl@develop%intel@2021.5.0~ipo build_system=cmake build_type=Release generator=make arch=linux-centos7-skylake_avx512
 -   h7wqbpx          ^gftl-shared@main%intel@2021.5.0~ipo build_system=cmake build_type=Release generator=make arch=linux-centos7-skylake_avx512

==> Concretized jedi-um-env%intel

==> Concretized madis@4.5%intel

==> Concretized mapl@=2.46.2%intel ^esmf@=8.6.1
 -   7okttfo  mapl@2.46.2%intel@2021.5.0~debug~extdata2g~f2py+fargparse~ipo~pflogger~pfunit~shared build_system=cmake build_type=Release generator=make arch=linux-centos7-skylake_avx512
 -   lyeuaa6      ^esmf@8.6.1%intel@2021.5.0 cxxflags="-fp-model precise" fflags="-fp-model precise" ~debug~external-lapack+external-parallelio+mpi+netcdf~pnetcdf~shared~xerces build_system=makefile esmf_comm=auto esmf_os=auto esmf_pio=auto patches=f63d405 snapshot=none arch=linux-centos7-skylake_avx512
 -   r2lhznx      ^fargparse@develop%intel@2021.5.0~ipo build_system=cmake build_type=Release generator=make arch=linux-centos7-skylake_avx512
 -   gf6rzli      ^gftl@develop%intel@2021.5.0~ipo build_system=cmake build_type=Release generator=make arch=linux-centos7-skylake_avx512
 -   h7wqbpx      ^gftl-shared@main%intel@2021.5.0~ipo build_system=cmake build_type=Release generator=make arch=linux-centos7-skylake_avx512

==> Concretized soca-env%intel

==> Concretized ufs-srw-app-env%intel ^esmf@=8.6.1 ^mapl@=2.46.2
 -   v4cb5fx  ufs-srw-app-env@1.0.0%intel@2021.5.0+python build_system=bundle arch=linux-centos7-skylake_avx512
 -   lyeuaa6      ^esmf@8.6.1%intel@2021.5.0 cxxflags="-fp-model precise" fflags="-fp-model precise" ~debug~external-lapack+external-parallelio+mpi+netcdf~pnetcdf~shared~xerces build_system=makefile esmf_comm=auto esmf_os=auto esmf_pio=auto patches=f63d405 snapshot=none arch=linux-centos7-skylake_avx512
 -   7okttfo      ^mapl@2.46.2%intel@2021.5.0~debug~extdata2g~f2py+fargparse~ipo~pflogger~pfunit~shared build_system=cmake build_type=Release generator=make arch=linux-centos7-skylake_avx512
 -   r2lhznx          ^fargparse@develop%intel@2021.5.0~ipo build_system=cmake build_type=Release generator=make arch=linux-centos7-skylake_avx512
 -   gf6rzli          ^gftl@develop%intel@2021.5.0~ipo build_system=cmake build_type=Release generator=make arch=linux-centos7-skylake_avx512
 -   h7wqbpx          ^gftl-shared@main%intel@2021.5.0~ipo build_system=cmake build_type=Release generator=make arch=linux-centos7-skylake_avx512

==> Concretized ufs-weather-model-env%intel ^esmf@=8.6.1 ^mapl@=2.46.2
 -   6wx4cyl  ufs-weather-model-env@1.0.0%intel@2021.5.0~debug+python build_system=bundle arch=linux-centos7-skylake_avx512
 -   lyeuaa6      ^esmf@8.6.1%intel@2021.5.0 cxxflags="-fp-model precise" fflags="-fp-model precise" ~debug~external-lapack+external-parallelio+mpi+netcdf~pnetcdf~shared~xerces build_system=makefile esmf_comm=auto esmf_os=auto esmf_pio=auto patches=f63d405 snapshot=none arch=linux-centos7-skylake_avx512
 -   7okttfo      ^mapl@2.46.2%intel@2021.5.0~debug~extdata2g~f2py+fargparse~ipo~pflogger~pfunit~shared build_system=cmake build_type=Release generator=make arch=linux-centos7-skylake_avx512
 -   r2lhznx          ^fargparse@develop%intel@2021.5.0~ipo build_system=cmake build_type=Release generator=make arch=linux-centos7-skylake_avx512
 -   gf6rzli          ^gftl@develop%intel@2021.5.0~ipo build_system=cmake build_type=Release generator=make arch=linux-centos7-skylake_avx512
 -   h7wqbpx          ^gftl-shared@main%intel@2021.5.0~ipo build_system=cmake build_type=Release generator=make arch=linux-centos7-skylake_avx512
  1. Fix grib-utils modulefile for wgrib (compare against upstream environment if you are not sure what to do): Replace
<             'WGRIB2': '{prefix}/bin/wgrib2'

with

>             'WGRIB': '{prefix}/bin/wgrib'

in the grib-utils section. I don't know why this was never committed to the release/1.6.0 branch.
9. Install via spack install --verbose 2>&1 | tee log.install.ue-esmf-8.6.1-mapl-2.46.2.001, then spack module lmod refresh --upstream-modules (use tcl instead of lmod where necessary), then spack stack setup-meta-modules

List of platforms / installers

ONLY TICK IF YOU'VE ALSO FIXED THE GRIB-UTILS MODULE FOR WGRIB

ONLY DO THIS FOR THE BASE UNIFIED-ENV - IGNORE THE ADDON ENVS

  • MSU Orion (EPIC/EMC)
  • MSU Hercules (EPIC/EMC) (@ulmononian Intel; @climbfuji GNU with mvapich2)
  • NASA Discover SCU16 - NOT NEEDED. Only used by NRL and NRL has already moved to spack-stack-1.7.0
  • NASA Discover SCU17 - NOT NEEDED. Only used by NRL and NRL has already moved to spack-stack-1.7.0
  • NAVY HPCMP Narwhal - NOT NEEDED. Only used by NRL and NRL has already moved to spack-stack-1.7.0
  • NAVY HPCMP Nautilus - NOT NEEDED. Only used by NRL and NRL has already moved to spack-stack-1.7.0
  • NCAR-Wyoming Derecho (EPIC)
  • NOAA Acorn (WCOSS2 test system; @AlexanderRichert-NOAA)
  • NOAA Parallel Works EPIC (AWS, Azure, Gcloud) (@natalie-perlin)
  • NOAA Parallel Works JCSDA (Gcloud) - NOT NEEDED. Only used by JCSDA and JCSDA has already moved to spack-stack-1.7.0
  • NOAA RDHPCS Gaea (EPIC/EMC)
  • NOAA RDHPCS Hera (EPIC/EMC)
  • NOAA RDHPCS Jet (EPIC/EMC)
  • UW (Univ. of Wisconsin) S4 (@climbfuji)
  • Amazon Web Services Parallelcluster Ubuntu 22.04 - NOT NEEDED. Only used by JCSDA and JCSDA has already moved to spack-stack-1.7.0
  • Ubuntu CI runner for skylab, simobs, ... - NOT NEEDED. Only used by JCSDA and JCSDA has already moved to spack-stack-1.7.0
  • Amazon Web Services Red Hat 8 - NOT NEEDED. Only used by JCSDA and JCSDA has already moved to spack-stack-1.7.0

@ulmononian
Copy link
Collaborator

@climbfuji @AlexanderRichert-NOAA @jkbk2004 @junwang-noaa i installed a chained env based on 1.6.0 but with esmf/8.6.1 and mapl/2.46.2 here /work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.6.0/envs/ue-esmf-8.6.1-mapl-2.46.2/install/modulefiles/Core. it is intel only for now. please give a try and let us know how it works with the ufs-wm.

@climbfuji
Copy link
Collaborator

@climbfuji @AlexanderRichert-NOAA @jkbk2004 @junwang-noaa i installed a chained env based on 1.6.0 but with esmf/8.6.1 and mapl/2.46.2 here /work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.6.0/envs/ue-esmf-8.6.1-mapl-2.46.2/install/modulefiles/Core. it is intel only for now. please give a try and let us know how it works with the ufs-wm.

I am doing the gcc part now. I had to comment out jedi-tools-env in the chained environment, but that doesn't matter. Fortunately, 1.6.0 as the last release that had more than one compiler in one environment - this just causes trouble. Unfortunately, though, we always need to go back and make updates to 1.6.0!

@climbfuji
Copy link
Collaborator

@mathomp4 mapl 2.46.2 refuses to build on Hercules with gcc, because the 1.6.0 stack uses mvapich2:

==> Ran patch() for mapl
==> mapl: Executing phase: 'cmake'
==> Error: InstallError: Unsupported MPI stack

/work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.6.0/spack/var/spack/repos/builtin/packages/mapl/package.py:363, in cmake_args:
        360        elif self.spec.satisfies("^cray-mpich"):
        361            args.append(self.define("MPI_STACK", "mpich"))
        362        else:
  >>    363            raise InstallError("Unsupported MPI stack")
        364
        365        return args

See build log for details:

Any quick fix for this (locally if needed - we've moved away from mvapich2 since spack-stack-1.7.0)?

@climbfuji
Copy link
Collaborator

@mathomp4 mapl 2.46.2 refuses to build on Hercules with gcc, because the 1.6.0 stack uses mvapich2:

==> Ran patch() for mapl
==> mapl: Executing phase: 'cmake'
==> Error: InstallError: Unsupported MPI stack

/work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.6.0/spack/var/spack/repos/builtin/packages/mapl/package.py:363, in cmake_args:
        360        elif self.spec.satisfies("^cray-mpich"):
        361            args.append(self.define("MPI_STACK", "mpich"))
        362        else:
  >>    363            raise InstallError("Unsupported MPI stack")
        364
        365        return args

See build log for details:

Any quick fix for this (locally if needed - we've moved away from mvapich2 since spack-stack-1.7.0)?

JCSDA/spack#449 and #1189 fix this for release/1.6.0, spack/spack#45164 for spack develop (it will come back to spack-stack-dev with the next pull).

#1189 also fixes the missing grib-utils module file change for wgrib.

@climbfuji
Copy link
Collaborator

@mathomp4 mapl 2.46.2 refuses to build on Hercules with gcc, because the 1.6.0 stack uses mvapich2:

==> Ran patch() for mapl
==> mapl: Executing phase: 'cmake'
==> Error: InstallError: Unsupported MPI stack

/work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.6.0/spack/var/spack/repos/builtin/packages/mapl/package.py:363, in cmake_args:
        360        elif self.spec.satisfies("^cray-mpich"):
        361            args.append(self.define("MPI_STACK", "mpich"))
        362        else:
  >>    363            raise InstallError("Unsupported MPI stack")
        364
        365        return args

See build log for details:

Any quick fix for this (locally if needed - we've moved away from mvapich2 since spack-stack-1.7.0)?

JCSDA/spack#449 and #1189 fix this for release/1.6.0, spack/spack#45164 for spack develop (it will come back to spack-stack-dev with the next pull).

#1189 also fixes the missing grib-utils module file change for wgrib.

@ulmononian Hercules is done for gcc, and I also fixed the grib-utils module and regenerated all module files.

@ulmononian
Copy link
Collaborator

@climbfuji thanks for taking on the hercules gcc issue. it looks to me like you did the gcc install in /work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.6.0/envs/ue-esmf-8.6.1-mapl-2.46.2/install; am i correct? if so, i can let the ufs-wm devs know.

@climbfuji
Copy link
Collaborator

@climbfuji thanks for taking on the hercules gcc issue. it looks to me like you did the gcc install in /work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.6.0/envs/ue-esmf-8.6.1-mapl-2.46.2/install; am i correct? if so, i can let the ufs-wm devs know.

Correct.

@climbfuji
Copy link
Collaborator

It looks like that the platforms missing are all EMC and EPIC systems - everything else is either done or not needed.

@zach1221
Copy link

In case it has not been reported here yet, I wanted to make aware this issue seen on Hercules when testing with the esmf/8.6.1 spack-stack 1.6.0 installation. @jkbk2004 @BrianCurtis-NOAA @FernandoAndrade-NOAA

CMake Error at /work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.6.0/envs/ue-esmf-8.6.1-mapl-2.46.2/install/intel/2021.9.0/mapl-2.46.2-uiwt3at/lib64/cmake/MAPL/MAPL-targets.cmake:73 (set_target_properties):
The link interface of target "MAPL_cfio_r4" contains:

ESMF::ESMF

but the target was not found. Possible reasons include:

* There is a typo in the target name.
* A find_package call is missing for an IMPORTED target.
* An ALIAS target is missing.

@mathomp4
Copy link
Collaborator

That is so odd. I mean, ESMF 8.6.1 and MAPL 2.46 were essentially created to allow for the ESMF::ESMF target.

Hmm. My only next thought is that a FindESMF.cmake file is out of date? The one we have in MAPL and the one we have in ESMA_cmake are identical to the one in ESMF.

Could you some how be picking up another one? It was noted by @danrosen25 in the ESMF PR, that, at the time these CMake files were out-of-date:

And a look at them shows them still quite old.

Perhaps some package is still referring to an old FindESMF.cmake in some way?

@BrianCurtis-NOAA
Copy link
Contributor

That is so odd. I mean, ESMF 8.6.1 and MAPL 2.46 were essentially created to allow for the ESMF::ESMF target.

Hmm. My only next thought is that a FindESMF.cmake file is out of date? The one we have in MAPL and the one we have in ESMA_cmake are identical to the one in ESMF.

Could you some how be picking up another one? It was noted by @danrosen25 in the ESMF PR, that, at the time these CMake files were out-of-date:

And a look at them shows them still quite old.

Perhaps some package is still referring to an old FindESMF.cmake in some way?

I'm running with the MAPL FindESMF.cmake on WCOSS2 right now, but if this is the case, we should look into coordinating a place for one FindESMF.cmake to exist and other groups pull from that location.

@climbfuji
Copy link
Collaborator

There's already an issue in the cmakemodules repo that talks about using ESMF's own findESMF.cmake: NOAA-EMC/CMakeModules#70 - there are also issues in fv3-jedi and spack for this if I remember correctly.

@junwang-noaa
Copy link
Author

@climbfuji is the findESMF.cmake issue caused by the new ESMF 8.6.1? I am curious why it is not an issue in previous ESMF 8.6.0.

@danrosen25
Copy link

@climbfuji @junwang-noaa
See this pull request, which was merged into 8.6.1
https://github.com/esmf-org/esmf/pull/226

When I tests this change in UFS I had trouble IF I updated the UFS FindESMF.cmake files. If I left them alone then the UFS system built.

@mathomp4
Copy link
Collaborator

mathomp4 commented Aug 2, 2024

Note: GEOS is still has a few esmf target refs due to olden days when we were linking to libesmf.a and, well, in CMake-land that is esmf. But of course now we have a real FindESMF.cmake and we should follow that.

But until I can fix up all of GEOS, we have:

    if (NOT TARGET esmf)
      add_library(esmf ALIAS ESMF::ESMF)
    endif ()

in our code to still support the old style. I hope to remove it soon.

@danrosen25
Copy link

Similar code can be added to UFS after this line:
https://github.com/ufs-community/ufs-weather-model/blob/develop/CMakeLists.txt#L150

@climbfuji
Copy link
Collaborator

Where are we with this issue? Have esmf@8.6.1 and mapl@2.46.2 been installed on all NOAA RDHPCS systems in spack-stack-1.6.0? Or is this moot given that spack-stack-1.8.0 has esmf@8.6.1 with mapl@2.46.3?

@junwang-noaa
Copy link
Author

MAPL 2.46.2 has a bug, we have to move to esmf 8.6.1 and mapl 2.46.3 to debug the issue. We suggest having a test version of spack-stack 1.6.0 with esmf 8.6.1 and mapl 2.46.3 to continue the debugging work, while you can move forward with spack-stack 1.8.0 release with esmf 8.6.1 and mapl 2.46.3

@climbfuji
Copy link
Collaborator

MAPL 2.46.2 has a bug, we have to move to esmf 8.6.1 and mapl 2.46.3 to debug the issue. We suggest having a test version of spack-stack 1.6.0 with esmf 8.6.1 and mapl 2.46.3 to continue the debugging work, while you can move forward with spack-stack 1.8.0 release with esmf 8.6.1 and mapl 2.46.3

@RatkoVasic-NOAA FYI

@jkbk2004
Copy link

@mathomp4 we can continue to test on orion and hercules for the new versions of mapl and esmf. we can follow up at ufs-community/ufs-weather-model#2346.

@ulmononian
Copy link
Collaborator

MAPL 2.46.2 has a bug, we have to move to esmf 8.6.1 and mapl 2.46.3 to debug the issue. We suggest having a test version of spack-stack 1.6.0 with esmf 8.6.1 and mapl 2.46.3 to continue the debugging work, while you can move forward with spack-stack 1.8.0 release with esmf 8.6.1 and mapl 2.46.3

@RatkoVasic-NOAA FYI

@climbfuji @junwang-noaa:

@RatkoVasic-NOAA installed a test env on orion/hercules w/ mapl@2.46.3 and esmf@8.6.1 in the following locations:

Hercules: /work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.6.0/envs/ue-esmf-8.6.1-mapl-2.46.3/install/modulefiles/Core
Orion: /work/noaa/epic/role-epic/spack-stack/orion/spack-stack-1.6.0/envs/ue-esmf-8.6.1-mapl-2.46.3/install/modulefiles/Core

thank you @RatkoVasic-NOAA!!!

@mathomp4
Copy link
Collaborator

mathomp4 commented Nov 8, 2024

Per a telecon today between the MAPL team (cc @tclune) and UFS team (e.g., @junwang-noaa and others), there was a request to create a MAPL tag that was based on MAPL v2.40.3 (which was the last "working" version) but with support for ESMF 8.6.1 as that version was needed. This could then be installed on Hercules for testing.

I've created a "preliminary" tag, v2.40.3.1 where the changes compared to v2.40.3 are:

  1. All esmf target references in CMake are now ESMF::ESMF
  2. Update FindESMF.cmake to the ESMF 8.6.1 version
  3. Updated a ESMF_ConfigNextLine call in get_vec_from_config to use tableEnd per @danrosen25

The tag is on MAPL now, though no release as I'm not sure yet if all the needed CMake, etc. changes have been brought over (MAPL 2.40 was a while ago).

Now, MAPL 2.40.3.1 is not in the spack package.py for MAPL, so I'm guessing it'll need to be installed as:

spack install mapl@git.v2.40.3.1

My laptop seems to be able to resolve that. That said, we might need to iterate on this a few times if I missed something. There have been further "fixes for Spack/UFS" on later tags and perhaps those might need backporting. If so, I can update and push the tag.

@danrosen25
Copy link

Hi @mathomp4
The 8.6.1 compliant version should fix the the method for looping over ESMF_Config tables. It's erroneous to call ESMF_ConfigNextLine when you're at the end of the table because there aren't more items. Previously you could call ESMF_ConfigNextLine at the end of the table and it would return the end of table marker ::. Here's my exchange with Ben.
reproducer_490.tgz

I removed a statically sized string buffer that held data for the "current line" of a configuration file. This work was done to eliminate the fixed maximum line length of 1024. Paired with the work and out of necessity, I also cleaned up all the calls that returned the next line. I'm not sure why the code above doesn't use tableEnd in the call to ESMF_ConfigNextLine?
https://earthsystemmodeling.org/docs/release/latest/ESMF_refdoc/node6.html#SECTION060931800000000000000

Alternatively the code could use ESMF_ConfigGetDim to get the line count of a table and use this in a do loop.
https://earthsystemmodeling.org/docs/release/latest/ESMF_refdoc/node6.html#SECTION060931300000000000000

Technically "::" isn't the next line, similar to how the label itself is not the next line. I can discuss this further with the ESMF Core team if this needs to be added back but my recommendation is to use the tableEnd argument in ESMF_ConfigNextLine.

I've attached a reproducer of this issue that includes the two working examples mentioned above.

@mathomp4
Copy link
Collaborator

mathomp4 commented Nov 8, 2024

@danrosen25 Ohhhh. Okay. Let me consult with @bena-nasa on that. I might need to push the tag...

ETA: I talked with @bena-nasa about this and we found one table call that needed updating. I've pushed v2.40.3.1 with the update.

@junwang-noaa
Copy link
Author

@mathomp4 We will test ESMF 8.6.1 with this MAPL 2.40.3.1 in ufs-weather model then.

@AlexanderRichert-NOAA would you please install spack-stack with these new libraries on Hercules. The current Hercules module file is at: https://github.com/ufs-community/ufs-weather-model/blob/develop/modulefiles/ufs_hercules.intel.lua. Thanks

@AlexanderRichert-NOAA
Copy link
Collaborator

AlexanderRichert-NOAA commented Nov 8, 2024

@mathomp4 the 2.40.3.1 build on hercules is failing because c_ptr & c_loc are undefined in geom/FieldPointerUtilities.F90. Adding use iso_c_binding at the top of the file fixes it, in which case it builds okay.

@AlexanderRichert-NOAA
Copy link
Collaborator

@junwang-noaa if/when @mathomp4 updates the tag to fix the issue in my previous comment, I'll reinstall, but if you want to go ahead and test: /work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.6.0/envs/mapl-2.40.3.1-intel-2021.9.0/install/modulefiles/Core

@mathomp4
Copy link
Collaborator

mathomp4 commented Nov 8, 2024

@mathomp4 the 2.40.3.1 build on hercules is failing because c_ptr & c_loc are undefined in geom/FieldPointerUtilities.F90. Adding use iso_c_binding at the top of the file fixes it, in which case it builds okay.

Ahhh. Yeah. That was a file where we could compile it because the iso_c_binding was bleeding in via ESMF, but they fixed that on their end. It was for 8.7 but I guess it got into 8.6.1. Good find @AlexanderRichert-NOAA

I've pushed the v2.40.3.1 tag.

@AlexanderRichert-NOAA
Copy link
Collaborator

Thanks @mathomp4. I just reinstalled using the updated tag.

@junwang-noaa
Copy link
Author

@AlexanderRichert-NOAA may I ask if you can install ESMF beta snapshot 8.8.0b04(https://github.com/esmf-org/esmf/releases/tag/v8.8.0b04) with MAPL 2.40.3.1 on Hercules for people to test the grid imprint issue in UFS coupled test? Thanks

@AlexanderRichert-NOAA
Copy link
Collaborator

Yes, will do

@AlexanderRichert-NOAA
Copy link
Collaborator

/work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.6.0/envs/esmf-8.8.0b04-intel-2021.9.0/install/modulefiles/Core

@mathomp4
Copy link
Collaborator

mathomp4 commented Nov 15, 2024

@AlexanderRichert-NOAA et al, I've released a formal MAPL v2.40.3.1 release1:

https://github.com/GEOS-ESM/MAPL/releases/tag/v2.40.3.1

and I've made a PR to spack mainline for it:

spack/spack#47627

If any changes are needed now, we'll up the tweak number to 2.

Footnotes

  1. The release doesn't have a Zenodo badge yet because, well, it doesn't seem to be appearing on Zenodo. Not sure why 🤷🏼. I'll keep monitoring. Never mind. It appeared!

@junwang-noaa
Copy link
Author

@AlexanderRichert-NOAA would you please install ESMF beta snapshot 8.8.0b06(https://github.com/esmf-org/esmf/tree/v8.8.0b06) with MAPL 2.40.3.1 on Hercules for us to test the HR4 hanging issue? Thank you very much!

@AlexanderRichert-NOAA
Copy link
Collaborator

Yes will do

@AlexanderRichert-NOAA
Copy link
Collaborator

Done. /work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.6.0/envs/esmf-8.8.0b06-intel-2021.9.0/install/modulefiles/Core

@junwang-noaa
Copy link
Author

@AlexanderRichert-NOAA with an one line change to use the new module:

-- prepend_path("MODULEPATH", "/work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.6.0/envs/fms-2024.01/install/modulefiles/Core")
prepend_path("MODULEPATH", "/work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.6.0/envs/esmf-8.8.0b06-intel-2021.9.0/install/modulefiles/Core")

Now we got error message:

$ module load ufs_hercules.intel
Lmod has detected the following error:  These module(s) or extension(s) exist but cannot be loaded as requested: "jasper/2.0.32", "python/3.10.13", "netcdf-fortran/4.6.1",
"cmake/3.23.1", "hdf5/1.14.0"
   Try: "module spider jasper/2.0.32 python/3.10.13 netcdf-fortran/4.6.1 cmake/3.23.1 hdf5/1.14.0" to see how to load the module(s).

Did we miss anything?

@AlexanderRichert-NOAA
Copy link
Collaborator

Sorry, I forgot to rebuild the modules. Please try again.

@junwang-noaa
Copy link
Author

@AlexanderRichert-NOAA ESMF released a new beta snapshot 8.8.0b09 for us to testing ESMF managed thread. Would you please install it with current compiler version and MAPL 2.40.3.1 in spack-stack 1.6.0?

Just to clarify, this does not require any Intel compiler version change. It is a separate request from the Intel version 2021.12 for MAPL/GOCART testing. Thank you!

@AlexanderRichert-NOAA
Copy link
Collaborator

AlexanderRichert-NOAA commented Dec 11, 2024

hercules: /work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.6.0/envs/esmf-8.8.0b09-intel-2021.9.0/install/modulefiles/Core

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
INFRA JEDI Infrastructure NAVY United States Naval Research Lab NOAA-EMC OAR-EPIC NOAA Oceanic and Atmospheric Research and Earth Prediction Innovation Center
Projects
None yet
Development

No branches or pull requests