Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Towards Large Meshes #59

Merged
merged 8 commits into from
Dec 27, 2023
Merged

Towards Large Meshes #59

merged 8 commits into from
Dec 27, 2023

Conversation

davschneller
Copy link
Contributor

@davschneller davschneller commented Sep 26, 2023

As stated, this PR adds large mesh support—or rather, it corrects the few spots where that wasn't the case yet (that is, the Simmodeller should have been able to do that already). For the large meshes to work, this PR requires a SCOREC/PUMI version which is newer than v2.2.7; seemingly only directly using master works right now for that (i.e. one which would not have worked before #58 ). Older PUMI versions are still supported, and PUMgen should work as usual (cf. #58 ), but they probably won't support larger meshes, due to the element data type in APF.

Some additional features:

  • We add some compression options for the output—if that's wanted or needed. (note that most compression options can also be applied, changed or removed by using h5repack and the resulting size is roughly the same as storing a PUML mesh in a ZIP file)
  • Furthermore, we add chunked output writing for large meshes.
  • Also, we integrate 64-bit boundaries natively into PUMgen (groups still stay 32-bit at the moment).

Some of the older mesh formats may break during this update and may be removed some time soon-ish.

Copy link
Contributor

@Thomas-Ulrich Thomas-Ulrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I will test it by trying to generate large meshes for the Texascale.

src/meshreader/FidapReader.h Show resolved Hide resolved
src/pumgen.cpp Outdated Show resolved Hide resolved
src/pumgen.cpp Outdated Show resolved Hide resolved
@davschneller
Copy link
Contributor Author

davschneller commented Oct 4, 2023

Thanks, that sounds great!

About the compression a few notes maybe still: using any of the deflate compression options will require HDF5 to be compiled with that enabled (both for writing and reading). I'd guess that many of the clusters have enabled it, but in case something throws an error, that could be the reason why.

Other than that, a small note regarding the compression; I had tried h5repack with deflate on strength 1 some time ago, and, as mentioned, it basically reduced it to the size (including the other meshes) as if we were to download the Zenodo file for the SC21 paper. In particular, the groups and the boundaries could be compressed exceptionally well. Here's the command I used:

h5repack -l connect:CHUNK=8192x4 -l geometry:CHUNK=8192x3 -l group:CHUNK=8192 -l boundary:CHUNK=8192 -f boundary:GZIP=1 -f group:GZIP=1 -f geometry:GZIP=1  -f connect:GZIP=1 infile.h5 compressed-outfile.h5

One more thing to add is maybe, the scaleoffset compression does not go well with the compactify-datatypes option; it only wants 1,2,4,8 byte-sized data types. But the scaleoffset compression is also the weakest one.

compactify-datatypes itself is not a compression option per se (meaning that it won't need HDF5 to be compiled with any such option, nor is that mentioned in the file itself). It merely cuts off unused bytes in integers. For example: if we have a mesh with, say, 7 million nodes/cells, then the connectivity indices all fit into a 3-byte integer (which can house up to roughly 16 million, or a bit over 8 million, if signed). The compactify-datatypes option does exactly use 3-byte integers for storage then—instead of the usual 8 bytes.

Also, the compactify-datatypes option cannot be applied post-mortem by h5repack (but deflate, scaleoffset can be either added or removed using h5repack).

@krenzland
Copy link
Contributor

How would the compression work then, when integrating it into SeisSol? Does it work well with the parallel I/O? Naively, I would assume that it chunks the data and then compresses it?

We might also consider this then for the output writing?

@davschneller
Copy link
Contributor Author

davschneller commented Oct 4, 2023

How would the compression work then, when integrating it into SeisSol? Does it work well with the parallel I/O? Naively, I would assume that it chunks the data and then compresses it?

Yup, that's exactly how it works. But the chunking needs to be done explicitly. Cf. the given h5repack command: we need to change the data layout from contiguous to chunked. As I understand it, each chunked is then compressed. Apparently it's supported with parallel IO, as long as we don't also use additional HDF5 data transforms (which we shouldn't... I think).
(we can still run PUMgen with MPI, and it doesn't throw us any errors)

For an example in code, cf. e.g.

PUMGen/src/pumgen.cpp

Lines 335 to 346 in 7748c25

hid_t connectFilter = H5P_DEFAULT;
if (applyFilters) {
connectFilter = checkH5Err(H5Pcreate(H5P_DATASET_CREATE));
hsize_t chunk[2] = {std::min(filterChunksize, sizes[0]), 4};
checkH5Err(H5Pset_chunk(connectFilter, 2, chunk));
if (filterEnable == 1) {
checkH5Err(H5Pset_scaleoffset(connectFilter, H5Z_SO_INT, H5Z_SO_INT_MINBITS_DEFAULT));
} else if (filterEnable < 11) {
int deflateStrength = filterEnable - 1;
checkH5Err(H5Pset_deflate(connectFilter, deflateStrength));
}
}

(and yes, it's still a lot of duplicated code in there at the moment—maybe it can be slightly generalized as in PUMLcube SeisSol/Meshing#49 )

We might also consider this then for the output writing?

That's a good idea—if we work on IO nodes which support that, then why actually not?

@davschneller davschneller marked this pull request as ready for review December 27, 2023 19:22
@davschneller davschneller merged commit 9d8dd3c into master Dec 27, 2023
7 checks passed
@davschneller davschneller deleted the davschneller/mesh64 branch December 27, 2023 19:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants