Add tests for grib idx & reinflate #528

emfdavid · 2024-11-19T02:21:32Z

Add chunky test files and copy over unit test approach.

Builds on #523

TODO:

Get tests working by editing fixtures/code
Modify to use pytest instead of unittest
Squash commits to avoid extra git blobs

martindurant · 2024-11-19T22:08:13Z

Well that's a lot .... :)

emfdavid · 2024-11-19T22:42:58Z

Well that's a lot .... :)

All part of my plan to become top contributor...

...ib_idx_fixtures/gfs.pgrb2.0p25/reinflate/best_available/u/instant/isobaricInPa/u_chunks.json

emfdavid · 2024-12-16T19:49:02Z

tests/test__grib_idx.py

@@ -284,6 +291,10 @@ def test_build_idx_grib_mapping(self):
                    )
                    expected = pd.read_parquet(kindex_test_path)

+                    expected = expected.assign(
+                        step=lambda x: x.step.astype("timedelta64[ns]")


@martindurant These errors in the CI system are really strange... These typically only happen when using an old version of numpy/pandas that has a different default time type.
The last one, the sync error... that seems like maybe I need to convert from unittest subtest to pytest?

martindurant · 2024-12-17T16:20:07Z

It claims to have numpy 2.2.0, but there is this:

RuntimeWarning: numpy.ndarray size changed, may indicate binary incompatibility. Expected 16 from C header, got 96 from PyObject

which suggests maybe there is a conda-forge/defaults/pip crossover? We do have nodefaults in the environment spec.

emfdavid · 2024-12-18T02:36:41Z

It claims to have numpy 2.2.0, but there is this:
RuntimeWarning: numpy.ndarray size changed, may indicate binary incompatibility. Expected 16 from C header, got 96 from PyObject
which suggests maybe there is a conda-forge/defaults/pip crossover? We do have nodefaults in the environment spec.

Welp - I dumped as much version info as I can think to and I don't see any smoking gun
https://github.com/fsspec/kerchunk/actions/runs/12385105383/job/34570789000?pr=528#step:5:532

emfdavid · 2024-12-18T02:55:58Z

Okay - I can repro locally now after installing anaconda python.
I will debug tomorrow.

emfdavid · 2024-12-18T21:26:02Z

Okay - fixed the dtype on the step column.
It is an issue with reading the parquet files - I had to set engine='fastparquet' on the pd.read_parquet calls in my test.
Otherwise the timedelta64 step column does not decode properly.
This only happens in the anaconda environment. I can't repro the issue when I use a python 3.12 virtual env.

I am not sure what the remaining sync error is. I can't reproduce that one locally yet. Could it be a version skew problem?

emfdavid · 2024-12-19T16:12:13Z

Looks like installing the head of fsspec was breaking this test.
Thoughts @martindurant ?

The pd.read_parquet requiring engine='fastparquet' when using anaconda python is also quiet strange. Is there some global state being set that would break timedelta64 types?
I will try to make a repro test for the latter on the main branch.

emfdavid · 2024-12-30T02:43:15Z

Okay - tests are green but I don't see an easy way to convert the heavy use of unittest subtest to pytest parameterize mark?

The current behavior, when run with subtest give really nice error messages when things go wrong for a particular set of subtests. I forced a failure for some subtests by adding:

diff --git a/tests/test__grib_idx.py b/tests/test__grib_idx.py
index 1e83d2f..4651710 100644
--- a/tests/test__grib_idx.py
+++ b/tests/test__grib_idx.py
@@ -630,6 +630,9 @@ class DataExtractorTests(unittest.TestCase):
                             ]
                         )

+                        if var.name == "dswrf":
+                            self.fail("oh no!")
+
                         # # To update test grib_idx_fixtures

Now, when I run python -m unittest tests/test__grib_idx.py, I get:

======================================================================
FAIL: test_reinflate_grib_store (tests.test__grib_idx.DataExtractorTests.test_reinflate_grib_store) (var_name='dswrf', node_path='/dswrf/avg/surface', dataset='hrrr.wrfsubhf', aggregation=<AggregationType.BEST_AVAILABLE: 'best_available'>)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/davidstuebe/projects/kerchunk/tests/test__grib_idx.py", line 634, in _reinflate_grib_store
    self.fail("oh no!")
AssertionError: oh no!

======================================================================
FAIL: test_reinflate_grib_store (tests.test__grib_idx.DataExtractorTests.test_reinflate_grib_store) (var_name='dswrf', node_path='/dswrf/instant/surface', dataset='hrrr.wrfsubhf', aggregation=<AggregationType.BEST_AVAILABLE: 'best_available'>)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/davidstuebe/projects/kerchunk/tests/test__grib_idx.py", line 634, in _reinflate_grib_store
    self.fail("oh no!")
AssertionError: oh no!

Pytest will run all these cases... but it doesn't give any of the subtest context on which part failed:

______________________________________________________________________________ DataExtractorTests.test_reinflate_grib_store _______________________________________________________________________________

self = <test__grib_idx.DataExtractorTests testMethod=test_reinflate_grib_store>

    def test_reinflate_grib_store(self):
        for dataset in self._reinflate_grib_store_dataset():
            for aggregation, axes in self._reinflate_grib_store_aggregation():
                with self.subTest(dataset=dataset, aggregation=aggregation):
>                   self._reinflate_grib_store(dataset, aggregation, axes)

tests/test__grib_idx.py:658:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tests/test__grib_idx.py:634: in _reinflate_grib_store
    self.fail("oh no!")
E   AssertionError: oh no!

I found a subtest package for pypi, but I am not sure you want the extra dependency. Any clever ideas on how to restructure the tests without a total rewrite?

martindurant · 2025-01-06T15:38:39Z

Any clever ideas on how to restructure the tests without a total rewrite?

Sorry, I have never had to do any complex unittest->pytest refactoring.

When you use parametrize and -v for the command, you do see parameter names (often just numbered, depending on input type) in the PASS/FAIL list.

I think adding helper packages for the sake of saner test run output is totally fine. Test-time dependencies are easier to justify than runtime.

emfdavid commented Nov 25, 2024

View reviewed changes

...ib_idx_fixtures/gfs.pgrb2.0p25/reinflate/best_available/u/instant/isobaricInPa/u_chunks.json Show resolved Hide resolved

emfdavid changed the title ~~Add tests for discussion~~ Add tests for grib idx & reinflate Nov 27, 2024

emfdavid force-pushed the grib_idx_tests branch from 6f9a188 to 194da4d Compare December 2, 2024 03:14

David Stuebe and others added 3 commits December 16, 2024 13:40

Add tests for discussion

e78a7f9

Fix the tests

01f510d

Fix broken tests

261ee2f

emfdavid force-pushed the grib_idx_tests branch 2 times, most recently from 65ff274 to 261ee2f Compare December 16, 2024 18:57

Fix tests and black format

72f684e

emfdavid force-pushed the grib_idx_tests branch from b6cc52c to 72f684e Compare December 16, 2024 19:41

emfdavid commented Dec 16, 2024

View reviewed changes

emfdavid force-pushed the grib_idx_tests branch from 9b452ff to 84de0be Compare December 18, 2024 02:40

Fix the tests by forcing engine fastparquet

14807de

emfdavid force-pushed the grib_idx_tests branch from 84de0be to 14807de Compare December 18, 2024 20:53

Removed todo test

836189f

try vanilla fsspec

fee41e1

emfdavid mentioned this pull request Dec 19, 2024

add repro case? #533

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tests for grib idx & reinflate #528

Add tests for grib idx & reinflate #528

emfdavid commented Nov 19, 2024 •

edited

Loading

martindurant commented Nov 19, 2024

emfdavid commented Nov 19, 2024

emfdavid Dec 16, 2024

martindurant commented Dec 17, 2024

emfdavid commented Dec 18, 2024

emfdavid commented Dec 18, 2024

emfdavid commented Dec 18, 2024

emfdavid commented Dec 19, 2024

emfdavid commented Dec 30, 2024

martindurant commented Jan 6, 2025

Add tests for grib idx & reinflate #528

Are you sure you want to change the base?

Add tests for grib idx & reinflate #528

Conversation

emfdavid commented Nov 19, 2024 • edited Loading

martindurant commented Nov 19, 2024

emfdavid commented Nov 19, 2024

emfdavid Dec 16, 2024

Choose a reason for hiding this comment

martindurant commented Dec 17, 2024

emfdavid commented Dec 18, 2024

emfdavid commented Dec 18, 2024

emfdavid commented Dec 18, 2024

emfdavid commented Dec 19, 2024

emfdavid commented Dec 30, 2024

martindurant commented Jan 6, 2025

emfdavid commented Nov 19, 2024 •

edited

Loading