Error when saving `TensorFlowModelDataset` as partition #759

anabelchuinard · 2023-08-21T22:04:11Z

Description

Can't save TensorFlowModelDataset objects as partition.

Context

I am dealing with a project where I have to train several models concurrently. I started writing my code using PartitionedDataset where each partition corresponds to the data relative to one training. When I am trying to save the resulting tensorflow models as a partition, I get an error. I wonder is this has to do with the fact that those inherit from the AbstractVersionedDataset instead of the AbstractDataset. And if yes, I am interested to know if there is any workaround for batch saving those.

This is the instance of my catalog corresponding to the models I want to save:

tensorflow_models:
  type: PartitionedDataset
  path: data/derived/ML/models
  filename_suffix: ".hdf5"
  dataset:
    type: kedro.extras.datasets.tensorflow.TensorFlowModelDataset

Note: Saving one model (not as partition) works.

Steps to Reproduce

Generate a bunch of trained models
Try to save them in a partition as TensorFlowModelDataset objects

Expected Result

Should save one .hdf5 file per partition with the name of the file being the associate dictionary key.

Actual Result

Getting this error:

DatasetError: Failed while saving data to data set PartitionedDataset(dataset_config={}, dataset_type=TensorFlowModelDataset,
path=...).
The first argument to `Layer.call` must always be passed.

Your Environment

Kedro version used (pip show kedro or kedro -V): kedro, version 0.18.12
Python version used (python -V): 3.9.16
Operating system and version: Mac M2

The text was updated successfully, but these errors were encountered:

astrojuanlu · 2023-09-05T11:07:32Z

Hi @anabelchuinard, thanks for opening this issue and sorry for the delay. It will take us some time but I'm labeling this issue so we don't lose track of it.

merelcht · 2024-07-08T14:34:35Z

Hi @anabelchuinard, do you still need help fixing this issue?

anabelchuinard · 2024-07-08T17:13:22Z

@merelcht I found a non-kedronic workaround for this but would love to know if there is now a kedronic way for batch-saving those models.

merelcht · 2024-07-09T13:52:43Z

Using the PartitionedDataset is definitely the recommended Kedro way for batch saving. I've done some digging and it seems that the following lines are causing issues for using the TensorFlowModelDataset with PartitionedDataset:

kedro-plugins/kedro-datasets/kedro_datasets/partitions/partitioned_dataset.py

Lines 313 to 314 in be99fec

    
           if callable(partition_data): 
        
               partition_data = partition_data()  # noqa: PLW2901

ElenaKhaustova · 2025-01-07T15:21:27Z

Cause of the issue

The issue is in how we implement partitioned dataset lazy saving. To postpone data loading, we require return Callable types in the dictionary fed to PartitionedDataset instead of the actual object.

kedro-plugins/kedro-datasets/kedro_datasets/partitions/partitioned_dataset.py

Lines 313 to 314 in be99fec

    
           if callable(partition_data): 
        
               partition_data = partition_data()  # noqa: PLW2901

When saving the data, we check if the Callable type was passed and call it to get the actual object. Since the TensorFlow model is callable, we make this call when saving, which causes the above error, though the user didn't mean to apply lazy saving.

So PartitionedDataset cannot save Callable types now, unless they're wrapped with another Callable, for example, lambda.

Current workaround

@anabelchuinard - To make PartitionedDataset save Callable in the current Kedro version you need to wrap an object as if you wanted to do a lazy saving:

save_dict = {
	"tensorflow_model_32": models["tensorflow_model_32"](),
	"tensorflow_model_64": models["tensorflow_model_64"](),
}

# Tensorflow model can be wrapped with lambda, to avoid calling it when saving
save_dict = {
	"tensorflow_model_32": lambda: models["tensorflow_model_32"](),
	"tensorflow_model_64": lambda: models["tensorflow_model_64"](),
}

Suggested fix

Make PartitionedDataset accept only lambda functions for lazy saving and ignore other callable objects - #978

Following PR to update docs

kedro-org/kedro#4402

noklam · 2025-01-07T16:51:41Z

Suggested fix
Make PartitionedDataset accept only lambda functions for lazy saving and ignore other callable objects - #978

To me this seems to be a niche case, and changing PartitionedDataset to only accept lambda is a bigger breaking change. Any useful callable will likely be more complicated than a simple lambda. Maybe we can disable lazy loading/saving (default enable) when specified?

ElenaKhaustova · 2025-01-08T11:31:01Z

Suggested fix
Make PartitionedDataset accept only lambda functions for lazy saving and ignore other callable objects - #978

To me this seems to be a niche case, and changing PartitionedDataset to only accept lambda is a bigger breaking change. Any useful callable will likely be more complicated than a simple lambda. Maybe we can disable lazy loading/saving (default enable) when specified?

I see the point but I think the issue is a little bit broader than this case. Particularly I don't think it's right to call any callable object and use this check to decide if we apply lazy saving. This affects all the ml-models cases (tensorflow, pytorch, scikit-learn, etc.) and potentially can also execute some unwanted code implemented in __call__. Moreover, it's not intuitive for users to wrap their objects to avoid such a behaviour.

In the solution suggested I tried to narrow down these cases from callable to lamda, so there's less chance to get them.

As an alternative, we can consider making lazy saving a default behaviour so we internally wrap and unwrap objects automatically. But here, the question is whether we need to make it the only option (as it is for lazy saving) or provide some interface to disable it.

DimedS · 2025-01-08T17:00:21Z

Thanks for the investigation and PR, @ElenaKhaustova! I agree with @noklam that relying solely on lambda functions for lazy saving doesn't seem like a generic solution. While it is a breaking change, it's hard to determine how much it will impact users. In my opinion, it would be better to avoid treating all Callables as participants in lazy saving by default. However, this would also be a breaking change. As a simpler alternative, we could provide an option to disable lazy saving, as you suggested.

github-actions bot mentioned this issue Sep 1, 2023

Monthly issue metrics report kedro-org/kedro#2996

Closed

astrojuanlu added the Community Issue/PR opened by the open-source community label Sep 5, 2023

merelcht removed the Community Issue/PR opened by the open-source community label Jul 9, 2024

merelcht changed the title ~~Saving TensorFlowModelDataset as partition~~ Error when saving TensorFlowModelDataset as partition Jul 9, 2024

merelcht transferred this issue from kedro-org/kedro Jul 9, 2024

merelcht added the bug Something isn't working label Jul 9, 2024

merelcht moved this to To Do in Kedro Framework Aug 5, 2024

merelcht added this to the Individual dataset improvements milestone Oct 22, 2024

ElenaKhaustova self-assigned this Jan 6, 2025

ElenaKhaustova moved this from To Do to In Progress in Kedro Framework Jan 6, 2025

This was referenced Jan 7, 2025

fix(datasets): Save callable with PartitionedDataset #978

Open

Update PartitionedDataset lazy saving docs page kedro-org/kedro#4401

Open

ElenaKhaustova moved this from In Progress to In Review in Kedro Framework Jan 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error when saving `TensorFlowModelDataset` as partition #759

Error when saving `TensorFlowModelDataset` as partition #759

anabelchuinard commented Aug 21, 2023 •

edited

Loading

astrojuanlu commented Sep 5, 2023

merelcht commented Jul 8, 2024

anabelchuinard commented Jul 8, 2024

merelcht commented Jul 9, 2024 •

edited

Loading

ElenaKhaustova commented Jan 7, 2025 •

edited

Loading

noklam commented Jan 7, 2025

ElenaKhaustova commented Jan 8, 2025

DimedS commented Jan 8, 2025

Error when saving TensorFlowModelDataset as partition #759

Error when saving TensorFlowModelDataset as partition #759

Comments

anabelchuinard commented Aug 21, 2023 • edited Loading

Description

Context

Steps to Reproduce

Expected Result

Actual Result

Your Environment

astrojuanlu commented Sep 5, 2023

merelcht commented Jul 8, 2024

anabelchuinard commented Jul 8, 2024

merelcht commented Jul 9, 2024 • edited Loading

ElenaKhaustova commented Jan 7, 2025 • edited Loading

Cause of the issue

Current workaround

Suggested fix

Following PR to update docs

noklam commented Jan 7, 2025

ElenaKhaustova commented Jan 8, 2025

DimedS commented Jan 8, 2025

Error when saving `TensorFlowModelDataset` as partition #759

Error when saving `TensorFlowModelDataset` as partition #759

anabelchuinard commented Aug 21, 2023 •

edited

Loading

merelcht commented Jul 9, 2024 •

edited

Loading

ElenaKhaustova commented Jan 7, 2025 •

edited

Loading