Can Kedro Pipeline be stateless? #4400

duy-rhombus · 2025-01-06T04:06:09Z

Description

Can Kedro support a stateless design where nodes are not recomputed even when a new pipeline is initialized, by leveraging the database model and computation state from a previous run?

Context

In a stateless setup, the pipeline is rebuilt from scratch on every run. This leads to all nodes being recomputed, which is inefficient for costly operations (e.g., LLM calls) and large datasets.

Possible Implementation

Could Kedro allow recovering pipeline computation states (e.g., node outputs) stored externally, such as in S3, to avoid recomputation with a new pipeline instance?

duy-rhombus added the Issue: Feature Request New feature or improvement to existing feature label Jan 6, 2025

duy-rhombus changed the title ~~<Title>Can Kedro Pipeline be stateless?~~ Can Kedro Pipeline be stateless? Jan 6, 2025

merelcht added the Community Issue/PR opened by the open-source community label Jan 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can Kedro Pipeline be stateless? #4400

Can Kedro Pipeline be stateless? #4400

duy-rhombus commented Jan 6, 2025

Can Kedro Pipeline be stateless? #4400

Can Kedro Pipeline be stateless? #4400

Comments

duy-rhombus commented Jan 6, 2025

Description

Context

Possible Implementation