Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can Kedro Pipeline be stateless? #4400

Open
duy-rhombus opened this issue Jan 6, 2025 · 0 comments
Open

Can Kedro Pipeline be stateless? #4400

duy-rhombus opened this issue Jan 6, 2025 · 0 comments
Labels
Community Issue/PR opened by the open-source community Issue: Feature Request New feature or improvement to existing feature

Comments

@duy-rhombus
Copy link

Description

Can Kedro support a stateless design where nodes are not recomputed even when a new pipeline is initialized, by leveraging the database model and computation state from a previous run?

Context

In a stateless setup, the pipeline is rebuilt from scratch on every run. This leads to all nodes being recomputed, which is inefficient for costly operations (e.g., LLM calls) and large datasets.

Possible Implementation

Could Kedro allow recovering pipeline computation states (e.g., node outputs) stored externally, such as in S3, to avoid recomputation with a new pipeline instance?

@duy-rhombus duy-rhombus added the Issue: Feature Request New feature or improvement to existing feature label Jan 6, 2025
@duy-rhombus duy-rhombus changed the title <Title>Can Kedro Pipeline be stateless? Can Kedro Pipeline be stateless? Jan 6, 2025
@merelcht merelcht added the Community Issue/PR opened by the open-source community label Jan 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Community Issue/PR opened by the open-source community Issue: Feature Request New feature or improvement to existing feature
Projects
None yet
Development

No branches or pull requests

2 participants