Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Task]: Update the minor version of cloudpickle library prior to Beam release. #23119

Open
tvalentyn opened this issue Sep 9, 2022 · 19 comments

Comments

@tvalentyn
Copy link
Contributor

tvalentyn commented Sep 9, 2022

What needs to happen?

If a Beam dependency has a flexible upper bound, users will download the most recent compatible version of a dependency at sdk installation time. Overtime, the version used at job submission may become newer than the version installed in a released Beam container. Given that forwards-compatiblity of a pickle library is not guaranteed, the pipeline may fail to unpickle at runtime.

To mitigate, pickle libraries should be specified in install_requires with tight version bounds limiting to a particular minor version. This can cause inconvenience to Beam user, if we depend on an old version of a library. Therefore, we should periodically update the version we use, at least once per release cycle.

We can consider to close this issue when either condition is met:

  • Beam vendors cloudpickle.
  • Beam stages used version of cloudpickle at pipeline submission time, and uses it at runtime.
  • Beam communicates to the users a list of vetted versions of each Beam dependency that users must install to use Beam in a supported configuration. Then, we can open up a version range.

Until then, don't close this issue, instead, move it to the next release milestone after updating the version in https://github.com/apache/beam/blob/master/sdks/python/setup.py

Issue Priority

Priority: 3

Issue Component

Component: sdk-py-core

@kennknowles
Copy link
Member

Looks like we are on the latest

@johnjcasey
Copy link
Contributor

updated for 2.45: #25143

@damccorm
Copy link
Contributor

damccorm commented Feb 8, 2023

Moving to 2.47 since there is not a new release

@tvalentyn tvalentyn added this to the 2.52.0 Release milestone Sep 20, 2023
@damccorm
Copy link
Contributor

It looks like there's been a major version upgrade (and no minor version upgrade) - https://pypi.org/project/cloudpickle/#history

@tvalentyn I don't think this should be a release blocker and probably represents non-trivial work to investigate/upgrade.

I'm going to move the release blocker and we should think about a good way to fund this going forward

@jrmccluskey
Copy link
Contributor

Similar status as before for cloudpickle, major increment to 3.0.0 was before the 2.52.0 branch cut. @tvalentyn any objections to rolling this up to 2.54?

@tvalentyn
Copy link
Contributor Author

no objections.

@lostluck
Copy link
Contributor

There's one week until the 2.54.0 cut and this issue is tagged for that release, if possible/necessary please complete the necessary work before then, or move this to the 2.55.0 Release Milestone.

@Abacn
Copy link
Contributor

Abacn commented Mar 6, 2024

2.2.1 is still the latest of 2.x as of 2.55.0 cut

@Abacn Abacn modified the milestones: 2.55.0 Release, 2.56.0 Release Mar 6, 2024
@damccorm
Copy link
Contributor

No update here

@tvalentyn tvalentyn removed this from the 2.57.0 Release milestone Apr 10, 2024
@Abacn Abacn added this to the 2.60.0 Release milestone Oct 2, 2024
@Abacn
Copy link
Contributor

Abacn commented Oct 2, 2024

We should move to the next milestone each time (not just remove it)

@Abacn
Copy link
Contributor

Abacn commented Oct 2, 2024

per #32617 cloudpickle 2.2.1 looks fine for Python 3.12, however we should revisit cloudpickle 3.0 later, move to the next milestone.

@Abacn Abacn modified the milestones: 2.60.0 Release, 2.61.0 Release Oct 2, 2024
@liferoad
Copy link
Contributor

#30528 (comment). Note the latest dask moves to cloudpickle >= 3.0.0 (https://github.com/dask/dask/blob/main/pyproject.toml#L33)

@kennknowles
Copy link
Member

No new minor version in 2.x.y line. Will we be able to move to 3.x.y eventually or are there user-breaking changes and/or serialization changes?

@tvalentyn
Copy link
Contributor Author

We'll take a look at this as part of https://s.apache.org/beam-cloudpickle-next-steps.
cc: @claudevdm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests