Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add pre-commit check for setup.cfg options.extras_require #49261

Merged
merged 31 commits into from
Nov 7, 2022

Conversation

kostyafarber
Copy link
Contributor

@kostyafarber kostyafarber commented Oct 23, 2022

Augmented scripts/validate_min_versions_in_sync.py to ensure that minimum versions are aligned in other areas where minimum versions are specified (setup.cfg).

Copy link
Member

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the check working correctly as https://github.com/pandas-dev/pandas/actions/runs/3307040746/jobs/5458375266 is failing?

Moreover, it would be great if the diff could be made clearer to state which file states which version e.g.

{package}
{file_1}: {version_1}
{file_2}: {version_2}
{file_3}: {version_3}

@mroeschke mroeschke added the Code Style Code style, linting, code_checks label Oct 24, 2022
@JMBurley
Copy link
Contributor

Bringing over my main issue comment:

Based on experience my key goal here would be clear error messages to help users find the problem, so I'd then compare the diff list against each the pkg list from each source file to identify which file is missing what pkg and print that as part of the failure message.

I agree with mroescke's suggestion for how to format that information.

@kostyafarber
Copy link
Contributor Author

Yes I agree we can make the diff more informative, I will work on making those changes. However, not sure why the check isn't working as intended.

Following from the logic of the previous iteration of scripts/validate_min_versions_in_sync.py which was taking the symmetric difference of the dependency dicts, we decided to go with the approach of subtracting the intersection of all dicts (adding the dependencies dict from setup.cfg) from the union i.e (a | b | c) - (a & b & c). to obtain the diff.

My hunch is maybe it's picking up versions that are defined in one place but not anywhere else?

Maybe I'm missing something but will have a crack and look into it.

@kostyafarber
Copy link
Contributor Author

kostyafarber commented Oct 25, 2022

Question: are we excluding test libraries in setup.cfg in our checks? I saw earlier in the script pytestis being dropped from one of the dependency dicts and didn't know if we are extending the same in this case. However there are more libraries than just pytest in tests are we keeping those or dropping them too?

@mroeschke
Copy link
Member

However there are more libraries than just pytest in tests are we keeping those or dropping them too?

Maybe can ignore the testing group for now. They can be addressed in a follow up PR

@kostyafarber
Copy link
Contributor Author

Update: I nearly have it working but there are a couple of things I wanted to raise:

  • When a library is specified in one file but none of the others OR all of the files but one:
qtpy
/Users/kostyafarber/repos/pandas/ci/deps/actions-38-minimum_versions.yaml: Not specified
/Users/kostyafarber/repos/pandas/pandas/compat/_optional.py: Not specified
/Users/kostyafarber/repos/pandas/setup.cfg: 2.2.0
hypothesis
/Users/kostyafarber/repos/pandas/ci/deps/actions-38-minimum_versions.yaml: 6.13.0
/Users/kostyafarber/repos/pandas/pandas/compat/_optional.py: 6.13.0
/Users/kostyafarber/repos/pandas/setup.cfg: Not specified

Is this something we should be picking up? Or are these false positives.

  • There are a couple of instances when the versions are the "same" but have different formatting of versions (leading zero for example):
fsspec
/Users/kostyafarber/repos/pandas/ci/deps/actions-38-minimum_versions.yaml: 2021.07.0
/Users/kostyafarber/repos/pandas/pandas/compat/_optional.py: 2021.07.0
/Users/kostyafarber/repos/pandas/setup.cfg: 2021.7.0

Not sure how we want to deal with those. Would probably just be easier to make sure these follow the same conventions?

@mroeschke
Copy link
Member

Ideally all files should have the same library with the same version (and format) specified.

@kostyafarber
Copy link
Contributor Author

Okay I'll commit my changes shortly. Looks like there are a few libraries not in sync (if the check is working correctly). The pre-commit hook will expose them and you can let me know if we want to change/update/add them in this PR.

@kostyafarber
Copy link
Contributor Author

kostyafarber commented Oct 27, 2022

@kostyafarber
Copy link
Contributor Author

Is there anything else I need to do here?

@MarcoGorelli
Copy link
Member

The check is failing at the moment, so the versions should be aligned to get the check to pass before this can be merged

It's probably worth also checking that any minimum aligns with with what's been reported as the minimum version in the whatsnew notes. For example, bottleneck is listed as >=1.3.1 in setup.cfg, but >= 1.3.2 in the 1.5.0 whatsnew note:

bottleneck>=1.3.1

| bottleneck |1.3.2 | X |

So, it'd be good to align the bottleneck minimum to 1.3.2 across these files

@kostyafarber
Copy link
Contributor Author

Okay cool. I'll check with what's in the what's new and change accordingly.

@kostyafarber
Copy link
Contributor Author

I've aligned versions across the files (checking what's in whatsnew) and the check should pass now.

@kostyafarber
Copy link
Contributor Author

Some of the packages I've aligned seem to be causing in issue in: https://github.com/pandas-dev/pandas/actions/runs/3362264680/jobs/5573763359

Encountered problems while solving:
    - nothing provides requested pyqt5 5.15.1**
    - package s3fs-2021.8.0-pyhd8ed1ab_0 requires aiobotocore 1.4.0, but none of the providers can be installed

Not sure any of the other fails are to do with this PR.

@mroeschke
Copy link
Member

I think for pyqt5 it will need to be installed with pip like in the conda env yaml like

- pip
   - pyqt5=version

@JMBurley
Copy link
Contributor

JMBurley commented Oct 31, 2022

@mroeschke @kostyafarber Regarding the build error
s3fs-2021.8.0-pyhd8ed1ab_0 requires aiobotocore 1.4.0

Ah, this is a common issue. aiobotocore is a huge packaging problem, and the fact that s3fs>0.4.0 relies on aiobotocore is a short-sighted choice that creates problems across the python ecosystem. (in summary: you can't use aiobotocore alongside anything else that interacts with boto3. Which is a LOT of stuff. Anyone wanting to unit test a repo will be using moto, for instance.)

This was mentioned back in June for Pandas here, and it has been discussed more extensively in other repos (eg. here)

I am not sure if there is a functional reason for pandas to advance the s3fs version pin beyond 0.4.0. My 2 cents are that the s3fs version should to be held at 0.4.0 until the aiobotocore issue is resolved. I'm happy to make a separate issue, adr, or PDEP to have that discussion if it needs to be had (@mroeschke ?)

@kostyafarber
Copy link
Contributor Author

@JMBurley sounds like a mess. I've pinned s3fs at 0.4.0 for now.

@kostyafarber kostyafarber requested review from JMBurley and mroeschke and removed request for JMBurley November 3, 2022 08:18
@kostyafarber
Copy link
Contributor Author

@mroeschke looks to be okay. Just one fail due to something with matplotlib but don't think it's related here.

@kostyafarber
Copy link
Contributor Author

kostyafarber commented Nov 4, 2022

Gonna leave it here. If there's anything else I need to do let me know.

All checks successful @mroeschke @JMBurley @MarcoGorelli

@kostyafarber
Copy link
Contributor Author

Any updates on this one?

Comment on lines -108 to +107
matplotlib>=3.3.2
matplotlib>=3.6.1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice - I think older versions of matplotlib were failing the CI checks, so it's good you've caught this

Copy link
Contributor Author

@kostyafarber kostyafarber Nov 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries. Only caught it because of the script 😄

Thanks for reviewing and chasing up.

@@ -45,7 +46,7 @@ dependencies:
- pytables=3.6.1
- python-snappy=0.6.0
- pyxlsb=1.0.8
- s3fs=2021.08.0
- s3fs=0.4.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @JMBurley - could you please clarify about

I'm not sure if pandas can be built successfully with s3fs>=0.4.0

? The environment.yml file has a later version of s3fs, and building pandas with the environment.yml file works fine

- s3fs>=2021.08.0

@JMBurley
Copy link
Contributor

JMBurley commented Nov 6, 2022

@MarcoGorelli

The issue is well covered in prior links (here) but in summary when s3fs required aiobotocore (which in turn required boto3) it was incompatible with any other package that used boto3 because aiobotocore has extremely tight OLD pins to boto3 that other libraries generally don't.

However, it looks like this may have been quietly resolved in the past few months as my simple test .cfg file no longer triggers a pip-compile error

[metadata]
name = s3fs-test
version = 1.0.0
author = JMBurley
classifiers =
    Programming Language :: Python :: 3.9
    Operating System :: OS Independent

[options]
zip_safe = False
include_package_data = True
# min python 3.8 for Walrus operator; preclude auto-updates to future python 4.0
python_requires= >=3.9, <4.
# In principle, should make list below accept a broad range of packages, but not new major versions (risk of
# breaking changes). the `~=` and `>=2.1, <3.0` syntax is useful for this
install_requires =
    boto3>=1.20.0
    moto>=3.0, <4.0  # [s3] manages boto3, botocore. Thorny mutual dependency problem w/ aiobotocore via s3fs.
    s3fs>0.4.0  # s3fs<0.5 doesn't need aiobotocore (and its strict botocore pinning).  Removes incompatibility with moto's botocore needs.

run with pip-compile --output-file=requirements_example.txt --resolver=backtracking

In which case I remove my objection if we can get tests working here with a newer s3fs version. Although I am concerned that aiobotocore has historically caused packaging issues frequently so might not be "safe" long-term.

@kostyafarber
Copy link
Contributor Author

@JMBurley bumped s3fs awaiting green tick.

@kostyafarber
Copy link
Contributor Author

@JMBurley bumped s3fs awaiting green tick.

Looks good 👍🏻

@MarcoGorelli MarcoGorelli self-requested a review November 6, 2022 17:27
Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me - got a really minor comment but not a blocker

scripts/validate_min_versions_in_sync.py Outdated Show resolved Hide resolved
@MarcoGorelli MarcoGorelli added this to the 2.0 milestone Nov 6, 2022
Co-authored-by: Marco Edward Gorelli <33491632+MarcoGorelli@users.noreply.github.com>
Copy link
Contributor

@JMBurley JMBurley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks for the work here!

@mroeschke mroeschke merged commit 8b6b867 into pandas-dev:main Nov 7, 2022
@mroeschke
Copy link
Member

Awesome work @kostyafarber; glad this caught some issues already :) and thanks @JMBurley for your continued guidance here

@kostyafarber
Copy link
Contributor Author

No worries @mroeschke glad I could help. Thanks for the support everybody! 🚀

@kostyafarber kostyafarber deleted the issue-48949 branch November 7, 2022 19:54
phofl pushed a commit to phofl/pandas that referenced this pull request Nov 9, 2022
…s-dev#49261)

* Add dependency check for setup.cfg

* improve diff alert, add install mapping, remove test deps

* align versions across min version files

* fix version issues for boto3, s3fs and pyqt5. Small change to script

* fix version equality, modify script for CI check

* fix version typo in min version yaml and update min version script

* bump fsspec, gcsfs to match in whats new

* Revert "bump fsspec, gcsfs to match in whats new"

This reverts commit 52efecb.

* align fastparquet, lowercase PyQt5

* ENH: align matplotlib versions

* ENH: align matplotlib in setup.cfg

* bump s3fs across min version files

* Update scripts/validate_min_versions_in_sync.py

Co-authored-by: Marco Edward Gorelli <33491632+MarcoGorelli@users.noreply.github.com>

Co-authored-by: Marco Edward Gorelli <33491632+MarcoGorelli@users.noreply.github.com>
noatamir pushed a commit to noatamir/pandas that referenced this pull request Nov 9, 2022
…s-dev#49261)

* Add dependency check for setup.cfg

* improve diff alert, add install mapping, remove test deps

* align versions across min version files

* fix version issues for boto3, s3fs and pyqt5. Small change to script

* fix version equality, modify script for CI check

* fix version typo in min version yaml and update min version script

* bump fsspec, gcsfs to match in whats new

* Revert "bump fsspec, gcsfs to match in whats new"

This reverts commit 52efecb.

* align fastparquet, lowercase PyQt5

* ENH: align matplotlib versions

* ENH: align matplotlib in setup.cfg

* bump s3fs across min version files

* Update scripts/validate_min_versions_in_sync.py

Co-authored-by: Marco Edward Gorelli <33491632+MarcoGorelli@users.noreply.github.com>

Co-authored-by: Marco Edward Gorelli <33491632+MarcoGorelli@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Code Style Code style, linting, code_checks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: pre-commit check for setup.cfg options.extras_require
4 participants