Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add pre-commit check for setup.cfg options.extras_require #49261

Merged
merged 31 commits into from
Nov 7, 2022
Merged
Show file tree
Hide file tree
Changes from 28 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
542b336
Add dependency check for setup.cfg
kostyafarber Oct 23, 2022
f9592e5
Merge branch 'pandas-dev:main' into issue-48949
kostyafarber Oct 25, 2022
6abda3e
Merge branch 'pandas-dev:main' into issue-48949
kostyafarber Oct 27, 2022
39e86cc
improve diff alert, add install mapping, remove test deps
kostyafarber Oct 27, 2022
7edca5c
Merge branch 'pandas-dev:main' into issue-48949
kostyafarber Oct 29, 2022
97adc2a
align versions across min version files
kostyafarber Oct 31, 2022
08a3e3f
Merge branch 'pandas-dev:main' into issue-48949
kostyafarber Oct 31, 2022
4036bfe
fix version issues for boto3, s3fs and pyqt5. Small change to script
kostyafarber Oct 31, 2022
953e885
Merge branch 'pandas-dev:main' into issue-48949
kostyafarber Oct 31, 2022
8aed966
Merge branch 'main' into issue-48949
kostyafarber Oct 31, 2022
d079b4f
fix version equality, modify script for CI check
kostyafarber Nov 1, 2022
2d272c9
Merge branch 'pandas-dev:main' into issue-48949
kostyafarber Nov 1, 2022
4f95fb2
Merge branch 'pandas-dev:main' into issue-48949
kostyafarber Nov 2, 2022
a256dcc
fix version typo in min version yaml and update min version script
kostyafarber Nov 2, 2022
52efecb
bump fsspec, gcsfs to match in whats new
kostyafarber Nov 2, 2022
309dc97
Merge branch 'main' into issue-48949
kostyafarber Nov 2, 2022
af47477
Revert "bump fsspec, gcsfs to match in whats new"
kostyafarber Nov 2, 2022
6f6b97d
align fastparquet, lowercase PyQt5
kostyafarber Nov 2, 2022
637dc10
Merge branch 'main' into issue-48949
kostyafarber Nov 2, 2022
674dadd
Merge branch 'main' into issue-48949
kostyafarber Nov 2, 2022
3c4a28e
Merge branch 'main' into issue-48949
kostyafarber Nov 2, 2022
0366b53
ENH: align matplotlib versions
kostyafarber Nov 3, 2022
56c7c05
Merge branch 'main' into issue-48949
kostyafarber Nov 3, 2022
e970faa
Merge branch 'main' into issue-48949
kostyafarber Nov 3, 2022
943d5db
Merge branch 'main' into issue-48949
kostyafarber Nov 4, 2022
bace671
ENH: align matplotlib in setup.cfg
kostyafarber Nov 4, 2022
9f414d8
Merge branch 'main' into issue-48949
kostyafarber Nov 4, 2022
e92334d
Merge branch 'main' into issue-48949
kostyafarber Nov 5, 2022
aee587a
bump s3fs across min version files
kostyafarber Nov 6, 2022
e5ea765
Merge branch 'main' into issue-48949
kostyafarber Nov 6, 2022
d2608b1
Update scripts/validate_min_versions_in_sync.py
kostyafarber Nov 6, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion ci/deps/actions-38-minimum_versions.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ dependencies:
- numba=0.53.1
- numexpr=2.7.3
- odfpy=1.4.1
- qtpy=2.2.0
- openpyxl=3.0.7
- pandas-gbq=0.15.0
- psycopg2=2.8.6
Expand All @@ -45,7 +46,7 @@ dependencies:
- pytables=3.6.1
- python-snappy=0.6.0
- pyxlsb=1.0.8
- s3fs=2021.08.0
- s3fs=0.4.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This dependency was bumped in 1.5 to a newer version: https://pandas.pydata.org/pandas-docs/stable/whatsnew/v1.5.0.html#backwards-incompatible-api-changes

So we shouldn't be testing an older version

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was causing build issues that's why I pinned it at 0.4.0 #49261 (comment).

What is the verdict on what version we want to put in here for s3fs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JMBurley cc for visibility

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @kostyafarber . I'm not sure if pandas can be built successfully with s3fs>=0.4.0 (and other packages active that require boto3), and I know that pandas cannot be packaged into larger production environments when it enforces s3fs>=0.4.0 because professional codebases need other boto3 imports.

However, I'm not a pandas maintainer so I don't get to make an assertion on what to pin s3fs to. @mroeschke controls the initial take on this. I'd make the pitch for not advancing the s3fs version pin, because I do not know of any benefits we need (could be a lack of awareness on my part) and I do know of major python ecosystem headaches from advancing the pin.

My team pins s3fs<=0.4.0 versions internally for this exact reason. Ditto my colleagues controlling data science at other companies.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @JMBurley - could you please clarify about

I'm not sure if pandas can be built successfully with s3fs>=0.4.0

? The environment.yml file has a later version of s3fs, and building pandas with the environment.yml file works fine

- s3fs>=2021.08.0

- scipy=1.7.1
- sqlalchemy=1.4.16
- tabulate=0.8.9
Expand All @@ -54,3 +55,6 @@ dependencies:
- xlrd=2.0.1
- xlsxwriter=1.4.3
- zstandard=0.15.2

- pip:
- pyqt5==5.15.1
4 changes: 3 additions & 1 deletion pandas/compat/_optional.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@
"pyreadstat": "1.1.2",
"pytest": "6.0",
"pyxlsb": "1.0.8",
"s3fs": "2021.08.0",
"s3fs": "0.4.0",
"scipy": "1.7.1",
"snappy": "0.6.0",
"sqlalchemy": "1.4.16",
Expand All @@ -46,6 +46,8 @@
"xlsxwriter": "1.4.3",
"zstandard": "0.15.2",
"tzdata": "2022.1",
"qtpy": "2.2.0",
"pyqt5": "5.15.1",
}

# A mapping from import name to package name (on PyPI) for packages where
Expand Down
64 changes: 57 additions & 7 deletions scripts/validate_min_versions_in_sync.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

ci/deps/actions-.*-minimum_versions.yaml
pandas/compat/_optional.py
setup.cfg

TODO: doc/source/getting_started/install.rst

Expand All @@ -13,6 +14,7 @@
"""
from __future__ import annotations

import configparser
import pathlib
import sys

Expand All @@ -21,6 +23,7 @@
pathlib.Path("ci/deps").absolute().glob("actions-*-minimum_versions.yaml")
)
CODE_PATH = pathlib.Path("pandas/compat/_optional.py").resolve()
SETUP_PATH = pathlib.Path("setup.cfg").resolve()
EXCLUDE_DEPS = {"tzdata"}
# pandas package is not available
# in pre-commit environment
Expand Down Expand Up @@ -57,29 +60,76 @@ def get_versions_from_ci(content: list[str]) -> tuple[dict[str, str], dict[str,
seen_required = True
elif "# optional dependencies" in line:
seen_optional = True
elif "pip" in line:
kostyafarber marked this conversation as resolved.
Show resolved Hide resolved
continue
elif seen_required and line.strip():
package, version = line.strip().split("=")
if "==" in line:
package, version = line.strip().split("==")

else:
package, version = line.strip().split("=")
package = package[2:]
if package in EXCLUDE_DEPS:
continue
if not seen_optional:
required_deps[package] = version
required_deps[package.casefold()] = version
else:
optional_deps[package] = version
optional_deps[package.casefold()] = version
return required_deps, optional_deps


def get_versions_from_setup() -> dict[str, str]:
install_map = _optional.INSTALL_MAPPING
optional_dependencies = {}

parser = configparser.ConfigParser()
parser.read(SETUP_PATH)
setup_optional = parser["options.extras_require"]["all"]
dependencies = setup_optional[1:].split("\n")

# remove test dependencies
test = parser["options.extras_require"]["test"]
test_dependencies = set(test[1:].split("\n"))
dependencies = [
package for package in dependencies if package not in test_dependencies
]

for dependency in dependencies:
package, version = dependency.strip().split(">=")
optional_dependencies[install_map.get(package, package).casefold()] = version

for item in EXCLUDE_DEPS:
optional_dependencies.pop(item)

return optional_dependencies


def main():
with open(CI_PATH, encoding="utf-8") as f:
_, ci_optional = get_versions_from_ci(f.readlines())
code_optional = get_versions_from_code()
diff = set(ci_optional.items()).symmetric_difference(code_optional.items())
setup_optional = get_versions_from_setup()

diff = (ci_optional.items() | code_optional.items() | setup_optional.items()) - (
ci_optional.items() & code_optional.items() & setup_optional.items()
)

if diff:
sys.stdout.write(
packages = {package for package, _ in diff}
out = sys.stdout
out.write(
f"The follow minimum version differences were found between "
f"{CI_PATH} and {CODE_PATH}. Please ensure these are aligned: "
f"{diff}\n"
f"{CI_PATH}, {CODE_PATH} AND {SETUP_PATH}. "
f"Please ensure these are aligned: \n\n"
)

for package in packages:
out.write(
f"{package}\n"
f"{CI_PATH}: {ci_optional.get(package, 'Not specified')}\n"
f"{CODE_PATH}: {code_optional.get(package, 'Not specified')}\n"
f"{SETUP_PATH}: {setup_optional.get(package, 'Not specified')}\n\n"
)
sys.exit(1)
sys.exit(0)

Expand Down
26 changes: 12 additions & 14 deletions setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -60,20 +60,19 @@ test =
# see: doc/source/getting_started/install.rst
performance =
bottleneck>=1.3.2
numba>=0.53.0
numba>=0.53.1
numexpr>=2.7.1
timezone =
tzdata>=2022.1
computation =
scipy>=1.7.1
xarray>=0.19.0
fss =
fsspec>=2021.7.0
fsspec>=2021.07.0
aws =
boto3>=1.22.7
s3fs>=0.4.0
gcp =
gcsfs>=2021.05.0
gcsfs>=2021.07.0
pandas-gbq>=0.15.0
excel =
odfpy>=1.4.1
Expand Down Expand Up @@ -105,7 +104,7 @@ html =
xml =
lxml>=4.6.3
plot =
matplotlib>=3.3.2
matplotlib>=3.6.1
Comment on lines -108 to +107
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice - I think older versions of matplotlib were failing the CI checks, so it's good you've caught this

Copy link
Contributor Author

@kostyafarber kostyafarber Nov 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries. Only caught it because of the script 😄

Thanks for reviewing and chasing up.

output_formatting =
jinja2>=3.0.0
tabulate>=0.8.9
Expand All @@ -123,19 +122,18 @@ compression =
all =
beautifulsoup4>=4.9.3
blosc>=1.21.0
bottleneck>=1.3.1
boto3>=1.22.7
bottleneck>=1.3.2
brotlipy>=0.7.0
fastparquet>=0.4.0
fsspec>=2021.7.0
gcsfs>=2021.05.0
fastparquet>=0.6.3
fsspec>=2021.07.0
gcsfs>=2021.07.0
html5lib>=1.1
hypothesis>=5.5.3
hypothesis>=6.13.0
jinja2>=3.0.0
lxml>=4.6.3
matplotlib>=3.3.2
numba>=0.53.0
numexpr>=2.7.1
matplotlib>=3.6.1
numba>=0.53.1
numexpr>=2.7.3
odfpy>=1.4.1
openpyxl>=3.0.7
pandas-gbq>=0.15.0
Expand Down