Skip to content

Commit

Permalink
[SPARK-46020][INFRA] Add Python 3.12 to Infra docker image
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?

This PR aims to add `Python 3.12` to Infra docker images.

Note that `Python 3.12` has a breaking change in the installation.
- `distutils` module itself is removed at Python 3.12 via [PEP-632](https://peps.python.org/pep-0632) in favor of `packaging` package.
- Apache Spark 4.0.0 is ready for Python 3.12 via SPARK-45390 by removing `distutils` usages
    - #43192
- However, some 3rd party packages are not ready for Python 3.12. So, this PR skips those kind of packages.

### Why are the changes needed?

This PR is a preparation to add a daily `Python 3.12` GitHub Action job later for Apache Spark 4.0.0.

As of today, Apache Spark 4.0.0 has Python 3.8 ~ Python 3.11 test coverage.
- Python 3.9 (Main)
    - https://github.com/apache/spark/blob/master/.github/workflows/build_and_test.yml
- PyPy3.8, Python 3.10, Python 3.11 (Daily)
    - https://github.com/apache/spark/actions/workflows/build_python.yml

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

```
$ docker run -it --rm ghcr.io/dongjoon-hyun/apache-spark-ci-image:master-6939290578 python3.12 --version
Python 3.12.0

$ docker run -it --rm ghcr.io/dongjoon-hyun/apache-spark-ci-image:master-6939290578 python3.12 -m pip freeze
alembic==1.12.1
blinker==1.7.0
certifi==2019.11.28
chardet==3.0.4
charset-normalizer==3.3.2
click==8.1.7
cloudpickle==2.2.1
contourpy==1.2.0
coverage==7.3.2
cycler==0.12.1
databricks-cli==0.18.0
dbus-python==1.2.16
distro-info==0.23+ubuntu1.1
docker==6.1.3
entrypoints==0.4
et-xmlfile==1.1.0
Flask==3.0.0
fonttools==4.45.0
gitdb==4.0.11
GitPython==3.1.40
googleapis-common-protos==1.56.4
greenlet==3.0.1
gunicorn==21.2.0
idna==2.8
importlib-metadata==6.8.0
itsdangerous==2.1.2
Jinja2==3.1.2
joblib==1.3.2
kiwisolver==1.4.5
lxml==4.9.3
Mako==1.3.0
Markdown==3.5.1
MarkupSafe==2.1.3
matplotlib==3.8.2
mlflow==2.8.1
numpy==1.26.2
oauthlib==3.2.2
openpyxl==3.1.2
packaging==23.2
pandas==2.1.3
Pillow==10.1.0
plotly==5.18.0
protobuf==4.25.1
pyarrow==14.0.1
PyGObject==3.36.0
PyJWT==2.8.0
pyparsing==3.1.1
python-apt==2.0.1+ubuntu0.20.4.1
python-dateutil==2.8.2
pytz==2023.3.post1
PyYAML==6.0.1
querystring-parser==1.2.4
requests==2.31.0
requests-unixsocket==0.2.0
scikit-learn==1.3.2
scipy==1.11.4
setuptools==45.2.0
six==1.14.0
smmap==5.0.1
SQLAlchemy==2.0.23
sqlparse==0.4.4
tabulate==0.9.0
tenacity==8.2.3
threadpoolctl==3.2.0
typing_extensions==4.8.0
tzdata==2023.3
unattended-upgrades==0.1
unittest-xml-reporting==3.2.0
urllib3==2.1.0
websocket-client==1.6.4
Werkzeug==3.0.1
wheel==0.34.2
zipp==3.17.0
```

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #43922 from dongjoon-hyun/SPARK-46020.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
  • Loading branch information
dongjoon-hyun committed Nov 21, 2023
1 parent df1280c commit 2293238
Showing 1 changed file with 9 additions and 0 deletions.
9 changes: 9 additions & 0 deletions dev/infra/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -127,3 +127,12 @@ RUN python3.11 -m pip install 'grpcio>=1.48,<1.57' 'grpcio-status>=1.48,<1.57' '
RUN python3.11 -m pip install 'torch<=2.0.1' torchvision --index-url https://download.pytorch.org/whl/cpu
RUN python3.11 -m pip install torcheval
RUN python3.11 -m pip install deepspeed

# Install Python 3.12 at the last stage to avoid breaking the existing Python installations
RUN add-apt-repository ppa:deadsnakes/ppa
RUN apt-get update && apt-get install -y \
python3.12 python3.12-distutils \
&& rm -rf /var/lib/apt/lists/*
RUN curl -sS https://bootstrap.pypa.io/get-pip.py | python3.12
RUN python3.12 -m pip install numpy 'pyarrow>=14.0.0' 'pandas<=2.1.3' scipy unittest-xml-reporting plotly>=4.8 'mlflow>=2.8.1' coverage matplotlib openpyxl 'scikit-learn>=1.3.2'
RUN python3.12 -m pip install 'protobuf==4.25.1' 'googleapis-common-protos==1.56.4'

0 comments on commit 2293238

Please sign in to comment.