Skip to content

Commit

Permalink
Make Spark 3.2 the default branch
Browse files Browse the repository at this point in the history
  • Loading branch information
EnricoMi committed Jan 19, 2022
1 parent d5d18de commit 312ecb3
Show file tree
Hide file tree
Showing 6 changed files with 61 additions and 49 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -292,7 +292,7 @@ jobs:
SPARK_HOME: ${{ steps.params.outputs.home }}/spark
ARTIFACT_ID: ${{ steps.params.outputs.artifact-id }}
VERSION: ${{ steps.params.outputs.version }}
# There is no Spark 3.1 release of graphframes yet, so we use Spark 3.0 for now
# There is no Spark 3.2 release of graphframes yet, so we use Spark 3.0 for now
run: |
${SPARK_HOME}/bin/spark-submit --packages uk.co.gresearch.spark:${ARTIFACT_ID}:${VERSION},graphframes:graphframes:${{ steps.params.outputs.graphframes-version }}-spark3.0-s_2.12 --class uk.co.gresearch.spark.dgraph.connector.example.ExampleApp examples/scala/target/spark-dgraph-connector-examples_*.jar
shell: bash
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,9 +62,9 @@ The connector is under continuous development. It has the following known limita

## Using Spark Dgraph Connector

The Spark Dgraph Connector is available for Spark 2.4, Spark 3.0 and Spark 3.1, all with Scala 2.12.
The Spark Dgraph Connector is available for Spark 2.4, Spark 3.0, Spark 3.1 and Spark 3.2, all with Scala 2.12.
Use Maven artifact ID `spark-dgraph-connector_2.12`. The Spark version is part of the package version,
i.e. 0.7.0-2.4, 0.7.0-3.0 and 0.7.0-3.1, respectively.
i.e. 0.7.0-2.4, 0.7.0-3.0, 0.7.0-3.1 and 0.7.0-3.2, respectively.
Minor versions are kept in sync between those two packages,
such that identical minor versions contain identical feature sets
(where supported by the respective Spark version).
Expand Down
77 changes: 39 additions & 38 deletions RELEASE.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,69 +8,70 @@ These steps are codified by these release scripts (run in this order):

## Releasing from default branch

Follow this procedure to release a new version from default branch `spark-3.1`:
Follow this procedure to release a new version from default branch `spark-3.2`:

- Add a new entry to `CHANGELOG.md` listing all notable changes of this release.
Use the heading `## [VERSION] - YYYY-MM-dd`, e.g. `## [1.1.0] - 2020-06-09`.
No need to mention the branch as `CHANGELOG.md` is branch-specific.
- Remove the `-SNAPSHOT` suffix from `<version>` in the [`pom.xml`](pom.xml)
and [`examples/scala/pom.xml`](examples/scala/pom.xml) files, e.g. `1.1.0-3.0-SNAPSHOT``1.1.0-3.0`.
and [`examples/scala/pom.xml`](examples/scala/pom.xml) files, e.g. `1.1.0-3.2-SNAPSHOT``1.1.0-3.2`.
- Update the versions in the `README.md` file to the version of your `pom.xml` to reflect the latest version,
e.g. replace all `1.0.0-3.0` with `1.1.0-3.0`.
e.g. replace all `1.0.0-3.2` with `1.1.0-3.2`.
All these changes should occur in the `Using Spark Dgraph Connector` section and its subsections.
- Commit the change to your local git repository, use a commit message like `Releasing 1.1.0`. Do not push to github yet.
- Tag that commit with a version tag like `v1.1.0_spark-3.0` and message like `Release v1.1.0`. Do not push to github yet.
- Tag that commit with a version tag like `v1.1.0_spark-3.2` and message like `Release v1.1.0`. Do not push to github yet.
- Release the version with `mvn clean deploy -Dsign`. This will be put into a staging repository and not automatically released (due to `<autoReleaseAfterClose>false</autoReleaseAfterClose>` in your [`pom.xml`](pom.xml) file).
- Inspect and test the staged version. Use `examples/scala/` for that. If you are happy with everything:
- Push the commit and tag to origin.
- Release the package with `mvn nexus-staging:release`.
- Bump the version to the next [minor version](https://semver.org/) in `pom.xml` and `examples/scala/pom.xml`,
and append the `-SNAPSHOT` suffix again, e.g. `1.1.0-3.0``1.2.0-3.0-SNAPSHOT`.
and append the `-SNAPSHOT` suffix again, e.g. `1.1.0-3.2``1.2.0-3.2-SNAPSHOT`.
- Commit this change to your local git repository, use a commit message like `Post-release version bump to 1.2.0`.
- Push all local commits to origin.
- Otherwise drop it with `mvn nexus-staging:drop`. Remove the last two commits from your local history.

After successfully releasing default branch `spark-3.1`, consider releasing backport branches like `spark-2.4` and `spark-3.0`.
After successfully releasing default branch `spark-3.2`, consider releasing backport branches like `spark-3.0` and `spark-3.1`.

## Releasing from backport branches

Compare the backport branch (e.g. `spark-2.4`) against the default branch `spark-3.1`. Use the following script:
Compare the backport branch (e.g. `spark-3.0`) against the default branch `spark-3.2`. Use the following script:

./git-compare-logs-2.4.sh
./git-compare-logs-3.0.sh

This script shows the commits that are in one branch but not the other, together with all commits
since `spark-2.4` branched off `spark-3.1`. It is important not to change the commit message when
since `spark-3.0` branched off `spark-3.2`. It is important not to change the commit message when
merging (cherry-pick) one commit from one branch to the other and not to enhance a commit.

Once the backport branch is in sync (feature wise) with the default branch, repeat above release process
for the backport branch with the same version, e.g. version `1.1.0-2.4` and tag `v1.1.0_spark-2.4`.
for the backport branch with the same version, e.g. version `1.1.0-3.0` and tag `v1.1.0_spark-3.0`.
Minor versions need to be in sync across all release branches (same feature set), whereas patch versions may differ.

### Backporting commits from default branch

For each commit that you want to backport (here to `spark-2.4`) perform the following steps:
For each commit that you want to backport (here to `spark-3.0`) perform the following steps:

Identify the commit that you want to backport:

===== missing commits =====
spark-2.4: Upgrade to Spark 3.1
spark-3.1: Add support for GraphFrames
spark-3.1: Make model parameter implicit to avoid merge conflicts
===== missing commits =====
spark-3.0: Upgrade dgraph4j to v21.12.0 (#158)
spark-3.0: Upgrade Spark to 3.1.2 (#132)
spark-3.1: Upgrade Spark to 3.0.3 (#134)

===== spark-3.1 =====
* 6c23950 - Upgrade to Spark 3.1 (HEAD -> spark-3.1) (vor 5 Minuten)
* 19b5a8a - Activate Spark 3.1 integration tests (origin/spark-3.0, spark-3.0) (vor 27 Minuten)
* 6121267 - Test all 20.11.x version and integration test latest 20.11.x as well (#81) (vor 5 Stunden)
===== spark-3.1 =====
* 7cd46e9 - Upgrade dgraph4j to v21.12.0 (#158) (vor 4 Wochen)
* 0a8feac - Support dgraph v21.12 (#147) (vor 7 Wochen)
* 065c2bc - Post-release version bump to 0.8.0 (vor 4 Monaten)
* b141836 - Releasing 0.7.0 (tag: v0.7.0_spark-3.1) (vor 4 Monaten)

===== spark-2.4 =====
* 3c07c51 - Activate Spark 3.1 integration tests (origin/spark-2.4, spark-2.4) (vor 27 Minuten)
* 5626b37 - Test all 20.11.x version and integration test latest 20.11.x as well (#81) (vor 5 Stunden)
* 090c26a - Upgrade dependencies, add gson dependency (#84) (vor 30 Stunden)
===== spark-3.0 =====
* ea0cb2d - Support dgraph v21.12 (#147) (vor 86 Minuten)
* ae56e77 - Post-release version bump to 0.8.0 (vor 4 Monaten)
* 8d63fe6 - Releasing 0.7.0 (tag: v0.7.0_spark-3.0) (vor 4 Monaten)

As an example, we want to backport commit `6c23950` to `spark-2.4`.
As an example, we want to backport commit `7cd46e9` to `spark-3.0`.

git checkout spark-2.4
git cherry-pick 6c23950
git checkout spark-3.0
git cherry-pick 7cd46e9
# resolve any conflicts
# add resolved files with 'git add'
# finish with 'git cherry-pick --continue'
Expand All @@ -84,20 +85,20 @@ Finally, push your changes to origin.

## Releasing a bug-fix version

A bug-fix version needs to be released from a [minor-version branch](https://semver.org/), e.g. `spark-3.1_v1.1`.
A bug-fix version needs to be released from a [minor-version branch](https://semver.org/), e.g. `spark-3.2_v1.1`.

### Create a bug-fix branch

If there is no bug-fix branch yet, create it:

- Create such a branch from the respective [minor-version tag](https://semver.org/), e.g. create minor version branch `spark-3.1_v1.1` from tag `v1.1.0_spark-3.1`.
- Bump the version to the next [patch version](https://semver.org/) in `pom.xml` and append the `-SNAPSHOT` suffix again, e.g. `1.1.0-3.1``1.1.1-3.1-SNAPSHOT`.
- Create such a branch from the respective [minor-version tag](https://semver.org/), e.g. create minor version branch `spark-3.2_v1.1` from tag `v1.1.0_spark-3.2`.
- Bump the version to the next [patch version](https://semver.org/) in `pom.xml` and append the `-SNAPSHOT` suffix again, e.g. `1.1.0-3.2``1.1.1-3.2-SNAPSHOT`.
- Commit this change to your local git repository, use a commit message like `Post-release version bump to 1.1.1`.
- Push this commit to origin.

Merge your bug fixes into this branch as you would normally do for the default branch `spark-3.1`, use PRs for that.
Merge your bug fixes into this branch as you would normally do for the default branch `spark-3.2`, use PRs for that.

Remember to also merge the bug-fix into the default branch `spark-3.1`. Also consider [backporting the bug-fix](#backporting-commits-from-default-branch)
Remember to also merge the bug-fix into the default branch `spark-3.2`. Also consider [backporting the bug-fix](#backporting-commits-from-default-branch)
(e.g. into `spark-2.4`) and releasing backport bug-fix releases.

### Release from a bug-fix branch
Expand All @@ -109,28 +110,28 @@ but the version increment occurs on [patch level](https://semver.org/):
Use the heading `## [VERSION] - YYYY-MM-dd`, e.g. `## [1.1.1] - 2020-06-09`.
No need to mention the branch as `CHANGELOG.md` is branch-specific.
- Remove the `-SNAPSHOT` suffix from `<version>` in the [`pom.xml`](pom.xml)
and [`examples/scala/pom.xml`](examples/scala/pom.xml) files, e.g. `1.1.1-3.1-SNAPSHOT``1.1.1-3.1`.
and [`examples/scala/pom.xml`](examples/scala/pom.xml) files, e.g. `1.1.1-3.2-SNAPSHOT``1.1.1-3.2`.
- Update the versions in the `README.md` file to the version of your `pom.xml` to reflect the latest version,
e.g. replace all `1.1.0-3.1` with `1.1.1-3.1`, respectively.
e.g. replace all `1.1.0-3.2` with `1.1.1-3.2`, respectively.
All these changes should occur in the `Using Spark Dgraph Connector` section and its subsections.
- Commit the change to your local git repository, use a commit message like `Releasing 1.1.1`. Do not push to github yet.
- Tag that commit with a version tag like `v1.1.1_spark-3.1` and message like `Release v1.1.1`. Do not push to github yet.
- Tag that commit with a version tag like `v1.1.1_spark-3.2` and message like `Release v1.1.1`. Do not push to github yet.
- Release the version with `mvn clean deploy -Dsign`. This will be put into a staging repository and not automatically released (due to `<autoReleaseAfterClose>false</autoReleaseAfterClose>` in your [`pom.xml`](pom.xml) file).
- Inspect and test the staged version. Use `examples/scala/` for that. If you are happy with everything:
- Push the commit and tag to origin.
- Release the package with `mvn nexus-staging:release`.
- Bump the version to the next [patch version](https://semver.org/) in `pom.xml` and `examples/scala/pom.xml`,
and append the `-SNAPSHOT` suffix again, e.g. `1.1.1-3.1``1.1.2-3.1-SNAPSHOT`.
and append the `-SNAPSHOT` suffix again, e.g. `1.1.1-3.2``1.1.2-3.2-SNAPSHOT`.
- Commit this change to your local git repository, use a commit message like `Post-release version bump to 1.1.2`.
- Push all local commits to origin.
- Otherwise drop it with `mvn nexus-staging:drop`. Remove the last two commits from your local history.

After successfully releasing from a `spark-3.1` bug-fix branch, merge the bug-fixes into other bug-fix branches like `spark-2.4_v1.1`.
After successfully releasing from a `spark-3.2` bug-fix branch, merge the bug-fixes into other bug-fix branches like `spark-2.4_v1.1`.
Repeat above release process for those branches with the same versions, e.g. version `1.1.1-2.4` and tag `v1.1.1_spark-2.4`.

## Git cheat sheet

git commit -a -m "Releasing 1.1.0"
git tag -a v1.1.0_spark-3.1 -m "Release v1.1.0"
git push origin spark-3.1 v1.1.0_spark-3.1
git tag -a v1.1.0_spark-3.2 -m "Release v1.1.0"
git push origin spark-3.2 v1.1.0_spark-3.2
git commit -a -m "Post-release version bump to 1.2.0"
8 changes: 4 additions & 4 deletions git-compare-logs-2.4.sh
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
echo "===== missing commits ====="
diff <(git log --pretty=format:'%s' spark-2.4..spark-3.1; echo) <(git log --pretty=format:'%s' spark-3.1..spark-2.4; echo) | grep "^[<>]" | sed -e "s/^</spark-2.4:/" -e "s/^>/spark-3.1:/"
diff <(git log --pretty=format:'%s' spark-2.4..spark-3.2; echo) <(git log --pretty=format:'%s' spark-3.2..spark-2.4; echo) | grep "^[<>]" | sed -e "s/^</spark-2.4:/" -e "s/^>/spark-3.2:/"

echo
echo "===== spark-3.1 ====="
git log --max-count=10 --graph --pretty=format:'%Cred%h%Creset - %s%C(yellow)%d%Creset %Cgreen(%cr)%Creset' --abbrev-commit --date=relative spark-2.4..spark-3.1
echo "===== spark-3.2 ====="
git log --max-count=10 --graph --pretty=format:'%Cred%h%Creset - %s%C(yellow)%d%Creset %Cgreen(%cr)%Creset' --abbrev-commit --date=relative spark-2.4..spark-3.2

echo
echo "===== spark-2.4 ====="
git log --max-count=10 --graph --pretty=format:'%Cred%h%Creset - %s%C(yellow)%d%Creset %Cgreen(%cr)%Creset' --abbrev-commit --date=relative spark-3.1..spark-2.4
git log --max-count=10 --graph --pretty=format:'%Cred%h%Creset - %s%C(yellow)%d%Creset %Cgreen(%cr)%Creset' --abbrev-commit --date=relative spark-3.2..spark-2.4

8 changes: 4 additions & 4 deletions git-compare-logs-3.0.sh
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
echo "===== missing commits ====="
diff <(git log --pretty=format:'%s' spark-3.0..spark-3.1; echo) <(git log --pretty=format:'%s' spark-3.1..spark-3.0; echo) | grep "^[<>]" | sed -e "s/^</spark-3.0:/" -e "s/^>/spark-3.1:/"
diff <(git log --pretty=format:'%s' spark-3.0..spark-3.2; echo) <(git log --pretty=format:'%s' spark-3.2..spark-3.0; echo) | grep "^[<>]" | sed -e "s/^</spark-3.0:/" -e "s/^>/spark-3.2:/"

echo
echo "===== spark-3.1 ====="
git log --max-count=10 --graph --pretty=format:'%Cred%h%Creset - %s%C(yellow)%d%Creset %Cgreen(%cr)%Creset' --abbrev-commit --date=relative spark-3.0..spark-3.1
echo "===== spark-3.2 ====="
git log --max-count=10 --graph --pretty=format:'%Cred%h%Creset - %s%C(yellow)%d%Creset %Cgreen(%cr)%Creset' --abbrev-commit --date=relative spark-3.0..spark-3.2

echo
echo "===== spark-3.0 ====="
git log --max-count=10 --graph --pretty=format:'%Cred%h%Creset - %s%C(yellow)%d%Creset %Cgreen(%cr)%Creset' --abbrev-commit --date=relative spark-3.1..spark-3.0
git log --max-count=10 --graph --pretty=format:'%Cred%h%Creset - %s%C(yellow)%d%Creset %Cgreen(%cr)%Creset' --abbrev-commit --date=relative spark-3.2..spark-3.0

11 changes: 11 additions & 0 deletions git-compare-logs-3.1.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
echo "===== missing commits ====="
diff <(git log --pretty=format:'%s' spark-3.1..spark-3.2; echo) <(git log --pretty=format:'%s' spark-3.2..spark-3.1; echo) | grep "^[<>]" | sed -e "s/^</spark-3.1:/" -e "s/^>/spark-3.2:/"

echo
echo "===== spark-3.2 ====="
git log --max-count=10 --graph --pretty=format:'%Cred%h%Creset - %s%C(yellow)%d%Creset %Cgreen(%cr)%Creset' --abbrev-commit --date=relative spark-3.1..spark-3.2

echo
echo "===== spark-3.1 ====="
git log --max-count=10 --graph --pretty=format:'%Cred%h%Creset - %s%C(yellow)%d%Creset %Cgreen(%cr)%Creset' --abbrev-commit --date=relative spark-3.2..spark-3.1

0 comments on commit 312ecb3

Please sign in to comment.