Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate git-scm.com to a static site, generated via Hugo, served via GitHub Pages #1804

Merged
merged 245 commits into from
Sep 24, 2024

Conversation

dscho
Copy link
Member

@dscho dscho commented Oct 16, 2023

Changes

This Pull Request adjusts the existing files such that the site is no longer served via a Rails App, but by GitHub Pages instead. A preview can be seen here: https://dscho.github.io/git-scm.com/ (which is generated and deployed from this Pull Request's branch, and will be updated via automation whenever that branch changes).

It is the culmination of a very long, and large, effort that started in February 2017 with the first attempt to migrate
the site to Jekyll
. Several years, and a substantial effort by @spraints, @vdye and myself, later, here is the result: No longer a Jekyll site but a Hugo site (because of render times: 20 minutes vs 30 seconds), search implemented using Pagefind, links verified by Lychee.

The main themes of the subsequent migration from the Rails App to a Hugo-generated static site are:

  • We move the original Rails App files that contain Rails code mixed into HTML to content/, where the files defining the pages live in the Hugo world, then modify them to drop the Rails code and replace it with Hugo constructs. More often than not, we separate the commits that move the files from the commits that adjust the contents, to help Git realize that there has been a move (as opposed to a delete/add, Git's rename detection does have its shortcomings). This allows for noticing upstream changes that need to be reflected in moved & modified files when rebasing to upstream.

  • In Hugo setups, the files live in the following locations:

    • hugo.yml

      This is the central configuration file that tells Hugo how to render the site.

    • layouts/

      This is where the "boiler plate" is defined that ties the site together, i.e. the header, the footer and the sidebar as well as the main scaffolding in which the pages' content is to be rendered.

      This is the location where most of Hugo's functionality is available and complex stuff can happen such as looping or accessing site parameters.

    • layouts/partials/

      This directory contains recurring templates, i.e. reusable partial layouts that are used to structure the elements of the site. This includes the side bar, how videos are rendered, etc.

    • layouts/shortcodes/

      This directory contains so-called "shortcodes", i.e. reusable elements similar to partial layouts. The major difference is that shortcodes can be used within content/ while partial layouts can only be used from within layouts/.

      See https://gohugo.io/content-management/shortcodes/ for more information on this topic.

    • content/

      This defines the content of the pages that are served. Only a subset of Hugo's functionality is available here (the idea is to leave the complicated stuff to the layout used to render the pages). These files have the extension .html but need to be processed using Hugo before becoming proper HTML pages. For example, most of these files begin with so-called front matter, i.e. metadata relevant to Hugo, specified using YAML that is enclosed in --- lines.

      To discern clearly between pages maintained in this repository vs HTML pages that are pre-generated using content from other repositories (such as the ProGit book and the manual pages), the pre-generated HTML pages are tracked in external/book/ and external/docs/, mapped via Hugo mounts. These files are not meant to be edited directly, and are clearly marked as such by comment at the top of the files, inside the front matter. Instead, these files are intended to be updated via GitHub workflows whenever the external repositories change.

    • static/

      These files are not processed by Hugo, but copied as-are. Good for images, for example.

    • assets/

      These files are processed in specific ways. That is where the SASS-based style sheets live, for example.

    • data/

      These files define metadata that can be used in Hugo's functions. For example, it contains the list of documentation categories that are rendered in various ways, and the GUIs that are shown at https://git-scm.com/downloads/guis are defined there.

  • In contrast to most Hugo-managed sites, we will refrain from using a Hugo theme, and instead stick with the existing style sheets.

    Likewise, we refrain from using Markdown at all: The existing site did not use it, therefore it makes little sense to start using it now.

  • In addition to Hugo's directories, we also have these:

    • script/

      This directory contains scripts to perform recurring tasks such as pre-rendering Git's manual pages into HTML that are then stored inside external/docs/.

      For historical reasons, these are Ruby scripts for the most part, as it is easier to follow the development when that functionality is extracted from the current Rails App and turned into Ruby scripts that can be run stand-alone.

    • .github/workflows/ and .github/actions/

      The latter directory contains a file that defines a custom GitHub Action that accommodates for the lack of Hugo support in GitHub Pages: By default, only Jekyll pages are supported out of the box, but Hugo sites require a custom GitHub workflow to deploy the site.

      The former directory contains files that define GitHub workflows that are typically run on a schedule, updating the various parts that are generated from external sources: the Git version, the ProGit Book, manual pages, etc. These workflows essentially keep the rendered HTML files in content/ up to date with the respective external repositories.

      These workflows can be seen in action (pun intended) here: https://github.com/dscho/git-scm.com/actions

    • external/book/

      It makes very, very little sense to render the ProGit book from scratch every time the site is deployed (and every time a PR build is run). To avoid that, one of the script/GitHub workflow pairs mentioned earlier populates and updates this directory with the latest version of the ProGit book.

      The subdirectories of external/book/ recapitulate Hugo's standard layout: content/, data/, static/, and Hugo mounts map them into the Hugo project. The only exception to this rule is sync/, which contains .sha files reflecting the tip commits of the ProGit book and its translations.

      Note: An alternative to this layout would have been to use submodules. However, the complexities, in particular in GitHub workflows, have been deemed not worth this approach and I opted for simplicity instead.

      Also note: The files in external/ are not meant to be edited directly, and are therefore clearly marked as such by comment at the top of the files, inside the front matter. The comment indicates the script that was used to populate/update the content; This will hopefully direct contributors who are tempted to edit these generated files to the right place to make their changes.

    • external/docs/

      Like the book/ subdirectory, the docs/ subdirectory contains pre-rendered versions of Git's manual pages and their translations (which is particularly important here because rendering them from scratch would easily take 20 minutes), and it is populated and updated via scripts that are run in regularly-scheduled GitHub workflows.

      Just like external/book/sync/, the external/docs/sync/ directory contains .sha files whose contents reflect the tip commits of the external repositories.

      In addition, there is the external/docs/asciidoc/directory which serves as a cache of "expanded AsciiDoc": many of Git's manual pages include content from other files, and therefore it is non-trivial to determine whether or not a manual page has changed and needs to be re-rendered (essentially, the only way is to expand them by inlining the included files and then comparing the contents). Caching this content speeds up updating the manual pages drastically.

  • Most of the core logic lives in layouts/. Hugo discerns between logic that is allowed in layouts/ and logic that is allowed in content/; The latter can only access so-called "shortcodes". These shortcodes are essentially snippets of Hugo pages and are free to use the entire set of Hugo's functionality.

    tl;dr whenever we need to do something complicated that is confined to only a few pages, we have to implement it in layouts/shortcodes/ and insert the corresponding {{< shortcode-name >}} in the page itself. Whenever we need to do something complicated that is used in more places, it is implemented elsewhere in layouts/.

  • Some of the logic that cannot be performed statically (such as telling the user how long ago the latest macOS installer was released, or adjusting the Windows downloads to reflect the CPU architecture indicated by the current user agent) are implemented using Javascript instead.

  • The site search needs to move to the client side, as there is no longer a server that can perform that service. Luckily, Pagefind matured in the meantime (I have helped, too), a very performant client-side search solution implemented in Javascript that makes use of a pre-computed, fine-grained search index that is loaded incrementally on demand.

  • In contrast to the Rails App, the static pages are easy to check for broken links. We use Lychee for that (which I helped support GitHub Pages better).

Context

Changes required to finalize the migration in addition to this Pull Request

  • This Pull Request is not actually meant to be merged, not to the main branch at least, but to be pushed to the gh-pages branch which then should be made the default branch.

  • To successfully deploy to GitHub Pages, the Pages configuration was already switched from "Deploy from a branch" to "GitHub Actions":

    image

  • Once everything is golden in this Pull Request and the decision to move to GitHub Pages is final, git-scm.com needs to be pointed to GitHub Pages (read: CNAME needs to be configured to make use of the GitHub Pages-deployed site).

  • The Pull Request branch was actually pushed to gh-pages already, reflected by the preview that can be seen at https://git.github.io/git-scm.com/.

Why make these changes?

  • Heroku stopped their free tier and therefore https://git-scm.com/ has required sponsorship for a while, using funds that could be put to better use elsewhere. In the meantime, Heroku offered to sponsor Git again, but we now know that this can go away at any time without much prior warning.
  • Static sites are much easier to manage, and to develop. With this Pull Request, developing the site locally is as easy as checking out the repository and running hugo serve -w, then editing the files to your heart's extent.
  • Easier debugging. For example, the page https://git-scm.com/docs/git-remote/fr has a typo in the synopsis: git remote renom is not the correct Git command. This page is supposedly generated from the git-html-l10n repository but the typo does not exist there. It is quite unclear where the bug is, seeing as https://dscho.github.io/git-scm.com/docs/git-remote/fr does not show the bug. I am still flummoxed how this bug could be fixed, as I haven't found the culprit despite investigating for multiple hours. This type of bug will be much easier to fix in the Hugo site than in the current Rails App, where this bug persists to this day.

@dscho dscho force-pushed the hugo branch 6 times, most recently from 1db01e4 to bd332cc Compare October 16, 2023 21:11
dscho added a commit to dscho/git-scm.com that referenced this pull request Oct 17, 2023
In the current effort to migrate https://git-scm.com/ to a static Hugo
site (see git#1804), we saw a bogus
tag that would confuse Hugo. We also saw a now-unused banner that we
probably do not want to bother migrating to Hugo.

So let's drop both.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@dscho dscho force-pushed the hugo branch 2 times, most recently from 4bd3b3f to 7c5e7c5 Compare October 17, 2023 12:00
@spraints
Copy link
Contributor

🎉 This is great! Thank you so much for picking this up! The demo site looks great!

@bglw
Copy link

bglw commented Oct 18, 2023

👋 Sneaking in here with some thoughts from the search side!

On first interactions, the search has some notable issues compared to the production rails search, for a few reasons on both sides of the fence.

  1. All tagged releases are indexed, so a search for rebase returns /docs/git-rebase/ and /docs/git-rebase/2.41.0/ and /docs/git-rebase/2.23.0/ and ...
    • The best fix here would be for you to omit the data-pagefind-body attribute from the numbered release pages, so that only /docs/git-rebase/ is indexed and returned
  2. Titles definitely need stronger affinity here. A search for list on the rails site returns rev-list-description, git-rev-list, and rev-list-options as the top results. Pagefind's search is significantly more varied, with a lot of results for mailing lists and related items.
  3. Typing rebase into the live search and hitting enter does not show the rebasing book result. Typing the query in does.
  4. The rails site live search has a nice Reference / Book split that would be great to recreate with filters, if possible.

(Amazing work migrating this to Hugo! ❤️)

dscho added a commit to dscho/git-scm.com that referenced this pull request Oct 18, 2023
In the current effort to migrate https://git-scm.com/ to a static Hugo
site (see git#1804), we saw a bogus
tag that would confuse Hugo. We also saw a now-unused banner that we
probably do not want to bother migrating to Hugo.

So let's drop both.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@dscho dscho self-assigned this Oct 18, 2023
@dscho
Copy link
Member Author

dscho commented Oct 18, 2023

Oh wow, Mr Pagefind himself! I'm honored to meet you, @bglw!

  • The best fix here would be for you to omit the data-pagefind-body attribute from the numbered release pages, so that only /docs/git-rebase/ is indexed and returned

I kind of wanted to be able to find stuff in old versions that is no longer present in current versions. That's why I added dscho@e9fa963).

  • Titles definitely need stronger affinity here. A search for list on the rails site returns rev-list-description, git-rev-list, and rev-list-options as the top results. Pagefind's search is significantly more varied, with a lot of results for mailing lists and related items.

Excellent!

Heh, thank you for that!

  • The rails site live search has a nice Reference / Book split that would be great to recreate with filters, if possible.

Right, I had not worked on that because I hoped that the sorting by relevance would be "good enough"...

@rimrul
Copy link
Contributor

rimrul commented Oct 20, 2023

About Heroku

That is true, but here has been an update since that 2022 mail.

https://lore.kernel.org/git/ZRHTWaPthX%2FTETJz@nand.local/

Heroku has a new (?)
program for giving credits to open-source projects. The details are
below:

https://www.heroku.com/open-source-credit-program

I applied on behalf of the Git project on 2023-09-25, and will follow-up
on the list if/when we hear back from them.

It does seem like the PLC is still in favor of moving to a static solution, though.

https://lore.kernel.org/git/ZRrfAdX0eNutTSOy@nand.local/

  • Biggest expense is Heroku - Fusion has been covering the bill
  • Dan Moore from FusionAuth has been providing donations
  • Ideally we are able to move away from using Heroku, but in the meantime
    we'll have coverage either from (a) FusionAuth, or (b) Heroku's new
    open-source credit system

About the preview:

Search

All tagged releases are indexed, so a search for rebase returns /docs/git-rebase/ and /docs/git-rebase/2.41.0/ and /docs/git-rebase/2.23.0/ and ...

That is true. And in both the search results page as well as the little preview (<div id="search-results">) it's not visually obvious which result is the current version and which results are older versions. Maybe that could be improved by adding the version number to the page title for non-current versions? Or maybe a filter in the search results to exclude historical documentation?
If we don't want to mangle the titles, pagefind would show the version number below the result if we configured it as metadata.

Minor issues

There are some broken links in the preview on https://dscho.github.io/git-scm.com/docs/ that lead to https://dscho.github.io/docs/ <topic>

There's a broken link on https://dscho.github.io/git-scm.com/about/free-and-open-source/ to https://dscho.github.io/git-scm.com/trademark. On the live site that redirects from https://git-scm.com/trademark to https://git-scm.com/about/trademark (dscho#1)

The "Setup and Config" headline on https://dscho.github.io/git-scm.com/docs/ is blue in the preview, but not in the live site. This is not happening for me in local testing.

There's some redirect that swallows anchors. https://dscho.github.io/git-scm.com/docs/ links to https://dscho.github.io/git-scm.com/docs/git#_git_commands , which redirects to https://dscho.github.io/git-scm.com/docs/git/ instead of https://dscho.github.io/git-scm.com/docs/git/#_git_commands
Looks like the slash-free version isn't possible with the GitHub pages/Hugo combination (gohugoio/hugo#492). We should update these links to contain the slash from the beginning to avoid the redirect.(dscho#3)

https://dscho.github.io/git-scm.com/downloads/mac/ has an odd grammar issue that https://git-scm.com/download/mac doesn't. (dscho#2) It says

which was released about 2 year, on 2021-08-30.

https://git-scm.com/download/mac correctly says

which was released about 2 years ago, on 2021-08-30.

Also note the slight url change there from dowload to downloads. There is a redirect for that, though, so that should be fine.

@rimrul
Copy link
Contributor

rimrul commented Oct 20, 2023

One additional note: There is a commit about porting the old 404 page, 18a3ac2, but I've only seen the generic GitHub pages 404 page on the preview in my testing.

@rimrul
Copy link
Contributor

rimrul commented Oct 21, 2023

Switching to pagefind also changed search behaviour in another way.

The rails site always searches the english content. Pagefind defaults to what they call multilingual search, i.e. searching only pages in the same language as the one you're searching from. That's theoretically a usability improvement, but with the partial nature of our non-english content (availability of any given language can vary from man page to man page, the book exists in languages that don't have any man pages, everything else only exists in english), we might need a fallback to english here. Pagefind offers an option to force all pages to be indexed as english, but I think we can slightly abuse mergeIndex with language set to en for a better result.

Copy link
Contributor

@rimrul rimrul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review. Only looked at the first 47 commits

application.js Outdated Show resolved Hide resolved
app/assets/javascripts/modernize.js Outdated Show resolved Hide resolved
layouts/_default/baseof.html Outdated Show resolved Hide resolved
script/cibuild Outdated Show resolved Hide resolved
static/js/application.js Outdated Show resolved Hide resolved
app/views/about/index.html.erb Outdated Show resolved Hide resolved
content/404.html Show resolved Hide resolved
content/about/branching-and-merging.html Outdated Show resolved Hide resolved
content/community/_index.html Show resolved Hide resolved
script/book.rb Outdated Show resolved Hide resolved
Gemfile Show resolved Hide resolved
dscho added a commit to dscho/git-scm.com that referenced this pull request Oct 24, 2023
This addresses that part of
git#1804 (comment):

	There are some broken links in the preview on
	https://dscho.github.io/git-scm.com/docs/ that lead to
	https://dscho.github.io/docs/ <topic>

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@dscho
Copy link
Member Author

dscho commented Oct 24, 2023

The "Setup and Config" headline on https://dscho.github.io/git-scm.com/docs/ is blue in the preview, but not in the live site. This is not happening for me in local testing.

I managed to fix it via 2d0f6c8

@dscho
Copy link
Member Author

dscho commented Oct 24, 2023

All tagged releases are indexed, so a search for rebase returns /docs/git-rebase/ and /docs/git-rebase/2.41.0/ and /docs/git-rebase/2.23.0/ and ...

That is true. And in both the search results page as well as the little preview (<div id="search-results">) it's not visually obvious which result is the current version and which results are older versions.

Hmm. The more I think about it, the more I get convinced that the older versions of the manual pages should be excluded from the search, I thought it was a feature, but it looks as if it incurs more downsides than upsides.

@pedrorijo91
Copy link
Member

this was a major effort @dscho , thank you very much! sorry for the silence, but i've been busy with other stuff. in the meanwhile, and to ensure this effort wont be wasted, can you summarize what do you need to make this merge-ready?

what do you still need to tackle? where do you need help from other people? :)

@dscho
Copy link
Member Author

dscho commented Nov 6, 2023

can you summarize what do you need to make this merge-ready?

@pedrorijo91 Yes.

  • The search needs some love:
    • exclude the manual pages of previous versions from the search instead of trying to demote them; It's just too confusing
    • in the "live search" (i.e. when typing in the search box on any page other than the search results page), we will want to reinstate the "Reference"/"Book" separation of the search results. I'm currently unsure how we can accomplish that.
  • to make the URLs nicer by having no trailing slash (just like the existing Rails App), we will need to uglify the URLs.
  • general QA:
    • ensure that current URLs would work after migration
      • e.g. /about#branching-and-merging, /about#staging-area etc
    • add test -z "$(git grep "\\(href\|src\) *= *[\"']/")" to CI
  • rebase to the latest main

The big blocker is the "live search" one.

@dscho
Copy link
Member Author

dscho commented Nov 6, 2023

Oh, and there's a ton of work still needed to address @rimrul's excellent feedback.

dscho added a commit to dscho/git-scm.com that referenced this pull request Nov 7, 2023
In the current effort to migrate https://git-scm.com/ to a static Hugo
site (see git#1804), we saw a bogus
tag that would confuse Hugo. We also saw a now-unused banner that we
probably do not want to bother migrating to Hugo.

So let's drop both.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Updated via the `update-download-data.yml` GitHub workflow.
@ttaylorr
Copy link
Member

And with this, everything is ready on my side to try again to switch over the DNS entries (this time, no Cloudflare cache eviction should be necessary, even!). @ttaylorr would you kindly give it a try?

Sorry for the delay on my end, I just had a chance to make the cut-over and all appears to be working. I can load git-scm.com correctly on my end after purging caches and my local DNS cache, and it all appears to be working.

book.git-scm.com is currently broken (it just redirects to a GitHub Pages site that says "There's no GitHub Pages site here"), but I'm willing to live with that since I suspect very few folks are using the book.git-scm.com address. Hopefully the fix is straightforward-ish on your end @dscho! If you need anything let me know.

I think all that's left to do is announce the change to the mailing list and merge this branch, both of which @dscho should definitely have the honor of doing! ❤️

ttaylorr added 2 commits September 24, 2024 17:42
Updated via the `update-git-version-and-manual-pages.yml` GitHub workflow.
Updated via the `update-git-version-and-manual-pages.yml` GitHub workflow.
@dscho
Copy link
Member Author

dscho commented Sep 24, 2024

book.git-scm.com is currently broken (it just redirects to a GitHub Pages site that says "There's no GitHub Pages site here")

@ttaylorr I believe that this needs a CNAME entry for the book subdomain that points to git.github.io (I have something similar for www.gitforwindows.org that points to git-for-windows.github.io).

@dscho dscho merged commit d09d33d into git:main Sep 24, 2024
3 checks passed
@ttaylorr ttaylorr temporarily deployed to git-scm September 24, 2024 21:08 Inactive
@ttaylorr
Copy link
Member

@ttaylorr I believe that this needs a CNAME entry for the book subdomain that points to git.github.io

@dscho: Thanks for the reference -- I just added a record there, though Cloudflare is taking a little while to propagate it. LMK when you're online tomorrow if you have time whether or not it works for you!

Also I noticed that you merged this into 'main', but we have 'gh-pages' as well. Which should be the default branch?

@dscho
Copy link
Member Author

dscho commented Sep 24, 2024

Also I noticed that you merged this into 'main', but we have 'gh-pages' as well. Which should be the default branch?

I merged this into main only because that was the target branch of the PR, no other reason. I guess we can delete main now.

I'd like to keep gh-pages as the default branch because that name says "I am being deployed to GitHub Pages".

BTW I just disabled the two Heroku webhooks because we do not want the site to be deployed there anymore.

@ttaylorr
Copy link
Member

Also I noticed that you merged this into 'main', but we have 'gh-pages' as well. Which should be the default branch?

I merged this into main only because that was the target branch of the PR, no other reason. I guess we can delete main now.

I'd like to keep gh-pages as the default branch because that name says "I am being deployed to GitHub Pages".

That all sounds great to me, thanks!

BTW I just disabled the two Heroku webhooks because we do not want the site to be deployed there anymore.

Thanks again. I'll spin down the account in the next few days, I just wanted to make sure it was still in-tact in case we had to do an emergency revert back to Heroku, which seems unlikely now. Thanks again for all of this great work 😍

@dscho
Copy link
Member Author

dscho commented Sep 24, 2024

I'll spin down the account in the next few days, I just wanted to make sure it was still in-tact in case we had to do an emergency revert back to Heroku

I like that idea.

I think all that's left to do is announce the change to the mailing list and merge this branch, both of which @dscho should definitely have the honor of doing! ❤️

Hereby done: https://lore.kernel.org/git/c3e372f6-3035-9e6b-f464-f1feceacaa4b@gmx.de/T/#u

@dscho
Copy link
Member Author

dscho commented Sep 25, 2024

I believe that this needs a CNAME entry for the book subdomain that points to git.github.io

Thanks for the reference -- I just added a record there, though Cloudflare is taking a little while to propagate it.

@ttaylorr It seems to take quite a bit more time than I would have expected. Looking at https://digwebinterface.com/?hostnames=book.git-scm.com&type=&ns=resolver&useresolver=9.9.9.10&nameservers=, I still see:

book.git-scm.com.	300	IN	A	172.67.12.172
book.git-scm.com.	300	IN	A	104.22.3.43
book.git-scm.com.	300	IN	A	104.22.2.43

For comparison, this is what happens with www.gitforwindows.org:

www.gitforwindows.org.	3600	IN	CNAME	git-for-windows.github.io.
git-for-windows.github.io. 3600	IN	A	185.199.111.153
git-for-windows.github.io. 3600	IN	A	185.199.108.153
git-for-windows.github.io. 3600	IN	A	185.199.110.153
git-for-windows.github.io. 3600	IN	A	185.199.109.153

@ttaylorr
Copy link
Member

@ttaylorr It seems to take quite a bit more time than I would have expected.

I think that there is some misconfiguration on the GitHub Pages side of things. The difference in dig lookups makes sense there, since the two 104.* IPs as well as the 172.* one are both Cloudflare IPs. So seeing book.git-scm.com resolve to that makes sense.

Do we need to tell GitHub Pages that there is a custom domain that it should be responding to when sending traffic via a CNAME record from book.git-scm.com -> git.github.io?

At least that redirect is working, so I think that the configuration issues may be within how we have Pages setup on the GitHub side of things.

@dscho
Copy link
Member Author

dscho commented Sep 25, 2024

Do we need to tell GitHub Pages that there is a custom domain that it should be responding to when sending traffic via a CNAME record from book.git-scm.com -> git.github.io?

The way I read the documentation, we are supposed to have only a CNAME and no A records for the book subdomain.

But I have to admit that I then fail to see how this could find the correct GitHub Pages site. Maybe book should have a CNAME record that points to git-scm.com instead of git.github.io?

The documentation on GitHub can also be interpreted in the way that any subdomain (with a CNAME pointing to <org>.github.io as suggested) can only resolve to the default GitHub Pages site, but that does not seem to be the case: https://book.git-scm.com/rev_news/rev_news/ should then show the same contents as https://git.github.io/rev_news/rev_news/, but that is not the case, the former URL 404s.

Maybe the best we can do is to add a new repository at https://github.com/git/book.git-scm.com that has a CNAME file containing book.git-scm.com and a single index.html with contents à la:

<!DOCTYPE html>
<html lang="en">
 <head>
  <meta charset="utf-8">
  <title>Redirecting&hellip;</title>
  <link rel="canonical" href="https://git-scm.com/book/en/v2">
  <meta http-equiv="refresh" content="0; url=https://git-scm.com/book/en/v2">
  <meta name="robots" content="noindex">
 </head>
 <body>
  <script>window.location.replace(document.querySelector("link[rel='canonical']").href + window.location.hash)</script>
  <h1>Redirecting&hellip;</h1>
  <a href="https://git-scm.com/book/en/v2">Click here if you are not redirected.</a>
 </body>
</html>

Shall we try whether this works?

@ttaylorr
Copy link
Member

The way I read the documentation, we are supposed to have only a CNAME and no A records for the book subdomain.

I think that's the way it's configured from Cloudflare's perspective (i.e., that there is a CNAME record, but no A record for book that points to the Git project's GitHub Pages domain). But I think the problem is that GitHub Pages can only support one custom domain per Pages site, so we can't add both "git-scm.com" and "book.git-scm.com" as domains belonging to the Pages site from this repository.

But since book.git-scm.com/foo is supposed to return the same content as "git-scm.com/foo", I think the right thing to do here is to configure a CNAME from "book.git-scm.com" to "git-scm.com", which should do what we're looking for here AFAICT.

I went ahead and configured that, but it still looks like it's broken. I'm guessing that's either (a) a caching thing (seems unlikely) or (b) GitHub Pages rejects a request coming from book.git-scm.com, because it doesn't think there should be a GitHub Pages site there.

I tried Googling for things like "multiple custom subdomains GitHub Pages" but couldn't come up with anything definitive, so I'm guessing that this is unsupported. I'm not opposed to the workaround you came up with, and think that that may be our best path forward.

@dscho
Copy link
Member Author

dscho commented Sep 25, 2024

@ttaylorr I initialized https://github.com/git/book.git-scm.com and it seems to work (it does not work for https://book.git-scm.com/ -- yet: the response headers suggest that cloudflare cached this from its previous state, but if you try, say, https://book.git-scm.com/abc, it redirects as intended).

I guess with this approach, we should actually go for those A and AAAA records, just like for git-scm.com, and then we can even redirect more cleverly in the future!

@ttaylorr
Copy link
Member

@ttaylorr I initialized https://github.com/git/book.git-scm.com and it seems to work (it does not work for https://book.git-scm.com/ -- yet: the response headers suggest that cloudflare cached this from its previous state, but if you try, say, https://book.git-scm.com/abc, it redirects as intended).

Thanks!

I guess with this approach, we should actually go for those A and AAAA records, just like for git-scm.com, and then we can even redirect more cleverly in the future!

Done.

@dscho
Copy link
Member Author

dscho commented Sep 25, 2024

it does not work for https://book.git-scm.com/ -- yet

And now it does!

@To1ne
Copy link
Contributor

To1ne commented Oct 8, 2024

Also nice to see everything works as expected now a new release is out. 👏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.