-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: publish the minimal SDK in a continuous release #87
Conversation
Instead of using a GitHub Actions-specific shell script to initialize the shell variables `MSYSTEM` and `PATH`, we now use the standard `/etc/profile` file. This file is already written by the `please.sh create-sdk-artifact minimal-sdk` command. This is the first step to unlock a more general process where the `minimal-sdk` artifact is made available for wider use by continuously releasing ready-to-consume archives. Signed-off-by: Johannes Schindelin <[email protected]>
Nowadays, 64-bit Windows no longer means Intel in general; It could also mean Windows/ARM64. Let's start here with naming CPU architecture-specific things by the actual CPU architecture, rather than by how many bits its addresses have. More precisely, let's use i686 for 32-bit Intel, x86_64 for 64-bit Intel (excluding Itanium, which is not supported by Git for Windows), and aarch64 for 64-bit ARM. Signed-off-by: Johannes Schindelin <[email protected]>
Taking a page out of MSYS2's book where they release their `srcinfo` artifact continuously, by replacing the file in the release at https://github.com/msys2/MINGW-packages/releases/tag/srcinfo-cache, we publish the minimal SDK artifact to the `ci-artifacts` GitHub release. It is slighly tricky because GitHub's REST API does not allow to replace an existing asset atomically, so we have to delete the old one first. To keep the window small where the asset is not available, we first upload the new asset to a temporary name, then delete the old one, and finally rename the new one to the old name. Signed-off-by: Johannes Schindelin <[email protected]>
Whenever we update the rolling release, we want to point to the revision corresponding to the new artifacts, i.e. from which they were built. Signed-off-by: Johannes Schindelin <[email protected]>
For the record, here is the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, your PR description is truly outstanding! Thank you so much for picking this up.
This change looks incredibly promising and exciting to me - I wouldn't be surprised if this reduces the runtime of the setup-git-for-windows-sdk
action from minutes (without cache) to mere seconds. Looking forward to testing it there!!
Since `.tar.gz` has its preferred habitat safely outside the Windows world, let's also provide a `.zip` version of the artifact, and for good measure even a self-extracting 7-Zip archive (which can be run everywhere, without having to rely on PowerShell or a native `tar.exe`). In my tests, both sizes and extract times vary greatly depending on the used file format. Here are manual measurements as of time of writing: .tar.gz .zip .7z.exe Size 95.7MB 84.2MB 41.6MB Time 4.1s 4.3s 11.1s Note that I used the native Windows `tar.exe` that is available in the `C:\Windows\system32` directory which is a BSD tar and comes preinstalled with Windows build 17063 and later; It is vastly faster both in extracting `.zip` and `.tar.gz` files (sadly, it has no support for `.7z` archives) than the corresponding MSYS utilities `unzip` and `tar` distributed with Git for Windows. In any scenario where all three formats can be extracted (with roughly the same speeds as in my tests), and where Defender does not need to scan the downloaded `.7z.exe` file first (which added up to 8s in my testing), it therefore depends on the available bandwidth when fetching from Azure Blobs (where GitHub release assets seem to be stored nowadays) which one is preferable. At download speeds above 6MB/s, the `.zip` archive is the fastest to download and extract, below that, the self-extracting 7-Zip archive provides the best overall speed. Obviously this should be measured in each particular use case instead of basing any decision on above-mentioned numbers. Signed-off-by: Johannes Schindelin <[email protected]>
83689dd
to
fdb0cea
Compare
As a result of recent changes, the Git SDK `ci-artifacts` are published as GitHub release assets now. This eliminates the need for us to clone and cache the artifacts in this GitHub action. Let's simply download the latest `.tar.gz` from the `ci-artifacts` and extract it. Ref: git-for-windows/git-sdk-64@fdb0cea Ref: git-for-windows/git-sdk-64#87 Signed-off-by: Dennis Ameling <[email protected]>
As a result of recent changes, the Git SDK `ci-artifacts` are published as GitHub release assets now. This eliminates the need for us to clone and cache the artifacts in this GitHub action. Let's simply download the latest `.tar.gz` from the `ci-artifacts` and extract it. Ref: git-for-windows/git-sdk-64@fdb0cea Ref: git-for-windows/git-sdk-64#87 Signed-off-by: Dennis Ameling <[email protected]>
As a result of recent changes, the Git SDK `ci-artifacts` are published as GitHub release assets now. This eliminates the need for us to clone and cache the artifacts in this GitHub action. Let's simply download the latest `.tar.gz` from the `ci-artifacts` and extract it. Ref: git-for-windows/git-sdk-64@fdb0cea Ref: git-for-windows/git-sdk-64#87 Signed-off-by: Dennis Ameling <[email protected]>
As a result of recent changes, the Git SDK `ci-artifacts` are published as GitHub release assets now. This eliminates the need for us to clone and cache the artifacts in this GitHub action. Let's simply download the latest `.tar.gz` from the `ci-artifacts` and extract it. Ref: git-for-windows/git-sdk-64@fdb0cea Ref: git-for-windows/git-sdk-64#87 Signed-off-by: Dennis Ameling <[email protected]>
As a result of recent changes, the Git SDK `ci-artifacts` are published as GitHub release assets now. This eliminates the need for us to clone and cache the artifacts in this GitHub action. Let's simply download the latest `.tar.gz` from the `ci-artifacts` and extract it. Ref: git-for-windows/git-sdk-64@fdb0cea Ref: git-for-windows/git-sdk-64#87 Signed-off-by: Dennis Ameling <[email protected]>
As a result of recent changes, the Git SDK `ci-artifacts` are published as GitHub release assets now. This eliminates the need for us to clone and cache the artifacts in this GitHub action. Let's simply download the latest `.tar.gz` from the `ci-artifacts` and extract it. Ref: git-for-windows/git-sdk-64@fdb0cea Ref: git-for-windows/git-sdk-64#87 Signed-off-by: Dennis Ameling <[email protected]>
As a result of recent changes, the Git SDK `ci-artifacts` are published as GitHub release assets now. This eliminates the need for us to clone and cache the artifacts in this GitHub action. Let's simply download the latest `.tar.gz` from the `ci-artifacts` and extract it. Ref: git-for-windows/git-sdk-64@fdb0cea Ref: git-for-windows/git-sdk-64#87 Signed-off-by: Dennis Ameling <[email protected]>
As a result of recent changes, the Git SDK `ci-artifacts` are published as GitHub release assets now. This eliminates the need for us to clone and cache the artifacts in this GitHub action. Let's simply download the latest `.tar.gz` from the `ci-artifacts` and extract it. Ref: git-for-windows/git-sdk-64@fdb0cea Ref: git-for-windows/git-sdk-64#87 Signed-off-by: Dennis Ameling <[email protected]>
As a result of recent changes, the Git SDK `ci-artifacts` are published as GitHub release assets now. This eliminates the need for us to clone and cache the artifacts in this GitHub action. Let's simply download the latest `.tar.gz` from the `ci-artifacts` and extract it. Ref: git-for-windows/git-sdk-64@fdb0cea Ref: git-for-windows/git-sdk-64#87 Signed-off-by: Dennis Ameling <[email protected]>
As a result of recent changes, the Git SDK `ci-artifacts` are published as GitHub release assets now. This eliminates the need for us to clone and cache the artifacts in this GitHub action. Let's simply download the latest `.tar.gz` from the `ci-artifacts` and extract it. Ref: git-for-windows/git-sdk-64@fdb0cea Ref: git-for-windows/git-sdk-64#87 Signed-off-by: Dennis Ameling <[email protected]>
As a result of recent changes, the Git SDK `ci-artifacts` are published as GitHub release assets now. This eliminates the need for us to clone and cache the artifacts in this GitHub action. Let's simply download the latest `.tar.gz` from the `ci-artifacts` and extract it. Ref: git-for-windows/git-sdk-64@fdb0cea Ref: git-for-windows/git-sdk-64#87 Signed-off-by: Dennis Ameling <[email protected]>
As a result of recent changes, the Git SDK `ci-artifacts` are published as GitHub release assets now, including several variants of the minimal flavor of Git for Windows' SDK. This eliminates the need for us to clone the minimal SDK in this GitHub Action. Let's simply download the latest `.tar.gz` from the `ci-artifacts` and extract it. Note: As per the analysis in git-for-windows/git-sdk-64@fdb0cea37389, we use the native `tar.exe` to unpack the minimal SDK while it is downloaded, for maximal speed. The analysis also suggests to use the `.zip` file, as this results in the fastest operation when download speeds are above 6MB/second (which we hope will be reliably the case in GitHub Actions). However, we want to pipe the archive to `tar -xf -` while fetching, and it seems that for some reason `C:\Windows\system32\tar.exe` misses `.sparse/` and `etc/` when extracting `.zip` files from `stdin`, but the same is not true with `.tar.gz` files. So let's use the latter. See also: git-for-windows/git-sdk-64#87. Signed-off-by: Dennis Ameling <[email protected]> Signed-off-by: Johannes Schindelin <[email protected]>
Currently, when asking the
setup-git-for-windows-sdk
Action for the minimal SDK, it first looks at the latest successfulci-artifacts
run to determine the commit, and then performs a shallow, partial & sparse checkout of thegit-sdk-64
repository at that commit.This is relatively slow, taking typically something around 25 seconds for the actual clone, compared to typically around 3 seconds to restore it from a cached version. Therefore, the Action tries to cache it, even if it sometimes adds quite an overhead (for example, here something spent 2 minutes after the cache was saved before the workflow was allowed to continue).
Not only is this taking quite some time, diminishing the developer experience, but it is also unnecessary, seeing as the
ci-artifacts
run already provides the minimal SDK in a ready-to-use form and uploads it as a workflow artifact.So why not use that workflow artifact? A couple of reasons:
setup-for-windows-sdk
was designed, it now seems possible to useactions/download-artifact
to download from other repositories, but:I once looked into publishing the minimal SDK continuously, first considering GitHub Packages. But they have rather strict retention rules and we definitely do not need to retain anything but the latest working minimal-sdk artifact. (And I don't want to waste resources unnecessarily, even if someone else pays for them.)
Then I had looked into uploading that artifact to Azure Blobs, but had to pull the plug in one big hurry because it threatened to blow my Azure budget.
So eventually I gave up on that plan and instead fell back to the "partial, shallow, sparse clone & then cache" strategy, which is in place to this day.
However, people who are more creative than I am figured out ways to publish artifacts continuously. I just had not found those solutions yet at the time. In the MSYS2 project, for example, they have a
srcinfo
artifact that is updated continuously, and initially they usedeine/tip
to publish it in a GitHub release that is kept up to date by replacing the assets as new revisions come in. Nowadays they use the same strategy, albeit manually viagh release upload --clobber
.This here PR uses a slight variation of the same strategy: Since
--clobber
basically deletes the asset before uploading a new version, there is a time window where the asset is not available. Since the.tar.gz
file is around 95MB, uploading it can take a while, in particular given the vagaries of any network operation. For that reason, in this PR the upload is done to a temporary name, then the old asset is deleted, and finally the new asset is renamed to the old name. This does not avoid the window of a missing asset altogether, but at least it makes it a lot smaller.Finally, I took this opportunity to generate and publish
.zip
and.7z.exe
variants, the latter being a self-extracting 7-Zip archive that does not require any special tools to extract. Granted, since a native BSD tar is available inC:\Windows\system32\tar.exe
on pretty much any non-ancient Windows version, and thattar.exe
can extract both.tar.gz
and.zip
archives without problems, that seems to be lesser of an issue. But still, there's things like Windows Nano Server that may not have that BSD tar.tl;dr with this PR, the
ci-artifacts
workflow runs not only verify that the minimal SDK is able to compile a workinggit.exe
, but then also publishes the minimal SDK in various variants in agit-sdk-64
GitHub release calledci-artifacts
whose tag will point to the corresponding commit.