Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: publish the minimal SDK in a continuous release #87

Merged
merged 5 commits into from
Sep 15, 2024

Conversation

dscho
Copy link
Member

@dscho dscho commented Sep 13, 2024

Currently, when asking the setup-git-for-windows-sdk Action for the minimal SDK, it first looks at the latest successful ci-artifacts run to determine the commit, and then performs a shallow, partial & sparse checkout of the git-sdk-64 repository at that commit.

This is relatively slow, taking typically something around 25 seconds for the actual clone, compared to typically around 3 seconds to restore it from a cached version. Therefore, the Action tries to cache it, even if it sometimes adds quite an overhead (for example, here something spent 2 minutes after the cache was saved before the workflow was allowed to continue).

Not only is this taking quite some time, diminishing the developer experience, but it is also unnecessary, seeing as the ci-artifacts run already provides the minimal SDK in a ready-to-use form and uploads it as a workflow artifact.

So why not use that workflow artifact? A couple of reasons:

  • It would appear as if workflow artifacts are somehow compressed as zip archives on the fly when trying to download them manually,
  • While not possible at the time setup-for-windows-sdk was designed, it now seems possible to use actions/download-artifact to download from other repositories, but:
  • Workflow run artifacts obey a limited retention policy, which is undesirable here, and finally:
  • You can only download workflow artifacts when authenticated, while clones can be performed anonymously; There are likely use cases where the minimal SDK is needed outside of GitHub Actions (for example in other CI systems).

I once looked into publishing the minimal SDK continuously, first considering GitHub Packages. But they have rather strict retention rules and we definitely do not need to retain anything but the latest working minimal-sdk artifact. (And I don't want to waste resources unnecessarily, even if someone else pays for them.)

Then I had looked into uploading that artifact to Azure Blobs, but had to pull the plug in one big hurry because it threatened to blow my Azure budget.

So eventually I gave up on that plan and instead fell back to the "partial, shallow, sparse clone & then cache" strategy, which is in place to this day.

However, people who are more creative than I am figured out ways to publish artifacts continuously. I just had not found those solutions yet at the time. In the MSYS2 project, for example, they have a srcinfo artifact that is updated continuously, and initially they used eine/tip to publish it in a GitHub release that is kept up to date by replacing the assets as new revisions come in. Nowadays they use the same strategy, albeit manually via gh release upload --clobber.

This here PR uses a slight variation of the same strategy: Since --clobber basically deletes the asset before uploading a new version, there is a time window where the asset is not available. Since the .tar.gz file is around 95MB, uploading it can take a while, in particular given the vagaries of any network operation. For that reason, in this PR the upload is done to a temporary name, then the old asset is deleted, and finally the new asset is renamed to the old name. This does not avoid the window of a missing asset altogether, but at least it makes it a lot smaller.

Finally, I took this opportunity to generate and publish .zip and .7z.exe variants, the latter being a self-extracting 7-Zip archive that does not require any special tools to extract. Granted, since a native BSD tar is available in C:\Windows\system32\tar.exe on pretty much any non-ancient Windows version, and that tar.exe can extract both .tar.gz and .zip archives without problems, that seems to be lesser of an issue. But still, there's things like Windows Nano Server that may not have that BSD tar.

tl;dr with this PR, the ci-artifacts workflow runs not only verify that the minimal SDK is able to compile a working git.exe, but then also publishes the minimal SDK in various variants in a git-sdk-64 GitHub release called ci-artifacts whose tag will point to the corresponding commit.

Instead of using a GitHub Actions-specific shell script to initialize
the shell variables `MSYSTEM` and `PATH`, we now use the standard
`/etc/profile` file. This file is already written by the `please.sh
create-sdk-artifact minimal-sdk` command.

This is the first step to unlock a more general process where the
`minimal-sdk` artifact is made available for wider use by continuously
releasing ready-to-consume archives.

Signed-off-by: Johannes Schindelin <[email protected]>
Nowadays, 64-bit Windows no longer means Intel in general; It could also
mean Windows/ARM64.

Let's start here with naming CPU architecture-specific things by the
actual CPU architecture, rather than by how many bits its addresses
have.

More precisely, let's use i686 for 32-bit Intel, x86_64 for 64-bit
Intel (excluding Itanium, which is not supported by Git for Windows),
and aarch64 for 64-bit ARM.

Signed-off-by: Johannes Schindelin <[email protected]>
Taking a page out of MSYS2's book where they release their `srcinfo`
artifact continuously, by replacing the file in the release at
https://github.com/msys2/MINGW-packages/releases/tag/srcinfo-cache, we
publish the minimal SDK artifact to the `ci-artifacts` GitHub release.

It is slighly tricky because GitHub's REST API does not allow to replace
an existing asset atomically, so we have to delete the old one first.

To keep the window small where the asset is not available, we first
upload the new asset to a temporary name, then delete the old one, and
finally rename the new one to the old name.

Signed-off-by: Johannes Schindelin <[email protected]>
Whenever we update the rolling release, we want to point to the revision
corresponding to the new artifacts, i.e. from which they were built.

Signed-off-by: Johannes Schindelin <[email protected]>
@dscho dscho requested a review from dennisameling September 13, 2024 19:27
@dscho dscho self-assigned this Sep 13, 2024
@dscho
Copy link
Member Author

dscho commented Sep 13, 2024

For the record, here is the ci-artifacts release in my personal fork that was done with the code presented here (plus a small commit on top to convince the workflow to run and publish in a fork).

Copy link
Contributor

@dennisameling dennisameling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, your PR description is truly outstanding! Thank you so much for picking this up.

This change looks incredibly promising and exciting to me - I wouldn't be surprised if this reduces the runtime of the setup-git-for-windows-sdk action from minutes (without cache) to mere seconds. Looking forward to testing it there!!

.github/workflows/ci-artifacts.yml Show resolved Hide resolved
.github/workflows/ci-artifacts.yml Outdated Show resolved Hide resolved
.github/workflows/ci-artifacts.yml Show resolved Hide resolved
Since `.tar.gz` has its preferred habitat safely outside the Windows
world, let's also provide a `.zip` version of the artifact, and for
good measure even a self-extracting 7-Zip archive (which can be run
everywhere, without having to rely on PowerShell or a native `tar.exe`).

In my tests, both sizes and extract times vary greatly depending on the
used file format. Here are manual measurements as of time of writing:

        .tar.gz     .zip        .7z.exe
Size    95.7MB      84.2MB      41.6MB
Time    4.1s        4.3s        11.1s

Note that I used the native Windows `tar.exe` that is available in the
`C:\Windows\system32` directory which is a BSD tar and comes
preinstalled with Windows build 17063 and later; It is vastly faster
both in extracting `.zip` and `.tar.gz` files (sadly, it has no support
for `.7z` archives) than the corresponding MSYS utilities `unzip` and
`tar` distributed with Git for Windows.

In any scenario where all three formats can be extracted (with roughly
the same speeds as in my tests), and where Defender does not need to
scan the downloaded `.7z.exe` file first (which added up to 8s in my
testing), it therefore depends on the available bandwidth when fetching
from Azure Blobs (where GitHub release assets seem to be stored
nowadays) which one is preferable. At download speeds above 6MB/s, the
`.zip` archive is the fastest to download and extract, below that, the
self-extracting 7-Zip archive provides the best overall speed. Obviously
this should be measured in each particular use case instead of basing
any decision on above-mentioned numbers.

Signed-off-by: Johannes Schindelin <[email protected]>
@dscho dscho force-pushed the publish-release-asset branch from 83689dd to fdb0cea Compare September 15, 2024 16:59
@dscho dscho merged commit 359bc4d into git-for-windows:main Sep 15, 2024
1 of 2 checks passed
@dscho dscho deleted the publish-release-asset branch September 15, 2024 16:59
dennisameling added a commit to git-for-windows/setup-git-for-windows-sdk that referenced this pull request Sep 22, 2024
As a result of recent changes, the Git SDK `ci-artifacts` are published as GitHub release assets now. This eliminates the need for us to clone and cache the artifacts in this GitHub action.

Let's simply download the latest `.tar.gz` from the `ci-artifacts` and extract it.

Ref: git-for-windows/git-sdk-64@fdb0cea
Ref: git-for-windows/git-sdk-64#87
Signed-off-by: Dennis Ameling <[email protected]>
dennisameling added a commit to git-for-windows/setup-git-for-windows-sdk that referenced this pull request Sep 22, 2024
As a result of recent changes, the Git SDK `ci-artifacts` are published as GitHub release assets now. This eliminates the need for us to clone and cache the artifacts in this GitHub action.

Let's simply download the latest `.tar.gz` from the `ci-artifacts` and extract it.

Ref: git-for-windows/git-sdk-64@fdb0cea
Ref: git-for-windows/git-sdk-64#87
Signed-off-by: Dennis Ameling <[email protected]>
dennisameling added a commit to git-for-windows/setup-git-for-windows-sdk that referenced this pull request Sep 22, 2024
As a result of recent changes, the Git SDK `ci-artifacts` are published as GitHub release assets now. This eliminates the need for us to clone and cache the artifacts in this GitHub action.

Let's simply download the latest `.tar.gz` from the `ci-artifacts` and extract it.

Ref: git-for-windows/git-sdk-64@fdb0cea
Ref: git-for-windows/git-sdk-64#87
Signed-off-by: Dennis Ameling <[email protected]>
dennisameling added a commit to git-for-windows/setup-git-for-windows-sdk that referenced this pull request Sep 22, 2024
As a result of recent changes, the Git SDK `ci-artifacts` are published as GitHub release assets now. This eliminates the need for us to clone and cache the artifacts in this GitHub action.

Let's simply download the latest `.tar.gz` from the `ci-artifacts` and extract it.

Ref: git-for-windows/git-sdk-64@fdb0cea
Ref: git-for-windows/git-sdk-64#87
Signed-off-by: Dennis Ameling <[email protected]>
dennisameling added a commit to git-for-windows/setup-git-for-windows-sdk that referenced this pull request Sep 22, 2024
As a result of recent changes, the Git SDK `ci-artifacts` are published as GitHub release assets now. This eliminates the need for us to clone and cache the artifacts in this GitHub action.

Let's simply download the latest `.tar.gz` from the `ci-artifacts` and extract it.

Ref: git-for-windows/git-sdk-64@fdb0cea
Ref: git-for-windows/git-sdk-64#87
Signed-off-by: Dennis Ameling <[email protected]>
dennisameling added a commit to git-for-windows/setup-git-for-windows-sdk that referenced this pull request Sep 22, 2024
As a result of recent changes, the Git SDK `ci-artifacts` are published as GitHub release assets now. This eliminates the need for us to clone and cache the artifacts in this GitHub action.

Let's simply download the latest `.tar.gz` from the `ci-artifacts` and extract it.

Ref: git-for-windows/git-sdk-64@fdb0cea
Ref: git-for-windows/git-sdk-64#87
Signed-off-by: Dennis Ameling <[email protected]>
dennisameling added a commit to git-for-windows/setup-git-for-windows-sdk that referenced this pull request Sep 22, 2024
As a result of recent changes, the Git SDK `ci-artifacts` are published as GitHub release assets now. This eliminates the need for us to clone and cache the artifacts in this GitHub action.

Let's simply download the latest `.tar.gz` from the `ci-artifacts` and extract it.

Ref: git-for-windows/git-sdk-64@fdb0cea
Ref: git-for-windows/git-sdk-64#87
Signed-off-by: Dennis Ameling <[email protected]>
dennisameling added a commit to git-for-windows/setup-git-for-windows-sdk that referenced this pull request Sep 22, 2024
As a result of recent changes, the Git SDK `ci-artifacts` are published as GitHub release assets now. This eliminates the need for us to clone and cache the artifacts in this GitHub action.

Let's simply download the latest `.tar.gz` from the `ci-artifacts` and extract it.

Ref: git-for-windows/git-sdk-64@fdb0cea
Ref: git-for-windows/git-sdk-64#87
Signed-off-by: Dennis Ameling <[email protected]>
dennisameling added a commit to git-for-windows/setup-git-for-windows-sdk that referenced this pull request Sep 22, 2024
As a result of recent changes, the Git SDK `ci-artifacts` are published as GitHub release assets now. This eliminates the need for us to clone and cache the artifacts in this GitHub action.

Let's simply download the latest `.tar.gz` from the `ci-artifacts` and extract it.

Ref: git-for-windows/git-sdk-64@fdb0cea
Ref: git-for-windows/git-sdk-64#87
Signed-off-by: Dennis Ameling <[email protected]>
dennisameling added a commit to git-for-windows/setup-git-for-windows-sdk that referenced this pull request Sep 22, 2024
As a result of recent changes, the Git SDK `ci-artifacts` are published as GitHub release assets now. This eliminates the need for us to clone and cache the artifacts in this GitHub action.

Let's simply download the latest `.tar.gz` from the `ci-artifacts` and extract it.

Ref: git-for-windows/git-sdk-64@fdb0cea
Ref: git-for-windows/git-sdk-64#87
Signed-off-by: Dennis Ameling <[email protected]>
dennisameling added a commit to git-for-windows/setup-git-for-windows-sdk that referenced this pull request Sep 22, 2024
As a result of recent changes, the Git SDK `ci-artifacts` are published as GitHub release assets now. This eliminates the need for us to clone and cache the artifacts in this GitHub action.

Let's simply download the latest `.tar.gz` from the `ci-artifacts` and extract it.

Ref: git-for-windows/git-sdk-64@fdb0cea
Ref: git-for-windows/git-sdk-64#87
Signed-off-by: Dennis Ameling <[email protected]>
dscho pushed a commit to git-for-windows/setup-git-for-windows-sdk that referenced this pull request Dec 17, 2024
As a result of recent changes, the Git SDK `ci-artifacts` are published
as GitHub release assets now, including several variants of the minimal
flavor of Git for Windows' SDK. This eliminates the need for us to clone
the minimal SDK in this GitHub Action.

Let's simply download the latest `.tar.gz` from the `ci-artifacts` and
extract it.

Note: As per the analysis in
git-for-windows/git-sdk-64@fdb0cea37389, we
use the native `tar.exe` to unpack the minimal SDK while it is
downloaded, for maximal speed.

The analysis also suggests to use the `.zip` file, as this results in
the fastest operation when download speeds are above 6MB/second (which
we hope will be reliably the case in GitHub Actions). However, we want
to pipe the archive to `tar -xf -` while fetching, and it seems that for
some reason `C:\Windows\system32\tar.exe` misses `.sparse/` and `etc/`
when extracting `.zip` files from `stdin`, but the same is not true with
`.tar.gz` files. So let's use the latter.

See also: git-for-windows/git-sdk-64#87.

Signed-off-by: Dennis Ameling <[email protected]>
Signed-off-by: Johannes Schindelin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants