Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AN-300 Fall back to large Docker image size if we can't find a real one #7673

Merged
merged 4 commits into from
Jan 6, 2025

Conversation

jgainerdewar
Copy link
Collaborator

@jgainerdewar jgainerdewar commented Jan 3, 2025

Description

See discussion in AN-300. This should fix cases where we can't find the Docker image size, so we don't bump the boot disk size to account for it, and tasks fail with out-of-disk during docker pull.

Batch will be handled in a different ticket.

Includes a revert of #7585, since this is a better-targeted fix for the same issue.

Release Notes Confirmation

CHANGELOG.md

  • I updated CHANGELOG.md in this PR
  • I assert that this change shouldn't be included in CHANGELOG.md because it doesn't impact community users

Terra Release Notes

  • I added a suggested release notes entry in this Jira ticket
  • I assert that this change doesn't need Jira release notes because it doesn't impact Terra users

val userCommandImageSizeRoundedUpInGB = userCommandImageSizeInGB.ceil.toInt
.map(s => MemorySize(s.toDouble, MemoryUnit.Bytes).to(MemoryUnit.GB))
.getOrElse(MemorySize(30, MemoryUnit.GB))
val userCommandImageSizeRoundedUpInGB = userCommandImageSizeInGB.amount.ceil.toInt
Copy link
Collaborator Author

@jgainerdewar jgainerdewar Jan 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slight refactor because I didn't want to write .getOrElse(30000000000L)

val userCommandImageSizeInBytes = createPipelineParameters.jobDescriptor.dockerSize

// Compute the decompressed size based on the information available. If we couldn't get the image size,
// default to 30GB. Defaulting to 0 can cause task to run out of disk. (more in AN-300)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is 30 GB extra for the image, right? Not a 30 GB disk overall.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct!

.map(s => MemorySize(s.toDouble, MemoryUnit.Bytes).to(MemoryUnit.GB))
val (userCommandImageSizeInGB, userImageLogString) = maybeUserCommandImageSizeInGB match {
case Some(imageSize) => (imageSize, "user command image")
case None => (MemorySize(30, MemoryUnit.GB), "failed to obtain user command image size, using safe default")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate this specific message.

@jgainerdewar jgainerdewar marked this pull request as ready for review January 6, 2025 16:49
@jgainerdewar jgainerdewar requested a review from a team as a code owner January 6, 2025 16:49
@jgainerdewar jgainerdewar requested a review from LizBaldo January 6, 2025 17:01
Copy link

@LizBaldo LizBaldo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@jgainerdewar jgainerdewar merged commit 95d2d53 into develop Jan 6, 2025
45 of 47 checks passed
@jgainerdewar jgainerdewar deleted the jd_AN-300_fallbackDockerImageSize branch January 6, 2025 20:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants