-
-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pkgs.dockertools.buildLayeredImage: customisable layering strategy #122608
pkgs.dockertools.buildLayeredImage: customisable layering strategy #122608
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Congrats on the first version!
I've read through the description and scanned parts of the diff. I'll have to take some time to review it thoroughly.
Some first observations:
- It seems that you've accidentally duplicated (part of?) the PR description
- The layering pipeline is a bit hard to follow. This is an area we have identified at the start of this project to be hard to predict and that it may call for some experimentation. Some complexity is to be expected for an advanced feature, but I expect that we at least improve the readability. For one, the syntax you've used is essentially a raw interface to python. It will be easier to read with a little "DSL" of Nix functions in front of it. Perhaps we can also identify some usage patterns and exploit those to make it more intuitive.
- Given the size, one could start to consider splitting the python code into its own repo. Such a decision must be weighed against the benefits of having it together with the dockerTools Nix code that it's coupled with. Aesthetic reasons are not sufficient to motivate such a split.
- One comment in the diff. You've already identified in the TODO that Graham's popularity contest hasn't been moved entirely into the new framework yet.
pkgs/build-support/flatten-references-graph/src/flatten_references_graph/popularity_contest.py
Outdated
Show resolved
Hide resolved
cheers!
fixed
Do you mean something like this: let
layeringPipeline = with pkgs.dockerTools.layeringDSL; compile [
(subcomponent_out [pkgs.python3])
(over "rest" (pipe [
(split_paths [ankisyncd])
(over "main" (pipe [
(subcomponent_in [ankisyncd])
(over "rest" (pipe [
popularity_contest
reverse
(limit_layers 110)
]))
]))
(over "rest" (pipe [
(subcomponent_in [with-env-from-dir])
(over "rest" (subcomponent_in [pkgs.cacert]))
]))
]))
]; instead of let
layeringPipeline = [
["subcomponent_out" [pkgs.python3]]
["over" "rest" ["pipe" [
["split_paths" [ankisyncd]]
["over" "main" ["pipe" [
["subcomponent_in" [ankisyncd]]
["over" "rest" ["pipe" [
["popularity_contest"]
["reverse"]
["limit_layers" 110]
]]]
]]]
["over" "rest" ["pipe" [
["subcomponent_in" [with-env-from-dir]]
["over" "rest" ["subcomponent_in" [pkgs.cacert]]]
]]]
]]]
]; which is a more tightly formatted version of this? The "DSL" version is only slightly more readable IMHO, but I guess one benefit it could bring is that the pipeline could be validated within nix and an error could be thrown with a trace pointing exactly at the place which caused it.
I have no preference. Happy to do whatever you (and other maintainers) decide.
I copied over the code but haven't touched the original package since I wasn't sure if we should keep it around (within nixpkgs only dockerTools depended on it). I guess it doesn't hurt to implement it in terms of |
Yes, something like this aligns more with what people expect from a nix file.
Not sure anymore. Maybe try to limit the nesting instead? I would expect this
to work, but it ran into a type error
Let's simplify to a working example:
This outputs a set of layers: [ [hello]
[libdn2 libunistring glibc]
[figlet bash]
] That is as expected; [ [hello] [] []
[] [glibc] [libdn2 libunistring]
[figlet bash] [] []
] Here, split_paths has an effect on both Whether my expectation is optimal is debatable. Perhaps it's better to apply it only to components that contain Could you try to make it possible, at least for the top-level pipeline, to allow operations to work on all components simultaneously? It seems that at least the splitting functions will behave like identities for components that are entirely unrelated to the argument path(s).
I vote to keep it all here.
I'd consider the old script an internal implementation detail. If someone used it without |
apologies for the delay Regarding:
I believe what you are trying to achieve could be done with: pipeline = [
["split_paths" [ hello ] ]
["flatten"]
["map" ["split_paths" [ figlet ] ] ]
] We could make a version of |
I'm still a bit concerned about the need for this amount of configuration. The capability is probably great for power users who will be able to fine tune their images and it's very useful for experimentation. I just don't want people to have to become power users. Perhaps I have given up too soon on optimization (in the math sense) when thinking about the split_paths approach. I've mentioned before that it optimizes for minimizing changes between updates of the same image as opposed to popularity_contest which optimizes for cross-image sharing. I didn't make the extra step to formalize that though. There's no reason it can't be done though, so a fully automatic split_paths may still be possible. Having a good algorithm without configuration should alleviate any DSL issues. A fully automatic split_paths could split layers one at a time, selecting the optimal split in each iteration. If we find a good definition of optimal we don't need any configuration at all. If it does not work great in some cases, we can let the user adjust the metric instead of making the whole thing manual. At a high level, for every split we're looking to maximize the expected saved cost of transmission. maxpath ∈ nodes(graph) P(change in path) * cost_of_changegraph,layers(path)) where P(change in path) is the probability that path changes between versions of the image This seems to be an accurate representation of what we're trying to achieve. cost_of_changegraph,layers(path) = size(intersect(split_atgraph(layers, path), layers)) Here, split_at performs the split according to So far, this seems to be accurate. P(change in path) is by far the most unpredictable one. One approximation is to assume that each byte in the closure is equally likely to change. Compared to an equal probability for all paths, this seems reasonable because big paths have more entropy. It will also have the effect of putting the small config files near the root together in the "rest" layer, not wasting layers. It also doesn't favor putting small dependencies in layers. However, because they're deeper down, those will be grouped according to the needs of the bigger paths, as a consequence of how
While writing the above, I've assumed the
The necessity of this seems to ultimately stem from the desire to use as many layers as possible without having too much configuration. Automatic iterated split_paths is also automatic and can just stop at the configured maximum. I hope we don't need to combine multiple automatic algorithms. Just a side note, if you could generate a graphviz graph in a |
Yes, I believe in most cases uses should be able to select one of predefined algorithms/presets based on their particular needs. Your proposal sounds interesting, however I believe this could be addressed in a separate PR. I think the capabilities implemented in this PR are still useful so it would be great if we could get this merged while we try to figure out the "fully automatic" algorithm which optimises for image rebuilds. |
We've just had a call. The plan is to work on an automatic algorithm before working on the DSL and documentation, because those may be affected by this work. |
sorry, I have been busy with a new project, I might be able to get back to it in about a month or so. Alternatively I could set up a bounty and contribute some $ to it if anyone would be interested in taking this up |
@adrian-gierakowski Thanks for the update. It'd be great if anyone could make progress on this. This pr brings good changes, such as tests and a foundation for building smarter layering strategies. It would even fix a bug, #140908, so I think the priority should be to get this merged. The strategies themselves can be marked as experimental with a message during evaluation, to manage expectations and invite collaboration. Regarding the bounty, that may be quite helpful. It'd be most effective if you could help with the merge-readiness part. That way the bounty can be used for a more specific task that doesn't require as much background. I should probably also plug https://opencollective.com/nix-deployments which is a new fund for maintenance and development of Nix-based deployment tech. You can donate for specific projects, such as the NixOps 2 release, or "Anything around Containers", where the ideas are open for discussion. |
@roberth I could probably spend 1-2 hours on this next week to complete any work you think necessary to get this merged. This would obviously not be enough to implement the fully automatic strategy. I could probably work on the latter in about a month or so, and submit it as a separate PR, together with documentation etc. In the meantime I’ll look into setting up a bounty in case someone else is keen to do it earlier. |
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/a-faster-dockertools-buildimage-prototype/16922/2 |
I've been using this PR in one project and thought it might be useful to post some error tracebacks I've been encountering, in case this PR is ever finished:
I've encountered this quite a few times but I'm not certain what causes it. It happens when specifying stuff for some dependencies and not for others. It's probably related to how many dependencies are contained in the "main" and "rest" parts of splits. I know that this PR might be obsolete and there now seem to be some other PRs tackling the same problem. In case it isn't obsolete and somebody decides to tackle this again, I hope the stacktraces help. EDIT: it seems that the error is caused by "popularity_contest" strategy having a "layer_limit" set that is higher than the actual number of dependencies in the subgraph. |
@rihardsk thanks for the feedback! I defiantly intend to push the ball forward on this PR, however my work situation changed soon after I submitted this PR and I haven't had much spare time atm to continue working on it.
I'm not aware of any other PRs which provide a better solution to the problem this PR is trying to address so I hope we can get this PR merged eventually.
Nice, I'll try to replicate this in tests and see if can catch it early to provide better feedback to the user. @roberth you wrote above that:
what would have to be done (apart from rebasing and resolving conflicts) for this PR to be considered merge ready? |
@adrian-gierakowski i'm definitely enjoying the flexibility your solution provides. I've only had a quick glance on those other docker image layering PRs so i wasn't sure if anything there attempts to provide a general solution. One slight annoyance I've had with this PR is that the layering strategy that I've created has ended up being deeply nested, here's a sample from my project: let
# Helper functions for layering strategy creation
popularityHelper = overWhat: layerLimit: [
"over"
overWhat
[
"pipe"
[
["popularity_contest"]
["reverse"]
["limit_layers" layerLimit]
]
]
];
# Note, there seems to be a bug, where the layer creation code fails with
# TypeError: reduce() of empty sequence with no initial value
# that seems to be caused by the popularity_contest strategy having a layerLimit
# that is higher than the actual number of dependencies in the dependency
# subgraph. To work around this, try decreasing the layerLimitForRest argument.
splitFromMain = splitWhat: layerLimitForRest:
let mainProcessingStrategy =
if layerLimitForRest > 0
then
[
["subcomponent_in" splitWhat]
(popularityHelper "rest" layerLimitForRest)
]
else
[
["subcomponent_in" splitWhat]
]
;
in [
"over"
"main"
[
"pipe"
mainProcessingStrategy
]
];
layeringPipeline = [
["subcomponent_out" [models]]
[
"over"
"rest"
[
"pipe"
[
["split_paths" [cudatoolkit]]
(splitFromMain [cudatoolkit] 10)
(popularityHelper "common" 30)
[
"over"
"rest"
[
"pipe"
[
["split_paths" [gateway]]
(splitFromMain [gateway] 1)
# (popularityHelper "common" 30) # cannot use this because of TypeError: reduce() of empty sequence with no initial value
[
"over"
"rest"
[
"pipe"
[
["split_paths" [segmenter]]
(splitFromMain [segmenter] 1)
# (popularityHelper "common" 30) # cannot use this because of KeyError: 'common'
[
"over"
"rest"
[
"pipe"
[
["split_paths" [translator]]
(splitFromMain [translator] 4)
# (popularityHelper "common" 30)
[
"over"
"rest"
[
"pipe"
[
["split_paths" [marian-server]]
(splitFromMain [marian-server] 2)
# (popularityHelper "common" 30)
[
"over"
"rest"
[
"pipe"
[
["split_paths" [pkgs.python38Packages.supervisor]]
# (splitFromMain [pkgs.python38Packages.supervisor] 10)
# (popularityHelper "common" 30)
[
"over"
"rest"
[
"pipe"
[
["subcomponent_out" [gateway_conf translator_conf supervisord_conf]]
# (popularityHelper "rest" 5)
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
];
in {} # ... Maybe I'm using it wrong, but it would be cool if I could avoid all the nesting here and instead have something that looks more linear and somehow automatically pipes the "rest" part to the next step (avoiding "over"). |
I think what this needs is
|
@rihardsk it seems that you're most knowledgeable about the issue. Do you think you can write a test case? |
@SuperSandro2000 which tool did you use to format |
138a673
to
5b4a8db
Compare
Just rebased on master, fixing merge conflicts, applying I've backed up previous state of this branch on at https://github.com/adrian-gierakowski/nixpkgs/tree/ag/docker-customisable-layering-strategy-before-rebase-and-nixfmt-2024-11-09 |
also ran |
# These derivations are only created as implementation details of docker-tools, | ||
# so they'll be excluded from the created images. | ||
unnecessaryDrvs = [ baseJson overallClosure customisationLayer ]; | ||
layersJsonFile = buildPackages.dockerMakeLayers { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for people who would like to experiment with different ways of grouping paths into layers, this could now be override (at pkgs level) in order to inject arbitrary code for generating layers
probably not the most friendly interface but at least it's an improvement over status quo of streamLayeredImage
being tightly coupled to referencesByPopularity approach
I'd be happy to iterate on this in order to make the injection of layering logic more user friendly.
@adrian-gierakowski @colonelpanic8 In principle I'm in favor of merging this as soon as 24.11 is branched off, which is planned for Thursday. Originally I wanted to do more for this PR in terms of review and perhaps improving it further, but that has turned out unrealistic, so I am content that this is at the very least an improvement in terms of maintainability and ability to evolve this solution. Thank you @adrian-gierakowski for implementing this (and persevering, oh my), @colonelpanic8 for testing, and @tomberek for offering to help with review. |
I will, but if anyone would only look at the numbers, they may disagree. I can trust you because we've talked directly and you've shown professionalism etc, but that won't apply to others, so it's probably best to wait a bit. Until then, ping me so I can check a couple of things and merge for you. These won't necessarily be in-depth reviews, or very large diffs, so that's two reasons why it will be much much quicker, especially compared to this one. |
@adrian-gierakowski i messaged you in Matrix. Any other good way to get in touch? |
@tomberek I don't see your message, but just sent you one on matrix as well (found you on #users:nixos.org) |
pkgs/by-name/fl/flattenReferencesGraph/src/flatten_references_graph/lib.py
Outdated
Show resolved
Hide resolved
addressed all recent comments |
mentioning non-existent file caused a ci failure: https://github.com/NixOS/nixpkgs/actions/runs/12091053669/job/33718955875?pr=122608
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merging this as a refactor of the underlying algorithm and reviewed by @roberth and myself. The DSL is still an internal representation. Still needs good documentation, helper wrappers, and perhaps some design iteration.
I am also looking at the maintainability. If this turns out to be too expensive we should consider reverting.
How has this gone so far? Any issues? |
As someone who has been waiting for this functionality for a long time, I just tried this out -- The pipeline syntax could definitely use more documentation and examples but now that I've wrapped my head around it, it works very well. A few snippets for common use cases: Flatten python3 and all its dependencies into a single layer as a common base layer, handle other paths like previously[
["subcomponent_out" [pkgs.python3]]
["over" "rest" ["pipe" [
["popularity_contest"]
["limit_layers" 100]
]]]
] Force the often-changing custom application code into its own layer[
["subcomponent_in" [my_app]]
["over" "rest" ["pipe" [
["popularity_contest"]
["limit_layers" 100]
]]]
] Combining the two[
["subcomponent_in" [my_app]]
["over" "rest" ["pipe" [
["subcomponent_out" [pkgs.python3]]
["over" "rest" ["pipe" [
["popularity_contest"]
["limit_layers" 100]
]]]
]]]
] The |
I was already using it on an older forked version of nixpkgs, but I have now moved to using the version that is in unstable and I'm (pretty unsurprisingly) having no issues! |
Allow customisation of the algorithm used to convert nix references graph (created from docker image contents) to docker layers.
A collection of building blocks (python functions) is provided, which can be assembled into a processing pipeline by specifying a list of operations (and their initial arguments) via a
nix
lang list.A graph of nix references is first converted into a python
igraph.Graph
object (with each vertex representing a nix path), and then fed into the user defined pipeline. Each stage in the pipeline represents a function call, with initial arguments specified by in nix, and the last argument being the result of the previous stage in the pipeline (or the initial Graph object).Each step of the pipeline is expected to produce a data structure consisting of arbitrarily nested lists/dicts with Graph objects (representing docker layers) at it's leafs. The result of the last stage in the pipeline is recursively flattened (with each dict converted into list of values), until a flat list of Graphs remains. This is then output as a json array of arrays (each Graph converted into an array of paths).
This functionality is made available via new
layeringPipeline
argument to thestreamLayeredImage
/buildLayeredImage
functions. The default value of the argument has been chosen to to preserve current layering behaviour.Functions available to use in the pipeline:
popularity_contest
: sorts paths by popularity, with each path put in it's own layer. This, when followed bylimit_layers
, results in layers identical to current implementation (implementation copied and adapted from closure-graph.py).subcomponent_out
: splits given nix references graph into 2 graphs (preserving edges within each sub-graph):main
: includes specified paths and all of their dependencies (including transitive)rest
: all remaining pathssubcomponent_in
: similar to the above butmain
output includes dependants of the provided paths (instead of dependencies)split_paths
: implements idea described by @roberth, by removing edges targeting given paths, and splitting resulting graph into 3:main
: specified paths + their dependencies -common
paths (see below)common
: paths which can be reached from both the specified paths and any of the remaining roots of the input graphlimit_layers
: limits the number of layers to desired amount by combining excess layers from the end of the list into onereverse
: reverses given list of layers. This, if inserted betweenpopularity_contest
andlimit_layers
, achieves the same result aslayeredStrategies.popularityWeightedTop
from this PR.split_every
: splits given graph into layers with given number of paths each (apart from the last layer, which might contain less paths).remove_paths
: removes vertices corresponding to given paths from a Graph (most likely not a good idea to use this apart for removing roots)over
: maps given function over a value of a given key in a dict. Similar toover
from haskell lenses, or ramda. Useful for applying further transformations to selected sub-graph in the result ofsubcomponent_out/in
orsplit_paths
.map
: map given function over all elements of a listSome images to demonstrates behaviour of
subcomponent_out/in
andsplit_paths
.Given the following input graph, and
paths
argument set to["A"]
:Result of
subcomponent_out
:With
main
output consisting of the green vertices andrest
of red.Result of
subcomponent_in
:Colours: as above.
Result of
split_paths
:main
- greencommon
- purplerest
- red.Note the edge removed from the graph above. The
common
sub-graph is calculated by taking an intersection of the sub-graph formed by vertices which can be reached fromA
and one which can be reached fromRoot
. IfA
was the root, then bothcommon
andrest
would be empty.Example of a custom layeringPipeline
Used to package an application written in python, optimising for sharing layers between rebuilds of the same image (but also with a layer containing python runtime with all it's dependencies, which can be shared between images of different application built on top of the same version of python).
The pipeline: https://github.com/adrian-gierakowski/nix-docker-custom-layering-strategy-example/blob/6eec668c4ef19b09926d8b53b7003575a115273e/default.nix#L42-L108
The resulting docker layers: https://github.com/adrian-gierakowski/nix-docker-custom-layering-strategy-example/blob/119c15a98a16b9b85d676d9ea86ce231d7306d55/store-layers.json
TODOs:
subcomponent_out/in
andsplit_paths
could be improved. The former are based on corresponding method of igraph.Graph class and the latter is the first thing that came to mind when I started working on it.split_every
: allow customising to walk the graph in depth-first or breath-first mode (currently it's simply iterates the vertices in the order in which they are stored in the igraph.Graph object, which is a bit arbitrary and probably not very useful).flatten-references-graph
(probably by implementing the first in terms of the latter).Motivation for this change
Current algorithm for grouping nix paths into layers is optimised for a specific use case (sharing layers between unrelated images), but it pessimisms others. For example the reuse of layers between consecutive rebuilds of images in which amount of nix paths exceeds maxLayers. This significantly slows down development of applications with large number of dependencies (where each dep is a separate nix path).
The need for customising the layering algorithm has been highlighted in a couple of issues:
#116446
#48462
Things done
sandbox
innix.conf
on non-NixOS linux)nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
./result/bin/
)nix path-info -S
before and after)