-
Notifications
You must be signed in to change notification settings - Fork 552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use gix-status
for create_wd_tree()
#5889
base: master
Are you sure you want to change the base?
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
@Byron is attempting to deploy a commit to the GitButler Team on Vercel. A member of the Team first needs to authorize it. |
7150bef
to
403cdac
Compare
403cdac
to
2accee5
Compare
@Byron should create_wd_tree be a gix built-in? To me it seems like a very common operation |
I thought about it but concluded that it's not possible at this time. Further, there is no functionality in Git that I am aware of that would automatically pick up all untracked files. @mtsgrd has picked up the task of creating commits after building a |
2accee5
to
2c4e34b
Compare
2c4e34b
to
95e62c6
Compare
95e62c6
to
66e1a0f
Compare
66e1a0f
to
94fa323
Compare
94fa323
to
1af1556
Compare
Is it safe to exclude large files from the produced tree? In edit mode where we write out the |
Thanks for reminding me! It's most definitely not safe while GitButler still performs hard resets of the working tree, something I believe it definitely has to stop doing. But until we are there, it should pick up everything to protect it, no matter how large it may be. It's possible that there are places where a limit can be imposed, but for now I have deactivated the limit everywhere. |
It contains the latest version of `gix status`.
…tree() tests. Also optimize the tests for reading by removing the .unwrap() calls and reducing the verbosity of variable names.
It's the same idea as it was with `git2`, but it's faster as `gix` uses more threads. Further improvements: * handle dir-to-file and file-to-dir conversions * pin current behaviour more with additional tests * add submodule support
d0d803c
to
6a4fb03
Compare
It's not safe to do that while performing hard resets, and we shouldn't risk it just yet.
6a4fb03
to
07c9c96
Compare
head_tree_editor.upsert(rela_path, kind, id)?; | ||
Ok(true) | ||
}; | ||
let mut head_tree_editor = repo.edit_tree(repo.head_tree_id()?)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This tree editor is a really neat abstraction from the gix side!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My brain had a serious issue to align what it saw with the side I ought to be seeing, but now I have recovered 😁.
This tree editor is a really neat abstraction from the gix side!
Thanks! It's based on git2
and pretty much the same API for the editor itself - the way it's instantiated might be a bit more natural though. As a major difference, besides being faster, it will also just do as it's told without special rules that prevent you from doing things (like turning a directory into a file or vice-versa is forbidden in git2
for some reason).
Regarding the pictures, I think it's perfectly true and very obvious when looking at the before-after of the whole 'status' loop. The gix
detail is excruciating, but it did help to make the whole operation very well defined.
There is huge value in the way git2
represents state as well and I hope to find ways to support both levels of abstraction one day. gix
easy mode :D?
|
||
/// The maximum size of files to automatically start tracking, i.e. untracked files we pick up for tree-creation. | ||
/// **Inactive for now** while it's hard to tell if it's safe *not* to pick up everything. | ||
pub const AUTO_TRACK_LIMIT_BYTES: u64 = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are cases where not disregarding huge files will lock up the app. For instance the case where somebody has a databse in the repo root - not ignored, but never intend to stage and commit that. Or an accidentally moving a large file in the repo.
Perhaps we can teach the edit mode to perform hard resets excluding these files? In terms of frequency of occurrence, it is much more likely to hit the case of an accidentally trying to add a huge file to a tree than to reset it via edit mode...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't agree more.
For this PR, I'd hope that maintaining the current behaviour of create_wd_tree()
picking up everything independently of size will be acceptable, as it at least won't make anything worse.
Then, for the future, I'd hope GB can be made to…
- …use the
.git/index
when creating commits for the user (but won't use the.git/index
when creating internal commits, like for the oplog as it's more efficient) - …never do hard resets
- …to leave untracked files alone just like Git does
- …while barking if there is some clash with an untracked file
It's the "good gitizen" principle so basic expectations that people have built up towards Git will be met.
Maybe there are better ways to do that even, there should be some room for innovation, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised that the creation of the wd tree is slow when large files are present.
In my mind, the slowness was with both generating and displaying diffs of large files. As such, we only needed to put the limits in there.
never do hard resets
Is that value compatible with edit mode?
to leave untracked files alone just like Git does
Could you elaborate on what cases you're concerned about here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised that the creation of the wd tree is slow when large files are present.
Independently of the UI, creating objects from content of disk will always be limited to 20-30MB/s per core due to ZLIB. This is horribly inefficient especially when said large file is a database that changes all the time while compressed by nature.
Is that value compatible with edit mode?
Git has one mechanism to deal with this, being git stash --include-untracked
. But even that I think should only be done once it's clear the untracked files are in the way. The content of the index would need to be stashed anyway to avoid loosing it.
Could you elaborate on what cases you're concerned about here?
It's the expectation that a Git client behaves like Git, and Git won't touch untracked files unless it's explicitly told to (or sometime less explicitly, but it's opt-in always).jj
has the advantage of being its own thing so it can redefine what's 'normal' to great effect. To my mind, that's something that GitButler can't do, at least not by default, merely to align with the expectations brought towards it.
Going beyond the 'normal' for Git, of course, is another form on innovation which should be explored to push the envelope, but that can't ever come at the cost of the safety of the user's data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I see. I also mixed up the code path that is being changed here with this one
gitbutler/crates/gitbutler-diff/src/diff.rs
Line 161 in 1e16896
let diff = repo.diff_tree_to_workdir_with_index(Some(&old_tree), Some(&mut diff_opts))?; |
Follow-up to #4912.
Tasks
gix-status
forcreate_wd_tree()
gix
frommain
once status improvements GitoxideLabs/gitoxide#1746 is merged.gitoxide
forgit_status
andgit_metrics
modules starship/starship#6476Notes for the Reviewer
gitoxide
version uses the typesystem to enforce exhaustiveness. This is more verbose, but really helps to cover all the bases.gix::object::tree::Editor
when making arbitrary edits to trees. This is rare though, so probably there aren't anymore bugs like these.insta-ad.mov
Performance
Measured on
gitlab
repository wheregitbutler-cli --trace branch apply -b main
was executed.create_wd_tree()
is called then. Unfortunately, there is no way to call it individually unless one creates a new subcommand for the CLI. As the performance improvement is visible enough, I didn't spend the time on that.With
git2
(Best out of two runs)
With
gix
(Best out of two runs)
gix
is ~35% faster.Research
Ordering Issues
The order of entries isn't defined, which makes it hard to decide what the final state actually is, or at least needs extra logic.
Something that doesn't work in our favor is:
By default, the order is reversed which works out, but like this the second even will remove the changes of the first.
Solved by applying untracked files last.