Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic LOD switching #324

Draft
wants to merge 7 commits into
base: master
Choose a base branch
from
Draft

Conversation

sdumetz
Copy link
Contributor

@sdumetz sdumetz commented Nov 14, 2024

So there is this idea_ I worked on for the past few months that makes it possible to open a ~200 models scene with each model having a 4k diffuse (in High quality) on a mobile device.

It's still a very rough prototype but I wanted to share it and begin discussions on what requirements we'd have to meet for this to get merged into voyager.

This branch temporarily includes #266 to allow more camera mobility and a lot of performance patches that should be (or have been) extracted to their own PRs. It also needs to be rebased on Voyager's latest release. I'll do all this down the line as I clean everything up.

A working example is visible there:

https://lodtest.ecorpus.holusion.net/ui/scenes/Lascaux_retake2/view

Same model, without LOD (everything is squashed together and downscaled to a 8k diffuse map) is publicly available on the french ministry of culture website and takes longer to load initially, has a lot of blur and doesn't even work on most mobile devices.

I also have a simpler demo example that should clearly show what is happening:

https://lodtest.ecorpus.holusion.net/ui/scenes/simple_room/view

Reasoning

I started with 2 facts :

  • as stated in Derivative quality "High" is always used #254 even lower-end devices are fully capable of showing High quality (4k textures) derivatives and there is no real point in trying to serve larger models even for high-end devices
  • Quality is selected at the scene level. Voyager has the ability to use heterogeneous quality internally but no way to exploit this.

My objective was to be able to handle spatially large models. Things like a large painting, multi-figure sculpture or a whole room. In such an environment, I want what's right under my nose to have a very high texture resolution (abt. 4k for something filling the whole screen is a good upper bound) but on the other hand I don't care about the resolution of pretty much everything else.

Having a scene with 10s of objects, each with 4k diffuses tends to overload even desktop computers with low end discrete GPUs. I want to be able to have 100s of them.

How it works

I repurposed the existing Derivative system to dynamically switch models quality depending on the "perceived importance".

Pretty much everything happens in the new CVDerivativeController class. The scene is initially loaded with Thumbquality (as it was before). Then the controller regularly sorts every model in the scene by importance (on-screen size and angle to camera are taken into account) and increments the quality of the most important model(s). If/When too much models are upgraded, it downgrades models as needed starting with the least important ones.

Heuristics

Tradeofs that had to be made. Can and should be adjusted.

Performance

Estimating a performance budget is hard (see #254). Even if we could reliably determine a device's graphic power (we can't), we can't know if the user wants us to use 100% of his computer's power.

I settled somewhat arbitrarily on an initial budget of 2 High resolution models that is then further decreased if we detect a low-end device (CPU count, RAM size, is a mobile device).

Model weight

Models are weighted based on their size in camera space, their distance to the camera and their angle relative to the Camera.forward.

The Distance modifier is somewhat redundant with size but we kinda want far away models to be less important even if they are very big. In fact most LOD algorithms are purely distance-based.

@sdumetz sdumetz mentioned this pull request Nov 14, 2024
@gjcope
Copy link
Collaborator

gjcope commented Nov 22, 2024

This is exciting, thanks so much! It may take a minute to get to, but I'm looking forward to testing it out.

@gjcope
Copy link
Collaborator

gjcope commented Dec 16, 2024

@sdumetz Nice work. In terms of what we need to get it merged, I think a step that would be helpful (for me) would be to have a PR without #266 to make things a little clearer. Then I think we will need at least a few different test cases outside of Lascaux (though that one works quite well) which we can help with on our end. I think your general approach is great and clearly well thought out. There are a few corner cases to be considered, but I see you've got many of them already commented.

@sdumetz
Copy link
Contributor Author

sdumetz commented Dec 17, 2024

Cleaned it up, removed #266 (I'll still include it in my test builds because otherwise the cave scenes are totally unusable)

We pushed it to a ~700 tiles model where it unsurprisingly showed its limits. This is unfortunately about 1/10th of what we need to have a full tour of the Chauvet Cave.

I plan to add a way to totally unload some models from memory once they are farther out than the camera's far distance. This would also speed up initial loading by not bothering with out-of-view models initially.

Aside from caves we have access to high-quality models of XIIIth century stone sculptures fragments so I plan on creating scenes with them shortly, but any samples you have would be welcomed, this definitely need to be tested in more diverse cases.

@gjcope
Copy link
Collaborator

gjcope commented Jan 6, 2025

Dug into this some with some simple test cases and found a couple of issues with the visibleSize calculation.

  • Looks like the object pose is not being taken into account when applying transforms to the local bounding box
  • The bounds check comparing the z-component of the projected coordinates with the near plane won't work because the projected depth is normalized and the near plane is not. So anything with a near plane > 1 (we have a number of those things) will end up with a null resulting bounds.

Suggested patches here: d0a31 but still needs testing so feel free to address otherwise.

Generally, it would be nice to not have it evaluate every frame and be driven by camera movement (or load), but I think that can be an optimization. Planning to do more testing soon.

@sdumetz
Copy link
Contributor Author

sdumetz commented Jan 7, 2025

I'll look up the patch, thanks. The rotation looks OK and I can't remember what I was trying to do with the near plane comparison but you're right, it doesn't make sense.

Generally, it would be nice to not have it evaluate every frame and be driven by camera movement (or load), but I think that can be an optimization. Planning to do more testing soon.

This is the reason I put it in the tock callback that could probably get called with an additional tickUpdated boolean frop CGraph.pulse(). Not really sure how that would work out but probably worth some time once the core features are running properly. However even if it is optimized-away when the scene is static we should probably debounce it to not run every frame when the camera moves because the tree traversal isn't cheap.

I found another issue today with the slicer tool: When the selected derivative is changed, the slicer isn't applied to the replacement model.

@gjcope
Copy link
Collaborator

gjcope commented Jan 7, 2025

Some more notes below. These can all be seen testing with https://3d-api.si.edu/content/document/d8c63ba6-4ebc-11ea-b77f-2e728ce88125/document.json (note: medium and high derivatives are the same currently because we can't support high for all pieces which makes this a good use case for us to support true high-res)

  • Can you say some more about the intent of angle in the weight computation? I'm seeing some issues that point to this, but I'm unsure what the goal is with that piece. One example is positioning the spacesuit to get a close-up view of the left arm (suit-p1). This would be a common use case where we want to zoom in to get better detail. Because that piece is on the camera-side of the origin, the angle is ~180deg and the weight takes a huge hit, resulting in a counter-intuitive drop of that mesh to 'thumb' quality.
    image

  • This scene also has a couple of objects that only have a 'high' derivative. This can create a condition where they are flagged for quality change every frame but are unable to do so. Not a serious issue, but it does result in "derivative quality 'XXX' not available, using higher quality" message being spammed to the console which can eventually affect browser performance.

  • I'm seeing some 'Violation' warnings like this: https://discourse.threejs.org/t/violation-requestanimationframe-handeler-took-xx-ms/49567

@sdumetz
Copy link
Contributor Author

sdumetz commented Jan 8, 2025

The angle computation was aimed at prioritizing things that are centered on-screen. It makes a lot of sense in interior spaces when the "tiles" are generally smalled than the camera"s frustrum. A lot less when zoomed-in on models like the suit. I suppose the distance computation will have the same limitation.

Ideally I'd like to compute those against the "best-case-point" from _localBox but not sure how to compute this.

I'm seeing some 'Violation' warnings like this: https://discourse.threejs.org/t/violation-requestanimationframe-handeler-took-xx-ms/49567

Curious what contributes meaningfully to this? Is the scene too long to render or does the tree traversal meaningfully contribute to this? Wrapping the tock in a setTimeout to move it outside of requestAnimationFrame's microtask is an option but I think it is more likely to be hiding the issue than fixing it.

If it's not the render or CVDerivativesController.tock() that's causing it, we should probably move the culprit outside of requestAnimationFrame's scope.

This scene also has a couple of objects that only have a 'high' derivative. This can create a condition where they are flagged for quality change every frame but are unable to do so. Not a serious issue, but it does result in "derivative quality 'XXX' not available, using higher quality" message being spammed to the console which can eventually affect browser performance.

Maybe once everything is settled we move this message to somewhere where it's triggered only when dynamic LOD is disabled and we are actually trying to load a specific quality and are not able to do so.

Pushing an algorithm change alongside your patch and a rebase on master, to more aggressively unload higher-res models when they are no longer needed. Hopefully this is also more readable though I'm still not very proud of my work in this matter.

@gjcope
Copy link
Collaborator

gjcope commented Jan 8, 2025

The angle computation was aimed at prioritizing things that are centered on-screen. It makes a lot of sense in interior spaces when the "tiles" are generally smalled than the camera"s frustrum. A lot less when zoomed-in on models like the suit. I suppose the distance computation will have the same limitation.

Got it. I think the updated calculation makes sense for this.

Curious what contributes meaningfully to this? Is the scene too long to render or does the tree traversal meaningfully contribute to this? Wrapping the tock in a setTimeout to move it outside of requestAnimationFrame's microtask is an option but I think it is more likely to be hiding the issue than fixing it.

Good question. I'll see if I can narrow it down.

With the current update I'm seeing a regression testing the space suit example above. Just zooming in from the default view results in the assets (same quality) being repeatedly reloaded over and over. Looking into it, but tough to debug with the console spamming. EDIT: Looks like they are stuck bouncing back and forth between upgrade/downgrade.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants