-
Notifications
You must be signed in to change notification settings - Fork 630
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Read-Only Vectorstore with GCS persistence goes stale #2612
Comments
hi @ rjrebel10 , apologies for the late followup on this - sorry you've run into an issue with Deep Lake. Worry not, I'm looping in @tatevikh who will advise further. |
I'm wondering if it's related to the caching layer storing previously saved versions of files. Does running |
@nvoxland I tried the clear_cache() method and it did not work. It still only shows the stale data and does not see the new commit to the dataset. |
+1 |
Seeing the behavior @rjrebel10 describes. Gotta work around it by basically redownloading everything manually which makes the connector not all that useful. |
What you are seeing is the currently expected behavior. When you load a dataset, you are connecting to that the current point in time and remains consistent with that. We are working on longer term changes that will allow the data you get back from the dataset to be able to remain fixed when you need it to be fixed and up-to-date when you need it to be up to date. In the meantime, we're looking at adding a way to refresh a currently loaded dataset beyond simply calling |
Severity
P0 - Critical breaking issue or missing functionality
Current Behavior
When running the Deeplake Vectorstore with a GCS path, any changes and commits made by a separate Deeplake instance on the same GCS path does not get picked up by the already running Deeplake Vectorstore instance.
Steps to Reproduce
Expected/Desired Behavior
A Deeplake Vectorstore with cloud persistence should periodically pick up and pull any changes made to the peristed data by another vectorstore instance.
Alternatively, provide a refresh method to trigger any Deeplake Vectorstore to refresh its data from cloud persistence.
Python Version
No response
OS
No response
IDE
No response
Packages
No response
Additional Context
No response
Possible Solution
No response
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: