-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question: .obao file seeks for constructing Blake3 Merkle Proofs #40
Comments
I'm afraid I don't know anything about Ethereum proofs or what their requirements are. But one thing to keep in mind is that verifying a Bao slice requires access to some low-level BLAKE3 APIs, because you need to be able to verify "non-root chaining values". If Ethereum gave you say the raw compression function with all its parameters, that would be fine, but if it only gave you
Yes, the Does that make sense? I realize the slicing concept is super abstract, so don't hesitate to ask for more clarification. |
Would it be possible to use those low-level API's to build a Merkle Proof off-chain, that you can present to a hashing function on-chain in some sort of JSON format? This would look like a classic Merkle Proof where you use a list of annotated sibling hashes in order to verify a block hashes to a Merle Root.
The use case we're aiming for is running the Oracles on some sort of low-overhead Serverless framework, so we want to cut down on memory usage as much as possible. We also plan on loading these .obao files from remote storage (e.g. we're using S3 for our Demo product), so we want to cut down on the amount of data we query at one time. From what I understand, SliceExtractor needs the encoding (.bao or .obao) loaded in memory before it can generate a slice. For Our use case, I can imagine two things working:
Alternatively, we could cut down on that sort of overhead by chunking files correctly, which is a topic for another day. I recognize that this use case is not exactly what your project aims to support. We have other plans for Bao which is more in line with its potential for verifying data streams, so we really want to use it in our design. That being said, I just wanted to get your input on the above to see if we can integrate .bao into our Oracle ecosystem in the way we've been planning or if we need to rethink our Architecutre in order to get to a Demo in a reasonable timeframe. Thanks for your help/attention/your very exciting code |
I'm afraid I still don't understand this concept well enough to answer this part of the quesiton.
Actually it doesn't. I don't know how comfortable you are reading Rust code, but if you take a look at |
Like, Our solidity code would be called with a list of Sibling hashes, a block, and hash that block to a Blake3 root hash that it knows about. Could we construct those sibling hashes from a .obao file such that our contract just calls to
I am completely new to Rust so this is good to know. Merely streaming that data to Lambda could work for us. Not sure how the dummy Read/Seeks would work since in that process, there's always gonna be some process that's reading the .obao into memory. Maybe AWS Glue might be able to do something like that, but it sounds complicated/resource intensive compared to just streaming. |
unclear whether it’ll be possible to hash on chain. I think the answer is possibly “it’ll be prohibitively expensive even if we get the EIP we want”, so this is mostly moot. however, re: the second point, I think the sliceextractor around an S3 interface shouldn’t be so bad- thanks @oconnor663 for clarifying/ @amiller68 I can help w this in a few days |
Yes, thank you so much @oconnor663 ! |
Hi @oconnor663, thanks for all your help, and sorry for all the silly questions... Please feel free to correct me if I'm misunderstanding anything... I wanted to quickly clarify something about the I see the 6.25% number in the blake3 paper, but that's the constant overhead per chunk of verified streaming: assuming you are attempting to verify the entire file, what's the extra percentage of bytes you'll need to pass along in order to verify that particular chunk, in addition to the bytes you've already received in the preorder traversal. On the other hand: We're going to be validating very-small randomly selected slices (probably in the 100s or 1000s of bytes range) of multi-gigabyte possibly-petabyte size files. So we definitely don't need the entire preorder traversal, we just need the sibling hashes for everything on the path up to the top of the root in order to prove. That should be log(filesize), if done in an optimal manner. So does the O(n) overhead mean the SliceExtractor is grabbing the entire preorder traversal up until it hits the location you're seeking to? In that case- I am not sure whether this is the right library for our use case. Do you think it'd be faster to modify bao for our use case by writing some kind of optimized SliceExtractor where it hops around in the obao to only grab sibling hashes of the chunk in question, or would it be quicker to just reimplement this functionality myself? |
I'm starting to look through all the tree magic in encode/decode.rs... It's been a second since I took algo... My gut says the following statement is true: If you have the two invariants that left subtrees are 1) complete and 2) of greater weight than the right subtree, you can uniquely and in constant time compute the index in the preorder traversal where you'll find a given node in the tree. Do you think that intuition is right? I think that means that it should be pretty quick to write some kind of ProofExtractor interface that reads in obao files and spits out a list of sibling nodes on a path from leaf to root in order to do a proof of inclusion at a given file chunk? If my logic is right, and I implemented this, would you prefer me to PR this repo with changes or just fork things? |
Ok, it looks like my thought was correct and you already implemented this (yay!)... still trying to figure out where the 7% overhead for an extracted slice is coming from. Line 777 in 08ca9dd
|
Ok- I'm stumped. From the code, it looks like the length of the outboard encoding should be ~ Have we just been misunderstanding each other in terms of which concerns we have because we're working with a very small/constant If so, I think we can just use the |
If I understand you correctly, that's exactly what the
No worries at all. I'm thrilled to see Bao get some real use cases. |
I have a very similar questions, but for a different use case. I'm trying to optimize a random-access reader client. As @laudiacay said:
Assuming that's true, I was also wondering where the 7% I've seen referenced elsewhere come into play wrt slices. (I can't seem to find that reference anymore though...) Really, I need to spend some time and dig into the code :) let mut extractor = bao::encode::SliceExtractor::new(encoded_cursor, slice_start, slice_len);
let mut slice = Vec::new();
extractor.read_to_end(&mut slice)?; I'm hoping that the computational complexity of this operation is |
Yeah it is, I went and read the code. he's treating the file size N as
constant and the slice_len as the interesting variable that things scale
with, in his computations, because that's what makes sense for this use
case :)
…On Wed, Jul 27, 2022 at 4:48 AM, Devin Smith ***@***.***> wrote:
I have a very similar questions, but for a different use case. I'm trying
to optimize a random-access reader client.
As @laudiacay <https://github.com/laudiacay> said:
you can uniquely and in constant time compute the index in the preorder
traversal where you'll find a given node in the tree
Assuming that's true, I was also wondering where the 7% I've seen
referenced elsewhere come into play wrt slices. (I can't seem to find that
reference anymore though...)
Really, I need to spend some time and dig into the code :)
let mut extractor = bao::encode::SliceExtractor::new(encoded_cursor, slice_start, slice_len);let mut slice = Vec::new();
extractor.read_to_end(&mut slice)?;
I'm hoping that the computational complexity of this operation is O(slice_len
+ log N)
—
Reply to this email directly, view it on GitHub
<#40 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACDPMJCPA7JIGRFMQ3JK7N3VWCPQ7ANCNFSM54BBASWQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Hi,
I am looking to implement a Proof-of-storage ecosystem based on .obao files. The idea is that an ecosystem of Oracles will query a storage provider for a chunk of a file and then construct some sort of Blake3 Merkle proof from an .obao file. I know from a solution proposed to @jonahkaye in this thread that something like this can be done by compiling Bao slices from an entire .obao file and desired chunks of the file needing to be verified. What I want to know is:
The text was updated successfully, but these errors were encountered: