Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented function for computing of old account states based on deltas #563

Open
wants to merge 34 commits into
base: next
Choose a base branch
from

Conversation

polydez
Copy link
Contributor

@polydez polydez commented Dec 4, 2024

Related to #527

This function (compute_old_account_states) computes state for given accounts for the specified block number based on deltas for accounts stored in the database. Since it takes into account all deltas from the creation of account up to the current block, if account has a lot of changes, it might be more expensive than applying deltas backward from the latest account state to the specified block, so it might be a subject for future improvement.

# Conflicts:
#	crates/store/src/db/sql/mod.rs
#	crates/store/src/db/sql/utils.rs
#	crates/store/src/db/tests.rs
Copy link
Contributor

@Mirko-von-Leipzig Mirko-von-Leipzig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly just style/nit suggestions :) LGTM.

crates/store/src/db/sql/utils.rs Show resolved Hide resolved
@@ -471,6 +474,121 @@ fn test_sql_public_account_details() {
assert_eq!(read_delta, Some(delta2));
}

#[test]
fn test_sql_public_account_details_for_old_block() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more a meta question (probably for @bobbinth) - why do we prefix tests with test?

They're already in a mod tests scope and have #[test]. Sort of feels like Hungarian notation and just stutters when running tests:

tests::test_xxx
tests::test_xxy
tests::test_xxz

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no good reason for this, and we are not always being consistent here (most of the time we do use test_ prefix but not always).

I'm totally fine with removing test_ prefixes if someone wants to rename them.

Comment on lines -133 to +136
/// Select the latest account details by account id from the DB using the given [Connection].
/// Select the latest account details by account ID from the DB using the given [Connection].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we introduce a comment spell-checker into CI? This would also negate the random air-drop-farming PRs we'll get in the future (they love submitting doc fixes).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no previous experience with spell-checking on CI. This should work, if it's easy to add new words to a dictionary. And if it won't require disabling it for particular comments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Me neither; this looks promising? But not really important at this stage :)

Copy link
Collaborator

@igamigo igamigo Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd agree with this as well (although I don't have prior experience to gauge how effective those tools can be). There have already been quite a few of those types of PRs.

account_ids: &[AccountId],
block_number: BlockNumber,
) -> Result<Vec<Account>> {
fn find_account(
Copy link
Contributor

@Mirko-von-Leipzig Mirko-von-Leipzig Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Short doc comment here would be good if you keep it.

Comment on lines 392 to 401
match find_account(&mut accounts, account_id)?.get_mut().storage.entry(slot) {
Entry::Vacant(entry) => {
entry.insert(StorageSlot::Value(value));
},
Entry::Occupied(_) => {
return Err(DatabaseError::DataCorrupted(format!(
"Duplicate storage slot: {slot}"
)));
},
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use if <..>.insert(..).is_some() { return Err(..) }. Unclear if its more legible though.

Suggested change
match find_account(&mut accounts, account_id)?.get_mut().storage.entry(slot) {
Entry::Vacant(entry) => {
entry.insert(StorageSlot::Value(value));
},
Entry::Occupied(_) => {
return Err(DatabaseError::DataCorrupted(format!(
"Duplicate storage slot: {slot}"
)));
},
}
let mut account = find_account(&mut accounts, account_id)?;
if account.get_mut().storage.insert(slot, value).is_some() {
return Err(DatabaseError::DataCorrupted(format!(
"Duplicate storage slot: {slot}"
)));
}

Comment on lines 380 to 381
find_account(&mut accounts, account_id)?.get_mut().nonce =
Some(nonce.try_into().map_err(DatabaseError::DataCorrupted)?);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: multi-line chaining from a stand-alone function is sometimes hard to read/grok:

Suggested change
find_account(&mut accounts, account_id)?.get_mut().nonce =
Some(nonce.try_into().map_err(DatabaseError::DataCorrupted)?);
let nonce = row.get::<_, u64>(1)?.try_into().map_err(DatabaseError::DataCorrupted)?;
let mut account = find_account(&mut accounts, account_id)?;
account.get_mut().nonce = Some(nonce);

Comment on lines +345 to +350
struct AccountRecord {
code: AccountCode,
nonce: Option<Felt>,
storage: BTreeMap<u8, StorageSlot>,
assets: Vec<Asset>,
}
Copy link
Contributor

@Mirko-von-Leipzig Mirko-von-Leipzig Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth adding an impl to this to separate it from the record iteration?

impl AccountRecord {
    // We should start by reading the `nonce` then it doesn't have to be optional?
    fn new(id: AccountId, nonce: Nonce) -> Self {..}

    fn push_asset(&mut self, ...)

    ...
}

Comment on lines 207 to 218
fn find_account(
accounts: &mut BTreeMap<AccountId, AccountRecord>,
account_id: AccountId,
) -> Result<OccupiedEntry<AccountId, AccountRecord>> {
let Entry::Occupied(found_account) = accounts.entry(account_id) else {
return Err(DatabaseError::DataCorrupted(format!(
"Account not found in DB: {account_id:x}"
)));
};

Ok(found_account)
}
Copy link
Contributor

@Mirko-von-Leipzig Mirko-von-Leipzig Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should new-type accounts? Though I'm bias against free standing functions, so take this with a big pinch of 🧂

#[derive(Default)]
struct Accounts(BTreeMap<AccountId, AccountRecord>);

impl Accounts {
    fn insert(...)

    // or maybe `get_mut_or_err`
    fn try_get_mut(&mut self, id: &AccountId) -> Result<&mut AccountRecord> {...}
}

Comment on lines 496 to 498
record.nonce.ok_or(DatabaseError::DataCorrupted(format!(
"Missing nonce for account: {account_id:x}"
)))?,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we start by iterating over the nonces the they don't need to be optional.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we actually start by getting codes from accounts and crate AccountRecords with only code field filled.

Comment on lines +475 to +477
return Err(DatabaseError::DataCorrupted(format!(
"Missing value for storage slot {expected_slot}, got {slot}"
)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there value in expanding DataCorrupted with variants for each possible corruption? Probably not 🤔

Copy link
Contributor Author

@polydez polydez Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I don't think we will process any data corruption cases differently from any other. 😄

@bobbinth
Copy link
Contributor

bobbinth commented Dec 7, 2024

This function (compute_old_account_states) computes state for given accounts for the specified block number based on deltas for accounts stored in the database. Since it takes into account all deltas from the creation of account up to the current block, if account has a lot of changes, it might be more expensive than applying deltas backward from the latest account state to the specified block, so it might be a subject for future improvement.

Are you thinking of this as a temporary solution? Because I agree that computing account states based on deltas from account creation is probably not a good solution.

And if it is a temporary solution - is it actually worth it? Maybe we should go for a longer-terms solution.

In general, as I mentioned in one other comment, let's look at the whole account data storage/retrieval area more holistically. Let's list the use-cases we are trying to solve for and then see what solution would address them well.

@@ -188,7 +188,7 @@ impl api_server::Api for RpcApi {
self.block_producer.clone().submit_proven_transaction(request).await
}

/// Returns details for public (public) account by id.
/// Returns details for public account by ID.
Copy link
Collaborator

@igamigo igamigo Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a bit misleading. This method still works for non-public accounts, right? It just doesn't return details for them (ie, it just returns the stored hash). There are some other related functions that have the same problem IIRC (like select_accounts_by_ids which get_account_proofs uses, both for private and public accounts).

Copy link
Collaborator

@igamigo igamigo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might have already been discussed (apologies if so), but considering the "usability" of account states decreases as they move further and further from the chain tip, would it make sense (if it is even possible) to retrieve the latest state and subtract deltas instead of building them up from 0?

a.block_num < b.block_num
)
ORDER BY
account_id
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would something like this work? It looks a bit more readable to me, although there's a good possibility it's not as performant (but with indices and fairly large amount of updates on each account, it might?).

SELECT
    account_id,
    nonce
FROM
    account_deltas
WHERE
    account_id IN rarray(?1)
    AND block_num <= ?2
    AND block_num = (
        SELECT MAX(block_num)
        FROM account_deltas b
        WHERE b.account_id = account_deltas.account_id AND b.block_num <= ?2
    )
ORDER BY
    account_id;

Comment on lines +200 to +204
pub fn compute_old_account_states(
conn: &mut Connection,
account_ids: &[AccountId],
block_number: BlockNumber,
) -> Result<Vec<Account>> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this check that account IDs are public only?

Comment on lines +382 to +384
}

let mut rows =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Could be nice to have some section separators or just comments to neatly divide each segment of the account retrieval (or maybe just make these their own functions directly)

# Conflicts:
#	crates/proto/src/generated/account.rs
#	crates/rpc-proto/proto/account.proto
#	crates/store/src/db/sql/mod.rs
#	crates/store/src/db/tests.rs
#	proto/account.proto
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants