Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Upgrade Malachite #184

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

varun-doshi
Copy link
Contributor

@varun-doshi varun-doshi commented Dec 23, 2024

Ref #181

This PR resulted in quite some changes, primarily due to breaking changes in Malachite v0.0.1 compared to the commit Snapchain was using previously.
There are many missing implementations in traits and Enums such as in SnapchainValidatorContext and Effect, TimeoutKind.
Another function that needs to be looked at is handle_msg in src/consensus/consensus.rs

@varunsrin
Copy link
Member

thanks for the pr, someone will review it soon

@varunsrin
Copy link
Member

also, can you comment on the issue so that i can assign it to you @varun-doshi ?

@varunsrin varunsrin requested review from sanjayprabhu and aditiharini and removed request for sanjayprabhu December 24, 2024 14:24
@aditiharini
Copy link
Contributor

aditiharini commented Dec 24, 2024

Were you able to start up 3 nodes locally and see consensus progressing? This will likely not work until we propagate the gossip messages into consensus (i.e. do what we were previously doing for the Broadcast effect, but it's a good way to test that these changes have roughly worked as expected.
cargo run --bin setup_local_testnet
cargo run -- --config-path nodes/1/snapchain.toml
cargo run -- --config-path nodes/2/snapchain.toml
cargo run -- --config-path nodes/3/snapchain.toml

@aditiharini
Copy link
Contributor

So far this looks reasonable, let me know if it'd be useful to get some input on the remaining todos and/or figuring out where to plumb gossip messages through consensus.

@sds
Copy link
Member

sds commented Dec 24, 2024

You'll want to rebase the changes from #187 so that the Docker image builds. Tests are still failing however, so we'll want to make sure those are fixed as well.

@varun-doshi
Copy link
Contributor Author

Were you able to start up 3 nodes locally and see consensus progressing? This will likely not work until we propagate the gossip messages into consensus (i.e. do what we were previously doing for the Broadcast effect, but it's a good way to test that these changes have roughly worked as expected. cargo run --bin setup_local_testnet cargo run -- --config-path nodes/1/snapchain.toml cargo run -- --config-path nodes/2/snapchain.toml cargo run -- --config-path nodes/3/snapchain.toml

the first command works and creates the config toml files

but running any of the nodes give an error like this:

2024-12-25T06:35:51.684183Z  INFO snapchain::storage::db::rocksdb: Creating new RocksDB path="nodes/1/.rocks/farcaster"
2024-12-25T06:35:51.690026Z  INFO snapchain: HubService listening addr="/ip4/127.0.0.11/udp/50051/quic-v1" grpc_addr="127.0.0.1:3383"
2024-12-25T06:35:51.690277Z  INFO snapchain: Starting Snapchain node with public key: 63c8f9758525c3c8e57e5654b4c091a3a646f5ed69babc2d1f5cbdcaa2fb881c
2024-12-25T06:35:51.693698Z  INFO libp2p_swarm: local_peer_id=12D3KooWGXtGq1tNmAe1BHzdiaGTbwrDgLU4v5FUmfavypWxrHkf
2024-12-25T06:35:51.693815Z  INFO snapchain::network::gossip: Processing bootstrap peer: "/ip4/127.0.0.12/udp/50052/quic-v1"
2024-12-25T06:35:51.693937Z  INFO snapchain::network::gossip: Dialing bootstrap peer: /ip4/127.0.0.12/udp/50052/quic-v1 ("/ip4/127.0.0.12/udp/50052/quic-v1")
2024-12-25T06:35:51.694568Z  INFO snapchain::network::gossip: Processing bootstrap peer: "/ip4/127.0.0.13/udp/50053/quic-v1"
2024-12-25T06:35:51.694579Z  INFO snapchain::network::gossip: Dialing bootstrap peer: /ip4/127.0.0.13/udp/50053/quic-v1 ("/ip4/127.0.0.13/udp/50053/quic-v1")
2024-12-25T06:35:51.695484Z  INFO snapchain::network::gossip: Processing bootstrap peer: "/ip4/127.0.0.14/udp/50054/quic-v1"
2024-12-25T06:35:51.695495Z  INFO snapchain::network::gossip: Dialing bootstrap peer: /ip4/127.0.0.14/udp/50054/quic-v1 ("/ip4/127.0.0.14/udp/50054/quic-v1")
2024-12-25T06:35:51.696473Z ERROR snapchain: Failed to create SnapchainGossip error=Other(Custom { kind: Other, error: Other(Right(Io(Os { code: 49, kind: AddrNotAvailable, message: "Can't assign requested address" }))) })

@sanjayprabhu
Copy link
Contributor

Ah, you should update config.toml to use 127.0.0.1 but just different ports. We'll fix the script to do this by default.

@varun-doshi
Copy link
Contributor Author

After updating the config toml, looks like the nodes are discovering each other and setting up a conenction, but it crashes when trying to do PersistTimeout which is an incomplete impl funciton as part of the Malachite updgrade

2024-12-26T10:09:44.298564Z  INFO snapchain::storage::db::rocksdb: Creating new RocksDB path="nodes/1/.rocks/farcaster"
2024-12-26T10:09:44.304339Z  INFO snapchain: HubService listening addr="/ip4/127.0.0.1/udp/50051/quic-v1" grpc_addr="127.0.0.1:3383"
2024-12-26T10:09:44.304606Z  INFO snapchain: Starting Snapchain node with public key: 63c8f9758525c3c8e57e5654b4c091a3a646f5ed69babc2d1f5cbdcaa2fb881c
2024-12-26T10:09:44.308209Z  INFO libp2p_swarm: local_peer_id=12D3KooWGXtGq1tNmAe1BHzdiaGTbwrDgLU4v5FUmfavypWxrHkf
2024-12-26T10:09:44.308342Z  INFO snapchain::network::gossip: Processing bootstrap peer: "/ip4/127.0.0.1/udp/50052/quic-v1"
2024-12-26T10:09:44.308464Z  INFO snapchain::network::gossip: Dialing bootstrap peer: /ip4/127.0.0.1/udp/50052/quic-v1 ("/ip4/127.0.0.1/udp/50052/quic-v1")
2024-12-26T10:09:44.309113Z  INFO snapchain::network::gossip: Processing bootstrap peer: "/ip4/127.0.0.1/udp/50053/quic-v1"
2024-12-26T10:09:44.309122Z  INFO snapchain::network::gossip: Dialing bootstrap peer: /ip4/127.0.0.1/udp/50053/quic-v1 ("/ip4/127.0.0.1/udp/50053/quic-v1")
2024-12-26T10:09:44.310080Z  INFO snapchain::network::gossip: Processing bootstrap peer: "/ip4/127.0.0.1/udp/50054/quic-v1"
2024-12-26T10:09:44.310088Z  INFO snapchain::network::gossip: Dialing bootstrap peer: /ip4/127.0.0.1/udp/50054/quic-v1 ("/ip4/127.0.0.1/udp/50054/quic-v1")
2024-12-26T10:09:44.311330Z  INFO snapchain: Starting gossip
2024-12-26T10:09:44.312054Z  INFO snapchain::network::gossip: Local node is listening address="/ip4/127.0.0.1/udp/50051/quic-v1"
2024-12-26T10:09:44.312100Z  INFO snapchain::storage::db::rocksdb: Creating new RocksDB path="nodes/1/.rocks/shard1"
2024-12-26T10:09:45.778566Z  INFO snapchain::connectors::fname: found new transfers count=100 position=0
2024-12-26T10:09:46.175188Z  INFO snapchain::connectors::fname: found new transfers count=100 position=100
2024-12-26T10:09:47.478895Z  INFO snapchain::network::gossip: Connection established with peer: 12D3KooWJ2BCYXUjcqF4Vsvme83Ft6rCvhuxjbAZg4VJiHFAwkCE
2024-12-26T10:09:47.480283Z  INFO snapchain::network::gossip: Peer: 12D3KooWJ2BCYXUjcqF4Vsvme83Ft6rCvhuxjbAZg4VJiHFAwkCE subscribed to topic: test-net
2024-12-26T10:09:47.588999Z  INFO snapchain::connectors::fname: found new transfers count=100 position=200
2024-12-26T10:09:47.589060Z  INFO snapchain::connectors::fname: stopped fetching transfers position=200
2024-12-26T10:09:52.322706Z  INFO snapchain: Registering validator with nonce: 5
2024-12-26T10:09:56.321559Z  INFO snapchain::network::gossip: Connection established with peer: 12D3KooWJ2BCYXUjcqF4Vsvme83Ft6rCvhuxjbAZg4VJiHFAwkCE
2024-12-26T10:09:56.322600Z  INFO snapchain::network::gossip: Peer: 12D3KooWJ2BCYXUjcqF4Vsvme83Ft6rCvhuxjbAZg4VJiHFAwkCE subscribed to topic: test-net
2024-12-26T10:09:59.262999Z  INFO snapchain::network::gossip: Connection established with peer: 12D3KooWE4waijrZ37yjMbbx35hZDzkLchj3wiVR4c8iQzN2pszh
2024-12-26T10:09:59.263990Z  INFO snapchain::network::gossip: Peer: 12D3KooWE4waijrZ37yjMbbx35hZDzkLchj3wiVR4c8iQzN2pszh subscribed to topic: test-net
2024-12-26T10:10:02.322488Z  INFO snapchain: Registering validator with nonce: 10
2024-12-26T10:10:04.335096Z  INFO Actor{id="0.1"}:consensus{height=[0] 1 round=-1}:node{name=0x63c8 Block}: snapchain::consensus::consensus: Connected to peer 79e40ffdf9cc7931c73039c065cf099bed56065ec2f2a66746334ca2e0a4dfd7. Total peers: 2
2024-12-26T10:10:04.335099Z  INFO Actor{id="0.0"}:consensus{height=[1] 1 round=-1}:node{name=0x63c8 Shard 1}: snapchain::consensus::consensus: Connected to peer 79e40ffdf9cc7931c73039c065cf099bed56065ec2f2a66746334ca2e0a4dfd7. Total peers: 2
2024-12-26T10:10:05.152748Z  INFO snapchain::network::gossip: Connection established with peer: 12D3KooWERy33ZNvfffoTf7eA2p2Y1cSKzdLvAD6itX5RYHgsqC7
2024-12-26T10:10:05.153884Z  INFO snapchain::network::gossip: Peer: 12D3KooWERy33ZNvfffoTf7eA2p2Y1cSKzdLvAD6itX5RYHgsqC7 subscribed to topic: test-net
2024-12-26T10:10:07.276589Z  INFO Actor{id="0.0"}:consensus{height=[1] 1 round=-1}:node{name=0x63c8 Shard 1}: snapchain::consensus::consensus: Connected to peer 3f2aa8df8a2cfc92425405cecb3db9f994e38058379a19d17c3a33098af8b762. Total peers: 3
2024-12-26T10:10:07.276718Z  INFO Actor{id="0.0"}:consensus{height=[1] 1 round=-1}:node{name=0x63c8 Shard 1}: snapchain::consensus::consensus: Enough peers (3) connected to start consensus
2024-12-26T10:10:07.277319Z  INFO Actor{id="0.1"}:consensus{height=[0] 1 round=-1}:node{name=0x63c8 Block}: snapchain::consensus::consensus: Connected to peer 3f2aa8df8a2cfc92425405cecb3db9f994e38058379a19d17c3a33098af8b762. Total peers: 3
2024-12-26T10:10:07.277386Z  INFO Actor{id="0.1"}:consensus{height=[0] 1 round=-1}:node{name=0x63c8 Block}: snapchain::consensus::consensus: Enough peers (3) connected to start consensus
2024-12-26T10:10:07.357667Z  INFO snapchain::network::gossip: Connection closed with peer: PeerId("12D3KooWJ2BCYXUjcqF4Vsvme83Ft6rCvhuxjbAZg4VJiHFAwkCE") due to: Some(IO(Custom { kind: Other, error: Connection(ConnectionError(TimedOut)) }))
2024-12-26T10:10:12.322268Z  INFO snapchain: Registering validator with nonce: 15
2024-12-26T10:10:13.166541Z  INFO Actor{id="0.0"}:consensus{height=[1] 1 round=-1}:node{name=0x63c8 Shard 1}: snapchain::consensus::consensus: Connected to peer 448d8023e6a34f56d01576944a36fff58f88047cf2cc0930b10e09cca42a1e44. Total peers: 4
2024-12-26T10:10:13.167042Z  INFO Actor{id="0.1"}:consensus{height=[0] 1 round=-1}:node{name=0x63c8 Block}: snapchain::consensus::consensus: Connected to peer 448d8023e6a34f56d01576944a36fff58f88047cf2cc0930b10e09cca42a1e44. Total peers: 4
2024-12-26T10:10:17.278031Z  INFO snapchain::consensus::consensus: Starting consensus
2024-12-26T10:10:17.278535Z  INFO Actor{id="0.0"}:consensus{height=[1] 1 round=-1}:node{name=0x63c8 Shard 1}: informalsystems_malachitebft_core_consensus::handle::start_height: Starting new height height=[1] 1
2024-12-26T10:10:17.278695Z  INFO Actor{id="0.0"}:consensus{height=[1] 1 round=-1}:node{name=0x63c8 Shard 1}: informalsystems_malachitebft_core_consensus::handle::driver: Starting new round height=[1] 1 round=0 proposer=3f2aa8df8a2cfc92425405cecb3db9f994e38058379a19d17c3a33098af8b762
2024-12-26T10:10:17.279281Z  INFO Actor{id="0.0"}:consensus{height=[1] 1 round=-1}:node{name=0x63c8 Shard 1}: informalsystems_malachitebft_core_consensus::handle::driver: Scheduling timeout round=0 step=Propose
2024-12-26T10:10:17.279056Z  INFO snapchain::consensus::consensus: Starting consensus
2024-12-26T10:10:17.280050Z  INFO Actor{id="0.1"}:consensus{height=[0] 1 round=-1}:node{name=0x63c8 Block}: informalsystems_malachitebft_core_consensus::handle::start_height: Starting new height height=[0] 1
2024-12-26T10:10:17.280129Z  INFO Actor{id="0.1"}:consensus{height=[0] 1 round=-1}:node{name=0x63c8 Block}: informalsystems_malachitebft_core_consensus::handle::driver: Starting new round height=[0] 1 round=0 proposer=3f2aa8df8a2cfc92425405cecb3db9f994e38058379a19d17c3a33098af8b762
2024-12-26T10:10:17.280280Z  INFO Actor{id="0.1"}:consensus{height=[0] 1 round=-1}:node{name=0x63c8 Block}: informalsystems_malachitebft_core_consensus::handle::driver: Scheduling timeout round=0 step=Propose
2024-12-26T10:10:20.281727Z  INFO Actor{id="0.0"}:consensus{height=[1] 1 round=0}:node{name=0x63c8 Shard 1}: informalsystems_malachitebft_core_consensus::handle::timeout: Timeout elapsed step=Propose timeout.round=0 height=[1] 1 round=0
2024-12-26T10:10:20.281727Z  INFO Actor{id="0.1"}:consensus{height=[0] 1 round=0}:node{name=0x63c8 Block}: informalsystems_malachitebft_core_consensus::handle::timeout: Timeout elapsed step=Propose timeout.round=0 height=[0] 1 round=0
thread 'tokio-runtime-worker' panicked at src/consensus/consensus.rs:512:17:
not yet implemented
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread 'tokio-runtime-worker' panicked at src/consensus/consensus.rs:512:17:
not yet implemented
2024-12-26T10:10:22.322727Z  INFO snapchain: Registering validator with nonce: 20
2024-12-26T10:10:23.162708Z  WARN snapchain::node::snapchain_node: Failed to forward message to actor: SendErr

@aditiharini
Copy link
Contributor

Awesome, the next step would be to implement the incomplete function and any others that prevent consensus from progressing.

@varun-doshi
Copy link
Contributor Author

Awesome, the next step would be to implement the incomplete function and any others that prevent consensus from progressing.

understood...I'll get on this

@varun-doshi
Copy link
Contributor Author

varun-doshi commented Dec 28, 2024

This is what Context for SnapChainValidator looks like right now:

impl informalsystems_malachitebft_core_types::Context for SnapchainValidatorContext {
    type Address = Address;
    type Height = Height;
    type ProposalPart = ProposalPart;
    type Proposal = Proposal;
    type Validator = SnapchainValidator;
    type ValidatorSet = SnapchainValidatorSet;
    type Value = ShardHash;
    type Vote = Vote;
    type SigningScheme = Ed25519;
    type SigningProvider = Ed25519Provider;

The Ed25519 and Ed25519Provider are just mocks structs I had to create to remove errors.
What type should i use here specifically? And what crate? (ecdsa/edd25519)

Also libp2p::identity::ed25519::SecretKey doesn't have a sign function
ref:

pub type PublicKey = libp2p::identity::ed25519::PublicKey;
pub type PrivateKey = libp2p::identity::ed25519::SecretKey;

@aditiharini
Copy link
Contributor

@varun-doshi this issue is actually more complicated that we realized. We'll take your branch over from here and get it merged. Thanks for taking this on!

We'll add some more issues we're open to contributions on over the next few days-- feel free to pick any of those up if you're interested.

@varun-doshi
Copy link
Contributor Author

Understood... looking forward to working on other issues


Effect::GetValue(height, round, timeout) => {
let timeout = timeouts.duration_for(timeout.step);
//Not available in Effect anymore
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Effect::Broadcast was renamed to Effect::Publish

@@ -105,7 +191,7 @@ pub struct Signature(pub Vec<u8>);
pub type PublicKey = libp2p::identity::ed25519::PublicKey;
pub type PrivateKey = libp2p::identity::ed25519::SecretKey;

impl malachite_common::SigningScheme for Ed25519 {
impl informalsystems_malachitebft_core_types::SigningScheme for Ed25519 {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will publish on crates.io (not yet done) the crate informalsystems-malachitebft-signing-ed25519 that can be reused here potentially.

@varun-doshi
Copy link
Contributor Author

Some other changes to watch out for as well informalsystems/malachite#723

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants