Skip to content

Releases: streamingfast/firehose-core

v1.3.9

03 May 20:05
Compare
Choose a tag to compare

Substreams

  • Allow stores to write to stores with out-of-order ordinals (they will be reordered at the end of the module execution for each block)
  • Fix issue in substreams-tier2 causing some files to be written to the wrong place sometimes under load, resulting in some hanging requests

v1.3.8

30 Apr 23:38
Compare
Choose a tag to compare
  • The tools download-from-firehose now respects its documentation when doing --help, correct invocation now is firecore tools download-from-firehose <endpoint> <start>:<end> <output_folder>.

  • The firecore tools download-from-firehose has been improved to work with new Firehose sf.firehose.v2.BlockMetadata field, if the server sends this new field, the tool is going to work on any chain. If the server's you are reaching is not recent enough, the tool fallbacks to the previous logic. All StreamingFast endpoints should serves be compatible.

  • Firehose response (both single block and stream) now include the sf.firehose.v2.BlockMetadata field. This new field contains the chain agnostic fields we hold about any block of any chain.

v1.3.7

26 Apr 15:28
Compare
Choose a tag to compare

Fixed

  • Fixed possible race condition in the blockPoller
  • Fix relayer waiting too long to fail when reconnecting to a single source (especially on slow chains). It will now fail right away if it receives an unlinkable block and has a single source configured.
  • Fixed skipped block handling and performance issues on blockPoller

Changed

  • The --block-type flag got renamed to --substreams-tier1-block-type. Specifying it will make substreams-tier1 skip the block type discovery (from files or live stream) on startup, getting ready faster.

Added

  • Logs now print the "x-deployment-id" header on firehose connections (used to propagate subgraph deployment ids from graph-node and help debugging)

v1.3.6

17 Apr 20:47
Compare
Choose a tag to compare
  • bump substreams to v1.5.5 with fix in wazero to prevent process freezing on certain substreams
  • bump go-generics to v3.4.0

v1.3.5

12 Apr 18:30
Compare
Choose a tag to compare

Substreams fixes

  • fix a possible panic() when an request is interrupted during the file loading phase of a squashing operation.
  • fix a rare possibility of stalling if only some fullkv stores caches were deleted, but further segments were still present.
  • fix stats counters for store operations time

v1.3.4

11 Apr 10:43
Compare
Choose a tag to compare
  • add DefaultBlockType into firehose.Chain struct, enabling default block type setting for known chain

substreams

  • bumped to v1.5.3
  • add --block-type flag that can be specified when creating substreams tier1. If not specified, tier1 will auto-detect block type from source.
  • fix memory leak on substreams execution (by bumping wazero dependency)
  • prevent substreams-tier1 stopping if blocktype auto-detection times out
  • fix missing error handling when writing output data to files. This could result in tier1 request just "hanging" waiting for the file never produced by tier2.
  • fix handling of dstore error in tier1 'execout walker' causing stalling issues on S3 or on unexpected storage errors
  • increase number of retries on storage when writing states or execouts (5 -> 10)
  • prevent slow squashing when loading each segment from full KV store (can happen when a stage contains multiple stores)

v1.3.3

04 Apr 18:24
Compare
Choose a tag to compare

substreams

  • Fix a context leak causing tier1 responses to slow down progressively

v1.3.2

03 Apr 21:02
Compare
Choose a tag to compare

substreams

  • fix another panic on substreams-tier2 service
  • fix thread leak in metering affecting substreams
  • revert a substreams scheduler optimisation that causes slow restarts when close to head
  • add substreams_tier2_active_requests and substreams_tier2_request_counter prometheus metrics

v1.3.1

02 Apr 18:21
Compare
Choose a tag to compare
  • fix panic on substreams-tier2 service

v1.3.0

02 Apr 17:51
Compare
Choose a tag to compare

Substreams

Chain-agnostic tier2

  • A single substreams-tier2 instance can now serve requests for multiple chains or networks. All network-specific parameters are now passed from Tier1 to Tier2 in the internal ProcessRange request.
  • This allows you to better use your computing resources by pooling all the networks together.

Important

Since the tier2 services will now get the network information from the tier1 request, you must make sure that the file paths and network addresses will be the same for both tiers.
ex: if --common-merged-blocks-store-url=/data/merged is set on tier1, make sure the merged blocks are also available from tier2 under the path /data/merged.
The flags --substreams-state-store-url, --substreams-state-store-default-tag and --common-merged-blocks-store-url are now ignored on tier2. The flag --common-first-streamable-block should be set to 0 to accommodate every chain.

Tip

The cached 'partial' files no longer contain the "trace ID" in their filename, preventing accumulation of "unsquashed" partial store files. The system will delete files under '{modulehash}/state' named in this format{blocknumber}-{blocknumber}.{hexadecimal}.partial.zst when it runs into them.

Performance improvements

  • All module outputs are now cached. (previously, only the last module was cached, along with the "store snapshots", to allow parallel processing).
  • Tier2 will now read back mapper outputs (if they exist) to prevent running them again. Additionally, it will not read back the full blocks if its inputs can be satisfied from existing cached mapper outputs.
  • Tier2 will skip processing completely if it's processing the last stage and the output_module is a mapper that has already been processed (ex: when multiple requests are indexing the same data at the same time)
  • Tier2 will skip processing completely if it's processing a stage where all the stores and outputs have been processed and cached.
  • Scheduler modification: a stage now waits for the previous stage to have completed the same segment before running, to take advantage of the cached intermediate layers.
  • Improved file listing performance for Google Storage backends by 25%!

Tip

Concurrent requests on the same module hashes may benefit from the other requests' work to a certain extent (up to 75%!) -- The very first request does most of the work for the other ones.

Tip

More caches will increase disk usage and there is no automatic removal of old module caches. The operator is responsible for deleting old module caches.

Tip

The cached 'partial' files no longer contain the "trace ID" in their filename, preventing accumulation of "unsquashed" partial store files.
The system will delete files under '{modulehash}/state' named in this format{blocknumber}-{blocknumber}.{hexadecimal}.partial.zst when it runs into them.

Metrics

  • Readiness metric for Substreams tier1 app is now named substreams_tier1 (was mistakenly called firehose before).
  • Added back readiness metric for Substreams tiere app (named substreams_tier2).
  • Added metric substreams_tier1_active_worker_requests which gives the number of active Substreams worker requests a tier1 app is currently doing against tier2 nodes.
  • Added metric substreams_tier1_worker_request_counter which gives the total Substreams worker requests a tier1 app made against tier2 nodes.

Flags

  • Added --merger-delete-threads to customize the number of threads the merger will use to delete files. It's recommended to increase this when using Ceph as S3 storage provider to 25 or higher (due to performance issues with deletes the merger might otherwise not be able to delete one-block files fast enough).
  • Added --substreams-tier2-max-concurrent-requests to limit the number of concurrent requests to the tier2 substreams service.