Add tunable buffer size for spt3g filtering_istream #194

tskisner · 2024-12-12T06:15:02Z

This adds a small (and perhaps temporary) patch to specify the buffer size to the filtering_istream stack. This is a quick update to fix slow data loading and #192 will be rebased after this is merged. Also added GSL to the wheel build dependencies.

For testing on a compute node on perlmutter.nersc.gov, I used a single process and 8 threads. I then loaded one wafer (ufm_mv12) of satp3 observation obs_1714550584_satp3_1111111.

I used the new SO3G_FILESYSTEM_BUFFER environment variable to try a variety of buffer sizes loading this same data from the CFS and scratch filesystems. Note that the frame files in this case are each ~200MB, so the largest buffer size tested is actually larger than the frame files. The smaller values of buffer size were essentially unusable on CFS.

Here is a plot of the times:

Note that CFS access is faster from a login node than a compute node, and scratch access is faster from a compute node than a login node. Here is the data in the plot:

Scratch Filesystem
-----------------------------------------------
Buffer Bytes        Time
0                    24.7
10                   6.71
100                  3.97
1000                 3.72
10000                3.6
100000               3.6     
1000000              3.6
10000000             3.6
20971520             3.7  (new default)  (13.8 on login)
100000000            3.8
1000000000          11.8

CFS Filesystem
-----------------------------------------------
Buffer Bytes        Time
0           
10          
100         
1000        
10000               523.0
100000              212.5
1000000              41.8
10000000             21.5
20971520             18.9  (new default)  (4.9 on login)
100000000            17.5
1000000000           24.8

mhasself

This is a miracle... no notes. This basically resolves the performance concerns, as far as I can tell. Let's get it out there.

arahlin · 2024-12-12T12:21:32Z

We'll just integrate this upstream as a keyword argument to the g3_istream / G3Reader code.

nwhitehorn · 2024-12-12T12:49:57Z

I'm not really sure there is much downside to setting a big global default rather than having a config parameter -- no one using this software is reading a couple KB and then going home and especially no one is reading a couple KB from something really slow where they would notice the extra reads.

tskisner added 3 commits December 11, 2024 12:32

Add dynamic file load buffer size

eacfc72

Add GSL to wheel builds

4f6a03c

Silly autotools...

67db578

tskisner assigned mhasself and unassigned mhasself Dec 12, 2024

tskisner requested a review from mhasself December 12, 2024 06:19

mhasself approved these changes Dec 12, 2024

View reviewed changes

tskisner merged commit e30cca2 into master Dec 12, 2024
11 checks passed

tskisner deleted the buffer_tune branch December 12, 2024 15:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tunable buffer size for spt3g filtering_istream #194

Add tunable buffer size for spt3g filtering_istream #194

tskisner commented Dec 12, 2024

mhasself left a comment

arahlin commented Dec 12, 2024

nwhitehorn commented Dec 12, 2024

Add tunable buffer size for spt3g filtering_istream #194

Add tunable buffer size for spt3g filtering_istream #194

Conversation

tskisner commented Dec 12, 2024

mhasself left a comment

Choose a reason for hiding this comment

arahlin commented Dec 12, 2024

nwhitehorn commented Dec 12, 2024