-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cache: fix pfree valueBuffer #273
base: main
Are you sure you want to change the base?
cache: fix pfree valueBuffer #273
Conversation
@JerrySievert looks ok to me, wdyt? |
I'll reproduce locally and give you an informed response soon :) |
I also have noticed bug in current implementation. For PostgreSQL 16+ you have function #if PG_VERSION_NUM >= PG_VERSION_16
/* Copied from postgres 15 source, since it was removed from 16. */
static bool
MemoryContextContains(MemoryContext context, void *pointer)
{
MemoryContext ptr_context;
/*
* NB: Can't use GetMemoryChunkContext() here - that performs assertions
* that aren't acceptable here since we might be passed memory not
* allocated by any memory context.
*
* Try to detect bogus pointers handed to us, poorly though we can.
* Presumably, a pointer that isn't MAXALIGNED isn't pointing at an
* allocated chunk.
*/
if (pointer == NULL || pointer != (void *) MAXALIGN(pointer))
return false;
/*
* OK, it's probably safe to look at the context.
*/
ptr_context = *(MemoryContext *) (((char *) pointer) - sizeof(void *));
return ptr_context == context;
}
#endif But this DOES NOT WORK in pg 16 anymore. Starting from this version, theese 8 bytes store bit fields: size + memory context method. This is NOT POINTER. So, this function will always return This function called in /* FreeChunkValueArrayBuffer relase valueBufferArray memory. */
void FreeChunkBufferValueArray(ChunkData *chunkData)
{
uint32 columnIndex = 0;
if (chunkData == NULL)
{
return;
}
for (columnIndex = 0; columnIndex < chunkData->columnCount; columnIndex++)
{
if (chunkData->valueBufferArray[columnIndex] != NULL && !MemoryContextContains(ColumnarCacheMemoryContext(), chunkData->valueBufferArray[columnIndex]))
{
pfree(chunkData->valueBufferArray[columnIndex]->data);
pfree(chunkData->valueBufferArray[columnIndex]);
}
}
} |
Hm, in this case can be that the idea behind |
@JerrySievert @wuputah I have create another PR that fixes this behaviour #274 |
Updated MR with new info about I think this realization is simplier comparing to #274. Here we only NULLify. But in 274 have added a new struct, repalloc... How do you think @JerrySievert @wuputah @ashenBlade ? |
If we nullify this, it will not be available for us anymore. We will have to read and decompress it again. |
@ashenBlade
|
I think a better solution is to make a copy of the data, not mess with trying to synchronize a lower and higher level caching mechanism, nor try to cover all of the possible edge-cases. and add a test replicating the issue to the regression tests. |
@JerrySievert Maybe simply remove I'll think how implement a test. We need to determine that cache was used. The only idea came to mind is to add elog that writes "Columnar: clearing the cache by 10 percent has started". If it will be NOTICE level, it may be spam in daily workload. If it will have a DEBUG level, it will spam in test, all DEBUG messages from postgres will be written to test output.. |
just setting the cache level to fairly low and using the example you provided, but pared down, should trigger the issue. there should be a logical existing test file to put it into, though |
if caching is disabled, the |
2f982fb
to
3a2505f
Compare
3a2505f
to
0b1c73c
Compare
ahh... found where it's read (columnar_write.c) @JerrySievert I've added copying and a test. But from performance view, better to use approach without copying from #274 |
tests were given in hydradatabase#273
@JerrySievert hello, what do you think?
|
I'm just here as an advisor, I have no merge powers :) but, it would be up to @wuputah to determine whether the merge makes sense. this doesn't seem to deal with the multiple copy and cache level synchronization though, which is a much bigger task |
Hello.
This MR fixes an issue with a cache. Currently cache entries contain already freed memory. MR fixes this.
SQL code to reproduce
broken_cache.txt
Steps:
Description
Currently an error
occurs when cache is full and we try to free up space by 10%.
We call in the
EvictCache
but
str->data
contains a garbage, soBogusFree
with anERROR
is executed (see Path to BogusFree chapter below)The reason to a garbage is the following.
A
valueBuffer (StringInfo)
is added to a cache in theColumnarAddCacheEntry
Also it's added tochunkData->valueBufferArray
In the function
FreeChunkBufferValueArray
valueBuffer
will be freedand also
EvictCache
tries to freed it as our cache is fullif (totalAllocationLength >= (columnar_page_cache_size * 1024 * 1024))
And we got an ERROR.
To fix this, we mustn't add
valueBufferArray
tochunkData->valueBufferArray
when previously have added it to a cachePath to BogusFree
Backtrace: