Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is cpptrace support set cache size? #193

Open
xiewajueji opened this issue Dec 3, 2024 · 7 comments
Open

Is cpptrace support set cache size? #193

xiewajueji opened this issue Dec 3, 2024 · 7 comments
Labels
enhancement New feature or request high priority

Comments

@xiewajueji
Copy link

Problem:
Some stacktrace is very large and it cost much memory in cache(about 2GB) and time to construct (about 4s).
In our system, 2GB memory consume can not be adopted, because 4GB can be utilized.

Although there are 3 cache mode of cpptrace, but two of them except prioritize_speed can't shared lookup tables between trace calls which could slow the system and could(guess?) use more momory when two "large" stacktrace is constructing.

Is there a way that could both share lookup tables between trace and limit the total cache size? thx.

@jeremy-rifkin
Copy link
Owner

Wow, 2GB of cache is huge. I haven't had to deal with anything of that magnitude yet and this may be something cpptrace isn't currently well-equipped to handle. Thanks for opening this issue.

I have a couple quick questions to help me understand your use-case better: Do you generate stack traces multiple times in a program? Do you need a trace generated reasonably quickly?

Is there any context you could provide about how big your application is? Do you happen to know if it utilizes many third-party libraries?

Some initial thoughts: Currently the caching isn't particularly smart, cpptrace just loads all relevant information from a translation unit. This is almost always fine, except for exceptionally large amounts of debug info. It should definitely be possible to implement something smarter. Attempting to load only relevant sections (and remember information needed to continue searches) could prove beneficial though DWARF doesn't make this as easy to do as I'd like. Something along the lines of a LRU cache might also be useful, though the use-case here is challenging as debug symbols are a hierarchical structure.

I will need to put together a test application for myself that's large enough to reproduce these issues, I'll have to look into that later.

@jeremy-rifkin jeremy-rifkin added enhancement New feature or request high priority labels Dec 3, 2024
@xiewajueji
Copy link
Author

xiewajueji commented Dec 3, 2024

Thanks for your reply. Information will be served but not in time, as I'm diving into DWARF for more clue.

@jeremy-rifkin
Copy link
Owner

I'm ran some tests with a relatively large binary and massif reported most allocation was done by libdwarf internals pertaining to loading and parsing DWARF from the ELF. I'll do some more testing.

@xiewajueji
Copy link
Author

xiewajueji commented Dec 10, 2024

sry. I'm really busy recently. I'll profile it this weekend. In my situation, after stacktrace is printed, allocated memory is retained.

@xiewajueji
Copy link
Author

xiewajueji commented Dec 15, 2024

Do you generate stack traces multiple times in a program?
the problem is caused by one trace.

Do you need a trace generated reasonably quickly?
No, trace is printed when some error happen in most situation.

Is there any context you could provide about how big your application is?
The application itself is less than 2GB with debug build.

Do you happen to know if it utilizes many third-party libraries?
Yes, but the third-party libraries can't be found in trace caused the problem.

I can reproduce this problem.
The stacktrace is like that, Some address can't be symbol resolved are JIT function generated by LLVM:

[2024-12-15 10:22:06.730] [error] [thread-75884] [Exception.cpp:59] <AssertFail> p_assert(false) (/home/crab/WorkSpace/polars/polars-llvm/cpp/core/operator/window/SortedWindowOperator.cpp:62:27)
Stack trace (most recent call first):
#0  (inlined)          in auto polars::getStackTrace<(polars::StackTraceStrategy)1>(int, int) at /home/crab/WorkSpace/polars/polars-llvm/cpp/common/StackTrace.h:48
#1  0x000056320bfc315d in polars::Exception::Exception(polars::ErrorCode, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::source_location const&, bool)::$_2::operator()() const at /home/crab/WorkSpace/polars/polars-llvm/cpp/common/Exception.cpp:56:27
#2  0x000056320bfc2e1f in polars::Exception::Exception(polars::ErrorCode, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::source_location const&, bool) at /home/crab/WorkSpace/polars/polars-llvm/cpp/common/Exception.cpp:70
#3  0x000056320bfc2bab in polars::Exception::Exception(polars::ErrorCode, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>&&, std::__1::source_location const&, bool) at /home/crab/WorkSpace/polars/polars-llvm/cpp/common/Exception.cpp:32
#4  0x000056320b064ed3 in polars::AssertFailException::AssertFailException(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>&&, std::__1::source_location const&) at /home/crab/WorkSpace/polars/polars-llvm/cpp/common/Exception.h:54
#5  0x000056320be300ca in polars::core::window_operator::SortedWindowStateFactory::GlobalSinkStateImpl::nextPartition(unsigned char* const*&, unsigned char* const*&, char const*) at /home/crab/WorkSpace/polars/polars-llvm/cpp/core/operator/window/SortedWindowOperator.cpp:62
#6  0x00007fb67abdd09c
#7  0x000056320b61e062 in std::__1::invoke_result<bool (*)(polars::core::RuntimeQueryResources*, long, long), polars::core::RuntimeQueryResources*, int, int>::type polars::codegen::VM::invoke<bool (*)(polars::core::RuntimeQueryResources*, long, long), polars::core::RuntimeQueryResources*, int, int>(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, polars::core::RuntimeQueryResources*&&, int&&, int&&) at /home/crab/WorkSpace/polars/polars-llvm/cpp/codegen/VM.h:82
#8  0x000056320b61b21f in polars::core::PipelineTask::execute() at /home/crab/WorkSpace/polars/polars-llvm/cpp/core/execution/PipelineTask.cpp:139
#9  0x000056320b60921a in polars::core::TaskExecutor::WorkForever(polars::core::TaskExecutor::ThreadState*) at /home/crab/WorkSpace/polars/polars-llvm/cpp/core/execution/TaskExecutor.cpp:65
#10 0x000056320b60a88b in polars::core::TaskExecutor::setThreadNum(int)::$_0::operator()() const at /home/crab/WorkSpace/polars/polars-llvm/cpp/core/execution/TaskExecutor.cpp:142
#11 0x000056320b60a844 in decltype(std::declval<polars::core::TaskExecutor::setThreadNum(int)::$_0>()()) std::__1::__invoke[abi:ne180100]<polars::core::TaskExecutor::setThreadNum(int)::$_0>(polars::core::TaskExecutor::setThreadNum(int)::$_0&&) at /usr/lib/llvm-18/bin/../include/c++/v1/__type_traits/invoke.h:344
#12 0x000056320b60a81c in void std::__1::__thread_execute[abi:ne180100]<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, polars::core::TaskExecutor::setThreadNum(int)::$_0>(std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, polars::core::TaskExecutor::setThreadNum(int)::$_0>&, std::__1::__tuple_indices<...>) at /usr/lib/llvm-18/bin/../include/c++/v1/__thread/thread.h:193
#13 0x000056320b60a641 in void* std::__1::__thread_proxy[abi:ne180100]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, polars::core::TaskExecutor::setThreadNum(int)::$_0>>(void*) at /usr/lib/llvm-18/bin/../include/c++/v1/__thread/thread.h:202
#14 0x00007fb67aa41ac2 in start_thread at ./nptl/pthread_create.c:442
#15 0x00007fb67aad384f at ./misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
(stack trace cost: 69616ms)

If cache mode is prioritize_memory, 750MB memory is reserved after stacktrace, while cache mode is prioritize_memory, 2.8GB memory is reserved.

I suspected the JIT code cause the problem(try to scan all over the dwarf to find the no-existing symbol), but it is not as what I thought.

Is it a bug or really expensive? I'm comfusing.

@jeremy-rifkin
Copy link
Owner

Thanks for the additional information. I'll see about testing more this week.

@xiewajueji
Copy link
Author

xiewajueji commented Dec 17, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request high priority
Projects
None yet
Development

No branches or pull requests

2 participants