-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use PostgreSQL nodes for scanning tables #477
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial round of review. I really like the direction of the code. One general feedback is that this definitely needs a bunch of tests too.
I've been looking into the PR you submitted, and I'm curious about its impact on index-scan functionality. Could you please clarify if this PR addresses the issue where index-scans were not being used when expected? If it does resolve this problem, could you also provide guidance on how to enforce the use of an index-scan? Specifically: Are there any configuration settings or parameters that can be adjusted to prefer index-scans over other scan methods? Thank you |
de068c6
to
03b1e3a
Compare
Hi, i have added regression test that shows that Bitmap/Index/IndexOnly scan is used when appropriate - hopefully that will help you answer your question. |
Thank you very much~ |
731faad
to
aa6832c
Compare
e78c6a8
to
ac2e1f2
Compare
6a726f9
to
570f34f
Compare
# Conflicts: # include/pgduckdb/pgduckdb_guc.h # Conflicts: # src/scan/heap_reader.cpp # src/scan/postgres_seq_scan.cpp
Added WaitLatch that can be used with multiple threads in process.
At the very least SlotGetAllAttrs should should be called while holding the lock.
If GlobaclLock is held for duration of populating output vector we don't need any special WaitLatch wrappers and we can call directly this function.
We should now support all types with native postgres scan
* Setting this variable to `0` disables parallelization * Cardinality of table less than 65536 use only single parallel process * Higher cardinality will try to use `max_workers_per_postgres_scan` parallel processes with upper limit of `max_parallel_workers`
7b0685b
to
e0135e8
Compare
Idea is to reconstruct query based on duckdb filtering information, for each table and use that information to plan postgres execution. This plan will, potentially exeucute with parallel workers. If no workers are available we will scan this local to thread.
This approach has advantage that it also will support all other scan nodes that are available (and postgresql thinks are best - index/index only/bitmap scan, also partitioned tables should be possible)
Fixes #243