Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

copy_to(overwrite=T) with in_schema() doesn't remove temp_* table created #279

Open
willshen99 opened this issue Dec 9, 2024 · 3 comments
Assignees
Labels

Comments

@willshen99
Copy link

willshen99 commented Dec 9, 2024

What happened:
When running copy_to with overwrite=T and use in_schema for table name, an extra temp_* table is created.

How to reproduce:
Run below script twice. You will see a table temp_* in the output schema

copy_to(dest = dest, df = mtcars, name = in_schema(schema = "schema_name_different_from_conn_schema", table = "mtcars"), overwrite = TRUE)

Issue:
The approach used for copy_to with overwrite is: 1. if the table already exists, rename the old table to temp_*; 2. create a new empty table with the table name; 3. write data to new table; 4. if succeed, remove the temp_* table
However, the name for the temp table is not schema qualified. When doing removal, it will only look for tables in the default schema specified in connection.

RPresto/R/dbWriteTable.R

Lines 94 to 96 in 3bf0265

rn <- paste0(
"temp_", paste(sample(letters, 10, replace = TRUE), collapse = "")
)

RPresto/R/dbWriteTable.R

Lines 166 to 170 in 3bf0265

if (dbExistsTable(conn, rn)) {
if (dbRemoveTable(conn, rn)) {
message("The table ", name, " is overwritten.")
}
}

@willshen99 willshen99 changed the title copy_to(overwrite=T) with in_schema() doesn't remove tmp_* table created copy_to(overwrite=T) with in_schema() doesn't remove temp_* table created Dec 9, 2024
@jarodmeng
Copy link
Contributor

Thanks for the detailed investigation. I can see how this could be problematic for in_schema() tables. I will find some time to work on it, but it's low priority right now.

@jarodmeng
Copy link
Contributor

I actually can't reproduce this. The temp table is created in the original schema and is removed from the original schema too. Can you provide a reprex in which the temp table is not correctly removed?

@willshen99
Copy link
Author

I can reproduce the error both on Hive and Postgres catalog connecting to Trino.

conn = DBI::dbConnect(
  RPresto::Presto(),
  host = "",
  port = 8443,
  catalog = "hive",
  schema = "default", # Note this is the default schema for connection. It's different from the schema to write
  user = "trino",
  use.trino.headers = TRUE
)

DBI::dbWriteTable(conn, in_schema('reprex', 'iris'), iris, overwrite=T) # first writing to iris table in reprex schema
DBI::dbWriteTable(conn, in_schema('reprex', 'iris'), iris, overwrite=T) # try to overwrite

Above example executed without error. However, it creates an extra temp_ table in the 'reprex' schema.
image

Package Version:

> packageVersion('dbplyr')
[1] ‘2.4.0’
> packageVersion('DBI')
[1] ‘1.1.3’
> packageVersion('RPresto')
[1] ‘1.4.6.9000’

@jarodmeng jarodmeng self-assigned this Dec 18, 2024
@jarodmeng jarodmeng added the bug label Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants