Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepared Statements: Ruby support #105

Open
adrianna-chang-shopify opened this issue Jul 18, 2023 · 5 comments
Open

Prepared Statements: Ruby support #105

adrianna-chang-shopify opened this issue Jul 18, 2023 · 5 comments

Comments

@adrianna-chang-shopify
Copy link
Collaborator

adrianna-chang-shopify commented Jul 18, 2023

We shipped the C API for prepared statements in #3. The next step is to add support to the Ruby bindings.

API

I imagine we'll want something similar to Mysql2's implementation: a Trilogy#prepare method that returns a Trilogy::Statement object, and a Trilogy::Statement#execute method that can be used to execute a PS with given parameters, e.g.:

client = Trilogy.new(host: "127.0.0.1", port: 3306, username: "root", read_timeout: 2, database: "trilogy_test")

statement = client.prepare("SELECT * FROM users WHERE id = ?")
result1 = statement.execute(1)
result2 = statement.execute(2)

statement = client.prepare("SELECT * FROM users WHERE id >= ? AND name LIKE ?")
result = statement.execute(1, "Joe")

Trilogy::Statement should, at a minimum, expose the underlying data values from the C API: the statement id, column count, parameter count, and warning count. Mysql2::Statement exposes things like affected_rows, fields, last_id, etc. that we may eventually want Trilogy::Statement to expose as well.

Blockers

@byroot and I began working on this and ran into some issues while fleshing out the Ruby class that will wrap the C prepared statement struct (trilogy_stmt_t). As Mysql2 does, we'll need to close the prepared statement when freeing the Ruby Trilogy::Statement object:
https://github.com/brianmario/mysql2/blob/6cf5e1d732b4a6ee5485def98637f100cbc86f1b/ext/mysql2/statement.c#L26-L29
https://github.com/brianmario/mysql2/blob/master/ext/mysql2/statement.c#L75
https://github.com/brianmario/mysql2/blob/master/ext/mysql2/statement.c#L65

trilogy_stmt_close needs the stmt and the connection:

trilogy/src/blocking.c

Lines 366 to 368 in 61a8b40

int trilogy_stmt_close(trilogy_conn_t *conn, trilogy_stmt_t *stmt)
{
int rc = trilogy_stmt_close_send(conn, stmt);

We need the connection to send TRILOGY_CMD_STMT_CLOSE -- so we can create an intermediate struct (that will be wrapped into Trilogy::Statement) that encapsulates both trilogy_stmt_t and the connection. The problem arises if the connection ends up being closed / freed before we attempt to close the prepared statement / free its memory. In this case, attempting to access the connection on the statement wrapper will raise an EXC_BAD_ACCESS exception.

The solution is likely to do something similar to what the native SQL client library does -- it stores a linked list of all prepared statements on the client itself, and then when the client is closed, it "detaches" the statement list by clearing the connection pointer of every statement. That way, we can no-op in trilogy_stmt_close if the connection is gone, and avoid accessing memory for a conn that's already been freed.

If anyone else has opinions on this though, would love to hear!


I've started a branch here for the Ruby bindings, but it's still very much a WIP: https://github.com/trilogy-libraries/trilogy/compare/main...adrianna-chang-shopify:trilogy:ac-prepared-statements-ruby?expand=1

@byroot
Copy link
Contributor

byroot commented Jul 19, 2023

The solution is likely to do something similar to what the native SQL client library does -- it stores a linked list of all prepared statements on the client itself, and then when the client is closed, it "detaches" the statement list by clearing the connection pointer of every statement.

So now that I think about it, I'm actually surprised this doesn't cause race conditions.

Another thought I had, is that it would be worth looking at what SQLite and Postgres client libraries do. MySQL may not be using the best solution here.

@adrianna-chang-shopify
Copy link
Collaborator Author

So now that I think about it, I'm actually surprised this doesn't cause race conditions.

Hm, yeah taking a look again: we send a QUIT command to the server and close the connection before we detach the statement list, so seems plausible that freeing the prepared statement and attempting to use the connection could happen after we've closed the connection, but before the statement list has been updated.

Will investigate the other drivers 👍

@adrianna-chang-shopify
Copy link
Collaborator Author

adrianna-chang-shopify commented Jul 19, 2023

Okay, from the SQLite source:

// sqlite3.c
** CAPI3REF: Closing A Database Connection
** DESTRUCTOR: sqlite3
**
** ^The sqlite3_close() and sqlite3_close_v2() routines are destructors
** for the [sqlite3] object.
**
** ^Calls to sqlite3_close() and sqlite3_close_v2() return [SQLITE_OK] if
** the [sqlite3] object is successfully destroyed and all associated
** resources are deallocated.
** Ideally, applications should [sqlite3_finalize | finalize] all
** [prepared statements], [sqlite3_blob_close | close] all [BLOB handles], and
** [sqlite3_backup_finish | finish] all [sqlite3_backup] objects associated
** with the [sqlite3] object prior to attempting to close the object.
** ^If the database connection is associated with unfinalized prepared
** statements, BLOB handlers, and/or unfinished sqlite3_backup objects then
** sqlite3_close() will leave the database connection open and return
** [SQLITE_BUSY].

So rather than MySQL's flow of:

  • We attempt to close the connection
  • It detaches all prepared statement connection pointers
  • When freeing a prepared statement, we avoid ending the command to close the statement server-side if we no longer have the connection

SQLite does the following:

  • We attempt to close the connection
  • If there are prepared statements open, we prevent the conn from being closed and return SQLITE_BUSY (in the case of sqlite3_close, which is what the Ruby SQLite3 gem uses)

Interestingly, the SQLite documentation for the close routine also says this:

The sqlite3_close_v2() interface
^If sqlite3_close_v2() is called with unfinalized prepared
** statements, unclosed BLOB handlers, and/or unfinished sqlite3_backups,
** it returns [SQLITE_OK] regardless, but instead of deallocating the database
** connection immediately, it marks the database connection as an unusable
** "zombie" and makes arrangements to automatically deallocate the database
** connection after all prepared statements are finalized, all BLOB handles
** are closed, and all backups have finished. The sqlite3_close_v2() interface
** is intended for use with host languages that are garbage collected, and
** where the order in which destructors are called is arbitrary.

So although sqlite3_close is used by sqlite3-ruby, seems like sqlite3_close_v2 would be more fitting since Ruby is GC'ed -- and perhaps this is a pattern that would lend itself to Trilogy? Avoid closing the db connection right away, and instead prepare the deallocate it after all of the prepared statements have been closed.

@adrianna-chang-shopify
Copy link
Collaborator Author

adrianna-chang-shopify commented Jul 19, 2023

PostgreSQL, OTOH, doesn't even support closing prepared statements in the client:

Prepared statements for use with PQexecPrepared can also be created by executing SQL PREPARE statements. Also, although there is no libpq function for deleting a prepared statement, the SQL DEALLOCATE statement can be used for that purpose.

(https://www.postgresql.org/docs/current/libpq-exec.html under PQprepare)

casperisfine pushed a commit to casperisfine/trilogy that referenced this issue Jul 25, 2023
Ref: trilogy-libraries#105

To close a prepared statement you need to have access to the connection
that created it.

But in managed languages like Ruby, the obvious thing to do is to
close the prepared statement when the associated object is garbage
collected.

But the order in which objects are garbaged collected is never guaranteed
so when freeing a statement the connection might have been freed
already and it's hard to detect.

We checked how libmysqlclient handles it, and each `MYSQL_STMT` has
a reference to its `MYSQL` (connection), and the connection keeps a
doubly linked list of the statements it created.

When a statement is closed it's removed from the list, when the connection
is closed, all the connection references are set to NULL.

We implemented exactly the same logic here.

Additionally, prepared statement can only be used with the connection
they were created from. As such having all the `trilogy_stmt_*` function
take a connection isn't great for usability. So this change opens the
door to only taking a `trilogy_stmt_t *`.

Co-Authored-By: Adrianna Chang <[email protected]>
casperisfine pushed a commit to casperisfine/trilogy that referenced this issue Jul 25, 2023
Ref: trilogy-libraries#105

To close a prepared statement you need to have access to the connection
that created it.

But in managed languages like Ruby, the obvious thing to do is to
close the prepared statement when the associated object is garbage
collected.

But the order in which objects are garbaged collected is never guaranteed
so when freeing a statement the connection might have been freed
already and it's hard to detect.

We checked how libmysqlclient handles it, and each `MYSQL_STMT` has
a reference to its `MYSQL` (connection), and the connection keeps a
doubly linked list of the statements it created.

When a statement is closed it's removed from the list, when the connection
is closed, all the connection references are set to NULL.

We implemented exactly the same logic here.

Additionally, prepared statement can only be used with the connection
they were created from. As such having all the `trilogy_stmt_*` function
take a connection isn't great for usability. So this change opens the
door to only taking a `trilogy_stmt_t *`.

Co-Authored-By: Adrianna Chang <[email protected]>
@adrianna-chang-shopify
Copy link
Collaborator Author

This PR should handle the blockers described in this issue. I started this branch to work on the Ruby bindings, but unfortunately will have to shelve this work until I have more time to dedicate to it. Happy to chat if anyone wants to take it over.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants