-
Notifications
You must be signed in to change notification settings - Fork 695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds support for scripting engines as Valkey modules #1277
Conversation
I'm opening this PR in draft mode because I still need to fix test failures, and add new tests as well. |
ea6e6b6
to
d3293dd
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## unstable #1277 +/- ##
============================================
- Coverage 70.82% 70.79% -0.03%
============================================
Files 119 119
Lines 64691 64884 +193
============================================
+ Hits 45818 45937 +119
- Misses 18873 18947 +74
|
4a2e223
to
1a808b2
Compare
I just realized I still need to implement the |
Core team said we are directionally aligned with, we still want someone to take a deeper look to make sure the details makes sense. |
fe56090
to
9d4b296
Compare
The PR is ready for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like great progress. Is it ready for a proper review or are you going to change much more?
Shall we do EVAL as a follow-up PR?
Each engine has only one context which is used for all calls, right? Would it be possible for an engine to use a context (or sub-context, etc.) for each client? This idea came up in the LuaJIT discussion by @secwall in #1229. A separate Lua context per client would isolate each client from each other, which compensates for the fact that Lua isn't completely sandboxed by design. Maybe the engine can organize this by itself, if the engine just has access to the current client at the time of a FCALL or EVAL?
It's ready. I don't think I'll add more changes apart from the changes asked by the reviewers.
Yes, that's the plan.
I'll take a look if it's possible, and will open a follow-up PR with the changes. |
This commit extends the module API to support the addition of different scripting engines to run user defined functions. The scripting engine can be implemented as a Valkey module, and can be dynamically loaded with the `loadmodule` config directive, or with the `MODULE LOAD` command. The current module API support, only allows to load scripting engines to run functions using `FCALL` command. In a follow up PR, we will move the Lua scripting engine implmentation into its own module. Signed-off-by: Ricardo Dias <[email protected]>
9d4b296
to
fc03df7
Compare
This commit adds a module with a very simple stack based scripting language implementation to test the new module API that allows to implement new scripting engines as modules. Signed-off-by: Ricardo Dias <[email protected]>
fc03df7
to
9c618a3
Compare
Can you avoid force-pushing? It's easier to review what's changed since last time if you just push small commits. You can even merge unstable to it, rather than rebase. We squash-merge PRs anyway in the end and we use the PR title and description as the commit message. |
Ah sorry. I wasn't aware that we were squashing the commits upon merge. I'll stop force-pushing. |
Signed-off-by: Ricardo Dias <[email protected]>
Today I added more test cases to test the error code paths. |
…nction` Signed-off-by: Ricardo Dias <[email protected]>
Signed-off-by: Ricardo Dias <[email protected]>
…allback functions Signed-off-by: Ricardo Dias <[email protected]>
@zuiderkwast @PingXie I've addressed all comments and pushed the commits with the changes.
|
…t callback should be implemented Signed-off-by: Ricardo Dias <[email protected]>
Signed-off-by: Ricardo Dias <[email protected]>
This is basically a preparation for calling scripts from modules, right? It's a good idea, but it exposes even more API. If it's not usable from the module API yet, then it's perhaps better to hide it from valkeymodule.h so we don't commit to this exact API just yet...? For example, I see you named a field |
Actually, I think I misunderstood how the module API is implemented. I was under the impression that if the scripting engine wanted to call another command using the Therefore, I'm going to revert this change. |
…sult" This reverts commit 0876bff. Signed-off-by: Ricardo Dias <[email protected]>
…xtElement callback should be implemented" This reverts commit 4a2cb12. Signed-off-by: Ricardo Dias <[email protected]>
…function execution" This reverts commit 3c60980. Signed-off-by: Ricardo Dias <[email protected]>
…he module context object Signed-off-by: Ricardo Dias <[email protected]>
d567079
to
3cfea1c
Compare
Revert is done. I think now everything is addressed. |
typedef ValkeyModuleScriptingEngineCompiledFunction **(*ValkeyModuleScriptingEngineCreateFunctionsLibraryFunc)( | ||
ValkeyModuleCtx *module_ctx, | ||
ValkeyModuleScriptingEngineCtx *engine_ctx, | ||
const char *code, | ||
size_t timeout, | ||
size_t *out_num_compiled_functions, | ||
char **err); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about this some more, I'd like to propose returning an integer error code instead:
typedef ValkeyModuleScriptingEngineCompiledFunction **(*ValkeyModuleScriptingEngineCreateFunctionsLibraryFunc)( | |
ValkeyModuleCtx *module_ctx, | |
ValkeyModuleScriptingEngineCtx *engine_ctx, | |
const char *code, | |
size_t timeout, | |
size_t *out_num_compiled_functions, | |
char **err); | |
typedef int (*ValkeyModuleScriptingEngineCreateFunctionsLibraryFunc)( | |
ValkeyModuleCtx *module_ctx, | |
ValkeyModuleScriptingEngineCtx *engine_ctx, | |
const char *code, | |
size_t timeout, | |
size_t *out_num_compiled_functions, | |
ValkeyModuleScriptingEngineCompiledFunction **compiled_functions); |
This would
- avoid the assumption that a scripting module would allocate these strings using zmalloc, which can't be enforced by compiler
- eliminate the "char * to sds" conversion
- help ensure consistency over the error strings across different scripting engines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the suggestion but I see one problem with 3. How the scripting engine communicates the kind of error that made this function to fail back to the client?
For instance, most of the times this function might fail because of a syntax error in the script, and the scripting engine can give a nice message specifying where the syntax error is located, which in this case will not be sent to the client.
We might log the error in the server log, but I assume the client of valkey might not have access to the server logs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. Do you have an example? How does the built-in Lua interpreter handle syntax errors? I wonder if we could generalize an outgoing line number and an outgoing column number? Or we'd like to provide more detailed context like a real compiler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, when there's some syntax, or runtime error we send the error message returned by lua:
127.0.0.1:6379> function load replace "#!lua name=mylib\nlocal function foo()\nretrn 1\nend\nserver.register_function('foo', bar)\n"
(error) ERR Error compiling function: user_function:3: '=' expected near '1'
127.0.0.1:6379> function load replace "#!lua name=mylib\nlocal function foo()\nreturn 1\nend\nserver.register_function('foo', bar)\n"
(error) ERR Error registering functions: ERR user_function:5: Script attempted to access nonexistent global variable 'bar'
I think the more generic approach is to allow the scripting engine to return a C string with the error message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@PingXie How about we get this PR merged and we can still change the API a bit more before 8.1 is released? Ricardo has the follow-up PR for EVAL waiting to be added. It will slightly modify the API already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rjd15372 We should document this in some way for the module API docs, for example that the module needs to allocate the err
using ValkeyModule_Alloc
(a wrapper for zmalloc).
The comments before each VM_
function definition gets extracted to produce this page: https://valkey.io/topics/modules-api-ref/
We also put comments for structs in valkeymodule.h, but it's not perfect for documentation. For example, Rust module authors don't directly use this file.
We can consider creating a new documentation topic page for engines later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we get this PR merged and we can still change the API a bit more before 8.1 is released?
Agreed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the more generic approach is to allow the scripting engine to return a C string with the error message.
Yeah if we would like to provide detailed compilation errors then we need to keep this capability but I would suggest robj
too over C strings.
Signed-off-by: Ricardo Dias <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job 🎉 Thanks @rjd15372!
|
||
/* This struct is used to return the memory information of the scripting | ||
* engine. */ | ||
typedef struct ValkeyModuleScriptingEngineMemoryInfo { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this info struct needs to be versioned. see ValkeyModuleClientInfo
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it requires versioning. The version number in ValkeyModuleScriptingEngineMethodsV1
should be enough. If we need to change the structure of ValkeyModuleClientInfo
, we can bump the version of VALKEYMODULE_SCRIPTING_ENGINE_ABI_VERSION
.
We're doing the same for the ValkeyModuleScriptingEngineCompiledFunction
struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sometimes we might just need a new field to the struct. bumping the ABI version would be more disruptive and might be breaking. If we version the struct, we can just add the new field to the end of the struct. The module who only understands V1 struct would still be able to parse the return info (all the way to the end of the V1 field list).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sometimes we might just need a new field to the struct. bumping the ABI version would be more disruptive and might be breaking. If we version the struct, we can just add the new field to the end of the struct. The module who only understands V1 struct would still be able to parse the return info (all the way to the end of the V1 field list).
Yes, and we already version the struct ValkeyModuleScriptingEngineMethodsV1, which is the entry point for all scripting engine structs. If we add a field in any of these structs, we bump the version in the main struct ValkeyModuleScriptingEngineMethods.
@PingXie is there any benefit is versioning every single struct? And what do you mean by bumping the ABI version would be more disruptive? All we're talking about here is just versioned structs...
ValkeyModuleScriptingEngineCtx *engine_ctx, | ||
void *compiled_function); | ||
|
||
typedef ValkeyModuleScriptingEngineMemoryInfo (*ValkeyModuleScriptingEngineGetMemoryInfoFunc)( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the info buffer needs be allocated by the caller with size provided for future-proofing. see ValkeyModule_GetClientInfoById
typedef ValkeyModuleScriptingEngineCompiledFunction **(*ValkeyModuleScriptingEngineCreateFunctionsLibraryFunc)( | ||
ValkeyModuleCtx *module_ctx, | ||
ValkeyModuleScriptingEngineCtx *engine_ctx, | ||
const char *code, | ||
size_t timeout, | ||
size_t *out_num_compiled_functions, | ||
char **err); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the more generic approach is to allow the scripting engine to return a C string with the error message.
Yeah if we would like to provide detailed compilation errors then we need to keep this capability but I would suggest robj
too over C strings.
This PR extends the module API to support the addition of different scripting engines to execute user defined functions.
The scripting engine can be implemented as a Valkey module, and can be dynamically loaded with the
loadmodule
config directive, or with theMODULE LOAD
command.This PR also adds an example of a dummy scripting engine module, to show how to use the new module API. The dummy module is implemented in
tests/modules/helloscripting.c
.The current module API support, only allows to load scripting engines to run functions using
FCALL
command.The additions to the module API are the following: