-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add function for getting current wallclock time #687
Comments
Thanks for reporting. There are at least four possible definitions for "current time" in arroyo
Currently arroyo just implements the first one (via In Flink, Writing new functions is pretty straightforward so this could be a great first issue for someone. |
I'm interested in obtaining the current event time of a row (i.e. 3 above). Tried to look quickly how this could be implemented, but couldn't find a similar function? @mwylde - can you give me some guidance, where to look at? Can something like |
Hey @kpe — happy to help walk through how you'd implement this. row_time() (i.e., 3 in the list above) can't be implemented as a UDF. This is because the expression trees are planned from SQL with DataFusion, which doesn't know about the _timestamp field as its not actually part of the SQL schema. We add it in as part of various rewrites of the plan (you can find the various calls to add_timestamp_field) throughout arroyo-planner). So instead we would need to define a placeholder UDF, like hop/tumble/session: arroyo/crates/arroyo-planner/src/lib.rs Lines 137 to 139 in 1d95e13
This would then need to be rewritten (by traversing through the logical plan and all expressions) into an expression that gets the _timestamp field. |
We have implemented a rust UDF for it. It might seem hacky but works for us and serves our use cases. This is for arroyo v0.12.0
The corresponding SQL usage: |
Hi!
I am trying to add a processing time timestamp to each row with the following query:
However, it seems that the now() function is executed only once, resulting in a static current_ts field that does not update over time.
Are there any keywords or functions that can generate a real-time timestamp for each record in the output?
The text was updated successfully, but these errors were encountered: