-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(sparksql): Speed up sparksql compilation by splitting function registrations #11565
refactor(sparksql): Speed up sparksql compilation by splitting function registrations #11565
Conversation
✅ Deploy Preview for meta-velox canceled.
|
@PHILO-HE @rui-mo @mbasmanova would you help review this PR? it's help speed up sparksql compilation time. |
@FelixYBW @majetideepak @pedroerp would you help review this PR? it's help speed up sparksql complication times. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Added several comments.
4d135d3
to
e2b4294
Compare
@rui-mo would you help review again? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update. Added several comments.
e2b4294
to
d60c91d
Compare
@rui-mo @pedroerp @majetideepak @Yuhta would you help review again? This PR is very helpful for improving the sparksql compilation speed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Added several minors.
@@ -20,7 +20,7 @@ | |||
|
|||
namespace facebook::velox::functions::sparksql { | |||
|
|||
void registerFunctions(const std::string& prefix); | |||
void registerFunctions(const std::string& prefix = ""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the default value added, callers with empty string parameter like sparksql::registerFunctions("")
can be simplified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can fix it in another PR if needed, many files will be affected.
fedf4a7
to
4826b23
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Looks good to me.
Would you take a look if the workflow failures can be resolved after rebasing the code? |
15c540d
to
2f0a81d
Compare
e3d7952
to
fe0df08
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the metrics this is the slowest to compile file in debug and on third place for release! https://facebookincubator.github.io/velox/bm-report/
I can see this improving dx when working on sparksql functions and reducing the need for rebuilds.
CMake looks good.
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe add a comment explaining why this approach was taken.
thanks for clarify the compilation time improvement works, I will try improve sparksql agg/window functions compilation time in following PR. |
@xiaoxmeng Who can help to merge the PR? with the fix, spark functions' compile time can improve ~1.5x. |
@xiaoxmeng has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@xiaoxmeng merged this pull request in 0bb7e64. |
@rui-mo @FelixYBW @xiaoxmeng thank for merging! |
…on registrations (facebookincubator#11565) Summary: This PR aims to speed up sparksql compilation by splitting function registrations to multiple source files arranged according to function type. Adds 'velox_functions_spark' for registrations and renames previous 'velox_functions_spark' as 'velox_functions_spark_impl'. Tested the compilation time using `velox_functions_spark_test` target to mock the general development process: build -> modify cpp file -> build. The compilation time speeds up 1.5x(165s to 104s) in release mode and more in debug mode. Fixes facebookincubator#11564. Pull Request resolved: facebookincubator#11565 Reviewed By: miaoever, kagamiori Differential Revision: D66688101 Pulled By: xiaoxmeng fbshipit-source-id: 54ba372f08c4ec91062b3d07e8e2b81aabbdef59
This PR aims to speed up sparksql compilation by splitting function
registrations to multiple source files arranged according to function type.
Adds 'velox_functions_spark' for registrations and renames previous
'velox_functions_spark' as 'velox_functions_spark_impl'.
Tested the compilation time using
velox_functions_spark_test
target to mockthe general development process: build -> modify cpp file -> build. The
compilation time speeds up 1.5x(165s to 104s) in release mode and more in debug
mode.
Fixes #11564.