-
Notifications
You must be signed in to change notification settings - Fork 459
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option to include fields that are equal to the default value in the JSON encoding #1526
base: main
Are you sure you want to change the base?
Conversation
This commit has all of the functional changes 59a6212 |
Do the other languages end up encoding the empty arrays when visiting default values? i.e. - does it make sense to visit on those or just skip the visit calls still? We probably want to make sure our JSON support ends up doing the same things the other upstream languages do. In any case, since this is public api, we should spell out in the comments what the expected behavior is for all cases:
And then use a custom listener in the test to ensure all these behaviors for all the different message types. |
For the comments spelling out the behaviors, I guess we should use the terms from the upstream pages: |
Yes I'm using the python package for comparison and if you set https://googleapis.dev/python/protobuf/latest/google/protobuf/json_format.html
|
I'm not sure what you mean by that. Is an encoding test that asserts the encoded json is as expected not enough? |
7319183
to
6d44684
Compare
Haven't looked in detail, it might be. I wasn't sure if json had any specific logic of it's own vs. ensure all the methods expected did/didn't get called. |
I finally got a chance to start looking at this, and I'm going to suggest we step back to hopefully won't be a bike shed – As drafted, the PR calls the The looking at the C++ api, their JSON is The Java api has: I don't think we want to eventually try to do the level of the api that Java has (the Set support would really complicate the visitor pattern). So on the naming side, my 2¢ would be: I think default gets a little confusing around the behavior with repeated fields and
The JSON value lines up with the two other names we already have that start Thoughts? @tbkka what makes sense to you? |
I like Thomas' naming ideas here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also the test case added is in the regenerate commit, so I almost missed that addition.
@@ -47,6 +47,10 @@ class MessageFieldGenerator: FieldGeneratorBase, FieldGenerator { | |||
return false | |||
} | |||
} | |||
|
|||
var usesDefaultValueFlagForTraversal: Bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, we don't want to have to check syntax any more (working towards Editions). Would just looking at say hasFieldPresence and if the field is repeated/map vs. singular work here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a difference in behavior between proto2 and proto3 though. At least in the python lib which I was using for reference. In proto3 unset optional primitives are omitted from json, and in proto2 unset optional primitives are included. I'm not sure if theres a better way to do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Descriptor
support should already be tracking most of that: https://github.com/apple/swift-protobuf/blob/main/Sources/SwiftProtobufPluginLibrary/Descriptor.swift#L638-L642 What's the python code you're looking at for reference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't look at python code, just observed how the python lib behaves when serializing a message with every kind of field type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is an example of what I mean
Proto3 behavior
syntax = "proto3";
message Proto3Message {
int32 singularInt32 = 1;
repeated int32 repeatedInt32 = 2;
optional int32 optionalInt32 = 3;
}
print(MessageToJson(Proto3Message(), including_default_value_fields=True))
{
"singularInt32": 0,
"repeatedInt32": []
}
Proto2 behavior
message Proto2Message {
required int32 singularInt32 = 1;
repeated int32 repeatedInt32 = 2;
optional int32 optionalInt32 = 3;
}
print(MessageToJson(Proto2Message(), including_default_value_fields=True))
{
"singularInt32": 0,
"repeatedInt32": [],
"optionalInt32": 0
}
hasPresence
will be true for optionalInt32
for both proto2 and proto3 but we want to print it in proto2 and omit it in proto3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good I can work on that. Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a docs rollout is happening:
https://protobuf.dev/programming-guides/proto2/#json-options
Always emit fields without presence: Fields that don’t support presence and that have their default value are omitted by default in JSON output (for example, an implicit presence integer with a 0 value, implicit presence string fields that are empty strings, and empty repeated and map fields). An implementation may provide an option to override this behavior and output fields with their default values.
As of v25.x, the C++, Java, and Python implementations are nonconformant, as this flag affects proto2
optional
fields but not proto3optional
fields. A fix is planned for a future release.
https://protobuf.dev/programming-guides/proto3/#json-options
Always emit fields without presence: Fields that don’t support presence and that have their default value are omitted by default in JSON output (for example, an implicit presence integer with a 0 value, implicit presence string fields that are empty strings, and empty repeated and map fields). An implementation may provide an option to override this behavior and output fields with their default values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused by those docs for proto2
(for example, an implicit presence integer with a 0 value, implicit presence string fields that are empty strings, and empty repeated and map fields). An implementation may provide an option to override this behavior and output fields with their default values.
Don't integer and string fields always have presence under proto2? Under what scenario would we actually emit these in proto2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the logic to just use hasPresence
for proto2 the same way as in proto3. But I think those docs could use some clarification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please feel free to open an issue upstream with the feedback. Some times it helps to hear directly from devs.
@mrabiciu looks like you might need to sync also to resolve some conflicts. |
@tbkka can you take a look, I think given the new docs from the protobuf team, we're likely pretty good to go on this. |
8b366c4
to
49c7175
Compare
Rebased on main and squashed all my commits |
49c7175
to
ba9708f
Compare
@thomasvl It's been 10 months now. Was there a blocker on this? It seemed at the time like it was ready to merge in. Has this been supplanted by another PR or methodology? Any guidance from the team on how we should deal with this situation in the meantime? Edit: Found the answer in another thread from another thread. Have to wait for v2 apparently. |
New apis/additions to the library generally get signoff from someone at Apple since they own the repo. Currently they want to ensure any changes/additions to the repo are non breaking as a major version bump could cause issues for downstream packages that also depend on this repo. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a sensible change to me. This is only additive in adding new encoding options and it adds new test cases to check that this option is working correctly. The change is non breaking since we provide a default implementation of the new protocol requirement.
@tbkka left a comment here in 2023 for another change that also only adds new APIs, but he mentioned it's breaking because it's breaking between the runtime and the generated code (kind of similarly to this PR, new API was added to the I agree there are only API additions with default implementations on this PR, but if you have mismatching versions for the generated code and SwiftProtobuf (specifically, newer generated code running with an older swift-protobuf), things will break because the generated code now references a new field in the FWIW, I don't think these count as breaking changes (so I agree with @FranzBusch approving this): if someone went to the trouble of updating the generated code with a newer version of the plugin, why would they keep the old version of the runtime (or vice versa)? As long as users are updating both things simultaneously they should be safe, and I don't see why this wouldn't happen, to be honest. @thomasvl In the README we suggest that you should try to keep the versions of |
I don't think we have. Since we've never come up with a better way to do the version mismatch checks compared to what the C based languages can do, it's sorta a space we've always struggled with. |
I guess it isn't a "break", but the potential catch I can see is if someone depends on protos compiled into another package, they would need those also updated for this option to work since the support is captured in the generated code. The only other thing I can think of that has been a concern on additions is code size increase, and this does increase the generated code size (which I believe is what pushed some of the investigation into table based generation to try and factor things like this out). |
/// - repeated fields: Will always be visited, even if they are empty. | ||
/// - map fields: Will always be visited, even if they are empty. | ||
/// - singular message and group fields: Will be visited only if they are set. | ||
/// - singular primitive fields: Will always be visited under proto2. Under proto3 unset optional primitive fields will not be visited. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doc string likely needs an update since everything is about Editions now upstream.
This adds the ability to include fields that are equal to the default value in the json encoding.
This is done by adding a
traversalOptions
var toVisitor
as well as adding aincludeDefaultValues
toJSONEncodingOptions
which controls theJSONEncodingVisitor
'straversalOptions
I've modified the generation of
traverse
to checktraversalOptions
before callingvisit
on each individual field.