Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RequireNonNullMessageChecker #13

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Add RequireNonNullMessageChecker #13

wants to merge 2 commits into from

Conversation

ksobolew
Copy link
Member

@ksobolew ksobolew commented Nov 7, 2022

This is a check to find some common cases of malformed requireNonNull calls. See the documentation for more details.

Based on trinodb/trino#10473 but is slightly more restrictive than that one (it always disallows the case when the part before the " is null" part is not the required identifier, while the other check only disallows it in constructors).

@ksobolew
Copy link
Member Author

ksobolew commented Nov 8, 2022

Why am I allowed to merge it without any approvals? 🤔

@kokosing
Copy link
Member

kokosing commented Nov 8, 2022

It should not be the case:
image
Can you please check it again?

@ksobolew
Copy link
Member Author

ksobolew commented Nov 8, 2022

Ok, now it's greyed out :)

extends BugChecker
implements MethodInvocationTreeMatcher
{
private static final Pattern MESSAGE_PATTERN = Pattern.compile(" (?:(?:is|are|was|were) (?:null|empty|missing|none)|(?:is|are) required|(?:must not|cannot|can't) be (?:null|empty))");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allowing multiple patterns with regex makes it hard to reason about what's allowed and how it matches. What about being more opinionated and matching only 1 pattern e.g. "{PARAMETER} is null", and making the pattern a config string?

I can't really think of a case where X is null must be expressed in some other form. Like empty/none/missing/required should mean something different from null, and the subject is always singular (1 parameter, even if the name is plural) and present, i.e. is.

Then we can possibly (but probably overkill) generalize it to other checks like Preconditions.checkArgument with a mapping:

  • requireNonNull(param) -> {PARAM} is null
  • checkArgument(!collection.isEmpty()) -> {PARAM} is empty
  • ...

Although requireNonNull seems to be mostly developer facing with X is null like messages while checkArgument et al are more user facing with far more descriptive messages.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these different variants are a result of trying this checker on the Trino codebase. So they do appear in the wild, although I assume that this is mostly in older code, before the "X is null" became the prevailing idiom. I added them as "alternative" pattern just to avoid generating even more churn :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grepping through Trino code base, there're ~11.8K requireNonNull(s, ~11.5K with is null"), the ones that are becoming less relevant.

Among the remaining 3K, only ~140 matches requireNonNull(IDENTIFIER, STRING_LITERAL). Most can be changed to X is null but could trigger the same debate in trinodb/trino#10473.

@ksobolew
Copy link
Member Author

ksobolew commented Nov 8, 2022

is slightly more restrictive than that one

I actually have qualms about this part. It makes sense on first sight, but I'm afraid that it will generate a torrent of irrelevant reports.

Copy link
Contributor

@nevillelyh nevillelyh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little on the fence after reading through trinodb/trino#10473

Seems requireNonNull is used for 2 things

  1. User facing e.g. "Required property X not set"
  2. Internal e.g. "X {is/must no be/cannot be} null"

IMO for 2. everything should be "X is null" or no message. The problem is we can't easily distinguish user vs internal facing checks. Many requireNonNull(config.getFlag(), ...) or requireNonNull(System.getProperty(...), ...) are allowed by chance.
And even for 2. there are a few debatable cases like
list.forEach(x -> requireNonNull(x, "one of list is null"))
vs
list.forEach(x -> requireNonNull(x, "x is null"))
So we'll end up with complex regex or lots of SuppressWarnings either way.

@nevillelyh nevillelyh self-requested a review November 8, 2022 16:50
Copy link
Member

@Laonel Laonel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have similar reservations about this. It looks like there is really no limit to how complicated regexes can become once we decide to support different forms of error messages, and if we decide to require x is null there aren't a lot of places that we could validate, like @nevillelyh pointed out.

But I do see a value in this check, as we are getting a lot of PR comments about requireNonNull.

Maybe we could start by limiting the scope of check only to usages in constructor with a constructor arg as the first param? In that scope I think we can be confident enough to require is null as the suffix and nothing else.

This is a check to find some common cases of malformed `requireNonNull`
calls. See the documentation for more details.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants