Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Rubocop DSL RBI compiler #1046

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Conversation

sambostock
Copy link
Contributor

@sambostock sambostock commented Jul 8, 2022

Motivation

When authoring a custom cop, one is likely to use the def_node_matcher or def_node_search macros, which generate methods. Sorbet will complain if the generated methods are used, as it does not know they exist, requiring the developer to write a shim.

Let's automate that.

Implementation

I looked at a couple other DSL compilers and based my work off of them.

One tricky thing is that Rubocop expects custom cops to be defined in its namespace (i.e. RuboCop::Cop::<department name>::<cop name>). Therefore, we need a way to figure out which constants to generate DSLs for, and which to skip. The approach I came up with was to check if the const_source_location is in a gem, or within the host app. Although there may be a better way to do this.

I ended up scrapping all this and just generating DSL RBI files for everything, and in tests just ignoring the gathered constants which already existed at the start of the test. I think this is in line with the end result of other compilers, which often end up producing RBI files not just for constants defined by the host app, but also for constants defined by gems (including sometimes the very gem the compiler is for).

Tests

I've added simple tests that check a couple uses of the macros, which result in slightly different method signatures.

⚠️ Before Merging

  • Tidy up commits

lib/tapioca/dsl/compilers/rubocop.rb Outdated Show resolved Hide resolved

root.create_path(constant) do |cop_klass|
node_matchers.each do |name|
create_method_from_def(cop_klass, constant.instance_method(name))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I almost implemented this method myself, but found it while looking through the code.

It would be nice if there were docs on how to write DSL compilers to make it more approachable. 📈

Comment on lines 15 to 25
def def_node_matcher(name, *, **defaults)
__tapioca_node_matchers << name
__tapioca_node_matchers << :"without_defaults_#{name}" unless defaults.empty?

super
end
Copy link
Contributor Author

@sambostock sambostock Jul 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlike some other gems (e.g. AASM) which store some sort of state we can query to synthesize the list of generated methods, Rubocop simply generates the methods and moves on.

I copied this approach (a __tapioca_* instance method) from the FrozenRecord compiler.

spec/tapioca/dsl/compilers/rubocop_spec.rb Outdated Show resolved Hide resolved
Comment on lines +62 to +109
def_node_matcher :some_matcher, "(...)"
def_node_matcher :some_matcher_with_params, "(%1 %two ...)"
def_node_matcher :some_matcher_with_params_and_defaults, "(%1 %two ...)", two: :default
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that these result in different method signatures, and in the last case, in a second method being generated (without_defaults_...).

The implementation of def_node_matcher can be found in ::RuboCop::AST::NodePattern::Macros, for reference.

@sambostock sambostock requested a review from paracycle July 8, 2022 14:59
@sambostock sambostock requested a review from a team as a code owner July 14, 2022 18:33
lib/tapioca/dsl/compilers/rubocop.rb Outdated Show resolved Hide resolved
lib/tapioca/dsl/compilers/rubocop.rb Outdated Show resolved Hide resolved
@sambostock sambostock force-pushed the rubocop-rbi-compiler branch from 3b0d494 to 978f837 Compare October 27, 2022 02:16
Copy link
Contributor Author

@sambostock sambostock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vinistock this should be ready for another pass.

If it looks okay, I'll write up documentation and tidy up the commits for final review.

dev.yml Outdated Show resolved Hide resolved
lib/tapioca/dsl/compilers/rubocop.rb Outdated Show resolved Hide resolved
spec/tapioca/dsl/compilers/rubocop_spec.rb Outdated Show resolved Hide resolved
def target_class_file
# Against convention, RuboCop uses "rubocop" in its file names, so we do too.
super.gsub("rubo_cop", "rubocop")
end
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally, I'd renamed the files to rubo_cop.rb too, due to the loading not working, but I like this better, as it allows us to keep both the constant names and file names matching RuboCop's convention (which was the whole point of renaming the constants).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this since we don't rename the file in the DSL command?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This allows us to have

  • lib/tapioca/dsl/compilers/rubocop.rb
  • lib/tapioca/dsl/extensions/rubocop.rb
  • spec/tapioca/dsl/compilers/rubocop_spec.rb

instead of

  • lib/tapioca/dsl/compilers/rubo_cop.rb
  • lib/tapioca/dsl/extensions/rubo_cop.rb
  • spec/tapioca/dsl/compilers/rubo_cop_spec.rb

which I think is preferable, especially given it's really just limited to telling the spec file how to require the correct compiler.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather rename the constant Rubocop or let the file be named rubo_cop rather than play with this.

It seems all the files under dsl/compilers follow the same convention. class CompilerName -> compiler_name.rb. I'm not sure why we should deviate from this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤷 I was trying to match Rubocop's own naming convention (e.g. RuboCop::Version is defined in lib/rubocop/version.rb, not lib/rubo_cop/version.rb), but given the can of worms it would be to do this for the RBI files, maybe it's not worth doing for the compiler files either. It just feels weird seeing rubo_cop TBH, but I'm glad to deal with it if you think it's not worth the extra method in the test file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so circling back to this, a80c4ec makes the changes necessary to avoid this workaround, but my gut feeling is that this leads to the code being more confusing, because now we're dealing with a mix of constants that include RuboCop and constants that include Rubocop, and it's not obvious why they differ.

Likewise, we could go for the rubo_cop file naming, but then we'd have a mid of those files and ones like .rubocop.yml, again without an obvious reason why.

Therefore, I think it's overall more straightforward to keep the RuboCop naming to match upstream, and be consistent across our constants and theirs, and have this one workaround method with the comment documenting why it's there. I think this would be in line with the "principle of least surprise", in that this one method makes the rest of the app less surprising because things just work.

Copy link
Member

@vinistock vinistock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good

spec/tapioca/dsl/compilers/rubocop_spec.rb Outdated Show resolved Hide resolved
spec/tapioca/dsl/compilers/rubocop_spec.rb Show resolved Hide resolved
@sambostock sambostock force-pushed the rubocop-rbi-compiler branch from 4b6a8d8 to a4e75e0 Compare October 27, 2022 17:05
@sambostock
Copy link
Contributor Author

I've addressed feedback, written docs, and tidied up the commits.

One thing I considered was seeing if we could specify that the first parameter to the methods should be a ::RuboCop::AST::Node. However, the methods both default to using self if no node is passed in, as well as implement a guard clause, so technically anything is allowed to be passed in, so for simplicity I stuck with the generated sig whose first param is T.untyped.

@sambostock sambostock requested a review from vinistock October 27, 2022 17:08
@vinistock vinistock requested a review from a team October 27, 2022 18:40
@sambostock sambostock force-pushed the rubocop-rbi-compiler branch from a4e75e0 to 54a3c68 Compare October 27, 2022 18:58
Copy link
Contributor

@KaanOzkan KaanOzkan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

lib/tapioca/dsl/compilers/rubocop.rb Outdated Show resolved Hide resolved
lib/tapioca/dsl/compilers/rubocop.rb Outdated Show resolved Hide resolved
lib/tapioca/dsl/compilers/rubocop.rb Outdated Show resolved Hide resolved
lib/tapioca/commands/dsl.rb Outdated Show resolved Hide resolved
@sambostock sambostock force-pushed the rubocop-rbi-compiler branch from dfbac89 to 58bf390 Compare October 28, 2022 16:55
@@ -363,7 +363,8 @@ def underscore(class_name)

sig { params(constant: String).returns(String) }
def rbi_filename_for(constant)
underscore(constant) + ".rbi"
# TODO: Figure out a better way than this.
underscore(constant).gsub("rubo_cop", "rubocop") + ".rbi"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for setting you on the wrong path, but let's switch the name of the compiler to Rubocop instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry these are two separate things (I originally erroneously thought they were related):

  • Tapioca::Commands::DSL#rbi_filename_for controls the output file name for the generated RBIs, which are based on the name of the constant they were generated for. This is hard to override for just constants with RuboCop in their name (and we don't control those constant names), and probably not worth doing (as @Morriar suggested).
  • DslSpec#target_class_file controls which compiler file is required based on the name of the spec file. Fortunately, that module is included in the specs, so we can just override the method in the spec to say that we want to require "tapioca/dsl/compilers/rubocop.rb" instead of "tapioca/dsl/compilers/rubo_cop.rb", which makes it trivial for Tapioca to internally match both the RuboCop constant name and rubocop file name conventions the RuboCop gem follows.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see what you mean. It's not related to the DSL compiler class. That's unfortunate for sure, but not sure if there's a better way of doing this other than treating rubocop as a special case

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, exactly, and if we were to allow for that (perhaps through a configurable inflector), it would probably be a change that doesn't belong in this PR.

@sambostock sambostock force-pushed the rubocop-rbi-compiler branch from 4b45bfc to 57670ce Compare October 28, 2022 20:17
Comment on lines 37 to 38
@__tapioca_node_methods = T.let(@__tapioca_node_methods, T.nilable(T::Array[MethodName]))
@__tapioca_node_methods ||= []
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Sorbet supports this as a one-liner now:

Suggested change
@__tapioca_node_methods = T.let(@__tapioca_node_methods, T.nilable(T::Array[MethodName]))
@__tapioca_node_methods ||= []
@__tapioca_node_methods ||= T.let([], T.nilable(T::Array[MethodName]))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I just wasn't sure if Tapioca requires at least the Sorbet version that introduced that notation. If it does, or if you know the version and we add it as a minimum, I'd be glad to use that notation, as it is certainly clearer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great call! The feature was introduced in Sorbet v.0.5.10207 and we require only v0.5.9204.

So let's keep this construct 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps worth asking: do we run CI against v0.5.9204? If we don't, perhaps it's worth adding that (in a separate PR) to ensure that we detect if we ever introduce a dependency on features from a newer version. Had I used the new notation, we might have unknowingly broken compatibility.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha, according to CI, that version of sorbet-static-and-runtime (apparently) doesn't even exist...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be fine to change now, since we force a newer version of Sorbet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated this (it's actually a RuboCop auto-correction now – neat), and also took the opportunity to switch from the require & rescue LoadError approach to the return unless defined? approach to lazy loading the compiler.

def target_class_file
# Against convention, RuboCop uses "rubocop" in its file names, so we do too.
super.gsub("rubo_cop", "rubocop")
end
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this since we don't rename the file in the DSL command?

@andyw8
Copy link
Contributor

andyw8 commented Jan 24, 2024

I haven't read through all the comments, but I see this got an approval, is something preventing it from being merged?

@paracycle paracycle force-pushed the rubocop-rbi-compiler branch from 57670ce to 6600b63 Compare January 24, 2024 23:23
@sambostock sambostock force-pushed the rubocop-rbi-compiler branch 3 times, most recently from 924bcef to dbc15ce Compare January 25, 2024 04:50
@sambostock
Copy link
Contributor Author

I've pushed some updates, mainly consisting of:

  • Adding a extenders_of helper method (not sure about this name).
  • Extending gathered constants to include all modules which extend RuboCop::AST::NodePattern::Macros, rather than descendants of RuboCop::Cop::Base.
  • Updating the lazy compiler definition approach from using rescuing LoadError to checking if constants are defined?.

I think it should be good to go, but given the changes it might be worth another look.

@sambostock sambostock force-pushed the rubocop-rbi-compiler branch from dbc15ce to a3fc3da Compare January 25, 2024 04:58
@vinistock
Copy link
Member

Extending gathered constants to include all modules which extend RuboCop::AST::NodePattern::Macros, rather than descendants of RuboCop::Cop::Base.

What's the advantage of this? Wouldn't all cops inherit from RuboCop::Cop::Base?

Also, are all cops we want to generate RBIs for required when Tapioca loads the application? For example, extensions like rubocop-rails are required through the .rubocop.yml configuration file and you can do the same thing for custom cops defined inside the application.

Just trying to understand if custom cops will need to be manually required in the require.rb file.

This method is similar to `descendants_of`, but returns all modules
which extend the given module. That is, all modules (including classes),
whose singleton class (or parent thereof) includes the given module.
This generates RBI signatures for use of Rubocop's Node Pattern macros
(`def_node_matcher` & `def_node_search`).
Extensions are loaded before the host application, and we need to ensure
we patch the class before the macros are used.
@sambostock sambostock force-pushed the rubocop-rbi-compiler branch from 119a263 to 3c9ee4b Compare January 29, 2024 01:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants