Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Yard rake task is missing require #1597

Open
jasonkarns opened this issue Dec 20, 2024 · 3 comments
Open

Yard rake task is missing require #1597

jasonkarns opened this issue Dec 20, 2024 · 3 comments

Comments

@jasonkarns
Copy link

Steps to reproduce

The yard rake task is not self-contained and is missing requires to yard itself.

  1. add yard to rakefile:
    require "yard/rake/yardoc_task"
    YARD::Rake::YardocTask.new
  2. verify yard rake task:
    $ rake -T
    # snip
    rake yard                     # Generate YARD Documentation
    
  3. run rake yard task

Actual Output

```
$ rake yard
rake aborted!
NameError: uninitialized constant YARD::CLI

          yardoc = YARD::CLI::Yardoc.new
                       ^^^^^

Tasks: TOP => yard
(See full trace by running task with --trace)
```

Expected Output

I expected roughly the same output as yard

It seems the yard task is missing requires.

Environment details:

  • OS: macos 15.2
  • Ruby version (ruby -v): ruby 3.1.6p260 (2024-05-29 revision a777087be6) [arm64-darwin23]
  • YARD version (yard -v): yard 0.9.37
  • Relevant software dependency/versions: rake, version 13.2.1

I have read the Contributing Guide.

@lsegal
Copy link
Owner

lsegal commented Dec 20, 2024

This is by design. You should not be reaching into and requiring any files from the YARD package other than require 'yard' itself:

require 'yard'
YARD::Rake::YardocTask.new

as documented in the README.

@jasonkarns
Copy link
Author

jasonkarns commented Dec 23, 2024

Understood. I saw that when reading up on install, but assumed it was the "simple, but malperformant" setup that naive rake users typically fall into.

It's common to see many gems document the "simple" setup that requires their entire lib in the rakefile, only to later discover the performance impact that has on shell completion (rake <tab>) or when gem A negatively impacts gem B's completely unrelated rake task (because gem A is being fully loaded even when not run).

The common convention is to have a rake task entrypoint that only defines the task, without requiring the library. (gems that do this include: rdoc, bundler, rspec, and rubocop)

(I'm overly stating things, just to be clear on expectations and assumptions.)

The way I ended up defining the task was:

require "yard/rake/yardoc_task"
YARD::Rake::YardocTask.new :yard do |t|
  t.before = -> { require "yard" }
end

Comparing the two approaches (the above vs: require "yard"; YARD::Rake::YardocTask.new)

yielded (collected with profile gem):

  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
  0.02     0.02      0.00        1     0.01    16.56  Object#full_yard
  0.04     0.02      0.00        2     0.00     0.21  Object#just_task

16ms isn't a lot, but it's a lot more than 0.2ms

Also tested with ruby-prof (Measure Mode: wall_time):

just_task total: 0.000478
full_yard total: 0.018249

Still about 2 orders of magnitude.

Would you consider a request to have an "official" rake entrypoint that didn't set up all of yard? (I realize it's only configuring autoloads and a few things, but it's quite a bit more than the tasklib needs.)

I would find this to also be helpful for consumers who look for patterns in Rakefiles and, seeing require "yard" could mistakenly conclude that require _gem_ is normal or good practice.

@lsegal
Copy link
Owner

lsegal commented Dec 23, 2024

Still about 2 orders of magnitude.

How are you generating this benchmark? This isn't something I can reproduce. Something looks very off about those numbers given that I've never seen a sub-ms require call. Are you sure your benchmark isn't requiring an already-required file? This would be the case if you ran a bench for require "yard"; require "yard/rake/yardoc_task" within the same process.

Here's what I see:

❯ ruby -rrake -e "t=Time.now;require 'yard/rake/yardoc_task';p Time.now - t"
0.0276671

❯ ruby -rrake -e "t=Time.now;require 'yard';p Time.now - t"
0.0639603

Running this a number of times is fairly consistent-- requiring the task file directly takes anywhere between 2.5ms - 4ms, while requiring yard fairly consistently is ~6.5ms. In the "worst case" performance it's about 2-2.5x faster, but keep in mind we're talking about differences of ~2-3ms in a use case that is absolutely not going to be a tight loop.

There is probably a risk of disk cache miss where loading YARD can take a little longer, but usually this only happens once and accounts for a very small total cost of your total runtime. If the concern is that YARD will slow down the running of other tasks, then this one time possible disk cache miss is unlikely to be the primary contributor-- an extra ~20ms every 100 or so runs doesn't seem significant for the benefit of significantly improved backwards compatibility guarantees.

tl;dr you're free to use the "unsupported" path of directly requiring into YARD if you really need to tightly optimize your Rakefile load time. I think the approach you're using is perfectly reasonable if the load time is that stark on your machine. I don't see enough of a performance gap in the general case to justify increasing YARD's maintenance surface area for the whole project, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants