Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix mode(Binomial) and modes(Binomial) #1931

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

marcusps
Copy link
Contributor

Fix #1927

@marcusps
Copy link
Contributor Author

The impact on runtime is minimal

julia> d = Binomial(5, 2//3)
Binomial{Rational{Int64}}(n=5, p=2//3)

julia> @benchmark mode(d) # current master
BenchmarkTools.Trial: 10000 samples with 994 evaluations.
 Range (min  max):  23.342 ns  76.490 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     23.791 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   25.491 ns ±  3.586 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █▇▅▅▅▆▃▂▁              ▄▃ ▃▂▁▂▂  ▁                          ▂
  █████████▇▇▇█▄██▇▇█▆█▆▆██▆█████▇██▆▆▆▆▆▆▄▅▇█▄▆▆▅▆▄▅▆▅▆▄▄▄▆▇ █
  23.3 ns      Histogram: log(frequency) by time      39.2 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark mode(d) # this PR
BenchmarkTools.Trial: 10000 samples with 995 evaluations.
 Range (min  max):  22.868 ns  90.999 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     23.201 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   25.739 ns ±  4.551 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █▄▃   ▂▂    ▁▂▂▂▂▅▄▃▃                                       ▁
  ████▇███▇▇▇▇█████████▇█▇▇█▅▇▇▄██▅▄▇▇▄▆█▆▄▇▇▅▅▄▅▅▅▄▅▅▄▁▅▅▅▄▅ █
  22.9 ns      Histogram: log(frequency) by time      44.2 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

similarly for modes

julia> @benchmark modes(d) # current master
BenchmarkTools.Trial: 10000 samples with 984 evaluations.
 Range (min  max):  43.023 ns    6.794 μs  ┊ GC (min  max): 0.00%  98.72%
 Time  (median):     47.259 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   54.389 ns ± 143.785 ns  ┊ GC (mean ± σ):  8.71% ±  3.40%

     ▅▂▂▁▆█▅▃                                                   
  ▁▂▆█████████▅▄▂▂▂▂▂▂▂▂▃▃▃▄▅▄▄▄▃▃▂▂▃▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▃
  43 ns           Histogram: frequency by time         69.7 ns <

 Memory estimate: 64 bytes, allocs estimate: 1.

julia> @benchmark modes(d) # this PR
BenchmarkTools.Trial: 10000 samples with 980 evaluations.
 Range (min  max):  47.515 ns    3.625 μs  ┊ GC (min  max): 0.00%  97.66%
 Time  (median):     56.248 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   63.108 ns ± 127.480 ns  ┊ GC (mean ± σ):  7.78% ±  3.79%

     ▁▄█▆▆▄        ▁ ▁▁                                         
  ▁▂▅███████▆▆▆▆▇▇█████▇▅▄▃▂▂▂▂▂▂▂▁▂▂▁▁▁▁▁▁▁▁▁▁▁▁▂▂▂▂▂▂▂▂▂▁▁▁▁ ▃
  47.5 ns         Histogram: frequency by time         89.3 ns <

 Memory estimate: 80 bytes, allocs estimate: 1.

Comment on lines 73 to 83
v = (n + 1) * p
quasi_mode = floor(Int, v)
if quasi_mode == v
if p == 1
n
else
quasi_mode-1
end
else
quasi_mode
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT alternatives with fewer branches would be

Suggested change
v = (n + 1) * p
quasi_mode = floor(Int, v)
if quasi_mode == v
if p == 1
n
else
quasi_mode-1
end
else
quasi_mode
end
iszero(p) ? 0 : ceil(Int, (n + 1) * p) - 1

or

Suggested change
v = (n + 1) * p
quasi_mode = floor(Int, v)
if quasi_mode == v
if p == 1
n
else
quasi_mode-1
end
else
quasi_mode
end
max(0, ceil(Int, (n + 1) * p) - 1)

or

Suggested change
v = (n + 1) * p
quasi_mode = floor(Int, v)
if quasi_mode == v
if p == 1
n
else
quasi_mode-1
end
else
quasi_mode
end
clamp(ceil(Int, (n + 1) * p) - 1, 0, n)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is little to no difference in runtime between these options (or the option in the PR). The main difference is in the max runtime, and the PR options and the max(0, ceil(...)) have the shorter max runtime (the two are nearly identical with a max of about 10x the average runtime), while the max runtime for the other two options is about twice as long (about 20x the average runtime). The standard deviations show a similar pattern (but they are smaller than the average runtime).

Given there isn't much of a different in run times, I propose the PR version is preferable because it is easiest to read and reason about.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devmotion Is there another reason to consider fewer branches other than runtime?

@codecov-commenter
Copy link

codecov-commenter commented Dec 19, 2024

Codecov Report

Attention: Patch coverage is 88.23529% with 2 lines in your changes missing coverage. Please review.

Project coverage is 86.02%. Comparing base (ceb6343) to head (6c0abd5).

Files with missing lines Patch % Lines
src/univariate/discrete/binomial.jl 88.23% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1931      +/-   ##
==========================================
+ Coverage   86.01%   86.02%   +0.01%     
==========================================
  Files         144      144              
  Lines        8696     8710      +14     
==========================================
+ Hits         7480     7493      +13     
- Misses       1216     1217       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

mode(Binomial) (docs?) wrong when (n+1)*p is integer
3 participants