-
Notifications
You must be signed in to change notification settings - Fork 279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The random number generator is slightly biased toward returning 0 #1434
Comments
Thank you @mrcslws ! Now I understand your motivation, although I still don't think the logic is correct.
|
Nice catch! This means that the random number generator is biased in the same way as simply using So, to summarize the bugs:
Other replies: (For these replies, assume
|
Can you verify my test is correct? I have the "half" RandomImpl.getUInt() and "biased" Random.getUInt and I cannot reproduce the bias in the testcase.
yes, it's statistically 1 (maybe 2) passes in the loop, but for compiler optimization it's an unknown-length loop. (I don't know how performance-critical this codepath is)
Wouldn't all (serious) offer that? https://en.cppreference.com/w/cpp/numeric/random/mersenne_twister_engine
|
Update, in the community repo, we've replaced Random with std:: version, you might like it. |
This code looks wrong.
https://github.com/numenta/nupic.core/blob/69b9c79cf34a9c7ab90c8c0f4616504da40edee7/src/nupic/utils/Random.cpp#L192
I think this line should be
while (sample >= smax);
.Here's a quick explanation of this code. It generates a random number between 0 and n - 1 by generating a random integer
mod n
. The sequence0 mod n
,1 mod n
,2 mod n
,...
will cycle through the ring of integers mod n forever, so each return value is almost equally likely. But this sequence will generally end in the middle of this ring of return values, except when2^32 - 1
happens to be-1 mod n
. So this code correctly finds the highest integer that is divisible by n. At this point, it should subtract 1 from this number, or it should do the>=
check that I suggested above. Instead, it does a>
check.So, for example, with the default max
2^32 - 1
, if the underlying random number generator does return2^32 - 1
, the function will return 0, when it ought to discard this return value and try again. That means that when you callgetUInt32()
with no max, it's twice as likely to return 0 than any other number. The problem is much less severe for other use cases likegetUInt32(1024)
, but the issue is still there.This code would be correct if the underlying
RandomImpl
never returns zero. But I ran a quick test, and it does in fact return zero. For example, with seed 19, it returns 0 on the 1009865459th call.The text was updated successfully, but these errors were encountered: