[HN Gopher] Original source of `(seed * 9301 and 49297) % 233280...
       ___________________________________________________________________
        
       Original source of `(seed * 9301 and 49297) % 233280` random
       algorithm? (2014)
        
       Author : thunderbong
       Score  : 142 points
       Date   : 2022-08-06 14:07 UTC (8 hours ago)
        
 (HTM) web link (softwareengineering.stackexchange.com)
 (TXT) w3m dump (softwareengineering.stackexchange.com)
        
       | montroser wrote:
       | As pointed out in the comments of the post, this is an LCG[1]
       | PRNG. The Wikipedia article has a good discussion on how to
       | choose parameters, with examples of use in the wild from various
       | popular projects over the last many decades.
       | 
       | [1]: https://en.wikipedia.org/wiki/Linear_congruential_generator
        
         | Someone wrote:
         | And its first reference is to what I consider the definitive
         | reference on the subject, at least for the state of the art in
         | 1968: the Art of Computer Programming. Vol. 2: Seminumerical
         | Algorithms.
         | 
         | I also think the only additions that might be needed to that
         | are mentions of newer algorithms that are comparatively simple,
         | but better. The subject itself is pretty well-trodden.
         | 
         | Knuth does mention the extremely simple additive random number
         | generator                 Xn = (Xn-24 + Xn-55) mod m
         | 
         | (And says that this can be used directly with floating point
         | values to produce floats in the range [0,1))
         | 
         | It uses 55 words of state (initialized to not all be even),
         | needs only about ten simple instructions to implement (in
         | particular, no multiplications or divisions), has period of at
         | least 2^55-1, and Knuth said it's very good in practice, isn't
         | known to be bad, and ends with (in the 1980 second edition of
         | volume 2, with the first edition being from 1968):
         | 
         |  _The only reason it is difficult to recommend sequence (7)
         | wholeheartedly is that there is very little theory to prove
         | that it does or does not have desirable randomness properties;
         | essentially all we know for sure is that the period is very
         | long, and this is not enough"_
         | 
         | I can't find anything more recent about this RNG. Has it been
         | forgotten, or was it shown to be bad in some important sense?
        
           | hcs wrote:
           | It's a lagged Fibonacci generator
           | https://en.wikipedia.org/wiki/Lagged_Fibonacci_generator
           | 
           | In the third edition (1998) there are a few more notes, most
           | importantly:
           | 
           | "Lagged Fibonacci generators have been used successfully in
           | many situations since 1958, so it came as a shock to discover
           | in the 1990s that they actually fail an extremely simple,
           | non-contrived test for randomness" with reference to exercise
           | 3.3.2-31 asking "prove that if we generate 79 consecutive
           | random bits ... starting at a random point in the period, the
           | probability is more than 51% that there are more 1s than 0s".
           | 
           | The solution references this paper on the tests:
           | https://arxiv.org/abs/cond-mat/9406054
           | 
           | There's a recommendation to generate X numbers and only use
           | the first Y an order of magnitude smaller, citing Luscher's
           | RANLUX, but I don't know how much that holds up either.
        
             | Someone wrote:
             | Thanks! I focused too much on "additive" as a search term.
             | I even looked at https://en.wikipedia.org/wiki/List_of_rand
             | om_number_generato..., and would have found that better
             | name if I had searched for "Mitchell" or "Moore" there, not
             | "additive".
        
               | hcs wrote:
               | FYI if you only have the 2nd edition of volume 2, Knuth
               | provides the full text of the changes from the 2nd to 3rd
               | edition electronically: https://www-cs-
               | faculty.stanford.edu/~knuth/taocp.html (search for
               | "Errata for Volume 2 (2nd ed.)")
               | 
               | Since it's too late to edit the earlier post, this is the
               | specifics around discarding (p571):
               | 
               | Luscher's discarding technique can be used to avoid the
               | bias towards 1s. For example, with lags 55 and 24, no
               | deviation for randomness is observed for random walks of
               | length 1001 when the numbers are generated in batches of
               | 165, if only the first 55 numbers of each batch are used.
        
       | aaaaaaaaaaab wrote:
       | Probably the NSA...
        
         | russellbeattie wrote:
         | In case people don't know this story, or worse, believe the
         | conspiracy theories, here's what happened with the development
         | of DES:
         | 
         | > _" According to Steven Levy, IBM Watson researchers
         | discovered differential cryptanalytic attacks in 1974 and were
         | asked by the NSA to keep the technique secret. Coppersmith
         | explains IBM's secrecy decision by saying, "that was because
         | [differential cryptanalysis] can be a very powerful tool, used
         | against many schemes, and there was concern that such
         | information in the public domain could adversely affect
         | national security." Levy quotes Walter Tuchman: "[t]hey asked
         | us to stamp all our documents confidential... We actually put a
         | number on each one and locked them up in safes, because they
         | were considered U.S. government classified. They said do it. So
         | I did it"._
         | 
         |  _Bruce Schneier observed that "It took the academic community
         | two decades to figure out that the NSA 'tweaks' actually
         | improved the security of DES."_
         | 
         | https://en.m.wikipedia.org/wiki/Data_Encryption_Standard
        
       | walterbell wrote:
       | John Denker's 2013 paper/code for using sound cards as input to a
       | HRNG, https://www.av8n.com/turbid/
       | 
       |  _> We discuss how to configure and use turbid, which is a
       | Hardware Random Number Generator (HRNG), also called a True
       | Random Generator (TRNG). It is suitable for a wide range of
       | applications, from the simplest benign applications to the most
       | demanding high-stakes adversarial applications, including
       | cryptography and gaming. It relies on a combination of physical
       | process and cryptological algorithms, rather than either of those
       | separately. It harvests randomness from physical processes, and
       | uses that randomness efficiently. The hash saturation principle
       | is used to distill the data, so that the output is virtually 100%
       | random for all practical purposes. This is calculated based on
       | physical properties of the inputs, not merely estimated by
       | looking at the statistics of the outputs. In contrast to a
       | Pseudo-Random Generator, it has no internal state to worry about.
       | In particular, we describe a low-cost high-performance
       | implementation, using the computer's audio I /O system._
        
       | hkgjjgjfjfjfjf wrote:
        
       | sbf501 wrote:
       | This is what bugs me about the stackexchange model. Most of the
       | comments are guesses by people who really don't know. The
       | "correct" answer doesn't explain anything, it just points to some
       | sources. The highest ranked comment that comes closest, attempts
       | to explain the nature of the numbers (the modulous is the period,
       | and the primes stretch out the cycle) and fires off a link to
       | wikipedia.
       | 
       | Then it appears on HN, and more people half guessing, but better
       | content in general.
       | 
       | Based on yesterday's post, BYTE magazine would have actually
       | given an answer with both why and how.
       | 
       | EDIT: Granted, the link DOES go to Wikipedia which has a great
       | explanation, but in general, providing a link and not answer is a
       | poor model because links decay so quickly.
        
         | svat wrote:
         | It took me a while to understand your comment, until I realized
         | there are _two_ interpretations of the question of where these
         | numbers come from:
         | 
         | * One is a _historical_ question: who first used these numbers
         | and when? who copied from where?
         | 
         | * The other is a _conceptual_ question: why might someone
         | choose these numbers, what do they  "mean", how do we interpret
         | these constants?
         | 
         | The former is purely a question of facts and there is nothing
         | to "explain the nature of the numbers", and this is what the
         | answerer on Stack Exchange seems to have interpreted it as.
         | Under this interpretation saying things like "the modulus is
         | the period" is not really germane to the question; what's
         | required is to search sources and try to find the
         | chronologically earliest one. (The answer doesn't seem to have
         | done a definitive job, but it's a reasonable start.)
         | 
         | This is like _synchronic_ vs _diachronic_ in linguistics: in
         | the latter interpretation you mainly just want to _understand_
         | the random-number generator.
        
         | pizza234 wrote:
         | > This is what bugs me about the stackexchange model.
         | 
         | > providing a link and not answer is a poor model because links
         | decay so quickly.
         | 
         | Providing a link and not answer is not the stackexchange model.
         | On the Stack Exchange network, such answers are often (not
         | always, of course) commented with a request to expand the
         | answer.
        
           | nerdponx wrote:
           | And it's not just accepted but encouraged to comment on old
           | questions and answers that are either wrong or not up to
           | current standards.
        
         | phalangion wrote:
         | > links decay quickly
         | 
         | That might be true in many places, and I agree with the general
         | sentiment that the answer should provide an answer and not just
         | a link to an answer. But wikipedia links are generally pretty
         | stable. It's not some random blog post.
        
         | [deleted]
        
         | spullara wrote:
         | I'd be willing to bet that a Wikipedia link will outlast a
         | StackExchange answer.
        
           | matheusmoreira wrote:
           | It absolutely will. Some of my favorite answers on Stack
           | Overflow have been deleted and there seems to be no archive
           | for such lost data.
        
             | nerdponx wrote:
             | Stackoverflow itself retains all deleted questions and
             | answers. If you have enough points you can see them.
        
             | kragen wrote:
             | Which ones were they?
        
             | cxcorp wrote:
             | I usually save my bookmarks on the Wayback Machine just so
             | there's at least one copy out there.
        
         | loldk wrote:
         | Insecure devs always obfuscate and try to hide the whole truth.
         | It is a massive problem in tech.
        
         | bluedino wrote:
         | The title question was the original source. The answer was a
         | list of places it was found, including the earliest.
        
           | sbf501 wrote:
           | Oh, good point. OP was looking for original source, papers
           | referring to, etc. Too late to edit, but minus points for me
           | for not comprehending the original question.
        
             | bluedino wrote:
             | Didn't help that he asked a couple more questions at the
             | end
             | 
             |  _Who invented this algorithm and tested the distribution?
             | Is there a paper or something to cite?_
             | 
             | That would have made another question.
        
             | Nition wrote:
             | I don't think you were far off what the asker was looking
             | for. In a comment, they say "I was kind of hoping to be
             | able to point to some specific research as to why these
             | numbers are special".
        
         | 0xTJ wrote:
         | The question isn't asking about how it works. It asks about
         | where that algorithm comes from.
        
         | a-dub wrote:
         | NR has long been seen as a sort of necronomicon for scientific
         | computer programming.
         | 
         | i don't think it's SE's fault as much as it is a historical
         | rough edge between two fields (computer programming and applied
         | mathematics).
         | 
         | the first constant is klaatu, the second is barada and the
         | third is nikto.
         | 
         | (NR also tends to be fairly rigorous in terms of citations, for
         | the curious)
        
           | ascar wrote:
           | What's NR, what's a necronomicon and what's klaatu, barads
           | and nikto.
           | 
           | I googled all these things and I can't still make any sense
           | of your comment except that it's intentionally confusing with
           | some popular fiction references?
        
             | svnpenn wrote:
             | It's common hacker news trope for someone to do acronym
             | diarrhea, even with rarely or never used abbreviations.
             | It's quite annoying.
        
               | Dylan16807 wrote:
               | NR is in the answer in the link.
               | 
               | I feel like SE being stack exchange is pretty clear here.
               | 
               | The remaining reference is not an acronym or
               | abbreviation, nor do I think it looks like one.
        
             | hinkley wrote:
             | "Klaatu barada nikto" is a code phrase from The Day the
             | Earth Stood Still to stop the homicidal antagonists from
             | killing you.
             | 
             | It's doubly famous because Sam Raimi recycled it in "Army
             | of Darkness" as an incantation to dispel the curse of the
             | Necronomicon (book of the dead, portrayed as the arch book
             | of black magic in any number of stories), which our hapless
             | anti-hero Ash fucks up and thus unleashes the armies of the
             | dead on a roughly medieval and wholly unprepared world.
             | 
             | If you like campy horror movies, there is so much camp in
             | Army of Darkness that there's barely any room for the
             | horror. It is technically a sequel but it works as a
             | standalone movie. I watched it years before I finally sat
             | down and watched The Evil Dead.
        
               | Natsu wrote:
               | Wasn't it also used in Ghost Busters at some point?
        
             | kragen wrote:
             | In addition to what the other answers have said, the
             | Necronomicon is the book of dark magic in the Lovecraft
             | mythos, from which it has been copied into some other
             | mythoi.
        
             | webstrand wrote:
             | NR is apparently Numerical Recipes, a book. I think they
             | mean that the constants are not well understood, like the
             | phrase "Klaatu barada nikto" was not understood, but still
             | had an effect.
        
             | adhesive_wombat wrote:
             | I assume _Numerical Recipes_.
        
       | overshard wrote:
       | And this is when comments in code are important! Any random
       | numbers without a source are immediate suspect to me, especially
       | in something that needs to be secure. It will save your coworkers
       | and peers time trying to figure out why it's there.
        
         | JKCalhoun wrote:
         | Security through obscurity?
         | 
         | Maybe not.
        
         | swatcoder wrote:
         | That's true, but _this_ thing is not code any more.
         | 
         | It's an incantation that's propagated for 50+ years because
         | it's minimal and effective. Over time, it's been fully
         | distilled to those properties.
         | 
         | Since comments aren't essential to being minimal and effective,
         | they don't survive the distillation.
         | 
         | Think of it like a clever gist that got pasted and shared a
         | hundred times. Even if the original source had explained every
         | step in great detail, with inline comments and deep explanatory
         | discourses and citations to prior art and etc, they'd
         | eventually get trimmed away as fat as people repeatedly prune
         | it down to some "important" bits pasted into their own copies
         | and then later share those trimmed copies, ad infinitum.
         | 
         | This is that, but 50 years out.
        
           | AstralStorm wrote:
           | Since it's an LCG, it does not matter how it was derived as
           | long as the randomness properties are known. Such as cycle
           | length, identical initial state set and dispersion
           | properties. Perhaps also performance.
           | 
           | These should be documented.
           | 
           | It's a rather weak PRNG of short cycle, so the suspicion is
           | that it's made for particular dispersion properties, such as
           | for a hash table of particular data and size or other
           | bucketing algorithm.
        
             | ChrisLomont wrote:
             | >These should be documented.
             | 
             | They're trivial to look up, and any modern source would
             | likely outlive the game of telephone of trying to keep such
             | a comment intact correctly.
        
         | krater23 wrote:
         | When you find this algorithm with or without any comments from
         | where the numbers come from in a point that should be secure,
         | you should be more than suspect. In any way, this is not a
         | secure PRNG.
        
         | Waterluvian wrote:
         | "No magic numbers."
        
           | eru wrote:
           | You'd want https://en.wikipedia.org/wiki/Nothing-up-my-
           | sleeve_number
        
             | Waterluvian wrote:
             | That's clever!
        
         | bee_rider wrote:
         | It is interesting -- 5/9 of the books listed there are clearly
         | numerics textbooks, so it isn't surprising they talked about a
         | non-cryptographic PRNG. Curious about the other 4. But maybe
         | this shows up as a "here's why you should be careful what type
         | of RNG you are using" type example.
        
         | pclmulqdq wrote:
         | For PRNGs like this, constants are often chosen by guess and
         | check. This algorithm (an LCG) has a bit more theory, so these
         | might not have been chosen that way, but the author of the code
         | probably didn't have any insight either.
        
       | swayvil wrote:
       | I think it fits the rng period to fit max 32 bit (4 bil).
       | Somebody in the comments said similar so I feel good about this.
        
         | jffry wrote:
         | The person in the comments said the period is only ~233k, which
         | is correct, because you are always taking modulo 233280. But
         | it's also easy to verify experimentally:                 let
         | state = 6, iters = 0;       do {         state = (state * 9301
         | + 49297) % 233280;         iters++;       } while(state !== 6);
         | console.log("period is", iters, "iters");       //period is
         | 233280 iters
         | 
         | If you instead took modulo 2^32, I believe the period would
         | indeed be the maximal period, according to this article:
         | https://en.wikipedia.org/wiki/Linear_congruential_generator#...
        
           | bonzini wrote:
           | He means that the constants are chosen so that the
           | multiplication and addition can be done on a signed 32-bit
           | integer without overflow.
        
             | morelisp wrote:
             | There's no reason to avoid overflow (nor any reason to make
             | it signed since that would also be a concern these days)
             | though.
        
               | bonzini wrote:
               | That depends on the language. For example BASIC used to
               | have only 16-bit or 32-bit signed integers (depending on
               | the processor; plus strings and floating point), so this
               | particular RNG can be useful for its portability.
        
               | morelisp wrote:
               | Such dialects of BASIC also offered defined overflow
               | behavior (it's really only C/C++ and other things
               | targeting MCUs that don't - and even three also only
               | relatively recently that "undefined overflow" meant
               | something more practically dangerous than "implementation
               | defined") so it was still not a concern.
        
               | bonzini wrote:
               | Sure but signed and unsigned division are different.
        
               | [deleted]
        
             | a1369209993 wrote:
             | > so that the multiplication and addition can be done on a
             | signed 32-bit integer without overflow
             | 
             | Except they can't: any seed in the range 230888 thru 233279
             | will produce a signed integer overflow when multiplied by
             | 9301 (eg 9301*230888 = 0x80001608).
        
               | bonzini wrote:
               | You're right, that makes no sense. Also a-1 (9300) should
               | not have factors in common with m (233280) but it has a
               | lot...
        
       | kloch wrote:
       | Here's a good article about this type of PRNG, with other sample
       | seeds:
       | 
       | https://www.pcg-random.org/posts/does-it-beat-the-minimal-st...
        
       | [deleted]
        
       | snet0 wrote:
       | If someone can access it, my university notes where I looked into
       | LCGs such as this discusses that in Numerical Recipes, 1992 C
       | Edition, a clever LCG can be used to generates random numbers
       | between 0 and 1. Using a 32 bit int generated by a fast LCG
       | modulo 2^32. The cleverness is that you treat this value as a
       | float. It uses a bit mask to mask only the mantissa of a
       | float-32, and then is OR'd with a value to set the exponent to
       | 127. Subtracting 1 from this results in a random value between 0
       | and 1, using only simple or bitwise operations!
        
         | 10000truths wrote:
         | The issue with this is approach that the interval [1,2) has
         | much fewer representable values than [0,1) in IEEE-754 floats,
         | so the range of generatable values is only a fraction of the
         | representable values.
         | 
         | On modern hardware, you should instead use a count-leading-
         | zeroes or count-trailing-zeroes instruction on a uniform bit
         | pattern to directly generate the exponent. This is what is done
         | in the Zig standard library:
         | 
         | https://github.com/ziglang/zig/pull/10428
        
           | binarycoffee wrote:
           | You are right of course, but I would contend that generating
           | as well sub-normal floats is in 99.9% of the cases a needless
           | cost. The reason is that, more often than not, all this extra
           | precision will be immediately lost in subsequent operations.
           | 
           | A very common situation is for instance to plug the random
           | number in a log, in which case you need to use log(1-r)
           | rather than log(r) to avoid an infinite at r=0. The problem
           | is, by doing this simple subtraction you have already lost
           | all the subnormal precision.
        
         | ralphb wrote:
         | This is well-known, and almost trivial. But because I am
         | curious I looked this up in NR 3rd edition. Section 7.1.5 says
         | 
         | > The steps above that convert a 64-bit integer to a double
         | precision floating-point value involves both a non-trivial type
         | conversion and a 64-bit floating multiply. They are performance
         | bottlenecks. One can instead directly move the random bits into
         | the right place in the double word with a union structure, a
         | mask, and some 64-bit logical operations; _but in our
         | experience this is not significantly faster_
         | 
         | It goes on to describe a _lagged Fibonacci generator_ which
         | generates values directly as floating point.
        
         | binarycoffee wrote:
         | This is a widely used method to map random integers to floating
         | point numbers, but it has the disadvantage of wasting 1 bit of
         | float mantissa precision.
         | 
         | On modern CPUs, its computational advantage over full-precision
         | mapping methods, such as multiplication by a float, is not
         | always clear [1].
         | 
         | [1] https://github.com/rust-random/rand/issues/416
        
       | humanistbot wrote:
       | Yeah, I'm pretty sure this isn't the same kind of thing as
       | Carmack's (edit: not Carmack's) legendary inverse square root
       | function [1], which has that 0x5F3759DF constant (edit: magic
       | number):
       | 
       | [1] https://en.wikipedia.org/wiki/Fast_inverse_square_root
       | float Q_rsqrt( float number )       {        long i;        float
       | x2, y;        const float threehalfs = 1.5F;               x2 =
       | number \* 0.5F;        y  = number;        i  = \* ( long \* )
       | &y;                           // evil floating point bit level
       | hacking        i  = 0x5f3759df - ( i >> 1 );
       | // what the fuck?         y  = \* ( float \* ) &i;        y  = y
       | \* ( threehalfs - ( x2 \* y \* y ) );      // 1st iteration
       | // y  = y \* ( threehalfs - ( x2 \* y \* y ) );   // 2nd
       | iteration, this can be removed               return y;       }
        
         | silisili wrote:
         | Not Carmack's, it says so in the link you provided.
         | 
         | > The algorithm was originally attributed to John Carmack, but
         | an investigation showed that the code had deeper roots in the
         | hardware and software side of computer graphics. Adjustments
         | and alterations passed through both Silicon Graphics and 3dfx
         | Interactive, with the original constant being derived in a
         | collaboration between Cleve Moler and Gregory Walsh, while
         | Gregory worked for Ardent Computing in the late 1980s.[3] Walsh
         | and Moler adapted their version from an unpublished paper by
         | William Kahan and K.C. Ng circulated in May 1986.
        
         | secondcoming wrote:
         | What has this got to do with seeding a RNG?
        
           | allan_s wrote:
           | I think the comment author meant "a magical constant that has
           | been found by someone by empirical method without a nice
           | paper pointing why it's a good value"
           | 
           | (even though I think now the constant above has a paper on
           | why it's actually so good)
        
           | humanistbot wrote:
           | This was an example of a seemingly random magic number
           | appearing in code for a very clever hack. I think the OP was
           | assuming there was something similar at work with the PRNG
           | magic numbers.
        
       ___________________________________________________________________
       (page generated 2022-08-06 23:00 UTC)