[HN Gopher] Original source of `(seed * 9301 and 49297) % 233280... ___________________________________________________________________ Original source of `(seed * 9301 and 49297) % 233280` random algorithm? (2014) Author : thunderbong Score : 142 points Date : 2022-08-06 14:07 UTC (8 hours ago) (HTM) web link (softwareengineering.stackexchange.com) (TXT) w3m dump (softwareengineering.stackexchange.com) | montroser wrote: | As pointed out in the comments of the post, this is an LCG[1] | PRNG. The Wikipedia article has a good discussion on how to | choose parameters, with examples of use in the wild from various | popular projects over the last many decades. | | [1]: https://en.wikipedia.org/wiki/Linear_congruential_generator | Someone wrote: | And its first reference is to what I consider the definitive | reference on the subject, at least for the state of the art in | 1968: the Art of Computer Programming. Vol. 2: Seminumerical | Algorithms. | | I also think the only additions that might be needed to that | are mentions of newer algorithms that are comparatively simple, | but better. The subject itself is pretty well-trodden. | | Knuth does mention the extremely simple additive random number | generator Xn = (Xn-24 + Xn-55) mod m | | (And says that this can be used directly with floating point | values to produce floats in the range [0,1)) | | It uses 55 words of state (initialized to not all be even), | needs only about ten simple instructions to implement (in | particular, no multiplications or divisions), has period of at | least 2^55-1, and Knuth said it's very good in practice, isn't | known to be bad, and ends with (in the 1980 second edition of | volume 2, with the first edition being from 1968): | | _The only reason it is difficult to recommend sequence (7) | wholeheartedly is that there is very little theory to prove | that it does or does not have desirable randomness properties; | essentially all we know for sure is that the period is very | long, and this is not enough"_ | | I can't find anything more recent about this RNG. Has it been | forgotten, or was it shown to be bad in some important sense? | hcs wrote: | It's a lagged Fibonacci generator | https://en.wikipedia.org/wiki/Lagged_Fibonacci_generator | | In the third edition (1998) there are a few more notes, most | importantly: | | "Lagged Fibonacci generators have been used successfully in | many situations since 1958, so it came as a shock to discover | in the 1990s that they actually fail an extremely simple, | non-contrived test for randomness" with reference to exercise | 3.3.2-31 asking "prove that if we generate 79 consecutive | random bits ... starting at a random point in the period, the | probability is more than 51% that there are more 1s than 0s". | | The solution references this paper on the tests: | https://arxiv.org/abs/cond-mat/9406054 | | There's a recommendation to generate X numbers and only use | the first Y an order of magnitude smaller, citing Luscher's | RANLUX, but I don't know how much that holds up either. | Someone wrote: | Thanks! I focused too much on "additive" as a search term. | I even looked at https://en.wikipedia.org/wiki/List_of_rand | om_number_generato..., and would have found that better | name if I had searched for "Mitchell" or "Moore" there, not | "additive". | hcs wrote: | FYI if you only have the 2nd edition of volume 2, Knuth | provides the full text of the changes from the 2nd to 3rd | edition electronically: https://www-cs- | faculty.stanford.edu/~knuth/taocp.html (search for | "Errata for Volume 2 (2nd ed.)") | | Since it's too late to edit the earlier post, this is the | specifics around discarding (p571): | | Luscher's discarding technique can be used to avoid the | bias towards 1s. For example, with lags 55 and 24, no | deviation for randomness is observed for random walks of | length 1001 when the numbers are generated in batches of | 165, if only the first 55 numbers of each batch are used. | aaaaaaaaaaab wrote: | Probably the NSA... | russellbeattie wrote: | In case people don't know this story, or worse, believe the | conspiracy theories, here's what happened with the development | of DES: | | > _" According to Steven Levy, IBM Watson researchers | discovered differential cryptanalytic attacks in 1974 and were | asked by the NSA to keep the technique secret. Coppersmith | explains IBM's secrecy decision by saying, "that was because | [differential cryptanalysis] can be a very powerful tool, used | against many schemes, and there was concern that such | information in the public domain could adversely affect | national security." Levy quotes Walter Tuchman: "[t]hey asked | us to stamp all our documents confidential... We actually put a | number on each one and locked them up in safes, because they | were considered U.S. government classified. They said do it. So | I did it"._ | | _Bruce Schneier observed that "It took the academic community | two decades to figure out that the NSA 'tweaks' actually | improved the security of DES."_ | | https://en.m.wikipedia.org/wiki/Data_Encryption_Standard | walterbell wrote: | John Denker's 2013 paper/code for using sound cards as input to a | HRNG, https://www.av8n.com/turbid/ | | _> We discuss how to configure and use turbid, which is a | Hardware Random Number Generator (HRNG), also called a True | Random Generator (TRNG). It is suitable for a wide range of | applications, from the simplest benign applications to the most | demanding high-stakes adversarial applications, including | cryptography and gaming. It relies on a combination of physical | process and cryptological algorithms, rather than either of those | separately. It harvests randomness from physical processes, and | uses that randomness efficiently. The hash saturation principle | is used to distill the data, so that the output is virtually 100% | random for all practical purposes. This is calculated based on | physical properties of the inputs, not merely estimated by | looking at the statistics of the outputs. In contrast to a | Pseudo-Random Generator, it has no internal state to worry about. | In particular, we describe a low-cost high-performance | implementation, using the computer's audio I /O system._ | hkgjjgjfjfjfjf wrote: | sbf501 wrote: | This is what bugs me about the stackexchange model. Most of the | comments are guesses by people who really don't know. The | "correct" answer doesn't explain anything, it just points to some | sources. The highest ranked comment that comes closest, attempts | to explain the nature of the numbers (the modulous is the period, | and the primes stretch out the cycle) and fires off a link to | wikipedia. | | Then it appears on HN, and more people half guessing, but better | content in general. | | Based on yesterday's post, BYTE magazine would have actually | given an answer with both why and how. | | EDIT: Granted, the link DOES go to Wikipedia which has a great | explanation, but in general, providing a link and not answer is a | poor model because links decay so quickly. | svat wrote: | It took me a while to understand your comment, until I realized | there are _two_ interpretations of the question of where these | numbers come from: | | * One is a _historical_ question: who first used these numbers | and when? who copied from where? | | * The other is a _conceptual_ question: why might someone | choose these numbers, what do they "mean", how do we interpret | these constants? | | The former is purely a question of facts and there is nothing | to "explain the nature of the numbers", and this is what the | answerer on Stack Exchange seems to have interpreted it as. | Under this interpretation saying things like "the modulus is | the period" is not really germane to the question; what's | required is to search sources and try to find the | chronologically earliest one. (The answer doesn't seem to have | done a definitive job, but it's a reasonable start.) | | This is like _synchronic_ vs _diachronic_ in linguistics: in | the latter interpretation you mainly just want to _understand_ | the random-number generator. | pizza234 wrote: | > This is what bugs me about the stackexchange model. | | > providing a link and not answer is a poor model because links | decay so quickly. | | Providing a link and not answer is not the stackexchange model. | On the Stack Exchange network, such answers are often (not | always, of course) commented with a request to expand the | answer. | nerdponx wrote: | And it's not just accepted but encouraged to comment on old | questions and answers that are either wrong or not up to | current standards. | phalangion wrote: | > links decay quickly | | That might be true in many places, and I agree with the general | sentiment that the answer should provide an answer and not just | a link to an answer. But wikipedia links are generally pretty | stable. It's not some random blog post. | [deleted] | spullara wrote: | I'd be willing to bet that a Wikipedia link will outlast a | StackExchange answer. | matheusmoreira wrote: | It absolutely will. Some of my favorite answers on Stack | Overflow have been deleted and there seems to be no archive | for such lost data. | nerdponx wrote: | Stackoverflow itself retains all deleted questions and | answers. If you have enough points you can see them. | kragen wrote: | Which ones were they? | cxcorp wrote: | I usually save my bookmarks on the Wayback Machine just so | there's at least one copy out there. | loldk wrote: | Insecure devs always obfuscate and try to hide the whole truth. | It is a massive problem in tech. | bluedino wrote: | The title question was the original source. The answer was a | list of places it was found, including the earliest. | sbf501 wrote: | Oh, good point. OP was looking for original source, papers | referring to, etc. Too late to edit, but minus points for me | for not comprehending the original question. | bluedino wrote: | Didn't help that he asked a couple more questions at the | end | | _Who invented this algorithm and tested the distribution? | Is there a paper or something to cite?_ | | That would have made another question. | Nition wrote: | I don't think you were far off what the asker was looking | for. In a comment, they say "I was kind of hoping to be | able to point to some specific research as to why these | numbers are special". | 0xTJ wrote: | The question isn't asking about how it works. It asks about | where that algorithm comes from. | a-dub wrote: | NR has long been seen as a sort of necronomicon for scientific | computer programming. | | i don't think it's SE's fault as much as it is a historical | rough edge between two fields (computer programming and applied | mathematics). | | the first constant is klaatu, the second is barada and the | third is nikto. | | (NR also tends to be fairly rigorous in terms of citations, for | the curious) | ascar wrote: | What's NR, what's a necronomicon and what's klaatu, barads | and nikto. | | I googled all these things and I can't still make any sense | of your comment except that it's intentionally confusing with | some popular fiction references? | svnpenn wrote: | It's common hacker news trope for someone to do acronym | diarrhea, even with rarely or never used abbreviations. | It's quite annoying. | Dylan16807 wrote: | NR is in the answer in the link. | | I feel like SE being stack exchange is pretty clear here. | | The remaining reference is not an acronym or | abbreviation, nor do I think it looks like one. | hinkley wrote: | "Klaatu barada nikto" is a code phrase from The Day the | Earth Stood Still to stop the homicidal antagonists from | killing you. | | It's doubly famous because Sam Raimi recycled it in "Army | of Darkness" as an incantation to dispel the curse of the | Necronomicon (book of the dead, portrayed as the arch book | of black magic in any number of stories), which our hapless | anti-hero Ash fucks up and thus unleashes the armies of the | dead on a roughly medieval and wholly unprepared world. | | If you like campy horror movies, there is so much camp in | Army of Darkness that there's barely any room for the | horror. It is technically a sequel but it works as a | standalone movie. I watched it years before I finally sat | down and watched The Evil Dead. | Natsu wrote: | Wasn't it also used in Ghost Busters at some point? | kragen wrote: | In addition to what the other answers have said, the | Necronomicon is the book of dark magic in the Lovecraft | mythos, from which it has been copied into some other | mythoi. | webstrand wrote: | NR is apparently Numerical Recipes, a book. I think they | mean that the constants are not well understood, like the | phrase "Klaatu barada nikto" was not understood, but still | had an effect. | adhesive_wombat wrote: | I assume _Numerical Recipes_. | overshard wrote: | And this is when comments in code are important! Any random | numbers without a source are immediate suspect to me, especially | in something that needs to be secure. It will save your coworkers | and peers time trying to figure out why it's there. | JKCalhoun wrote: | Security through obscurity? | | Maybe not. | swatcoder wrote: | That's true, but _this_ thing is not code any more. | | It's an incantation that's propagated for 50+ years because | it's minimal and effective. Over time, it's been fully | distilled to those properties. | | Since comments aren't essential to being minimal and effective, | they don't survive the distillation. | | Think of it like a clever gist that got pasted and shared a | hundred times. Even if the original source had explained every | step in great detail, with inline comments and deep explanatory | discourses and citations to prior art and etc, they'd | eventually get trimmed away as fat as people repeatedly prune | it down to some "important" bits pasted into their own copies | and then later share those trimmed copies, ad infinitum. | | This is that, but 50 years out. | AstralStorm wrote: | Since it's an LCG, it does not matter how it was derived as | long as the randomness properties are known. Such as cycle | length, identical initial state set and dispersion | properties. Perhaps also performance. | | These should be documented. | | It's a rather weak PRNG of short cycle, so the suspicion is | that it's made for particular dispersion properties, such as | for a hash table of particular data and size or other | bucketing algorithm. | ChrisLomont wrote: | >These should be documented. | | They're trivial to look up, and any modern source would | likely outlive the game of telephone of trying to keep such | a comment intact correctly. | krater23 wrote: | When you find this algorithm with or without any comments from | where the numbers come from in a point that should be secure, | you should be more than suspect. In any way, this is not a | secure PRNG. | Waterluvian wrote: | "No magic numbers." | eru wrote: | You'd want https://en.wikipedia.org/wiki/Nothing-up-my- | sleeve_number | Waterluvian wrote: | That's clever! | bee_rider wrote: | It is interesting -- 5/9 of the books listed there are clearly | numerics textbooks, so it isn't surprising they talked about a | non-cryptographic PRNG. Curious about the other 4. But maybe | this shows up as a "here's why you should be careful what type | of RNG you are using" type example. | pclmulqdq wrote: | For PRNGs like this, constants are often chosen by guess and | check. This algorithm (an LCG) has a bit more theory, so these | might not have been chosen that way, but the author of the code | probably didn't have any insight either. | swayvil wrote: | I think it fits the rng period to fit max 32 bit (4 bil). | Somebody in the comments said similar so I feel good about this. | jffry wrote: | The person in the comments said the period is only ~233k, which | is correct, because you are always taking modulo 233280. But | it's also easy to verify experimentally: let | state = 6, iters = 0; do { state = (state * 9301 | + 49297) % 233280; iters++; } while(state !== 6); | console.log("period is", iters, "iters"); //period is | 233280 iters | | If you instead took modulo 2^32, I believe the period would | indeed be the maximal period, according to this article: | https://en.wikipedia.org/wiki/Linear_congruential_generator#... | bonzini wrote: | He means that the constants are chosen so that the | multiplication and addition can be done on a signed 32-bit | integer without overflow. | morelisp wrote: | There's no reason to avoid overflow (nor any reason to make | it signed since that would also be a concern these days) | though. | bonzini wrote: | That depends on the language. For example BASIC used to | have only 16-bit or 32-bit signed integers (depending on | the processor; plus strings and floating point), so this | particular RNG can be useful for its portability. | morelisp wrote: | Such dialects of BASIC also offered defined overflow | behavior (it's really only C/C++ and other things | targeting MCUs that don't - and even three also only | relatively recently that "undefined overflow" meant | something more practically dangerous than "implementation | defined") so it was still not a concern. | bonzini wrote: | Sure but signed and unsigned division are different. | [deleted] | a1369209993 wrote: | > so that the multiplication and addition can be done on a | signed 32-bit integer without overflow | | Except they can't: any seed in the range 230888 thru 233279 | will produce a signed integer overflow when multiplied by | 9301 (eg 9301*230888 = 0x80001608). | bonzini wrote: | You're right, that makes no sense. Also a-1 (9300) should | not have factors in common with m (233280) but it has a | lot... | kloch wrote: | Here's a good article about this type of PRNG, with other sample | seeds: | | https://www.pcg-random.org/posts/does-it-beat-the-minimal-st... | [deleted] | snet0 wrote: | If someone can access it, my university notes where I looked into | LCGs such as this discusses that in Numerical Recipes, 1992 C | Edition, a clever LCG can be used to generates random numbers | between 0 and 1. Using a 32 bit int generated by a fast LCG | modulo 2^32. The cleverness is that you treat this value as a | float. It uses a bit mask to mask only the mantissa of a | float-32, and then is OR'd with a value to set the exponent to | 127. Subtracting 1 from this results in a random value between 0 | and 1, using only simple or bitwise operations! | 10000truths wrote: | The issue with this is approach that the interval [1,2) has | much fewer representable values than [0,1) in IEEE-754 floats, | so the range of generatable values is only a fraction of the | representable values. | | On modern hardware, you should instead use a count-leading- | zeroes or count-trailing-zeroes instruction on a uniform bit | pattern to directly generate the exponent. This is what is done | in the Zig standard library: | | https://github.com/ziglang/zig/pull/10428 | binarycoffee wrote: | You are right of course, but I would contend that generating | as well sub-normal floats is in 99.9% of the cases a needless | cost. The reason is that, more often than not, all this extra | precision will be immediately lost in subsequent operations. | | A very common situation is for instance to plug the random | number in a log, in which case you need to use log(1-r) | rather than log(r) to avoid an infinite at r=0. The problem | is, by doing this simple subtraction you have already lost | all the subnormal precision. | ralphb wrote: | This is well-known, and almost trivial. But because I am | curious I looked this up in NR 3rd edition. Section 7.1.5 says | | > The steps above that convert a 64-bit integer to a double | precision floating-point value involves both a non-trivial type | conversion and a 64-bit floating multiply. They are performance | bottlenecks. One can instead directly move the random bits into | the right place in the double word with a union structure, a | mask, and some 64-bit logical operations; _but in our | experience this is not significantly faster_ | | It goes on to describe a _lagged Fibonacci generator_ which | generates values directly as floating point. | binarycoffee wrote: | This is a widely used method to map random integers to floating | point numbers, but it has the disadvantage of wasting 1 bit of | float mantissa precision. | | On modern CPUs, its computational advantage over full-precision | mapping methods, such as multiplication by a float, is not | always clear [1]. | | [1] https://github.com/rust-random/rand/issues/416 | humanistbot wrote: | Yeah, I'm pretty sure this isn't the same kind of thing as | Carmack's (edit: not Carmack's) legendary inverse square root | function [1], which has that 0x5F3759DF constant (edit: magic | number): | | [1] https://en.wikipedia.org/wiki/Fast_inverse_square_root | float Q_rsqrt( float number ) { long i; float | x2, y; const float threehalfs = 1.5F; x2 = | number \* 0.5F; y = number; i = \* ( long \* ) | &y; // evil floating point bit level | hacking i = 0x5f3759df - ( i >> 1 ); | // what the fuck? y = \* ( float \* ) &i; y = y | \* ( threehalfs - ( x2 \* y \* y ) ); // 1st iteration | // y = y \* ( threehalfs - ( x2 \* y \* y ) ); // 2nd | iteration, this can be removed return y; } | silisili wrote: | Not Carmack's, it says so in the link you provided. | | > The algorithm was originally attributed to John Carmack, but | an investigation showed that the code had deeper roots in the | hardware and software side of computer graphics. Adjustments | and alterations passed through both Silicon Graphics and 3dfx | Interactive, with the original constant being derived in a | collaboration between Cleve Moler and Gregory Walsh, while | Gregory worked for Ardent Computing in the late 1980s.[3] Walsh | and Moler adapted their version from an unpublished paper by | William Kahan and K.C. Ng circulated in May 1986. | secondcoming wrote: | What has this got to do with seeding a RNG? | allan_s wrote: | I think the comment author meant "a magical constant that has | been found by someone by empirical method without a nice | paper pointing why it's a good value" | | (even though I think now the constant above has a paper on | why it's actually so good) | humanistbot wrote: | This was an example of a seemingly random magic number | appearing in code for a very clever hack. I think the OP was | assuming there was something similar at work with the PRNG | magic numbers. ___________________________________________________________________ (page generated 2022-08-06 23:00 UTC)