[HN Gopher] Zero one infinity rule ___________________________________________________________________ Zero one infinity rule Author : azhenley Score : 60 points Date : 2023-03-18 18:01 UTC (4 hours ago) (HTM) web link (en.wikipedia.org) (TXT) w3m dump (en.wikipedia.org) | readthenotes1 wrote: | Zero one infinity (many) is a good rubric for test driving, and | for thinking how close to infinity your system can/should | actually support | furyofantares wrote: | > It argues that arbitrary limits on the number of instances of a | particular type of data or structure should not be allowed. | | Either this is a misuse of "arbitrary" or it's not really arguing | for no limits beyond 1. In the Carmack example from another | comment, if you know your solution isn't going to scale well | beyond 1000 items, that's not an arbitrary limit, and it's not 0 | or 1 either. When we choose not to allow users to back up their | hard drive in the username field of their account, that's not an | arbitrary limit. | xlii wrote: | I have a feeling that a lot of commenters mix up structure design | with structure implementation. | | Rule talks about the design (and I find approach interesting) yet | implementation can (and most likely should) be limited. | | E.g. we are designing a world. In this design we say children can | have toys. It doesn't make sense to design universe so that | children are limited to 5 toys or any other arbitrary number. But | no room can fit infinite number of toys so implementation is | limited. For some it's 500 toys but for some it's 500.000 toys. | | Names or parents are good real world examples. Sure, most people | have 2 parents pr at most 2 names but this rule argues that if | there's more than one it shouldn't be limited at all (which also | makes sense in real life even though IT system might limit names | to 256 characters because hardware). | eternalban wrote: | Then change the name of the rule :) | | _Zero, One, or Many_. I think that would satisfy all | concerned. | xlii wrote: | Absolutely agree. However that's for the rule creator. I | think that infinity in this context sounds slightly poetic. | antiquark wrote: | RIP the byte. | layer8 wrote: | It's just one octet. | travisjungroth wrote: | A metarule for using rules like this: instead of thinking of it | as a rule, think of it as an option you should usually consider. | You get lot of the benefits and less of the costs if you just ask | questions like "Would it be easier to solve for N than 3? Easier | to use and maintain?". The answer will never always be yes or no, | but you'll be more likely to get the right answer if you ask the | question. | Kim_Bruning wrote: | Am I the only person who sometimes makes an exception for "two" ? | ChainOfFools wrote: | I seem to recall that there are certain languages (human | languages) that have cases specifically for talking about pairs | of objects, situated semantically between singular and plural. | I think one of these is an archaic form of Greek, maybe ancient | or Hellenistic. | mmphosis wrote: | How about a sign bit? | | https://en.wikipedia.org/wiki/Unum_(number_format) | playingalong wrote: | It has always "bothered" me that AWS allowed for up to two API | keys per IAM user. | jdreaver wrote: | That's probably so you can rotate your keys without downtime. | adhesive_wombat wrote: | A ping-pong buffer certainly seems like generalising to a ping- | pong-pang-peng-pung-p(infinity)ng buffer would be overkill. | hprotagonist wrote: | i fairly regularly want pairwise, yep. | jerf wrote: | No. | | But I have the decency to feel bad and leave an apologetic | comment. | GauntletWizard wrote: | Wherever I need "Two" I usually find that what I actually need | is a one and a many; there's not a master and a replica, | there's one master and many replicas. There's not two places | that the data is when doing a raid restripe, there's many | places the data probably is and one place it should be. | grensley wrote: | Two is quite relevant for relationships | ChickeNES wrote: | Well, if you ignore polyamory and polygamy... | layer8 wrote: | You can have n-ary relationships. Relational databases are | built on that. | alex_sf wrote: | The argument against two would be: if you can do two, why can | you not do three? | GauntletWizard wrote: | I learned this rule through, of all things, the children's series | Animorphs. One book they encounter a colony of ants, who mention | that "You are one, we are many" and count that way; any finite | number is one, and infinite numbers are many. | hsjqllzlfkf wrote: | John Carmack argued the opposite. He said he would hardcore | limits into his data structures. Limits that would never be hit | under normal operating circumstances. He argued that when you | design software you should have an idea under what circumstances | it will run and optimize for those. The fact that people normally | don't do this, is why software often lags - that algo you | implemented worked worked just fine when it's hashmap has 1000 | keys, but not when it has 1m keys. | | I haven't articulated his point very well, but I lost interest in | this comment mid way thru. | cjfd wrote: | Well, John Carmack is a game programmer, at least originally. | In games one has strong upper limits on how long it can take to | calculate the next frame. So, it may very well be that | something can be said for this in this context. I am not sure | in general, though. If one limits the length of the name of a | person in some record one can start waiting until one day a | person arrives that has a name that is one character longer and | then a programmer needs to get involved to increase the limit. | Okay, maybe there exists a ridiculously large maximum name | length such that this problem is unlikely to occur. Then one | may still have the problem that if all people would have names | that approach this limit, the system still might get too slow | as well. So maybe we should impose a limit on the sum of all | the lengths of all the names in the system. Hmmm... Kind of | gets a bit too complicated and arbitrary for my tastes. I think | for many systems Carmarcks advice really is not very good. | layer8 wrote: | There should be limits. What if someone pastes the text of | the bible into the name field? What if you want to send an | email containing the name, and you hit the maximum email size | accepted by the SMTP server (they all have a limit)? What if | you want to send a postal letter and print the name on the | envelope? | | Not setting explicit limits either means that you still have | implicit limits, but you don't know what they are and where | and when you'll run into them, or it means you're opening | yourself up to DoS attacks, because your database or your | backups or your AWS bill will run amok. | GauntletWizard wrote: | I have legitimately used the complete works of William | Shakespeare, unabridged, as a password. Even that is in | megabytes, and not significant load for bcrypt2. | | Not that there should be no limits at all, but the upper | bound should be relatively high. | asynchronous wrote: | Doesn't bcrypt2 essentially truncate every source input | to no longer than 35 characters? | mhh__ wrote: | > 1m keys | | As opposed to not working at all? | quelsolaar wrote: | Ive come to agree with this. Say you have a struct that | contains a name. If you limit the name size to say 64 bytes, | then you can store it in the struct, otherwise you need to have | a separate allocation and an indirection. This makes the code | slower, more error prone and more complex to use. So think hard | of when "infinite" is justified. | meghan_rain wrote: | > more error prone | | not when using Rust ;-) | AlotOfReading wrote: | You should also be careful about incorrectly imposing limits | on human input, as every "falsehoods programmers should know | about x" repeatedly hits on. To continue the name example, | there are people with legal names and titles that take far | more than 64 bytes to store and truncation is not an | appropriate solution. You _should_ store the entire thing, | even if that 's a small bit more difficult technically. The | other limit also applies. Some people don't have any | characters in their legal names at all, like my mother (birth | certificate name was blank). | PKop wrote: | Or you can make a tradeoff and simply exclude these | outliers from using your product, instead of optimizing for | their unique case. | quelsolaar wrote: | Who said anything about the name being the name of a | person? | zeroonetwothree wrote: | I think "should" is a bit strong here. There are always | tradeoffs. Having support for very long names can cause | other problems, like the opportunity for abuse, making it | hard to format UIs, making human inspection of data much | harder, or even limiting algorithmic efficiency. | | For example, US Passports have a relatively small limit for | name length (I think it's around 20 characters each for | first/middle/last). | jiggawatts wrote: | "Who you are is inconvenient for me." | GMoromisato wrote: | Conversely, "The needs of the many outweigh the needs of | the few, or the one." | zeroonetwothree wrote: | This is true, but note that the "Zero one infinity rule" only | applies to the _number of items_ , not the size of an item. | Limits on size are a lot more typical anyway. | | (Yes I know someone will point out that to store a list of | items requires a linear amount of space, so therefore an | unbounded number of items inherently implies an unbounded | amount of space) | MawKKe wrote: | Sure if you know the operating domain, just hard code some | sensible upper limit and move on. Not that this way is "better" | in all cases. YAGNI and all that | adhesive_wombat wrote: | You can still have both: an algorithm that technically has no | formal numerical limit, but have warnings or errors when it | reaches some condition that is clearly outside the design | scope. | | For example, you might have a system that warns somehow if you | start inserting items at the front of a vector with over a | million items (when you originally expected the vector to be | used for tens of items). If the structure is suddenly being | used for something way outside of it's design scope, it's | probably in need of a re-think. | stefan_ wrote: | Nope. "No formal limit" or "infinity" is just another way of | saying "dynamic memory allocation". There is a reason you | usually find this advice in game adjacent places; to punt | everything to the dynamic memory allocator makes it very | difficult to know if you are keeping within the memory- | starved limits of e.g. the game console you are developing | for, or on mobile to keep a garbage collector from stepping | in and ruining your day. | adhesive_wombat wrote: | "Nope." An "algorithm" can be about things other than | memory allocation. Say you have a collection of sensors the | you poll. ZOI is saying that the system that polls them | shouldn't[1] have some kind of hard assumption that there | will only ever be, say, specifically 4. You could still | statically allocate the storage on a given system, for | example in an embedded system where the software is | compile-time configured for a given hardware. | | However, if you pass the "poll_sensors" function a size of | 60 million when the system was designed for "about 4, I | guess", it's likely that you're operating the algorithm in | a regime it wasn't designed for. You may (or may not, this | is just another trade-off) wish to know about it. | | [1]: of course you can always construct exceptions. If you | follow every rule you subscribe to dogmatically, then | you're doing something more akin to religion then | engineering. | shoo wrote: | > when you design software you should have an idea under what | circumstances it will run and optimize for | | This philosophy is sometimes referred to as "data-oriented | design" | | see also: Mike Acton's CppCon 2014 talk: | https://www.youtube.com/watch?v=rX0ItVEVjHc | | Data-oriented design seems appropriate for shipping high- | performance games, and shipping them on schedule and on budget. | The goal is not to design and code the algorithm with the best | theoretical scalability characteristics - and publish it in a | paper, or invent the most mathematically general version of the | algorithm to be factored out into a library for use by 100 | hypothetical future projects. The goal is not to design a | system using hexagonal architecture (or whatever) that can more | flexibly accommodate arbitrary changes of requirements in | future, at the expense of performance. The source code is not | the product that the customers buy, the game is the product. | | The goal is to understand the problem you actually have - | processing data of a certain finite size or distribution of | sizes, on some fixed finite set of target hardware platforms, | and figure out how to do this in a way that hits performance | targets, with a limited budget of engineering time, to keep | moving forward on schedule toward shipping the product. | roflyear wrote: | Only works when fail early is better than run slow which IME is | not often the case | TazeTSchnitzel wrote: | You can see this kind of design in action in the Source engine, | perhaps inherited from Quake (which is partly Carmack's work of | course). There are hard limits on things like how many objects | can exist at once. Occasionally those get hit... usually due to | a memory leak![0] So the artificial limit helps find real | problems. That's also been my experience with PHP, which limits | the amount of memory a request or script execution can consume | by default. | | [0] https://www.youtube.com/watch?v=pw2X1yhrDdE -- check out | shounic's other videos too, there's lots of interesting content | there about how complex game systems fail, including about | these limits | at_a_remove wrote: | I've had a different take on this that goes "Zero, One, Many." | And that when you end up really using your software, the things | that were "one" often become "many," and the unthinkable zeroes | frequently became ones. | zeroonetwothree wrote: | While this has some theoretical merit, IME, limits are quite | useful to catch bugs (or prevent degenerate cases). For example I | was recently working on a permissions system wherein there can be | members of groups. I set a reasonable limit on the size of a | group based on a maximum of actual usage and what I could foresee | being reasonable. A few days later, this limit was triggered, and | I got a bug report. But it turned out there was some bad caller | that had essentially an infinite loop that was adding more and | more members to the group. Without the limit this probably would | only have detected after it cascaded and caused other failures. | | As another example, Facebook famously had (maybe still has? I | have no idea) a limit of 5,000 friends. This was based on the | idea that you were supposed to be friends with people you | actually kind of know at least a little bit, which would probably | fit within 5,000 for most people. After some more popular users | started hitting the limit, it led to the development of public | profiles/pages as an alternative product to handle that use case. | So in this case the limit allowed the company to identify a new | market that they could better serve in an alternative way. | | BTW, the George Gamow book that the name comes from (One, Two, | Three... Infinity) is a great read for anyone interested in | popular descriptions of physics or math. He also lived a very | interesting life, I would recommend his autobiography "My World | Line". | josephg wrote: | Yeah; expected limits are also fantastically useful in | performance engineering. It's very common your code needs to | handle an arbitrarily sized input, but 99% of the time the | input will be bounded. (Or generally simpler). Special casing | the common code path can make a lot of code run much faster. | | For example, in some code I'm writing at the moment I have | lists of integers all over the place. I call them lists - | usually they only have 1 element. Sometimes they have 2 | elements (10%) and very occasionally more than 2 elements or | they're empty (<1%). | | If I used a language like Javascript, I'd use Arrays. But | arrays are quite expensive performance wise - they need to be | allocated and tracked by the GC and the array contents are | stored indirectly. | | Instead, I'm using an array type which stores up to 2 items | inline in the container object (or the stack) without | allocating. It only allocates memory on the heap when there are | 3 or more items. This decreases allocations by 2 orders of | magnitude, which makes a really big difference for performance | in my library. And my code is just as readable. | | I'm using the smallvec crate. There's plenty of libraries in C | and Rust for this sort of thing in arrays and strings. Swift | (like obj-c before it) builds small string optimizations into | the standard library. I think that's a great idea. | | https://crates.io/crates/smallvec | pyuser583 wrote: | The Wiki page says the problem isn't with limits, but with | _arbitrary_ limits. | | When a program limits me to 256 of something, it doesn't seem | arbitrary. | | I've heard stories of programmers setting limits to multiples | of two simply so nobody asks why. | crazygringo wrote: | Totally agreed. I think it's absolutely a best practice to set | limits. | | Code is generally designed to operate within certain | "reasonable" performance boundaries, and when it goes outside | those you need to think whether code should be rewritten to | accomodate it. | | Just a tiny example, but I regularly deal with long (800+ page) | PDF's on my iPad, reading parts of them in the stock Books app. | When I select text to highlight, the context menu unfortunately | puts "select all" directly next to "highlight". Of course, | every so often I accidentally hit "select all", and then I have | to force-close the app because otherwise it just freezes for 10 | minutes as it extracts and selects the text on every single | page. | | When really, it needs a limit to detect that, hey, if the PDF | is over 20 pages long then just don't allow "select all" | anymore, because this isn't the right tool for the job. | psychphysic wrote: | Although that's also ripe for the annoyance of not being able | to select all in border cases. | | I'm much in favour of don't fuck with the basics (i.e. the | core expect feature UI like select, copy and paste.). | solveit wrote: | I think a better way to formulate the same rule of thumb is | that zero, one, and infinity are the only numbers that don't | need to be _justified_. You should always have an answer to the | question "why 5000 and not 6000?" | | Of course, the justification can be as simple as "it has to be | some number and 5000 is as good as any", but that opens the | door to the discussion of whether 5000 really is as good as any | other number, which is often surprisingly enlightening. | pixl97 wrote: | What should the limits be? And are the limits clear to the | users of the system? Are they clear to other components of the | system? | | I agree we should have limits in software because we don't have | unlimited memory and processing time, but I commonly find these | limits are encoded by the imagination of the programmer working | on the software at the time, and it's often you find limits | that were not considered in systems design of the product. | HervalFreire wrote: | >While this has some theoretical merit, | | What theoretical merit? It sounds like a completely made up | rule of thumb based off of a persons anecdotal experience. | heyzk wrote: | Which, I assume, led to one of my favorite idioms: there are only | three numbers in computing: 0, 1, and n. | deergomoo wrote: | I think this is something programmers intuitively learn pretty | quickly (in fact I'm only just learning that there is actually a | name for this idea), but it's an interesting principle to explain | to non-developers. | | Back when I was doing agency work, on more than one occasion I | had clients question why modifying some entity to permit multiple | related entities instead of one would be more work than they were | anticipating. | | The question was usually along the lines of "would it help if we | were to limit related $foobars to three?". But of course, from a | structural perspective supporting three related records is | usually exactly the same amount of work as supporting arbitrary | amounts. Yeah I could probably hack in some fixed amount of extra | records, but that's just going to create an increasingly large | mess when your business grows and you realise | three...twelve...one hundred wasn't enough. Might as well do it | properly from the start. | dragontamer wrote: | This is probably a rule I believed in as a beginner, and | completely don't believe in anymore today. | | Lets take a look at an exceptionally well designed system: | single-error correction double-error detection codes. Also known | as "ECC RAM". The implementation of this only works for *exactly* | 64 bits + 8-parity bits. | | Now sure, we can talk about how we can generalize SECED ECC to | different numbers of bits or whatever. But I'll also point out | that the only implementation in widespread use is ECC Memory, the | 64-bits + 8-parity bits (72-bits total) implementation. | Nullabillity wrote: | It is generalized, you can chain together infinitely many | blocks. | alex_sf wrote: | I would argue your example is a special case of only allowing | one. | dragontamer wrote: | But it detects two-bit errors. | | Which is nonzero, non-one, and non-infinity. | | -------- | | SEC without DED is the classic implementation of Hamming | error correction codes. The double error detection (aka DED) | but is grafted on top of the theory, because it's useful in | practice. | crazygringo wrote: | I've read the article but I'm having trouble understanding the | context of what this was _in response to_. The article quotes: | | > _I formulated it in the early 70s, when I was working on | programming language design and annoyed by all the arbitrary | numbers that appeared in some of the languages of the day._ | | Was it just an issue with languages in the 70's then? Because I'm | struggling to think of any "arbitrary number" limits in modern | languages. | | Obviously we have numerical limits depending on if a variable is | a signed 32-bit integer or whatever. But those aren't arbitrary. | | Like would older languages just arbitrarily say there couldn't be | more than 100 elements in an array, or something? | layer8 wrote: | Lots of examples (e.g. maximum number of nesting levels), see | for example: | | http://www.tendra.org/tdfc2-config/chapter2 | | https://www.ibm.com/docs/en/epfz/5.3?topic=reference-limits | | One aim of finite implementation limits is to define which | programs are guaranteed to compile successfully, so that you | don't run into the situation that a program compiles on one | implementation but not on another implementation, which would | raise the question of whether its the program or the compiler | that is nonconforming. If both are conforming, you want the | program to compile 100% of the time. This isn't possible if | programs can have unlimited complexity. | | The mindset is similar to embedded programming, where you have | limited memory and absolutely don't want to run into the | situation that there's not enough memory left to perform the | operation that needs to be performed. | gdprrrr wrote: | Early C compilers had a limit of 6 characters for an identifier | neilv wrote: | I first heard of "no arbitrary limits" in connection with GNU | software. It was one of ways that GNU software not only had | different licensing than alternatives, but was also usually | technically superior. | | (At the time, there were a lot of different workstation computer | vendors, each with their own flavor of Unix. But where GNU tools | existed, they'd usually be better than all of the Unix | workstation ones. One exception was that the premium vendor C and | C++ compilers generated better code and/or were more trusted than | GCC at the time. GCC didn't take over until Linux, initially on | x86.) ___________________________________________________________________ (page generated 2023-03-18 23:00 UTC)