[HN Gopher] The most copied StackOverflow snippet of all time is... ___________________________________________________________________ The most copied StackOverflow snippet of all time is flawed (2019) Author : vinnyglennon Score : 128 points Date : 2021-06-16 21:27 UTC (1 hours ago) (HTM) web link (programming.guide) (TXT) w3m dump (programming.guide) | bsaul wrote: | seems that most comments here missed the end of the article , | where he points to the "production ready" version of the | solution, that is indeed very close to the original one, | including a while loop. | t0astbread wrote: | It's especially ironic given that this is about a StackOverflow | code snippet that many people probably also copied without | reading. | eutectic wrote: | This is why it's a good idea to have a real integer type. | enriquto wrote: | Isn't it impossible? Integers go arbitrarily large but | computers don't. | asdf3243245q wrote: | Computers also go arbitrarily large. Not infinite, but | arbitrarily large. | | A real number type could be bounded by the amount of RAM you | have. | jrockway wrote: | > Sebastian then reached out to me to straighten it out, which I | did: I had not yet started at Oracle when that commit was merged, | and I did not contribute that patch. Jokes on Oracle. Shortly | after, an issue was filed and the code was removed. | | Good thing it wasn't a range check function. I hear those are | expensive. | dokem wrote: | Something about this comes off as amateurish. The obsession with | minimization. Just use a switch statement. Now where is the bug | going to hide? The solution doesn't need to generalize, there is | only a small handful of different solutions. Just break them all | out. It's more maintainable and readable and requires less | thinking. | stefan_ wrote: | Why are you writing code? The question was for a static method in | Apache Commons, not your "I'm so clever" implementation. Think | the reading comprehension is flawed. | | (Of course, this static method exists in Apache Commons, going | back at least 20 years. But the fellow "code golfers" of the | author voted someone to the first answer who similarly had the | irresistible urge to _try to be very clever_. It 's a scourge on | StackOverflow.) | [deleted] | unwind wrote: | I must admit I smiled at seeing that I edited the question, back | in the day. :) Can't say I remember the question, and didn't know | it has that epic feature of being the most-copied. Cool! | mweberxyz wrote: | Say what you want about the stability of the npm ecosystem, but | if this were JS, a new SemVer patch release could be cut, and it | would be fixed in thousands of code bases essentially instantly. | beermonster wrote: | > I wrote almost a decade ago was found to be the most copied | snippet on Stack Overflow. Ironically it happens to be buggy. | | I don't find it ironic, I find it quite normal that even small | snippets of code contains bugs (given the daily review requests I | receive). | | I think when copying code literally from StackOverflow what's | more important is understanding what the code does, and why , | rather than copying it ad-verbatim by copy & pasting it into your | production code. | | I also often find on StackExchange et al that quite often the | most upvoted is the one that 'fixes it' for 'most people' yet the | correct answer is down at number 3 or 4. Again, understanding the | answer and why it applies, helps give you the context to | understand if this is _actually_ the solution to _your_ problem | or just treats the symptom. | megalodon wrote: | One of the best tips I have gotten from the internet is to | never copy and paste code you have not written yourself. Even | rewriting it verbatim makes you think about what it is you are | actually copying. | | It's a pretty neat rule to have in mind. | AceJohnny2 wrote: | > _Key Takeaways:_ | | > _[...]_ | | > _Floating-point arithmetic is hard._ | | I have successfully avoided FP code for most of my career. At | this point, I consider the domain sophisticated enough to be an | independent skill on someone's resume. | user3939382 wrote: | There are libraries that offer more appropriate ways of dealing | with it, but last time I ran into a FP-related bug (something | to do with parsing xlsx into MySQL) I fixed it quickly by | converting everything to strings and doing some unholy | procedure on them. It worked but it wasn't my proudest moment | as a programmer. | tasty_freeze wrote: | The thing that jumped out at me, as I've seen the same kind of | thing on the job, is the assumption that, eg, log(1000)/log(10) | is _exactly_ 3. Does the standard guarantee that the rounded | approximation of one transcendental number by the rounded | approximation of a related transcendental number will give 3.0 | and not 2.999999999? | remram wrote: | Yeah that seems like a serious flaw to me too. On my Python: | >>> math.log(1000)/math.log(10) 2.9999999999999996 | >>> int(math.log(1000)/math.log(10)) 2 | | But I don't know about the guarantees provided in the | JavaScript standard (or more importantly those offered by | actual browsers). | danellis wrote: | > almost no branches | | I wonder whether the author is suggesting that (potentially) nine | branches is a small number, or they overlooked ternary | expressions and function calls and are just counting the if | statement. | axiosgunnar wrote: | So it's not flawed (it does compute the correct result). | | The author just thinks a completely unreadable (but supposedly | faster) variant using logarithms is "better" than the simple loop | used in the original snippet? | | Write your code for junior devs in their first week at your | company, not for academic journals. | hardwaregeek wrote: | I think you might have misread the post. His logarithm code | became the most used snippet and had the bug. | [deleted] | [deleted] | [deleted] | ascar wrote: | His code snippet had rounding errors on the boundaries towards | the next unit. | | However he notes: | | > FWIW, all 22 answers posted, including the ones using Apache | Commons and Android libraries, had this bug (or a variation of | it) at the time of writing this article. | phist_mcgee wrote: | You should almost _always_ focus on code readability and | simplicity over inventiveness and cleverness. | | Very few people I have encountered have complained about code | being 'too simple' or 'too readable', but the opposite happens | on a near daily/weekly basis. | | Write comments, use a for loop, avoid global state, keep your | nesting limited to 2-3 levels, be kind to your junior devs. | jka wrote: | There might be an opportunity somewhere around this area to | combine the versioning, continuous improvement, and dependency | management of package repositories with the Q&A format of | StackOverflow. | | Something like "cherry pick this answer, with attribution, and | notifications when flaws and/or improvements are found". | | Maybe that's a terrible idea (there's definitely risk involved, | and the potential to spread and create bad software), but equally | I don't know why it would be significantly worse than | unattributed code snippets and trends towards single-function | libraries. | fennecfoxen wrote: | NodeJS did something a lot like this by having packages that | are just short snippets, but half the ecosystem flipped out | when someone messed up `leftpad`. | [deleted] | DylanSp wrote: | Not sure if it's quite what you had in mind, but SO is starting | to address the issue of updating old answers with the Outdated | Answers Project: | https://meta.stackoverflow.com/questions/405302/introducing-... | pkaye wrote: | Now the new code is unreadable. | ape4 wrote: | Its as easy as "KMGTPE" | penteract wrote: | This is a bit of a tangent, but while it may be conventional to | round to the value with the smallest difference, is that | convention good? In a case such as this where it's fine for the | prescision to vary with magnitude, then I'd argue it makes sense | to round to the value with the smallest ratio. | bla3 wrote: | > At the very least, the loop based code could be cleaned up | significantly. | | Seems like the loop based code wasn't so bad after all... | meetups323 wrote: | Loop code has the same bug. | bla3 wrote: | This is Java, not JavaScript. The exponents table was likely | of integer type. Then it works. | spkm wrote: | This! If I had to choose between the two snippets I would have | taken the loop based one without a second though, because of | its simplicity. The second snippet is what usually happens when | people try to write "clever" code. | dataflow wrote: | The loop by itself isn't entirely clear on what it's doing. | Stuff like the direction of the > comparison and what to do | vs. >= and the byteCount / magnitudes[i] at the end really do | require you to pause & do mental analysis to check | correctness. I think the real solution here is to define an | integer log (ilog()?) function based on division and use that | in the same manner as the log(). That way you only do do the | analysis the first time you write that function, and after | that you just call the function knowing that it's correct. | twobitshifter wrote: | Premature optimization strikes again. | amelius wrote: | Wouldn't it be cool if you could call stack overflow answers | directly from your code? | hardwaregeek wrote: | Floating point is really really hard to get right, especially if | you want the numbers to be stable. Which begs the question, why | the heck does JavaScript, the most used language in the world, | not have an integer type? Sure, there's BigInt but that's quite | clunky to use. I know it's virtually impossible to add by now, | but I'd love a integer type for all my bit twiddling, byte | munching needs. | ascar wrote: | I just feel if you have bit twiddling, byte munching needs | JavaScript shouldn't be the language of choice. Doing that is a | rather rare edge case and if you're doing it for performance | reason, working in Javascript is the much bigger performance | problem. | colejohnson66 wrote: | What's wrong with a simple loop (like the one near the top)? Why | does it _have_ to branchless? Wouldn't the IO take longer than | missed branches /pipeline flushes? | | Not to mention that the fixed version now has branches as well... | MauranKilom wrote: | The irony is that a single log computation is going to take | longer than the loop. (No idea if implementing a log | approximation involves loops either.) | [deleted] | bottled_poe wrote: | Sounds like textbook example of when theory is misaligned | with reality. | xxpor wrote: | the original version had branches too, in fact a majority of | the lines had them! ? is just shorthand for if. | enedil wrote: | This isn't true, this form of conditionals can be compiled | into cmov type of instructions, which is faster than regular | jump if condition. | dataflow wrote: | > This isn't true, this form of conditionals can be | compiled into cmov type of instructions, which is faster | than regular jump if condition. | | IIRC cmov is actually quite slow. It's just faster than an | unpredictable branch. Most branches have predictability so | you generally don't want a cmov. | ncann wrote: | If the if/else is simple the compiler should be able to | optimize that anyway. | kmote00 wrote: | Update title: this is from 2019 | mjevans wrote: | The author's lookup table is incorrect. | | The question being answered clearly wanted base2 engineering | prefix units, rather than the standard base10 engineering prefix | units. | | suffixes = [ "EB", "PB", "TB", "GB", "MB", "KB", "B" ] | | magnitudes = [ 2^60, 2^50, 2^40, 2^30, 2^20, 2^10, 2^0 ] // | Pseudocode, also 64 bit integers required. (Compilers might | assume unsigned 32 for int) | returningfory2 wrote: | That code snippet is explicitly introduced in the article as | _not_ the author 's. | asdf3243245q wrote: | That is not the author's code. That is pseudocode for one of | the example answers that he is improving on. | | The author's code gives an option for the units: | | int unit = si ? 1000 : 1024; ___________________________________________________________________ (page generated 2021-06-16 23:00 UTC)