[HN Gopher] New Grad vs. Senior Dev ___________________________________________________________________ New Grad vs. Senior Dev Author : kogir Score : 427 points Date : 2020-03-28 00:20 UTC (22 hours ago) (HTM) web link (ericlippert.com) (TXT) w3m dump (ericlippert.com) | csours wrote: | The only nested loop implementation I've ever changed for | performance was due to the fact that there was a freakin' | database call in the inner loop. All I did was refactor the data | structure so that the db call was done outside the outer loop. | Instant 20x improvement in performance. | dimtion wrote: | I find that the biggest misunderstanding happens because "new | grads" (and I happen to be one) confuse _asymptotic complexity_ | with actual complexity. | | I'm not sure sure why, but CS courses and interview questions | mostly focus on _asymptotic complexity_ and usually forget to | take into consideration the complexity for "little values of n". | And funnily enough, in real life n never goes to infinity! | | In a strict sense big O notation only cares about what happens | when n goes to infinity. The algorithm could behave in any way up | to numbers unimaginable (like TREE(3)) but still, its big O | wouldn't change. | | Maybe what is missing to those "new grad" is a felling of real | world data, and how a computer behave in the real world (with | caches, latencies, optimised instructions etc...) not just having | an ideal computer model in their mind when they design | algorithms. | justinmeiners wrote: | I regularly see people make this mistake and don't grasp it | after correction. | | You could make a hash table with a constant time lookup, but | the hash takes 1 hour. Big oh only tells you how it scales, not | it's performance (runtime). | TeMPOraL wrote: | It's not even that. You could have a normal hash table with a | decent hashing function, and you'll still get beaten by a | flat array for small n (hundreds, low thousands), because the | array is contiguous in memory - so operations like search or | moving stuff around after addition make extremely good use of | CPU's cache. | justinmeiners wrote: | I was using an extreme example to illustrate. I definitely | agree. | dmoy wrote: | > the array is contiguous in memory - so operations like | search or moving stuff around after addition make extremely | good use of CPU's cache | | Also - if I see someone try to use a linked list for an | enormous data structure again.... Wow it does not scale | worth crap because it turns out that the hardware is | actually important, and contiguous memory is amazing. | alexis_fr wrote: | > if I see someone use a linked list | | Or a hashmap to prepare 3 variables to pass to Json | serialization. | leetrout wrote: | I think that hits the human factor of software | development where it is easier to conceptualize your | hashmap will become that JSON object. | | Curious - what would be your solution? Just creating the | json directly as strings / bytes? | dasyatidprime wrote: | Unsorted-array-based maps are sometimes used in the Java | world, and for two or three elements will have much less | overhead than hash tables. For instance, fastutil has htt | p://fastutil.di.unimi.it/docs/it/unimi/dsi/fastutil/objec | .... The map interface and encapsulation into a single | "object" is the same. | | It occurs to me that I don't know whether any of the | major dynamic language implementations with | maps/dicts/hashes as a central data structure use a | similar approach for very small ones... huh. | TeMPOraL wrote: | Oh god. Don't talk to me about linked lists. One of the | bigger performance improvements I've made in a certain | company is taking the code working with lots of numerical | data in linked lists because _they had easier syntax_ , | and rewriting it using honest-to-god, contiguous-memory | arrays of doubles. After that, we could process three | orders of magnitude more numbers per operation, and one | order of magnitude more of operations, and we _still came | ahead_. | matwood wrote: | Maybe you knew the scale up front, but if you didn't the | easier syntax was the right first choice. It may have | been the right first choice because it was easier to code | even with the scale known up front. Only after measuring | and understanding the trade offs should the easier to | reason about code have been removed. IMO, thinking about | and understanding these trade offs is one of the main | differentiators between a junior and senior developer. | TeMPOraL wrote: | > _IMO, thinking about and understanding these trade offs | is one of the main differentiators between a junior and | senior developer._ | | I agree, but in a way opposite to what you intended. An | experienced developer[0] should be able to look at a | situation like this and realize that few more minutes of | focus can yield a better (array-based vs. list-based) | implementation[1]. There are no downsides to that (arrays | were only slightly less convenient in that case, syntax- | wise), improvements occur regardless of scale. The list- | based solution was a bad one at the scale it was | originally written for handling. | | I believe a hallmark of an experienced developer is | writing performant code from the get-go; this is | accomplished by not making stupid mistakes like this, and | it costs pretty much nothing in terms of coding time or | code complexity. All it takes is a little knowledge and | _caring about the product 's performance_. | | -- | | [0] - I hesitate to use the word "senior", because to me, | whether it means anything depends on the company one | works in. In many, a "senior" developer is just the one | that came before all the "junior" hires, and it doesn't | matter that that developer is a fresh bootcamp graduate. | And once you can put "senior X developer" on your CV, | it's likely your next job will give you seniorship | immediately as well. | | [1] - and an extra few more minutes would give an | implementation that doesn't allocate new memory | unnecessarily - also a huge performance win. | codr7 wrote: | The most important lesson I've learned from 34 years of | writing software, it's to stop pretending I know shit | about the problem I'm trying to solve before I have | written an actual working solution. Which means getting | there asap is top priority and nothing else matters. | Sometimes that code runs fast enough, often it turns out | I'm solving the wrong problem which means performance | doesn't matter at all. | jayd16 wrote: | In this case complexity analysis would tell you that. | deathanatos wrote: | > _by a flat array for small n (hundreds, low thousands)_ | | Some of us are working in, say, Python. A flat array can | outperform at small n, yes, but people overestimate where | the tradeoff point is. It's at <5 items: # | A list of [0, 1, 2, 3, 4] In [10]: linear = | list(range(5)) | # A hash set, same thing. In [11]: hashing = | set(range(5)) | # 44ns / linear search In [12]: %timeit 3 in linear | 44.2 ns +- 0.412 ns per loop (mean +- std. dev. of 7 runs, | 10000000 loops each) # 25ns / hash search! | In [13]: %timeit 3 in hashing | 25 ns +- 0.6 ns per loop (mean +- std. dev. of 7 runs, | 10000000 loops each) | | The hash set outperforms the linear search by nearly 2x, on | a list of size 5! (The performance is similar for other | common types that end up in hashes, like strings.) | | "It's Python!", you say. "Too much chasing of pointers to | PyObjects destroy the cache!" And yes, they do; but many | people are working in high-level languages like Python or | Ruby. | | But, for those that aren't, if we repeat the above exercise | in Rust, yes the tradeoff will move up, but only to ~60 | items, _not_ hundreds or low thousands: | test tests::bench_hash_int ... bench: 14 | ns/iter (+/- 1) test tests::bench_linear_int ... | bench: 19 ns/iter (+/- 3) | | If you're thinking that somehow accessing the middle item | each time bestows an unfair advantage to the hash table, | randomizing the desired item doesn't help, either: | test tests::bench_rng_hash_int ... bench: 19 | ns/iter (+/- 2) test tests::bench_rng_linear_int ... | bench: 24 ns/iter (+/- 2) | | And looking for an item _not_ in the list is definitely not | favorable to the linear search. (It 's the worst case.) | | In my experience, it's almost always easiest to pay mild | attention to big O concerns, and just use the appropriate | data structure for the problem at hand. Cache effects | mattering is either rare (you're writing a RESTful | microserving to push cat pictures, a cache isn't going to | matter once we hit this mobile devices 20 second network | latency!) or highly context dependent (your line of work is | always low-level, and these crop up more often, and you're | consequently on the lookout for it; I don't think this | applies to most of us, however). | | The code used, in case you wish to find fault with it: | https://github.com/thanatos/hash-vs-linear | XelNika wrote: | You only benchmarked the search itself, but in the real | world it might also take time to set up the data. You | can't really pinpoint a tradeoff point without knowing | how many times a data structure will be used. | | I ran your Python test on my machine and the hash set was | faster in every case: 10x faster at size 50, 2x faster at | size 5, 1.3x faster at size 3. | vvanders wrote: | Yup, linear data reads are easily 10-30x faster than random | thanks to cache miss penalty staying static since the 90s. | | If you want to see real-world DDR speeds figure out what | algorithms do linear reads. | corysama wrote: | It's because Big O is Computer Science. Cache effects are | Software Engineering. Professors of CS do a fine job of | teaching CS. They even briefly mention that there is a implicit | constant factor k in O(k n log(n)) and then they never mention | it again. They certainly don't mention that k can easily vary | by 128x between algos. AKA: 7 levels of a binary tree. Or that | most of the data they will be dealing with in practice not only | won't be infinite, but will actually be less than 128 bytes. | Or, that even with huge data and an proven-ideal O() algo, | there is often 10x speed-up to be had with a hybrid algo like a | b-tree instead of a binary tree. And, another 2-10x with SIMD | vs scalar. 100x isn't infinite, but it's still counts. | | So, grads listen to their CS professors and that's what they | know. It's not until they get lectures from greybeard software | engineers that they learn about the reality algos and not just | the idealized algos. | lonelappde wrote: | That's an arbitrarily restrictive view of computer science. | It's like saying teaching physics ignoring friction perfectly | fine. | pretendscholar wrote: | Yes you should ignore the details of friction for the first | few semesters. | lonelappde wrote: | For how many semesters? All the way through to | graduation? | rriepe wrote: | No. You wouldn't want people slipping on stage. | alasdair_ wrote: | > No. You wouldn't want people slipping on stage. | | Since all the students will merely be spheres of equal | density, that shouldn't matter much. | MaxBarraclough wrote: | > briefly mention that there is a implicit constant factor k | in O(k n log(n)) and then they never mention it again | | A fine concrete example of this is the Coppersmith-Winograd | algorithm (and its derivatives), a matrix multiplication | algorithm with impressive complexity properties, but which in | practice _always_ loses to the Strassen algorithm, despite | Strassen 's inferior complexity. [0][1][2] | | (Aside: the Strassen algorithm is pretty mind-bending, but | also easily shown. If you've got 22 minutes spare, there's a | good explanation of it on YouTube. Perhaps there's a more | dense source elsewhere. [3]) | | > It's not until they get lectures from greybeard software | engineers that they learn about the reality algos and not | just the idealized algos. | | To mirror what some others are saying here, students should | also be taught the realities of cache behaviour, SIMD- | friendliness, branch prediction, multi-threaded programming, | real-time constraints, hardware acceleration, etc. | | [0] https://en.wikipedia.org/wiki/Coppersmith%E2%80%93Winogra | d_a... | | [1] https://en.wikipedia.org/wiki/Strassen_algorithm | | [2] https://en.wikipedia.org/wiki/Computational_complexity_of | _ma... | | [3] https://www.youtube.com/watch?v=ORrM-aSNZUs | dwohnitmok wrote: | Even Strassen can be overly expensive for reasonably sized | matrices. Almost all matrix multiplication algorithms have | a threshold only beyond which do they use Strassen. | | Although apparently that threshold can be lowered | (http://jianyuhuang.com/papers/sc16.pdf), but even then | it's a matrix that's several hundred columns by several | hundred rows large. | | Some CS classes explicitly use Strassen to teach the | realities of asymptotic vs wall-clock time complexity, | challenging students to come up with a hybrid matrix | multiplication algorithm that performs the fastest and | switches at the best thresholds of matrix size. | msla wrote: | > To mirror what some others are saying here, students | should also be taught the realities of cache behaviour, | SIMD-friendliness, branch prediction, multi-threaded | programming, real-time constraints, hardware acceleration, | etc. | | Which would have the positive knock-on effect of the | textbook being sufficiently obsolete every year or so that | the students could no longer trade it in for credit, saving | the bookstores money! | | More seriously, that knowledge (at least once you attach | numbers to it) has a shelf life, and not a very long one. | Teaching big-O analysis means the knowledge is timeless, | which any good theoretical knowledge is, and moving more | towards practice would force the professors to keep on top | of the state of the art, and the state of the _mainstream_ | , of hardware design in addition to everything else they're | doing. | herbstein wrote: | I always loved my algorithms and datastructures professor | talking about Brodal queues[0] - a datastructure named | after him. They're super interesting from a theoretical | point of view, but they're not useful for anything. | | [0] https://en.wikipedia.org/wiki/Brodal_queue | fchu wrote: | As a side note: O(k n log(n)) and O (n log (n)) have exactly | the same meaning for k non-null constant by the definition of | big O notation. The more you know! | dilyevsky wrote: | Every good cs course has a section on cache aware algorithms. | And i call bullshit that constant factor is not mentioned too | eyegor wrote: | I've looked before, but I've never seen a class dedicated | to practical algorithm design. Being able to reason about | cache layout, context switch costs, branch prediction | behavior, simd refactoring, and basic compiler level | optimizations will result in much more performant code. In | the real world people often write complex algorithms which | operate on structs/classes instead of primitives. This | means there's a huge performance hit just from pointer | traversal in a hot path, especially if someone naively does | math across data inside heap objects. You can easily write | a fancy dynamic algorithm approach which has theoretical | O(k*n) performance which takes forever in the real world | due to abstraction traversal. If you're doing more than one | operation, it's often a massive performance boost to cache | all your object pointer evaluations into primitive arrays, | do simd operations on them, and then populate the objects | at the end. | | Does anyone have a good textbook suggestion for cache/simd | aware algorithm design? I've seen plenty of papers that | cover single examples but never something the scope of a | book. | corysama wrote: | It wasn't taught to me. And, in my previous job I | interviewed many dozen fresh grads. One of my questions was | "How much slower is it to sum integers in a trivial linked | list vs. a trivial array?" 90% answered "Umm... I don't | know. 2x?" When asked why, they all said "1 op to sum the | int +1 op to traverse the pointer." It was amazingly | consistent. | onekorg wrote: | The answer could be 2x. Let's say you're in a 64 bit | platform. Your linked list nodes consist of a next | pointer and a 64 bit integer. | | If your linked list nodes are all allocated sequentially | in memory then it'd only be 2x as slow as an array of 64 | bit integers. | | But maybe it's not fair to call sequentially allocated | linked list a "trivial linked list". | [deleted] | [deleted] | koala_man wrote: | This kind of CS-based rationalization is arguably another | aspect of what the article comments on. I wrote a | benchmark and found the difference in this case to be | 3x-3.5x. | willberman wrote: | I can see how this wouldn't be covered in an undergrad cs | education. I took only a single computer architecture | class which was extremely limited in scope. The only | reason I knew about vectorization during undergrad is | because a friend mentioned it to me once. | cjfd wrote: | O dear.... Am I happy that I never studied computer | 'science'..... On the other hand, there must be smart | computer science students and/or smart places of | education where actual learning about processor caches | and the like takes place..... | lghh wrote: | I don't think anyone answering in this way would be "not | smart" like you seem to imply. | joshlemer wrote: | What would be the kind answer you're looking for? | jodrellblank wrote: | At a guess, some understanding of this article[1] - if a | CPU instruction scaled up to human time takes 1 second | then a Level 1 cache lookup takes 2 seconds and a main | memory lookup takes 4 minutes. | | Imagine for the array it's 1 CPU instruction to load a | value, 1 to load the next value, 1 to add them, and one | to store the result, that would be 4 instructions per | sum; ideally the array would stream into the CPU after a | single main memory lookup delay up-front, and then be 4 | instructions per pair, summed as fast as the CPU can | loop. | | The linked list at worst needs an imaginary 1 CPU | instruction to load a value, 1 to load the pointer value, | 1 to reference the pointer, a delay of 2 seconds to get | that value from L1 cache - missed, it's not in cache - | 240 seconds stalled waiting for main memory, 1 to add, 1 | to store the result. Worst case, >240x slower. | | The linked list is not guaranteed to be in contiguous | memory, but it might be, so the cache might have the | right data in it. The linked list is 50% data, 50% | metadata, so the cache is half wasted / can hold half as | much data, and if the linked list is coming in from a big | memory read quickly, half the bandwidth is carrying | pointer addresses not data, so the throughput is halved | for that, too, and the processor cycles were already able | to happen much faster than the main memory bus max speed. | If it's not contiguous memory, you don't know in advance | where it is to request all the right memory areas at once | - not until you read sequentially to the last item | pointer and find there are no more. | | Maybe if they are both small, both in contiguous memory | and go into Level 1 cache after a single main memory | delay, it could be only ~2x time, but the more data there | is overall, the more chance the linked list will bust the | cache or be discontinuous in memory. And on the plain | array side, it might be possible to speed up with | SIMD/SSE instructions to spend fewer cycles adding and | storing per element, which the linked list approach might | not be amenable to at all[2], then best case might be ~4x | slower, worst case ~500x slower. | | [1] https://www.prowesscorp.com/computer-latency-at-a- | human-scal... | | [2] https://stackoverflow.com/questions/10930595/sse- | instruction... | corysama wrote: | The question had 2 goals: | | 1) Do you think about cache _at all_ or is it just | something you heard mentioned as important that one time? | | 2) It's a good lead-in to discussing the effects of cache | in algorithms. How that conversation goes helps me to | understand how that person thinks and discusses complex | problems. | | A good answer would be "I'm not sure, but probably way, | way slower because linked list can point all over memory | but arrays cache really well." | | An excellent, A+ answer would be "In the best case it | might not be too much slower if you have an intrusive | linked list is arranged sequentially in memory like an | array like onekorg explained. But, in practice most will | be 20-200x slower because they are usually implemented as | pointers to nodes containing pointers to data and each | piece is allocated piecemeal in a already fragmented | heap. Uncached memory reads can take 100+ cycles and | summing an int is not enough work to hide that even if | the CPU speculatively prefetches." | | I mainly expect a surprised reaction that they could be | so slow and looked forward to the follow-up discussion. | TheOtherHobbes wrote: | I sometimes wonder if CS should be renamed Computer | Pseudo-Science. I blame Knuth and (mostly) Dijkstra for | propagating the falsehood that you can estimate the | performance of real code on real hardware with a some | elegant academic blackboard math which gives you the | wrong answer for many practical applications. | | It's not that Big O isn't useful - it's that it's taught | as a set of "proofs" which somehow make it appear | objective and "correct", when in reality performance is | at least as dependent on cache architecture, median size- | of-n, memory bandwidth, and other implementation details. | | Anyone who graduates CS without having been taught this | very forcefully - preferably during a practical project - | should be refunded at least some of their course fees. | herbstein wrote: | But Big O was a lot more directly correlated when the CPU | wasn't doing "magic" optimizations on its own. It was | still estimations with an invisible constant factor, of | course, | willberman wrote: | When I took data structures and algorithms, I believe | almost every single lesson was concluded with the caveat | that constants are important. | dwheeler wrote: | Mentioned, yes, but often not learned. In many situations | the only thing that matters is the constant factor. If the | number of data items is relatively small, the difference | between N log N and and N squared may be completely | dominated by the constant factor. In addition, there is the | challenge of maintaining the code later and making sure | it's correct. | vvanders wrote: | Yup, so much truth here. | | Take a look at Real-time Collision Detection[1]. I takes | a great look at both algorithmic complexity _and_ cache | awareness. That 's how it should be done. | | [1] https://www.amazon.com/dp/1558607323 | Jasper_ wrote: | Models are easy when you turn every cow into a sphere. But | physicists never believe their models respect the real world. | Computer Science should be about the Science of Computers, | not hypothetical models acting on hypothetical architectures. | duck2 wrote: | Computer Science existed long before there were computers. | I think the Science of Computers you mention is known as | Computer Architecture. | msla wrote: | > Computer Science existed long before there were | computers. | | Not in its current form, and not if you define | "computers" with a sufficiently broad net. | | (Or broad loom, tipping a hat to Jacquard... ) | gorloth wrote: | Taking the difference between normal complexity and asymptotic | complexity to the extreme you have | https://en.wikipedia.org/wiki/Galactic_algorithm which do have | the best asymptotic performance, but only on values of n so | large they never come up in real life. | kevinventullo wrote: | To be fair, in my experience it is often the case that | asymptotic complexity is a good proxy for real-world | performance, even for small values of n. Not always, but often. | | I think it's fine that the academic courses focus a bit more on | what's better in theory than in practice, because there are | always caveats to "in practice"; the person who writes the | special-purpose genomics libraries was also once a new grad. | monocasa wrote: | Eh, bubble sort can be quicker than q sort for samll values | of N for instance. | RookyNumbas wrote: | I like to start by thinking about cache locality and ensuring | linear layout. Next focus on one-time, or minimal memory | allocation. Then there are a bunch of small, systemic things | you need to get right. After that you can start worrying | about worst case big O scenarios. | | Of course this depends on your language. A c programmer will | have a different mental model than a python one. | lonelappde wrote: | In Python performance is your last consideration, and | that's OK. Most things computers do don't need to be fast. | Only the innermost loops run the most do. | SaxonRobber wrote: | This is the philosophy that has led to our software | becoming slower despite improvements in hardware. | | Performance is always important. Especially for consumer | applications, where your software will probably need to | run alongside many other processes each competing for | resources. | why_only_15 wrote: | i used to think this was true, but theres a lot of stuff | recently where there really aren't such hotspots | everywhere. when i profile UI stuff for instance there | isn't some big obvious hotspot to optimize, the runtime | is spread all throughout the app doing random things. if | all the regular code you write is just ludicrously slow, | you're going to end up with something that's just laggy | and without any way to fix it other than rewriting it | saagarjha wrote: | Often, for small values of n performance matters less anyways | matching that as n gets larger is often a nice bonus. | Sometimes this makes the code more complicated, yes, but | occasionally it can even make the code simpler, especially in | a language with good data structures and algorithms (C++ is a | shining example.) | monocasa wrote: | What if you have a large number of small N lists to be | sorted? | loosetypes wrote: | Complexity yes, but I feel scalability is similarly a word with | a misleadingly narrow connotation in practice. | | Is an approach dependent on swathes of training data truly | scalable if it doesn't work for the first n attempts? | samfisher83 wrote: | When you do big O analysis you get best case, worst case, and | average case. You have to do some thinking about the structure | of you data when doing big O analysis. | TeMPOraL wrote: | It's not that. Something not properly covered in CS courses | is that very often, performance is dominated by things that | are not evaluated as a part of big O analysis. Like, memory | allocations, cache friendliness, and other constant factors. | | For example, according to the theory, a hash table is much | better suited for key lookup and random additions than a | vector. In practice, if you're storing a couple hundred | elements, a flat array (with objects stored directly) will be | faster because of data locality. If your problems are mostly | "do a lot of small N ops" and not "do some large N ops", then | big O analysis isn't all that useful anymore. | seanmcdirmid wrote: | This is covered in computer architecture, at least, and | sometimes (but not often) in compilers. | saagarjha wrote: | No, the point here is that big-O analysis means nothing if n | is small. If n < 10 your algorithm could be exponential and | still do better than a linear algorithm with a constant | factor a thousand times larger. | tmpz22 wrote: | And they never factor in training or maintenance level | complexity, i.e. your ball of mud runs fast but have fun | teaching ~5 juniors how to use it two years from now. | gok wrote: | It's mostly a shibboleth to make sure you actually did the | coursework on your resume. | fjfaase wrote: | Everyone who has done profiling knows that (almost always) the | performance problem is in the part that you least expect it. But | also that sometimes you wrestle to improve the performance of an | algorithm hitting a hard wall and than someone comes up with an | idea, often resulting from a fresh look on the problem, that | results in an improvement of several orders. I guess that | performance is at least as counter intuitive as statistics. The | same is true for some other things like scalability and | reliability. | | Actually, I think that it scares the hell out of most of the | developers that it is so difficult to get a grip on these things. | It is so easy to think that there is a simple solution, a grand | idea that will fix the problem. I still find myself falling in | the trap and this after having developed software for over 30 | year. It is the Dunning-Kruger effect over and over again. I | guess it more that as a more senior engineer, you have | experienced a little more often. | jniedrauer wrote: | > I guess that performance is at least as counter intuitive as | statistics. | | I don't think this is really true. After you've optimized | enough code over the years, you start to get a sense for | bottlenecks, and your code is usually "fast enough" even on the | first try. When it isn't, finding the problem with a profiler | is usually pretty straightforward. | pdubs1 wrote: | I have no CS degree or STEM degree. | | Recently I worked on the same type of project as someone with 10 | yrs of experience & a CS degree from Stanford. | | A few months later, I created a project, and had a manager with a | CS degree. However, when I left, that manager was unable to | pickup where I left off, and he ended up leaving soon after. | | I have less years of experience, but to me, what matters more is | the time within that experience which was put into a relevant | business model, product built, or past projects. I.e. a Senior | Dev with a CS degree, vs. a New Grad with a Business Background & | SWE Experience. It's apples to oranges in many cases. | | Also, I'd echo another comment here: | | >"daxfohl 1 hour ago [-] | | > As a senior dev, I wish that I could say I always knew more | than my interns, and that all the code that's there is because it | was carefully planned to be that way. | | >But more often than not, I don't, and it's not. " | downerending wrote: | There are indeed many not-really-programmers with CS degrees. | | On the other hand, sometimes I'm handed a program written by a | new grad to maintain/fix/improve, and rapidly determine that | it's less work to just chuck it in the bin and start over. | | One of the major differentiators between a newbie and an old | hand is knowing how to create a piece of software that those | who work alongside you or come after you can understand, | maintain, and improve. | aerovistae wrote: | I dislike the mentality that one must "struggle" to be patient | with new devs and that it's "more than they deserve." | | Is it really so hard to help other people learn, and to accept | that the only advantage you have on them is starting earlier? | ericlippert wrote: | I take your point, but let's be fair. My attitude was "this | code is bad and I'm going to demonstrate my skill by improving | it" when it should have been "please teach me what design and | implementation concerns went into the choice of algorithm | here". I was lucky to get a gentle and thoughtful correction | for my presumptions. | nocman wrote: | Kudos to you first for recognizing your own error and second | for openly admitting it. | | I think there many things to consider here. For new | developers, I'd encourage you to look for those "old guys" | who really know their stuff. There's a lot of unmined gold | you can discover there if you find the right ones. I think us | older developers would do well to imitate the patience and | kindness of Tim Paterson more often. I guess what I'm saying | is both sides could do with a huge dose of humility. I know | at times I've been the youthful dev out to one-up the "old | guys", and I've been the senior dev thinking "these kids | today" to myself when dealing with those with a lot less | experience. And both of those are bad. | | Also, there are many times when a young guy fresh out of | college spots a problem, finds a great solution, and makes | things ten times better by just doing it! If you're in the | business for a long time, it can become too easy to be | cynical, and lose your enthusiasm. | | I think the best thing we can do is try to let the good stuff | from both sides rub off on each other. | lonelappde wrote: | It's great that you saw that you had a bad attitude and | reflected on it. But ignorant arrogance is a personality | trait some people have sometimes, not a new grad trait. | pjmlp wrote: | Learn on their own they must. | mbrameld wrote: | > Is it really so hard to help other people learn, and to | accept that the only advantage you have on them is starting | earlier? | | If only the world were that black-and-white. It took me a long | time to realize the habits I learned from my father in this | area were toxic. Not everybody got the same upbringing you did, | it sounds like yours was more advantageous than mine in that | respect. | king_magic wrote: | Except it is a struggle. It's often a struggle to get newer | devs to stop wasting time, it's often a struggle to get them to | focus on the problem you're trying to solve instead of the new | library all the cool kids are using, etc. | | I agree with the "more than they deserve" mentality, but let's | be honest here: it's a struggle. | | We've all been through it as new devs, and we'll all help new | devs struggle through it as well. | hhjinks wrote: | I've worked with juniors who always scope creep their very | simple introductory tasks. Something as simple as "add these | two fields to the existing API response" suddenly turns into | "change the method signature of 90% of existing methods | because DTOs make code lines shorter". | lonelappde wrote: | New team members also notice the mess veterans made by | feature creeping "add two more fields" to a 100 field | monstrosity that is a usability and security nightmare, but | no one can improve it because the lead engineer got | promoted for building v1 and refuses constructive | criticism. | lonelappde wrote: | I've seen in with senior devs too. "architecture astronauts" | and "principal engineers" interfering with solving problems. | Why drag an unnecessary distracting stereotype into the | conversation? | saagarjha wrote: | By the way, here's an anecdote for the flip side: at one of my | internships I was working on a tool to process large log files, | and by careful application of Aho-Corasick I was able to make it | about _50 times_ faster on the dataset we were dealing with, | which made using the tool change from "let's go grab lunch while | this finishes" to "let's stream the logs through this live". | Sometimes you do know how to make things faster: just make sure | to test them before you do, and asking why something is a certain | way is always a good thing to do before proclaiming it's broken. | LeoTinnitus wrote: | But sometimes you can get a supervisor who doesn't care what | you do, you're not changing it because they don't want to learn | something new... | ericlippert wrote: | That's a great example; the most important part of your | anecdote is the end which says _what was the user impact?_ | There is no prior reason to believe that a 50x speedup is | actually a win; taking an algorithm from 100K nanoseconds to 2K | nanoseconds when hitting the file system takes billions of | nanoseconds is not a win for the user, and taking an algorithm | that takes 5000 years down to 100 years is probably not either. | | But a 50x win that, as you note, goes from "let's have lunch" | to "let's change this one thing ten times and re-do the | analysis to see which gets us the best results" is a huge game | changer; it's not just that it saves time, it's that it makes | possible new ways to work with data. | est31 wrote: | Another anecdote: A few months ago, I was adding a bunch of | features to an open source library. Before changing, I | studied the library's code to find out where to put the | change best. During that study, I found an instance where the | code performed a slow operation, but I didn't attempt to | change it because it was out of scope for my change, and I | didn't want to introduce bugs [1]. | | A little bit after I have made the change, an user filed a | bug [2] to the library about slow behavior when you passed a | large amount of data to the library. They were using the | public release instead of git master, so my code wasn't at | fault thankfully. Due to the knowledge I attained from doing | the change, I could quickly confirm that precisely this slow | part was cause for the performance slowdown, and was able to | file a PR to reduce the library's complexity from quadratic | to linear [3]. | | It wasn't very complicated from the algorithmic point of | view, I only had to create a lookup structure, but the lookup | structure had to be created in a certain way so that the | library still behaved the same way as tested by the test | suite. It also had to support things like duplicates as the | original implementation also supported them, as a very | important feature in fact (toml arrays). | | [1]: https://github.com/alexcrichton/toml-rs/pull/333 | | [2]: https://github.com/alexcrichton/toml-rs/issues/342 | | [3]: https://github.com/alexcrichton/toml-rs/pull/349 | lalaland1125 wrote: | For interest's sake, did you try simply using a decent regex | engine as an alternative? Any DFA regex engine implicitly | implements Aho-Corasick for you. | saagarjha wrote: | I think the engineer before me put a bit of effort into this, | but it didn't work out all that well in this case. This isn't | surprising considering that the standard regex tool on the | platform wasn't known for being particularly performant. | [deleted] | pfyon wrote: | The open source intrusion detection system Suricata [1] used | aho-corasick until intel released hyperscan [2] in open source. | Hyperscan is apparently more performant than aho-corasick. If | your language can handle the C libraries, have you considered | trying hyperscan to see how it compares? | | [1] https://suricata-ids.org/ [2] https://www.hyperscan.io/ | skavi wrote: | Based on their use of past tense, I'd assume they aren't | interning there anymore. | saagarjha wrote: | I am not, and the company is known for having a fairly | strong not-invented-here syndrome. | [deleted] | rkagerer wrote: | Shockingly, InStr(<this page of 100+ comments>, | "docum") = 0 | | I'm a dev with some grey hair who feels it would have been useful | for all that fantastic domain knowledge from Paterson to get | documented in a code comment. | | I'd love to hear if either of them ever went back and did that? | ericlippert wrote: | In this case I think not. However I strongly agree with your | position and have tried hard to ensure that my code is | commented such that it explains why the code works as it does. | See https://ericlippert.com/2014/09/08/comment-commentary/ for | more thoughts on this subject. | 0x00000000 wrote: | I was looking for this, thank you. I cannot get over when | people try and keep all this knowledge in their head. Have they | never worked on something written by someone other than | themselves? I know it usually isn't, but it really feels like | some kind of hubris that makes people allergic to writing a | paragraph block comment in the code. | jpochtar wrote: | Can someone explain the bugs in the code samples? The author says | they're there but I honestly can't find them. | asib wrote: | No checks regarding length of source and query - may end up | dereferencing beyond the bounds of the string. | ericlippert wrote: | There is a dereference past the bounds of the query in one | case in the last code sample, but there is no deference | beyond the bounds of the source string. | | You're probably thinking in C# or Java; remember that in C | the convention is that a zero char ends strings. If the | source string is shorter than the query string then the code | will encounter a zero char in the source string at the same | time as it encounters a non-zero char in the query string, | and the inequality will end the loop before the beyond-bounds | dereference. | | There are other defects; can you find them? | jpochtar wrote: | There are though, when it checks for '\0'. | | `starts()` looks like it's not checking if len(source) < | len(query), but if, say, query="foobar" and source="foo", | when i = 3 the line if (source[i] != | query[i]) return false; | | will evaluate to if ('\0' != 'b') | return false; | | so `starts()` will correctly return false. | | Always if len(source) < len(query), we'll return false when | we get to i=len(source), because source[len(source)] != | query[len(source)] as source[len(source)] == '\0' and | query[len(source)] != '\0' since len(query) > len(source). | derangedHorse wrote: | I don't see any checks preventing an 'out of bounds' exception | nor general null checks before accessing data | | Edit: As others have stated, the 'out of bounds' exception | should be taken care of by the '\0' at the end of strings in C | ericlippert wrote: | Hint: what is the correct behaviour of this method when given | empty strings? Every string contains the empty string as a | substring. | jpochtar wrote: | Oh, I'd assumed disagreement on behavior of query="" between | the two code samples meant it was UB and was looking for | crashes/invalid memory accesses. | ericlippert wrote: | A Visual Basic program is not allowed to have undefined | behaviours like a C program; InStr has a specification and | that specification has to be implemented; that spec | includes defining the behaviour for all inputs. | | There's also no null handling here, which was a deliberate | omission for clarity. In practice, the convention used | inside the VB source code is that null string pointers are | semantically the same as empty strings, which introduces | some complexities. | jpochtar wrote: | ah neat, thanks! | [deleted] | [deleted] | daxfohl wrote: | As a senior dev, I wish that I could say I always knew more than | my interns, and that all the code that's there is because it was | carefully planned to be that way. | | But more often than not, I don't, and it's not. | | My take on it is that for the most part senior devs are a lot | more paranoid on breaking things and don't want to make code | changes unless enough people are asking for it and it'll make a | notable difference. Even making things strictly faster can break | users if they depend on the algorithm being slow (an extreme edge | case that you'd usually say is the user's fault anyway, but could | nonetheless be relevant in things like optimizing away a spin- | wait). | bfung wrote: | Nested for-loops go brrrrrr | | lmao! But #truth. | TeMPOraL wrote: | Oh god. That meme. | | I've seen it a day or two ago. Can't find the picture anywhere | now (I've seen it in some group chat). Anyway, beyond the words | quoted at the beginning of this article, the meme's "nested loops | go brrr" had a picture of a triple-nested loop using Active | Record to do some simple database operations. | | To which the correct response is: "it's a 'senior developer' in | an industry where you get called a 'senior' after a total of 3 | years of experience doing programming; don't do stupid shit like | this, _just use SQL like it 's meant to_". | on3 wrote: | Found it: https://i.redd.it/lmrf72ko0ro41.png | TeMPOraL wrote: | That's it. Thanks! | tspop wrote: | Hey, I made that meme. | | It was based on a similar story the one in OPs blogpost. At my | first job I used to work with some really talented fresh grads | that wanted to show off their algorithms skills and ended up | over-engineering stuff. | | One of them implemented a trie and stored it in SQL lite to | implement some string autocomplete where the number of strings | was something like 100. | | The other implemented a 2D segment tree for doing some grid | updates where the size of the grid was small. This inspired the | first part of the meme. Segment trees and sqrt decomposition | are topics that are popular at programming contests and nowhere | else really. | | Regarding the triple nested loop, I just wrote the simplest | pseudocode that represents nested loops, not necessarily | something a "senior" developer would write. | est31 wrote: | Wow it was based on a real world story. Makes the meme even | better. | lonelappde wrote: | New hires showing up at work and doing the one thing they | were tested on in the interview. How strange of them! | est31 wrote: | Many people can write brute force implementations. But | while the meme is funny and great, it only shows a part of | reality: sometimes you _do_ have to optimize things. Just | don 't do it too early and only if there is a clear use | case that requires the speed. Then you should be able to | write fast code, or at least know which library to use that | has a good implementation of the algorithm you need. | | Some companies like GAFAM probably put too large focus onto | algorithmic questions, but they can afford to lose | otherwise good engineers who are bad at algorithmic | questions. They need something to filter the masses of | applicants they receive. | Apocryphon wrote: | Goodhart's Law-Driven Development | DonHopkins wrote: | I like to ask interviewees to imitate the sound a computer | would make over an AM radio while executing different | algorithms. | | Nested for loops go brrrrrrrrrrrr, munching squares go | bweep bweep bwweeeep bwweeeep bwweeeep bwweeeep | bwwwweeeeeeep bwwwweeeeeeep bwwwweeeeeeep bwwwweeeeeeep | bweep bweep bweep bweep... | | https://www.youtube.com/watch?v=V4oRHv-Svwc | | Life goes shlup shlup shlup shlup shlup... | | https://www.youtube.com/watch?v=hB78NXH77s4 | | If they use any Don Martin sound effects, I hire them on | the spot. | | https://www.madcoversite.com/dmd-alphabetical.html | | >CHK CHK CHA-GONK BRBBRBBRING! -- Man's Eyes Being Poked | Like A Cash Registers' Keys And Jaw Popping Open Like A | Till Drawer -- Mad #61, Mar 1961, Page 18 -- Kitzel's | Department Store | TeMPOraL wrote: | Thanks for the context. The example you describe supports the | meme better. | | Sorry for being harsh, I got triggered by that code inside | the printer, because I've dealt with a lot of dumb "I don't | know how SQL joins work, so I'll use my ORM to do it and | filter the data in code" cases early in my career, and I have | sort of an allergy to that now. | karatestomp wrote: | I see this so much in Rails codebases that at this point | the two are nearly synonymous in my mind. But maybe I've | been cursed to work only on bad Rails projects or something | and there's a universe of them out there that aren't full | of that sort of thing. | johncalvinyoung wrote: | There are rails codebases written by people who love both | ActiveRecord and SQL and try to optimize both performance | AND developer productivity. ;) | ADanFromCanada wrote: | But this is war. Pick a side! | ericlippert wrote: | Well thanks for inspiring the post! It triggered a pleasant | trip back to a simpler time for me. | empath75 wrote: | Most FAANG job interviews would fail you if you did the brute | force solution, it seems. | saagarjha wrote: | I feel that most will be sympathetic if to you explain why you | did something a certain way and showed that you understood the | different approaches available. | maliker wrote: | I'm the senior dev on my team, and whenever a new dev joined my | team they would look at the codebase and go "ew, python2? Just | use python3." | | That gave me a chance to explain the testing and refactoring cost | that would come with changing python versions, and how the | benefits to users would be almost zero. And then at some point | one of the new juniors said, "hey, there's a lot of filesystem | performance improvements and readability improvements (f-strings) | in 3. I can get the test/refactor done in a month, and I think | it's a net positive." They were right, and now we're on python3. | | So, sometimes we all learn something. | heymijo wrote: | This sounds like the phenomenon dubbed Chesterton's Fence [0]. | | _A core component of making great decisions is understanding | the rationale behind previous decisions. If we don't understand | how we got "here," we run the risk of making things much | worse._ | | So you helped the new dev understand the current lay of the | land. They listened, then suggested an improvement based on | their new understanding. You agreed and together improved the | code. | | [0] https://fs.blog/2020/03/chestertons-fence/ | duxup wrote: | I'm working with some old code and spend a lot of my time | thinking "There's got to be a reason this is as wonky as it | is." | | Sometimes I just can't find out (the guy who wrote it is | gone, nobody knows) then I have to just do it the way I think | it should be, test ... then I find out why, sometimes | spectacularly ;) | dmurray wrote: | This isn't really Chesterton's Fence. It doesn't take any | soul searching or insight to answer "why was this not written | in Python3 in the first place" - the answer is almost | certainly that Python3 or some key libraries didn't exist. | | It's a pure cost/benefit analysis question. Switching has | some cost, and some benefits, and you have to decide which | are likely greater. | heymijo wrote: | I think this seems like a good critique, but as you noticed | this is a nascent concept for me. As I learn more, I'll | have this critique in mind. | drewmol wrote: | > the answer is almost certainly that Python3 or some key | libraries didn't exist. | | This has not been my experience, even within the past few | years on occasion. | dmurray wrote: | Really? The only other answer I can imagine is "all our | other projects are in Python 2". And generally that means | you will reuse some code from those projects. What other | reasons have you seen? | 29athrowaway wrote: | Another reason to use Python 3: you won't get any more updates. | | https://www.python.org/doc/sunset-python-2/ | maliker wrote: | We wrapped up our conversion about 1 week after the sunset | date. I think we could have lived with no new features, but | the no security updates issue was a big reason why we | upgraded. | mohamedmansour wrote: | No excuse converting python 2 to python 3 slowly. One commit at | a time. We did that in Chromium. Using newer Jon deprecated, | unsupported libraries brings a bit better team Morale and | enjoyment in what you do. Imagine intern coming in, and using | Fortran. | twelfthnight wrote: | This is a nice counter example. Although it's clear the article | has good intentions, it's promoting unscientific thinking with | an appeal to authority ("yes, they go brrrrrr extremely quickly | much of the time, and senior developers know that!"). I think | there is a point to be made about how we all could benefit from | quelling our outrage at decisions we initially disagree with | before we've heard all the evidence, but that has nothing to do | with seniority. | mgkimsal wrote: | and... if after 3-4 weeks, it was nowhere near completion, or | you'd hit so many snags it was going to take several more | months, I'd hope you'd have the good sense to put a cap on it, | and revisit later. There's nothing inherently wrong about | trying something like that, especially if there's some tests in | place, a known goal, a reasonable time boundary relative to the | potential benefit, and a willingness to stop if the effort is | growing beyond the benefit. | | Your experience sounds great, but it also sounds like we don't | see that level of middle-ground pragmatism enough - it's either | "no, we have to stay using PHP4.3.3 because it's what I know" | or "we have to rebuild to cloud micro services to be able to | scale infinitely without being restricted by schedules or | budgets". | maliker wrote: | We mostly stick to a single master with our code, but this | was one case where we branched so we could watch progress via | test passing percentage to make sure we were moving fast | enough to finish before we got sick of it. | | Definitely have been one or two refactors that have been | shutdown. We've been lucky to have clients that give us the | freedom to retire some technical debt/risk instead of just | churning out features. And we're small enough (200 kLoc | approx, 6 devs) that full codebase refactors are still | doable. | hombre_fatal wrote: | I thought part two of their post was just going to be another | example of the classic beginner dev hubris. | | "I'll rewrite it in a week!" | mister_hn wrote: | Not everyone is accepting suggestions like you. | | I am a senior dev and I always try to push new ideas to my team | lead (senior dev and older than me), but all I get is blatant | criticism because he says "I've tested it and didn't like it). | A clear example: refusing to move to Spring Boot and still | staying on the dead horse JavaEE, which got more complicated | and fragmented than ever since Java 9 | xapata wrote: | To be fair, it's easier to move to 3 now than X years ago. And | there's more benefit. | chrismarlow9 wrote: | I think this is the most forgotten part of this type of | issue. It's not a static issue. With these types of things | and time the problem often gets worse, and the answers get | easier. They should be periodically re-evaluated. | | Parts of the problem get worse as time goes on. (more code to | convert, more complexity, more test cases, EOL for the | python2 version of some lib, more user data, blah) | | Parts of the solution get easier as time goes on. (easier | syntax sugars, better testing frameworks, infrastructure | abstracted easier, remove dead features, etc) | | Parts of the desired solution become less relevant as time | goes on. (Why not use golang or node or elixir, php and | python are so dated!)... | | Just because last year was a bad time for the upgrade doesn't | mean today is. Knowing how to get these things done at the | right time by the right people for the business is what | separates a great engineering manager from one that is just | "shipping features and bug fixes". | bogdanu wrote: | The reverse of this also happens: new team manager joins a team | of 4-5 dev and goes "eww... a monolith, we'll write an MVP in 3 | weeks with microservices, CQRS and all". | | Long story short, one year and a half passes and the mvp is | still not finished, the architect leaves the company and some | poor guys (from an outsourcing company) are still going at it | with the same architecture. | majormajor wrote: | That's a very "junior senior" type of developer. Rewrite from | scratch needs external reasons to be a good idea, because you | have a _huge_ uncertainty risk of "i don't know enough about | this system to rewrite it, and now i'm spending months | hacking back in edge cases to the new no-longer-beautiful- | design." Uncertainty risk doesn't mean never do it, but it | means _make sure you understand it and make sure it 's gonna | be worth it._ | cameronfraser wrote: | CQRS and microservices has so much overhead, I'm amazed at | how many companies adopt microservices without having anyone | know a single thing about distributed systems. I think people | underestimate the glue required to make microservices be | useful. I had a similar situation at my last company where | they spent a year and a half converting their monolith into | half ass microservices and it still wasn't working properly. | They also stopped releasing new features on their product | during this entire conversion which obviously lead to a drop | in customers. The mass exodus of employees at the end was | something beautiful though. | Lapsa wrote: | I would argue that main reason for adaptation of | microservices is to actually prolong development time (moar | dollars) and make devops as convoluted as possible to | safeguard data. Or foolishness. | allover wrote: | You're assuming malice over incompetence. | | Having heard mainly "nothing but guff" justifications for | microservices in non-enormous orgs, I think (as usual, | and some law I forget dictates) the latter is more | likely. | commandlinefan wrote: | On the other other hand, I've worked on system that stuck | with "the old way" (like ColdFusion) for so long that it was | impossible to even find documentation on the old programming | environment if you _could_ find somebody willing to maintain | it. The longer you wait to upgrade, the more it's going to | hurt when you finally do. | bogdanu wrote: | I'm not against replacing old systems, but this has to | happen in incremental steps. | | This way, if an implementation path doesn't worth the | effort, just drop it and don't spend dozens of months with | a full rewrite and realizing that you are spending a week | to implement a CRUD for a simple entity. It's even worse | when devs are new to the project and have no knowledge of | the domain; I've been burned by this, never again. | | At the end of the day, we are not in the research field, we | are payed to fix business problems, to deliver something | that brings a business value. Yeah, it's nice to write some | really complex software that would make the system a lot | more optimized, but we have to deliver it before our bosses | are getting tired of excuses and pull the plug. Learned | that the hard way. | Supermancho wrote: | Having been in this situation, as well, I find that the | aging out of technology MUST be led by a CTO who is | competent enough to know that tipping point of cost- | benefit. Their job immediately becomes a political choice | (pass the pain to the next guy) when the point has passed | and it gets worse every new CTO. Some of these systems are | HUGE (hudreds of millions of lines of ColdFusion, where I | worked). | mkchoi212 wrote: | As a new grad myself, I completely agree with the author's freak | out when seeing "inefficient" looking code. But I feel like this | applies to basically all aspects of a company life as a new grad | software engineer. For me at least, there seems to be a lot of | inefficient things going on in every aspect of the company. Maybe | the author is right in that there is a reason behind those | seemingly "inefficient" things. However, the inefficiencies are | there because sometimes, people who have been working there for | years just have become used to it. | tyleo wrote: | I see these senior vs non-senior engineer contrasts pop up a lot. | I'm not a huge fan of them. | | It seems that there is a spectrum of skills an engineer could | excel at: programming, infrastructure, managing, planning, etc. | I've known senior engineers who only excel at a particular skill. | I've also known senior engineers who are moderately good at many | but not particularly good at one. | | In my experience the only difference between a senior and non- | senior is that the senior's distribution of skills makes them | more effective. | | I've also seen senior engineers change roles laterally and become | more or less effective, yet still maintain the senior title. | ksk wrote: | If you accept the system that promotes an engineer into a | senior position, there is no arguing that a senior engineer | does posses skills that a junior doesn't. The junior is going | to be evaluated in the same system. | ericlippert wrote: | Well then, feel free to write your own blog post on a topic you | enjoy more! | XelNika wrote: | I don't think he dislikes the topic, but rather the way it is | framed as senior vs. junior instead of subject matter expert | vs. non-expert. The skipto example from your post is not | exemplary of the difference between a senior dev and a new | grad, it seems very domain-specific. | wwweston wrote: | As a senior engineer who is proud of certain strengths and | envious of different strengths I see in more often in others | than myself, this seems pretty spot on to me. | Forge36 wrote: | One of my job tasks was improving our custom string library. In | VB! Sure InStr was slow: but we were trying to do "starts with" | across very long strings. Sure, it worked, but a slightly smarter | solution was 100 times faster. Upon finding it I went on a | search, I found it written nearly identical 10 years before me, | in a different version of a string library (hint, that one was | faster). | | The trick to knowing we needed to improve it: profiling! | kokey wrote: | You're telling me that a bigger instance size in AWS isn't | always the solution? | whymauri wrote: | This is the meme referenced by the blog post: | | https://i.redd.it/lmrf72ko0ro41.png | twotwotwo wrote: | Go's strings.Index/bytes.Index also often use the "find first | byte" approach; now on amd64 it's a fancy vector instruction. If | it finds the first byte too many times without a full match, it | falls back to a Rabin-Karp match using a rolling multiplicative | hash. | | The strings.Index code is at | https://golang.org/src/strings/strings.go?s=25956:25988#L101... | and the internal bytealg package with all the CPU intrinsics is | at https://golang.org/src/internal/bytealg/ with | index_amd64.{go,s} as the relevant files for x64. | | Battle-tested implementations of fundamental things like string | searches, sorts, hashtables, allocators, graphics primitives, | etc. are often interesting 'cause you find out a lot about what | you need (and/or don't need) to work well in practice as well as | protect from exploding worst-case scenarios. | lordleft wrote: | This is a great story. I feel like talented CS students are | particularly prone to this sort of myopic thinking; their | professors introduce them to functional programming, or the | philosophy of UNIX, and they carry that approach everywhere, in a | very blunt way. They also carry this little-examined conviction | that the world consists of mediocrities and mediocre code and | sometimes aren't charitable with the work of others. | gen220 wrote: | Putting it another way, knowledge with the bare minimum | experience required to be effective is very sharp, and that | sharpness is sometimes what's required to cut through old, | retrospectively "wrong" ways of doing things. I don't disagree | with what you've said, I think we all have met plenty of people | fitting your description, I just mean to say there's another | side of the coin. As food for thought, much (not all) of the | open source software we herald as brilliant today was initially | written by people still in or just barely out of grad school. | chiefalchemist wrote: | The New Grad had knowledge. The Senior Dev had Understanding. | | Understanding > Knowledge | | It's that simple. | 9wzYQbTYsAIc wrote: | Being pedantic here, but knowledge is equivalent to | understanding (information being knowledge without | understanding). Wisdom is the word you were looking for | (knowledge being wisdom without experience): | | The New Grad had knowledge. The Senior Dev had Wisdom. | | Wisdom > Knowledge. | chiefalchemist wrote: | I see it differently. Knowledge =/= understanding. There | are plenty of ppl who know a lot but have little | understanding. | | Put another way, knowledge is the nodes. Understanding is | grasping the connections. Understanding is the higher | power. Understanding is where the magic happens. | | Wisdom? Wisdom is next level understanding. It's the | maturity of developing connection within the connections. | saagarjha wrote: | Not always. | philipov wrote: | More like, The New Grad has Ideals. The Senior Dev had | Deadlines. | | Deadlines > Ideals. | chiefalchemist wrote: | Not if you read the article. It's pretty clean the New | Grad didn't understand the details, the scope of the | problem. He was theoretically correct at 50k ft. In | reality, the details made his theoretical irrelevant. | jlgaddis wrote: | Not sure where it originated but I love this saying: | | > _In theory, theory and reality are the same._ | | > _In reality, they 're different._ | ptr wrote: | The senior probably had both Knowledge and Understanding, | and decided (correctly) in a split second which solution | was good enough in all applicable dimensions, including the | temporal (is it easy to change this if we have to?) | jakear wrote: | Heh... reminds me of my first proper MS internship, when I too | was responsible for speeding up some code, this time in the VS | Code Go extension. This code was responsible for tokenization, so | it affected pretty much every operation and ran on every edit. | Important shit. | | Day 1: do some basic hoisting. O(n^3) => O(n^2). Tokenization | times for a 10k line file go from ~15s to 500ms. Sweet. | | Days 2-30 [1]: ideate, develop, bug fix, iterate on, etc, a novel | (to me) data-structure to speed up the program even more. O(n^2) | => O(n x log(n)) (expected). Great! Runtime on 10k line file went | from 500ms to _maybe_ 300ms. Oooops. | | But hey, all the people working in 500k line files must really | love the couple seconds my month of toiling (more importantly, my | month of not doing other, more impactful things) saved them. | | Learned a lot from that experience, and writing this out now I | can see how that impacted engineering decisions I make to this | day. I suppose thats the _real_ point of an internship, so time | well spent, in a way. | | [1] It probably wasn't actually a month, but certainly a | significant chunk of the internship. | TeMPOraL wrote: | > _But hey, all the people working in 500k line files must | really love the couple seconds my month of toiling (more | importantly, my month of not doing other, more impactful | things) saved them._ | | This stuff matters. These couple seconds per operation may very | well be a difference between being able to open a 100k+ LOC | file in the same editor you're doing your other work in, vs. | giving up in frustration and looking for something else that | can handle large files. Large files happen surprisingly often | (in particular: log files, data dumps, machine-generated code). | A "month of toiling" like this may immediately enable new use | cases. | jakear wrote: | This is true, but my approach was still not a great one. It | turns out the data structure was only accessed in a very | specific way, which I could have exploited to dramatically | simplify the implementation. If I had researched the problem | more before diving into my original idea, I could have | achieved the same end goal with much less work. | | Lessons upon lessons :) | ericlippert wrote: | One of the most important things you can do in perf analysis is | to know when to stop looking for incremental improvement. | | If this subject in particular interests you, we did a lot of | work in the C# lexer/parser so that once the file is lexed, it | only re-lexes the tokens which changed on every edit. It also | does fun stuff like the syntax colourizer only runs on code | that's actually on the screen. Getting every operation that | depends on the lex/parse of code in the editor down to running | in much less than 30ms so that it would not slow down | keystrokes was a huge amount of work. | cwzwarich wrote: | Does the C# parser implement incremental parsing by just | using a recursive descent parser with caching, or does it do | parsing with nonterminals as lookahead? | ericlippert wrote: | It does a full lex and parse. Then if there is an edit, it | determines based on the edit which tokens need to be re- | lexed; maybe we went from "[10,20]" to "[10.20]" and so now | the [ and ] tokens are fine but everything in the middle is | changed. | | So we go from a data structure representing "original text | plus edit" to a data structure representing "original lex | plus changes". Now we have the information we need to do | the same to the parse tree, which has also been stored. | Given the set of tokens in the program which changed, and | knowledge of where the textual boundaries are of every | parse node, we can restrict the re-parse to the affected | syntax tree spine. In the example given, we know that we've | still got, say, an array index list but the contents of the | list need to be re-parsed, so we re-start the parser on the | left bracket. | | The algorithm that does this is called "the blender", and | reading that code makes my brain feel like it is in a | blender. The code was written by Neal Gafter and based on | his PhD thesis in incremental parser theory. | | The source code is available on github; do a search for | "roslyn" and you'll find it. | ThePhysicist wrote: | Wow that still sounds really long for simply tokenizing a file? | I worked on parsers a while ago and for reference I benchmarked | e.g. the Python parser at 300.000 loc / second (for | tokenization and building an AST) on my machine (a i7 laptop). | Also tokenization complexity should not increase quadratically | with the length of the file? | | You probably know what you're doing, just curious why these | numbers seem to be off so much to what I would expect. What | approach did you use for tokenization if I may ask? | jakear wrote: | I don't recall the exact numbers to be honest. I know the | original was in many seconds, and in the end it was sub 1. | | As mentioned in another comment: | | The go extension is a thin wrapper around standard go | tooling, we weren't tokenizing ourselves just converting | between their tokens and ones we could process; a large part | of that was converting from byte offsets to UTC-8 character | offsets. | | The quadratic behavior was a bug caused by reconverting | segments over and over again instead of converting deltas | between previously converted subsegments. | tmpz22 wrote: | I just want to thank you personally for speeding up the VS Code | Go extension because I use it every day! Thank you! | [deleted] | cwzwarich wrote: | I'm curious: why can't you do tokenization in linear time? And | why not use incremental tokenization for an editor? | jakear wrote: | I think you're right, the end result was actually linear. | | And the go extension is a thin wrapper around standard go | tooling, we weren't tokenizing ourselves just converting | between their tokens and ones we could process; a large part | of that was converting from byte offsets to UTC-8 character | offsets. | certera wrote: | With the world of developers becoming "senior" at faster and | faster rates, we're asymptomatically approaching the demise of | these questions, no? | arendtio wrote: | I think it is good when new grads have the knowledge to see that | those things don't look right and can articulate how and why it | should look different. | | After all, when you come across a problem, that does not contain | just 100 chars, it is very helpful to know what you can use to | create something that still works with reasonable resources. | Jabbles wrote: | Of course, if this code were running in a server, or if it were | part of a library that was widely used, then suddenly you have | the potential for a DoS vulnerability. | | The narrative would then be that the microseconds you saved all | those devs over the years were wiped out when hackers took down | your system. | sabujp wrote: | tldr; needle in a haystack, rolling polyhash | Ididntdothis wrote: | I always envy people who work on this level instead of cobbling | systems together that integrate several systems all with their | own set of flaws and you can be happy if you can make them work | together somehow. The algorithm stuff seems pretty simple in | comparison. A very local problem that can be profiled well and | you can understand most of the factors at play. | TeMPOraL wrote: | The "cobbling systems together" stuff is code bureaucracy | rather than programming. You're no longer dealing with | constraints of physics, mathematics and sound system design - | you're building a bit-pushing layer in between Kafkaesque | monstrosities that were never truly intended to work together. | | Sadly, most programming jobs seem to primarily involve code | bureaucracy rather than the "algorithmic" level. | GordonS wrote: | I got the chance at work recently to work solo on a small, | greenfield project, where I was fully in control of all the | pieces and the problem space was small. I could do it any which | way I wanted. | | Gods, it was wonderful! | | As I've moved up the ladder and worked on complex enterprise | systems, with umpteen integrations, overly-strict SAST systems, | enforced 90% test coverage and the like, I seldom feel the "joy | of code". It was good to feel it again! | Ididntdothis wrote: | For a while I worked in video decoding/encoding. That was | true engineering. I loved reading up about cache and assembly | and then applying this knowledge to our code. It was ok to | work on one function for weeks just to get 20% speed up. Now | I do enterprise and it's just horrible. Between dealing with | stakeholders and 3rd party systems that barely work there is | no room for systematic engineering. | burntoutfire wrote: | May I ask why did you move from codecs into enterprise | software? | Ididntdothis wrote: | I was never an expert in video and during 2008 my company | went down. The only position I found after a while was | doing C#/.NET. Outside Silicon Valley there are not too | many hardcore dev roles anyway and in addition I feel a | lot of that work is done in Asia now. | varjag wrote: | Applying algorithms in non classroom assignment style is rarely | trivial. Takes some insight into nature of the problem to spot | the opportunity _and_ good familiarity with applicable solution | in the first place. | [deleted] | jupp0r wrote: | The real lesson here should be: don't make assumptions about | performance. Take real world use cases and measure! | | That's what my professors drilled into me (I specialized in high | performance computing) and it's served me well. | mettamage wrote: | > NO! YOU CAN'T JUST USE BRUTE FORCE HERE! WE NEED TO USE SEGMENT | TREES TO GET UPDATE TIME COMPLEXITY DOWN TO O(LOG N)! BREAK THE | DATA INTO CHUNKS AT LEAST! OH THE INEFFICIENCY!!! | | I'm the opposite of this stereotype, and I think there are more | like me. Two reasons as to why: | | (1) Psychological: | | I never had this. As a junior dev, I don't like to optimize | because I feel a bit of pain when I need to moderately focus. I | can do it, and I've done it quite a lot. In that sense, I've | experienced quite a bit of pain in my life. It's something I've | learned to live with. And when I have to focus, I prefer to focus | all the way, because whether I focus on 50% or 100%, the pain is | roughly similar. This leaves me in a state of either being | lazy(ish) and wanting to program in a simple and understandable | way versus being willing to go as deep into a topic as I would | need to and focus on all the details step by step. | | When I'm intense, I also am still sympathetic towards simple code | because I know that I understand that in both states. I only | understand complicated code when I'm focused. | | (2) There are enough CS grads that know better: | | Also, on another note. Efficiency analysis is simply not taught | at university. Parts of it are taught, but they're never | connected to real world cases. | | For efficiency analysis (note I haven't done any but I've read a | thing or two and talk with a friend who is a performance engineer | quite regurlarly about it) I think there need to be a few | perspectives in check: | | 1. What is the user's actual behavior. | | 2. What is the behavior for malicious attackers (if applicable, | Intel should do this more, to my knowledge they are doing it more | now). | | 3. How does the compiler translate it to assembly? | | 4. How does it theoretically look? Specifically: worst-case, | best-case, average-case in both theoretical and empirical sense. | | Concluding: | | I only have one year of experience in software development, where | no one cared about performance yet I know better than to look | only at the theoretical speed. So I guess I'm a counter example | to the stereotype. I'm not alone. Any CS student who reads HN has | a high chance of stumbling upon articles like this and will know | that performance analysis is not only about calculating the | space-time complexity. | ericlippert wrote: | The "malice" aspect is a great one and I did not go into that | in this post because of course back in the 1990s we were not at | all concerned that someone would _maliciously_ craft inputs | that would slow down this algorithm. | | In modern code we'd want to do a threat model that considered | the consequences of untrusted inputs. | lonelappde wrote: | In the 1990s Microsoft certainly didn't think about malicious | inputs to its software, and the world suffered. | mettamage wrote: | Neither did Intel :P | | Come to think of it, neither do beginning game-designers. | They think that players will play their game as intended. | | I like that you're standing still at the malice part as it is | becoming seemingly more important every day. | dunkelheit wrote: | I'd say that for widely used library code it makes sense to | implement an efficient algorithm. By its very nature the context | in which a library routine will be called is unknown and so it | must be prepared for everything. Also for widely used code the | amount of total saved machine time can be quite substantial. | | In the story the senior dev makes a judgment call (that this | routine will be used in LOB applications where the worst case is | unlikely to appear so the implementation is OK) which is probably | correct, especially considering other priorities. And of course | senior devs are much better equipped to make this kind of calls | than juniors, but they still can and will guess wrong. | | > Moreover, Tim explained to me, any solution that involves | allocating a table, preprocessing strings, and so on, is going to | take longer to do all that stuff than the blazingly- | fast-99.9999%-of-the-time brute force algorithm takes to just | give you the answer. | | That's why a good implementation will dispatch to the best | algorithm at runtime! | 04091948 wrote: | Imagine that the intern didn't dare to ask such questions. | | a - he could've gone out thinking that performance doesn't | matter. but it certainly does in a piece of code being used daily | by thousands of devs. | | b - he could've thought that the simple implementation is faster | but missed the fact that skip is implemented in assembly. | | c - he could've realized both but missed the why. | | and these failure scenarios are likely to happen because this is | an intern we're speaking about. | | One or two tricks up your sleeve do not matter but repeat this a | 100 times which one do you think would be a better programmer? | | I think the willingness to challenge authority figures and to be | (often) proven wrong and to learn from it is an essential part of | becoming better. | | Maybe Tim was understanding because he himself challenged people | older than him and in-process learned form them. | | Advice like "don't reinvent the wheel", "avoid premature | optimization", "write boring code" promote exploitation. which is | the optimal strategy for older agents. | | but for newer agents, a higher level of exploration is needed | otherwise they would converge into a suboptimal strategy | saagarjha wrote: | > he could've thought that the simple implementation is faster | but missed the fact that skip is implemented in assembly | | It's not; the compiler is just fairly decent at transforming | string manipulation routines. | 04091948 wrote: | >As Tim explained to me, first off, the reason why VB does | this seemingly bizarre "find the first character match, then | check if query is a prefix of source" logic is because the | skipto method is not written in the naive fashion that I | showed here. The skipto method is a single x86 machine | instruction. | | oh my bad then, from this text it sounded like it was written | in assembly. | m3kw9 wrote: | The senior dev knows the system, knows what they can get away | with, and has choices to use those knowledge powers. A newbie | will always follow the programming rules, like never going over | the limit on a hwy. as they can't take much calculated risks | downerending wrote: | A true master usually goes way _under_ the limit. | m3kw9 wrote: | Senior has a lot more types of ammo to use to solve problems | let's just say | saagarjha wrote: | > The skipto method is a single x86 machine instruction. | | That's not always a good thing, especially on modern hardware. | And obviously, the "single instruction" doesn't mean it'll take | bounded time to execute... | vardump wrote: | Very much true. | | REP SCASB on 1994 (same year as the incident in the article) | Intel Pentium would have taken 9 + 4 * n clock cycles [0][1] to | go through a string. So scanning 4 chars long string takes 25 | clock cycles. | | [0]: Agner's instruction tables page 123: | https://www.agner.org/optimize/instruction_tables.pdf | | [1]: _Plus_ one extra cycle to decode REP prefix, if previous | instruction took 1 cycle. | thedance wrote: | It could be faster than some alternatives under some | circumstances. A loop would probably be more code (== icache | pressure) and would consume at least a register and a BTB | entry. | vardump wrote: | It's certainly been a while since I last optimized for the | original Pentium architecture. Still faintly remember U & V | pipes, unexplained causes for stalls, etc. | | As even nowadays, it would likely depend on the particular | algorithm and data set. I'd be surprised if you can't do | better than 4 cycles per char for sufficiently long | strings. Most likely for short strings, REP SCASB wins due | to setup costs. (Actually that article's skipto method | would have of course used REP CMPSB, but that's just | splitting hairs.) | | Remember that even original Pentium could execute up to | _two instructions per clock_. Unless you messed up with | those damn U & V pipes. :-) | | The hypothetical faster-than-rep solution would need to | process data in 32-bit chunks, faux vector style. | saagarjha wrote: | > The hypothetical faster-than-rep solution would need to | process data in 32-bit chunks, faux vector style. | | Or with real vector style with vectorized instructions? | vardump wrote: | The original 1994 Intel Pentium did not have any | vectorized instructions. | asveikau wrote: | Tangentially to your point, here's something I haven't thought | about much: when these instructions get an interrupt, I imagine | they've updated (r|e)si and (r|e)di, (r|e)cx etc. to reflect | where they are in their copy or scan loop. So if you get a page | fault in the middle, then the kernel does enormous amounts of | work in response to it, then resumes that single instruction, | it resumes in the middle of the loop, not the start. | | So to reiterate one aspect of your point, there might be lots | of work, written in C, that occurs in response to the page | fault in the middle of your "hardware-backed" single | instruction. On top of all the other complexities of cache vs | memory access etc. that make scanning and copying memory | complicated no matter which way you do it. | | But probably in the heyday of Visual Basic, and especially the | DOS-based BASICs that preceded it which wouldn't have had | virtual memory at all, all of this is less of a concern. The | story takes place in a simpler time which serves as a plot | device to better illustrate the point. | vardump wrote: | Pretty pointless as said page fault would occur no matter how | you access the string. Besides, the page in question would | very likely be already present. | mlazos wrote: | This reminds me of how insertion sort is the most efficient | sorting algorithm for small values of n. I remember new-grad me | always dismissing insertion sort in every situation because of | asymptotic complexity. Engineers are supposed to find the most | pragmatic solution without biases. | nostrademons wrote: | This is also why C++ (and hopefully other languages by now) has | the short string optimization. On a 64-bit machine, a pointer to | a heap-allocated buffer is 8 bytes long. A stupidly large number | of the strings used by actual programs are <= 7 bytes long (it's | actually more like 20 in C++, because strings also include a | length and capacity). Think about it: that's virtually every file | extension, path separator, and field delimiter, and a good number | of field names, labels, first & last names, usernames, | identifiers, URL path components, parameter names, etc. If you | can fit the whole string into an immediate value, then you can | avoid the heap allocation and deallocation entirely, you can | avoid chasing pointers and incurring a cache miss, and you can | very significantly reduce the total amount of memory used. | | This is also a good reason why, if you're designing a standard | library, you really want to have length-prefixed strings instead | of null-terminated ones. If you know the length of the strings | you're dealing with, you can swap out the algorithm you use, such | that you might use brute force for a very small needle, word | comparisons if it's exactly 4 bytes long, SSE instructions if | it's a word-length multiple, or Boyer-Moore for very long | strings. | peteri wrote: | This isn't always the case that simplistic algorithms don't cause | issues Bruce Dawson keeps on finding O(n^2) algorithms in windows | that keep on biting him. I'm always impressed with his work. | | https://randomascii.wordpress.com/2019/12/08/on2-again-now-i... | Lammy wrote: | The true mark of a Senior dev would be commenting the code in | question with the thought processes and decisions that lead to | the chosen approach. | joejerryronnie wrote: | Sometimes it's not just coding tools/techniques where new grads | can benefit from a bit of oldster wisdom, like that one time I | convinced a new grad it was wiser to slog through the | bureaucratic approval process for a new api than attempt to hack | into our company's Oracle Financials db. | bitwize wrote: | New Grad: I'd use a linked list here because insertion into the | middle is O(1) rather than O(n). | | Senior Dev: Linked lists have very many more cache misses than do | vectors, and the difference between hitting cache and hitting | main memory is such a huge constant factor that for most | reasonable list sizes it never makes sense to use a linked list. | Use a vector. Checkmate, smug Lisp weenies. | erik_seaberg wrote: | Yeah, the Lisp world figured out cdr-coding in the 1970s and | had moved on to chains of vectors by 1985 according to | https://cpsc.yale.edu/sites/default/files/files/tr362.pdf. | Fortunately they avoided changing the API to make this | tradeoff. | xenocyon wrote: | Admittedly clueless question: what does 'brrr' mean? (I'm not a | software dev and idioms that may be obvious to others are | unfamiliar to me.) | lonelappde wrote: | It's a meme not a software thing. It's the metaphor for sound | of a machine working at high speed. | sergiotapia wrote: | It originated from a meme of the Federal Reserve printing more | monopoly money. "Printing machine goes brrrr". As they turn the | crank ever faster making the printing machines smoke. | gruez wrote: | More info: https://knowyourmeme.com/memes/money-printer-go- | brrr | T-hawk wrote: | It's the whirring sound of a motor. The for loop is | metaphorically spinning to loop over everything instead of | doing something smarter to find what it needs. The programmer | thinks the whirring sound means it's doing a lot of work for | him. | | It's an interesting dichotomy because the most practical | solution could go either way. Maybe it's looping over a table | of 20 customers and the wasted time is microseconds that | wouldn't even justify ten minutes of developer thought, or | maybe it's looping over a million customers and causing all | sorts of capacity problems. Maybe the memetic "senior dev" here | knows the former is true, or maybe his "senior" experience was | all misplaced and he's clueless. | [deleted] | rovr138 wrote: | _brrrrrr_ | | Goes a fast spinning motor | wmu wrote: | Well, this is a nice example of lack of knowledge flow. The | senior developer spent his time explaining things that should | have been put in the procedure's comment. Yes, that's what | comments are for. | malisper wrote: | I understand this as phenomenon as the new grad and the senior | developer are optimizing for different things. The new grad is | focused solely on the asymptotic complexity of the code. It | doesn't matter how slow or how complicated it is in practice, | they are solely focused on using the fastest data structure | asymptotically. | | The senior developer optimizes a different set of criteria: | 1) How hard is it to understand the code and make sure it's | correct. 2) How fast the algorithm in practice. | | There are several different reasons why the performance of the | algorithm in practice is different than the performance in | theory. The most obvious reason is big-O notation does not | capture lots of details that matter in practice. An L1 cache read | and a disk IOP are both treated the same in theory. | | A second reason is the implementation of a complex algorithm is | more likely to be incorrect. In some cases this leads to bugs | which you can find with good testing. In other cases, it leads to | a performance degradation that you'll only find if you run a | profiler. | | I one time saw a case where a function for finding the right | shard for a given id was too slow. The code needed to find from a | list of id ranges, which one a given id fell into. The | implementation would sort the id ranges once ahead of time and | then run a binary search of the ranges to find the right shard | for the id. One engineer took a look at this, realized that we | were doing the shard lookups sequentially, and decided to perform | the shard lookups in parallel. This made the code faster, but we | still would have needed to double the size of our servers in | order to provide enough additional CPU to make the code fast | enough. | | Another engineer hooked the code up into a profiler and made a | surprising discovery. It turns out the implementation of the | function was subtlety incorrect and it was sorting the id ranges | _on every call_. This happened because the code sorted the id | ranges inside of a Scala mapValues function. It turns out that | mapValues does not actually map a function over the values of a | hash table. It instead returns an object that when you look up a | key, it will look up the value in the original hash table, then | apply the function[0]. This results in the function being called | on every read. | | The solution was to replace mapValues with map. This dramatically | improved the performance of the system and basically brought the | CPU usage of the system down to zero. Notably, it would have been | impossible to discover this issue without either knowing the | difference between map and mapValues, or by using a profiler. | | [0] https://blog.bruchez.name/2013/02/mapmap-vs- | mapmapvalues.htm... | jeromebaek wrote: | Sure. But most senior devs are not Tim Patterson. | toolz wrote: | but it's not uncommon for jr. devs to believe every piece of | code deserves the most efficient runtime. Runtime speed causing | projects to fail is very uncommon. What does add an incredible | amount of work time is combing through a codebase looking for | micro optimizations. I've never once seen a jr. dev who claimed | to care about efficiency start by writing benchmarks over large | parts of the system and using that to find real bottlenecks. | No, they always comb through the code trying to impress the sr. | dev. with their algorithmic knowledge. | lonelappde wrote: | This statement works equally all when you swap "jr" and "sr", | so just leave it out | TeMPOraL wrote: | > _Runtime speed causing projects to fail is very uncommon._ | | That's true. What is common, however, is bad runtime | performance losing you users and bleeding your money. Not | doing dumb things (like using a list where a vector would | do), and taking a moment every now and then to go over your | product with a profiler and fix the biggest bottlenecks | early, can save you a ton of money in cloud bills (it might | even turn out that your product actually doesn't need to | horizontally scale _at all_ , giving you further reduction- | of-complexity benefits). Or you might end up delivering | features that were impossible to do with bad performance (see | e.g. https://news.ycombinator.com/item?id=22712103). | reader_1000 wrote: | In real life, you will always start with simple working | implementation and go with it. Then if things are slow, you | profile your code with a good profiler while running for some | kind of real life scenario and spot the slow parts (also keep in | mind that profiling may affect the program's behaviour). After | that you may want to consider alternatives with less asymptotic | complexity iff that's the part causing slowness. | | Once I was asked to look for one project to see that if there is | any room for improvement to speed up the program. After I | profiled the program with the test data, I saw that program was | severely affected by a "size" method call on a lock-free | concurrent list. Since the data structure is lock free, size | method is not a constant time operation and calling it in a large | list takes too much time. It was just there to print some kind of | statistics, I changed the code so that it is called only | necessary not every time some operation occurs. This immediately | made program 2-3 times faster. | | There were also some parts I changed with some algorithms with | less algorithmic complexith to make it faster. Overall, I made | the program 6x faster. So sometimes you need to use fancy | algorithms, sometimes you just need to change one line of code | after profiling. | Koshkin wrote: | There is also such thing as "design for performance," and | sometimes - just sometimes - any amount of effort spent later | on optimization would end up going nowhere, because what you | are trying to optimize is in fact a huge steaming POS. | yashap wrote: | Although I mostly agree, there's certain types of simple, | fundamental performance considerations you want to take when | writing code the first time, otherwise you're just needlessly | dealing with tones of performance issues in prod. Like make | sure your DB queries are well covered by indexes, if you're | repeatedly looking up items from a list, turn it into a map or | set first, in FE code try to keep re-renders to areas of the UI | that need updating, etc. | | Correctness and simplicity are almost always things you want to | focus on over performance, but there's still SOME level of | simple performance best practices that you want to stick to up | front. Similar to the tradeoff between YAGNI and software | architecture - if you don't start with some sort of coherent | architecture, your project is gonna descend into a spaghetti | mess. Ignore simple performance best practices and you'll spend | way too much time fighting performance issues. | andybak wrote: | Aside - I wish the examples were written in pseudocode without c | pointer and type clutter. | ericlippert wrote: | Noted. ___________________________________________________________________ (page generated 2020-03-28 23:00 UTC)