[HN Gopher] Tracking developer build times to decide if the M3 M... ___________________________________________________________________ Tracking developer build times to decide if the M3 MacBook is worth upgrading Author : paprikati Score : 165 points Date : 2023-12-28 19:41 UTC (1 days ago) (HTM) web link (incident.io) (TXT) w3m dump (incident.io) | lawrjone wrote: | Author here, thanks for posting! | | Lots of stuff in this from profiling Go compilations, building a | hot-reloader, using AI to analyse the build dataset, etc. | | We concluded that it was worth upgrading the M1s to an M3 Pro | (the max didn't make much of a difference in our tests) but the | M2s are pretty close to the M3s, so not (for us) worth upgrading. | | Happy to answer any questions if people have them. | BlueToth wrote: | Hi, Thanks for the interesting comparison. What I would like to | see added would be a build on a 8GB memory machine (if you have | one available). | aranke wrote: | Hi, | | Thanks for the detailed analysis. I'm wondering if you factored | in the cost of engineering time invested in this analysis, and | how that affects the payback time (if at all). | | Thanks! | LanzVonL wrote: | We've found that distributed building has pretty much eliminated | the need to upgrading developer workstations. Super easy to set | up, too. | packetlost wrote: | Distributed building of _what_? Because for every language the | answer of whether it 's easy or not is probably different. | LanzVonL wrote: | We don't use new-fangled meme languages so everything is very | well supported. | lawrjone wrote: | I'm not sure this would work well for our use case. | | The distributed build systems only really benefit from | aggressively caching the modules that are built, right? But the | majority of the builds we do are almost fully cached, having | changed just one module that needs recompiling then the linker | sticks everything back together, which the machines would then | need to download from the distributed builder and at 300MB a | binary that's gonna take a while. | | I may have this totally wrong though. Would distributed builds | actually get us a new binary faster to the local machine? | | I suspect we wouldn't want this anyway (lots of our company | work on the go, train WiFi wouldn't cut it for this!) but | interested nonetheless. | dist-epoch wrote: | > The distributed build systems only really benefit from | aggressively caching the modules that are built, right | | Not really, you have more cores to build on. Significant | difference for slow to compile languages like C++. | | > I may have this totally wrong though. Would distributed | builds actually get us a new binary faster to the local | machine? | | Yes, again, for C++. | closeparen wrote: | A Macbook-equivalent AWS instance prices to at least the cost | of a Macbook per year. | lawrjone wrote: | Yes I actually did the maths on this. | | If you want a GCP instance that is comparable to an M3 Pro | 36GB, you're looking at an n2-standard-8 with a 1TB SSD, | which comes out at $400/month. | | Assuming you have it running just 8 hours a day (if your | developers clock in at exact times) and you can 1/3 that to | make it $133/month, or $1600/year. | | We expect these MacBooks to have at least a 2 year life, | which means you're comparing the cost of the MacBook to 2 | years of running the VM for 8 hours a day, which means $2800 | vs $3200, so the MacBook still comes in $400 cheaper over | it's lifetime. | | And the kicker is you still need to buy people laptops so | they can connect to the build machine, and you can no longer | work if you have bad internet connection. So for us the | trade-off doesn't work whichever way you cut it. | throwaway892238 wrote: | 1. With a savings plan or on-demand? 2. Keeping one | instance on per developer indefinitely, or only when | needed? 3. Shared nodes? Node pools? 4. | Compared to what other instance types/sizes? 5. Spot | pricing? | | Shared nodes brought up on-demand with a savings plan and | spot pricing is the same cost if not cheaper than dedicated | high-end laptops. And on top of that, they can actually | scale their resources much higher than a laptop can, and do | distributed compute/test/etc, and match production. And | with a remote dev environment, you can easily fix issues | with onboarding where different people end up with | different setups, miss steps, need their tooling re- | installed or to match versions, etc. | lawrjone wrote: | 1. That was assuming 8 hours of regular usage a day that | has GCP's sustained use discounts applied, though not the | committed usage discounts you can negotiate (but this is | hard if you don't want 24/7 usage). | | 2. The issue with only-when-needed is the cold-start time | starts hurting you in ways we're trying to pay to avoid | (we want <30s feedback loops if possible) as would | putting several developers on the same machine. | | 3. Shared as in cloud multi-tenant? Sure, we wouldn't be | buying the exclusive rack for this. | | 4. n2-standard-8 felt comparable. | | 5. Not considered. | | If it's interesting, we run a build machine for when | developers push their code into a PR and we build a | binary/container as a deployable artifact. We have one | machine running a c3-highcpu-22 which is 22 CPUs and 44GB | memory. | | Even at the lower frequency of pushes to master the build | latency spikes a lot on this machine when developers push | separate builds simultaneously, so I'd expect we'd need a | fair bit more capacity in a distributed build system to | make the local builds (probably 5-10x as frequent) behave | nicely. | mgaunard wrote: | Anything cloud is 3 to 10 times the price of just buying | equivalent hardware. | boricj wrote: | At one of my former jobs, some members of our dev team (myself | included) had manager-spec laptops. They were just good enough | to develop and run the product on, but fairly anemic overall. | | While I had no power over changing the laptops, I was co- | administrator of the dev datacenter located 20 meters away and | we had our own budget for it. Long story short, that dev | datacenter soon had a new, very beefy server dedicated for CI | jobs "and extras". | | One of said extras was providing Docker containers to the team | for running the product during development, which also happened | to be perfectly suitable for remote development. | vessenes wrote: | The upshot: M3 Pro is slightly better than M2 and significantly | better than M1 Pro is what I've experienced with running local | LLMs on my Macs; currently M3 memory bandwidth options are lower | than for M2, and that may be hampering the total performance. | | Performance per watt and rendering performance are both better in | the M3, but I ultimately decided to wait for an M3 Ultra with | more memory bandwidth before upgrading my daily driver M1 Max. | lawrjone wrote: | This is pretty much aligned with our findings (am the author of | this post). | | I came away feeling that: | | - M1 is a solid baseline | | - M2 improves performance by about 60% - M3 Pro is marginal on | the M2, more like 10% | | - M3 Max (for our use case) didn't seem that much different on | the M3 Pro, though we had less data on this than other models | | I suspect Apple saw the M3 Pro as "maintain performance and | improve efficiency" which is consistent with the reduction in | P-cores from the M2. | | The bit I'm interested about is that you say the M3 Pro is only | a bit better than the M2 at LLM work, as I'd assumed there were | improvements in the AI processing hardware between the M2 and | M3. Not that we tested that, but I would've guessed it. | vessenes wrote: | Yeah, agreed. I'll say I do use the M3 Max for Baldur's gate | :). | | On LLMs, the issue is largely that memory bandwidth: M2 Ultra | is 800GB/s, M3 Max is 400GB/s. Inference on larger models are | simple math on what's in memory, so the performance is | roughly double. Probably perf / watt suffers a little, but | when you're trying to chew through 128GB of RAM and do math | on all of it, you're generally maxing your thermal budget. | | Also, note that it's absolutely incredible how cheap it is to | run a model on an M2 Ultra vs an H100 -- Apple's integrated | system memory makes a lot possible at much lower price | points. | lawrjone wrote: | Ahh right, I'd seen a few comments about the memory | bandwidth when it was posted on LinkedIn, specifically that | the M2 was much more powerful. | | This makes a load of sense, thanks for explaining. | Aurornis wrote: | > - M2 improves performance by about 60% | | This is the most shocking part of the article for me since | the difference between M1 and M2 build times has been more | marginal in my experience. | | Are you sure the people with M1 and M2 machines were really | doing similar work (and builds)? Is there a possibility that | the non-random assignment of laptops (employees received M1, | M2, or M3 based on when they were hired) is showing up in the | results as different cohorts aren't working on identical | problems? | lawrjone wrote: | The build events track the files that were changed that | triggered the build, along with a load of other stats such | as free memory, whether docker was running, etc. | | I took a selection of builds that were triggered by the | same code module (one that frequently changes to provide | enough data) and compared models on just that, finding the | same results. | | This feels as close as you could get for an apples-to- | apples comparison, so I'm quite confident these figures are | (within statistical bounds of the dataset) correct! | sokoloff wrote: | > apples-to-apples comparison | | No pun intended. :) | Erratic6576 wrote: | Importing a couple thousand RAW pictures into a Capture One | library would take 2 h on my 2017 iMac. | | 5 min on my m3 mbp pro. | | Geekbench score differences were quite remarkable. | | I am still wondering if I should return it, though | lawrjone wrote: | Go on, I'll bite: why? | ac2u wrote: | They miss the 2 hours procrastination time. It's a version of | "code's compiling" :) | Erratic6576 wrote: | Ha ha ha. You can leave it overnight and importing files is | a 1 time process so not much to win | teaearlgraycold wrote: | The foam swords are collecting dust. | Erratic6576 wrote: | 2,356 EUR is way over my budget. The machine is amazing but | the specs are stingy. Returning it and getting a cheaper one | would give me a lot of disposable money to spend in | restaurants | tomaskafka wrote: | Get a 10-core M1 Pro then - I got mine for about 1200 eur | used (basically undistinguishable from new), and the | difference (except GPU) is very small. | https://news.ycombinator.com/item?id=38810228 | kingTug wrote: | Does anyone have any anecdoctal evidence around the snappiness of | VsCode with Apple Silicon? I very begrudgingly switched over from | SublimeText this year (after using it as my daily driver for | ~10yrs). I have a beefy 2018 MBP but VScode just drags. This is | the only thing pushing me to upgrade my machine right now but I'd | be bummed if there's still not a significant improvement with an | m3 pro. | lawrjone wrote: | If you're using an Intel Mac at this point, you should 100% | upgrade. The performance of the MX chips blows away the Intel | chips and there's almost no friction with the arm architecture | at this point. | | I don't use VSCode but most of my team do and I frequently pair | with them. Never noticed it to be anything other than very | snappy. They all have M1s or up (I am the author of this post, | so the detail about their hardware is in the link). | hsbauauvhabzb wrote: | There can be plenty of friction depending on your use case. | whalesalad wrote: | I have 2x intel macbook pro's that are honestly paperweights. | Apple Silicon is infinitely faster. | | It's a bummer because one of them is also a 2018 fully loaded | and I would have a hard time even selling it to someone because | of how much better the M2/M3 is. It's wild when I see people | building hackintoshes on like a Thinkpad T480 ... its like | riding a pennyfarthing bicycle versus a ducati. | | My M2 Air is my favorite laptop of all time. Keyboard is | finally back to being epic (esp compared to 2018 era, which I | had to replace myself and that was NOT fun). It has no fan so | it never makes noise. I rarely plug it in for AC power. I can | hack almost all day on it (using remote SSH vscode to my beefy | workstation) without plugging in. The other night I worked for | 4 hours straight refactoring a ton of vue components and it | went from 100% battery to 91% battery. | ghaff wrote: | That assumes you only use one laptop. I have a couple 2015 | Macs that are very useful for browser tasks. They're not | paperweights and I use them daily. | whalesalad wrote: | I have a rack in my basement with a combined 96 cores and | 192gb of ram (proxmox cluster), and a 13900k/64gb desktop | workstation for most dev work. I usually will offload | workloads to those before leveraging one of these old | laptops that is usually dead battery. If I need something | for "browser tasks" (I am interpreting this as cross- | browser testing?) I have dedicated VMs for that. For just | browsing the web, my M2 is still king as it has zero fan, | makes no noise, and will last for days without charging if | you are just browsing the web or writing documentation. | | I would rather have a ton of beefy compute that is remotely | accessible and one single lightweight super portable | laptop, personally. | | I should probably donate these mac laptops to someone who | is less fortunate. I would love to do that, actually. | xp84 wrote: | > should donate | | Indeed. I keep around a 2015 MBP with 16GB (asked my old | job's IT if I could just keep it when I left since it had | already been replaced and wouldn't ever be redeployed) to | supplement my Mac Mini which is my personal main | computer. I sometimes use screen sharing, but mostly when | I use the 2015 it's just a web browsing task. With | adblocking enabled, it's 100% up to the task even with a | bunch of tabs. | | Given probably 80% of people probably use webapps for | nearly everything, there's a huge amount of life left in | a late-stage Intel Mac for people who will never engage | in the types of tasks I used to find sluggish on my 2015 | (very large Excel sheet calculations and various kinds of | frontend code transpilation). Heck, even that stuff ran | amazingly better on my 16" 2019 Intel MBP, so I'd assume | for web browsing your old Macs will be amazing for | someone in need, assuming they don't have bad keyboards. | fragmede wrote: | Your 5 year old computer is, well, 5 years old. It was once | beefy but that's technology for you. | orenlindsey wrote: | VSCode works perfectly. | baq wrote: | I've got a 12700k desktop with windows and an M1 macbook (not | pro!) and my pandas notebooks run _noticeably_ faster on the | mac unless I 'm able to max out all cores on the Intel chip | (this is after, ahem, _fixing_ the idiotic scheduler which | would put the background python on E-cores.) | | I couldn't believe it. | | Absolutely get an apple silicon machine, no contest the best | hardware on the market right now. | kimixa wrote: | The 2018 macbook pros weren't even using the best silicon of | the time - they were in the middle of Intel's "14nm skylake | again" period, and an AMD GPU from 2016. | | I suspect one of the reasons why Apple silicon looks _so_ good | is the previous generations were at a dip of performance. Maybe | they took the foot off the gas WRT updates as they _knew_ the M | series of chips was coming soon? | doublepg23 wrote: | My theory is Apple bought Intel's timeline as much as anyone | and Intel just didn't deliver. | eyelidlessness wrote: | On my 2019 MBP, I found VSCode performance poor enough to be | annoying on a regular basis, enough so that I would frequently | defer restarting it or my machine to avoid the lengthy | interruption. Doing basically anything significant would have | the fans running full blast pretty much constantly. | | On my M2 Max, all of that is ~fully resolved. There is still | some slight lag, and I have to figure it's just the Electron | tax, but never enough to really bother me, certainly not enough | to defer restarting anything. And I can count the times I've | even heard the fans on one hand... and even so, never for more | than a few seconds (though each time has been a little | alarming, just because it's now so rare). | aragonite wrote: | It depends on what specifically you find slow about VSCode. In | my experience, some aspects of VSCode feel less responsive than | Sublime simply due to intentional design choices. For example, | VSCode's goto files and project symbol search is definitely not | as snappy as Sublime's. But this difference is due to VSCode's | choice to use debouncing (search is triggered after typing has | stopped) as opposed to throttling (restricts function execution | to a set time interval). | tmpfile wrote: | If you find your compiles are slow, I found a bug in vscode | where builds would compile significantly faster when the status | bar and panel are hidden. Compiles that took 20s would take 4s | with those panels hidden. | | https://github.com/microsoft/vscode/issues/160118 | mattgreenrocks wrote: | VSCode is noticeably laggy on my 2019 MBP 16in to the point | that I dislike using it. Discrete GPU helps, but it still feels | dog slow. | throwaway892238 wrote: | MacBooks are a waste of money. You can be just as productive with | a machine just as fast for 1/2 the price that doesn't include the | Apple Tax. | | Moreover, if your whole stack (plus your test suite) doesn't fit | in memory, what's the point of buying an extremely expensive | laptop? Not to mention constantly replacing them just because a | newer, shinier model is released? If you're just going to test | one small service, that shouldn't require the fastest MacBook. | | To test an entire product suite - especially one that has high | demands on CPU and RAM, and a large test suite - it's much more | efficient and cost effective to have a small set of remote | servers to run everything on. It's also great for keeping dev and | prod in parity. | | Businesses buy MacBooks not because they're necessary, but | because developers just want shiny toys. They're status symbols. | cedws wrote: | It's OK to just not like Apple. You don't have to justify your | own feelings with pejoratives towards other peoples' choice of | laptop. | boringuser2 wrote: | You really need to learn what a"pejorative" is before using | the term publicly. | swader999 wrote: | My main metrics are 1) does the fan turn on, 2) does it respond | faster than I think and move? Can't be any happier with the M2 at | top end specs. It's an amazing silent beast. | LispSporks22 wrote: | I wish I needed a fast computer. It's the CI/CD that's killing | me. All this cloud stuff we use - can't test anything locally | anymore. Can't use the debugger. I'm back to glorified fmt.Printf | statements that hopefully have enough context that the 40 min | build/deploy time was worth it. At least it's differential | -\\_(tsu)_/- All I can say is "I compiles... I think?" The unit | tests are mostly worthless and the setup for sending something to | a lambda feels like JCL boiler plate masturbation from that z/OS | course I took out of curiosity last year. I only typing this out | because I just restarted CI/CD to redeploy what I already pushed | because even that's janky. Huh, it's an M3 they gave me. | lawrjone wrote: | Yeah everything you just said is exactly why we care so much | about a great local environment. I've not seen remote tools | approach the speed/ease/flexibility you can get from a fast | local machine yet, and it makes a huge difference when | developing. | LispSporks22 wrote: | In the back of my mind I'm worried that our competitors have | a faster software development cycle. | orenlindsey wrote: | This is pretty cool, also I love how you can use AI to read the | data. Would take minutes if not hours to do it even just a year | ago. | lawrjone wrote: | Yeah, I thought it was really cool! (am author) | | It's pretty cool how it works, too: the OpenAI Assistant uses | the LLM to take your human instructions like "how many builds | is in the dataset?" and translate that into Python code which | is run in a sandbox on OpenAI compute with access to the | dataset you've uploaded. | | Under the hood everything is just numpy, pandas and gnuplot, | you're just using a human interface to a Python interpreter. | | We've been building an AI feature into our product recently | that behaves like this and it's crazy how good it can get. I've | done a lot of data analysis in my past and using these tools | blew me away, it's so much easier to jump into complex analysis | without tedious setup. | | And a tip I figured out halfway through: if you want to, you | can ask the chat for an iPython notebook of it's calculations. | So you can 'disable autopilot' and jump into manual if you ever | want finer control over the analysis it runs. Pretty wild. | guax wrote: | I also got surprised about using it for this kind of work. I | don't have access to copilot and gpt-4 at work but my first | instinct is to ask, did you double check its numbers? | | Knowing how it works now makes more sense that it would make | less mistakes but I'm still skeptical :P | tomaskafka wrote: | My personal research for iOS development, taking the cost into | consideration, concluded: | | - M2 Pro is nice, but the improvement over 10 core (8 perf cores) | M1 Pro is not that large (136 vs 120 s in Xcode benchmark: | https://github.com/devMEremenko/XcodeBenchmark) | | - M3 Pro is nerfed (only 6 perf cores) to better distinguish and | sell M3 Max, basically on par with M2 Pro | | So, in the end, I got a slightly used 10 core M1 Pro and am very | happy, having spent less than half of what the base M3 Pro would | cost, and got 85% of its power (and also, considering that you | generally need to have at least 33 to 50 % faster CPU to even | notice the difference :)). | geniium wrote: | Basically the Pareto effect in choosing the right cpu vs cost | mgrandl wrote: | The M3 Pro being nerfed has been parroted on the Internet since | the announcement. Practically it's a great choice. It's much | more efficient than the M2 Pro at slightly better performance. | That's what I am looking for in a laptop. I don't really have a | usecase for the memory bandwidth... | tomaskafka wrote: | Everyone has a different needs - for me, even M1 Pro has more | battery life than I use or need, so further efficiency | differences bring little value. | dgdosen wrote: | I picked up an M3Pro/11/14/36GB/1TB to 'test' over the long | holiday return period to see if I need an M3 Max. For my | workflow (similar to blog post) - I don't! I'm very happy | with this machine. | | Die shots show the CPU cores take up so little space compared | to GPUs on both the Pro and Max... I wonder why. | wlesieutre wrote: | I don't really have a usecase for even more battery life, so | I'd rather have it run faster | lawrjone wrote: | That's interesting you saw less of an improvement in the M2 | than we saw in this article. | | I guess not that surprising given the different compilation | toolchains though, especially as even with the Go toolchain you | can see how specific specs lend themselves to different parts | of the build process (such as the additional memory helping | linker performance). | | You're not the only one to comment that the M3 is weirdly | capped for performance. Hopefully not something they'll | continue into the M4+ models. | tomaskafka wrote: | That's what Xcode benchmarks seem to say. | | Yep, there appears to be no reason for getting M3 Pro instead | of M2 Pro, but my guess is that after this (unfortunate) | adjustment, they got the separation they wanted (a clear | hierarchy of Max > Pro > base chip for both CPU and GPU | power), and can then improve all three chips by a similar | amount in the future generations. | Reason077 wrote: | > _"Yep, there appears to be no reason for getting M3 Pro | instead of M2 Pro"_ | | There is if you care about efficiency / battery life. | Aurornis wrote: | My experience was similar: In real world compile times, the M1 | Pro still hangs quite closely to the current laptop M2 and M3 | models. Nothing as significant as the differences in this | article. | | I could depend on the language or project, but in head-to-head | benchmarks of identical compile commands I didn't see any | differences this big. | jim180 wrote: | I love my M1 MacBook Air for iOS development. One thing, I'd | like to have from Pro line is the screen, and just the PPI | part. While 120Hz is a nice thing to have, it won't happen Air | laptops. | ramijames wrote: | I also made this calculation recently and ended up getting an | M1 Pro with maxed out memory and disk. It was a solid deal and | it is an amazing computer. | aschla wrote: | Side note, I like the casual technical writing style used here, | with the main points summarized along the way. Easily digestible | and I can go back and get the details in the main text at any | point if I want. | lawrjone wrote: | Thank you, really appreciate this! | isthisreallife9 wrote: | Is this what software development is like in late 2023? | | Communicating in emojis as much as words? Speaking to an LLM to | do basic data aggregation because you don't know how to do it | yourself? | | If you don't know how to do munge data and produce bar charts | yourself then it's just a small step to getting rid of you and | let the LLM do everything! | lawrjone wrote: | Fwiw I've spent my whole career doing data analysis but the | ease at which I was able to use OpenAI to help me for this post | (am author) blew me away. | | The fact that I can do this type of analysis is why I | appreciate it so much. It's one of the reasons I'm convinced AI | engineering find its way into the average software engineer's | remit (https://blog.lawrencejones.dev/2023/#ai) because it | makes this analysis far more accessible than it was before. | | I still don't think it'll make devs redundant, though. Things | the model can't help you with (yet, I guess): | | - Providing it with clean data => I had to figure out what data | to collect, write software to collect it, ship it to a data | warehouse, clean it, then upload it into the model. | | - Knowing what you want to achieve => it can help suggest | questions to ask, but people who don't know what they want will | still struggle to get results even from a very helpful | assistant. | | These tools are great though, and one of the main reasons I | wrote this article was to convince other developers to start | experimenting with them like this. | gray_-_wolf wrote: | > it makes this analysis far more accessible than it was | before | | How does the average engineer verify if the result is | correct? You claim (and I believe you) to be able to do this | "by hand", if required. Great, but that likely means you are | able to catch when LLM makes an mistake. Any ideas on how | average engineer, without much experience in this area, | should validate the results? | lawrjone wrote: | I mentioned this in a separate comment but it may be worth | bearing in mind how the AI pipeline works, in that you're | not pushing all this data into an LLM and asking it to | produce graphs, which would be prone to some terrible | errors. | | Instead, you're using the LLM to generate Python code that | runs using normal libraries like Pandas and gnuplot. When | it makes errors it's usually generating totally the wrong | graphs rather than inaccurate data, and you can quickly ask | it "how many X Y Z" and use that to spot check the graphs | before you proceed. | | My initial version of this began in a spreadsheet so it's | not like you need sophisticated analysis to check this | stuff. Hope that explains it! | PaulHoule wrote: | The medium is the message here, the macbook is just bait. | | The pure LLM is not effective on tabular data (so many | transcripts of ChatGPT apologizing it got a calculation | wrong.). To be working as well as it seems to work they must be | loading results into something like a pandas data frame and | having the agent write and run programs on that data frame, tap | into stats and charting libraries, etc. | | I'd trust it more if they showed more of the steps. | lawrjone wrote: | Author here! | | We're using the new OpenAI assistants with the code | interpreter feature, which allows you to ask questions of the | model and have OpenAI turn those into python code that they | run on their infra and pipe the output back into the model | chat. | | It's really impressive and removes need for you to ask it for | code and then run that locally. This is what powers many of | the data analysis product features that are appearing | recently (we're building one ourselves for our incident data | and it works pretty great!) | gumballindie wrote: | You need to be a little bit more gentle and understanding. A | lot of folks have no idea there are alternatives to apple's | products that are faster, of higher quality, and upgradeable. | Many seem to be blown away by stuff that has been available | with other brands for a while - fast RAM speeds being one of | them. Few years back when i broke free from apple i was shocked | how fast and reliable other products were. Not to mention the | size of my ram is larger than an entry level storage option | with apple's laptops. | Aurornis wrote: | This is a great write-up and I love all the different ways they | collected and analyzed data. | | That said, it would have been much easier and more accurate to | simply put each laptop side by side and run some timed | compilations on the exact same scenarios: A full build, | incremental build of a recent change set, incremental build | impacting a module that must be rebuilt, and a couple more | scenarios. | | Or write a script that steps through the last 100 git commits, | applies them incrementally, and does a timed incremental build to | get a representation of incremental build times for actual code. | It could be done in a day. | | Collecting company-wide stats leaves the door open to significant | biases. The first that comes to mind is that newer employees will | have M3 laptops while the oldest employees will be on M1 laptops. | While not a strict ordering, newer employees (with their new M3 | laptops) are more likely to be working on smaller changes while | the more tenured employees might be deeper in the code or working | in more complicated areas, doing things that require longer build | times. | | This is just one example of how the sampling isn't truly as | random and representative as it may seem. | | So cool analysis and fun to see the way they've used various | tools to analyze the data, but due to inherent biases in the | sample set (older employees have older laptops, notably) I think | anyone looking to answer these questions should start with the | simpler method of benchmarking recent commits on each laptop | before they spend a lot of time architecting company-wide data | collection | lawrjone wrote: | I totally agree with your suggestion, and we (I am the author | of this post) did spot-check the performance for a few common | tasks first. | | We ended up collecting all this data partly to compare machine- | to-machine, but also because we want historical data on | developer build times and a continual measure of how the builds | are performing so we can catch regressions. We quite frequently | tweak the architecture of our codebase to make builds more | performant when we see the build times go up. | | Glad you enjoyed the post, though! | pjot wrote: | newer employees will have M3 laptops while the oldest employees | will be on M1 laptops | | While I read this from my work intel... | dash2 wrote: | As a scientist, I'm interested how computer programmers work with | data. | | * They drew beautiful graphs! | | * They used chatgpt to automate their analysis super-fast! | | * ChatGPT punched out a reasonably sensible t test! | | But: | | * They had variation across memory and chip type, but they never | thought of using a linear regression. | | * They drew histograms, which are hard to compare. They could | have supplemented them with simple means and error bars. (Or used | cumulative distribution functions, where you can see if they | overlap or one is shifted.) | mnming wrote: | I think it's partly because the audiences are often not | familiar with those statistics details either. | | Most people hates nuances when reading data report. | jxcl wrote: | Yeah, I was looking at the histograms too, having trouble | comparing them and thinking they were a strange choice for | showing differences. | Herring wrote: | >They drew histograms, which are hard to compare. | | Note that in some places they used boxplots, which offer | clearer comparisons. It would have been more effective to | present all the data using boxplots. | vaxman wrote: | 1. If, and only if, you are doing ML or multimedia, get a 128GB | system and because of the cost of that RAM, it would be foolish | not to go M3 Max SoC (notwithstanding the 192GB M2 Ultra SoC). | Full Stop. (Note: This is also a good option for people with more | money than brains.) | | 2. If you are doing traditional heavyweight software development, | or are concerned with perception in an interview, promotional | context or just impressing others at a coffee shop, get a 32GB | 16" MBP system with as large a built-in SSD as you can afford (it | gets cheaper per GB as you buy more) and go for an M2 Pro SoC, | which is faster in many respects than an M3 Pro due to core count | and memory bandwidth. Full Stop. (You could instead go 64GB on an | M1 Max if you keep several VMs open, which isn't really a thing | anymore (use VPS), or if you are keeping a 7-15B parameter LLM | open (locally) for some reason, but again, if you are doing much | with local LLMs, as opposed to being always connectable to the | 1.3T+ parameter hosted ChatGPT, then you should have stopped at | #1.) | | 3. If you are nursing mature apps along, maybe even adding ML, | adjusting UX, creating forks to test new features, etc.. then | your concern is with INCREMENTAL COMPILATION and the much bigger | systems like M3 Max will be slower (bc they need time to ramp up | multiple cores and that's not happening with bursty incremental | builds), so might as well go for a 16GB M1 MBA (add stickers or | whatever if you're ashamed of looking like a school kid) and | maybe invest the savings in a nice monitor like the 28" LG DualUp | (bearing in mind you can only use a single native-speed external | monitor on non-Pro/Max SoCs at a time). You can even get by with | the 8GB M1 MBA because the MacOS memory compressor is really good | and the SSD is really fast. Do you want an M2 MBA? No, it has | inferior thermals, is heavier, larger, fingerprints easy, lack's | respect and the price performance doesn't make sense given the | other options. Same goes for 13" M1/M2 Pro and all M3 Pro. | | Also, make sure you keep hourly (or better) backups on all Apple | laptops. There is a common failure scenario where the buck | converter that drops voltage for the SSD fails, sending 13VDC | into the SSD for long enough to permanently destroy the data on | it. https://youtu.be/F6d58HIe01A | whatshisface wrote: | Good to know I have commercial options for overcoming my laptop | shame at interviews. /s | fsckboy wrote: | I feel like there is a correlation between fast-twitch | programming muscles and technical debt. Some coding styles that | are rewarded by fast compile times can be more akin to "throw it | at the wall, see if it sticks" style development. Have you ever | been summoned to help a junior colleague who is having a problem, | and you immediately see some grievous errors, errors that give | you pause. You point the first couple out, and the young buck is | ready to send you away and confidently forge ahead, with no sense | of "those errors hint that this thing is really broken". | | but we were all young once, I remember thinking the only thing | holding me back was 4.77MHz | wtetzner wrote: | There's a lot of value in a short iteration loop when debugging | unexpected behavior. Often you end up needing to keep trying | different variations until you understand what's going on. | lawrjone wrote: | Yeah there's a large body of research that shows faster | feedback cycles help developers be more productive. | | There's nothing that says you can't have fast feedback loops | _and_ think carefully about your code and next debugging | loop, but you frequently need to run and observe code to | understand the next step. | | In those cases even the best programmer can't overcome a much | slower build. | LASR wrote: | Solid analysis. | | A word of warning from personal experience: | | I am part of a medium-sized software company (2k employees). A | few years ago, we wanted to improve dev productivity. Instead of | going with new laptops, we decided to explore offloading the dev | stack over to AWS boxes. | | This turned out to be a multi-year project with a whole team of | devs (~4) working on it full-time. | | In hindsight, the tradeoff wasn't worth it. It's still way too | difficult to remap a fully-local dev experience with one that's | running in the cloud. | | So yeah, upgrade your laptops instead. | jiggawatts wrote: | https://xkcd.com/1205/ | mdbauman wrote: | This xkcd seems relevant also: https://xkcd.com/303/ | | One thing that jumps out at me is the assumption that compile | time implies wasted time. The linked Martin Fowler article | provides justification for this, saying that longer feedback | loops provide an opportunity to get distracted or leave a | flow state while ex. checking email or getting coffee. The | thing is, you don't have to go work on a completely unrelated | task. The code is still in front of you and you can still be | thinking about it, realizing there's yet another corner case | you need to write a test for. Maybe you're not getting | instant gratification, but surely a 2-minute compile time | doesn't imply 2 whole minutes of wasted time. | chiefalchemist wrote: | Spot on. The mind often needs time and space to breathe, | especially after it's been focused and bearing down on | something. We're humans, not machines. Creativity (i.e., | problem solving) needs to be nurtured. It can't be force | fed. | | More time working doesn't translate to being more effective | and more productive. If that were the case then why are a | disproportionate percentage of my "Oh shit! I know what to | do to solve that..." in the shower, on my morning run, | etc.? | WaxProlix wrote: | I suspect things like GitHub's Codespaces offering will be more | and more popular as time goes on for this kind of thing. Did | you guys try out some of the AWS Cloud9 or other 'canned' dev | env offerings? | hmottestad wrote: | My experience with GitHub Codespaces is mostly limited to | when I forgot my laptop and had to work from my iPad. It was | a horrible experience, mostly because Codespaces didn't | support touch or Safari very well and I also couldn't use | IntelliJ which I'm more familiar with. | | Can't really say anything for performance, but I don't think | it'll beat my laptop unless maven can magically take decent | advantage of 32 cores (which I unfortunately know it can't). | boringuser2 wrote: | I get a bit of a toxic vibe from a couple comments in that | article. | | Chiefly, I think the problem is that the CTO solved the wrong | problem: the right problem to solve includes a combination of | assessing why company public opinion is generating mass movements | of people wanting a new MacBook literally every year, if this is | even worth responding to at all (it isn't), and keeping employees | happy. | | Most employees are reasonsble enough to not be bothered if they | don't get a new MacBook every year. | | Employers should already be addressing outdated equipment | concerns. | | Wasting developer time on a problem that is easily solvable in | one minute isn't worthwhile. You upgrade the people 2-3 real | generations behind. That should already have been in the | pipeline, resources notwhistanding. | | I just dislike this whole exercise because it feels like a | perfect storm of technocratic performativity, short sighted | "metric" based management, rash consumerism, etc. | BlueToth wrote: | It's really worth the money if it keeps employees happy! | Besides that the conclusion was updating M1 to M3, but not | every year. | lawrjone wrote: | Sorry you read it like this! | | If it's useful: Pete wasn't really being combative with me on | this. I suggested we should check if the M3 really was faster | so we could upgrade if it was, we agreed and then I did the | analysis. The game aspect of this was more for a bit of fun in | the article than how things actually work. | | And in terms of why we didn't have a process for this: the | company itself is about two years old, so this was the first | hardware refresh we'd ever needed to schedule. So we haven't a | formal process in place yet and probably won't until the next | one either! | joshspankit wrote: | Since RAM was a major metric, there should have been more focus | on IO Wait to catch cases where OSX was being hindered by | swapping to disk. (Yes, the drives are fast but you don't know | until you measure) | cced wrote: | This. I've routinely got a 10-15GB page file on an M2 pro and | need to justify bumping the memory up a notch or two. I'm | consistently in the yellow memory and in the red while | building. | | How can I tell how much I would benefit from a memory bump? | mixmastamyk wrote: | A lot of the graphs near the end comparing side-to-side had | different scales on the Y axis. Take results with a grain of | salt. | | https://incident.io/_next/image?url=https%3A%2F%2Fcdn.sanity... | lawrjone wrote: | They're normalised histograms so the y axis is deliberately | adjusted so you can compare the shape of the distribution, as | the absolute number of builds in each bucket means little when | there are a different count of builds for each platform. | hk1337 wrote: | I wonder why they didn't include Linux since the project they're | building is Go? Most CI tools, I believe, are going to be Linux. | Sure, you can explicitly select macOS in Github CI but Linux | seems like it would be the better generic option? | | *EDIT* I guess if you needed a macOS specific build with Go you | would us macOS but I would have thought you'd use Linux too. Can | you build a Go project in Linux and have it run on macOS? I | suppose architecture would be an issue building on Linux x86 | would not run on macOS Apple Silicon but the reverse is true too | a build on Apple Silicon would not work on Linux x86 maybe not | even Linux Arm. | xp84 wrote: | I know nothing about Go, but if it's like other platforms, | builds intended for production or staging environments are | indeed nearly always for x86_64, but those are done somewhere | besides laptops, as part of the CI process. The builds done on | the laptops are to run each developer's local instance of their | server-side application and its front-end components, That | instance is always being updated to whatever is in-progress at | the time. Then they check that code in, and eventually it gets | built for prod on an Intel, Linux system elsewhere. | SSLy wrote: | > Application error: a client-side exception has occurred (see | the browser console for more information). | | When I open the page. | rendaw wrote: | > People with the M1 laptops are frequently waiting almost 2m for | their builds to complete. | | I don't see this at all... the peak for all 3 is at right under | 20s. The long tail (i.e. infrequently) goes up to 2m, but for all | 3. M2 looks slightly better than M1, but it's not clear to me | there's an improvement from M2 to M3 at all from this data. ___________________________________________________________________ (page generated 2023-12-29 23:00 UTC)