[HN Gopher] Improving Rust compile times to enable adoption of m...
       ___________________________________________________________________
        
       Improving Rust compile times to enable adoption of memory safety
        
       Author : todsacerdoti
       Score  : 211 points
       Date   : 2023-02-03 09:22 UTC (13 hours ago)
        
 (HTM) web link (www.memorysafety.org)
 (TXT) w3m dump (www.memorysafety.org)
        
       | fidgewidge wrote:
       | I wonder about the framing of the title here. Rust is great but
       | realistically a lot of software with memory safety bugs doesn't
       | need to be written in C in the first place.
       | 
       | For example Java has a perfectly serviceable TLS stack written
       | entirely in a memory safe language. Although you could try to
       | make OpenSSL memory safe by rewriting it in Rust - which
       | realistically means yet another fork not many people use - you
       | could _also_ do the same thing by implementing the OpenSSL API on
       | top of JSSE and Bouncy Castle. The GraalVM native image project
       | allows you to export Java symbols as C APIs and to compile
       | libraries to standalone native code, so this is technically
       | feasible now.
       | 
       | There's also some other approaches. GraalVM can also run many
       | C/C++ programs in a way that makes them _automatically_ memory
       | safe, by JIT compiling LLVM bitcode and replacing allocation
       | /free calls with garbage collected allocations. Pointer
       | dereferences are also replaced with safe member accesses. It
       | works as long as the C is fairly strictly C compliant and doesn't
       | rely on undefined behavior. This functionality is unfortunately
       | an enterprise feature but the core LLVM execution engine is open
       | source, so if you're at the level of major upgrades to Rust you
       | could also reimplement the memory safety aspect on top of the
       | open source code. Then again you can compile the result down to a
       | shared native library that doesn't rely on any external JVM.
       | 
       | Don't get me wrong, I'm not saying don't improve Rust compile
       | times. Faster Rust compiles would be great. I'm just pointing out
       | that, well, it's not the only memory safe language in the world,
       | and actually using a GC isn't a major problem these days for many
       | real world tasks that are still done with C.
        
         | duped wrote:
         | That's not feasible for the millions of devices that don't have
         | the resources for deploying GraalVM or GraalVM compiled native
         | images.
         | 
         | The other thing to consider is that in many applications,
         | nearly every single bit of i/o will flow through a buffer and
         | cryptographic function to encrypt/decrypt/validate it. This is
         | the place where squeezing out every ounce of performance is
         | critical. A JIT + GC might cost a lot more money than memory
         | safety bugs + AOT optimized compilation.
        
           | fidgewidge wrote:
           | Native images are AOT optimized and use way less RAM than a
           | normal Java app on HotSpot does. And you can get competitive
           | performance from them with PGO.
           | 
           | Using a GC doesn't mean never reusing buffers, and Java has
           | intrinsics for hardware accelerated cryptography for a long
           | time. There's no reason performance has to be less,
           | especially if you're willing to fund full time research
           | projects to optimize it.
           | 
           | The belief that performance is more important than everything
           | else is exactly how we ended up with pervasive memory safety
           | vulns to begin with. Rust doesn't make it free, as you pay in
           | developer hours.
        
             | duped wrote:
             | How big are GraalVM native images?
        
         | titzer wrote:
         | There is Go, too.
        
         | nindalf wrote:
         | > you could try to make OpenSSL memory safe by rewriting it in
         | Rust
         | 
         | Or just write a better crypto stack without the many legacy
         | constraints holding OpenSSL back. Rustls
         | (https://github.com/rustls/rustls) does that. It has also been
         | audited and found to be excellent - report (https://github.com/
         | rustls/rustls/blob/main/audit/TLS-01-repo...).
         | 
         | You're suggesting writing this stack in a GC language. That's
         | possible, except most people looking for an OpenSSL solution
         | probably won't be willing to take the hit of slower run time
         | perf and possible GC pauses (even if these might be small in
         | practice). Also, these are hypothetical for now. Rustls exists
         | today.
        
           | fidgewidge wrote:
           | OpenSSL was just an example. You could also use XML parsing
           | or many other tasks.
           | 
           | Point is that the code already exists - it's not hypothetical
           | - and has done for a long time. It is far easier to write
           | bindings from an existing C API to a managed implementation
           | than write, audit and maintain a whole new stack from
           | scratch. There are also many other cases where apps could
           | feasibly be replaced with code written in managed languages
           | and then invoked from C or C++.
           | 
           | Anything written in C/C++ can certainly tolerate pauses when
           | calling into third party libraries because malloc/free can
           | pause for long periods, libraries are allowed to do IO
           | without even documenting that fact etc.
           | 
           | I think it's fair to be concerned that rewrite-it-in-rust is
           | becoming a myopic obsession for security people. That's one
           | way to improve memory safety but by no means the only one.
           | There are so many cases where you don't need to do that and
           | you'll get results faster by not doing so, but it's not being
           | considered for handwavy reasons.
        
             | josephg wrote:
             | I think the thing you're missing is that opensource people
             | love rewriting libraries in their favourite languages.
             | Especially something well defined, like tls or an xml
             | parser. Rustls is a great example. You wont stop people
             | making things like this. Nor should you - they're doing it
             | for fun!
             | 
             | It's much more fun to rewrite something in a new language
             | than maintain bindings to some external language. You could
             | wrap a Java library with a rust crate, but it would depend
             | on Java and rust both being installed and sane on every
             | operating system. Maintaining something like that would be
             | painful. Users would constantly run into problems with Java
             | not being installed correctly on macos, or an old version
             | of Java on Debian breaking your crate in weird ways. It's
             | much more pleasant to just have a rust crate that runs
             | everywhere rust runs, where all of the dependencies are
             | installed with cargo.
        
             | nindalf wrote:
             | > It is far easier to write bindings from an existing C API
             | to a managed implementation than write, audit and maintain
             | a whole new stack from scratch.
             | 
             | I'd agree, if rustls wasn't already written, audited and
             | maintained. And there are other examples as well. The
             | internationalisation libraries Icu4c and Icu4j exist, but
             | the multi-language, cross-platform library Icu4x is written
             | in Rust. Read the announcement post on the Unicode blog
             | (http://blog.unicode.org/2022/09/announcing-
             | icu4x-10.html?m=1) - security is only one of the reasons
             | they chose to write it in Rust. Binary size, memory usage,
             | high performance. Also compiles to wasm.
             | 
             | Your comment implies that people rewrite in Rust for
             | security alone. But there are so many other benefits to
             | doing so.
        
           | e12e wrote:
           | > people looking for an OpenSSL solution probably won't be
           | willing to take the hit of slower run time perf and possible
           | GC pauses
           | 
           | Golang users would?
           | 
           | That aside excellent points about rust tls, and libssl legacy
           | cruft.
        
             | nindalf wrote:
             | No, I'm imagining cross-language usage. Someone not using
             | Go isn't going to use the crypto/tls package from the Go
             | std lib regardless of its quality. The overhead and
             | difficulty of calling into Go make this infeasible.
             | 
             | To include a library written in another language as a
             | shared lib, it needs to be C, C++ or Rust.
        
       | burntsushi wrote:
       | I originally posted this on reddit[1], but figured I'd share this
       | here. I checked out ripgrep 0.8.0 and compiled it with both Rust
       | 1.20 (from ~5.5 years ago) and Rust 1.67 (just released):
       | $ git clone https://github.com/BurntSushi/ripgrep         $ cd
       | ripgrep         $ git checkout 0.8.0         $ time cargo +1.20.0
       | build --release         real    34.367         user    1:07.36
       | sys     1.568         maxmem  520 MB         faults  1575
       | $ time cargo +1.67.0 build --release         [... snip sooooo
       | many warnings, lol ...]         real    7.761         user
       | 1:32.29         sys     4.489         maxmem  609 MB
       | faults  7503
       | 
       | As kryps pointed out on reddit, I believe at some point there was
       | a change to add/improve compilation times by making more
       | effective use of parallelism. So forcing the build to use a
       | single thread produces more sobering results, but still a huge
       | win:                   $ time cargo +1.20.0 build -j1 --release
       | real    1:03.11         user    1:01.90         sys     1.156
       | maxmem  518 MB         faults  0              $ time cargo
       | +1.67.0 build -j1 --release         real    46.112         user
       | 44.259         sys     1.930         maxmem  344 MB
       | faults  0
       | 
       | (My CPU is a i9-12900K.)
       | 
       | These are from-scratch release builds, which probably matter less
       | than incremental builds. But they still matter. This is just one
       | barometer of many.
       | 
       | [1]:
       | https://old.reddit.com/r/rust/comments/10s5nkq/improving_rus...
        
         | ilyagr wrote:
         | Re parallelism: I have 12 cores, and cargo indeed effectively
         | uses them all. As a result, the computer becomes extremely
         | sluggish during a long compilation. Is there a way to tell Rust
         | to only use 11 cores or, perhaps, nice its processes/threads to
         | a lower priority on a few cores?
         | 
         | I suppose it's not the worst problem to have. Makes me realize
         | how spoiled I got after multiple-core computers became the
         | norm.
        
           | jrockway wrote:
           | Are they real cores or hyperthreads/SMT? I've found that
           | hyperthreading doesn't really live up to the hype; if
           | interactive software gets scheduled on the same physical core
           | as a busy hyperthread, latency suffers. Meanwhile, Linux
           | seems to do pretty well these days handling interactive
           | workloads while a 32 core compilation goes on in the
           | background.
           | 
           | SMT is a throughput thing, and I honestly turn it off on my
           | workstation for that reason. It's great for cloud providers
           | that want to charge you for a "vCPU" that can't use all of
           | that core's features. Not amazing for a workstation where you
           | want to chill out on YouTube while something CPU intensive
           | happens in the background. (For a bazel C++ build, having SMT
           | on, on a Threadripper 3970X, does increase performance by
           | 15%. But at the cost of using ~100GB of RAM at peak! I have
           | 128GB, so no big deal, but SMT can be pretty expensive. It's
           | probably not worth it for most workloads. 32 cores builds my
           | Go projects quickly enough, and if I have to build C++ code,
           | well, I wait. ;)
        
           | globalreset wrote:
           | exec ionice -c 3 nice -n 20 "$@"
           | 
           | Make it a shell script like `takeiteasy`, and run `takeiteasy
           | cargo ...`
        
             | kstrauser wrote:
             | Partly because of being a Dudeist, and partly because it's
             | just fun to say, I just borrowed this and called it "dude"
             | on my system.                 dude cargo ...
             | 
             | has a nice flow to it.
        
           | mbrubeck wrote:
           | `cargo build -j11` will limit parallelism to eleven cores.
           | Cargo and rustc use the Make jobserver protocol [0][1][2] to
           | coordinate their use of threads and processes, even when
           | multiple rustc processes are running (as long as they are
           | part of the same `cargo` or `make` invocation):
           | 
           | [0]: https://www.gnu.org/software/make/manual/html_node/Job-
           | Slots...
           | 
           | [2]: https://github.com/rust-lang/cargo/issues/1744
           | 
           | [2]: https://github.com/rust-lang/rust/pull/42682
           | 
           | `nice cargo build` will run _all_ threads at low priority,
           | but this is generally a good idea if you want to prioritize
           | interactive processes while running a build in the
           | background.
        
             | epage wrote:
             | To add, in rust 1.63, cargo added support for negative
             | numbers, so you can say `cargo build --jobs -2` to leave
             | two cores available.
             | 
             | See https://github.com/rust-
             | lang/cargo/blob/master/CHANGELOG.md#...
        
           | [deleted]
        
         | Ygg2 wrote:
         | As someone who uses Rust on various hobby projects, I never
         | understood why people were complaining about compile times.
         | 
         | Perhaps they were on old builds or some massive projects?
        
           | burntsushi wrote:
           | Wait, like, you don't _understand_ , or you don't share their
           | complaint? I don't really understand how you don't
           | understand. If I make a change to ripgrep because I'm
           | debugging its perf and need to therefore create a release
           | build, it can take several seconds to rebuild. Compared to
           | some other projects that probably sounds amazing, but it's
           | still annoying enough to impact my flow state.
           | 
           | ripgrep is probably on the smallish side. It's not hard to
           | get a lot bigger than that and have those incremental times
           | also get correspondingly bigger.
           | 
           | And complaining about compile times doesn't mean compile
           | times haven't improved.
        
             | Ygg2 wrote:
             | I do understand some factors, but I never noticed it being
             | like super slow to build.
             | 
             | My personal project takes seconds to compile, but fair
             | enough it's small, but even bigger projects like a game in
             | Bevy don't take that much to compile. Minute or two tops.
             | About 30 seconds when incremental.
             | 
             | People complained of 10x slower perf. Essentially 15min
             | build times.
             | 
             | Fact that older versions might be slower to compile fills
             | another part of the puzzle.
             | 
             | That and fact I have a 24 hyper thread monster of CPU.
        
               | TinkersW wrote:
               | 30 seconds isn't incremental, that is way too long.
               | 
               | I work on a large'ish C++ project and incremental is
               | generally 1-2 seconds.
               | 
               | Incremental must work in release builds(someone else said
               | it only works in debug for Rust), although it is fine to
               | disable link time optimizations as those are obviously
               | kinda slow.
        
           | jackmott42 wrote:
           | First, compile times can differ wildly based on the code in
           | question. Big projects can take minutes where hobby projects
           | take second.
           | 
           | Also, people have vastly different work flows. Some people
           | tend to slowly write a lot of code and compile rarely. Maybe
           | they tend to have runtime tools to tweak things. Otherwise
           | like to iterate really fast. Try a code change, see if the UI
           | looks better or things run faster, and when you work like
           | this even a compile time of 3 seconds can be a little bit
           | annoying, and 30 seconds maddening.
        
             | Taywee wrote:
             | It's less about "big projects" and more about "what
             | features are used". It's entirely possible for a 10kloc
             | project to take much more time to build than a 100kloc
             | project. Proc macros, heavy generic use, and the like will
             | drive compile time way up. It's like comparing a C++
             | project that is basically "C with classes" vs one that does
             | really heavy template dances.
             | 
             | Notably, serde can drive up compile times a lot, which is
             | why miniserde still exists and gets some use.
        
           | jph wrote:
           | Code gen takes quite a while. Diesel features are one way to
           | see the effect...
           | 
           | diesel = { version = "*", features = ["128-column-tables"],
           | ... }
        
         | [deleted]
        
         | twotwotwo wrote:
         | This also relates to something not directly about rustc: many-
         | core CPUs are much easier to get than five years ago, so a CPU-
         | hungry compiler needn't be such a drag if its big jobs can use
         | all your cores.
        
           | michaelt wrote:
           | It's true!
           | 
           | Steam hardware survey, Jan 2017 [1] vs Jan 2023, "Physical
           | CPUs (Windows)"                          2017    2023       1
           | CPU    1.9%    0.2%       2 CPUs  45.8%    9.6%       3 CPUs
           | 2.6%    0.4%       4 CPUs  47.8%   29.6%       6 CPUs   1.4%
           | 33.0%       8 CPUs   0.2%   18.8%       More     0.3%    8.4%
           | 
           | [1] https://web.archive.org/web/20170225152808/https://store.
           | ste...
        
           | masklinn wrote:
           | However, rustc currently has limited ability to parallelise
           | at a sub-crate level, which makes for not-so-great tradeoffs
           | on large projects.
        
         | manholio wrote:
         | The most annoying thing in my experience is not really the raw
         | compilation times, but the lack of - or very rudimentary -
         | incremental build feature. If I'm debugging a function and make
         | a small local change that does not trickle down to some generic
         | type used throughout the project, then 1-second build times
         | should be the norm, or better yet, edit & continue debug.
         | 
         | It's beyond frustrating that any "i+=1" change requires
         | relinking a 50mb binary from scratch and rebuilding a good
         | chunk of the Win32 crate for good measure. Until such
         | enterprise features become available, high developer
         | productivity in Rust remains elusive.
        
           | burntsushi wrote:
           | To be clear, Rust has an "incremental" compilation feature,
           | and I believe it is enabled by default for debug builds.
           | 
           | I don't think it's enabled by default in release builds
           | (because it might sacrifice perf too much?) and it doesn't
           | make linking incremental.
           | 
           | Making the entire pipeline incremental, including release
           | builds, probably requires some very fundamental changes to
           | how our compilers function. I think Cranelift is making
           | inroads in this direction by caching the results of compiling
           | individual functions, but I know very little about it and
           | might even be describing it incorrectly here in this comment.
        
           | josephg wrote:
           | > It's beyond frustrating that any "i+=1" change requires
           | relinking a 50mb binary from scratch
           | 
           | It's especially hard to solve this with a language like rust,
           | but I agree!
           | 
           | I've long wanted to experiment with a compiler architecture
           | which could do fully incremental compilation, maybe down the
           | function in granularity. In the linked (debug) executable,
           | use a malloc style library to manage disk space. When a
           | function changes, recompile it, free the old copy in the
           | binary, allocate space for the new function and update jump
           | addresses. You'd need to cache a whole lot of the compiler's
           | context between invocations - but honestly that should be
           | doable with a little database like LMDB. Or alternately, we
           | could run our compiler in "interactive mode", and leave all
           | the type information and everything else resident in memory
           | between compilation runs. When the compiler notices some
           | functions are changed, it flushes the old function
           | definitions, compiles the new functions and updates
           | everything just like when the DOM updates and needs to
           | recompute layout and styles.
           | 
           | A well optimized incremental compiler should be able to do a
           | "i += 1" line change faster than my monitor's refresh rate.
           | It's crazy we still design compilers to do a mountain of
           | processing work, generate a huge amount of state and then
           | when they're done throw all that work out. Next time we run
           | the compiler, we redo all of that work again. And the work is
           | all almost identical.
           | 
           | Unfortunately this would be a particularly difficult change
           | to make in the rust compiler. Might want to experiment with a
           | simpler language first to figure out the architecture and the
           | fully incremental linker. It would be a super fun project
           | though!
        
         | CGamesPlay wrote:
         | Can you explain why the user time goes _down_ when using a
         | single thread? Does that mean that there 's a huge amount of
         | contention in the parallelism?
        
           | nequo wrote:
           | Faults also drop to zero. Might be worth trying to flush the
           | cache before each cargo build?
        
           | twotwotwo wrote:
           | There are hardware reasons even if you leave any software
           | scaling inefficiency to the side. For tasks that can use lots
           | of threads, modern hardware trades off per-thread performance
           | for getting more overall throughput from a given amount of
           | silicon.
           | 
           | When you max out parallelism, you're using 1) hardware
           | threads which "split" a physical core and (ideally) each run
           | at a bit more than half the CPU's single-thread speed, and 2)
           | the small "efficiency" cores on newer Intel and Apple chips.
           | Also, single-threaded runs can feed a ton of watts to the one
           | active core since it doesn't have to share much power/cooling
           | budget with the others, letting it run at a higher clock
           | rate.
           | 
           | All these tricks improve the throughput, or you wouldn't see
           | that wall-time reduction and chipmakers wouldn't want to ship
           | them, but they do increase how long it takes each thread to
           | get a unit of work done in a very multithreaded context,
           | which contributes to the total CPU time being higher than it
           | is in a single-threaded run.
        
           | celrod wrote:
           | User time is the amount of CPU time spent in user mode. It is
           | aggregated across threads. If you have 8 threads running at
           | 100% in user mode for 1 second, that gives you 8s of user
           | time.
           | 
           | Total CPU time in user mode will normally increase when you
           | add more threads, unless you're getting perfect or better-
           | than-perfect scaling.
        
           | burntsushi wrote:
           | To be honest, I don't know. My understanding of 'user' time
           | is that is represents the sum of all CPU time spent in "user
           | mode" (as opposed to "kernel mode"). In theory, given that
           | understanding and perfect scaling, the user time of a multi-
           | threaded task should roughly match the user time of a single-
           | threaded task. Of course, "perfect" scaling is unlikely to be
           | real, but still, you'd expect better scaling here.
           | 
           | If I had to guess as to what's happening, it's that there's
           | some thread pool, and at some point, near the end of
           | compilation, only one or two of those threads is busy doing
           | anything while the other threads are sitting and idling. Now
           | whether and how that "idling" gets interpreted as "CPU being
           | actively used in user mode" isn't quite clear to me. (It may
           | not, in which case, my guess is bunk.)
           | 
           | Perhaps someone more familiar with what 'user' time actually
           | means and how it interplays with multi-threaded programs will
           | be able to chime in.
           | 
           | (I do not think faults have anything to do with it. The
           | number of faults reported here is quite small, and if I re-
           | run the build, the number can change quite a bit---including
           | going to zero---and the overall time remains unaffected.)
        
             | Filligree wrote:
             | User time is the amount of CPU time spent actually doing
             | things. Unless you're using spinlocks, it won't include
             | time spent waiting on locks or otherwise sleeping -- though
             | it will include time spent setting up for locks, reloading
             | cache lines and such.
             | 
             | Extremely parallel programs can improve on this, but it's
             | perfectly normal to see 2x overhead for fine-grained
             | parallelism.
        
               | fulafel wrote:
               | Spinlocks are normal userspace code issuing machine
               | instructions in a loop that do memory operations. It is
               | counted in user time, unless the platform is unusual and
               | for some reason enters the kernel to spin on the lock.
               | Spinning is the opposite of sleeping.
               | 
               | edit: misparsed, like corrected below, my bad.
        
               | burntsushi wrote:
               | I think you're saying the same thing as the GP. You might
               | have parsed their comment incorrectly.
        
               | burntsushi wrote:
               | I'd say there's still a gap in my mental model. I agree
               | that it's normal to observe this, definitely. I see it in
               | other tools that utilize parallelism too. I just can't
               | square the 2x overhead part of it in a workload like
               | Cargo's, which I assume is _not_ fine-grained. I see the
               | same increase in user time with ripgrep too, and its
               | parallelism is maybe more fine grained than Cargo 's, but
               | is still at the level of a single file, so it isn't that
               | fine grained.
               | 
               | But maybe for Cargo, parallelism is more fine grained
               | than I think it is. Perhaps because of codegen-units. And
               | similarly for ripgrep, if it's searching a lot of tiny
               | files, that might result in fine grained parallelism in
               | practice.
        
               | Filligree wrote:
               | Well, like mentioned elsewhere, most of that overhead is
               | just hyper threads slowing down when they have active
               | siblings.
               | 
               | Which is fine; it's still faster overall. Disable SMT and
               | you'll see much lower overhead, but higher time spent
               | overall.
        
               | burntsushi wrote:
               | Yes, I know its fine. I just don't understand the full
               | details of why hyperthreading slows things down that
               | much. There are more experiments that could be done to
               | confirm or deny this explanation, e.g., disabling
               | hyperthreading. And playing with the thread count a bit
               | more.
        
             | ynik wrote:
             | Idle time doesn't count as user-time unless it's a spinlock
             | (please don't do those in user-mode).
             | 
             | I suspect the answer is: Perfect scaling doesn't happen on
             | real CPUs.
             | 
             | Turboboost lets a single thread go to higher frequencies
             | than a fully loaded CPU. So you would expect "sum of user
             | times" to increase even if "sum of user clock cycles" is
             | scaling perfectly.
             | 
             | Hyperthreading is the next issue: multiple threads are not
             | running independently, but might be fighting for resources
             | on a single CPU core.
             | 
             | In a pure number-crunching algorithm limited by functional
             | units, this means using $(nproc) threads instead of 1
             | thread should be expected to more than double the user time
             | based on these two first points alone!
             | 
             | Compilers of course are rarely limited by functional units:
             | they do a decent bit of pointer-chasing, branching, etc.
             | and are stalled a good bit of time. (While OS-level
             | blocking doesn't count as user time; the OS isn't aware of
             | these CPU-level stalls, so these count as user time!) This
             | is what makes hyperthreading actually helpful.
             | 
             | But compilers also tend to be memory/cache-limited. L1 is
             | shared between the hyperthreads, and other caches are
             | shared between multiple/all cores. This means running
             | multiple threads compiling different parts of the program
             | in parallel means each thread of computation gets to work
             | with a smaller portion of the cache -- the effective cache
             | size is decreasing. That's another reason for the user time
             | to go up.
             | 
             | And once you have a significant number of cache misses from
             | a bunch of cores, you might be limited on memory bandwidth.
             | At that point, also putting the last few remaining idle
             | cores to work will not be able to speed up the real-time
             | runtime anymore -- but it will make "user time" tick up
             | faster.
             | 
             | In particularly unlucky combinations of working set size
             | vs. cache size, adding another thread (bringing along
             | another working set) may even increase the real time.
             | Putting more cores to work isn't always good!
             | 
             | That said, compilers are more limited by memory/cache
             | latency than bandwidth, so adding cores is usually pretty
             | good. But it's not perfect scaling even if the compiler has
             | "perfect parallellism" without any locks.
        
               | burntsushi wrote:
               | > Turboboost lets a single thread go to higher
               | frequencies than a fully loaded CPU. So you would expect
               | "sum of user times" to increase even if "sum of user
               | clock cycles" is scaling perfectly.
               | 
               | Ah yes, this is a good one! I did not account for this.
               | Mental model updated.
               | 
               | Your other points are good too. I considered some of them
               | as well, but maybe not enough in the context of
               | competition making many things just a bit slower. Makes
               | sense.
        
           | [deleted]
        
           | pornel wrote:
           | This is caused by hyperthreading. It's not an actual
           | inefficiency, but an artifact of the way CPU time is counted.
           | 
           | The HT cores aren't real CPU cores. They're just an
           | opportunistic reuse of hardware cores when another thread is
           | waiting for RAM (RAM is relatively so slow that they're
           | waiting a lot, for a long time).
           | 
           | So code on the HT "core" doesn't run all the time, only when
           | other thread is blocked. But the time HT threads wait for
           | their opportunity turn is included in wall-clock time, and
           | makes them look slow.
        
             | pjmlp wrote:
             | Back in the early days of HT I was so happy to get a
             | desktop with it, that I enabled it.
             | 
             | The end result was that doing WebSphere development
             | actually got slower, because of their virtual nature and
             | everything else on the CPU being shared.
             | 
             | So I ended up disabling it again to get the original
             | performance back.
        
               | pornel wrote:
               | Yeah, the earliest attempts weren't good, but I haven't
               | heard of any HT problems post Pentium 4 (apart from
               | Spectre-like vulnerabilities).
               | 
               | I assume OSes have since then developed proper support
               | for scheduling and pre-empting hyperthreading. Also the
               | gap between RAM and CPU speed only got worse, and CPUs
               | have grown more various internal compute units, so
               | there's even more idle hardware to throw HT threads at.
        
         | fnordpiglet wrote:
         | I remember I would spend hours looking at my code change
         | because it would take hours to days to build what I was working
         | on. I would build small examples to test and debug. I was
         | shocked at Netscape with the amazing build system they had that
         | could continuously build and tell you within a short few hours
         | if you've broken the build on their N platforms they cross
         | compiled to. I was bedazzled when I had IDEs that could tell me
         | whether I had introduced bugs and could do JIT compilation and
         | feedback to me in real time if I had made a mistake and provide
         | inline lints. I was floored when I saw what amazing things rust
         | was doing in the compiler to make my code awesome and how
         | incredibly fast it builds. But what really amazed me more than
         | anything was realizing how unhappy folks were that it took 30
         | seconds to build their code. :-)
         | 
         | GET OFF MY LAWN
        
           | [deleted]
        
             | [deleted]
        
           | burntsushi wrote:
           | I dare to want better tools. And I build them when I can.
           | Like ripgrep. -\\_(tsu)_/-
        
             | fnordpiglet wrote:
             | Keep keeping me amazed and I'll keep loving the life I've
             | lived
        
         | burntsushi wrote:
         | Someone asked (and then deleted their comment):
         | 
         | > How many LoC there is in ripgrep? 46sec to build a grep like
         | tool with a powerful CPU seems crazy.
         | 
         | I wrote out an answer before I knew the comment was deleted,
         | so... I'll just post it as a reply to myself...
         | 
         | -----
         | 
         | Well it takes 46 seconds with only a single thread. It takes ~7
         | seconds with many threads. In the 0.8.0 checkout, if I run
         | `cargo vendor` and then tokei, I get:                   $ tokei
         | -trust src/ vendor/         ===================================
         | ============================================          Language
         | Files        Lines         Code     Comments       Blanks
         | ===============================================================
         | ================          Rust                  765
         | 299692       276218        10274        13200          |-
         | Markdown           387        21647         2902        14886
         | 3859          (Total)                         321339
         | 279120        25160        17059         ======================
         | =========================================================
         | Total                 765       299692       276218
         | 10274        13200         ====================================
         | ===========================================
         | 
         | So that's about a quarter million lines. But this is very
         | likely to be a poor representation of actual complexity. If I
         | had to guess, I'd say the vast majority of those lines are some
         | kind of auto-generated thing. (Like Unicode tables.) That count
         | also includes tests. Just by excluding winapi, for example, the
         | count goes down to ~150,000.
         | 
         | If you _only_ look at the code in the ripgrep repo (in the
         | 0.8.0 checkout), then you get something like ~13K:
         | $ tokei -trust src globset grep ignore termcolor wincolor
         | ===============================================================
         | ================          Language            Files
         | Lines         Code     Comments       Blanks         ==========
         | ===============================================================
         | ======          Rust                   34        15484
         | 13205          780         1499          |- Markdown
         | 30         2300            6         1905          389
         | (Total)                          17784        13211
         | 2685         1888         =====================================
         | ==========================================          Total
         | 34        15484        13205          780         1499         
         | ===============================================================
         | ================
         | 
         | It's probably also fair to count the regex engine too (version
         | 0.2.6):                   $ tokei -trust src regex-syntax
         | ===============================================================
         | ================          Language            Files
         | Lines         Code     Comments       Blanks         ==========
         | ===============================================================
         | ======          Rust                   29        22745
         | 18873         2225         1647          |- Markdown
         | 23         3250          285         2399          566
         | (Total)                          25995        19158
         | 4624         2213         =====================================
         | ==========================================          Total
         | 29        22745        18873         2225         1647         
         | ===============================================================
         | ================
         | 
         | Where about 5K of that are Unicode tables.
         | 
         | So I don't know. Answering questions like this is actually a
         | little tricky, and presumably you're looking for a barometer of
         | how big the project is.
         | 
         | For comparison, GNU grep takes about 17s single threaded to
         | build from scratch from its tarball:                   $ time
         | (./configure --prefix=/usr && make -j1)         real    17.639
         | user    9.948         sys     2.418         maxmem  77 MB
         | faults  31
         | 
         | Using `-j16` decreases the time to 14s, which is actually
         | slower than a from scratch ripgrep 0.8.0 build. Primarily do to
         | what appears to be a single threaded configure script for GNU
         | grep.
         | 
         | So I dunno what seems crazy to you here honestly. It's also
         | worth pointing out that ripgrep has quite a bit more
         | functionality than something like GNU grep, and that
         | functionality comes with a fair bit of code. (Gitignore
         | matching, transcoding and Unicode come to mind.)
        
           | Thaxll wrote:
           | It was me, and thanks for the details. I missed the multi
           | threaded compilation in the second part, I thought it was
           | 46sec with -jx
        
           | kibwen wrote:
           | In addition, it's worth mentioning here that the measurement
           | is for release builds, which are doing far more work than
           | just reading a quarter million lines off of a disk.
        
       | lumb63 wrote:
       | I love to see work being done to improve Rust compile times. It's
       | one of the biggest barriers to adoption today, IMO.
       | 
       | Package management, one of Rust's biggest strengths, is one of
       | its biggest weaknesses here. It's so easy to pull in another
       | crate to do almost anything you want. How many of them are well-
       | written, optimized, trustworthy, etc.? My guess is, not that
       | many. That leads to applications that use them being bloated and
       | inefficient. Hopefully, as the ecosystem matures, people will pay
       | better attention to this.
        
         | pornel wrote:
         | On the contrary, commonly used Rust crates tend to be well
         | written and well optimized (source: I have done security audits
         | of hundreds of deps and I curate https://lib.rs).
         | 
         | Rust has a culture of splitting dependencies into small
         | packages. This helps pull in only focused, tailored
         | functionality that you need rather than depending on multi-
         | purpose large monoliths. Ahead-of-time compilation + generics +
         | LTO means there's no extra overhead to using code from 3rd
         | party dependency vs your own (unlike interpreted or VM
         | languages where loading code costs, or C with dynamic libraries
         | where you depend on the whole library no matter how little you
         | use from it).
         | 
         | I assume people scarred by low-quailty dependencies have been
         | burned by npm. Unlike JS, Rust has a strong type system, with
         | rules that make it hard to cut corners and break things. Rust
         | also ships with a good linter, built-in unit testing, and
         | standard documentation generator. These features raise the
         | quality of average code.
         | 
         | Use of dependencies can improve efficiency of the whole
         | application. Shared dependencies-of-dependencies increase code
         | reuse, instead of each library rolling its own NIH basics like
         | loggers or base64 decode, you can have one shared copy.
         | 
         | You can also easily use very optimized implementations of
         | common tasks like JSON, hashmaps, regexes, cryptography, or
         | channels. Rust has some world-class crates for these tasks.
        
       | gregwebs wrote:
       | Haskell is one of the few languages that can compile slower than
       | rust. But they have a REPL GHCI that can be used to fairly
       | quickly reload code changes.
       | 
       | I wish there were some efforts at dramatically different
       | approaches like this because there's all this work going into
       | compilation but it's unlikely to make the development cycle twice
       | as fast in most cases.
        
         | sesm wrote:
         | Are there any articles/papers that explain how a mix of
         | compiled and interpreted code works for Haskell? I wanted to
         | play with this idea for my toy language, but don't know where
         | to start.
        
         | nkit wrote:
         | I've started liking evcxr (https://github.com/google/evcxr) for
         | REPL. It's a little slow compared to other REPLs, but still
         | good enough to be usable after initial load.
        
           | sitkack wrote:
           | I agree, evcxr really needs to be advertised more. It might
           | need a new name, I don't even know how to say it.
        
             | antipurist wrote:
             | > eee-vic-ser
             | 
             | https://github.com/google/evcxr/issues/215
        
               | laszlokorte wrote:
               | Phonetically for a german that sounds like "eww,
               | disgusting wanker"
        
       | pornel wrote:
       | Another build time improvement coming, especially for fresh CI
       | builds, is a new registry protocol. Instead of git-cloning
       | metadata for 100,000+ packages, it can download only the data for
       | your dependencies.
       | 
       | https://blog.rust-lang.org/inside-rust/2023/01/30/cargo-spar...
        
         | MarkSweep wrote:
         | You can also use something like this to cache build artifacts
         | and dependencies between builds:
         | 
         | https://github.com/Swatinem/rust-cache
        
         | MuffinFlavored wrote:
         | I really wonder how many Dockerfiles are out there that on
         | every PR merge pull the entire cargo "metadata" without cache
         | and how wasteful that is from a bandwidth/electricity
         | standpoint or if in the grand scheme of things it's a small
         | drop in the bucket?
        
           | aseipp wrote:
           | In my experience it's pretty significant from the bandwidth
           | side at reasonable levels of usage. You'd be astounded at how
           | many things download packages and their metadata near
           | constantly, and the rise of fully automated CI systems has
           | really put the stress on bandwidth in particular, since most
           | things are "from scratch." And now we have things like
           | dependabot automatically creating PRs for downstream
           | advisories constantly which can incur rebuilds, closing the
           | loop fully.
           | 
           | If you use GitHub as like a storage server and totally
           | externalize the costs of the package index onto them, then
           | it's workable for free. But if you're running your own
           | servers then it's a whole different ballgame.
        
             | kzrdude wrote:
             | I think github would have throttled that cargo index
             | repository a long time ago if it wasn't used by Rust, i.e
             | they get some kind of special favour. Which is nice but
             | maybe not sustainable.
        
               | kibwen wrote:
               | Github employees personally reached out to various
               | packagers (I know both Cargo and Homebrew for certain)
               | asking them not to perform shallow clones on their index
               | repos, because of the extra processing it was incurring
               | on the server side.
        
         | CodesInChaos wrote:
         | Why would a CI build need the index at all? The lock file
         | should already contain all the dependencies and their hashes.
        
           | kibwen wrote:
           | You're correct that Cargo doesn't check the index if it's
           | building using a lockfile, but I think the problem is that a
           | freshly-installed copy of Cargo assumes that it needs to get
           | the index the first time that any command is run. I assume
           | (but haven't verified in the slightest) that this behavior
           | will change with the move to an on-demand index by default.
        
         | Vecr wrote:
         | Good thing they will continue to support the original protocol.
         | I don't like downloading things on demand like that, not good
         | for privacy.
        
           | charcircuit wrote:
           | How is it bad for privacy?
           | 
           | Before:
           | 
           | Download all metadata, Download xyz package
           | 
           | After:
           | 
           | Downolad xyz's metadata, Download xyz
           | 
           | They already know you are using xyz.
        
             | throwaway894345 wrote:
             | I don't care much either way, but you have the privacy
             | argument backwards. If you're downloading all the things,
             | then no knows if you are using xyz, only that you _might_
             | be using xyz. If you 're just downloading what you need and
             | you're downloading xyz, then they know that you're using
             | xyz.
        
               | rascul wrote:
               | You're downloading specific packages either way, which
               | can potentially be tracked, regardless of whether you're
               | downloading metadata for all packages or just one.
               | 
               | Edit: A thought occurs to me. Cargo downloads metadata
               | from crates.io but clones the package repo from
               | GitHub/etc. So unless I'm missing something, downloading
               | specific metadata instead of all metadata allows for
               | crates.io to track your specific packages in addition to
               | GitHub.
        
               | pornel wrote:
               | No, repos of packages are not used, at all. Crates don't
               | even need to be in any repository, and the repository URL
               | in the metadata isn't verified in any way. Crates can
               | link to somebody else's repo or a repo full of fake code
               | unrelated to what has been published on crates.io.
               | 
               | crates.io crates are tarballs stored in S3. The tarball
               | downloads also go through a download-counting service,
               | which is how you get download stats for all crates (it's
               | not a tracker in the Google-is-watching-you sense, but
               | just an integer increment in Postgres).
               | 
               | Use https://lib.rs/cargo-crev or source view on docs.rs
               | to see the actual source code that has been uploaded by
               | Cargo.
        
               | kibwen wrote:
               | This has it backwards. crates.io has always hosted the
               | crates themselves, but has used Github for the index. In
               | the future, with the sparse HTTP index, crates.io will be
               | the only one in the loop, cutting Github out of the
               | equation.
        
               | Xorlev wrote:
               | I'm not sure I understand. This is talking about Cargo
               | metadata download improvements. You still download
               | individual packages regardless of receiving a copy of the
               | entire registry, so privacy hasn't materially changed
               | either way.
               | 
               | If knowing you use a crate is too much, then running your
               | own registry with a mirror of packages seems like all you
               | could do.
        
         | aseipp wrote:
         | Great stuff. Now, if they can just have a globally shared (at
         | least per $USER!), content-addressible target/ directory, two
         | of my complaints with Cargo would be fixed nicely...
        
       | xiphias2 wrote:
       | I see a lot of work going on making the compiler faster (which
       | looks hard at this point), but I wish I just would be able to
       | make correct changes without needing to recompile code at least.
       | 
       | The extract function tool is very buggy. As I spend a lot of time
       | refactoring, maybe putting time in those tools would have a
       | better ROI than so much work into making the compiler faster.
        
         | estebank wrote:
         | Keep in mind that the people working on rustc are not the same
         | working on rust-analyzer, even if there's some overlap in
         | contributors and there's a desire to share libraries as much as
         | possible. Someone working on speeding up rustc is unlikely to
         | have domain expertise in DX and AST manipulation, and vice-
         | versa.
        
           | xiphias2 wrote:
           | Maybe you're right, but I think both are hard enough that
           | people who are smart enough to do one can do the other if
           | they really want :)
           | 
           | By the way AST manipulation is easy, the really hard part of
           | refactoring (that I had a lot of problem with) is creating
           | the lifetime annotations, which requires a deep understanding
           | of the type system.
           | 
           | I was trying to learn some type theory and read papers to
           | understand how Rust's life times work, but only found long
           | research papers that don't even do the same thing as Rust.
           | 
           | I haven't found any documentation that documents exactly when
           | a function call is accepted by the lifetime checker (borrow
           | checking is easy).
        
             | guipsp wrote:
             | Have you read the rustnomicon section on lifetimes? I found
             | it pretty useful
        
               | xiphias2 wrote:
               | It's cool, I just looked at the manual.
               | 
               | there's this part though:
               | 
               | // NOTE: `'a: {` and `&'b x` is not valid syntax!
               | 
               | I hate that I can't introduce new lifetime inside a
               | function, it would make refactoring so much easier. Right
               | now I have to try to refactor, see if the compiler
               | accepts it or not, then revert the change.
               | 
               | Sometimes desugaring would be a great feature in itself,
               | sugaring makes interactions between functions much harder
               | to understand.
        
       | IshKebab wrote:
       | I really wish there was some work on hermetic compilation of
       | crates. Ideally crates would be able to opt-in (eventually opt-
       | out) to "pure" mode which would mean they can't use `build.rs`,
       | proc macros are fully sandboxed, no `env!()` and so on.
       | 
       | Without that you can't really do distributed and cached
       | compilation 100% reliably.
        
         | josephg wrote:
         | That would help some stuff, but it wouldn't help with
         | monomorphized code or macro expansion. Those two are the real
         | killers in terms of compilation performance. And in both of
         | those cases, most of the compilation work happens at the call
         | site - when compiling _your_ library.
        
           | IshKebab wrote:
           | Those are simply different problems though. Macro expansion
           | is not even exactly a problem.
           | 
           | For monomorphized code the compiler just needs a mode where
           | it automatically does what the Momo crate does.
           | 
           | For proc macros the Watt crate (precompiled WASM macros) will
           | make a big difference. It just needs official sanction and
           | integration.
           | 
           | Anyway yeah those are totally separate problems to caching
           | and distributed builds.
        
           | pjmlp wrote:
           | Eiffel, Ada, D, C++ with extern templates (even better if
           | modules are also part of the story) show ways how this can be
           | improved.
           | 
           | Naturally somehow has to spend time analysing how their way
           | maps into Rust compilation story.
        
           | nicoburns wrote:
           | Would it not allow macro expansion to be cached? Which I
           | believe it can't be currently because macros can run
           | arbitrary code and access arbitrary external state.
        
       | mcdonje wrote:
       | I don't know much about how the compiler works, so the answer
       | here is probably that I should read a book, but can external
       | crates from crates.io be precompiled? Or maybe compile my
       | reference to a part of an external crate once and then it doesn't
       | need to be done on future compilations?
       | 
       | If the concern is that I could change something in a crate, then
       | could a checksum be created on the first compilation, then
       | checked on future compilations, and if it matches then the crate
       | doesn't need to be recompiled.
        
         | epage wrote:
         | Hosted pre-compiled builds would need to account for
         | 
         | - Feature flags
         | 
         | - Platform conditionals
         | 
         | - The specific rust version being used
         | 
         | - (unsure on this) the above for all dependencies of what is
         | being pre-compiled
         | 
         | There is also the impediments of designing / agreeing on a
         | security model (do you trust the author like PyPI, trust a
         | central build authority, etc) and then funding the continued
         | hosting.
         | 
         | Compiling on demand like in sccache is likely the best route
         | for not over-building and being able to evict unused items.
        
           | mcdonje wrote:
           | sccache seems awesome. I wasn't aware of it. Thanks.
        
         | runevault wrote:
         | because of the lack of stable ABI they'd need to pre-compile it
         | however many versions of the rust compiler they wanted to
         | support.
        
         | duped wrote:
         | Cargo already does this when building incrementally, and there
         | are tools for doing it within an organization like sccache.
         | 
         | > If the concern is that I could change something in a crate
         | 
         | It's possible for a change in one crate to require recompiling
         | its dependencies and transitive dependencies, due to
         | conditional compilation (aka, "features' [0]). Basically you
         | can't know which thing to compile until it's referenced by a
         | dependent and provided a feature set.
         | 
         | That said, many crates don't have features and have a default
         | feature set, but the number of variants to precompile is still
         | quite large.
         | 
         | [0] https://doc.rust-lang.org/cargo/reference/features.html
         | 
         | Note that C and C++ have the exact same problem, but it's
         | mitigated by people never giving a shit about locking
         | dependencies and living with the horrible bugs that result from
         | it.
        
       | dcow wrote:
       | When people complain about rust compile times are they
       | complaining about cold/clean compiles or warm/cached compiles? I
       | can never really tell because people just gripe "compile times".
       | 
       | I can see how someone would come to rust, type `cargo run`, wait
       | 3-5 minutes while cargo downloads all the dependencies and
       | compiles them along with the main package, and then say, "well
       | that took awhile it kinda sucks". But if they change a few lines
       | in the actual project and compile again it would be near instant.
       | 
       | The fair comparison would be something akin to deleting your node
       | or go modules and running a cold build. I am slightly suspicious,
       | not in a deliberate foul play way but more in a messy semantics
       | and ad-hoc anecdotes way, that many of these compile time
       | discrepancies probably boil down more to differences in how the
       | cargo tooling handles dependencies and what it decides to include
       | in the compile phase, where it decides to store caches and what
       | that means for `clean`, etc. compared to similar package
       | management tooling from other languages, than it does to "rustc
       | is slow". But I could be wrong.
        
         | codetrotter wrote:
         | > But if they change a few lines in the actual project and
         | compile again it would be near instant.
         | 
         | If it's a big project and the lines you are changing are in
         | something that is being used many other places then the rebuild
         | will still take a little while. (30 seconds or a minute, or
         | more, depending on the size of the project.)
         | 
         | Likewise, if you work on things in different branches you may
         | need to wait more when you switch branch and work on something
         | there.
         | 
         | Also if you switch between Rust versions you need to wait a
         | while when you rebuild your project.
         | 
         | I love Rust, and I welcome everything that is being done to
         | bring the compile times down further!
        
           | dcow wrote:
           | I am not discouraging efforts to make compile times faster.
           | However, I also see a lot of things that would really make
           | Rust soar not being worked on, like syntax quality of life
           | reworks that get complex under the hood being dropped,
           | partially complete features with half baked PRs, IDE tooling
           | and debugging support, interface-types and much of the
           | momentum behind wasm, async traits and the sorely lacking
           | async_std, etc. It seems like every time I dive into
           | something moderately complex I start hitting compiler caveats
           | with links to issues that have been open for 5 years and a
           | bunch of comments like "what's the status of this can we
           | please get this merged?". It can ever so slightly give one
           | the impression that the rust community has decided that the
           | language is mature and the only thing missing is faster
           | compile times.
        
             | insanitybit wrote:
             | > "what's the status of this can we please get this
             | merged?"
             | 
             | Having written Rust professionally for a number of years,
             | this didn't happen too much. Where it did it was stuff like
             | "yeah you need to Box the thing today", which... did not
             | matter, we just did that and moved on.
             | 
             | > It can ever so slightly give one the impression that the
             | rust community has decided that the language is mature and
             | the only thing missing is faster compile times.
             | 
             | That is generally my feeling about Rust. There are a few
             | areas where I'd like to see things get wrapped up (async
             | traits, which are being actively worked on) but otherwise
             | everything feels like a bonus. In terms of things that made
             | Rust difficult to use, yeah, compile times were probably
             | the number one.
        
               | dcow wrote:
               | I mean this is what you have to do to access variables
               | from an async block:                   let block = || {
               | let my_a = a.clone();             let my_b = b.clone();
               | let my_c = c.clone();             async move {
               | // use my_a, my_b, my_c                 let value = ...
               | Ok<success::Type, error::Type>(value)             }
               | }
               | 
               | And you can't use `if let ... && let ...` (two lets for
               | one if) because it doesn't desugar correctly.
               | 
               | And error handling and backtraces are a beautiful mess.
               | Your signatures look like `Result<..., Box<dyn
               | std::error::Error>>` unless you use `anyhow::Result` but
               | then half the stuff implements std::error::Error but not
               | Into<anyhow::Error> and you can't add the silly trait
               | impl because of _language limitations_ so you have to
               | map_err everywhere.
               | 
               | It's not just "oh throw a box around it and you're good".
               | It's ideas that were introduced to the language when
               | there was lots of steam ultimately not making it to a
               | fully polished state (maybe Moz layoffs are partly to
               | blame IDK). Anyway I love Rust and we use it in
               | production and have been for years, but I think there's
               | still quite a bit to polish.
        
               | nicoburns wrote:
               | There are a few more fundemental missing pieces for me:
               | 
               | - It's impossible to describe a type that "implements
               | trait A and may or may not implement trait B"
               | 
               | - It's impossible to be generic over a trait (not a type
               | that implements a trait, the trait itself)
        
               | tialaramex wrote:
               | > It's impossible to describe a type that "implements
               | trait A and may or may not implement trait B"
               | 
               | How is this different from just describing a type that
               | only "implements trait A" ?
        
               | nicoburns wrote:
               | It would allow you to call a function to check for trait
               | B and downcast to "implements trait A and B" in the case
               | that it does implement the trait.
        
               | merely-unlikely wrote:
               | I'm still learning the language but couldn't you use an
               | enum containing two types to accomplish the same thing?
        
               | nicoburns wrote:
               | You can if you know all of the possible types in advance.
               | But if you want to expose this as an interface from a
               | library that allows users to provide their own custom
               | implementation then you need to use traits.
        
               | tialaramex wrote:
               | It seems like a way to ask "Can this thing implement X
               | and if so how?" from say the Any trait would be what you
               | want here, I have no idea how hard that would be to
               | deliver but I also don't see how the previous trait thing
               | is relevant, like, why do we need to say up front that
               | _maybe_ we will care whether trait B is implemented?
        
               | insanitybit wrote:
               | > - It's impossible to describe a type that "implements
               | trait A and may or may not implement trait B"
               | 
               | So, specialization? Or something else? I haven't found a
               | need for specialization. I remember when I came from C++
               | I had a hard time adjusting to "no specialization, no
               | variadics" but idk I haven't missed it in years.
               | 
               | > - It's impossible to be generic over a trait (not a
               | type that implements a trait, the trait itself)
               | 
               | Not sure I understand.
        
               | nicoburns wrote:
               | > So, specialization?
               | 
               | Basically yes. But that works with dynamic dispatch
               | (trait objects) as well as static dispatch (generics).
               | 
               | > Not sure I understand.
               | 
               | A specific pattern I'd like to be able to represent is:
               | trait AlgorithmAInputData {           ...         }
               | trait AlgorithmA {           trait InputData =
               | AlgorithmAInputData;           ...         }
               | trait DataStorage<trait AlgorithmA> {           type
               | InputData : Algorithm::InputData;                      fn
               | get_input_data() -> InputData;              }
               | fn compute_algorithm_a<Storage:
               | DataStorage<AlgorithmA>>() {           ...         }
        
             | mamcx wrote:
             | > It can ever so slightly give one the impression that the
             | rust community has decided that the language is mature and
             | the only thing missing is faster compile times.
             | 
             | Is not the case, is that the features are now _good enough_
             | and compile times is the one major, big, sore point.
             | 
             | So, if you compare Rust to X you can make a _very good
             | case_ until you hit:
             | 
             | "... wait, Rust is THAT SLOW TO COMPILE?"
             | 
             | ":(. Yes"
        
           | wongarsu wrote:
           | For the branch-switching usecase you might get some milage
           | out of sccache [1]. For local storage it's just one binary
           | and two lines of configuration to have a cache around rustc,
           | so it's worth testing out.
           | 
           | 1: https://github.com/mozilla/sccache
        
           | Lewton wrote:
           | > (30 seconds or a minute, or more, depending on the size of
           | the project.)
           | 
           | I'm working on a largeish modern java project using gradle,
           | and this sounds great... Every time I start my server it
           | takes 40 seconds just for gradle to find out that all the sub
           | projects are up to date, nothing has been changed and no
           | compilation is necessary...
        
         | insanitybit wrote:
         | It's a few things:
         | 
         | 1. Clean builds can happen more often than some may think.
         | CI/CD pipelines can end up with a lot of clean builds -
         | especially if you use ephemeral instances (to save money), but
         | even if you don't it's very likely.
         | 
         | Even locally it can happen sometimes. For example, we used
         | Docker to run builds. For various reasons the cache could get
         | blown. Also, sometimes weird systemy things happen and 'cargo
         | clean' fixes it, but you have to recompile from scratch. This
         | can take 10+ minutes on a decent sized codebase.
         | 
         | 2. On a large codebase even small changes can lead to long
         | recompile times, especially if you want to run tests - cargo
         | check won't be enough, you need to build.
        
         | pjmlp wrote:
         | Both, because some times a little change implies compiling the
         | world due to configuration changes.
         | 
         | Also it is quite irritating sometimes seeing the same crate
         | being compiled multiple times as it gets referenced from other
         | crates.
         | 
         | Ideally Rust could use a dumb compilation mode (or interpreter)
         | for change-compile-debug cycles, and proper compilation for
         | release, e.g. Haskell and OCaml offer such capabilities on
         | their toolchains.
        
           | titzer wrote:
           | I primarily develop the Virgil compiler in interpreted mode
           | (i.e. running the current source on the stable binary's
           | interpreter). Loading and typechecking ~45kloc of compiler
           | source takes 80ms, so it is effectively instantaneous.
        
             | pjmlp wrote:
             | Yeah it works great having toolchains that support all
             | possible execution models.
        
             | sitkack wrote:
             | But you aren't on our timeline and haven't opted into our
             | bullshit.
             | 
             | HTML+Js projects used to be testable with the load of a web
             | page and that community has opt-in to long build times.
             | 
             | Most people are so far away from flow-state that they can't
             | even imagine another way of being.
        
         | anuraaga wrote:
         | cargo run is a command you'd generally use to actually get
         | something running I guess. This is not going to be incremental
         | development in many cases which focus on unit tests I guess.
         | 
         | FWIW cold builds (i.e., in docker with no cache) of cargo are
         | much slower than go, hanging for a long time on refreshing
         | cargo.io indexes. I don't know exactly what that is doing but I
         | have a feeling it is implemented in a monolithic way rather
         | than on-demand. Rust has had plenty of time to make this better
         | but it is still very slow for cold cargo builds, often spending
         | minutes refreshing the crates index. But Go misses easy
         | optimizations like creating strings from a byte slice.
         | 
         | So it is what it is - Go makes explicit promises of fast
         | compile times. Thanks to that, build scripts in go are pretty
         | fast. Any language that doesn't make that explicit might be
         | slow to compile and might run fast - that's totally fine and I
         | would rather have two languages optimized to each case than one
         | mediocre language.
        
           | zozbot234 wrote:
           | You don't even need a separate language, there's already a
           | "fast" compiler for Rust based on cranelift which is used in
           | debug builds by default.
        
             | burntsushi wrote:
             | Cranelift is not used for debug builds by default. I think
             | that's _probably_ a goal (although I 'm not actually 100%
             | sure about that just because I'm not dialed into what the
             | compiler team is doing). Even the OP mentions this:
             | 
             | > We were able to benchmark bjorn3's cranelift codegen
             | backend on full crates as well as on the build dependencies
             | specifically (since they're also built for cargo check
             | builds, and are always built without optimizations): there
             | were no issues, and it performed impressively. It's well on
             | its way to becoming a viable alternative to the LLVM
             | backend for debug builds.
             | 
             | And the Cranelift codegen backend itself is also clear
             | about it not being ready yet:
             | https://github.com/bjorn3/rustc_codegen_cranelift
             | 
             | (To be clear, I am super excited about using Cranelift for
             | debug builds. I just want to clarify that it isn't actually
             | used by default yet.)
        
               | nicoburns wrote:
               | The more immediate goal of "distribute the cranelift
               | backend as a rustup component" has been making good
               | progress and seems like it might happen relatively soon h
               | ttps://github.com/bjorn3/rustc_codegen_cranelift/mileston
               | e/...
        
               | burntsushi wrote:
               | That's amazing. Thanks for that update. Can't wait.
        
               | pjmlp wrote:
               | Great news.
        
           | dwattttt wrote:
           | Refreshing the crates index has gotten quite slow because it
           | currently downloads the entire index, regardless of which
           | bits you need. There's a trial of a new protocol happening
           | now, due for release in March, that should speed this up
           | (https://blog.rust-lang.org/inside-rust/2023/01/30/cargo-
           | spar...)
        
         | dmm wrote:
         | Incremental builds are what matter to me. On my 1240p if I
         | change one file and build it takes ~11s to build. Changing one
         | file and running tests takes ~3.5. That's all build time the
         | tests run in <100ms.
         | 
         | The incremental build performance seems to be really dependent
         | on single-thread performance. An incremental build on a 2014ish
         | Haswell e5-2660v3 xeon takes ~30s.
        
           | kibwen wrote:
           | _> On my 1240p if I change one file and build it takes ~11s
           | to build. Changing one file and running tests takes ~3.5_
           | 
           | `cargo test` and default `cargo build` use the same profile,
           | so presumably the first number is referring to `cargo build
           | --release`. Release builds deliberately forego compilation
           | speed in favor of optimization. In practice, most of my
           | development involves `cargo check`, which is much faster than
           | `cargo build`.
        
             | dmm wrote:
             | > so presumably the first number is referring to `cargo
             | build --release`
             | 
             | Both numbers are for debug builds. I don't know why `cargo
             | test` is faster but I appreciate it.
             | 
             | Incremental release builds with `cargo build --release` are
             | even slower taking ~35s on the 1240p.
        
               | kibwen wrote:
               | Honestly, it should be impossible; in the absence of some
               | weird configuration, cargo test does strictly more work
               | than cargo build. :P Can you reproduce it and file a bug?
        
               | dureuill wrote:
               | i suspect most of that time is link time. Possibly the
               | linker in use is not very parallel, and so linking one
               | big executable with cargo build takes longer than many
               | smaller test executable whose linking can actually be
               | made parallel?
        
         | zozbot234 wrote:
         | Practically all Rust crates make heavy use of monomorphized
         | generics, so every use of them in a new project is bespoke and
         | has to be compiled on the spot. This is very different from how
         | Go or Node work. You _could_ compile the non monomorphic
         | portions of a Rust crate into a C-compatible system library
         | (with a thin, header-like wrapper to translate across ABI 's)
         | but in practice it wouldn't amount to much.
        
           | Tobu wrote:
           | Some popular proc-macros could be pre-compiled and
           | distributed as WASM, and it would be impactful, since they
           | tend to bottleneck the early parts of a project build.
           | However I don't think that could be made entirely
           | transparent, because right now there's a combinatorial
           | explosion of possible syn features. For now I avoid depending
           | on syn/quote if I can.
        
         | conaclos wrote:
         | I contributed to Rome tools [1] and the build takes more than 1
         | min. This makes write-build-test loop frustrating... Such
         | frustrating that I am hesitating to start a project in Rust...
         | 
         | My machine is 7yo. People tell me to buy a new one just for
         | compiling a Rust project... That's ecologically questionable.
         | 
         | [1] https://rome.tools/
        
       | TheDesolate0 wrote:
       | [dead]
        
       | Bjartr wrote:
       | It's funny, any other post on HN about improvements to Rust I've
       | seen are chock full of comments to the effect of "I guess that
       | feature is nice, but when will they improve the compile times?"
       | And now many of the replies to this post are "Faster compiles are
       | nice, but when will they improve/implement important features?"
       | 
       | The Rust dev team can't win!
        
         | marcosdumay wrote:
         | That's how winning looks like.
        
         | capableweb wrote:
         | 1) HN is maybe one organism if you zoom out enough, but it
         | consists of people with wildly different opinions, you'll have
         | capitalists arguing with anarchists here, any post is bound to
         | have both sides, no sides and every side, all on the same page
         | 2) it's easier to complain about stuff, somehow. Not sure why,
         | or if it's extra prominent on HN in particular, but people tend
         | to start thinking "Why am I against this thing?" and then write
         | their thoughts, rather than "Why do I like this thing?". Maybe
         | it is more engagement to write something that can be
         | challenged, and people like when others engage with them, so
         | they start to implicitly learn to be that way.
        
           | jchw wrote:
           | I think pessimistic and cynical reactions are the literal
           | lifeblood of news aggregator comments sections. It's been
           | like this for as long as I can remember across as many
           | aggregators as I've ever used.
           | 
           | Part of the problem is that news aggregates reward people who
           | comment early, and the earliest comments are the kneejerk
           | reactions where you braindump thoughts you've had brewing but
           | don't have anywhere to put. (Probably without actually
           | clicking through.)
        
             | smolder wrote:
             | Another part of it is just psychology. People seem much
             | more inclined to join discourse to make objections than to
             | pile on affirmative comments, which generally an upvote
             | suffices for.
        
               | yadoomerta wrote:
               | It's also partly the site's culture. Not saying it's
               | wrong, because it adds some noise and not much new info,
               | but I've been downvoted before for posting comments like
               | "Thanks for saying this!"
        
             | nindalf wrote:
             | I agree in general except HN also rewards quality a bit
             | more. All new comments get a few minutes to bask at the top
             | of their subthreads. So a really good, late comment can
             | still get to the top and stay there.
        
               | jchw wrote:
               | To a degree, the comment ranking algorithm helps, though
               | long/fast threads do often leave brand new replies buried
               | upon posting.
               | 
               | Still, I believe what makes HN unusually nice is just the
               | stellar moderation. It is definitely imperfect, but it
               | creates a nice atmosphere that I think ultimately does
               | encourage people to try to be civil, even though places
               | like these definitely have a tendency to bring out the
               | worst in people. Having a deft touch with moderation is
               | very hard nowadays, especially with increasingly
               | difficult demands put against moderators and absolutely
               | every single possible subject matter turning into a
               | miniature culture war (how in the hell do you turn the
               | discussion of gas ranges vs electric ranges into a
               | culture war?!) and the unmoderated hellscapes of the
               | Internet wrongly painting all lightweight moderation with
               | a black mark.
               | 
               | I definitely fear for the future of communities like HN,
               | because the pressure from increasingly vile malicious
               | actors as well as the counter-active pressure from others
               | to moderate harder, stronger, faster will eventually
               | break the sustainability of this sort of community. When
               | I first joined HN, a lot of communities on the Internet
               | felt like this. Now, I know of very few.
        
         | jerf wrote:
         | This is when it is important not to model the comment section
         | as some sort of single composite individual.
         | 
         | Since it is impossible to mentally model them as the number of
         | humans they are, I find it helpful to model them as at least a
         | few very distinct individuals, or sometimes just as an
         | amorphous philosophical gas that will expand to fill all
         | available comments, where the only question is really with what
         | distribution rather than _whether_ a given point will be
         | occupied.
        
         | [deleted]
        
         | boredumb wrote:
         | Rust is amazing, I truly believe a large number of people are
         | intimidated by it and so go out of their way to shit on it and
         | pretend like it's only for some niche IOT device... when it's
         | just as easy to write out a full crud application in Rust as
         | any other language at this point.
        
         | [deleted]
        
         | thechao wrote:
         | I used to be a grad student adjacent to Bjarne Stroustrup; he
         | has a quip he's used a bunch: if no one's complaining, no one
         | cares.
         | 
         | I see all of these complaints -- both the volume, and the count
         | -- as great indicators of Rust's total health.
        
           | guhidalg wrote:
           | So true. I use a similar quip at work: "Take the shortcut and
           | if we're not out of business when it becomes an issue, then
           | it will be a good problem to have".
        
             | rascul wrote:
             | I guess that depends on the work. By day I repair houses
             | and taking shortcuts can mean I could have an even bigger
             | problem to solve in a few months. Luckily I have yet to be
             | in such a situation myself but I've fixed other's shortcuts
             | a number of times.
        
               | guhidalg wrote:
               | Yes, I should specify I'm talking about software
               | development. Physical products and work rarely have the
               | luxury of making mistakes or taking shortcuts, the
               | universe is a harsh place.
        
       | bufo wrote:
       | "There are possible improvements still to be made on bigger
       | buffers for example, where we could make better use of SIMD, but
       | at the moment rustc still targets baseline x86-64 CPUs (SSE2) so
       | that's a work item left for the future."
       | 
       | I don't understand this. The vast majority (I would guess 95%+)
       | of people using Rust have CPUs with AVX2 or NEON. Why is that a
       | good reason? Why can't there be a fast path and slow path as a
       | failover?
        
         | pjmlp wrote:
         | Because it requires some kind of fat binaries.
         | 
         | Some C and C++ compilers offer this and it requires some
         | infrastructure to make it happen (simd attribute in GCC), or
         | explicitly loading different kinds of dynamic libraries.
        
       ___________________________________________________________________
       (page generated 2023-02-03 23:01 UTC)