[HN Gopher] Compiling Rust is NP-hard
       ___________________________________________________________________
        
       Compiling Rust is NP-hard
        
       Author : lovasoa
       Score  : 367 points
       Date   : 2021-07-08 09:10 UTC (13 hours ago)
        
 (HTM) web link (niedzejkob.p4.team)
 (TXT) w3m dump (niedzejkob.p4.team)
        
       | FartyMcFarter wrote:
       | > What's the most efficient algorithm you can write to do this?
       | As it turns out, you can't do well in the worst case -- and we'll
       | prove this by a reduction from SAT.
       | 
       | Correction: you can't do well in the worst case, assuming P [?]
       | NP (where "well" means "in polynomial runtime").
        
         | gizmo wrote:
         | Pedantic and wrong. Considering there is no known algorithm
         | that can sat solve in P, nobody in the world can do well in the
         | worst case. And if nobody can do it, then you can't do it
         | either, for any you. QED.
        
           | FartyMcFarter wrote:
           | I'm not a native English speaker, but I thought "you can't do
           | well" is equivalent to "it's impossible to do well", which is
           | how I read the original sentence.
           | 
           | As an aside, I find it a bit ironic that you're complaining
           | about pedantry while interpreting the "you" word literally.
        
             | gizmo wrote:
             | It's not ironic that my response was pedantic, it was the
             | entire point. When you engage in pedantic nerd sniping you
             | just open yourself up for even more pedantic nerd sniping.
             | 
             | "Can't" can mean impossible, or it can mean nobody can do
             | it. Compare "you can't go back in time" with "you can't
             | lift a horse".
        
           | gugagore wrote:
           | It's awkward to abbreviate "in polynomial time" as "in P". P
           | is a set of (decision) problems. We don't know that 3-SAT is
           | in P. We don't know that it's not.
           | 
           | To say that a solver is not in P is a type error.
        
           | OgAstorga wrote:
           | Your "proof" it's more like. No one has ever stepped on Mars.
           | Therefore no one will.
           | 
           | Yet another pedantic fallacy.
        
             | __s wrote:
             | Not that no one will, just that no one can
             | 
             | I give you until the end of today to show me someone
             | (anyone!) step on Mars
             | 
             | It was just someone being cheeky about someone being
             | cheeky. Now I'm being cheeky about you being cheeky over
             | someone being cheeky about someone being cheeky
        
       | sischoel wrote:
       | While this is an interesting result, isn't basically every
       | compiler NP-hard, because of register allocation?
        
         | srl wrote:
         | Compilers don't guarantee (or try to find) the absolute optimal
         | allocation, right?
        
           | WJW wrote:
           | This depends on the compiler of course. Most will not because
           | it is not worth it, but I expect that there is at least one
           | compiler out there with the option to exhaustively search the
           | allocation space for that very last nanosecond of runtime
           | optimization. Probably inside an HFT firm somewhere.
        
         | chrisseaton wrote:
         | No - you can express register allocation as an NP-hard problem
         | if you want to, but you don't have to and not all compilers do
         | it like that. You don't need an optimal register allocation -
         | you get into diminishing returns very quickly.
        
           | sischoel wrote:
           | Yes, I hadn't thought about that before, but of course the
           | difference here is that the Rust compiler cannot just produce
           | a "good enough" result for pattern matching.
        
             | kadoban wrote:
             | Well, it probably _could_ do "good enough". If a set of
             | matches gets too large/hairy it could just output a "here
             | be dragons" warning and go ahead with compiling it, no?
             | 
             | That may not count as a correct Rust compiler though, I'm
             | not 100% sure how that's defined, or if it is.
        
             | chrisseaton wrote:
             | Register allocation is a great example of over-
             | theorisation. People found that register allocation can be
             | expressed as graph colouring, so they doubled-down and
             | worked really hard to solve that pure problem, but then it
             | turned out you can just do a very simple linear allocation
             | heuristic and the generated code is basically as fast as
             | solving the NP-hard problem.
             | 
             | https://dl.acm.org/doi/abs/10.1145/330249.330250
        
               | imtringued wrote:
               | Well, the problem for the NP-hard algorithm is that the
               | base line is already very efficient. It's like branch
               | prediction. Even the dumbest predictors can be 90%
               | efficient. The perfect solution doesn't have much room to
               | be better. I'd say there are more microsecond level
               | optimization problems left than on the nanosecond or
               | millisecond level.
        
         | adrianN wrote:
         | If you do a SSA transform of your program the register graphs
         | are chordal. It turns you that coloring chordal graphs is easy:
         | http://web.cs.ucla.edu/~palsberg/course/cs232/papers/HackGoo...
        
         | antientropic wrote:
         | Or worse than NP-hard: C++ template instantiation and the C
         | preprocessor are Turing complete. (Except that I believe C++
         | requires some fixed limit on the maximum instantiation depth,
         | but even so it's still exponential complexity.)
        
           | rocqua wrote:
           | I believe you can encode paradoxes in C++ templating.
           | Something about two specializations which are mutually
           | impossible because they reference eachother. So you can say:
           | A is of type T <=> B is not of type T        B is of type T
           | <=> A is not of type T
           | 
           | Where <=> is if and only if.
           | 
           | In which case it is paradoxical to say that A is of type T.
           | But also paradoxical to say that A is not of type T.
        
             | MauranKilom wrote:
             | Note that "a template for which no valid instantiation is
             | possible" is actually ill-formed, no diagnostic required.
             | 
             | That said, it's not clear to me what part of "you can write
             | uninstantiable templates" is surprising, so I'm probably
             | not understanding correctly. Maybe you could find the C++
             | code you're talking about? I would be interested.
        
           | Banana699 wrote:
           | >the C preprocessor are Turing complete
           | 
           | That's surprising, the C preprocessor is pretty crippled.
           | It's creator intentionally made it so people won't abuse it
           | to create full blown mini languages.
           | 
           | Did you get things mixed up or is there actually a way the c
           | preprocessor can be turing-complete ?
        
             | namibj wrote:
             | CPP needs to feed into itself (pipe stdout into stdin), if
             | you want it to be turing-complete.
        
               | gpderetta wrote:
               | I believe that while you can't encode an infinite tape,
               | you can very easily define an arbitrarily large tape, so,
               | while not technically Turing complete, it is close
               | enough.
        
             | mtlmtlmtlmtl wrote:
             | Not sure whether this proves turing completeness of the
             | preprocessor or not(probably not) but Simon Tatham has a
             | great(and somewhat disturbing) write-up[1] on creating
             | arbitrary control structures using macros in C. He even
             | goes as far as providing a library of helper macros for
             | doing things like making gensyms, etc.
             | 
             | If nothing else, it does demonstrate that you can do a lot
             | more in the preprocessor than you might think, and this is
             | at least a significant subset of what true MP macros are
             | used for in Lisp most of the time. Except much gnarlier...
             | 
             | [1] https://www.chiark.greenend.org.uk/~sgtatham/mp/
        
       | srl wrote:
       | TFA notes that rustc is not a particular _efficient_ SAT solver,
       | taking something like 10^4 times as long as a commonly used
       | algorithm [1] would on a small example.
       | 
       | So, I'm curious: what's the _strongest_ case one could make for
       | including a more robust SAT solver as part of the exhaustiveness
       | checking? Is there any reasonable place where this could matter?
       | (Extremely autogenerated code?)
       | 
       | [1] https://en.wikipedia.org/wiki/DPLL_algorithm
        
         | trissylegs wrote:
         | Rust's proc-macro system makes it pretty easy to create some
         | hard to compile code. _but_ there 's many ways to generate very
         | slow compiling code with proc macros.
        
         | Quekid5 wrote:
         | One perhaps-interesting use case might be to use the fastest
         | possible solver in the case of compiling generated/foreign
         | code, so basically code that's expected to always compile (but
         | which still needs to be checked for safety).
         | 
         | One could even imagine using it as a first-pass for code and to
         | just fall back to a slower-but-with-better-error-messages type
         | checker if errors are found.
         | 
         | (Of course you'd end up with two orthogonal implementations of
         | the type checker, but that might not even be a bad thing since
         | you'd be easily able to check for inconsistencies between them,
         | aka. type checker bugs.)
        
         | ygra wrote:
         | I was under the impression that cases where it would matter are
         | typically code that wouldn't usually be written. E.g. C# and
         | Java also can solve SAT as part of overload resolution with
         | generic types and it's an interesting peculiarity, but not a
         | pattern that is interesting enough to optimize. Perhaps that's
         | the case here as well.
         | 
         | I could also imagine the compiler code being easy to read,
         | understand and modify are often more valuable traits than
         | compilation speed of highly atypical patterns in a language.
         | But that may well depend on _how_ atypical those are.
        
         | johncolanduoni wrote:
         | The fact that most dependently typed languages (which have a
         | much higher propensity for complicated exhaustiveness checking
         | when pattern matching) don't use a serious SAT solver for this
         | makes me doubt Rust would ever need to.
        
           | sanity31415 wrote:
           | Doesn't Liquid Haskell[1] use a SAT solver?
           | 
           | [1] https://ucsd-progsys.github.io/liquidhaskell-blog/
        
             | johncolanduoni wrote:
             | Like most languages with refinement types it uses an SMT
             | solver (a more general problem), though I'm not sure it
             | uses it for pattern matching. I was thinking of Coq, Agda,
             | Lean, etc. which are full-spectrum dependently typed
             | languages. LiquidHaskell is more specialized, though its
             | automated proof capabilities are much stronger.
        
         | gpm wrote:
         | If you were using rustc as some sort of JIT compiler compile
         | times could become extremely important, and error messages less
         | important, at which point things like this might be worth it.
         | 
         | If you wanted the SAT solver for some other reason
         | (optimization? proving code correct?) it might make sense to
         | use it for this too.
        
       | multifasciatus wrote:
       | > Obviously, every formula can be transformed into this form, by
       | applying the rules of Boolean algebra.
       | 
       | Is it so obvious? Why do so many engineers write with these sort
       | of modifiers (obviously, just, of course, etc.)?
       | 
       | Edit: I am assuming no ill intent from the author. I also use
       | these sorts of modifiers without thinking about them much when
       | writing. I am just curious as to why we as engineers call deeply
       | technical things obvious when they are not.
        
         | rgoulter wrote:
         | I liked the model in blogpost:
         | https://status451.com/2016/01/06/splain-it-to-me/
         | 
         | It discusses that there can be different ways sharing
         | information comes across. For some people, sharing information
         | demonstration that you're smart enough to know the thing. For
         | others, sharing information is for others' benefit.
         | 
         | I wouldn't prefer to interpret "obviously" as "you're not as
         | smart/cool as me if you don't know this", but I can see why
         | there are examples that could come across like that. (I'd
         | sooner interpret it as "I'm not claiming to be smart by saying
         | this").
        
         | bombela wrote:
         | I don't think it's any malice. Just a way to bring up the idea
         | of a sudden realization. Obviously, I could be wrong.
        
         | bruce343434 wrote:
         | Roughly related: why not simply[1]
         | 
         | [1] https://news.ycombinator.com/item?id=27415725
        
         | NieDzejkob wrote:
         | That sounds like something that could be solved with a linter
         | for prose. Though, I can't think of the search terms that would
         | let me find such a solution, if it exists.
        
         | guga42k wrote:
         | Usually this is a polite way to say that proof/derivation is
         | too long to fit on this page, please look somewhere else if you
         | are interested.
         | 
         | Notable example would be Lev Landau's Theoretical Physics. If
         | you see "obviously" there it is probably not obvious at all and
         | can take good few hours to derive the formula.
        
         | joshribakoff wrote:
         | The good faith interpretation is that the author means in the
         | sense of "it follows then that" as opposed to "if you didn't
         | realize this upon reading the previous section then something
         | is wrong"
        
       | EdSchouten wrote:
       | It's only NP-hard if you want the compiler to give you proper
       | warnings/errors, right? As far as I know, the first matching arm
       | is used. This means that if a compiler doesn't do any analysis on
       | it, but naively generates machine code to compare the value, then
       | there is no complexity at all.
       | 
       | (Not saying that that's acceptable...)
        
         | tialaramex wrote:
         | Here's what Rust's compiler actually does today:
         | 
         | https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_bu...
         | 
         | Rust lets you say either "There are four specific possible
         | Clowns, and I want everybody to explicitly deal with that" or
         | you can write #[non_exhaustive] to say "There are several
         | Clowns, right now I defined four of them, but maybe I'll add
         | more some day" in which case _you_ are allowed to only write
         | code for those four Clowns, but everybody _else_ is obliged by
         | the compiler to handle the case with more Clowns than that
         | since they can 't depend on there only being four, and they
         | have no idea how to name the additional ones that don't yet
         | exist.
         | 
         | So, a compiler that just punts is not a correct Rust compiler.
         | It needs to do this analysis or it's compiling a different less
         | useful language.
         | 
         | In the real world the (default) exhaustive case is great for
         | types that reflect some truth about the world being modelled,
         | where you really do mean that all the code relying on this
         | enumerated type should refuse to compile and needs re-
         | engineering if your list is subsequently discovered not to
         | really be exhaustive. Teapot not withstanding, there are
         | definitely only five PlatonicSolids, such as Tetrahedron and
         | Cube. The non-exhaustive case is great for types that reflect a
         | truth you know or strongly suspect will change and everybody
         | using your system needs to be prepared for that up front. In
         | 2016 the enumerated list of US FederalHolidays didn't need
         | Juneteenth, but in 2021 it does, we don't want random code to
         | blow up because it had never imagined Congress would make new
         | ones.
        
           | EdSchouten wrote:
           | Sure. But a naive compiler whose only purpose is to stomp out
           | a poorly optimized executable can safely ignore that
           | annotation, right?
        
             | richardwhiuk wrote:
             | No, because checking that it is exhaustive is required by
             | the language definition.
             | 
             | It's not just not optimized, it's wrong.
        
             | tialaramex wrote:
             | You could instead require everything to be treated as non-
             | exhaustive in your cheap compiler. This means it would
             | reject correct Rust code that explicitly handles each of
             | the five PlatonicSolids differently, because that code
             | lacks a way to handle a sixth (non-existent) PlatonicSolid.
             | 
             | You can't just go "Eh, shrug emoji" without deciding what
             | happens when you don't match.
             | 
             | Suppose you silently treat the first (or last) handled case
             | as the default, so I can write code that only handles Cubes
             | and unlike the real Rust compiler your naive compiler spits
             | out a program. But wait, now what happens for a
             | Tetrahedron? I defined Cubes to have a 32-bit floating
             | point size, but for whatever reason Tetrahedrons have a
             | 16-bit integer length instead. Does your compiler just try
             | to treat the 16-bit integer as a 32-bit floating point
             | number since it assumes this was a Cube?
        
             | kevincox wrote:
             | Yes, that annotation doesn't affect the codegen of a
             | correct program (optimizations aside).
        
         | NieDzejkob wrote:
         | It still needs to ensure that one of the arms will always
         | match. (just not executing any branch in that case is tricky,
         | because the arms can return values)
        
           | remram wrote:
           | It does not need to check, if it _assumes_ it. mrustc works
           | that way, it is not trying to check code, only compile valid
           | code: https://github.com/thepowersgang/mrustc
           | 
           | This is similar to C/C++'s " undefined behaviors". There are
           | things you can write that aren't valid for the language, but
           | that compilers are not required to catch.
        
       | kadoban wrote:
       | I had not thought of pattern-exhaustiveness checking like that.
       | 
       | Isn't compiling several languages, including Rust, actually
       | undecidable though? Seems like if your type system or
       | metaprogramming systems are Turing complete, this has to be the
       | case.
       | 
       | So that's ~worse than NP-hard already (though calling it NP-hard
       | is still technically correct).
        
         | trissylegs wrote:
         | I think in Haskell you need to add:                   {-#
         | LANGUAGE UndecidableInstances #-}
         | 
         | to get an Undecidable type system.
        
           | arianvanp wrote:
           | There are other language extensions to make the haskell type
           | system undecidable. The classic example is `RankNTypes`
           | 
           | E.g. you can proof that type systems of rank n >= 3 are
           | undecidable. [0]
           | 
           | I think that's why haskell has both Rank2Types and RankNTypes
           | as a language extension. So you can still use higher rank
           | types without running into decidability problems up to rank
           | 2.
           | 
           | [0] https://dl.acm.org/doi/10.1145/182409.182456
        
         | arianvanp wrote:
         | Type checking /compiling often is conservative for this
         | purpose. It's sound (Rejects all incorrect programs),
         | decideable (There is an algorithm to reject all incorrect
         | programs) but due to Godel's decidability problem they're
         | incomplete (There are always correct programs which the type
         | checker rejects as incorrect).
         | 
         | There are exceptions of course (C++ templating is turing
         | complete; so type-checking is undecidable) but often you want
         | your compiler to be sound and decidable
         | 
         | Also see https://3fx.ch/typing-is-hard.html
         | 
         | (edit: Apparently , as the link I posted shows, rust is indeed
         | undecidable in certain edge cases. However you can usually
         | reason about a subset of the language (Restrict using certain
         | features) that is still decidable. I have the feeling Rust
         | might have such a subset but it's just a fuzzy feeling)
        
           | thaumasiotes wrote:
           | > It's sound (Rejects all incorrect programs), decidable
           | (There is an algorithm to reject all incorrect programs) but
           | due to Godel's decidability problem they're incomplete (There
           | are always correct programs which the type checker rejects as
           | incorrect).
           | 
           | Isn't that the definition of "semidecidable" as distinct from
           | "decidable"? A decidable question is one where you can
           | determine the answer, whatever that answer may be. A
           | semidecidable question is one where, if the answer is yes,
           | you can determine that, and if the answer is no, you may not
           | be able to determine that.
           | 
           | If you can always show that a program is incorrect, but you
           | can't necessarily show that a program is correct, then the
           | correctness of a program isn't decidable.
        
             | arianvanp wrote:
             | Decide-ability here means "Can we come to an answer in a
             | known amount of steps". If you have an algorithm for type
             | checking; then that answer is yes! You're _always_ able to
             | come up with an answer that is yes or no.
             | 
             | For example the following algorithm is for typechecking is
             | decideable:                   typechecks :: Program -> Bool
             | typechecks _ = false
             | 
             | We always come to decision. There is no "semi" here. Just a
             | clear yes/no answer.
             | 
             | The algorithm is even sound. It rejects all incorrect
             | programs.
             | 
             | However it is not complete. It might also reject correct
             | programs (In this case it rejects all correct programs).
             | 
             | Of course you want a as little conservative typechecker
             | without getting into trouble. But it's always conservative
             | to some extent. Preferably you would both be sound _and_
             | complete. But the problem is that any reasonable logic
             | system can't proof its own consistency. Hence we can't have
             | both. However we can get "as close as we possibly can".
        
               | kmill wrote:
               | I believe you're explaining what a computable function
               | is. A decidable set S of a type T is defined to be one
               | for which there is a computable function f : T -> Bool
               | such that the set of values x with f x = true is equal to
               | S. A semidecidable set S of a type T is defined to be one
               | for which there is a partial computable function f : T ->
               | Bool such that whenever x is in S, then f x is computable
               | and equals true.
               | 
               | Given a definition for a well-typed program, you get a
               | set W of Program of the well-typed programs. You can ask
               | whether W is decidable or semidecidable -- i.e., whether
               | there is some (partial) computable function f : Program
               | -> Bool with the above properties.
               | 
               | The example you give is certainly computable, but it has
               | nothing to do with the set W, so it says nothing about
               | (semi)decidability of the typechecking decision problem.
               | 
               | However, it is a valid trying to find the largest subset
               | of W that is (semi)decidable. Or trying to redefine what
               | it means to be well-typed so that it is (semi)decidable!
               | 
               | One interesting case of a programming language without a
               | semidecidable typechecking problem is Lean. The
               | typechecker they've implemented will both reject or time
               | out on valid programs (so the typechecker is at least a
               | computable function in a sense), but in practice these
               | cases don't really occur.
        
               | thaumasiotes wrote:
               | > We always come to decision. There is no "semi" here.
               | 
               | That's not really how I understand the "semi" in the
               | terminology. For a decidable question, you can recognize
               | when the answer is yes, and you can recognize when the
               | answer is no. For a semidecidable question, you can
               | recognize when the answer is yes.
               | 
               | It doesn't take much of a stretch to describe that as
               | cutting your capabilities in "half".
               | 
               | The way I usually think about this is that with full
               | decidability, you can rule something in or out. With
               | semidecidability, you can rule something in, but you
               | can't rule things out.
               | 
               | That framework extends well to the situation described
               | above, where if the compiler chooses to compile a
               | program, then the typing in that program was valid, and
               | if the compiler chooses to complain about the typing, we
               | can't draw any conclusions. It doesn't match your
               | example; you can't rule anything in or out.
        
             | cedilla wrote:
             | Semi-decidable means that will accept all correct formulas,
             | and either rejects incorrect formulas or gives no answer.
             | It's the same as having a generator that generates every
             | correct formula eventually.
             | 
             | In my opinion, OP is incorrect and those systems are sound
             | but indecidable.
        
               | thaumasiotes wrote:
               | > Semi-decidable means that will accept all correct
               | formulas, and either rejects incorrect formulas or gives
               | no answer.
               | 
               | By this definition, program incorrectness isn't
               | semidecidable either - the compiler will accept all
               | correct [incorrect] formulas, and will either reject or
               | accept incorrect [correct] formulas.
        
               | ThreeFx wrote:
               | Not really, because for undecidable problems
               | semidecidability can only hold one-way. If it held in
               | both ways then the language would be decidable.
               | 
               | Take the Halting program for example: For a program p
               | which halts, you can determine whether it will halt in
               | finite time (because by definition it halts). However,
               | there is no semidecidable algorithm for "this program
               | runs infinitely long".
        
               | thaumasiotes wrote:
               | I don't understand what you're trying to say. What are
               | you contradicting with "Not really"? The claim above you
               | is "by this definition, program incorrectness isn't
               | semidecidable either". You're saying that it is?
        
           | LadyCailin wrote:
           | Sort of a side track, but there's a great layman's intro to
           | this in this Veratasium video
           | https://m.youtube.com/watch?v=HeQX2HjkcNo
        
           | jcelerier wrote:
           | > There are exceptions of course
           | 
           | for a definition of "exceptions" being "virtually all
           | languages with a non-trivial static type system and non-zero
           | userbase" as shown in your link.
           | 
           | Java (https://arxiv.org/abs/1605.05274), C#
           | (https://blog.hediet.de/post/how-to-stress-the-csharp-
           | compile...), Scala
           | (https://michid.wordpress.com/2010/01/29/scala-type-level-
           | enc...), Haskell, Rust
           | (https://sdleffler.github.io/RustTypeSystemTuringComplete/),
           | Typescript
           | (https://github.com/Microsoft/TypeScript/issues/14833) ...
           | 
           | Having a type system with generics and not being turing-
           | complete as a result _is_ the exception, not the rule.
        
             | octachron wrote:
             | Generics is not really the right cutting point: parametric
             | polymorphism is perfectly decidable. However, few languages
             | stop at just parametric polymorphism. For instance, the
             | OCaml issue in the linked blog post happens at the module
             | level, which mixes parametric polymorphism and subtyping.
             | And typically, while SML has parametric polymorphism, it
             | lacks the OCaml extension that makes the module type system
             | undecidable.
        
             | gnulinux wrote:
             | For an example of contrary, one can check Agda. It is not
             | Turing complete, but a useful language (I personally wrote
             | many parsers in it and its Haskell FFI is pretty good so
             | you can essentially write unsafe parts of your programs in
             | Haskell, and safe parts in Agda then prove correctness).
        
             | arianvanp wrote:
             | Haskell's System-F is definitely completely is decidable.
             | You need to opt in to undecidable features like RankNTypes
             | (n>=3) explicitly. (which are GHC; and not Haskell
             | specific).
             | 
             | Even then; Undecidability is a bit spectrumy. e.g.
             | RankNTypes become sort of decideable again in the presence
             | of explicit type annotations
        
         | Zababa wrote:
         | Rust's type system is undecidable. You can see a few languages
         | listed here: https://3fx.ch/typing-is-hard.html
        
           | silon42 wrote:
           | I believe there are limits to recursion (as mentioned in link
           | above), so not strictly true. Perhaps they are too big?
        
             | kibwen wrote:
             | Rust's recursion limit is fairly low, I believe it's 128.
             | However, you can override this if you find it too
             | restrictive.
        
         | simiones wrote:
         | This ends up in some discussion about what "compiling" actually
         | means. Is macroexpanding a part of "compiling"?
         | 
         | Most languages with complex type systems avoid Turing
         | completeness by having an explicit limit for all type level
         | expansions - at least, that is how Java and C# do it.
        
           | kadoban wrote:
           | I think metaprogram expansion has to be considered part of
           | compiling. You have sourcecode, you need machinecode. If the
           | lang says that the process of getting there requires running
           | some metaprogram, so be it.
           | 
           | For Rust I think the type system itself is probably enough
           | though even without that. If neither type checking or
           | metaprogramming are part of compiling, I think your (for
           | hypothetical you) definition of compiling is a bit too
           | restrictive.
        
             | aj3 wrote:
             | Even makefiles are Turing complete.
        
         | BBC-vs-neolibs wrote:
         | Thanks, I couldn't quite put it in words, but that is exactly
         | my reaction.
        
         | jacoblambda wrote:
         | Compiling a complete language and verifying correctness is
         | undecidable but that's not necessarily what Rust does.
         | 
         | The complier can still fail to compile/verify otherwise correct
         | parts of a program. Because it's only operating on a subset of
         | the entire possible language and it's not verifying all
         | properties of the language, it's not quite undecidable but it
         | is still very much NP-hard.
         | 
         | Languages like C, C++, or Ada on the other hand take the other
         | approach. They compile all parts of the language but make
         | little to no attempt to enforce correctness on all those parts.
         | You see verifying compilers for those languages only allowing a
         | subset that they can actually verify which is the same as Rust.
         | 
         | At the moment Rust the language is what the main Rust compiler
         | says it is but once there are meaningfully different compilers
         | you'll start to notice the issue a bit more and there will
         | likely be parts of the language that one compiler can verify
         | but the other can't (and therefore fail to compile).
        
           | kibwen wrote:
           | Any hypothetical standards-conformant Rust implementation
           | would have to be as or more restrictive than rustc, not less.
           | Something like mrustc, which performs no borrow checking,
           | would not be allowed to call itself a conformant Rust
           | implementation.
           | 
           | (I say hypothetical because until there's a spec, it's simply
           | not feasible to build a conformant alternative
           | implementation, as you'd need to be bug-for-bug compatible
           | with rustc.)
        
             | ncmncm wrote:
             | This is also not correct. As in, completely wrong.
             | 
             | Presuming a formal spec that rustc happens to enforce,
             | another compiler could simply try longer to resolve types
             | before giving up, admitting some set of programs that rustc
             | gives up on.
        
               | tialaramex wrote:
               | The spec. could specify exactly the rules rustc actually
               | has including iteration limits, this would freeze rustc
               | (and so it would be undesirable) but it is an option.
               | 
               | The thing we're talking about here has changed after Rust
               | 1.0, Rust 1.0 shipped with a rule that said if you match
               | integers you have to provide a default. In lots of code
               | that feels natural. But not everywhere. If you match all
               | possible integers (e.g. for a i8 that's from -128 to
               | 127), the compiler would say "Not good enough" until you
               | write the default match it will never actually need.
               | 
               | That was fixed with feature(exhaustive_integer_patterns)
               | in 2018 or so AIUI.
               | 
               | But you could imagine Rust being standardised with the
               | old behaviour and, even though it's clearly _possible_ to
               | write a compiler which implements
               | feature(exhaustive_integer_patterns) that would then be
               | non-standard because Rust programs lacking the default
               | match for integers are forbidden in the standard.
        
               | ncmncm wrote:
               | The type system is also Turing-complete, therefore non-
               | terminating. It would be arbitrarily hard to write a spec
               | in a non-Turing-complete language such that all
               | implementations would admit the same set of programs, or
               | even (responsive to the original claim) reliably only a
               | smaller set.
        
           | ncmncm wrote:
           | > _[Rust is] not quite undecidable but it is still very much
           | NP-hard._
           | 
           | > _Languages like C, C++, or Ada ... make little to no
           | attempt to enforce correctness_
           | 
           | Both statements are laughably false, but for different
           | reasons.
           | 
           | Both are based on the Doctrine of Rust Exceptionalism, which
           | requires that Rust is not just a variation on a theme, but
           | fundamentally different from other languages. Like most
           | doctrines, this one is indefensible.
           | 
           | Rust compilation, like C++ and Haskell, is undecideable, not
           | just NP-hard. Non-terminating compilation is artificially
           | curtailed, thus failing to produce an open set of what would
           | have been correct programs.
           | 
           | The Rust compiler performs certain correctness checks that
           | other languages do not. It rejects an infinite set of correct
           | programs that its correctness checks cannot resolve. In this
           | as in all other particulars, Rust is very much on the
           | continuum with other languages. All strongly-typed languages
           | perform a huge variety of correctness checks, some built-in,
           | others programmed, with great success, rejecting the large
           | majority of incorrect programs people write. As a
           | consequence, it is very common for Rust, C++, and Haskell
           | programs to run correctly the first time, once the compiler
           | is satisified.
        
         | volta83 wrote:
         | > Isn't compiling several languages, including Rust, actually
         | undecidable though?
         | 
         | While Rust is Turing complete, the compiler has a recursion
         | limit, and if you take that into account, then compiling Rust
         | is decidable (you only need to evaluate compilation up to the
         | recursion limit).
        
           | ghosty141 wrote:
           | Rusts type system is turing complete if I recall correctly.
           | Obviously, in practice everything is in a way decidable
           | because we have limitations everywhere.
        
             | derefr wrote:
             | In other words, the _abstract machine specification_ you're
             | programming your compile-time logic against in Rust, _has_
             | Turing-complete semantics; but its extant _implementations_
             | -- any Rust compiler people would care to use --
             | intentionally does not implement compile-time Turing-
             | completeness.
             | 
             | But you could in theory write a Rust compiler whose
             | compile-time runtime _is_ fully Turing-complete; and that
             | compile-time runtime would be conformant to the Rust
             | language spec. (Just, nobody would want to use it, because
             | "unbounded runtime" isn't a property people tend to want
             | from their compilers.)
        
               | dllthomas wrote:
               | > But you could in theory write a Rust compiler whose
               | compile-time runtime is fully Turing-complete
               | 
               | Only if your theory allows for infinite storage...
        
               | einpoklum wrote:
               | Well, my infinite tape seemed to work just fine last time
               | I checked.
               | 
               | Of course, I didn't check ad infinitum. <rim shot>
        
             | lovasoa wrote:
             | There is a low hardcoded limit to how deep you can nest
             | types. But I don't think there is a hardcoded limitation to
             | how large match patterns can be. That's the difference.
        
               | smitop wrote:
               | It's possible to change that hardcoded limit. By default
               | it is 256, but you can bring it all the way up to 2^64
               | with #![recursion_limit = "18446744073709551615"].
        
           | pjmlp wrote:
           | Same applies to C++ templates by the way, the compilers
           | usually provide an expansion depth flag.
        
       | comonoid wrote:
       | "I'm not just hard, I'm NP-hard, baby".
        
         | speed_spread wrote:
         | Wanna solve my constraints?
        
       | not2b wrote:
       | So just use a SAT solver to check for checking whether pattern
       | matching is exhaustive. Any decent one will have much better
       | performance than the current approach in the Rust compiler for
       | complex cases. The interesting thing about SAT is that although
       | the worst case is exponential (as far as we know until someone
       | proves P = NP), even huge real-life problems (from circuits,
       | program analysis etc) tend to have structure that makes them
       | quickly solvable with a SAT solver. And if not, have a time
       | budget and report "too complex to solve" if exceeded.
        
         | thaunatos wrote:
         | It's worth noting that the author of the arcicle doesn't think
         | it's worth integrating a SAT solver:
         | 
         | > Does this mean that rustc should integrate an industrial-
         | strength SAT solver? As hilarious as that would be, I'm
         | advocating no such thing. This will only be a performance issue
         | on pathological examples crafted by bored nerds, and I don't
         | think precious engineering time should be spent on that.
         | Besides, generalizing a SAT algorithm to handle the full
         | expressive power of Rust's patterns might be, to borrow some
         | language from mathematicians, non-trivial.
        
       | Ericson2314 wrote:
       | > Does this mean that rustc should integrate an industrial-
       | strength SAT solver?
       | 
       | Well, this is actually what https://github.com/rust-
       | lang/rfcs/blob/master/text/1868-port... would require. There is
       | already chalk, a logic programming language for traits too.
       | 
       | I don't balk at these things. A good logical / relational toolbox
       | is very useful for a lot of tasks.
        
       | cletus wrote:
       | Compiling really is the Achilles heel of Rust I feel. This
       | article really talks about the worst case. As others have noted,
       | Haskell and other languages have this property too.
       | 
       | Here's another perspective on practical issues [1] that's worth
       | reading.
       | 
       | Basically, Rust made some understandable but unfortunate design
       | decisions that are hard to walk back or change now.
       | 
       | [1]: https://pingcap.com/blog/rust-compilation-model-calamity
        
       | w-m wrote:
       | Presumably there's another SAT solver in the realm of the Rust
       | tool chain: cargo surely has one for dependency resolution. This
       | is a big time sink in other package managers (looking at you,
       | Anaconda). Does someone have an insight how solving is
       | implemented in Cargo?
        
         | littlestymaar wrote:
         | There is one[1] but Rust allows different major versions of the
         | same package to exist in parallel, so dependency resolution is
         | usually trivial unless someone in the dependency tree
         | explicitly adds an upper limit of the acceptable version [2],
         | which is pretty uncommon (I don't think I've ever met one in
         | practice).
         | 
         | [1]: https://github.com/rust-
         | lang/cargo/blob/master/src/cargo/cor...
         | 
         | [2]: https://doc.rust-lang.org/cargo/reference/specifying-
         | depende...
        
       | donatj wrote:
       | I had rare occasion to compile a Rust application recently
       | because a prebuilt binary didn't exist for my platform. It took
       | just over two hours! Coming from primarily Go development
       | recently, I was astonished.
       | 
       | How do you get anything done when the cost of testing an idea is
       | several hours?! I'd feel like the sunk cost fallacy would kick in
       | for even the most lackluster of changes just because you took the
       | time to try it.
        
         | spockz wrote:
         | Compiling linkerd-proxy took me also north of an hour on an
         | Intel 6850k. Most of the time was spent compiling dependencies
         | though. I also compiled envoy proxy which also took more than
         | an hour.
         | 
         | Subsequent rust builds for linkerd-proxy were quite fast. For
         | envoy most time was spent in Bazel itself. Probably because I
         | just did a Bazel build instead of specifying a more specific
         | target.
        
           | mkj wrote:
           | Out of curiousity I tried linkerd2-proxy here on a i5-1145G7
           | (I guess similar total CPU? Less cores, laptop model, newer).
           | Took 5 minutes but used ~6GB resident. Maybe yours was
           | swapping a lot?
        
         | glennpratt wrote:
         | I worked on a Rust project at work for several months. Compile
         | times were not a significant issues for developers where it was
         | generally cached. For example, repeatedly running a test only
         | compiled a small amount of code.
         | 
         | It was a bit slow in CI/CD build environments with no cache,
         | but so are the k8s Go projects I work on (go mod pulling in the
         | world).
         | 
         | The only thing approaching 2 hours I've ever seen is building
         | the compiler from scratch.
        
         | yakubin wrote:
         | Although Rust compile times are rather bad, IME such
         | astronomical compile times are rather a result of terrible code
         | quality, a lot of code doing essentially nothing, just gluing
         | glue, to the point where it's normal to get a 50-60 frames deep
         | stack trace with most functions being named such that it would
         | suggest that all of them do basically the same. Then you look
         | at their bodies and... yep... doing the same, aka nothing.
         | 
         | With Rust, procedural macros add a lot of overhead when
         | compiling, so I try to avoid them.
         | 
         | At work, we have a C++ project which, without using distributed
         | builds with distributed ccache, running on powerful cloud
         | computers, takes 3h+. Code quality is adequate. Debugging
         | anything is a nightmare in such a codebase.
        
           | spoiler wrote:
           | While I agree that the compile times are bizarre, I don't
           | think we can jump to conclusions such as "such astronomical
           | compile times are rather a result of terrible code quality"
           | without knowing more about what's being compiled and under
           | what conditions it was being compiled!
        
         | gameswithgo wrote:
         | Short answer is it doesn't normally take that long. You also
         | don't have the problem of cgo being slow or ever having to wait
         | for the data race checker to run.
         | 
         | initial compile has to download and compile all dependencies.
         | after that compiling is incremental. still slower than go
         | though
        
         | adwn wrote:
         | Compiling Servo, a _web browser engine_ , takes 5-6 minutes for
         | an optimized build from scratch on my 6C/12T machine. So no, 2
         | hour build times are not normal for all but the largest
         | software projects.
        
           | genghizkhan wrote:
           | Compiling paru, an AUR helper for Arch Linux, on the other
           | hand, takes me around 3 - 4 minutes on a 4C/8T i5 for an
           | optimised build. I think Rust compile times might just fall
           | within a narrow range.
        
         | Kaze404 wrote:
         | Compiling for production and compiling for development are two
         | very different processes, with the latter benefiting from less
         | optimizations and incremental compilation. It wouldn't take two
         | hours to test an idea.
        
         | wongarsu wrote:
         | In my experience, Rust compilation times aren't that different
         | from what I'm used to in the JavaScript world. Initial
         | compilation takes a minute or two (comparable to the runtime of
         | `npm install` on Windows), when you make changes you usually
         | get incremental compilation within a few seconds.
         | 
         | I guess you can have projects that are just huge and/or run
         | into some pathological case that increases compile time a lot
         | (just like with C++), but for any subsequent compilations you
         | should get very fast incremental builds.
        
         | jackcviers3 wrote:
         | I get this question about large scala codebases too - clean
         | build and test cycles for things on underpowered containers in
         | cloud build pipelines are in the tens of minutes sometimes.
         | 
         | One: my local dev machine has 16 gigs of memory and 16 cores.
         | What takes the tiny docker container with 1 gig and 2 vcpus 30
         | minutes takes my computer about 1.
         | 
         | Two: Incremental compilation and testOnly make build/test
         | cycles maybe a second / twenty seconds max, and most of that is
         | test runtime on complex property based tests.
         | 
         | You just get by without a clean compile most of the time after
         | you build the project the first time. And really, a lot of
         | builds spend an inordinate amount of time just pulling in
         | external dependencies (which are also cached on my local
         | machine, but not on the containerized builds a lot of the
         | time).
        
         | lmilcin wrote:
         | I have worked for 3 years on a project where it took a whole
         | week to get the code compiled, signed by an external company
         | and deployed to the device so that I could see the results.
         | 
         | I just learned to work without compiling for a long time. Over
         | time my productivity increased and the number of bugs fell
         | dramatically.
         | 
         | Working this way requires you to really think about what you
         | are doing, which is always a good idea.
         | 
         | This was over a decade ago and now I work mostly on Java
         | backends and I am happy that I typically spend days or even
         | weeks without ever compiling the code and that it usually works
         | the first time I run it.
         | 
         | I can't think how I could get back. It looks really strange to
         | me to observe other developers constantly compiling and running
         | their code just to see if it works. It kinda looks as if they
         | did not exactly understand what they are doing because if they
         | did, they would be confident the implementation works.
         | 
         | The only time when I actually run a lot of compile/execute
         | iterations is when I actually don't know how something works. I
         | typically do this to learn, and I typically use a separate toy
         | project for this.
        
           | nine_k wrote:
           | Often you can be certain that you know what tp expect from
           | your own code, but not from dependencies or external systems.
           | So checking that your assumptions about them are right is a
           | major reason to run and rerun your code.
        
             | lmilcin wrote:
             | That is good reason to minimize amount of dependencies and
             | only use the ones you know and can reason about. It is also
             | part of what I do to help me reason about code before I
             | execute it.
             | 
             | As I mentioned, if I don't understand a dependency or
             | external system I make separate toy project where I can
             | rapidly experiment and learn.
             | 
             | Think of it as having fun in a aircraft simulator. You play
             | with it so that you are prepared to fly the actual plane.
             | 
             | Also checking your assumptions by trying to see if the code
             | works is a problem in itself. A lot of these broken
             | assumptions will not break your code immediately but maybe
             | sometime later. Maybe when your application is under load
             | or maybe when clock on the server is moved back by an hour
             | or maybe when the connection breaks in a certain way.
             | 
             | Base your work on _knowing_ how something works and not
             | _assuming_.
             | 
             | Best way to limit the risk of your application failing due
             | to broken assumption is to limit your reliance on
             | assumptions in the first place.
        
           | marcosdumay wrote:
           | I used to use approach, but the new heavily typed languages
           | bring a lot of really nice tools that you only get to use at
           | compile time.
           | 
           | Specifically in Rust, you can use the language to guide you
           | through things like refactoring, thread synchronization,
           | correctness verification and a lot of other things. But you
           | have to run the compiler.
        
             | lmilcin wrote:
             | I don't write production code in Rust (though I learn for
             | my embedded hobby).
             | 
             | But you can say the same for Java. IntelliJ IDEA internally
             | does equivalent of compilation and tells me exactly where
             | my code would fail to compile.
             | 
             | So in a sense I am not strictly practicing my approach, but
             | I also don't see reason to do so if the tools are reliably
             | giving me hints when I made mistake writing something that
             | will not compile.
        
               | slunk wrote:
               | > So in a sense I am not strictly practicing my approach
               | 
               | Developing in an IDE that compiles almost continuously is
               | about as far from the development philosophy you're
               | advocating for here as one could get :P
        
               | lmilcin wrote:
               | That doesn't make any sense.
               | 
               | This isn't about throwing away tools for some idealized
               | goal. It is about using the tools that are available to
               | achieve best results without making you reliant on the
               | tools to the point you don't know what your program is
               | going to do without compiling and running.
               | 
               | IDE helps catch a lot of stupid simple mistakes and that
               | helps save time. Why would that be bad?
        
               | slunk wrote:
               | I don't think using an IDE to catch lots of stupid simple
               | mistakes is bad. It's how I prefer to work.
               | 
               | > It looks really strange to me to observe other
               | developers constantly compiling and running their code
               | just to see if it works. It kinda looks as if they did
               | not exactly understand what they are doing because if
               | they did, they would be confident the implementation
               | works.
               | 
               | Explain to me how this statement doesn't apply to your
               | use of an IDE, but the other engineers you've observed
               | don't understand what they're doing.
        
               | [deleted]
        
               | lmilcin wrote:
               | If you can't read that sentence with comprehension none
               | of my explanations are going to help.
        
               | SV_BubbleTime wrote:
               | How do you like embedded Rust?
               | 
               | I'm looking forward to someone making a legit IDE/suite
               | with support, no indication of it yet but I assume some
               | day!
        
               | lmilcin wrote:
               | I mainly work with STM32 Cortex-M3 MCUs (again, these are
               | my personal projects).
               | 
               | Rust, well, "works". But there is still a bunch of issues
               | so I keep developing using C until I get the kinks ironed
               | out.
        
               | steveklabnik wrote:
               | It's working really well for us at Oxide. I even do it on
               | Windows :)
        
           | hinkley wrote:
           | My theory for Java is that it was frog boiling turned cargo
           | culting.
           | 
           | Comparatively speaking Java compilation was very fast at the
           | beginning, so for instance Red-Green-Refactor (RGR) works
           | pretty well. There's a parallel to other creative jobs where
           | sometimes shuffling around the things you already have
           | reveals a pattern that leads to a breakthrough.
           | 
           | But there are other feedback loops that, with J2EE in
           | particular, the cycle times started to creep up, and up, and
           | at some point if you haven't stopped and looked at what
           | you're doing you don't see how crazy things have gotten. RGR
           | still has a place there because you are typically _not_
           | recompiling everything and you aren 't spooling up the
           | application to run unit tests. But making one line changes to
           | see how a page loads is just bonkers amounts of busy work.
           | 
           | One of the bad dynamics is that people more like you also
           | tend to memorize the code, which is both bad for new hires
           | (circular logic does not reveal itself when you introduce one
           | assertion at at time, but does when you get hit with all of
           | them at once), and also incentivizes you push back on
           | refactoring. Because those damned smartasses keep moving
           | things around and they were Just Fine where they were. If
           | that happens you have cemented the entire codebase and
           | anything that is really wrong with it is going to stay wrong
           | until someone proposes a rewrite. And having learned the
           | wrong lessons the first time, we repeat them again in the
           | second iteration.
        
           | random_kris wrote:
           | I am one of those devs that is always compiling and checking
           | if their program is working. I am mainly working with java
           | and still doing this.
           | 
           | Less so than when working with JavaScript. But please teach
           | me your ways hahha
        
             | ransom_rs wrote:
             | In case anyone else does this and is new to Rust - you can
             | use `cargo check` to type check/borrow check/everything
             | else without doing any codegen
        
               | lmilcin wrote:
               | This thread is no longer with regards to Rust or checking
               | whether the code compiles or not. It is about how you can
               | work with compilation times that are longer than a coffee
               | break.
        
             | lmilcin wrote:
             | Start by understanding that this compile/run process is a
             | crutch. Rather than use your knowledge, experience and
             | intelligence to predict if it works you resign yourself to
             | just waiting to see.
             | 
             | This is a problem for many reasons. One is that this may
             | help you get something working, but with lack of deep
             | understanding the outcome will likely be subpar if only
             | because not all problems that you could have predicted will
             | show themselves on execution.
             | 
             | Another is that it basically shrinks the part of brain that
             | is necessary for understanding and predicting behavior of
             | your code (figuratively). Sort of like driving with GPS
             | makes me helpless without it.
             | 
             | Try to write larger stretches of code without compilation.
             | 
             | Try to focus on modularizing your application so that you
             | can reason about modules separately. This is always a good
             | idea, but it is even more important when you need to be
             | able to predict how something works without trying it.
             | 
             | When you have finally compiled your code and it failed, do
             | not immediately go to fix the problem. Try to spend a
             | moment to learn from the failure and improve your process
             | so that you minimize chance of this happening in the
             | future.
             | 
             | Ask yourself, what you could have done to prevent this
             | problem from happening? Could you have specified some
             | function or module better? Could you have simplified your
             | code to be able to better reason about it? Would it help if
             | you have spent a bit more time getting acquainted with this
             | internal or external library before you decided to use it?
             | 
             | From my experience, most of this comes down to following
             | things:
             | 
             | - defining your modules and APIs correctly -- badly defined
             | modules make it difficult to predict the behavior,
             | 
             | - finding simple solutions to problems -- complex solutions
             | tend to make it difficult to predict behavior,
             | 
             | - using only tools you understand,
             | 
             | - only interacting with existing code after you have
             | understood how it works (I typically at least look over the
             | code that I plan to use),
             | 
             | - thinking hygiene (make sure you base your work on hard
             | facts and not beliefs),
             | 
             | - refactoring, refactoring, refactoring -- first solution
             | to the problem is rarely optimal. I write something that
             | works and then immediately keep refactoring it removing any
             | unnecessary complexity until I am satisfied. Don't leave
             | refactoring for later -- when you have just written a piece
             | of code it is easiest to change it.
             | 
             | - as much as possible, writing your code in a way that it
             | is not even allowed to produce wrong result. This is very
             | large topic so I won't explain. There is a lot of
             | techniques that you can research.
        
               | random_kris wrote:
               | Do you think there is a time and place where it is more
               | sensible to just go by trial and error.
               | 
               | For example when I am interacting with a codebase for the
               | first time and I want to implement something I just keep
               | bashing and throwing shit at the wall untill something
               | sticks. After that I start working more in line of what
               | you described.
        
               | lmilcin wrote:
               | How exactly you arrive at understanding the codebase is
               | not as important.
               | 
               | Just make sure you keep actual development separate from
               | learning the tool if you care for your results and
               | especially reliability.
               | 
               | Now, I use various ways to learn the tools and codebase.
               | Running PoC for my idea or maintaining separate toy
               | project helps me with maintaining hygiene.
               | 
               | For example, for the past year I have been spending a lot
               | of time learning reactive programming with RxJava and
               | Reactor. I have created a dozen small projects
               | illustrating various ideas for making reactive APIs,
               | processing pipelines, separating business logic from
               | infrastructure, composing reactive modules, etc.
               | 
               | I did this with aim of purposeful learning rather than
               | writing production code, even though some of these in the
               | end migrated to be part of production codebase.
               | 
               | I am now at a level where I can, again, write large
               | swaths of modules, libraries but now using reactive
               | paradigm, with a very good chance of it working
               | correctly, which for me validates that I more or less
               | understand what is going on.
        
           | throwawayboise wrote:
           | First job was in a mainframe environment, compiles were
           | pretty quick but the job queue was so big and developer
           | compiles were low enough priority that it could take hours of
           | wall clock time to turn around.
           | 
           | I don't remember the specifics but while watching the job
           | queue on the terminal screen I discovered that the job
           | priority was editable. So I bumped it up, my compile job ran,
           | and I was happy. I only did this a few times before I got a
           | lecture from the system operations guys that was quite
           | explicit that I should never do this again.
           | 
           | Yes, you figure out how to run code mentally, how to check
           | carefully for typos and syntax errors, etc. No Intellisense
           | then either, that's another modern crutch.
        
         | yosefk wrote:
         | I doubt your compile times were due to pattern matching which
         | TFA demonstrates to be NP-complete, any more than C++'s
         | compile-times are due to the undecidability of template
         | metaprogramming.
         | 
         | (Though I guess one might argue that slow compile times on
         | typical code and rarely-used computationally hard features have
         | the same root cause of not treating fast compile times as a top
         | priority.)
        
           | tialaramex wrote:
           | Right, and C++ got a lot of its compiler performance for
           | large projects by making this an embarrassingly parallel
           | problem even though the consequences are negative for the
           | main consumers of the code (people, not compilers).
           | 
           | One cost of being embarrassingly parallel is the One
           | Definition Rule. If we can compile N different code units in
           | parallel, but they're all allowed to define things in the
           | same namespace, obviously those definitions might contradict
           | each other and we wouldn't notice. So, the C++ language
           | explicitly forbids this, knowing you'll probably do it
           | anyway, at least by mistake. If (when) you do, that isn't a
           | valid C++ program, but the compiler isn't expected to produce
           | any diagnostic (warning or error). So, you get a binary, but
           | the language doesn't care what that binary does. Maybe it
           | does exactly what you expected. Maybe it does _almost_
           | exactly what you expected. If not too bad, the One Definition
           | Rule means your compiler was _fast_ and that 's what matters.
        
             | sakex wrote:
             | I usually fails at link time doesn't it?
        
               | bluGill wrote:
               | No. I have intentionally violated the one definition rule
               | and nothing broke at all. In my case I wrote a second
               | timer, which had hooks my unit test system could use to
               | advance time without having to inject timers.
               | 
               | It will fail at link time if you link everything into the
               | same library. Even here there is an escape: there are
               | ways to mark something as a weak symbol and than the
               | linker won't complain about more than one definition.
               | 
               | See your OS documentation for how this works on your
               | implementation. (though don't be surprised if the
               | documentation is wrong...)
        
               | brandmeyer wrote:
               | > I wrote a second timer, which had hooks my unit test
               | system could use to advance time without having to inject
               | timers.
               | 
               | This use case isn't an ODR violation. Its just using the
               | linker to mock an interface.
               | 
               | > It will fail at link time if you link everything into
               | the same library.
               | 
               |  _That_ is an ODR violation, although there are
               | variations on this pattern that are not required to be
               | detectable. Template instantiation is an easy way to get
               | a silent ODR violation.
        
               | jcelerier wrote:
               | > No. I have intentionally violated the one definition
               | rule and nothing broke at all.
               | 
               | did you use LTO ? It always catches ODR issues for me.
               | There is also GNU Gold's --detect-odr-violations switch.
        
               | bluGill wrote:
               | No. LTO is something I've been meaning to look at. Though
               | if I can't violate the ODR intentionally I might not be
               | interested.
        
               | jcelerier wrote:
               | If you violate ODR intentionally you are just writing
               | code that _will_ break in a future version of GCC  /
               | Clang / ..
        
               | bluGill wrote:
               | That is a risk I know I'm taking.
               | 
               | Though I've been tempted to write a paper for C++ to make
               | it defined behavior. I know it works in some form on most
               | implementations even though it isn't legal. Thus there
               | seems to be some other use cases for it that could/should
               | be formalized. If anyone can give me other examples of
               | why you want to do this and what the rules are on each
               | platform I'll take a shot at it.
        
               | jcelerier wrote:
               | > I know it works in some form on most implementations
               | even though it isn't legal.
               | 
               | it definitely does not, every time I had an ODR issue
               | that caused actual bugs. For instance, dynamic_cast not
               | working because a typeid was defined in two shared
               | objects, etc.
               | 
               | What would be the behaviour you expect if you have
               | a.cpp:          int constant() { return 123; }
               | b.cpp:              int constant() { return 456; }
               | c.cpp:          int constant();         int main() {
               | return constant(); }
               | 
               | how could you define this meaningfully other than "hard
               | error" ?
               | 
               | e.g. here with gcc, if a and b are put into shared
               | libraries, the return value depends on _the order in
               | which the libraries are passed to g++ when linking_. e.g.
               | g++ c.cpp a.so b.so
               | 
               | calls the version in a.so, while                   g++
               | c.cpp b.so a.so
               | 
               | calls the version in b.so
        
               | bluGill wrote:
               | You can do that because the order in which libraries are
               | passed to the linker is something you can control. Of
               | course linkers don't have to do this, and future versions
               | can do something different, but it works and the rules
               | for gcc are currently "the order in which the libraries
               | are passed to g++ when linking", which is defined. Of
               | course gcc has the right to change those rules (I suspect
               | the real rules are a bit more complex)
               | 
               | Gcc also has the concept of weak symboles which if
               | invoked (and the linker supports it) would allow you to
               | make one of the two weaker than the others and then the
               | whole doesn't depend on link order. Visual C++ also seems
               | to have something like this, but I'm sure it is
               | different.
               | 
               | Like I said, I want to write a paper to make it defined -
               | but the paper will be a lot longer than would fit in a
               | response here, and depending on information that I
               | currently don't know.
        
         | IshKebab wrote:
         | Were you compiling Servo on a Raspberry Pi or something? 2
         | hours is ridiculous even for Rust.
         | 
         | > How do you get anything done when the cost of testing an idea
         | is several hours?
         | 
         | Incremental compilation. Depending on the project you can get
         | the edit compile cycle down to somewhere between 1 second and 5
         | minutes. It's nowhere near as fast as Go but it's not like
         | anyone is actually making edits then waiting 2 hours for them
         | to compile.
        
         | weavie wrote:
         | Incremental compilation. Once you get the first build done,
         | subsequent builds are much faster.
         | 
         | Also a project is likely to be split up over multiple crates,
         | so a lot of the changes you make will require building and
         | testing just that one crate.
        
           | runevault wrote:
           | Isn't incremental compile disabled currently? I thought they
           | found a significant bug and so turned it off until they could
           | validate a fix.
        
             | kibwen wrote:
             | The grandparent was building a binary utility, which means
             | they were (hopefully!) building in release mode, which
             | doesn't use incremental compilation by default in any
             | version of the compiler.
             | 
             | For debug builds, where incremental is usually on by
             | default, it was disabled for 1.52.1 due to bugs, and then
             | kept off in 1.53 out of an abundance of caution. It should
             | be back on in 1.54.
        
               | runevault wrote:
               | okay that's what I thought (and the not turned back on
               | yet officially was what I was trying to hint at with the
               | abundance of caution).
        
         | ekidd wrote:
         | > How do you get anything done when the cost of testing an idea
         | is several hours?!
         | 
         | My Rust compile times are usually between 10 and 30 seconds.
         | Some tricks that help:
         | 
         | - Get a good development machine. I prefer a Dell Precision
         | laptop from the last couple of years, with plenty of cores. A
         | 5-year old laptop with 2 cores will be a _lot_ slower.
         | 
         | - Use Visual Studio Code and the rust-analyzer plugin. This
         | will give you feedback in the editor while you're working.
         | 
         | - Rely on the type system to identify most problems before ever
         | generating code.
         | 
         | - Minimize the use of slow-to-compile libraries that rely on
         | complex generic types. Diesel, for example, is a great library
         | but it's slow to compile.
         | 
         | - Install cargo-watch, and use it to automatically re-run the
         | tests.
         | 
         | Also, remember that incrementally re-compiling a single crate
         | in debug mode will be much faster than building an optimized
         | application and all its dependencies from scratch.
         | 
         | TL;Dr: Rust compilation times are less than ideal, but with a
         | fast laptop and rust-analyzer, it's possible to spend very
         | little time actually waiting for the compiler.
        
           | remix2000 wrote:
           | Just get a new machine for a couple zillion bucks, why
           | haven't we thought of it earlier? :^)
        
             | isaacimagine wrote:
             | As a developer who uses Rust a bunch (work & personal),
             | this is what I ended up having to do. I'm joking, but this
             | one simple trick can cut compile times _in half_.
        
             | edflsafoiewq wrote:
             | Yeah, can really see that "empowering everyone" motto
             | coming into play.
        
             | ekidd wrote:
             | Yes. If you are a professional developer who gets paid to
             | work on large Rust projects all day long, then a US$2500
             | high-end laptop will pay for itself many times over. (I'm
             | assuming US or European salaries here.)
             | 
             | I've gotten by with much less for personal projects. But
             | Rust will make very efficient use of (say) an 8-core i9 if
             | you have one.
        
         | nicetryguy wrote:
         | I'm pretty excited for the Linux kernel to take three weeks to
         | build.
        
         | stonemetal12 wrote:
         | If I had to guess it is because all of the dependencies had to
         | be compiled too.
         | 
         | In a normal Rust application you break it up into crates.
         | Crates are the unit of compilation, and are only recompiled
         | when something in the crate changes. In a "normal" developer
         | flow you only touch 1 or two crates at a time so normal
         | developer compile time would be a couple of minutes at most.
         | Even for a new build on a machine most developers would never
         | see the two hour compile time because they would have gotten
         | precompiled dependencies.
        
           | Nullabillity wrote:
           | > In a "normal" developer flow you only touch 1 or two crates
           | at a time so normal developer compile time would be a couple
           | of minutes at most.
           | 
           | Obviously this depends on your computer, but generally
           | incremental Rust builds should typically be on the order of
           | seconds, not minutes.
           | 
           | > Even for a new build on a machine most developers would
           | never see the two hour compile time because they would have
           | gotten precompiled dependencies.
           | 
           | Cargo doesn't do precompiled dependencies (even between
           | projects on the same machine).
        
         | tester34 wrote:
         | jesus christ
         | 
         | rust's compilation times are as terrible as c++'s?
        
           | okamiueru wrote:
           | I don't know about Him, but at least when working with C++ I
           | feel the compile time is entirely up to deliberate tradeoffs.
           | With better planning and more effort made with forward
           | headers, better implementation boundaries w/pimpl like
           | constructs, and otherwise linking, and avoiding meta-
           | programming entirely, I've not found it to be an issue.
           | 
           | Incremental C++ builds can be lightning fast, or it can be
           | ridiculously slow. I've worked on large projects where change
           | + incremental compile + test was less than 5 seconds. And
           | I've worked with project where no effort was made where the
           | same could take 10 minutes.
        
             | gpderetta wrote:
             | Keeping your C++ compilation times reasonable requires
             | eternal vigilance (or a distributed system with aggressive
             | caching).
        
           | kzrdude wrote:
           | It's the same ballpark
        
             | spoiler wrote:
             | In my experience (C++ dev at work) C++ generally _feels_
             | slower than Rust.
             | 
             | Edit: I quickly realised, this might only because of Rust's
             | --in my opinion--superior type system and tooling, so maybe
             | it just requires less waiting for compilations than when
             | working C++.
             | 
             | Edit 2: I've never actually measured Rust compilation
             | times, but even for large-ish graphical codebases (wezterm,
             | alacritty, bevy games) I've compiled from scratch, it felt
             | noticeably faster even then.
        
             | tester34 wrote:
             | I think we must do better
             | 
             | AAA games need less resources than those tree walkers
        
           | IshKebab wrote:
           | In my experience Rust is faster for full clean builds, even
           | if you use slow dependencies like Serde.
           | 
           | However C++'s compilation units are smaller than Rust's which
           | helps it to have faster incremental compilation times for
           | small changes.
           | 
           | So, same order of magnitude basically.
        
         | kibwen wrote:
         | I suspect the machine you were using was swapping itself to
         | death, because I've never experienced anything resembling that
         | in Rust.
         | 
         | I also presume you were compiling in release mode, which is
         | good for producing extremely fast programs but not something I
         | bother with in regular development (in contrast to Go, which
         | has no distinction between debug and release optimization
         | levels).
         | 
         |  _> How do you get anything done._
         | 
         | The vast majority of my workflow just involves seeing if my
         | code typechecks, for which I don't even need to build the
         | program (so no codegen, no linking). This I do very frequently,
         | as a sanity check on my work. The command for this is `cargo
         | check`. This takes less than a second for small projects, one
         | to five seconds for medium projects, and one to twenty seconds
         | for large projects.
        
           | adwn wrote:
           | > _I suspect the machine you were using was swapping itself
           | to death_
           | 
           | Good point. I recently upgraded my machine to 64 GiB RAM,
           | because 16 GiB filled up pretty quickly with parallel
           | compilation, several VS Code/rust-analyzer instances and a
           | couple of browser tabs open.
        
         | tinco wrote:
         | Go is geared towards productivity. If you have a problem you
         | can solve adequately in Go, why not just do it in Go?
         | 
         | Because they gained popularity roughly around the same time
         | many people think they are competing languages, but they're
         | really not.
         | 
         | Ruby is my main programming language, if I need something to be
         | fast and/or light weight I switch to Go, sacrificing some
         | comfort. And if I need absolute lowest possible overhead, I
         | switch to Rust, sacrificing a whole lot more comfort. Rust is
         | more expressive than Go, but if your Go project is so large
         | that it needs a stronger typesystem in my opinion you should
         | switch to a language like C# or Scala.
        
           | sesm wrote:
           | Go is geared towards productivity of specific people at
           | specific environment: fresh grads at Google.
        
             | tinco wrote:
             | First time I used it, I had not written a line of Go prior,
             | I rewrote our data ingestion gateway from Ruby to Go. It
             | took me 1 week, and it had full feature parity. I don't
             | think I had the GPA at university to land a fresh grad job
             | at Google.
             | 
             | Sure the tooling around it (besides the compiler itself)
             | was the dumbest 21st century tooling I've seen (this was ~6
             | years ago), but it was quick to learn, quick to write,
             | quick to compile and quick to run so who am I to complain.
        
               | sim_card_map wrote:
               | How was the tooling dumb?
        
               | tinco wrote:
               | It had this weird system where all your projects were
               | supposed to be located in one directory structure that
               | mapped to their git namespaces, it made no sense at all
               | and was a pain to work around.
        
         | spoiler wrote:
         | I'm really curious what you were compiling, and if you remember
         | the machine specs (and was the machine under other load at the
         | time). Two hours seems _beyond_ ridiculous.
         | 
         | Also, I'm assuming it was a full (as opposed to an incremental)
         | release build? But even then, I've never had to wait for that
         | long.
        
           | lmilcin wrote:
           | Two hours is not beyond ridiculous. A lot of projects in the
           | past compiled for way longer than that, days even.
           | 
           | In 90s and early 2000s if you worked on a large desktop
           | application (the size of Office or Photoshop) you could
           | expect to spend a business day waiting for compilation
           | results.
        
             | spoiler wrote:
             | Sorry, I should've clarified that I meant specifically
             | meant in recent versions of Rust. I wasn't speaking in the
             | general sense. I've experienced slow compilation times in
             | C++ projects, although not of the same magnitude you
             | describe (ie having to wait days)!
        
             | gpderetta wrote:
             | Around the turn of the millennium I remember recompiling,
             | for some reason I don't remember, the whole of KDE. That
             | took a bit...
        
         | swiftcoder wrote:
         | What on earth did you compile? I can't recall ever having
         | compiled a rust project that took more than 10 minutes from
         | scratch.
         | 
         | I do have memories of the days when compiling Chromium or AOSP
         | was a 4 hour battle, though :)
        
           | [deleted]
        
           | jynelson wrote:
           | The rust compiler itself can take as long as 2 hours on a
           | modern laptop if you do a full stage 2 build. I don't think
           | that's what the top poster was talking about though, it
           | sounded like they had a rust toolchain already.
        
             | steveklabnik wrote:
             | (You know this, but for anyone else reading, don't forget
             | that that doing this compiles LLVM, which is in of itself a
             | massive C++ project, as well as compiling the compiler
             | itself multiple times. That's part of why it would be such
             | an outlier.)
        
               | swiftcoder wrote:
               | Right. Nested C/C++ dependencies are the only case I can
               | think of where drastically-long compile times are common.
        
               | jandrese wrote:
               | Or maybe someone included Boost?
               | 
               | Sometimes it is just that someone drank the Kool-aid and
               | made everything a template. Then split their code across
               | hundreds of tiny files and did the one big massive
               | include at the top of every file that pulls in
               | _everything_.
        
         | eptcyka wrote:
         | I've only had 2 hour build times when I was compiling a rust
         | toolchain on an emulated ARM system running on x86.
        
         | JD557 wrote:
         | > How do you get anything done when the cost of testing an idea
         | is several hours?!
         | 
         | I haven't used rust in production, but usually you can just use
         | `cargo check` to only run the typecheck. Using `cargo build` is
         | also usually much faster than `cargo build --release`.
         | 
         | Having said that, at least in my toy projects it was not
         | uncommon to have to compile using `--release` to run the tests
         | (since otherwise the tests would take forever).
        
           | littlestymaar wrote:
           | > Having said that, at least in my toy projects it was not
           | uncommon to have to compile using `--release` to run the
           | tests (since otherwise the tests would take forever).
           | 
           | Maybe you're already aware of that, but if the reason why
           | your tests are slow is not "your code not being optimized"
           | but "your dependencies not being optimized" then Cargo
           | profile override[1] can save your life. You can have one
           | specific dependency (or all of them) being build in release
           | mode even when you're building you own code in debug mode.
           | The first development build is going to be quite slow, but
           | after that, you're going to have the best of both worlds.
           | 
           | [1]: https://doc.rust-
           | lang.org/cargo/reference/profiles.html#over...
        
       | lower wrote:
       | For languages with Hindley-Milner typing, like SML and OCaml, it
       | has long been known that type-checking is even worse in the worst
       | case (DEXPTIME hard) [1]. But the programs where this matters
       | aren't what one would write by hand, typically.
       | 
       | [1] https://dl.acm.org/doi/10.1145/96709.96748
        
         | contravariant wrote:
         | To some extent it wouldn't even be surprising for a
         | sufficiently powerful language to be EXPSPACE. Pretty much
         | anything with sufficiently powerful macros would be EXPSPACE,
         | or worse.
        
         | lower wrote:
         | Maybe it's a good idea to add a concrete example (can't edit
         | the reply anymore):
         | 
         | OCaml:                 let x1 = fun y -> (y, y) in       let x2
         | = fun y -> x1 (x1 y) in       let x3 = fun y -> x2 (x2 y) in
         | let x4 = fun y -> x3 (x3 y) in       let x5 = fun y -> x4 (x4
         | y) in       let x6 = fun y -> x5 (x5 y) in         x6 (fun z ->
         | z)
         | 
         | Haskell:                 test = let          x1 = \y -> (y, y)
         | x2 = \y -> x1 (x1 y)          x3 = \y -> x2 (x2 y)         x4 =
         | \y -> x3 (x3 y)          x5 = \y -> x4 (x4 y)          x6 = \y
         | -> x5 (x5 y)         in x6 (\z -> z)
         | 
         | If you can compile these, add x7. The effort increases
         | exponentially as one adds x7, x8 and so on.
        
           | NieDzejkob wrote:
           | The fact that x1 is \y -> (y, y) is superficial, though.
           | Something like x1 = Just also increases exponentially, and
           | makes clear that using a deduplicated DAG for representing
           | type terms doesn't solve this.
        
             | lower wrote:
             | With x1 = Just, the program is accepted instantly by ghc. I
             | think you need a type variable to appear twice.
        
               | NieDzejkob wrote:
               | Ah, it's still exponential, but in the size of the type,
               | and the constants of the complexity are a bit different,
               | so you need around x30.
        
               | lower wrote:
               | Ah, you're right. If one goes to x_{i+1}, then there will
               | be 2^i Maybes in the type. The number of Maybe-
               | occurrences in the type doubles in each step and sharing
               | won't help there.
        
           | Yajirobe wrote:
           | could you provide a python example?
        
             | Cu3PO42 wrote:
             | Python does not perform static type checking, therefore
             | this problem doesn't apply immediately. You can build a
             | program that takes exponential time to execute, but that is
             | probably not a relevation to you.
        
               | Enginerrrd wrote:
               | Building a program that takes exponential time to execute
               | because of python run-time dynamic type checking might be
               | interesting...
        
               | shrimpx wrote:
               | I don't think that's possible because I don't think
               | there's any aspect of the runtime type checker that
               | descends into complex type structures. It just checks if
               | runtime type tags match when you invoke an operator, or
               | that an attribute exists when you try to access it.
        
           | daxfohl wrote:
           | Formerly id id id id id id id id id... would do it, but looks
           | like someone has put in a fix for that.
        
             | lower wrote:
             | Yes, types are usually represented by a DAG with sharing.
             | OCaml does so, for example, and I assume ghc does something
             | similar.
             | 
             | So, while id id has type ('a -> 'a) -> ('a -> 'a), this is
             | stored in memory by a pointer structure that amounts to let
             | 'b = ('a -> 'a) in ('b -> 'b). The type of id id id would
             | become let 'b = ('a -> 'a) in let 'c = ('b -> 'b) in ('c ->
             | 'c). This grows linearly. If one were to write out the
             | types without sharing then their size would grow
             | exponentially.
        
               | chowells wrote:
               | There was a bug in ghci (the GHC REPL) for a while which
               | made it take exponential time to print the type of `id id
               | id id id id id id id`. It used the linear representation
               | for type-checking, but the pretty-printer was not
               | implemented so cleverly.
               | 
               | Apparently that's been fixed. I can't confirm because
               | that's not something I ever check the type of.
        
               | kmill wrote:
               | It seems to me that id id id ... id always has type a ->
               | a. If you're referring to the type of the first id in the
               | sequence, then what you're saying makes sense.
        
         | Ericson2314 wrote:
         | Note that is is polynomial with respect to the size of the
         | _output_.
         | 
         | That is to say, pathological expressions don't stay small but
         | induce horrendous searching, but rather merely get a log bigger
         | (and that can cause more trouble for the rest of the program).
         | 
         | That means a heuristic to bail out when something grows a lot
         | probably not violate the users intent and get it back into
         | polynomial time.
        
           | chalst wrote:
           | No, this is the time complexity of type checking before any
           | computation on the terms is performed.
        
             | Ericson2314 wrote:
             | Yes I am saying that while it is exponential with respect
             | to the input, and it is polynomial with respect to the
             | output.
        
       | shadowgovt wrote:
       | The points made by the Endsay here are key, and actually a good
       | example of trade-offs in software engineering. It is definitely
       | okay to incorporate an algorithm that is NP-hard into the core of
       | a performant system as long as you can have some expectation that
       | the input size on the NP-hard part will be small. Sometimes, it
       | even allows you to simplify another piece of the problem via pre-
       | computation that allows you to avoid making the piece of the
       | system that accepts large inputs require a costly solution
       | algorithm.
        
       | estebank wrote:
       | The actual code that handles exhaustiveness checks has extensive
       | explanations on how it works and what it was inspired by.
       | 
       | https://github.com/rust-lang/rust/blob/master/compiler/rustc...
        
       | rocqua wrote:
       | > he standard representation of such a formula is the conjunctive
       | normal form, or simply: AND-of-ORs. Our previous example would
       | look like this when transformed:                   A and B and (A
       | or B) and (!A or !B)
       | 
       | There is another normal form, the disjunctive normal form. This
       | is in the form of An OR of ANDs. Interestingly, every formula can
       | be represented in this form. And SAT is rather easy to determine
       | for this form.
       | 
       | The catch is that taking a logical formula, and placing it in
       | Disjunctive normal form is actually a rather arduous, non
       | polynomial process.
        
         | maweki wrote:
         | > And SAT is rather easy to determine for this form.
         | 
         | Gave me a chuckle, as that's quite an understatement, as the
         | decision procedure is only: is there a single clause and is it
         | just "False"?
         | 
         | > The catch is that taking a logical formula, and placing it in
         | Disjunctive normal form is actually a rather arduous, non
         | polynomial process.
         | 
         | Basically writing out all assignments that make the formula
         | true. Potentially exponentially many, as there are 2^n possible
         | assignments of n variables. So this transformation is basically
         | just solving and writing down all models.
        
           | mmarx wrote:
           | The conjunctive normal form can be exponentially large in the
           | worst case as well, though, see the footnote in the article.
           | In practice that is not a problem, since it suffices to find
           | a formula in CNF that is satisfiable precisely when the
           | original formula is satisfiable, even though it might not be
           | equivalent.
        
       ___________________________________________________________________
       (page generated 2021-07-08 23:01 UTC)