[HN Gopher] Compiling Rust is NP-hard ___________________________________________________________________ Compiling Rust is NP-hard Author : lovasoa Score : 367 points Date : 2021-07-08 09:10 UTC (13 hours ago) (HTM) web link (niedzejkob.p4.team) (TXT) w3m dump (niedzejkob.p4.team) | FartyMcFarter wrote: | > What's the most efficient algorithm you can write to do this? | As it turns out, you can't do well in the worst case -- and we'll | prove this by a reduction from SAT. | | Correction: you can't do well in the worst case, assuming P [?] | NP (where "well" means "in polynomial runtime"). | gizmo wrote: | Pedantic and wrong. Considering there is no known algorithm | that can sat solve in P, nobody in the world can do well in the | worst case. And if nobody can do it, then you can't do it | either, for any you. QED. | FartyMcFarter wrote: | I'm not a native English speaker, but I thought "you can't do | well" is equivalent to "it's impossible to do well", which is | how I read the original sentence. | | As an aside, I find it a bit ironic that you're complaining | about pedantry while interpreting the "you" word literally. | gizmo wrote: | It's not ironic that my response was pedantic, it was the | entire point. When you engage in pedantic nerd sniping you | just open yourself up for even more pedantic nerd sniping. | | "Can't" can mean impossible, or it can mean nobody can do | it. Compare "you can't go back in time" with "you can't | lift a horse". | gugagore wrote: | It's awkward to abbreviate "in polynomial time" as "in P". P | is a set of (decision) problems. We don't know that 3-SAT is | in P. We don't know that it's not. | | To say that a solver is not in P is a type error. | OgAstorga wrote: | Your "proof" it's more like. No one has ever stepped on Mars. | Therefore no one will. | | Yet another pedantic fallacy. | __s wrote: | Not that no one will, just that no one can | | I give you until the end of today to show me someone | (anyone!) step on Mars | | It was just someone being cheeky about someone being | cheeky. Now I'm being cheeky about you being cheeky over | someone being cheeky about someone being cheeky | sischoel wrote: | While this is an interesting result, isn't basically every | compiler NP-hard, because of register allocation? | srl wrote: | Compilers don't guarantee (or try to find) the absolute optimal | allocation, right? | WJW wrote: | This depends on the compiler of course. Most will not because | it is not worth it, but I expect that there is at least one | compiler out there with the option to exhaustively search the | allocation space for that very last nanosecond of runtime | optimization. Probably inside an HFT firm somewhere. | chrisseaton wrote: | No - you can express register allocation as an NP-hard problem | if you want to, but you don't have to and not all compilers do | it like that. You don't need an optimal register allocation - | you get into diminishing returns very quickly. | sischoel wrote: | Yes, I hadn't thought about that before, but of course the | difference here is that the Rust compiler cannot just produce | a "good enough" result for pattern matching. | kadoban wrote: | Well, it probably _could_ do "good enough". If a set of | matches gets too large/hairy it could just output a "here | be dragons" warning and go ahead with compiling it, no? | | That may not count as a correct Rust compiler though, I'm | not 100% sure how that's defined, or if it is. | chrisseaton wrote: | Register allocation is a great example of over- | theorisation. People found that register allocation can be | expressed as graph colouring, so they doubled-down and | worked really hard to solve that pure problem, but then it | turned out you can just do a very simple linear allocation | heuristic and the generated code is basically as fast as | solving the NP-hard problem. | | https://dl.acm.org/doi/abs/10.1145/330249.330250 | imtringued wrote: | Well, the problem for the NP-hard algorithm is that the | base line is already very efficient. It's like branch | prediction. Even the dumbest predictors can be 90% | efficient. The perfect solution doesn't have much room to | be better. I'd say there are more microsecond level | optimization problems left than on the nanosecond or | millisecond level. | adrianN wrote: | If you do a SSA transform of your program the register graphs | are chordal. It turns you that coloring chordal graphs is easy: | http://web.cs.ucla.edu/~palsberg/course/cs232/papers/HackGoo... | antientropic wrote: | Or worse than NP-hard: C++ template instantiation and the C | preprocessor are Turing complete. (Except that I believe C++ | requires some fixed limit on the maximum instantiation depth, | but even so it's still exponential complexity.) | rocqua wrote: | I believe you can encode paradoxes in C++ templating. | Something about two specializations which are mutually | impossible because they reference eachother. So you can say: | A is of type T <=> B is not of type T B is of type T | <=> A is not of type T | | Where <=> is if and only if. | | In which case it is paradoxical to say that A is of type T. | But also paradoxical to say that A is not of type T. | MauranKilom wrote: | Note that "a template for which no valid instantiation is | possible" is actually ill-formed, no diagnostic required. | | That said, it's not clear to me what part of "you can write | uninstantiable templates" is surprising, so I'm probably | not understanding correctly. Maybe you could find the C++ | code you're talking about? I would be interested. | Banana699 wrote: | >the C preprocessor are Turing complete | | That's surprising, the C preprocessor is pretty crippled. | It's creator intentionally made it so people won't abuse it | to create full blown mini languages. | | Did you get things mixed up or is there actually a way the c | preprocessor can be turing-complete ? | namibj wrote: | CPP needs to feed into itself (pipe stdout into stdin), if | you want it to be turing-complete. | gpderetta wrote: | I believe that while you can't encode an infinite tape, | you can very easily define an arbitrarily large tape, so, | while not technically Turing complete, it is close | enough. | mtlmtlmtlmtl wrote: | Not sure whether this proves turing completeness of the | preprocessor or not(probably not) but Simon Tatham has a | great(and somewhat disturbing) write-up[1] on creating | arbitrary control structures using macros in C. He even | goes as far as providing a library of helper macros for | doing things like making gensyms, etc. | | If nothing else, it does demonstrate that you can do a lot | more in the preprocessor than you might think, and this is | at least a significant subset of what true MP macros are | used for in Lisp most of the time. Except much gnarlier... | | [1] https://www.chiark.greenend.org.uk/~sgtatham/mp/ | srl wrote: | TFA notes that rustc is not a particular _efficient_ SAT solver, | taking something like 10^4 times as long as a commonly used | algorithm [1] would on a small example. | | So, I'm curious: what's the _strongest_ case one could make for | including a more robust SAT solver as part of the exhaustiveness | checking? Is there any reasonable place where this could matter? | (Extremely autogenerated code?) | | [1] https://en.wikipedia.org/wiki/DPLL_algorithm | trissylegs wrote: | Rust's proc-macro system makes it pretty easy to create some | hard to compile code. _but_ there 's many ways to generate very | slow compiling code with proc macros. | Quekid5 wrote: | One perhaps-interesting use case might be to use the fastest | possible solver in the case of compiling generated/foreign | code, so basically code that's expected to always compile (but | which still needs to be checked for safety). | | One could even imagine using it as a first-pass for code and to | just fall back to a slower-but-with-better-error-messages type | checker if errors are found. | | (Of course you'd end up with two orthogonal implementations of | the type checker, but that might not even be a bad thing since | you'd be easily able to check for inconsistencies between them, | aka. type checker bugs.) | ygra wrote: | I was under the impression that cases where it would matter are | typically code that wouldn't usually be written. E.g. C# and | Java also can solve SAT as part of overload resolution with | generic types and it's an interesting peculiarity, but not a | pattern that is interesting enough to optimize. Perhaps that's | the case here as well. | | I could also imagine the compiler code being easy to read, | understand and modify are often more valuable traits than | compilation speed of highly atypical patterns in a language. | But that may well depend on _how_ atypical those are. | johncolanduoni wrote: | The fact that most dependently typed languages (which have a | much higher propensity for complicated exhaustiveness checking | when pattern matching) don't use a serious SAT solver for this | makes me doubt Rust would ever need to. | sanity31415 wrote: | Doesn't Liquid Haskell[1] use a SAT solver? | | [1] https://ucsd-progsys.github.io/liquidhaskell-blog/ | johncolanduoni wrote: | Like most languages with refinement types it uses an SMT | solver (a more general problem), though I'm not sure it | uses it for pattern matching. I was thinking of Coq, Agda, | Lean, etc. which are full-spectrum dependently typed | languages. LiquidHaskell is more specialized, though its | automated proof capabilities are much stronger. | gpm wrote: | If you were using rustc as some sort of JIT compiler compile | times could become extremely important, and error messages less | important, at which point things like this might be worth it. | | If you wanted the SAT solver for some other reason | (optimization? proving code correct?) it might make sense to | use it for this too. | multifasciatus wrote: | > Obviously, every formula can be transformed into this form, by | applying the rules of Boolean algebra. | | Is it so obvious? Why do so many engineers write with these sort | of modifiers (obviously, just, of course, etc.)? | | Edit: I am assuming no ill intent from the author. I also use | these sorts of modifiers without thinking about them much when | writing. I am just curious as to why we as engineers call deeply | technical things obvious when they are not. | rgoulter wrote: | I liked the model in blogpost: | https://status451.com/2016/01/06/splain-it-to-me/ | | It discusses that there can be different ways sharing | information comes across. For some people, sharing information | demonstration that you're smart enough to know the thing. For | others, sharing information is for others' benefit. | | I wouldn't prefer to interpret "obviously" as "you're not as | smart/cool as me if you don't know this", but I can see why | there are examples that could come across like that. (I'd | sooner interpret it as "I'm not claiming to be smart by saying | this"). | bombela wrote: | I don't think it's any malice. Just a way to bring up the idea | of a sudden realization. Obviously, I could be wrong. | bruce343434 wrote: | Roughly related: why not simply[1] | | [1] https://news.ycombinator.com/item?id=27415725 | NieDzejkob wrote: | That sounds like something that could be solved with a linter | for prose. Though, I can't think of the search terms that would | let me find such a solution, if it exists. | guga42k wrote: | Usually this is a polite way to say that proof/derivation is | too long to fit on this page, please look somewhere else if you | are interested. | | Notable example would be Lev Landau's Theoretical Physics. If | you see "obviously" there it is probably not obvious at all and | can take good few hours to derive the formula. | joshribakoff wrote: | The good faith interpretation is that the author means in the | sense of "it follows then that" as opposed to "if you didn't | realize this upon reading the previous section then something | is wrong" | EdSchouten wrote: | It's only NP-hard if you want the compiler to give you proper | warnings/errors, right? As far as I know, the first matching arm | is used. This means that if a compiler doesn't do any analysis on | it, but naively generates machine code to compare the value, then | there is no complexity at all. | | (Not saying that that's acceptable...) | tialaramex wrote: | Here's what Rust's compiler actually does today: | | https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_bu... | | Rust lets you say either "There are four specific possible | Clowns, and I want everybody to explicitly deal with that" or | you can write #[non_exhaustive] to say "There are several | Clowns, right now I defined four of them, but maybe I'll add | more some day" in which case _you_ are allowed to only write | code for those four Clowns, but everybody _else_ is obliged by | the compiler to handle the case with more Clowns than that | since they can 't depend on there only being four, and they | have no idea how to name the additional ones that don't yet | exist. | | So, a compiler that just punts is not a correct Rust compiler. | It needs to do this analysis or it's compiling a different less | useful language. | | In the real world the (default) exhaustive case is great for | types that reflect some truth about the world being modelled, | where you really do mean that all the code relying on this | enumerated type should refuse to compile and needs re- | engineering if your list is subsequently discovered not to | really be exhaustive. Teapot not withstanding, there are | definitely only five PlatonicSolids, such as Tetrahedron and | Cube. The non-exhaustive case is great for types that reflect a | truth you know or strongly suspect will change and everybody | using your system needs to be prepared for that up front. In | 2016 the enumerated list of US FederalHolidays didn't need | Juneteenth, but in 2021 it does, we don't want random code to | blow up because it had never imagined Congress would make new | ones. | EdSchouten wrote: | Sure. But a naive compiler whose only purpose is to stomp out | a poorly optimized executable can safely ignore that | annotation, right? | richardwhiuk wrote: | No, because checking that it is exhaustive is required by | the language definition. | | It's not just not optimized, it's wrong. | tialaramex wrote: | You could instead require everything to be treated as non- | exhaustive in your cheap compiler. This means it would | reject correct Rust code that explicitly handles each of | the five PlatonicSolids differently, because that code | lacks a way to handle a sixth (non-existent) PlatonicSolid. | | You can't just go "Eh, shrug emoji" without deciding what | happens when you don't match. | | Suppose you silently treat the first (or last) handled case | as the default, so I can write code that only handles Cubes | and unlike the real Rust compiler your naive compiler spits | out a program. But wait, now what happens for a | Tetrahedron? I defined Cubes to have a 32-bit floating | point size, but for whatever reason Tetrahedrons have a | 16-bit integer length instead. Does your compiler just try | to treat the 16-bit integer as a 32-bit floating point | number since it assumes this was a Cube? | kevincox wrote: | Yes, that annotation doesn't affect the codegen of a | correct program (optimizations aside). | NieDzejkob wrote: | It still needs to ensure that one of the arms will always | match. (just not executing any branch in that case is tricky, | because the arms can return values) | remram wrote: | It does not need to check, if it _assumes_ it. mrustc works | that way, it is not trying to check code, only compile valid | code: https://github.com/thepowersgang/mrustc | | This is similar to C/C++'s " undefined behaviors". There are | things you can write that aren't valid for the language, but | that compilers are not required to catch. | kadoban wrote: | I had not thought of pattern-exhaustiveness checking like that. | | Isn't compiling several languages, including Rust, actually | undecidable though? Seems like if your type system or | metaprogramming systems are Turing complete, this has to be the | case. | | So that's ~worse than NP-hard already (though calling it NP-hard | is still technically correct). | trissylegs wrote: | I think in Haskell you need to add: {-# | LANGUAGE UndecidableInstances #-} | | to get an Undecidable type system. | arianvanp wrote: | There are other language extensions to make the haskell type | system undecidable. The classic example is `RankNTypes` | | E.g. you can proof that type systems of rank n >= 3 are | undecidable. [0] | | I think that's why haskell has both Rank2Types and RankNTypes | as a language extension. So you can still use higher rank | types without running into decidability problems up to rank | 2. | | [0] https://dl.acm.org/doi/10.1145/182409.182456 | arianvanp wrote: | Type checking /compiling often is conservative for this | purpose. It's sound (Rejects all incorrect programs), | decideable (There is an algorithm to reject all incorrect | programs) but due to Godel's decidability problem they're | incomplete (There are always correct programs which the type | checker rejects as incorrect). | | There are exceptions of course (C++ templating is turing | complete; so type-checking is undecidable) but often you want | your compiler to be sound and decidable | | Also see https://3fx.ch/typing-is-hard.html | | (edit: Apparently , as the link I posted shows, rust is indeed | undecidable in certain edge cases. However you can usually | reason about a subset of the language (Restrict using certain | features) that is still decidable. I have the feeling Rust | might have such a subset but it's just a fuzzy feeling) | thaumasiotes wrote: | > It's sound (Rejects all incorrect programs), decidable | (There is an algorithm to reject all incorrect programs) but | due to Godel's decidability problem they're incomplete (There | are always correct programs which the type checker rejects as | incorrect). | | Isn't that the definition of "semidecidable" as distinct from | "decidable"? A decidable question is one where you can | determine the answer, whatever that answer may be. A | semidecidable question is one where, if the answer is yes, | you can determine that, and if the answer is no, you may not | be able to determine that. | | If you can always show that a program is incorrect, but you | can't necessarily show that a program is correct, then the | correctness of a program isn't decidable. | arianvanp wrote: | Decide-ability here means "Can we come to an answer in a | known amount of steps". If you have an algorithm for type | checking; then that answer is yes! You're _always_ able to | come up with an answer that is yes or no. | | For example the following algorithm is for typechecking is | decideable: typechecks :: Program -> Bool | typechecks _ = false | | We always come to decision. There is no "semi" here. Just a | clear yes/no answer. | | The algorithm is even sound. It rejects all incorrect | programs. | | However it is not complete. It might also reject correct | programs (In this case it rejects all correct programs). | | Of course you want a as little conservative typechecker | without getting into trouble. But it's always conservative | to some extent. Preferably you would both be sound _and_ | complete. But the problem is that any reasonable logic | system can't proof its own consistency. Hence we can't have | both. However we can get "as close as we possibly can". | kmill wrote: | I believe you're explaining what a computable function | is. A decidable set S of a type T is defined to be one | for which there is a computable function f : T -> Bool | such that the set of values x with f x = true is equal to | S. A semidecidable set S of a type T is defined to be one | for which there is a partial computable function f : T -> | Bool such that whenever x is in S, then f x is computable | and equals true. | | Given a definition for a well-typed program, you get a | set W of Program of the well-typed programs. You can ask | whether W is decidable or semidecidable -- i.e., whether | there is some (partial) computable function f : Program | -> Bool with the above properties. | | The example you give is certainly computable, but it has | nothing to do with the set W, so it says nothing about | (semi)decidability of the typechecking decision problem. | | However, it is a valid trying to find the largest subset | of W that is (semi)decidable. Or trying to redefine what | it means to be well-typed so that it is (semi)decidable! | | One interesting case of a programming language without a | semidecidable typechecking problem is Lean. The | typechecker they've implemented will both reject or time | out on valid programs (so the typechecker is at least a | computable function in a sense), but in practice these | cases don't really occur. | thaumasiotes wrote: | > We always come to decision. There is no "semi" here. | | That's not really how I understand the "semi" in the | terminology. For a decidable question, you can recognize | when the answer is yes, and you can recognize when the | answer is no. For a semidecidable question, you can | recognize when the answer is yes. | | It doesn't take much of a stretch to describe that as | cutting your capabilities in "half". | | The way I usually think about this is that with full | decidability, you can rule something in or out. With | semidecidability, you can rule something in, but you | can't rule things out. | | That framework extends well to the situation described | above, where if the compiler chooses to compile a | program, then the typing in that program was valid, and | if the compiler chooses to complain about the typing, we | can't draw any conclusions. It doesn't match your | example; you can't rule anything in or out. | cedilla wrote: | Semi-decidable means that will accept all correct formulas, | and either rejects incorrect formulas or gives no answer. | It's the same as having a generator that generates every | correct formula eventually. | | In my opinion, OP is incorrect and those systems are sound | but indecidable. | thaumasiotes wrote: | > Semi-decidable means that will accept all correct | formulas, and either rejects incorrect formulas or gives | no answer. | | By this definition, program incorrectness isn't | semidecidable either - the compiler will accept all | correct [incorrect] formulas, and will either reject or | accept incorrect [correct] formulas. | ThreeFx wrote: | Not really, because for undecidable problems | semidecidability can only hold one-way. If it held in | both ways then the language would be decidable. | | Take the Halting program for example: For a program p | which halts, you can determine whether it will halt in | finite time (because by definition it halts). However, | there is no semidecidable algorithm for "this program | runs infinitely long". | thaumasiotes wrote: | I don't understand what you're trying to say. What are | you contradicting with "Not really"? The claim above you | is "by this definition, program incorrectness isn't | semidecidable either". You're saying that it is? | LadyCailin wrote: | Sort of a side track, but there's a great layman's intro to | this in this Veratasium video | https://m.youtube.com/watch?v=HeQX2HjkcNo | jcelerier wrote: | > There are exceptions of course | | for a definition of "exceptions" being "virtually all | languages with a non-trivial static type system and non-zero | userbase" as shown in your link. | | Java (https://arxiv.org/abs/1605.05274), C# | (https://blog.hediet.de/post/how-to-stress-the-csharp- | compile...), Scala | (https://michid.wordpress.com/2010/01/29/scala-type-level- | enc...), Haskell, Rust | (https://sdleffler.github.io/RustTypeSystemTuringComplete/), | Typescript | (https://github.com/Microsoft/TypeScript/issues/14833) ... | | Having a type system with generics and not being turing- | complete as a result _is_ the exception, not the rule. | octachron wrote: | Generics is not really the right cutting point: parametric | polymorphism is perfectly decidable. However, few languages | stop at just parametric polymorphism. For instance, the | OCaml issue in the linked blog post happens at the module | level, which mixes parametric polymorphism and subtyping. | And typically, while SML has parametric polymorphism, it | lacks the OCaml extension that makes the module type system | undecidable. | gnulinux wrote: | For an example of contrary, one can check Agda. It is not | Turing complete, but a useful language (I personally wrote | many parsers in it and its Haskell FFI is pretty good so | you can essentially write unsafe parts of your programs in | Haskell, and safe parts in Agda then prove correctness). | arianvanp wrote: | Haskell's System-F is definitely completely is decidable. | You need to opt in to undecidable features like RankNTypes | (n>=3) explicitly. (which are GHC; and not Haskell | specific). | | Even then; Undecidability is a bit spectrumy. e.g. | RankNTypes become sort of decideable again in the presence | of explicit type annotations | Zababa wrote: | Rust's type system is undecidable. You can see a few languages | listed here: https://3fx.ch/typing-is-hard.html | silon42 wrote: | I believe there are limits to recursion (as mentioned in link | above), so not strictly true. Perhaps they are too big? | kibwen wrote: | Rust's recursion limit is fairly low, I believe it's 128. | However, you can override this if you find it too | restrictive. | simiones wrote: | This ends up in some discussion about what "compiling" actually | means. Is macroexpanding a part of "compiling"? | | Most languages with complex type systems avoid Turing | completeness by having an explicit limit for all type level | expansions - at least, that is how Java and C# do it. | kadoban wrote: | I think metaprogram expansion has to be considered part of | compiling. You have sourcecode, you need machinecode. If the | lang says that the process of getting there requires running | some metaprogram, so be it. | | For Rust I think the type system itself is probably enough | though even without that. If neither type checking or | metaprogramming are part of compiling, I think your (for | hypothetical you) definition of compiling is a bit too | restrictive. | aj3 wrote: | Even makefiles are Turing complete. | BBC-vs-neolibs wrote: | Thanks, I couldn't quite put it in words, but that is exactly | my reaction. | jacoblambda wrote: | Compiling a complete language and verifying correctness is | undecidable but that's not necessarily what Rust does. | | The complier can still fail to compile/verify otherwise correct | parts of a program. Because it's only operating on a subset of | the entire possible language and it's not verifying all | properties of the language, it's not quite undecidable but it | is still very much NP-hard. | | Languages like C, C++, or Ada on the other hand take the other | approach. They compile all parts of the language but make | little to no attempt to enforce correctness on all those parts. | You see verifying compilers for those languages only allowing a | subset that they can actually verify which is the same as Rust. | | At the moment Rust the language is what the main Rust compiler | says it is but once there are meaningfully different compilers | you'll start to notice the issue a bit more and there will | likely be parts of the language that one compiler can verify | but the other can't (and therefore fail to compile). | kibwen wrote: | Any hypothetical standards-conformant Rust implementation | would have to be as or more restrictive than rustc, not less. | Something like mrustc, which performs no borrow checking, | would not be allowed to call itself a conformant Rust | implementation. | | (I say hypothetical because until there's a spec, it's simply | not feasible to build a conformant alternative | implementation, as you'd need to be bug-for-bug compatible | with rustc.) | ncmncm wrote: | This is also not correct. As in, completely wrong. | | Presuming a formal spec that rustc happens to enforce, | another compiler could simply try longer to resolve types | before giving up, admitting some set of programs that rustc | gives up on. | tialaramex wrote: | The spec. could specify exactly the rules rustc actually | has including iteration limits, this would freeze rustc | (and so it would be undesirable) but it is an option. | | The thing we're talking about here has changed after Rust | 1.0, Rust 1.0 shipped with a rule that said if you match | integers you have to provide a default. In lots of code | that feels natural. But not everywhere. If you match all | possible integers (e.g. for a i8 that's from -128 to | 127), the compiler would say "Not good enough" until you | write the default match it will never actually need. | | That was fixed with feature(exhaustive_integer_patterns) | in 2018 or so AIUI. | | But you could imagine Rust being standardised with the | old behaviour and, even though it's clearly _possible_ to | write a compiler which implements | feature(exhaustive_integer_patterns) that would then be | non-standard because Rust programs lacking the default | match for integers are forbidden in the standard. | ncmncm wrote: | The type system is also Turing-complete, therefore non- | terminating. It would be arbitrarily hard to write a spec | in a non-Turing-complete language such that all | implementations would admit the same set of programs, or | even (responsive to the original claim) reliably only a | smaller set. | ncmncm wrote: | > _[Rust is] not quite undecidable but it is still very much | NP-hard._ | | > _Languages like C, C++, or Ada ... make little to no | attempt to enforce correctness_ | | Both statements are laughably false, but for different | reasons. | | Both are based on the Doctrine of Rust Exceptionalism, which | requires that Rust is not just a variation on a theme, but | fundamentally different from other languages. Like most | doctrines, this one is indefensible. | | Rust compilation, like C++ and Haskell, is undecideable, not | just NP-hard. Non-terminating compilation is artificially | curtailed, thus failing to produce an open set of what would | have been correct programs. | | The Rust compiler performs certain correctness checks that | other languages do not. It rejects an infinite set of correct | programs that its correctness checks cannot resolve. In this | as in all other particulars, Rust is very much on the | continuum with other languages. All strongly-typed languages | perform a huge variety of correctness checks, some built-in, | others programmed, with great success, rejecting the large | majority of incorrect programs people write. As a | consequence, it is very common for Rust, C++, and Haskell | programs to run correctly the first time, once the compiler | is satisified. | volta83 wrote: | > Isn't compiling several languages, including Rust, actually | undecidable though? | | While Rust is Turing complete, the compiler has a recursion | limit, and if you take that into account, then compiling Rust | is decidable (you only need to evaluate compilation up to the | recursion limit). | ghosty141 wrote: | Rusts type system is turing complete if I recall correctly. | Obviously, in practice everything is in a way decidable | because we have limitations everywhere. | derefr wrote: | In other words, the _abstract machine specification_ you're | programming your compile-time logic against in Rust, _has_ | Turing-complete semantics; but its extant _implementations_ | -- any Rust compiler people would care to use -- | intentionally does not implement compile-time Turing- | completeness. | | But you could in theory write a Rust compiler whose | compile-time runtime _is_ fully Turing-complete; and that | compile-time runtime would be conformant to the Rust | language spec. (Just, nobody would want to use it, because | "unbounded runtime" isn't a property people tend to want | from their compilers.) | dllthomas wrote: | > But you could in theory write a Rust compiler whose | compile-time runtime is fully Turing-complete | | Only if your theory allows for infinite storage... | einpoklum wrote: | Well, my infinite tape seemed to work just fine last time | I checked. | | Of course, I didn't check ad infinitum. <rim shot> | lovasoa wrote: | There is a low hardcoded limit to how deep you can nest | types. But I don't think there is a hardcoded limitation to | how large match patterns can be. That's the difference. | smitop wrote: | It's possible to change that hardcoded limit. By default | it is 256, but you can bring it all the way up to 2^64 | with #![recursion_limit = "18446744073709551615"]. | pjmlp wrote: | Same applies to C++ templates by the way, the compilers | usually provide an expansion depth flag. | comonoid wrote: | "I'm not just hard, I'm NP-hard, baby". | speed_spread wrote: | Wanna solve my constraints? | not2b wrote: | So just use a SAT solver to check for checking whether pattern | matching is exhaustive. Any decent one will have much better | performance than the current approach in the Rust compiler for | complex cases. The interesting thing about SAT is that although | the worst case is exponential (as far as we know until someone | proves P = NP), even huge real-life problems (from circuits, | program analysis etc) tend to have structure that makes them | quickly solvable with a SAT solver. And if not, have a time | budget and report "too complex to solve" if exceeded. | thaunatos wrote: | It's worth noting that the author of the arcicle doesn't think | it's worth integrating a SAT solver: | | > Does this mean that rustc should integrate an industrial- | strength SAT solver? As hilarious as that would be, I'm | advocating no such thing. This will only be a performance issue | on pathological examples crafted by bored nerds, and I don't | think precious engineering time should be spent on that. | Besides, generalizing a SAT algorithm to handle the full | expressive power of Rust's patterns might be, to borrow some | language from mathematicians, non-trivial. | Ericson2314 wrote: | > Does this mean that rustc should integrate an industrial- | strength SAT solver? | | Well, this is actually what https://github.com/rust- | lang/rfcs/blob/master/text/1868-port... would require. There is | already chalk, a logic programming language for traits too. | | I don't balk at these things. A good logical / relational toolbox | is very useful for a lot of tasks. | cletus wrote: | Compiling really is the Achilles heel of Rust I feel. This | article really talks about the worst case. As others have noted, | Haskell and other languages have this property too. | | Here's another perspective on practical issues [1] that's worth | reading. | | Basically, Rust made some understandable but unfortunate design | decisions that are hard to walk back or change now. | | [1]: https://pingcap.com/blog/rust-compilation-model-calamity | w-m wrote: | Presumably there's another SAT solver in the realm of the Rust | tool chain: cargo surely has one for dependency resolution. This | is a big time sink in other package managers (looking at you, | Anaconda). Does someone have an insight how solving is | implemented in Cargo? | littlestymaar wrote: | There is one[1] but Rust allows different major versions of the | same package to exist in parallel, so dependency resolution is | usually trivial unless someone in the dependency tree | explicitly adds an upper limit of the acceptable version [2], | which is pretty uncommon (I don't think I've ever met one in | practice). | | [1]: https://github.com/rust- | lang/cargo/blob/master/src/cargo/cor... | | [2]: https://doc.rust-lang.org/cargo/reference/specifying- | depende... | donatj wrote: | I had rare occasion to compile a Rust application recently | because a prebuilt binary didn't exist for my platform. It took | just over two hours! Coming from primarily Go development | recently, I was astonished. | | How do you get anything done when the cost of testing an idea is | several hours?! I'd feel like the sunk cost fallacy would kick in | for even the most lackluster of changes just because you took the | time to try it. | spockz wrote: | Compiling linkerd-proxy took me also north of an hour on an | Intel 6850k. Most of the time was spent compiling dependencies | though. I also compiled envoy proxy which also took more than | an hour. | | Subsequent rust builds for linkerd-proxy were quite fast. For | envoy most time was spent in Bazel itself. Probably because I | just did a Bazel build instead of specifying a more specific | target. | mkj wrote: | Out of curiousity I tried linkerd2-proxy here on a i5-1145G7 | (I guess similar total CPU? Less cores, laptop model, newer). | Took 5 minutes but used ~6GB resident. Maybe yours was | swapping a lot? | glennpratt wrote: | I worked on a Rust project at work for several months. Compile | times were not a significant issues for developers where it was | generally cached. For example, repeatedly running a test only | compiled a small amount of code. | | It was a bit slow in CI/CD build environments with no cache, | but so are the k8s Go projects I work on (go mod pulling in the | world). | | The only thing approaching 2 hours I've ever seen is building | the compiler from scratch. | yakubin wrote: | Although Rust compile times are rather bad, IME such | astronomical compile times are rather a result of terrible code | quality, a lot of code doing essentially nothing, just gluing | glue, to the point where it's normal to get a 50-60 frames deep | stack trace with most functions being named such that it would | suggest that all of them do basically the same. Then you look | at their bodies and... yep... doing the same, aka nothing. | | With Rust, procedural macros add a lot of overhead when | compiling, so I try to avoid them. | | At work, we have a C++ project which, without using distributed | builds with distributed ccache, running on powerful cloud | computers, takes 3h+. Code quality is adequate. Debugging | anything is a nightmare in such a codebase. | spoiler wrote: | While I agree that the compile times are bizarre, I don't | think we can jump to conclusions such as "such astronomical | compile times are rather a result of terrible code quality" | without knowing more about what's being compiled and under | what conditions it was being compiled! | gameswithgo wrote: | Short answer is it doesn't normally take that long. You also | don't have the problem of cgo being slow or ever having to wait | for the data race checker to run. | | initial compile has to download and compile all dependencies. | after that compiling is incremental. still slower than go | though | adwn wrote: | Compiling Servo, a _web browser engine_ , takes 5-6 minutes for | an optimized build from scratch on my 6C/12T machine. So no, 2 | hour build times are not normal for all but the largest | software projects. | genghizkhan wrote: | Compiling paru, an AUR helper for Arch Linux, on the other | hand, takes me around 3 - 4 minutes on a 4C/8T i5 for an | optimised build. I think Rust compile times might just fall | within a narrow range. | Kaze404 wrote: | Compiling for production and compiling for development are two | very different processes, with the latter benefiting from less | optimizations and incremental compilation. It wouldn't take two | hours to test an idea. | wongarsu wrote: | In my experience, Rust compilation times aren't that different | from what I'm used to in the JavaScript world. Initial | compilation takes a minute or two (comparable to the runtime of | `npm install` on Windows), when you make changes you usually | get incremental compilation within a few seconds. | | I guess you can have projects that are just huge and/or run | into some pathological case that increases compile time a lot | (just like with C++), but for any subsequent compilations you | should get very fast incremental builds. | jackcviers3 wrote: | I get this question about large scala codebases too - clean | build and test cycles for things on underpowered containers in | cloud build pipelines are in the tens of minutes sometimes. | | One: my local dev machine has 16 gigs of memory and 16 cores. | What takes the tiny docker container with 1 gig and 2 vcpus 30 | minutes takes my computer about 1. | | Two: Incremental compilation and testOnly make build/test | cycles maybe a second / twenty seconds max, and most of that is | test runtime on complex property based tests. | | You just get by without a clean compile most of the time after | you build the project the first time. And really, a lot of | builds spend an inordinate amount of time just pulling in | external dependencies (which are also cached on my local | machine, but not on the containerized builds a lot of the | time). | lmilcin wrote: | I have worked for 3 years on a project where it took a whole | week to get the code compiled, signed by an external company | and deployed to the device so that I could see the results. | | I just learned to work without compiling for a long time. Over | time my productivity increased and the number of bugs fell | dramatically. | | Working this way requires you to really think about what you | are doing, which is always a good idea. | | This was over a decade ago and now I work mostly on Java | backends and I am happy that I typically spend days or even | weeks without ever compiling the code and that it usually works | the first time I run it. | | I can't think how I could get back. It looks really strange to | me to observe other developers constantly compiling and running | their code just to see if it works. It kinda looks as if they | did not exactly understand what they are doing because if they | did, they would be confident the implementation works. | | The only time when I actually run a lot of compile/execute | iterations is when I actually don't know how something works. I | typically do this to learn, and I typically use a separate toy | project for this. | nine_k wrote: | Often you can be certain that you know what tp expect from | your own code, but not from dependencies or external systems. | So checking that your assumptions about them are right is a | major reason to run and rerun your code. | lmilcin wrote: | That is good reason to minimize amount of dependencies and | only use the ones you know and can reason about. It is also | part of what I do to help me reason about code before I | execute it. | | As I mentioned, if I don't understand a dependency or | external system I make separate toy project where I can | rapidly experiment and learn. | | Think of it as having fun in a aircraft simulator. You play | with it so that you are prepared to fly the actual plane. | | Also checking your assumptions by trying to see if the code | works is a problem in itself. A lot of these broken | assumptions will not break your code immediately but maybe | sometime later. Maybe when your application is under load | or maybe when clock on the server is moved back by an hour | or maybe when the connection breaks in a certain way. | | Base your work on _knowing_ how something works and not | _assuming_. | | Best way to limit the risk of your application failing due | to broken assumption is to limit your reliance on | assumptions in the first place. | marcosdumay wrote: | I used to use approach, but the new heavily typed languages | bring a lot of really nice tools that you only get to use at | compile time. | | Specifically in Rust, you can use the language to guide you | through things like refactoring, thread synchronization, | correctness verification and a lot of other things. But you | have to run the compiler. | lmilcin wrote: | I don't write production code in Rust (though I learn for | my embedded hobby). | | But you can say the same for Java. IntelliJ IDEA internally | does equivalent of compilation and tells me exactly where | my code would fail to compile. | | So in a sense I am not strictly practicing my approach, but | I also don't see reason to do so if the tools are reliably | giving me hints when I made mistake writing something that | will not compile. | slunk wrote: | > So in a sense I am not strictly practicing my approach | | Developing in an IDE that compiles almost continuously is | about as far from the development philosophy you're | advocating for here as one could get :P | lmilcin wrote: | That doesn't make any sense. | | This isn't about throwing away tools for some idealized | goal. It is about using the tools that are available to | achieve best results without making you reliant on the | tools to the point you don't know what your program is | going to do without compiling and running. | | IDE helps catch a lot of stupid simple mistakes and that | helps save time. Why would that be bad? | slunk wrote: | I don't think using an IDE to catch lots of stupid simple | mistakes is bad. It's how I prefer to work. | | > It looks really strange to me to observe other | developers constantly compiling and running their code | just to see if it works. It kinda looks as if they did | not exactly understand what they are doing because if | they did, they would be confident the implementation | works. | | Explain to me how this statement doesn't apply to your | use of an IDE, but the other engineers you've observed | don't understand what they're doing. | [deleted] | lmilcin wrote: | If you can't read that sentence with comprehension none | of my explanations are going to help. | SV_BubbleTime wrote: | How do you like embedded Rust? | | I'm looking forward to someone making a legit IDE/suite | with support, no indication of it yet but I assume some | day! | lmilcin wrote: | I mainly work with STM32 Cortex-M3 MCUs (again, these are | my personal projects). | | Rust, well, "works". But there is still a bunch of issues | so I keep developing using C until I get the kinks ironed | out. | steveklabnik wrote: | It's working really well for us at Oxide. I even do it on | Windows :) | hinkley wrote: | My theory for Java is that it was frog boiling turned cargo | culting. | | Comparatively speaking Java compilation was very fast at the | beginning, so for instance Red-Green-Refactor (RGR) works | pretty well. There's a parallel to other creative jobs where | sometimes shuffling around the things you already have | reveals a pattern that leads to a breakthrough. | | But there are other feedback loops that, with J2EE in | particular, the cycle times started to creep up, and up, and | at some point if you haven't stopped and looked at what | you're doing you don't see how crazy things have gotten. RGR | still has a place there because you are typically _not_ | recompiling everything and you aren 't spooling up the | application to run unit tests. But making one line changes to | see how a page loads is just bonkers amounts of busy work. | | One of the bad dynamics is that people more like you also | tend to memorize the code, which is both bad for new hires | (circular logic does not reveal itself when you introduce one | assertion at at time, but does when you get hit with all of | them at once), and also incentivizes you push back on | refactoring. Because those damned smartasses keep moving | things around and they were Just Fine where they were. If | that happens you have cemented the entire codebase and | anything that is really wrong with it is going to stay wrong | until someone proposes a rewrite. And having learned the | wrong lessons the first time, we repeat them again in the | second iteration. | random_kris wrote: | I am one of those devs that is always compiling and checking | if their program is working. I am mainly working with java | and still doing this. | | Less so than when working with JavaScript. But please teach | me your ways hahha | ransom_rs wrote: | In case anyone else does this and is new to Rust - you can | use `cargo check` to type check/borrow check/everything | else without doing any codegen | lmilcin wrote: | This thread is no longer with regards to Rust or checking | whether the code compiles or not. It is about how you can | work with compilation times that are longer than a coffee | break. | lmilcin wrote: | Start by understanding that this compile/run process is a | crutch. Rather than use your knowledge, experience and | intelligence to predict if it works you resign yourself to | just waiting to see. | | This is a problem for many reasons. One is that this may | help you get something working, but with lack of deep | understanding the outcome will likely be subpar if only | because not all problems that you could have predicted will | show themselves on execution. | | Another is that it basically shrinks the part of brain that | is necessary for understanding and predicting behavior of | your code (figuratively). Sort of like driving with GPS | makes me helpless without it. | | Try to write larger stretches of code without compilation. | | Try to focus on modularizing your application so that you | can reason about modules separately. This is always a good | idea, but it is even more important when you need to be | able to predict how something works without trying it. | | When you have finally compiled your code and it failed, do | not immediately go to fix the problem. Try to spend a | moment to learn from the failure and improve your process | so that you minimize chance of this happening in the | future. | | Ask yourself, what you could have done to prevent this | problem from happening? Could you have specified some | function or module better? Could you have simplified your | code to be able to better reason about it? Would it help if | you have spent a bit more time getting acquainted with this | internal or external library before you decided to use it? | | From my experience, most of this comes down to following | things: | | - defining your modules and APIs correctly -- badly defined | modules make it difficult to predict the behavior, | | - finding simple solutions to problems -- complex solutions | tend to make it difficult to predict behavior, | | - using only tools you understand, | | - only interacting with existing code after you have | understood how it works (I typically at least look over the | code that I plan to use), | | - thinking hygiene (make sure you base your work on hard | facts and not beliefs), | | - refactoring, refactoring, refactoring -- first solution | to the problem is rarely optimal. I write something that | works and then immediately keep refactoring it removing any | unnecessary complexity until I am satisfied. Don't leave | refactoring for later -- when you have just written a piece | of code it is easiest to change it. | | - as much as possible, writing your code in a way that it | is not even allowed to produce wrong result. This is very | large topic so I won't explain. There is a lot of | techniques that you can research. | random_kris wrote: | Do you think there is a time and place where it is more | sensible to just go by trial and error. | | For example when I am interacting with a codebase for the | first time and I want to implement something I just keep | bashing and throwing shit at the wall untill something | sticks. After that I start working more in line of what | you described. | lmilcin wrote: | How exactly you arrive at understanding the codebase is | not as important. | | Just make sure you keep actual development separate from | learning the tool if you care for your results and | especially reliability. | | Now, I use various ways to learn the tools and codebase. | Running PoC for my idea or maintaining separate toy | project helps me with maintaining hygiene. | | For example, for the past year I have been spending a lot | of time learning reactive programming with RxJava and | Reactor. I have created a dozen small projects | illustrating various ideas for making reactive APIs, | processing pipelines, separating business logic from | infrastructure, composing reactive modules, etc. | | I did this with aim of purposeful learning rather than | writing production code, even though some of these in the | end migrated to be part of production codebase. | | I am now at a level where I can, again, write large | swaths of modules, libraries but now using reactive | paradigm, with a very good chance of it working | correctly, which for me validates that I more or less | understand what is going on. | throwawayboise wrote: | First job was in a mainframe environment, compiles were | pretty quick but the job queue was so big and developer | compiles were low enough priority that it could take hours of | wall clock time to turn around. | | I don't remember the specifics but while watching the job | queue on the terminal screen I discovered that the job | priority was editable. So I bumped it up, my compile job ran, | and I was happy. I only did this a few times before I got a | lecture from the system operations guys that was quite | explicit that I should never do this again. | | Yes, you figure out how to run code mentally, how to check | carefully for typos and syntax errors, etc. No Intellisense | then either, that's another modern crutch. | yosefk wrote: | I doubt your compile times were due to pattern matching which | TFA demonstrates to be NP-complete, any more than C++'s | compile-times are due to the undecidability of template | metaprogramming. | | (Though I guess one might argue that slow compile times on | typical code and rarely-used computationally hard features have | the same root cause of not treating fast compile times as a top | priority.) | tialaramex wrote: | Right, and C++ got a lot of its compiler performance for | large projects by making this an embarrassingly parallel | problem even though the consequences are negative for the | main consumers of the code (people, not compilers). | | One cost of being embarrassingly parallel is the One | Definition Rule. If we can compile N different code units in | parallel, but they're all allowed to define things in the | same namespace, obviously those definitions might contradict | each other and we wouldn't notice. So, the C++ language | explicitly forbids this, knowing you'll probably do it | anyway, at least by mistake. If (when) you do, that isn't a | valid C++ program, but the compiler isn't expected to produce | any diagnostic (warning or error). So, you get a binary, but | the language doesn't care what that binary does. Maybe it | does exactly what you expected. Maybe it does _almost_ | exactly what you expected. If not too bad, the One Definition | Rule means your compiler was _fast_ and that 's what matters. | sakex wrote: | I usually fails at link time doesn't it? | bluGill wrote: | No. I have intentionally violated the one definition rule | and nothing broke at all. In my case I wrote a second | timer, which had hooks my unit test system could use to | advance time without having to inject timers. | | It will fail at link time if you link everything into the | same library. Even here there is an escape: there are | ways to mark something as a weak symbol and than the | linker won't complain about more than one definition. | | See your OS documentation for how this works on your | implementation. (though don't be surprised if the | documentation is wrong...) | brandmeyer wrote: | > I wrote a second timer, which had hooks my unit test | system could use to advance time without having to inject | timers. | | This use case isn't an ODR violation. Its just using the | linker to mock an interface. | | > It will fail at link time if you link everything into | the same library. | | _That_ is an ODR violation, although there are | variations on this pattern that are not required to be | detectable. Template instantiation is an easy way to get | a silent ODR violation. | jcelerier wrote: | > No. I have intentionally violated the one definition | rule and nothing broke at all. | | did you use LTO ? It always catches ODR issues for me. | There is also GNU Gold's --detect-odr-violations switch. | bluGill wrote: | No. LTO is something I've been meaning to look at. Though | if I can't violate the ODR intentionally I might not be | interested. | jcelerier wrote: | If you violate ODR intentionally you are just writing | code that _will_ break in a future version of GCC / | Clang / .. | bluGill wrote: | That is a risk I know I'm taking. | | Though I've been tempted to write a paper for C++ to make | it defined behavior. I know it works in some form on most | implementations even though it isn't legal. Thus there | seems to be some other use cases for it that could/should | be formalized. If anyone can give me other examples of | why you want to do this and what the rules are on each | platform I'll take a shot at it. | jcelerier wrote: | > I know it works in some form on most implementations | even though it isn't legal. | | it definitely does not, every time I had an ODR issue | that caused actual bugs. For instance, dynamic_cast not | working because a typeid was defined in two shared | objects, etc. | | What would be the behaviour you expect if you have | a.cpp: int constant() { return 123; } | b.cpp: int constant() { return 456; } | c.cpp: int constant(); int main() { | return constant(); } | | how could you define this meaningfully other than "hard | error" ? | | e.g. here with gcc, if a and b are put into shared | libraries, the return value depends on _the order in | which the libraries are passed to g++ when linking_. e.g. | g++ c.cpp a.so b.so | | calls the version in a.so, while g++ | c.cpp b.so a.so | | calls the version in b.so | bluGill wrote: | You can do that because the order in which libraries are | passed to the linker is something you can control. Of | course linkers don't have to do this, and future versions | can do something different, but it works and the rules | for gcc are currently "the order in which the libraries | are passed to g++ when linking", which is defined. Of | course gcc has the right to change those rules (I suspect | the real rules are a bit more complex) | | Gcc also has the concept of weak symboles which if | invoked (and the linker supports it) would allow you to | make one of the two weaker than the others and then the | whole doesn't depend on link order. Visual C++ also seems | to have something like this, but I'm sure it is | different. | | Like I said, I want to write a paper to make it defined - | but the paper will be a lot longer than would fit in a | response here, and depending on information that I | currently don't know. | IshKebab wrote: | Were you compiling Servo on a Raspberry Pi or something? 2 | hours is ridiculous even for Rust. | | > How do you get anything done when the cost of testing an idea | is several hours? | | Incremental compilation. Depending on the project you can get | the edit compile cycle down to somewhere between 1 second and 5 | minutes. It's nowhere near as fast as Go but it's not like | anyone is actually making edits then waiting 2 hours for them | to compile. | weavie wrote: | Incremental compilation. Once you get the first build done, | subsequent builds are much faster. | | Also a project is likely to be split up over multiple crates, | so a lot of the changes you make will require building and | testing just that one crate. | runevault wrote: | Isn't incremental compile disabled currently? I thought they | found a significant bug and so turned it off until they could | validate a fix. | kibwen wrote: | The grandparent was building a binary utility, which means | they were (hopefully!) building in release mode, which | doesn't use incremental compilation by default in any | version of the compiler. | | For debug builds, where incremental is usually on by | default, it was disabled for 1.52.1 due to bugs, and then | kept off in 1.53 out of an abundance of caution. It should | be back on in 1.54. | runevault wrote: | okay that's what I thought (and the not turned back on | yet officially was what I was trying to hint at with the | abundance of caution). | ekidd wrote: | > How do you get anything done when the cost of testing an idea | is several hours?! | | My Rust compile times are usually between 10 and 30 seconds. | Some tricks that help: | | - Get a good development machine. I prefer a Dell Precision | laptop from the last couple of years, with plenty of cores. A | 5-year old laptop with 2 cores will be a _lot_ slower. | | - Use Visual Studio Code and the rust-analyzer plugin. This | will give you feedback in the editor while you're working. | | - Rely on the type system to identify most problems before ever | generating code. | | - Minimize the use of slow-to-compile libraries that rely on | complex generic types. Diesel, for example, is a great library | but it's slow to compile. | | - Install cargo-watch, and use it to automatically re-run the | tests. | | Also, remember that incrementally re-compiling a single crate | in debug mode will be much faster than building an optimized | application and all its dependencies from scratch. | | TL;Dr: Rust compilation times are less than ideal, but with a | fast laptop and rust-analyzer, it's possible to spend very | little time actually waiting for the compiler. | remix2000 wrote: | Just get a new machine for a couple zillion bucks, why | haven't we thought of it earlier? :^) | isaacimagine wrote: | As a developer who uses Rust a bunch (work & personal), | this is what I ended up having to do. I'm joking, but this | one simple trick can cut compile times _in half_. | edflsafoiewq wrote: | Yeah, can really see that "empowering everyone" motto | coming into play. | ekidd wrote: | Yes. If you are a professional developer who gets paid to | work on large Rust projects all day long, then a US$2500 | high-end laptop will pay for itself many times over. (I'm | assuming US or European salaries here.) | | I've gotten by with much less for personal projects. But | Rust will make very efficient use of (say) an 8-core i9 if | you have one. | nicetryguy wrote: | I'm pretty excited for the Linux kernel to take three weeks to | build. | stonemetal12 wrote: | If I had to guess it is because all of the dependencies had to | be compiled too. | | In a normal Rust application you break it up into crates. | Crates are the unit of compilation, and are only recompiled | when something in the crate changes. In a "normal" developer | flow you only touch 1 or two crates at a time so normal | developer compile time would be a couple of minutes at most. | Even for a new build on a machine most developers would never | see the two hour compile time because they would have gotten | precompiled dependencies. | Nullabillity wrote: | > In a "normal" developer flow you only touch 1 or two crates | at a time so normal developer compile time would be a couple | of minutes at most. | | Obviously this depends on your computer, but generally | incremental Rust builds should typically be on the order of | seconds, not minutes. | | > Even for a new build on a machine most developers would | never see the two hour compile time because they would have | gotten precompiled dependencies. | | Cargo doesn't do precompiled dependencies (even between | projects on the same machine). | tester34 wrote: | jesus christ | | rust's compilation times are as terrible as c++'s? | okamiueru wrote: | I don't know about Him, but at least when working with C++ I | feel the compile time is entirely up to deliberate tradeoffs. | With better planning and more effort made with forward | headers, better implementation boundaries w/pimpl like | constructs, and otherwise linking, and avoiding meta- | programming entirely, I've not found it to be an issue. | | Incremental C++ builds can be lightning fast, or it can be | ridiculously slow. I've worked on large projects where change | + incremental compile + test was less than 5 seconds. And | I've worked with project where no effort was made where the | same could take 10 minutes. | gpderetta wrote: | Keeping your C++ compilation times reasonable requires | eternal vigilance (or a distributed system with aggressive | caching). | kzrdude wrote: | It's the same ballpark | spoiler wrote: | In my experience (C++ dev at work) C++ generally _feels_ | slower than Rust. | | Edit: I quickly realised, this might only because of Rust's | --in my opinion--superior type system and tooling, so maybe | it just requires less waiting for compilations than when | working C++. | | Edit 2: I've never actually measured Rust compilation | times, but even for large-ish graphical codebases (wezterm, | alacritty, bevy games) I've compiled from scratch, it felt | noticeably faster even then. | tester34 wrote: | I think we must do better | | AAA games need less resources than those tree walkers | IshKebab wrote: | In my experience Rust is faster for full clean builds, even | if you use slow dependencies like Serde. | | However C++'s compilation units are smaller than Rust's which | helps it to have faster incremental compilation times for | small changes. | | So, same order of magnitude basically. | kibwen wrote: | I suspect the machine you were using was swapping itself to | death, because I've never experienced anything resembling that | in Rust. | | I also presume you were compiling in release mode, which is | good for producing extremely fast programs but not something I | bother with in regular development (in contrast to Go, which | has no distinction between debug and release optimization | levels). | | _> How do you get anything done._ | | The vast majority of my workflow just involves seeing if my | code typechecks, for which I don't even need to build the | program (so no codegen, no linking). This I do very frequently, | as a sanity check on my work. The command for this is `cargo | check`. This takes less than a second for small projects, one | to five seconds for medium projects, and one to twenty seconds | for large projects. | adwn wrote: | > _I suspect the machine you were using was swapping itself | to death_ | | Good point. I recently upgraded my machine to 64 GiB RAM, | because 16 GiB filled up pretty quickly with parallel | compilation, several VS Code/rust-analyzer instances and a | couple of browser tabs open. | tinco wrote: | Go is geared towards productivity. If you have a problem you | can solve adequately in Go, why not just do it in Go? | | Because they gained popularity roughly around the same time | many people think they are competing languages, but they're | really not. | | Ruby is my main programming language, if I need something to be | fast and/or light weight I switch to Go, sacrificing some | comfort. And if I need absolute lowest possible overhead, I | switch to Rust, sacrificing a whole lot more comfort. Rust is | more expressive than Go, but if your Go project is so large | that it needs a stronger typesystem in my opinion you should | switch to a language like C# or Scala. | sesm wrote: | Go is geared towards productivity of specific people at | specific environment: fresh grads at Google. | tinco wrote: | First time I used it, I had not written a line of Go prior, | I rewrote our data ingestion gateway from Ruby to Go. It | took me 1 week, and it had full feature parity. I don't | think I had the GPA at university to land a fresh grad job | at Google. | | Sure the tooling around it (besides the compiler itself) | was the dumbest 21st century tooling I've seen (this was ~6 | years ago), but it was quick to learn, quick to write, | quick to compile and quick to run so who am I to complain. | sim_card_map wrote: | How was the tooling dumb? | tinco wrote: | It had this weird system where all your projects were | supposed to be located in one directory structure that | mapped to their git namespaces, it made no sense at all | and was a pain to work around. | spoiler wrote: | I'm really curious what you were compiling, and if you remember | the machine specs (and was the machine under other load at the | time). Two hours seems _beyond_ ridiculous. | | Also, I'm assuming it was a full (as opposed to an incremental) | release build? But even then, I've never had to wait for that | long. | lmilcin wrote: | Two hours is not beyond ridiculous. A lot of projects in the | past compiled for way longer than that, days even. | | In 90s and early 2000s if you worked on a large desktop | application (the size of Office or Photoshop) you could | expect to spend a business day waiting for compilation | results. | spoiler wrote: | Sorry, I should've clarified that I meant specifically | meant in recent versions of Rust. I wasn't speaking in the | general sense. I've experienced slow compilation times in | C++ projects, although not of the same magnitude you | describe (ie having to wait days)! | gpderetta wrote: | Around the turn of the millennium I remember recompiling, | for some reason I don't remember, the whole of KDE. That | took a bit... | swiftcoder wrote: | What on earth did you compile? I can't recall ever having | compiled a rust project that took more than 10 minutes from | scratch. | | I do have memories of the days when compiling Chromium or AOSP | was a 4 hour battle, though :) | [deleted] | jynelson wrote: | The rust compiler itself can take as long as 2 hours on a | modern laptop if you do a full stage 2 build. I don't think | that's what the top poster was talking about though, it | sounded like they had a rust toolchain already. | steveklabnik wrote: | (You know this, but for anyone else reading, don't forget | that that doing this compiles LLVM, which is in of itself a | massive C++ project, as well as compiling the compiler | itself multiple times. That's part of why it would be such | an outlier.) | swiftcoder wrote: | Right. Nested C/C++ dependencies are the only case I can | think of where drastically-long compile times are common. | jandrese wrote: | Or maybe someone included Boost? | | Sometimes it is just that someone drank the Kool-aid and | made everything a template. Then split their code across | hundreds of tiny files and did the one big massive | include at the top of every file that pulls in | _everything_. | eptcyka wrote: | I've only had 2 hour build times when I was compiling a rust | toolchain on an emulated ARM system running on x86. | JD557 wrote: | > How do you get anything done when the cost of testing an idea | is several hours?! | | I haven't used rust in production, but usually you can just use | `cargo check` to only run the typecheck. Using `cargo build` is | also usually much faster than `cargo build --release`. | | Having said that, at least in my toy projects it was not | uncommon to have to compile using `--release` to run the tests | (since otherwise the tests would take forever). | littlestymaar wrote: | > Having said that, at least in my toy projects it was not | uncommon to have to compile using `--release` to run the | tests (since otherwise the tests would take forever). | | Maybe you're already aware of that, but if the reason why | your tests are slow is not "your code not being optimized" | but "your dependencies not being optimized" then Cargo | profile override[1] can save your life. You can have one | specific dependency (or all of them) being build in release | mode even when you're building you own code in debug mode. | The first development build is going to be quite slow, but | after that, you're going to have the best of both worlds. | | [1]: https://doc.rust- | lang.org/cargo/reference/profiles.html#over... | lower wrote: | For languages with Hindley-Milner typing, like SML and OCaml, it | has long been known that type-checking is even worse in the worst | case (DEXPTIME hard) [1]. But the programs where this matters | aren't what one would write by hand, typically. | | [1] https://dl.acm.org/doi/10.1145/96709.96748 | contravariant wrote: | To some extent it wouldn't even be surprising for a | sufficiently powerful language to be EXPSPACE. Pretty much | anything with sufficiently powerful macros would be EXPSPACE, | or worse. | lower wrote: | Maybe it's a good idea to add a concrete example (can't edit | the reply anymore): | | OCaml: let x1 = fun y -> (y, y) in let x2 | = fun y -> x1 (x1 y) in let x3 = fun y -> x2 (x2 y) in | let x4 = fun y -> x3 (x3 y) in let x5 = fun y -> x4 (x4 | y) in let x6 = fun y -> x5 (x5 y) in x6 (fun z -> | z) | | Haskell: test = let x1 = \y -> (y, y) | x2 = \y -> x1 (x1 y) x3 = \y -> x2 (x2 y) x4 = | \y -> x3 (x3 y) x5 = \y -> x4 (x4 y) x6 = \y | -> x5 (x5 y) in x6 (\z -> z) | | If you can compile these, add x7. The effort increases | exponentially as one adds x7, x8 and so on. | NieDzejkob wrote: | The fact that x1 is \y -> (y, y) is superficial, though. | Something like x1 = Just also increases exponentially, and | makes clear that using a deduplicated DAG for representing | type terms doesn't solve this. | lower wrote: | With x1 = Just, the program is accepted instantly by ghc. I | think you need a type variable to appear twice. | NieDzejkob wrote: | Ah, it's still exponential, but in the size of the type, | and the constants of the complexity are a bit different, | so you need around x30. | lower wrote: | Ah, you're right. If one goes to x_{i+1}, then there will | be 2^i Maybes in the type. The number of Maybe- | occurrences in the type doubles in each step and sharing | won't help there. | Yajirobe wrote: | could you provide a python example? | Cu3PO42 wrote: | Python does not perform static type checking, therefore | this problem doesn't apply immediately. You can build a | program that takes exponential time to execute, but that is | probably not a relevation to you. | Enginerrrd wrote: | Building a program that takes exponential time to execute | because of python run-time dynamic type checking might be | interesting... | shrimpx wrote: | I don't think that's possible because I don't think | there's any aspect of the runtime type checker that | descends into complex type structures. It just checks if | runtime type tags match when you invoke an operator, or | that an attribute exists when you try to access it. | daxfohl wrote: | Formerly id id id id id id id id id... would do it, but looks | like someone has put in a fix for that. | lower wrote: | Yes, types are usually represented by a DAG with sharing. | OCaml does so, for example, and I assume ghc does something | similar. | | So, while id id has type ('a -> 'a) -> ('a -> 'a), this is | stored in memory by a pointer structure that amounts to let | 'b = ('a -> 'a) in ('b -> 'b). The type of id id id would | become let 'b = ('a -> 'a) in let 'c = ('b -> 'b) in ('c -> | 'c). This grows linearly. If one were to write out the | types without sharing then their size would grow | exponentially. | chowells wrote: | There was a bug in ghci (the GHC REPL) for a while which | made it take exponential time to print the type of `id id | id id id id id id id`. It used the linear representation | for type-checking, but the pretty-printer was not | implemented so cleverly. | | Apparently that's been fixed. I can't confirm because | that's not something I ever check the type of. | kmill wrote: | It seems to me that id id id ... id always has type a -> | a. If you're referring to the type of the first id in the | sequence, then what you're saying makes sense. | Ericson2314 wrote: | Note that is is polynomial with respect to the size of the | _output_. | | That is to say, pathological expressions don't stay small but | induce horrendous searching, but rather merely get a log bigger | (and that can cause more trouble for the rest of the program). | | That means a heuristic to bail out when something grows a lot | probably not violate the users intent and get it back into | polynomial time. | chalst wrote: | No, this is the time complexity of type checking before any | computation on the terms is performed. | Ericson2314 wrote: | Yes I am saying that while it is exponential with respect | to the input, and it is polynomial with respect to the | output. | shadowgovt wrote: | The points made by the Endsay here are key, and actually a good | example of trade-offs in software engineering. It is definitely | okay to incorporate an algorithm that is NP-hard into the core of | a performant system as long as you can have some expectation that | the input size on the NP-hard part will be small. Sometimes, it | even allows you to simplify another piece of the problem via pre- | computation that allows you to avoid making the piece of the | system that accepts large inputs require a costly solution | algorithm. | estebank wrote: | The actual code that handles exhaustiveness checks has extensive | explanations on how it works and what it was inspired by. | | https://github.com/rust-lang/rust/blob/master/compiler/rustc... | rocqua wrote: | > he standard representation of such a formula is the conjunctive | normal form, or simply: AND-of-ORs. Our previous example would | look like this when transformed: A and B and (A | or B) and (!A or !B) | | There is another normal form, the disjunctive normal form. This | is in the form of An OR of ANDs. Interestingly, every formula can | be represented in this form. And SAT is rather easy to determine | for this form. | | The catch is that taking a logical formula, and placing it in | Disjunctive normal form is actually a rather arduous, non | polynomial process. | maweki wrote: | > And SAT is rather easy to determine for this form. | | Gave me a chuckle, as that's quite an understatement, as the | decision procedure is only: is there a single clause and is it | just "False"? | | > The catch is that taking a logical formula, and placing it in | Disjunctive normal form is actually a rather arduous, non | polynomial process. | | Basically writing out all assignments that make the formula | true. Potentially exponentially many, as there are 2^n possible | assignments of n variables. So this transformation is basically | just solving and writing down all models. | mmarx wrote: | The conjunctive normal form can be exponentially large in the | worst case as well, though, see the footnote in the article. | In practice that is not a problem, since it suffices to find | a formula in CNF that is satisfiable precisely when the | original formula is satisfiable, even though it might not be | equivalent. ___________________________________________________________________ (page generated 2021-07-08 23:01 UTC)