[HN Gopher] My own C compiler on my own compiler infrastructure ___________________________________________________________________ My own C compiler on my own compiler infrastructure Author : uint256_t Score : 113 points Date : 2020-10-23 14:43 UTC (8 hours ago) (HTM) web link (github.com) (TXT) w3m dump (github.com) | mshockwave wrote: | I'm pretty excited to see the infrastructure (i.e. cilk). But I'm | wondering if it "fixes" some of the flaw in LLVM IR? For example, | (not really a flaw but I rather prefer) using basic block | parameters instead of PHI nodes. | jcranmer wrote: | There's no theoretical difference between phis and basic block | parameters. | | When you do SSA conversion, you need a notion that allows you | to bind a value to a variable along an edge in the control-flow | graph. You can choose to actually have some edge data that | indicates these bindings, but this tends to be a poor | engineering solution. If it has to go into a basic block, you | need to either bind it to the source basic block or the target | basic block. Bind it to the target, and you get phi nodes. Bind | it to the source, and you get basic block parameters. | | The choice of which to use is an engineering tradeoff. Phis | have the issue that they are pseudo-instructions existing as | the leading instructions in a basic block, and you may well | need to special-case them in transformation or analysis passes. | Using basic block arguments also provides a more direct analogy | to interprocedural optimization, and you could potentially use | the exact same code with a touch of specialization to handle | that domain. | | Since the two forms are moving the edge instructions to | different places, they make different questions easier. With a | phi node, it's quick to see if all predecessors are providing | the same value. With a basic block argument, it's quick to see | which variables are live out (assuming you're restricted to | using only basic block-local variables [1]). In general, if you | need to query all-predecessor analyses, it's easier with phis; | if you need all-successor analyses, it's easier with basic | block arguments. | | [1] The choice of where variables can be used is _also_ | orthogonal to phis versus basic block arguments. Loop-closed | SSA already exists, which prevents you from using a variable | outside of its loop. | tom_mellior wrote: | > There's no theoretical difference between phis and basic | block parameters. | | I question this claim. Don't basic block parameters allow you | to do interesting stuff also at _split_ points, not only at | _join_ points? | | For example: if (x == 0) { y = | 2 * x; } else { y = 3 * x; } | | In SSA form this is: if (x == 0) jump | B1 else jump B2 B1: y1 = 2 * x | jump join B2: y2 = 3 * x jump | join join: y = phi(y1, y2) | | You need conditional constant propagation to realize that y1 | must be 0. | | But with basic block arguments you could do this (improvising | notation on the spot): if (x == 0) jump | B1(0) else jump B2(x) B1(arg1): y1 = 2 * | arg1; jump join(y1) B2(arg2): | y2 = 3 * arg2; jump join(y2) join(y): | ... | | Basically, as soon as information becomes available (x is 0 | along some path), that is explicit in the IR. Realizing that | y1 is 0 only takes simple, non-conditional sparse constant | propagation. You might not call that a "theoretical" | difference (what theory are we talking about?), but it does | give you a simple way to access information that is not | explicit in SSA form. | | In effect this latter representation is something like Static | Single Information (SSI) form: https://citeseerx.ist.psu.edu/ | viewdoc/summary?doi=10.1.1.1.9... | jcranmer wrote: | As I explained in my footnote, you can do the same thing | with SSA form: if (x == 0) jump B1 else | jump B2 B1: x1 = phi(x) y1 = 2 * x1 | jump join B2: x2 = phi(x) y2 = 3 * x2 | jump join join: y = phi(y1, y2) | | And then you can do the same propagation of the x == 0 to | the phi in B1 pretty easily. It is somewhat harder than the | basic-block argument case, but that's because basic block | arguments rely on the data binding being put in the head of | the edge and we're pushing data from the head into the | edge. The next step of checking if all incoming edges are | identically 0 would be easier in the phi-form, since the | data binding is in the tail node, and we're pulling data | from the edge into the tail. | uint256_t wrote: | (I'm the owner of cilk.) I didn't know the basic block | parameters. It seems better than using phi nodes. Thanks. | mshockwave wrote: | Yeah basic block parameters are especially useful when doing | data flow analysis | tptacek wrote: | Huh! I stopped learnin' compilery stuff after SSA and was | unaware of the approach. If anyone else is wondering, here's a | good Reddit thread on bblock parameters: | | https://www.reddit.com/r/ProgrammingLanguages/comments/9z8qu... | mhh__ wrote: | It's relatively new in the mainstream of compiler's, i.e. | MSVC only got SSA in its backend 5 years ago or so. | | I think Appel wrote a paper comparing SSA to functional | programming way back in the 90s. | bonzini wrote: | It's also included in his books ("Modern compiler | programming in <insert language name>", where the language | is only used in the first third of the book). | einpoklum wrote: | It says the compiler is "Powered by Cilk". But - what does that | mean? | | https://en.wikipedia.org/wiki/Cilk | | isn't Cilk a C-like language with additions for parallel | execution? How is that a compiler infrastructure? | sergeykish wrote: | cilk in dependencies, should search "rust cilk" | | https://github.com/maekawatoshiki/cilk | Retr0spectrum wrote: | See the parent directory of the repo linked in the OP: | https://github.com/maekawatoshiki/cilk | einpoklum wrote: | Ah, interesting, thanks. | | Well, the example shows convenient syntax... but - it also | says it's a "toy" project. | alcidesfonseca wrote: | I agree that Cilk is not a great name for compiler stuff as | there is already a Cilk extension to C. | | That said, it's the author's prerogative. ___________________________________________________________________ (page generated 2020-10-23 23:00 UTC)