[HN Gopher] Pointers are Easy, Optimization is Complicated ___________________________________________________________________ Pointers are Easy, Optimization is Complicated Author : pcr910303 Score : 16 points Date : 2020-09-15 09:07 UTC (1 days ago) (HTM) web link (blog.metaobject.com) (TXT) w3m dump (blog.metaobject.com) | Sulik wrote: | To me, the philosophy of C (and even C++ before things became | nuts) is that you should be able to reasonably guess the assembly | code resulting from the C code, the idea being that you can write | assembly code with much less typing. These days, things seem to | be moving in a more dogmatic direction with the underlying | assumption that the vast majority of programmers are bad | programmers. | sanxiyn wrote: | If pointers are integers, all pointers escape. This is not a | matter of "nice to have" optimizations. | | (The usual model used by compilers is that pointers are not | integers, but a pointer _becomes_ an integer when it is casted to | an integer. This prevents majority of pointers that are not | casted to integers from escaping, which is important.) | jcranmer wrote: | While it's true that the difficulty of building a semantic model | of pointers is mostly related to permitting optimizations, I | don't think the author actually understands how many | optimizations are actually disabled by a simplistic pointers-are- | integers model. | | Let me give an example here. Consider these two functions: | void a() { int x = 5; return x; } | void b() { return 5; } | | In a simplistic, pointers-are-integer model, these two functions | are _not equivalent_ , and it is illegal for any compiler to | transform the first into the second. This is because declaring | the variable x means it must have some associated storage, which | can be accessed by some unknown integer name. Any pointer which | points to that integer value can then observe its value, and | deleting the store to that temporary value is a potentially | observable change to a legal program. | | To permit this basic optimization, you need to have to some sort | of rule that allows you to reason that a value whose address is | never taken can never be legally accessed by a pointer. To | actually effect this rule in semantics, you now have to start | tracking extra metadata about what pointers can and cannot | access... which is exactly the kind of model that's being | criticized. | dnautics wrote: | maybe there's a question here: If you really need the kind of | performance afforded by that level of fine-grained | optimization, shouldn't you instead consider emitting better | code, at least in hot loops? It seems less dangerous to have | the code do semantically exactly what you tell it to in the 90% | case, because otherwise you can unwittingly introduce | difficult-to-reason-about code. For a GP programming language, | this could be possibly deployed in mission-critical code where | lives are at stake. | sanxiyn wrote: | I am not sure what the question is. Are you proposing manual | marking of hot code and applying pointers-are-integers model | to all code but optimizable pointer model to marked code? | petergeoghegan wrote: | I read the article as a criticism of the strict aliasing rules | in C. I think that you're attributing something to the author | that they couldn't possibly believe. | sanxiyn wrote: | This is not a strict aliasing rule. I actually agree strict | aliasing optimizations are mostly useless. (Exhibit: Rust has | no strict aliasing.) | tjalfi wrote: | If you want PCC, you know where you can find it. | Taniwha wrote: | Really what is being argued here is that C pointer optimisation | is difficult - and because C is designed to do stuff like coding | the insides of the 'new' primitive (ie malloc) you can't make the | sorts of assumptions about aliasing that you can in other | languages. | | I'd argue that if you don't understand that you probably | shouldn't be coding in C (I'll let others argue about C++) | Sulik wrote: | +1 | Diggsey wrote: | > I prefer the simple and obvious pointer model. Vastly. | | That's great, but then you're not writing C anymore. When you | write C (or indeed, almost any language in use today) you are | writing code against an abstract machine, not against real | hardware. | | Pointers are integers on (most) real hardware. They are | indisputably _not_ integers in the C abstract machine. | | The reason for this abstract machine existing is not to enable | optimizations, it's to allow the code to be portable. If there | was no abstract machine, you would have to reason about the | correctness of your code for each target architecture | independently. With an abstract machine, you can reason that your | code is correct on the abstract machine, and the compiler can do | the hard work of translating that to each target architecture in | a way that preserves well-defined behaviour. | | Now, given that the abstract machine exists, you could ask why it | is so complicated, and the answer this time _is_ for | optimizations, and also to allow the translation to the target | architecture to introduce as little overhead as possible. ___________________________________________________________________ (page generated 2020-09-16 23:00 UTC)