[HN Gopher] Pointers are Easy, Optimization is Complicated
       ___________________________________________________________________
        
       Pointers are Easy, Optimization is Complicated
        
       Author : pcr910303
       Score  : 16 points
       Date   : 2020-09-15 09:07 UTC (1 days ago)
        
 (HTM) web link (blog.metaobject.com)
 (TXT) w3m dump (blog.metaobject.com)
        
       | Sulik wrote:
       | To me, the philosophy of C (and even C++ before things became
       | nuts) is that you should be able to reasonably guess the assembly
       | code resulting from the C code, the idea being that you can write
       | assembly code with much less typing. These days, things seem to
       | be moving in a more dogmatic direction with the underlying
       | assumption that the vast majority of programmers are bad
       | programmers.
        
       | sanxiyn wrote:
       | If pointers are integers, all pointers escape. This is not a
       | matter of "nice to have" optimizations.
       | 
       | (The usual model used by compilers is that pointers are not
       | integers, but a pointer _becomes_ an integer when it is casted to
       | an integer. This prevents majority of pointers that are not
       | casted to integers from escaping, which is important.)
        
       | jcranmer wrote:
       | While it's true that the difficulty of building a semantic model
       | of pointers is mostly related to permitting optimizations, I
       | don't think the author actually understands how many
       | optimizations are actually disabled by a simplistic pointers-are-
       | integers model.
       | 
       | Let me give an example here. Consider these two functions:
       | void a() {         int x = 5;         return x;       }
       | void b() {         return 5;       }
       | 
       | In a simplistic, pointers-are-integer model, these two functions
       | are _not equivalent_ , and it is illegal for any compiler to
       | transform the first into the second. This is because declaring
       | the variable x means it must have some associated storage, which
       | can be accessed by some unknown integer name. Any pointer which
       | points to that integer value can then observe its value, and
       | deleting the store to that temporary value is a potentially
       | observable change to a legal program.
       | 
       | To permit this basic optimization, you need to have to some sort
       | of rule that allows you to reason that a value whose address is
       | never taken can never be legally accessed by a pointer. To
       | actually effect this rule in semantics, you now have to start
       | tracking extra metadata about what pointers can and cannot
       | access... which is exactly the kind of model that's being
       | criticized.
        
         | dnautics wrote:
         | maybe there's a question here: If you really need the kind of
         | performance afforded by that level of fine-grained
         | optimization, shouldn't you instead consider emitting better
         | code, at least in hot loops? It seems less dangerous to have
         | the code do semantically exactly what you tell it to in the 90%
         | case, because otherwise you can unwittingly introduce
         | difficult-to-reason-about code. For a GP programming language,
         | this could be possibly deployed in mission-critical code where
         | lives are at stake.
        
           | sanxiyn wrote:
           | I am not sure what the question is. Are you proposing manual
           | marking of hot code and applying pointers-are-integers model
           | to all code but optimizable pointer model to marked code?
        
         | petergeoghegan wrote:
         | I read the article as a criticism of the strict aliasing rules
         | in C. I think that you're attributing something to the author
         | that they couldn't possibly believe.
        
           | sanxiyn wrote:
           | This is not a strict aliasing rule. I actually agree strict
           | aliasing optimizations are mostly useless. (Exhibit: Rust has
           | no strict aliasing.)
        
       | tjalfi wrote:
       | If you want PCC, you know where you can find it.
        
       | Taniwha wrote:
       | Really what is being argued here is that C pointer optimisation
       | is difficult - and because C is designed to do stuff like coding
       | the insides of the 'new' primitive (ie malloc) you can't make the
       | sorts of assumptions about aliasing that you can in other
       | languages.
       | 
       | I'd argue that if you don't understand that you probably
       | shouldn't be coding in C (I'll let others argue about C++)
        
         | Sulik wrote:
         | +1
        
       | Diggsey wrote:
       | > I prefer the simple and obvious pointer model. Vastly.
       | 
       | That's great, but then you're not writing C anymore. When you
       | write C (or indeed, almost any language in use today) you are
       | writing code against an abstract machine, not against real
       | hardware.
       | 
       | Pointers are integers on (most) real hardware. They are
       | indisputably _not_ integers in the C abstract machine.
       | 
       | The reason for this abstract machine existing is not to enable
       | optimizations, it's to allow the code to be portable. If there
       | was no abstract machine, you would have to reason about the
       | correctness of your code for each target architecture
       | independently. With an abstract machine, you can reason that your
       | code is correct on the abstract machine, and the compiler can do
       | the hard work of translating that to each target architecture in
       | a way that preserves well-defined behaviour.
       | 
       | Now, given that the abstract machine exists, you could ask why it
       | is so complicated, and the answer this time _is_ for
       | optimizations, and also to allow the translation to the target
       | architecture to introduce as little overhead as possible.
        
       ___________________________________________________________________
       (page generated 2020-09-16 23:00 UTC)