[HN Gopher] AT&T Syntax versus Intel Syntax
       ___________________________________________________________________
        
       AT&T Syntax versus Intel Syntax
        
       Author : susam
       Score  : 84 points
       Date   : 2022-11-13 17:29 UTC (5 hours ago)
        
 (HTM) web link (www.cs.mcgill.ca)
 (TXT) w3m dump (www.cs.mcgill.ca)
        
       | ufo wrote:
       | On Linux, is there a way to convert an assembly language file
       | from one syntax to the other?
       | 
       | I know that there are ways to ask GCC to emit one syntax or the
       | other, as well as ways to assemble code in either syntax. However
       | I don't know any program that just translates one to the other.
        
       | Graziano_M wrote:
       | AT&T vs. Intel syntax has caused many arguments in some of my
       | circles. AT&T is an atrocity and if you disagree... you're wrong.
       | :)
        
         | [deleted]
        
         | mixmastamyk wrote:
         | op src dest
         | 
         | is the logical order, the rest of the syntax I don't care about
         | much.
        
           | SoftTalker wrote:
           | TFA says "The `source, dest' convention is maintained for
           | compatibility with previous Unix assemblers."
        
           | Joker_vD wrote:
           | Which is my all memcpy-like functions in C/C++ standard
           | library take arguments in dst, src order.
        
           | pjmlp wrote:
           | In mathematics _variable = value_ , variable receives value,
           | ergo _op dest src_.
           | 
           | That is logic.
        
             | mrkeen wrote:
             | In mathematics _variable = value [?] value = variable_
        
             | googlryas wrote:
             | That is just an arbitrary notation, nothing to do with
             | logic. In English at least, "add 4 to x" is more natural
             | than "add to x 4".
             | 
             | Ergo _op src dest_
        
               | bialpio wrote:
               | But English "add 4 to x" tells you nothing about where to
               | store the result. : - )
               | 
               | "Add 4 to x and store result in x" vs "Let x be x + 4"?
        
               | monocasa wrote:
               | Without any other context, you generally assume that the
               | target is an accumulator.
               | 
               | So "add 4 coins to that bucket", generally means at the
               | end of that operation the bucket contains at least four
               | coins plus any coins that were already in the bucket.
        
               | bialpio wrote:
               | Huh, I think my brain is not wired to think about `x` as
               | a storage, it is closer to "add 3 to x for x = 2", which
               | then gets reduced to "add 3 to 2" and then the
               | destination is missing...
        
             | pavlov wrote:
             | Indeed. Besides, "let there be light" is clearly
             | destination-first, so this matter was already resolved by
             | Genesis 1:3.
        
               | Eleison23 wrote:
               | Which direction was the Hebrew text written in?
        
             | LudwigNagasena wrote:
             | That doesn't follow. _variable = value_ , ergo _dest op
             | src_.
        
             | monocasa wrote:
             | ...math has no such order. Half the point of algebra is
             | that the two sides of the equation are semantically
             | equivalent and can be swapped at will.
        
               | pjmlp wrote:
               | Good luck convincing anyone that _value = variable_ makes
               | sense.
        
               | monocasa wrote:
               | My math teachers had no problem with '42 = x' vs 'x = 42'
               | as long as the steps made sense to get there. In fact
               | they'd probably comment that there was no need to go with
               | 'x = 42' if I obviously took a circuitous route simply to
               | end up with x on the left side of the equation, as that
               | would have demonstrated a lack of internalizing some of
               | the base ideas of algebra and it's approach of equation
               | symmetry.
        
               | fweimer wrote:
               | Some older C coding styles recommend that order because
               | before compilers added warnings, "if (17 = variable)"
               | resulted in a diagnostic, while "if (variable = 17)"
               | would not. Nowadays, I think most programmers prefer
               | putting the fastest-changing expression first.
        
               | [deleted]
        
               | akira2501 wrote:
               | And it has no state. Half the point of computation is
               | maintaining state for the purpose of efficiency. So,
               | computation gets an "assignment" operator where pure math
               | lacks one, pure math being relegated to subscripts of
               | time and indicator functions instead.
        
             | molticrystal wrote:
             | Postfix , infix, prefix, crucifix.
        
           | NobodyNada wrote:
           | I used to think this was more intuitive, but after using both
           | for a while I came to the conclusion that putting the
           | destination first is much more practical, because my eyes can
           | scan the left column to quickly find where a register was
           | last written to. If the destination is last, it doesn't
           | appear in a consistent location horizontally.
        
           | sph wrote:
           | Be that as it may, the AT&T indirect memory reference
           | `section:disp(base, index,scale)` is an abomination unto God.
           | 
           | At least the Intel one makes actual mathematical sense:
           | `section:[base + index*scale + disp]`
        
           | colejohnson66 wrote:
           | That "logical" order is only because you're trying to read it
           | like a sentence ("mov/add ebx into eax") when you should be
           | reading it like a formula or what it actually is - code. And
           | that's fine, but considering Intel created the chip, it makes
           | sense that they should decide how the assembly syntax should
           | be, not AT&T.
           | 
           | The _only_ reason "AT &T syntax" exists for x86 is because
           | people working at AT&T refused to use Intel as the
           | authoritative reference on the syntax, and, instead, decided
           | to follow the convention of the PDP, Motorola, etc. family
           | and friends. Hence why `as` (and subsequently `gas`) have
           | that as the default.
        
           | JonChesterfield wrote:
           | Sketchy part is when operations read and write with the same
           | argument. Op first is nice for that as it becomes op arg0
           | arg1.
           | 
           | I quite like the SSA style which tends to be dst0 dst1 opcode
           | src0 src1 but that doesn't model assembly brilliantly.
           | Perhaps that order with read-write arguments required to
           | appear on both sides of opcode with the same symbol has some
           | merit.
        
           | userbinator wrote:
           | That is extremely confusing for comparisons, which are
           | effectively a subtraction.
        
         | [deleted]
        
         | bbarnett wrote:
         | They are using a classic dollar sign for assignment, so clearly
         | at&t is better.
        
           | masklinn wrote:
           | > They are using a classic dollar sign for assignment
           | 
           | They're using a dollar sign for _immediates_.
           | 
           | As if you can't notice that it's a number.
           | addl $4, %eax
           | 
           | that's 3 different completely unnecessary symbols for things
           | which are not ambiguous in the first place:
           | 
           | - the operation width is provided by the registry
           | 
           | - a number is an immediate
           | 
           | - a register is named
           | 
           | Hence the much less noisy Intel syntax:                   add
           | eax, 4
        
             | docandrew wrote:
             | The comma is unnecessary too, so that's 4 unneeded symbols.
        
             | matja wrote:
             | addl is not necessary if it is not ambiguous
             | add $4, %eax
             | 
             | - is fine
             | 
             | Compare with                   add 4, %eax
             | 
             | The "less noisy" Intel syntax becomes:
             | add eax, DWORD PTR [4]
        
               | __init wrote:
               | Most assemblers for Intel syntax will let you write:
               | add eax, [4]
               | 
               | if you desire. Indeed, many disassemblers will follow
               | suit in unambiguous cases. IDA, for example, does this.
        
               | colejohnson66 wrote:
               | The only time "DWORD PTR" is required is when (1) you're
               | working with old assemblers, or (2) you're using a memory
               | operand with an immediate:                   add eax, [4]
               | ; inferred         add [eax], 4       ; ambiguous
               | add DWORD [eax], 4 ; explicit
               | 
               | A disassembler may output it when not necessary, however.
        
           | amluto wrote:
           | Contemplate the AT&T vs Intel syntax for x86 addressing modes
           | and say that again.
        
             | ratsmack wrote:
             | X86 addressing modes are an atrocity:
             | https://stackoverflow.com/questions/63571979/clarifying-
             | thre...
        
               | amluto wrote:
               | They are an atrocity _in AT &T syntax_. They are
               | perfectly readable in Intel format. For example:
               | 
               | https://blog.yossarian.net/2020/06/13/How-x86_64-addresse
               | s-m...
        
               | ratsmack wrote:
               | That is an excellent write up, thanks.
        
       | pjmlp wrote:
       | In a dumb idea to depend only on GAS, before it had good support
       | for Intel syntax, I ported some code from Intel syntax into AT&T
       | many moons ago, quite dumb idea, if i was doing it today I would
       | have listed nasm or yasm as a requirement and be done with it.
        
       | retrac wrote:
       | Here's the why for the curious. It's just a historical quirk.
       | 
       | It's "AT&T syntax" because it dates to the 1978 AT&T Labs effort
       | to port UNIX to the 8086. [1] While the 8086 did not have virtual
       | memory or hardware protection, its memory segmentation model was
       | still adequate to support UNIX. It was the first microprocessor
       | practically capable of running UNIX, and this was realized before
       | the chip was even released. The porting effort started
       | immediately. (Though most of the energy would soon switch to the
       | 68000 when that was released a year later.)
       | 
       | The AT&T folks did not wait for Intel's assembler. (Written in
       | Fortran, to run on mainframes, or on Intel's development
       | systems). Nor did they closely model their assembler after it.
       | They just took the assembler they already had for the PDP-11 and
       | adapted it with minimal changes for the 8086. Quick and dirty.
       | Which was okay. You're not supposed to write assembly on UNIX
       | systems, anyway. Only the poor people who had to write kernel
       | drivers and compilers would ever have to deal with it.
       | 
       | [1] https://www.bell-labs.com/usr/dmr/www/otherports/newp.pdf
       | (see section III)
        
         | bbanyc wrote:
         | I think there's a bit more to the story. It was before my time,
         | but as I understand it the most widely used Unix for the 8086
         | was XENIX (initially a Microsoft product, later sold to SCO),
         | which used Intel-syntax MASM as its assembler.
         | 
         | XENIX for 386 was based on AT&T System V/386, which introduced
         | the AT&T syntax to 32-bit x86. I've found some references to
         | 32-bit XENIX still using an assembler called "masm" but I don't
         | know if it was still based on Microsoft's MASM or just called
         | that for compatibility, or whether it was AT&T or Intel syntax.
         | Also by that point compilers and assemblers weren't included in
         | the base OS anymore, but a "development kit" sold separately.
         | 
         | The Minix compiler and assembler also used Intel syntax.
        
         | bluedino wrote:
         | It's funny how many things come down to 'UNIX hackers did it
         | this way when they had to work with a PC'
        
       | jcalvinowens wrote:
       | I initially learned the Intel syntax, and preferred it for
       | awhile. But the more I work with non-x86 CPUs, the more I prefer
       | AT&T just because it feels less different than everything else.
        
         | aap_ wrote:
         | Much agreed. Intel syntax just seems somewhat alien.
        
           | userbinator wrote:
           | Intel syntax is more similar to ARM, MIPS, and even RISC-V's
           | official syntax than AT&T.
        
           | secondcoming wrote:
           | But all the Intel instruction documentation unsurprisingly
           | uses the Intel syntax!
        
           | yakaccount4 wrote:
           | That's interesting. My entire world revolves around x86 and
           | ARM, so "Intel" syntax (which to me mostly means op dest,
           | src) is what seems normal to me.
        
             | aap_ wrote:
             | Order of dst and src i have no strong feelings about, it's
             | all the rest that i find weird about intel syntax.
        
               | GeorgeTirebiter wrote:
               | dest = src
               | 
               | That's how I think of it.
               | 
               | ALSO: I really do not like the MOV instruction; I much
               | prefer LD. The Z-80 instruction names got everything
               | mostly right.
        
       | jart wrote:
       | AT&T syntax is the most elite syntax. I've used it to write some
       | famous hacks, like Actually Portable Executable, which is a
       | 16-bit BIOS bootloader shell script ELF / Mach-O / PE executable.
       | People dislike it because writing assembly Bell Labs style
       | requires great responsibility. What makes AT&T syntax so powerful
       | is its tight cooperation with the linker. I don't think it would
       | have been possible for me to invent APE had I been using tools
       | like NASM and YASM and FASM because PC assemblers were
       | traditionally written as isolated tools that weren't able to take
       | a holistic view with linker scripts and the C preprocessor.
       | https://raw.githubusercontent.com/jart/cosmopolitan/master/a...
        
         | chrisseaton wrote:
         | Isn't this a function of the tools you were using, not the
         | syntax? Couldn't any of these tools support any syntax and do
         | the same thing?
        
           | JonChesterfield wrote:
           | The line between syntax and functionality is pretty thin for
           | an assembler.
           | 
           | I've definitely had code that some assemblers accepted and
           | others didn't on the same arch, so even if they could be
           | equally expressive in practice they aren't. Fairly sure
           | that's also true of inline assembly on clang x64, had to
           | change between intel and at&t for something a while ago.
        
         | fingleberry wrote:
         | I'm sorry, no. This is incredibly misleading. Even if the
         | linker step and assembly are completely separated, everything
         | you've built in Cosmopolitan and APE is 100% buildable in other
         | tools. It might take more effort in some cases, but there's
         | more to a native stack than choice of tooling and syntax; if
         | you genuinely know your stack and architecture, anything is
         | possible.
         | 
         | Your accomplishments have zero to do with the 'elite' tooling
         | you use (why are you gatekeeping and creating class
         | distinctions out of assemblers?), and more that you've taken
         | the time to really think about how memory is laid out and how
         | the architecture works - which most of us who started out
         | writing operating systems instead of Rails understand perfectly
         | fine. Nothing about the relationship between gas and ld
         | achieves uniqueness not seen in other native stacks. That's
         | just made up.
         | 
         | There are multiple operating systems built with hand written
         | NASM. Arguing about assemblers like they matter for more than
         | five seconds is tiring 1990s IRC stuff. They turn syntax into a
         | byte layout. It's like realizing oh, this assembler sucks at
         | ELF, why don't I just hand lay one out? and boom, you're on the
         | way to APE.
        
       | dboreham wrote:
       | Hmm. "AT&T"? I thought it at least came from DEC via the PDP-11,7
       | assembler.
       | 
       | And I'm guessing Intel didn't invent doing it backwards in their
       | own either.
        
         | masklinn wrote:
         | It's the AT&T syntax because AT&T are the one who unleashed
         | that on the world against the wishes of everyone. See the
         | sibling comment:
         | 
         | > The AT&T folks did not even wait for Intel's assembler [...]
         | Nor did they closely model their assembler after it. They just
         | took the assembler they already had for the PDP-11 and adapted
         | it with minimal changes for the 8086.
        
         | ksherlock wrote:
         | Wikipedia confirms it.
         | 
         | https://en.wikipedia.org/wiki/X86_assembly_language#Syntax
         | 
         | "The AT&T syntax is nearly universal to all other architectures
         | with the same mov order; it was originally a syntax for PDP-11
         | assembly. The Intel syntax is specific to the x86 architecture,
         | and is the one used in the x86 platform's documentation."
         | 
         | https://en.wikipedia.org/wiki/As_(Unix)
         | 
         | "As of November 1971, an assembler invoked as as was available
         | for Unix. Implemented by Bell Labs staff, it was based upon the
         | Digital Equipment Corporation's PAL-11R assembler."
        
           | [deleted]
        
         | monocasa wrote:
         | AT&T as in Bell Labs in the work that would become Unix.
        
       | PaulHoule wrote:
       | I never liked the syntax used by gas. It feels like something
       | intended to be part of a C compiler, not like something you'd use
       | for the joy of assembly language.
        
       | userbinator wrote:
       | IMHO the most confusing part is that AT&T/GAS syntax inverts the
       | comparisons, which are otherwise natural in Intel syntax:
       | cmp eax, ebx   ; eax ? ebx         jg foo         ; jump if eax >
       | ebx
       | 
       | Related: http://x86asm.net/articles/what-i-dislike-about-gas/
        
         | mrjin wrote:
         | Also the % before register names is completely unnecessary, it
         | just another extra character to type.
        
           | [deleted]
        
           | rwmj wrote:
           | It disambiguates labels from registers, assuming of course
           | you allow labels to have register names. eg this is valid:
           | mov rax,%rax
           | rax:  .ascii "hello\0"
           | 
           | Stupid perhaps, but valid.
        
             | jck wrote:
             | I wonder if it would have been more ergonomic to have the
             | labels be % prefixed instead of registers.
        
       | _tomcat_ wrote:
        
       | mshockwave wrote:
       | Similar thing also happens on 68k: Motorola syntax v.s. "MIT"
       | syntax which is probably only used by GNU toolchain
        
         | ack_complete wrote:
         | Practically, 68k is far more usable in AT&T syntax than x86.
         | When I used to do PalmPilot development, you could basically
         | write standard 68k asm with just some extra %s sprinkled before
         | registers and as would be fine with it. The x86 AT&T syntax is
         | far more alien compared to the syntax in the official manuals,
         | with arguments backward and nonstandard instruction names like
         | addl and movabsq.
        
       | bmc7505 wrote:
       | Interesting bit of trivia, Prof. Ratzer, the author of this
       | piece, was the first graduate student in computing at McGill
       | University [1] and one of the founding members of the School of
       | Computer Science [2], which just recently celebrated its 50th
       | anniversary. [3]
       | 
       | [1]:
       | https://en.wikipedia.org/wiki/McGill_University_School_of_Co...
       | 
       | [2]: https://www.cs.mcgill.ca/~ratzer/backup/welcome.html
       | 
       | [3]:
       | https://mcgill.imodules.com/controls/email_marketing/view_in...
        
       ___________________________________________________________________
       (page generated 2022-11-13 23:00 UTC)