[HN Gopher] Hello World ___________________________________________________________________ Hello World Author : ddevault Score : 88 points Date : 2020-01-04 13:53 UTC (9 hours ago) (HTM) web link (drewdevault.com) (TXT) w3m dump (drewdevault.com) | alberth wrote: | I'm curious to know how NIM performed given it transcodes into C. | zamadatix wrote: | Someone commented a test shortly after you asked | https://news.ycombinator.com/item?id=21957476 | dom96 wrote: | I'm actually curious why Nim didn't make the list. Crystal is | there and a lot of other emerging languages. Is the author | not familiar with it? | kick wrote: | Drew's familiar with nim, and he's talked with you about it | before on HN. Crystal was volunteered by someone working on | Crystal, Julia was volunteered by a dev who works using | Julia, etc. Haskell was also presented, I think, but | Haskell was a mess. | phoe-krk wrote: | What is this post supposed to prove? It certainly is not supposed | to prove that a hello world is a representative real-world | program, from which one could infer that writing and debugging a | real-world program in Julia is 835 times as complex as writing a | real-world program in assembly, since the former makes 835 times | as many syscalls as an assembly program. (You know, that number | seems okay for me, except it needs to be applied to these | languages in reverse.) | | I agree that software bloat is a big problem, but trivializing | that problem to printing a "hello world" to the screen, punishing | all languages with runtimes by measuring the syscalls involved in | their startup routines, disregarding the fact that many users are | going to have a single system-wide runtime for e.g. C or Python | or Julia and therefore the total-kB number does not scale | linearly with the number of programs written in C or Python or | Julia, ignoring the massively increased software development and | debugging time for writing in low-level and memory-unsafe | languages like assembly, static Zig, or C, and directly | implying[0] that most problems with software complexity can be | solved by writing in assembly, static Zig, or C rather than in | Julia/Ruby/Java/all other languages from the bottom 90% of the | list (and that's the vibe that this post gives me) is, for me, | more about making a venting shitpost than creating something that | provides even a part of an actual solution to software bloat in | general. | | The "more time your users are sitting there waiting for your | program" statement is especially amusing to me. Your users are | not going to wait shorter for your program because you and your | team are taking another year to write, test, and debug it in | assembly. | | [0] "These numbers are real. This is more complexity that someone | has to debug, more time your users are sitting there waiting for | your program, less disk space available for files which actually | matter to the user." | ajkjk wrote: | It could just be trying to be interesting, not to prove | anything. | phoe-krk wrote: | Why is it attempting to moralize if it is not meant to prove | any morals then? | hn_throwaway_99 wrote: | The language in the post, "Most languages do a whole lot of | other crap other than printing out "hello world", even if | that's all you asked for." certainly seems to imply it's | moralizing about something. | zamadatix wrote: | "Passing /dev/urandom into perl is equally likely to print "hello | world"" | | That gave me a good chuckle towards the end. | | It'd be useful to break this out a little further as it'd have | been interesting to see how small just the output is on the | dynamically linked versions instead of just comparing static to | whole dynamic bundle. | | It's also a bit odd that e.g. zig gets optimized for size and | stripped via the compiler, c gets optimized for speed and | stripped via strip, and Go/Crystal just gets built standard with | no stripping at all. I don't think it'd change the big picture | just a bit odd. | | . | | Unrelated tangent/ramble, I played with Zig and Go as part of my | yearly "take December off and tinker" break. Zig was really fun | to work with but unfortunately still in a huge churn and | development. Go was a lot better than I expected it to be (I had | put off messing with Go for a few years now) and the size of the | stdlib is just astounding. In the end it wasn't as "fun" as zig | but it had very low friction and I definitely see myself using it | for a few personal projects over the next year... and then seeing | if Zig has less churn in December ;). | franciscop wrote: | Random thought/question, does `process.stdout.write("Hello | World");` in Node.js make any difference? While `console.log()` | is correct for this analysis since it's the more common one, it | does a lot of extra internal logic: | https://github.com/nodejs/node/blob/v13.x/lib/internal/conso... | | Edit: I'm just curious and don't know how to even start testinng | this, not trying to promote/demote Node.js in any way. | zamadatix wrote: | I would be surprised if console.log() made many more syscalls | than process.stdout.write. More function calls probably but | those aren't being counted and neither is RAM usage. "strace" | would let you count and find out though! | | The size would be a few bytes larger since node is scripted and | that's more characters. | np_tedious wrote: | If it is any better, it probably would be the more fair entry | (and perhaps similar for python and stdout/bytes) since Go's | example did basically that instead of fmt.print with a string | hn_throwaway_99 wrote: | What is with it lately where there seem to be lots of posts | fetishising absolute performance over lots of other attributes, | or even worse, pretending those other attributes don't even | matter. | | What is the point of this post? Yes, I fully expect a simple | Hello World in assembly would be straightforward and fast. I | still want the advantage of things like automated memory | management, an interpreter or JIT compiler where warranted, a | standard runtime environment, etc. For anything even remotely | complicated. | | I get it, over the past 30-40 years we've built layers upon | layers of abstraction, so it's worth it to take a look back and | ask "Are there some cases where we overdid it?" Still, let's not | throw the baby out with the bathwater, or forget why we added | those layers in the first place. | ChrisMarshallNY wrote: | That's pretty cool. It reminds me of GodBolt | (https://godbolt.org). | | I'm told that the story behind it, is that he was arguing with | someone about the efficiency of an operation, and actually wrote | that site to prove his point. | NilsIRL wrote: | It would be interesting to know why the number of syscalls for C | are so high. | gerikson wrote: | Which version of C? | zamadatix wrote: | All versions are interesting, zig/assembly managesto do it in | 2/3 so what is musl doing that needs 5? And what on Earth is | glibc dynamic doing that it needs 65? | BearOso wrote: | I'm wondering why his dynamic glibc C executable is so big. | _paulc wrote: | As it's not on Drew's list: | | Nim: $ cat hello.nim stdout.write("hello, | world!\n") | | Static (musl): $ nim --gcc.exe:musl-gcc | --gcc.linkerexe:musl-gcc --passL:-static c -d:release hello.nim | $ ldd ./hello not a dynamic executable Execution | Time: 0m0.002s (real) Total Syscalls: 16 Unique | Syscalls: 8 Size (KiB): 95K (78K stripped) | | Dynamic (glibc): $ nim c -d:release hello.nim | $ ldd ./hello linux-vdso.so.1 => (0x00007ffc994b6000) | libdl.so.2 => /lib64/libdl.so.2 (0x00007f7c88785000) | libc.so.6 => /lib64/libc.so.6 (0x00007f7c883b8000) | /lib64/ld-linux-x86-64.so.2 (0x00007f7c88989000) | Execution Time: 0m0.002s (real) Total Syscalls: 42 | Unique Syscalls: 13 Size (KiB): 91K (79K stripped) | | Which I think is actually pretty reasonable for a high-level GC'd | language. | zamadatix wrote: | "Size (KiB): 95K (78K stripped)" | | Seems suspicious that lines up with the 95.9 KiB the author | listed for C + GCC + musl static build even though the author | says they stripped the binary after. I think they might have | copied the wrong number into the table :). | | The author was counting the size of dynamic as binary + | dynamically linked files. Should be about the same as the c | dynamic ones in the table in this case anyways but just a note | to anyone else running their own tests. | tyingq wrote: | Curious if the generated C is any different for nim's echo as | opposed to stdout.write(). | cycloptic wrote: | Sorry but I have to give this a thumbs down for not being a very | convincing or well-written blog post. It dumps some data and then | immediately jumps to a statement about how "lots of syscalls = | bad" without actually detailing what those syscalls are doing in | the context of the runtime. And I'm saying this as someone who | already runs Alpine on my servers and doesn't need to be | convinced. Drew, I think you can write much better posts than | this. | peteradio wrote: | More matter with less art. | | I thought it was a breezy read with a simple thesis. I don't | know why such a thing should be discouraged. | cycloptic wrote: | It should be no surprise to users of CPython and Ruby that | those languages have a lot of startup code. The details of | what that code is doing are already evident if you're | watching it happen in an strace log, but those bits were left | out. This isn't art, it's details, and without the details, | it's just preaching to the choir. No new readers are going to | be convinced. | nielsole wrote: | It's still an interesting table. I was surprised that Java is | 10x faster than Python. I would have expected initializing the | JVM would be similarly complex to initializing the Python | Interpreter. | cycloptic wrote: | My point is that it's not clear from the article why that is | the case. | giantrobot wrote: | The author doesn't state the Java version but more recent | version JREs (9+ IIRC) start up way faster than older | versions. I'd imagine a JRE's launch time is heavily | influenced by disk cache. A warm launch ends up way faster | than a cold launch with the tens of megabytes of classes in | RAM means the warm launch basically loads the program's class | file from disk. | lttlrck wrote: | It's comparing unassembled assembler to JITed code and compilers | that are pulling in precompiled libraries. | | I feel it needs some kind of normalizing. I get that it is | illustrating bloat but it doesn't really illustrate where that's | coming from. | | Maybe only the output of the JITs should counts, or the syscalls | required to assemble the example should be included. Are musl and | glibc really wasting cycles or are they doing something that the | example is missing. | | Fun to think about. | tensor wrote: | It's not even meaningfully illustrating bloat. Hello world is | an unrealistic edge case. Any program that does something | useful will be far more complicated, and it's entirely possible | that a lot of the extra stuff being measured here will required | anyways. | zamadatix wrote: | > It's comparing unassembled assembler to JITed code | | No, he runs the assembly through NASM + GCC as documented on | the page. | | I think it's a comparison of "when the user runs the program | what runs, how long does it take and how much disk space does | it need" based of the column headings. It's not a comparison of | the tooling prior to the user's computer as far as I can tell. | ddevault wrote: | It's deliberate that JITs, interpreters, and compiled languages | are compared on the same terms here. JITs and interpreters are | fundamentally less performant than compiled languages, they | don't get a pass on performance tests just because it's by | design. | joshuamorton wrote: | > JITs | | This is...highly context dependent. For highly polymorphic | code, my understanding is that JITs can outperform | precompiled binaries, since they can inline | virtual/polymorphic calls in tight loops. | | This also isn't a "performance" test in any real sense. It's | a test of startup time. Where, yes, JITs lose, but unless | you're writing short lived interactive command line tools, or | something that runs on lambda, that shouldn't be a concern. | For "normal" serverside or desktop apps that run for more | than, say, 30 seconds, the difference between 0s of startup | and 0.2s of startup time is literally in the noise. | ddevault wrote: | Startup time is the least interesting metric in this | article. The more interesting metric is the number of | syscalls. This isn't a measure of performance, it's a | measure of complexity and busywork. Complexity tends to | indirectly affect performance, but that's not the point of | the article. | joshuamorton wrote: | Complexity of _what_? | | The resulting generated binary? Well no, a python binary | is smaller than the c binary. The toolchain? Well gcc is | pretty complex and that's unaccounted for. The build | process? Again, no. | | The closest thing I can think of is the language runtime. | But why do I care about how complex the language runtime | is? Often more complex language runtimes make my life | easier anyway, and they're all sitting atop the Intel | microcode magic box anyway. | | There's a very specific definition of complexity you're | using, and I'm still not sure what it is. In my world, | you usually add complexity to eek out extra performance | by breaking the less complex abstractions. | ddevault wrote: | The compleixty of the _system_. | joshuamorton wrote: | I'm still confused. Why is runtime compilation a | component of the "system", but aot compilation is not? | Why are python interpreters a component of the system, | but microcode interpreters and aot compilation not? | | If you haven't given a clear definition of what "the | system" is, I can't really use your evaluation to | influence my decision making. | ddevault wrote: | Because AOT compilation can be used to construct a system | which is functional _without_ the compiler, but runtime | compilation requires it at runtime. The difference is | plainly obvious. | marcosscriven wrote: | Curious what all the syscalls Rust is making, and why? | steveklabnik wrote: | One reason is that println! will lock stdout for you. I'm not | sure what percentage that makes up. | | If you don't want that behavior, you can control this all | yourself with write! and friends. | | EDIT: deeper analysis on Reddit: | https://www.reddit.com/r/programming/comments/ejxwlu/hello_w... | tyingq wrote: | Perl seems to lead the pack for interpreted langauges by a wide | margin for this microbenchmark. I wonder if that's just for the | narrow case of print() / hello-world. | [deleted] | [deleted] ___________________________________________________________________ (page generated 2020-01-04 23:00 UTC)