[HN Gopher] Tiny C Compiler
       ___________________________________________________________________
        
       Tiny C Compiler
        
       Author : Koshkin
       Score  : 145 points
       Date   : 2020-06-23 17:44 UTC (5 hours ago)
        
 (HTM) web link (bellard.org)
 (TXT) w3m dump (bellard.org)
        
       | Conlectus wrote:
       | Heads up: Though amazing, the compiler makes use of the
       | deprecated POSIX setcontext API[1], which means you may have
       | trouble running it on some modern distributions.
       | 
       | [1] https://en.wikipedia.org/wiki/Setcontext
        
       | stephc_int13 wrote:
       | Anyone being impressed by the compile time of the modern
       | languages such as Switft, Rust or Go should try TCC at least once
       | in their lifetime.
       | 
       | Both compilation and link are generally finished before your
       | fingers have relased the enter key...
        
       | FpUser wrote:
       | been already discussed numerous times. But I never miss a chance
       | to sing praises to Bellard. One of the very few real programming
       | geniuses.
        
       | Teknoman117 wrote:
       | My main experience with tcc was that it was the only C compiler
       | available for Damn Small Linux (a <50 MB Linux distro with X and
       | Firefox).
        
       | jonathonf wrote:
       | Want to "skip" the compile step?                 //usr/bin/tcc
       | -run $0; exit       main() { printf("Hello\n"); return 0; }
       | 
       | Just `chmod +x` and run it...
        
         | sergeykish wrote:
         | From man tcc(1):                   #!/usr/bin/tcc -run
         | #include <stdio.h>                  int main()         {
         | printf("Hello World\n");             return 0;         }
        
         | mmastrac wrote:
         | How does this work? Is there an alternate form for #! (aka
         | shebang)?
         | 
         | EDIT: hah, I should have realized that "//" is the C comment
         | and "//usr/bin/tcc" is equivalent to "/usr/bin/tcc". Clever!
        
           | dfabulich wrote:
           | "C comment," you mean.
           | 
           | So it's running the file as a shell script, where the first
           | line runs tcc on the current file and quits.
        
             | mmastrac wrote:
             | Fixed typo, thanks. Yeah for some reason it didn't click
             | when I saw it.
        
         | hawski wrote:
         | It's obviously not the same, but you can emulate this with a
         | conventional compiler:                 #if 0       set -e; [
         | "$0" -nt "$0.bin" ] &&       cc "$0" -o "$0.bin"       exec
         | "$0.bin" "$@"       #endif              #include <stdio.h>
         | int       main(int argc, char *argv[]) {         puts("Hello
         | world!");         return 0;       }
         | 
         | It works, because by default system shell will be spawned.
        
         | shakna wrote:
         | And add the -b commandline flag to enable the memory and bounds
         | checker.
        
       | agumonkey wrote:
       | someone to revive tccboot ?
        
       | dang wrote:
       | See also
       | 
       | 2016 https://news.ycombinator.com/item?id=13249851
       | 
       | 2017 https://news.ycombinator.com/item?id=15272894
       | 
       | 2018 - obfuscated! https://news.ycombinator.com/item?id=17335856
        
       | fwsgonzo wrote:
       | It's not just a tiny C compiler - it's a compiler you can use as
       | a static library that can compile your C code directly to memory.
       | And then you can call into it, if you dare.
       | 
       | I used it as a scripting backend for a long time, but eventually
       | you will realize that you are not the only one that could write
       | scripts, but you also really need to have a trust boundary (or
       | just a different address space), so that errors in the script
       | don't drag the whole thing down.
       | 
       | That's where Lua and LuaJIT shines: It's simple, and it's
       | sandboxed. However, there is still one thing missing. These days,
       | several programming languages have several targets that they can
       | compile to, and so imagine if you could emulate some platform at
       | high performance with low overhead, in a sandbox. You would then
       | be able to script in whatever language you so desire, provided it
       | has that target.
       | 
       | Unfortunately, those languages tend to be system languages and
       | not the absolute best for scripting. With one big exception:
       | Compile-time programming support.
        
         | BubRoss wrote:
         | Have you actually gotten the static library and compile to
         | memory mode to work?
        
           | fwsgonzo wrote:
           | Yes, I used it in a game engine for many years, on Windows
           | and Linux. No problems on that part.
           | 
           | I don't really recommend doing it, unless you want to do it
           | as a learning experiment. It was a fun thing to do.
        
           | saagarjha wrote:
           | I have tried the latter and it seems to work fine.
        
         | rubber_duck wrote:
         | I feel like embedding a web assembly runtime and then exposing
         | scripting API to it would give you similar benefits (you would
         | have to add a compile C to WASM step before loading) while
         | giving you a really secure sandbox.
         | 
         | It is more complicated than just embedding a scripting language
         | (or a scripting compiler :)) but more general and secure.
         | 
         | Especially if WASM evolves to a point where you can compile C#
         | and other high level languages to it (you can to a point
         | already but it's not on par with native runtimes) - it would be
         | the most general embedding runtime.
        
         | monocasa wrote:
         | You can see this technique being used in TCCBOOT, using tcc as
         | a boot loader to compile A Linux kernel into memory and then
         | run it in a handful of seconds.
         | 
         | https://bellard.org/tcc/tccboot.html
        
           | SomeoneFromCA wrote:
           | Yeah well, modern Linux cannot be compiled in 15 seconds. The
           | link is from 2004. OTOH, I wonder how quickly modern Ryzen
           | would be able to boot 2004 kernel from the source.
        
             | tyingq wrote:
             | The Threadripper 3970x can compile the 5.x kernel in ~24
             | seconds. That's not too shabby.
             | https://www.phoronix.com/scan.php?page=article&item=amd-
             | linu...
        
             | monocasa wrote:
             | Kernel compile time has apparently stayed pretty static
             | over the years (or at least tracked with perf
             | improvements). That being said, I think that takes into
             | account SMP which probably wouldn't be easily accessible to
             | tccboot.
        
         | haberman wrote:
         | LuaJIT is not sandboxed.
         | 
         | Mike Pall told me this in no uncertain terms: http://lua-
         | users.org/lists/lua-l/2011-02/msg01582.html
         | 
         | > The only reasonably safe way to run untrusted/malicious Lua
         | scripts is to sandbox it at the process level. Everything else
         | has far too many loopholes.
        
           | fwsgonzo wrote:
           | I see, that's unfortunate. I don't have those issues in my
           | emulated environment though. I don't use Lua anymore - I'm
           | running emulated RISC-V in my own emulator.
           | 
           | I have been benchmarking against LuaJIT for the longest time
           | because I thought it was an equal in that respect. Guess it
           | should have been against regular Lua.
        
             | haberman wrote:
             | For what it's worth, I think his statement applies to
             | regular Lua too, see: http://lua-
             | users.org/lists/lua-l/2011-02/msg01595.html
        
           | wahern wrote:
           | The issue isn't that Lua can't nominally provide a sandboxed
           | environment--it can, better than almost any other language.
           | The central issue is whether LuaJIT, PUC Lua, or any other
           | particular piece of software can be made sufficiently free of
           | bugs that you can trust such a sandbox to run potentially
           | malicious code.
           | 
           | The answer in the case of LuaJIT is definitely no, because
           | the JIT engine is sufficiently complex that exploits are
           | inevitable. Note that this is _also_ the case with
           | JavaScript. Many browser exploits start with some codegen bug
           | in V8 or SpiderMonkey. And there are many more eyeballs
           | looking to fix bugs in V8 and SpiderMonkey than LuaJIT, so in
           | the case of LuaJIT the prudent answer is that you should
           | never trust it to run potentially malicious code.
           | 
           | The case for PUC Lua is more nuanced. Lua used to have a
           | bytecode verifier, but it was removed for 5.2 because too
           | many bugs were found in the verifier, and because the VM
           | relied heavily on the verifier to filter bad opcode patterns,
           | that led to sandbox breakouts. This made the PUC developers
           | believe that the verifier made Lua worse off as compared to a
           | more wholistic emphasis on correctness and robustness, so
           | they dropped the verifier. They also dropped any pretense
           | that you could safely sandbox bytecode (i.e. precompiled
           | scripts); if you want a sandbox in Lua you should only load
           | untrusted code as plain Lua scripts into the sandboxed
           | environment. To that end Lua 5.2 added a parameter to all
           | APIs for loading code that specified whether to accept text
           | scripts or binary bytecode. In other words, the bytecode
           | verifier was considered a third wheel, so they removed it and
           | redirected attention to the compiler and the rest of the VM.
           | 
           | So for PUC Lua the issue really comes down to how prudent it
           | is to draw a trust boundary around a pure Lua sandbox; or
           | rather, what your adversarial model is, precisely. PUC Lua is
           | committed to Lua's sandboxing features, but many developers
           | are fairly of the opinion that the only way to run untrusted
           | code, if you're to run untrusted code at all, is using either
           | a hardware VM or a very strict seccomp jail. If you're of the
           | latter opinion, the language is irrelevant--you shouldn't
           | trust Lua, JavaScript, Java, or any other language
           | environment, period. In practice, however, even people of the
           | latter opinion generally apply the principle of defense in
           | depth. That's why browser JavaScript APIs and capabilities
           | are still relatively limited, even though browsers execute
           | JavaScript in OS-based sandboxes. In most practical contexts
           | Lua's sandboxing features still provide great value; it just
           | needs to be understood that they're a complement to rather
           | than substitute for process sandboxing.
           | 
           | The same analysis applies to WebAssembly, FWIW, especially
           | JIT'ing WASM environments. Anybody who thinks WASM is a magic
           | cureall for running untrusted code is mistaken.
        
       | kriro wrote:
       | On a somewhat related note, I always thought dietlibc was a
       | pretty cool project (basically trying to get libc to be very
       | tiny). It's been a pretty long time since I browsed the code but
       | I remember it being quite elegant.
       | 
       | https://www.fefe.de/dietlibc/
        
         | Koshkin wrote:
         | > _2. Fabrice Bellard 's Tiny C Compiler. You can't compile the
         | diet libc with it._
         | 
         | :(
        
         | Conlectus wrote:
         | From my experience musl libc[1] is a more popular project in
         | this space
         | 
         | [1] https://musl.libc.org/
        
           | acqq wrote:
           | The creator of Zig language wrote a praise to musl:
           | 
           | https://andrewkelley.me/post/why-donating-to-musl-libc-
           | proje...
        
       | mywittyname wrote:
       | > Measures were done on a 2.4 GHz Pentium 4. Real time is
       | measured. Compilation time includes compilation, assembly and
       | linking.
       | 
       | How old is this project?
        
         | eeereerews wrote:
         | > (Nov 18, 2001) TCC version 0.9 is out. First public version.
        
       | algorithm314 wrote:
       | For some time it has a risc64 backend.
        
       | mmastrac wrote:
       | It's worth reading through all the other bellard projects. He is
       | an amazingly productive developer.
       | 
       | https://news.ycombinator.com/from?site=bellard.org
        
         | LeoPanthera wrote:
         | Perhaps most famously, the original creator of both ffmpeg and
         | qemu.
        
       | hawski wrote:
       | I still wonder about the feasibility of Rob Landley's idea of QCC
       | - TCC with TCG backend - the code generator used by qemu which
       | comes originally from TCC itself. That would make it fast to
       | compile and with quite an extensive repertoire of architectures
       | covered.
       | 
       | https://landley.net/qcc/
        
       | winrid wrote:
       | As a side note, I'm surprised how big Links is!
        
       | laurensr wrote:
       | The author is a genius: a while ago he implemented an x86 virtual
       | machine in javascript:
       | https://bellard.org/jslinux/vm.html?url=buildroot-x86.cfg
        
         | simlevesque wrote:
         | He created FFMpeg, QEMU and claimed the pi digits record... A
         | legend.
        
       | tankfeeder wrote:
       | Its not dead, i use from git for years.
       | 
       | https://repo.or.cz/w/tinycc.git
        
         | slezyr wrote:
         | wtf are those tags?
        
           | nadavami wrote:
           | Looks like anyone can add them right next to the tag cloud...
        
             | jjice wrote:
             | Classic mistake to allow anyone to add content to a public
             | site anonymously
        
       ___________________________________________________________________
       (page generated 2020-06-23 23:00 UTC)