[HN Gopher] Tiny C Compiler ___________________________________________________________________ Tiny C Compiler Author : Koshkin Score : 145 points Date : 2020-06-23 17:44 UTC (5 hours ago) (HTM) web link (bellard.org) (TXT) w3m dump (bellard.org) | Conlectus wrote: | Heads up: Though amazing, the compiler makes use of the | deprecated POSIX setcontext API[1], which means you may have | trouble running it on some modern distributions. | | [1] https://en.wikipedia.org/wiki/Setcontext | stephc_int13 wrote: | Anyone being impressed by the compile time of the modern | languages such as Switft, Rust or Go should try TCC at least once | in their lifetime. | | Both compilation and link are generally finished before your | fingers have relased the enter key... | FpUser wrote: | been already discussed numerous times. But I never miss a chance | to sing praises to Bellard. One of the very few real programming | geniuses. | Teknoman117 wrote: | My main experience with tcc was that it was the only C compiler | available for Damn Small Linux (a <50 MB Linux distro with X and | Firefox). | jonathonf wrote: | Want to "skip" the compile step? //usr/bin/tcc | -run $0; exit main() { printf("Hello\n"); return 0; } | | Just `chmod +x` and run it... | sergeykish wrote: | From man tcc(1): #!/usr/bin/tcc -run | #include <stdio.h> int main() { | printf("Hello World\n"); return 0; } | mmastrac wrote: | How does this work? Is there an alternate form for #! (aka | shebang)? | | EDIT: hah, I should have realized that "//" is the C comment | and "//usr/bin/tcc" is equivalent to "/usr/bin/tcc". Clever! | dfabulich wrote: | "C comment," you mean. | | So it's running the file as a shell script, where the first | line runs tcc on the current file and quits. | mmastrac wrote: | Fixed typo, thanks. Yeah for some reason it didn't click | when I saw it. | hawski wrote: | It's obviously not the same, but you can emulate this with a | conventional compiler: #if 0 set -e; [ | "$0" -nt "$0.bin" ] && cc "$0" -o "$0.bin" exec | "$0.bin" "$@" #endif #include <stdio.h> | int main(int argc, char *argv[]) { puts("Hello | world!"); return 0; } | | It works, because by default system shell will be spawned. | shakna wrote: | And add the -b commandline flag to enable the memory and bounds | checker. | agumonkey wrote: | someone to revive tccboot ? | dang wrote: | See also | | 2016 https://news.ycombinator.com/item?id=13249851 | | 2017 https://news.ycombinator.com/item?id=15272894 | | 2018 - obfuscated! https://news.ycombinator.com/item?id=17335856 | fwsgonzo wrote: | It's not just a tiny C compiler - it's a compiler you can use as | a static library that can compile your C code directly to memory. | And then you can call into it, if you dare. | | I used it as a scripting backend for a long time, but eventually | you will realize that you are not the only one that could write | scripts, but you also really need to have a trust boundary (or | just a different address space), so that errors in the script | don't drag the whole thing down. | | That's where Lua and LuaJIT shines: It's simple, and it's | sandboxed. However, there is still one thing missing. These days, | several programming languages have several targets that they can | compile to, and so imagine if you could emulate some platform at | high performance with low overhead, in a sandbox. You would then | be able to script in whatever language you so desire, provided it | has that target. | | Unfortunately, those languages tend to be system languages and | not the absolute best for scripting. With one big exception: | Compile-time programming support. | BubRoss wrote: | Have you actually gotten the static library and compile to | memory mode to work? | fwsgonzo wrote: | Yes, I used it in a game engine for many years, on Windows | and Linux. No problems on that part. | | I don't really recommend doing it, unless you want to do it | as a learning experiment. It was a fun thing to do. | saagarjha wrote: | I have tried the latter and it seems to work fine. | rubber_duck wrote: | I feel like embedding a web assembly runtime and then exposing | scripting API to it would give you similar benefits (you would | have to add a compile C to WASM step before loading) while | giving you a really secure sandbox. | | It is more complicated than just embedding a scripting language | (or a scripting compiler :)) but more general and secure. | | Especially if WASM evolves to a point where you can compile C# | and other high level languages to it (you can to a point | already but it's not on par with native runtimes) - it would be | the most general embedding runtime. | monocasa wrote: | You can see this technique being used in TCCBOOT, using tcc as | a boot loader to compile A Linux kernel into memory and then | run it in a handful of seconds. | | https://bellard.org/tcc/tccboot.html | SomeoneFromCA wrote: | Yeah well, modern Linux cannot be compiled in 15 seconds. The | link is from 2004. OTOH, I wonder how quickly modern Ryzen | would be able to boot 2004 kernel from the source. | tyingq wrote: | The Threadripper 3970x can compile the 5.x kernel in ~24 | seconds. That's not too shabby. | https://www.phoronix.com/scan.php?page=article&item=amd- | linu... | monocasa wrote: | Kernel compile time has apparently stayed pretty static | over the years (or at least tracked with perf | improvements). That being said, I think that takes into | account SMP which probably wouldn't be easily accessible to | tccboot. | haberman wrote: | LuaJIT is not sandboxed. | | Mike Pall told me this in no uncertain terms: http://lua- | users.org/lists/lua-l/2011-02/msg01582.html | | > The only reasonably safe way to run untrusted/malicious Lua | scripts is to sandbox it at the process level. Everything else | has far too many loopholes. | fwsgonzo wrote: | I see, that's unfortunate. I don't have those issues in my | emulated environment though. I don't use Lua anymore - I'm | running emulated RISC-V in my own emulator. | | I have been benchmarking against LuaJIT for the longest time | because I thought it was an equal in that respect. Guess it | should have been against regular Lua. | haberman wrote: | For what it's worth, I think his statement applies to | regular Lua too, see: http://lua- | users.org/lists/lua-l/2011-02/msg01595.html | wahern wrote: | The issue isn't that Lua can't nominally provide a sandboxed | environment--it can, better than almost any other language. | The central issue is whether LuaJIT, PUC Lua, or any other | particular piece of software can be made sufficiently free of | bugs that you can trust such a sandbox to run potentially | malicious code. | | The answer in the case of LuaJIT is definitely no, because | the JIT engine is sufficiently complex that exploits are | inevitable. Note that this is _also_ the case with | JavaScript. Many browser exploits start with some codegen bug | in V8 or SpiderMonkey. And there are many more eyeballs | looking to fix bugs in V8 and SpiderMonkey than LuaJIT, so in | the case of LuaJIT the prudent answer is that you should | never trust it to run potentially malicious code. | | The case for PUC Lua is more nuanced. Lua used to have a | bytecode verifier, but it was removed for 5.2 because too | many bugs were found in the verifier, and because the VM | relied heavily on the verifier to filter bad opcode patterns, | that led to sandbox breakouts. This made the PUC developers | believe that the verifier made Lua worse off as compared to a | more wholistic emphasis on correctness and robustness, so | they dropped the verifier. They also dropped any pretense | that you could safely sandbox bytecode (i.e. precompiled | scripts); if you want a sandbox in Lua you should only load | untrusted code as plain Lua scripts into the sandboxed | environment. To that end Lua 5.2 added a parameter to all | APIs for loading code that specified whether to accept text | scripts or binary bytecode. In other words, the bytecode | verifier was considered a third wheel, so they removed it and | redirected attention to the compiler and the rest of the VM. | | So for PUC Lua the issue really comes down to how prudent it | is to draw a trust boundary around a pure Lua sandbox; or | rather, what your adversarial model is, precisely. PUC Lua is | committed to Lua's sandboxing features, but many developers | are fairly of the opinion that the only way to run untrusted | code, if you're to run untrusted code at all, is using either | a hardware VM or a very strict seccomp jail. If you're of the | latter opinion, the language is irrelevant--you shouldn't | trust Lua, JavaScript, Java, or any other language | environment, period. In practice, however, even people of the | latter opinion generally apply the principle of defense in | depth. That's why browser JavaScript APIs and capabilities | are still relatively limited, even though browsers execute | JavaScript in OS-based sandboxes. In most practical contexts | Lua's sandboxing features still provide great value; it just | needs to be understood that they're a complement to rather | than substitute for process sandboxing. | | The same analysis applies to WebAssembly, FWIW, especially | JIT'ing WASM environments. Anybody who thinks WASM is a magic | cureall for running untrusted code is mistaken. | kriro wrote: | On a somewhat related note, I always thought dietlibc was a | pretty cool project (basically trying to get libc to be very | tiny). It's been a pretty long time since I browsed the code but | I remember it being quite elegant. | | https://www.fefe.de/dietlibc/ | Koshkin wrote: | > _2. Fabrice Bellard 's Tiny C Compiler. You can't compile the | diet libc with it._ | | :( | Conlectus wrote: | From my experience musl libc[1] is a more popular project in | this space | | [1] https://musl.libc.org/ | acqq wrote: | The creator of Zig language wrote a praise to musl: | | https://andrewkelley.me/post/why-donating-to-musl-libc- | proje... | mywittyname wrote: | > Measures were done on a 2.4 GHz Pentium 4. Real time is | measured. Compilation time includes compilation, assembly and | linking. | | How old is this project? | eeereerews wrote: | > (Nov 18, 2001) TCC version 0.9 is out. First public version. | algorithm314 wrote: | For some time it has a risc64 backend. | mmastrac wrote: | It's worth reading through all the other bellard projects. He is | an amazingly productive developer. | | https://news.ycombinator.com/from?site=bellard.org | LeoPanthera wrote: | Perhaps most famously, the original creator of both ffmpeg and | qemu. | hawski wrote: | I still wonder about the feasibility of Rob Landley's idea of QCC | - TCC with TCG backend - the code generator used by qemu which | comes originally from TCC itself. That would make it fast to | compile and with quite an extensive repertoire of architectures | covered. | | https://landley.net/qcc/ | winrid wrote: | As a side note, I'm surprised how big Links is! | laurensr wrote: | The author is a genius: a while ago he implemented an x86 virtual | machine in javascript: | https://bellard.org/jslinux/vm.html?url=buildroot-x86.cfg | simlevesque wrote: | He created FFMpeg, QEMU and claimed the pi digits record... A | legend. | tankfeeder wrote: | Its not dead, i use from git for years. | | https://repo.or.cz/w/tinycc.git | slezyr wrote: | wtf are those tags? | nadavami wrote: | Looks like anyone can add them right next to the tag cloud... | jjice wrote: | Classic mistake to allow anyone to add content to a public | site anonymously ___________________________________________________________________ (page generated 2020-06-23 23:00 UTC)