[HN Gopher] Tiny-C Compiler (2001) ___________________________________________________________________ Tiny-C Compiler (2001) Author : swatson741 Score : 199 points Date : 2023-03-13 10:30 UTC (12 hours ago) (HTM) web link (www.iro.umontreal.ca) (TXT) w3m dump (www.iro.umontreal.ca) | WoodenChair wrote: | This is an interpreter for a super restricted subset of C and it | looks well written from a pedagogical standpoint (keeps thing | pretty simple, fairly easy to read). But it's slightly awk to | strip-down a language (what features do you keep, what do you | lose?). I think it's more fun to build an interpreter for an | actual tiny language. In my next book I have interpreters for | Brainfuck [0], an obfuscated kind of joke of a language, and Tiny | BASIC[1] a real tiny language that was used on early personal | computers. These are pretty common first projects for folks | interested in doing an interpreter. | | Here's why real languages are better than stripped down | languages: Anyone with programming knowledge can implement a | Brainfuck interpreter in a few hours and run any Brainfuck | program. Anyone with a tiny bit of CS knowledge can implement a | Tiny BASIC interpreter in just a day and then you can run any | real Tiny BASIC program from the late 70s. It's cool to run real | programs people actually used. With this stripped down C, there | are no pre-made real programs... | | 0:https://en.wikipedia.org/wiki/Brainfuck | 1:https://en.wikipedia.org/wiki/Tiny_BASIC | Gordonjcp wrote: | FORTH is another language that's quick and easy to write from | scratch, where you need a couple of dozen words written in | assembler and then the rest of FORTH can be written in FORTH. | doodlesdev wrote: | Another language that's more modern and currently useful but | which is very tiny to write an interpreter for is Lua [0][1]. | Currently the official Lua interpreter has around 30k LOC which | I find pretty amusing for a language used so widely in games | and for scripting purposes [2]. Of course it's still at least | an order of magnitude larger than a small Tiny BASIC | interpreter but the fact it's a current language used in so | many places makes it even more interesting to make your for-fun | implementation. | | Also related to small language implementations I find notable | PicoC [3] which is a C interpreter written in around 3k LOC of | C. Past discussion about it here 13 years ago [4]. | | [0]: https://www.lua.org/about.html | | [1]: https://www.lua.org/spe.html | | [2]: https://en.wikipedia.org/wiki/Lua_(programming_language) | | [3]: https://gitlab.com/zsaleeba/picoc | | [4]: https://news.ycombinator.com/item?id=1658890 | benj111 wrote: | While I appreciate your point. | | 1. You use the example of a tiny basic of a 'real language' and | I don't see how tiny basic is a 'real language', but tiny C is | a stripped down language. | | 2. You can build on this to make a full c implementation. A | minimal c implementation that can potentially bootstrap a full | c environment is more useful than a brainfuck interpreter. | northernskys30 wrote: | I did my CS degree at umontreal and this was an assignment in a | second year class. This was a pretty interesting introduction to | compilers, and even if this is a toy subset of C, this was | challenging, at least for me. We would get 0 if there were any | memory leak, so we were pretty paranoid about it. | | The second assignment was writing a Scheme interpreter. | ndiddy wrote: | That's kind of surprising they cared so much about memory use, | a lot of one-shot C programs such as compilers don't bother | freeing memory and let the OS clean up after them once they | exit. | ComputerGuru wrote: | I was about to comment and say the same thing, but as a | graded learning exercise there is certainly value in that | approach. | ttvecthrowaway wrote: | Not to be confused with https://bellard.org/tcc/, which is a tiny | compiler for the C language. | Laaas wrote: | I use tcc for all of my small C "scripts" for doing ioctls, | etc. Less bloat, suckless. I imagine most software would be | better off using tcc than gcc/clang. Performance isn't that | important in most cases. | notorandit wrote: | I think you are confusing the work of Frabrice Bellard with | this very one. The former is a C-language compiler. This once | is a compiler for a language called "Tiny C". Understandable | confusion, though. | [deleted] | doublepg23 wrote: | > Performance isn't that important in most cases. | | Optimizing for storage space is...better? | vidarh wrote: | Since they say "scripts", note that tcc supports being | invoked in the shebang line. E.g. | #!/usr/bin/tcc -run | | You _can_ do that with gcc /clang too (e.g. #if 0, #endif | to wrap a block of shell script to compile the current file | and execute the result) but a primary value of tcc is that | it _compiles fast_. | | On a more philosophical note, the suckless approach is to | optimise for _simplicity_ not storage. It 's perfectly | valid to disagree with that of course, but if simplicitly | of the system as a whole is a consideration gcc and clang | doesn't really fit. | LukeShu wrote: | You can only _sort of_ do that with gcc /clang. The #if 0 | trick relies on funny behavior that is in a few common | shells. When you try to execve(2) a script without a | proper #! shebang, the kernel will return ENOEXEC. Bash | will check for ENOEXEC then check a few heuristics to see | if it looks like a text file, and if it does, then it | will try to run it as a shell script. | | This means that your script will work when run from a | shell, but won't work when exec()ed from a non-shell | program, which is a weird foot-gun. | LanternLight83 wrote: | Thanks for sharing! I've yet to go through my C phase, but | see it on the horizon, and will remember this and the shebang | trick. | kevin_thibedeau wrote: | This is a recommended practice for scripting with Nim if | you want a batteries-included language. | circuit10 wrote: | I feel like a lot of software written in C is written in C | for performance reasons. Obviously that's not always the case | and TCC is useful but I wouldn't say that that most software | should use it | squarefoot wrote: | It is sad that tcc is unmaintained as it would be really useful | in small embedded systems. I just tried it on Debian and | compilation fails without #undefining CONFIG_TCC_MALLOC_HOOKS | in lib/bcheck.c. After compilation it passes tests, but they | warn that it could be unreliable. | jart wrote: | Try chibicc. It's x86_64 native and so much more readable as | a codebase than TCC. | dantrell wrote: | While Fabrice Bellard is no longer working on TCC [0] and an | official release tarball hasn't been packaged since version | 0.9.27 (5 years ago) the project is by no means unmaintained. | | For details, check their current working repository [1] and | mailing list [2]. | | [0]: https://bellard.org/tcc/ | | [1]: https://repo.or.cz/tinycc.git | | [2]: https://lists.nongnu.org/archive/html/tinycc-devel/ | siliconunit wrote: | I'm quite confused, not the same project at all? To me tiny c | compiler always meant the bellard page. Super useful stuff for | micro hacky projects. | hawski wrote: | One could say that the one from this submission is Tiny-C | Compiler and Bellard's is Tiny C-Compiler. | Narishma wrote: | This is a compiler for a language called Tiny-C. | notorandit wrote: | I understand the confusion: it is more about "syntax | associativity" | | (tiny C) compiler --> "This is a compiler for the Tiny-C | language" | | vs | | Tiny (C compiler) --> "TinyCC [...] is a small but hyper fast | C compiler" | | That's it! ;-) | moffkalast wrote: | Now obviously the next step is to make a tiny tiny c | compiler compiler. | Koshkin wrote: | Sigh. I wish people would teach compilers using Oberon as an | example. One can write a small yet complete compiler for (what | turns out to be not-so-tiny) a language. | peacefulhat wrote: | Best to pick languages anybody has heard of. | stevekemp wrote: | That's a cute project, thanks for sharing. | | I hacked in support for ">", ">=", and "<=" to match the "<" | support, but I just noticed that ints are truncated, so the | maximum value stored in a variable is 127. | bitwize wrote: | Oh, Marc Feeley. Wonder if we'll see a Tiny-C target for Gambit? | feeley wrote: | That's not on my TODO! But Gambit does have support for TCC. | For example you can use TCC to compile a file to a dynamically | loadable object file (aka shared library). The compilation is | faster than gcc and the code size is typically smaller too: | $ cat hello.scm (display "hello!\n") $ gsc | hello.scm $ gsi hello.o1 hello! $ ls -l | hello.o1 # this is generated by gcc -rwxrwxr-x 1 feeley | feeley 18152 Mar 13 17:16 hello.o1 $ rm hello.o1 $ | gsc -cc "tcc -shared" hello.scm $ gsi hello.o1 | hello! $ ls -l hello.o1 # this is generated by tcc | -rwxrwxr-x 1 feeley feeley 4432 Mar 13 17:17 hello.o1 | fernly wrote: | Um, excuse me, but there existed a Tiny-C in 1979. Whatever you | are talking about creating in 2000 is in no way an original idea. | | References: | | Dr. Dobb's Journal #32 (Feb 1979) page 41, review of Tiny-C User | Manual by Ted Shapin [0] | | Dr. Dobb's Journal #35 (May 1979) page 37, "Tiny-C Interpreter on | C-Dos" by Ray Duncan[1] | | Tiny-C Associates incorporated in Holmdel, NJ, March 1978 [2] | | "Tiny C" trademark application filed 1979, cancelled 1987 [3] | | There was also a "Small C", see DDJ #69 (July 1982) p. 66, "Small | C for the 9900" by Matthew Halfant[4] | | [0] | https://archive.org/details/dr_dobbs_journal_vol_04_201803/p... | | [1] | https://archive.org/details/dr_dobbs_journal_vol_04_201803/p... | | [2] https://www.bizapedia.com/nj/tiny-c-associates.html | | [3] https://alter.com/trademarks/tiny-c-73219160 | | [4] | https://archive.org/details/dr_dobbs_journal_vol_07_201803/p... | mati365 wrote: | Recently I'm working on toy C compiler and x86 Assembler in | TypeScript[1] and I can confirm that the amount of work that have | to be done to compile and print simple Hello World is | astronomically huge (as the satisfaction) | | [1] https://github.com/Mati365/ts-c-compiler | Narishma wrote: | This isn't a C compiler though. It's a compiler for a language | called Tiny-C. | [deleted] | jokoon wrote: | first assignment would be to add the multiply and divide | operators... | | I admit I have trouble understanding how the VM run() function | works... anybody can give some insight? | mav88 wrote: | The function runs through the program by incrementing the | program counter (*pc++) and dispatching what instruction it | sees. It's a stack-based VM so individual instructions are | pushed onto and popped from the stack depending on the | operation. Is there anything specific you don't grok? Happy to | help. | feeley wrote: | Author here. Just for context tinyc.c was created in 2000 (I | found the file in my archives and the last modification date is | January 12, 2001). I was not aware at the time of Fabrice | Bellard's work which after all won the IOCCC in 2001, so the | confusion with TCC was not intentional. My tinyc.c was meant to | teach the basics of compilers in a relatively accessible way, | from parsing to AST to code generation to bytecode interpreter. | And yes it is the subset of C that is tiny, not a tiny compiler | for the full C language. | bullen wrote: | I wish I had time to make a list what would be required to | bootstrap this. | | Either by adding complexity (more features to the compiler) or | dropping complexity (fewer C features in the implementation). | | Did you ever look at that? | | Edit: functions, enum, struct, arrays and maybe make all | variables/functions a-z? | | Edit2: https://joyofsource.com/projects/bootstrappable-tcc.html | userbinator wrote: | It's unfortunately not self-compiling, but has a structure which | is very reminiscent of C4 --- another tiny C-subset compiler + | stack-based VM which is self-compiling: | | https://news.ycombinator.com/item?id=8558822 | | The 26 predefined integer variables make this look like a variant | of minimal BASIC, except with structured control flow instead of | only GOTO. | bakul wrote: | This doesn't have types, functions, arrays or much error | checking. It has one char identifiers. I don't think we should | read into this any more than a tiny example or experiment by the | author. | Gordonjcp wrote: | So, it's the C equivalent of Tiny BASIC? | | So, a Tiny C? | bakul wrote: | Not even that as it doesn't have function calls or even | print! | | See Feeley's response for the proper context. | netgusto wrote: | It's worth noting that this is a compiler for the Tiny-C | language, and not as one might think a tiny compiler for the C | language. | susam wrote: | Yes, a better title would be: | | Compiler for the Tiny-C Language (2001) | | In fact, that is exactly how the source code describes itself | in the comments. | unwind wrote: | It's probably better to call it an interpreter, since it will | also run the program and print the values of all non-zero | variables afterward. | | Calling it a compiler is (to me) really stretching things, I | can't see any code to emit any other form of the code, it's all | aimed at evaluating (executing) it. | | Edit: oops, I didn't read the code closely enough, it does emit | code but only internally, that code is what gets executed. | Thanks for the corrections! | northernskys30 wrote: | It compiles to a sort of byte code that is executed by a | stack based virtual machine. | userbinator wrote: | It is a compiler rather than a direct evaluator, since it | generates bytecode for a stack VM --- and also includes the | interpreter for that (look at the bottom). | masklinn wrote: | That's more or less every interpreter. CPython compiles to | bytecode before interpreting that, yet nobody would call it | a compiler. | Mike_12345 wrote: | That is definitely a compiler and anyone with a CS degree | would call it that if they were discussing its | functionality, because that's technically what it is. | (Referring specifically to the part which compiles Python | to bytecode) | | Your SQL database also has a compiler. SQL is compiled to | an execution plan. Compile doesn't only mean "create a | machine code executable file". | masklinn wrote: | > That is definitely a compiler and anyone with a CS | degree would call it that if they were discussing its | functionality because that's technically what it is. | | None of these assertions is correct. | | > (Referring specifically to the part which compiles | Python to bytecode) | | So referring specifically to something different than | what I explicitly specified, it's called something else. | | By that reasoning, a cow is a muscle and you are an acid. | | > Your SQL database also has a compiler. | | "Has a" and "is a" are rather different relationships. | | > Compile doesn't only mean "create a machine code | executable file". | | You're the only person who made that assertion. | shadowfox wrote: | In contrast, Java also did that and I doubt if most | people think of Java as interpreted. So, using a byte- | code interpreter may not be the criteria most people are | using to decide on this. Truthfully, I think it is all a | bit arbitrary. | [deleted] | zabzonk wrote: | not sure i understand how enums work here. but interesting. ___________________________________________________________________ (page generated 2023-03-13 23:01 UTC)